AI writes buggy code. A Silicon Valley startup wants to fix it

TribLive TribLive Thu, Mar 12, 2026 • 10:18 AM

(TribLive)

PALO ALTO, Calif. — In January, a team of researchers at Carnegie Mellon University published a study analyzing the use of artificial intelligence technologies that can generate their own computer code.

The study found that while these increasingly popular AI systems could speed up software development, they could also degrade the quality of computer code, which can slow projects over time. In other words, they may generate buggy code.

Now a new wave of Silicon Valley startups is trying to solve that problem.

These startups, including Axiom Math and Harmonic, both in Palo Alto, California, and Logical Intelligence in San Francisco, hope to create AI systems that can automatically verify computer code in much the same way that mathematicians prove elaborate math problems.

“Code verification is probably the next frontier,” said Carina Hong, CEO and founder of Axiom.

Axiom said Thursday that it had raised $200 million in new funding from venture capital firms such as Menlo Ventures, Greycroft and Madrona. With headquarters next door to the downtown Palo Alto offices where Mark Zuckerberg built Facebook, the startup is a year old, employs about 20 people but is now valued at $1.6 billion.

Venture capitalists are betting big on this new kind of company because they see it as a path toward improving the code generated by AI systems such as OpenAI’s Codex or Anthropic’s Claude Code.

“Right now, the biggest problem with using AI to write code is that you don’t know when the code contains a bug,” said Matt Kraning, a partner with Menlo Ventures. “There are early signs that technology like Axiom’s can help with this.”

Like its rival Harmonic — which is valued at $1.45 billion after its latest funding round in the fall — Axiom began as an effort to create technology that solves math problems. In December, its technology, called AxiomProver, achieved a perfect score on the Putnam Exam, an annual competition that tests the math skills of top college students.

The AI systems that drive chatbots like ChatGPT often make mistakes. Sometimes, they even make stuff up. But when the subject is mathematics, technologies like AxiomProver can eliminate those mistakes.

Axiom has built technology that can formally prove whether an answer is right or wrong. It does this using a computer programming language called Lean, which was created more than decade ago as a way of proving mathematical statements.

Lean was originally a tool for mathematicians. Now, systems like AxiomProver are using Lean to prove math problems.

The hope is that these systems can use the same technique to verify the quality of computer code.

Although Axiom’s technology learned its skills by analyzing proofs of math problems, the company recently said it had achieved high scores on a standard benchmark test that judges whether AI systems can verify computer code. AI researchers called this “transfer learning” — when a system learns one skill (like proving math problem) and can successfully transfer that skill to a different task (like verifying computer code).

As they begin to train their systems for code verification, Hong and her colleagues said, they can further improve the quality of AI-generated code.

But experts warn that these methods have limits.

In mathematics, there is a clear distinction between correct and incorrect. But with computer programming, the distinction is harder to pin down, especially as programmers build things like social media services, which must handle the chaos created by millions of users across the globe.

“You can’t always specify what it means for computer code to be correct,” said Bogdan Vasilescu, a Carnegie Mellon computer science professor. “There are places where AI can verify code. But that does not mean that all problems in the code will suddenly go away.”

Featured Local Businesses

More local businesses