OpenAI Reasoning Models Crack a 1946 Geometry Conjecture
Validation from the critics who flagged previous failures
In 1946, mathematician Paul Erdős proposed a series of conjectures that would become foundational challenges in discrete geometry. For 78 years, the mathematical community failed to provide a definitive proof for one specific problem regarding the maximum number of unit distances between points in a plane. OpenAI now claims its newest reasoning models have successfully disproved a related geometry conjecture, and unlike previous instances of artificial intelligence claiming mathematical breakthroughs, the experts are agreeing.
The significance of this milestone lies not just in the solution itself, but in the identity of the verifiers. Fedja Nazarov and Terence Tao, mathematicians who famously debunked a previous OpenAI claim regarding the Riemann Hypothesis, have reviewed the output. This time, the rigorous logical steps provided by the model held up under professional scrutiny, marking a shift from probabilistic guessing to verifiable reasoning.
The shift from pattern matching to logical synthesis
Large language models typically operate on statistical probability, predicting the next word in a sequence based on massive datasets. This approach is notoriously poor for mathematics, where a single incorrect digit or logical leap invalidates the entire output. The new model, internally referred to as o1, utilizes a chain-of-thought process that mimics human deliberation. This allows the system to check its own work and pivot when a logical path leads to a contradiction.
- The model generated a counterexample that contradicted the long-standing geometry hypothesis.
- It provided a step-by-step proof that was short enough for human experts to verify manually.
- The output lacked the typical hallucinations that plagued earlier iterations like GPT-4.
Data from internal benchmarks suggests that while GPT-4 solved only 13% of the most difficult competitive math problems, the o1 model reaches 83% accuracy on similar datasets. This delta represents a move toward symbolic reasoning, where the machine understands the underlying rules of the system rather than just the frequency of words in a training set.
Implications for the cost of discovery
The cost of human capital in high-level mathematics is immense, often requiring decades of specialized education for a single breakthrough. By automating the verification of conjectures, the time-to-discovery for complex problems could drop from decades to hours. This is not merely an academic exercise; the logic used to solve geometry problems is the same logic required for optimizing logistics networks and designing complex semiconductors.
The proof is elegant and, more importantly, it is correct. It provides a path forward for using these tools as genuine research assistants rather than just sophisticated search engines.
Developers and founders should look at this as a signal that the bottleneck in AI is moving from data quantity to computational quality. We are seeing the first evidence that machines can contribute original knowledge to the hard sciences. By 2026, expect the integration of these reasoning engines into CAD software and cryptography suites to become the standard for enterprise-grade technical work.
Videos UGC avec avatars IA — Avatars realistes pour le marketing