MaxProof: How MiniMax M3 Reaches Gold-Level Proof Scores
MaxProof turns MiniMax-M3 into a generator, verifier, fixer, and ranker; with population-level test-time scaling it reports 35/42 on IMO 2025 and 36/42 on USAMO 2026.
Topics
Neural, symbolic, and hybrid systems for mathematical proof search.
MaxProof turns MiniMax-M3 into a generator, verifier, fixer, and ranker; with population-level test-time scaling it reports 35/42 on IMO 2025 and 36/42 on USAMO 2026.
Theorem Proving · Google Research
HOList turns machine learning for higher-order theorem proving into a concrete research object, with evidence anchors, method tradeoffs, and limits for practical use.
Theorem Proving · Princeton University
LeanDojo turns retrieval-augmented theorem proving in Lean into a concrete research object, with evidence anchors, method tradeoffs, and limits for practical use.
Theorem Proving · Independent Researcher
MiniF2F turns formal Olympiad-level mathematics benchmarking into a concrete research object, with evidence anchors, method tradeoffs, and limits for practical use.
Theorem Proving · Google DeepMind
This work evaluates AI-aided formal proof search on open math problems: the strongest agent resolves 9 of 353 Erdos problems and proves 44 of 492 OEIS conjectures.
DeepSeek-Prover-V1.5 combines Lean feedback, reinforcement learning, and RMaxTS search, reaching 63.5% on miniF2F and 25.3% on ProofNet.
Theorem Proving · Google DeepMind
AlphaGeometry pairs a language model with a symbolic engine and trains on 100M synthetic theorems, solving 25 of 30 olympiad geometry problems versus 10 for the prior best.