SorryDB Leaderboard
SorryDB is a dataset of open proof obligations drawn from real-world Lean formalization projects, designed as a benchmark for measuring the effectiveness of AI theorem provers for day-to-day Lean practitioners.
SorryDB_2601 · January 2026
A fixed slice of 1,000 tasks from the SorryDB dataset. New snapshots will follow.
Evaluation pipeline
SorryDB continuously monitors active Lean projects listed on
Reservoir.
For every open sorry it finds, it records the repository, commit, Lean
version, and source location so the task can be reproduced locally.
Each strategy then proposes a proof to replace the sorry, and an
independent verifier compiles the candidate inside the original project to check
whether the proof closes.
Evaluation Paper
An evaluation of current SOTA models on SorryDB
SorryDB: Can AI Provers Complete Real-World Lean Theorems?
Accepted at ICML 20261Axiomatic AI · 2University of Amsterdam · 3Math, Inc. · 4EPFL · 5Imperial College · 6The London School of Geometry and Number Theory · 7Côte d'Azur University · 8ENSEIRB-MATMECA, INP-Bordeaux · 9Columbia University · 10University of Toronto · 11The University of Texas at Austin