skip to main content
Caltech

Mathematics & Machine Learning Seminar

Tuesday, January 30, 2024
2:00pm to 3:00pm
Add to Cal
East Bridge 114
Towards Large Language Models as Copilots for Formalization and Theorem Proving
Kaiyu Yang, Computing and Mathematical Sciences, Caltech,

Large language models (LLMs) have achieved strong results in various AI and machine learning tasks. This talk covers two applications of LLMs in mathematics: theorem proving and autoformalization. First, we introduce Lean Copilot, a tool that uses LLMs to assist mathematicians in proving formal theorems in the Lean proof assistant. It can generate proof steps, suggest useful lemmas, and search for complete proofs. Second, we investigate LLMs for autoformalization, i.e., automatically translating informal mathematics into formal theorems and proofs, focusing on the domain of 2D Euclidean geometry. We introduce LeanEuclid, a benchmark for testing LLMs on autoformalization. In constructing LeanEuclid, we manually formalized 48 problems from the first book of Euclid's Elements, which is challenging to formalize due to the extensive use of diagrams in proofs. This is the first time the diagrammatic proofs in Elements have been faithfully formalized and mechanically checked. Using LeanEuclid, our experiments reveal the capability and limitations of state-of-the-art LLMs on autoformalizing Euclidean geometry problems.

For more information, please contact Math Department by phone at 626-395-4335 or by email at [email protected].