Mathematics & Machine Learning Seminar
Large language models (LLMs) have achieved strong results in various AI and machine learning tasks. This talk covers two applications of LLMs in mathematics: theorem proving and autoformalization. First, we introduce Lean Copilot, a tool that uses LLMs to assist mathematicians in proving formal theorems in the Lean proof assistant. It can generate proof steps, suggest useful lemmas, and search for complete proofs. Second, we investigate LLMs for autoformalization, i.e., automatically translating informal mathematics into formal theorems and proofs, focusing on the domain of 2D Euclidean geometry. We introduce LeanEuclid, a benchmark for testing LLMs on autoformalization. In constructing LeanEuclid, we manually formalized 48 problems from the first book of Euclid's Elements, which is challenging to formalize due to the extensive use of diagrams in proofs. This is the first time the diagrammatic proofs in Elements have been faithfully formalized and mechanically checked. Using LeanEuclid, our experiments reveal the capability and limitations of state-of-the-art LLMs on autoformalizing Euclidean geometry problems.