论文标题

使用大语言模型建立数学形式化助手

Towards a Mathematics Formalisation Assistant using Large Language Models

论文作者

Agrawal, Ayush, Gadgil, Siddhartha, Goyal, Navin, Narayanan, Ashvni, Tadipatri, Anand

论文摘要

数学形式化是用自然语言编写数学(即定义,定理陈述,证明)的任务,如书和论文中所发现的,并将其用于正式语言,然后可以通过程序检查正确的正确性。今天这是一项蓬勃发展的活动,但是正式化仍然很麻烦。在本文中,我们探讨了大语言模型(Codex)的能力,以帮助精益定理宣传中的形式化。我们发现,通过仔细的输入及时选择和后处理,Codex能够以$ 120 $定理语句的近75 \%准确性在本科级别上形式化简短的数学语句。为了证明,定量分析是不可行的,我们进行了详细的案例研究。我们在本科级别选择了一组$ 13 $定理的$ 13 $定理,并提供了适合两三段的证据。我们表明,使用新的提示策略可以用自然语言形式化这些证据,而十二codex完成中至少有一个易于修复为完整的证明。这是令人惊讶的,因为正式数学本质上没有任何一致的数据,尤其是用于证明。这些结果表明,大型语言模型是完全或部分自动化形式化的有希望的途径。

Mathematics formalisation is the task of writing mathematics (i.e., definitions, theorem statements, proofs) in natural language, as found in books and papers, into a formal language that can then be checked for correctness by a program. It is a thriving activity today, however formalisation remains cumbersome. In this paper, we explore the abilities of a large language model (Codex) to help with formalisation in the Lean theorem prover. We find that with careful input-dependent prompt selection and postprocessing, Codex is able to formalise short mathematical statements at undergrad level with nearly 75\% accuracy for $120$ theorem statements. For proofs quantitative analysis is infeasible and we undertake a detailed case study. We choose a diverse set of $13$ theorems at undergrad level with proofs that fit in two-three paragraphs. We show that with a new prompting strategy Codex can formalise these proofs in natural language with at least one out of twelve Codex completion being easy to repair into a complete proof. This is surprising as essentially no aligned data exists for formalised mathematics, particularly for proofs. These results suggest that large language models are a promising avenue towards fully or partially automating formalisation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源