Learning to Reason with Insight for Informal Theorem Proving
2026-04-17 • Artificial Intelligence
Artificial IntelligenceComputation and LanguageMachine Learning
AI summaryⓘ
The authors point out that current automated theorem provers mostly rely on strict formal logic, which doesn't play to the strengths of language models that understand natural language better. They identify that the main challenge is helping models find the key ideas or techniques needed to solve hard math problems. To fix this, they created a new dataset called DeepInsightTheorem that breaks down informal proofs to highlight these key steps and ideas. They also developed a training method that teaches the model step-by-step, similar to how humans learn, which helps the model think more insightfully. Their tests show this approach helps the model reason better in math tasks.
automated theorem provinginformal theorem provinglarge language modelsmathematical reasoningproof sketchescore techniquesprogressive traininghierarchical datasetdeep learninginsightful reasoning
Authors
Yunhe Li, Hao Shi, Bowen Deng, Wei Wang, Mengzhe Ruan, Hanxu Hou, Zhongxiang Dai, Siyang Gao, Chao Wang, Shuang Qiu, Linqi Song
Abstract
Although most of the automated theorem-proving approaches depend on formal proof systems, informal theorem proving can align better with large language models' (LLMs) strength in natural language processing. In this work, we identify a primary bottleneck in informal theorem proving as a lack of insight, namely the difficulty of recognizing the core techniques required to solve complex problems. To address this, we propose a novel framework designed to cultivate this essential reasoning skill and enable LLMs to perform insightful reasoning. We propose $\mathtt{DeepInsightTheorem}$, a hierarchical dataset that structures informal proofs by explicitly extracting core techniques and proof sketches alongside the final proof. To fully exploit this dataset, we design a Progressive Multi-Stage SFT strategy that mimics the human learning process, guiding the model from basic proof writing to insightful thinking. Our experiments on challenging mathematical benchmarks demonstrate that this insight-aware generation strategy significantly outperforms baselines. These results demonstrate that teaching models to identify and apply core techniques can substantially improve their mathematical reasoning.