🤖 AI Summary
Current automated formalization tools struggle to independently handle the formalization of complex mathematical proofs. This study investigates how human experts conduct proof formalization with AI assistance through a mixed-methods approach, combining qualitative inquiry with controlled user experiments across diverse domains and difficulty levels. It provides the first systematic characterization of how users flexibly orchestrate multiple AI tools in real-world scenarios and reveals a central human need to retain high-level control in human-AI collaboration. The findings demonstrate that AI assistance significantly improves formalization accuracy, and users consistently adapt their tool usage dynamically based on task requirements.
📝 Abstract
For centuries, human mathematicians have written proofs to substantiate their mathematical arguments; yet, the ability to automatically verify the validity of proofs has long been a challenge. Advances in AI systems' ability to generate code and engage in increasingly high-level mathematical reasoning promise to transform people's ability to formalize and thereby verify proofs. While many works focus on benchmarking the current frontier, we instead study how people use these tools. We conduct a mixed-methods analysis into the initial impact of AI on people's formalization workflows: what people claim they want, what they see as the barriers to those visions, and how they actually use and adapt AI in practice. A qualitative survey shows that people's preferences are diverse, but with a general desire for AI assistance in formalization that preserves high-level human control over the proof discovery process. To assess how people actually engage with AI for formalization under such limitations, we conduct a controlled user study in which participants formalize informal math problems and their proofs, with and without AI, across a range of mathematical problems at varying levels of difficulty and domains. Despite limitations of the tools at the time for autoformalization, participants tend to attain higher formalization accuracy when allowed access to AI tools than when formalizing on their own, with most participants flexibly choosing to use multiple different AI tools. Taken together, our work sheds light on the early stages of AI integration into formalization workflows, involving an intimate interplay of human and AI engagement.