PP-FormulaNet: Bridging Accuracy and Efficiency in Advanced Formula Recognition

πŸ“… 2025-03-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Addressing the longstanding trade-off between accuracy and efficiency in mathematical formula recognition for document intelligence, this paper proposes PP-FormulaNetβ€”a dual-model architecture. PP-FormulaNet-L achieves high accuracy via a co-designed visual encoder, symbol decoder, and structural modeling module, attaining a 6% absolute improvement in LaTeX token accuracy over UniMERNet. PP-FormulaNet-S prioritizes efficiency through model lightweighting and inference optimization, delivering over 16Γ— faster runtime. We further introduce the first fully automated formula mining system, significantly enhancing training data quality and model generalization. All models are implemented end-to-end using PaddlePaddle; source code and pre-trained models are publicly released. The framework has been successfully deployed in multiple industrial document processing systems, demonstrating robust real-world applicability.

Technology Category

Application Category

πŸ“ Abstract
Formula recognition is an important task in document intelligence. It involves converting mathematical expressions from document images into structured symbolic formats that computers can easily work with. LaTeX is the most common format used for this purpose. In this work, we present PP-FormulaNet, a state-of-the-art formula recognition model that excels in both accuracy and efficiency. To meet the diverse needs of applications, we have developed two specialized models: PP-FormulaNet-L, tailored for high-accuracy scenarios, and PP-FormulaNet-S, optimized for high-efficiency contexts. Our extensive evaluations reveal that PP-FormulaNet-L attains accuracy levels that surpass those of prominent models such as UniMERNet by a significant 6%. Conversely, PP-FormulaNet-S operates at speeds that are over 16 times faster. These advancements facilitate seamless integration of PP-FormulaNet into a broad spectrum of document processing environments that involve intricate mathematical formulas. Furthermore, we introduce a Formula Mining System, which is capable of extracting a vast amount of high-quality formula data. This system further enhances the robustness and applicability of our formula recognition model. Code and models are publicly available at PaddleOCR(https://github.com/PaddlePaddle/PaddleOCR) and PaddleX(https://github.com/PaddlePaddle/PaddleX).
Problem

Research questions and friction points this paper is trying to address.

Improving accuracy and efficiency in mathematical formula recognition
Developing specialized models for high-accuracy and high-efficiency scenarios
Enhancing robustness with a Formula Mining System for data extraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops PP-FormulaNet-L for high accuracy
Optimizes PP-FormulaNet-S for high efficiency
Introduces Formula Mining System for data extraction
πŸ”Ž Similar Papers
No similar papers found.
H
Hongen Liu
PaddlePaddle Team, Baidu Inc.; College of Intelligence and Computing, Tianjin University
Cheng Cui
Cheng Cui
BUAA
deep learningnetwork designOCRmllm
Yuning Du
Yuning Du
PhD student, University of Edinburgh
Machine LearningMedical Image Analysis
Y
Yi Liu
PaddlePaddle Team, Baidu Inc.
Gang Pan
Gang Pan
Tianjin University
Computer visionMultimodalAI