Base and Exponent Prediction in Mathematical Expressions using Multi-Output CNN

📅 2024-07-20

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Recognizing bases and exponents jointly in mathematical expression images under realistic degradations—such as noise, font scaling, and blur—remains challenging for conventional serial or single-task models. Method: This paper proposes a lightweight multi-output convolutional neural network (CNN) that performs end-to-end simultaneous prediction of bases and exponents. It introduces a novel single-model dual-branch architecture: one branch jointly regresses bounding boxes and classifies base symbols, while the other does the same for exponents. To enhance generalization, synthetic data augmentation—including additive noise, multi-scale rendering, and Gaussian blur—is systematically applied. Contribution/Results: Evaluated on 10,900 degraded images, the method achieves high accuracy while significantly reducing model complexity and training resource consumption. It demonstrates superior robustness to common imaging degradations and strong practicality for real-world deployment in mathematical OCR systems.

Technology Category

Application Category

📝 Abstract

The use of neural networks and deep learning techniques in image processing has significantly advanced the field, enabling highly accurate recognition results. However, achieving high recognition rates often necessitates complex network models, which can be challenging to train and require substantial computational resources. This research presents a simplified yet effective approach to predicting both the base and exponent from images of mathematical expressions using a multi-output Convolutional Neural Network (CNN). The model is trained on 10,900 synthetically generated images containing exponent expressions, incorporating random noise, font size variations, and blur intensity to simulate real-world conditions. The proposed CNN model demonstrates robust performance with efficient training time. The experimental results indicate that the model achieves high accuracy in predicting the base and exponent values, proving the efficacy of this approach in handling noisy and varied input images.

Problem

Research questions and friction points this paper is trying to address.

Predict base and exponent in math expressions using CNN

Simplify complex network models for efficient training

Handle noisy and varied input images effectively

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-output CNN for math expression prediction

Synthetic training data with real-world variations

Efficient model with high accuracy performance

🔎 Similar Papers

Enhancing Complex Formula Recognition with Hierarchical Detail-Focused Network