Finetuning Is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition

📅 2023-02-13
🏛️ IEEE International Conference on Document Analysis and Recognition
📈 Citations: 7
Influential: 1
📄 PDF
🤖 AI Summary
To address the challenge of few-shot domain adaptation in handwriting recognition, this paper investigates efficient domain transfer for Connectionist Temporal Classification (CTC) models under extremely limited target-domain data—down to just 16 lines of text. We propose a lightweight adaptation paradigm relying solely on supervised fine-tuning and image- or sequence-level data augmentation, deliberately avoiding complex domain-adaptation techniques. Evaluated under both writer-dependent and writer-independent protocols on large-scale real-world datasets, our method substantially mitigates overfitting: it achieves relative reductions in character error rate (CER) of 25% (with 16 lines) to 50% (with 256 lines) when adapting to unseen writers. Our core contribution is demonstrating that, within the CTC framework, carefully designed fine-tuning combined with simple yet effective data augmentation suffices to attain strong generalization—challenging the necessity of sophisticated domain-adaptation methods in low-resource handwriting recognition.
📝 Abstract
In many machine learning tasks, a large general dataset and a small specialized dataset are available. In such situations, various domain adaptation methods can be used to adapt a general model to the target dataset. We show that in the case of neural networks trained for handwriting recognition using CTC, simple fine-tuning with data augmentation works surprisingly well in such scenarios and that it is resistant to overfitting even for very small target domain datasets. We evaluated the behavior of fine-tuning with respect to augmentation, training data size, and quality of the pre-trained network, both in writer-dependent and writer-independent settings. On a large real-world dataset, fine-tuning on new writers provided an average relative CER improvement of 25 % for 16 text lines and 50 % for 256 text lines.
Problem

Research questions and friction points this paper is trying to address.

Improving handwriting recognition via fine-tuning
Evaluating fine-tuning for small target datasets
Assessing data augmentation impact on adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning with data augmentation for adaptation
Resistant to overfitting in small datasets
Evaluated performance across various training conditions
🔎 Similar Papers
No similar papers found.
J
Jan Koh'ut
Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic
Michal Hradiš
Michal Hradiš
Brno University of Technology
Computer VisionPattern Recognition