Performance Analysis of Few-Shot Learning Approaches for Bangla Handwritten Character and Digit Recognition

📅 2024-12-14

🏛️ 2024 6th International Conference on Sustainable Technologies for Industry 5.0 (STI)

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

To address the performance degradation of models in Bangla handwritten character and digit recognition under low-resource and few-shot settings, this paper proposes SynergiProtoNet—a prototypical network framework integrating multi-level feature extraction and dynamic clustering optimization. It is the first to introduce context-aware embedding learning and collaborative modeling of high- and low-level features for this task. The method significantly enhances fine-grained structural representation and cross-domain generalization, supporting diverse few-shot evaluations including monolingual/cross-lingual transfer and digit segmentation testing. Under multiple few-shot configurations (e.g., 1- and 5-shot), SynergiProtoNet consistently outperforms state-of-the-art approaches such as BD-CSPN and Relation Network, establishing a new benchmark for Bangla handwritten recognition. The source code is publicly available.

Technology Category

Application Category

📝 Abstract

Few-shot learning (FSL) offers a promising solution for classification tasks with limited labeled examples, offering a valuable solution for languages with limited annotated samples. Traditional deep learning research has largely centered on optimizing performance using large-scale datasets, yet constructing extensive datasets for all languages is both labor-intensive and impractical. FSL offers a compelling alternative, achieving effective results with minimal data. In this connection, this study investigates the performance of FSL approaches in Bangla characters and numerals recognition with limited labeled data, demonstrating their applicability to scripts with intricate and complex structures where dataset scarcity is prevalent. Given the complexity of Bangla scripts, we posit that models capable of performing well on these characters will generalize effectively to languages of similar or lower structural complexity. We introduce SynergiProtoNet, a hybrid network designed to enhance the recognition accuracy of handwritten characters and digits. Our model combines advanced clustering methods with a robust embedding framework to capture fine-grained details and contextual subtleties, leveraging multi-level (high- and low-level) feature extraction within a prototypical learning framework. We rigorously benchmark SynergiProtoNet against several state-of-the-art fewshot learning models, including BD-CSPN, Prototypical Network, Relation Network, Matching Network, and SimpleShot, across diverse evaluation settings. Our experiments-Monolingual Intra-Dataset Evaluation, Monolingual Inter-Dataset Evaluation, Cross-Lingual Transfer, and Split Digit Testing demonstrate that SynergiProtoNet consistently achieves superior performance, establishing a new benchmark in few-shot learning for handwritten character and digit recognition. The code is available on GitHub: https://github.com/MehediAhamed/SynergiProtoNet.

Problem

Research questions and friction points this paper is trying to address.

Evaluating few-shot learning for Bangla handwritten recognition

Addressing dataset scarcity in complex script recognition

Improving accuracy with hybrid network SynergiProtoNet

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid network SynergiProtoNet for recognition

Multi-level feature extraction framework

Advanced clustering with robust embedding

🔎 Similar Papers

No similar papers found.