๐ค AI Summary
To address the performance degradation of models in Bangla handwritten character and digit recognition under low-resource and few-shot settings, this paper proposes SynergiProtoNetโa prototypical network framework integrating multi-level feature extraction and dynamic clustering optimization. It is the first to introduce context-aware embedding learning and collaborative modeling of high- and low-level features for this task. The method significantly enhances fine-grained structural representation and cross-domain generalization, supporting diverse few-shot evaluations including monolingual/cross-lingual transfer and digit segmentation testing. Under multiple few-shot configurations (e.g., 1- and 5-shot), SynergiProtoNet consistently outperforms state-of-the-art approaches such as BD-CSPN and Relation Network, establishing a new benchmark for Bangla handwritten recognition. The source code is publicly available.
๐ Abstract
Few-shot learning (FSL) offers a promising solution for classification tasks with limited labeled examples, offering a valuable solution for languages with limited annotated samples. Traditional deep learning research has largely centered on optimizing performance using large-scale datasets, yet constructing extensive datasets for all languages is both labor-intensive and impractical. FSL offers a compelling alternative, achieving effective results with minimal data. In this connection, this study investigates the performance of FSL approaches in Bangla characters and numerals recognition with limited labeled data, demonstrating their applicability to scripts with intricate and complex structures where dataset scarcity is prevalent. Given the complexity of Bangla scripts, we posit that models capable of performing well on these characters will generalize effectively to languages of similar or lower structural complexity. We introduce SynergiProtoNet, a hybrid network designed to enhance the recognition accuracy of handwritten characters and digits. Our model combines advanced clustering methods with a robust embedding framework to capture fine-grained details and contextual subtleties, leveraging multi-level (high- and low-level) feature extraction within a prototypical learning framework. We rigorously benchmark SynergiProtoNet against several state-of-the-art fewshot learning models, including BD-CSPN, Prototypical Network, Relation Network, Matching Network, and SimpleShot, across diverse evaluation settings. Our experiments-Monolingual Intra-Dataset Evaluation, Monolingual Inter-Dataset Evaluation, Cross-Lingual Transfer, and Split Digit Testing demonstrate that SynergiProtoNet consistently achieves superior performance, establishing a new benchmark in few-shot learning for handwritten character and digit recognition. The code is available on GitHub: https://github.com/MehediAhamed/SynergiProtoNet.