Bangla Sign Language Translation: Dataset Creation Challenges, Benchmarking and Prospects

📅 2025-11-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Sentence-level Bangladeshi Sign Language (BdSL) translation has long been hindered by severe scarcity of sentence-aligned, annotated data. Method: We introduce IsharaKhobor—the first standardized, publicly available sentence-level sign language translation dataset—comprising authentic news-domain multimodal videos, 3D skeletal keypoint sequences, corresponding Bengali text, and semantically normalized annotations. We propose a keypoint- and RQE-embedding-based baseline model and conduct systematic ablation studies incorporating lexical constraints and semantic normalization. Contribution/Results: Experiments demonstrate IsharaKhobor’s effectiveness, yielding substantial performance gains in low-resource BdSL translation. The dataset fills a critical gap in South Asian low-resource sign language resources, establishing a benchmark dataset, unified evaluation protocol, and reproducible modeling framework for BdSL translation—thereby advancing AI-assisted technologies for the Deaf and hard-of-hearing community.

Technology Category

Application Category

📝 Abstract
Bangla Sign Language Translation (BdSLT) has been severely constrained so far as the language itself is very low resource. Standard sentence level dataset creation for BdSLT is of immense importance for developing AI based assistive tools for deaf and hard of hearing people of Bangla speaking community. In this paper, we present a dataset, IsharaKhobor , and two subset of it for enabling research. We also present the challenges towards developing the dataset and present some way forward by benchmarking with landmark based raw and RQE embedding. We do some ablation on vocabulary restriction and canonicalization of the same within the dataset, which resulted in two more datasets, IsharaKhobor_small and IsharaKhobor_canonical_small. The dataset is publicly available at: www.kaggle.com/datasets/hasanssl/isharakhobor [1].
Problem

Research questions and friction points this paper is trying to address.

Creating standard sentence-level datasets for Bangla Sign Language Translation
Developing AI assistive tools for deaf Bangla-speaking communities
Addressing vocabulary restriction and canonicalization challenges in BdSLT
Innovation

Methods, ideas, or system contributions that make the work stand out.

Created IsharaKhobor dataset for Bangla Sign Language
Used landmark-based raw and RQE embedding benchmarking
Applied vocabulary restriction and canonicalization techniques
🔎 Similar Papers
No similar papers found.
H
Husne Ara Rubaiyeat
Systems and Software Lab (SSL), Islamic University of Technology (IUT)
Hasan Mahmud
Hasan Mahmud
Postdoctoral Research Associate, Rochester Institute of Technology
Information SystemsAlgorithmic decision-makingHCI/Human-AI interaction
M
Md Kamrul Hasan
Systems and Software Lab (SSL), Islamic University of Technology (IUT)