Position and Rotation Invariant Sign Language Recognition from 3D Kinect Data with Recurrent Neural Networks

📅 2020-10-23

📈 Citations: 1

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This study addresses automatic Indian Sign Language (ISL) recognition to facilitate natural communication between deaf and hearing individuals. Focusing on 30 fundamental gestures, we propose a recognition method leveraging 3D skeletal sequences—captured via Kinect with 20 joint locations—and incorporating synchronized RGB and depth modalities. Our core contribution is a geometric transformation–driven deep frame alignment strategy, which effectively mitigates temporal modeling errors induced by gesture positional shifts and hand rotations, thereby significantly improving the spatial robustness of recurrent neural networks (RNNs). Experimental evaluation on a standard ISL benchmark dataset achieves an 84.81% classification accuracy, outperforming existing RNN-based baselines that operate directly on raw skeleton sequences. The proposed spatiotemporal alignment paradigm offers a generalizable solution for low-resource sign language recognition tasks.

📝 Abstract

Sign language is a gesture-based symbolic communication medium among speech and hearing impaired people. It also serves as a communication bridge between non-impaired and impaired populations. Unfortunately, in most situations, a non-impaired person is not well conversant in such symbolic languages restricting the natural information flow between these two categories. Therefore, an automated translation mechanism that seamlessly translates sign language into natural language can be highly advantageous. In this paper, we attempt to perform recognition of 30 basic Indian sign gestures. Gestures are represented as temporal sequences of 3D maps (RGB + depth), each consisting of 3D coordinates of 20 body joints captured by the Kinect sensor. A recurrent neural network (RNN) is employed as the classifier. To improve the classifier's performance, we use geometric transformation for the alignment correction of depth frames. In our experiments, the model achieves 84.81% accuracy.

Problem

Research questions and friction points this paper is trying to address.

Automated translation of sign language

Recognition of 30 Indian sign gestures

Improving classifier accuracy with alignment correction

Innovation

Methods, ideas, or system contributions that make the work stand out.

3D Kinect data processing

Recurrent Neural Networks classifier

Geometric transformation alignment correction

🔎 Similar Papers

No similar papers found.