FG-SGL: Fine-Grained Semantic Guidance Learning via Motion Process Decomposition for Micro-Gesture Recognition

📅 2026-03-17

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of micro-gesture recognition, where subtle inter-class differences hinder effective modeling of local dynamic features under conventional class-level supervision. To overcome this limitation, the authors propose a fine-grained semantic-guided learning framework, introducing the first micro-gesture dataset annotated with four-dimensional fine-grained textual descriptions. They design a multi-level contrastive optimization strategy to enable coarse-to-fine joint training and incorporate two novel attention modules—Fine-Grained Semantic Attention (FG-SA) and Class-Level Prototype Attention (CP-A)—to guide vision-language models toward discriminative local motion cues. Experimental results demonstrate that the proposed approach achieves competitive performance on micro-gesture recognition benchmarks, validating the efficacy of fine-grained semantic guidance in enhancing recognition accuracy.

Technology Category

Application Category

📝 Abstract

Micro-gesture recognition (MGR) is challenging due to subtle inter-class variations. Existing methods rely on category-level supervision, which is insufficient for capturing subtle and localized motion differences. Thus, this paper proposes a Fine-Grained Semantic Guidance Learning (FG-SGL) framework that jointly integrates fine-grained and category-level semantics to guide vision--language models in perceiving local MG motions. FG-SA adopts fine-grained semantic cues to guide the learning of local motion features, while CP-A enhances the separability of MG features through category-level semantic guidance. To support fine-grained semantic guidance, this work constructs a fine-grained textual dataset with human annotations that describes the dynamic process of MGs in four refined semantic dimensions. Furthermore, a Multi-Level Contrastive Optimization strategy is designed to jointly optimize both modules in a coarse-to-fine pattern. Experiments show that FG-SGL achieves competitive performance, validating the effectiveness of fine-grained semantic guidance for MGR.

Problem

Research questions and friction points this paper is trying to address.

micro-gesture recognition

fine-grained semantics

subtle motion differences

category-level supervision

vision-language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-Grained Semantic Guidance

Micro-Gesture Recognition

Vision-Language Model

Motion Process Decomposition

Multi-Level Contrastive Optimization

🔎 Similar Papers

No similar papers found.

Authors to Follow