Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition

📅 2025-07-29

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

To address insufficient modeling of subtle motion variations in micro-action recognition, this paper proposes a Motion-Guided Modulation Network (MMN) that explicitly captures dynamic distinctions via dual-path motion cue modulation—operating at both skeletal and frame levels. Key contributions include: (1) a Motion-Sensitive Skeleton Modulation module (MSM) that injects motion awareness at the joint level; (2) a Temporal Modulation module (MTM) to enhance discriminability of inter-frame sub-pixel displacements; and (3) a multi-scale motion consistency learning strategy that jointly optimizes local and global motion representations. By integrating skeleton sequence modeling, spatiotemporal attention, and feature modulation mechanisms, MMN achieves fine-grained capture of sub-pixel dynamic changes. Extensive experiments demonstrate state-of-the-art performance on Micro-Action 52 and iMiGUE benchmarks, validating the critical efficacy of motion-guided modulation for micro-action recognition.

Technology Category

Application Category

📝 Abstract

Micro-Actions (MAs) are an important form of non-verbal communication in social interactions, with potential applications in human emotional analysis. However, existing methods in Micro-Action Recognition often overlook the inherent subtle changes in MAs, which limits the accuracy of distinguishing MAs with subtle changes. To address this issue, we present a novel Motion-guided Modulation Network (MMN) that implicitly captures and modulates subtle motion cues to enhance spatial-temporal representation learning. Specifically, we introduce a Motion-guided Skeletal Modulation module (MSM) to inject motion cues at the skeletal level, acting as a control signal to guide spatial representation modeling. In parallel, we design a Motion-guided Temporal Modulation module (MTM) to incorporate motion information at the frame level, facilitating the modeling of holistic motion patterns in micro-actions. Finally, we propose a motion consistency learning strategy to aggregate the motion cues from multi-scale features for micro-action classification. Experimental results on the Micro-Action 52 and iMiGUE datasets demonstrate that MMN achieves state-of-the-art performance in skeleton-based micro-action recognition, underscoring the importance of explicitly modeling subtle motion cues. The code will be available at https://github.com/momiji-bit/MMN.

Problem

Research questions and friction points this paper is trying to address.

Recognizing micro-actions with subtle motion changes

Enhancing spatial-temporal representation in skeleton-based recognition

Improving accuracy in micro-action classification via motion cues

Innovation

Methods, ideas, or system contributions that make the work stand out.

Motion-guided Modulation Network for subtle cues

Skeletal and frame level motion modulation

Multi-scale motion consistency learning strategy

🔎 Similar Papers

No similar papers found.

Motional

$172,000—$229,000 USD

Boston / Pittsburgh / Las Vegas

Research Scientist Intern, Machine Perception for Input and Interaction (PhD)