KAN-HyperpointNet for Point Cloud Sequence-Based 3D Human Action Recognition

📅 2024-09-14
🏛️ IEEE International Conference on Acoustics, Speech, and Signal Processing
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing point cloud sequence-based 3D action recognition methods struggle to simultaneously preserve fine-grained limb motion fidelity and global pose structural integrity, leading to critical spatiotemporal cue loss. To address this, we propose D-Hyperpoint representation and KAN-HyperpointNet—a decoupled architecture. D-Hyperpoint is the first unified representation that jointly encodes local instantaneous motion and global static pose. We further pioneer the integration of Kolmogorov–Arnold Networks (KANs) into a spatiotemporal hybrid module to model high-order nonlinear interactions. Our framework comprises four key components: D-Hyperpoint embedding, KANsMixer, KAN-enhanced spatiotemporal decoupling, and a hierarchical nested aggregation mechanism. Extensive experiments demonstrate state-of-the-art performance on MSR Action3D and NTU-RGB+D 60, with significant improvements in fine-grained discrimination of complex actions.

Technology Category

Application Category

📝 Abstract
Point cloud sequence-based 3D action recognition has achieved impressive performance and efficiency. However, existing point cloud sequence modeling methods cannot adequately balance the precision of limb micro-movements with the integrity of posture macro-structure, leading to the loss of crucial information cues in action inference. To overcome this limitation, we introduce D-Hyperpoint, a novel data type generated through a D-Hyperpoint Embedding module. D-Hyperpoint encapsulates both regional-momentary motion and global-static posture, effectively summarizing the unit human action at each moment. In addition, we present a D-Hyperpoint KANsMixer module, which is recursively applied to nested groupings of D-Hyperpoints to learn the action discrimination information and creatively integrates Kolmogorov-Arnold Networks (KAN) to enhance spatio-temporal interaction within D-Hyperpoints. Finally, we propose KAN-HyperpointNet, a spatio-temporal decoupled network architecture for 3D action recognition. Extensive experiments on two public datasets: MSR Action3D and NTU-RGB+D 60, demonstrate the state-of-the-art performance of our method.
Problem

Research questions and friction points this paper is trying to address.

Balancing limb micro-movements and posture macro-structure in 3D action recognition
Capturing regional-momentary motion and global-static posture in point cloud sequences
Enhancing spatio-temporal interaction for improved 3D human action recognition
Innovation

Methods, ideas, or system contributions that make the work stand out.

D-Hyperpoint Embedding captures motion and posture
D-Hyperpoint KANsMixer enhances spatio-temporal interaction
KAN-HyperpointNet decouples spatio-temporal architecture
🔎 Similar Papers
No similar papers found.
Zhaoyu Chen
Zhaoyu Chen
TikTok
AI SecurityTrustworthy AIMultimodal AIGenerative AI
X
Xing Li
Nanjing Forestry University
Q
Qian Huang
HoHai University
Q
Qiang Geng
HoHai University
T
Tianjin Yang
HoHai University
Shihao Han
Shihao Han
The Univeristy of Hong Kong