Bi-modal Prediction and Transformation Coding for Compressing Complex Human Dynamics

📅 2025-09-21

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the degraded compression efficiency of KeyNode-Driven codecs for dynamic human motion sequences exhibiting rapid motion and strong non-rigid deformations, this paper proposes a dual-modal prediction and region-adaptive transform coding framework. Innovatively integrating semantic segmentation with hybrid rigid/affine transform modeling, the method selectively applies affine transforms only to regions with significant deformation (e.g., torso), while employing rigid transforms elsewhere. A Lagrangian rate-distortion optimization dynamically selects the optimal transform mode per region and prunes redundant parameters. This strategy preserves high compression efficiency for simple, quasi-rigid regions while substantially improving reconstruction fidelity for complex non-rigid deformations. Experimental results show an average bitrate reduction of 33.81% over baseline methods, achieving high-fidelity, low-bitrate dynamic mesh compression under key-node guidance.

Technology Category

Application Category

📝 Abstract

For dynamic human motion sequences, the original KeyNode-Driven codec often struggles to retain compression efficiency when confronted with rapid movements or strong non-rigid deformations. This paper proposes a novel Bi-modal coding framework that enhances the flexibility of motion representation by integrating semantic segmentation and region-specific transformation modeling. The rigid transformation model (rotation & translation) is extended with a hybrid scheme that selectively applies affine transformations-rotation, translation, scaling, and shearing-only to deformation-rich regions (e.g., the torso, where loose clothing induces high variability), while retaining rigid models elsewhere. The affine model is decomposed into minimal parameter sets for efficient coding and combined through a component selection strategy guided by a Lagrangian Rate-Distortion optimization. The results show that the Bi-modal method achieves more accurate mesh deformation, especially in sequences involving complex non-rigid motion, without compromising compression efficiency in simpler regions, with an average bit-rate saving of 33.81% compared to the baseline.

Problem

Research questions and friction points this paper is trying to address.

Improving compression of human motion with rapid movements

Enhancing representation of non-rigid deformations in dynamics

Reducing bit-rate while maintaining mesh deformation accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bi-modal coding integrates semantic segmentation

Hybrid scheme applies affine transformations selectively

Lagrangian optimization guides component selection strategy

🔎 Similar Papers

No similar papers found.

Authors to Follow