VISUALCENT: Visual Human Analysis using Dynamic Centroid Representation

📅 2025-04-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
VISUALCENT addresses the limited generalizability of pose estimation and instance segmentation, as well as insufficient robustness to occlusion and motion, in multi-person visual analysis. It proposes a unified bottom-up framework centered on a dynamic centroid representation mechanism: introducing KeyCentroid (keypoint centroid) and MaskCentroid (mask centroid), jointly leveraging disk-shaped heatmap modeling and explicit centroid-driven pixel clustering to enable co-optimization of keypoint detection and instance segmentation. This paradigm significantly enhances resilience to severe occlusion and rapid, large-scale motion. Evaluated on COCO and OCHuman benchmarks, VISUALCENT achieves state-of-the-art performance in both mAP and FPS, enabling real-time, high-accuracy multi-person analysis. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
We introduce VISUALCENT, a unified human pose and instance segmentation framework to address generalizability and scalability limitations to multi person visual human analysis. VISUALCENT leverages centroid based bottom up keypoint detection paradigm and uses Keypoint Heatmap incorporating Disk Representation and KeyCentroid to identify the optimal keypoint coordinates. For the unified segmentation task, an explicit keypoint is defined as a dynamic centroid called MaskCentroid to swiftly cluster pixels to specific human instance during rapid changes in human body movement or significantly occluded environment. Experimental results on COCO and OCHuman datasets demonstrate VISUALCENTs accuracy and real time performance advantages, outperforming existing methods in mAP scores and execution frame rate per second. The implementation is available on the project page.
Problem

Research questions and friction points this paper is trying to address.

Addresses generalizability and scalability in multi-person visual human analysis
Improves keypoint detection using centroid-based bottom-up paradigm
Enhances segmentation via dynamic centroids for occluded or moving humans
Innovation

Methods, ideas, or system contributions that make the work stand out.

Centroid-based bottom-up keypoint detection paradigm
Keypoint Heatmap with Disk Representation and KeyCentroid
Dynamic MaskCentroid for swift pixel clustering
🔎 Similar Papers
No similar papers found.
Niaz Ahmad
Niaz Ahmad
PostDoctoral TMU Ontario
Visual Human Analysis using AI
Youngmoon Lee
Youngmoon Lee
Hanyang University
Real-Time AICPSRoboticsSystems
G
Guanghui Wang
Department of Computer Science, Toronto Metropolitan University, Toronto, Canada.