Egocentric Action-aware Inertial Localization in Point Clouds

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses inertial localization from head-mounted IMUs in egocentric 3D point clouds, tackling two core challenges: trajectory drift induced by IMU noise and difficulties in modeling diverse human motions. We propose EAIL—a novel end-to-end framework that jointly optimizes localization and action recognition. First, it leverages human actions as spatial anchors to compensate for IMU drift—a conceptual innovation. Second, it introduces a hierarchical multimodal contrastive alignment mechanism that fuses IMU time-series signals, local geometric features from point clouds, and spatiotemporal reasoning. Evaluated on multiple benchmarks, EAIL achieves state-of-the-art performance: reducing localization error by 32% and attaining 91.4% accuracy in action sequence recognition.

Technology Category

Application Category

📝 Abstract

This paper presents a novel inertial localization framework named Egocentric Action-aware Inertial Localization (EAIL), which leverages egocentric action cues from head-mounted IMU signals to localize the target individual within a 3D point cloud. Human inertial localization is challenging due to IMU sensor noise that causes trajectory drift over time. The diversity of human actions further complicates IMU signal processing by introducing various motion patterns. Nevertheless, we observe that some actions observed through the head-mounted IMU correlate with spatial environmental structures (e.g., bending down to look inside an oven, washing dishes next to a sink), thereby serving as spatial anchors to compensate for the localization drift. The proposed EAIL framework learns such correlations via hierarchical multi-modal alignment. By assuming that the 3D point cloud of the environment is available, it contrastively learns modality encoders that align short-term egocentric action cues in IMU signals with local environmental features in the point cloud. These encoders are then used in reasoning the IMU data and the point cloud over time and space to perform inertial localization. Interestingly, these encoders can further be utilized to recognize the corresponding sequence of actions as a by-product. Extensive experiments demonstrate the effectiveness of the proposed framework over state-of-the-art inertial localization and inertial action recognition baselines.

Problem

Research questions and friction points this paper is trying to address.

Localizing individuals in 3D point clouds using head-mounted IMU signals

Compensating for IMU trajectory drift via action-environment correlations

Aligning egocentric action cues with spatial environmental features

Innovation

Methods, ideas, or system contributions that make the work stand out.

Egocentric action cues for 3D localization

Hierarchical multi-modal alignment learning

Contrastive learning of IMU and point cloud

🔎 Similar Papers

No similar papers found.

Authors to Follow