๐ค AI Summary
Existing markerless 3D freehand ultrasound reconstruction methods struggle to balance cost, invasiveness, and cumulative drift. This work proposes a low-cost, markerless reconstruction framework leveraging an off-the-shelf RGB-D camera, which employs a vision foundation model for robust 6D ultrasound probe pose tracking. To mitigate drift, the method integrates a vision-guided divergence detection and autonomous recovery mechanism, along with a two-stage pose optimization network that decouples high- and low-frequency motion components. Experimental results demonstrate that the approach achieves an average positional error of only 0.88 mm under complex scanning trajectories, enabling sub-millimeter surface reconstruction accuracy and significantly outperforming both sensor-assisted and sensor-free state-of-the-art methods.
๐ Abstract
Freehand 3D ultrasound (US) reconstruction promises volumetric imaging with the flexibility of standard 2D probes, yet existing tracking paradigms face a restrictive trilemma: marker-based systems demand prohibitive costs, inside-out methods require intrusive sensor attachment, and sensorless approaches suffer from severe cumulative drift. To overcome these limitations, we present MLRecon, a robust markerless 3D US reconstruction framework delivering drift-resilient 6D probe pose tracking using a single commodity RGB-D camera. Leveraging the generalization power of vision foundation models, our pipeline enables continuous markerless tracking of the probe, augmented by a vision-guided divergence detector that autonomously monitors tracking integrity and triggers failure recovery to ensure uninterrupted scanning. Crucially, we further propose a dual-stage pose refinement network that explicitly disentangles high-frequency jitter from low-frequency bias, effectively denoising the trajectory while maintaining the kinematic fidelity of operator maneuvers. Experiments demonstrate that MLRecon significantly outperforms competing sensorless and sensor-aided methods, achieving average position errors as low as 0.88 mm on complex trajectories and yielding high-quality 3D reconstructions with sub-millimeter mean surface accuracy. This establishes a new benchmark for low-cost, accessible volumetric US imaging in resource-limited clinical settings.