🤖 AI Summary
This work addresses the lack of effective self-supervised foundation models in 3D neural imaging, which has led to neuron segmentation methods relying heavily on extensive annotations and suffering from insufficient structural fidelity. We present the first successful adaptation of the 2D vision foundation model DINOv3 to 3D neuron segmentation, introducing a filter-expansion-based strategy for 3D network initialization and a novel topology-aware skeleton loss to enhance structural integrity in neuronal morphology reconstruction. By integrating DINOv3’s semantic priors with topological constraints, our approach significantly outperforms existing methods across four neural imaging datasets, achieving average improvements of 2.9% in overall structural metrics, 2.8% across diverse structures, and a 3.8% increase in the proportion of accurately reconstructed complex morphologies.
📝 Abstract
2D visual foundation models, such as DINOv3, a self-supervised model trained on large-scale natural images, have demonstrated strong zero-shot generalization, capturing both rich global context and fine-grained structural cues. However, an analogous 3D foundation model for downstream volumetric neuroimaging remains lacking, largely due to the challenges of 3D image acquisition and the scarcity of high-quality annotations. To address this gap, we propose to adapt the 2D visual representations learned by DINOv3 to a 3D biomedical segmentation model, enabling more data-efficient and morphologically faithful neuronal reconstruction. Specifically, we design an inflation-based adaptation strategy that inflates 2D filters into 3D operators, preserving semantic priors from DINOv3 while adapting to 3D neuronal volume patches. In addition, we introduce a topology-aware skeleton loss to explicitly enforce structural fidelity of graph-based neuronal arbor reconstruction. Extensive experiments on four neuronal imaging datasets, including two from BigNeuron and two public datasets, NeuroFly and CWMBS, demonstrate consistent improvements in reconstruction accuracy over SoTA methods, with average gains of 2.9% in Entire Structure Average, 2.8% in Different Structure Average, and 3.8% in Percentage of Different Structure. Code: https://github.com/yy0007/NeurINO.