🤖 AI Summary
Existing methods struggle to simultaneously model local anatomical structures and inter-subject variability from whole-brain tractograms, lacking a unified and transferable representation. This work proposes TractFM, the first framework to enable end-to-end learning of context-aware embeddings that capture both streamline-level and subject-level semantics. The approach integrates a local streamline encoder with a permutation-equivariant whole-brain tractogram encoder and introduces a dense anatomical bundle segmentation pretraining strategy to yield compact, transferable representations. Experiments across three tractography algorithms and five diffusion MRI datasets demonstrate that freezing the learned representations achieves high-accuracy bundle segmentation and effectively supports cross-dataset prediction of age and sex.
📝 Abstract
Diffusion MRI (dMRI) tractography is the only noninvasive approach for mapping white-matter pathways in the living human brain. It represents each brain as a tractogram: a large, unordered set of three-dimensional streamlines that includes information about both local streamline geometry and whole-brain anatomical organization. This structure makes tractograms a natural but challenging target for representation learning. Existing methods treat streamline classification and subject-level prediction as separate problems: streamline classifiers focus on geometric patterns, whereas subject-level prediction often depends on hand-crafted features. As a result, current methods do not learn reusable representations that connect streamline anatomy with whole-brain inter-subject variation. Here we introduce TractFM, a tractogram foundation model that learns reusable representations directly from whole-brain streamline sets. TractFM combines a local streamline encoder with a permutation-equivariant tractogram encoder, allowing all streamlines from a subject to be contextualized jointly in a single forward pass. Pretraining on dense anatomical tract parcellation, i.e., assigning anatomical labels to individual streamlines, yields two complementary representations: contextualized streamline-level embeddings for tract parcellation and compact subject-level descriptors for downstream prediction of subject phenotypes. Across three tractography algorithms and five dMRI datasets, TractFM transfers to both streamline-level and subject-level tasks. Its frozen representations achieve accurate tract parcellation and predict age and sex across independent datasets. These results show that whole-brain geometric context, learned once, can generalize across tractography pipelines, datasets, and prediction tasks.