🤖 AI Summary
Modeling tree-like geometric structures—such as 3D vascular networks—as compact vector representations remains challenging. This paper proposes a two-stage Transformer-based autoencoder: Stage I learns geometric embeddings of vessel segments via point-cloud sampling; Stage II recursively encodes the full tree topology into a single latent vector and enforces hierarchical recursive constraints during decoding to guarantee topological validity. Our approach introduces the first hierarchical vectorization paradigm, explicitly decoupling geometric detail from topological structure. Evaluated on synthetic 2D trees and real 3D coronary artery data, it achieves significant improvements in reconstruction fidelity, topological accuracy, and latent-space interpolation realism. Moreover, it reduces GPU memory consumption substantially compared to 3D convolutional baselines, enabling scalable training on large datasets.
📝 Abstract
We introduce a novel framework for learning vector representations of tree-structured geometric data focusing on 3D vascular networks. Our approach employs two sequentially trained Transformer-based autoencoders. In the first stage, the Vessel Autoencoder captures continuous geometric details of individual vessel segments by learning embeddings from sampled points along each curve. In the second stage, the Vessel Tree Autoencoder encodes the topology of the vascular network as a single vector representation, leveraging the segment-level embeddings from the first model. A recursive decoding process ensures that the reconstructed topology is a valid tree structure. Compared to 3D convolutional models, this proposed approach substantially lowers GPU memory requirements, facilitating large-scale training. Experimental results on a 2D synthetic tree dataset and a 3D coronary artery dataset demonstrate superior reconstruction fidelity, accurate topology preservation, and realistic interpolations in latent space. Our scalable framework, named VeTTA, offers precise, flexible, and topologically consistent modeling of anatomical tree structures in medical imaging.