🤖 AI Summary
To address the high memory overhead and poor scalability of contour tree computation on large-scale scientific datasets, this paper proposes a distributed contour tree construction method based on a pre-simplification strategy. Innovatively, a topological pre-simplification step is introduced prior to distributed hierarchical construction, significantly reducing memory consumption by eliminating redundant structural components. The method further integrates parallel branch decomposition with isoline extraction to optimize end-to-end processing efficiency. Evaluated on a dataset comprising 550 billion mesh cells, our approach constructs a contour tree with over 500 billion nodes in just 15 minutes—the largest such tree reported to date—demonstrating strong scalability and practical efficiency. This work establishes, for the first time, real-time topological analysis capability for scalar fields defined on grids exceeding 10 billion cells.
📝 Abstract
Contour trees offer an abstract representation of the level set topology in scalar fields and are widely used in topological data analysis and visualization. However, applying contour trees to large-scale scientific datasets remains challenging due to scalability limitations. Recent developments in distributed hierarchical contour trees have addressed these challenges by enabling scalable computation across distributed systems. Building on these structures, advanced analytical tasks -- such as volumetric branch decomposition and contour extraction -- have been introduced to facilitate large-scale scientific analysis. Despite these advancements, such analytical tasks substantially increase memory usage, which hampers scalability. In this paper, we propose a pre-simplification strategy to significantly reduce the memory overhead associated with analytical tasks on distributed hierarchical contour trees. We demonstrate enhanced scalability through strong scaling experiments, constructing the largest known contour tree -- comprising over half a trillion nodes with complex topology -- in under 15 minutes on a dataset containing 550 billion elements.