🤖 AI Summary
To address the sharp degradation in approximate nearest neighbor search (ANNS) performance caused by dynamic query weight α in hybrid vector queries (HVQ), this paper proposes the Dynamic Edge Graph (DEG)—the first graph-based index structure supporting real-time, arbitrary-α adaptation. DEG innovatively integrates Pareto-frontier candidate generation, α-aware dynamic edge pruning, and edge-seed acceleration, explicitly modeling optimal neighborhood relationships across multiple α values during index construction. Evaluated on multiple real-world image–text datasets under frequent α-switching scenarios, DEG consistently outperforms state-of-the-art methods: it improves retrieval accuracy by up to 12.7% and reduces latency by up to 3.2×. Crucially, DEG is the first approach to simultaneously achieve high accuracy and low latency in dynamic HVQ settings.
📝 Abstract
Bimodal data, such as image-text pairs, has become increasingly prevalent in the digital era. The Hybrid Vector Query (HVQ) is an effective approach for querying such data and has recently garnered considerable attention from researchers. It calculates similarity scores for objects represented by two vectors using a weighted sum of each individual vector's similarity, with a query-specific parameter $alpha$ to determine the weight. Existing methods for HVQ typically construct Approximate Nearest Neighbors Search (ANNS) indexes with a fixed $alpha$ value. This leads to significant performance degradation when the query's $alpha$ dynamically changes based on the different scenarios and needs. In this study, we introduce the Dynamic Edge Navigation Graph (DEG), a graph-based ANNS index that maintains efficiency and accuracy with changing $alpha$ values. It includes three novel components: (1) a greedy Pareto frontier search algorithm to compute a candidate neighbor set for each node, which comprises the node's approximate nearest neighbors for all possible $alpha$ values; (2) a dynamic edge pruning strategy to determine the final edges from the candidate set and assign each edge an active range. This active range enables the dynamic use of the Relative Neighborhood Graph's pruning strategy based on the query's $alpha$ values, skipping redundant edges at query time and achieving a better accuracy-efficiency trade-off; and (3) an edge seed method that accelerates the querying process. Extensive experiments on real-world datasets show that DEG demonstrates superior performance compared to existing methods under varying $alpha$ values.