EMA: Approximate Nearest Neighbor Search with General Attribute Filtering and Dynamic Updates

📅 2026-05-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

200K/year
🤖 AI Summary
This work addresses the challenges of existing approximate nearest neighbor (ANN) search methods in supporting general attribute filtering, which often suffer from high indexing overhead, substantial memory consumption, and difficulty handling predicates with diverse selectivities. To overcome these limitations, the authors propose EMA, an algorithm that attaches compact, zero-false-negative marker sketches to graph index edges, enabling predicate- and geometry-aware conservative guidance. Coupled with a bounded edge recovery mechanism, EMA efficiently integrates multi-predicate filtering with graph traversal during query processing. The approach supports dynamic updates over mixed numerical and categorical attributes while balancing expressiveness and efficiency. Experimental results demonstrate that EMA achieves speedups of 1.68× to 12.25× over the current state-of-the-art general-purpose filtered ANN methods across a range of workloads.
📝 Abstract
Filtering Approximate Nearest Neighbor (FANN) search is a critical and emerging task for strengthening the query capability of vector databases, supporting applications such as recommendation systems, retrieval-augmented generation (RAG), and agent memory. However, most existing methods are limited to range or label filtering, often incurring unacceptable index construction time and memory overhead. Predicate-agnostic approaches further struggle to handle a wide range of predicate selectivities effectively. In this paper, we propose EMA, a filtering ANN algorithm that supports multi-predicate queries over mixed numerical and categorical attributes, and efficient dynamic updates. EMA introduces Markers as compact summaries attached to graph edges, providing conservative predicate- and geometric-aware guidance with zero false negatives at the Marker level. During query processing, EMA performs Marker-augmented joint search with a bounded edge recovery mechanism, enabling efficient filtering while preserving graph navigability. Extensive experiments demonstrate that EMA achieves 1.68x--12.25x speedup over state-of-the-art general filtering ANN methods across diverse workloads.
Problem

Research questions and friction points this paper is trying to address.

Approximate Nearest Neighbor Search
Attribute Filtering
Dynamic Updates
Vector Databases
Multi-predicate Queries
Innovation

Methods, ideas, or system contributions that make the work stand out.

Filtering Approximate Nearest Neighbor
Dynamic Updates
Predicate-Agnostic Filtering
Graph-Based ANN
Marker-Augmented Search