Filtered Approximate Nearest Neighbor Search in Vector Databases: System Design and Performance Analysis

📅 2026-02-11

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This study addresses the lack of systematic evaluation of hybrid search mechanisms that combine semantic retrieval with metadata filtering in existing vector databases. We propose a novel relevance metric, Global-Local Selectivity (GLS), construct MoReVec—the first benchmark dataset supporting filtered retrieval—and extend ANN-Benchmarks to enable unified evaluation of hybrid search performance. Through comprehensive experiments integrating diverse filtering strategies into FAISS, Milvus, and pgvector with IVFFlat and HNSW indexes, we demonstrate that engine-level algorithmic integration critically governs performance: Milvus achieves more stable recall via hybrid execution, pgvector’s optimizer often selects suboptimal query plans, and IVFFlat outperforms HNSW under low-selectivity queries. Our findings culminate in practical configuration guidelines that offer both theoretical insights and actionable recommendations for efficient hybrid search deployment.

Technology Category

Application Category

📝 Abstract

Retrieval-Augmented Generation (RAG) applications increasingly rely on Filtered Approximate Nearest Neighbor Search (FANNS) to combine semantic retrieval with metadata constraints. While algorithmic innovations for FANNS have been proposed, there remains a lack of understanding regarding how generic filtering strategies perform within Vector Databases. In this work, we systematize the taxonomy of filtering strategies and evaluate their integration into FAISS, Milvus, and pgvector. To provide a robust benchmarking framework, we introduce a new relational dataset, \textit{MoReVec}, consisting of two tables, featuring 768-dimensional text embeddings and a rich schema of metadata attributes. We further propose the \textit{Global-Local Selectivity (GLS)} correlation metric to quantify the relationship between filters and query vectors. Our experiments reveal that algorithmic adaptations within the engine often override raw index performance. Specifically, we find that: (1) \textit{Milvus} achieves superior recall stability through hybrid approximate/exact execution; (2) \textit{pgvector}'s cost-based query optimizer frequently selects suboptimal execution plans, favoring approximate index scans even when exact sequential scans would yield perfect recall at comparable latency; and (3) partition-based indexes (IVFFlat) outperform graph-based indexes (HNSW) for low-selectivity queries. To facilitate this analysis, we extend the widely-used \textit{ANN-Benchmarks} to support filtered vector search and make it available online. Finally, we synthesize our findings into a set of practical guidelines for selecting index types and configuring query optimizers for hybrid search workloads.

Problem

Research questions and friction points this paper is trying to address.

Filtered Approximate Nearest Neighbor Search

Vector Databases

Retrieval-Augmented Generation

Filtering Strategies

Hybrid Search

Innovation

Methods, ideas, or system contributions that make the work stand out.

Filtered Approximate Nearest Neighbor Search

Vector Database

Global-Local Selectivity