Threshold-Protected Searchable Sharing: Privacy Preserving Aggregated-ANN Search for Collaborative RAG

📅 2025-07-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the privacy-preserving approximate nearest neighbor (ANN) search problem in collaborative Retrieval-Augmented Generation (RAG). To overcome the incompatibility between private data silos and high-dimensional vector retrieval, we propose the first multi-party secure collaborative ANN protocol compatible with Hierarchical Navigable Small World (HNSW) indexing. Our method introduces a shareable BitGraph structure and a dynamic insertion mechanism, integrating threshold cryptography, searchable encryption, and secure aggregation. We further construct an interactive game-based leakage analysis framework—enabling, for the first time, standardized security proofs of information leakage in AI systems. Through rigorous theoretical reduction, we prove strict privacy guarantees; search complexity is reduced to O(n) while preserving the original single-party HNSW topology. The solution supports efficient, verifiable, and industrially compatible cross-domain RAG collaboration under end-to-end privacy assurance.

Technology Category

Application Category

📝 Abstract
LLM-powered search services have driven data integration as a significant trend. However, this trend's progress is fundamentally hindered, despite the fact that combining individual knowledge can significantly improve the relevance and quality of responses in specialized queries and make AI more professional at providing services. Two key bottlenecks are private data repositories' locality constraints and the need to maintain compatibility with mainstream search techniques, particularly Hierarchical Navigable Small World (HNSW) indexing for high-dimensional vector spaces. In this work, we develop a secure and privacy-preserving aggregated approximate nearest neighbor search (SP-A$^2$NN) with HNSW compatibility under a threshold-based searchable sharing primitive. A sharable bitgraph structure is constructed and extended to support searches and dynamical insertions over shared data without compromising the underlying graph topology. The approach reduces the complexity of a search from $O(n^2)$ to $O(n)$ compared to naive (undirected) graph-sharing approach when organizing graphs in the identical HNSW manner. On the theoretical front, we explore a novel security analytical framework that incorporates privacy analysis via reductions. The proposed leakage-guessing proof system is built upon an entirely different interactive game that is independent of existing coin-toss game design. Rather than being purely theoretical, this system is rooted in existing proof systems but goes beyond them to specifically address leakage concerns and standardize leakage analysis -- one of the most critical security challenges with AI's rapid development.
Problem

Research questions and friction points this paper is trying to address.

Enables privacy-preserving aggregated ANN search for collaborative RAG
Overcomes locality constraints of private data repositories
Ensures HNSW compatibility while maintaining data security
Innovation

Methods, ideas, or system contributions that make the work stand out.

Secure aggregated ANN search with HNSW compatibility
Sharable bitgraph structure for dynamic insertions
Leakage-guessing proof system for privacy analysis
🔎 Similar Papers