SMA: Who Said That? Auditing Membership Leakage in Semi-Black-box RAG Controlling

📅 2025-08-12

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

In RAG/MRAG systems, retrieval-augmented generation and multimodal fusion obscure content provenance, rendering existing membership inference methods incapable of distinguishing whether generated content originates from pretraining data, external retrieval, or user input—severely undermining the traceability of privacy leakage. Method: We propose SMA, the first source-aware membership auditing framework, which shifts membership inference from “whether memorized” to “where sourced” under a semi-black-box setting. SMA enables fine-grained, cross-modal provenance attribution—including text-level attribution of image retrieval traces—via zeroth-order optimization-driven perturbation sampling and ridge regression modeling, leveraging semantic alignment between text and images in multimodal large language models (MLLMs). Contribution/Results: SMA achieves high-accuracy leakage auditing for both textual and visual retrieval sources. Experiments demonstrate substantial improvements in data provenance reliability and privacy accountability for complex generative systems.

Technology Category

Application Category

📝 Abstract

Retrieval-Augmented Generation (RAG) and its Multimodal Retrieval-Augmented Generation (MRAG) significantly improve the knowledge coverage and contextual understanding of Large Language Models (LLMs) by introducing external knowledge sources. However, retrieval and multimodal fusion obscure content provenance, rendering existing membership inference methods unable to reliably attribute generated outputs to pre-training, external retrieval, or user input, thus undermining privacy leakage accountability To address these challenges, we propose the first Source-aware Membership Audit (SMA) that enables fine-grained source attribution of generated content in a semi-black-box setting with retrieval control capabilities.To address the environmental constraints of semi-black-box auditing, we further design an attribution estimation mechanism based on zero-order optimization, which robustly approximates the true influence of input tokens on the output through large-scale perturbation sampling and ridge regression modeling. In addition, SMA introduces a cross-modal attribution technique that projects image inputs into textual descriptions via MLLMs, enabling token-level attribution in the text modality, which for the first time facilitates membership inference on image retrieval traces in MRAG systems. This work shifts the focus of membership inference from 'whether the data has been memorized' to 'where the content is sourced from', offering a novel perspective for auditing data provenance in complex generative systems.

Problem

Research questions and friction points this paper is trying to address.

Audit content source attribution in RAG systems

Enable membership inference for multimodal retrieval traces

Address privacy leakage accountability in generative systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Source-aware Membership Audit for content attribution

Zero-order optimization for attribution estimation

Cross-modal attribution via MLLMs for images

🔎 Similar Papers

Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation