IDP-Bench: Benchmarking ability of LLMs to protect personal information in interdependent privacy contexts

📅 2026-06-06

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the widespread neglect of interdependent privacy (IDP)—where an individual’s privacy may be compromised by others’ actions—in current large language models (LLMs). Drawing on contextual integrity theory, the study introduces IDP-Bench, the first benchmark specifically designed to evaluate LLMs’ capacity for IDP reasoning. It employs three multi-tiered reasoning tasks and a dual-LLM adjudication mechanism to systematically assess models’ understanding of co-present agents, information attributes, and norms of appropriate sharing. Experimental results reveal that while open-source models excel at identifying co-ownership (6 out of 8 exceeding 90% accuracy), they exhibit significant deficiencies in recognizing secondary stakeholders and judging contextual appropriateness (7 out of 8 scoring below 74%). Performance further degrades markedly in smaller models, highlighting critical limitations in current LLMs’ ability to reason about interdependent privacy.

📝 Abstract

Large language models (LLMs) are becoming widely deployed as personal AI assistants with access to sensitive user data, making privacy a major challenge for their design and evaluation. Prior work focuses mainly on individual-level risks, overlooking \textbf{interdependent privacy (IDP)}--where one person's data may be revealed by others without their knowledge or consent. We address this gap by introducing \textbf{IDP-Bench}: the first LLM benchmark for IDP scenarios, grounded in the Contextual Integrity (CI) framework. We evaluate eight open-source LLMs on their understanding of IDP scenarios across three levels of IDP reasoning using two LLM judges. Results show strong co-ownership recognition (6/8 models exceed 90\%) but persistent weaknesses in identifying CI parameters (information attribute, primary subject) and IDP-specific parameters such as secondary subjects, where 7/8 models score below 74\%. Models also struggle to judge sharing appropriateness (5/8 scoring below 77\%). While the ability to judge the appropriateness of sharing improves with scale, performance tends to decline in smaller models, and prompt sensitivity remains high on IDP-specific questions--highlighting the need for more targeted study of IDP in LLM privacy research. Data \& code available \href{https://github.com/tisl-lab/Interdependent_Privacy_Bench}{here}.

Problem

Research questions and friction points this paper is trying to address.

interdependent privacy

large language models

privacy protection

Contextual Integrity

personal information

Innovation

Methods, ideas, or system contributions that make the work stand out.

Interdependent Privacy

LLM Benchmark

Contextual Integrity