Enterprise AI Must Enforce Participant-Aware Access Control

📅 2025-09-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In enterprise multi-user settings, large language models (LLMs) risk leaking sensitive training data during fine-tuning and retrieval-augmented generation (RAG), as existing probabilistic defenses—such as prompt sanitization and output filtering—fail to guarantee deterministic security due to the absence of fine-grained access control. Method: We propose the first systematic LLM security paradigm embedding deterministic access control, centered on “participant authorization.” Our approach dynamically verifies user permissions throughout the entire fine-tuning and RAG inference pipeline, enabling fine-grained visibility control over context documents and model outputs. It integrates classical access control theory with modern LLM techniques to build an end-to-end secure architecture resilient to diverse data exfiltration attacks. Contribution/Results: The framework has been deployed in Microsoft Copilot Tuning, demonstrating practical efficacy and scalability in production-grade enterprise AI systems.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) are increasingly deployed in enterprise settings where they interact with multiple users and are trained or fine-tuned on sensitive internal data. While fine-tuning enhances performance by internalizing domain knowledge, it also introduces a critical security risk: leakage of confidential training data to unauthorized users. These risks are exacerbated when LLMs are combined with Retrieval-Augmented Generation (RAG) pipelines that dynamically fetch contextual documents at inference time. We demonstrate data exfiltration attacks on AI assistants where adversaries can exploit current fine-tuning and RAG architectures to leak sensitive information by leveraging the lack of access control enforcement. We show that existing defenses, including prompt sanitization, output filtering, system isolation, and training-level privacy mechanisms, are fundamentally probabilistic and fail to offer robust protection against such attacks. We take the position that only a deterministic and rigorous enforcement of fine-grained access control during both fine-tuning and RAG-based inference can reliably prevent the leakage of sensitive data to unauthorized recipients. We introduce a framework centered on the principle that any content used in training, retrieval, or generation by an LLM is explicitly authorized for emph{all users involved in the interaction}. Our approach offers a simple yet powerful paradigm shift for building secure multi-user LLM systems that are grounded in classical access control but adapted to the unique challenges of modern AI workflows. Our solution has been deployed in Microsoft Copilot Tuning, a product offering that enables organizations to fine-tune models using their own enterprise-specific data.
Problem

Research questions and friction points this paper is trying to address.

Preventing data leakage in enterprise LLMs
Addressing access control gaps in fine-tuning
Securing RAG pipelines against unauthorized access
Innovation

Methods, ideas, or system contributions that make the work stand out.

Enforcing participant-aware access control in AI
Deterministic access control during fine-tuning and inference
Framework ensuring explicit authorization for all content
🔎 Similar Papers
No similar papers found.
S
Shashank Shreedhar Bhatt
Microsoft Corporation
T
Tanmay Rajore
Microsoft Corporation
K
Khushboo Aggarwal
Microsoft Corporation
Ganesh Ananthanarayanan
Ganesh Ananthanarayanan
Senior Principal Researcher at Microsoft
Systems & Networking
Ranveer Chandra
Ranveer Chandra
Managing Director, Research for Industry, GM Networking Research, Microsoft Research
NetworkingWirelessFoodAgricultureSpace
Nishanth Chandran
Nishanth Chandran
Senior Principal Researcher, Microsoft Research, India
CryptographySecurity
S
Suyash Choudhury
Microsoft Corporation
Divya Gupta
Divya Gupta
Microsoft Corporation
E
Emre Kiciman
Microsoft Corporation
S
Sumit Kumar Pandey
Microsoft Corporation
Srinath Setty
Srinath Setty
Senior Principal Researcher at Microsoft Research
SecurityCryptographyDistributed Systems
R
Rahul Sharma
Microsoft Corporation
T
Teijia Zhao
Microsoft Corporation