🤖 AI Summary
To address the error-proneness and analytical intractability of manually authored access control policies in cloud environments, this paper proposes an automated policy generation and semantic-level request summarization framework integrating large language models (LLMs) with symbolic techniques. Methodologically, it introduces a semantics-driven request abstraction mechanism, augmented by an inference-enhancement module and symbolic execution–based verification, to construct an interpretable and verifiable policy analysis pipeline. Key contributions include: (i) the first synergistic use of symbolic execution and LLM-based reasoning to ensure semantic consistency of access policies; and (ii) a novel bidirectional request–policy summarization paradigm that significantly improves policy comprehensibility and behavioral predictability. Experimental results show that inference-capable LLMs achieve 93.7% accuracy in generating compliant policies—substantially outperforming non-inference LLMs (45.8%)—and symbolic augmentation reduces false positives by 62% in complex policy scenarios.
📝 Abstract
Cloud computing is ubiquitous, with a growing number of services being hosted on the cloud every day. Typical cloud compute systems allow administrators to write policies implementing access control rules which specify how access to private data is governed. These policies must be manually written, and due to their complexity can often be error prone. Moreover, existing policies often implement complex access control specifications and thus can be difficult to precisely analyze in determining their behavior works exactly as intended. Recently, Large Language Models (LLMs) have shown great success in automated code synthesis and summarization. Given this success, they could potentially be used for automatically generating access control policies or aid in understanding existing policies. In this paper, we explore the effectiveness of LLMs for access control policy synthesis and summarization. Specifically, we first investigate diverse LLMs for access control policy synthesis, finding that: although LLMs can effectively generate syntactically correct policies, they have permissiveness issues, generating policies equivalent to the given specification 45.8% of the time for non-reasoning LLMs, and 93.7% of the time for reasoning LLMs. We then investigate how LLMs can be used to analyze policies by introducing a novel semantic-based request summarization approach which leverages LLMs to generate a precise characterization of the requests allowed by a policy. Our results show that while there are significant hurdles in leveraging LLMs for automated policy generation, LLMs show promising results when combined with symbolic approaches in analyzing existing policies.