Cross-Vendor Sola ISPM Benchmark: Evaluating Agentic AI for Federated Identity Security Reasoning

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This study addresses the challenge of Identity Security Posture Management (ISPM) in heterogeneous cross-cloud and multi-SaaS environments, where vendor diversity impedes unified detection of misconfigurations and privilege escalation paths. The authors present the first ISPM benchmark spanning eight enterprise platforms—including AWS, Azure AD, and Okta—comprising 50 real-world multi-hop tasks that require AI agents to perform cross-system entity resolution and correlation analysis. They propose a novel LLM reasoning framework integrating metadata injection, schema graphs, and structured relational context, along with a multidimensional evaluation protocol assessing answer correctness, evidence traceability, structural fidelity, and SQL equivalence. Experimental results demonstrate that incorporating graph-topological information yields a 34% relative improvement in answer accuracy, achieving 78% correctness and reducing complete failure rates to 4%, thereby confirming the critical role of explicit relational context in cross-domain identity security reasoning.

📝 Abstract

The rapid proliferation of multi-cloud and SaaS platforms has transformed Identity Security Posture Management (ISPM) into a fundamentally cross-vendor challenge: critical misconfigurations and privilege escalation paths increasingly span multiple identity providers, infrastructure layers, and authentication systems never designed to interoperate. Existing evaluations focus on isolated single-platform environments and provide no means to assess whether an AI agent can reason across these fragmented boundaries. To address this gap, we introduce the Cross-Vendor Sola ISPM Benchmark, a production-grade benchmark of 50 data-grounded tasks requiring multi-hop entity resolution and cross-system correlation across eight integrated enterprise platforms including AWS, Okta, Azure AD, and Google Workspace. We also contribute an evaluation framework measuring not only final answer correctness but also evidentiary grounding, structural join fidelity, retrieval quality, and SQL equivalence. We evaluate the Sola AI Agent across five context configurations - from no injected metadata to full schema, graph, and retrieval context - using three frontier LLMs. Results show that structured relational context improves answer correctness by approximately 34% relatively and reduces exploration queries by approximately 70% across all tested models, with the largest gains driven by cross-vendor graph topology. Our findings indicate that frontier LLMs possess substantial latent security reasoning capability, but reliable cross-vendor identity analysis is fundamentally constrained by the availability of explicit relational context for entity resolution and evidentiary grounding. Under full context, the best configuration achieves 78% answer correctness while reducing complete failure to 4%.

Problem

Research questions and friction points this paper is trying to address.

Cross-Vendor

Identity Security Posture Management

Agentic AI

Federated Identity

Security Reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-Vendor ISPM

Agentic AI

Entity Resolution