MARIA: A Framework for Marginal Risk Assessment without Ground Truth in AI Systems

📅 2025-10-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional AI risk assessment relies on ground-truth labels and absolute risk metrics, yet such labels are often unavailable in settings with delayed feedback, unknown consequences, or high annotation costs—particularly for long-running safety-critical systems. Method: We propose the first label-free marginal risk assessment framework, which evaluates *changes* in risk relative to existing workflows—not absolute risk. It introduces three novel relative evaluation criteria: predictability, capability disparity, and interaction dominance—operationalized via prediction consistency analysis, system capability benchmarking, and human-AI interaction modeling. Contribution/Results: The framework quantifies risk gain or loss without ground truth, enabling actionable deployment decisions for software engineering teams. Empirically validated in high-assurance domains, it demonstrates both practical utility and robustness under label scarcity, advancing risk-aware AI deployment in safety-sensitive applications.

Technology Category

Application Category

📝 Abstract
Before deploying an AI system to replace an existing process, it must be compared with the incumbent to ensure improvement without added risk. Traditional evaluation relies on ground truth for both systems, but this is often unavailable due to delayed or unknowable outcomes, high costs, or incomplete data, especially for long-standing systems deemed safe by convention. The more practical solution is not to compute absolute risk but the difference between systems. We therefore propose a marginal risk assessment framework, that avoids dependence on ground truth or absolute risk. It emphasizes three kinds of relative evaluation methodology, including predictability, capability and interaction dominance. By shifting focus from absolute to relative evaluation, our approach equips software teams with actionable guidance: identifying where AI enhances outcomes, where it introduces new risks, and how to adopt such systems responsibly.
Problem

Research questions and friction points this paper is trying to address.

Assessing AI system risks without ground truth data
Comparing marginal risk between AI and existing systems
Providing actionable guidance for responsible AI adoption
Innovation

Methods, ideas, or system contributions that make the work stand out.

Marginal risk assessment without ground truth
Relative evaluation through three methodologies
Focuses on system comparison rather than absolute risk
🔎 Similar Papers
No similar papers found.