Learning Actionable Counterfactual Explanations in Large State Spaces

📅 2024-04-25

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Low-level counterfactual explanations (CFEs) suffer from poor real-world actionability in large state spaces. Method: This paper introduces three types of high-level CFEs—hl-continuous, hl-discrete, and hl-id—that shift CFE modeling from the feature level to the action level. We formalize hl-discrete CFEs as a weighted set cover problem and hl-continuous CFEs as an integer linear program; further, we design a data-driven CFE generator equivalent to learning an optimal policy over a family of deterministic large MDPs. Results: Experiments on medical datasets—including BRFSS, Foods, and NHANES—demonstrate that the generator achieves high accuracy with low computational overhead. High-level CFEs significantly improve actionability, interpretability, and real-world feasibility compared to conventional feature-level explanations, establishing a novel paradigm for actionable model interpretation.

Technology Category

Application Category

📝 Abstract

Recourse generators provide actionable insights, often through feature-based counterfactual explanations (CFEs), to help negatively classified individuals understand how to adjust their input features to achieve a positive classification. These feature-based CFEs, which we refer to as emph{low-level} CFEs, are overly specific (e.g., coding experience: $4 o 5+$ years) and often recommended in feature space that doesn't straightforwardly align with real-world actions. To bridge this gap, we introduce three novel recourse types grounded in real-world actions: high-level continuous (emph{hl-continuous}), high-level discrete (emph{hl-discrete}), and high-level ID (emph{hl-id}) CFEs. We formulate single-agent CFE generation methods, where we model the hl-discrete CFE as a solution to a weighted set cover problem and the hl-continuous CFE as a solution to an integer linear program. Since these methods require costly optimization per agent, we propose data-driven CFE generation approaches that, given instances of agents and their optimal CFEs, learn a CFE generator that quickly provides optimal CFEs for new agents. This approach, also viewed as one of learning an optimal policy in a family of large but deterministic MDPs, considers several problem formulations, including formulations in which the actions and their effects are unknown, and therefore addresses informational and computational challenges. Through extensive empirical evaluation using publicly available healthcare datasets (BRFSS, Foods, and NHANES), we compare the proposed forms of recourse to low-level CFEs and assess the effectiveness of our data-driven approaches. Empirical results show that the proposed data-driven CFE generators are accurate and resource-efficient, and the proposed forms of recourse have various advantages over the low-level CFEs.

Problem

Research questions and friction points this paper is trying to address.

Bridging gap between low-level counterfactual explanations and real-world actions

Developing data-driven methods for efficient counterfactual explanation generation

Addressing informational and computational challenges in actionable recourse generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces high-level CFEs for real-world actions

Formulates CFE generation as optimization problems

Proposes data-driven CFE generators for efficiency

🔎 Similar Papers

Evaluating the Reliability of Self-Explanations in Large Language Models