Strategic Learning with Local Explanations as Feedback

📅 2025-02-06

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This paper addresses the problem of how a decision-maker (DM) can balance their own utility against agent welfare when agents respond strategically to local explanations—rather than full model disclosure. We propose an “Action Recommendation” (AR) framework for explanations, the first to integrate the revelation principle from information design into interpretable machine learning. We derive sufficiency conditions for AR to ensure that local explanations do not induce self-harming agent behavior. Methodologically, we unify local explanation techniques (e.g., LIME, SHAP), conditional homogeneity modeling, and joint optimization to co-train predictive models and AR policies. Empirically, our approach significantly improves decision utility while strictly preserving agent welfare—achieving safe, controllable, and high-yield local model disclosure.

Technology Category

Application Category

📝 Abstract

We investigate algorithmic decision problems where agents can respond strategically to the decision maker's (DM) models. The demand for clear and actionable explanations from DMs to (potentially strategic) agents continues to rise. While prior work often treats explanations as full model disclosures, explanations in practice might convey only partial information, which can lead to misinterpretations and harmful responses. When full disclosure of the predictive model is neither feasible nor desirable, a key open question is how DMs can use explanations to maximise their utility without compromising agent welfare. In this work, we explore well-known local and global explanation methods, and establish a necessary condition to prevent explanations from misleading agents into self-harming actions. Moreover, with conditional homogeneity, we establish that action recommendation (AR)-based explanations are sufficient for non-harmful responses, akin to the revelation principle in information design. To operationalise AR-based explanations, we propose a simple algorithm to jointly optimise the predictive model and AR policy to balance DM outcomes with agent welfare. Our empirical results demonstrate the benefits of this approach as a more refined strategy for safe and effective partial model disclosure in algorithmic decision-making.

Problem

Research questions and friction points this paper is trying to address.

Strategic agent responses to decision models

Partial model disclosure for safe explanations

Optimizing predictive models and agent welfare

Innovation

Methods, ideas, or system contributions that make the work stand out.

Local and global explanation methods

Action recommendation-based explanations

Joint optimization of predictive model

🔎 Similar Papers

Mutual Enhancement of Large Language and Reinforcement Learning Models through Bi-Directional Feedback Mechanisms: A Case Study