Classical AI vs. LLMs for Decision-Maker Alignment in Health Insurance Choices

📅 2025-10-07

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study investigates health insurance selection—a high-stakes, preference-sensitive decision-making domain—to examine how AI systems can achieve personalized alignment with decision-makers’ risk preferences. We propose the first empirical framework systematically comparing the generalization capabilities of classical AI approaches (case-based reasoning, Bayesian inference, and naturalistic decision-making) against large language models (GPT-4/5). Methodologically, we introduce a novel zero-shot prompting technique grounded in weight self-consistency, which significantly enhances LLM alignment stability in novel decision contexts. Results indicate comparable alignment performance between classical AI and LLMs across diverse risk preferences, with classical methods exhibiting marginal superiority in moderate-risk scenarios. All experimental data, source code, and prompt templates are publicly released to support reproducibility and further research.

Technology Category

Application Category

📝 Abstract

As algorithmic decision-makers are increasingly applied to high-stakes domains, AI alignment research has evolved from a focus on universal value alignment to context-specific approaches that account for decision-maker attributes. Prior work on Decision-Maker Alignment (DMA) has explored two primary strategies: (1) classical AI methods integrating case-based reasoning, Bayesian reasoning, and naturalistic decision-making, and (2) large language model (LLM)-based methods leveraging prompt engineering. While both approaches have shown promise in limited domains such as medical triage, their generalizability to novel contexts remains underexplored. In this work, we implement a prior classical AI model and develop an LLM-based algorithmic decision-maker evaluated using a large reasoning model (GPT-5) and a non-reasoning model (GPT-4) with weighted self-consistency under a zero-shot prompting framework, as proposed in recent literature. We evaluate both approaches on a health insurance decision-making dataset annotated for three target decision-makers with varying levels of risk tolerance (0.0, 0.5, 1.0). In the experiments reported herein, classical AI and LLM-based models achieved comparable alignment with attribute-based targets, with classical AI exhibiting slightly better alignment for a moderate risk profile. The dataset and open-source implementation are publicly available at: https://github.com/TeX-Base/ClassicalAIvsLLMsforDMAlignment and https://github.com/Parallax-Advanced-Research/ITM/tree/feature_insurance.

Problem

Research questions and friction points this paper is trying to address.

Evaluating classical AI versus LLM alignment for health insurance decisions

Assessing decision-maker alignment with varying risk tolerance levels

Comparing generalizability of AI approaches to novel decision contexts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Classical AI uses case-based and Bayesian reasoning methods

LLM-based methods employ zero-shot prompting with weighted self-consistency

Both approaches evaluated using GPT models for decision-maker alignment

🔎 Similar Papers

No similar papers found.

Authors to Follow