SAFER: Risk-Constrained Sample-then-Filter in Large Language Models

📅 2025-10-11

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

To address the challenge of ensuring output reliability of large language models (LLMs) in risk-sensitive open-domain question answering, this paper proposes SAFER—a selective abstinence-aware framework. First, it introduces an abstention-aware two-stage sampling mechanism that dynamically adjusts sample size under a finite budget. Second, it filters candidate answers using Clopper–Pearson confidence intervals and conformal risk control thresholds, explicitly bounding both error coverage and the risk of erroneously rejecting correct answers. SAFER is the first method to integrate principled abstention with dual-risk control into selective conformal prediction, enabling task-adaptive acceptance criteria and decoupled calibration–testing. Evaluated on multiple open-domain QA benchmarks, SAFER significantly improves output reliability under strict risk constraints, demonstrating strong robustness, high data efficiency, and flexible risk configurability—even with low sampling budgets.

Technology Category

Application Category

📝 Abstract

As large language models (LLMs) are increasingly deployed in risk-sensitive applications such as real-world open-ended question answering (QA), ensuring the trustworthiness of their outputs has become critical. Existing selective conformal prediction (SCP) methods provide statistical guarantees by constructing prediction sets with a constrained miscoverage rate for correct answers. However, prior works unrealistically assume that admissible answers for all instances can be obtained via finite sampling, even for open-ended QA scenarios that lack a fixed and finite solution space. To address this, we introduce a two-stage risk control framework comprising abstention-aware sampling and conformalized filtering (SAFER). Firstly, on a held-out calibration set, SAFER calibrates a sampling budget within the maximum sampling cap, using the Clopper-Pearson exact method at a user-desired risk level (i.e., the maximum allowable miscoverage rate of the sampling sets). If the risk level cannot be satisfied within the cap, we abstain; otherwise, the calibrated sampling budget becomes the minimum requirements at test time. Then, we employ calibration instances where correct answers are attainable under the calibrated budget and apply the conformal risk control method to determine a statistically valid uncertainty threshold, which filters unreliable distractors from the candidate set for each test data point. In this stage, SAFER introduces an additional risk level to guide the calculation of the threshold, thereby controlling the risk of correct answers being excluded. Furthermore, we show that SAFER is compatible with various task-specific admission criteria and calibration-test split ratios, highlighting its robustness and high data efficiency.

Problem

Research questions and friction points this paper is trying to address.

Ensuring trustworthy LLM outputs in risk-sensitive open-ended question answering

Addressing unrealistic finite sampling assumptions in conformal prediction methods

Providing statistical guarantees while controlling miscoverage risks during answer filtering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage risk control framework with abstention-aware sampling

Calibrated sampling budget using Clopper-Pearson exact method

Conformal filtering with uncertainty threshold to exclude distractors

🔎 Similar Papers

Predicting and analyzing memorization within fine-tuned Large Language Models