DeLTa: A Decoding Strategy based on Logit Trajectory Prediction Improves Factuality and Reasoning Ability

📅 2025-03-04

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

To address factual inaccuracies and logical reasoning errors commonly exhibited by large language models (LLMs) during autoregressive generation, this paper proposes DeLTa—a model-agnostic decoding strategy that requires no architectural or parametric modifications. DeLTa’s core innovation lies in the first formal modeling of logit evolution trajectories across Transformer layers, enabling linear regression-based calibration of next-token probabilities and dynamic reweighting of the output distribution. Crucially, it enhances factual consistency and logical reasoning jointly—without decoupling these capabilities—achieving improvements of +4.9% on TruthfulQA, +8.1% on StrategyQA, and +7.3% on GSM8K. These gains reflect substantial reductions in hallucination and reasoning errors. As a black-box decoding enhancement, DeLTa establishes a new paradigm: it is universally applicable across LLMs and fully plug-and-play, requiring only inference-time adjustments.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are increasingly being used in real-world applications. However, concerns about the reliability of the content they generate persist, as it frequently deviates from factual correctness or exhibits deficiencies in logical reasoning. This paper proposes a novel decoding strategy aimed at enhancing both factual accuracy and inferential reasoning without requiring any modifications to the architecture or pre-trained parameters of LLMs. Our approach adjusts next-token probabilities by analyzing the trajectory of logits from lower to higher layers in Transformers and applying linear regression. We find that this Decoding by Logit Trajectory-based approach (DeLTa) effectively reinforces factuality and reasoning while mitigating incorrect generation. Experiments on TruthfulQA demonstrate that DeLTa attains up to a 4.9% improvement over the baseline. Furthermore, it enhances performance by up to 8.1% on StrategyQA and 7.3% on GSM8K, both of which demand strong reasoning capabilities.

Problem

Research questions and friction points this paper is trying to address.

Improves factual accuracy in large language models.

Enhances logical reasoning without model modifications.

Mitigates incorrect generation using logit trajectory analysis.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adjusts next-token probabilities using logit trajectory

Enhances factuality and reasoning without model changes

Applies linear regression on Transformer logit trajectories

🔎 Similar Papers

FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering