Ekka: Automated Diagnosis of Silent Errors in LLM Inference

📅 2026-06-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

171K/year
🤖 AI Summary
This work addresses the challenge of silent errors in large language model (LLM) inference frameworks, which can degrade output quality without triggering explicit failures during rapid development iterations. To tackle this issue, the study introduces the first systematic application of differential debugging: it constructs a semantically correct reference implementation and aligns intermediate execution states between the reference and the target system to enable automated root cause localization. The contributions include a novel diagnostic methodology based on intermediate state alignment and differential analysis, the creation of the first real-world benchmark for silent errors in LLM inference, and empirical validation demonstrating 80% (pass@1) and 88% (pass@5) diagnostic accuracy on this benchmark. Using this approach, the authors successfully identified four previously unknown silent errors, all confirmed by the respective developers.
📝 Abstract
LLM serving frameworks are quickly evolving with a complex software stack and a vast number of optimizations. The rapid development process can introduce silent errors where output quality silently degrades without any explicit error signals. Diagnosing silent errors is notoriously difficult due to the substantial semantic gap between the high-level symptoms and the low-level root causes. We observe that diagnosis of silent errors can be effectively framed as a differential debugging problem by leveraging the existence of semantically correct reference implementations. We propose Ekka, an automated diagnosis system that identifies root causes by systematically aligning and comparing intermediate execution states between a target and a reference framework. We constructed a benchmark of real-world silent errors from popular serving frameworks, where Ekka shows 80% pass@1 diagnosis accuracy and 88% pass@5 diagnosis accuracy, outperforming state-of-the-art systems. Ekka also diagnoses 4 new silent errors from serving frameworks, all of which have been confirmed by the developers.
Problem

Research questions and friction points this paper is trying to address.

silent errors
LLM inference
automated diagnosis
serving frameworks
differential debugging
Innovation

Methods, ideas, or system contributions that make the work stand out.

silent error diagnosis
differential debugging
LLM inference
automated debugging
execution state alignment
🔎 Similar Papers