KIEval: Evaluation Metric for Document Key Information Extraction

📅 2025-03-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing KIE evaluation metrics (e.g., Span-F1) focus solely on token-level entity matching and fail to reflect the real-world industrial requirement of extracting coherent, structured information groups. Method: We propose KIEval—the first application-oriented KIE evaluation framework—that explicitly incorporates structured grouping capability into the assessment pipeline, jointly modeling entity recognition and hierarchical grouping consistency. Its core innovation lies in unifying semantics-aware entity-level F1 with a novel group-level consistency score, enabling interpretable error diagnosis. Results: Extensive experiments across multiple industrial document datasets demonstrate that KIEval significantly outperforms conventional metrics, yielding more accurate discrimination of model utility in production settings. By shifting evaluation emphasis from isolated span matching to end-to-end structural understanding, KIEval advances KIE assessment toward practical deployment readiness.

Technology Category

Application Category

📝 Abstract
Document Key Information Extraction (KIE) is a technology that transforms valuable information in document images into structured data, and it has become an essential function in industrial settings. However, current evaluation metrics of this technology do not accurately reflect the critical attributes of its industrial applications. In this paper, we present KIEval, a novel application-centric evaluation metric for Document KIE models. Unlike prior metrics, KIEval assesses Document KIE models not just on the extraction of individual information (entity) but also of the structured information (grouping). Evaluation of structured information provides assessment of Document KIE models that are more reflective of extracting grouped information from documents in industrial settings. Designed with industrial application in mind, we believe that KIEval can become a standard evaluation metric for developing or applying Document KIE models in practice. The code will be publicly available.
Problem

Research questions and friction points this paper is trying to address.

Lack of accurate evaluation metrics for Document KIE in industrial applications.
Need to assess both individual and grouped information extraction in Document KIE.
Development of KIEval, a novel metric for practical Document KIE model evaluation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

KIEval evaluates structured information grouping.
Focuses on industrial application relevance.
Publicly available code for practical use.
🔎 Similar Papers
No similar papers found.
Minsoo Khang
Minsoo Khang
Upstage AI
OCRIntelligent Document Parsing
S
Sang Chul Jung
Upstage AI, South Korea
Sungrae Park
Sungrae Park
Upstage AI
Document AILarge Language Models
T
Teakgyu Hong
Upstage AI, South Korea