Benchmarking Table Extraction from Heterogeneous Scientific Extraction Documents

📅 2025-11-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing scientific document table extraction (TE) methods suffer from poor generalization, low robustness, and limited interpretability when applied to heterogeneous data. Method: We introduce the first large-scale heterogeneous scientific table benchmark—comprising 37K multi-source samples—and propose an end-to-end fine-grained evaluation framework that decouples subtasks (e.g., table detection and structure recognition), quantifies model uncertainty via confidence scoring, and systematically exposes deficiencies in conventional evaluation metrics. We conduct a unified comparative study integrating PDF parsing libraries, domain-specific tools, computer vision models, and multimodal large language models. Results: Empirical evaluation reveals substantial performance degradation of state-of-the-art TE methods under real-world heterogeneity, validating our framework’s critical role in advancing robust, interpretable, and reproducible TE research.

Technology Category

Application Category

📝 Abstract
Table Extraction (TE) consists in extracting tables from PDF documents, in a structured format which can be automatically processed. While numerous TE tools exist, the variety of methods and techniques makes it difficult for users to choose an appropriate one. We propose a novel benchmark for assessing end-to-end TE methods (from PDF to the final table). We contribute an analysis of TE evaluation metrics, and the design of a rigorous evaluation process, which allows scoring each TE sub-task as well as end-to-end TE, and captures model uncertainty. Along with a prior dataset, our benchmark comprises two new heterogeneous datasets of 37k samples. We run our benchmark on diverse models, including off-the-shelf libraries, software tools, large vision language models, and approaches based on computer vision. The results demonstrate that TE remains challenging: current methods suffer from a lack of generalizability when facing heterogeneous data, and from limitations in robustness and interpretability.
Problem

Research questions and friction points this paper is trying to address.

Benchmarking table extraction methods from diverse scientific documents
Evaluating generalizability and robustness of table extraction tools
Assessing end-to-end table extraction performance across heterogeneous datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel benchmark for end-to-end table extraction
Rigorous evaluation process capturing model uncertainty
Two new heterogeneous datasets with 37k samples
🔎 Similar Papers
No similar papers found.