Had enough of experts? Quantitative knowledge retrieval from large language models

📅 2024-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited reliability of large language models (LLMs) in quantitative knowledge retrieval. To this end, we propose a novel Bayesian workflow-oriented paradigm that supports principled prior distribution construction and missing data imputation. Methodologically, our approach integrates an LLM interface combining prompt engineering with uncertainty calibration, a structured prior elicitation framework, and a multi-round consistency verification mechanism—enabling joint expert knowledge distillation and missing-value reasoning. We present the first systematic evaluation of LLMs’ robustness and interpretability in quantitative knowledge retrieval. Experiments across real-world datasets from healthcare, environmental science, and engineering domains demonstrate that our method improves prediction accuracy by 12.3% on average, reduces reliance on labeled data by 40%, and substantially enhances the practicality and generalizability of Bayesian analysis.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have been extensively studied for their abilities to generate convincing natural language sequences, however their utility for quantitative information retrieval is less well understood. Here we explore the feasibility of LLMs as a mechanism for quantitative knowledge retrieval to aid two data analysis tasks: elicitation of prior distributions for Bayesian models and imputation of missing data. We introduce a framework that leverages LLMs to enhance Bayesian workflows by eliciting expert-like prior knowledge and imputing missing data. Tested on diverse datasets, this approach can improve predictive accuracy and reduce data requirements, offering significant potential in healthcare, environmental science and engineering applications. We discuss the implications and challenges of treating LLMs as 'experts'.
Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs for quantitative knowledge retrieval.
Enhancing Bayesian workflows via prior elicitation.
Improving data imputation using LLMs.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages LLMs for quantitative knowledge retrieval
Enhances Bayesian workflows with expert-like priors
Imputes missing data to improve predictive accuracy
D
D. Selby
Deutsches Forschungszentrum für Künstliche Intelligenz GmbH (DFKI), Kaiserslautern, Germany
K
Kai Spriestersbach
Deutsches Forschungszentrum für Künstliche Intelligenz GmbH (DFKI), Kaiserslautern, Germany
Y
Yuichiro Iwashita
Deutsches Forschungszentrum für Künstliche Intelligenz GmbH (DFKI), Kaiserslautern, Germany; Graduate School of Informatics, Osaka Metropolitan University, Japan
M
Mohammad Saad
D
Dennis Bappert
Amazon Web Services, Mainz, Germany
A
Archana Warrier
Deutsches Forschungszentrum für Künstliche Intelligenz GmbH (DFKI), Kaiserslautern, Germany
S
Sumantrak Mukherjee
Deutsches Forschungszentrum für Künstliche Intelligenz GmbH (DFKI), Kaiserslautern, Germany
Koichi Kise
Koichi Kise
Professor of Graduate School of Informatics, Osaka Metropolitan University
Document Image AnalysisComputer VisionHuman Sensing and Actuation
S
Sebastian Vollmer
Deutsches Forschungszentrum für Künstliche Intelligenz GmbH (DFKI), Kaiserslautern, Germany