Machines acquire scientific taste from institutional traces

📅 2026-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the long-standing challenge of formalizing and automating “scientific taste”—the ability to judge the potential value of unverified research ideas—which has constrained the efficiency of scientific evaluation. The authors demonstrate for the first time that scientific taste can be learned from institutional traces such as journal acceptance decisions and encoded into a fine-tuned language model to automatically assess the quality of research proposals. In evaluations of management science proposals, this approach achieves 59% accuracy—significantly outperforming both expert panels (42%) and eleven state-of-the-art large language models (average 31%). Notably, its high-confidence predictions attain 100% accuracy and generalize effectively to economics, reaching 70% accuracy, thereby surpassing the performance limits of both human experts and existing models.

Technology Category

Application Category

📝 Abstract
Artificial intelligence matches or exceeds human performance on tasks with verifiable answers, from protein folding to Olympiad mathematics. Yet the capacity that most governs scientific advance is not reasoning but taste: the ability to judge which untested ideas deserve pursuit, exercised daily by editors and funders but never successfully articulated, taught, or automated. Here we show that fine-tuning language models on journal publication decisions recovers evaluative judgment inaccessible to both frontier models and human expertise. Using a held-out benchmark of research pitches in management spanning four quality tiers, we find that eleven frontier models, spanning major proprietary and open architectures, barely exceed chance, averaging 31% accuracy. Panels of journal editors and editorial board members reach 42% by majority vote. Fine-tuned models trained on years of publication records each surpass every frontier model and expert panel, with the best single model achieving 59%. These models exhibit calibrated confidence, reaching 100% accuracy on their highest-confidence predictions, and transfer this evaluative signal to untrained pairwise comparisons and one-sentence summaries. The mechanism generalizes: models trained on economics publication records achieve 70% accuracy. Scientific taste was not missing from AI's reach; it was deposited in the institutional record, waiting to be extracted. These results provide a scalable mechanism to triage the expanding volume of scientific production across disciplines where quality resists formal verification.
Problem

Research questions and friction points this paper is trying to address.

scientific taste
evaluative judgment
publication decisions
research prioritization
AI in science
Innovation

Methods, ideas, or system contributions that make the work stand out.

scientific taste
publication decision
fine-tuned language models
evaluative judgment
research triage
🔎 Similar Papers
No similar papers found.
Z
Ziqin Gong
School of Economics and Management, Tsinghua University, Beijing, China
Ning Li
Ning Li
School of Physics, Sun Yat-sen University
Hadron PhysicsLattice Effective Field TheoryMonte Carlo simulationsFew- and Many-body System
H
Huaikang Zhou
School of Economics and Management, Tsinghua University, Beijing, China