When should we trust the annotation? Selective prediction for molecular structure retrieval from mass spectra

📅 2026-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current mass spectrometry–based molecular structure identification methods lack sufficient reliability in high-stakes applications, necessitating robust assessment of prediction confidence. This work proposes a selective prediction framework that actively abstains from making predictions when uncertainty is excessive, leveraging a risk–coverage trade-off mechanism. It presents the first systematic evaluation of fingerprint-level versus retrieval-level uncertainty quantification strategies and their impact on annotation reliability. Integrating distribution-free risk control, the framework enables users to specify a tolerable error rate and guarantees— with high probability—that this constraint is satisfied. Experimental results demonstrate that retrieval-level aleatoric uncertainty combined with a low-cost first-order confidence metric effectively balances risk and coverage, significantly outperforming fingerprint-level approaches and offering a highly reliable annotation solution for critical applications such as clinical metabolomics.

Technology Category

Application Category

📝 Abstract
Machine learning methods for identifying molecular structures from tandem mass spectra (MS/MS) have advanced rapidly, yet current approaches still exhibit significant error rates. In high-stakes applications such as clinical metabolomics and environmental screening, incorrect annotations can have serious consequences, making it essential to determine when a prediction can be trusted. We introduce a selective prediction framework for molecular structure retrieval from MS/MS spectra, enabling models to abstain from predictions when uncertainty is too high. We formulate the problem within the risk-coverage tradeoff framework and comprehensively evaluate uncertainty quantification strategies at two levels of granularity: fingerprint-level uncertainty over predicted molecular fingerprint bits, and retrieval-level uncertainty over candidate rankings. We compare scoring functions including first-order confidence measures, aleatoric and epistemic uncertainty estimates from second-order distributions, as well as distance-based measures in the latent space. All experiments are conducted on the MassSpecGym benchmark. Our analysis reveals that while fingerprint-level uncertainty scores are poor proxies for retrieval success, computationally inexpensive first-order confidence measures and retrieval-level aleatoric uncertainty achieve strong risk-coverage tradeoffs across evaluation settings. We demonstrate that by applying distribution-free risk control via generalization bounds, practitioners can specify a tolerable error rate and obtain a subset of annotations satisfying that constraint with high probability.
Problem

Research questions and friction points this paper is trying to address.

selective prediction
molecular structure retrieval
mass spectrometry
uncertainty quantification
risk-coverage tradeoff
Innovation

Methods, ideas, or system contributions that make the work stand out.

selective prediction
uncertainty quantification
risk-coverage tradeoff
mass spectrometry
molecular structure retrieval
🔎 Similar Papers
No similar papers found.
M
Mira Jürgens
Department of Data Analysis and Mathematical Modeling, Ghent University, Coupure Links 653, Ghent, 9000, Belgium
G
Gaetan De Waele
Department of Computer Science, University of Antwerp, Middelheimlaan 1, Antwerp, 2020, Belgium
M
Morteza Rakhshaninejad
Department of Data Analysis and Mathematical Modeling, Ghent University, Coupure Links 653, Ghent, 9000, Belgium
Willem Waegeman
Willem Waegeman
Ghent University
machine learningdata sciencebioinformatics