How to rank imputation methods?

📅 2025-07-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Evaluating imputation methods without ground-truth complete data remains challenging, as conventional metrics (e.g., RMSE) are misleading under realistic missingness mechanisms. Method: This paper proposes an unsupervised scoring framework based on the energy score, which constructs validation sets via controlled artificial masking and evaluates imputations by their ability to reproduce the underlying data distribution—under the Missing at Random (MAR) assumption. Contribution/Results: The framework is the first to explicitly model missingness mechanisms tailored to data distribution characteristics, ensuring scoring consistency with downstream task performance. Experiments on both synthetic and real-world datasets demonstrate its robustness in discriminating among imputation algorithms. Theoretically grounded and empirically validated, the approach offers both statistical soundness and practical deployability.

Technology Category

Application Category

📝 Abstract
Imputation is an attractive tool for dealing with the widespread issue of missing values. Consequently, studying and developing imputation methods has been an active field of research over the last decade. Faced with an imputation task and a large number of methods, how does one find the most suitable imputation? Although model selection in different contexts, such as prediction, has been well studied, this question appears not to have received much attention. In this paper, we follow the concept of Imputation Scores (I-Scores) and develop a new, reliable, and easy-to-implement score to rank missing value imputations for a given data set without access to the complete data. In practice, this is usually done by artificially masking observations to compare imputed to observed values using measures such as the Root Mean Squared Error (RMSE). We discuss how this approach of additionally masking observations can be misleading if not done carefully and that it is generally not valid under MAR. We then identify a new missingness assumption and develop a score that combines a sensible masking of observations with proper scoring rules. As such the ranking is geared towards the imputation that best replicates the distribution of the data, allowing to find imputations that are suitable for a range of downstream tasks. We show the propriety of the score and discuss an estimation algorithm involving energy scores. Finally, we show the efficacy of the new score in simulated data examples, as well as a downstream task.
Problem

Research questions and friction points this paper is trying to address.

Ranking imputation methods for missing data
Developing reliable scores without complete data
Ensuring imputation replicates true data distribution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops new Imputation Scores (I-Scores)
Combines masking with proper scoring rules
Uses energy scores for estimation algorithm
🔎 Similar Papers
No similar papers found.
J
Jeffrey Näf
Research Institute for Statistics and Information Science, University of Geneva
K
Krystyna Grzesiak
Faculty of Mathematics and Computer Science, University of Wrocław
Erwan Scornet
Erwan Scornet
Professeur, Sorbonne Université
StatistiqueMachine Learning