MVIAnalyzer: A Holistic Approach to Analyze Missing Value Imputation

📅 2025-07-28

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Missing values frequently induce analytical bias or failure, yet existing imputation methods lack task-agnostic, universal evaluation criteria. To address this, we propose MVIAnalyzer—the first general-purpose analytical framework that embeds missing value imputation (MVI) into end-to-end data science workflows. Our approach enables comprehensive pre- and post-imputation assessment, encompassing synthetic data generation, machine learning modeling, and result visualization. It introduces configurable, multi-mechanism missingness simulation (MCAR, MAR, MNAR) with fine-grained parametric control, and delivers an open-source toolchain supporting systematic benchmarking across diverse data types, models, and evaluation metrics. Extensive experiments on multiple real-world datasets validate the framework’s effectiveness, uncovering performance boundaries and contextual applicability of mainstream imputation methods across tasks. MVIAnalyzer establishes a reproducible, extensible analytical paradigm and evidence-based guidance for both MVI research and practical deployment.

Technology Category

Application Category

📝 Abstract

Missing values often limit the usage of data analysis or cause falsification of results. Therefore, methods of missing value imputation (MVI) are of great significance. However, in general, there is no universal, fair MVI method for different tasks. This work thus places MVI in the overall context of data analysis. For this purpose, we present the MVIAnalyzer, a generic framework for a holistic analysis of MVI. It considers the overall process up to the application and analysis of machine learning methods. The associated software is provided and can be used by other researchers for their own analyses. To this end, it further includes a missing value simulation with consideration of relevant parameters. The application of the MVIAnalyzer is demonstrated on data with different characteristics. An evaluation of the results shows the possibilities and limitations of different MVI methods. Since MVI is a very complex topic with different influencing variables, this paper additionally illustrates how the analysis can be supported by visualizations.

Problem

Research questions and friction points this paper is trying to address.

Analyzing missing value imputation in data analysis context

Evaluating limitations of different MVI methods

Providing a framework for holistic MVI analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Holistic framework for missing value imputation analysis

Includes missing value simulation with key parameters

Supports analysis with visualization techniques

🔎 Similar Papers

Deep Learning for Multivariate Time Series Imputation: A Survey