MVIAnalyzer: A Holistic Approach to Analyze Missing Value Imputation

📅 2025-07-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Missing values frequently induce analytical bias or failure, yet existing imputation methods lack task-agnostic, universal evaluation criteria. To address this, we propose MVIAnalyzer—the first general-purpose analytical framework that embeds missing value imputation (MVI) into end-to-end data science workflows. Our approach enables comprehensive pre- and post-imputation assessment, encompassing synthetic data generation, machine learning modeling, and result visualization. It introduces configurable, multi-mechanism missingness simulation (MCAR, MAR, MNAR) with fine-grained parametric control, and delivers an open-source toolchain supporting systematic benchmarking across diverse data types, models, and evaluation metrics. Extensive experiments on multiple real-world datasets validate the framework’s effectiveness, uncovering performance boundaries and contextual applicability of mainstream imputation methods across tasks. MVIAnalyzer establishes a reproducible, extensible analytical paradigm and evidence-based guidance for both MVI research and practical deployment.

Technology Category

Application Category

📝 Abstract
Missing values often limit the usage of data analysis or cause falsification of results. Therefore, methods of missing value imputation (MVI) are of great significance. However, in general, there is no universal, fair MVI method for different tasks. This work thus places MVI in the overall context of data analysis. For this purpose, we present the MVIAnalyzer, a generic framework for a holistic analysis of MVI. It considers the overall process up to the application and analysis of machine learning methods. The associated software is provided and can be used by other researchers for their own analyses. To this end, it further includes a missing value simulation with consideration of relevant parameters. The application of the MVIAnalyzer is demonstrated on data with different characteristics. An evaluation of the results shows the possibilities and limitations of different MVI methods. Since MVI is a very complex topic with different influencing variables, this paper additionally illustrates how the analysis can be supported by visualizations.
Problem

Research questions and friction points this paper is trying to address.

Analyzing missing value imputation in data analysis context
Evaluating limitations of different MVI methods
Providing a framework for holistic MVI analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Holistic framework for missing value imputation analysis
Includes missing value simulation with key parameters
Supports analysis with visualization techniques
🔎 Similar Papers
No similar papers found.
V
Valerie Restat
University of Hagen, Hagen, Germany
K
Kai Tejkl
University of Hagen, Hagen, Germany
Uta Störl
Uta Störl
Professor of Computer Science, University of Hagen
Database SystemsNoSQLBig Data