🤖 AI Summary
This study addresses the challenge of missing data imputation in compositional datasets containing zeros by proposing a nonparametric approach that avoids distributional assumptions. The method constructs k-nearest neighbors based on Jensen–Shannon divergence and performs imputation using the Fréchet mean, while incorporating an adaptive hyperparameter mechanism to accommodate diverse missingness patterns. Notably, this work is the first to jointly leverage Jensen–Shannon divergence and the Fréchet mean for compositional data analysis, offering a natural handling of zero values without imposing strong parametric constraints. Experimental results across multiple real-data simulation scenarios demonstrate that the proposed method consistently achieves higher imputation accuracy and computational efficiency compared to existing approaches.
📝 Abstract
A novel nonparametric method to impute missing values in compositional data is developed. The method is based on the $k$--$NN$ algorithm, utilizes the Jensen-Shannon divergence and employs the Fr{é}chet mean to allow for more flexibility in the estimation process. As an extra feature, the hyper-parameters can be self-adaptive according to the pattern of missing values. Unlike restrictive parametric models, the proposed method makes no assumption about the structure of the data and, most importantly, it is applicable even when compositional data contain zero values. Through simulation studies using real data, it is shown that the proposed algorithm outperforms competing algorithms at various settings, not only in terms of accuracy but also in terms of computational efficiency.