🤖 AI Summary
To address the dual challenges of data silos and multi-source missing values in profit forecasting for small and medium-sized enterprises (SMEs), this paper proposes a joint modeling framework under vertical federated learning. We introduce Vertical Federated Expectation Maximization (VFEM), a novel algorithm that integrates the Expectation-Maximization (EM) method—robust to complex missing-data patterns—into the vertical federated setting, enabling cross-institutional collaboration without raw data sharing. We theoretically establish its linear convergence and develop an interpretable statistical inference framework. By unifying distributed optimization with missing-data imputation techniques, VFEM supports joint analysis of heterogeneous, incomplete, multi-party data. Experiments on synthetic and real-world datasets demonstrate significant improvements in prediction accuracy, effectively resolving both data isolation and missing-value challenges.
📝 Abstract
Small and medium-sized enterprises (SMEs) play a crucial role in driving economic growth. Monitoring their financial performance and discovering relevant covariates are essential for risk assessment, business planning, and policy formulation. This paper focuses on predicting profits for SMEs. Two major challenges are faced in this study: 1) SMEs data are stored across different institutions, and centralized analysis is restricted due to data security concerns; 2) data from various institutions contain different levels of missing values, resulting in a complex missingness issue. To tackle these issues, we introduce an innovative approach named Vertical Federated Expectation Maximization (VFEM), designed for federated learning under a missing data scenario. We embed a new EM algorithm into VFEM to address complex missing patterns when full dataset access is unfeasible. Furthermore, we establish the linear convergence rate for the VFEM and establish a statistical inference framework, enabling covariates to influence assessment and enhancing model interpretability. Extensive simulation studies are conducted to validate its finite sample performance. Finally, we thoroughly investigate a real-life profit prediction problem for SMEs using VFEM. Our findings demonstrate that VFEM provides a promising solution for addressing data isolation and missing values, ultimately improving the understanding of SMEs' financial performance.