Feature Shift Localization Network

📅 2025-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Feature distribution shift—prevalent in multi-source heterogeneous data (e.g., healthcare, finance, multi-sensor systems)—poses significant challenges for model generalization; existing approaches suffer from accuracy limitations and poor scalability to high-dimensional settings. Method: We propose the first end-to-end, generalizable neural framework that precisely localizes feature-level shifts across datasets and shift types—without requiring model retraining. Our approach integrates deep statistical modeling, meta-learning, and unsupervised feature importance estimation to explicitly characterize distributional discrepancies and quantify per-dimension shift magnitude. Contribution/Results: The framework achieves millisecond-scale inference on large-scale, high-dimensional data, significantly outperforming conventional statistical tests (e.g., KS, MMD) and state-of-the-art learning-based methods in localization accuracy. It is robust to diverse shift types—including covariate, concept, and label shift—and generalizes across domains. Source code and pre-trained models are publicly released.

Technology Category

Application Category

📝 Abstract
Feature shifts between data sources are present in many applications involving healthcare, biomedical, socioeconomic, financial, survey, and multi-sensor data, among others, where unharmonized heterogeneous data sources, noisy data measurements, or inconsistent processing and standardization pipelines can lead to erroneous features. Localizing shifted features is important to address the underlying cause of the shift and correct or filter the data to avoid degrading downstream analysis. While many techniques can detect distribution shifts, localizing the features originating them is still challenging, with current solutions being either inaccurate or not scalable to large and high-dimensional datasets. In this work, we introduce the Feature Shift Localization Network (FSL-Net), a neural network that can localize feature shifts in large and high-dimensional datasets in a fast and accurate manner. The network, trained with a large number of datasets, learns to extract the statistical properties of the datasets and can localize feature shifts from previously unseen datasets and shifts without the need for re-training. The code and ready-to-use trained model are available at https://github.com/AI-sandbox/FSL-Net.
Problem

Research questions and friction points this paper is trying to address.

Localizing feature shifts in heterogeneous data sources
Addressing inaccurate or non-scalable current solutions
Handling large and high-dimensional datasets efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural network localizes feature shifts
Handles large high-dimensional datasets
No re-training needed for new data
🔎 Similar Papers
No similar papers found.
M
M'iriam Barrab'es
Department of Computer Science, Munster Technological University, Cork T12 P928, Ireland
D
Daniel Mas Montserrat
Department of Biomedical Data Science, Stanford University, Stanford, CA 94305 USA
Kapal Dev
Kapal Dev
Assistant Professor @ Munster Technological University, Ireland.
Wireless NetworksSecurity and PrivacyAgentic AIIndustry 5.0
Alexander G. Ioannidis
Alexander G. Ioannidis
Assistant Professor