Optimizing importance weighting in the presence of sub-population shifts

📅 2024-10-18

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Importance weighting methods fail under subgroup distribution shift due to uncontrolled estimation variance arising from finite-sample bias—yet existing approaches ignore this variance, relying instead on heuristic weight designs. Method: We propose a bias–variance-driven framework for optimal importance weighting, formulated as a bilevel optimization problem that jointly learns importance weights and model parameters. Crucially, our method explicitly models the finite-sample weighted estimator variance—a first in the literature—enabling principled variance-aware weight optimization. The framework is architecture-agnostic and supports efficient adaptation via fine-tuning only the final layer. Contribution/Results: Evaluated across multiple subgroup shift benchmarks, our approach consistently improves out-of-distribution generalization and cross-distribution robustness. Empirical results validate both the effectiveness and broad applicability of variance-aware importance weighting, establishing a new paradigm beyond heuristic weight design.

Technology Category

Application Category

📝 Abstract

A distribution shift between the training and test data can severely harm performance of machine learning models. Importance weighting addresses this issue by assigning different weights to data points during training. We argue that existing heuristics for determining the weights are suboptimal, as they neglect the increase of the variance of the estimated model due to the finite sample size of the training data. We interpret the optimal weights in terms of a bias-variance trade-off, and propose a bi-level optimization procedure in which the weights and model parameters are optimized simultaneously. We apply this optimization to existing importance weighting techniques for last-layer retraining of deep neural networks in the presence of sub-population shifts and show empirically that optimizing weights significantly improves generalization performance.

Problem

Research questions and friction points this paper is trying to address.

Addressing suboptimal heuristics in importance weighting methods

Optimizing weights to balance bias-variance trade-off in distribution shifts

Improving generalization performance under sub-population shifts in training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bi-level optimization for weight and parameter tuning

Bias-variance trade-off interpretation of optimal weights

Simultaneous optimization applied to deep neural networks

🔎 Similar Papers

Multiple importance sampling for stochastic gradient estimation