How Useful is Causal Invariance for Domain Adaptation in Finite-Sample Settings?

📅 2026-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates whether causal invariance can enhance performance in supervised domain adaptation under limited target samples. Addressing this question, the authors identify an invariant feature subset grounded in linear structural causal models and propose an adaptive aggregation of candidate predictors to circumvent negative transfer. Theoretically, they establish, for the first time, conditions under which causal invariance yields performance gains in the finite-sample regime, deriving matching upper and lower bounds that depend on the target risk gap and source estimation error. Specifically, when the target risk gap is sufficiently large, their approach aggregates predictors that provably outperform those trained solely on target data; otherwise, no faster convergence rate can be guaranteed. Empirical evaluations on real-world causal benchmarks corroborate these theoretical findings.
📝 Abstract
Machine learning models often degrade when they are deployed on a target distribution that differs from the source distributions they were trained on. Recent work in causality-based domain generalization has shown how shared causal structure between domains can induce invariant predictors, e.g., models on a subset of features which have stable risk across structured domain shifts. However, the extent to which such population-level causal invariances can lead to gains in finite-sample settings remains underexplored. In particular, in practice we often have access to a few labeled target samples, a setting called supervised domain adaptation (sDA). In this paper, we explore when (full or partial) causal knowledge can provably improve supervised domain adaptation. As a first step, we study linear regression, where full or partial causal knowledge specifies a collection of invariant or possibly invariant feature subsets, each yielding a source-trained candidate predictor. We derive matching upper and lower bounds showing that finite-sample gains are governed by the target-risk margins separating the candidates, together with the finite-source estimation error. When these margins are sufficiently large relative to $n_Q$, an adaptive aggregation procedure can match the best candidate predictor while avoiding negative transfer relative to target-only learning. On the other hand, when the margins are too small, no algorithm can reliably exploit the candidate collection to obtain faster finite-sample rates. We further connect these margins to structural shift magnitude in linear SCMs and validate the theory on real-world causal benchmarks.
Problem

Research questions and friction points this paper is trying to address.

causal invariance
domain adaptation
finite-sample
supervised domain adaptation
invariant predictors
Innovation

Methods, ideas, or system contributions that make the work stand out.

causal invariance
supervised domain adaptation
finite-sample analysis
invariant predictors
adaptive aggregation
🔎 Similar Papers
2024-06-18Conference on Empirical Methods in Natural Language ProcessingCitations: 2