Mixture Proportion Estimation and Weakly-supervised Kernel Test for Conditional Independence

πŸ“… 2026-04-08
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the critical challenge of estimating class priors from unlabeled dataβ€”a fundamental problem in weakly supervised learning scenarios such as positive-unlabeled (PU) learning, label noise learning, and domain adaptation. By introducing a conditional independence assumption on class labels, the work overcomes the traditional irreducibility constraint, thereby enabling identifiability of mixture proportions under broader conditions. Building on this insight, the authors develop a method-of-moments estimator for class priors. Additionally, they propose a kernel-based hypothesis test to validate the conditional independence assumption, with potential extensions to causal discovery and fairness assessment. Theoretical analysis and empirical experiments demonstrate that the proposed estimator outperforms existing approaches, and the kernel-based test effectively controls both Type I and Type II errors.
πŸ“ Abstract
Mixture proportion estimation (MPE) aims to estimate class priors from unlabeled data. This task is a critical component in weakly supervised learning, such as PU learning, learning with label noise, and domain adaptation. Existing MPE methods rely on the \textit{irreducibility} assumption or its variant for identifiability. In this paper, we propose novel assumptions based on conditional independence (CI) given the class label, which ensure identifiability even when irreducibility does not hold. We develop method of moments estimators under these assumptions and analyze their asymptotic properties. Furthermore, we present weakly-supervised kernel tests to validate the CI assumptions, which are of independent interest in applications such as causal discovery and fairness evaluation. Empirically, we demonstrate the improved performance of our estimators compared with existing methods and that our tests successfully control both type I and type II errors.\label{key}
Problem

Research questions and friction points this paper is trying to address.

Mixture Proportion Estimation
Weakly-supervised Learning
Conditional Independence
Class Prior Estimation
Identifiability
Innovation

Methods, ideas, or system contributions that make the work stand out.

mixture proportion estimation
conditional independence
weakly-supervised learning
method of moments
kernel test
πŸ”Ž Similar Papers
No similar papers found.
Y
Yushi Hirose
Institute of Science Tokyo, RIKEN AIP
A
Akito Narahara
Institute of Science Tokyo, RIKEN AIP
Takafumi Kanamori
Takafumi Kanamori
Institute of Science Tokyo
Machine LearningMathematical Statistics