๐ค AI Summary
In weakly supervised learning, existing methods are typically tailored to specific supervision paradigms (e.g., PU, UU, CLL, PLL) and rely on post-hoc calibration to mitigate instability, lacking a unified, robust risk minimization framework. This paper proposes the first *direct surrogate risk minimization framework* applicable across diverse weak supervision settingsโwithout requiring post-processing correction. By explicitly modeling the structural properties of weak supervision signals, we design a structure-aware surrogate loss. Theoretically, we derive non-asymptotic generalization bounds, analyze the impact of class-prior misspecification, and establish sufficient conditions for identifiability of the target risk. Empirically, our method consistently outperforms state-of-the-art approaches across varying class priors, dataset scales, and numbers of classes; it exhibits strong resistance to overfitting and delivers stable, significant improvements in generalization performance.
๐ Abstract
Weakly supervised learning has emerged as a practical alternative to fully supervised learning when complete and accurate labels are costly or infeasible to acquire. However, many existing methods are tailored to specific supervision patterns -- such as positive-unlabeled (PU), unlabeled-unlabeled (UU), complementary-label (CLL), partial-label (PLL), or similarity-unlabeled annotations -- and rely on post-hoc corrections to mitigate instability induced by indirect supervision. We propose a principled, unified framework that bypasses such post-hoc adjustments by directly formulating a stable surrogate risk grounded in the structure of weakly supervised data. The formulation naturally subsumes diverse settings -- including PU, UU, CLL, PLL, multi-class unlabeled, and tuple-based learning -- under a single optimization objective. We further establish a non-asymptotic generalization bound via Rademacher complexity that clarifies how supervision structure, model capacity, and sample size jointly govern performance. Beyond this, we analyze the effect of class-prior misspecification on the bound, deriving explicit terms that quantify its impact, and we study identifiability, giving sufficient conditions -- most notably via supervision stratification across groups -- under which the target risk is recoverable. Extensive experiments show consistent gains across class priors, dataset scales, and class counts -- without heuristic stabilization -- while exhibiting robustness to overfitting.