PASS: Ambiguity Guided Subsets for Scalable Classical and Quantum Constrained Clustering

📅 2026-01-28

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

This work addresses the challenge of balancing constraint satisfaction and computational efficiency in pairwise-constrained clustering, particularly in large-scale and quantum/quantum-hybrid settings. To this end, the authors propose the PASS framework, which introduces a novel ambiguity-guided subset selection mechanism. By compressing must-link constraints into pseudopoints, performing constraint-aware edge sampling, and leveraging Fisher–Rao distance-based information-geometric scoring, PASS efficiently identifies high-information quantum subsets under limited budgets. Experimental results demonstrate that PASS achieves silhouette sum of errors (SSE) performance comparable to state-of-the-art methods at significantly lower computational cost across multiple benchmark datasets. Moreover, it remains robust and effective even in scenarios where conventional approaches fail, exhibiting strong scalability and high satisfaction rates for both must-link and cannot-link constraints.

Technology Category

Application Category

📝 Abstract

Pairwise-constrained clustering augments unsupervised partitioning with side information by enforcing must-link (ML) and cannot-link (CL) constraints between specific samples, yielding labelings that respect known affinities and separations. However, ML and CL constraints add an extra layer of complexity to the clustering problem, with current methods struggling in data scalability, especially in niche applications like quantum or quantum-hybrid clustering. We propose PASS, a pairwise-constraints and ambiguity-driven subset selection framework that preserves ML and CL constraints satisfaction while allowing scalable, high-quality clustering solution. PASS collapses ML constraints into pseudo-points and offers two selectors: a constraint-aware margin rule that collects near-boundary points and all detected CL violations, and an information-geometric rule that scores points via a Fisher-Rao distance derived from soft assignment posteriors, then selects the highest-information subset under a simple budget. Across diverse benchmarks, PASS attains competitive SSE at substantially lower cost than exact or penalty-based methods, and remains effective in regimes where prior approaches fail.

Problem

Research questions and friction points this paper is trying to address.

pairwise-constrained clustering

scalability

must-link constraints

cannot-link constraints

quantum clustering

Innovation

Methods, ideas, or system contributions that make the work stand out.

pairwise-constrained clustering

subset selection

ambiguity-guided