qc-kmeans: A Quantum Compressive K-Means Algorithm for NISQ Devices

📅 2025-10-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
On NISQ devices, quantum k-means clustering suffers from high data-loading overhead and a quantum-bit count that scales with sample size—severely limiting practical deployment. Method: We propose a hybrid quantum k-means algorithm featuring (i) constant-dimensional unbiased Fourier sketching for data compression and efficient quantum state encoding—capping peak qubit usage at ≤9 regardless of dataset size; and (ii) an elite-preservation strategy combined with shallow-depth QAOA to solve surrogate QUBO subproblems within the compressed learning framework. Contribution/Results: Experiments on Qiskit Aer simulators and nine real-world datasets demonstrate that our method achieves reconstruction error comparable to classical baselines—while consuming only ≤9 qubits—and maintains robust clustering accuracy under realistic noise conditions. This establishes the first k-means variant achieving constant-qubit scaling without sacrificing fidelity or generalizability.

Technology Category

Application Category

📝 Abstract
Clustering on NISQ hardware is constrained by data loading and limited qubits. We present extbf{qc-kmeans}, a hybrid compressive $k$-means that summarizes a dataset with a constant-size Fourier-feature sketch and selects centroids by solving small per-group QUBOs with shallow QAOA circuits. The QFF sketch estimator is unbiased with mean-squared error $O(varepsilon^2)$ for $B,S=Θ(varepsilon^{-2})$, and the peak-qubit requirement $q_{ ext{peak}}=max{D,lceil log_2 B ceil + 1}$ does not scale with the number of samples. A refinement step with elitist retention ensures non-increasing surrogate cost. In Qiskit Aer simulations (depth $p{=}1$), the method ran with $le 9$ qubits on low-dimensional synthetic benchmarks and achieved competitive sum-of-squared errors relative to quantum baselines; runtimes are not directly comparable. On nine real datasets (up to $4.3 imes 10^5$ points), the pipeline maintained constant peak-qubit usage in simulation. Under IBM noise models, accuracy was similar to the idealized setting. Overall, qc-kmeans offers a NISQ-oriented formulation with shallow, bounded-width circuits and competitive clustering quality in simulation.
Problem

Research questions and friction points this paper is trying to address.

Developing quantum clustering for NISQ devices with limited qubits
Addressing data loading constraints via compressive Fourier-feature sketches
Solving centroid selection with shallow QAOA circuits on small QUBOs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses constant-size Fourier-feature sketch for dataset summarization
Solves small QUBOs with shallow QAOA circuits for centroids
Employs refinement step with elitist retention for cost control
🔎 Similar Papers
2023-08-16arXiv.orgCitations: 4
P
Pedro Chumpitaz-Flores
University of South Florida, Tampa, FL, USA
M
My Duong
University of South Florida, Tampa, FL, USA
Y
Ying Mao
Fordham University, New York, NY , USA
Kaixun Hua
Kaixun Hua
Assistant Professor, University of South Florida
Trustworthy AIClusteringGlobal Optimization