qc-kmeans: A Quantum Compressive K-Means Algorithm for NISQ Devices

📅 2025-10-26

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

On NISQ devices, quantum k-means clustering suffers from high data-loading overhead and a quantum-bit count that scales with sample size—severely limiting practical deployment. Method: We propose a hybrid quantum k-means algorithm featuring (i) constant-dimensional unbiased Fourier sketching for data compression and efficient quantum state encoding—capping peak qubit usage at ≤9 regardless of dataset size; and (ii) an elite-preservation strategy combined with shallow-depth QAOA to solve surrogate QUBO subproblems within the compressed learning framework. Contribution/Results: Experiments on Qiskit Aer simulators and nine real-world datasets demonstrate that our method achieves reconstruction error comparable to classical baselines—while consuming only ≤9 qubits—and maintains robust clustering accuracy under realistic noise conditions. This establishes the first k-means variant achieving constant-qubit scaling without sacrificing fidelity or generalizability.

Technology Category

Application Category

📝 Abstract

Clustering on NISQ hardware is constrained by data loading and limited qubits. We present extbf{qc-kmeans}, a hybrid compressive $k$-means that summarizes a dataset with a constant-size Fourier-feature sketch and selects centroids by solving small per-group QUBOs with shallow QAOA circuits. The QFF sketch estimator is unbiased with mean-squared error $O(varepsilon^2)$ for $B,S=Θ(varepsilon^{-2})$, and the peak-qubit requirement $q_{ ext{peak}}=max{D,lceil log_2 B ceil + 1}$ does not scale with the number of samples. A refinement step with elitist retention ensures non-increasing surrogate cost. In Qiskit Aer simulations (depth $p{=}1$), the method ran with $le 9$ qubits on low-dimensional synthetic benchmarks and achieved competitive sum-of-squared errors relative to quantum baselines; runtimes are not directly comparable. On nine real datasets (up to $4.3 imes 10^5$ points), the pipeline maintained constant peak-qubit usage in simulation. Under IBM noise models, accuracy was similar to the idealized setting. Overall, qc-kmeans offers a NISQ-oriented formulation with shallow, bounded-width circuits and competitive clustering quality in simulation.

Problem

Research questions and friction points this paper is trying to address.

Developing quantum clustering for NISQ devices with limited qubits

Addressing data loading constraints via compressive Fourier-feature sketches

Solving centroid selection with shallow QAOA circuits on small QUBOs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses constant-size Fourier-feature sketch for dataset summarization

Solves small QUBOs with shallow QAOA circuits for centroids

Employs refinement step with elitist retention for cost control

🔎 Similar Papers

A Quantum Approximation Scheme for k-Means