A General Anchor-Based Framework for Scalable Fair Clustering

📅 2025-11-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing fair clustering algorithms typically exhibit quadratic or higher time complexity, rendering them infeasible for large-scale datasets. Method: We propose a generic, plug-and-play anchor-based fair clustering framework that enforces global fairness via joint group-label constraints; introduces efficient fair sampling and anchor graph construction to reduce computational complexity to linear time; and integrates an ADMM-based optimization solver with an optimization-driven label propagation mechanism, enabling linear scalability for arbitrary baseline fair clustering algorithms. Contribution/Results: Evaluated on multiple large-scale benchmarks, our method achieves 10×–1000× speedup over state-of-the-art approaches while maintaining competitive clustering quality and provable fairness guarantees.

Technology Category

Application Category

📝 Abstract
Fair clustering is crucial for mitigating bias in unsupervised learning, yet existing algorithms often suffer from quadratic or super-quadratic computational complexity, rendering them impractical for large-scale datasets. To bridge this gap, we introduce the Anchor-based Fair Clustering Framework (AFCF), a novel, general, and plug-and-play framework that empowers arbitrary fair clustering algorithms with linear-time scalability. Our approach first selects a small but representative set of anchors using a novel fair sampling strategy. Then, any off-the-shelf fair clustering algorithm can be applied to this small anchor set. The core of our framework lies in a novel anchor graph construction module, where we formulate an optimization problem to propagate labels while preserving fairness. This is achieved through a carefully designed group-label joint constraint, which we prove theoretically ensures that the fairness of the final clustering on the entire dataset matches that of the anchor clustering. We solve this optimization efficiently using an ADMM-based algorithm. Extensive experiments on multiple large-scale benchmarks demonstrate that AFCF drastically accelerates state-of-the-art methods, which reduces computational time by orders of magnitude while maintaining strong clustering performance and fairness guarantees.
Problem

Research questions and friction points this paper is trying to address.

Mitigating bias in unsupervised learning through fair clustering
Reducing computational complexity of fair clustering algorithms
Enabling linear-time scalability for large-scale datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Anchor-based framework enables linear-time fair clustering
Fair sampling selects representative anchors for scalability
ADMM optimization propagates labels with fairness guarantees
🔎 Similar Papers
No similar papers found.
S
Shengfei Wei
College of Computer Science and Technology, National University of Defense Technology, Changsha, 410073, China
Suyuan Liu
Suyuan Liu
National University of Defense Technology
Multi-view ClusteringAnchor LearningGraph Learning
J
Jun Wang
College of Computer Science and Technology, National University of Defense Technology, Changsha, 410073, China
Ke Liang
Ke Liang
NUDT
Graph LearningKnowledge Representation and ReasoningMulti-view Clustering
M
Miaomiao Li
School of Computer, Changsha College, Changsha, 410022, China
Lei Luo
Lei Luo
Kansas State University
Computer VisionGANsImage Restoration