🤖 AI Summary
To address high communication overhead and costly human annotation in cross-institutional Federated Active Learning (FAL), this paper proposes a two-stage coupled optimization framework. In Stage I, vision foundation models (ViT/SAM) generate weak labels to preliminarily screen candidate samples; in Stage II, only the most uncertain samples—selected via entropy or margin confidence—are sent for precise human labeling, followed by lightweight local model refinement and FedAvg-based federated training. We introduce the first “two-pass” FAL paradigm, pioneering the integration of weak supervision into the active sampling pipeline to jointly optimize sample selection and model training. Evaluated on multimodal medical and natural image datasets, our method achieves a 4.36% average accuracy gain using only 5% of the annotation budget, while reducing total communication rounds by 8× compared to baseline approaches.
📝 Abstract
Federated Active Learning (FAL) has emerged as a promising framework to leverage large quantities of unlabeled data across distributed clients while preserving data privacy. However, real-world deployments remain limited by high annotation costs and communication-intensive sampling processes, particularly in a cross-silo setting, when clients possess substantial local datasets. This paper addresses the crucial question: What is the best practice to reduce communication costs in human-in-the-loop learning with minimal annotator effort? Existing FAL methods typically rely on iterative annotation processes that separate active sampling from federated updates, leading to multiple rounds of expensive communication and annotation. In response, we introduce FAST, a two-pass FAL framework that harnesses foundation models for weak labeling in a preliminary pass, followed by a refinement pass focused exclusively on the most uncertain samples. By leveraging representation knowledge from foundation models and integrating refinement steps into a streamlined workflow, FAST substantially reduces the overhead incurred by iterative active sampling. Extensive experiments on diverse medical and natural image benchmarks demonstrate that FAST outperforms existing FAL methods by an average of 4.36% while reducing communication rounds eightfold under a limited 5% labeling budget.