FAST: Federated Active Learning with Foundation Models for Communication-efficient Sampling and Training

📅 2025-04-03

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

To address high communication overhead and costly human annotation in cross-institutional Federated Active Learning (FAL), this paper proposes a two-stage coupled optimization framework. In Stage I, vision foundation models (ViT/SAM) generate weak labels to preliminarily screen candidate samples; in Stage II, only the most uncertain samples—selected via entropy or margin confidence—are sent for precise human labeling, followed by lightweight local model refinement and FedAvg-based federated training. We introduce the first “two-pass” FAL paradigm, pioneering the integration of weak supervision into the active sampling pipeline to jointly optimize sample selection and model training. Evaluated on multimodal medical and natural image datasets, our method achieves a 4.36% average accuracy gain using only 5% of the annotation budget, while reducing total communication rounds by 8× compared to baseline approaches.

Technology Category

Application Category

📝 Abstract

Federated Active Learning (FAL) has emerged as a promising framework to leverage large quantities of unlabeled data across distributed clients while preserving data privacy. However, real-world deployments remain limited by high annotation costs and communication-intensive sampling processes, particularly in a cross-silo setting, when clients possess substantial local datasets. This paper addresses the crucial question: What is the best practice to reduce communication costs in human-in-the-loop learning with minimal annotator effort? Existing FAL methods typically rely on iterative annotation processes that separate active sampling from federated updates, leading to multiple rounds of expensive communication and annotation. In response, we introduce FAST, a two-pass FAL framework that harnesses foundation models for weak labeling in a preliminary pass, followed by a refinement pass focused exclusively on the most uncertain samples. By leveraging representation knowledge from foundation models and integrating refinement steps into a streamlined workflow, FAST substantially reduces the overhead incurred by iterative active sampling. Extensive experiments on diverse medical and natural image benchmarks demonstrate that FAST outperforms existing FAL methods by an average of 4.36% while reducing communication rounds eightfold under a limited 5% labeling budget.

Problem

Research questions and friction points this paper is trying to address.

Reducing communication costs in federated active learning

Minimizing annotator effort with foundation models

Streamlining sampling and training in cross-silo settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Active Learning with foundation models

Two-pass framework for weak labeling and refinement

Reduces communication rounds and annotation costs

🔎 Similar Papers

No similar papers found.