đ€ AI Summary
Data silosâarising from privacy regulations, legal constraints, and intellectual property rightsâundermine statistical power and exacerbate accessibility bias in machine learning. To address this, we propose a lightweight, browser-based distributed collaborative learning framework enabling non-technical users to jointly train models without sharing raw data. Our modular architecture unifies federated and decentralized paradigms, integrating client-side browser training, a frontend ML inference engine, end-to-end encrypted communication, and multi-tiered privacy safeguards. It supports customizable weight aggregation strategies to enhance model personalization and robustness against bias. The open-source platform is cross-device compatibleâincluding smartphonesâand requires only a web browser for participation. Empirical evaluation demonstrates significant improvements in usability, fairness, and scalability of collaborative modeling while preserving data confidentiality and regulatory compliance.
đ Abstract
Data is often impractical to share for a range of well considered reasons, such as concerns over privacy, intellectual property, and legal constraints. This not only fragments the statistical power of predictive models, but creates an accessibility bias, where accuracy becomes inequitably distributed to those who have the resources to overcome these concerns. We present DISCO: an open-source DIStributed COllaborative learning platform accessible to non-technical users, offering a means to collaboratively build machine learning models without sharing any original data or requiring any programming knowledge. DISCO's web application trains models locally directly in the browser, making our tool cross-platform out-of-the-box, including smartphones. The modular design of disco offers choices between federated and decentralized paradigms, various levels of privacy guarantees and several approaches to weight aggregation strategies that allow for model personalization and bias resilience in the collaborative training. Code repository is available at https://github.com/epfml/disco and a showcase web interface at https://discolab.ai