🤖 AI Summary
AI diagnostic models for skin diseases often suffer from population-level bias, leading to a trade-off between fairness and accuracy. Conventional debiasing approaches—by removing correlations with sensitive attributes—risk discarding clinically relevant features, thereby degrading performance. To address this, we propose FairMoE, a novel framework that treats sensitive attributes as modeling guidance rather than nuisance variables. FairMoE employs a hierarchical Mixture-of-Experts (MoE) architecture with dynamic routing, enabling group-specific learning and adaptive assignment of boundary samples. Crucially, it preserves diagnostically discriminative features while explicitly modeling population heterogeneity. Experiments demonstrate that FairMoE achieves fairness metrics—including equal opportunity difference and predictive equality—comparable to or better than state-of-the-art baselines, while significantly improving overall diagnostic accuracy. Thus, FairMoE effectively breaks the conventional fairness–performance trade-off.
📝 Abstract
AI-based systems have achieved high accuracy in skin disease diagnostics but often exhibit biases across demographic groups, leading to inequitable healthcare outcomes and diminished patient trust. Most existing bias mitigation methods attempt to eliminate the correlation between sensitive attributes and diagnostic prediction, but those methods often degrade performance due to the lost of clinically relevant diagnostic cues. In this work, we propose an alternative approach that incorporates sensitive attributes to achieve fairness. We introduce FairMoE, a framework that employs layer-wise mixture-of-experts modules to serve as group-specific learners. Unlike traditional methods that rigidly assign data based on group labels, FairMoE dynamically routes data to the most suitable expert, making it particularly effective for handling cases near group boundaries. Experimental results show that, unlike previous fairness approaches that reduce performance, FairMoE achieves substantial accuracy improvements while preserving comparable fairness metrics.