FissionVAE: Federated Non-IID Image Generation with Latent Space and Decoder Decomposition

📅 2024-08-30

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

To address the challenge of federated generative modeling over non-IID multi-source image data, this paper proposes a decoupled federated VAE framework featuring group-customized decoders. The method decomposes the latent space into shared semantic representations and client-specific texture features, enabling effective knowledge transfer across heterogeneous clients. It introduces a group-aware decoder architecture with modular, client-adaptive branches—compatible with hierarchical VAEs—and incorporates configurable priors to enhance decoupling robustness. Evaluated on MNIST+FashionMNIST and a multi-source RGB composite dataset (cartoons, faces, animals, ships, remote sensing), the approach achieves over 35% FID reduction compared to state-of-the-art federated VAE baselines. This demonstrates substantial improvements in generated sample quality and cross-domain generalization capability under realistic non-IID data distributions.

Technology Category

Application Category

📝 Abstract

Federated learning is a machine learning paradigm that enables decentralized clients to collaboratively learn a shared model while keeping all the training data local. While considerable research has focused on federated image generation, particularly Generative Adversarial Networks, Variational Autoencoders have received less attention. In this paper, we address the challenges of non-IID (independently and identically distributed) data environments featuring multiple groups of images of different types. Non-IID data distributions can lead to difficulties in maintaining a consistent latent space and can also result in local generators with disparate texture features being blended during aggregation. We thereby introduce FissionVAE that decouples the latent space and constructs decoder branches tailored to individual client groups. This method allows for customized learning that aligns with the unique data distributions of each group. Additionally, we incorporate hierarchical VAEs and demonstrate the use of heterogeneous decoder architectures within FissionVAE. We also explore strategies for setting the latent prior distributions to enhance the decoupling process. To evaluate our approach, we assemble two composite datasets: the first combines MNIST and FashionMNIST; the second comprises RGB datasets of cartoon and human faces, wild animals, marine vessels, and remote sensing images. Our experiments demonstrate that FissionVAE greatly improves generation quality on these datasets compared to baseline federated VAE models.

Problem

Research questions and friction points this paper is trying to address.

Addressing non-IID data challenges in federated image generation

Maintaining consistent latent space across diverse client data groups

Preventing texture feature blending during model aggregation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decouples latent space for non-IID data

Uses tailored decoder branches per client group

Incorporates hierarchical VAEs for enhanced decoupling

🔎 Similar Papers

No similar papers found.