Bringing together invertible UNets with invertible attention modules for memory-efficient diffusion models

📅 2025-04-15

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Training diffusion models on high-dimensional medical imaging data (e.g., 3D MRI/CT) incurs prohibitive GPU memory consumption and energy cost under single-GPU settings. To address this, we propose a memory-efficient architecture that integrates reversible U-Net with reversible attention mechanisms, enabling fully invertible feature transformations and attention computations. Coupled with memory-optimized gradient computation strategies, our design decouples peak memory usage from data dimensionality. Experiments on BraTS2020 demonstrate a 15% reduction in peak GPU memory, substantial energy savings during training, and state-of-the-art image reconstruction quality. This work constitutes the first systematic incorporation of reversible design principles into diffusion-based 3D medical image generation frameworks, establishing a novel paradigm for efficient generative modeling under resource-constrained conditions.

Technology Category

Application Category

📝 Abstract

Diffusion models have recently gained state of the art performance on many image generation tasks. However, most models require significant computational resources to achieve this. This becomes apparent in the application of medical image synthesis due to the 3D nature of medical datasets like CT-scans, MRIs, electron microscope, etc. In this paper we propose a novel architecture for a single GPU memory-efficient training for diffusion models for high dimensional medical datasets. The proposed model is built by using an invertible UNet architecture with invertible attention modules. This leads to the following two contributions: 1. denoising diffusion models and thus enabling memory usage to be independent of the dimensionality of the dataset, and 2. reducing the energy usage during training. While this new model can be applied to a multitude of image generation tasks, we showcase its memory-efficiency on the 3D BraTS2020 dataset leading to up to 15% decrease in peak memory consumption during training with comparable results to SOTA while maintaining the image quality.

Problem

Research questions and friction points this paper is trying to address.

Memory-efficient diffusion models for high-dimensional medical datasets

Reducing computational resources in 3D medical image synthesis

Decreasing peak memory usage while maintaining image quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Invertible UNet with invertible attention modules

Memory-efficient diffusion model training

Reduced energy usage during training

🔎 Similar Papers

DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut