Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification

📅 2024-09-25
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Medical video and 3D volumetric sequences suffer from data scarcity, high annotation costs, and weak semantic-temporal controllability and noise-sample quality control in existing diffusion-based generation methods. To address these challenges, this paper proposes a controllable generative augmentation framework. Its core contributions are: (1) a multimodal conditional guidance mechanism for controllable sequence generation, enabling precise customization along both semantic and temporal dimensions; (2) a spatio-temporal consistency enhancement module to preserve structural coherence across frames and volumes; and (3) a dual-level (semantic and sequential) noise filtering mechanism with fine- and coarse-grained quality assessment to eliminate spurious samples. Built upon a diffusion model architecture, the framework demonstrates significant performance gains across three medical datasets, eleven classifiers, and three training paradigms—particularly improving high-risk patient identification and out-of-distribution generalization.

Technology Category

Application Category

📝 Abstract
In the medical field, the limited availability of large-scale datasets and labor-intensive annotation processes hinder the performance of deep models. Diffusion-based generative augmentation approaches present a promising solution to this issue, having been proven effective in advancing downstream medical recognition tasks. Nevertheless, existing works lack sufficient semantic and sequential steerability for challenging video/3D sequence generation, and neglect quality control of noisy synthesized samples, resulting in unreliable synthetic databases and severely limiting the performance of downstream tasks. In this work, we present Ctrl-GenAug, a novel and general generative augmentation framework that enables highly semantic- and sequential-customized sequence synthesis and suppresses incorrectly synthesized samples, to aid medical sequence classification. Specifically, we first design a multimodal conditions-guided sequence generator for controllably synthesizing diagnosis-promotive samples. A sequential augmentation module is integrated to enhance the temporal/stereoscopic coherence of generated samples. Then, we propose a noisy synthetic data filter to suppress unreliable cases at semantic and sequential levels. Extensive experiments on 3 medical datasets, using 11 networks trained on 3 paradigms, comprehensively analyze the effectiveness and generality of Ctrl-GenAug, particularly in underrepresented high-risk populations and out-domain conditions.
Problem

Research questions and friction points this paper is trying to address.

Limited medical datasets hinder deep model performance
Existing methods lack control in sequence generation
Noisy synthetic data reduces downstream task reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal conditions-guided sequence generator
Sequential augmentation module for coherence
Noisy synthetic data filter for reliability
🔎 Similar Papers
No similar papers found.
Xinrui Zhou
Xinrui Zhou
The Hong Kong University of Science and Technology
Medical image computing
Yuhao Huang
Yuhao Huang
Shenzhen University
Medical Image ComputingUltrasoundModel Robustness
Haoran Dou
Haoran Dou
Research Associate, The University of Manchester
Medical Image AnalysisIn-silico Trials
S
Shijing Chen
National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China, the Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China, and also with the Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
Ao Chang
Ao Chang
University of Science and Technology of China
J
Jia Liu
The Third Affiliated Hospital of Sun Yat-sen University
W
Weiran Long
The Third Affiliated Hospital of Sun Yat-sen University
J
Jian Zheng
E
Erjiao Xu
J
Jie Ren
The Third Affiliated Hospital of Sun Yat-sen University
Ruobing Huang
Ruobing Huang
Post-doctoral Research Assistant, University of Oxford
Machine learningMedical Image AnalysisUltrasound
J
Jun Cheng
National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China, the Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China, and also with the Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
Wufeng Xue
Wufeng Xue
Shenzhen University; Xian Jiaotong University; University of Western Ontario
medical image analysiscomputer visionimage processingimage quality assessment
D
Dong Ni
National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China, the Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China, and also with the Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China