SDM-Q: Cost-Aware Staged Decision-Making for Multi-Omics Classification with Deep Q-Learning

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work addresses the high cost of multi-omics data acquisition and the limited clinical practicality of existing deep learning methods that require all modalities during inference. The authors formulate multi-omics classification as a finite-horizon sequential decision-making problem and propose a cost-aware deep Q-learning framework that dynamically decides whether to acquire an additional modality or terminate prediction, thereby balancing diagnostic accuracy against acquisition cost. A novel joint reward mechanism—defined only at termination—and a backward-stage optimization strategy are introduced to enhance policy consistency and training stability. Evaluated on four public datasets, the model drastically reduces modality usage: over 99% and 95% of samples in BRCA and KIPAN, respectively, are accurately classified using only a single modality, while ROSMAP and LGG achieve average modality counts below two, all while maintaining competitive performance.

📝 Abstract

Multi-omics data provide complementary molecular characterizations of disease phenotypes and play an important role in disease diagnosis and subtype classification in precision medicine. However, acquiring complete multi-omics profiles is expensive and time-consuming, while most existing deep learning methods assume full modality availability during inference, resulting in substantial redundancy and limited practicality in clinical settings. To address this issue, we propose SDM-Q, a reinforcement learning framework for adaptive and cost-aware multi-omics classification. Specifically, multi-omics diagnosis is reformulated as a finite-horizon sequential decision problem, where the currently acquired omics modalities define the diagnostic state at each stage. An action--value function determines whether to acquire an additional modality or terminate the decision process and output the final prediction. To balance diagnostic utility and acquisition cost, the reward is defined only at the terminal stage and jointly determined by classification correctness and cumulative modality acquisition cost. A backward stage-wise optimization strategy is introduced to improve policy consistency and training stability. Experiments on four public multi-omics datasets, including ROSMAP, LGG, BRCA, and KIPAN, demonstrate that SDM-Q effectively reduces redundant modality acquisition while maintaining competitive classification performance compared with methods using complete multi-omics inputs. In the BRCA and KIPAN datasets, more than 99\% and 95\% of subjects, respectively, achieve accurate classification using only a single omics modality, while the average number of acquired modalities remains below two for ROSMAP and LGG. These results suggest that cost-aware sequential decision-making provides an effective paradigm for improving the efficiency of precision medicine workflows.

Problem

Research questions and friction points this paper is trying to address.

multi-omics classification

cost-aware decision-making

modality acquisition

precision medicine

clinical efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

cost-aware decision-making

multi-omics classification

deep Q-learning