Completion of the DrugMatrix Toxicogenomics Database using 3-Dimensional Tensors

📅 2025-07-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the high prevalence of missing data in the DrugMatrix toxicogenomics database. We propose a three-dimensional structured imputation method based on non-negative tensor completion (NTC), which explicitly models the coupled relationships among tissue type, treatment condition, and gene expression—thereby preserving the intrinsic data distribution and accurately capturing organ-specific variability. Unlike conventional matrix factorization or CP decomposition approaches, our method integrates non-negative tensor decomposition with machine learning–driven optimization. Experimental results demonstrate that it significantly outperforms existing benchmark methods in terms of mean squared error and mean absolute error. We successfully applied the method to impute the world’s largest in vivo toxicogenomics database—comprising multi-organ, multi-dose, and multi-time-point measurements—with high accuracy. This advancement substantially enhances the reliability and feasibility of cross-species drug toxicity prediction and mechanistic interpretation.

Technology Category

Application Category

📝 Abstract
We explore applying a tensor completion approach to complete the DrugMatrix toxicogenomics dataset. Our hypothesis is that by preserving the 3-dimensional structure of the data, which comprises tissue, treatment, and transcriptomic measurements, and by leveraging a machine learning formulation, our approach will improve upon prior state-of-the-art results. Our results demonstrate that the new tensor-based method more accurately reflects the original data distribution and effectively captures organ-specific variability. The proposed tensor-based methodology achieved lower mean squared errors and mean absolute errors compared to both conventional Canonical Polyadic decomposition and 2-dimensional matrix factorization methods. In addition, our non-negative tensor completion implementation reveals relationships among tissues. Our findings not only complete the world's largest in-vivo toxicogenomics database with improved accuracy but also offer a promising methodology for future studies of drugs that may cross species barriers, for example, from rats to humans.
Problem

Research questions and friction points this paper is trying to address.

Complete DrugMatrix toxicogenomics data using tensor completion
Improve accuracy by preserving 3D data structure and machine learning
Capture organ-specific variability and cross-species drug relationships
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tensor completion preserves 3D data structure
Non-negative tensor reveals tissue relationships
Lower errors than matrix factorization methods
🔎 Similar Papers
No similar papers found.