Completion of the DrugMatrix Toxicogenomics Database using 3-Dimensional Tensors

📅 2025-07-02
📈 Citations: 0
Influential: 0
📄 PDF

career value

201K/year
🤖 AI Summary
This study addresses the high prevalence of missing data in the DrugMatrix toxicogenomics database. We propose a three-dimensional structured imputation method based on non-negative tensor completion (NTC), which explicitly models the coupled relationships among tissue type, treatment condition, and gene expression—thereby preserving the intrinsic data distribution and accurately capturing organ-specific variability. Unlike conventional matrix factorization or CP decomposition approaches, our method integrates non-negative tensor decomposition with machine learning–driven optimization. Experimental results demonstrate that it significantly outperforms existing benchmark methods in terms of mean squared error and mean absolute error. We successfully applied the method to impute the world’s largest in vivo toxicogenomics database—comprising multi-organ, multi-dose, and multi-time-point measurements—with high accuracy. This advancement substantially enhances the reliability and feasibility of cross-species drug toxicity prediction and mechanistic interpretation.

Technology Category

Application Category

📝 Abstract
We explore applying a tensor completion approach to complete the DrugMatrix toxicogenomics dataset. Our hypothesis is that by preserving the 3-dimensional structure of the data, which comprises tissue, treatment, and transcriptomic measurements, and by leveraging a machine learning formulation, our approach will improve upon prior state-of-the-art results. Our results demonstrate that the new tensor-based method more accurately reflects the original data distribution and effectively captures organ-specific variability. The proposed tensor-based methodology achieved lower mean squared errors and mean absolute errors compared to both conventional Canonical Polyadic decomposition and 2-dimensional matrix factorization methods. In addition, our non-negative tensor completion implementation reveals relationships among tissues. Our findings not only complete the world's largest in-vivo toxicogenomics database with improved accuracy but also offer a promising methodology for future studies of drugs that may cross species barriers, for example, from rats to humans.
Problem

Research questions and friction points this paper is trying to address.

Complete DrugMatrix toxicogenomics data using tensor completion
Improve accuracy by preserving 3D data structure and machine learning
Capture organ-specific variability and cross-species drug relationships
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tensor completion preserves 3D data structure
Non-negative tensor reveals tissue relationships
Lower errors than matrix factorization methods
💼 Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid