Sequence-to-Image Transformation for Sequence Classification Using Rips Complex Construction and Chaos Game Representation

📅 2025-12-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address feature sparsity, high computational complexity, and the limited performance of deep models on tabular biological data in molecular sequence classification, this paper proposes a novel sequence-to-image transformation framework. It integrates Chaos Game Representation (CGR) with algebraic-topological Rips complexes to generate topological images that jointly encode local sequence patterns and global topological structure. This work is the first to introduce the Rips complex into sequence visualization, providing theoretical guarantees of representation uniqueness, topological stability, and information preservation. Evaluated on anticancer peptide datasets for breast and lung cancer, the method achieves 86.8% and 94.5% classification accuracy, respectively—outperforming conventional vector-based approaches, sequence language models, and state-of-the-art image-based baselines. The framework is compatible with vision-oriented architectures such as Vision Transformers and ResNet, demonstrating both efficacy and generalizability in biomedical sequence analysis.

Technology Category

Application Category

📝 Abstract
Traditional feature engineering approaches for molecular sequence classification suffer from sparsity issues and computational complexity, while deep learning models often underperform on tabular biological data. This paper introduces a novel topological approach that transforms molecular sequences into images by combining Chaos Game Representation (CGR) with Rips complex construction from algebraic topology. Our method maps sequence elements to 2D coordinates via CGR, computes pairwise distances, and constructs Rips complexes to capture both local structural and global topological features. We provide formal guarantees on representation uniqueness, topological stability, and information preservation. Extensive experiments on anticancer peptide datasets demonstrate superior performance over vector-based, sequence language models, and existing image-based methods, achieving 86.8% and 94.5% accuracy on breast and lung cancer datasets, respectively. The topological representation preserves critical sequence information while enabling effective utilization of vision-based deep learning architectures for molecular sequence analysis.
Problem

Research questions and friction points this paper is trying to address.

Transforms molecular sequences into topological images for classification
Addresses sparsity and computational issues in traditional sequence analysis
Enables vision-based deep learning on biological sequence data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines Chaos Game Representation with Rips complex construction
Transforms molecular sequences into topological image features
Enables vision-based deep learning for sequence analysis
🔎 Similar Papers
No similar papers found.
Sarwan Ali
Sarwan Ali
Columbia University
Deep LearningMachine LearningAdversarial AttackCombinatorial OptimizationBioinformatics
T
Taslim Murad
IBA, Karachi, Pakistan
I
Imdadullah Khan
Lahore University of Management Sciences, Lahore, Pakistan