🤖 AI Summary
This work addresses the vulnerability of gas chromatography–mass spectrometry (GC-MS) detection to non-specific peaks, retention time shifts, and background noise in the presence of interfering substances, which degrade sensitivity and increase false positives. To overcome these limitations, the authors propose a peak-aware conditional generative adversarial network (CGAN) framework that synthesizes high-fidelity GC-MS signals by conditioning the latent vector with chemical and solvent information. A peak-aware attention mechanism is integrated to enhance critical diagnostic peaks. The approach enables the generation of diverse interference scenarios without requiring real experimental data, thereby substantially augmenting the training dataset. The synthesized signals achieve cosine similarity and Pearson correlation coefficients exceeding 0.9, significantly improving the robustness of downstream discriminative models in complex environments and reducing false positive rates.
📝 Abstract
Gas chromatography-mass spectrometry (GC-MS) is a widely used analytical method for chemical substance detection, but measurement reliability tends to deteriorate in the presence of interfering substances. In particular, interfering substances cause nonspecific peaks, residence time shifts, and increased background noise, resulting in reduced sensitivity and false alarms. To overcome these challenges, in this paper, we propose an artificial intelligence discrimination framework based on a peak-aware conditional generative model to improve the reliability of GC-MS measurements under interference conditions. The framework is learned with a novel peak-aware mechanism that highlights the characteristic peaks of GC-MS data, allowing it to generate important spectral features more faithfully. In addition, chemical and solvent information is encoded in a latent vector embedded with it, allowing a conditional generative adversarial neural network (CGAN) to generate a synthetic GC-MS signal consistent with the experimental conditions. This generates an experimental dataset that assumes indirect substance situations in chemical substance data, where acquisition is limited without conducting real experiments. These data are used for the learning of AI-based GC-MS discrimination models to help in accurate chemical substance discrimination. We conduct various quantitative and qualitative evaluations of the generated simulated data to verify the validity of the proposed framework. We also verify how the generative model improves the performance of the AI discrimination framework. Representatively, the proposed method is shown to consistently achieve cosine similarity and Pearson correlation coefficient values above 0.9 while preserving peak number diversity and reducing false alarms in the discrimination model.