Improving Slow Transfer Predictions: Generative Methods Compared

📅 2025-12-16

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Predicting data transfer performance in scientific computing networks faces severe class imbalance, particularly due to the scarcity of slow-transfer samples. Method: This paper systematically evaluates generative data augmentation techniques—SMOTE, ADASYN, and CTGAN—in conjunction with logistic regression, XGBoost, and LSTM models, under multiple imbalance ratios. Contribution/Results: (1) Generative augmentation yields only marginal improvements for slow-transfer prediction, with diminishing returns as class imbalance intensifies; (2) CTGAN does not significantly outperform simple stratified sampling; (3) We propose a lightweight, efficient prediction paradigm grounded in stratified sampling as a baseline. This work challenges the prevailing assumption that generative models are necessary for this task, establishing a reproducible, low-overhead benchmark for early network performance prediction.

Technology Category

Application Category

📝 Abstract

Monitoring data transfer performance is a crucial task in scientific computing networks. By predicting performance early in the communication phase, potentially sluggish transfers can be identified and selectively monitored, optimizing network usage and overall performance. A key bottleneck to improving the predictive power of machine learning (ML) models in this context is the issue of class imbalance. This project focuses on addressing the class imbalance problem to enhance the accuracy of performance predictions. In this study, we analyze and compare various augmentation strategies, including traditional oversampling methods and generative techniques. Additionally, we adjust the class imbalance ratios in training datasets to evaluate their impact on model performance. While augmentation may improve performance, as the imbalance ratio increases, the performance does not significantly improve. We conclude that even the most advanced technique, such as CTGAN, does not significantly improve over simple stratified sampling.

Problem

Research questions and friction points this paper is trying to address.

Addressing class imbalance to enhance performance prediction accuracy

Comparing augmentation strategies including oversampling and generative techniques

Evaluating impact of class imbalance ratios on model performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Addressing class imbalance with augmentation strategies

Comparing oversampling and generative techniques like CTGAN

Adjusting imbalance ratios to evaluate model performance

🔎 Similar Papers

No similar papers found.

Authors to Follow