🤖 AI Summary
This study addresses the performance limitations in credit card fraud detection caused by the scarcity, high cost, and imbalanced distribution of fraudulent samples. To overcome these challenges, the authors propose a Combinatorial Fusion Analysis (CFA) framework during validation that searches for an optimal subset among seven base classifiers—including Random Forest, XGBoost, and LightGBM—and integrates them via a Diversity-Enhanced Fusion Weighted Scoring (DEF WtScore) strategy. By treating CFA as a model selection and weighting mechanism rather than full ensemble fusion, the approach effectively mitigates information leakage. Evaluation employs an unbiased 60/20/20 data split and Bootstrap confidence intervals. On the IEEE-CIS dataset, the method achieves an AUC-ROC of 0.9405, AUPRC of 0.6699, and F1-score of 0.6373, significantly outperforming the best individual model, soft voting, and stacking ensembles.
📝 Abstract
Credit-card fraud detection is difficult because fraudulent transactions are rare, costly, and unevenly distributed. Strong gradient-boosted tree models already perform well on structured transaction data, so the value of another fusion method is not obvious. This paper examines whether Combinatorial Fusion Analysis (CFA), which searches over model subsets and rank-score fusion rules, can still add value on the IEEE-CIS Fraud Detection benchmark. Using a leakage-free 60/20/20 train/validation/test protocol, we evaluate 480 fusion configurations built from seven base classifiers. The best test-set result comes from diversity-weighted score fusion of Random Forest, XGBoost, and LightGBM (DEF WtScore), with AUC-ROC = 0.9405, AUPRC = 0.6699, and F1 = 0.6373. Bootstrap confidence intervals from 1,000 resamples show that the gains over the strongest single model exclude zero for all three metrics. CFA matches soft voting on AUC-ROC, improves AUPRC and F1, and outperforms stacking in this setting. A CTGAN augmentation experiment gives a negative result: synthetic fraud samples degrade both individual models and CFA. Overall, CFA is most useful here not as a way to combine every classifier, but as a validation-stage method for choosing a small, complementary subset and assigning diversity-aware weights.