The Role of Review Process Failures in Affective State Estimation: An Empirical Investigation of DEAP Dataset

📅 2025-08-04

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Reliability of EEG-based affective state estimation in affective computing remains questionable due to widespread methodological flaws and the absence of standardized evaluation protocols. Method: We conducted a systematic literature review and reproducibility-focused experimental validation across 101 studies published on the DEAP dataset, identifying five prevalent methodological errors: data leakage, biased feature selection, improper hyperparameter optimization, neglect of class imbalance, and inadequate cross-validation design. Contribution/Results: Our analysis reveals that 87% of the studies contain at least one severe flaw; typical errors artificially inflate classification accuracy by up to 46%. This work provides the first quantitative assessment of how methodological biases distort performance evaluation in EEG-based affective computing. It further exposes systemic gaps in peer review rigor and standardization practices within neuroscience-oriented machine learning research. The findings establish an empirical foundation and actionable roadmap for developing stringent methodological guidelines, reproducible evaluation protocols, and community-adopted benchmarks.

Technology Category

Application Category

📝 Abstract

The reliability of affective state estimation using EEG data is in question, given the variability in reported performance and the lack of standardized evaluation protocols. To investigate this, we reviewed 101 studies, focusing on the widely used DEAP dataset for emotion recognition. Our analysis revealed widespread methodological issues that include data leakage from improper segmentation, biased feature selection, flawed hyperparameter optimization, neglect of class imbalance, and insufficient methodological reporting. Notably, we found that nearly 87% of the reviewed papers contained one or more of these errors. Moreover, through experimental analysis, we observed that such methodological flaws can inflate the classification accuracy by up to 46%. These findings reveal fundamental gaps in standardized evaluation practices and highlight critical deficiencies in the peer review process for machine learning applications in neuroscience, emphasizing the urgent need for stricter methodological standards and evaluation protocols.

Problem

Research questions and friction points this paper is trying to address.

Identifying methodological flaws in EEG-based affective state estimation

Assessing impact of review process failures on emotion recognition accuracy

Proposing need for standardized evaluation protocols in neuroscience ML

Innovation

Methods, ideas, or system contributions that make the work stand out.

Identified data leakage from improper segmentation

Exposed biased feature selection flaws

Highlighted hyperparameter optimization issues

🔎 Similar Papers

No similar papers found.

Authors to Follow