Task Decoding based on Eye Movements using Synthetic Data Augmentation

📅 2025-09-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study validates the Yarbus hypothesis—that eye movement patterns reflect observers’ cognitive tasks—while addressing the challenge of low task decoding accuracy under small-sample conditions. We propose a multi-source synthetic data augmentation framework leveraging CTGAN, CopulaGAN, and Gretel AI, constituting the first systematic integration of generative modeling with task classification (e.g., Random Forest, InceptionTime) on this eye-tracking dataset. Augmenting the original dataset fivefold with high-fidelity synthetic samples elevates task decoding accuracy from 28.1% to 82.0%, substantially outperforming existing traditional machine learning and deep learning approaches. Our core contributions are threefold: (1) establishing the first synthetic data augmentation paradigm specifically designed for eye movement–based task decoding; (2) empirically demonstrating the efficacy and generalizability of synthetic data in small-sample cognitive decoding; and (3) offering a novel, non-invasive pathway for cognitive state recognition.

Technology Category

Application Category

📝 Abstract
Machine learning has been extensively used in various applications related to eye-tracking research. Understanding eye movement is one of the most significant subsets of eye-tracking research that reveals the scanning pattern of an individual. Researchers have thoroughly analyzed eye movement data to understand various eye-tracking applications, such as attention mechanisms, navigational behavior, task understanding, etc. The outcome of traditional machine learning algorithms used for decoding tasks based on eye movement data has received a mixed reaction to Yarbus' claim that it is possible to decode the observer's task from their eye movements. In this paper, to support the hypothesis by Yarbus, we are decoding tasks categories while generating synthetic data samples using well-known Synthetic Data Generators CTGAN and its variations such as CopulaGAN and Gretel AI Synthetic Data generators on available data from an in-person user study. Our results show that augmenting more eye movement data combined with additional synthetically generated improves classification accuracy even with traditional machine learning algorithms. We see a significant improvement in task decoding accuracy from 28.1% using Random Forest to 82% using Inception Time when five times more data is added in addition to the 320 real eye movement dataset sample. Our proposed framework outperforms all the available studies on this dataset because of the use of additional synthetic datasets. We validated our claim with various algorithms and combinations of real and synthetic data to show how decoding accuracy increases with the increase in the augmentation of generated data to real data.
Problem

Research questions and friction points this paper is trying to address.

Decoding task categories from eye movement data
Improving classification accuracy with synthetic data augmentation
Validating Yarbus' hypothesis on task inference from eye movements
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic data augmentation using CTGAN
Combining real and synthetic eye movement data
Improving classification accuracy with data generation
🔎 Similar Papers
No similar papers found.
S
Shanmuka Sadhu
Rutgers University
A
Arca Baran
New Jersey Institute of Technology
P
Preeti Pandey
Ayush Kumar
Ayush Kumar
University of Manitoba
Multidrug Resistance in Gram negative bacteria