🤖 AI Summary
This study addresses the implicit recognition of user intent in visual search—specifically, distinguishing navigational gaze from goal-directed search—without requiring explicit user input, fixed trial durations, or subject-specific training data. We propose a cross-subject intent classification method that fuses electroencephalography (EEG) and eye-tracking signals. To support this, we construct the first publicly available multimodal dataset featuring self-paced search trials. We further design a machine learning model explicitly optimized for cross-subject generalization. Using leave-one-subject-out cross-validation, our approach achieves 84.5% classification accuracy in cross-subject evaluation—nearly matching intra-subject performance (85.5%). This represents a significant advance over prior methods, which rely heavily on ground-truth behavioral labels and rigid experimental paradigms. Our work enhances the practicality and scalability of brain–computer interfaces for naturalistic human–computer interaction.
📝 Abstract
For machines to effectively assist humans in challenging visual search tasks, they must differentiate whether a human is simply glancing into a scene (navigational intent) or searching for a target object (informational intent). Previous research proposed combining electroencephalography (EEG) and eye-tracking measurements to recognize such search intents implicitly, i.e., without explicit user input. However, the applicability of these approaches to real-world scenarios suffers from two key limitations. First, previous work used fixed search times in the informational intent condition -- a stark contrast to visual search, which naturally terminates when the target is found. Second, methods incorporating EEG measurements addressed prediction scenarios that require ground truth training data from the target user, which is impractical in many use cases. We address these limitations by making the first publicly available EEG and eye-tracking dataset for navigational vs. informational intent recognition, where the user determines search times. We present the first method for cross-user prediction of search intents from EEG and eye-tracking recordings and reach 84.5% accuracy in leave-one-user-out evaluations -- comparable to within-user prediction accuracy (85.5%) but offering much greater flexibility