🤖 AI Summary
Existing research platforms predominantly focus on simplified tasks or single-perspective analysis, limiting their capacity to support interdisciplinary empirical studies of complex human-AI collaborative decision-making. To address this gap, we introduce CREW—an open-source platform featuring a novel modular architecture designed for ecologically valid collaborative scenarios. CREW integrates cognitive experimental paradigms, real-time multimodal physiological signal acquisition (EEG, ECG, EMG), human-guided reinforcement learning benchmarks (PPO, SAC), and an extensible task framework—unifying human behavioral modeling and AI algorithm evaluation. CREW is the first tool enabling real-time, multidisciplinary, high-ecological-validity studies of human-AI teams. Within one week, we validated the platform across 50 participants: results demonstrate significant improvements in task flexibility, depth of human engagement, and rigor of algorithmic assessment. CREW thus establishes a unified experimental foundation for both foundational research and applied development in human-AI collaborative decision-making.
📝 Abstract
With the increasing deployment of artificial intelligence (AI) technologies, the potential of humans working with AI agents has been growing at a great speed. Human-AI teaming is an important paradigm for studying various aspects when humans and AI agents work together. The unique aspect of Human-AI teaming research is the need to jointly study humans and AI agents, demanding multidisciplinary research efforts from machine learning to human-computer interaction, robotics, cognitive science, neuroscience, psychology, social science, and complex systems. However, existing platforms for Human-AI teaming research are limited, often supporting oversimplified scenarios and a single task, or specifically focusing on either human-teaming research or multi-agent AI algorithms. We introduce CREW, a platform to facilitate Human-AI teaming research in real-time decision-making scenarios and engage collaborations from multiple scientific disciplines, with a strong emphasis on human involvement. It includes pre-built tasks for cognitive studies and Human-AI teaming with expandable potentials from our modular design. Following conventional cognitive neuroscience research, CREW also supports multimodal human physiological signal recording for behavior analysis. Moreover, CREW benchmarks real-time human-guided reinforcement learning agents using state-of-the-art algorithms and well-tuned baselines. With CREW, we were able to conduct 50 human subject studies within a week to verify the effectiveness of our benchmark.