MEGC2026: Micro-Expression Grand Challenge on Visual Question Answering

📅 2026-03-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the understanding of facial microexpressions under high-stakes scenarios where emotions are suppressed, introducing two novel tasks: Microexpression Video Question Answering (ME-VQA) and Microexpression Long Video Question Answering (ME-LVQA). It pioneers the application of the visual question answering (VQA) paradigm to microexpression analysis and extends it to long-duration video settings. By integrating multimodal large language models (MLLMs) with temporal modeling techniques, the approach enables fine-grained semantic reasoning about microexpressions and their cross-temporal associations. The project also establishes the first public benchmark platform and leaderboard dedicated to semantic question answering on microexpressions, advancing the field beyond basic recognition toward deeper semantic comprehension.

Technology Category

Application Category

📝 Abstract
Facial micro-expressions (MEs) are involuntary movements of the face that occur spontaneously when a person experiences an emotion but attempts to suppress or repress the facial expression, typically found in a high-stakes environment. In recent years, substantial advancements have been made in the areas of ME recognition, spotting, and generation. The emergence of multimodal large language models (MLLMs) and large vision-language models (LVLMs) offers promising new avenues for enhancing ME analysis through their powerful multimodal reasoning capabilities. The ME grand challenge (MEGC) 2026 introduces two tasks that reflect these evolving research directions: (1) ME video question answering (ME-VQA), which explores ME understanding through visual question answering on relatively short video sequences, leveraging MLLMs or LVLMs to address diverse question types related to MEs; and (2) ME long-video question answering (ME-LVQA), which extends VQA to long-duration video sequences in realistic settings, requiring models to handle temporal reasoning and subtle micro-expression detection across extended time periods. All participating algorithms are required to submit their results on a public leaderboard. More details are available at https://megc2026.github.io.
Problem

Research questions and friction points this paper is trying to address.

Micro-Expression
Visual Question Answering
Long Video
Temporal Reasoning
Multimodal Reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

micro-expression
visual question answering
multimodal large language models
temporal reasoning
long-video analysis
🔎 Similar Papers
No similar papers found.
X
Xinqi Fan
Department of Computing and Mathematics, Manchester Metropolitan University
J
Jingting Li
State Key Laboratory of Cognitive Science and Mental Health, Institute of Psychology, CAS & Department of Psychology, University of the Chinese Academy of Sciences
John See
John See
Professor, Heriot-Watt University Malaysia
Computer VisionImage ProcessingMultimediaArtificial IntelligenceAffective Computing
Moi Hoon Yap
Moi Hoon Yap
Professor of Image and Vision Computing, Manchester Metropolitan University
Face and Gesture AnalysisMedical Image AnalysisComputer Vision and Deep Learning
S
Su-Jing Wang
State Key Laboratory of Cognitive Science and Mental Health, Institute of Psychology, CAS & Department of Psychology, University of the Chinese Academy of Sciences
Adrian K. Davison
Adrian K. Davison
Senior Lecturer, Manchester Metropolitan University
Computer VisionMicro-facial expressionsfacial expression analysismedical image analysisdeep