DinoCompanion: An Attachment-Theory Informed Multimodal Robot for Emotionally Responsive Child-AI Interaction

📅 2025-06-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current AI companions lack grounding in developmental psychology, limiting their capacity to provide attachment-based emotional support essential for children aged 3–8. To address deficiencies in developmentally appropriate affective architecture, balanced safety–engagement trade-offs, and attachment-oriented evaluation, this work introduces the first attachment-theory-driven multimodal companion robot framework for children. Our contributions include: (1) CARPO—a risk-calibrated preference optimization objective; (2) AttachSecure-Bench, a novel attachment-competence benchmark (Cohen’s κ = 0.81); and (3) a caregiver–child multimodal dataset comprising 125,000 annotated segments. The system integrates vision–speech–behavior perception, hierarchical memory modeling, and cognitive-uncertainty-weighted risk modeling. Experiments demonstrate a composite attachment-competence score of 57.15% (state-of-the-art), secure-base behavior accuracy of 72.99% (vs. human expert 78.4%), and risk identification accuracy of 69.73%, significantly outperforming GPT-4o and Claude-3.7-Sonnet.

Technology Category

Application Category

📝 Abstract
Children's emotional development fundamentally relies on secure attachment relationships, yet current AI companions lack the theoretical foundation to provide developmentally appropriate emotional support. We introduce DinoCompanion, the first attachment-theory-grounded multimodal robot for emotionally responsive child-AI interaction. We address three critical challenges in child-AI systems: the absence of developmentally-informed AI architectures, the need to balance engagement with safety, and the lack of standardized evaluation frameworks for attachment-based capabilities. Our contributions include: (i) a multimodal dataset of 128 caregiver-child dyads containing 125,382 annotated clips with paired preference-risk labels, (ii) CARPO (Child-Aware Risk-calibrated Preference Optimization), a novel training objective that maximizes engagement while applying epistemic-uncertainty-weighted risk penalties, and (iii) AttachSecure-Bench, a comprehensive evaluation benchmark covering ten attachment-centric competencies with strong expert consensus (k{appa}=0.81). DinoCompanion achieves state-of-the-art performance (57.15%), outperforming GPT-4o (50.29%) and Claude-3.7-Sonnet (53.43%), with exceptional secure base behaviors (72.99%, approaching human expert levels of 78.4%) and superior attachment risk detection (69.73%). Ablations validate the critical importance of multimodal fusion, uncertainty-aware risk modeling, and hierarchical memory for coherent, emotionally attuned interactions.
Problem

Research questions and friction points this paper is trying to address.

Lack of developmentally-informed AI architectures for child emotional support
Balancing engagement with safety in child-AI interaction systems
Absence of standardized evaluation frameworks for attachment-based AI capabilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal dataset with annotated caregiver-child clips
CARPO optimizes engagement with risk penalties
AttachSecure-Bench evaluates attachment competencies
🔎 Similar Papers
No similar papers found.
B
Boyang Wang
Beihang University
Y
Yuhao Song
The University of Melbourne
J
Jinyuan Cao
Independent Researcher
P
Peng Yu
Panasonic Appliances(China) Co.,Ltd
Hongcheng Guo
Hongcheng Guo
School of Data Science, Fudan University
LLMsMultimodal LLMs
Zhoujun Li
Zhoujun Li
Beihang University
Artificial IntelligentNatural Language ProcessingNetwork Security