Imitation Learning for Active Neck Motion Enabling Robot Manipulation beyond the Field of View

📅 2025-06-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses robotic manipulation failures caused by limited field-of-view (FoV). We propose an imitation learning framework integrating active neck motion to dynamically regulate visual perception. Methodologically, we design a systematic teleoperation data collection protocol that synchronously captures articulated neck pose and multi-view visual inputs, and introduce a novel neural architecture that explicitly models the dynamic coupling between neck articulation and hand-eye coordination. Our key contribution is the first integration of active visual control into an end-to-end imitation learning pipeline—overcoming the constraints of fixed-camera setups and enabling continuous perception and manipulation of objects outside the initial FoV. Experiments demonstrate a 90% task success rate under dynamic viewpoint perturbations, significantly outperforming fixed-FoV baselines. The approach exhibits superior robustness and generalization in edge-of-view and occluded scenarios.

Technology Category

Application Category

📝 Abstract
Most prior research in deep imitation learning has predominantly utilized fixed cameras for image input, which constrains task performance to the predefined field of view. However, enabling a robot to actively maneuver its neck can significantly expand the scope of imitation learning to encompass a wider variety of tasks and expressive actions such as neck gestures. To facilitate imitation learning in robots capable of neck movement while simultaneously performing object manipulation, we propose a teaching system that systematically collects datasets incorporating neck movements while minimizing discomfort caused by dynamic viewpoints during teleoperation. In addition, we present a novel network model for learning manipulation tasks including active neck motion. Experimental results showed that our model can achieve a high success rate of around 90%, regardless of the distraction from the viewpoint variations by active neck motion. Moreover, the proposed model proved particularly effective in challenging scenarios, such as when objects were situated at the periphery or beyond the standard field of view, where traditional models struggled. The proposed approach contributes to the efficiency of dataset collection and extends the applicability of imitation learning to more complex and dynamic scenarios.
Problem

Research questions and friction points this paper is trying to address.

Expands imitation learning beyond fixed camera views
Enables robot manipulation with active neck motion
Improves performance in dynamic and peripheral scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Teaching system for neck movement dataset collection
Novel network model for active neck motion
High success rate despite viewpoint variations
🔎 Similar Papers
No similar papers found.
K
Koki Nakagawa
Laboratory for Intelligent Systems and Informatics, Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan
Y
Yoshiyuki Ohmura
Laboratory for Intelligent Systems and Informatics, Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan
Yasuo Kuniyoshi
Yasuo Kuniyoshi
School of Information Science and Technology, The University of Tokyo
Intelligent SystemsEmbodied Artificial IntelligenceDevelopmental Cognitive NeuroscienceComplex Emergent SystemsHumanoid