IRIS: Learning-Driven Task-Specific Cinema Robot Arm for Visuomotor Motion Control

📅 2026-02-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a lightweight, task-specific six-degree-of-freedom cinematographic robotic arm designed to overcome the high cost and operational complexity of industrial-grade film robots, which hinder their widespread adoption. By integrating a fully 3D-printed mechanical structure with a target-conditioned visuomotor imitation learning framework, the system achieves autonomous camera motion control without explicit geometric programming. Leveraging the Action Chunking with Transformers (ACT) algorithm for end-to-end imitation learning, the robot operates at a total hardware cost under $1,000, supports a 1.5 kg payload, and attains a repeatable positioning accuracy of 1 mm. It accurately reproduces and generalizes diverse cinematic camera trajectories, representing the first sub-$1,000 high-precision robotic system capable of professional-grade cinematography.

Technology Category

Application Category

📝 Abstract
Robotic camera systems enable dynamic, repeatable motion beyond human capabilities, yet their adoption remains limited by the high cost and operational complexity of industrial-grade platforms. We present the Intelligent Robotic Imaging System (IRIS), a task-specific 6-DOF manipulator designed for autonomous, learning-driven cinematic motion control. IRIS integrates a lightweight, fully 3D-printed hardware design with a goal-conditioned visuomotor imitation learning framework based on Action Chunking with Transformers (ACT). The system learns object-aware and perceptually smooth camera trajectories directly from human demonstrations, eliminating the need for explicit geometric programming. The complete platform costs under $1,000 USD, supports a 1.5 kg payload, and achieves approximately 1 mm repeatability. Real-world experiments demonstrate accurate trajectory tracking, reliable autonomous execution, and generalization across diverse cinematic motions.
Problem

Research questions and friction points this paper is trying to address.

robotic camera systems
high cost
operational complexity
cinematic motion control
visuomotor control
Innovation

Methods, ideas, or system contributions that make the work stand out.

visuomotor imitation learning
Action Chunking with Transformers
task-specific robot arm
cinematic motion control
low-cost robotic platform
Q
Qilong Cheng
Mechanical and Aerospace Eng. Department, New York University
M
Matthew Mackay
MIE Department, University of Toronto
Ali Bereyhi
Ali Bereyhi
University of Toronto
Statistical LearningInformation TheorySignal ProcessingWireless CommunicationsStatistical Mechanics