Autonomous Aerial Manipulation via Contextual Contrastive Meta Reinforcement Learning

๐Ÿ“… 2026-06-07
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of enabling drones to autonomously grasp, transport, and deliver diverse payloads without pre-installed fixtures or human intervention. To this end, the authors propose the Acoยฒ framework, which leverages contextual contrastive meta-reinforcement learning for end-to-end aerial manipulation. The approach introduces a context observation encoder that implicitly extracts dynamic payload characteristics through a contrastive learning objective, thereby achieving generalization to unseen payloads without explicit system identification. By integrating meta-reinforcement learning, contrastive representation learning, and domain randomization, the policy is trained entirely in simulation and deployed directly on a real quadrotor platform. Experimental results demonstrate successful fully autonomous pick-and-place operations on a variety of handled objects, validating the frameworkโ€™s robustness and transferability.
๐Ÿ“ Abstract
Unmanned aerial vehicles (UAVs) are increasingly being deployed in logistics, service robotics, and other real-world applications, creating a growing demand for autonomous payload acquisition and delivery. Existing approaches typically assume pre-attached payloads or rely on specialized grippers, leaving versatile end-to-end aerial delivery largely unresolved, where different payloads induce highly variable flight dynamics, requiring a single policy to adapt online without manual calibration or explicit system identification. To this end, we study \textbf{A}utonomous \textbf{A}erial Manipulation via \textbf{Co}ntextual \textbf{Co}ntrastive Meta Reinforcement Learning (\textbf{\textit{Aco2}}), a fully autonomous aerial delivery setting in which a quadrotor equipped with a lightweight hook continuously picks up, transports, and delivers diverse handle-equipped objects between randomized locations, all without human intervention. First, we design a contextual observation encoder that infers a compact latent context from recent interaction history, enabling the policy to adapt online to payload-dependent dynamics. To further improve the quality of this context, we introduce a contrastive objective that structures the context embedding around task-relevant variations, improving generalization across diverse payloads without requiring explicit system identification. Trained entirely in simulation with extensive domain randomization, \textit{Aco2} can be directly deployed on a physical quadrotor without real-world fine-tuning.
Problem

Research questions and friction points this paper is trying to address.

Autonomous Aerial Manipulation
Payload Adaptation
End-to-End Aerial Delivery
Variable Flight Dynamics
Online Adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contextual Contrastive Learning
Meta Reinforcement Learning
Autonomous Aerial Manipulation
Online Adaptation
Domain Randomization
๐Ÿ”Ž Similar Papers
L
Lixuan Jin
National Key Laboratory of Novel Software Technology, Nanjing University, Nanjing, China; School of Artificial Intelligence, Nanjing University, Nanjing, China
B
Bingxuan Lan
National Key Laboratory of Novel Software Technology, Nanjing University, Nanjing, China; School of Artificial Intelligence, Nanjing University, Nanjing, China
X
Xinyi Bao
National Key Laboratory of Novel Software Technology, Nanjing University, Nanjing, China; School of Artificial Intelligence, Nanjing University, Nanjing, China
X
Xiangyuan Xie
Faculty of Robot Science and Engineering, Northeastern University, China
Chunjie Zhang
Chunjie Zhang
Beijing Jiaotong University
multimediacomputer vision
Z
Zheng Chen
National Key Laboratory of Novel Software Technology, Nanjing University, Nanjing, China; School of Artificial Intelligence, Nanjing University, Nanjing, China
T
Tianshuo Liu
National Key Laboratory of Novel Software Technology, Nanjing University, Nanjing, China; School of Artificial Intelligence, Nanjing University, Nanjing, China
Ruijie Tian
Ruijie Tian
School of Computer and Artificial Intelligence, Liaoning Normal University
Big data managementDistributed processingQuery optimization with machine learning
J
Jinyu Ru
Faculty of Robot Science and Engineering, Northeastern University, China
Gang Wang
Gang Wang
Beijing Institute of Technology
Distributed learningnon-convex optimizationreinforcement learningdata-driven control
Lei Yuan
Lei Yuan
Nanjing University
Machine LearningReinforcement LearningMulti-Agent SystemsEmbodied AI
Yang Yu
Yang Yu
Professor, Nanjing University
Artificial IntelligenceReinforcement LearningEvolutionary Algorithms