Beyond the Patch: Exploring Vulnerabilities of Visuomotor Policies via Viewpoint-Consistent 3D Adversarial Object

📅 2026-03-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the vulnerability of vision-based robotic manipulation policies to adversarial attacks under dynamic viewpoints, where conventional 2D adversarial patches fail due to perspective distortion—particularly from wrist-mounted cameras on robotic arms. To overcome this limitation, the authors propose a viewpoint-consistent 3D adversarial texture optimization method that jointly optimizes surface textures via differentiable rendering. The approach integrates a coarse-to-fine frequency strategy, saliency-guided perturbations, and a target-oriented loss function within the Expectation over Transformation (EOT) framework, enabling robust cross-view and cross-distance attacks. The method demonstrates strong effectiveness across diverse environmental conditions, exhibits black-box transferability, and successfully compromises real-world robotic systems, thereby revealing profound vulnerabilities in visual motor policies.

Technology Category

Application Category

📝 Abstract

Neural network-based visuomotor policies enable robots to perform manipulation tasks but remain susceptible to perceptual attacks. For example, conventional 2D adversarial patches are effective under fixed-camera setups, where appearance is relatively consistent; however, their efficacy often diminishes under dynamic viewpoints from moving cameras, such as wrist-mounted setups, due to perspective distortions. To proactively investigate potential vulnerabilities beyond 2D patches, this work proposes a viewpoint-consistent adversarial texture optimization method for 3D objects through differentiable rendering. As optimization strategies, we employ Expectation over Transformation (EOT) with a Coarse-to-Fine (C2F) curriculum, exploiting distance-dependent frequency characteristics to induce textures effective across varying camera-object distances. We further integrate saliency-guided perturbations to redirect policy attention and design a targeted loss that persistently drives robots toward adversarial objects. Our comprehensive experiments show that the proposed method is effective under various environmental conditions, while confirming its black-box transferability and real-world applicability.

Problem

Research questions and friction points this paper is trying to address.

visuomotor policies

adversarial attacks

3D objects

dynamic viewpoints

perceptual vulnerabilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

viewpoint-consistent

3D adversarial texture

differentiable rendering

visuomotor policy

Expectation over Transformation

🔎 Similar Papers

No similar papers found.

Authors to Follow