Fourier Features Let Agents Learn High Precision Policies with Imitation Learning

📅 2026-06-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the limitations of vision-based robotic manipulation strategies that rely solely on RGB images, which suffer from depth ambiguity and perspective scaling issues, as well as existing point cloud methods that, despite leveraging geometric priors, exhibit poor generalization. The authors propose a novel approach that maps point clouds into a high-dimensional Fourier space for encoding, enabling imitation learning policies to directly perceive high-frequency geometric details and effectively mitigate the neural network’s inherent bias toward low-frequency representations. Evaluated on both simulated and real-world robotic benchmarks—including RoboCasa and ManiSkill3—the method demonstrates significantly improved policy performance, consistently robust results, strong hyperparameter insensitivity, and compatibility with diverse encoding architectures.

📝 Abstract

High-precision robotic manipulation requires fine-grained spatial reasoning that is often difficult to achieve with RGB-only policies due to depth ambiguity and perspective scale issues. Policies that leverage 3D information directly, such as those based on point clouds, offer a stronger geometric prior over purely image-based ones, yet their performance remains highly task-dependent. We hypothesize that this discrepancy may be due to the spectral bias of neural networks towards learning low frequency functions, which especially affects architectures conditioned on slow-moving Cartesian features. We thus propose to map point clouds from Cartesian space into high-dimensional Fourier space, effectively equipping the point cloud encoder with direct access to high-frequency features. We experimentally validate the use of Fourier features on challenging manipulation tasks from the RoboCasa and ManiSkill3 benchmarks and on a real robot setup. Despite their simplicity, we find that Fourier features provide significant benefits across diverse encoder architectures and benchmarks and are robust across hyperparameters. Our results indicate that Fourier features let policies leverage geometric details more effectively than Cartesian features, showing their potential as a general-purpose tool for point cloud-based imitation learning. We provide source code and videos on our project page: https://fourier-il.github.io/fourier-il

Problem

Research questions and friction points this paper is trying to address.

high-precision manipulation

imitation learning

point clouds

spectral bias

spatial reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fourier features

point cloud

imitation learning