Fine-Tuning Robot Policies While Maintaining User Privacy

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

251K/year

🤖 AI Summary

This work addresses privacy risks arising from user behavioral data leakage during robot personalization fine-tuning. We propose PRoP (Privacy-preserving Robot Personalization), a framework centered on a user-specific key-driven weight transformation mechanism: the model activates its personalized policy only upon receiving the correct key; otherwise, it reverts to a generic baseline policy—achieving model-agnostic, architecture-non-intrusive privacy protection. The mechanism is compatible with diverse policy models, including those used in imitation learning, reinforcement learning, and classification tasks. Experiments across multiple benchmark tasks demonstrate that PRoP significantly outperforms existing encoder-based privacy-preserving approaches. It maintains high personalization performance while effectively preventing external adversaries from reverse-engineering sensitive user information—such as preferences and behavioral habits—thereby ensuring robust privacy guarantees without compromising functionality.

Technology Category

Application Category

📝 Abstract

Recent works introduce general-purpose robot policies. These policies provide a strong prior over how robots should behave -- e.g., how a robot arm should manipulate food items. But in order for robots to match an individual person's needs, users typically fine-tune these generalized policies -- e.g., showing the robot arm how to make their own preferred dinners. Importantly, during the process of personalizing robots, end-users leak data about their preferences, habits, and styles (e.g., the foods they prefer to eat). Other agents can simply roll-out the fine-tuned policy and see these personally-trained behaviors. This leads to a fundamental challenge: how can we develop robots that personalize actions while keeping learning private from external agents? We here explore this emerging topic in human-robot interaction and develop PRoP, a model-agnostic framework for personalized and private robot policies. Our core idea is to equip each user with a unique key; this key is then used to mathematically transform the weights of the robot's network. With the correct key, the robot's policy switches to match that user's preferences -- but with incorrect keys, the robot reverts to its baseline behaviors. We show the general applicability of our method across multiple model types in imitation learning, reinforcement learning, and classification tasks. PRoP is practically advantageous because it retains the architecture and behaviors of the original policy, and experimentally outperforms existing encoder-based approaches. See videos and code here: https://prop-icra26.github.io.

Problem

Research questions and friction points this paper is trying to address.

Fine-tuning robot policies risks leaking user preference data

Developing robots that personalize actions while maintaining privacy

Preventing external agents from observing personally-trained robot behaviors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Personalized robot policies using unique user keys

Mathematical transformation of network weights for privacy

Model-agnostic framework maintaining original policy architecture

🔎 Similar Papers

Personalization in Human-Robot Interaction through Preference-based Action Representation Learning