A Review of Reward Functions for Reinforcement Learning in the context of Autonomous Driving

📅 2024-04-12

🏛️ 2024 IEEE Intelligent Vehicles Symposium (IV)

📈 Citations: 8

✨ Influential: 1

🤖 AI Summary

Reinforcement learning (RL) for autonomous driving faces critical challenges in reward function design—including objective conflicts (safety vs. comfort vs. progress vs. traffic-rule compliance), insufficient contextual awareness, lack of principled aggregation mechanisms, and inadequate standardization. Method: This paper conducts a systematic literature review and proposes the first four-dimensional reward taxonomy tailored to autonomous driving—categorizing rewards along safety, comfort, progress, and regulatory compliance dimensions. Through goal attribution analysis and cross-method comparative evaluation, it identifies fundamental deficiencies in contextual perception and conflict resolution across existing approaches. Contribution/Results: We introduce a structured dynamic reward paradigm and a verifiable reward design framework that explicitly characterizes applicability boundaries and bias sources for each reward type. The work provides theoretical foundations and practical guidelines for developing trustworthy, interpretable, and scenario-adaptive RL training pipelines in autonomous driving.

Technology Category

Application Category

📝 Abstract

Reinforcement learning has emerged as an important approach for autonomous driving. A reward function is used in reinforcement learning to establish the learned skill objectives and guide the agent toward the optimal policy. Since autonomous driving is a complex domain with partly conflicting objectives with varying degrees of priority, developing a suitable reward function represents a fundamental challenge. This paper aims to highlight the gap in such function design by assessing different proposed formulations in the literature and dividing individual objectives into Safety, Comfort, Progress, and Traffic Rules compliance categories. Additionally, the limitations of the reviewed reward functions are discussed, such as objectives aggregation and indifference to driving context. Furthermore, the reward categories are frequently inadequately formulated and lack standardization. This paper concludes by proposing future research that potentially addresses the observed shortcomings in rewards, including a reward validation framework and structured rewards that are context-aware and able to resolve conflicts.

Problem

Research questions and friction points this paper is trying to address.

Developing suitable reward functions for autonomous driving RL

Addressing conflicting objectives in safety, comfort, progress, and rules

Standardizing and improving context-aware reward function formulations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reviewing reward functions for autonomous driving RL

Categorizing objectives into Safety, Comfort, Progress, Rules

Proposing context-aware, conflict-resolving structured rewards

🔎 Similar Papers

No similar papers found.

Authors to Follow