Abstract
Every vehicle needs to frequently handle interactions with other road users in order to travel successfully. Enabling autonomous vehicles to interact naturally, in a manner similar to human drivers, is an important challenge in the field. This paper proposes a decision-making approach for autonomous vehicles that integrates Social Value Orientation (SVO) into a reinforcement learning framework to enable interactive behaviors in complex merging scenarios. We defined a quantitative calculation method for the SVO value, used it for reward shaping, and aimed to investigate the impact of SVO on agent behavior patterns. We employed the Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) algorithms to train our agents, and validated our findings in the highway-env simulator. The simulation results indicate that, with the proposed decision-making approach, agents complete tasks in merging scenarios while following the intended interaction behavior patterns, displaying modes ranging from self-centered to altruistic. This confirms that the SVO-based reward function is both concise and capable of effectively guiding agents to achieve a diverse range of anticipated interaction behaviors.
Keywords
Get full access to this article
View all access options for this article.
