Sage Journals: Discover world-class research

Abstract

Every vehicle needs to frequently handle interactions with other road users in order to travel successfully. Enabling autonomous vehicles to interact naturally, in a manner similar to human drivers, is an important challenge in the field. This paper proposes a decision-making approach for autonomous vehicles that integrates Social Value Orientation (SVO) into a reinforcement learning framework to enable interactive behaviors in complex merging scenarios. We defined a quantitative calculation method for the SVO value, used it for reward shaping, and aimed to investigate the impact of SVO on agent behavior patterns. We employed the Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) algorithms to train our agents, and validated our findings in the highway-env simulator. The simulation results indicate that, with the proposed decision-making approach, agents complete tasks in merging scenarios while following the intended interaction behavior patterns, displaying modes ranging from self-centered to altruistic. This confirms that the SVO-based reward function is both concise and capable of effectively guiding agents to achieve a diverse range of anticipated interaction behaviors.

Keywords

autonomous driving social interaction social value orientation deep reinforcement learning decision making

Get full access to this article

View all access options for this article.

References

Chen

Sun

Zhou

, et al. A future intelligent traffic system with mixed autonomous vehicles and human-driven vehicles. Inf Sci 2020; 529: 59–72.

Zhou

Hang

Sun

Reasoning graph-based reinforcement learning to cooperate mixed connected and autonomous traffic at unsignalized intersections. Transp Res Part C Emerg Technol 2024; 167: 104807.

Fan

Zhu

Liu

, et al. Baidu Apollo EM motion planner. arXiv preprint arXiv:1807.08048, 2018.

Tampuu

Matiisen

Semikin

, et al. A survey of end-to-end driving: architectures and training methods. IEEE Trans Neural Netw Learn Syst 2022; 33(4): 1364–1384.

Watkins

Dayan

Q-learning. Mach Learn 1992; 8: 279–292.

Mnih

Kavukcuoglu

Silver

, et al. Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.

Sutton

McAllester

Singh

, et al. Policy gradient methods for reinforcement learning with function approximation. Adv Neural Inf Process Syst 1999; 12: 1057–1063.

Schulman

Levine

Moritz

, et al. Trust region policy optimization. In: ICML’15; Proceedings of the 32nd international conference on international conference on machine learning - volume 37, Lille, France, 6–11 July 2015, pp. 1889–1897. New York: ACM.

Schulman

Wolski

Dhariwal

, et al. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.

10.

Konda

Tsitsiklis

Actor-critic algorithms. Adv Neural Inf Process Syst 1999; 12: 1008–1014.

11.

Lillicrap

Hunt

Pritzel

, et al. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.

12.

Fujimoto

van Hoof

Meger

Addressing function approximation error in actor-critic methods. Proc Mach Learn Res 2018; 80: 1587–1596.

13.

Haarnoja

Zhou

Abbeel

, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290, 2018.

14.

Markkula

Madigan

Nathanael

, et al. Defining interactions: a conceptual framework for understanding interactive behaviour in human and automated road traffic. Theor Issues Ergon Sci 2020; 21(6): 728–752.

15.

Crosato

Tian

Shum

HPH

, et al. Social interaction-aware dynamical models and decision-making for autonomous vehicles. Adv Intell Syst 2023; 6: 2300575.

16.

Yang

Özgüner

Redmill

KA.

A social force based pedestrian motion model considering multi-pedestrian interaction with a vehicle. ACM Trans Spat Algorithms Syst 2020; 6: 1–27.

17.

Han

Zhao

Zhu

, et al. Spatial-temporal risk field for intelligent connected vehicle in dynamic traffic and application in trajectory planning. Trans Intell Transp Syst 2023; 24(3): 2963–2975.

18.

Zhao

Wang

, et al. Driving behavior modeling and characteristic learning for human-like decision-making in highway. IEEE Trans Intell Veh 2023; 8(2): 1994–2005.

19.

Deng

Yang

, et al. Socially game-theoretic lane-change for autonomous heavy vehicle based on asymmetric driving aggressiveness. IEEE Trans Veh Technol 2025; 74: 17005–17018.

20.

Wang

Hoogendoorn

Daamen

, et al. Game theoretic approach for predictive lane-changing and car-following control. Transp Res Part C Emerg Technol 2015; 58: 73–92.

21.

Hang

Huang

, et al. An integrated framework of decision making and motion planning for autonomous vehicles considering social behaviors. IEEE Trans Veh Technol 2020; 69: 14458–14469.

22.

Kolmanovsky

Girard

, et al. Game theoretic modeling of vehicle interactions at unsignalized intersections and application to autonomous vehicle control. In: 2018 annual American control conference (ACC), Milwaukee, WI, USA, 27–29 June 2018, pp. 3215–3220. New York: IEEE.

23.

Deng

Chu

, et al. A probabilistic model for driving-style-recognition-enabled driver steering behaviors. IEEE Trans Syst Man Cybern Syst 2022; 52(3): 1838–1851.

24.

Deng

Huang

, et al. Social predictive intelligent driver model for autonomous driving simulation. Automot Innov 2025; 8(1): 1–12.

25.

Deng

Sun

, et al. Eliminating uncertainty of driver’s social preferences for lane change decision-making in realistic simulation environment. IEEE Trans Intell Transp Syst 2025; 26(2): 1583–1597.

26.

Alahi

Goel

Ramanathan

, et al. Social LSTM: human trajectory prediction in crowded spaces. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016, pp. 961–971. New York: IEEE.

27.

Piao

Gao

. Encoding crowd interaction with deep neural network for pedestrian trajectory prediction. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 18–23 June 2018, pp. 5275–5284. New York: IEEE.

28.

Liao

Chen

, et al. Integration of decision-making and motion planning for autonomous driving based on double-layer reinforcement learning framework. IEEE Trans Veh Technol 2024; 73(3): 3142–3158.

29.

Huang

, et al. Toward human-in-the-loop AI: enhancing deep reinforcement learning via real-time human guidance for autonomous driving. Engineering 2023; 21: 75–91.

30.

McClintock

CG.

Social motivation: a set of propositions. Syst Res Behav Sci 1972; 17: 438–454.

31.

Liebrand

WBG

McClintock

. The ring measure of social values: a computerized procedure for assessing individual differences in information processing and social value orientation. Eur J Pers 1988; 2(3): 217–230.

32.

Buckman

Pierson

Schwarting

, et al. Sharing is caring: socially-compliant autonomous intersection negotiation. In: 2019 IEEE/RSJ international conference on intelligent robots and systems (IROS), Macau, China, 3–8 November 2019, pp. 6136–6143. New York: IEEE.

33.

Tong

Wen

Cai

, et al. Human-like decision making at unsignalized intersections using social value orientation. IEEE Intell Transp Syst Mag 2024; 16(2): 55–69.

34.

Schwarting

Pierson

Alonso-Mora

, et al. Social behavior for autonomous vehicles. Proc Natl Acad Sci USA 2019; 116(50): 24972–24978.

35.

Zhao

Tian

Sun

Yield or rush? Social-preference-aware driving interaction modeling using game-theoretic framework. In: 2021 IEEE international intelligent transportation systems conference (ITSC), Indianapolis, IN, USA, 19–22 September 2021, pp. 453–459. New York: IEEE.

36.

Pan

Xiao

, et al. Social-aware decision algorithm for on-ramp merging based on level-k gaming. In: 2022 IEEE 18th international conference on automation science and engineering (CASE), Mexico City, Mexico, 20–24 August 2022, pp. 1753–1758. New York: IEEE.

37.

Toghi

Valiente

Sadigh

, et al. Social coordination and altruism in autonomous driving. IEEE Trans Intell Transp Syst 2022; 23(12): 24791–24804.

38.

Crosato

Shum

HPH

ESL

, et al. Interaction-aware decision-making for automated vehicles using social value orientation. IEEE Trans Intell Veh 2023; 8(2): 1339–1349.

39.

Rummel

RJ.

Understanding conflict and war: vol. 2: the conflict helix. Sage, 1976.

40.

Wang

Zhang

, et al. Social interactions for autonomous driving: a review and perspectives. Found Trends Robot 2022; 10(3–4): 198–376.

41.

Zhang

Ren

, et al. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision, Santiago, Chile, 7–13 December 2015, pp. 1026–1034. New York: IEEE.

Drive with SVO: Decision making for autonomous vehicles considering social value orientation

Abstract

Keywords

Get full access to this article

References