Sage Journals: Discover world-class research

Abstract

Navigating heterogeneous urban traffic environments is challenging for autonomous vehicles (AVs) because of the dense and intricate interactions between AVs, human-driven vehicles (HDVs), and non-motorized vehicles (NMVs). In this paper, we propose a decentralized multi-agent reinforcement learning (MARL) algorithm with a bi-level intention inference module for joint motion and intention prediction of AVs. We model the underlying representation of agents’ intentions on two levels: the high-level intention represents long-term behavioral patterns, while the low-level intention depicts immediate interactive dynamics. By integrating intent-aware motion forecasting, this algorithm ensures the safe and resilient decision making of AV in mixed traffic flow. Experiments are performed in a modified Highway-Env simulation environment, incorporating calibrated models for both HDVs and NMVs based on real-world data. Results demonstrate that, compared with centralized training decentralized execution (CTDE) MARL baseline QMIX, our method yields a 20.0% and 13.8% higher episodic reward in stable and chaotic traffic, respectively, with a 53.2% higher non-collision rate and a 13.8% longer agent lifespan in chaotic traffic. We also compare with a decentralized training and decentralized execution (DTDE) baseline IPPO and demonstrate a higher episodic reward of 7.7% and 15.8% in stable traffic and chaotic traffic, 24.1% higher non-collision rate, and 3.1% longer agent lifespan.

Keywords

data and data science artificial intelligence and advanced computing applications reinforcement learning operations intelligent transportation systems automated ITS traffic flow theory and characteristics automated/autonomous vehicles

Get full access to this article

View all access options for this article.

References

Vosooghi

Kamel

Puchinger

Leblond

Jankovic

Robo-Taxi Service Fleet Sizing: Assessing the Impact of User Trust and Willingness-to-Use. Transportation, Vol. 46, No. 6, 2019, pp. 1997–2015.

Chen

Dong

P. Y. J.

Labi

A Cooperative Control Framework for CAV Lane Change in a Mixed Traffic Environment. arXiv Preprint arXiv:2010.05439, 2020.

Xiong

B.-K.

Jiang

Managing Merging from a CAV Lane to a Human-Driven Vehicle Lane Considering the Uncertainty of Human Driving. Transportation Research Part C: Emerging Technologies, Vol. 142, 2022, p. 103775. https://doi.org/10.1016/j.trc.2022.103775. https://www.sciencedirect.com/science/article/pii/S0968090X22002066.

Van der Horst

A. R. A.

de Goede

de Hair-Buijssen

Methorst

Traffic Conflicts on Bicycle Paths: A Systematic Observation of Behaviour from Video. Accident Analysis & Prevention, Vol. 62, 2014, pp. 358–368.

Pan

Kerali

Effect of Nonmotorized Transport on Motorized Vehicle Speeds in China. Transportation Research Record: Journal of the Transportation Research Board, 1999. 1695: 34–41.

Woo

Kono

Tamura

Kuroda

Sugano

Yamamoto

Yamashita

Asama

Lane-Change Detection Based on Vehicle-Trajectory Prediction. IEEE Robotics and Automation Letters, Vol. 2, No. 2, 2017, pp. 1109–1116. https://doi.org/10.1109/LRA.2017.2660543.

Rosbach

James

Großjohann

Homoceanu

Roth

Driving with Style: Inverse Reinforcement Learning in General-Purpose Planning for Automated Driving. Proc., 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, IEEE, New York, 2019, pp. 2658–2665. https://doi.org/10.1109/IROS40897.2019.8968205.

Maximum Entropy Inverse Reinforcement Learning Based on Behavior Cloning of Expert Examples. Proc., 2021 IEEE 10th Data Driven Control and Learning Systems Conference (DDCLS), Suzhou, China, IEEE, New York, 2021, pp. 996–1000. https://doi.org/10.1109/DDCLS52934.2021.9455476.

Dagli

Brost

Breuel

Action Recognition and Prediction for Driver Assistance Systems Using Dynamic Belief Networks. Proc., Net. ObjectDays: International Conference on Object-Oriented and Internet-Based Technologies, Concepts, and Applications for a Networked World. Springer, Berlin, Heidelberg, 2002, pp. 179–194.

10.

Rammelt

Planning in Driver Models Using Probabilistic Networks. Proc., 11th IEEE International Workshop on Robot and Human Interactive Communication, Berlin, Germany, IEEE, New York, 2002, pp. 87–92. https://doi.org/10.1109/ROMAN.2002.1045603.

11.

Berndt

Emmert

Dietmayer

Continuous Driver Intention Recognition with Hidden Markov Models. Proc., 2008 11th International IEEE Conference on Intelligent Transportation Systems, Beijing, China, IEEE, New York, 2008, pp. 1189–1194.

12.

Boyraz

Acar

Kerr

Signal Modelling and Hidden Markov Models for Driving Manoeuvre Recognition and Driver Fault Diagnosis in an Urban Road Scenario. Proc., 2007 IEEE Intelligent Vehicles Symposium, Istanbul, Turkey, IEEE, New York, 2007, pp. 987–992. https://doi.org/10.1109/IVS.2007.4290245.

13.

Leurent

An Environment for Autonomous Driving Decision-Making. 2018. https://github.com/eleurent/highway-env.

14.

Qin

Zhang

Wang

Feng

Modeling and Simulation for Non-Motorized Vehicle Flow on Road Based on Modified Social Force Model. Mathematics, Vol. 11, No. 1, 2023, p. 170. https://doi.org/10.3390/math11010170. https://www.mdpi.com/2227-7390/11/1/170.

15.

Shao

Yang

Wang

Huang

Wang

SIND: A Drone Data set at Signalized Intersection in China. Proc., 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, IEEE, New York, 2022, pp. 2471–2478. https://doi.org/10.1109/ITSC55140.2022.9921959.

16.

Ziebart

B. D.

Maas

A. L.

Bagnell

J. A.

Dey

A. K.

Maximum Entropy Inverse Reinforcement Learning. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, Vol. 8, 2008, pp. 1433–1438.

17.

Kuefler

Morton

Wheeler

Kochenderfer

Imitating Driver Behavior with Generative Adversarial Networks. Proc., 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, IEEE, New York, 2017, pp. 204–211. https://doi.org/10.1109/IVS.2017.7995721.

18.

Alsaleh

Sayed

Modeling Pedestrian-Cyclist Interactions in Shared Space Using Inverse Reinforcement Learning. Transportation Research Part F: Traffic Psychology and Behaviour, Vol. 70, 2020, pp. 37–57.

19.

Ulbrich

Maurer

Probabilistic Online POMDP Decision Making for Lane Changes in Fully Automated Driving. Proc., 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013), The Hague, Netherlands, IEEE, New York, 2013, pp. 2063–2067. https://doi.org/10.1109/ITSC.2013.6728533.

20.

Hubmann

Becker

Althoff

Lenz

Stiller

Decision Making for Autonomous Driving Considering Interaction and Uncertain Prediction of Surrounding Vehicles. Proc., 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, IEEE, New York, 2017, pp. 1671–1678. https://doi.org/10.1109/IVS.2017.7995949.

21.

Khan

S. I.

Maini

Modeling Heterogeneous Traffic Flow. Transportation Research Record: Journal of the Transportation Research Board, 1999. 1678: 234–241.

22.

Helbing

Molnar

Social Force Model for Pedestrian Dynamics. Physical Review E, Vol. 51, No. 5, 1995, p. 4282.

23.

Huang

You

Song

Cyclist Social Force Model at Unsignalized Intersections with Heterogeneous Traffic. IEEE Transactions on Industrial Informatics, Vol. 13, No. 2, 2016, pp. 782–792.

24.

Vasic

Ruskin

H. J.

Cellular Automata Simulation of Traffic Including Cars and Bicycles. Physica A: Statistical Mechanics and its Applications, Vol. 391, No. 8, 2012, pp. 2720–2729.

25.

Nash

Non-Cooperative Games. Annals of Mathematics, Vol. 54, No. 2, 1951, pp. 286–295. http://www.jstor.org/stable/1969529.

26.

Doshi-Velez

Konidaris

Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations. IJCAI: Proceedings of the Conference, Vol. 2016, 2016, pp. 1432–1440.

27.

Zhou

Laval

Zhou

Wang

Qing

Peeta

Review of Learning-Based Longitudinal Motion Planning for Autonomous Vehicles: Research Gaps Between Self-Driving and Traffic Congestion. Transportation Research Record: Journal of the Transportation Research Board, 2022. 2676: 324–341.

28.

Treiber

Hennecke

Helbing

Congested Traffic States in Empirical Observations and Microscopic Simulations. Physical Review E, Vol. 62, No. 2, 2000, p. 1805.

29.

Dang

Brüdigam

Zhang

Liu

Leibold

Buss

Distributed Stochastic Model Predictive Control for a Microscopic Interactive Traffic Model. Electronics, Vol. 12, No. 6, 2023, p. 1270.

30.

Cao

Chen

Tpcn: Temporal Point Cloud Networks for Motion Forecasting. Proc., IEEE/CVF Conference on Computer Vision and Pattern Recognition, Computer Vision Foundation, 2021, pp. 11318–11327.

31.

Schulman

Wolski

Dhariwal

Radford

Klimov

Proximal Policy Optimization Algorithms, 2017. https://arxiv.org/abs/1707.06347.

32.

Polack

Altché

d’Andréa Novel

de La Fortelle

The Kinematic Bicycle Model: A Consistent Model for Planning Feasible Trajectories for Autonomous Vehicles?

Proc., 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, IEEE, New York, 2017, pp. 812–818. https://doi.org/10.1109/IVS.2017.7995816.

33.

Feng

Pickering

Chappell

Iravani

Brace

Driving Style Analysis by Classifying Real-World Data with Support Vector Clustering. Proc., 2018 3rd IEEE International Conference on Intelligent Transportation Engineering (ICITE), 2018, pp. 264–268. https://doi.org/10.1109/ICITE.2018.8492700.

34.

Rashid

Samvelyan

De Witt

C. S.

Farquhar

Foerster

Whiteson

Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. Journal of Machine Learning Research, Vol. 21, No. 178, 2020, pp. 1–51.

35.

De Witt

C. S.

Gupta

Makoviichuk

Makoviychuk

Torr

P. H.

Sun

Whiteson

Is Independent Learning All You Need in the Starcraft Multi-Agent Challenge?

arXiv Preprint arXiv:2011.09533, 2020.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.78 MB

Toward Heterogeneous Urban Traffic: Distributed Multi-Agent Reinforcement Learning with Bi-level Intention Inference

Abstract

Keywords

Get full access to this article

References

Supplementary Material