Abstract
Cooperative adaptive cruise control (CACC) can improve the traffic efficiency and safety of a platoon on the road. Most traditional CACC methods need to rely on accurate mathematical models, while those based on deep reinforcement learning (DRL) suffer from long training times and poor convergence. In this context, this study proposes a CACC framework based on imitation learning (IL) and DRL, which aims to improve the car-following efficiency and long platoon stability of connected autonomous vehicles (CAVs) in a mixed traffic environment. This method combines the optimization ability of model predictive control (MPC) and the adaptive learning characteristics of a soft actor-critic (SAC) algorithm. MPC is demonstrated as an expert, and the pre-training policy network is obtained by imitation learning. The pre-training network is then introduced into SAC’s actor network, which enhances training efficiency in the SAC algorithm. Numerical simulation results show that the improved DRL algorithm has better convergence in the training process. Compared with the baseline model, the proposed framework has higher reward, lower tracking error, and better platoon stability in the evaluation. In addition, the proposed model can efficiently complete the task of car-following under different penetration rates.
Keywords
Get full access to this article
View all access options for this article.
