Sage Journals: Discover world-class research

Abstract

The physical modeling-based approaches tend to be over-simplistic and cannot forecast the complex dynamical phenomena, thus leading to non-negligible errors. It is not easy to measure some parameters precisely, and they are usually approximated roughly. However, this approximation reduces the modeling accuracy of the physical model, which is a common problem in complex systems research. It is well-known that neural networks are capable of encoding dynamic information. The vehicle can be accurately modeled by collecting data during its motion. However, purely data-driven approaches have low interpretability and cannot be used in commercial applications. In this work, we present a new hybrid modeling architecture. Based on the physical model, the deep learning method is introduced to expand the incomplete dynamics described by differential equations. Compared with the physical modeling-based and purely data-driven approaches, the proposed technique has lower modeling error and higher interpretability. We evaluate the performance of the hybrid model based on the collected data. The test results show that the proposed architecture successfully captures the vehicle dynamics and reduces the error caused by multi-step prediction compared to the data-driven models. The results also show that the proposed method has value for significant research and practical application.

Keywords

Vehicle dynamics hybrid model multi-step prediction recurrent neural networks

Introduction

Autonomy is the future of automobile development. Autonomous vehicles are considerably transforming the ecology and travel modes. The classic self-driving systems typically include perception, positioning, decision making, trajectory planning, and control module. Accurate kinetic information of the vehicle ensures the safety of autonomous vehicles.

In the last century, physical modeling-based approaches have been widely studied. In the mid-20th century, researchers from Cornell University studied the linear two degrees of freedom vehicle model.¹ The researchers assumed the whole vehicle is a rigid body and considered the cornering stiffness of the front and rear axles. In addition, the authors did not distinguish between the wheels on the left and right sides. The wheel angle is directly used as the input of the model, and the understeer and oversteer characteristics of the car are defined. Segel² regarded the vehicle as a linear dynamical system and established a three-DOF vehicle model including yaw, lateral, and roll motions to describe the steering response. Kazemi et al.³ established a non-linear 7-DOF vehicle model using the magic formula tire model and studied the influence of the rear-wheel steering on the vehicle’s stability during driving. Notably, the parameters of these models have actual physical meanings. Therefore, they have high interpretability. However, the physical model based on the first principles is usually idealized during modeling. Consequently, it is impossible to accurately calculate the true dynamic response of the vehicle during experimental driving, such as the front and rear axles of the vehicle load transfer and the high-order dynamic response of tires. Generally, the environment in which the physical model tests are performed is different from the actual environment, thus making it difficult to estimate the model parameters in real-time.

With the development of deep learning, significant progress has been made in many research fields, especially in autonomous driving. The deep learning-based methods have been successfully used to perform perceptual tasks, such as target detection,^4–8 image segmentation,^9–11 and trajectory prediction.¹² Recently, various works have been proposed that use deep learning in an end-to-end fashion to control the vehicles based on the original sensor data.^13,14 These methods directly obtain the instructions at the output end from the image information at the input end. It is noteworthy that neural network training requires a large amount of data. The neural networks have strong nonlinear modeling capabilities. The authors¹⁵ proved that the nonlinear modeling ability of a neural network could be combined with the feedback, and the input and output data of the system can be used to model the nonlinear systems accurately. Ji et al.¹⁶ proposed an adaptive control mechanism based on the Lyapunov stability theory and radial basis function neural network (RBFNN). This network uses an ANN to estimate the uncertainty in tire cornering stiffness. Spielberg et al.¹⁷ established a vehicle lateral dynamics model based on a neural network and successfully used it to design a trajectory tracking controller. Simon et al.¹⁸ studied and optimized a neural network’s size, structure, and initial weights for modeling. In addition, the authors also examine the results of the fusion weight network. Kabzan et al.¹⁹ established a data-driven and mechanism mixed model using a relatively simple and nominal model. The researchers also established the online learning of model error based on Gaussian process regression. Compared with physical models, the dynamic models based on neural networks require almost no specific domain knowledge. In addition, the construction cost of the model is high as the network parameters are learned using a large amount of data. Although the data-driven model can be changed according to the continuous changes in the environment, it is difficult to solve the model parameter estimation due to using a nonlinear algorithm. However, the interpretability of the neural network is low as the weight parameters of the network cannot correspond to the parameters of the physical model. In addition, the data-driven model is prone to cause uncontrollable errors compared with the traditional physical model, which reduces the safety of the autonomous vehicle during driving. In case of an error message inside the neural network, it is impossible to locate the source of the error message.

When using neural networks to model dynamic systems, the network’s convergence speed and prediction accuracy can be improved by incorporating prior knowledge. In system identification, the white-box model is one of the most convenient methods to represent prior knowledge. However, because the white-box model is simplified and assumed to be unable to capture too many complex nonlinearities acting on the system, its output error is considerably high. One way to solve these shortcomings is to combine an analysis model with a neural network to improve the overall performance. For the time series prediction task, researchers have proved that the performance of the hybrid model is better than that of the analysis model and artificial neural network.^20,21 Furthermore, the existing methods have shown the potential of a hybrid neural network model in dynamic system modeling. Jiahao et al.²² have demonstrated neural networks’ compatibility with first-principle dynamic models by modeling various nonlinear systems. Chee et al.²³ used a deep learning method and differential equations to augment a model obtained from first principles. Holzmann et al.²⁴ used the radial basis function network to compensate for the influences of changing road conditions affecting a vehicle dynamics simulation model. Pracny et al.²⁵ coupled neural networks and spline function to study the influence of oil temperature change on the operating characteristics of the shock absorber. Fraikin et al.²⁶ established an efficient and accurate vehicle lateral dynamics simulation by coupling the long short-term memory (LSTM) neural network with the single-track model. Graeber et al.²⁷ combined neural networks with a vehicle kinematics model in the side-slip angle estimation and increased the number of input features of a neural network by using the kinematics model to improve the estimation quality of the side-slip angle. Mohajerin et al.²⁸ combined the proposed RNN-based black-box models with a physics-based into a single RNN-based modeling system, solving many of the limitations of the existing state-of-the-art in long-term prediction for dynamic systems. However, the neural network’s output is used as the input of the physical model, which may cause the input-output relationship of the physical model to be obscured by the neural network. De Groote et al.²⁹ proposed a neural network-enhanced physical model for modeling the servo system. The unknown loads and parameters in the physical model are obtained using the neural network. Nevertheless, the final output is obtained from the physical model. Since the modeling method requires high accuracy of the physical model in the hybrid model, it is not suitable for complex dynamic systems, such as vehicle dynamic system. Another challenge is that it is difficult to train hybrid model due to inaccurate physical models. Since the prediction state will be feedback to the input port, the error increases with time, which causes divergence in the weights of the neural networks.

To address these challenges, we propose a new hybrid model for the multi-step prediction of vehicle state. The main contributions of this study are stated as follows.

(1) We employ a hybrid model to develop a high-fidelity vehicle model capable of capturing poorly understood uncertainties and residual dynamics. The hybrid model combines prior knowledge of the system dynamic and better represents the vehicle dynamics.

(2) We show that the hybrid model significantly improves state predictions’ accuracy over the nominal model and a separate data-driven-based prediction model.

(3) To reduce the divergence during the training of the hybrid model in multi-step prediction, we use open-loop and closed-loop training methods so that the inaccurate physical model is used as a part of the hybrid model.

The rest of this paper is structured as follows. In Section II, the overall architecture and composition of the hybrid model used for multi-step prediction are presented. In Section ãÂ, the vehicle dynamics data collection and model training methods are introduced. In Section ãÈ, the predictive performance of physical models, neural network, and hybrid models are evaluated. We conclude this work in Section ãÈ.

Hybrid model for multi-step prediction

In this section, we present the framework and methodology of the hybrid model. The proposed algorithm is mainly composed of three components: the physical model, feed-forward neural network (FFNN), and recurrent neural network (RNN).

The structure of hybrid model

Figure 1 shows our suggested hybrid model, comprising three modules: a physical model, a FFNN and an RNN (with initialization networks). The physical model receives current $c_{t}$ and $u_{t}$ as input, and updates the state vector $u_{t + 1}^{physical}$ . The FFNN accounts for residual and uncertain dynamics $u_{t + 1}^{FFNNs}$ given $c_{t}$ and $u_{t}$ within the system in single-step prediction process. The RNN receives previous hidden state $h_{t}$ , $c_{t}$ and state vector $u_{t}$ to generate the compensated for multi-step prediction process. However, the previous state $u_{t}$ does not need to be a part of the RNN’s input as the information it contains has been encoded in the RNN. Therefore, the final output state $u_{t + 1}^{RNNs}$ of the hybrid model is the sum of ${\hat{u}}_{t + 1}$ and ${\tilde{u}}_{t + 1}$ .

Figure 1.

The proposed hybrid architecture consists of a physical model, FFNN, and RNN.

During the process of multi-step prediction, the feedback loop indicated by the red arrow provides feedback of the previous state to the input for predicting the next state. Due to an increase in the feedback, the multi-step prediction error accumulates over time. In this work, we define $c_{t}$ as the combined longitudinal force of the front tire front wheel, front-wheel angle, and longitudinal speed at the current moment. $u_{t}$ denotes the yaw rate and lateral speed at the current moment. The forward computation for the proposed hybrid model for the time step $t + 1$ is defined as follows:

{\begin{matrix} u_{t + 1}^{p h y s i c a l} = p h y s i c a l m o d e l (c_{t}, u_{t}) \\ u_{t + 1}^{F F N N} = F F N N (c_{t}, u_{t}) \\ {\tilde{u}}_{t + 1} = u_{t + 1}^{p h y s i c a l} + u_{t + 1}^{F F N N} \\ h_{t + 1} = R N N (c_{t}, {\tilde{u}}_{t + 1}, h_{t}) \\ u_{t + 1}^{R N N} = W^{h x} h_{t + 1} \\ {\hat{u}}_{t + 1} = {\tilde{u}}_{t + 1} + u_{t + 1}^{R N N} \end{matrix}

(1)

Physical model

The kinematic model and dynamics models are commonly used in vehicle motion simulation. The kinematic model uses the kinematic correlation to describe the motion of an object in space. The dynamic model can be established by describing the forces acting on an object. In this section, the vehicle dynamic model used for hybrid modeling of vehicle lateral dynamics are described, including single-track model and brush Fiala model.

The self-driving cars run on a flat road and the factors, such as the slope and the vertical movement, are ignored.

The suspension system and vehicle are rigid. The influence of suspension motion and its coupling are ignored.

Only the tire cornering characteristics are considered. The relationship between the longitudinal coupling of the tire forces is ignored.

A 2-DOF vehicle model is used to describe the movement of the vehicle without considering the left and right load transfer.

The longitudinal speed of the vehicle is constant and the weight transfer of the front and rear axles is ignored.

The resistance, such as vertical and horizontal aerodynamics, is also ignored.

In Figure 2, $U$ denotes the velocity of the vehicle’s center of mass; $U_{x}, U_{y}$ denote the velocities at the center of mass of the vehicle along the x and y directions of the vehicle body coordinate system, respectively; $α_{f}, α_{r}$ denote the side-slip angles of the front and rear wheels, respectively; $β$ represents the side-slip angle of the center of mass; $r$ denotes the vehicle yaw rate; $L = a + b$ denotes the wheelbase; $m$ denotes the vehicle weight; $I_{Z}$ represents the moment of inertia of the vehicle around the z-axis of the center of mass; $F_{y, f}, F_{y, r}$ denote the resultant lateral forces on the front and rear tires, respectively; $F_{xf}$ represents the resultant longitudinal force on the front axle tires; and $δ$ denotes the front wheel steering angle.

Figure 2.

The schematic of a single-track model.

The dynamics single-track model is expressed as follows:

\begin{array}{l} {\dot{U}}_{y} = \frac{F_{x f} \sin δ + F_{y f} \cos δ + F_{y r}}{m} - U_{x} r \\ \dot{r} = \frac{a (F_{x f} \sin δ + F_{y f} \cos δ) - b F_{y r}}{I_{z}} . \end{array}

(2)

The nonlinear characteristics of a vehicle during driving under different road conditions arise from the tires, when the vehicle is turning. Therefore, in order to expand the scope of application of a vehicle model, a nonlinear tire model is introduced as follows. The tire lateral force is respectively calculated as:

\begin{array}{l} F_{y} = {\begin{cases} - C_{α} \tan α + \frac{C_{α}^{2}}{3 μ F_{z}} | \tan α | \tan α \\ \begin{matrix} - \frac{C_{α}^{3}}{27 μ^{2} F_{z}^{2}} \tan^{3} α, | α | < α_{sat} \\ - μ F_{z} sgn α, otherwise \end{matrix} \end{cases}, \\ α_{sat} = \arctan \frac{3 μ F_{z}}{C_{α}} \end{array}

(3)

where, $C_{α}$ and $μ$ denote the tire corner stiffness and road adhesion coefficient, respectively; $F_{z}$ denotes the tire vertical force; $α$ denotes the tire side-slip angle and $α_{sat}$ denotes the tire saturation side-slip angle. The side-slip angles of the front and rear tires are respectively calculated by:

\begin{matrix} α_{f} = arc \tan (\frac{U_{y} + ar}{U_{x}}) - δ_{f}, \\ α_{r} = arc \tan (\frac{U_{y} - br}{U_{x}}) . \end{matrix}

(4)

When the vehicle’s longitudinal speed changes slowly, the load transfer between the front and rear axles can be ignored. The amount of normal force experienced on each tire is respectively calculated by:

{\begin{matrix} F_{zf} = \frac{b}{a + b} mg \\ F_{zr} = \frac{a}{a + b} mg \end{matrix}

(5)

The real vehicle parameters corresponding to the simplified vehicle dynamics model are shown in Table 1.

Table 1.

Vehicle parameters.

Symbol	Parameter	Value	Unit
m	Vehicle mass	1580	kg
a	Distance of the front wheel axle from the CG	1.098	m
b	Distance of the rear wheel axle from the CG	1.57	m
$I_{Z}$	Vehicle moment of inertia about the yaw axis	2512	kg m²
$c_{f}$	Cornering stiffness of front tires	146,452	N/rad
$c_{r}$	Cornering stiffness of rear tires	157,526	N/rad

Feed-forward neural networks

Hornik et al.³⁰ proved that a FFNN with a hidden layer could approximate any continuous function. It has been used extensively in the modeling and control of dynamic systems and as the modeling part of the controller in the Lyapunov design method to stabilize the system.³¹ Since FFNN cannot capture the long-term characteristics of a dynamic system, it is mainly used to capture poorly understood uncertainties and residual dynamics. In this work, FFNN comprises three fully connected layers and ReLU activation functions as a single-step predictor or compensator.

Recurrent neural networks

In this section, we present the basic concepts of RNN and gated recurrent unit (GRU). The RNN has proved their advantages in time sequence modeling, such as natural language processing and trajectory prediction. These networks have proved to be more effective than the traditional networks for modeling long-term dependence between the current and historical information. We present a vanilla RNN and use a GRU to enhance its performance. A schematic structure of the RNN is shown in Figure 3.

Figure 3.

The schematic structure of RNN.

The RNN is mathematically expressed as:

s_{t} = σ (W_{x} x_{t} + H_{s} s_{t - 1} + b_{x})

(6)

o_{t} = soft max (W_{o} s_{t} + b_{o})

(7)

where $σ$ denotes the activation function of the hidden state. $x_{t}$ , $s_{t}$ , and $o_{t}$ represent the input state, hidden state, and output, respectively, and $W_{x}$ , $H_{s}$ , and $W_{o}$ denote the corresponding weight matrices, respectively. In order to reduce the number of parameters, $W_{x}$ , $H_{s}$ , and $W_{o}$ are kept similar at each step.

The blue arrow in Figure 3 indicates the chain-rule-based Back Propagation Through Time (BPTT) scheme, which is expressed as:

\frac{\partial L}{\partial θ} = \sum_{1 \leq t \leq T} \frac{\partial L_{t}}{\partial θ}

(8)

\frac{\partial L_{t}}{\partial θ} = \sum_{1 \leq t \leq T} (\frac{\partial L_{t}}{\partial s_{t}} \frac{\partial s_{t}}{\partial s_{k}} \frac{\partial^{+} s_{k}}{\partial θ})

(9)

where, $L$ denotes the loss, that is, the performance of the network; $\frac{\partial L}{\partial θ}$ denotes the network gradient; $\frac{\partial L_{t}}{\partial θ}$ denotes the sum of the time components in the past t time steps; and $\frac{\partial S_{t}}{\partial S_{k}}$ denotes the loss in time step t back to time step k.

Gate recurrent units

Unlike FFNN, RNN can take sequences of arbitrary length as input. It has the property of universal approximation and can reconstruct the state space of a dynamical system well in theory, enabling suitable models for multi-step prediction problems. Jin et al.³² showed that RNN might be used to approximate uniformly a state-space trajectory produced by a discrete-time nonlinear system. Due to the large delay between vehicle response and driver input, RNN will offer good performance. Sepp et al.³³ proposed an LSTM model, which improves the long-term modeling ability and has been widely used for participants’ trajectory predictions. However, due to many model parameters, the LSTM model takes longer to process large datasets. Cho et al.³⁴ proposed GRU. The GRU simplifies the structure of LSTM, reduces the number of gates, and improves operational efficiency.

RNN state initialization

The multi-step prediction models, such as LSTMs and GRUs, rely on accurate hidden states to produce accurate predictions. The initial state of the GRU should contain historical information. Mohajerin and Waslander³⁵ described different LSTM hidden state initializations in the quadrotor modeling using RNN. The authors finally used another layer of LSTM as the initializer. An initialization network extracts the potential dynamics from the historical data. Please note that using the final hidden state of the initializer to initialize the predictive network improves the long-term prediction performance of a network. In this work, we choose another layer of GRU to receive the historical data to generate the hidden states of the predictive network (Figure 4), which are expressed as:

h_{init} = f_{NN, init} (u_{t - k,} \dots_{,} u_{t - 1,} \dots_{,} c_{t - k,} c_{t - 1})

(10)

Figure 4.

The schematic structure of GRU cells and regression layers.

where, $u_{t - k}, \dots, u_{t - 1}, \dots, c_{t - k}, c_{t - 1}$ denote the control inputs and states at historical time steps.

Experiments and dataset

Data collection and processing

We collected the trajectory samples based on the “Chery Arezer 5E” for approximately 48 min. The dataset includes the dynamic response data of an intelligent vehicle under dry and wet road conditions. The test platform is presented in Figure 5. This platform includes an environment sensing system, inertial navigation and positioning system, a decision control module, and an underlying actuator by wire. The wheel force sensors, S-Motion DTI, and MSW DTI sensors are also installed.

Figure 5.

The intelligent driving platform used in this work.

During data collection, the sampling frequency of the signal is 10 kHz. However, there is noise interference in the accumulated data. We use a mean filter to down-sample the collected data to 100 Hz to reduce the training complexity. A second-order Butterworth low-pass filter with a cut-off frequency of 2 Hz is used for data smoothing and filtering out the impact of high-frequency behaviors, such as suspension vibration on vehicle dynamics. The data filtering is completed using an Intel Core i7 2.5 GH computer and MATLAB 2020a. Though the friction coefficient between the tire and road slightly changes in actual conditions, we assume that the friction coefficient between tire and road is constant. The friction coefficients between the tire and dry road and between the tire and wet road are 0.85 and 0.5, respectively. We use 80% of the data for model training, and 20% of the data is used to test the model.

Training methods

The training of the hybrid model is divided into FFNN training and GRU training. In the first stage, FFNN is used to compensate for the single-step prediction error between the output of the physical model and the actual measured value. The loss function used for training FFNN is expressed as:

\begin{matrix} L_{FFNN} = \frac{1}{D} \sum_{i = 1}^{D} L_{i} = \\ \frac{1}{TD} \sum_{i = 1}^{D} \sum_{t = t_{0} + 1}^{t_{0} + T} ∥ u_{t + 1}^{physical} + u_{t + 1}^{FFNN} - u_{t + 1} ∥_{1} \end{matrix}

(11)

where, $D$ denotes the size of a minibatch, $t$ denotes the time step, and $T$ denotes the total time step of a training sample.

Given the initial state of GRU and the amount of control in the future, the multi-step prediction training of GRU is difficult because the input of FFNN and the physical model depends on the previous inaccurate output. The error accumulates in multiple time steps and finally leads to divergence between the predicted state and the real state. As GRU uses the BPTT algorithm for backpropagation and updating parameters, it is more sensitive to errors. The error accumulation leads to gradient explosion, thus making GRU difficult to train. We use open-loop and closed-loop training to address this problem (Figure 6).

Figure 6.

Open loop and close loop training process.

Open-loop training: In open-loop training, we use the pre-training weights of FFNN to train the hybrid model. The teacher forcing is used as the training method.³⁶ The physical model and FFNN accept the true prior state as input at the time step $t + 1$ , which enables GRU to reduce one-step prediction error at each time step.

Closed-loop training: In the closed-loop training phase, the hybrid model learns to predict the states over multiple time steps without receiving the true states as input. This is achieved by using the state prediction from the previous time step as the input to the current prediction step. The training is still effective because the open-loop training phase ensures that the hybrid model achieves sufficiently accurate predictions to prevent strong divergence of the GRU gradient. In closed-loop training, the pre-training weights during the open-loop training are used to continue training.

The loss function used in open-loop training phase is expressed as follows:

L_{p r e} = \sum_{i = 1}^{D} \sum_{t = t_{0} + 1}^{t_{0} + T} ∥ u_{t + 1}^{p h y s i c a l} + u_{t + 1}^{F F N N} - (u_{t + 1} - G R U (c_{t}, u_{t + 1}^{p h y s i c a l}, h_{t}; θ)) ∥_{1}

(12)

u_{t + 1}^{physical} = physical model (c_{t}, u_{t})

(13)

u_{t + 1}^{FFNN} = FFNN (c_{t}, u_{t})

(14)

where, $D$ denotes the size of minibatch, $t$ denotes the time step, and $T$ denotes the total time steps of a training sample.

The model does not accept the true input in the closed-loop training phase, and rather, it accepts the output of the previous time step as the input of the next time step. The input of the physical model and FFNN change in the closed-loop training:

u_{t + 1}^{physical} = physical model (c_{t}, {\hat{u}}_{t})

(15)

u_{t + 1}^{FFNN} = FFNN (c_{t}, {\hat{u}}_{t})

(16)

Experimental setup

The Facebook PyTorch package in Python 3.6 is used for implementing and training the networks. In the FFNN training phase, we set the learning rate to 0.0001 and mini-batch to 500. In the GRU training phase, we use the exponential decay learning rate to train the model. The initial learning rate is set to $5 \times 10^{- 4}$ , and the learning rate is changed after each epoch as:

5 \times 10^{- 4} \times (l r_{deacy})^{epoch - 1}

(17)

The optimization is performed by using the standard ADAM optimizer.³⁷ In order to avoid the gradient explosion during the training process, we set the maximum threshold of gradient change to 10 and minibatch to 128 when the model parameters are updated in each step.

Model selection

Hyperparameter search is performed via grid search on the test dataset where prediction time steps T = 0.5 s. The initial network, FFNN and RNN use the same number of hidden neurons. GRU Encoder-Decoder (GRU-ED) is used as a deep neural network baseline. The structure of the hybrid model and GRU-ED are of one hidden layer with the structure chosen as 32, 64, 128, and 256. The best model is chosen based on the root mean sum-of-squared error³⁵ (RMSSE) on the test dataset. The RMSSE is defined as

RMSSE = \sqrt{\frac{1}{T_{pred} . n} \sum_{i = 1}^{n} \sum_{k = 1}^{T_{pred}} e_{i}^{T} (k) e_{i} (k)}

(18)

where n is the size of the test dataset, e is the predict error.

As shown in Figure 7, by increasing parameters, the test RMSSE in the hybrid model and GRU-ED significantly reduces, showing that adding model hidden neurons has the ability to significantly reduce prediction error. By adding additional hidden neurons, generally, RMSSE continues to decrease. However, additional parameters increase the size of the model, which increases computational complexity and solves time or causes network overfitting. Ultimately128 hidden neurons are chosen because of their predictive performance and resulting short solve time compared to the large model.

Figure 7.

Network size versus RMSSE: (a) RMSSE changes with GRU-ED parameters for high friction data and (b) RMSSE changes with hybrid model parameters for high friction data.

Evaluation metrics

In order to evaluate the prediction accuracy of the model, the mean absolute error and distribution of prediction error in each time step are analyzed. The distribution of error in each time step is represented by using boxplots. The mean absolute error in each time step is evaluated by the L1 norm:

‖ {e^{-}}_{u_{y}, k} ‖ = \frac{1}{N} \sum_{k = 1}^{T} | u_{y, k} - {\hat{u}}_{y, k} |

(19)

‖ {e^{-}}_{r, k} ‖ = \frac{1}{N} \sum_{k = 1}^{T} | r_{k} - {\hat{r}}_{k} |

(20)

where, $u_{y, k}$ and $r_{k}$ denote the true lateral velocity and yaw rate at time step k, respectively; ${\hat{u}}_{y, k}$ and ${\hat{r}}_{k}$ denote the prediction state at time step $k$ ; $N$ denotes the number of test samples, and T denotes the total time steps of a test sample.

Result and discussion

In this section, the hybrid model prediction performance is compared with both the white-box and black-box models. It is demonstrated that the hybrid models provide a more accurate and reliable prediction than the white-box and black-box models.

Hybrid model versus physical model

Due to simplifying the parameters of the physical model, it is not suitable for multi-step prediction. The performances of the one-step prediction of the hybrid model and the physical model are compared in Figure 8. The results show that the hybrid model’s performance is better than the physical model. Compared with the hybrid model, the physical model has a large deviation.

Figure 8.

The white-box prediction performance: (a, b) histograms of the lateral velocity and yaw rate single-step prediction errors for the white-box and hybrid model for high friction surface and (c, d) histograms of the lateral velocity and yaw rate single-step prediction errors for the white-box and hybrid model for low friction surface.

Hybrid model versus GRU encoder-decoder

In Figures 9 and 10 and Tables 2 –4, the plots and tables illustrate the mean of the test error over the prediction and the evolution of the training cost over the training process. Each column corresponds to one prediction length, which, from left to right, are T = 0.15, 0.3, and 0.5 s. It can be observed that the hybrid model has reduced the prediction error and the training cost significantly. Please note that each model uses the same hyperparameters and is trained for six epochs.

Table 2.

The L1 normal error of lateral velocity and yaw rate, T = 0.15 s.

	0.01 s	0.03 s	0.06 s	0.09 s	0.12 s	0.15 s
Hybrid model/lateral velocity MAE/high friction	0.00125	0.00312	0.00525	0.00710	0.00868	0.01009
GRU-ED/lateral velocity MAE/high friction	0.00216	0.00359	0.00585	0.00830	0.01063	0.01273
Hybrid model/yaw rate MAE/high friction	0.00069	0.00146	0.00215	0.00255	0.00279	0.00296
GRU-ED/yaw rate MAE/high friction	0.00170	0.00248	0.00366	0.00477	0.00570	0.00647
Hybrid model/lateral velocity MAE/low friction	0.00246	0.00586	0.01033	0.01363	0.01692	0.01972
GRU-ED/lateral velocity MAE/low friction	0.00097	0.00308	0.00644	0.00973	0.01280	0.01561
Hybrid model/yaw rate MAE/low friction	0.00049	0.00128	0.00214	0.00269	0.00306	0.00331
GRU-ED/yaw rate MAE/low friction	0.00129	0.00223	0.00326	0.00379	0.00419	0.00449

Bold indicates hybrid model’s prediction error.

Table 3.

The L1 normal error of lateral velocity and yaw rate, T = 0.3 s.

	0.01 s	0.06 s	0.12 s	0.18 s	0.24 s	0.30 s
Hybrid model/lateral velocity MAE/high friction	0.00139	0.00562	0.00914	0.01154	0.01321	0.01438
GRU-ED/lateral velocity MAE/high friction	0.00247	0.00652	0.01151	0.01496	0.01727	0.01895
Hybrid model/yaw rate MAE/high friction	0.00066	0.00218	0.00277	0.00298	0.00305	0.00310
GRU-ED/yaw rate MAE/high friction	0.00374	0.00925	0.01412	0.01690	0.01841	0.01918
Hybrid model/lateral velocity MAE/low friction	0.00106	0.00663	0.01279	0.01756	0.02113	0.02394
GRU-ED/lateral velocity MAE/low friction	0.00517	0.01194	0.01741	0.02137	0.02422	0.02640
Hybrid model/yaw rate MAE/low friction	0.00062	0.00228	0.00312	0.00348	0.00361	0.00370
GRU-ED/yaw rate MAE/low friction	0.00240	0.00434	0.00501	0.00521	0.00527	0.00531

Bold indicates hybrid model’s prediction error.

Table 4.

The L1 normal error of lateral velocity and yaw rate, T = 0.5 s.

	0.01 s	0.1 s	0.2 s	0.3 s	0.4 s	0.5 s
Hybrid model/lateral velocity MAE/high friction	0.00073	0.00669	0.01089	0.01303	0.01418	0.01491
GRU-ED/lateral velocity MAE/high friction	0.00172	0.00718	0.01215	0.01510	0.01710	0.01856
Hybrid model/yaw rate MAE/high friction	0.00057	0.00251	0.00304	0.00304	0.00301	0.00299
GRU-ED/yaw rate MAE/high friction	0.00157	0.00410	0.00520	0.00543	0.00549	0.00559
Hybrid model/lateral velocity MAE/low friction	0.00142	0.01154	0.01989	0.02494	0.02827	0.03066
GRU-ED/lateral velocity MAE/low friction	0.00431	0.01311	0.02081	0.02581	0.02946	0.03240
Hybrid model/yaw rate MAE/low friction	0.00071	0.00304	0.00359	0.00374	0.00379	0.00381
GRU-ED/yaw rate MAE/low friction	0.00268	0.00604	0.00905	0.01121	0.01278	0.01396

Bold indicates hybrid model’s prediction error.

Figure 9.

Comparison between hybrid model and GRU-ED on the vehicle lateral velocity and yaw rate.

Figure 10.

The training performance of the proposed model on the actual vehicle dataset. (From top to bottom = 0.15, 0.3, and 0.5 s. Left column is training process for high friction data. Right column is training process for low friction data.): (a) is case 1 of training process with the prediction time T=0.15s, (b) is case 2 of training process with the prediction time T=0.15s, (c) is case 1 of training process with the prediction time T=0.3s, (d) is case 2 of training process with the prediction time T=0.3s, (e) is case 1 of training process with the prediction time T=0.5s, and (f) is case 2 of training process with the prediction time T=0.5s.

The prediction performance of the model can be further evaluated by studying the error distribution. The lateral velocity and yaw velocity prediction error distributions of the hybrid model and GRU-ED are presented using boxplots in Figures 11 and 12, respectively. In each prediction time step, the yellow dotted line represents the median, the lower and upper limits of the blue rectangle correspond to the upper and lower quartiles q1 and q3, and the ends of the whiskers correspond to extreme cases.

Figure 11.

The comparison between the hybrid model and GRU-ED prediction error. The distribution of the L1 norm of the lateral velocity prediction error is plotted for prediction lengths, T = 0.5 s. The plots on the left represent the hybrid model and the plot on the right side represents the GRU-ED: (a) is the prediction error of lateral velocity with hybrid model for high friction data, (b) is the prediction error of lateral velocity with GRU-ED for high friction data, (c) is the prediction error of lateral velocity with hybrid model for low friction data, and (d) is the prediction error of lateral velocity with GRU-ED for low friction data.

Figure 12.

The comparison between the hybrid model and GRU-ED prediction error. The distribution of the L1 norm of the yaw rate prediction error is plotted for prediction lengths, T = 0.5 s. The plots on the left represent the hybrid model and the plots on right represent the GRU-ED: (a) is the prediction error of yaw rate with hybrid model for high friction data, (b) is the prediction error of yaw rate with GRU-ED for high friction data, (c) is the prediction error of yaw rate with hybrid model for low friction data, and (d) is the prediction error of yaw rate with GRU-ED for low friction data.

We observe that within the prediction range, that is, T = 0.5 s, the prediction error of lateral velocity under a low friction coefficient road surface reduces by 50%. Similarly, the prediction error of the yaw rate is reduced four times. With time, the performance of the hybrid model does not change significantly, but the GRU-ED prediction error diverges considerably. For the high friction coefficient road surface, the performance of the hybrid model does not improve significantly. It can be seen from the whisker length of the boxplot that the mean prediction error and the uncertainty of the hybrid model are significantly reduced. Please note that the hybrid model induces minor errors in the early prediction stages. The hybrid model obtains more information from the measured physical parameters than the neural network model. These parameters are easily obtained. They cannot be changed, such as the vehicle’s mass and wheelbase of the front and rear axles. The introduction of the physical model changes the way the neural network optimizes the weights as the weight space is no longer explored uniformly. Therefore, the areas in the weight space combined with the physical model’s output are explored first. Exploring specific areas in the weight space enables the model to achieve faster convergence speed and minor prediction error. In the long run, it performs better than the GRU-ED.

The simulation time for the test dataset was also analyzed and compared to the GRU-ED. The test dataset was simulated on a computer with an i7-8700 CPU and RAM 64 GB. Figure 13 shows the mean computation time of the hybrid model and GRU-ED. Although the calculation time is increased, the overall real-time requirements can be met.

Figure 13.

Mean computation times of the GRU-ED, the hybrid model.

Conclusion

There are many difficulties in multi-step prediction for dynamic systems because the unmodeled parts of the system and the noise in the data lead to error accumulation and reduce the prediction accuracy. For vehicle dynamic systems, this problem is particularly important because the vehicle’s driving conditions are affected by many complex phenomena. It is not easy to accurately model these phenomena with physical models. However, data-driven modeling provides more inexplicability because the weight parameters of the network cannot correspond to the parameters of the physical model. In this work, we present a hybrid vehicle model based on the single-track model and GRU neural network to model the lateral vehicle dynamics. In order to capacitate the GRU neural network for accurate long-term predictions with feedback input values, a two-stage learning algorithm is employed, including open-loop and closed-loop training. The prediction accuracy of the hybrid vehicle model is evaluated based on the data collected by the vehicle in the real environment. On all datasets, the hybrid model obtains the best results compared to the single-track model and GRU-ED architecture. The analysis shows that the GRU neural network learns the single-track model’s inaccuracies and provides an error value to compensate for the difference between the measured values. The accurate modeling results can be used as reliable information in automatic driving tests and calibration.

There are various challenges to overcome to reach a robust modeling scheme based on neural networks. While this work provides an accurate modeling technique, future work will improve the results. A grid search or any other advanced method should be used to search the hyperparameters space. Furthermore, improving the quality of training data will potentially increase the overall robustness of the model.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the National Natural Science Foundation of China (U20A20333, 52072160, 51875255, U1764264), Key Research and Development Program of Jiangsu Province (BE2019010-2, BE2020083-3), Jiangsu Province’s six talent peaks (TD-GDZB-022).

ORCID iDs

Xuekai Yu

Xiaoqiang Sun

References

Smith

Starkey

JM.

Effects of model complexity on the performance of automated vehicle steering controllers: model development, validation and comparison. Veh Syst Dyn 1995; 24(2): 163–181.

Segel

Theoretical prediction and experimental substantiation of the response of the automobile to steering control. Proc IMechE, Automobile Division 1956; 10(1): 310–330.

Kazemi

Keshavarz Bahaghighat

Panahi

. Yaw moment control of four wheel steering vehicle by fuzzy approach. In: 2008 IEEE international conference on industrial technology. New York: IEEE, 2008.

Ren

Girshick

, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 2017; 39(6): 1137–1149.

Redmon

Divvala

Girshick

, et al. You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.

Cai

Luan

Gao

, et al. YOLOv4-5D: an effective and efficient object detector for autonomous driving. IEEE Trans Instrum Meas 2021; 70: 1–13.

Liu

Cai

Wang

, et al. Robust target recognition and tracking of self-driving cars with radar and camera information fusion under severe weather conditions. IEEE Trans Intell Transp Syst 2022; 23: 6640–6653.

Wang

Chen

Cai

, et al. Voxel-RCNN-complex: an effective 3-D point cloud object detector for complex traffic conditions. IEEE Trans Instrum Meas 2022; 71: 1–12.

Cai

Dai

Wang

, et al. Dlnet with training task conversion stream for precise semantic segmentation in actual traffic scene. IEEE Trans Neural Netw Learn Syst. Epub ahead of print 25 May 2021. DOI: 10.1109/TNNLS.2021.3080261.

10.

Gkioxari

Dollar

, et al. Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, 2017.

11.

Wang

Chen

Cai

, et al. Sfnet-N: an improved SFNet algorithm for semantic segmentation of low-light autonomous driving road scenes. IEEE Trans Intell Transp Syst. Epub ahead of print 30 May 2022. DOI: 10.1109/TITS.2022.3177615.

12.

Cai

Dai

Wang

, et al. Pedestrian motion trajectory prediction in intelligent driving from far shot first-person perspective video. IEEE Trans Intell Transp Syst 2022; 23: 5298–5313.

13.

Chen

Seff

Kornhauser

, et al. Deep driving: learning affordance for direct perception in autonomous driving. In: Proceedings of the IEEE international conference on computer vision, 2015.

14.

Bojarski

End to end learning for self-driving cars. arXiv preprint arXiv, 2016, p.1604.

15.

Narendra

Parthasarathy

Gradient methods for the optimization of dynamical systems containing neural networks. IEEE Trans Neural Netw 1991; 2(2): 252–262.

16.

, et al. Adaptive-neural-network-based robust lateral motion control for autonomous vehicle at driving limits. Control Eng Pract 2018; 76: 41–53.

17.

Spielberg

Brown

Kapania

, et al. Neural network vehicle models for high-performance automated driving. Sci Robot 2019; 4(28): eaaw1975.

18.

Rutherford

Cole

DJ.

Modelling nonlinear vehicle dynamics with neural networks. Int J Veh Des 2010; 53(4): 260–287.

19.

Kabzan

Hewing

Liniger

, et al. Learning-based model predictive control for autonomous racing. IEEE Robot Autom Lett 2019; 4(4): 3363–3370.

20.

Zhang

GP.

Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 2003; 50: 159–175.

21.

Khandelwal

Adhikari

Verma

Time series forecasting using hybrid ARIMA and ANN models based on DWT decomposition. Procedia Comput Sci 2015; 48: 173–179.

22.

Jiahao

Hsieh

Forgoston

Knowledge-based learning of nonlinear dynamics and chaos. Chaos 2021; 31(11): 111101.

23.

Chee

Jiahao

Hsieh

MA.

KNODE-MPC: a knowledge-based data-driven predictive control framework for aerial robots. IEEE Robot Autom Lett 2022; 7(2): 2819–2826.

24.

Holzmann

, et al. Vehicle dynamics simulation based on hybrid modeling. In: 1999 IEEE/ASME international conference on advanced intelligent mechatronics (Cat. No. 99TH8399). New York: IEEE, 1999.

25.

Pracny

Meywerk

Lion

Hybrid neural network model for history-dependent automotive shock absorbers. Veh Syst Dyn 2007; 45(1): 1–14.

26.

Fraikin

Funk

Frey

, et al. A fast and accurate hybrid simulation model for the large-scale testing of automated driving functions. Proc IMechE, Part D: J Automobile Engineering 2020; 234(4): 1183–1196.

27.

Graber

Lupberger

Unterreiner

, et al. A hybrid approach to side-slip angle estimation with recurrent neural networks and kinematic vehicle models. IEEE Trans Intell Vehicles 2019; 4(1): 39–47.

28.

Mohajerin

Mozifian

Waslander

. Deep learning a quadrotor dynamic model for multi-step prediction. In: 2018 IEEE international conference on robotics and automation (ICRA), 2018, pp.2454–2459.

29.

De Groote

Kikken

Hostens

, et al. Neural network augmented physics models for systems with partially unknown dynamics: application to slider–crank mechanism. IEEE/ASME Trans Mechatron 2022; 27(1): 103–114.

30.

Hornik

Stinchcombe

White

Multilayer feedforward networks are universal approximators. Neural Netw 1989; 2(5): 359–366.

31.

Boudjedir

Bouhali

Nassim

Neural network control based on adaptive observer for quadrotor helicopter. Int J Inform Technol Control Automat 2012; 2(3): 39–54.

32.

Nikiforuk

Gupta

MM.

Approximation of discrete-time state-space trajectories using dynamic recurrent neural networks. IEEE Trans Automat Contr 1995; 40(7): 1266–1270.

33.

Hochreiter

Schmidhuber

Long short-term memory. Neural Comput 1997; 9(8): 1735–1780.

34.

Cho

Van Merriënboer

Gulcehre

, et al. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), arXiv preprint arXiv:1406.1078, 2014.

35.

Mohajerin

Waslander

SL.

Multistep prediction of dynamic systems with recurrent neural networks. IEEE Trans Neural Netw Learn Syst 2019; 30(11): 3370–3383.

36.

Narendra

Parthasarathy

Identification and control of dynamical systems using neural networks. IEEE Trans Neural Netw 1990; 1(1): 4–27.

37.

Kingma

Jimmy

BA.

Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

Hybrid physics and neural network model for lateral vehicle dynamic state prediction

Abstract

Keywords

Introduction

Hybrid model for multi-step prediction

The structure of hybrid model

Physical model

Feed-forward neural networks

Recurrent neural networks

Gate recurrent units

RNN state initialization

Experiments and dataset

Data collection and processing

Training methods

Experimental setup

Model selection

Evaluation metrics

Result and discussion

Hybrid model versus physical model

Hybrid model versus GRU encoder-decoder

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

References