Sage Journals: Discover world-class research

Abstract

In this paper, a novel algorithm is proposed for the motion planning and path following automated cars with the incorporation of a collision avoidance strategy. This approach is aligned with an optimal reinforcement learning (RL) coupled with a new risk assessment approach. For this purpose, a probabilistic function-based collision avoidance strategy is developed, and the proposed RL approach learns the probability distributions of the adjacent and leading vehicles. Subsequently, the nonlinear model predictive control (NMPC) algorithm approximates the optimal steering input and the required yaw moment to follow the safest and shortest path through the optimal RL-based probabilistic risk function framework. Additionally, it is attempted to maintain the travel speed for the ego vehicle stable such that the ride comfort is also offered for the vehicle occupants. For this purpose, the steering system dynamics are also incorporated to provide a thorough understanding of the vehicle dynamics characteristic. Different driving scenarios are employed in the present paper to evaluate the proposed algorithm’s effectiveness.

Keywords

Automated cars obstacle avoidance reinforcement learning path-planning

Introduction

Rapid advancement in vehicular technologies, in addition to the practical implementation of adaptive equipment, has pushed the concept of automated cars forward.^1–3 Powerful motivations to employ automated cars include safety of passengers, driving comfort, improved car performance, and efficiency in terms of time and infrastructure deployment.^4,5 Statistics reveal that car drivers are the main culprit of almost 90% of casualties in road accidents that can be potentially avoided by employing automated cars.^6,7 To accomplish this goal, automated cars should hold an adequate intelligence level to demonstrate effective decision-making and environmental awareness to handle severe traffic scenarios and hazardous road conditions. It is essential in case the driving procedure is wholly taken over by the car. Optimal path planning based on road conditions, potential obstacles, and traffic regulations are crucial for automated cars. Accordingly, emerging a framework to equip the cars with optimized route planning algorithms based on the potential road obstacles and available road space is still a dynamic research field.

Path-planning combined with the obstacle avoidance paradigm has been extensively investigated in the literature for non-holonomic robots before.^8,9 However, path-planning based on obstacle avoidance should be performed by extra considerations such as the road regulations and free maneuvering space, the dynamic constraints related to the vehicle components and system states. Overall, the described factors turn the path- planning and following problem into a hugely challenging task.¹⁰ It is also essential to recognize a strategy for path-planning in a real-time fashion because of the risk of obstacle emergence on the road. The recently employed path-planning techniques for automated cars include artificial potential field methods (APF),¹⁰ random search methods,¹¹ and invariants of optimal control such as nonlinear model predictive control (NMPC).^2,4 Huang et al.¹² employed an APF approach to designate several distinct potential functions for possible obstacles and road barriers. Moreover, the obstacle-free areas were meshed and utilized as safe driving zones. As a result, the driving path was planned spatiotemporally. APF approach holds the ability to designate various potential functions to complex obstacles and road barriers to set the desired path accordingly. However, APF approach does not necessarily encompass the optimal vehicle dynamic response to follow the desired path. Rasekhipour et al.¹³ developed a combined model predictive and APF algorithm for planning the optimal path, while the objectives incorporated the obstacle-based potential functions together with the vehicle dynamics constraints.

The path-planning strategy typically accounts for any individual road barrier under the operating conditions such as vehicle-obstacle shortest space and whether the obstacle is visible for the approaching vehicle. Among the commonly employed strategies, single MPC is typically sluggish to deal with the two-dimensional collisionless maneuvering by incorporating an obstacle avoidance cost. Additionally, the optimal control problem analyzes the different types of obstacles in identical functions without the road regulations. A hybrid path-planning paradigm for automated cars under constrained environments was proposed by considering various constraints related to the path geometry, vehicle dynamics and holonomicity.¹⁴ In this approach, a non-derivative-based global search algorithm was employed to derive the higher-order state information for state sampling. Yoon et al.,¹⁵ proposed a recursive-based path-planning algorithm running based on the reduction in the states of the search space and incorporating several factors such as the geometry of the road and the dynamics of the car. Additionally, the framework operates through two heuristics-based constraints-wise node expansion approaches that correct the future path according to the available geometry and cornering space. Similar convex optimization-based attempts were undertaken to develop a collision-free-based path-planning algorithm for automated cars by considering road geometry, vehicle states, and infrastructure constraints.^16–18 Apart from the optimality for the path to take, it is also crucial to consider the entire surrounding risks of the ego vehicle and guarantee that the intended direction is reliable and reasonably safe. Therefore, other competent techniques are being reported to sort out the obstacle collision during maneuvering by assessing the risks of the vehicle approaching the surrounding obstacles.^19–21 Kim et al.¹⁹ developed an algorithm based on the potential risk assessment by realizing the possible collision risk with the driving situation obstacles and identifying the safest path to take. In addition, an integrated path planning and path-following strategy was proposed in Chen et al.²² according to the velocity prediction for the leading vehicle by employing a composite nonlinear feedback controller design for the path following purpose and an input-output hidden Markov model (HMM) for the estimation of the velocity related to the leading car. In the area of the dynamic field, the invariants of curve shapes such as Bezier,²³ cubic,²⁴ and quintic polynomials²⁵ have manifested their effectiveness for generating smooth trajectories for the vehicle during the path planning process. Lattice-based path planning framework^26,27 as well as multi-objective trajectory planning according to evolutionary algorithms have also been proposed^28–30 as potential approaches. However, such a strategy mainly incorporates the kinematic constraints but lacks the collision risk and obstacle avoidance intentions, demonstrating limited effectiveness in practical applications. In Liang et al.,³¹ a local motion planning framework was conceived, connected with a cruise control algorithm using adaptive MPC. Another lateral MPC controller then performed the lateral path-tracking for the global path. In Hang et al.,³² a tube-based MPC method was utilized to control autonomous electric vehicles with active safety considerations at driving limits by controlling four-wheel steering (4WS) and direct yaw-moment control (DYC). Because sideslip angle, as a critical state, plays a significant role in achieving the desired tire forces for controlling the vehicle, Advancing the estimation sideslip angle was carried out by combining vehicle kinematics and dynamics states and fed to a fuzzy logic system in Xia et al.³³ Other hybridized local and global path-planning methods, such as the visibility graph method, have also been introduced to control autonomous vehicles, combined with NMPC control algorithms for path-tracking purposes.³⁴ Despite the particular merits of the reviewed approaches, path planning for situations where the existence of obstacles and barriers is unknown or no information exists on their dynamic states still serves as a challenge for optimal path planning problems.

The reviewed literature indicates that optimal path, and predictive models have been massively employed so far. However, these algorithms are liable to the lack of dynamic interaction with the surrounding environment to plan for the optimal path adaptively. Furthermore, the varying risk function growth interpreted in probabilistic functions suggests more excellent reliability and generality. Moreover, reinforcement learning-based risk assessment is broadly considered as a proper solution to address the collision-free path-planning of automated cars. In this paper, the path planning of automated cars is explored based on the following main contributions: (i) a novel optimal path-planner paradigm to avoid the obstacle collision is proposed by employing the optimal reinforcement learning algorithm combined with probabilistic risk assessment and (ii) potential risks of the car-obstacle(s) collision based on a growth function is considered to be uniformly distributed but unequal in magnitudes. Hence, the proposed algorithm comprises the merits of both unstructured-based obstacle avoidance control according to the nearest neighbor principle and the deep learning benefit of the RL-algorithm combined with the optimality search according to a nonlinear model predictive control (NMPC) paradigm. Computing the optimal path is organized based on the approaching obstacles, road structures, and the dynamic response of the automated car in terms of the constrained inputs and states.

The structure of the present paper is laid out as follows. In Section II, the dynamics of an automated car are formulated. Section III presents the path-planning problem, prospective obstacles, and constraints, and the probabilistic risk assessment. Section IV describes the optimal RL-algorithm. In Section V, the path planning paradigm is investigated based on numerous simulations under various operating situations, and the results are discussed in further detail. Finally, Section VI concludes the paper.

Problem formulation

The dynamic response of the vehicles closely depends on the directional forces and resulting moments generated by the pneumatic tires. However, merely the forces developed by the tires in the lateral direction leave a substantial effect on the handling performance analysis of the vehicles. In contrast, the longitudinal force components affect the handling dynamics infinitesimally. The longitudinal acceleration and the resultant force components of the car are dismissed. However, the vehicle’s longitudinal speed must be adequate for producing the lateral forces in proportion with the value of slip angles based on the well-known models for tires. Furthermore, the roll effect of the lateral weight transfer during the cornering is considered negligible due to the adequately adjusted suspension setting. Therefore, a bicycle model with two describing degrees of freedom is applied to describe the main dynamic modes of the vehicle in the yaw-plane of motion owing to the symmetricity between the right- and left-side tracks (Figure 1). The vehicle yaw stability implies that the yaw angle, the so-called heading angle, is essentially taken as the controlled parameter. Furthermore, maintaining the vehicle yaw velocity $γ$ at the vicinity of the desired value that professional drivers achieve is a substantial step in the functional control of vehicles during cornering. The lateral offset error in automated cars can be explained based on the shortest space between the desired trajectory and the vehicle as an orthogonal projection. The vehicle yaw rate $γ$ is computed in terms of heading angle $φ$ first-order time-differentiation, and the difference between the actual yaw rate and desired yaw one, $γ_{d}$ , is defined in terms of the yaw rate error $γ_{e}$ . Consequently, vehicle dynamics for the path-following task can be formulated as:

{\begin{matrix} γ_{e} = γ - γ_{d} = γ - v_{x} / R (ρ), \\ \overset{\cdot}{y} = v_{x} \sin φ + v_{y} \cos φ, \\ \overset{\cdot}{φ} = γ, \end{matrix}

(1)

Figure 1.

Yaw-plane vehicle bicycle model due to the symmetricity between the right and left tracks.

where $R (ρ)$ represents the road radius of curvature and $ρ$ is the arc length of the position to track instantaneously, which varies as a function of the road trajectory. Furthermore, $v_{x}$ and $v_{y}$ denote the vehicle’s longitudinal and lateral speed components described in the body-attached reference system. Moreover, $y$ and $\overset{\cdot}{y}$ denote the lateral displacement and velocity components of the vehicle at the center of gravity (C.G.), respectively. A primary goal in developing the automated cars is to guarantee the minimum lateral offset of the car by designing a controller to converge $γ_{e}$ and ${\overset{\cdot}{y}}_{e}$ to zero, where ${\overset{\cdot}{y}}_{e} = \overset{\cdot}{y} - {\overset{\cdot}{y}}_{d}$ . Additionally, $y_{d}$ is the desired trajectory of the car in the lateral direction. Consequently, the vehicle should carry the ability for the satisfactory path-following of the desired trajectory. Integrated Active Front Steering (AFS) coupled with Direct Yaw-Moment Control (DYC) are suggestive of remarkable benefits within different variants of control approaches.^35,36 Thus, a 2-DOF yaw model of vehicles can safely predict the dominant responses for the purpose of path-planning and trajectory following. These equations can be briefly written as:

{\begin{matrix} {\overset{\cdot}{v}}_{y} = \frac{1}{m} (F_{yf} + F_{yr}) - v_{x} γ, \\ \overset{\cdot}{γ} = \frac{1}{I_{z}} (F_{yf} l_{f} - F_{yr} l_{r} + Δ T), \end{matrix}

(2)

where $Δ T$ represents the acting yaw moment, and $F_{yf}$ and $F_{yr}$ account for the tire lateral force constituents concerning the front and rear wheels, respectively. Furthermore, $l_{f}$ and $l_{r}$ represent the associated wheelbase elements, and m and $I_{z}$ represent total mass and yaw-inertia, respectively. Consequently, the exerted moment with the track width $l_{b}$ is described:

\begin{matrix} Δ T = \sum_{i} \sum_{j = 1}^{2} {(- 1)}^{j} F_{xij} \frac{l_{b}}{2}, i = f, r \end{matrix}

(3)

where $F_{xij}$ represent the longitudinal force exerted to the front- and rear-axle wheels. There exists a proportional relationship among the tire lateral force and the side slip angles. Therefore, the lateral force components are essentially computable based on the front and rear tire cornering stiffness parameters (i.e. $C_{f}$ and $C_{r}$ ):

\begin{matrix} F_{yf} = C_{f} α_{f}, F_{yr} = C_{r} α_{r}, \end{matrix}

(4)

The nonlinear cornering characteristics of tire may be captured in terms of the uncertainty about the nominal tire cornering stiffness as follows:

\begin{matrix} C_{f} = {\tilde{C}}_{f} + Δ C_{f}, C_{r} = {\tilde{C}}_{r} + Δ C_{r}, \end{matrix}

(5)

where ${\tilde{C}}_{f}$ and ${\tilde{C}}_{r}$ represent the nominal cornering stiffness values related to the front and rear tires, respectively. The nominal cornering stiffness terms are descriptive of the tire force-deflection linear region, and $Δ C_{f}$ and $Δ C_{r}$ describes the bounded uncertainties for tire cornering stiffness related to the front and rear wheels, respectively. Additionally, the side slip angles associated with the front and rear tires can be described as:

\begin{matrix} α_{f} = ta n^{- 1} [\frac{v_{x} \sin (β) + l_{f} γ}{v_{x} \cos (β)}] - δ_{f} \\ α_{r} = ta n^{- 1} [\frac{v_{x} \sin (β) - l_{r} γ}{v_{x} \cos (β)}] \end{matrix}

(6)

We also incorporated the steering system dynamics in the present study to thoroughly explore the vehicle response. The steering system dynamics can drastically affect the vehicle’s dynamic response and the capacity to follow the path formed by the developed algorithm. Considering the dynamics of steering system from (Figure 2), the produced moment about the kingpin due to the tire lateral force is obtained as:

\begin{matrix} T_{s} = 2 (σ_{c} + σ_{n}) C_{f} α_{f}, \end{matrix}

(7)

Figure 2.

EPS-based steering system model.

where $σ_{c}$ and $σ_{n}$ define the caster and pneumatic trails, respectively. By replacing $α_{f}$ from equations (6) in (7):

\begin{matrix} T_{s} = 2 (σ_{c} + σ_{n}) C_{f} {ta n^{- 1} [\frac{v_{x} \sin (β) + l_{f} γ}{v_{x} \cos (β)}] - δ_{f}} \end{matrix}

(8)

One should note that moment is an extraneous cause to the front wheel, and the attached steering system. Hence, the describing equations of motion for steering wheel account for the rotations transferred through the kingpin:

\begin{matrix} I_{z} (\frac{d^{2} δ}{d t^{2}} + \frac{d γ}{dt}) + C_{s} \frac{d δ}{dt} - K_{s} (ϑ - δ) = 2 (σ_{c} + σ_{n}) C_{f} \\ {\arctan [\frac{v_{x} \sin (β) + l_{f} γ}{v_{x} \cos (β)}] - δ} \end{matrix}

(9)

Since the rotational acceleration acts relative to the absolute space and dynamically changing steering system, the expression $d γ / dt$ incorporates in this model. Nevertheless, the expression $d^{2} δ / d t^{2} >> d γ / dt$ ^37,38 holds valid for regular automotive cars. Therefore, a dynamic model for the steering system can be expressed as:

\begin{array}{l} C_{s} \dot{δ} = 2 (σ_{c} + σ_{n}) C_{f} {\arctan [\frac{v_{x} \sin (β) + l_{f} γ}{v_{x} \cos (β)}] - δ} + K_{s} (ϑ - δ) \end{array}

(10)

The equations of motion associated with the vehicle’s lateral dynamics, in terms of vehicle lateral speed, yaw-rate, and steering system and the steering system, can be re-structured as follows:

\begin{matrix} {\overset{\cdot}{v}}_{y} = C_{f} / m {\arctan [\frac{v_{x} \sin β + l_{f} γ}{v_{x} \cos β}] - δ} \\ + C_{r} / m {\arctan [\frac{v_{x} \sin β - l_{r} γ}{v_{x} \cos β}]} - v_{x} γ, \end{matrix}

(11)

\begin{matrix} \overset{\cdot}{γ} = {l_{f} C_{f} / I_{z} {\arctan [\frac{v_{x} \sin β + l_{f} γ}{v_{x} \cos β}] - δ}} \\ + {l_{r} C_{r} / I_{z} {\arctan [\frac{v_{x} \sin β - l_{r} γ}{v_{x} \cos β}]}} + Δ T / I_{z}, \end{matrix}

(12)

\begin{array}{l} \dot{δ} = & 2 (σ_{c} + σ_{n}) C_{f} C_{s}^{- 1} {\arctan [\frac{v_{x} \sin β + l_{f} γ}{v_{x} \cos β}] - δ} + K_{s} C_{s}^{- 1} (ϑ - δ) . \end{array}

(13)

Accordingly, the general state-space representation of the system dynamics is derivable as follows:

\begin{matrix} \underline{\overset{\cdot}{} ξ} = \underline{G} (\underline{ξ}) + \underline{H} (\underline{ξ}) \underline{U}, \end{matrix}

(14)

where $\underline{ξ} = {[v_{y}, γ, δ]}^{T}$ represents the states of the system, $\underline{G} = {[g_{1}, g_{2}, g_{3}]}^{T}$ , denotes the nonlinear but bounded system function, $\underline{H} = {[h_{1}, h_{2}]}^{T}$ includes the general control function and $\underline{U} = {[u_{1}, u_{2}]}^{T}$ is the control input to the system. Additionally, the corresponding subfunctions are obtained as:

\begin{matrix} g_{1} = C_{f} / m {\arctan [\frac{v_{x} \sin β + l_{f} γ}{v_{x} \cos β}] - δ} \\ + C_{r} / m {\arctan [\frac{v_{x} \sin β - l_{r} γ}{v_{x} \cos β}]} - v_{x} γ, \end{matrix}

(15)

\begin{matrix} g_{2} = {l_{f} C_{f} / I_{z} {\arctan [\frac{v_{x} \sin β + l_{f} γ}{v_{x} \cos β}] - δ}} \\ + {l_{r} C_{r} / I_{z} {\arctan [\frac{v_{x} \sin β - l_{r} γ}{v_{x} \cos β}]}}, \end{matrix}

(16)

\begin{array}{l} g_{3} = 2 (σ_{c} + σ_{n}) C_{f} C_{s}^{- 1} {\arctan [\frac{v_{x} \sin β + l_{f} γ}{v_{x} \cos β}] - δ} - K_{s} C_{s}^{- 1} δ, \end{array}

(17)

Furthermore, it can be stated that $h_{1} = 1 / I_{z}$ and $h_{2} = K_{s} {C_{s}}^{- 1}$ , $u_{1} = Δ T$ and $u_{2} = ϑ$ .

Path-planning and probabilistic risk assessment

Path considerations and risk assessment

For the automated cars to follow the intended path, a modified model according to the Frenet–Serret differential geometry can be developed. Assuming $θ$ denotes the independent parameterization variable and the intended path curve $D (x_{d}, y_{d})$ , the desired position vector is denoted by $\underline{c_{d}} (θ (t)) = {[x_{d} (θ (t)), y_{d} (θ (t))]}^{T}$ . Moreover, it is obvious that $‖ \underline{c_{d}} (θ (t)) ‖ \geq 0 \forall t \geq 0$ and that $\underline{v_{d}} (t) = \underline{{\overset{\cdot}{c}}_{d}} (t)$ and $\frac{ds}{dt} = ‖ \underline{v_{d}} (t) ‖$ where $s$ represents the curvilinear coordinate (arc-length) related to point $Ω$ in the direction of the traveling path from a predetermined initial position. Therefore, $s$ can be expressed as:

s \overset{Δ}{=} \int_{θ_{0}}^{θ} ‖ \frac{d_{c_{d}}}{d \tilde{θ}} ‖ d \tilde{θ},

(18)

Since $‖ d_{c_{d}} / d \tilde{θ} ‖$ is always nonzero, thus:

s = \int_{θ_{0}}^{θ} \sqrt{{(\frac{d_{x_{d}}}{d \tilde{θ}})}^{2} + {(\frac{d_{y_{d}}}{d \tilde{θ}})}^{2}} d \tilde{θ},

(19)

Hereafter, the parameterized variable $s$ is employed with $φ_{d} (s)$ representing the angle between the unit vector tangent to the path $T (s)$ , and the coordinate X. For this purpose, the tangential vector is represented as:

T = {[T_{x}, T_{y}]}^{T} = {[\cos φ_{d}, \sin φ_{d}]}^{T},

(20)

Differentiating (20) with respect to the variable $s$ :

\frac{\partial T}{\partial s} = \frac{\partial T}{\partial φ_{d}} \frac{\partial φ_{d}}{\partial s} = {[- \sin φ_{d}, \cos φ_{d}]}^{T} \frac{\partial φ_{d}}{\partial s},

(21)

Additionally, it is assumed that the curvature vector is $K (s) = \frac{\partial T}{\partial s}$ and $k (s) = \frac{\partial φ_{d}}{\partial s}$ . Therefore it is axiomatic that $‖ k (s) ‖ = ‖ K (s) ‖$ since $‖ \frac{\partial T}{\partial s} ‖ = ‖ \frac{\partial φ_{d}}{\partial s} ‖$ , which leads to the following kinematic equality²:

\underline{Ξ} = {(Λ (Ω, k))}^{- 1} R (φ_{e}) \underline{Π},

(22)

where $\underline{Ξ} = {[s, Ω, φ_{e}]}^{T}$ ,

\begin{matrix} Λ = [\begin{matrix} 1 - Ω k & \begin{matrix} 0 & 0 \end{matrix} \\ \begin{matrix} 0 \\ k \end{matrix} & \begin{matrix} \begin{matrix} 1 \\ 0 \end{matrix} & \begin{matrix} 0 \\ 1 \end{matrix} \end{matrix} \end{matrix}], R (φ_{e}) = [\begin{matrix} \cos φ_{e} & \begin{matrix} - \sin φ_{e} & 0 \end{matrix} \\ \begin{matrix} \sin φ_{e} \\ 0 \end{matrix} & \begin{matrix} \begin{matrix} \cos φ_{e} \\ 0 \end{matrix} & \begin{matrix} 0 \\ 1 \end{matrix} \end{matrix} \end{matrix}], \end{matrix}

(23)

$\underline{Π} = {[v_{x}, v_{y}, r]}^{T}$ , $φ_{e} = φ - φ_{d}$ and $φ$ is the vehicle heading angle. Assuming a scenario with two lanes that can be represented by a cubic spline $c (Ω)$ , the position in the direction of the spline can be parametrically expressed as follows:

c (Ω) = [x (Ω), y (Ω)], Ω \in [0, t_{\max}],

(24)

For any spline $c (Ω)$ , it is also assumed to exist a number of disjoint unobserved segments. For the sampled segments independent from uniform distributions in certain unobserved segments, the position and speed components in the segments are expressed as:

\begin{matrix} d ~ U (⋃_{i = 1} [d_{\min} (i), d_{\max} (i)]), \\ v ~ U ([v_{\min}^{*} (i), v_{\max}^{*} (i)]), \end{matrix}

(25)

where $[., .]$ denotes a closed set between two real numbers, $U (.)$ denotes a uniform distribution on the set, $d_{\min} (i)$ , and $d_{\max} (i)$ and $v_{\min}^{*} (i)$ and $v_{\max}^{*} (i)$ represent the starting and final position of an unobserved segment $i$ on spline $c (Ω)$ , and the minimum and maximum speed of other vehicles, respectively. Assuming that each car is traveling with a constant speed, it is then possible to propagate the entire segments forward in time for $t_{max}$ seconds:

\tilde{d} (k) = d (k) + v (k) t_{\max},

(26)

where $\tilde{d} (k)$ is the position of the car in the k-th segment after $t_{max}$ seconds. In order to incorporate the size of vehicles and the permissible lateral displacements within the lane, an offset term $Γ (k)$ is introduced from a uniform distribution for each road segment in Cartesian space perpendicular to the spline. Such a uniform distribution is employed to consider the probability of adjacent vehicles since no prior information is assumed for the location of the adjacent vehicles in the lateral direction.

r (Ω) : = [\frac{\partial x (Ω)}{\partial Ω}, \frac{\partial y (Ω)}{\partial Ω}] [\begin{matrix} 0 & {(- 1)}^{m + n} \\ {(- 1)}^{m + n} & 0 \end{matrix}],

(27)

where m and n are the indices related to the columns and rows of the matrix elements, respectively. Therefore, the probability function is expressed as:

P (Ω) = c (\tilde{d} (k)) + Γ (k) r (\tilde{d} (k)) {({‖ r (\tilde{Ω} (k)) ‖}_{2})}^{- 1},

(28)

where $‖ . ‖_{2}$ is the 2-norm of a vector, $r (Ω)$ is the non-normalized vector perpendicular to $c (Ω)$ , and $Γ (k)$ is the maximum deviation among the segments alongside the spline curve. The main purpose of this part is to illustrate how the risk explained above is employed in practice. For this aim, the risk assessment is combined with an optimization-based planning algorithm and twofold advancements in terms of safety and ride comfort are demonstrated. The simple path-planner which is typically employed is substituted in this paper by any other cost-function-based path-planners because the methods are agnostic to the planning techniques. It is also assumed that the ego vehicle speed at time $t$ is $v_{ego}$ and the target path is $c (Ω)$ , and also a cubic spline is parameterized by its position along the spline. Therefore, the path-planning algorithm primarily recognizes the safety cost $J_{a_{ego}}$ corresponding to the acceleration or deceleration $a_{ego}$ as follows: $J_{a_{ego}} = \sum_{i = 1}^{H_{p}} (h_{a_{ego}})$ , where $h_{a_{ego}}$ is the potential function of the vehicle which is positive when the vehicle $P (Ω)$ is within the ego lane and zero otherwise, and can be defined as follows:

h_{a_{ego}} {\begin{matrix} \exp (- \frac{Q {(a_{ego})}^{2}}{σ^{2}}), if z (k) \leq Γ (k) \\ 0, otherwise . \end{matrix}

(29)

Additionally, the following kinematic equalities can be described as³⁹:

\tilde{d} {(k)}_{ego} : = d {(k)}_{ego} + v {(k)}_{ego} t_{\max} + 0.5 a_{ego} {t_{\max}}^{2},

(30)

Q (a_{ego}) : = ‖ c (\tilde{d} {(k)}_{ego}) - P (Ω) ‖_{2},

(31)

z (k) : = \inf_{s} ‖ c (d (k)) - P (Ω) ‖_{2},

(32)

where $\tilde{d} {(k)}_{ego}$ denotes the expected position for the ego car that travels along the ego car, $Q (a_{ego})$ represents the distance between the $P (Ω)$ and $c (\tilde{d} {(k)}_{ego})$ , $z (k)$ implies the minimum distance between the vehicle $P (Ω)$ and the ego car’s intended route $c (d (k))$ , and is the bandwidth of the repulsive potential field. Figure 3 represents the sample collision risk probability function value variations in different orientations.

Figure 3.

Sample collision risk probability function value variations in different orientations.

NMPC path-tracking

Herein, there are constraints that are placed on the vehicle side-slip angle together with the yaw-rate to guarantee cornering stability. The constraint related to the vehicle slip-angle is purposed to keep the vehicle away from the tire-road adhesion limits because the slip-angle massively relies on the road condition and varies under different adhesion characteristics. Accordingly, the constraints imposed on the yaw-rate and slip-angle states of the vehicle are explained as follows:

{\begin{matrix} γ_{\min} \leq γ_{k + i, t} \leq γ_{\max}, \\ β_{\min} \leq β_{k + i, t} \leq β_{\max} . \end{matrix}

(33)

The vehicle’s total acceleration/deceleration performance is directly related to the tire-road adhesion characteristics. Such value is bounded by $μ g$ where $μ$ represents the tire/road friction coefficient and $g$ denotes the gravitational acceleration. Accordingly, the acceleration absolute value is expressed as:

\sqrt{{\overset{\cdot}{v}}_{y}^{2} + {\overset{\cdot}{v}}_{x}^{2}} \leq μ g,

(34)

Assuming that the vehicle longitudinal acceleration is negligible ${\overset{\cdot}{v}}_{x} ≅ 0$ , equation (34) is simplified to:

‖ {\overset{\cdot}{v}}_{y} ‖_{2} \leq μ g,

(35)

From equation (35), it is axiomatic that the control input to put the vehicle on the preplanned path would be constrained for a specific operating condition. As a result, the steering angle and direct yaw moment control (DYC) and the corresponding variations constrained as follows:

{\begin{matrix} {ϑ_{f}}_{\min} \leq {ϑ_{f}}_{k + i, t} \leq {ϑ_{f}}_{\max}, \\ Δ {ϑ_{f}}_{\min} - ε_{1} \leq Δ {ϑ_{f}}_{k + i, t} \leq Δ {ϑ_{f}}_{\max} + ε_{1}, \\ Δ T_{\min} \leq Δ T_{k + i, t} \leq Δ T_{\max}, \\ Δ T_{\min} - ε_{2} \leq Δ T_{k + i, t} \leq Δ T_{\max} + ε_{2}, \end{matrix}

(36)

where $| ε_{1} |$ and $| ε_{2} |$ define the maximum allowable magnitude of the control input perturbation:

\begin{matrix} min_{u} J (x_{d} (k), u_{d} (k)) = \sum_{i = 0}^{H_{p}} ‖ y_{d} (k + i | k) - w_{ref} (k + i | k) ‖_{Q_{i}}^{2} + \\ \sum_{i = 1}^{H_{c} - 1} ‖ Δ u ‖_{R_{i}}^{2} + J_{a_{ego}} + Ξ ε^{2} \end{matrix}

(37)

\begin{matrix} s . t . x_{d} (k + i + 1) = f (x_{d} (k + i + 1)) + \\ g (u_{d} ((k + i + 1))) i = 0, 1, \dots, H_{p} - 1 \\ y_{d} (k + i + 1) = x_{d} (k + i), i = 0, 1, \dots, H_{p} - 1 \\ u_{\min} \leq u_{k + i, t} \leq u_{\max}, i = 0, 1, \dots, H_{c} - 1 \\ Δ u_{\min} \leq Δ u_{k + i, t} \leq Δ u_{\max}, i = 0, 1, \dots, H_{c} - 1 \\ y_{\min} - ζ \leq {y_{d}}_{k + i, t} \leq y_{\min} + ζ, i = 0, 1, \dots, H_{p} \\ 0 \leq ζ < 1, \end{matrix}

(38)

where $Q$ shows the weighting matrix related to the difference between the vehicle actual path and the planned path. Additionally, the weighting matrices concerned with the control input is represented by $R$ . Moreover, the intended path vector, control input and the control increment steps are denoted by $w_{ref} (k)$ , $u (k)$ , and $Δ u (k)$ . In addition, $Ξ$ denotes a slack variable and $Ξ ε^{2}$ is employed to penalize the vehicle cost function exceeding the constrained slip angle of the tire.

Reinforcement learning algorithm

The conventional reinforcement learning (RL) models are mostly explained through an agent operator that dynamically interacting with the environment. Such an interaction is implemented by applying the action and perception system. Throughout each singular exchange between the agent and environment, the agent accommodates input $i$ being the sign of the current state $s$ of the environment. Consequently, the agent plans an action $a$ to achieve the output. Afterward, the action substitutes the environment state and the magnitude concerned with the state transition is regarded as the agent by applying a proper reinforcement signal $r$ . Herein, $B$ describes the agent action to improve the temporary total magnitude associated with $r$ interchangeably. Additionally, $S$ and $A$ describe the discrete set of environment states and agent actions. In this context, the problem of delayed reinforcement and delayed reward is applied according to the Markov Decision Process (MDP).³⁹ Furthermore, the reward function $R$ $(R : S \times R \to ℜ)$ and a state transition function $T$ are employed where $T : S \times A \to Π (S)$ . It is also assumed that $Π (S)$ is a probability function distributed across the set $S$ . In this manner, the transition function $T (s, s^{'}, α)$ can be defined in terms of the probability of implementing a transition from state $s$ to state $s^{'}$ because of the action $α$ .

By taking into account the longer-term reward policy for the agent, the infinite horizon discounted model is applied. Besides, the subsequent rewards are topologically discounted on account of a discount factor ranged between 0 and 1 $(0 \leq χ < 1)$ such as $E (\sum_{t = 0}^{\infty} χ^{t} r_{t})$ . Additionally, the average of infinite discounted rewards to approach the optimality derived from an agent describes the optimal value corresponding to a state:

V^{*} (s) = \max_{π} E (\sum_{t = 0}^{\infty} χ^{t} r_{t}),

(39)

Based on the uniqueness and existence of the optimal result, the solution to the concurrent equations is determined in terms of a recursion expression³⁹:

\begin{matrix} V^{*} (s) = max_{α} {R (s, α) + χ \sum_{s^{'} \in S} T (s, s^{'}, α) + V^{*} (s^{'})}, \forall s \in S, \end{matrix}

(40)

where $V^{*} (s)$ represents the value of s corresponding to the initial optimal action and the above statement shows that the value of the state is the total sum of the expected instantaneous reward and the discounted value of the subsequent state values based on the current action. According to the optimal design, the desired value function is explained as follows³⁹:

\begin{matrix} π^{*} (s) = \underset{α}{\arg max} {R (s, α) + χ \sum_{s^{'} \in S} T (s, s^{'}, α) + V^{*} (s^{'})}, \forall s \in S, \end{matrix}

(41)

Moreover, the action-value function $Q (s, α)$ is described:

Q (s, α) = R (s, α) + χ \sum_{s^{'} \in S} T (s, s^{'}, α) Q (s^{'}, α^{'}),

(42)

Hence, the associated optimal solution $Q^{*} (s, α)$ is defined according to the action-value function³⁹:

Q^{*} (s, α) = R (s, α) + χ \sum_{s^{'} \in S} T (s, s^{'}, α) Q^{*} (s^{'}, α^{'}),

(43)

where $Q^{*} (s, α)$ denotes the expected discounted reinforcement associated with the $a$ in state $s$ continuously. Moreover, the $Q$ -learning algorithm explains the update concerned with the $Q$ value according to the delayed parameter $(Θ \in [0, 1])$ :

Q (s, α) : = Q (s, α) + Θ (χ max_{α^{'}} Q (s^{'}, α^{'}) - Q (s, α) + r),

(44)

The stated modification is used to implement the RL-based predictive decision-making to avoid the obstacle collision according to the predictive model. Figure 4 illustrates the integrated algorithm for the path planning and path following strategies according to the provided discussions, and the NMPC control algorithm, the control commands in terms of the steering input and the DYC signal applied to the vehicle dynamics model.

Figure 4.

The flowchart related to the integrated path planning and path following algorithm.

Results and discussion

In order to evaluate the performance of the proposed path-planning algorithm for automated cars, simulations are implemented during two different driving scenarios. These two scenarios can demonstrate the feasibility of the proposed methods under various operating conditions. The simulation parameters are summarized in Table 1. The simulation results for the proposed controller are implemented using numerous simulations. In the present study, a road with a single lane in each direction is employed for evaluating the proposed method without the loss of generality to be extendedly employed for other road conditions and driving environment.

Table 1.

Simulation parameters.

Parameter (unit)	Value
Total mass $m$ kg	1480
Yaw-moment of Inertia $I_{z}$ $kg \cdot m^{2}$	2350
Steering system Inertia $I_{s}$ $kg \cdot m^{2}$	4200
Distance between C.G. and front axle $l_{f}$ m	1.05
Distance between C.G. and rear axle $l_{r}$ m	1.63
Track width $l_{b}$ m	1.54
Front tire cornering Stiffness ${\tilde{C}}_{f}$ $N / rad$	67,500
Rear tire cornering Stiffness ${\tilde{C}}_{r}$ $N / rad$	74,500
Steering equivalent stiffness $K_{s}$ $N / rad$	10
Steering equivalent damping $C_{s}$ $Ns / rad$	225
Caster + Pneumatic Trails $σ_{n}$ + $σ_{c}$ $kg \cdot m^{2}$	0.04

Scenario A

In the first scenario, the ego car is traveling at an average forward speed of 30 km/h while the two leading vehicles travel on the same lane holding the constant speed of 25 km/h. It is obvious that the ego vehicle is required to safely pass the leading vehicles. The space for the ego vehicle to pass the leading vehicle has to be sufficient, which is typically a function of the car traveling speed. Herein, the threshold is put at a low space to verify whether the car has the capacity to pass the leading vehicle and also to return to the main pass successfully. This maneuver simply mimics the double lane change maneuver. Herein, the ego vehicle is represented with the risk functions shown in Figure 5. The two leading vehicles are represented based on their collision risk functions in the global coordinate system. Additionally, the planned path for the ego vehicle to pass the two in-line leading vehicles can be seen to safely return the original lane without collision.

Figure 5.

Sample collision risk probability function value variations in different orientations.

Because the target lane is clear after the first lane change, the vehicle is planned to can successfully complete the double lane change without changing the travel speed such that the left lane is kept free for the other vehicles attempting to pass. Additionally, it can be seen that after the critical passing from the leading vehicles, because the front lane is free, the vehicle has the opportunity to make the second lane change is a gradual and smoother manner. Figure 6 represents the vehicle trajectory in the plane of the motion and how the vehicle passes the leading vehicles as a function of the iteration numbers of the RL-agent and environment interaction. The plot encompasses both of the longitudinal and lateral based trajectory variations and how the ego vehicle (blue) can pass the leading vehicles (red) without any collision and considering other dynamics obstacles in the environment (green car). Figure 7 illustrates the vehicle responses in terms of the steering system input, the applied torque for the yaw generation for smooth cornering velocity. These parameters are mainly the control tuning inputs to the system which can be seen that are within the reasonable ranges for tires before saturation and prior to the tire starting to drop the lateral force generation. In response to the applied inputs to the ego vehicle to follow the planned path, the dynamic response of the car in terms of the lateral acceleration (g-acceleration), yaw-rate variations and vehicle heading angle change during the intended trajectory travel are presented in Figure 8. Furthermore, it is clear that based on the co-simulations of the model, the path-planning and following the proposed trajectory by the ego vehicle can be performed satisfactorily.

Figure 6.

Vehicle trajectory in the plane of motion and the planned strategy for the vehicle passing the leading vehicles as a function of the iteration numbers of the RL-agent interaction with the environment.

Figure 7.

Dynamic variations related to: (a) steering system input and (b) applied torque for the yaw generation of the ego car.

Figure 8.

Dynamic responses of the ego car in terms of: (a) lateral acceleration, (b) yaw-rate variations, and (c) vehicle heading.

Scenario B

This scenario is considerably complex compared to the first scenario mainly because the ego vehicle is expected to perform two consecutive double-lane-change maneuvers. The leading vehicles are distributed within two lanes with various collision risk functions depending on the traveling speed. The relative positions and the collision risk functions related to the leading vehicles on the plane of the motion proposed can be seen in Figure 9. Furthermore, the planned trajectory for the ego vehicle based on the algorithm is also demonstrated in Figure 9. It can be seen that because the leading vehicles hold the constant speed lower than the ego vehicle and that the leading vehicles are distributed randomly, the optimal path to pass the entire vehicles safely without the collision risk is to perform two consecutive double-lane changes with different lengths depending on the collision risk function, road condition, and geometric understanding of the environment. It is also noted that the ego vehicle changes the lane to the left lane when it is clear and has sufficient space to accommodate the ego-vehicle with the constant speed. The vehicle keeps the constant speed to provide a smooth ride comfort for the passenger. It is also appreciated that the vehicle is intended to return to the original lane after any passing of the leading vehicles to keep the left lane free for other higher speed traveling cars. The measure of comfort for seated passengers inside vehicles, according to the ISO 2631-1:1997, is associated with the magnitude of exposure to the total magnitude of weighted accelerations in all directions. The root mean square (RMS) of the accelerations can be utilized to objectify the magnitude of the weighted accelerations:

(a_{w})_{j} = {[\frac{1}{T} \int_{0}^{t} {((a_{w})_{j} (τ))}^{2} d τ]}^{\frac{1}{2}} j = x, y, z

(45)

where $a_{w}$ is the weighted acceleration according to ISO 2631-1:1997, and the subscript $j$ demonstrates the acceleration component in each direction. As the road is assumed reasonably flat, and the longitudinal acceleration is negligible, the RMS contributions of these components converge to zero, and the only RMS of the weighted acceleration in the lateral direction is, according to ISO 2631-1 and Zhao and Schindler,⁴⁰ is obtained at 0.201 $\frac{m}{s^{2}}$ , which puts the measurable criterion in the comfortable range.

Figure 9.

Global Coordinate based path-planning based on the proposed algorithm based on: (a) collision-risk function and (b) on yaw-plane of motion.

Figure 10 represents the vehicle trajectory in the plane of the motion and how the vehicle passes the leading vehicles as a function of the iteration numbers of the RL-agent interaction with the environment. The plot encompasses the lateral based trajectory variations and how the ego vehicle (blue) can pass the leading vehicles (red) without any collision and considering other dynamics obstacles in the environment (green car). It is also noted that the agent-environment interaction number causes the self-tuning and deep learning of the ego vehicle to adapt to the driving environment and the road condition. Finally, Figure 11 explores the tracking performance of the target vehicle by employing the designed NMPC algorithm, subsequent to the planned path according to the risk-assessment based collision avoidance algorithm. It can be seen that the target car holds the capacity to follow the planned path during the entire simulation range although slight variations are observed which are followed by rapid stabilization. Furthermore, it is observed that the second double-lane-change maneuver is taken more consistently and smoothly compared to the first maneuver which can be attributed to the improved learning of the algorithm after iterative interactions of the agents with the environment. Additionally, the error variations of the tracking performance across the X-coordinate is presented in Figure 11 along with the standard deviation of the tracking error. According to the obtained results, the maximum and mean values of the tracking error are obtained at 0.11 and 0.01 m, respectively, indicating the effectiveness and reliability of the proposed integrated path planning and following algorithm.

Figure 10.

Vehicle trajectory in the plane of motion and the planned strategy for the vehicle passing the leading vehicles as a function of the iteration numbers of the RL-agent interaction with environment.

Figure 11.

(a) The planned path versus the actual path subsequent to applying NMPC in the global coordinate system and (b) tracking error variations alongside the traveling direction.

Conclusions

In this paper, a motion planning and path following algorithm was proposed by employing the optimal reinforcement learning (RL) coupled with a novel risk assessment approach to avoid the collision with the leading and adjacent vehicles and obstacles during the lane change and critical maneuvers. The proposed RL approach demonstrated to be capable of learning the collision risk based on the probability distributions of the adjacent and leading vehicles and identifying the safest and shortest paths during the lane changes. Additionally, it was achieved to maintain the travel speed for the ego vehicle unchanged such that the ride comfort is rendered for the vehicle occupants by minimizing the contribution of the weighted longitudinal acceleration, as explored in equation (45). For this purpose, the dynamics of the steering system was also incorporated to provide an understanding of how the steering system dynamics can potentially affect the vehicle response to the input variations. Different driving scenarios were employed in the present paper to verify the effectiveness and performance of the proposed algorithm.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Hamid Taghavifar

Chongfeng Wei

Yechen Qin

References

Taghavifar

Rakheja

Path-tracking of autonomous vehicles using a novel adaptive robust exponential-like-sliding-mode fuzzy type-2 neural network controller. Mech Syst Signal Process 2019; 130: 41–55.

Taghavifar

Neural network autoregressive with exogenous input assisted multi-constraint nonlinear predictive control of autonomous vehicles. IEEE Trans Vehicular Technol 2019; 68(7): 6293–6304.

Wang

Taghavifar

, et al. Mme-ekf-based path-tracking control of autonomous vehicles considering input saturation. IEEE Trans Vehicular Technol 2019; 68(6): 5246–5259.

Wei

Romano

Merat

, et al. Risk-based autonomous vehicle motion control with considering human driver’s behaviour. Transp Res Part C Emerg Technol 2019; 107: 1–14.

Gao

Guo

, et al. RISE-based integrated motion control of autonomous ground vehicles with asymptotic prescribed performance. IEEE Trans Syst Man Cybern Syst 2021; 51: 5336–5348.

, et al. Adaptive-neural-network-based robust lateral motion control for autonomous vehicle at driving limits. Control Eng Pract 2018; 76: 41–53.

Mohammadzadeh

Taghavifar

A robust fuzzy control approach for path-following control of autonomous vehicles. Soft Comput 2020; 24: 3223–3235.

Taghavifar

Rakheja

A novel Terramechanics-based path-tracking control of terrain-based wheeled robot vehicle with matched-mismatched uncertainties. IEEE Trans Vehicular Technol 2020; 69: 67–77.

Taghavifar

, et al. Optimal Path-planning of nonholonomic terrain robots for dynamic obstacle avoidance using single-time velocity estimator and reinforcement learning approach. IEEE Access 2019; 7: 159347–159356.

10.

Mohamed

Ren

Lang

, et al. Optimal path planning for an autonomous articulated vehicle with two trailers. Int J Autom Control 2018; 12(3): 449–465.

11.

Gottlieb

Manathara

Shima

Multi-target motion planning amidst obstacles for autonomous aerial and ground vehicles. J Intell Robot Syst 2018; 90(3-4): 515–536.

12.

Huang

Ding

Zhang

, et al. A motion planning and tracking framework for autonomous vehicles based on artificial potential field elaborated resistance network approach. IEEE Trans Ind Electron 2020; 67(2): 1376–1386.

13.

Rasekhipour

Khajepour

Chen

, et al. A potential field-based model predictive path-planning controller for autonomous road vehicles. IEEE Trans Intell Transp Syst 2017; 18(5): 1255–1267.

14.

Zhang

Chen

Waslander

, et al. Hybrid trajectory planning for autonomous driving in highly constrained environments. IEEE Access 2018; 6: 32800–32819.

15.

Yoon

Lee

, et al. Recursive path planning using reduced states for car-like vehicles on grid maps. IEEE Trans Intell Transp Syst 2015; 16(5): 2797–2813.

16.

Zhang

Chen

Waslander

, et al. Toward a more complete, flexible, and safer speed planning for autonomous driving via convex optimization. Sensors 2018; 18(7): 2185.

17.

Althoff

Stursberg

Buss

Model-based probabilistic collision detection in autonomous driving. IEEE Trans Intell Transp Syst 2019; 10(2): 299–310.

18.

Lim

Lee

Sunwoo

, et al. Hierarchical trajectory planning of an autonomous car based on the integration of a sampling and an optimization method. IEEE Trans Intell Transp Syst 2018; 19(2): 613–626.

19.

Kim

Lee

, et al. Design of integrated risk managementbased dynamic driving control of automated vehicles. IEEE Intell Transp Syst Mag 2017; 9(1): 57–73.

20.

Lee

Kum

Collision avoidance/mitigation system: motion planning of autonomous vehicle via predictive occupancy map. IEEE Access 2019; 7: 52846–52857.

21.

Khajepour

Melek

, et al. Path planning and tracking for vehicle collision avoidance based on model predictive control with multiconstraints. IEEE Trans Vehicular Technol 2017; 66(2): 952–964.

22.

Chen

Wang

Motion planning with velocity prediction and composite nonlinear feedback tracking control for Lane-change strategy of autonomous vehicles. IEEE Trans Intell Vehicles 2020; 5(1): 63–74.

23.

Elhoseny

Tharwat

Hassanien

AE.

Bezier curve based path planning in a dynamic field using modified genetic algorithm. J Comput Sci 2018; 25: 339–350.

24.

Chen

Wang

. Personalized vehicle path following based on robust gain-scheduling control in lane-changing and left-turning maneuvers. In: Proceedings of the 2018 American control conference, 2018, pp.4784–4789. Milwaukee, WI, USA: IEEE.

25.

Werling

Ziegler

Kammel

, et al. Optimal trajectory generation for dynamic street scenarios in a Frenet frame. In: IEEE international conference on robotics and automation, 2010, pp.987–993. New York: IEEE.

26.

Ljungqvist

Evestedt

Axehill

, et al. A path planning and path-following control framework for a general 2-trailer with a car-like tractor. J Field Robot 2019; 36: 1345–1377.

27.

Ljungqvist

Evestedt

Cirillo

, et al. Lattice-based motion planning for a general 2-trailer system. IEEE intelligent vehicles symposium, 2017, pp.819–824. New York: IEEE.

28.

Chai

Savvaris

Tsourdos

, et al. Solving multiobjective constrained trajectory optimization problem by an extended evolutionary algorithm. IEEE Trans Cybern 2020; 50: 1630–1643.

29.

Chai

Tsourdos

Savvaris

, et al. Multiobjective overtaking maneuver planning for autonomous ground vehicles. IEEE Trans Cybern 2021; 51: 4035–4049.

30.

Bai

Cao

Yan

, et al. Efficient heuristic algorithms for single-vehicle task planning with precedence constraints. IEEE Trans Cybern 2021; 51: 6274–6283.

31.

Liang

Khajepour

, et al. A novel combined decision and control scheme for autonomous vehicle in structured road based on adaptive model predictive control. IEEE Trans Intell Transp Syst 2022; 23(9): 16083–16097.

32.

Hang

Xia

Chen

, et al. Active safety control of automated electric vehicles at driving limits: a tube-based MPC approach. IEEE Trans Transp Electrification 2022; 8(1): 1338–1349.

33.

Xia

Hang

, et al. Advancing estimation accuracy of sideslip angle by fusing vehicle kinematics and dynamics information with fuzzy logic. IEEE Trans Vehicular Technol 2021; 70(7): 6577–6590.

34.

Hang

Huang

Chen

, et al. Path planning of collision avoidance for unmanned ground vehicles: a nonlinear model predictive control approach. Proc IMechE Part I: J Systems Control Engineering 2021; 235(2): 222–236.

35.

Qin

Cao

, et al. Lane keeping of autonomous vehicles based on differential steering with adaptive multivariable super-twisting control. Mech Syst Signal Process 2019; 125: 330–346.

36.

Wang

, et al. Integrated optimal dynamics control of 4WD4WS electric ground vehicle with tire-road frictional coefficient estimation. Mech Syst Signal Process 2015; 60–61: 727–741.

37.

Abe

Vehicle handling dynamics: theory and application. Oxford: Butterworth-Heinemann, 2015.

38.

Wang

Yan

, et al. Differential steering based yaw stabilization using ISMC for independently actuated electric vehicles. IEEE Trans Intell Transp Syst 2018; 19(2): 627–638.

39.

Vasudevan

Johnson-Roberson

Occlusion-aware risk assessment for autonomous driving in urban environments. IEEE Robot Autom Lett 2019; 4(2): 2235–2241.

40.

Zhao

Schindler

Evaluation of whole-body vibration exposure experienced by operators of a compact wheel loader according to ISO 2631-1:1997 and ISO 2631-5:2004. Int J Ind Ergon 2014; 44(6): 840–850.

Optimal reinforcement learning and probabilistic-risk-based path planning and following of autonomous vehicles with obstacle avoidance

Abstract

Keywords

Introduction

Problem formulation

Path-planning and probabilistic risk assessment

Path considerations and risk assessment

NMPC path-tracking

Reinforcement learning algorithm

Results and discussion

Scenario A

Scenario B

Conclusions

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

References