Abstract
In vehicular ad hoc networks (VANETs), network topology and communication links frequently change due to the high mobility of vehicles. Key challenges include how to shorten transmission delays and increase the stability of transmissions. When establishing routing paths, most research focuses on detecting traffic and selecting roads with higher vehicle densities in order to transmit packets, thus avoiding carry-and-forward scenarios and decreasing transmission delays; however, such approaches may not obtain accurate real-time traffic densities by periodically monitoring each road because vehicle densities change so rapidly. In this paper, we propose a novel routing information system called the machine learning-assisted route selection (MARS) system to estimate necessary information for routing protocols. In MARS, road information is maintained in roadside units with the help of machine learning. We use machine learning to predict the moves of vehicles and then choose some suitable routing paths with better transmission capacity to transmit packets. Further, MARS can help to decide the forwarding direction between two RSUs according to the predicted location of the destination and the estimated transmission delays in both forwarding directions. Our proposed system can provide in-time routing information for VANETs and greatly enhance network performance.
1. Introduction
A vehicular ad hoc network (VANET) is a subtype of the mobile ad hoc network (MANET), which is an emerging technology that combines the ad hoc network and wireless local area network (WLAN). Vehicles on the roads with wireless communication capabilities can communicate with roadside units (RSUs) or other vehicles. Users in vehicles can connect to the Internet at any time in order to obtain the desired services. Application areas for VANETs include driver assistance, road safety, improvements in traffic efficiency, and mobile entertainment services. There are three types of common communication models in a VANET—vehicle-to-infrastructure (V2I), vehicle-to-roadside (V2R) unit, and vehicle-to-vehicle (V2V).
Vehicles in a VANET have high mobility, which causes a network topology to change and connections to become unstable. The maximum transmission range is one km in the IEEE 802.11p standard, but it is actually limited by vehicular speed and other interferences. When the distance between a source and destination is very long, the vehicle that has data to send will transmit packets to an adjacent vehicle located within its transmission range. If there are no vehicles within range, the vehicle will carry the messages until it meets an adjacent vehicle within its range and then it will forward the packets. This carry-and-forward technique increases transmission delay times. Therefore, the typical strategy is to select road sections with higher vehicle density, if possible, to transmit packets when designing a routing protocol, thereby minimizing the need for carry-and-forward technique.
To date, many routing protocols have been proposed. Some routing mechanisms detect the vehicle density of road sections to decide routing paths; however, if such detection occurs only periodically, problems may arise. First, a vehicle detects the density of the road section by itself and exchanges information with other vehicles, which causes substantial amounts of control overhead. Second, vehicle densities change rapidly in VANETs, thus requiring a longer time to converge. Vehicles may not obtain real-time information, which is especially unfavorable to the source routing protocol. Third, vehicles can only obtain information regarding adjacent road sections. If the routing protocol uses a per-hop calculation, it will probably fall victim to the local optimum problem.
For these reasons, we propose a machine learning-assisted route selection (MARS) mechanism. In MARS, RSUs are assisted by a machine learning system to maintain road traffic information. MARS predicts the movement of vehicles and calculates the probability of passing within the range of other RSUs. According to these predictions, packets can be transmitted via more appropriate routing paths. We utilize the real-time state detection of urban traffic in order to provide reference information for the routing protocol in a VANET. In our method, we use machine learning to predict the movement behavior of vehicles and the transmission capacity of routing paths. We adopt an unsupervised clustering approach to judge the similarity of the data and decide whether such data can be grouped into the same cluster. Next, we map each cluster to a predefined class and then make predictions for newly arriving vehicles.
In addition to this introduction, the remainder of our paper is organized as follows. Section 2 introduces related studies. Section 3 proposes MARS with the simulation results shown in Section 4. Finally, Section 5 concludes our paper.
2. Related Work
To design an efficient routing mechanism that is adaptable to the mobility characteristics of VANETs is a challenge. Many routing protocols have already been proposed such that data can be transmitted more efficiently. Routing protocols can be classified into the five categories detailed below [1–10]. First, topology-based routing protocols use link information that exists in the network in order to find the best route on which to forward packets. Such routing protocols can be further divided into proactive routing, reactive routing, and hybrid routing. Second, position-based routing protocols use GPS devices to obtain geographic position information and share such information with neighboring nodes in order to select the next forwarding hops. Third, for cluster-based routing protocols, in each cluster, a node is selected as a cluster head, which is responsible for intra- and intercluster communications. Nodes inside a cluster communicate via a direct link, and intercluster communications occur via the cluster heads. Fourth, broadcast routing protocols are frequently used in VANETs to share information. Fifth, geocast routing protocols deliver packets from a source to all other nodes within a specified geographical region. Many applications in VANETs benefit from this routing approach.
Routing schemes in urban environments are challenging because of the restricted mobility patterns, obstacles, intersections, uneven vehicle densities, and other related factors [11, 12]. Several proposed routing protocols aim to address requirements that are unique to urban scenarios. In [13], the authors propose a geographic-based routing protocol called the improved greedy traffic aware routing protocol (GyTAR), which considers vehicle densities and road topologies as main factors in deciding the best street to traverse. GyTAR comprises three major components—traffic density estimation, intersection selection, and forwarding data between two intersections—however, packet retransmissions caused by packet collisions or packet losses are not considered.
The beacon-less routing algorithm for vehicular environments (BRAVE) is proposed in [14]. This protocol uses hop-by-hop data forwarding along a selected street by using an opportunistic forwarding scheme. Further, the protocol can perform in the carry-and-forward paradigm to handle disconnected topologies.
In [15], a cross-layer weighted position-based routing (CLWPR) protocol is proposed. This protocol uses the prediction of the node's position and navigation information to improve routing efficiency. Furthermore, it uses the SNIR value and MAC frame error rate to estimate the quality of links. All information is combined into a weighted function in order to calculate the weight for each neighboring node. From the simulations, a prediction-based scheme achieved better packet delivery rates and lower network overhead.
A novel VANET routing protocol called shortest-path-based traffic-light-aware routing (STAR) for VANETs takes traffic lights into account [16]. The authors illustrate that traffic lights greatly impact VANET routing in urban areas; however, little mention of this important issue is discussed elsewhere. The manner in which traffic lights influence the performance of VANET routing is fully discussed in [16]; more specifically, STAR is developed to improve the performance of VANET routing by utilizing traffic lights in urban environments. The authors note that vehicles in green light segments may move more smoothly, but vehicles in red light segments tend to cluster in front of the two sides of the road segments. The strategy behind STAR is to select green light segments first to forward packets, but it checks the connectivity of red light segments at intersections. If the connectivity of a red light segment is good and the segment tends toward the destination, the segment is selected to forward packets. In STAR, GPS is also used to obtain the position of a destination. To maintain the information about the connectivity of segments, vehicles should deliver test messages to the next intersection when nearing an intersection. Simulation results show that STAR performs quite well.
Machine learning techniques can learn from a dataset by automatically analyzing the dataset to identify rules. We can use these rules to predict results from new data. In general, machine learning systems need a pretraining process such that they can generalize input data. According to different training data, machine learning algorithms can be categorized as either unsupervised clustering or supervised classification. Unsupervised machine learning systems judge the similarity between data to decide whether such data can be grouped together. Supervised machine learning systems map input data to the desired outputs, classifying input data according to their attributes.
Recently, machine learning systems have been used to improve the performance of wireless networks. When packets are lost in the traditional TCP-friendly rate control protocol, it is regarded as network congestion. The protocol will reduce the packet transmission rate and lower channel utilization. In [17], the authors use machine learning to determine the cause of packet losses and take different measures to improve the network performance. The cause can be congestion, a route change, or link errors.
The cluster-based routing protocol can group nodes into clusters and distinguish between cluster heads and general nodes through machine learning techniques [18]. In [19, 20], broadcast routing uses machine learning to predict whether a packet needs to be rebroadcast. In [21], the authors dynamically adjust the beacon interval through machine learning, which decreases the control overhead and maintains the reliability of transmissions.
3. MARS
In this paper, we propose a routing information system that can be applied to urban environments. We do not adopt preestablished routing between a source and destination, but we incorporate RSUs and the assistance of a machine learning system to maintain route and traffic information. We predict the change of vehicle densities through a machine learning system and extend these predictions of vehicular movement on each road section. During transmissions, MARS dynamically selects routes with better capacity and higher probabilities of reaching the destination based on the calculations of our proposed machine learning system. To train the system, a vehicle transmits its mobile information (e.g., moving path and speed) to a RSU that the vehicle is passing through to update the information regarding its movement. When the vehicle enters the coverage area of the next RSU, the RSU informs the previous RSU about the arrival of the vehicle.
3.1. Scenario and Assumptions
We make the following basic assumptions.
Each vehicle is equipped with a GPS device and a digital map to be aware of the road and location information regarding all RSUs. Note that querying GPS about the position of a destination would introduce many security issues and may invade personal privacy. Thus, our proposed protocol estimates the position of a destination by means of the machine learning system rather than GPS.
A RSU can communicate with other RSUs via wired networks to query regarding the possible location of a destination.
The machine learning system is built into RSUs and can estimate traffic patterns by collecting vehicle information. Further, a RSU is also a relay node.
A source vehicle only knows its location information and each RSU's location information via GPS. When a source wants to transmit packets, it needs the help of RSUs to know where a destination is. The original source routing problem between the source and destination is divided into several subproblems, that is, (Source, RSU1), (RSU1, RSU2) ⋯ (RSUn−1, RSU n ), and (RSU n , Destination), as shown in Figure 1. In our proposed protocol, RSUs are assigned the responsibility to forward packets to the RSU nearest to the destination by means of wired links between RSUs. For example, when the source in Figure 1 wants to send data to the destination, it simply sends data to the nearest RSU (i.e., RSU1). RSU1 takes charge of forwarding data to the RSU nearest to the destination (i.e., RSU3). Next, RSU3 tries to forward data to the destination by means of suitable paths based on the knowledge provided by its built-in machine learning system. Thus, the traveling time upon wireless V2V communications for packets can be minimized. Furthermore, because a RSU can collect a wider range of information, it can obtain more real-time information and avoid the local optimum problem.

The original routing problem is divided into several subrouting problems.
3.2. Proposed Method
As mentioned previously, machine learning systems generalize data by a series of recurring processes, such as data collection, comparison and categorization, feedback, and self-tuning. Therefore, extensive training data and sufficient training time are necessary for a machine learning system to improve its output accuracy that is a major reason why applications of machine learning systems are currently so restricted, despite their powerful prediction capabilities.
In our paper, a dedicated machine learning system is embedded in RSUs in order to make their functionality more mature and efficient. Machine learning system requirements can be easily met with the scenario outlined in this paper: VANET in an urban environment. RSUs that equip the proposed system can collect information from plenty of passing vehicles every day and continue to tune parameters to lead to more accurate predictions. Thus, the proposed scheme can take full advantage of machine learning systems to provide users with more functional services with better quality while driving. In this paper, we focus on improving VANET communication capacity. Note that the application of the proposed machine learning system is not limited to VANET; it also has many other applications, such as traffic jam prevention and dynamic navigation services. By deploying the proposed system, several promising and valuable services that make daily life more convenient can be realized. Thus, extended applications of the proposed system will be our most significant area of focus in the future.
In this paper, we aim to improve the reliability and stability of transmissions in VANETs. The proposed machine learning system provides VANET routing protocols with precise and critical information, which is still a great challenge for such a highly dynamic environment. The proposed system can be integrated with many existing routing protocols and can significantly improve their performance in terms of packet delivery ratios and transmission delays.
The scenario discussed in this paper is shown in Figure 2. When a source vehicle tries to transmit packets to the corresponding destination, the first question we should answer is how to find the destination. In this paper, we do not make a strong assumption that a source vehicle can obtain the location of the destination by means of querying GPS. Instead, we let RSUs perform a “paging” process [22] similar to cellular networks. In cellular networks, the system tries to page a certain mobile user in the cells maintained by it since the user does not report its location when a call arrives. Similarly, in our proposed system, when packets will be transmitted to a destination, RSUs page the destination in their coverage areas to find the destination's location. By adopting the proposed system, a source vehicle delivers packets to the nearest RSU and relies on the routing service provided by the proposed system when it has data to transmit.

Scenarios in urban environments with hybrid networks.
There are three cases in our scenario. First, the destination, that is, vehicle A in Figure 2, is in the path between the source vehicle and its nearest RSU; thus, evidently packets can reach their destination without the help of RSUs. Second, the destination, that is, vehicle B, is in the coverage of a certain RSU (not the RSU nearest to the source vehicle). In this case, the RSU nearest to the source vehicle will perform the paging process and then know the location of the destination based on the reply it receives from the RSU covering the destination. Packets will then be transmitted to the destination by way of the wired path between the two RSUs (i.e., R1 and R2 in Figure 2). The packet delivery ratio and transmission delay mainly profit from the rapid and stable transmission in a wired network. The third case, the most complicated, is that the destination, that is, vehicle C, has left the coverage of a RSU and has not yet entered the coverage of another RSU. In this case, getting back to the original question, that is, how to find the current location of the destination, is very difficult. For this reason, the first duty of our proposed machine learning system is to predict the location of the destination. The RSU that the destination most recently passed (i.e., R3 in Figure 2) utilizes the following information as inputs to predict the location of the destination: the moving path of the destination in its coverage, the driving lane and spot speed of the destination when it left, and generalized histories. After the proposed machine learning system completes its computations, R3 can predict the next RSU the destination will pass through and tries to forward packets along the paths between R3 and R4.
At the moment, the second question we should solve is how to select one or multiple paths from all the paths between R3 and R4. The proposed machine system is further adopted to help us answer that question. In this paper, RSUs are assigned the responsibility of monitoring path capability by counting the number of vehicles and transmission probabilities. The proposed system adopts the unsupervised learning method to predict the network capability of each path between two RSUs. Based on the capability of each path, the system can determine how many and which paths should be selected to balance the overhead and transmission delays.
After the path selection, the final question to be answered is which direction should the forwarding packets take along the selected paths. Considering the cases of vehicles C and D, both lie in paths between R3 and R4 with different relative distances. Thus, the best way to transmit data to C is to send packets to the RSU nearest to C by way of the wired path and then forward packets hop-by-hop along the selected path. For example, packets to C should be sent to R3 and then forwarded toward R4; otherwise, packets to D should be sent to R4 and then forwarded toward R3. Compared to wired networks, wireless networks are still more unstable, especially in highly dynamic environments such as VANETs; however, by using our proposed system, the path on which packets travel by means of wireless technologies can be shortened. The proposed system also provides instructions regarding the forwarding direction, taking into consideration vehicle speeds and transmission capacities of each path. Therefore, better packet delivery ratios and transmission delays are expected after deploying the proposed system.
In summary, there are three prediction or evaluation mechanisms provided by our proposed machine learning system:
the prediction mechanism of vehicle moves,
the evaluation mechanism of transmission capacity,
the evaluation mechanism of forwarding direction.
The sample data of the first and second mechanisms have different features, as shown in Figure 3. Once new input data appear, we can appropriately determine the classification of new data through our system. We describe the specific mechanisms of our proposed system in detail below.

The prediction system for traffic flows.
3.2.1. Predicting the Vehicle Moves
A mechanism of predicting vehicle locations is very useful for looking up destinations. Therefore, several driving features of vehicles are monitored so that the proposed system can trace and predict the move of a vehicle.
When a vehicle is going to leave the coverage of a RSU, the RSU will use the prediction mechanism of vehicle moves to predict the moving direction of the vehicle at the first intersection after leaving the RSU. The purpose is to determine which RSU is the next one that the vehicle will visit. To train the system, when a vehicle leaves the coverage of the RSU i and then enters into the coverage of the RSU j , the vehicle should inform the RSU j of the identification of the previous RSU i . The RSU j can notify the RSU i that the vehicle is now in its coverage. By the notification, a RSU can estimate the proportion of traffic flows into each RSU, respectively, for predicting the moving directions of vehicles. In Figure 4, RSU1 can estimate the proportion of traffic flows into RSU2 and RSU3 after it predicts that the direction of a vehicle is A, B, or C. If a RSU wants to relay packets to a vehicle that once appeared in its coverage but the vehicle has not entered into the coverage of the other RSU yet, this RSU can use the proportion of traffic flows to decide which RSUs to forward packets.

RSU1 estimates the proportion of traffic flows into each RSU, respectively.
The information transmitted to the RSU by vehicle i is represented by particular features, including the lane number (
To give different weight values to the different features, we determine the maximum and minimum values of each feature except
Our system determines the similarity between the sample data via the K-means [23] algorithm; it groups the sample data into several clusters. The operational steps are as follows.
Step 1.
K samples are randomly selected as the initial positions of the center of mass.
Step 2.
Other sample data will select the nearest center of mass from K samples to join, thus forming K clusters.
Step 3.
The algorithm recalculates the center of mass for each cluster and finds a new one.
Step 4.
Repeat Steps 2 and 3 until the center of mass is almost unchanged.
As noted above, in our system, there are three classes of vehicular movements, that is, go straight, turn left, and turn right. Each cluster will randomly select P samples to determine which classification it belongs to. The number of clusters is much larger than the number of classes, so some clusters probably correspond to the same classification, as illustrated in Figure 5.

The relationship between the cluster and the classification.
After the training process, we analyze the sample data and then group them into several clusters. If a new sample appears, it will calculate and join a certain cluster. Therefore, the system is aware of which classification the new sample belongs to. When the number of samples in clusters reaches a certain threshold, a newly arriving sample may likely be grouped into more than one cluster at the same time, meaning that there is an overlap between clusters. At this time, the RSU needs to call the K-means algorithm again.
Using this mechanism, we use four features—lane number, direction, speed, and traveling roads—to predict the movement behavior of each vehicle. We can obtain tables from these predictions, as given in Figure 6. After predicting the movement behavior of each vehicle, a RSU can estimate the proportion of traffic flows in each RSU. When a RSU wants to forward packets to the vehicle which had left its coverage and is not seen by another RSU, the proposed system can point out the RSU with the highest probability the destination vehicle will visit. It will be selected such that packets can be forwarded in its direction. To avoid buffer overflow, we do not forward packets to all RSUs. RSUs can collect more global information because of their wider coverages. They can evaluate the number of messages forwarded by vehicles and the number of vehicles moving on the road; therefore, they can predict the transmission situation of all possible routes. In other words, RSUs calculate the maximum delay time according to the estimated delays of all routes to set a threshold time. This threshold is used when transmitting packets to the RSU with the highest probability that the destination will visit. After the threshold expires and packets are not yet transmitted to the destination, the RSU nearest to the source will choose the other RSU with the second highest probability to forward packets to the destination. If the destination has passed through this selected RSU, the selected RSU will perform our machine learning system again. Note that we choose multiple suitable routes to forward packets toward the selected RSU, as described in the next subsection.

The prediction for vehicles’ moves.
3.2.2. Evaluating Transmission Capacity
Once a RSU knows which RSU it should forward packets toward, it uses an evaluation mechanism regarding the transmission capacity to predict suitable routes with smaller transmission delay times. If a vehicle leaves the coverage area of RSU i and enters the coverage area of RSU j , it can record its traveling route and inform RSU i of information via RSU j , as shown in Figure 7, that is, vehicle A. Consequently, a RSU can obtain all possible routes of which vehicles will travel between itself and other RSUs. Note that the RSU can collect more global information because of its wider coverage. It can estimate the number of messages that should be forwarded and the number of vehicles present on the road. According to the number of messages and vehicles, a RSU can predict the transmission capacity of all routes.

A vehicle informs its traveling route to a RSU.
In the evaluation mechanism, features of the input data are the number of messages (
We give different weight values to different features. Then, we determine the similarity between sample data by K-means and group the sample data into several clusters, as mentioned above. Each cluster will randomly select P samples for verification to determine its classification as high, middle, or low. If a new sample appears, it will calculate and join a cluster. Therefore, the system can be aware of the classification to which the new sample belongs.
After the evaluation mechanism regarding transmission capacity is completed, a RSU obtains the transmission capacity of all related routes. Moreover, instead of choosing one route with the highest transmission capacity for forwarding packets, our mechanism allows all routes with smaller transmission delays to relay packets. Because the destination may probably not travel on the selected route, one way to successively relay packets is to forward packets to the next RSU near which the destination will pass via wired networks; however, this method introduces longer delay times. Therefore, except for the routes with poor transmissions, the RSU proceeds with the third part of MARS for other possible routes, as shown in Figure 8. Then, we can decide on one of three relaying methods for each traveling route according to the estimation mechanism.

Packets will be relayed via all possible routes.
3.2.3. Evaluating Forwarding Direction
According to related research, the number of vehicles on the road has a great impact on network transmission performance. When the density of vehicles is high, the number of vehicles competing for channel resources is also high. Collisions occur more frequently, and the transmission performance degrades. When the density of vehicles is low, the number of competitors decreases and the collisions are lower; therefore, performance improves. Sometimes the density is too sparse to transmit messages via wireless communications. The transmissions have to rely on a carry-and-forward technique. The speed of the moving vehicle carrying information is far slower than the speed of wireless communications, so transmission delays will increase.
If a destination is not in the coverage area of a certain RSU at present, there are three relaying methods. First, the RSU directly chooses a relay vehicle to relay packets to the destination. Second, the RSU forwards packets via wired networks to the next RSU in which the destination will pass through. Third, the RSU forwards packets to the next RSU; then the next RSU directly chooses a vehicle in its coverage area to relay packets to the destination. The estimation mechanism can calculate the required transmission delay time of these three methods, as illustrated in Figure 9.

There are three possible relaying methods to choose for relaying packets to the destination.
RSU1 receives packets from the other RSU via a backbone network and is the latest RSU the destination vehicle has moved through. We first calculate the delay time of forwarding packets from RSU1 to the predicted RSU2. Because the backbone network is part of the wired network, forwarding the packets to RSU2 takes a very less time. We assume that D is the distance between RSU1 and RSU2 and
where T is the required driving time of the destination vehicle from RSU1 to RSU2 and
If RSU1 transmits packets by V2V, the required transmission delay time is the time to forward packets from a relay vehicle, which RSU1 chooses, to the destination vehicle. If collision probability
In the basic mode, the competitor subtracts one from its backoff timer when one of the following conditions is met [24–26]: (1) there is no competitor transmitting packets; that is, it is an idle time slot and consumes one time slot, and we use
Note that we can also calculate these parameters in the RTS/CTS access mode according to the method mentioned above. Next, we can estimate the expected value
The average delay time
The required transmission delay time
In (8),
The required transmission delay time of the third relaying method
In (9),
RSU1 compares
3.3. Analysis of MARS and Other RSU-Assisted Schemes
In MARS, we use the assistance of a RSU with a machine learning system to accurately predict the movement of vehicles. Therefore, we can forward packets in the correct direction. Other RSU-assisted mechanisms do not have such prediction capacities. They adopt traditional broadcast routing techniques to forward packets. Therefore, MARS significantly reduces the transmission delay times and traffic overheads. We analyze MARS and other RSU-assisted methods below.
As shown in Figure 10, a source vehicle forwards packets to its nearest RSU1, and RSU1 forwards packets to RSU2, which has the newest information (obtained via backbone network) regarding the destination vehicle. In MARS, we predict the RSU3 as the next RSU that the destination will pass through. The distance from the destination to RSU2 and RSU3 is

We analyze MARS and other RSU-assisted methods.
If
In (10),
In Figure 11, we assume that the length of each road segment is S and the transmission range of each RSU is R (
In (11),

The estimation of traffic overheads.
We infer the total number of road segments covered by the furthest transmission area as follows:
As a result, we obtain the traffic overhead of other RSU-assisted mechanisms:
For MARS, we predict the movement of vehicles. We know that packets should be forwarded to specific RSUs instead of all directions. In Figure 11, we estimate the total number of road segments covered by the largest square area with side length
In (14), the value of
From the equations above, we conclude that MARS can greatly reduce traffic overheads compared with other RSU-assisted approaches.
4. Simulation Results
To evaluate the performance of our proposed MARS mechanism, we adopted NS-2 (version 2.35) as our simulation tool. The simulation scenario is shown in Figure 12, representing an area of 2000 m × 2000 m of Manhattan in New York City that we extracted from the OpenStreetMap database [27]. We use VanetMobiSim [28] to generate vehicle movements to fit the general behaviors of vehicles in an urban environment. The deployment of RSUs is randomly distributed. Each vehicle's speed ranges from 5 to 30 m/s. Simulation time is set to 400 s. In our simulations, wireless signal propagation follows the two-ray-ground model. Transmission power is adjusted to meet the maximum transmission range (250 m). For each source vehicle, packets are constantly generated using CBR in the application layer. The packet size is 512 bytes. For each receiver vehicle (destination), actual data rates are affected by some factors such as distance between a sender and a receiver and fading channel. In the simulations, the setting of parameters in the lower layer, such as the physical layer and data link layer, has less impact on the proposed system because our machine learning system can significantly shorten V2V routes for data transmissions with the help of RSUs. We have simulated 30 independent runs for each configuration and averaged the outcomes to obtain performance graphs. Table 1 shows the parameters for our simulation. By focusing on such evaluation results as packet delivery ratio, end-to-end delay time, and control overhead, we can prove the benefits of our proposed mechanism.
Simulation parameters.

Manhattan map extracted from the OpenStreetMap database.
Simulation results compare our proposed MARS mechanism, CLWPR, and STAR. CLWPR and STAR, as many VANET routing protocols, hypothesize that the position of a destination can be obtained by means of GPS. No other mechanism for positioning is presented in CLWPR and STAR; however, obtaining the position of a destination is difficult in VANETs due to the high mobility of vehicles. In fact, for a routing protocol, reaching a destination by means of GPS may require hop-by-hop GPS queries due to the high mobility. Furthermore, as noted above, querying GPS regarding the position of a destination introduces many security issues and may invade personal privacy.
Thus, there might not be a system available to provide the service of querying positions of other vehicles. For this reason, MARS estimates the position of a destination by means of our machine learning system rather than GPS. To compare with CLWPR and STAR, we also present the results in which MARS uses GPS to find a destination; similarly, we present results in which STAR adopts our proposed machine learning system to find a destination. In the MARS protocol, the RSU nearest to the source can help to deliver packets to the RSU nearest to the destination by means of wired links. We also integrate this function into the STAR protocol in order to verify the improvements gained by this function.
Thus, six datasets can be seen in our experiments. The notation “(GPS, RSU forward)” indicates that the protocol adopts GPS to obtain the position of a destination and RSUs are involved to deliver packets to the RSU nearest to the destination. The notation “(ML, RSU forward)” indicates that the protocol adopts our proposed machine learning system to obtain the position of a destination and RSUs are involved to deliver packets to the RSU nearest to the destination. Our proposed functions in MARS are difficult to integrate with CLWPR; therefore, only one dataset regarding CLWPR is presented in each figure.
In Figure 13, we first present one of the key indices for routing protocols, that is, packet delivery ratio. The RSU coverage ratio is the ratio of the total area covered by RSUs to the whole area of the map. The proposed protocol estimates the position of a destination and the transmission capacity of a path based on the information gathered by the RSUs. Therefore, it makes sense that the performance increases when the total coverage area of RSUs increases. CLWPR selects the next hop with the consideration of SNR and frame error rate; however, in CLWPR, the local optimization problem leads to the decay in the packet delivery ratio. The routing protocols that consider the whole path capacity, such as STAR, have good performance in terms of packet delivery ratio. According to Figure 13, no matter what protocol is deployed, significant improvements can be obtained due to the curtailment of the distances packets travel by means of V2V links when RSUs are involved to forward packets by means of the wired links between RSUs.

Packet delivery ratio under different RSU coverage ratios.
When GPS is adopted to obtain the position of a destination, ~5% improvement can be achieved as compared with our proposed machine learning system. In other words, we can say that the credibility of our proposed machine learning system is 95%. Results are very promising and show that our proposed system can find a destination without the aid of GPS, thereby being more practical and secure.
Compared with STAR, MARS has ~4%–6% improvement in terms of packet delivery ratio, because STAR maintains the connectivity information by sending announcement messages from one intersection to the next. These announcement messages are forwarded via V2V links, which is less reliable than gathering and distributing information via RSUs. As shown in Figure 14, the gap between low density and high density environment in STAR is ~6.7%. Compared with the gap of MARS (~2.7%), STAR has more significant impact when V2V links become more unstable due to its distributed architecture.

Packet delivery ratio in the low and high density environments.
The gaps of CLWPR and pure STAR between low density and high density environments are 14.5% and 13.3%, respectively. In Figure 14, we observe notable performance improvements of RSU forwarding by observing the gaps of pure STAR and STAR with RSU forwarding. The improvement in terms of packet delivery ratio is ~22.7% when RSU forwarding is deployed.
Figures 15 and 16 show the simulation results of end-to-end delays. We observe that the RSU coverage ratio has great impact on the end-to-end delay when RSUs are forwarding packets to the RSU nearest to a destination. This shortens the total length of a V2V route by using wired links between RSUs. Certainly, the end-to-end delay can be greatly decreased. The delay difference between protocols with and without GPS (i.e., replaced by our proposed machine learning system) still shows the credibility (e.g., hit rate) of our machine learning system. Our proposed MARS protocol adopts the machine learning system again to select multiple paths with higher probabilities of successfully transmitting packets to the destination between two RSUs. Compared with STAR, the information collected by RSUs, such as vehicle densities of roads and driving directions of vehicles, is more detailed in MARS. Further, the information can be maintained reliably due to the stability of RSUs.

End-to-end delays under different RSU coverage ratios.

End-to-end delays in the low and high density environments.
In Figure 15, as discussed above, maintaining the information by vehicles is less stable than that by RSUs. Thus, the information resolution in MARS is better than STAR. Furthermore, MARS selects the RSU nearest to a destination to be the starting point of the V2V path. The length of the V2V path is shorter than that of STAR. Therefore, MARS has a shorter average end-to-end delay.
Finally, we evaluated the control overhead of MARS and STAR to determine their costs. Results are presented in Figure 17. The control overhead in MARS comes primarily from vehicles reporting information to RSUs for training the machine learning system. Transmissions between vehicles and RSUs are one-hop transmissions rather than multihop transmissions in STAR. In STAR, vehicles transmit messages from one intersection to the next to calculate the connectivity probabilities of all road segments. The multihop transmissions for maintaining necessary protocol information in STAR causes more control overheads than MARS. The amount of control overheads in MARS is related to the number of RSUs in the map. Therefore, when the RSU coverage ratio increases, the control overhead also increases in MARS. Even with a RSU coverage ratio of 90%, MARS still has less control overheads than STAR.

The comparison of control overheads between STAR and MARS.
In summary, our proposed MARS protocol has better packet delivery ratio and end-to-end delay and keeps lower control overheads to reserve transmission opportunities for data transmissions. According to simulation results, our proposed MARS protocol is useful for VANETs in urban environments.
5. Conclusions
In this paper, we proposed a routing information system—MARS. We use RSUs and machine learning to maintain road information. MARS can predict the movement of vehicles and then choose some suitable routing paths with higher transmission capacity to transmit packets. Moreover, MARS can help to decide the forwarding direction between two RSUs. Our method can construct more complete and real-time traffic information and provide appropriate routing information for VANETs.
In our simulation results, our method had better performance given different vehicle densities. Further, we found that the variations in V2V capacity had relatively low impact on the performance of MARS. We have shown that our proposed protocol is more reliable and efficient for data transmissions in VANETs in which vehicles have high mobility. As a result, VANETs combined with our proposed machine learning system can effectively improve the routing performance.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
