Abstract
In vehicular ad hoc networks (VANETs), video communication makes a significant contribution to quality of experience (QoE) for people on the road. However, the selection of the video source is an impediment to video delivery due to the high mobility and dynamic topology of VANETs. An improper provider not only leads to frequent interruptions of communications, but the transmission of the invalid video fragments would also result in the waster of precious bandwidth. To address the issue, a novel video source decision scheme, named Cluster and Dynamic Overlay based video delivery over VANETs (CDOV), is proposed in this paper. By the on-demand clustering approach, nodes with the same video requirement/supply and moving features are clustered. Further, in a cluster, an overlay tree is constructed dynamically based on the relation between supply and demand, in which all requesters can find their greedy optimal source easily. In addition, the intracluster communication and head-RSU communication are designed for video streaming over this network structure. Using extensive simulations, the effectiveness of the proposed scheme is demonstrated. Compared with two existing works, the proposed solution is capable of obtaining lower startup latency and higher delivery ratio.
1. Introduction
Benefitted from the exclusive directed short range communication (DSRC) radio spectrum, vehicular ad hoc networks (VANETs) enable vehicles on the road to have wireless communication with each other and access the roadside unit (RSU) for Internet service [1]. By vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communications, the network provides a variety of novel and exciting applications. VANETs make the trip of users safe, efficient, and enjoyable, which attracts increasing attention in various fields [2].
Recently, as the technology of intervehicle communication development, video communication is possible in VANETs [3]. Utilizing communication protocols, video can be transmitted from the provider to the requester since the source is decided. Compared with text, video is more informative and intelligible. Furthermore, video transmission is bit loss tolerant. Thus one packet missing may not affect the experience of users [4]. Therefore, it has potential to be of high benefit for traffic management as well as for providing value-added entertainment [5] and advertising services [6]. In VANETs, valuable literatures are abound in studying transmission technologies for video streaming on each protocol layer. The performance of video streaming in IEEE 802.11p vehicular networks is discussed on MAC layer [7–9]. On the network layer, a rebroadcast mechanism is presented in [10] while Rezende et al. [11] study the relay node selection algorithm. As more and more vehicles are equipped with wireless communication devices, large number of users expect to be serviced with high quality of experience (QoE) in video applications. Thus not only the video delivery strategy but also the video source decision scheme should be studied. However, features of VANETs and video applications make the study of video source selection very challenging. First of all, the limited number of RSUs makes the channel of V2I more precious. A vehicle can access the Internet directly and get a wealth of information by contacting with a RSU. However, it is impractical that every vehicle requests video from one RSU, since a RSU can only serve one user at a time. Thus more available video sources instead of RSUs should be explored. Secondly, large scale broadcast causes heavy overhead though broadcast is necessary in VANETs. It is expensive that every node in the network discovers its video source by multihop broadcast. Thus, the broadcast area, as well as the size and number of the packets broadcasted, should be as little as possible. In addition, the nonuniform distribution of nodes, the high mobility, and dynamic nature of VANETs cause the issue of intermittent connectivity [12, 13]. The quality of communication is not guaranteed; even the selected provider has plenty of video sources. We propose a structure for the video source decision. By combing clustering approach and the overlay tree method, the relation between supply and demand is clear. A vehicle can find its greedy optimal source fast and efficiently by local information. A content-aware and on-demand clustering approach and a dynamic mesh-structure overlay tree are designed. By clustering nodes with similar features, the overhead of video source decision is decreased. In addition, the channel of RSUs is utilized more efficiently since only the cluster head can communicate with RSUs. Over the structure of the network, we proposed two simple but efficient communication modes for video delivery, which are the intracluster mode and cluster-RSU mode. Then we conduct a comprehensive simulation to verify the performance of CDOV.
An improper provider not only leads to frequent interruptions of communications, but the transmission of the invalid video fragments would also result in the waster of precious bandwidth. It makes the video source decision the first impediment to video communication in VANETs. To address the issue, a cluster based dynamic overlay scheme for video delivery over VANETs (CDOV) is proposed for video source decision in this paper. We design an on-demand clustering approach and a dynamic overlay algorithm. By the two algorithms, nodes with same video requirement/supply and moving features are gathered in a cluster. All requesters can find their greedy optimal source easily by searching in the tree. Then a mechanism for video delivery is designed over the network structure. Utilizing extensive simulations, the performance of the proposed CDOV is demonstrated. The main contributions of this work are as follows.
The rest of this paper is structured as follows. Section 2 surveys the related work. In Section 3, our assumptions and the structure of the network are introduced. The on-demand clustering approach and the overlay tree method are depicted in Sections 4 and 5, respectively. Then the transmission mechanism over this architecture is depicted in the following section. Section 7 shows our simulation results. In the end, we conclude this work.
2. Related Work
The source decision and transmission mechanism are important components of video communication. The source is always selected by broadcast while there are valuable literatures devoted to the transmission mechanism. This section highlights and classifies literatures on multimedia streaming approaches that have been proposed to provide video delivery in vehicles on the roads. The classification of the existing multimedia streaming approaches is illustrated in Figure 1.

Classification of video delivery approaches over VANETs.
2.1. Gossiping Based Streaming
Gossiping mechanism leverages the inherent broadcast nature of the wireless medium. In this mechanism, the source node starts the dissemination by broadcasting packets, and every node that receives a packet retransmits it. The difference with broadcast is that the retransmission of a packet by intermediary nodes is subjected to a random test of probability γ (flooding happens when
2.2. Network Coding Based Streaming
Network coding allows nodes to combine different previous received packets together to generate coded packets to transmit, which is an effective technique. Lee et al. [16] firstly realize the utility of random linear coding for P2P file sharing systems in mobile network. Then a SVC-based streaming is proposed for multipurpose applications in [17], which uses the path diversity and network node. Yang et al. [18] introduce a live multimedia streaming scheme in VANETs and fully exploit the benefits of symbol-level network. The network coding based method is an effective technique that can improve user experience and maximize network capacity in high density and low mobility networks. It takes a lot of time to collect enough pieces to decode when the density of nodes is low.
2.3. Cluster Based Streaming
It is generally known that clustering can solve the scalability, stability, and mobility issues. In [19], an application-layer overlay-network solution is introduced. It includes a distributed clustering method (DVAC) and an efficient peer-to-peer relay method (VAPER) to deliver live video streaming. ZIPPER, a zero infrastructure peer to peer system for VANETs, is proposed in [20]. Different from VAPER, ZIPPER pulls the multimedia stream, which means that the vehicle obtains the required pieces if it finds the required pieces. However, the above schemes form one hop clusters regardless of whether cluster members need to receive multimedia stream without considering the user interest. Then, a user-oriented cluster-based solution for multimedia delivery over VANETs is introduced in [21], which addresses vehicle passenger preferences and delivery multimedia content of their interest.
2.4. Overlay Based Streaming
Overlay network with multiple sources has proven to be robust, which is a distributed solution to multimedia transport. Video streaming using an overlay is firstly introduced in [22]. However, this overlay network is static, so it cannot adjust the overlay in time for the high-dynamic VANETs. In [23], Hsieh and Wang present a dynamic application layer mesh-structure overlay for multimedia streaming multicast to vehicles for the first time. Then in MANETs, OMHF (overlay multicast based on heterogeneous forwarding) [24] and ALMA (application layer multicast algorithm) [25] are designed, which are typical dynamic overlay video transmission protocols. However, these methods did not take full advantage of the V2I communication. Moreover, the number of the nodes in overlay may be too great to decrease the transmission efficiency as the overhead of maintaining the dynamic overlay is high. Reference [26] focused on the outside park vehicles to construct a stable parking backbone on top of the highly dynamic VANETs. However, it ignores the dynamic feature of the moving nodes which is one of the most important characteristics and challenges in VANETs.
3. Framework of CDOV
In this section, we first describe the problem of video source decision. After giving the assumption of this work, we then introduce the system architecture of the proposed CDOV.
3.1. Problem Description
This work focuses on the issue of video source selection in highway scenarios of VANETs. In these scenarios, a limited number of RSUs (
3.2. Assumption
Considering a straight highway over which a vehicular network resides, we make the following definitions and assumptions in this work.
3.2.1. Location and Distance
The location of a vehicle is described by (longitude, latitude) which is obtained by the GPS device. Then distance between node i and j in the network is defined by
3.2.2. Communication Model
We assume that both the RSU and the vehicle have the same transmission range R. We adopt the IEEE 802.11p as the PHY and MAC protocols. We assume that the transmission from node i to node j can be successful if and only if the following condition holds:
3.2.3. ID for Vehicles and Videos
We assume that each vehicle and video have unique ID numbers, that is,
3.2.4. Length of a Cluster
Vehicles tend to move in clusters where two consecutive clusters of vehicles are normally separated by a relatively large distance. By assuming that vehicles are distributed following a Poisson process with density λ and the transmission range is R, Wisitpongphan et al. [28] give the average length of a cluster by
3.3. Architecture
CDOV contains an on-demand clustering approach and a dynamic overlay algorithm for video source selection. The architecture of CDOV is shown in Figure 2, in which infrastructures are located on the side of the road. Units are divided into three types, that is, RSUs/infrastructures, cluster headers, and cluster members. Thus the V2V and V2I communications all exist. As shown in Figure 2, the upper layer consists of cluster heads which are in charge of its cluster members. The cluster header can join in not only the V2V communication but also the V2I communication.

Illustration of architecture of CDOV.
In CDOV, a source table for each requester is conducted by combining the clustering approach and the overlay scheme. Then, if a vehicle demands for some video clips, it can require the video from the corresponding source by local information. The proposed scheme contains the following parts. First of all, CDOV has an on-demand clustering method to distinguish the header and the member. Based on the mobility and the quality of link, moving directions, locations, and video requirements of vehicles are considered in the method. In a cluster, the logical relation between supply and demand is revealed by the design overlay tree. Communication mechanism is needed with the clear supply-demand relationship.
Up till now, we have an outline of the proposed CDOV scheme. Based on these assumptions, we will proceed to present the clustering approach, the overlay tree method, and the communication mode in detail.
4. On-Demand Clustering Approach
In our clustering approach, three measures are taken to deal with the mobility of the network. To ensure the stability of a cluster, moving directions and locations of vehicles are considered. Further, we restrict the size of a cluster to avoid the packet loss caused by excessive hops. In addition, taking demands of requesters and contents of providers, the efficiency of video communication and QoE of users are guaranteed. Thus the principle of our clustering algorithm is that nodes satisfying three conditions, that is, requiring or providing the same video and locating in the communication range of the cluster head and the same moving direction, are in a cluster. A simple cluster is shown in Figure 3 where all colored vehicles form a cluster. Then we will describe the two stages of our clustering algorithm, that is, the stage of clustering and the stage of maintaining.

Illustration of a cluster.
4.1. Clustering
The clustering algorithm is described in Algorithm 1. Dedicated to a video, a cluster is sponsored by a status message, which is broadcasted by a vehicle named the cluster head.
(1) (2) // Determine the type of i (3) (4) (5) insert corresponding information in record table (6) (7) broadcast new table in the cluster (8) (9) // Measures of a cluster header (10) (11) (12) rebroadcast in the cluster (13) (14) renew a replay to the header (15) (16) // Measures of a cluster member (17) (18) (19) (20) as a header, i sponsors a new cluster for (21) i broadcasts a status message of the cluster (22) (23) (24) (25) i sends a reply message to header (26) (27) (28) // Determine the condition of a cluster member (29) (30) (31) // Measures of a non-member (32) (33)
Any vehicle can be a cluster head if it requires a video and hears no status message about the video. Then it will broadcast its status message to build a cluster. The format of the status message is shown in Table 1, which contains the vehicle ID
Format of status message.
Each vehicle has a chance to be a member of a cluster if and only if it satisfies the above mentioned conditions and sends a reply message to the corresponding cluster head. This reply message contains the vehicle ID
Format of reply message.
In addition, the head takes the responsibility to manage the record table (shown in Table 3) for the cluster. Once the head receives a reply from one nonmember node, it will update the table according to the new coming member, as well as broadcasting the new record table in the cluster.
Illustration of the record table.
Since the cluster is built, a member can easily and quickly find a stable video provider because of the limited size and the stability of the cluster.
4.2. Maintaining
In the stage of clustering, not only the cluster is set up, but also new coming nodes can join the cluster following Algorithm 1. Then in this part, the maintaining method is introduced. In CDOV, the maintaining of a cluster is defined by detecting leaving nodes. There are two cases, that is, active and passive.
4.2.1. Case 1
A vehicle is called an active leaving node if it wants to stop providing video to others or requiring the video. Then it can leave the cluster by sending a special reply message with
4.2.2. Case 2
A passive leaving node is the node that is far away from the header. According to the information of location, speed, and acceleration, a cluster can calculate the real-time location for every node recorded in the table. It will delete the corresponding item of a node once the header finds the distance between them is larger than L.
5. Proposed Overlay Tree
In this section, we will depict the overlay scheme based on the cluster, which is used in intervehicles communication. In the process to construct the overlay structure, we divided the cluster members into two types: 1-type and 0-type. 1-type members stand for the nodes which want to obtain video segments and 0-type members represent the nodes which can provide the video segment and do not require video segments. The cluster members and the cluster head are organized as an overlay binary tree according to the relationship of the supply and demand. To further improve the robustness of the overlay, we set the 1-type as left child nodes and 0-type as right child nodes. The optimal child node is selected by an analogous XOR function. Moreover, the overlay structure needs to change frequently to meet the strict QoE because of the highly moving nature of the nodes in VANETs.
5.1. Analogous XOR Function
An analogous XOR function is defined to present the maximum difference of video segments. The function is described as follows:
Let vectors
5.2. Construction of an Overlay Tree
In a cluster, an overlay tree is built by Algorithm 2. The principle of the algorithm is that both the left child and the right child of a parent must be the node with the highest value among all the available nodes. The value is calculated by analogous XOR function.
(1) // Classify all members into two types (2) (3) (4) (5) (6) (7) (8) (9) // Construct the overlay tree (10) set p equals to V-ID of header (11) (12) H-ID of left child of p equals to j (13) p equals to j (14) (15) (16) // decide the left child (17) set p equals to V-ID of header (18) (19) H-ID of right child of p equals to j (20) p equals to j (21) (22) (23) // decide the right child
5.3. Record of the Overlay Tree
The cluster head should record the overlay tree and then broadcast the record in the cluster. It is generally known that the preorder traversal and sequence traversal can uniquely determine a tree. The preorder traversal of the overlay tree is
5.4. Illustration
By the following concrete example we explain the procedure of video source decision in detail. The scenario is shown in Figure 4. A cluster is formed with eight vehicles. The cluster head is node 1 with red colour. The information about the distribution of the video segments is registered in the record table. The video is split into 9 segments and each vehicle contains the different segments. Based on the supply-demand relationship, the header classifies members into two types. Two blue vehicles, vehicles 4 and 7, are 1-type nodes, and the rest of the vehicles are 0-type nodes. Then, by the results of XOR function for each pair of nodes, the overlay tree of the cluster is generated.

Classification of video delivery approaches over VANETs.
Remark 1.
The root node is the cluster head because the cluster head is the first node that sends requesting message to get some segments. This means that the cluster head must be the 1-type node. Another reason why the header is chosen as the root is that it is the only node which can communicate with RSU.
Remark 2.
Based on the consideration that once a node downloads a segment of a video it may be interested in this video, we look for the children for the chosen right children if there are some 0-type nodes left. Another advantage is that the more segments the right child contains, the more segments its parent can get. We use similar sorts of rules to search for children for the right children until all the nodes in the
Based upon the two algorithms, the structure of the network is structured. Any node in the network can quickly and efficiently find its greedy optimal source that is its right child node. In addition, the node has one backup source, which is its left child node. Then delivery scheme can be executed over the structure.
6. Communication Mode
To deliver video packets in VANETs, a transmission mechanism is designed over the above network structure. Two communication modes exist, which are the intracluster communication and the head-RSU communication. In a cluster, every vehicle can require its desired video from its source by the intracluster communication mode. However, only the cluster head can access the RSU by the head-RSU mode when it is moving around one RSU. Further, the head-RSU communication has higher priority to use the channel than the intracluster communication. In the following, the two modes are described in detail.
6.1. Intracluster Communication
The intracluster communication occurs between the head and the member or two members. Recall that
6.2. Cluster-RSU Communication
In a cluster, the cluster head knows which segments are rarest or nonexistent. Thus it can download these video segments from a RSU when it is moving around the RSU. The request message from the cluster to the RSU contains the number of the cluster members, the number of the segments, a vector of video segment ID, current speed of cluster head, acceleration of cluster head, and current position. Then, the RSU delivers the video segments to the head once it receives the request message.
However, a RSU can serve one head at one time. Thus, the service order as well as the service time for each cluster head should be scheduled if more than one cluster head want to access the RSU. In this paper, we consider both the remaining time and the weighting value of a cluster to allocate the accessing priority.
The remaining time

Illustration of remaining distance.
The weighting value of the cluster i (
Then, we use the
7. Performance Evaluation
In this part, we verify the performance of CDOV via NS-2 [30]. The startup delay, delivery rate, and control overhead will be used to measure the performance. We introduce the simulation from three parts: the simulation setup, the evaluation metrics, and the simulation results.
7.1. Simulation Setup
We simulate the protocols in a
7.2. Evaluation Metrics
We compare our solution with two video streaming schemes, that is, a noncooperative scheme [32] and a gossiping-based protocol [14]. In the noncooperative scheme, the vehicle can only communicate with RSU for downloading videos, which means the RSU is all vehicles’ video source. However, vehicles can use V2V communication to share video information in a gossiping-based protocol. Thus both RSU and vehicles on the road can be the source of a node.
We use the three QoS metrics, the startup latency, the delivery rate, and the control overhead, which further affect the QoE, to measure the performance of the CDOV we proposed. The startup latency is defined as the time interval of the video segment from source node to destination node. It is clear that the shorter the startup latency, the higher QoE. The delivery rate is the portion of the successful transmittal segments to the total transmittal segments. As we know, substantial amount of dropped packets can result in the fact that the drivers or someone else cannot acquire the correct information. We define the control overhead as the portion of the transmitting information except the segments to the total transmittal information. All the three metrics mentioned above have effects on the interactivity, so these can reflect the performance of the protocol and further reflect QoE at some level.
7.3. Simulation Results
We test the protocol performance under two kinds of simulation scenarios. Figures 6 and 7 show trends of startup delay and the delivery rate in the first simulation scenario, respectively. Moreover, the control overhead is shown in Figure 8. At the same time, Figure 9 shows the startup delay in the second scenario. Figures 10 and 11 show the delivery rate and control overhead, respectively.

Startup delay versus speed.

Packet delivery versus speed.

Performance versus speed.

Startup delay versus node number.

Packet delivery versus node number.

Performance versus node number.
As in Figure 6, the startup delay achieved by the CDOV is the best among the three protocols. The reason why CDOV performs well is that it uses both V2V and V2I communication, combing the cluster scheme and the overlay scheme to find the best node to download the required segments, which saves the startup delay. Moreover, the increasing of the node speed causes the dynamic changes in topology structure, which leads to the increased startup delay. In noncooperative protocol, it only uses the V2I communication, in which each node needs to send require messages when it enters the cover area of the RSU, and obtaining the required video is a probability event; therefore, the startup delay is the longest. For the gossiping-based protocol, the nodes download the video from the RSU and then broadcast the message about the video it stores, its ID, and location information when it leaves the cover area of the RSU. The nodes which received the gossiping message forward it with a certain probability p, and then the nodes requiring the video will send the request, which causes the longer startup delay.
From Figure 7, we can easily find that, with the increasing speed of the nodes, the packet delivery rate reduced, and the reason is that the instability of the links becomes higher with the increasing speed, which causes the delivery rate reduction. This figure also shows that CDOV achieves up to 30.1% and 44.5% improvement in packet delivery rate than gossiping-based protocol and noncooperative protocol, respectively. The dynamic overlay structure used in CDOV can select the best destination node, which contributes to the improvement in delivery rate. However, the flooding is used in gossiping-based protocol, which increases the network load and the storm causes the collision. In noncooperative protocol, once the node in the communication range of RSU can get the required segments, hence, it has the lower delivery rate.
In Figure 8, the control overhead of CDOV is greater than both the gossiping-based protocol and the noncooperative protocol because it needs not only the control information for video segment every node stored but also the information for constructing cluster-based overlay structure. The control overhead used to collect the video clips is increasing as the frequent changeable cluster members which are caused by the rising speed of moving node. However, the gossiping-based protocol only sends the require message one time at the beginning of a video segment transmission. In addition, in noncooperative protocol, the node only needs to send the request message to the RSU which it can communicate with, so the control overhead is smaller than the CDOV.
Figure 9 shows the effect of the varying number of the nodes on the startup delay. With this figure, we may see that the delay of the CDOV and the gossiping-based protocol both drop in some degree with the increase of the number of the nodes, but the proposed method shows obvious superiority; its delay is lowest. However, the delay decreases with the increase of the number of the nodes. When the number of the nodes is close to or even greater than 120, the delay of CDOV is lower than the noncooperative and the gossiping-based protocols. The CDOV achieves down to 40.48% and 79.2% improvement in delay than gossiping-based protocol and noncooperative protocol, respectively.
We also measure the achieved packets delivery rate versus the number of the nodes. Figure 10 illustrates how the delivery rate is influenced by the number of nodes. It shows that CDOV greatly improves the packet delivery rate compared with gossiping-based protocol and noncooperative protocol. The packet delivery rate is 51.4% higher than noncooperative protocol and 30.9% higher than gossiping-based protocol. In CDOV, nodes take advantage of the overlay structure to find the best target nodes and avoid that a large number of nodes send request messages to the same node. Hence, the delivery rate in CDOV is higher than others. In noncooperative protocol, it is observed that more request messages are sent to RSU as node number increases, so the delivery rate decreases with the increase of the node number.
The control overhead rate of the simulation protocols is shown in Figure 11. We can see that the control overhead rate of CDOV is higher than others. With the increasing number of the nodes, the control overhead rate of the CDOV is rising as the increasing nodes contain more information which needs more control message to collect. The gossiping-based protocol has an increasing overhead with more nodes because it needs more reply messages when the nodes number rises. Additionally, above all, we can conclude that the CDOV obtains the lower setup delay and the higher delivery rate with the little higher control overhead.
8. Conclusion and Future Work
Video delivery over VANETs is an efficient way to enhance the applications in both safety and infotainment. However, the character of the VANETs, interactive requirements, and limited number of infrastructures pose great challenges for video delivery in VANETs. In this paper, we proposed a novel video source decision scheme for video delivery over VANETs, in which we use the content-aware clustering approach and construct the overlay tree according to the XOR function to reveal the relationship between supply and the demand. Moreover, we design communications between the cluster head and RSU to ensure the node fairness. Simulation results show that our scheme really has an outstanding performance on both startup latency and delivery rate. Though we use the small increasing amount of control overhead to build a more efficient cluster-based dynamic overlay structure, CDOV achieves up to 30.1% and 44.5% improvement in packet delivery rate compared to gossiping-based protocol and noncooperative protocol, respectively. In a word, the proposed solution is demonstrated to be a promising solution to improve the experience of user in VANETs.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grant nos. 61271176 and 61401334, the National Science and Technology Major Project under Grant no. 2013ZX03004007-003, the Fundamental Research Funds for the Central Universities, and the 111 Project (B08038).
