Abstract
Software-defined network is an encouraging research area that realizes updates throughout an entire network, and a wireless network is ubiquitous in an up-to-date world. The combination of these techniques, which is referred to as a wireless software-defined network, has been a significant development in numerous testbeds. The extensive use of distributed controllers that achieved elastic extension and are fault tolerant in large-scale wireless networks is hopeful. Despite their profits, they generate significant overhead in synchronizing network information. Thus, power-hungry wireless software-defined network devices limit the usage of distributed controllers in wireless networks. The patterns of storing network information have a crucial effect on controller performances, as they determine the probability of processing requests based on local data. To address this problem, we identify a better pattern for controllers to retain proper network information to balance costs in terms of both synchronous traffic and response time. We analyze the characteristics of distributed controllers and propose an overhead model of a distributed controller in a wireless software-defined network. Based on this model, we design an asymmetric network information cache to make trade-offs among different controller architectures. The asymmetric network information cache measures real-time network traffic and computes the traffic among sub-networks controlled by controller nodes. Then, the controller asymmetrically caches suitable network data to substantially busier nodes. The experimental results indicate that the asymmetric network information cache achieves a trade-off between synchronous traffic and the average response time.
Keywords
Introduction
A software-defined network (SDN) has been significantly attractive for both academic researchers and industrial vendors since it was proposed by N McKeown et al. 1 Therefore, it is extensively employed in various networks to improve the management of physical networks. The core property of an SDN is the programmability achieved by separating the control plane and data plane of a network. 2 The detached control functions are centralized into an individual control unit, which is referred to as the controller, to ensure that network devices only focus on forwarding packets. The principles of an SDN are suitable for plentiful types of networks. Thus, an SDN has been applied to a campus network, an optical backbone network, and a data center network. Both theories and technologies of an SDN advance with controller improvement.
Starting from wired networks with many improvements, the combination of an SDN and wireless networks (WNs), which is well known as a wSDN, is a promising direction in practice. Many studies have committed to a wSDN for revolutions on WNs. C Chaudet and Y Haddad 3 summarized several advantages: (1) Both the connectivity and Quality of Service of end users are prone to improve, and multiple controllers make access points as straightforward as possible to satisfy the needs of users. General wireless devices hold multiple wireless interfaces, such as Wi-Fi, 3G, and Bluetooth. A network vendor can use an SDN to coordinate services in terms of user requests. (2) Within a wSDN, programmers can easily realize network-wide multi-area planning, for instance, the collaboration among close access points, power control, and channel selection. (3) Centralized control creates a global view (or a network-wide view 4 ). A controller is able to conveniently execute security policies based on centralized network information. (4) Network information is crucial for locating users and predicting their movements. Numerous paradigms have emerged in several wSDN environments. Research on the Internet of Things (IoT), 5 which combines multiple types of wireless devices, has attracted global attention. Betzler et al. 6 discussed the benefits of applying wSDN for dense cells and proposed a model consists of many agents. Gallo et al. 7 designed wSDN home network with specific MAC processor. K Wang et al. 8 proposed an architecture of next-generation WN, in which optimize resources allocation, terminal handover, and failure recovery are proposed as key problems. However, we acknowledge that these studies directly employ SDN in these wired networks. To clarify this point, A El-Mougy et al. 9 summarized related projects, and we use Figure 1 to depict these testbeds.

wSDN architecture.
Figure 1 depicts common wSDN paradigms with a three-layer architecture. All services are run on the application plane. The control plane consists of several controller nodes, which may be connected with wired or wireless links. Plentiful wireless transport nodes constitute a valid network, and various types of end nodes can connect themselves to this network. Many cluster heads are directly connected to controller nodes. Other network devices can use one or more links that connect with other devices. In a WN, the resources on nodes are limited by battery capacity, CPU, and memory. 10 Distributed controllers generate a vast amount of overhead in these indicators. Therefore, saving network costs with respect to distributed controllers is crucial. 11
In the early stages of SDN development, single-node controllers (such as NOX, 12 Floodlight, Ryu, and Beacon) were extensively adopted; however, demands for scalability and high performance have caused an increase in the use of distributed controllers. 13 The choice to deploy additional controller nodes while the scale of a network continues to expand is better. We consider placing a distributed controller in the IoT, encouraged by Benamrane et al. 14 and Farhady et al. 15 Jayashree and Infant Princy 16 declared that proper management is necessary for a wSDN to make a reasonable trade-off between resource utilization and performance. This work can be accomplished by SDN techniques because of the property of programmability. However, distributed controllers consume a greater number of resources than single controllers. In addition, several typical architectures of distributed controllers can only perform well in terms of either synchronous communication or the response time. 17 Sato et al. conclude that distributed controllers can either be configured to share data centers or hold exclusive storage. These architectures exhibit fundamental differences with regard to synchronization and response processes. A perfect method for performing well in terms of these indicators is hard to find. A controller cannot behave the best in terms of both synchronous traffic and the response time, as we discussed in our previous study. 18
The objective of this article is making a trade-off among traffic, the response time, and storage. We analyze some extensively applied distributed controllers and propose an improvement method that is referred to as the asymmetric network information cache (ANIC). Sato et al. 17 defined network information as an aggregation of basic network data and respective application data. In this definition, network information is substantially more complicated than topology data, as noted by many previous studies.17,19 It is a state of network that is frequently employed by both pre-installed applications and user-selected applications, such as firewall, load balancing, 20 and intrusion detection. 21 In addition, A Krishnamurthy et al. 22 discussed that both synchronization and reading of network information have a significant effect on storage, the response time, and synchronous traffic. Therefore, we improve the storage of network information in an SDN distributed controller. Our contributions include the following:
An overhead model of a distributed controller in a wSDN. This model depicts the total cost of current distributed controllers with some key factors.
A traffic-aware ANIC algorithm. As a module that is run on high-level node(s), it computes the traffic among sub-networks controlled by controller nodes. The controller asymmetrically caches suitable network data to substantially busier nodes. This algorithm requires a minimal amount of overhead to achieve a trade-off between synchronous traffic and the average response time. Less synchronous traffic results in saving energy of mobile nodes.
Mobility of data plane in WNs leads to time-sensitive variations of traffic matrix. We investigate monitoring period and re-computation conditions of ANIC to address this problem. The results of experiments show that there exists an optimal solution due to real collected data.
SDN distributed controllers
An SDN controller is an individual system that is in charge of indicating network devices to take proper actions on packets. Farhady et al. 15 explained these network operating systems. Since the emergence of SDNs, many different controllers with several architectures have been developed. Figure 2 contains some basic steps in the development of mainstream controllers.

SDN controller development.
In traditional networks, a control plane is integrated into a forwarding device; these devices individually acquire network information and store them in a routing table. An SDN centralizes control functions into a logical centralized controller. In this phase, the key issue is to construct a network-wide view. Early controllers only implement some basic functions and adopt a single thread and a single-node architecture (NOX, Floodlight, and Ryu). The advancement of SDNs continues to require additional capability of a controller in terms of computing and transportation. Then, controllers (such as Beacon 23 ) are designed to support multi-threads to process requests faster. However, an individual controller is provided with limited computational power and storage to enable subsequent researchers to develop distributed controllers that supervise large-scale networks. Distributed controllers are available in scalability, robustness, and performance. 24 Two main categories of distributed controllers exist: collaborative distributed controllers (CDCs) and hierarchical distributed controllers (HDCs). All these controllers are elastic and can extend physical nodes; they are different with regard to the patterns of node connection: CDCs (Hyperflow, 25 DISCO 26 ) regard all controller nodes as equal and horizontally scale the control plane; thus, their nodes have the same functions and privileges . In contrast, HDCs (Kandoo 27 , Orion 28 ) classify physical nodes into two roles: root node process tasks with low frequency and high overhead. Other real-time tasks are processed by local nodes. Local nodes control a sub-network.
HDCs and CDCs have respective benefits and drawbacks. Although they had been deployed in a wSDN, no paradigm is achieved with regard to wireless-based improvement.
Overhead model
A controller monitors many indicators in a wSDN, and we consider synchronous traffic and the response time as the most significant indicators. Synchronous traffic, which is a communication cost among controller nodes in a synchronous process, must be decreased to save energy. The response time reflects the speed of processing requests from network devices. OpenFlow 29 is extensively employed in both industry and academia. Therefore, we adopt the OpenFlow protocol to analyze the overhead model and perform experiments.
Synchronous traffic
The traffic among controller nodes consists of monitor traffic (from network devices to the controller) and synchronous traffic (within controller nodes). To maintain consistent global views, both physical network devices and logical network devices continue to transmit their collected information to their respective controller nodes due to the locality of controller nodes; they all spread local network data to maintain consistency. In WN, terminal nodes have limited resources. Synchronous traffic consumes energy and bandwidth. Therefore, it is better for wSDN controller to reduce this overhead.
For a wSDN, in which the number of controller nodes is n, a CDC has n local nodes, and an HDC has a root node and n – 1 local nodes; the synchronization process is shown in Figures 3 and 4.

HDC synchronization process.

CDC synchronization process.
When a sub-network experiences variations Δg, the HDC local node only needs to upload related data to the root node (shown in Figure 3). In contrast, the CDC generates (n – 1)Δg traffic in response to remind all nodes of consistency. Assume that the synchronous traffic of the distributed controller A is Tr(A); we can compare the HDC and CDC
As shown in (1) and (2), CDCs generate a significant amount of synchronous traffic, which has a significant effect on the energy consumption of wireless nodes. Therefore, HDCs satisfied a larger number of requirements of a wSDN from the perspective of saving energy.
Average response time
The average response time of a controller is defined as the total time from the switch sending a request to the time at which the controller indicates forwarding rules. It has a significant effect on switch performance. Jayashree and Infant Princy 16 model the forward latency of a wSDN but focus on single-node controllers. In this section, we propose a model that depicts the response time of distributed controllers and consider that a wSDN works according to the flow diagram in Figure 5 after receiving unmatched packets. This procedure starts from switch handling of a packet-in message, which includes basic data of an unmatched data flow. The average delay Tt is defined as the latency between any pair of physical devices. Assume that the local node spends Tp on processing every request from switches. Then, it requires Tl to traverse local data to assess whether this request can be processed. If local data are efficient, the local node sends data to applications. Otherwise, it acquires a network-wide view from the shared database. The time required for the shared database to obtain a sufficient amount of data is TL. Relevant applications compute forwarding rules based on acquired data (Tc) and send these rules to switches.

Workflow of distributed controller.
We present the factor r as the probability of processing a request with local data. Therefore, the average response time of a distributed controller is calculated as follows
Equation (5) can be simplified due to the capacity of the controller node: Tp and Tc can be neglected if the local node has sufficient computing power. Tl and TL depend on the scale of the network, and the scale of the sub-network is relatively similar. Thus, Tl can be expressed as follows
Thus, the response time T is given by
The HDCs cannot ensure that all requests are addressed at the local nodes, whereas CDCs handle requests by retaining the entire network state on local nodes. Based on the difference in r, CDCs are better than HDCs from the perspective of the average response time.
Controller total cost
Fitting the problem of optimizing resource allocation as formulated in the previous section considers synchronous traffic and the response time. However, they use different types of units to express their magnitude. Thus, the total cost is defined as
where α and β are adjustable coefficients selected based on empirical values, and
The probability of processing requests with local data is determined by the network information cached on local nodes. Caching more network information on local nodes can reduce the average response time but increases synchronous communication. The main challenge addressed in the use of distributed controllers for a wSDN is the trade-off between synchronous communication and the average response time. CDC generates substantially more synchronous communication in exchange for a shorter response time based on the overhead model.
wSDN distributed controller improvement
The current controllers cannot simultaneously satisfy all indicators. The reduction of the average response time generates more synchronous traffic. Therefore, we make a trade-off between these indicators rather than choosing HDC or CDC. We observed the existence of traffic locality among controller nodes; this characteristic has an available time. Consequently, we propose an ANIC algorithm that caches frequently employed local network information to relative local nodes, as many sub-networks generate a substantially greater amount of traffic than the other networks.
ANIC problem definition
The direct result of a network information cache is an enhanced r based on local data. The main problems are selecting local controllers and computing network information. Assume that two adjacent sub-networks (Gx and Gy) are controlled by the local nodes Cx and Cy, and the network information of these sub-networks is Nx and Ny, respectively. The traffic between Gx and Gy is asymmetric (f1 > f2); three plans are shown as follows (Figure 6).

Network information cache plans: (a) is a symmetric plan, (b) and (c) are asymmetric plans that only cache data to proper local nodes selected based on traffic.
Figure 6(a) is a symmetric plan of spreading local network information to the other networks. Hu et al.
18
proposed a paradigm with a correlation-based algorithm. Figure 6(b) and (c) are asymmetric plans that only cache data to proper local nodes selected based on traffic. Assume that the actual network information on Ci is
Symmetric plan
Cache Nx to Cy
Cache Ny to Cx
Although a symmetric plan can adapt to the variation of network traffic, it consumes a greater amount of resources than an asymmetric plan. The ANIC saves resources due to a focus on frequently employed network information. For instance (Figure 6(c)), when f1 traverses a path controlled by Cx, it is processed based on both Nx and Ny, which optimizes the forwarding rules. After the f1 entry devices are controlled by Cy, the controller also obtains sufficient data to compute a best path. Conversely, f2 is handled based on Ny to ensure that Cy has to acquire additional information from the shared database. In the case of n local nodes, the distributed controller needs to calculate network views on each node. Thus, we present an ANIC algorithm to acquire the results.
ANIC algorithm
We adopt a hierarchical structure to run this algorithm. A shared database is in charge of storing network information. Our ANIC algorithm considers the poor resources of a wSDN to ensure that it computes proper data cached on local nodes; it directly changes the probability of the distributed controller processing requests based on local network information. With cached data, we reduce the times for requesting distant network data. It is designed to make decisions based on statistics of network traffic to prevent useless operations.
We sort the elements of M in descending order as mij, with a higher value having a higher probability of improvement in controller efficiency. In line 5, pop() removes the first element of the operational object. In line 9, getDst() is a function that is employed to obtain the ID of the destination sub-network. Lines 11 and 16 sum the data of two networks; the operation to achieve this summation depends on the data structure selected by the controller. Redundant operation is prevented by evaluating whether the controller cost decreases.
Evaluation
Experiment environment
The computers are DELL OPTIPLEX3010 (64-bit Intel Core i5-3470, 8 GB of memory). Synchronous traffic and the response time are simulated and measured with a wSDN environment that consists of ONOS and OMNeT++. The dataset of network traffic is the Malware Capture Facility Project (MCFP), which is monitored by the Czech Technical University to detect intrusions. We use this dataset to construct our traffic matrix. The ANIC algorithm is programmed with Python.
The specific values of the evaluation may vary based on the experiment devices; however, we use the same conditions to compare the HDC, CDC, and ANIC. We evaluate the costs of the HDC, CDC, and ANIC. Then, we adopt a series of sequential traffic matrices to test the available time of the ANIC.
Evaluation indicators
We defined several indicators to evaluate the HDC, CDC, and ANIC. These indicators are listed in Table 1.
Evaluation indicators.
ANIC: asymmetric network information cache.
1. Coverage rate of BufferM
Network traffic continuously varies after the controller cached network information to local nodes. Therefore, we introduce µ to evaluate the present BufferM. µ is given by
2. Ratio of synchronous traffic
ANIC experiment
We deploy 12 local nodes and assume that all nodes manage an equal quantity of switches. Then, we employ a statistic traffic matrix based on both the source IP and the destination IP of the packets. The change of terminal location is presented by traffic matrix without dis-adjacent communications. The controller nodes

Network traffic.
Assume that the average synchronous traffic caused by network variations is Δg. BufferM is shown in Figure 8, in which Ni shows the actual network information of Ci.

BufferM result.
ξ(HDC) is equivalent to 2, and ξ(CDC) is equal to 11. The value of ξ(ANIC) approximates to 2.8. Thus, Figure 8 shows that the ANIC consumes slightly more synchronous traffic than the HDCs. The details of all indicators are depicted with the histogram in Figure 9:

Cost comparison of HDC, CDC, and ANIC.
The experiment implies that HDCs entail significantly less synchronous traffic than the CDCs. However, the CDCs have significant advantages in terms of the average response time. The ANIC has an increase of 40% in synchronous traffic compared to the HDC; conversely, it saves 26.5% in the average response time. The total cost of the ANIC is the least of the three patterns. Thus, the ANIC makes a trade-off between traffic and the response time according to the locality of the network traffic. The evaluation of the importance of these indicators is beyond the scope of this article.
Monitoring period
After proving the applicability of the ANIC, we compute the average total cost of different Tmp values (shown in Figure 10).

Average total cost.
We observe that the total cost increases with Tmp; its trend is relatively stable between 46.4 and 81.2 s. Enlarging Tmp sacrifices the availability of BufferM in exchange for computing power. If Tmp is too long, the total cost becomes relatively unacceptable for the controller. Otherwise, the controller spends more power on monitoring. Therefore, the adoption of a value between 46.4 and 81.2 s is beneficial. Mobility of terminals has critical effects on monitoring period. In mobile environments, we recommend setting a relative smaller value for Tmp.
Conditions of ANIC re-computation
For mobile networks and WNs, the mobility of endpoints results in the frequently variable traffic matrix. It is critical for finding proper conditions of ANIC re-computation. In this section, we explore a trigger in re-computing the ANIC. Although the decision to re-compute BufferM is based on the total cost, µ is relatively simple to calculate.

Variation of µ in different BufferM.
Re-computation indicators.
The data in Table 2 show that frequent re-computation of the ANIC can improve the performance of the controller (average total cost is minimized). Similar to the monitoring period, it generates computation overhead, which should be limited. The selection of a suitable ω can efficiently lengthen the executive time of the ANIC to achieve a balance between the computing power and total cost.
Conclusion
Using an SDN to construct a wSDN has an unavoidable conflict with the poor resources of a WN. In this article, we present a resource consumption model of distributed controllers in WNs. It depicts two major indicators of distributed controllers in a wSDN and guides us to propose further improvements in the wSDN control plane. The ANIC is proposed to improve the efficiency of a distributed controller. It caches proper network information to all physical nodes to ensure that each physical node has sufficient data to make decisions. Our work exchanges a minimal amount of computer resource to reduce the overhead of distributed controllers. It helps distributed controllers to perform better in a wSDN.
In the future, we will introduce more indicators to improve this algorithm and dynamically adjust α and β based on the states of local nodes. We also plan to design a self-adaptable module to learn the values of ω and Tmp.
Footnotes
Handling Editor: Fei Yu
Author’s note
Tao Fu is also affiliated with College of Computer Science and Technology, Jilin University, Changchun, China.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical standards
Mentioned funding in the “Funding” section do not lead to any conflict of interests regarding the publication of this manuscript. Our work is original research achieved by all authors. This manuscript has not been submitted to more than one journal for simultaneous consideration. The manuscript has not been published previously (partly or in full). No data have been fabricated or manipulated (including images) to support our conclusions. Authors whose names appear on the submission have contributed sufficiently to the scientific work and therefore share collective responsibility and accountability for the results.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is funded by National Natural Science Foundation of China under Grant No. 61701190, National Key R&D Plan of China under Grant No. 2017YFA0604500, National Sci-Tech Support Plan of China under Grant No. 2014BAH02F00, Youth Science Foundation of Jilin Province of China under Grant Nos 20160520011JH and 2018 0520021JH, Youth Sci-Tech Innovation Leader and Team Project of Jilin Province of China under Grant No. 2017 0519017JH, Key Technology Innovation Cooperation Project of Government and University for the Whole Industry Demonstration under Grant No. SXGJSF2017-4, Key Scientific and Technological R&D Plan of Jilin Province of China under Grant No. 20180201103GX, and China Postdoctoral Science Foundation under Grant No. 2018 M631873.
