Abstract
We propose novel strategies for end-to-end reliability-aware scheduling in Industrial Wireless Sensor Networks (IWSNs). Because of stringent reliability requirements in industrial applications where missed packets may have disastrous or lethal consequences, all IWSN communication standards are based on Time Division Multiple Access (TDMA), allowing for deterministic channel access on the MAC layer. We therefore extend an existing generic and scalable reliability-aware scheduling approach by the name of SchedEx. SchedEx has proven to quickly produce TDMA schedules that guarantee a user-defined end-to-end reliability level
1. Introduction
The advantages of wireless technologies with respect to flexibility and cost compared to wired solutions together with the introduction of industrial wireless communication standards have led to a growing deployment of Industrial Wireless Sensor Networks (IWSNs) [1]. For an adaptation of WSN for control applications in industrial automation, though, the communication protocol stack must in a scalable way offer end-to-end quality of service (QoS) guarantees which is an acknowledged research challenge lacking a general enough solution [2, 3]. Further, actuator support in the problem formulation, as well as coexistence of interfering networks, is among the open issues that require attention from the research community [3].
IWSN provides an opportunity for flexible and cost-effective sensor networks but introduces difficult challenges. In scenarios with stringent end-to-end QoS requirements, wireless is not trivially implemented, mostly because WSNs suffer from a shared and open communication medium. QoS features, such as end-to-end latency or reliability, can be addressed on different communication layers. For instance, an improvement in terms of end-to-end reliability can be achieved by a change of route on the network layer, or a rescheduling on the MAC layer. One prominent approach to achieve end-to-end QoS guarantees is to apply contention-free medium access protocols, such as time-division medium access (TDMA) [4]. TDMA assigns dedicated transmission rights to the sensors in a repeating schedule. Energy-efficiency due to duty-cycling and a guaranteed access to the communication medium are avdantages of TDMA, as opposed to the best-effort strategies based on carrier sensing [5, 6]. This feature is especially relevant in densely populated sensor networks, with stringent QoS demands, such as for automation control in factories.
WirelessHART [7], the most widespread standard for IWSN, supports an IEEE 802.15.4e amendment for the MAC layer using a variant of TDMA on top of IEEE 802.15.4 2006/2012 PHY. WirelessHART does not dictate the choice of scheduling algorithm: it is the vendor who decides the implementation. Finding minimal size TDMA schedules is an NP-hard problem to solve, both in the single-channel case where concurrent noninterfering transmissions are allowed (slot-reuse [8]) and in the case of multichannel use (e.g., WirelessHART), where concurrent transmissions in dedicated slots may only be scheduled on different channels [9]. Combining multichannel use and slot-reuse is desired in order to further reduce the achievable schedule sizes but brings along a large complexity increase in the formal problem formulation. In order to address end-to-end QoS requirements, the common assumption for IWSN is that all network link quality information is collected at the network manager which stands responsible for scheduling, routing, and the distribution of control messages on the downlink to the sensors for both topology deployment and adjustment. The centralized collection of global control information allows for network wide optimal choices and QoS optimization with respect to, for example, fairness or jitter, which would otherwise not be possible in a decentralized setting.
The focus of this paper is on scalable reliability-aware scheduling. A scheduling algorithm is here defined as reliability-aware, if it guarantees a user-defined statistic within-network end-to-end reliability The performance of the recent reliability-aware scheduling algorithm SchedEx from [12] is here improved by introducing an optimal bound that reduces latencies for the investigated convergecast topologies by on average 20%, while guaranteeing the same end-to-end reliability levels as in the original paper. A proof of correctness with respect to end-to-end reliability is given too. The results of the scalable extension show that the schedules are calculated about 20% faster than those in the original paper and that the performance in terms of latency outperforms those created by a recent but poorly scaling approach from [13] in all tested topologies. To the best of our knowledge, this is the first paper addressing scalable TDMA scheduling combining slot-reuse, multiple channels, and reliability-aware scheduling in combination. We provide numerous simulation results that verify how even for stringent end-to-end reliabilities of 99.999% all sensor readings can be collected within seconds using schedules that are calculated within microseconds, assuming a variant of WirelessHART that supports slot-reuse. The paper makes a case for the simple extendability of SchedEx for other features such as multichannel use and multisink deployment.
Where convergecast with a single sink does not suffice, SchedEx is here evaluated with different sink deployment plans. Our simulations reveal the latency gain that can be expected for the given topologies using reliability-aware scheduling, highly relevant for mission-critical applications.
The structure of this paper is as follows. Section 2 presents the relevant research in the area. In Section 3 the network model is described, and Section 4 defines the formal problem formulation. The SchedEx algorithm is explained in Section 5. Section 6 introduces the improvements and extensions. The simulations are explained in Section 7, and the results are presented in Section 8. A discussion and concluding remarks follow in Sections 9 and 10, respectively.
2. Related Work
A lack of a generic framework that addresses the end-to-end QoS concerns for mission-critical applications over WSN has been reported in [2] with regards to the MAC layer, and more generally in [3]. Many noteworthy contributions to QoS-aware scheduling have been made [14–20], but since the requirements in industrial automation differ heavily from traditional applications for WSN, alternative approaches are required.
The work in [17] investigates multichannel transmissions with dynamic channel assignment to reduce interference. Network lifetime maximization has been investigated under the consideration of energy harvesting [20], applying cross-layer optimization [19], and the use of redundant actors that support the network in regions where sensor availability drops with time [18]. However, contention-based approaches, such as [17–20], are not suitable to guarantee end-to-end reliability, because of their best-effort nature.
TDMA-based MAC protocols are the most common means to ensure reliability and latency bounds for WSNs (e.g., [8, 12, 21, 22]). TDMA scheduling is an NP-hard combinatorial problem in its most basic static version without the consideration of time synchronization or reliability guarantees [8]. Finding optimal or even initial branch-and-cut solutions using traditional Integer Programming can take weeks or months for moderately large topologies of sizes around 50 nodes without even considering stringent QoS constraints.
In [8], the authors prove the schedule minimization problem with guaranteed packet delivery to be NP-hard and propose two scalable scheduling algorithms to solve it. In [22], the authors show how their SAS-TDMA outperforms the algorithms from [8] under the consideration of a harsh dynamic industrial environment. However, SAS-TDMA does not come with any guarantees in terms of reliability. The TDMA-based multichannel scheduling approaches introduced in [23, 24] address the scheduling problem assuming static network topologies without the consideration of slot-reuse or reliability guarantees. Offline dimensioning approaches, such as GinMAC [25] and Burst [26], address the problem for industrial networks by maximizing end-to-end reliability under latency constraints. However, they cannot be considered for large or dynamic networks because of their time-consuming execution. The authors in [13] introduce a two-phase scheduling strategy that extends existing scheduling algorithms and maximizes end-to-end reliability under a fixed latency constraint, outperforming the algorithms introduced in [8]. They further investigate redundancy in form of multipath routing in both theory and simulations, and their results show the superiority of single-path routing, not even considering communication overhead in their comparison. The SchedEx scheduling approach is introduced in [12]. SchedEx produces schedules that guarantee a given end-to-end reliability while cooptimizing latency. Simulations show that SchedEx is an order of magnitude faster than the strategy from [13]. So far, SchedEx has only been investigated on single sink topologies and a single communication channel. In this paper the SchedEx approach is further improved with respect to execution time and end-to-end latency performance. Simulations verify its suitability even for multichannel and multisink scenarios, compared to existing work.
3. Model Assumptions
We base our network model, optimization problem formulation, and simulation setup on the work in [12], extending it for multiple channel use in order to allow for an easy comparison between the approaches. Multiple sinks are already supported but have not yet been evaluated. A WSN is defined as a digraph
4. Problem Formulation
Let
In this paper, the end-to-end delay in terms of schedule frame size for a given end-to-end reliability demand
End-to-end reliability
A schedule frame F is a binary matrix of size
The formal routing constraints (4)–(6) are expressed over a binary routing table R of size
A schedule F is called valid or collision-free if no receiving sensor is disturbed by a second concurrent transmission (7)-(8) (either no sensor exists that hears two concurrent transmissions or that sensor is not interested in either transmission, or both send via different channels while not sharing the same parent). According to the constraints, spatial-reuse within a slot is allowed if no conflicts arise and transceivers may be scheduled in multiple slots of the same superframe. Constraint (10) for channel assignment has been added in this paper. Channel assignment is realized by
5. SchedEx
In [12], the authors compare the two-phase schedule reliability enhancer from [13] and a novel scheduling algorithm extension for reliability-awareness by name SchedEx, with respect to performance and runtime. They show that the schedule enhancer does not scale in the size of the network by both complexity analysis and simulations. For WSNs in industrial applications, the deployed sensors are not capable of running long processes to identify better schedules, because of battery drainage and restricted computational power. SchedEx shows to scale well in the size of the network and guarantees a reliability lower bound
The two-phase schedule enhancer from [13] produced schedules that are on average between 5 and 20% shorter than those for SchedEx, with a cost between 4000 and 12000% runtime for networks with node-to-node reliability levels produced by the Rayleigh Fading model with a signal to noise ratio of 60 dB. For 50 dB, the results for SchedEx are reported to be on average better than those for the two-phase enhancer. SchedEx produces the schedules in runtimes 2-3 orders of magnitude faster than the two-phase approach. The simulations in [12] are conducted for topologies of sizes between 50 and 200 uniformly at random distributed sensors, for single-path routing over a single channel, given a single sink in the center of a circular area.
As explained in [12], slotted scheduling algorithms operate as follows.
update meta-data (if required).
SchedEx substitutes step 3 for packet update to ensure reliability by a controlled packet move delay. The authors in [12] therefore propose Algorithm 1 which maintains a vector of down-counters for all transceivers in the network, based on
Counter Vector c, Repetition Vector τ (1) (2) (3) (4) (5) (6) (7) SchedEx, the slot-based scheduling algorithm extension improved in this paper by a tighter bound introduced further below.
The lower-bound formulation of minimal repetitions
Given the lower bound
6. Method
SchedEx calculates the minimal amount
6.1. Multichannel Assignment
Each centralized slot-based scheduling algorithm, in Step 1 of the loop, sequentially adds transceivers to the slot until a conflict, according to constraints
6.2. SchedEx Improvement
In this section, we introduce a new formula that substitutes (11), prove its correctness, and show that the adjusted formula requires the proven lowest amount of repetitions per scheduled link for SchedEx and any given initial buffer b.
Proposition 1.
Given a network with a single-path routing table R, sinks
Proof.
For the case of one link and k packets, this is trivially true because of Proposition 2 in [12]. For
Due to the equality of
7. Simulations
We apply single-path routing in this paper. Routing tables are determined by the Dijkstra shortest path algorithm using the expected transmission count (ETX) metric [27]. Table 1 contains the network model parameters.
Simulation setup parameters.
7.1. Topologies
A simulation model with n randomly distributed nodes in a circular area of radius 100 units around a center point at
We analyze 50 topologies, each ten of sizes 50, 100, 200, 400, and 800. SchedEx supports WSNs with multiple sinks, while the simulations only use a single sink. We extend the work by assessing each topology in three variations, with a single sink in the center at two sinks, one at four sinks at
For the multisink topologies, all sinks are connected via a backbone-network, which is why the choice of sink for transmission does not matter for the sensors. We investigate all scenarios first without reliability constraint and then for reliability demand
8. Results
8.1. Multiple Channels, Reliability-Unaware
Figure 1 illustrates the impact of the amount of utilized channels on latency and runtime. In terms of schedule size the performance is similar for all algorithms. Node-based scheduling provides the majority of shortest schedules for the given topologies, with a runtime that is between 50 and 220% faster than for the competitors. For one and two channels, node-based scheduling is clearly better, whereas the variance in the amount of utilized channels. For 8 and 15 channels the expected performance is very similar, also because the results converge towards the theoretical optimum

The average impact of the amount of available channels on the scheduling performance (a) and runtime (b) of the benchmarked scheduling algorithms. The more channels are utilized, the less impact has the choice of scheduling algorithm on the latency performance.
8.2. Multiple Channels, Reliability-Aware
We define the simulations using
Results for node-based scheduling given the networks of 5 different sizes, 1–15 utilized channels, and a stringent reliability requirement of

The average impact of the amount of available channels on the scheduling performance (a) and runtime (b) for the three reliability-aware approaches. SchedEx(1) produces the shortest schedules throughout the investigation.
8.3. Multiple Sinks
The impact for the one, two, and four sink scenarios is presented in Figure 3(a) for basic scheduling and Figure 3(b) for reliability-aware scheduling. The results show that all algorithms react similar to the introduction of multiple sinks. For two sinks, schedule size can on average be reduced by ca.

The average impact of the amount of available sinks on latency for the four scheduling algorithms (a) and the three reliability-aware scheduling approaches (b). Reliability-aware scheduling performance is equally positively correlated with the amount of deployed sinks in the network compared to basic scheduling.
8.4. Runtime Impact
Figures 4(a) and 4(b) illustrate the impact of multiple sinks and multiple channels, respectively, on the runtime of the scheduling. Where the amount of sinks has a large impact on the scheduling performance, the runtime is less strongly affected. The amount of applied channels, however, has a much larger impact on the algorithm runtime than the amount of sinks. The runtime increases substantially until a certain point. Thereafter, a performance decline can be observed.

The average execution time impact of the amount of available sinks (a) and channels (b), respectively, on the scheduling algorithms. The choice of utilized scheduling algorithm becomes less important with respect to latency performance the more channels are available. Node-based scheduling however performs best in both execution time and latency.
9. Discussion
We are interested in how the introduced formula (12) compares to the original one and how the scheduling algorithms scale in the amount of sinks and utilized channels. From the nonreliability-aware simulations, we derive that single-channel simulations, as conducted in previous studies, are not sufficient to evaluate the scheduling efficiency of an algorithm for real-life scenarios where multiple channels are available. The performance for the four algorithms is very different for one to four channels, and node-based scheduling seems to offer a good trade-off between frame-size and execution time. The overall general schedule size in Figure 3(a) suggests that the choice of a best scheduling algorithm with respect to schedule frame size is hard, because the summarized statistics reveal that all competitors perform on average very similar. For the reliability-aware simulations, the new formula (12) significantly outperforms the one from [12]. It is substantially faster in execution time and obtains better schedules than both SchedEx(1) and the two-phase approach from [13], with narrow variances. For the one-channel one-sink utilization on topologies of size 50 and reliability constraint
10. Conclusions
The simulations presented in this paper reveal that the introduced SchedEx enhancements and problem formulation extensions lead to a substantial performance improvement. Significant improvements in terms of performance and runtime can be achieved using multiple sinks and utilizing multiple channels for reliable packet delivery in WSN. Still, there is space for improvement, and a need for future work. Adaptive algorithms that for a given topology are able to recommend a setup that includes the amount of channels to be used and also the number of sinks and their positioning for a simplified deployment are required. Our simulations suggest that the utilization of four channels is a good heuristic value for IWSN, in order to reach a close-to-optimal scheduling performance and as fast a runtime as possible. In real scenarios, the distribution of sensors is usually not uniform, and positioning of sinks might be difficult due to application specific constraints. Where objective functions for network design can be formulated, heuristic algorithms can identify near-optimal solutions that fulfill the given constraints.
The identification of the routing table R could become part of the problem formulation in order to better address the network dynamics. Weighted shortest path routing may be far from being the most appropriate routing table for a topology given specific transmission patterns and interference landscapes.
Further, a WSN specific risk analysis for reliable packet delivery is required which for the given deployment calculates the risk of failure over a time period. Then, for real networks, flows are usually arbitrary and maximum acceptable latencies are on a sensor, if not packet, level. SchedEx in its extended version significantly improves on the applicability of reliability-aware scheduling for IWSN, where link qualities in the network can vary in time, and new schedules have to quickly be computed and deployed.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
