Abstract
Despite the fact that a number of research efforts in the area of Industrial Wireless Sensor Networks (IWSNs) exist, there is a lack of really practical IWSN implementations, deployments, and in-field applications. This paper presents the design and implementation of an IWSN for welder machine systems, called WirelessCAN which is based on WirelessHART to replace wired CAN fieldbus. Although the implementation and challenges are application-specific, we believe that the problems we encountered will be faced by many IWSN designers. In this paper, selected challenges exposed in our implementation and the potential solutions are introduced, which have not been addressed previously. Specifically, communication resource constraint, real-time rescheduling, integrated knowledge for IWSN applications, centralized control architecture, reliable and immediate message delivery for human and equipment safety, and differential QoS requirements are discussed in detail and in depth. The goal of this paper is to make the design and implementation of IWSNs more efficient and applicable.
1. Introduction
With recent advances in wireless communication and embedded systems, the market of Wireless Sensor Networks (WSNs) has expanded exponentially and has started to find its place in industrial processing and automation. The IWSN standards including WirelessHART, ISA100.11a, WIA-PA, and IEEE 802.15.4e have been released in recent years. The MarketsandMarkets' 2012 market research report [1] on IWSN said that the IWSN market is expected to reach $3.795 billion by 2017.
IWSN will become a hotspot technology and bring a large market space without doubt. However, a large gap between current existing techniques and in-field applications still exists. Because the published IWSN standards mainly define mechanism (such as TDMA and graph routing) but not strategy (such as scheduling algorithms, self-healing schemes) which needs lots of in-depth studies, this research field is just getting started. A complete implementation of IWSN standards, satisfying industrial requirements, faces a lot of challenges.
There are many studies about the IWSN challenges which have been discussed from the viewpoint of system requirements. The springboard of some these researches is from the theoretic industrial requests of current protocols and standards such as [2–5]. And some other researches [6, 7] mainly concentrate on one or two challenges in industry or WSN, for example, hardware constraints, reliability, latency, and security. They may not provide the detailed challenges in the design of real IWSNs system. Therefore, this paper is greatly different from these studies. All of the challenges described in this paper are derived from a real implementation of IWSN project [8, 9], WirelessCAN which applies the IWSN technology in a welder machine control system. These concrete challenges are general and in-depth and are not addressed previously. We propose some potential approaches to these challenges, as further research topics for the design and implementation of IWSNs or TDMA-based multihop mesh networks.
The main characteristic of existing IWSN standards is the use of the time division multiple access (TDMA) medium access control (MAC) protocol to enable more reliable and real-time communication instead of the contention-based MAC protocol carrier sense multiple access with collision avoidance (CSMA/CA) [10]. Though existing studies [6, 11, 12] all emphasized the necessity of TDMA for IWSNs, there is a lack of practical in-field comparison experiments and analysis. Another contribution of this paper is that we compare and analyze the performance between CSMA-based ZigBee and TDMA-based WirelessHART by real experiments.
The remainder of this paper is organized as follows. Section 2 briefly describes the project WirelessCAN and comparison experiments between CSMA-based ZigBee and TDMA-based WirelessHART. In Section 3, we discuss some selected technical challenges and possible solutions. Finally this paper is concluded in Section 4.
2. Project Description
Welder machines need to be frequently moved from one weld site to another, so the welder machine industry is eager to introduce wireless technology to welder machine communication. Based on WirelessHART standard aiming at the process control industry, our WirelessCAN project is belonging to distributed control system. We extend the application scenarios of WirelessHART. A leading welder machine company is cooperating with the project team to replace the wired CAN fieldbus with wireless technology. Currently, all the welder machines (WMs) connect to a Welding Production Manager (WPM) by wired CAN fieldbus as shown in Figure 1. WPM is the central controller of the whole WMs system. We also append the CAN port into WirelessHART which only supports HART-related devices.

Welder machine control system with CAN fieldbus.
2.1. Communication Models
The CAN fieldbus in welder machine systems has four application functions or communication models: monitoring, browsing, configuring, and combination of monitoring with browsing or configuring.
The monitoring application is used for the WPM to monitor all the WMs' working conditions. The WPM broadcasts a request packet every 500 ms, and every WM in the system replies the WPM with its current working conditions.
The browsing application is triggered by the operator and used for the WPM to browse one WM's detailed working parameters. The WPM sends 20~30 request packets to a selected WM. The WM replies every request packet.
The configuring application is triggered by the operator and used for the WPM to configure one WM's working models and parameters. The WPM sends 20~30 configuration packets to a selected welder machine. The selected welder machine acknowledges every configuration packet.
In practice, the monitoring always works as long as the WM is powered on. So there is a hybrid operational model combining monitoring with browsing or configuring. The monitoring application is cycle with little traffic; on the other hand, the browsing or configuring application is burst with large traffic. Because of this large difference, the hybrid operational model greatly increases the difficulty to design and deploy an efficient IWSN system.
2.2. Wireless Technologies
Several mature wireless technologies provide WSNs with high bandwidth data communication like WiFi. However, we have designed a project before which uses the WiFi wireless technology. The system shows up a bad performance about timeliness and reliability. In the context of WirelessCAN project which is in industrial area, the sampling data or management data possess strong timeliness and reliability. Therefore this wireless technology we choose must provide time-predicted and reliable communication. WiFi technology uses the series standard of IEEE 802.11 based on collision MAC protocol which could not be stratified with time-critical communication. Hence before adopting IWSN technology, we considered two representative WSNs technologies, CSMA-based ZigBee and TDMA-based WirelessHART. ZigBee defines the Network Layer and Application Layer on top of the IEEE 802.15.4 PHY and MAC layer. However, IEEE 802.15.4 has not been specially designed for reliable real-time communication in IWSNs [11]. IEEE 802.15.4 MAC is a hybrid protocol of CSMA and TDMA. The TDMA schedule capability is limited in IEEE 802.15.4 MAC because the number of TDMA slots is limited to seven. Compared with ZigBee, WirelessHART adopts pure TDMA for reliable and real-time industrial applications. Therefore, in order to choose a better suitable approach to cater to demands for the performance of real-time delivery, stability, and reliability in harsh environments, we make a comparison between ZigBee and WirelessHART. Some studies have documented the differences between them; however, most results are indicated by theoretical analysis and simulation. In this paper, we compared the packet delay and the packet loss ratio (PLR) between these two wireless technologies based on real experiments and the form of tests is based on monitoring application aforementioned. In the context of this project, this application plays a great important role in the majority of lifetime due to its cyclicality.
Then without loss of generality, the experiments have been conducted by following conditions on hardware equipment, superframe structure, and routing:
Hardware equipment: the experiments were carried out by our own testbed which comprises a set of sensor nodes, an access point (AP), a gateway, and a manager server. We use LPC1769 based on Cortex-M3 MCU and AT86RF231 radio chip to make up our experimental nodes and AP. Both node and AP provide 120 MHz CPU frequency, 5 dBm transmission power, and 250 kb/s transmission rate. In the factory, node is an additional part of welder machine without battery. So the power of experimental nodes is provided by USB port. We set the radio transceiver working on a single channel which is number 25 (centered at 2.450 GHz) without channel hopping. Superframe structure: as shown in Figure 2, the superframe is simply divided into 40-slot management part and 60-slot data part in TDMA superframe structure. The duration of one slot is equal to 10 ms. The 40-byte packets that periodically generated in management part by each node are sent out in dedicated slots. If the node failed to transmit packet in its own dedicated slot, it can retransmit it in shared slots. We make a simple slot scheme which is similar to linear network slot schedule in which the slot of one-hop node is placed at the end of dedicate slots. A slot will be reserved for each packet without aggregation. While in schedule of CSMA, we set 400 ms interval at the beginning per second to correspond with management part in TDMA schedule. Therefore there are 600 ms left for packets transmission or retransmission per second. As for parameters, we set the number of backoff equal to 4 and the retransmission opportunities of a packet equal to 3. Routing: both of technologies use graph routing [7] as the path from node to manager which is along with the fixed topology. Nodes receive the packet and then forward it to its next hop node with the combination of slot schedule. In the context of our IWSN project, we have done an on-the-spot investigation in the factory. The groups which contain 10 welder machines or 20 welder machines are commonly seen. Therefore, we have two sets of tests with two topologies graph A and graph B as shown in Figures 3 and 4. As can be seen, the dashed line with arrow indicates the next hop of each node. The first set of tests contains 10 nodes and another includes 20 nodes for comparison.

Superframe structure of TDMA and transmission schedule of CSMA.

Topology of graph A including 10 nodes.

Topology of graph B including 20 nodes.
Under the above conditions, two sets of experiments have been finished to show the different transmission performance between two technologies. The obtained results are summarized in following figures, in which 3 diagrams are shown, where each referred to the specific number of nodes: 10 in Figure 5 and 20 in Figure 6. The premised definition is that two resultant figures show different topology results. The max value of “delay of packet arrival” in both topologies represents two situations in which the periodical generated packet has been deleted for exceeding max retransmission opportunities, or the delay of packet exceeds the accessible interval. We call it untimely arriving. The accessible interval is set to be one second in graph A and three seconds in graph B. We select the reference nodes in a linear fashion with three hops to observe their delay of packet arrival, respectively, as the results. Because of slot schedule, the arrival delay of packet generated by different hops nodes differs by approximately 10 ms in TDMA protocol. Through the smallest delay of packet arrival, we can estimate that the dealing time of packets in AP, gateway, and manager is approximately 350 ms to 450 ms according to the quantity of packet.

Delay of packet arrival in graph A.

Delay of packet arrival in graph B.
The outcomes of such a test allowed us to determine the choice of mac protocol considered suitable in different environments. As can be seen, in blue line with circle mark, stable and predictable delay is displayed in TDMA schedule, though there is some jitter due to the retransmission which occurred. In red line with square mark, the fluctuant delay changes over the number of nodes and varying hops which represents that the packet arrival in CSMA schedule may be unstable and unpredicted. This performance may cause the fact that the vital data may be not collected in real time in IWSN. In Figure 5(a), the arrival delay of packet in CSMA schedule is shorter integrally but has some untimely arriving. As the number of nodes increases in graph B, we can see that longer delay in CSMA schedule is turning up and it exceeds the delay in TDMA schedule overall when the hops reach two. The reason why the delay of CSMA schedule rises is that the contention of medium access becomes more intense as nodes increase. Moreover, we can see that several packets' arrival delay exceeds the accessible interval in red regardless of the number of nodes or hops. From this, we can gain the result that, compared to the CSMA schedule, the PLR of TDMA schedule is lower synthetically. Reliability guarantee can be traded off for latency, though our focus has been on needing high reliability networks at the lowest packet delay possible. The experimental results give a rough indication of the schedule of TDMA which is more suitable for industrial applications especially WirelessCAN project.
2.3. Integration of IWSN with WMs System
On the basis of the above experiments, we decide to apply IWSN technique based on TDMA mac protocol. Since the CAN fieldbus is to be replaced with wireless, the project is titled WirelessCAN. The company does not want to modify anything of their WMs and WPM, so they ask us to replace the wired CAN fieldbus with wireless technology transparently. The integration scheme is shown in Figure 7. We add a complete IWSN between WPM and WMs. There are four kinds of devices in the IWSN.

Integration of IWSN with WMs system.
(1) W-CAN is a wireless communication and processing device that is responsible for the data conversion between a WM and wireless protocol. The W-CAN also works as a bidirectional relay device to form a multihop mesh network. The hardware of W-CAN as shown in Figure 8(a) mainly consists of a radio, a processor, a CAN module, and a supply protection circuit. The CAN module is a connector or input/output (I/O) between a W-CAN and a WM that originally connected to the fieldbus. The supply protection circuit is designed to protect the W-CAN from the impacts of high transient voltages generated by the welding machines. Figure 9 shows an in-field connection of a W-CAN device with a welder machine. (2) AP (access point) is attached to the gateway and provides radio paths between W-CANs and the gateway. Compared with W-CAN, AP has a high performance MCU Cortex-M3 as shown in Figure 8(b). (3) IWSN manager is the central controller of the whole IWSN. (4) Gateway is the intermediate device among AP, IWSN manager, and WPM.

Devices of WirelessCAN.

In-field connection of W-CAN with WM.
3. Challenges
3.1. Resource Constraint
The main challenges of traditional WSNs are restricted resources: energy, memory, and processing. According to our experience, the restraint of hardware will mitigate as the hardware update, but the restricted communication resource, that is, the number of slots in a time unite, becomes a bottleneck in the implementation of IWSNs.
3.1.1. Communication Resource Constraint
As we introduced in Section 2.1, the sample rate of welder machine monitoring is 500 ms, so the duration of a superframe is 500 ms. With this 500 ms restriction, the available number of slots is limited to 50 for a single channel or an AP because the slot length is 10 ms in WirelessHART specifications. The usage of these 50 slots should be carefully considered for the following reasons. (1) Most of the wireless sensor devices are simplex radio, which means that each device can be scheduled to TX or RX only once in a slot, so receiving and forwarding a packet consume two time slots. (2) In a multihop network, a packet may be relayed several times to the destination, which consumes even more time slots. (3) IWSN standard proposes multipath routing to improve the reliability. If a packet is delivered simultaneously via multipaths, a significant number of slots are required. (4) Not all of the slots are assigned to transmit the industrial process data packets; we should assign some of slots for periodic beacons, network management, and maintenance, such as advertisement, keeping alive, joining interaction, neighbors, and link quality report. (5) In the WirelessCAN project, simultaneous monitoring and browsing/configuring need much more slots than the simplex monitoring. (6) To guarantee the reliability in IWSN, we should reserve some slots for retransmissions of corrupted packets. The scheduling algorithm described in [13] assigns 50% slots for retransmissions. A problem arises: how many shared slots are really needed according to network condition and required QoS? During the development of WirelessCAN, we found that this problem also greatly affects real-time communication. The basic idea of current competition algorithm is that when a node has a retransmission packet, it will send in current shared slot; if the retransmission fails because of internal competition or external interference, the node will back off several slots randomly to avoid competition or interference again. Because of the restricted slots resource, the amount of shared slots in a superframe is small, so it is easy to run out of the shared slots in current supeframe as shown in Figure 10 which destroys the reliable real-time communication.

Exponential backoff algorithm during multiple superframes.
As the above analysis, the restricted TDMA slots greatly limit the network size, reliability, sample rate, and other performances of IWSNs. For example, an AP or a single channel can accommodate only 10 field devices in current WirelessCAN project. A few following techniques have been adopted and implemented in WirelessCAN; we estimate that an AP is able to accommodate 40–50 welder machines.
To resolve the challenge from the restricted slot resource, the potential solution is as follows: (1) data aggregation technology, which means several data or packets are converged into one in temporal and/or spatial domain; (2) reducing the maximum hops by properly deploying routers in the field; (3) full-duplex AP as AP is a communication bottleneck; (4) multiple access points (APs) in one IWSN; (5) two or more APs sharing one channel for downlink due to asymmetric traffic between downlink and uplink in IWSNs; (6) channel reuse in spatial domain; (7) efficient communication scheduling, especially, one slot for multiple duties, that will be presented in Section 3.2.
Multiple channels could increase more communication opportunities; however, it is not optimistic in the practical deployment. For the successful adoption of IWSNs in the process automation and manufacturing industry, it is expected that IWSNs are capable of friendly coexistence with other wireless systems that operate in the 2.4 GHz band, especially with WiFi, as WLANs have reached the process and manufacturing plants. In a welding field, it is also natural to expect that WirelessCAN is coexistent with WiFi; therefore available bandwidth for WirelessCAN is very constrained. To increase more clean channels, besides 2.4 GHz W-CAN device we design a 700/800/900 MHz W-CAN device with AT86RF212 as shown in Figure 8(a).
3.1.2. Memory and Processing Constraint
The authors in [5] give a detailed analysis on the hardware challenge for IWSNs. Here we supplement more problems that we encountered during the WirelessCAN development to show the challenge of resource constraint. To implement the TDMA, we define a 25-byte C-language structure to describe a slot. If a node activates 50 slots in the superframes, it needs 1.25 kB memory resource. To send and receive a packet in a slot, WirelessHART defines 11 timers and their interrupt handlers, which greatly increase the processing burden of MCU. In the implementation of WirelessCAN, we find that both the memory and processing power of MSP430f2618 at AP are not enough to accommodate scheduled duties; therefore we update the hardware of AP devices from MSP430f2618 to Cortex-M3 LPC1769.
3.2. Routing and Scheduling
Cross-layer routing and TDMA scheduling have been intensively studied. Based on the current existing standards and papers, such as [13–16], the approaches for reliable and real-time communication in industrial wireless mesh networks are summarized with our comments as follows. (1) Three types of routing graphs are defined: broadcasting graph from gateway downward to all devices for common configuration and network control, uplink graph for data collection from all devices to gateway, and downlink graph from the gateway to each individual device for actuator control and other commands. (2) WirelessHART defines that each intermediate node on the routing graph must have at least two neighbors to forward the traffic to the destination, such that reliable routing is achievable. However, when we make TDMA scheduling with limited number of time slots in a time unit, the problem occurs. For example, if a packet is delivered from its source to the destination simultaneously through multiple paths, the required number of slots increases multiplicatively compared to a single path; if packets are delivered alternately along one of the multiple paths, a path failure will cause packet losses. (3) Two types of superframes have been defined: data superframes and management superframes. Superframe scheduling for periodic collection and control data has been deeply studied, not the management and event-generated data which are more complicated. (4) Four types of schedule entries have been defined: exclusive entries for dedicated communication; shared entries for multiple devices to compete for burst communication such as retransmission; reserved entries for maintenance purposes; unused entries for new assignments. It is obvious that the reserved and the unused entries are necessary for simplifying the rescheduling but possibly waste the limited communication resource. (5) The typical data sampling rates are defined as
In [13], the number of shared slots equals the number of dedicated slots. A problem arises: how to guarantee early hop first in noisy environment. This problem is named the routing-ordered slot scheduling in this paper. Take the routing node 3→2→1→AP in Figure 11(a) as a simple example; its corresponding dedicated slots schedule should be in the same order as the routing which is 3→2, 2→1, and 1→AP in a superframe as shown in Figure 11(b). Otherwise, the transmission of a packet from 3 to AP could not be completed in one superframe. The routing-ordered slot scheduling problem has been considered by most of current IWSNs studies; however, to the best of our knowledge, no studies have studied this problem considering retransmission. Taking the slot 3→2 as an example, if the packet is corrupted, the retransmission should be scheduled before slot 2→1; otherwise, the following slots 2→1 and 1→AP are wasted and the packet submission from 3 to AP could not be completed in one superframe.

Simple topology and slot schedule of example.
From the above analyses, a lot of problems have not been addressed. Additionally, because of the harsh and unstable industrial environments, link or node failures are unavoidable; therefore routing and TDMA rescheduling appear frequently. The cost of rerouting and TDMA rescheduling is very high. The network needs large bandwidth resource to collect links and transmit scheduling information, but the slot resource recovery becomes complicated and takes a long time, which has been highlighted by WirelessHART standard. Furthermore, it is a problem of how to instantly deliver the event-generated emergent message on the condition of communication resource constraint. Current scheduling algorithms lack the strategies for recovery and real-time emergency delivery.
To evaluate the performance of scheduling algorithms, we propose the following performance metrics: (1) end-to-end message delivery rate for specific types of data; (2) end-to-end delivery latency for specific types of data; (3) scheduling length (number of slots); (4) rescheduling convergence time: the time period from the link or device failure to the recovery by rerouting and rescheduling; (5) rescheduling overhead: the number of slots used for rerouting and rescheduling information collection and spreading.
In 2013, 6TiSCH, a new IETF workshop, is formed to study the impact between IEEE 802.15.4e and RPL [17] which is the routing protocol of LLNs. This cross-layer scheduling challenge could not be avoided for large scale application of IWSNs and will be deeply studied.
The analysis in [15] has proved that the real-time transmission scheduling in WirelessHART is a NP-hard problem in a single superframe case. Unfortunately, all of the IWSN standards support multiple superframes which further increase the complexity of routing and TDMA scheduling. We use the structure as shown in Figure 12 to implement the joint scheduling. One task is driven by packet arriving or generating event. The system looks up the Graph Table to replace the next hop field according to the Graph ID of the packet; then the system enqueues the packet. Another task is driven by the TDMA timers. If a slot is active, the system will dequeue a packet according to next hop and graph ID. Slot scheduling in MAC layer has the routing knowledge by graph ID. Further, Han et al. have studied multiple data superframes scheduling [13]. The proposed algorithm is simple and suitable for assigning plenty of slots during the network forming stage but lacks flexibility and controllability. We will propose an algorithm for multiple superframe scheduling, called Multiple Transparent Slides Scheduling (MTSS), which is simple, intuitive, collision-free, resource-efficient, change-flexible, and able to greatly improve the communication quality and application safety.

Task of routing and slot scheduling.
3.3. Centralized Control
A common feature of IWSN standards is to push the complexity of ensuring reliable and expedited data transfer to a centralized entity, the network manager. Because current research on IWSN focuses on protocols and algorithms, the implementation of central IWSN and its manager has not been paid enough attention. From the implementation of WirelessCAN, we find that the replacing wired CAN with wireless technology is not just to insert an IWSN between WPM and WMs; more system architecture designs are needed.
From our experience of WirelessCAN, we summarize some general rules to design centralized IWSNs: (1) separate policy from mechanism, to implement mechanisms such as TDMA slot in the nodes, while implementing policies such as slot scheduling in the manager; (2) separate system configuration from system logic, to implement system logic (functions) in the nodes, while implementing system configuration (parameters) in the manager; (3) distinguish static parameters from variable parameters; the static parameters such as unique ID, code version, and password are written into the node's ROM; other variable parameters could be dynamically configured by the manager; (4) separate network maintenance traffic from data traffic, in both buffers and communication schedules.
Another problem on central control is the integration of the IWSN network managers with industrial controller. For the WirelessCAN project, it is the integration between WPM and IWSN manager. To replace the wired CAN bus with wireless technique transparently, we implement the IWSN manager independent of WPM. However, to provide an easy operation and efficient IWSN, it is better to physically integrate these two into one for some specific applications. This integration involves new knowledge, which further increases the design difficulty of centralized control.
Some studies are helpful to design centralized architecture. Luo et al. [18] propose Sensor OpenFlow which is a typical centralized control structure. Finne et al. [19] use the separating system configuration from system logic to enable cross-layer optimization.
3.4. Different QoS for Different Kinds of Message in a Single IWSN
Packets in IWSNs are application-aware and have different degree of importance and consequently require differential QoS, that is, update rate, message delivery ratio, transmission delay, and so forth. Based on our experience and knowledge on process automation, the typical requirements for IWSN are as follows: event-generated interlocking and emergency commands require 100% delivery ratio with maximum latency 10–100 ms; critical missions including closed loop control data require periodic sampling with update rate 1-2 per second, 99.99% message delivery ratio, and maximum latency 100–500 ms; uncritical missions including diagnostic or environmental data require periodic sampling with update rate 1 per 1.0–10 seconds, average message delivery ratio 80%, and maximum latency 1.0–10 seconds. It is important to develop schemes that will guarantee the required QoS for different kinds of packets with limited communication resources.
WirelessHART specifies four priority levels from highest to lowest: command, process data, normal, and alarm. ISA100.11a defines six usage classes from highest to lowest as shown in Table 1. Each class has different characteristics and different communication requirements. Paper [3] classifies industrial data into three categories from lowest to highest: monitoring and supervision, closed control, and interlocking and control. All these tell us that the data in IWSNs are function-aware and have different degree of importance, thereby different requirements of QoS.
Usage classes for industrial wireless communication.
To develop an IWSN, we should firstly classify all the network traffic (including closed loop, environmental, diagnostic, safety, network healing, and managing) into several categories with individual requirements of QoS. In our WirelessCAN project, monitoring application data are more sensitive to delay. The configuring data can tolerate delay but requires 100% reliability; browsing application data can tolerate both delay and packet losses of some degree. In addition, different types of networking data, such as advertisement, join interaction, keep-alive, neighbors, and link quality reports, have different requirements of QoS.
This requirement motivates us to develop mechanism of unequal packet loss protection to guarantee the differential QoS and efficiently utilize the limited bandwidth. Unequal packet loss protection can be implemented in different layers of OSI reference model. In network layer the possible approaches include simultaneous double route scheduling for closed loop control, alternative route scheduling for control monitoring data, and shortest single route for noncritical traffic. Other possible approaches in other layers are, for example, unequal error protection codes, unequal access to shared slots, and unequally weighted data sampling rates. All these are cross-layer and possible to achieve an optimal solution.
3.5. Immediate Delivery of Emergent Message
Personal and equipment safety as well as emergent interlocking should always be the highest priority. When these event-driven signals generated, the corresponding data or commands must be immediately delivered with 100% reliability. Up to now, researchers in IWSNs have not paid enough attention to this crucial problem and an effective solution has not been proposed.
Both WirelessHART and ISA100.11a adopt TDMA for efficient channel utilization and reliable transmission, and the time slots are divided into dedicated slots and shared slots in superframes. A straightforward “solution” to the problem of immediate delivery of emergency message is to schedule the transmissions in dedicated slots, but the emergency message may be generated at different sources and transmitted to different destinations; additionally the delay from source to destination should not be more than 100 ms with 100% delivery ratio. These imply that almost all the time slots are assigned to the rare occurrences. Another straightforward “solution” is that emergency packets have access to shared slots, but delay and reliability are unacceptable.
The authors of [20] have developed and implemented a duel slotted ALOHA algorithm for improvements of MAC throughput, which has been implemented in WirelessCAN project; ABB and the team group are now developing and implementing a Wireless Arbitrator for efficient flooding [21]. These techniques can be improved and applied as a possible solution to emergency applications. How to solve the problems of network consistency and concurrency in emergency situation is also a research issue.
3.6. Application-Specific Knowledge Integration
IWSNs are application-specific and cross-layer design. Developing the WirelessCAN needs a combination of expertise from several disciplines. First of all, industrial expertise and knowledge are required. More importantly, IWSN designers must have some knowledge on the IWSN operational environment and insight into process control system. Both IWSN persons and industry users must participate in the discussions of IWSN requirements and specifications and make in-field experiments together. In the WirelessCAN project, the welder company requests a transparent solution from the wired to the wireless and provides us with the requirements described in Section 2.1. In the implementation process, we find that the provided knowledge is not enough to optimize the design of IWSN to satisfy the requirements and a lot of specific challenges are exposed in the in-field experiments. However, these application-specific challenges may be not encountered by other IWSNs designers since they are particular for WirelessCAN project. Different industrial machines and requests have some influences on the designation of IWSN such as those described in the following examples.
Example 1.
Time synchronization over the entire network is essential for TDMA. The IWSN standards define a device as the master time source, while all the other devices in the network are slaves and must synchronize with the master. In our WirelessCAN project, gateway is the master time source, and an efficient synchronization method proposed in [22] is implemented. In the monitoring application of WirelessCAN system, the WPM broadcasts a request packet every 500 ms. And then every WM replies to the WPM with its current working conditions. We did lots of experiments and found that some WPM broadcast request packets are always dropped in every experiment. Finally, we find that the WPM and the network manager are not synchronized. They have different sense of 500 ms.
Example 2.
When welding starts, strong electrical impulses make W-CAN malfunction. We find that the impulses interfere in the W-CAN through three ways: electrical supply lines, connection wires between W-CAN and WM, and radiations. The countermeasure that we adopted is to add a supply protection circuit, powerful filters, and electromagnetic shelters in the W-CAN.
Example 3.
Most current studies [2, 3] take the protocol translation modules implemented in gateway as the key point to make a smooth and efficient integration of IWSNs into existing industrial control system. However, we encountered some other obstacles, which are more expensive and hard to be solved. The essence of these problems is the different communication models between IWSN and industrial control system. The current CAN field-buses have the following characters: wired connection with large bandwidth, one-hop star or bus topology, and contention-based MAC like CSMA. Based on these characters, the communication models of the wired field-buses are simple and bandwidth-inefficient. However, IWSN has the totally different characters: communication resource restriction, multihop mesh wireless topology, and conflict-free TDMA MAC. In the implementation of WirelessCAN, we have noticed the following. (1) In WirelessHART, the length of a time slot is 10 ms that is designed for the maximum data link layer payload of 133 bytes. However, the payload of a CAN frame is very short, typically 8 bytes; thus a lot of bandwidth resources will be wasted in WirelessCAN. (2) For the monitoring application, the welder machines passively respond to the WPM's broadcast request every 500 ms. In a one-hop star or bus topology the bandwidth cost for broadcasting is little, but it is so much in a wireless mesh network. The integrated knowledge provides the possibility for us to improve the performance of bandwidth utilization efficiency in WirelessCAN.
3.7. Other Prevalent Challenges
Though several prevalent challenges not emphasized before are not critical problems that we have encountered in this project such as power consumption and time synchronization, we also have a general consideration. Due to the fact that most WSNs are employing batteries as a power source which are difficult to change or recharge, all process and communication must minimize power consumption.
Although power provision of nodes is supplied directly from the welder machines in our project, our initial designation supports low-power operation for energy conservation. According to WirelessHART standard, the slot schedule follows the principle of energy conservation. Compared to CSMA-based protocol, TDMA-based protocol is prescheduled such that the nodes fall in the sleep state during idle slots to conserve energy. We have made an evaluation of power consumption according to our slot schedule and physical circuit. The results of evaluation is that average energy consumption of the one-hop node is 249.2 μA for a complete superframe. According to the evaluation, we estimate that the node in our IWSN can work more than 34 days if powered by ordinary battery. At the same time we also evaluate that the node can work about 223 days with 10 s sampling rate powered by 1500 mAh battery.
Synchronization is one of the most important challenges in most WSNs. Particularly in IWSNs, all of nodes in network need to collaborate to perform the sensing task, and the collected data are usually delay-sensitive. Every node in network has its own local clock. With the surrounding temperature changing and time flying, the frequency of oscillator varies and will result in time deviation between different nodes. If the deviation crosses the guard time of reception waiting window, nodes will be divorced from the network. To solve this problem, we adopt the synchronization mechanism which is in WirelessHART standard. The network schedule indicates the “time parent” of each node, the neighbor nodes to which this node needs to keep synchronized. It includes timing information in all packets, to allow for two neighbor nodes to resynchronize with one another whenever they communicate. When a transmitter transmits a packet, the receiver time-stamps instant while receiving the packet and indicates the offset between that measured time and the theoretical reception time TsTxOffset in its acknowledgment. Depending on which node is the time parent, either the transmitter or the receiver aligns its clock to the other. When the network sits idle (no data packets are flowing), nodes are synchronized by periodical advertisement of time parents. In our IWSN, because of 500 ms sampling rate, each node has enough time and packets to synchronize with its own time parents, respectively.
4. Experiment
To evaluate the performance of this project, we implement it in an industrial factory cooperating with us. According to the above analysis of challenges, many vital parameters are mainly chosen as Table 2 shows. The first part is the platform of our project including the specifications of hardware, while the main parameters of system and schedule are listed in the second part.
System parameters.
Before the real test, we have made a simple experiment that is the transmission of ordinary wireless nodes about Packet Loss Rate (PLR) with different interference environment in factory. Besides the industrial interference, another source is WifiSpot that we can detect. The PLR can reach 2% when we can detect a small number of WiFi access points and even 10% with plenty of access points interference. Therefore, our project implementation is mainly based on WirelessHART; at the same time, some other useful designs are supported such as packet aggregation in WIA-PA and segmented slots schedule in [9] to improve the performance. Experiencing a variety of challenges and solutions, finally, the three-hop topology with 40 welder machines is formed for the experiment as shown in Figure 13. The designation with multiple superframes is used which separates network management and data communication. The data communication superframe contains 50 slots divided into 40 dedicated slots and 10 shared slots. These shared slots are regularly alternate with the dedicated slots as shown in Figure 14. All of the nodes submit the aggregated welder machine data in their dedicated slots during every superframe. Once the transmissions failed in dedicated slots, retransmissions occur during the shared slots.

A multihop topology of WirelessCAN.

Slot schedule of data superframe.
The packet error rate is one of the most important parameters affecting the network performance. In the context of our project, this parameter is dependent on several different challenges aforementioned. Considering this parameter, we describe the capability of reliable real-time communication as the notation Pa that the sample data should be successfully transmitted in the current superframe. We record the data collected information for two-week run of the welder machine monitoring application. Because the factory works on weekdays and the system is powered off during the weekend, the first-week observation is in low interference environment with 2% PLR and another week is in 10%-PLR environment. The experimental results are displayed in Figure 15. We make statistics of the average Pa value of all the nodes with different hops for one week. As can be seen, our project gains a great performance by carefully solving these challenges. The lower bound of successful transmission during current superframe is 0.95 in the case that much interference is surrounded. It even can guarantee less than 0.01 packet untimely arrival when there is little interference. From this result, it is enough to satisfy a majority of real-time and reliable industrial applications.

In-field connection of W-CAN with WM.
5. Conclusion
In this paper, we briefly introduce a real IWSN implementation to be utilized in welder machine systems, the project WirelessCAN. At the same time, we validate the performance of WirelessHART with real experiment compared to CSMA/CA-based ZigBee and represented graphically. In the implementation, a lot of challenges emerge, which are more concrete, detailed, and deep and not addressed before. Selected challenges are presented in detail, such as radio bandwidth constraint, real-time scheduling, the design of network manager for central control, different requirements of QoS, and immediate delivery of emergent events. We believe that these will be faced by many IWSN designers. We propose some novel ways to tackle the challenges. Preliminary results from our project WirelessCAN are very encouraging.
Footnotes
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported by National Natural Science Foundation of China (Grant no. 61201204), Fundamental Research Funds for the Central Universities (Grant no. 2015JBM006 and Grant no. W15JB00430), and Beijing Higher Education Young Elite Teacher Project (Grant no. YETP0533).
