Abstract
In this article, we consider the physical layer security issue in Internet of Things systems, in which there exist a sensing transceiver pair, a number of candidate nodes, and an eavesdropper. The transceiver pair needs to select a jammer node and a relay node among the candidate nodes so as to preserve the secrecy of the communications. Considering the diversity of candidate channels and the limited available power, it is infeasible to scan all the nodes and find the optimal one. We formulate this jammer and relay selection problem as an optimal stopping problem under a fixed sensing order. Then, through applying dynamic programming solution, we propose a low-complexity approach to obtain the optimal sensing order. The performance of the proposed selection scheme is evaluated through numerical results.
Keywords
Introduction
The Internet of Things (IoT), an advanced paradigm to support omnipresent connectivity among physical devices (e.g. sensors, actuators, and smart phones), gains its popular since it was first laid by K Ashton. 1 To connect things for exchanging and gathering information, wireless networks (such as wireless sensor networks2–4) play an integral part in IoT. Yet, the wide adoption and deployment of IoT devices may shadowed by security threat. 5 Specially, devices in IoT are inherently impressionable to eavesdropping attacks. 6 Therefore, plenty of work needs to be devoted to improve the secrecy capacity in such an environment. Traditionally, security is considered as an issue in upper layers (e.g. the network layer) using cryptographic methods (e.g. encryption).7,8 However, these cryptographic methods may be hard to implement in IoT systems since key distribution, encryption, and decryption are costly and complex for low-profile IoT devices. 6
To solve the security issue in IoT systems, physical layer security has come to our mind. Different from the upper layer security, secret key is not needed in physical layer security, resulting in low complexity and low energy cost, which makes it more suitable for IoT devices. The basic thought of it is to make use of the features of the spectrum to prevent the eavesdropper from correctly decoding the information signals.9–11 Specially, physical layer security methods attempt to destroy the signal to interference plus noise ratio (SINR) of the eavesdropper to maintain a positive value of secrecy capacity, which is defined as the maximum rate difference between the legitimate link and the transmitter-eavesdropper link. 12 In order to achieve a large secrecy capacity, cooperative jamming has been put forward.13–15 The main idea of it is to confuse the eavesdropper via adopting artificial noises from a cooperative helper.
However, the existing cooperative jamming schemes may not be suitable for IoT systems since these schemes need the instantaneous channel state information (CSI) of all users (we refer to as GCSI). As we know, the general approach to get GCSI is transmitting training signals for channel estimation between transmitter and receiver. Yet, in IoT systems, due to the restricted energy and the lack of high-rate feedback channels, the channel training opportunities are limited [6]. The acquisition of accurate GCSI is a waste of spectrum access occasions. As a result, it is prohibitively difficult to get the GCSI in IoT systems. Besides, in these schemes, one can notice that energy consumption is not designed as a constraint condition. However, this issue is a matter of great concern in IoT systems [16]. Thus, the problem of how to design a suitable cooperative jamming scheme in such an energy-limited network with only statistical CSI needs to be tackled.
To solve these problems, we first propose a joint jammer and relay selection scheme in an IoT system, in which there exist a sensing transmitter–receiver pair, some other candidate nodes, and an eavesdropper. The source needs to select two candidate nodes which are employed as the jammer and the relay, respectively. In the proposed scheme, the source tests the secrecy capacity of the candidate nodes in a certain sequential order. Due to the time and energy constraint, as we mentioned before, it is scarcely possible to sense all the candidate nodes to select the best relay and jammer. To solve this, we attempt to employ the optimal stopping theory to joint select proper jammer and relay nodes. Specifically, the first two candidate nodes (one acts as the relay and the other acts as the jammer) that satisfy the secrecy capacity thresholds are selected as the relay and the jammer, respectively. The optimal thresholds are calculated according to the probability distribution of candidate nodes’ CSI and available power. Besides, just like Swindlehurst and colleagues,14,15 we assume the candidate nodes are all equipped with multiple antennas. To cancel out jamming signals at the legitimate receiver, the corresponding jamming vectors are designed. Finally, considering a more general case that each candidate node has its unique probability distribution of CSI and available power, a proper sensing order is needed since it can help the transmitter–receiver pair to find the superior candidate nodes to reduce time and energy cost and meanwhile improve the secrecy capacity. By applying dynamic programming, we propose a low-complexity method to obtain the sensing order.
The rest of the article is organized as follows. Section “Related work” presents the related work. The system model is pointed out in section “System model.” The joint relay and jammer selection scheme is derived in section “Optimal stopping theory–based joint relay and jammer selection scheme.” The performance evaluations are detailed in section “Evaluation.” Section ”Conclusion” concludes the article.
Notations:
Related work
We summarize the related work under the categories of jamming schemes in physical layer security and optimal stopping theory–based schemes in wireless resource allocation.
Existing work on jamming schemes in physical layer security
Dong et al. 13 considered a three-node topology and used cooperative jamming to confuse the eavesdropper. As for the IoT system, Zhang et al. 17 proposed a jamming strategy among a large number of IoT devices. Huang and Swindlehurst 14 employed cooperative jamming in relay networks. Using convex optimization, the jamming covariance matrices were derived in this work.
Jammer selection plays an important role in the cooperative jamming–based physical layer security methods. To fulfill the security performance requirements, amounts of recent research works have been performed in the respect of jammer selection.18–22
Chen et al. 18 investigated the joint jammer and relay selection scheme in an amplify-and-forward (AF)-based network. Similar to Chen et al., 18 Liu et al. 19 proposed a cooperative jamming scheme in a relay network, in which one relay node and one or two jammers are selected. In these works, the number of antennas is assumed to be 1. In the context of multi-antenna networks, Wang et al. 20 investigated the jammer selection issue, in which the secrecy capacity is maximized using a null-steering beamforming technique. Different from Wang et al., 20 Hui et al. 21 introduced another criterion, termed as secrecy outage probability to select jammers. In the work by Hui et al., 21 the node that can minimize the secrecy outage probability is selected as the jammer. To choose multiple friendly jammers, Wang et al. 22 attempted to select the nodes whose channels are orthogonal to the legitimate channel. In summary, these existing jammer selection schemes mainly pay attention to choose one or more jammers to optimize the security performance, assuming the instantaneous GCSI is known. However, to obtain the instantaneous GCSI, it is necessary to scan all the candidate nodes, resulting in a worse overall throughput since the time utilized for data transmissions is confined by the jammer selection process. And more importantly, as we mentioned before, due to the time and energy constraint in IoT systems, it is scarcely possible to sense all the candidate nodes. With the purpose of saving time and energy, we attempt to use the optimal stopping theory in the selection problem in IoT systems.
Existing work on the optimal stopping theory in wireless resource allocation
As we know, the optimal stopping theory has been well studied in wireless communications such as opportunistic scheduling, relay selection, and spectrum sensing.
For example, the opportunistic scheduling is an important issue in many wireless networks.23–25 Tan et al. 23 studied the distributed opportunistic scheduling problem through the use of stopping theory. In the scheduling scheme, the authors characterized the optimal scheduling policies under delay constraints. A distributed opportunistic scheduling framework was proposed by Li et al. 24
Besides, the optimal stopping theory is employed to investigate the problem of spectrum sensing in cognitive radio networks. Shu and Krunz 26 considered the spectrum sensing issue as a stopping theory problem. An optimal decision strategy is suggested to enhance the overall network performance by maximizing the system rewards. Jia et al. 27 considered the problem of channel allocation in cognitive radio networks. In the context of relay selection, Jing et al. 28 proposed a stopping theory–based selection strategy, in which the node that can maximize the transmission throughput is selected as the relay. As far as we know, using the optimal stopping theory to address the joint jammer and relay selection issue in IoT systems remains a whitespace in existing literature.
System model
System model
We establish a two-hop IoT system (Figure 1), which consists of a sensing transmitter S, a sensing receiver D, an eavesdropper E, and M candidate nodes denoted by

Network model.

Time slot structure.
In Phase I, S observes the candidate nodes step by step. In a certain sensing step, S picks and senses two candidate nodes. The time needed for one observation step can be given as t. After sensing these two candidate nodes, S should make a decision regarding whether to select them as the relay and the jammer or to skip to the next sensing step. Suppose in the
Data transmission process
Transmission process in phase II
Without loss of generality, we assume
where
To cancel out the undesired interference, a decoding vector can be designed at
Then, the received signals at
After using ZFBF, one can see that
Transmission process in phase III
In Phase III, the signals received at the receiver and the eavesdropper can be given as
Without ambiguity, the information signals and the jamming signals are still denoted by
Since D only has one single antenna, we deliberately design
Using matrix transformation, this problem can be solved and
For more information, one can refer to Gao et al. 29 Thus, the signals received at D can be given as
To be concluded, by carefully designing the jamming beamformer and information beamformer, no matter which candidate nodes are selected, the information signals can be perfectly transmitted from S to D since jamming signals are masked.
Optimal stopping theory–based joint relay and jammer selection scheme
In section “System model,” choosing preferable relay and jammer has an important influence on the SINR at E. Intuitively, with the purpose of enhancing the secrecy capacity, the channel gain of the jammer-eavesdropper link needs to be greater, while the channel gain of the jammer-receiver should be smaller. And when it comes to the relay node, the situation is reversed. In this section, we attempt to employ the optimal stopping theory to construct the relay and jammer selection process (i.e. Phase I). With the purpose of maximizing the reward function, the source should decide to stop or to continue according to the comparison of the instantaneous reward and the value of expected reward in the subsequent sensing steps. The concepts of stopping theory are given as follows.
A sequence of random variables (i.e.
A sequence of reward functions, (i.e.
To be more specific, given these concepts, the optimal stopping problem can be formulated as follows: for each
Reward function of secrecy capacity
As mentioned before, we assume
where
According to Huang and Swindlehurst,
30
the secrecy capacity in Phase II, denoted by
Similarly, in Phase III, the SINR at D and E can be calculated by
And the secrecy capacity in Phase III, denoted by
Since the eavesdropping attack can take place in Phase II and Phase III, both of the secrecy capacities
Then, we derive the reward function denoted by
According to equation (17), the value of
Optimal selection scheme
In IoT systems, as mentioned before, the instantaneous GCSI is hard to obtain. Therefore, we assume only statistical GCSI is know a priori for S. The channel gains for nodes i and j are assumed to be selected from a set of discrete values
Under these assumptions, we can derive the expected reward for each observation step. Denoted by
where
One can find that it is suitable to continue the selection process if
That is to say, Phase I stops at
In the following, we propose to use the backward induction method to obtain the expected reward for each observation step, since the number of the observation steps is limited to
In this subsection, we simply assume the distribution of the available power and the channel gains of each candidate node are independent and identically distributed (i.i.d.) variables. As a result,
The process of the proposed selection scheme is shown in Algorithm 1. Similar to Huang and Swindlehurst
30
and Ly et al.,
32
a common control channel (CCCH) is assumed to be set up for nodes to send the control information. First, S senses the candidate nodes according to a fixed sensing order (Line 1) and obtains the instantaneous reward
Optimal sensing order
In the subsection “optimal selection scheme”, we assume the distribution of the channel gains and the available power of each candidate node are i.i.d. variables. However, in a more general and practical case, each candidate node should have its unique probability distributions. Thus, one can find that the sensing sequence can dramatically affect the effectiveness of the selection process. To be more specific, it is easier for S to find superior candidate nodes by constructing an optimal sensing order before the selection process. Inspired by this, in this section, we attempt to find an optimal sensing order to optimize the selection process. Due to the unique characteristic of the candidate node, we redefine the channel gains of i to j exist in a finite set
At the
Note that the sequence of the first
At the
At each observation step, S can record which state results in the maximum expected reward. Thus, the optimal sensing order can be obtained based on the optimal state that recorded by S at each observation step. Another approach to derive the optimal sensing order is to use brute force to scan all the possible orders. S can calculate the maximum expected reward of each step with Algorithm 1 in a certain order. One can find that our proposed approach can significantly reduce the computation overhead compared with the brute force search method.
Evaluation
In this section, we evaluate the performance of our proposed scheme through simulation experiments. As described in section “System model,” all the CNs are assumed with four antennas while the system slot is assumed to be 0.2 ms. We assume the distribution of the channel power gain is
Simulation study of fixed sensing order
In this subsection, we pay attention to the case that the sensing order is fixed, namely, the candidate nodes are sensed from 1 to M one by one. The time cost of one observation step is 2 µs with
First, we compare the optimal stopping theory–based scheme with a random selection scheme, in which the jammer and the relay are randomly selected by S. Figure 3 shows the secrecy capacity versus the number of candidate nodes. The proposed selection scheme can remarkably improve the secrecy capacity.

Secrecy capacity versus the number of candidate nodes.
Moreover, the variation trend of secrecy capacity in the random scheme is observed to change with no rules. However, the proposed scheme performs a continuous growth with an increase in the network size. Besides, one can also notice the secrecy capacity levels off with a large number of the candidate nodes. It can be explained that when the number is small, with the increased number of the candidate nodes, there are more opportunities for S to choose suitable nodes. For a larger number of the candidate nodes, however, a marginal increase in the candidate nodes has little impact on variations of the channel gain and the available power.
In Figure 4, the impact of the sensing time on the secrecy capacity is detailed. With an increase in the sensing time, less time can be used for data transmission, which leads to a poorer performance regardless of the network size. Figure 5 reports the impact of the sensing time on the number of sensing steps, which is observed to be increased with the number of candidate nodes. The reason can be given as with an increase in the candidate nodes, S has more chance to find suitable nodes. According to Figure 4, one can also notice that the secrecy capability is more likely to be affected with a large network size. This can be understood by the fact that with more sensing steps and bigger sensing times, the performance is drastically reduced when the network size increases. In Figure 5, one can also find that in a fixed network size, the sensing time almost has no effect on the number of sensing steps. This demonstrates the convergence property of the proposed selection scheme.

Secrecy capacity versus sensing times.

The number of sensing steps versus sensing times.
Simulation study of optimal sensing order
Note that in section “Simulation study of fixed sensing order,” we assume the distributions of the available power and the channel gains are i.i.d. variables for all the candidate nodes. In this subsection, the performance of our scheme is evaluated with the optimal sensing order by changing the distribution of the available power. That is to say, each candidate node has its unique probability distribution of the available power. We compare the proposed optimal sensing order with that of a fixed one, which is defined as the descending order. For example, when the candidate nodes set M as 4, the fixed sensing order can be given as [4,3,2,1].
Figure 6 represents the secrecy capacity achieved by these two sensing orders versus the number of candidate nodes. One can find that the performance of the optimal sensing order outperforms that of the fixed sensing order, which verifies the analysis in section “Optimal sensing order.” The increase in the secrecy capacity is observed to level off with a large number of the candidate nodes. The same conclusion can also be found in Figure 3. In Figure 7, the number of sensing steps versus the number of candidate nodes is plotted. One can see that the optimal sensing order can effectively reduce the sensing time of the proposed selection scheme.

Secrecy capacity versus the number of candidate nodes with different sensing order.

The number of sensing steps versus the number of candidate nodes.
Conclusion
In this article, we have investigated the joint jammer and relay selection issue in an IoT system, which consists of a sensing transmitter, a sensing receiver, some candidate nodes, and an eavesdropper. With the purpose of maximizing the secrecy capacity, we have formulated the selection process as an optimal stopping theory by considering the channel gains and the available power. The first two candidate nodes (one acts as the relay and the other acts as the jammer) that satisfy the secrecy capacity thresholds are selected as the relay and the jammer, respectively. The optimal thresholds are calculated according to the probability distribution of candidate nodes’ CSI and available power. Then, considering a more general case that each candidate node has its unique probability distribution of CSI and available power, we have proposed a low-complexity method to obtain the optimal sensing order, by applying dynamic programming.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was financially supported by the National Natural Science Foundation of China (61572070, 61272505, 61371069, and 61471028) and the Specialized Research Fund for the Doctoral Program of Higher Education (Grant No. 20130009110015).
