Abstract
To maximize the long-term time-averaged secrecy rate of an energy harvesting wireless communication system, an online power control algorithm based on the Lyapunov optimization framework is proposed. The system is composed of a source node, a cooperative jamming node, and two destination nodes. The source node and the jamming node are powered by the energy harvesting device. Information sent to the two destination nodes is mutually confidential. Using the Lyapunov optimization framework, the original stochastic optimization problem is transformed into a per-time-slot optimization problem, and the power of the signal and that of the artificial noise are determined based on the current system state such as the power level of the batteries and channel coefficients. The fairness between the two destination nodes is considered too. Simulation results demonstrate that the proposed algorithm can effectively utilize the harvested energy and significantly improve the long-term averaged secrecy rate.
Keywords
Introduction
With the rapid development of information technology, intelligent time is coming. To achieve intelligence, people have begun to explore in Internet of Things (IoT), mobile computing (MC), pervasive computing (PC), wireless sensor networks (WSNs) and cyber-physical systems (CPS). 1 IoT connects different things by heterogeneous networks, which is a hot issue in academic and industrial researches. In recent years, IoT has been widely used in environmental monitoring, security surveillance, spatial crowdsourcing, crowd dynamics management, and smart cities. 2 The sensor nodes in IoT that collect sensory data from environment connect with the network through a wireless link in most cases, making wireless communication technology one of the most important technologies of IoT.
The broadcast feature of wireless communications enables the signal sent by transmitter to be received by both the legitimate nodes and the eavesdroppers. Therefore, security of information is an important issue of IoT. Traditionally, information is encrypted at high level to guarantee its security, which relies on the high computational complexity in the decryption of the encrypted information without the key. The higher complexity of decryption is, the better security performance can be achieved, but it also makes the corresponding encryption and decryption with key more complex. Due to the limitations in the computing capability and energy supply of the nodes, the high complexity encryption algorithms cannot be used in IoT. Physical layer security (PLS) is another way to ensure the security of information. In 1975, Wyner 3 first proposed the eavesdropping channel model and pointed out that the security transmission of information can be achieved using physical layer technology when the legitimate channel is superior to the eavesdropping channel. The signal processing technology, which aims to create and increase the quality advantage of the legitimate channel relative to the eavesdropping channel, is a hot topic in the research of PLS. Commonly used technologies include multi-antenna technology, 4 artificial noise (AN) technology, and cooperative communication technology. 5 The nodes of IoT are limited in size and generally have single antenna. Due to the available resources of antenna, computing, and energy, PLS technology in IoT has its special features. Burg et al. 6 reviewed PLS technology in IoT.
As the scale of communication networks continues to expand, the energy consumption for information transmission and processing is increasing, and the supply and effective use of energy have gradually become an important topic. There are many nodes in IoT, which have a wide distribution range. In many cases, the nodes cannot be powered by the grid, but only by a battery. The replacement or charging for the battery is costly. Harvesting energy from the environment is a cost-effective energy supply solution. Energy in the environment comes from a wide range of sources, such as solar energy, thermal energy, environmental noise, and radio frequency signal. 7 In recent years, a lot of research has been conducted on the application of energy harvesting (EH) technology in communication systems. The main research topics include energy sources in the environment, EH communication model, and energy usage protocol. Ku et al. 8 provided a comprehensive review on the research and application of EH technology in communication systems. In addition, Zhao et al. 9 provided a comprehensive survey about the research on the utilization of interference for wireless EH systems.
In the wireless communication system powered by harvested energy, because of the random change of the amount of the harvested energy and the fading of channels, the control of the energy usage and the transmission rate is quite complicated. The power control algorithms in EH communication systems can be divided into two categories: offline power control algorithms and online power control algorithms, according to whether energy, channel state, and data arrival in the transmission process are available in advance. The offline control algorithms are applied in the case where the information of energy, channel state, and data arrival are known in advance. Some literature works have studied the offline power control algorithms under different system models. Tutuncuoglu and Yener 10 considered the discontinuous energy arrival model and presented an offline power control algorithm to maximize the short-term throughput, or minimize the transmission time of a certain amount of data. For the two-hop relay system in which the source node and the relay node are both powered by EH devices, Wu et al. 11 decomposed an offline optimization problem of maximizing the end-to-end throughput into two sub-problems—the selection of forwarding relay and the control of transmission power. The optimal solution was obtained by solving the convex optimization problem. Ozel et al. 12 aimed to maximize the transmission rate and minimize the transmission time for a point-to-point communication system. Under the premise that the process of EH and that of the channel fading were known, the water-filling algorithm was used to control the transmission power. In an actual system, the rate of EH, data flow, and channel fading are all randomly changing, and it is not possible to obtain the information in advance. So, they are online power control algorithms, not offline power control algorithms, which can be used in an actual system. The complexity of online algorithms is usually higher than that of offline algorithms. Modeling the power control process as a Markov decision process is a common method in online power control algorithms. For example, Sinha and Chaporkar 13 constructed the optimal transmission power control problem under a random fading channel to a Markov decision process. Under the condition that the statistical characteristics of system states such as energy, channel, and data arrival are available, the dynamic programming was used to solve the optimization problem and maximize the average transmission rate. The online power control algorithms based on the system’s statistical characteristics generally have high complexity. Furthermore, since the statistical information of the EH, channel fading, and data arriving is required, it is difficult to apply this category of algorithm in practice. Lyapunov framework 14 is a widely used control method in control engineering. It does not require the statistical information of the system and makes decisions based on the current system state to optimize the long-term time-averaged performance of the system. The Lyapunov framework is also a powerful tool to solve the online power control problem in EH communication systems. The basic task of Lyapunov optimization framework is to keep the queues and virtual queues stable in the long term. The constraints of the optimization are transformed to virtual queues. The optimization target is added to the drift of the queues (including virtual queues) as a penalty term, and the original optimization problem with the constraints is converted to the minimization of the drift-plus-penalty function. Lyapunov optimization transforms the long-term time-averaged optimization problem into an instantaneous optimization problem and simplifies the optimization problem. Some literature works have studied the optimization of EH communication systems based on Lyapunov framework. Qiu et al. 15 explored the long-term time-averaged throughput maximization problem of EH communication systems under the limitation of battery capacity and the requirement of bit error rate. The virtual queues of energy and bit error rate are constructed, and the transmission power and modulation scheme are jointly optimized. The Lyapunov optimization framework is used to optimize the performance for an EH point-to-point communication system by Amirnavaei and Dong. 16 The long-term time-averaged transmission rate is maximized by controlling the transmission power at each time slot (TS). The online joint power control for a two-hop amplifying-and-forwarding relay system is studied by Dong et al. 17 The relay is an EH node, and the harvested energy is used for the forwarding. Lyapunov framework was used to solve the joint online power control of the source node and the relay node for the maximization of the long-term time-averaged transmission rate.
In this article, we study the power control of a PLS transmission system. The system is composed of a source node, a friendly jamming node and two destination nodes. The source node and the jamming node are powered by the EH devices. Information sent to a destination node is required to keep secret to the other destination node. All nodes are equipped with single antenna. Without any prior information of channel fading process and EH process, an online algorithm to jointly control the power of the source node and that of the jamming node is designed based on Lyapunov framework. The Lyapunov optimization framework is used to transform the long-term time-averaged optimization problem into a single TS optimization problem. On the condition that only the current channel state and battery state are known, the transmission power of the information signal and AN are jointly controlled. Although the power control algorithm in this article is proposed for the single antenna system, it can be easily extended to the multi-antenna nodes scenario. The main contributions of this article are summarized as follows:
We formulate the power control problem of the signal and AN as an optimization problem which aims to maximize the long-term time-averaged secrecy rate with long-term constrains of the batteries. We then transform the constrains of the batteries to the stability requirement of virtual queues.
We use Lyapunov optimization framework to transform the long-term optimization problem into a per-TS minimization of the drift-plus-penalty function, and dynamically adjust the penalty weight to guarantee the fairness between the two destination nodes. However, the joint optimization of signal power and AN power is a non-convex problem. We use Karush-Kuhn-Tucker (KKT) condition to obtain all possible optimal power pairs and choose the pair that minimizes the drift-plus-penalty function to be the optimal power pair.
We evaluate the performance of our proposed power control algorithm via simulations which show that the secrecy rate can be increased by the assistance of the cooperative jamming node. Compared with the algorithm that does not optimize the power, the proposed algorithm can utilize the harvested energy efficiently and achieve a higher secrecy rate.
The following parts of this article are arranged as follows: the second part introduces the system model, the third part presents the optimization problem, the fourth part solves the optimization problem using Lyapunov framework, the fifth part simulates the optimization algorithm, and sixth part summarizes the whole article.
System model
The system model is shown in Figure 1. The system is composed of a source node S, a cooperative jamming node (jammer) J, and two destination nodes D1 and D2. Each node is equipped with single antenna. The source node and the jammer are equipped with an EH device and a rechargeable battery. The EH equipment is used to harvest energy from the environment and convert it into electricity. The battery is used to store the harvested electricity for data transmission. During the transmission process, the arrival rate of the energy and the channel state change randomly. The source node sends information to the two destination nodes, and the information sent to a destination node is confidential to the other destination node. The source node chooses a destination node and sends its secrecy information to it at each TS according to the states of all channels and the power levels of the batteries. In order to ensure the secrecy of the transmission, the jammer uses harvested energy to send AN at the same time. The transmission power of the source node and that of the jammer is controlled jointly to obtain high energy efficiency.

System model.
EH and using model
Assuming that the capacity of the battery at the source node is Emax (J). The energy stored in the battery at the beginning of TS t is Esb(t) (J), and 0 ≤ Esb(t) ≤ Emax. Denote the energy harvested by the EH device from the environment in TS t as Esa(t) (J), and the electricity charged into the battery as Ess(t) (J), which satisfies Ess(t) ≤ Esa(t). The transmission power is Ps(t) W in TS t, and the energy consumed in the TS is ΔtPs(t) (J), where Δt is the duration of one TS. Limited by the storage capacity of the battery, the electricity charged in one TS does not exceed Emax − (Esb(t) − ΔtPs(t)). In addition, due to the battery’s physical feature, the charging rate is limited, and the maximum electricity charged into the battery in one TS is Ec,max (J). Thus, the electricity charged into the battery can be written as
In the formula, the first term in the minimum operation is the limitation of battery capacity, the second term is the amount of the harvested energy, and the third term is the limitation of the charging rate.
Similarly, for the jammer, we also assume that the capacity of the battery and the maximum energy charged into the battery in a TS as, respectively, Emax and Ec,max (J) and denote the energy stored in the battery at the beginning of TS t as Ejb(t) J, and the energy harvested by the EH device and the energy charged into the battery and the transmission power in TS t, respectively, as Eja(t) and Ejs(t), and Pj(t). Ejs(t) is determined by
The signal power Ps(t) and the AN power Pj(t) are constrained by the maximum discharge rate of the batteries Pmax, so they are bounded by
The energy consumed in TS t cannot exceed the energy stored in the batteries at the begging of the TS, that is
Thus, after the charging and discharging process in TS t, the power levels at the beginning of next TS can be written as
Secrecy transmission model
The channels from the source node and the jamming node to the two destination nodes are time-varying fading channel, and their coefficients are denoted as hsd1(t), hsd2(t), hjd1(t), and hjd2(t) respectively, which remain unchanged within a TS. The noise of each channel is additive white Gaussian noise (AWGN) with mean zero and variance
where i = 1 and
where [x]+ = max{0, x}. For the model in this article, the destination of the information transmission is determined by the states of all channels, and the battery levels of the source node and the jammer. Only the destination node with non-zero achievable secrecy rate can be taken as a legitimate receiver. Therefore, [x]+ in equation (7) can be omitted, and the achievable secrecy rate can be written as
Optimization algorithm based on Lyapunov framework
Optimization problem
For each TS, in order to utilize the harvested energy efficiently, it is necessary to choose the legitimate receiver and determine the power of signal and AN based on the current channel state and the power levels of the batteries at the source node and the jammer. The target of the power control strategy is to maximize the long-term time-averaged achievable secrecy rate under the constraints of the batteries’ performance and the energy amount stored in the batteries. The optimization problem for all t is illustrated as
where E[x] represents expectation operation.
Rewrite the first formula of equation (5) as
From TS 0 to TS T, it is easy to get
Calculate the expectation for the left-hand side and the right-hand side of equation (11), respectively, and superimposing all formulae, we can obtain
Dividing both sides of the above equation by T and letting T → ∞, we can get
Due to the finite capacity of the battery, Esb(0) and Esb(T) must be limited, so the left-hand side of equation (13) is 0. Denoting
Equation (14) indicates that, from a long-term perspective, all energy harvested by the EH device of the source node should be used to transmit information, and all energy harvested by the EH device of the jammer is used to transmit AN.
Define the two virtual queues, respectively, for the power levels of the batteries of the source node and the jammer as
where δs and δj are constants. Keeping the two queues stable in the long term is equivalent to satisfying the constraint (14), that is, all harvested energy runs out. By adding an offset to the energy queue, the battery level will fluctuate around the offset. By choosing appropriate constants, enough energy is stored in the batteries for the transmission while there is enough free capacity to store the harvested energy in each TS.
The dynamics of the virtual queues from one TS to the next TS can be formulated as
If the virtual queues are stable in the long term, that is
the constraint (14) is satisfied.
Due to the random variation of the amount of the harvested energy and the channel states, the virtual queues Xs(t) and Xj(t) will fluctuate up and down around 0, so the values of Xs(t) and Xj(t) can be either positive or negative. Replacing the constraint (5) in P1 with equation (17), we can convert the optimization problem into
Transformation of the optimization problem
To solve the above optimization problem, some traditional offline power control strategies, such as the water-filling algorithm, can be used, but the complete information of the channel states and the energy arrival amounts need to be known in advance. In an actual system, this information is not easy to get. We use Lyapunov framework to solve the optimization problem based on the current system states.
The optimization goal of power control is to maximize the long-term time-averaged secrecy rate while the power level of the two batteries (i.e. the two energy virtual queues) is kept stable. The optimizing objects are the transmission power of the source node and that of the jammer. The optimization goal can be achieved by exploiting the minimization of the drift-plus-penalty of Lyapunov optimization framework.
Let
Define the Lyapunov drift as
The smaller the drift is, the more stable the queues are. In order to maximize the secrecy rate while ensuring the stability of queues, the negative of the mean value of the secrecy rate is used as the penalty and we construct the drift-plus-penalty function as
Now optimization problem P2 has been transformed into the minimization of the drift-plus-penalty equation (21). V in equation (21) is the penalty weight, which is a positive constant and used to trade between the rate maximization and queues stability in the optimization. It is not easy to minimize the drift-plus-penalty directly. However, it has an upper bound as follows, and the minimization of the drift-plus-penalty can be replaced by the minimization of its upper bound.
Lemma 1
Equation (21) has an upper bound
where B is a constant not smaller than
Proof
Since Ess(t), Ejs(t), Ps(t), and Pj(t) are all finite and
In summary
and the drift-plus-penalty equation (21) has an upper bound as
The proof is completed.
Under the condition that the current states of the channels and the energy virtual queues are known, equation (21) can be changed to minimize its upper bound in per-TS fashion. By removing the expectation and constant B in the drift-plus-penalty equation (22), an equivalent per-TS optimization problem of P2 can be obtained as
In the transmission process, due to the random changing of the channels and the uncertainty of the harvested energy, there may be a significant difference between the information throughputs of the two destination nodes, so fairness should be considered in the power control. The idea of maintaining fairness between the information transmissions to the two destination nodes in the power control is as follows. Assume that the average secrecy rate in the past TS before TS t of destination nodes D1 and D2 are
where
and
In the above formula, positive constant U is the adjustment factor, and Vmax and Vmin are the maximum and minimum values of the weight which are used to avoid extreme large or small weight after the adjustment and maintain the necessary balance between the batteries’ power stability and the maximization of secrecy rate in the optimization.
Solution of the optimization problem
The achievable secrecy rate is jointly determined by the channel states hsd1(t), hsd2(t), hjd1(t), and hjd2(t); the signal power Ps(t); and AN power Pj(t). Under the premise that secrecy rate is positive, the source node can send secrecy data to destination node D1 or D2, which needs a further discussion.
Lemma 2
Denote
γ sd1(t) > γsd2(t):
If γjd1(t) < γjd2(t): only the confidential data of D1 can be sent and
If γjd1(t) > γjd2(t): If In other cases, both the confidential data of D1 and D2 can be sent. When the data of D1 are sent, Pj(t) = 0; otherwise,
γ sd1(t) < γsd2(t):
If γjd1(t) > γjd2(t): only the confidential data of D2 can be sent and
If γjd1(t) < γjd2(t): If In other cases, both the confidential data of D1 and D2 can be sent. When the data of D2 are sent, Pj(t) = 0; otherwise,
For Proof, see Appendix 1.
The optimization problem P3 includes two constraints on the transmission power of the source node and that of the jammer, which can be solved by the KKT condition. 18 Since the KKT condition is a necessary condition for the optimal solution of nonlinear programming, the solution must satisfy KKT condition. The optimal pair of the signal power and AN power must be one of the solutions which satisfy the KKT condition.
It can be seen from Lemma 2 that the power pair of the signal and AN must belong to one of the following three cases when transmitting confidential data of D i (i = 1, 2):
Case 1:
Case 2:
Case 3:
Now, we analyze the solution of the optimization, respectively, for the three cases.
Case 1
In Case 1, the optimization problem (24) can be rewritten as
where
Define Lagrangian function as
The KKT condition is
First, we analyze the solutions of Ps(t) which meet the KKT condition. (1) If λ1 = 0 and μ1 = 0, the solution of Ps(t) is the root of
Similarly, there are three possible solutions to Pj(t), which are the root of
By combining the three possible values of Ps(t) and Pj(t), there are nine possible solutions of power pair of the signal and AN which meet the KKT condition. When the power of signal Ps(t) is zero, it is not necessary to transmit AN. So, the two pairs of Ps(t) = 0 and Pj(t) ≠ 0 cannot be the solution of the optimization and are removed. The optimal pair of the signal power and AN power
Next, we give the expressions of
where
Taking the first derivative of
where
Case 2
In Case 2, using KKT condition, the optimal power pair
Case 3
In Case 3, using KKT condition, the optimal power pair
For each TS, all possible solutions of power pair are calculated based on the current system states first, then they are substituted into equation (30) and the corresponding values of
Based on the above optimization process, the optimization algorithm is given as follows.
Approach extended to multi-antenna scenario
The approach proposed in this section can be easily extended to the multi-antenna scenario. The secrecy rate of the system is related to PLS scheme, the channel state, signal power, and AN power. In the multi-antenna scenario, the secrecy rate (equation (8)) in section “System model” needed to be replaced by the formula of the achievable secrecy rate of the PLS scheme adopted in the transmission (such as equation (4) in the work of Cumanan et al. 19 ). After a derivation similar to that in this section, the optimal solution to signal power and AN power can be obtained.
Performance analysis
Using the Lyapunov optimization framework, the long-term time-averaged secrecy rate maximization problem P1 has been transformed into the per-TS optimization problem P3. If the optimization of the performance in per-TS is realized, the optimization of the time-averaged performance in long-term is also approximately realized. 13 The average achievable secrecy rate using the algorithm proposed in this article and the possibly maximum secrecy rate of P1 have the following relationship
where
The proof of equation (32) is directly done using the method in Sinha and Chaporkar, 13 so we omit it here.
It can be seen from equations (25) and (26) that
Simulation results
This section verifies the performance of the proposed algorithm by simulation. Unless specifically specified, the parameters in the simulation are set as follows. The energy arrival amount per-TS Esa(t) and Eja(t) at the source node and the cooperative jammer follow a compound Poisson process with a uniform distribution. The energy arrival rate at the source node is λs = 2.5 unit/slot and that at the jammer is λj = 0.5 unit/slot. The amount of energy per unit is uniformly distributed in [0, 0.4] J with mean of 0.2 J/unit. Two nodes’ battery capacity is Emax = 5 J. The maximum charging amount per-TS of the batteries is Ec,max = 1 J, and the maximum discharge rate is Pmax = 1 W. All channel coefficients follow the complex Gaussian distribution with mean of 0 and unit variance. The channel coefficients remain unchanged during one TS and randomly change from one TS to the next TS. The variance of channel noise is
Performance comparison with two comparison algorithms
To evaluate the performance of the proposed algorithm, we compare it with the following three algorithms:
Half-power algorithm (HPA): the system model is the same as that in Figure 1. In each TS, the source node sends confidential information to the destination node which can achieve a higher secrecy rate. The source node uses half of the energy stored in its battery to transmit signal, and the jammer uses half of the energy stored in its battery to transmit AN when AN is helpful for the promotion of secrecy rate. The signal power and AN power are
Greedy algorithm (GA): the system model is the same as that in Figure 1. The algorithm is basically the same as HPA, except that the signal power and AN power are the maximum values supported by the energy stored in the batteries of the source node and the jammer, that is,
The algorithm proposed by Lei and Wang. 20 The system model considered in this algorithm is similar to that in this article, but there is no cooperative jammer. The Lyapunov optimization method is used to control the transmission power of the source node for the maximization of the secrecy rate. In each TS, the source node transmits the secrecy information to the destination node with better channel quality and optimizes the transmission power using Lyapunov optimization framework. We refer to this algorithm as without-jammer algorithm (WJA).
In the simulation of the two comparison algorithms, the energy arrival, channel characteristics, and the features of the batteries are the same as those in the simulation of the proposed algorithm.
Figure 2 shows the simulation results of time-averaged achievable secrecy rate versus TS t. The time-averaged secrecy rate of each TS is the average value of the secrecy rate from the beginning of simulation to the current TS. It can be clearly seen that the proposed algorithm has a significant advantage over other three algorithms. HPA and GA do not do any optimization according to the channel states and the power levels of the batteries. All energy stored in the batteries is consumed in GA, and its performance is the worst among the three algorithms for the system with a jammer. HPA uses half of the energy stored in the batteries in the current TS, and the remained energy is reserved for the future, so its performance is superior to that of GA. Compared with WJA, an extra jammer sends AN in the system of the proposed algorithm. The simulation results show that the achievable secrecy rate can be significant improved by elaborately controlling the power of AN.

Time-averaged secrecy rate versus TS.
Figure 3 shows the time trajectory of the power levels of the batteries of the four algorithms. At the beginning of simulation, all batteries are fully charged. Figure 3 shows that although the energy can be supplemented in the transmission period, the energy stored in the two batteries in HPA and GA drops to a very low level (the level is lower in GA) in a short time and then remains at this level, while those in other two algorithms can be maintained around a middle level (there is no jammer in WJA).

Time trajectory of the power levels of the batteries: (a) source node’s battery and (b) jammer’s battery.
Effects of the system parameters
In this section, we evaluate the effects of the system parameters on the performance of the proposed algorithm. The simulation period is T = 105 TS.
Figure 4 shows the effects of energy arrival rates λs and λj at the long-term time-averaged secrecy rate. It can be found that as the energy arrival rate λs at the source node increases, the secrecy rate increases. This is because the more energy arriving per-TS, the more energy is available to transmit data, and the higher transmission rate can be achieved. With the increase in the arrival rate λj at the jammer, the secrecy rate first increases and then remains. It’s because when the energy arrival rate is small, the average power of AN increases with the increase in the arrival rate λj, which is beneficial for the promotion of the secrecy rate. As the analysis in section “Solution of the optimization problem” shows, there is no need to transmit AN or the optimal power of AN is small in many cases. So, when the energy arrival rate λj is large enough, the secrecy rate no longer increases with the increase in the arrival rate λj.

Long-term time-averaged secrecy rate versus energy arrival rates.
Figure 5 shows the effects of the offsets of energy virtual queues δs and δj on the system performance. The purpose of setting the offsets is to ensure that the power level of the battery fluctuates around a certain level to accommodate random variations in EH and channel quality. By setting the offsets, in each TS, enough energy is stored in the battery, which can support the transmission of information or AN, while there is enough free space to store the harvested energy. It can be seen from Figure 5(a) that as the offset of the energy virtual queues at the source node δs increases, the average secrecy rate increases first and then decreases slightly. When δs is small, the average amount of power reserved in the source node battery is small too. According the water-filling theory, the better the channel quality is, the higher the transmission power should be. Because the energy stored in the battery is not enough, the high transmission power cannot be supported, and the capacity of the channel is not fully utilized. So, the average secrecy rate is low. As δs increases, the energy stored in the battery increases too, which results in the increase in the secrecy rate. Since the available energy is also limited by the amount of the harvested energy, the average secrecy rate does not always increase with the increase in the δs. However, the increase in the δs leads to the decrease in the remaining capacity of the battery, and the possibility increases that the amount of the harvested energy exceeds the remaining capacity of the battery (i.e. energy overflow). Some harvested energy is discarded once the energy overflow occurs, and it is harmful to the system performance. So, the average secrecy rate decreases slightly with the increase in the δs when it is larger than 4.6. The secrecy rate only increases slightly as δj increases. The reason is that the energy consumption of AN is lower than that of the signal, and the change of the energy stored in the jammer’s battery has a small impact on the system performance. It can be seen from Figure 5(b) and (c) that the time-averaged battery energy saved by the source node and jammer is near the offset.

Effect of energy virtual queue offsets δs and δj on system performance: (a) long-term time-averaged secrecy rate, (b) long-term time-averaged power level of the source node’s battery, and (c) long-term time-averaged power level of the jammer’s battery.
Figure 6 shows the effects of weights U and V on the system performance. Figure 6(a) gives the effect on long-term time-averaged secrecy rate. Figure 6(b) and (c) gives the effects on the standard deviations of the power level of the two batteries, that is,

Effect of weights U and V on system performance: (a) long-term time-averaged secrecy rate, (b) standard deviations of the power level of the source node, (c) standard deviations of the power level of the jammer, and (d) normalized RMS of the difference between the secrecy rates of the two destination nodes.
Figure 7 shows the effects of upper and lower limits of weight Vmax and Vmin on the system performance. Figure 7(a) gives the effect on long-term time-averaged secrecy rate, and Figure 7(b) gives the effect on

Effect of Vmax and Vmin on system performance: (a) long-term time-averaged secrecy rate and (b) normalized RMS of the difference between the secrecy rates of the two destination nodes.
Conclusion
This article studies the power control problem in an EH wireless communication system with a cooperative jamming node. The system model includes an EH source node, an EH jamming node, and two destination nodes. The information sent to one destination node is kept secret from the other destination node. Per-TS, based on channel states and battery states, the source node selects the destination node as the legitimate receiver which can achieve a lower drift-plus-penalty function and sends the confidential information to it. In order to utilize the harvested energy efficiently, the signal power and AN power are jointly controlled to minimize the drift-plus-penalty function. The original optimization problem is transformed into a per-TS optimization problem using the Lyapunov optimization framework, and the decision of power control is only based on the current system states. The constraint on the battery operation is converted into the stability constraint of the energy virtual queues. In addition, the fairness between the transmissions of the two destination nodes’ information is also considered in the optimization process. The optimal power pair of the signal and AN in different cases is analyzed, and the solutions are given. The simulation results show that the proposed algorithm can efficiently utilize the harvested energy and a higher averaged secrecy rate can be achieved compared with two comparison algorithms.
Footnotes
Appendix 1
Appendix 2
Appendix 3
Handling Editor: Bo Rong
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China under grant nos 61971080 and 61471076; the Chongqing Research Program of Basic Research and Frontier Exploration under grant no. cstc2018 jcyjAX0432; and the Key Project of Science and Technology Research of Chongqing Education Commission under grant nos KJZD-K201800603 and KJZD-M201900602.
