Sage Journals: Discover world-class research

Abstract

With the number of Internet of Things devices continually increasing, the endogenous security of Internet of Things communication systems is growingly critical. Physical layer authentication is a powerful means of resisting active attacks by exploiting the unique characteristics inherent in wireless signals and physical devices. Many existing physical layer authentication schemes usually assume physical layer attributes obey certain statistical distributions that are unknown to receivers. To overcome the uncertainty, machine learning–based authentication approaches have been employed to implement threshold-free authentication. In this article, we utilize an expectation–conditional maximization algorithm to provide the physical layer attribute estimates required for the authentication phase and a logistic regression model to achieve threshold-free physical layer authentication. Moreover, a Frank–Wolfe algorithm is considered to achieve fast convergence of the logistic regression parameters and multi-attributes are adopted to increase the differentiation of transmitters. Simulation results demonstrate that the obtained attribute estimates are sufficient to provide a reliable source of data for authentication and the proposed threshold-free multi-attributes physical layer authentication scheme can effectively improve authentication accuracy, with the false alarm rate P_f reduced to 0.0263% and the miss detection rate P_m reduced to 0.3466%.

Keywords

Physical layer authentication expectation–conditional maximization algorithm multi-attributes logistic regression Internet of Things

Introduction

With the innovation of wireless access technologies, the complex heterogeneity of network access architectures, and the proliferation of Internet of Things (IoT) devices, the security risks of IoT communications are increasing. In an open and time-varying wireless channel environment, communications become transparent and unstable, which makes information and data transmission more susceptible to eavesdropping, tampering, and forgery.¹ Nowadays, network security has evolved to the era of endogenous security, which requires the continuous growth of self-adaptive and autonomous security capabilities within IoT communication systems. Therefore, effective responses to wireless communication security issues are urgent and critical.

Authentication of the signaling entity is one of the core technologies for securing wireless communications. Traditional key-based high-level authentication techniques have been well researched and widely used over the last few decades. However, the explosive growth in the number of access devices complicates the distribution and management of keys for high-level authentication. The high transmission overhead of high-level authentication results in high latency, making it difficult to adapt to latency-sensitive industrial IoT communication systems.^2,3 Accordingly, physical layer authentication, which uses the physical layer attributes of the signal source to verify whether the transmitting entity is legitimate, is receiving increasing attention.

In this article, we propose a physical layer authentication scheme using multi-antenna technology and multi-attributes combined as authentication fingerprints to improve the reliability. In the channel estimation phase, we use semi-blind estimation of the expectation–conditional maximization (ECM) algorithm, assisted by a few pilots, to avoid pilots taking up too much bandwidth resources while taking into account the complexity of the estimation algorithm. Physical layer attributes are always time-varying and random due to the environment and equipment, making them difficult for attackers to imitate, especially imitating multiple attributes simultaneously. Thus, we consider combining received signal strength indicator (RSSI), channel impulse response (CIR), and carrier frequency offset (CFO) as authentication fingerprints to enhance authentication performance. An attacker can launch a spoofing attack by imitating an attribute, but it is challenging to imitate multiple attributes simultaneously, so combining multiple attributes as authentication fingerprints reduces the communication security risk if an attribute fails. To avoid channel statistical model dependence of physical layer authentication system, we use a logistic regression model in machine learning (ML) to design the threshold-free authentication process. Logistic regression is a parametric learning method that requires less training data than non-parametric learning methods and has a way to avoid overfitting. In the convex optimization problem constructed with the loss function of logistic regression, we use the Frank–Wolfe (FW) algorithm to find the optimal solution of the parameters with a fast convergence rate. Moreover, we envisage improving the reliability of authentication using multi-antenna technology.

The major contributions of this work can be summarized as follows:

We obtain multiple physical layer attributes, including RSSI, CIR, and CFO, from the received signals, where an ECM algorithm aided by a few pilots is designed to get the values of CIR and CFO.

Multiple attributes are combined as authentication fingerprints and multi-antenna technology is utilized to improve authentication accuracy. And different contribution of each attribute to the authentication decision is considered due to its stability.

Logistic regression model is used to achieve threshold-free physical layer authentication and the FW algorithm is adopted to find the optimal solution of the parameters with a fast convergence rate.

Related works

Some physical layer authentication schemes are based on watermarking. The transmitter uses a hash function to fuse a shared key with the signal to generate a tag, allowing the tag to arrive at the receiver with the transmit signal and be authenticated by the receiver. Based on two attack scenarios with or without user–attacker complicity, several different tags overlay schemes for physical layer authentication in the Non-Orthogonal Multiple Access (NOMA) system are presented in Xie et al.⁴ Optimal certified tag embedding and optimized power distribution between signal and tag are designed in Gu et al.⁵ Xie and Chen⁶ have developed a slope authentication that divides the transmit signal into two equal groups based on the secret key, with labels used to mark the time index of each group. These schemes are based on shared private keys, using hash function encryption and signal processing techniques for physical layer authentication, with the attendant need for complex signal preprocessing and perfect privacy of the shared private key.

In contrast to these schemes, channel characteristics and device attributes–based physical layer authentication schemes are also studied. Physical layer authentication techniques utilize characteristics carried by the signal during transmission about the transceiver and the channel, for example, RSSI, CIR, CFO, and so on; no additional authentication information needs to be transmitted. These characteristics are device- and environment-dependent, unpredictable, and difficult to imitate. Moreover, instantaneous measurements of the device and channel by the receiver can monitor the temporal variation of the authentication information, which is equivalent to providing a natural refreshing mechanism.⁷

In the schemes proposed in Hou et al.,^8,9 the time-varying CFO is used as a radio frequency (RF) feature for physical layer authentication. Hypothesis testing is established to verify that the estimated value of the CFO is consistent with the predicted value for the authentication decision. The CIR is quantified by Liu and colleagues,^10,11 in the magnitude dimension and the multipath delay dimension to simplify the decision rules for authentication, and then based on the output of the quantizer, hypothesis testing is used to achieve physical layer authentication. RSSI of the signals measured by multiple landmarks is used as a verification basis input into the authentication system of Xiao et al.¹² to detect spoofing attacks in a wireless network. The quantized channel gain and phase noise are combined to implement physical layer authentication with thresholds to enhance authentication performance in Zhang et al.,¹³ which demonstrates the combination of two attributes provides a higher accuracy than using a single attribute as the basis for authentication. However, the optimal values of quantification thresholds and authentication decision thresholds in these methods are found using exhaustive search method, which appears to be less intelligent.

Pan et al.¹⁴ conduct extensive simulation experiments and field experiments to verify the performance of different ML algorithms for threshold-free authentication, where the channel difference matrix or the channel state matrix is used as the input of each ML algorithm. However, the authentication accuracy of these experiments is still below 90%, which means that many attackers cannot be identified. Authentication methods based on kernel machines fusing multiple attributes to achieve reliable authentication in time-varying environments are presented in Fang et al.¹ The authors of this article also propose hierarchical authentication and progressive authorization with high complexity in Fang et al.¹⁵ This strategy is used to resist the risks associated with missed detections with one-time binary authentication. Fang et al.¹⁶ suggest selecting those physical layer attributes that have historically performed well for authentication as the basis for current authentication. However, it is also necessary to select the appropriate ML algorithm and carefully design the authentication process to improve authentication accuracy. The scheme based on ML to achieve threshold-free physical layer authentication with multiple landmarks is proposed in Xiao et al.,¹² which can reach a higher authentication accuracy and reduce overhead but requires the deployment of a large number of peripheral devices. A comparison of statistical-based and ML-based physical layer authentication methods for different channel correlation coefficients is presented in Senigagliesi et al.¹⁷ However, it is usually difficult for the receiver to be informed of the channel correlation coefficient between the attacker and the legitimate transmitter.

Only a few of the above physical layer authentication methods mention the issue of channel estimation prior to the authentication phase and they are implemented using pilots or training sequences. The problem of frequency synchronization, channel estimation, and data detection for all active users in the uplink of an orthogonal frequency division multiple access (OFDMA) system has been intensively studied in Wang and Liew¹⁸, Chen et al.,¹⁹ and Pun et al.,²⁰ Since exact maximum likelihood estimation is complex in practical scenarios, an alternative scheme operating in an iterative manner, where each user’s signal is processed using the ECM algorithm, has been proposed. This approach to channel estimation makes sense for obtaining physical layer attributes as authentication fingerprints.

Multi-attributes threshold-free physical layer authentication schemes with higher authentication accuracy combined with receiver-side channel estimation are still to be investigated. Many existing physical layer authentication schemes are based on a statistical approach. This approach requires the assumption that physical layer attributes obey a specific distribution. However, in reality, receivers usually do not know the distribution model of physical layer attributes for wireless communication. ML-based authentication approaches can achieve model independence and overcome the difficulty of modeling the uncertainty and unknown dynamics of the authentication process.³ While the aid of peripheral devices is beneficial in improving authentication accuracy, it sacrifices system deployment costs. Imperfect estimation and time-varying characteristics of physical layer attributes in wireless communication systems are inevitable, which affect the accuracy of physical layer authentication. In addition, ML-based authentication techniques can enhance authentication performance by analyzing and fusing multiple physical layer attributes.

System model

This section describes the system model for physical layer authentication, formulas for the received signals, and the calculation formula of path loss.

In this article, we consider the physical layer authentication (PLA) in a single-input multiple-output (SIMO) system, where the universal Alice–Bob–Eve model is employed. Alice and Eve are transmitters equipped with one antenna each and Bob is the receiver equipped with $Nr$ antennas. Specifically, Alice is the transmitter with legitimate status recorded in the source list of Bob, while Eve is the attacker who wants to spoof Bob by imitating Alice to gain illegal benefits. As depicted in Figure 1, Eve imitates Alice to send signals to Bob by using a MAC address same as Alice. Once Eve successfully spoofs Bob to gain a legitimate status, it causes damages to the wireless communication system by transmitting malicious signals or passively eavesdropping. To resist this spoofing attack, Bob employs physical layer authentication to identify whether the current transmitter is the legitimate transmitter Alice or the attacker Eve.

Figure 1.

System model.

To enhance the reliability of PLA at Bob, multiple attributes, including RSSI, CIR, and CFO, are used as authentication fingerprints in a combination method. To be specific, signals from different transmitters experience different path loss and multipath fading, which results in differences in the signal received power and channel impulse response, so RSSI and CIR can be exploited as the authentication fingerprints of the signal transmit entity. The CIR from Alice to Bob is illustrated as $h_{A}$ and the CIR from Eve to Bob is $h_{E}$ . Moreover, CFO is an essential wireless device characteristic consisting of two parts. One part is the inherent frequency difference caused by the device hardware during the production process. The other part is the random variable caused by the free motion related to time and environment. We use a normalized CFO, that is, $ϵ = \frac{Δ f}{f_{s}}$ , where $Δ f$ is the frequency difference between transmitter and receiver and $f_{s}$ is the sampling frequency. The CFO varies between each transceiver pair, so it can be used as a unique fingerprint for authentication.

It is assumed that the initial secure transmission has been established between Alice and Bob before Eve arrives so that Bob can collect the physical layer attributes of the legitimate transmitter. The initial transmission can be implemented by existing authentication methods or physical measures (such as manual setup during initial communication).^11,17 Bob keeps a record of Alice’s authentication information, that is, a record of physical layer attributes for the previous M moments. Assume that Eve has been listening to the channel between Alice and Bob and takes the opportunity to send spoofing signals when the channel is free to avoid packet loss due to channel congestion. Bob first extracts information about the physical layer attributes of the transmitter from the received signal and then uses them as fingerprints in the subsequent authentication process. Assume that the signal received by the $r^{th}$ antenna at Bob is

y_{r} = Γ (ϵ_{r}) F^{H} XW h_{r} + n_{r}

(1)

where $y_{r}$ is the signal complex vector received by the $r^{th}$ antenna; $Γ (ϵ_{r})$ is the diagonal matrix containing the CFO between Bob and Alice (or Eve), and $Γ (ϵ_{r}) = diag {1, e^{j 2 π ϵ_{r} / N}, \dots, e^{j 2 π (N - 1) ϵ_{r} / N}}$ , $ϵ_{r} \subseteq {ϵ_{A}, ϵ_{E}}$ ; $F$ is the N-point discrete Fourier transform (DFT) matrix, whose $[p, q] th$ element expression is $[F]_{[p, q]} = \frac{1}{\sqrt{N}} e^{- j 2 π pq / N}$ ; $X$ is the diagonal matrix of frequency-domain symbols which are sent by Alice (or Eve); $W$ is the partial scaling matrix of $F$ ; $h_{r}$ reflects both the large-scale and small-scale propagation effect. The former effect refers to the path loss $P L_{r} (d)$ , which is calculated as follows according to the log-normal shadow fading model in Goldsmith²¹

P L_{r} (d) [dB] = 20 \underset{10}{\log} (\frac{4 π d_{0}}{λ}) + 10 ξ \underset{10}{\log} (\frac{d}{d_{0}}) + S_{r}

(2)

where $d_{0}$ is a reference distance; $λ$ is the wavelength of the emitted signal; $d$ is the distance between the transmitter and the receiver; $ξ$ is the path loss index determined by the propagation environment; $S_{r}$ denotes a Gaussian random variable with zero mean and variance $σ_{s}$ . The small-scale propagation effect refers to the multipath fading, represented by the finite impulse response (FIR) with $L_{h}$ channel taps and assumed to be free of timing errors and stable in an orthogonal frequency division multiplexing (OFDM) symbol. Thus, $h_{r} = [\sqrt{\frac{1}{10^{(P L_{r} / 10)}}} h_{r} (1), \dots, \sqrt{\frac{1}{10^{(P L_{r} / 10)}}} h_{r} (L_{h} - 1)]^{T}$ Moreover, $n_{r}$ is an independent and identically distributed zero-mean complex additive white Gaussian noise (AWGN) with variance $σ_{n}^{2}$ .

Physical layer authentication algorithm

In this section, we first discuss how to obtain the physical attributes from the received signal by using an estimation algorithm, then design a logistic regression model to make the authentication decision based on the estimation results, and give performance metrics for evaluating authentication decisions.

Physical attribute estimation

RSSI calculation

RSSI can be calculated as follows

\begin{matrix} RSSI (d) [dB] = 10 \log P_{r} (d) \end{matrix}

(3)

where $P_{r} (d) = | | y | |^{2}$ represents the power of the received signal and $| | \cdot | |$ is the Frobenius norm. According to the computation of $y_{r}$ , RSSI is affected by both the path loss and AWGN, except that the effect of noise is relatively small.

CIR and CFO estimation

The ECM algorithm is employed to estimate the CIR and CFO, that is, $h_{r}$ and $ϵ_{r}$ . ECM is a variant of the expectation–maximization (EM) algorithm that is usually used to estimate hidden parameters from incomplete available data and includes E-step and M-step.

Although channel estimation based on the EM algorithm has many advantages, its computational complexity increases exponentially with the number of transmitted signals. Therefore, a variant of the EM algorithm, the ECM algorithm, is exploited to iteratively estimate the parameters $h_{r}$ and $ϵ_{r}$ , considering both algorithm complexity and frequency band utilization.

The ECM algorithm is an iterative optimization strategy. In this estimation task, the observed data is $y_{r}$ , the hidden data is $h_{r}$ , and the parameter to be estimated is $ϵ_{r}$ . We use a few pilots to aid the ECM algorithm and reduce the number of estimated parameters. We replace the signal matrix $X$ with the pilot matrix $X_{P} = diag {X_{P} (0), X_{P} (1), \dots, X_{P} (N - 1)}$ . The pilot is a constant amplitude zero autocorrelation sequence (CAZAC). According to Chen et al.¹⁹ and Pun et al.,²⁰ the ECM algorithm can be divided into E-step and M-step.

E-step

The expected log-likelihood function is defined as follows

\begin{matrix} Q (\tilde{ϵ} | {\hat{ϵ}}_{r}^{i}) = {E_{h}}_{r} {In [p (y_{r} | h_{r}, {\tilde{ϵ}}_{r})] p (y_{r} | h_{r,} {\hat{ϵ}}_{r}^{i})} \end{matrix}

(4)

where $p (y_{r} | h_{r}, {\tilde{ϵ}}_{r})$ and $p (y_{r} | h_{r}, {\hat{ϵ}}_{r}^{i})$ represent conditional probability density functions, $E_{h_{r}}$ means the expectation of $h_{r}$ , ${\tilde{ϵ}}_{r}$ is a trial value of $ϵ_{r}$ and ${\hat{ϵ}}_{r}^{i}$ is the estimate value for iteration $i$

Assume that the signal undergoes log-normal shadow loss, time delay expansion, and Rayleigh fading as it travels from the transmitter to the receiver. The process of estimating the parameters starts with estimating $h_{r}$ . We adopt the least-squares (LS) algorithm, which is computationally simple and widely used for channel estimation, to initialize the value of $h_{r}$ , and this computational step is also aided by the pilot $X_{P}$ . The estimated $h_{r}$ is represented as

\begin{matrix} {\hat{h}}_{r, LS} ({\hat{ϵ}}_{r}^{i}) = {[W^{H} E (X_{P}) W]}^{- 1} W^{H} X_{P}^{H} F Γ^{H} ({\hat{ϵ}}_{r}^{i}) y_{r} \end{matrix}

(5)

where $E (X_{P}) = diag {| X_{P} [n] |^{2}}, n = 0, 1, \dots, N - 1$ . By substituting the estimated CIR into the Q function and skipping the summation term and multiplication factor, which are not relevant to the parameters to be estimated, the following equation can be obtained

\begin{matrix} Q' ({\tilde{ϵ}}_{r} | {\hat{ϵ}}_{r}^{i}) = - {‖ y_{r} - Γ ({\tilde{ϵ}}_{r}) F^{H} X_{P} W {\hat{h}}_{r, LS} ({\hat{ϵ}}_{r}^{i}) ‖}^{2} \end{matrix}

(6)

M-step

Once the $h_{r}$ for the current iteration is obtained, the parameter space is searched for the value that maximizes the expectation function $Q'$ as the parameter estimate for the current iteration. This process can be expressed as the following equation

\begin{matrix} {\hat{ϵ}}_{r}^{i + 1} = \underset{{\tilde{ϵ}}_{r}}{argmax} {- {‖ y_{r} - Γ ({\tilde{ϵ}}_{r}) F^{H} X_{P} W {\hat{h}}_{r, LS} ({\hat{ϵ}}_{r}^{i}) ‖}^{2}} \end{matrix}

(7)

To solve this optimization problem, $Γ (ϵ_{r})$ is written as a Taylor series expansion, taking ${\hat{ϵ}}_{r}^{i}$ as the starting point and intercepting to the second-order term as an approximation. After calculation and derivation, taking out the irrelevant terms, and taking the derivative concerning $ϵ_{r}$ , set the derivative to zero; the relationship between the current estimate value and the previous estimate value in the previous step is

{\hat{ϵ}}_{r}^{i + 1} = {\hat{ϵ}}_{r}^{i} + \frac{ℑ {y_{r}^{H} Γ' ({\hat{ϵ}}_{r}) F^{H} X_{P} W {\hat{h}}_{r, LS} ({\hat{ϵ}}_{r}^{i})}}{ℜ {y_{r}^{H} Γ ″ ({\hat{ϵ}}_{r}) F^{H} X_{P} W {\hat{h}}_{r, LS} ({\hat{ϵ}}_{r}^{i})}}

(8)

where $Γ' ({\tilde{ϵ}}_{r})$ and $Γ ″ ({\tilde{ϵ}}_{r})$ represent the first-order partial derivative and second-order partial derivative of $Γ ({\tilde{ϵ}}_{r})$ with respect to ${\tilde{ϵ}}_{r}$ . Ultimately, the parameters to be estimated can be obtained by continually iterating over equations (5) and (8). The above process completes the attributes estimation phase. Estimated physical layer attributes can be utilized in the subsequent authentication phases.

Logistic regression authentication model

Different attributes have different ranges of variation and magnitudes, and in order to fit the logistic regression model, the raw data need to be normalized. There are several reasons for doing this. First, some attributes have a much more extensive range of variation than others, and then the classification results depend mainly on this feature but maybe contrary to reality. Second, to remove the effect of magnitude and finally the normalized data helps to improve the convergence speed of the gradient descent algorithm. In order to give the normalized data better discrimination without losing the original data characteristics, we make different normalizations for the attributes assumed by the different distributions. In this article, we assume that the CFO follows an incremental distribution that the CFOs from Alice and Eve float around a constant value without outliers. The RSSI can vary relatively little over short periods and that the maximum and minimum values are relatively stable, so we normalize these two attributes using the $\max - \min$ method. Assuming that the range of variation of the attribute $a_{x}$ is $[\min, \max]$ , with $\min$ representing the smallest value and $\max$ representing the largest value in the dataset, the normalized data is

{\bar{a}}_{x} = \frac{2 a_{x} - (\max + \min)}{\max - \min}, x = CFO, RSSI, {\bar{a}}_{x} \in [- 1, 1]

(9)

After normalization, CFO and RSSI values are distributed in the range $[- 1, 1]$ . At the transmitter side, we have represented the CIR as $h_{r}$ with $L_{h}$ channel taps. Each tap is subject to path loss; to reduce this effect, the total power of each $h_{r}$ is normalized to 1 and the normalized value of each tap represents the ratio of the current tap to the total power. The normalized formula for CIR is

{\bar{h}}_{r} (n) = \frac{{‖ h_{r} (n) ‖}^{2}}{{‖ h ‖}^{2}}

(10)

where $∥ h_{r} (n) ∥^{2}$ is the power of the $n th$ tap, $∥ h ∥^{2}$ is the total power of $h$ , and the range of ${\bar{h}}_{r} (n)$ is $[0, 1)$ .

We utilize an ML approach to implement threshold-free physical layer authentication . The model for the authentication phase is described below. The authentication decision process is described as a binary classifier in the authentication phase. The receiver marks the current message sent by the transmitter as legal or illegal based on the attributes obtained in the channel estimation phase and records them as training data for the next message authentication. Assume that Bob marks the current authentication result as $r_{j}$ , $r_{j} \in {0, 1}$ . Define $r_{j} = 1$ to mean the transmitter passes authentication, and Bob accepts the current message and marks the current transmitter as Alice. Alternatively, $r_{j} = 0$ means the transmitter does not pass authentication; Bob rejects the current transmitter and initiates a spoof warning, marking the physical layer attributes of the current transmitter as Eve’s characteristics.

The critical issue is to choose a suitable ML model to obtain better authentication performance. Here we choose a logistic regression model suitable for the binary classification problem to achieve threshold-free authentication. Suppose that the data in the same dataset are all estimated under the same signal to noise ratio (SNR). We use the attribute values of M historical records as the training dataset for the model, which are both from Alice and Eve. It is assumed that the training data labels are consistent with the actual situation, that is, the authentication results of all M previous messages are correct. We encapsulate the normalized attributes of all antennas in $A$ . $A$ is a matrix of $(L_{h} + 2) \times Nr$ -dimensional matrix with the $r^{th}$ column element being $[{\bar{h}}_{r}, {\bar{ϵ}}_{r}, R \bar{S} S I_{r}]^{T}$ . The input to the logistic regression model is $A$ and the output is the authentication result $r_{j}$ of the current source. According to the logistic regression model of McCue,²² the output of the authentication of the current source can be expressed as a probability modeled by the sigmoid function

\begin{matrix} P (r_{j} = 0 | A_{j}) = \frac{1}{1 + e^{ϕ_{0} + tr {Φ^{T} A_{j}}}} \end{matrix},

(11)

\begin{matrix} P (r_{j} = 1 | A_{j}) = \frac{e^{ϕ_{0} + tr {Φ^{T} A_{j}}}}{1 + e^{ϕ_{0} + tr {Φ^{T} A_{j}}}} \end{matrix} .

(12)

where $ϕ_{0}$ is the bias term of this function, $Φ^{T}$ means the transposition of $Φ$ , and $tr {\cdot}$ represents the trace of the matrix. Note that the fit of the logistic regression can be improved by setting a bias term and $Φ$ is the $(L_{h} + 2) \times Nr$ -dimensional weight matrix. The above analysis shows that if the probability of the current message being Alice is greater than the probability of it being Eve, then the authentication is passed, that is, if $P (r_{j} = 1 | A_{j}) > P (r_{j} = 0 | A_{j})$ , then $r_{j} = 1$ , otherwise $r_{j} = 0$ . Substituting equations (11) and (12) into the above inequalities and simplifying them yields the following hypothesis test

\begin{matrix} {\hat{r}}_{j} = {\begin{matrix} 1, & i f ϕ_{0} + tr {Φ^{T} A_{j}} > 0 \\ 0, & otherwise \end{matrix} \end{matrix}

(13)

Logistic regression uses the minimization of cross-entropy loss as the objective function. Logistic regression under maximum likelihood does not have an analytic solution and we often use algorithms such as gradient descent to optimize locally better parameter solutions iteratively. In conjunction with the optimization task of this article, we choose the FW algorithm to solve the parameter matrix $Φ$ and the gradient descent algorithm to solve the bias term $ϕ_{0}$ . Refer to Xiao et al.¹² and McCue,²² the loss function associated with $ϕ_{0}$ and $Φ$ is defined as

\begin{matrix} L (ϕ_{0}, Φ) = - \ln (Π_{m = j - M - 1}^{j - 1} \Pr (r_{m} | A_{m})) . \end{matrix}

(14)

The following functional expressions can be obtained after substituting equations (11) and (12) into the loss function

\begin{array}{l} L (ϕ_{0}, Φ) = \sum_{m = j - 1 - M}^{j - 1} (\ln (1 + e^{ϕ_{0} + t r {Φ^{T} A_{m}}}) \\ - r_{m} (ϕ_{0} + t r {Φ^{T} A_{m}}) . \end{array}

(15)

The prediction results get closer to the actual category as the loss function decreases.

FW-based parameter optimization algorithm

In this subsection, the adjustment of the weight parameters $Φ$ and offset term $ϕ_{0}$ to minimize the objective function $L (ϕ_{0}, Φ)$ is described. The process of adjusting the parameters can be described as the following optimization problem

\begin{matrix} {\hat{ϕ}}_{0}, \hat{Φ} = \arg min_{ϕ_{0}, Φ} \sum_{m = j - 1 - M}^{j - 1} \\ (\ln (1 + e^{ϕ_{0} + tr {Φ^{T} A_{m}}}) - r_{m} (ϕ_{0} + tr {Φ^{T} A_{m}})) \end{matrix}

(16)

The risk of overfitting the model can be reduced by setting regular constraints, and $L_{1}$ regularization can limit the solution space, reduce the model capacity, and lower the computational effort. In this optimization task, the column vectors of the matrix $Φ$ can be regularized. In order to solve the parameter matrix $Φ$ using the FW algorithm,²³ we rewrite the objective function in the form of a vector and the optimization problem is transformed as follows

\begin{array}{l} {\hat{ϕ}}_{0}, \hat{Φ} = \arg \min_{ϕ_{0}, Φ} \sum_{m = j - 1 - M}^{j - 1} (\ln (1 + e^{ϕ_{0} + \sum_{r = 1}^{N r} (Φ_{r}^{T} A_{m, r})}) - r_{m} (ϕ_{0} + \sum_{r = 1}^{N r} (Φ_{r}^{T} A_{m, r}))) \\ s . t . | Φ_{r} |_{1} \leq C_{1}, r = 1, 2, \dots, N r \\ | ϕ_{0} | \leq C_{2}, \end{array}

(17)

where $C_{1}$ and $C_{2}$ are constants that limit the complexity of the parameters.

We decompose equation (15) into two more straightforward optimization problems to solve this optimization problem. First, treat $ϕ_{0}$ as a constant and use the FW algorithm described in Jaggi²⁴ to optimize $Φ$ and then treat $Φ$ as a constant matrix and use the gradient descent algorithm to optimize $ϕ_{0}$ . The basic idea of the FW algorithm is that the linear approximation of the objective function $L (Φ_{r})$ , which is differentiable in the feasible domain, at the $Φ_{r}^{k}$ can be expressed as $L (Φ_{r}) = L (Φ_{r}^{k}) + \nabla L (Φ_{r}^{k})^{T} (Φ_{r} - Φ_{r}^{k}) .$ Replacing the objective function in equation (15) with the above equation gives the following approximate linear programming in a neighborhood of $Φ^{k}$

\begin{matrix} {\hat{Φ}}_{r} = \arg min_{Φ} \nabla L {(Φ_{r}^{k})}^{T} Φ_{r} \\ s . t . {| Φ_{r} |}_{1} \leq C_{1}, r = 1, 2, \dots, Nr . \end{matrix}

(18)

The gradient $\nabla L (Φ_{r})$ of the objective function is an $(L_{h} + 2)$ -dimensional column vector, and its $l th$ element can be written as

\nabla L (Φ_{r})_{l} = \sum_{m = j - M - 1}^{j - 1} (\frac{e^{ϕ_{0} + \sum_{r = 1}^{Nr} (Φ_{r}^{T} A_{m, r})} A_{m, r, l}}{1 + e^{ϕ_{0} + \sum_{r = 1}^{Nr} (Φ_{r}^{T} A_{m, r})}} - r_{m} A_{m, r, l}) .

(19)

In order to construct a feasible descent direction for the approximate linear programming, the index of the largest element of the gradient is denoted as $ξ$ ,

\begin{matrix} ξ = \arg max_{1 \leq i \leq L_{h} + 2} | \nabla L {(Φ_{r})}_{l} | . \end{matrix}

(20)

According to the rules of the FW algorithm, the feasible descent direction should be the difference between the solution of equation (20) and $Φ_{r}^{k}$ . However, to simplify the search process, we define the feasible descent direction as $d_{k} = sgn (- \nabla L (Φ_{r})_{ξ}) C_{1} e^{ξ} - Φ_{r}$ , where $e^{ξ}$ denotes a column vector with only the $ξ^{th}$ element being 1 and all other elements being 0. To obtain an approximate solution with a target accuracy of $ε$ , $φ$ is set as the flag to stop the iteration

\begin{matrix} φ = ‖ \nabla L {(Φ_{r}^{k})}^{T} d_{k} ‖ . \end{matrix}

(21)

If $φ \leq ε$ , then stop the iterative step of updating $Φ_{r}$ and fix $Φ^{k}$ constant and go to the step of updating $ϕ_{0}$ . Otherwise, $Φ_{r}$ is updated in the feasible descent direction until the target accuracy is reached.

The optimal step size $λ$ should have been found at each step. However, in order to reduce the computational effort, the fixed step size $λ = \frac{2}{k + 2}$ commonly used in Jaggi²⁴ is applied in this article, that is, the step size is only related to the number of iterations. This method of setting the step size is reasonable because as the number of iterations increases and the current solution moves closer to the optimal solution, the required step size needs to be reduced. The parameter matrix needs to be continuously updated with feasible descent discovery and step size as long as the target accuracy is not achieved. The new vector is defined by the following equation

\begin{matrix} Φ_{r}^{(k + 1)} : = Φ_{r}^{(k)} + λ d_{k} \\ = (1 - λ) Φ_{r}^{(k)} + λ s g n (- \nabla L {(Φ_{r})}_{ξ}) C_{1} e^{ξ} \end{matrix}

(22)

Once $Φ$ has reached its target accuracy, we start optimizing the next parameter, $ϕ_{0}$ . Since the process of optimizing $ϕ_{0}$ involves only one variable, then we can just use the partial derivative directly to find the best $ϕ_{0}$ . The partial derivative of the objective function with respect to $ϕ_{0}$ is denoted as $(\frac{\partial}{\partial ϕ_{0}}) L (ϕ_{0}) = \sum_{m = j - M - 1}^{j - 1} ((\frac{e^{ϕ_{0} + t r {Φ^{(k) T} A_{m}}}}{1 + e^{ϕ_{0} + t r {Φ^{(k) T} A_{m}}}}) - r_{m})$ . The following equation can represent the new iteration point

ϕ_{0}^{(k + 1)} : = ϕ_{0}^{(k)} - \frac{λ}{M} \frac{\partial}{\partial ϕ_{0}} L (ϕ_{0}) .

(23)

When both $ϕ_{0}$ and $Φ$ have reached the target accuracy, that is, $ε$ -approximate solution for problem equation (15), the iteration can be stopped. Then, the parameter values $ϕ_{0}^{(k)}$ and $Φ^{(k)}$ of the current iteration can be output as the final logistic regression model parameters. Substituting all the parameters into hypothesis test equation (13) gives the attributes label of the transmitter to be authenticated. When the label ${\hat{r}}_{j} = 1$ , Bob treats the transmitter to be authenticated as Alice and receives the message. Otherwise, Bob treats the transmitter as Eve, rejects its message, and initiates a spoof warning. Regardless of the legitimacy of the source of the received message, Bob records the physical layer attributes data with tag $r_{j}$ as the training set for the next authentication. The process of physical layer attributes estimation and physical layer authentication is summarized in Algorithm 1

Algorithm 1. FW-based PHY-layer authentication.
Input: input received signals $y$ , pilot signal, target accuracy $ε$ Output: Authentication result 1: Initialize $Φ = 0, ϕ_{0} = 0, k = 0, φ, C_{1}$ , and $C_{2}$ 2. Calculate RSSI via (3) 3: for i = 1, 2,…, I do 4: Estimate CIR via (5) 5: ECM algorithm to estimate CFO via (8) 6: end for 7: The raw data of the attributes are normalized and combined to obtain $A_{m}$ 8: while $\frac{\partial}{\partial ϕ_{0}} L (ϕ_{0}) > ε$ do 9: while $φ > ε$ do 10: Calculate $ξ$ via (18) 11: Calculate $φ$ via (19) 12: Set $λ = \frac{2}{k + 2}$ 13: Update $Φ_{r}$ via (20) 14: $k + +$ 15: end while 16: Update $ϕ_{0}$ via (21) 17: end while 18: Obtain $ϕ_{0} = {\hat{ϕ}}_{0}$ and $Φ = \hat{Φ}$ 19: Calculate ${\hat{r}}_{j}$ via (13) 20: if ${\hat{r}}_{j} = 1$ then 21: Accept message j 22: else 23: Send spoofing alarm for current message 24: end if 25: Storing attribute data with labels

Algorithm 1. FW-based PHY-layer authentication.

Input: input received signals

y

, pilot signal, target accuracy

ε

Output: Authentication result
1: Initialize

Φ = 0, ϕ_{0} = 0, k = 0, φ, C_{1}

, and

C_{2}

2. Calculate RSSI via (3)
3: for i = 1, 2,…, I do
4: Estimate CIR via (5)
5: ECM algorithm to estimate CFO via (8)
6: end for
7: The raw data of the attributes are normalized and combined to obtain

A_{m}

8: while

\frac{\partial}{\partial ϕ_{0}} L (ϕ_{0}) > ε

do
9: while

φ > ε

do
10: Calculate

ξ

via (18)
11: Calculate

φ

via (19)
12: Set

λ = \frac{2}{k + 2}

13: Update

Φ_{r}

via (20)
14:

k + +

15: end while
16: Update

ϕ_{0}

via (21)
17: end while
18: Obtain

ϕ_{0} = {\hat{ϕ}}_{0}

and

Φ = \hat{Φ}

19: Calculate

{\hat{r}}_{j}

via (13)
20: if

{\hat{r}}_{j} = 1

then
21: Accept message j
22: else
23: Send spoofing alarm for current message
24: end if
25: Storing attribute data with labels

Performance metrics

To facilitate the analysis of PLA performance, specific indicators need to be selected to quantify the errors. Typically, the mean squared error (MSE) is used to quantify the error between the estimated attribute and the actual value, and it is calculated as

\begin{matrix} MSE (\hat{x}) = E {(\hat{x} - x)}^{2} \end{matrix},

(24)

where $\hat{x}$ represents the estimation value of true value $x$ . $E (\cdot)$ is the expectation calculation. While a large estimation error is detrimental to the authentication of legitimate receivers, it is beneficial to the attacker, increasing the chances of the attacker being able to muddle through.

The performance of our proposed method is measured by the probability of miss detection (MD) and false alarm (FA). MD means the number of times Bob has marked physical layer attributes from Eve as legal and accepted it. FA means the number of times Bob has marked physical layer information from Alice as illegal, thus issuing spoof warning. In the test data, signal samples from Alice are correctly classified as true positives (TP); otherwise, they are referred to as false negatives (FN). Eve’s message samples are correctly classified as true negatives (TN); otherwise, they are called false positives (FP).

The miss detection rate $(P_{m})$ , that is, the FP rate can be described as

P_{m} = \Pr ({\hat{r}}_{j} = 1 | Φ \in Eve) = \frac{FP}{FP + TN} .

(25)

The false alarm rate $(P_{f})$ , that is, the FN rate can be described as

P_{f} = \Pr ({\hat{r}}_{j} = 0 | Φ \in Alice) = \frac{FN}{FN + TP} .

(26)

Algorithm complexity analysis

Floating point operations (FLOPS) represents the computational complexity to some extent. The number of needed FLOPS for each operation of Algorithm 1 is listed in the Table 1. After neglecting the constants, lower powers, and coefficients of the highest powers, the total computational complexity for Algorithm 1 is represented by total FLOPS of $O (I N_{r} L_{h} (N L_{h} + N^{2} + L_{h}^{2}) + KM N_{r} L_{h}^{2})$ .

Table 1.

Complexity analysis.

Stages	Operations	FLOPS
Estimation	Initialization	$L_{h} + 4$
	Calculate RSSI	$Nr (N + 2)$
	Estimate CIR	$O (Nr L_{h} (N L_{h} + N^{2} + L_{h}^{2}))$
	Estimate CFO	$O (Nr N^{2} + Nr N L_{h})$
	Total $I$ iterations	$O (I Nr L_{h} (N L_{h} + N^{2} + L_{h}^{2}))$
Training and Authentication	Compute $\nabla L (Φ_{r})$	$M (L_{h} + 2) (Nr (L_{h} + 2) + 4)$
	Calculate $ξ$	$(L_{h} + 2) + (L_{h} + 1)$
	Calculate $φ$	$2 + 2 (L_{h} + 2)$
	Update $Φ_{r}$	$Nr (L_{h} + 2)$
	Update $Φ_{0}$	$M + 2$
	Authentication	$Nr (L_{h} + 2)$
	Total $K$ cycles	$O (KM Nr L_{h}^{2})$
Total FLOPS	$O (I Nr L_{h} (N L_{h} + N^{2} + L_{h}^{2}) + KM Nr L_{h}^{2})$

Performance evaluation

This section presents and analyzes the simulation results of the proposed multi-attributes physical layer authentication scheme.

The modulation method is QAM on the transmitter side and the access method is OFDM. The signal carrying the imperfect characteristics of the device oscillator passes through a multipath fading channel and interference by noise. On the receiver side, the sampling rate $f_{s}$ is 20 MHz. Assuming the system has perfect time synchronization, only the CFO needs to be considered. Although the ML-based physical layer authentication approach is model independent, we still use certain channel models when simulating communication systems for the convenience of later evaluation. The CFO between Alice and Bob is experimentally modeled as a Wiener process with an initial value of 2.54 kHz. We assume that an attacker is in the communication range, and its CFO is also modeled as a Wiener process with an initial value of 2.3 kHz. It can be seen that the CFO values of the attacker Eve and the legitimate transmitter Alice are very close to each other, which is usually very demanding for hardware devices, meaning that we assume that Eve has a powerful imitation capability. The CFO values mentioned here are normalized before being added to the signal. Eve deliberately moves closer to the legitimate transmitter to spoof the receiver, making its path loss very similar to Alice’s. Unless otherwise stated, the distance from Alice to Bob is 5 m and the distance from Eve to Bob is 6 m. However, after all, the attacker still launches the attack under covert conditions and the surrounding environment may be complex, so its multipath delay expansion will be large. The channel modeling in the experiment is an indoor channel at 2.4 GHz under IEEE 802.11a specification. Assume that the parameters of the FW method are $C_{1} = 6$ , $C_{2} = 1$ , $M = 100$ , and $ε = 0.01$ . For the sake of clarity, we put the parameters of the simulation setup in Table 2.

Table 2.

Simulation parameters.

Parameters	Value
Center frequency	2.4 GHz
Sampling rate $f_{s}$	20 MHz
Number of Channel taps $L_{h}$	6
Number of antennas $Nr$	8
Initial $Δ f$ of Alice	2.54 kHz
Initial $Δ f$ of Eve	2.3 kHz
Distance from Alice to Bob	5 m
Distance from Eve to Bob	6 m
path loss index	2.1
Number of historical records M	100
Regularization parameter $C_{1}$	6
Regularization parameter $C_{2}$	1
Learning rate $ε$	0.01

For the subsequent evaluation comparison, we define the following variables.

From the transmitter to the receiver, there is path loss in the transmission of the signal. The path loss index is set to 2.1 for the indoor model with obstacle occlusion. Define $d_{A}$ and $d_{E}$ as the distance between Alice to Bob and between Eve to Bob, respectively, and $ρ_{d}$ denotes the ratio of the difference between the two distances, that is, $ρ_{d} = \frac{d_{E} - d_{A}}{d_{A}}$ . Root mean square (RMS) delay expansion is a useful channel parameter that provides a reference for comparing different multipath fading channels. It is assumed that the power delay distribution of the channel obeys an exponential distribution and that the CIR is represented by an FIR filter. In the experiments, the length of the channel taps is set to 6, the channel power is normalized to 1, and different RMS delay expansions determine the proportion of each tap power to the total power. $τ_{A}$ and $τ_{E}$ denote the RMS delay extension between Alice to Bob and Eve to Bob, respectively. Define $ρ_{t}$ as the ratio of the RMS delay extension from Eve to Bob and the RMS delay extension from Alice to Bob, that is, $ρ_{t} = \frac{τ_{E}}{τ_{A}}$ . The CFO can be modeled as a Wiener process for hardware devices with oscillatory characteristics. The CFO remains constant within a time slot but constantly varies between time slots. Suppose the value of the CFO is $ϵ^{m}$ at m moments and the value of the CFO is $ϵ^{m + n}$ after n moments, then the relationship between the CFOs at two different time slots can be expressed as

\begin{matrix} ϵ^{[m + n]} = ϵ^{[m]} + Δ (n) \end{matrix}

(27)

where $Δ (n)$ is the variable that is modeled as a zero-mean real Gaussian process. It has a variance of $σ_{Λ}^{2 n}$ . In the experiment, n is set to 1. The variance can be written in logarithmic form $σ [dB] = 10 \log σ_{Λ}^{2 n}$ .

Figure 2(a) shows the decelerating decline of the MSE of CFO with the increase of SNR, which means that the estimation accuracy of the hardware device attributes increases as the noise in the environment decreases, especially under the case of low SNR. Moreover, the estimation error of CFO of the eight-antenna system is smaller than that of the single-antenna system and this advantage is more obvious for the case of low SNR than the high SNR. From Figure 2(b), we can see that the MSE of CIR decreases continually in a fixed rate with the increase of SNR under both single-antenna and eight-antenna systems and the MSE under eight-antenna system is always lower than that of the single-antenna system. Take the case of SNR = 0 as an example, compared with the single antenna, the multi-antenna technology reduces the estimation error of CFO and CIR by 2.14% and 98.43%, respectively. Thus, we can conclude that the multi-antenna technology can improve the received signal quality and reduce the estimation errors of physical layer attributes.

Figure 2.

Estimation errors versus SNR with the setting of ( $ρ_{d}$ = 0.2, $ρ_{t}$ = 4, $σ$ = 30 dB). (a) MES of CFO. (b) MSE of CIR.

In order to compare the authentication accuracy using multi-attributes, we also carried out the authentication experiment using each physical layer attribute separately and their combinations. Figure 3 shows the $P_{m}$ of single-attribute and multi-attribute physical layer authentication versus SNR. Figure 4 shows the $P_{f}$ of authentication versus SNR.

Figure 3.

Miss detection rate versus SNR with the setting of (Nr = 8, $ρ_{d}$ = 0.2, $ρ_{t}$ = 4, $σ$ = 30 dB).

Figure 4.

False alarm rate versus SNR with the setting of (Nr = 8, $ρ_{d}$ = 0.2, $ρ_{t}$ = 4, $σ$ = 30 dB).

It can be seen from Figures 3 and 4 that the $P_{m}$ and $P_{f}$ decrease continuously as the SNR increases. The $P_{m}$ of authentication based on RSSI single physical layer attributes stays at a higher level of 20%–30%, and $P_{f}$ is between 30% and 40%, which indicates that the performance of authentication with RSSI as a fingerprint does not improve as the received signal quality increases when Eve is close to Alice. The path loss between them is of an order of magnitude; distinguishing legitimate from illegitimate devices by RSSI alone is not a good solution.

The CIR-based $P_{m}$ and $P_{f}$ decrease significantly with increasing SNR, indicating that signal improvement contributes to the performance of CIR-based authentication. The $P_{m}$ of authentication based on CIR single physical layer attribute can reduce to 4.8362% and $P_{f}$ can reduce to 0.382% under the condition of SNR = 20 dB.

When SNR < 10 dB, as the SNR increases, the authentication error rates of CFO-based single-attribute physical layer authentication decrease. When SNR > 10 dB, authentication error rates of CFO-based authentication do not significantly reduce as the signal quality improved due to the MSE of CFO does not decrease significantly. The $P_{m}$ of authentication based on CFO single physical layer attribute can reduce to 0.7526% and $P_{f}$ can reduce to 0.1833% under the condition of SNR = 20 dB. It has better performance than CIR-based single-attribute authentication because CFO is modeled as a Wiener process and the variation of CFO is continuous rather than abrupt. However, each channel tap of CIR is modeled as an independent complex Gaussian random variable, that is, CIR is more random than CFO.

The physical layer authentication based on CFO and CIR attributes perform better than the authentication using individual attributes as fingerprints separately, indicating that the combination of multiple attributes can indeed improve the accuracy of physical layer authentication. When an attribute fails in the current authentication, that is, this attribute of Eve and Alice is so similar that it is impossible to distinguish a legitimate transmitter from an illegitimate transmitter by this attribute; other attributes can be utilized as authentication fingerprints to distinguish them. The $P_{m}$ of authentication based on CFO and CIR can reduce to 0.3869% and $P_{f}$ can reduce to 0.0667% under the condition of SNR = 20 dB. The best performance is achieved by using all three attributes simultaneously as authentication for fingerprints because an attacker can imitate some attributes of the legitimate transmitter, but it is almost impossible for Eve to imitate all attributes of Alice. Multiple attributes as authentication fingerprints increase the security of the system. The simulation results show that the $P_{m}$ of multi-antenna three-attributes physical layer authentication is reduced to 0.3466% and the $P_{f}$ is less than 0.0263% under the condition of SNR = 20 dB. The authentication performance based on the three attributes of CFO, CIR, and RSSI is improved over the authentication performance based on the two attributes of CFO and CIR, but it is not significant. Because the RSSI-based authentication error rate is not low, adding RSSI as the authentication fingerprint can slightly improve performance.

The accuracy of CFO estimation and CIR estimation depends on the signal quality. Improved channel conditions improve the estimation accuracy, which is more favorable for Bob to distinguish between Eve and Alice, that is, the authentication accuracy is much higher. The multi-antenna technology can improve the signal reception quality and indirectly reduce the estimation error. Moreover, using multi-attributes is significantly more accurate than a single attribute as a fingerprint for authentication.

The distance between Alice and Bob is fixed at 5 m and a specific value of $ρ_{d}$ is obtained by varying $d_{E} \in {5, 6, \dots, 11}$ . Figures 5 and 6 show that the authentication error rate decreases as $ρ_{d}$ increases. Increases in $ρ_{d}$ mean that the distance between Eve and Bob increases but the distance between Alice and Bob remains constant. The signals from the legitimate transmitter and the illegitimate attacker experience very different path losses, increasing the differentiation of the transmitters and making it easier for the receiver to make the correct authentication. Simulation results show that when the ratio $ρ_{d}$ of the distance is equal to zero, just using RSSI as the only fingerprint for authentication, the miss detection rate and false alarm rate are as high as 40.276% and 50.659%, respectively, which hardly provides any authentication reliability. When the distance from Alice to Bob and the distance from Eve to Bob are similar, RSSI can only be used as an authentication fingerprint in combination with other attributes to obtain a low authentication error rate. For example, when $ρ_{d} = 0$ , $P_{m}$ of RSSI and CIR as authentication fingerprints is reduced to 6.879% and $P_{f}$ is reduced to 8.963%. $P_{m}$ of RSSI and CFO together as authentication fingerprint is reduced to 0.4603% and $P_{f}$ is reduced to 1.0498%. When all three attributes together as authentication fingerprint, $P_{m}$ is reduced to 0.3797% and the $P_{f}$ is reduced to 0.8642%. When $ρ_{d}$ increases to 2, $P_{f}$ rate tends to zero using a single attribute of RSSI as well as using three attributes as fingerprints, and the $P_{m}$ is very small, which indicates that Alice and Eve can be well distinguished using just one attribute of RSSI. Furthermore, the multi-attributes authentication error rates are 0 in this case.

Figure 5.

Miss detection rate versus $ρ_{d}$ with the setting of (Nr = 8, SNR = 20 dB, $ρ_{t}$ = 4, $σ$ = 30 dB).

Figure 6.

False alarm rate versus $ρ_{d}$ with the setting of (Nr = 8, SNR = 20 dB, $ρ_{t}$ = 4, $σ$ = 30 dB).

In the experiment, fix the RMS delay extension from Alice to Bob to 25 ns and obtain different values of $ρ_{t}$ by changing the RMS delay extension from Eve to Bob. Figures 7 and 8 show that the authentication error rate decreases as the $ρ_{t}$ increases. When $ρ_{t} = 1$ , that is, the attacker has the same multipath delay extension as the legitimate transmitter, and the CIR-based authentication error rate is as high as 50%, which indicates that CIR as an authentication attribute fails. The error rate of the combination of CIR and RSSI as the authentication fingerprint is also as high as 30%, which is because the default $d_{E}$ is very close to $d_{A}$ in the experiment, so RSSI as the authentication fingerprint also cannot distinguish well between Alice and Eve. When the value of $ρ_{t}$ is small, combining CFO as the authentication fingerprint yields lower $P_{m}$ and $P_{f}$ , and multiple attributes provide higher reliability for the authentication process. For example, when $ρ_{t} = 1$ , the missed detection rate of the combination of CIR and CFO as certified fingerprint is reduced to 4.2038% and the false alarm rate is reduced to 0.6497%. When the value of $ρ_{t}$ is large, relying only on a single attribute of CIR can obtain the same low error rate as multi-attribute authentication, that is, it is easy to distinguish between legitimate transmitter and attacker.

Figure 7.

Miss detection rate versus $ρ_{t}$ with the setting of (Nr = 8, SNR = 20 dB, $ρ_{d}$ = 0.2, $σ$ = 30 dB).

Figure 8.

False alarm rate versus $ρ_{t}$ with the setting of (Nr = 8, SNR = 20 dB, $ρ_{d}$ = 0.2, $σ$ = 30 dB).

The experiment assumes that Eve and Alice have the same variance of the Wiener process, and the impact of the change in CFO for the authentication performance is observed by changing the value of the variance. We set $σ [dB] \in {0, 10 \dots, 100}$ . Figures 9 and 10 indicate the authentication error rate for different $σ [dB]$ values. When $σ [dB]$ is less than 20 dB, the authentication error rates of multi-attributes and single-attribute can be reduced to 0. The frequency of the device oscillator is more stable in this case, but the quality of the device is very demanding. The authentication error rates increase with the growth of σ[dB] when 20dB<σ[dB]<80dB, which means the authentication effect gets worse with a worse stability of the device. When σ[dB]>80dB, The authentication error rates do not change with the increase of σ[dB]. Under this case, $P_{m}$ and $P_{f}$ have reached more than 40% when only the CFO is used for authentication, which indicates that only the single attribute CFO is no longer valid to authenticate. The effect of combining multi-attributes for authentication at this point is equivalent to using other attributes for authentication. In reality, the CFO between different hardware oscillator pairs is bound to vary. However, this variation is also changing, and if the variation varies too much, it will give an attacker an opportunity to take advantage of it, thus causing a security breach. Suppose the authentication reliability of the CFO as a fingerprint is to be improved. In that case, higher requirements will inevitably be placed on the quality of the equipment, which will increase the cost of communication. Multi-attributes as authentication fingerprints are a good solution.

Figure 9.

Miss detection rate versus $σ$ with the setting of (Nr = 8, SNR = 20 dB, $ρ_{d}$ = 0.2, $ρ_{t}$ = 4).

Figure 10.

False alarm rate versus $σ$ with the setting of (Nr = 8, SNR = 20 dB, $ρ_{d}$ = 0.2, $ρ_{t}$ = 4).

Figures 11 and 12 demonstrate the effect of different receive antennas on the multi-attributes authentication performance. It can be observed that the authentication error rate continues to decrease as the number of antennas increases. For the case of small SNR, the improvement of the authentication performance by increasing the number of antennas is significant. In contrast, when the signal conditions are inherently good, increasing the number of antennas only slightly improves the authentication performance. When SNR = 0, $P_{m}$ decreases from 14.69% to 0.7459% and $P_{f}$ decreases from 12.77% to 0.1207% as the number of antennas increases from 1 to 8. When SNR = 20, $P_{m}$ decreases from 0.65% to 0.3466% and $P_{f}$ decreases from 0.0491% to 0.0263% as the number of antennas increases from 1 to 8. This indicates that the multiple antenna techniques improve the authentication performance of communication systems with poor channel conditions.

Figure 11.

Miss detection rate versus Nr with the setting of ( $ρ_{d}$ = 0.2, $ρ_{t}$ = 4, $σ$ = 30 dB).

Figure 12.

False alarm rate versus Nr with the setting of ( $ρ_{d}$ = 0.2, $ρ_{t}$ = 4, $σ$ = 30 dB).

Figure 13 illustrates the relationship between the authentication error rate and the number of iterations of the logistic regression algorithm. It can be seen from the figure that the authentication error rate decreases as the number of iterations increases. When the number of iterations is less than 2, the error rates are high because the weight parameters of logistic regression are initialized to 0, equivalent to no attribute involved in authentication. When the number of iterations is greater than 5, the authentication error rates can remain stable. For example, after five iterations, $P_{m}$ is below 3% and $P_{f}$ is below 0.9%, which indicates that the algorithm converges quickly.

Figure 13.

Physical layer authentication error rates for different number of iterations with the setting of (SNR = 20 dB, $ρ_{d}$ = 0.2, $ρ_{t}$ = 4, $σ$ = 30 dB).

Figure 14 characterizes the influence of different authentication schemes on authentication performance, which quantifies the $P_{m}$ versus $P_{f}$ . Fang et al.¹⁶ consider that different attributes are suitable for physical layer authentication in different communication scenarios. When encountering complex wireless scenarios, the security performance of authentication scheme will deteriorate if a fixed physical layer attribute is used as authentication fingerprints. Therefore, Fang et al.¹⁶ propose to select the most effective attributes (MEA) as the authentication fingerprints based on the performance observation of RSSI, CFO, and CIR and then use their combination and a threshold-based method to do authentication. In summary, although the selection of MEA is achieved via a learning approach, the authentication decision is still based on a threshold approach.

Figure 14.

$P_{m}$ versus $P_{f}$ based our scheme and the scheme in Fang et al.¹⁶

The solid line in the figure depicts the relationship between $P_{m}$ and $P_{f}$ implemented in Fang et al.¹⁶ based on different authentication fingerprints. The dashed line depicts the performance achieved based on our proposed scheme. Curves of the same color represent the error rate obtained using the same attributes as the authentication fingerprints. It can be observed that as $P_{f}$ increases from 0 to 0.2, $P_{m}$ continues to decrease due to the inevitable $P_{m}$ and $P_{f}$ trade-off. The point in the figure is close to the origin, indicating the good performance of physical layer authentication. It can be observed that when $P_{f} = 0$ , $P_{m}$ obtained based on the scheme in Fang et al.¹⁶ is close to 1, but based on our scheme, $P_{m}$ is lower than 0.09 in all cases except for a single attribute of RSSI is utilized as authentication fingerprint. The points depicted based on our scheme are closer to the origin, and $P_{m}$ and $P_{f}$ can converge to 0 simultaneously, that is, the authentication performance is better. For example, the closest point to the origin based on our scheme is $(8.79 \times 10^{- 3}, 2.48 \times 10^{- 3})$ , while the closest point to the origin based on the scheme in Fang et al.¹⁶ is approximated as (0.01, 0.09). The dashed line in the figure is always lower than the solid line of the same color, which shows that the authentication achieved by our scheme is always better than that of Fang et al.,¹⁶ whether using single or multiple attributes as authentication fingerprints.

Conclusion

We consider the threshold-free multi-attributes physical layer authentication based on the ECM channel estimation algorithm. Once received a signal, the receiver can directly calculate the RSSI and use the ECM algorithm to estimate the CFO and CIR from the received signals. The RSSI, CFO, and CIR obtained from the signal processing are exploited as fingerprints for the physical layer authentication and input of the logistic regression model during the authentication phase. The logistic regression model achieves threshold-free authentication and model parameters are optimized by the FW algorithm to achieve lower authentication error rates. Moreover, the combination of multi-attributes as authentication fingerprints enhances authentication reliability.

Experimental results show that the proposed threshold-free multi-attributes physical layer authentication scheme can effectively improve authentication accuracy, with $P_{f}$ reduced to 0.0263% and $P_{m}$ reduced to 0.3466%. The improvement of signal quality contributes to the accuracy of attributes estimation and thus indirectly enhances authentication performance. Besides, the multi-antenna technology can improve the accuracy of attributes estimation when the signal quality is poor. The combination of multiple attributes can effectively resist the risk of certain attribute failures due to imitation by attackers. And comparing with CFO single attribute as authentication fingerprint, $P_{m}$ and $P_{f}$ are reduced by 53.9% and 86.03%, respectively.

Footnotes

Handling Editor: Yanjiao Chen

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partly supported by the National Natural Science Foundation of China (Grants 61931001 and 61871023) and Beijing Natural Science Foundation (Grant 4202054).

ORCID iD

Qinghe Gao

References

Fang

Wang

Hanzo

Learning-aided physical layer authentication as an intelligent process. IEEE Trans Commun 2019; 67(3): 2260–2273.

Sun

A review of physical layer security techniques for internet of things: challenges and solutions. Entropy 2018; 20(10): 730.

Fang

Wang

Tomasin

Machine learning for intelligent authentication in 5G and beyond wireless networks. IEEE Wirel Commun 2019; 26(5): 55–61.

Xie

Zhang

Liu

AX.

Physical-layer authentication in non-orthogonal multiple access systems. IEEE/ACM Trans Netw 2020; 28(3): 1144–1157.

Chen

, et al. Physical layer authentication for non-coherent massive SIMO-enabled industrial IoT communications. IEEE Trans Inform Foren Sec 2020; 15: 3722–3733.

Xie

Chen

Slope authentication at the physical layer. IEEE Trans Inform Foren Sec 2018; 13(6): 1579–1594.

Wang

Hao

Hanzo

Physical-layer authentication for wireless security enhancement: current challenges and future developments. IEEE Commun Mag 2016; 54(6): 152–158.

Hou

Wang

Chouinard

. Physical layer authentication in OFDM systems based on hypothesis testing of CFO estimates. In: Proceedings of the 2012 IEEE international conference on communications (ICC), Ottawa, ON, Canada, 10–15 June 2012, pp.3559–3563. New York: IEEE.

Hou

Wang

Chouinard

, et al. Physical layer authentication for mobile systems with time-varying carrier frequency offsets. IEEE Trans Commun 2014; 62(5): 1658–1667.

10.

Liu

Wang

Primak

. A two dimensional quantization algorithm for CIR-based physical layer authentication. In: Proceedings of the 2013 IEEE international conference on communications (ICC), Budapest, 9–13 June 2013. New York: IEEE.

11.

Liu

Wang

Physical layer authentication enhancement using two-dimensional channel quantization. IEEE Trans Wirel Commun 2016; 15(6): 4171–4182.

12.

Xiao

Wan

Han

PHY-layer authentication with multiple landmarks with reduced overhead. IEEE Trans Wirel Commun 2018; 17(3): 1676–1687.

13.

Zhang

Shen

Jiang

, et al. Physical layer authentication jointly utilizing channel and phase noise in MIMO systems. IEEE Trans Commun 2020; 68(4): 2446–2458.

14.

Pan

Pang

Wen

, et al. Threshold-free physical layer authentication based on machine learning for industrial wireless CPS. IEEE Trans Ind Inform 2019; 15(12): 6481–6491.

15.

Fang

Wang

Hanzo

Adaptive trust management for soft authentication and progressive authorization relying on physical layer attributes. IEEE Trans Commun 2020; 68(4): 2607–2620.

16.

Fang

Yin

Mei

, et al. Learning enabled adaptive multiple attribute-based physical layer authentication. In: Proceedings of the 2020 IEEE 92nd vehicular technology conference (VTC2020-Fall), Victoria, BC, Canada, 18 November–16 December 2020, pp.1–5. New York: IEEE.

17.

Senigagliesi

Baldi

Gambi

Comparison of statistical and machine learning techniques for physical layer authentication. IEEE Trans Inform Foren Sec 2020; 16(99): 1506–1521.

18.

Wang

Liew

SC.

Frequency-asynchronous multiuser joint channel-parameter estimation, CFO compensation, and channel decoding. IEEE Trans Veh Technol 2016; 65(12): 9732–9746.

19.

Chen

TS.

Optimal joint CFO and channel estimation in quasi-synchronized OFDM systems. In: Proceedings of the IEEE GLOBECOM 2007—IEEE global telecommunications conference, Washington, DC, 26–30 November 2007, pp.2816–2820. New York: IEEE.

20.

Pun

Morelli

Kuo

CCJ

. Iterative detection and frequency synchronization for OFDMA uplink transmissions. IEEE Trans Wirel Commun 2007; 6(2): 629–639.

21.

Goldsmith

AJ.

Wireless communications. Cambridge: Cambridge University Press, 2005.

22.

McCue

Data mining and predictive analysis. 2nd ed. Amsterdam: Elsevier, 2015, pp.31–48.

23.

Beck

First-order methods in optimization. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM), 2017.

24.

Jaggi

Revisiting Frank-Wolfe: projection-free sparse convex optimization. In: Proceedings of the 30th international conference on machine learning (ICML 2013), pp. 427–435, http://proceedings.mlr.press/v28/jaggi13.pdf

Threshold-free multi-attributes physical layer authentication based on expectation–conditional maximization channel estimation in Internet of Things

Abstract

Keywords

Introduction

Related works

System model

Physical layer authentication algorithm

Physical attribute estimation

RSSI calculation

CIR and CFO estimation

E-step

M-step

Logistic regression authentication model

FW-based parameter optimization algorithm

Performance metrics

Algorithm complexity analysis

Performance evaluation

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

References