Sage Journals: Discover world-class research

Abstract

During the transmission of power measurement data through communication networks from remote terminal unit (RTU) to the state estimator in Supervisory Control and Data Acquisition (SCADA), power cyber-physical systems (PCPSs) are more susceptible to cyber-attacks. To mitigate that threat, this paper is concerned with a new data recovery strategy on machine learning against false data injection attacks (FDIAs) in PCPSs. Firstly, in view of the limited resources (such as limited energy) of adversaries and system protections, a sparse target false data injection attack (FDIA) is constructed. Then, the FDIA detection problem is transformed into a tripartite separation problem, and the alternating direction method of multipliers on proximal exchange (ADMM-PE) is adopted to complete the intrusion detection of FDIAs. In addition, with the help of reliable mask information and real incomplete measurement data provided by the FDIA detection, a similar supervised generative adversarial imputation networks (GAIN) is proposed to complete the measurement data recovery after FDIAs. Specifically, the pseudo labels generated by data analysis methods such as k-means clustering and support vector machine (SVM) to improve the accuracy of measurement data recovery. Finally, the experimental results of PCPSs show the effectiveness and superiority of the proposed data recovery strategy against FDIAs.

Keywords

Power cyber-physical systems false data injection attacks measurement data recovery intrusion detection machine learning

Introduction

With more and more renewable power generations and smart devices, power system is increasingly dependent on cyber infrastructure (e.g. interface terminals due to promote real-time analysis, open communication networks).^1,2 Therefore, Power Cyber Physical Systems (PCPSs) is a product of high informatization, intelligence, and deep networking of power systems. However, the high integration of advanced information technology not only brings convenience but also many information security issues to PCPSs. In the process of transmitting data through communication networks, it is more susceptible to cyber-attacks, for example, false data injection (FDI) attacks and denial of service (DoS) attacks, where FDI attacks (FDIAs) can inject subtle and pseudo biases into the data through several ways such as network layer or communication channel. In addition, FDIAs can be more intelligent and conceal with the objective of minimizing contaminated measurements and maximizing attack impact for collaboratively altering the meter measurements in PCPSs.^3,4 Especially, different from DoS attacks, FDIAs do not destroy the observability of systems through intrusion of the communication link, magnetic field injection or global positioning system (GPS) spoofing,¹ which is relatively difficult to be detected and eliminated.^5,6

There have been numerous studies on intrusion detection methods against FDIAs for measurement data in PCPSs. According to Musleh et al.,⁵ intrusion detection methods against FDIAs can be divided into model-based and data-driven methods, and advantages and disadvantages of these methods are analyzed and compared. At the end of Musleh et al.,⁵ future development trends about FDI attack (FDIA) detection methods are pointed out, that is, the methods that are adaptive to new types of FDIAs, system topology and parameter changes, or independent on system models and parameters. Therefore, more and more FDIA detection methods on machine learning (ML) that are independent on system model and parameters or data-driven methods are adopted to detect FDIAs in PCPSs. For example, a fast go-decomposition (GoDec) approach on matrix decomposition,⁷ non-convex robust principal component analysis (NcRPCA) on matrix separation,⁸ a semi-supervised deep learning approach on autoencoders and an advanced generative adversarial network (GAN),⁹ the detection mechanism on a temporal correlation and spatial correlation method and a deep convolutional neural network,¹⁰ secure federated deep learning with Transformer, federated learning and Paillier cryptosystem,¹¹ the detection method on Kalman filter and recurrent neural network,¹² FDIAs detection based on the spectral energy of Hilbert-Huang transform,² a novel interval state forecasting-based detection scheme on ensemble learning of long short term memory neural network and parametric Gaussian distribution.⁴ It is worth noting that FDIA detection methods on ML are suitable for detecting all malicious attacks, including FDIAs, and has good universality. However, this type of FDIA detection methods requires a large amount of training time and has a high dependence on training samples. Compared to FDIA detection methods on ML, detection methods on matrix decomposition or separation do not have the above-mentioned problems and are only suitable for detecting FDIAs.

However, intrusion detection alone is not enough, and effective data recovery methods of measurement data are also needed to provide complete and reliable measurement data for state estimation and control decisions in PCPSs. Indeed, designing state estimators that are resilient and robust to FDIAs can reduce the impact of attacks on the system. However, the design of anti-FDIAs state estimators heavily relies on system modeling, and the modeling of complex PCPSs is very difficult, therefore, adopting a data-driven measurement data recovery method is a better way. In addition, the detected false data is often discarded in PCPSs, which lends to the incomplete data. When the discarded data reaches a certain scale, it will seriously affect downstream applications, for example, incomplete power measurement data can affect state estimation and control decision in supervisory control and data acquisition (SCADA). Therefore, how to recover the incomplete data has become a focus of attention, which can be converted to a data imputation problem for multivariate time series (DIP-MTS).¹³

In general, DIP-MTS can be mainly divided into two categories: non-deep learning (DP)-based and DP-based methods. Non-DP-based methods include the imputation with mean values, median values, clustering etc.,¹⁴ while DP-based methods include matrix completion, recurrent neural networks (RNN)-based methods (e.g. bidirectional recurrent imputation,^15,16 gated recurrent unit (GRU)-D¹⁷), GAN-based methods. However, non-DP-based methods for DIP-MTS are difficult in capturing complex nonlinear correlation in PCPSs, and their imputation errors are large when the missing rate is relatively high. For DP-based methods, most of DP-based methods (e.g. RNN, GRU-D, GAN) need complete data for training, which limits their application in PCPSs.^13,15 In the paper, how to use intrusion detection to assist us in completing the DIP-MTS based on incomplete measurement data and DP-based method is our focus.

Obviously, the intrusion detection problem against FDIAs in PCPSs and data imputation problem, as two relatively independent problems, have received many attentions. In this paper, the two problems are combined to address the intrusion detection, identification, and data recovery problems against FDIAs in PCPSs. Our main contributions are summarized as the following:

(1) The sparse targeted FDIA against PCPSs is constructed to deceive traditional detector in state estimator. A new data recovery strategy against FDIAs is presented to recover measurement data of PCPSs, which combines FDIA detection and data imputation techniques based on alternating direction method of multipliers (ADMM) and similar supervised generative adversarial imputation nets.

(2) The results of the proposed FDIA detection provide the mask information and real incomplete measurement data for data imputation. Moreover, k-means clustering and support vector machine (SVM) are introduced to improve the accuracy of measurement data recovery.

(3) Experiments on power cyber physical systems (CPSs) under FDIAs are demonstrated quantitatively and qualitatively to evaluate the effectiveness and superiority of our data recovery strategy.

The rest of this paper is organized as follows. In Section 2, the preliminary including state estimation and modeling of false data injection attacks is introduced. In Section 3, intrusion detection (ID) on machine learning with tripartite separation data model is presented. Then, the data imputation on the improved generative adversarial imputation networks is proposed in Section 4. In Section 5, the proposed integrated data recovery strategy against FDIAs is described. Finally, the experimental results and analyses of data recovery strategy against FDIAs in PCPSs are presented in Section 6, and conclusion is given in Section 7.

Preliminary

State estimation

A safe and reliable PCPSs requires accurate state estimation to make follow-up control decisions. The state estimation problem is the process of deriving state variables from measurement values $z \in R^{m \times 1}$ (injected active powers of buses and active power flows of branches) in PCPSs, which can be based on the following linear (i.e. Direct Current, DC) form:

z = H θ + e

(1)

where $H \in R^{m \times n}$ denotes Jacobian matrix according to the physical structure of power systems, $θ \in R^{n \times 1}$ denotes the phase angles of bus voltages, and the magnitudes of bus voltages are all equal to 1 p.u., $e \in R^{m \times 1}$ denotes measurement noise. Let $\hat{θ}$ be the best estimate of $θ$ through weighted least square (WLS) method, and residual $r$ can be calculated by $r = z - H \hat{θ}$ . If $r = ‖ z - H \hat{θ} ‖_{2} > τ$ , where $τ$ is the preset threshold of the residual-based detector, then $z$ is identified as bad measurement data, otherwise it is real measurement data.

Modeling of false data injection attacks

According to the work on the FDIAs first proposed in Liu et al.,¹⁸ the successful FDIAs passing the residual-based detector can be constructed: $a = Hc$ $(a \in R^{m \times 1})$ , where $a$ denotes the attack vector injected into measurement $z$ , that is, $z_{a} = z + a$ is received by state estimator; $c \in R^{n \times 1}$ denotes an arbitrary nonzero vector.

Considering the specific scenarios and limited conditions in which the FDIAs are launched, as early as 2014, the sparse FDIA was proposed in Liu et al.,¹⁹ and the sparsity of FDIAs was also defined as $‖ a ‖_{0} / m$ , where $‖ a ‖_{0}$ denotes the number of nonzero elements in $a$ and $m$ is the size of $a$ according to Hao et al.²⁰ In this section, a sparse target data-driven FDIA is constructed by adversaries. On the one hand, in view of the limited resources (such as limited energy) of adversaries, system protections, regular maintenance of power systems and the utilization of phasor measurement units (PMU),¹⁹ assuming that the adversaries are unable to obtain the topology information of systems (i.e. $H$ ) except for intercepting and tampering with the measurement data during the communication process, and up to $k$ measurement data can be contaminated by adversaries, therefore, the sparse data-driven FDIA can only be launched, which is similar to the blind FDIA,²¹ where data-driven means the approximate topology information of systems (i.e. $H_{apx}$ ) can be obtained for adversaries by using data analysis methods such as principal component analysis (PCA). On the other hand, due to some buses being heavily protected by defenders, adversaries can only approach and tamper the measurement data corresponding to the specified set $Λ$ (the state variable set of the target buses, i.e. not only sparse but also targeted data-driven FDIA is launched. Let the attack vector $a = H_{apx} c_{1}$ $(c_{1} \in R^{n \times 1})$ , if and only if $B_{apx}^{\bar{Λ}} a = y$ , where $\bar{Λ}$ denotes a set of non-target state variables, $H_{apx}$ denotes the approximate estimation of the Jacobian matrix $H$ by PCA,²² $y = B_{apx}^{\bar{Λ}} b$ and $b = \sum_{j \in Λ} h_{j} c_{j}$ , $h_{j}$ denotes the $j th$ column of $H$ , and $c_{j}$ denotes the $j th$ element of $c$ ; $B_{apx}^{\bar{Λ}} = H_{apx}^{\bar{Λ}}$ ${[{(H_{apx}^{\bar{Λ}})}^{T} H_{apx}^{\bar{Λ}}]}^{- 1}$ ${(H_{apx}^{\bar{Λ}})}^{T} - I$ , $H_{apx}^{\bar{Λ}}$ is a sub-matrix of $H_{apx}$ , in which the column number is not in set $Λ$ . The attack strategy of adversaries can be transformed into solving the following optimization problem:

\begin{matrix} min ‖ a ‖_{1} \\ s . t . y = B_{apx}^{\bar{Λ}} * a \end{matrix}

(2)

where $‖ a ‖_{1}$ denotes the $l_{1}$ relaxation of $‖ a ‖_{0}$ . Equation (2) is a basis pursuit (BP) problem, which is to pursuit the optimal sparse solution $a^{*}$ . To solve the above BP problem by alternating direction method of multipliers (ADMM),²³ equation (2) can be transformed into the following form:

\begin{matrix} min h (a) + ‖ β_{a} ‖_{1} \\ s . t . a - β_{a} = 0 \end{matrix}

(3)

where $h (a)$ denotes the indicator function of ${a : y = B_{apx}^{\bar{Λ}} * a}$ , $β_{a}$ is the optimal variable. Obviously, the sparsity of $a$ can be determined by $‖ β_{a} ‖_{1}$ with a real scalar $λ_{a}$ . In addition, to improve the stealth of FDIAs, $‖ y - B_{apx}^{\bar{Λ}} * a ‖_{2}^{2}$ can be used as a cost function and the construction of the attack can be transformed into the following optimization problem:

\begin{matrix} min ‖ y - B_{apx}^{\bar{Λ}} * a ‖_{2}^{2} + λ_{a} ‖ β_{a} ‖_{1} \\ s . t . a - β_{a} = 0 \end{matrix}

(4)

(4) is clearly a least absolute shrinkage and selection operator (Lasso) problem that can be solved using the ADMM. However, due to the limited resources of adversaries, at most $k$ measurement data can be tampered with, so the sparsity of $a$ actually depends on parameter $k$ . Then, the following regression selection problem can be obtained:

\begin{matrix} min ‖ y - B_{apx}^{\bar{Λ}} * a ‖_{2}^{2} \\ s . t . ‖ a ‖_{0} \leq k \end{matrix}

(5)

where $‖ y - B_{apx}^{\bar{Λ}} * a ‖_{2}^{2}$ denotes the cost function. Moreover, to reduce the probability of attacks being detected, the cost function should be minimized. Similarly, (5) can still be solved by ADMM. The specific attack construction steps of the sparse targeted data-driven attacks can be found in Algorithm 2 in Li et al.²² Here, the FDIA using real Jacobian matrix $H$ is named H-FDIA, and the proposed sparse target data-driven FDIA is named Blind-FDIA, as shown in Figure 1. According to Figure 1, measurement data is transmitted from remote terminal unit (RTU) to the state estimation in SCADA through communication networks, which is susceptible to cyber-attacks, for example, FDIAs.

Figure 1.

Two FDIA models against the state estimation in power cyber-physical systems.

Intrusion detection (ID) on machine learning

Transformation problem of ID

Data imputation can repair abnormal measurement data in PCPSs, requiring the location information for tampered data through intrusion detection. Therefore, adopting appropriate intrusion detection methods against FDIAs is a powerful guarantee for the location information of the tampered data.

Firstly, due to system protection and the limited ability of adversaries, adversaries can only tamper with partial measurement data if the FDIAs attempt to evade traditional detectors, hence the FDIAs are generally sparse. Secondly, when PCPSs operate stably, the corresponding measurement and system state change slowly in a short time, then the matrix composed of measurement data is generally low rank. Therefore, as the mathematical method of machine learning, matrix separation algorithm can be introduced to complete the detection of FDIAs, which is always specifically used to detect FDIAs,^7,19 that is, the FDIA detection problem can be transformed into the following convex optimization matrix separation problem:

min_{L, S} ‖ L ‖_{*} + λ ‖ S ‖_{1}, s . t ., Z = L + S

(6)

where $Z$ denotes a matrix composed of the tampered measurement data. $L$ denotes a low rank matrix composed of the untampered measurement data, and $‖ L ‖_{*}$ denotes nuclear norm of $L$ ; $S$ denotes a sparse matrix composed of the false deviation data, and $‖ S ‖_{1}$ denotes $l_{1}$ norm of $S$ .

Remark 1. It should be noted that the tampered measurement data $Z$ in equation (6) is only separated into clean measurement data $L$ and injected sparse false data (bias) $S$ , without considering the measurement noise present in the actual measurement process. Therefore, the separated matrices $L$ and $S$ may have a certain degree of errors.

Intrusion detection on tripartite separation data model

Considering the impact of measurement noise on intrusion detection, an ADMM on proximal exchange (ADMM-PE) method for tripartite separation data model is proposed as a solution to the FDIA detection problem:

\begin{matrix} min_{L, S, E} Rank (L) \\ s . t ., Z = L + S + E \end{matrix}

(7)

where $Rank (L)$ denotes the rank of $L$ , $E$ is a measurement noise matrix.

Moreover, a proximal operator is introduced to obtain the optimal solution of (7) in the tripartite separation, which can convert the solution method of (7) into the distributed and parallel method on ADMM.²⁴ Specifically, the proximal operator $Π_{C}$ can be evaluated and projected onto the set $C = {L, S, E}$ , and has the following form according to the equilibrium constraint in²⁵:

Π_{C} (L, S, E) = (L, S, E) - Λ + (\frac{1}{N}) Z

(8)

where $Λ = (L + S + E) / N$ , $N$ denotes the number of separated matrices. Here $N = 3$ owing to tripartite separation algorithm. Obviously, the entries of $Λ$ are the average values of the corresponding entries of these $N$ matrices. The sparse part $S$ in (7) can be obtained:

S_{k + 1} = {prox}_{sto}^{α} (S_{k} - Λ_{k} + (\frac{1}{N}) Z - P_{k})

(9)

where ${prox}_{sto}^{α} (\cdot)$ is a soft threshold operator, which can be obtained with ${prox}_{sto}^{α} (ζ) = sign (ζ) \cdot max {| ζ | - α, 0}$ . Then, $L$ can be obtained and updated with singular value decomposition (SVD):

UW V^{T} = SVD (L_{k} - Λ_{k} + (\frac{1}{N}) Z - P_{k})

(10)

L_{k + 1} = U diag {{prox}_{sto}^{β} (ζ) [diag (W)]} V^{T}

(11)

where $diag (\cdot)$ is a function that generates diagonal matrices, $W \in R^{m \times t}$ is a rectangular diagonal matrix. $U$ and $V$ are orthogonal matrices of order $m$ and $t$ , respectively. The parameter $α$ and $β$ in (9) and (11) can control the soft threshold operator $pro x_{sto}$ to approach the optimal mapping, which can be obtained:

\begin{matrix} α = 0.15 {‖ Z ‖}_{\infty} \\ β = 0.15 {‖ Z ‖}_{2} \end{matrix}

(12)

In addition, $E$ and Lagrange multiplier matrix $P$ can be iteratively updated according to the following way:

E_{k + 1} = pro x_{so} (E_{k} - Λ_{k} + (\frac{1}{N}) Z - P_{k})

(13)

P_{k + 1} = P_{k} + Λ_{k + 1} - (\frac{1}{N}) Z

(14)

where $pro x_{so}$ denotes the shrinkage operator²⁴ that can be expressed:

pro x_{so} (ϑ) = \frac{1}{1 + γ} (ϑ)

(15)

Finally, through continuous iteration, sparse attacks $S$ are obtained as the key input information for the next step of data imputation.

Remark 2. In this subsection, it is apparent that the intrusion detection method on ADMM-PE brings some advantages to obtain the optimal solution of equation (7): each component can be independently solved, and only the current matrices $Λ$ and $P_{k}$ are required in the process of matrix separation.

Data imputation on the improved generative adversarial imputation networks

On the one hand, results of FDIA detection provide a binary mask matrix $M \in R^{m \times n}$ for data imputation, which reflects the missing rate $(MR)$ of measurement data:

\begin{matrix} \underset{=}{MR} 1 - \frac{1}{m \times n} \sum_{i = 1}^{m} \sum_{j = 1}^{n} M_{ij} \\ M R_{j} = 1 - \frac{1}{n} \sum_{i = 1}^{m} M_{ij} \end{matrix}

(16)

where $M_{ij} = {\begin{matrix} 1, S_{ij} = 0 \\ 0, S_{ij} \neq 0 \end{matrix}$ , that is, $M_{ij} = 1$ if the corresponding measurement data is determined as true data by the proposed FDIA detection $(S_{ij} = 0)$ , otherwise $M_{ij} = 0$ . On the other hand, due to the sparsity of the proposed FDIA, the $\underset{\times}{MR} 100 %$ of the power measurement data will not exceed 50%, hence the data imputation method for multivariate time series (DIM-MTS) on Generative Adversarial Imputation Nets (GAIN)¹⁵ is introduced to solve the recovery of incomplete measurement data after FDIA detection, which has been proven to have good data recovery performance under low $MR s$ .²⁶

Basic GAIN

GAIN includes two generators and one discriminator, adding a hint generator on the top of the traditional GAN. In GAIN, incomplete measurement data $\tilde{X}$ , the binary mask matrix $M$ and random matrix $Z$ are the inputs of the generator $G$ , then the output $X_{G}$ of $G$ can be obtained:

\begin{matrix} X_{G} = G (\tilde{X}, M, (1 - M) ⊙ Z) \\ \hat{X} = M ⊙ \tilde{X} + (1 - M) ⊙ X_{G} \end{matrix}

(17)

where $\hat{X}$ denotes the imputed complete measurement data by generator $G$ , $⊙$ denotes the Hadamard multiplication. Then, with the additional information provided by hint generator, the discriminator $D$ tries to distinguish which entries of $\hat{X}$ are true observed and which are imputed. The output of $D$ is a probability matrix as follows:

P_{D} = D (\hat{X}, H)

(18)

where the entries of $P_{D}$ denote the probabilities that the entries of $\hat{X}$ are identified as the true observations. The information in $H$ is the output of the hint generator.

The biggest feature of GAIN is the introduction of the hint generator,^15,27 the output $H$ can be calculated as follows:

H = B ⊙ M + 0.5 (1 - B)

where $B = (B_{1}, \dots, B_{n})$ denotes the first sampling $k$ from {1, …, $n$ } uniformly at random, and $B_{j} = 1$ if $j \neq k$ , otherwise $B_{j} = 0$ .

The goals of the discriminator $D$ and generator $G$ can be mathematically described as

max_{D} \frac{1}{k_{D}} \sum_{κ = 1}^{k_{D}} L_{D} (M^{κ}, P_{D}^{κ})

(19)

max_{G} \frac{1}{k_{G}} \sum_{κ = 1}^{k_{G}} [L_{G} (M^{κ}, P_{D}^{κ}) + α L_{R} ({\tilde{X}}^{κ}, {\hat{X}}^{κ})]

(20)

where $k_{D}$ and $k_{G}$ are the mini-batch size of $D$ and $G$ , respectively. $L_{D}$ , $L_{G}$ , and $L_{R}$ denote the loss terms:

L_{D} (M, P_{D}) = M \log P_{D} + (1 - M) \log (1 - P_{D})

(21)

L_{G} (M, P_{D}) = - (1 - M) \log P_{D}

(22)

L_{R} ({\tilde{X}}^{κ}, {\hat{X}}^{κ}) = \sum_{i = 1}^{n} M_{i} L_{R} ({\tilde{x}}_{i}, {\hat{x}}_{i})

(23)

with

$L_{R} ({\tilde{x}}_{i}, {\hat{x}}_{i}) = {\begin{matrix} ({\tilde{x}}_{i} - {\hat{x}}_{i})^{2} for numerical variables \\ - {\tilde{x}}_{i} \log {\hat{x}}_{i} for binary variables \end{matrix}$ and $α$ is a weight parameter. Then, generator $G$ and discriminator $D$ continue to play the zero-sum game, ultimately achieving an equilibrium.

Similar supervised gain

Obviously, GAIN belongs to an unsupervised machine learning algorithm. In order to improve the accuracy of measurement data recovery, a clustering algorithm is introduced to provide potential category information for GAIN, transforming it from an unsupervised to a similar supervised algorithm.²⁶

At first, the sampling measurement data is arranged in ascending order according to the level of ${MR}_{j}$ , and the first $δ n$ sampling measurement data with low ${MR}_{j}$ is select to pre-train the generator and discriminator, then the imputed data through (17) can be obtained, where $0 < δ < 0.8$ . Note that $δ$ should be set small if the ${MR}_{j}$ of all sampling measurement data is large, otherwise, set a larger value.

In the second step, k-means clustering algorithm is introduced to data analysis, forming the pseudo-labels.²⁷ Next, as a classifier, support vector machine (SVM) is trained with the pseudo-labels and the imputed data.

Then, with the information from SVM, the generator and discriminator play multiple games again. The goal of the discriminator is the same as that of the discriminator in basic GAIN (as shown in (19)), and the goal of the generator is transformed as follows:

max_{G} \frac{1}{k_{n}} \sum_{κ = 1}^{k_{n}} [L_{G} (M^{κ}, P_{D}^{κ}) + α L_{R} ({\tilde{X}}^{κ}, {\hat{X}}^{κ}) + β L_{C} ({\hat{X}}^{κ})]

where $k_{n}$ is the mini-batch size of $G$ , $L_{C} (\hat{X}) = - C (\hat{X}) \log C (\hat{X})$ , $C (\hat{X})$ denotes the output of SVM. $β$ is also a weight parameter. As the data recovery way, the output $\hat{X}$ of the generator $G$ is the final imputed measurement data through the zero-sum game. According to Wang et al.,²⁷ the similar supervised GAIN algorithm is named pseudo-label GAIN (PC-GAIN).

The proposed data recovery strategy against FDIAs

For FDIAs in PCPS, a data recovery strategy against FDIAs that consists of two parts is proposed. The first part is the intrusion detection on the proposed ADMM-PE, which completes the separation and detection of the sparse FDIAs through tripartite separation technique and proximal exchange, and provides the low rank measurement data, noise and sparse attack data. Then, the incomplete measurement data and mask information for subsequent measurement data imputation can be obtained through the tampered location information from sparse attack data. The intrusion detection on the proposed ADMM-PE is summarized in Algorithm 1:

Algorithm 1 The intrusion detection on the proposed ADMM-PE
Input: Measurement matrix, $Z, P_{0}$ . Output: $L, S, E$ . 1: $k = 0$ , calculate parameters $α$ and $β$ according to (12). 2: while $k < 100$ do 3: The sparse matrix $S_{k + 1}$ is estimated according to (9); 4: The low rank matrix $L_{k + 1}$ is estimated according to (10) and (11); 5: The noise $E_{k + 1}$ is estimated according to (13); 6: The lagrange multiplier matrix $P_{k + 1}$ is iteratively updated according to (14); 7: $k = k + 1$ 8: end while

Algorithm 1 The intrusion detection on the proposed ADMM-PE

Input: Measurement matrix,

Z, P_{0}

.
Output:

L, S, E

.
1:

k = 0

, calculate parameters

α

and

β

according to (12).
2: while

k < 100

do
3: The sparse matrix

S_{k + 1}

is estimated according to (9);
4: The low rank matrix

L_{k + 1}

is estimated according to (10) and (11);
5: The noise

E_{k + 1}

is estimated according to (13);
6: The lagrange multiplier matrix

P_{k + 1}

is iteratively updated according to (14);
7:

k = k + 1

8: end while

The second part is data imputation, in which a similar supervised GAIN algorithm is proposed to restore power measurement data, and provide reliable and complete data for state estimation. The proposed data recovery strategy against FDIAs is shown in Figure 2.

Figure 2.

The proposed data recovery strategy against FDIAs.

According to Figure 2, once the FDIAs are launched, the intrusion detection method on ADMM-PE can detect and identify the tampered location, generating mask information. Then, the tampered measurement data will be discarded, resulting in incomplete measurement data $Z_{inc}$ , $Z_{inc}$ can be calculated as follows:

Z_{inc} = Z ⊙ M

(24)

The results of the intrusion detection including mask information and incomplete measurement data will provide the input information for subsequent data imputation process. Considering that the FDIAs are sparse attacks, the proposed similar supervised GAIN algorithm is introduced to complete the task of data recovery, which is not only suitable for the recovery of low MR data, but also improves the accuracy of data recovery with the help of pseudo labels. The specific steps of the data imputation are as follows:

(1) Pre-training: select a subset of measurement data samples with low missing rates to pre-train the generator and discriminator, and obtain the pre-imputed (measurement) data through GAIN;

(2) K-means clustering is introduced to generate the pseudo-labels on the pre-imputed data; and then a auxiliary classifier on SVM is trained with the pseudo-labels and the pre-imputed data;

(3) Formal training: all measurement data samples are used to train the generator and the discriminator in GAIN, while the classification information obtained by SVM are used to constrain the generator and force it to learn features from different classes. In the end, the final imputed measurement data is obtained through the games between generator and discriminator.

Apparently, the quality of the generated pseudo-labels affects the performance of the generator and discriminator by affecting the performance of the auxiliary classifier.

Experiments

We validate the performance of the proposed data recovery strategy (the proposed DRS) on IEEE 14-bus power system extracted from the MATPOWER toolbox, where measurement data is generated by direct current power flow operations. Furthermore, we assume that adversaries can replace true measurement data with tampered measurement data through the communication link to deceive the state estimator in SCADA.

Experiments for intrusion detection on traditional chi square

In this section, we first verify the stealth of FDIAs against traditional residual-based detector. If the attack density of the sparse attacks (i.e. the sparsity of FDIAs) is 6% (all subsequent experiments have the same attack density), traditional residual-based detector cannot detect the H-FDIA and Blind-FDIA well, as shown in Figure 3, where the true positive rate is defined as follows:

P_{d} = \frac{N_{Hit}}{N_{Hit} + N_{Miss}}

Figure 3.

The results of chi square detection against H-FDIA and Blind-FDIA.

$N_{Hit}$ denotes the number of tampered measurement data that can be successfully detected, $N_{Miss}$ denotes the number of tampered measurement data that have not been detected. According to Figure 3, traditional Chi Square detector is completely unable to detect H-FDIA and can occasionally detect Blind-FDIA, hence, we replace $H_{apx}$ with a real $H$ to improve the concealment of the proposed sparse target FDIA (pFDIA) in section “Modeling of False Data Injection Attacks.”

Experiments for intrusion detection on tripartite separation model

Firstly, in order to evaluate the detection performance of FDIA detection methods, except $P_{d}$ , false alarm rate $(P_{f})$ , accuracy $(P_{i})$ , and comprehensive evaluation index $F 1$ are also introduced:

P_{f} = \frac{N_{False}}{N_{False} + N_{Corred}}

(25)

P_{i} = \frac{N_{Hit} + N_{Corred}}{N_{Hit} + N_{Corred} + N_{Miss} + N_{False}}

(26)

F 1 = \frac{2 \times P_{i} \times P_{d}}{P_{i} + P_{d}}

(27)

where $N_{False}$ denotes the number of false reports of the attack-free locations and $N_{Corred}$ denotes the number of correct reports of the attack-free locations.

Four machine learning methods on matrix separation are used to compare their performance on FDIA detection against pFDIA in IEEE 14-bus power system, under different signal to noise ratio (SNR), including inexact augmented lagrange multipliers (IALM), low rank matrix factorization (LMaFit),¹⁹ a fast Go Decomposition (GoDec),⁷ and the proposed intrusion detection on ADMM-PE. The sampling time is set: $n = 200$ . The parameters for IALM are set: the tolerance for stopping criterion is $10^{- 6}$ , the maximum number of iterations is 50 and the weight on sparse error term in the cost function is $m^{- 0.5}$ . The parameters for LMaFit are set: penalty parameter is 1:50, the maximum number of iterations is 50, initial rank estimate is $n \times 25 %$ . The power scheme modification in GoDec is 6.

Table 1 shows the performance of FDIA detection methods in IEEE 14-bus power system, under different signal to noise ratio (SNR), where the bold font is the proposed algorithm and the data with the best performance. According to Table 1, IALM has the best performance on $P_{d}$ but no good performance on other three indicators. Regardless of whether the SNR is 30 dB or 18 dB, ADMM-PE has the best performances on $P_{f}$ , $P_{i}$ , and F1, followed by LMaFit. Obviously, the proposed intrusion detection method on ADMM-PE has good robustness against noise interference and can accurately detect the position of the tampered data, even when the SNR is equal to 18 dB.

Table 1.

The performance of FDIA detection methods in IEEE 14-bus power system.

	SNR = 30 dB			F1 (%)
	$P_{d}$ (%)	$P_{f}$ (%)	$P_{i}$ (%)
IALM	100.00	22.09	78.06	87.68
LMaFit	83.93	0.53	99.36	90.95
GoDec	83.75	29.03	71.07	76.79
ADMM-PE	96.76	0.05	99.93	98.31
	SNR = 18 dB			F1 (%)
	$P_{d}$ (%)	$P_{f}$ (%)	$P_{i}$ (%)
IALM	100.00	22.07	78.0	87.69
LMaFit	96.27	0.61	53.07	97.78
GoDec	88.98	28.99	71.14	79.03
ADMM-PE	96.89	0.03	99.95	98.38

Then, the results of intrusion detection should be processed, including discarding the tampered measurement data, setting the corresponding entries to 0 or null, that is, the incomplete measurement data can be obtained according to (24), and generating the corresponding mask information for use in the data recovery process in next section.

Experiments for measurement data recovery

Data imputation technology is adopted to complete the measurement data recovery in the paper. According to Figure 2, the dataset including $Z$ , $M$ , and partial $Z_{inc}$ with low $M R_{j}$ obtained by intrusion detection is used to complete the pre-training, and then the dataset including $Z$ , $M$ , and all $Z_{inc}$ is used to complete the formal training. We compare PC-GAIN with baseline methods such as GAIN, the imputation on K-nearest neighbor (KNN), and the imputation on mean values (MEAN). Among them, both KNN imputation²⁸ and MEAN belong to traditional methods, while PC-GAIN and GAIN belong to deep learning methods. Furthermore, the same generative adversarial network structure is adopted when comparing PC-GAIN and GAIN, since both algorithms have the generative adversarial network frameworks.

For the comparison experiments, during each parameter adjustment, other parameters are fixed, and only one parameter is manually set. Then, the experiment corresponding to each parameter group is repeated for 20 times, and the average imputation effect is obtained. Finally, the best group is selected as the final parameters. By the above way, the parameters of PC-GAIN are set to $α = 800$ and $β = 80$ , cluster number $= 6$ , $k_{D} = k_{n} = 64$ , and epoch $= 1500$ ; and the parameters of GAIN are set as follows: $α = 100$ , $k_{D} = k_{G} = 16$ , and epoch $= 3000$ . In particular, in order to make a fair comparison, we have all models tested and evaluated under the same experimental conditions, including imputation at the same position, both PC-GAIN and GAIN use the same generator and discriminator structure, and the experiments with the same parameters are repeated 20 times.

First, we compare the performance of these algorithms under different missing rates $(MR s)$ , as shown in Figure 4. Here, root mean square error (RMSE) is used as the evaluation metric, which denotes the imputation accuracy of measurement data recovery. Specifically, RMSE is the root mean square error between the imputed value of missing data and the true value, representing the average degree of the deviation between the imputed value and the true value. The smaller the RMSE, the better performance of data imputation models, that is, better measurement data recovery result. RMSE is defined as follows:

\underset{=}{RMSE} \sqrt{\frac{1}{m \times n} \sum_{i = 1}^{m} \sum_{j = 1}^{n} {(L_{ij}^{true} - {\hat{X}}_{ij})}^{2}}

where $L^{true}$ denotes a data matrix composed of real power measurement data. $\hat{X}$ denotes the imputed complete measurement data by generator $G$ .

Figure 4.

The RMSE of four data imputation algorithms under different missing rates $(MR s)$ in IEEE 14-bus power system.

According to Figure 4, the performances of the four algorithms are stable under different missing rates $(MRs)$ . Regardless of the $MR$ , PC-GAIN has always had the best performance with the minimum RMSE, followed by GAIN and KNN, while MEAN has the worst performance. However, as the MR increases, the data recovery performances of all four algorithms will gradually deteriorate.

Next, we record the running time of the four algorithms under different $(MRs)$ , as shown in Table 2.

Table 2.

Comparisons of running time (s) for four data imputation algorithms under different $MRs$ .

	MEAN	KNN	GAIN	PC-GAIN
MR = 0.1	0.0040	0.0282	4.1075	5.5440
MR = 0.2	0.0040	0.0289	4.1699	5.6228
MR = 0.3	0.0038	0.0319	4.0826	5.6241
MR = 0.4	0.0038	0.0521	4.7547	5.5461
MR = 0.5	0.0040	0.0644	4.1636	5.5746

From Table 2, it can be seen that the running time of both traditional methods is very short. In contrast, the running time of PC-GAIN and GAIN is relatively large. The reason is that PC-GAIN and GAIN belong to deep learning methods and require more time to train. Moreover, PC-GAIN has more parameters and more complex calculation process (e.g. pre-training phase), since PC-GAIN further improves the inference ability of the generator by adding implicit category information on the basis of GAIN. Obviously, PC-GAIN requires more time for pre-training and so on than the other three algorithms. Therefore, the running time of PC-GAIN is about 1 s longer than that of GAIN and more longer than that of MEAN and KNN.

In addition, we study the impact of the cluster number in k-means clustering on the performance of PC-GAIN algorithm, as shown in Figure 5.

Figure 5.

The RMSE of the data imputation on PC-GAIN with different cluster numbers and $MR$ s in IEEE 14-bus power system.

According to Figure 5, we can see that when the $MR$ is 0.3 and 0.4, the performance of PC-GAIN is less affected by the cluster number, but when the $MR$ is 0.1, 0.2, and 0.5, the cluster number has a significant impact on the algorithm. For the data in this paper, when the cluster number is 6, the algorithm can perform well under high $MR s$ .

In general, the proposed intrusion detection on ADMM-PE in the previous section provides reliable information for subsequent data imputation, ensuring the effectiveness of measurement data recovery in PCPSs. In addition, in terms of data recovery, compared with unsupervised GAIN, the proposed similar supervised GAIN has good performance on imputation accuracy (minimum RMSE) but spends slightly more time (lower computing efficiency), however, with the rapid improvement of hardware computing and processing capabilities, the computing efficiency will not be a major issue.

Conclusion

In this paper, we have developed a new data recovery strategy on machine learning against FDIAs in PCPSs. A sparse target FDIA has been introduced, and the machine learning algorithms such as ADMM-PE and similar supervised GAIN are integrated to provide a data recovery solution after FDIAs. Specifically, the FDIA detection problem can be transformed into a tripartite separation problem and provides reliable inputs such as mask information and real incomplete measurement data to the proposed similar supervised GAIN. With the help of k-means clustering algorithm and SVM, the pseudo labels generated by data analysis provide information similar to supervised learning for data imputation to improve the accuracy of measurement data recovery. Finally, the example in power cyber physical systems is illustrated the effectiveness and superiority of the proposed data recovery strategy against FDIAs. Obviously, the proposed data recovery strategy against FDIAs provides reliable and complete measurement data for the state estimator in SCADA, ensuring the subsequent stable and reliable operation of the power system. Overall, as long as cost is considered, the protection configurations of PCPSs cannot be perfect, the proposed strategy is feasible in theory for the data recovery problem of bad data injection caused by such sparse FDIAs.

However, the proposed data recovery strategy has two limitations. (1) In the intrusion detection phase, the proposed ADMM-PE on matrix separation may fail when FDIAs are not sparse according to equations (6) and (7). (2) In the data imputation phase, according to Table 2, PC-GAIN has a longer running time, and may not meet the real-time requirements of PCPSs If the hardware computing and processing capabilities are not enough. In general, with the improvement of hardware computing and processing capabilities, the proposed strategy still remains a good data recovery strategy against FDIAs. In the future research, it is expected that: the feasible and efficient data recovery solutions against mixed cyber-attacks not only the sparse FDIAs in PCPSs should be studied, which are more universal, for example, hybrid intrusion detection methods against mixed cyber-attacks, or intrusion detection based on ensemble learning, and data recovery solution on self-attention-based imputation.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China [grant numbers 62006052, 61973128]; Basic and Applied Basic Research Foundation of Guangdong Province [grant numbers 2023A1515012468, 2022A1515110148, 2021A1515011520].

ORCID iDs

Qinxue Li

Xuhuan Xie

Guiyun Liu

Data availability statement

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

Ghiasi

Niknam

Wang

, et al. A comprehensive review of cyber-attacks and defense mechanisms for improving security in smart grid energy systems: past, present and future. Elect Power Syst Res 2023; 215: 108975.

Ghiasi

Dehghani

Niknam

, et al. Cyber-attack detection and cyber-security enhancement in smart DC-microgrid based on blockchain technology and Hilbert Huang transform. IEEE Access 2021; 9: 29429–29440.

ZG.

Multi-objective false data injection attacks of cyber–physical power systems. IEEE Trans Circuits Syst II Express Briefs 2022; 69(9): 3924–3928.

Huang

Differential evolution-based three stage dynamic cyber-attack of cyber-physical power systems. IEEE/ASME Trans Mechatron 2023; 28(2): 1137–1148.

Musleh

Chen

Dong

ZY.

A survey on the detection algorithms for false data injection attacks in smart grids. IEEE Trans Smart Grid 2020; 11(3): 2218–2234.

Reda

Anwar

Mahmood

Comprehensive survey and taxonomies of false data injection attacks in smart grids: attack models, targets, and impacts. Renew Sustain Energ Rev 2022; 163: 112423.

Ding

Huang

, et al. Detecting false data injection attacks against power system state estimation with fast go-decomposition approach. IEEE Trans Ind Inform 2019; 15(5): 2892–2904.

Zheng

Xie

. False data injection attack detection in power system with non-convex principal component analysis. In: 2022 China automation congress (CAC). Xiamen, China, 2022, pp.5649–5654.

Zhang

Wang

Chen

Detecting false data injection attacks in smart grids: a semi-supervised deep learning approach. IEEE Trans Smart Grid 2021; 12(1): 623–634.

10.

Zhang

Bamisile

, et al. Spatio-temporal correlation-based false data injection attack detection using deep convolutional neural network. IEEE Trans Smart Grid 2022; 13(1): 750–761.

11.

Wei

, et al. Detection of false data injection attacks in smart grid: a secure federated deep learning approach. IEEE Trans Smart Grid 2022; 13(6): 4862–4872.

12.

Wang

Zhang

, et al. KFRNN: an effective false data injection attack detection in smart grid based on Kalman filter and recurrent neural network. IEEE Internet Things J 2022; 9(9): 6893–6904.

13.

Sun

Parallel generative adversarial imputation network for multivariate missing time-series reconstruction and its application to aeroengines. IEEE Trans Instrum Meas 2023; 72: 1–16.

14.

Yldz

Koç

Multivariate time series imputation with transformers. IEEE Signal Process Lett 2022; 29: 2517–2521.

15.

Yoon

Jordon

Schaar

. Gain: missing data imputation using generative adversarial nets. In: Proceedings of the 35th international conference on machine learning, 2018, pp.5689–5698.

16.

Cao

Wang

, et al. Brits: bidirectional recurrent imputation for time series. Adv Neural Inform Process Syst 2018; 31: 6775–6785.

17.

Che

Purushotham

Cho

, et al. Recurrent neural networks for multivariate time series with missing values. Sci Rep 2018; 8(1): 6085.

18.

Liu

Ning

Reiter

MK.

False data injection attacks against state estimation in electric power grids. ACM T Inf Syst Se 2011; 14(1): 1–33.

19.

Liu

Esmalifalak

Ding

, et al. Detecting false data injection attacks on power grid by sparse optimization. IEEE Trans Smart Grid 2014; 5(2): 612–621.

20.

Hao

Piechocki

Kaleshi

, et al. Sparse malicious false data injection attacks and defense mechanisms in smart grids. IEEE Trans Ind Inform 2015; 11(5): 1–12.

21.

Yang

Wang

, et al. Blind false data injection attacks against state estimation based on matrix reconstruction. IEEE Trans Smart Grid 2022; 13(4): 3174–3187.

22.

, et al. Data-driven attacks and data recovery with noise on state estimation of smart grid. J Frankl 2021; 358(1): 35–55.

23.

Boyd

Parikh

Chu

Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach J 2010; 3(1): 1–122.

24.

Zhang

Zhu

Wang

, et al. Vision-based vehicle detection for videosar surveillance using low-rank plus sparse three-term decomposition. IEEE T Veh Technol 2020; 69(5): 4711–4726.

25.

Parikh

Stephen

Proximal algorithms. Found Trends Optim 2014; 1(3): 127–239.

26.

Zhang

Zhao

A systematic review of generative adversarial imputation network in missing data imputation. Neural Comput Appl 2023; 35(27): 19685–19705.

27.

Wang

, et al. PC-GAIN: pseudo-label conditional generative adversarial imputation networks for incomplete data. Neural Netw 2021; 141: 395–403.

28.

Zhang

Nearest neighbor selection for iteratively kNN imputation. J Syst Softw 2012; 85(11): 2541–2552.

The data recovery strategy on machine learning against false data injection attacks in power cyber physical systems

Abstract

Keywords

Introduction

Preliminary

State estimation

Modeling of false data injection attacks

Intrusion detection (ID) on machine learning

Transformation problem of ID

Intrusion detection on tripartite separation data model

Data imputation on the improved generative adversarial imputation networks

Basic GAIN

Similar supervised gain

The proposed data recovery strategy against FDIAs

Experiments

Experiments for intrusion detection on traditional chi square

Experiments for intrusion detection on tripartite separation model

Experiments for measurement data recovery

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

Data availability statement

References