Abstract
Network security situation prediction is a complex task that typically requires extensive retraining of deep-learning models on vast amounts of sample data to achieve optimal performance. This paper proposes an innovative approach that integrates Model-Agnostic Meta-Learning (MAML) with Bidirectional Gated Recurrent Units (BiGRU) to address these challenges. Our method harnesses the BiGRU model’s capability to learn from both preceding and succeeding conditions within network security prediction data, effectively extracting temporal information essential for prediction. This is complemented by Stochastic Gradient Descent for parameter updates, enhancing the model’s adaptability and learning efficiency. Furthermore, the MAML algorithm is incorporated to facilitate the BiGRU model’s swift adaptation to new tasks, thereby improving its generalization capabilities. The parameters are refined through a meta-learning process that calculates the sum of losses across multiple training instances and employs quadratic gradient descent for optimization. The empirical results of our approach demonstrate significant advancements, with goodness-of-fit decision coefficients of 0.926983 and 0.934452, representing a marked improvement of at least 18.0% and 15.8% over conventional deep learning models in the domain of network security situation prediction. This research novelty lies in the synergistic combination of MAML and BiGRU, which not only reduces the dependency on large datasets for retraining but also enhances the model’s predictive accuracy and generalization to novel network security scenarios. It contributes a robust and efficient solution to the critical problem of network security situation prediction and paves the way for future advancements in cybersecurity defense mechanisms.
Introduction
Nowadays, Internet has become an integral part of everyday life, and which has become more evident since the COVID-19 pandemic. School students learn remotely through the Internet, company employees work remotely through the Internet, hospital doctors diagnose remotely through the Internet, and governments use health codes to keep track of people’s movements and then provide support for epidemic prevention through big data technology. Cyberspace has become the fifth space after land, sea, sky, and outer space, carrying more and more human activities and becoming an indispensable and essential element in the development of human society [1]. According to the latest data from China Internet Network Information Center, by December 2022, China’s netizens have been 1.067 billion. And in contrast with December 2021, 35.49 million netizens newly emerged. Internet penetration approached 75.6%, up 2.6 percent points compared with December 2021 [2]. Changes in netizen scale and Internet penetration recently are shown in Fig. 1.

Netizen scale and Internet penetration.
As netizen scale continues to expand, network environment becomes more and more complex, and network security problems occur frequently. According to the data from National Internet Emergency Response Center on August 19, 2022 [3], the overall evaluation of the country’s Internet network security status in July 2022 was good, and 3,713 domestic websites were tampered with; 1,960 domestic websites were implanted with backdoors; 7,740 domestic website pages were counterfeited. National Information Security Vulnerability Sharing Platform collected and collated 2,066 information system security holes, 730 high risk holes included, 1,553 zero-day holes, and 1,613 holes for remote attack. These security issues not only threaten government departments, banks, and other important information system sectors but also threaten education, telecommunications, self-publishing, and other related industries and pose a significant threat to individuals’ daily lives.
Network security situation prediction (NSSP) is very complex, and deep-learning model usually needs countless sample data retrained to achieve good performance. To address these problems, this paper combines meta-learning and deep learning methods to enable model’s quick adaptation to small-sample tasks as a way to propose a prediction approach on the basis of MAML and BiGRU, and below are major contributions of the paper: Construct a network security situation prediction dataset using the weekly network security information and dynamic weekly reports from National Internet Emergency Response Center for experimental validation. Network security situation prediction method on the basis of MAML and BiGRU is proposed, which adds the MAML algorithm to the traditional BiGRU, thus allowing the model to adapt to new tasks in small sample situations quickly. By designing relevant experiments, we verify that the model shows significant effectiveness and feasibility in network security situation prediction, whose performance exceeds general deep learning models.
Our approach introduces a novel integration of Meta-Learning with Bidirectional Gated Recurrent Units (BiGRU) to address the challenges of data complexity and the need for retraining in deep learning models for network security situation prediction. This integration allows for rapid adaptation to new tasks and enhances the generalization performance, which is a significant advancement over existing methods.
Other sections: Section Two details relative works and current status of the research correlated with the paper. Section Three expresses the model in detail and the principle approach proposed in this paper. Section Four describes the experiment with findings. Section Five sums up the study and the prospects for further tasks.
Literal review
Network Security Situation Prediction technology differs from traditional prediction technology in that it can be considered a relatively proactive defense system [4]. By predicting the next stage of the network security situation to develop complementary strategies to defend against network attacks, the status quo of traditional security defense tools can be completely changed from passive defense to active defense, greatly upgrade defense with high success rate. As for network security situation prediction, it will no longer be partial to a corner but from a more macro perspective to calculate and evaluate. Analyzing the target network provides analysts with more macro and intuitive data to show the target network’s security. Standard prediction methods include time series prediction methods [5], gray theory prediction, regression analysis, etc. However, the reality is that changing network security situation is a complex process because network attacks are often full of randomness and chance. When dealing with a nonlinear relationship, the above methods need to be revised and have gradually failed to meet the needs of network security situation prediction. Prediction methods as well as models on the basis of theories like Neural Networks, Markov chain [6], Support Vector Machine (SVM) [7, 8] have been discovered by various scholars one after another. Among them, Neural Networks is widely used for the prediction, belonging to artificial intelligence and can be processed in parallel while possessing excellent function fitting and self-learning capabilities. It has a high fault tolerance, providing solid data analysis and processingsupport.
Background work
Meta learning
Meta Learning is a concept initially introduced by Schmiduber in the 1990’s [9]. Distinguished from machine learning, in which the data itself is the unit, meta-learning uses the task as the basic unit, with target to improving learning algorithm via multi-task learningfor quick adaptation to new task. Each task
The Model-Agnostic Meta-Learning (MAML) algorithm is a perfect meta learning algorithm put forward by Finn et al. in 2017 for trained model using Gradient Descent [10]. The MAML algorithm is widely used to train a streamlined model that uses countable training samples for solving multi-task learning. Algorithm focuses on tuning the original model parameters by training countable tasks and continuously iterating gradient descent to show better generalization performance on new tasks. Its training process is shown in Fig. 2. In addition, the MAML algorithm is a fundamental framework for innumberable meta learning algorithms, with its application into numerous aspects for addressing challenges like data bottlenecks as well as generalization in deep learning. Liu et al. [11] applied meta learning for predicting stock prices and improved stock prediction accuracy by introducing the MAML algorithm for mitigating concept drift influence on predicting and providing precious guide to investor for reducing investing risks. Nie et al. [12] introduced meta-learning to cope with the problem of lack of interoperability and scalability of existing methods and models in the field of human activity recognition when activities and human bodies newly engaged in activities and statuses newly arise for rapidly adapting to human activity recognition in new statuses. Su et al. [13] applied meta-learning to a bearing failure diagnosis with countable samples in diverse operating environment by proposing a data reconstruction hierarchical recursive meta learning method to rapidly adapt to human activity recognition when fault samples are lacking. The fault diagnosis task achieved good results. Through these studies, we treat network security situation prediction as meta learning issue, with MAML algorithm application into network security situaiton prediction to improve model’s prediction performance.

MAML training process.
Gated recurrent unit (GRU) introduces the reset gate

Basic structure of GRU.
Where
GRU network is calculated as follows:
Where
The comparison of related work with the present study is as Table 1.
Comparison of related work with the present study
Comparison of related work with the present study
Our proposed MAML-BiGRU model combines the strengths of meta-learning and deep learning to address the challenges of small sample learning and rapid adaptation to new tasks in network security situation prediction. Compared to traditional machine learning methods (Study A), our model eliminates the need for extensive manual feature engineering, instead learning useful representations directly from raw data. In contrast to pure deep learning approaches (Study B), our model significantly improves generalization performance on new datasets through the MAML algorithm, even when the number of samples is limited. Additionally, our BiGRU architecture effectively captures the dynamics of time-series data, which is crucial for real-time security situation prediction.
In the context of network security, accurately predicting security incidents is a complex challenge due to the dynamic nature of cyber threats. This section describes network security incidents and the problem of network security situation prediction.
Network security incidents
Network security incidents refer to any unauthorized actions or occurrences that potentially compromise the security, integrity, or availability of network systems. These incidents may include, but are not limited to, virus attacks, network intrusions, data breaches, and denial-of-service attacks (DoS/DDoS). To effectively predict and respond to these incidents, it is crucial to define and document them accurately.
Network security incidents can be categorized based on their nature, scope, and severity. For instance, events can be classified as low, medium, or high risk based on their potential impact. Additionally, incidents can be further based on their origin (such as internal or external threats) and attack type (such as malware or social engineering).
Network security situation prediction
Situation prediction was first proposed by Endsley in 1988 as part of situation awareness. Situation awareness is defined as “cognition, understanding of environmental factors in a certain spatial and temporal context, and prediction of future trends.” It can be summarized in a classical three-layer model: situation perception, comprehension, and prediction in Fig. 4.

Network security situation awareness model diagram.
In 1999, Bass [15] put forward “network situation awareness”, which first applied situation awareness to the field of cyberspace, and revealed “the next-generation intrusion detection system shall conduct data fusion from countless heterogeneous distributed network sensors for situational awareness in cyberspace,” and proposed a functional model of network security situation awareness on the basis of Multi-sensor Data Fusion concerning the data-fusion model based on the U.S. military structure, as shown in Fig. 5.

Functional model of network security situation awareness on the basis of multi-sensor data fusion.
Shi et al. [16] summarized network security situation awareness as “mining out various security elements in the network environment, processing and fusing them, forming a macroscopic security situation assessment on global network environment, and making time-series prediction on security situation further, which is a technical means to guarantee the network environment. It is a technical means for ensuring network environment security.” Chang et al. [17] summarized network security situation awareness as “discerning the attack behavior in the network from innumerable noisy data and then fusing them to evaluate and monitor security situation of the network in real time for achieving comprehensive network control and providing a foundation for network managers’ decision analysis for reducing network risks and losses.”
Network security situation prediction is the final target to awareness, which is mainly based on acquiring and processing the situation information of historical data, establishing appropriate mathematical models to find potential development patterns among situation data, and thus reasoning to get developing trendency and status of situational situation further. Through situation prediction, qualitative or quantitative analysis can be conducted, and early warnings can be issued to provide a reference for security personnel to make decisions and further realize active defense of the network. The randomness and uncertainty of network attacks make the attack-based security situation change highly complex and non-linear, which brings excellent limitations to the traditional prediction model.
Our research employs a novel approach to network security situation prediction by harnessing the power of Model-Agnostic Meta-Learning (MAML) in conjunction with Bidirectional Gated Recurrent Units (BiGRU). This section outlines the comprehensive methodology that underpins our MAML-BiGRU model.
Modeling framework based on MAML and BiGRU
This paper introduces a meta learning method on the basis of Bi-directional Gate Recurrent Unit (BiGRU) network. It constructs a network security situation prediction approach on the basis of MAML and BiGRU, with major three parts included: data input, BiGRU model, and meta learning network, and overall architecture is shown in Fig. 6.

Model architecture.
In this paper, we choose the security data from National Internet Emergency Response Center [18] as data for experiment, and we obtain the posture values by performing the posture assessment weekly and then conduct normalization and sliding window processing. Finally, we convert them into the form of time step×input dimension. Taking the sliding window as an example, we reconstructed the data, with findings detailed as Table 2.
Data reconstruction results
Data reconstruction results
The task of network security situation prediction is usually related to both before and after network status, and dependencies between network security situation data need to be considered. Therefore, to improve the prediction effect, this paper uses BiGRU network to predict security situation, thus information before and after the network state can be obtained simultaneously, and the features in the network security situation is able for full extraction. Bi-directional architecture of the network can acquire dynamic change of network state in a more detailed way, thus improving the accuracy of prediction.
GRU [19] is a commonly used gated recurrent neural network with a strong learning ability for long-term dependent information.
However, the GRU network can only better capture the forward feature information of the network security posture data, and the backward feature information cannot be obtained, so this paper chooses the BiGRU network for predicting network security situation. Meanwhile, to prevent overfitting problem, a dropout layer is introduced after each BiGRU layer for improving neural network performance.
The BiGRU consists of forward and backward GRUs superimposed on each other, with structure detailed as Fig. 7.

BiGRU network model.
Using forward and backward GRU network, posture values of forward and reverse inputs are calculated separately for corresponding hidden layer state output
As an emerging technique in machine learning, meta learning is aimed at addressing the problem of how to learn new tasks quickly and accurately in a short time. In our research, the MAML algorithm as a meta-learning layer is introduced, with quick adaptation to new learning tasks and improve model’s generalization performance to some extent by using previous learning experiences. At the same time, combining the MAML algorithm with neural networks and various loss functions enables the model to get better training results and has good application value.
The MAML algorithm, as an initialization method for the learner, has significant advantages, such as fast adaptation to new tasks and improved model generalization performance. Compared with traditional machine learning techniques, the MAML algorithm can update parameters based on computing only a tiny amount of data, thus achieving better results in the face of new learning tasks, that is, having the ability to learn to learn. When using BiGRU to predict cyber security posture, we split data into training and test set. In MAML-BiGRU method, support and query set correspond with training and test set.
In meta training phase, for each source task
Where
Where
In meta testing, model firstly calculates training error on support set
The specific algorithmic flow of the MAML-BiGRU method is detailed as Algorithm 1.
Experiment and analysis
Assumptions made
Data acquisition and environment configuration
Data acquisition
To apply MAML algorithm into network security situation prediction for model’s quick adaptation to new data, difficulty of splitting the cybersecurity posture data into multitask must be solved firstly. The generation process of task distribution is shown below.
(1) Dividing tasks and data sets. The network security information and dynamic weekly reports from National Internet Emergency Response Center are collected as data for experiment. In our research, we select 520 weekly reports from Issue 1 of 2013 to Issue 52 of 2022 being validation basis and divide each year as a separate network security situation prediction task and each year’s data into a separate dataset. Eight years of data from 2013 to 2020 are taken to train the metamodel and two years of data from 2021 and 2022 are taken as new data sets to test the adaptive capability of the metamodel. To better assess network security situation, posture assessment method in Ref. [20] is employed to quantitatively assess the five network security threats. By assigning weights to the severity of network security threats so that the impact level of each threat can be better understood, the specific weight assignments are detailed in Table 3. based on the obtained weights and Equation (11) to calculate the weekly posture values. It is able to effectively enhance real-time cybersecurity accuracy and remind relevant personnel of in-time cybersecurity strategyadjustment.
Weight of cyber security threats
Weight of cyber security threats
Where
(2) Sample extraction tasks. Each task draws the first 70% of continuous data as the support set and the last 30% as the query set.
The MAML-BiGRU model and experiment conducted were under Pytorch Deep Learning Framework under given experiment condition in Table 4.
Experiment environment configuration
Experiment environment configuration
Data normalization is one of the essential pre-processing techniques in machine learning. In practical applications, there are z-score normalization, min-max normalization and mean normalization methods. For our research, we choose min-max normalization, by which the feature data are normalized to –1 and 1, which reduces outlier effect and improves model convergence rate, and also improves model’s ability of handling feature data.
Where
The specific parameter settings for the experiments are shown in Table 5.
Model parameter settings
Model parameter settings
To evaluate accuracy and stability of prediction model put forward, Mean Absolute Error (MAE), Mean Square Error (MSE), Mean Absolute Percentage Error (MAPE), as well as coefficient of determination (
In the above four equations,
To effectively compare model prediction ability of proposed in our research and models by others, following experiments are performed: in same experiment environment and setting the sliding window number

Comparison of the prediction situation values of different models.

Evaluation indicators sum of different models in 2021, 2022.
Evaluating indexes of diverse models in 2021
Evaluating indexes of diverse models in 2022
From Fig. 8, we can see that most models merely predict trendency of the network security situation. However, they can not predict the details accurately, while the MAML-BiGRU prediction model put forward introduces MAML algorithm, thus BiGRU model can better extract the relationship characteristics between time series by calculating only a tiny amount of data, which makes the prediction results accurate.
As seen in Table 6, in the 2021 data, compared to other models, MAE decreased by at least 42.2%, MSE decreased by at least 66.1%, MAPE decreased by at least 43.9% %, and
As seen in Fig. 9, the MAML-BiGRU prediction model put forward owns minor error values and the most significant coefficient of determination in the data sum of 2021 and 2022, which has a significant advantage over other models and proves efficacy and accuracy of MAML-BiGRU prediction model in predicting value of the situation.
To ensure that our model does not overfit or underfit the data, we monitored the training and validation loss throughout the training process. Overfitting occurs when a model learns the training data too well, including its noise and outliers, which can reduce its ability to generalize to new data. Conversely, underfitting happens when a model is too simple to capture the underlying structure of the data.
K-fold cross-validation results
To ensure the robustness of our MAML-BiGRU model’s predictions, we conducted a 5-fold cross-validation experiment.
The dataset was stratified and divided into five equal-sized subsamples. Each subsample served as the test set once, while the other four subsamples were used as the training set. This process was repeated five times, ensuring each subsample was used for testing. The MAML-BiGRU model was trained and evaluated across these folds, and the performance metrics were documented.
The cross-validation results are summarized in the table below:
5-Fold Cross-Validation Performance Metrics
Average Performance: MAE: 0.124±0.008, MSE: 0.313±0.025, MAPE: 3.47±0.15, R 2: 0.897±0.004.
5-Fold Cross-Validation Performance Metrics
Average Performance: MAE: 0.124±0.008, MSE: 0.313±0.025, MAPE: 3.47±0.15,
The cross-validation results demonstrate that our MAML-BiGRU model provides consistent and accurate predictions across different subsets of the data. The low MAE and MSE values indicate that the model’s predictions are close to the actual values on average. The MAPE values, which are below 4%, further confirm the model’s accuracy in relative terms. The high
Our results demonstrate that the MAML-BiGRU model outperforms other benchmark methods in network security situation prediction. This superior performance can be attributed to several key factors:
The implications of these findings are significant for the field of network security. By leveraging meta-learning and bidirectional recurrent structures, our approach offers a more proactive and adaptive solution to network security situation prediction. This not only enhances the capability to anticipate potential threats but also facilitates the implementation of timely and effective countermeasures.
Moreover, our study contributes to the broader understanding of how advanced machine learning techniques can be applied to complex, real-world problems. The success of our MAML-BiGRU model suggests that similar approaches could be beneficial in other domains where rapid adaptation and the handling of time-series data are critical.
Conclusion
A network security situation prediction approach integrating MAML and BiGRU is put forward. Using weekly reports of security information and dynamics from National Internet Emergency Response Center, we construct a network security situation prediction dataset and divide tasks. We introduce the BiGRU model to learn and train parameters on network security situation data’s pre-post relationship and temporal order features. Combining BiGRU with MAML effectively improves the model’s prediction performance for network security situation data with small sample learning. Through comparative experiments, this paper demonstrates the superiority and stability of model in network security situation data, surpassing general deep learning models in several metrics and proving the effectiveness of model put forward. Future work will focus on refining our model to handle larger and more diverse datasets, as well as exploring the integration of additional types of network traffic data to further enhance the prediction accuracy.
Footnotes
Acknowledgments
The work obtains support of National Science Foundation of China (61806219, 61703426, and 61876189), National Science Foundation of Shaanxi Provence (2021JM-226) by the Young Talent fund of the University, and Association for Science and Technology in Shaanxi, China (20190108, 20220106), and Innovation Capability Support Plan of Shaanxi, China (2020KJXX-065).
