Data-driven-based spatio-temporal detection model against cyber attacks in Intelligent transportation

Abstract

The escalating sophistication of cyber attacks in intelligent transportation systems necessitates the development of advanced intrusion detection systems capable of processing multi-modal data efficiently while maintaining real-time performance. To address these critical cybersecurity challenges, this paper presents a novel spatio-temporal hybrid model for robust attack detection. Our framework, designed to overcome the limitations of existing methods in handling complex hybrid threats, integrates three key components: convolutional neural networks (CNN) for spatial feature extraction; bidirectional long short-term memory (BiLSTM) networks to capture long-range temporal dependencies, and an attention mechanism for adaptive feature weighting. This innovative architecture enables comprehensive spatio-temporal pattern analysis while dynamically prioritizing the most discriminative features. Extensive experimental results demonstrate that our model significantly outperforms existing AI-based detection methods, such as Korea New Network-KNN and Multilayer Perceptron-MLP, achieving superior detection accuracy and enhanced robustness against sophisticated attacks.

Keywords

intelligent transportation hybrid attack detection CNN-BiLSTM-attention deceptive attack

Introduction

The intellectualization and networking of automobiles signify a crucial trajectory in the ongoing transformation of the automotive sector. Smart connected vehicles harness cutting-edge communication technologies to facilitate effortless data interchange and utilize a suite of sensors—such as cameras, laser radars, and millimeter-wave radars—to attain thorough environmental awareness. By amalgamating communication data with insights gleaned from sensor readings, these vehicles can make well-informed decisions and implement precise control measures.^1–3 Nowadays, the automotive electronic system has become the linchpin of contemporary vehicles, with the Vehicle Internet system materializing through the interconnection of these electronic components. Nevertheless, the inherent openness of the network environment renders automobiles vulnerable to an array of potential cyber security risks during the networking process, encompassing data tampering, viral attacks, and the insertion of malicious software, to name a few.^4–6 According to the 2020 Auto Network Security Report issued by Upstream Security, an overseas security research entity, automotive cyber security incidents witnessed a staggering 605% increase from 2016 to January 2020. For example, on January 11, 2024, the Cybernews research team discovered that certain subdomain names of BMW were susceptible to redirection vulnerabilities, enabling attackers to craft links that, in reality, directed users to malicious websites; on June 3, 2025, numerous foreign media outlets reported that Stormous, a notorious global blackmail gang, claimed to have infiltrated Volkswagen AG and stolen sensitive information, including user account data and vehicle VIN codes. This concerning trend highlights the urgent necessity for in-depth research into resilient security control strategies tailored for intelligent transportation.

As a quintessential example of a cyber-physical system, intelligent transportation systems’ heavy dependence on unsecured wireless communication channels inherently renders them susceptible to cyber-attacks.⁷ The lack of tangible security perimeters leaves these systems vulnerable to electromagnetic exploits, facilitating unauthorized network intrusions that jeopardize not only the integrity of transmitted data but also the reliability of vital vehicle safety communications in intelligent transportation.⁸ At present, attacks targeting intelligent transportation information systems primarily fall into three categories: Denial-of-Service (DoS), replay attack, and deception attacks. DoS (Denial-of-Service) assaults focus on vehicular communication by deliberately inundating network channels with an excessive volume of redundant data packets. This intentional congestion of malicious traffic clogs up the information flow, thereby obstructing the vehicle’s onboard systems from acquiring vital control instructions sent by the cloud-based infrastructure.⁹ Replay attacks entail the act of introducing adversarial signals that mimic legitimate system signals at a precise instant. In 2010, the Stuxnet virus unleashed extensive damage on industrial systems globally, utilizing replay attack strategies to transmit harmful commands formulated from intercepted data.¹⁰ In contrast, deceptive attacks undermine vehicular systems by introducing tampered data packets into the communication flow. These nefarious alterations mislead the vehicle’s decision-making algorithms, effectively seizing control of the mechanisms and steering operational responses toward hazardous or unintended conditions.¹¹ Therefore, the detection of hybrid attacks—including Denial-of-Service (DoS), replay attack, and deceptive attacks—is of paramount importance to ensuring the security of intelligent transportation systems.

To tackle the challenge of detecting hybrid attacks within intelligent transportation systems, significant research endeavors have been channeled into devising robust countermeasures. Current strategies can be broadly categorized into two main approaches: artificial intelligence (AI)-detection techniques and model-based detection mechanisms. Table 1 provides a summary of the aforementioned detection methods. An attack detection method using improved Kalman filter was proposed to detect the injected malicious attacks in intelligent transportation.¹¹ In Gao et al.,¹² to safeguard the information security of the train-ground communication system, an intrusion detection approach leveraging machine learning and state observer technology is introduced to identify and classify diverse types of attacks. A Partial Differential Equation-based observer is devised to identify the False Data Injection attack and pinpoint the exact location where the attack is introduced within the platoon.¹³ In Cheng et al.,¹⁴ an adaptive detection and identification approach, utilizing an unknown input observer, has been developed to counteract malicious attacks in intelligent transportation. While model-based detection approaches in Chowdhury et al.,¹¹ Gao et al.,¹² Biroon et al.,¹³ and Cheng et al.¹⁴ demonstrate effectiveness in identifying false data injection attacks, their performance is inherently limited by the fidelity of the underlying system model and the precision of threshold determination. Unlike these model-dependent techniques, AI-driven detection methods provide a significant advantage by operating without reliance on the intelligent transportation system’s mathematical representation. For example, a Long Short-Term Memory (LSTM)-based approach for SQL injection attack detection is proposed, enabling automatic extraction of meaningful data representations.¹⁵ A graph-based machine learning techniques to identify malicious is proposed, by which the injected malicious attacks can be detected.¹⁶ In AlEisa et al.,¹⁷ to enhance cybersecurity in connected automotive systems, a deep learning-powered Intrusion Detection System for real-time monitoring of malicious activities across In-Vehicle Networks, Vehicle-to-Vehicle communications, and Vehicle-to-Infrastructure networks is proposed. A new traffic anomaly detection approach based on Multi-Head Attention is proposed, which considers the built-in correlations of network traffic data.¹⁸ Zhang et al.,¹⁹ designed a Transformer-enabled transfer learning framework for intrusion detection, specifically tailored to extract and interpret spatiotemporal sequential patterns from vehicular data streams. Nevertheless, existing AI-driven detection frameworks in Li et al.,¹⁵ Gupta et al.,¹⁶ AlEisa et al.,¹⁷ Li et al.,¹⁸ and Zhang et al.¹⁹ neglect to model the interdependent spatio-temporal relationships that characterize modern network attacks in intelligent transportation systems.

Table 1.

Summary of techniques for detecting attacks.

Methods	Advantages	Limitations
Model-based detection methods^11–14	Reflect the change of physical system	Mitigation of model-induced errors and uncertainty
Model-based detection methods^11–14	Short detection time	Prior threshold calculation
AI-based detection methods^15–19	Extremely high accuracy	Extract individual spatial or temporal features
AI-based detection methods^15–19	Powerful feature recognition capability	Robustness of adversarial attacks

Motivated by the limitations of the above existing methods, this paper introduces a novel convolutional neural networks (CNN)-bidirectional long short-term memory (BiLSTM)-Attention hybrid model for robust attack detection. The proposed architecture synergistically combines CNN for spatial feature extraction, BiLSTM for capturing contextual temporal dependencies, and an attention mechanism for adaptive feature refinement. The proposed integration detection framework enables effective representation of complex threat patterns amid dynamic and high-dimensional intelligent transportation systems. Extensive experiments demonstrate that our model achieves state-of-the-art performance, outperforming existing detection approaches across multiple metrics. The main contributions of this work are summarized as follows:

A novel CNN-BiLSTM-Attention detection framework that jointly models spatial features (via CNN), bidirectional temporal dynamics (via BiLSTM), and critical feature selection (via attention) for hybrid attack detection in intelligent transportation systems. Unlike existing methods that process spatial and temporal features separately, our unified approach can detect the injected attacks by capturing the spatio-temporal feature in intelligent transportation systems.

The introduced Attention mechanism automatically can emphasize the most discriminative spatial-temporal features while suppressing noise and irrelevant variations. This leads to more robust detection against evolving attack strategies compared to static feature-weighting approaches.

Experimental results underscore the superiority of our proposed detection model demonstrating marked improvements over existing works, such as Korea New Network-KNN²¹ and Multilayer Perceptron-MLP.²²

The structure of this paper is outlined as follows. Section II provides the description of the problem. In Section III, the CNN-BiLSTM-Attention framework for attack detection is established. The simulation experiments in Section V illustrate the advantages of the proposed approach. Finally, Section IV presents the conclusion and discussion.

Problem description

In this section, a linear physical dynamic model is constructed to characterize the intelligent vehicular network system. Based on this model, we then develop a comprehensive hybrid attack framework that integrates DoS, replay, and deceptive attack strategies.

Physical dynamic model of intelligent transportation system

As shown in Figure 1, the intelligent networked vehicle system can monitor and control the vehicle status by collecting and processing sensor data from connected vehicles. According to the work in Wang et al.,²⁰ a linear physical model of intelligent vehicular system can be constructed as follows

\begin{matrix} x (k + 1) = Ax (k) + Bu (k) + w (k) \\ y (k) = Cx (k) + v (k) \end{matrix}

(1)

where $x (k)$ denotes the intelligent vehicular system state, $A$ and $B$ are system matrices, $C$ is observation matrix, $w (k)$ and $v (k)$ are system and observation noise, $y (k)$ is the system output of intelligent vehicular, $u (k)$ is the control input.

Figure 1.

Intelligent vehicular network system.

DoS attack model

DoS attacks are intended to obstruct or postpone the transmission of control commands $u (k)$ or observation data $y (k)$ . Based on the above intelligent vehicular system model, the Dos attack model is constructed as follows

y_{receive} (k) = {\begin{matrix} (1 - σ) y (k) \\ σ \end{matrix}

(2)

where $y_{receive} (k)$ denotes the observation data under DoS attack, $σ$ is data packet loss rate.

By injecting numerous malicious data packets, attackers cause channel congestion and subsequent data loss, as shown in Figure 2. This compromises the monitoring system’s state estimation capability, potentially leading to traffic disruptions or safety-critical incidents.

Figure 2.

Vehicular speed under different attacks.

Replay attack model

Malicious attackers execute replay attacks through the interception and retransmission of legitimate historical data, effectively bypassing the system’s security monitoring and misleading the control center. Based on the above intelligent vehicular system model, the replay attack model can be constructed as follows

y_{receive} (k) = y (k - d)

(3)

where $d$ is attack replay delay.

Through a coordinated replay attack, the adversary can retransmit the historical data from time $k - d$ at current time $k$ while compromising the vehicle system state, as shown in Figure 2. Consequently, the persistent use of stale data forces autonomous vehicles to make navigation decisions based on inaccurate information, significantly increasing accident probabilities.

Deceptive attack model

Attackers deceive the detection and control center by tampering with fabricated data at time instant k, aiming to conceal alterations in the state of the intelligent vehicular system system. Based on the above intelligent vehicular system model, the deceptive attack model can be established as follows

y_{receive} (k) = y (k) + δ (k)

(4)

where $δ (k)$ is the deception signal injected by the attacker.

According to equation (4), an attacker has the capability to inject the previously mentioned deceptive attack into the vehicle networking system without setting off its detection mechanism, as shown in Figure 2.

Ultimately, the robust detection of cyber-physical attacks targeting intelligent transportation systems is of paramount significance, as it serves as a cornerstone for system safety and mitigates the risk of systemic failures with catastrophic repercussions.

CNN-BiLSTM-attention detection framework against hybrid attacks

This section introduces a spatio-temporal detection framework designed to counteract FDIAs in intelligent transportation systems. The proposed solution leverages a CNN-BiLSTM-Attention hybrid model, as illustrated in Figure 3. The architecture integrates three key components: a deep CNN module for extracting spatial features from continuous vehicular sensor data streams; a BiLSTM network for modeling temporal dependencies in the sequential sensor data; an attention mechanism that enhances detection accuracy by adaptively focusing on critical data patterns or anomalous behaviors. This multi-module approach effectively captures both spatial-temporal correlations while prioritizing the most relevant attack indicators through dynamic feature weighting.

Figure 3.

Spatio-temporal detection framework using CNN-BiLSTM-attention.

Construction of CNN model-based spatial feature extraction

To extract spatial features from the intelligent transportation systems dataset, a CNN-based framework is designed. As shown in Figure 4, the implemented CNN structure comprises an input layer, a feature extraction layer (convolutional layer), a downsampling layer (pooling layer), and a classification layer (fully connected layer). The input layer receives the intelligent transportation systems data, containing both normal and attack condition. The feature extraction and downsampling layers are essential for refining the spatial patterns from the input signals. Ultimately, the classification layer processes and outputs the learned features.

Figure 4.

CNN-based spatio feature extraction framework.

The convolutional layer performs local feature extraction by systematically sliding the convolutional filter across the input feature map. The mathematical formulation of the discrete convolution operation is expressed as follows:

K_{i} = ε (W_{i} * X + o_{i})

(5)

where $K_{i}$ is the output of the ith volume accumulation layer; $ε$ is a nonlinear activation function; $W_{i}$ is the ith convolution kernel; $o_{i}$ represents the offset term in the accretion layer and $X$ is the data input into the energy storage system in the CNN model.

The pooling layer primarily serves to downsample feature maps, significantly reducing computational complexity while preserving essential signal characteristics. The mathematical formulation of this operation is expressed as follows:

H_{i / \max} = maxpool [K_{i}]

(6)

where $H_{i / \max}$ represents the output of pooling layer.

By forging global links among all activations, fully connected layer allows for thorough feature learning and supports informed decision-making in subsequent tasks. The mathematical expression for this transformation is outlined as follows:

β_{i} = ReLU (ζ_{i} * H_{i / \max} + γ_{i})

(7)

where $γ_{i}$ is the bias term, $β_{i}$ is the output of fully connected layer, $ζ_{i}$ is the weight parameter.

Construction of BiLSTM model-based temporal feature extraction

As depicted in Figure 5, a bidirectional BiLSTM framework aimed at extracting temporal features is constructed. To capture the complex temporal dynamics inherent in network traffic data, we employ a BiLSTM network. Unlike unidirectional models that process sequences in a single direction, the BiLSTM analyzes the input sequence both forwards and backwards. This dual-level analysis allows the model to contextualize each data point within its entire historical and future context, which is crucial for identifying sophisticated attacks that may exhibit subtle, long-range dependencies. The model effectively mitigates the vanishing gradient problem through its gating mechanisms (input, forget, and output gates), enabling it to learn and retain long-term dependencies critical for detecting multi-stage cyber threats.

Figure 5.

BiLSTM-attention-based temporal feature extraction framework.

The introduced model consists of two separate LSTM modules featuring gated architectures, each equipped with update gates, reset gates, and candidate state memory units. Importantly, one LSTM module processes input sequences in a forward, chronological manner, whereas the other processes them in a reverse temporal sequence. This bidirectional processing setup allows the BiLSTM architecture to concurrently capture both forward and backward contextual relationships within time-series data, thereby improving the extraction of holistic temporal features. The mathematical expression for the BiLSTM structure is given as follows.

{\vec{h}}_{t} = LSTM (x_{t}, {\vec{h}}_{t - 1})

(8)

{\overset{\leftarrow}{h}}_{t} = LSTM (x_{t}, {\overset{\leftarrow}{h}}_{t - 1})

(9)

y_{t} = W_{\vec{hy}} {\vec{h}}_{t} + W_{\overset{\leftarrow}{hy}} {\overset{\leftarrow}{h}}_{t} + b_{y}

(10)

where $x_{t}$ and $y_{t}$ are the input and output at time $t$ , $\to_{h_{t}}$ is the hidden layer state of the forward LSTM at time $t$ , $\leftarrow_{h_{t}}$ is the hidden layer state of the backward LSTM at time $t$ , $\to_{h_{t - 1}}$ is the hidden layer state of the forward LSTM at time $t - 1$ , $\leftarrow_{h_{t - 1}}$ is the hidden layer state of the backward LSTM at time $t - 1$ , $W_{\to_{hy}}$ , and $W_{\leftarrow_{hy}}$ are the weight matrix in the forward and backward, $b_{y}$ is the bias parameter.

Attention

To better capture the pertinent information in the encoding and extract the corresponding temporal and spatial features, we incorporate the attention mechanism into the model, as shown in Figure 4. Attention can enable heightened focus on specific trends or concentrations within the training data. The fundamental calculation formula is provided as follows.

s_{ij} = \tanh (O_{1} h_{i} + O_{2} h_{j} + θ_{1})

(11)

d_{ij} = softmax (s_{ij})

(12)

L_{i} = \sum_{j} d_{ij h_{j}}

(13)

where $s_{ij}$ is the relation of i-th parameter and j-th parameter. $O_{1}$ , $O_{2}$ , and $θ_{1}$ are the corresponding structural weight and bias. $L_{i}$ is the output, $d_{ij}$ is the weight.

Algorithm 1

Spatio-temporal detection algorithm using CNN-BiLSTM-Attention

Input: Data set including train and test set
Output: Output label of test data

L_{i}

1. for i ∈ [1, Total Epochs]
2. for j ∈ [1, Total Epochs]
3. Predict Attacked Probability = CNN-LSTM-Attention Detection Model(Trained data);
4.end for
5. Loss value = Loss function(Predict Attacked Probability, Trained data);
6. Update CNN-LSTM-Attention Detection Model parameters;
7. end for
8. Predict Attacked Probability = Trained Model(CNN-LSTM- Attention Detection Model-Test Data);
9. if Predict Attacked Probability > 1 − Predict Attacked Probability
10.

L_{i}

-Test = Abnormal;
11. else
12.

L_{i}

-Test = normal;
13. end if
14. return

L_{i}

Since the detection of FDIA attacks is expressed as a binary classification problem, the cross entropy loss function is defined as:

Loss = - \frac{1}{T} \sum_{i = 1}^{T} [L_{tru} \log (L_{i}) + (1 - L_{tru}) \log (1 - L_{i})]

(14)

where $L_{tru}$ is the true label category.

Remark 1. The CNN-BiLSTM-Attention detection model demonstrates superior performance in intelligent transportation attack detection by hierarchically integrating CNN’s spatial feature extraction, BiLSTM’s bidirectional temporal modeling, and attention mechanism’s adaptive feature weighting. CNNs effectively process high-dimensional vehicular network data to capture local spatial patterns, while BiLSTMs analyze the extracted features to model both forward and backward time-dependent relationships in traffic flow and communication sequences. The attention layer dynamically emphasizes critical temporal features for attack identification, significantly improving detection sensitivity to sophisticated threats. The proposed architecture achieves enhanced real-time processing capability through parallel feature computation and focused attention allocation, making it particularly effective for large-scale vehicular networks with heterogeneous data streams.

CNN-BiLSTM-Attention-based attack detection

The CNN-BiLSTM-Attention detection framework against hybrid attacks is designed by leveraging spatial-temporal feature learning. The detailed process is outlined as follows:

Step 1: Construct a CNN-based spatial feature extraction model to capture local patterns from vehicular network data.

Step 2: Develop a BiLSTM-based temporal feature extraction model to analyze sequential dependencies in traffic behavior and detect anomalies such as sudden trajectory deviations or message falsification.

Step 3: Integrate an attention mechanism to dynamically weigh critical time steps, improving detection sensitivity to stealthy attacks

Step 4: Train the CNN-BiLSTM-Attention model offline using historical vehicular network data.

Step 5: Deploy the trained detection model for real-time attack detection online.

The proposed spatial-temporal detection approach can ensure robust anomaly identification while maintaining low-latency processing, making it suitable for secure and efficient intelligent transportation systems. Based on the above detection steps, the detection algorithm against hybrid attacks is summarized in Algorithm 1.

Results

This section presents experimental simulations to evaluate the detection performance of the developed CNN-BiLSTM-Attention model for identifying cyber threats in intelligent transportation network systems. Comparative analyses with existing approaches (KNN²¹ and MLP²²) that the proposed model achieves superior detection performance against hybrid attacks.

Simulation and attack injection

The simulation experiments in this study are conducted on a high-performance computing platform with the following specifications: MATLAB R2024a (https://www.mathworks.com) running on an Intel i9-13900HX processor (2.20 GHz), 16 GB RAM, and an NVIDIA TITAN RTX 4060 GPU. The proposed CNN-BiLSTM-Attention model is implemented with the following key parameters. BiLSTM Network: 128 hidden units per layer, 2 stacked layers, and a dropout rate of 0.2 to prevent overfitting; Attention Mechanism: 64-dimensional attention layer for feature weighting; CNN Architecture: Optimized kernel sizes and pooling layers for spatial feature extraction from vehicular data; Training Configuration: Adam optimizer with a learning rate of 0.01.

Based on the evaluation dataset that combines real-world driving records from the Open Vehicle Dataset of Shenzhen City and dataset in Song et al.,²³ we simulate the different attacks, such as DoS attack, replay attack, and deceptive attack. The data set is divided into training set and test set in a ratio of 7:3.

Evaluation indicators

To assess the performance of the detection model, we have selected the following evaluation indicators: Accuracy, Precision, F1-Score, and Alarm Recall Rate.²⁴ The corresponding mathematical expressions for these metrics are provided below.

Accuracy = \frac{Γ_{TN} + Γ_{TP}}{Γ_{TN} + Γ_{TP} + Γ_{FN} + Γ_{FP}}

(15)

Precision = \frac{Γ_{TP}}{Γ_{TP} + Γ_{FP}}

(16)

Recall = \frac{Γ_{TP}}{Γ_{TP} + Γ_{FN}}

(17)

F 1 - Score = \frac{2 Precision \times Recall}{Precision + Recall}

(18)

where $Γ_{FN}$ denotes the number of benign instances incorrectly flagged as anomalous, $Γ_{FP}$ denotes the count of abnormal instances incorrectly labeled as normal, ..denotes the number of true negative instances (normal data correctly identified as normal), $Γ_{TN}$ denotes the number of correctly classified positive instances.

Ablation experiment analysis

To evaluate the contributions of the CNN, BiLSTM, and Attention mechanisms to the detection performance of the CNN-BiLSTM-Attention model, we conduct the following ablation experiments:

Model 1: CNN-BiLSTM (to assess the impact of the Attention mechanism)

Model 2: CNN-Attention (to examine the effect of BiLSTM)

Model 3: BiLSTM-Attention (to evaluate the influence of CNN)

Model 4: The complete CNN-BiLSTM-Attention model

Based on the above ablation framework, key performance metrics—including Accuracy, Precision, F1-Score, and Alarm Recall Rate—along with the Rate of Change (ROC) can be obtained, as shown in Table 2 and Figure 6.

Table 2.

Ablation results of CNN-BilSTM-Attention.

Detection models	Accuracy (%)	F1-score (%)
Model 1	95.12	94.89
Model 2	93.15	92.84
Model 3	92.62	91.93
Model 4	97.58	97.59

Figure 6.

ROC of the ablation experiments for CNN-BiLSTM-Attention.

As illustrated in Table 2, the contributions of each module to the overall model performance can be summarized as follows: When Attention was removed, there is a slight decline in model performance, with the Accuracy and F1-Score decreasing from 97.58% to 95.12% and 97.59% to 94.89%, respectively. The above results indicates that Attention plays a important role to focus on important features. When BiLSTM was removed, the model performance suffered a significant drop, with the Accuracy and F1-Score decreasing from 97.58% to 93.15% and 97.59% to 92.84%, respectively. It reveals that BiLSTM plays a crucial role in extracting temporal features. When CNN was removed, the model performance suffered a significant drop, with the Accuracy and F1-Score decreasing from 97.58% to 92.62% and 97.59% to 91.93%, respectively. It reveals that CNN plays a crucial role in extracting spatio features.

By comparing the ROC curves of each model variant in Figure 6, one can observe that the complete CNN-BiLSTM-Attention model achieves the best performance. Its curve is closest to the upper-left corner, with an AUC value of 0.9768, significantly outperforming the other ablation variants. The CNN-Attention model (without the BiLSTM module, AUC = 0.9368) and the CNN-BiLSTM model (without the Attention module, AUC = 0.9525) show limitations in time-series modeling and feature extraction, respectively, across different threshold intervals. Meanwhile, the BiLSTM-Attention model (without the CNN module, AUC = 0.9299) exhibits the most pronounced performance degradation across all FPR (False Positive Rate) ranges, confirming the critical role of the CNN module in spatial feature extraction. The gap between each ablation model’s curve and the baseline model’s curve visually reflects the contribution of the removed component. Notably, the CNN and BiLSTM modules synergistically enhance performance, while the Attention mechanism primarily improves detection sensitivity in specific regions.

Comparison of detection performance under different detection models

To evaluate detection performance against hybrid attacks, this section selects baseline models (KNN and MLP) for comparative analysis. The comparative detection metrics (Accuracy, Precision, F1-Score, and Alarm Recall Rate) and confusion matrices are presented in Table 3 and Figure 7, respectively.

Table 3.

Comparison results of evaluation indicators under different detection models.

Detection models	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)
Proposed detection model	97.58	97.63	97.58	97.59
KNN	91.26	89.96	90.32	91.18
MLP	90.23	87.25	89.56	89.12

Figure 7.

The confusion matrices under different detection models: (a) KNN, (b) MLP, and (c) CNN-BiLSTM-Attention.

As illustrated in Table 3, the CNN-BiLSTM-Attention model attains top-tier performance (Accuracy: 97.58%, Precision: 97.63%, Recall: 97.58%, F1-Score: 97.59%) by integrating three complementary modules: spatial convolution, temporal BiLSTM modeling, and attention-based feature reweighting. By contrast, the KNN and MLP models show significant performance gaps (1.3% and 2.98% ranges, respectively) due to their inability to adaptively process high-dimensional spatio-temporal patterns. As shown in Figure 7, the CNN-BiLSTM-Attention model achieves the highest performance in the confusion matrix, with main diagonal elements exceeding 97.5%—significantly outperforming the KNN (91.2%) and MLP (90.2%) models. The above quantitative advantage demonstrates that the CNN-BiLSTM-Attention architecture effectively captures complex characteristics of intelligent transportation data by integrating spatio-temporal feature extraction and an attention mechanism. In contrast, the KNN model suffers from fuzzy classification boundaries due to its reliance on distance-based metrics, while the MLP model is constrained by its inability to learn spatio-temporal features inherent in fully connected architectures.

The experimental results demonstrate that the spatio-temporal detection model proposed in this study achieves optimal feature extraction and attack recognition for hybrid attack detection tasks characterized by pronounced spatio-temporal dynamics.

Comparison analyse of attack robustness under different detection models

To assess the attack resilience of our CNN-BiLSTM-Attention framework in intelligent transportation scenarios, comparative experiments were performed against KNN and MLP baselines. Figure 8 presents the detection accuracy trends under escalating attack magnitudes, while Figure 9 compares the ROC metrics across all models.

Figure 8.

The comparative results of detection rate under different detection models.

Figure 9.

The comparative results of ROC under different detection models.

As illustrated in Figure 8, all models exhibit an upward trend in detection rates with increasing attack intensity; however, the proposed CNN-BiLSTM-Attention model consistently outperforms both KNN and MLP baselines in detection accuracy. Further analysis of the ROC curves in Figure 9 confirms that our model achieves significantly higher discriminative power (AUC) compared to the conventional approaches. This indicates its superior capability in accurately identifying adversarial data within Internet of Vehicles (IoV) datasets.

Figure 10 illustrates that as the noise standard deviation increases, the overall detection rate exhibits a declining trend. Initially, when the noise standard deviation is low, the detection rate remains high and decreases at a relatively slow pace. However, once the noise standard deviation surpasses a certain threshold, the rate of decline in the detection rate accelerates, before eventually tapering off and stabilizing at a comparatively low level. Conversely, the false alarm rate demonstrates an overall upward trend. At low levels of noise standard deviation, the false alarm rate is extremely low and increases gradually. As the noise standard deviation rises to a certain extent, the rate of false alarms experiences an acceleration, continuing to rise as the noise standard deviation further increases. This observed trend underscores the significant impact of noise on the performance of detection systems, with increasing noise leading to a decrease in detection rate and an increase in false alarms.

Figure 10.

The analysis of detection performance in different noise environments.

Conclusions and future works

This study proposes a novel spatio-temporal detection model designed to identify hybrid attacks in intelligent transportation systems. The introduced CNN-BiLSTM-Attention framework effectively integrates convolutional neural networks for spatial feature extraction, bidirectional long short-term memory networks for capturing temporal dependencies, and a channel attention mechanism for adaptive feature weighting. Simulation results demonstrate that the proposed model outperforms conventional detection approaches such as KNN and MLP across multiple performance metrics, including accuracy, precision, F1-Score, and recall. Furthermore, the model exhibits strong robustness in handling diverse attack scenarios.

Looking forward, several promising directions warrant further investigation. First, we intend to explore more advanced feature representation learning techniques to enhance both detection performance and computational efficiency. Second, transfer learning methodologies will be further developed, with emphasis on parameter optimization strategies and cross-domain adaptation mechanisms to improve generalizability across different transportation networks. Third, we plan to extend our validation to more complex and realistic attack scenarios, incorporating larger and more diversified datasets to evaluate model performance under near-real-world conditions. Finally, the integration of explainable AI (XAI) techniques will be considered to improve the interpretability of detection results, which is critical for practical deployment in security-sensitive transportation applications.

Footnotes

ORCID iD

Xinyu Wang

Ethical considerations

This work did not involved humans and animals. Ethic approval was not required for this research.

Consent to participate

There is no such case.

Consent for publication

The corresponding author gave consent for the publication of the identifiable details.

Author contributions

Conceptualization, RX. W. and MY.Z.; methodology, MY.Z and X.Y W; data curation, RX. W; writing—original draft preparation, X.Y. W; writing—review and editing, X.Y. W and RX. W; visualization, MY.Z. and RX. W; All authors have read and agreed to the published version of the manuscript.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by the Open Research Fund of Intelligent Electric Power Grid Key Laboratory of Sichuan Province under 2023-IEPGKLSPKFYB05, supported by Hebei Natural Science Foundation under F2025203071.

I acknowledge Open Research Fund of Intelligent Electric Power Grid Key Laboratory of Sichuan Province under (2023-IEPGKLSPKFYB05) and Hebei Natural Science Foundation (F2025203071) for financial support that made this research possible.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

Data underlying the results presented in this paper are available from the corresponding author upon reasonable request.

References

Wang

Ban

. Data poisoning attacks in intelligent transportation systems: a survey. Transp Res Part C: Emerg Technol 2024; 165: 104750.

Abdo

Chen

Zhao

, et al. Cybersecurity on connected and automated transportation systems: a survey. IEEE Trans Intell Vehicles 2024; 9(1): 1382–1401.

Sakiz

Sen

. A survey of attacks and detection mechanisms on intelligent transportation systems: vanets and iov. Ad Hoc Netw 2017; 61: 33–50.

Parkinson

Ward

Wilson

, et al. Cyber threats facing autonomous and connected vehicles: future challenges. IEEE Trans Intell Transp Syst 2017; 18(11): 2898–2915.

Acharya

Dvorkin

Pandzic

, et al. Cybersecurity of smart electric vehicle charging: a power grid perspective. IEEE Access 2020; 8: 214434–214453.

Thapliyal

Wazid

Singh

, et al. Robust authenticated key agreement protocol for internet of vehicles-envisioned intelligent transportation system. J Syst Archit 2023; 142: 102937.

Saleem

Ayub

, et al. An efficient and physically secure privacy-preserving key-agreement protocol for vehicular ad-hoc network. IEEE Trans Intell Transp Syst 2023; 24(9): 9940–9951.

Abdollahi Biron

Dey

Pisu

. Real-time detection and estimation of denial of service attack in connected vehicle systems. IEEE Trans Intell Transp Syst 2018; 19: 3893–3902.

Lei

Zhu

, et al. Improving Kalman filter for cyber physical systems subject to replay attacks: an attack-detection-based compensation strategy. Appl Math Comput 2024; 466: 128444.

10.

Zemmoudj

Bermad

Bouallouche-Medjkoune

. Detection and mitigation of vehicle platooning disruption attacks. Veh Commun 2024; 47: 10076.

11.

Chowdhury

Belikov

Baimel

, et al. Observer-based detection and identification of sensor attacks in networked CPSs. Automatica 2020; 121: 109166.

12.

Gao

Zhang

, et al. An intrusion detection method based on machine learning and state observer for train-ground communication systems. IEEE Trans Intell Transp Syst 2022; 23(7): 6608–6620.

13.

Biroon

Biron

Pisu

. False Data Injection Attack in a platoon of CACC: real-time detection and isolation with a PDE approach. IEEE Trans Intell Transp Syst 2022; 23(7): 8692–8703.

14.

Cheng

Pan

Zhang

. Adaptive unknown input observer-based detection and identification method for intelligent transportation under malicious attack. Meas Control 2023; 56(7–8): 1377–1386.

15.

Wang

, et al. LSTM-based SQL injection detection method for intelligent transportation system. IEEE Trans Vehicular Technol 2019; 68(5): 1–4191.

16.

Gupta

Gaurav

Marín

, et al. Novel graph-based machine learning technique to secure smart vehicles in Intelligent Transportation Systems. IEEE Trans Intell Transp Syst 2023; 24(8): 8483–8491.

17.

AlEisa

Alrowais

Allafi

, et al. Transforming transportation: safe and secure vehicular communication and anomaly detection with intelligent cyber–physical system and deep learning. IEEE Trans Consum Electron 2024; 70(1): 1736–1746.

18.

Zhang

, et al. Detecting anomalies in Intelligent Vehicle Charging and station power supply systems with multi-head attention models. IEEE Trans Intell Transp Syst 2021; 22(1): 555–564.

19.

Zhang

Zhou

, et al. Hybrid transfer and self-supervised learning approaches in neural networks for intelligent vehicle intrusion detection and analysis. IEEE Internet Things J 2025; 12(7): 7677–7692.

20.

Wang

Hongyu

Ruiping

, et al. Detection of malicious attack in network vehicle system via observer. In: International conference on green intelligent transportation system and safety. Lecture notes in electrical engineering, Qinhuangdao, China, 16–18 September 2022, paper no. 1200, pp.13–14. Singapore: Springer.

21.

Nadeem

Arshad

Riaz

, et al. Preventing Cloud Network from spamming attacks using Cloudflare and KNN. Comput Mater Contin 2023; 74(2): 2641–2659.

22.

Wang

Qin

. A dynamic MLP-based DDoS attack detection method using feature selection and feedback. Comput Secur 2020; 88: 101645.

23.

Song

Woo

Kim

. In-vehicle network intrusion detection using deep convolutional neural network. Veh Commun 2020; 21: 100198.

24.

Javed

Rehman

Khan

, et al. CANintelliIDS: detecting in-vehicle intrusion attacks on a controller area network using CNN and attention-based GRU. IEEE Trans Netw Sci Eng 2021; 8(2): 1456–1466.