Forecasting traffic congestion status in terminal areas based on support vector machine

Abstract

This article researches on a traffic congestion status forecasting method to improve the real-time monitoring and controlling of air traffic in terminal areas. First, a traffic congestion status evaluation method was introduced based on a fuzzy C-means clustering algorithm, as well as several traffic congestion status evaluation metrics. And then, a traffic congestion status forecasting model was proposed based on support vector machine. Finally, a real case study from a terminal area in China was provided to test and verify the proposed evaluation method and forecasting model. The evaluation results show that traffic congestion status of the terminal area can be classified into five levels: free, smooth, slightly congested, moderately congested, and severely congested. The forecasting results show that the mean absolute error and the cluster accuracy are 0.041% and 92.2%, respectively, which indicate that the forecasting model is very effective and accurate. In addition, it is also found that the parameters of forecasting period and size of training set have some influence on forecasting results, and the optimal results can be found when the two parameters values are 15 and 3, respectively.

Keywords

Transport engineering support vector machine forecasting model air traffic congestion

Introduction

Terminal maneuvering area (TMA) is an air traffic control area normally established at the confluence of air traffic service routes in the vicinity of one or more major airdromes. Usually, air route network in TMA is complex, types of flights are various (departures, approaches, and over-flights), and potential conflicts between aircraft are of high frequency. As a result, air traffic problems, such as air traffic congestion, large numbers of flight delay, and even aircraft accidents, more easily occur in a terminal area. Air traffic management in terminal areas is becoming an extremely difficult problem. And the technologies of congestion situation identifying and forecasting play an increasing important role in terminal areas’ traffic control.

Air traffic congestion status is a macroscopic representation of air traffic flow fluency in certain air traffic control area. Forecasting congestion status aims to provide real-time and accurate traffic information in coming future for air traffic controllers and to support them to control air traffic more safely and effectively, which has important realistic significance.

Researches aimed at the evaluation of air traffic congestion status are in the early stage with relatively rare research results. However, scholars had done a lot of research work based on dynamic density and intrinsic properties, which can provide important basics for research on forecasting air traffic congestion status. The combined influence of metrics (i.e. traffic density, complexity, and number of airports) on controllers’ workload was researched, and it was found that controllers’ workload had close connection with these metrics.¹ The influence of different airspace factors on controllers’ workload was analyzed, and the controller workload was found to have strong correlation with sector geometry.² In 1995, Radio Technical Commission for Aeronautics (RTCA) first proposed the idea to quantitatively describe the airspace complexity.³ After that the “dynamic density” model was established to evaluate air traffic complexity.⁴ Dynamic density is a proposed concept for a metric that includes dynamic metrics, traffic density, and conflict metrics. Controller workload is reflected with a linear weighted sum. However, the model cannot reflect controller’s intention, and the weights are vulnerable to subjective factors.⁵ By analyzing intrinsic data, such as radar tracks, another way to measure complexity was found.^6,7 The research of intrinsic properties suggests that complexity is correlated to airspace structure and the connections between individual aircraft.⁸ Researchers believe that the number of aircraft is the crucial factor that influences the air traffic complexity.⁹ The research of intrinsic properties avoids the difficulty to measure controller workload. However, it ignores the human factors at the same time.

According to the data in use and the models applied, air traffic congestion forecasting methods mainly is divided into four categories: forecasting based on mathematical statistics, forecasting based on air traffic flow models, forecasting based on intelligent algorithms, and forecasting based on computer simulation.¹⁰ The forecasting method based on mathematical statistics can obtain air traffic congestion status by analyzing traffic operation data. Researchers from NASA established the probability distribution function of aircraft passing an air traffic unit,¹¹ traffic demand was forecasted by comparing the forecasting demand with the actual capacity. From the perspective of weather, a weather-impacted traffic index was introduced, and a corresponding linear regression model was proposed.¹² Taking different lengths of time slices into consideration, researchers from the National Transportation Systems Center proposed a linear regression forecasting model based on three continuous time intervals¹³ and adjacent time intervals¹⁴ and obtained forecasting results by comparing demand with capacity. From the perspective of risk, an airport congestion risk forecasting model was proposed based on the random demand of airport approaches and departures.¹⁵

The forecasting method based on air traffic flow models can obtain congestion status metrics using traffic flow parameter relation models, such as converging traffic flow model,^16–18 flight delay and cancelation model,¹⁹ and approach flight instantaneous queuing model.²⁰ The forecasting method based on intelligent algorithms can obtain congestion status metrics by inputting forecasting values into a congestion status identification model. Based on the support vector machine (SVM), air route traffic congestion status was forecasted with an air route traffic demand model and a threshold method.²¹ An airport congestion status forecasting method was proposed based on fuzzy neural network and measured data.²² The forecasting method based on simulation can obtain congestion status using some traffic simulation systems. Based on the Total Airspace and Airspace Modeller (TAAM) simulation system, an airport operation under stochastic factors was simulated, and the congestion status was analyzed by the simulated results.²³ Based on the SIMMOD simulation system, an airport operation under different weather conditions and runway configurations was simulated.²⁴ Based on the NetLogo simulation tools, a cell transmission model of air traffic flow was established for terminal area, and its basic congestion evaluation rules were discussed.^25,26

Air traffic system is a complex human-centered system, which consists of air traffic controllers, air traffic flow, and airspace environment. It should take air traffic flow operating characteristic, air traffic controller factors, and airspace complexity into account to analyze traffic congestion status. Therefore, the article presents a comprehensive method for forecasting TMA traffic congestion status with these factors based on fuzzy C-means (FCM) and SVM. The main processes of forecasting TMA congestion status is as follows, as shown in Figure 1:

Congestion status evaluation. First, take equivalent airspace occupancy (AO) which considers the air traffic controllers’ workload on each aircraft category (e.g. landing, departure, and flyover) as the main body, and basic congestion status metrics which consist of air traffic flow characteristics and airspace complexity as the contribution, to formulate an integrated congestion status evaluation model.

Congestion status classification. Next, based on the result of status evaluation, an FCM clustering method is adopted to divide air traffic status into five levels with stability and minimum iterations constraints.

Congestion status forecasting. Finally, based on the historic results of congestion status evaluation and classification as training samples, a traffic congestion status forecasting model was proposed based on SVM. And sensitivities of model performance are analyzed further.

Figure 1.

Process of forecasting congestion status in terminal area.

The rest of this article is organized as follows. Section “Research method” presents TMA congestion status forecasting models, consisting of congestion status evaluation metrics, identification, and forecasting models. Section “Results and discussion” presents a case study with a large TMA in China to test and verify the proposed models. Some conclusions and implications are presented in section “Conclusion.”

Research method

Congestion status evaluation metrics

Average flow velocity

The metric reflects the fluency of traffic flow in TMA. According to the former researches, air traffic flow status can be divided into five phases (free, unimpeded, metastable, pseudo-congestion, and synchronized congestion). The average flow velocity (AFV) is an important parameter to measure the status. In general, when the AO is extremely low, the average velocity of the free traffic flow is of great fluctuation with no monotonic regularity. As the AO increases, the inner interacting activity increases. As a result, the aircraft following behavior between successive aircraft leads to decrease in average velocity. $v_{t, 1}$ is the velocity of aircraft i at time t, and $N_{t}$ is the total number of aircraft at time t in the TMA, the AFV of TMA is as follows

{\bar{V}}_{t} = \frac{v_{t, 1} + v_{t, 2} + \dots + v_{t, i}}{N_{t}}

(1)

Standard deviation of velocity

The standard deviation of velocity (SDV) reflects the deviation of individual aircraft’s velocity to the average velocity. When the AO is fixed, if the values of aircraft’s velocity are concentrated, controllers can correctly and quickly predict aircraft’s position and do the order sorting. Otherwise, it will increase the difficulty to do interval allocation and velocity adjustment, and as a result, the control workload increases and the status of traffic transits to congestion. So, the SDV in TMA at time t is as follows

S_{t}^{v} = \sqrt{\frac{\sum {(| v_{t, i} | - {\bar{V}}_{t})}^{2}}{N_{t} - 1}}

(2)

Standard deviation of heading angle

The metric reflects the clustering of flight paths. When the flight paths are dispersed, the value is fairly large, making it hard for controllers to cluster the flight paths using abstract structure cognition. When the flight paths are gathering in lines and forming the standard flow, it helps the controllers to simplify the traffic situation and relieve control workload. The average heading angle of aircraft at time t is ${\bar{Hd}}_{t}$ , and the heading angle of aircraft i at time t is $H d_{t, i}$ , the standard deviation of heading angles (SDHs) in TMA is as follows

S_{t}^{Hd} = \sqrt{\frac{\sum (| H d_{it} | - {\bar{Hd}}_{t}^{2})}{N_{t} - 1}}

(3)

Traffic mixing coefficient

The metric reflects the number of aircraft and the mixing degree of different categories of aircraft (climbing, descending, and cruising). Normally, the bigger the traffic mixing coefficient (TMC), the bigger the workload of air traffic controllers. Assuming that $n_{c, t}$ , $n_{d, t}$ , and $n_{r, t}$ , respectively, represent the number of aircraft climbing, descending, and cruising, the TMC can be calculated as follows

C_{t} = \frac{n_{c, t} \times n_{d, t} + n_{c, t} \times n_{r, t} + n_{d, t} \times n_{r, t}}{(n_{c, t} + n_{d, t} + n_{r, t}) \cdot N_{t}}

(4)

Equivalent AO

It is the number of aircraft within a certain spatial range instantaneously. Researches have shown that control workload has a complicated non-linear relation to the AO. By analyzing control audio data, different categories of aircraft (taking off, landing, and flyover) are found to generate different control workloads. Therefore, the concept of equivalent AO is introduced, indicating the weighted sum of aircraft amount within certain spatial range instantaneously. The weight values are set depending on the ratio relations of control workload in handling each flight per minute, among which the weight value with lowest workload is denoted as 1. If the number of aircraft in TMA at time t is $n_{k, t}$ and the weight of categories k is $w_{k}$ , the equivalent AO is as follows

N_{e, t} = \sum w_{k} \cdot n_{k, t}

(5)

Congestion status evaluation method based on FCM

Integrated congestion status evaluation model

In this article, aircraft are divided into some categories (flyover, taking off, and landing). However, in terminal airspace, flyover aircraft at different altitudes generate different complexity and difficulty for air traffic control due to their interactions with departure and landing flights. So, flyover can be extended to high-level and low-level flyover. Different weights are given to each category, and they are added up as equivalent AO. By multiplying equivalent AO by basic status metric coefficient, the terminal air traffic congestion status value is acquired, as shown in formulas (6)–(8)

U_{t} = \frac{N_{e, t}}{Cp} \times K_{t}

(6)

N_{e, t} = w_{c} \cdot n_{c, t} + w_{d} \cdot n_{d, t} + w_{r} \cdot n_{r, t} + w_{R} \cdot n_{R, t}

(7)

K_{t} = k_{\bar{v}, t} \cdot k_{s^{v}, t} \cdot k_{s^{Hd}, t} \cdot k_{c, t}

(8)

where $U_{t}$ is congestion status value at time t; Cp is the maximum instantaneous capacity of TMA; $K_{t}$ is the basic status metric coefficient; $w_{c}$ , $w_{d}$ , $w_{r}$ , and $w_{R}$ are, respectively, the weight of taking off aircraft, landing aircraft, low-altitude flyover aircraft, and high-altitude flyover aircraft; $k_{\bar{v}, t}$ , $k_{s^{v}, t}$ , $k_{s^{Hd}, t}$ , and $k_{c, t}$ are the coefficient of average velocity, SDV, SDH, and traffic mixing, respectively, at time t.

Congestion status classification process

The classification of congestion status value is uncertain and fuzzy. To deal with it, an FCM clustering is introduced.²⁷ The basic idea of FCM is to cluster the sample into a given number of sets and to ensure that elements within a set reach the maximum similarity and elements between different sets reach the minimum similarity. By obtaining the membership degree matrix of each element, cluster each element into certain set based on the maximum membership principle.²⁸ The process of FCM clustering is shown in Figure 2.

Figure 2.

The general process of FCM clustering.

Congestion status forecasting method based on SVM

Congestion status forecasting model

Given a training data set ${(x_{i}, y_{i}), i = 1, 2, \dots, l}$ , where $x_{i}$ is a one-dimensional column vector, $x_{i} = (x_{i}^{1}, x_{i}^{2}, \dots, x_{i}^{d})^{T}$ , $y_{i}$ is the corresponding target value. The idea of SVM regression is a linear transformation from a sample space to another feature space. The regression estimation function is as follows^29,30

f (x) = ω_{1} \cdot ϕ_{i} (x) + b

(9)

where $ϕ_{i} (x)$ is an input variable, $ω_{1}$ and b are undetermined coefficients and are calculated as follows

{\begin{matrix} \overset{Minimize}{r (ω, ξ, ξ^{*})} = \frac{1}{2} {‖ ω ‖}^{2} + C \sum_{i = 1}^{N} (ξ_{i} + ξ_{i}^{*}) \\ ω_{1} φ (x_{i}) + b_{i} - a_{i} \leq ε + ξ_{i}^{*}, i = 1, 2, \dots, N \\ a_{i} - ω_{1} φ (x_{i}) - b_{i} \leq ε + ξ_{i}^{*}, i = 1, 2, \dots, N \\ ξ_{i} ξ_{i}^{*} \geq 0, i = 1, 2, \dots, N \end{matrix}

(10)

where $ε$ is an insensitive loss parameter, which determines maximum allowable error; C is a penalty parameter; and $ξ$ and $ξ^{*}$ are slack variables. Introducing the Lagrange multiplier, formula (10) turns into the following

f (x, β, β^{*}) = \sum_{i = 1}^{N} (β, β^{*}) K (x, x_{i}) + b

(11)

The kernel function $K (x_{i}, x_{j}) = φ (x_{i}) φ (x_{j})$ could be any function that meets Mercer’s condition.

Congestion status forecasting process

The forecasting model is trained with the training data from a sample, which consist of congestion status values and airspace occupancies of a day. Parameters s and k are predetermined. s is the forecasting period. S time slices are combined as one, and the average congestion status value and average AO of the combined time slices are calculated. k is the size of training set, which determines the number of combined time slices in the input data set in a single forecasting loop.

Providing the congestion status values and AO of the first k combined time slices for the forecasted day, the congestion status values of all day can be forecasted. The forecasting congestion status values are clustered based on the clustering results of the sample. The process of congestion status forecasting is as follows:

Step 1. Combine each S time slices as one. Calculate the average congestion status value $U_{t}$ and AO $N_{t}$ at combined time slice t of the training sample and calculate the average congestion status value $U'_{t}$ and AO $N'_{t}$ at combined time slice t of the actual data from the forecasted day. Cluster the average congestion status values of the training sample based on FCM and use the results to cluster the average congestion status values of the forecasted day.

Step 2. Build an independent variable matrix of the training data

Φ = | \begin{matrix} U_{1} & U_{2} & \dots & U_{k} & N_{k} \\ U_{2} & U_{3} & N_{k + 1} \\ ⋮ & ⋱ & ⋮ \\ U_{m - k} & U_{m - 1} & N_{m - 1} \end{matrix} |

(12)

Build a dependent variable matrix of the training data

P = {| \begin{matrix} U_{k + 1} & U_{k + 2} & \dots & U_{m} \end{matrix} |}^{T}

(13)

Input $Φ$ and P to the SVMTRAIN function in LIBSVM software package. The forecasting model is trained after selecting the proper kernel function and setting the model parameters.

Step 3. Build the input matrix of the trained forecasting model

A_{i} = {| \begin{matrix} U'_{i} & U'_{i + 1} & \dots & U'_{i + k - 1} & N'_{i + k - 1} \end{matrix} |}^{T}

(14)

Forecast based on the trained forecasting model. The output is the congestion status value $U'_{i + k}$ . i is the loop marker, which begins with 1.

Step 4. Let $i = i + 1$ . If $i < n + k$ , turn back to step 3, else turn to step 5. Where n is the length of the forecasting congestion status value array.

Step 5. Output the forecasting congestion status value array

Λ = (U'_{k + 1}, U'_{k + 2}, \dots, U'_{n + k})

(15)

Cluster the congestion status value according to the clustered results of the sample and evaluate forecasting performances by comparing with the actual congestion status values of the forecasted day.

Results and discussions

The congestion status evaluation and forecasting methods discussed above are verified with a terminal area in China, whose airspace structure is shown in Figure 3. The airspace consists of standard arrival routes, standard instrument departure routes and runways, and so on. The radar data consist of aircraft’s positions (longitudes and latitudes), ground speeds (km/h), heights (100 feet), heading angles (degree), and flight numbers of 5000 time slices during a typical day.

Figure 3.

The airspace structure of the TMA.

Evaluation results

Calibrating for basic status metrics

The values of AFV, SDV, SDH, and TMC of the day with 5-min interval can be calculated based on the proposed method, and their relationships between AO are analyzed. It is found that when the AO is relatively low (0–10), the values of these metrics distribute in a wide interval with no regularity; when the AO is relatively high, the values of these metrics are concentrated with a slight fluctuation around an average. All the metrics have positive correlations with AO when AO is greater than 10, except for AFV. Take the relationship between TMC and AO as an example, as shown in Figure 4. Assuming the cutoff value of AO as 10, values of basic status metric coefficient can be determined using different methods, and the average value of each metric can be calculated as the reference value of basic status metric on the basis that AO is >10. Reference values of basic status metrics are shown in Table 1.

Figure 4.

Relationship between TMC and AO.

Table 1.

Reference values of basic status metrics.

AFV reference value ${\bar{V}}_{t}^{R}$	SDV reference value $S_{R}^{v}$	SDH angle reference value $S_{R}^{Hd}$	TMC reference value $C_{t}^{R}$
600 km/h	180 km/h	108°	0.316

AFV: average flow velocity; SDV: standard deviation of velocity; SDH: standard deviation of heading angle; TMC: traffic mixing coefficient.

When the AO is >10, SDV, SDH, and TMC are all positively correlated to traffic congestion, $k_{S^{v}} = S_{t}^{v} / S_{R}^{v}$ , $k_{S^{Hd}} = S_{t}^{Hd} / S_{R}^{Hd}$ , and $k_{c_{t}} = c_{t} / c_{t}^{R}$ . The AFV is negatively correlated to traffic congestion, $k_{\bar{v}} = {\bar{V}}_{t}^{R} / {\bar{V}}_{t}$ . When the AO is ≤10, the four metrics are unable to reflect congestion status, and the values are set as 1.

Calibrating for equivalent AO

Currently, radio communication is the major way to conduct air traffic control. Controlling audio duration time is the externalization of control workload. It is an effective method to evaluate airspace capacity based on controllers’ workload with audio data. The average working time per minute for dealing with different categories of aircraft can be obtained with control audio data. Contributions to control workload of each category are shown in Table 2.

Table 2.

Contributions to control workload of each category.

Categories	Average work time per minute (s)	Explanation
Landing aircraft	8.5	Spend 8.5s per minute on controlling a landing aircraft
Takeoff aircraft	7.5	Spend 7.5s per minute on controlling a takeoff aircraft
Low-altitude flyover aircraft	5.8	Spend 5.8s per minute on controlling a low-altitude flying over aircraft
High-altitude flyover aircraft	5.2	Spend 5.2s per minute on controlling a high-altitude flying over aircraft

The contributions of different aircraft categories to control workload are obtained by audio data and radar track synthesis. The equivalent weight of high-altitude flying over aircraft is set as 1. The other equivalent weights refer to the comparing workload to high-altitude flying over aircraft. The equivalent weight of landing aircraft is $w_{d} = 8.5 / 5.2 = 1.63$ , the equivalent weight of takeoff aircraft is $w_{c} = 7.5 / 5.2 = 1.44$ , and the equivalent weight of low-altitude flying over aircraft is $w_{r} = 5.8 / 5.2 = 1.12$ . The instantaneous capacity is 22 flights, which is valued using an evaluation model in Zhang.³¹ With calibrated indexes, the congestion status values and AO values are calculated using the proposed models. Their changing tendencies during the typical day are shown in Figure 5. It is clear that their changing tendencies are consistent, which shows the AO is also an important metric to be used to evaluate air traffic congestion.

Figure 5.

Changing tendencies of congestion status and AO.

Congestion status classification

The evaluation of congestion status aims to support air traffic control operation, traffic flow management, and airspace management; a sudden change of congestion status should be avoided. Therefore, based on the principle of 15-min stability and minimum iterations, the terminal air traffic congestion status is clustered into five levels: free, smooth, slightly congested, moderately congested, and severely congested, as shown in Table 3.

Table 3.

Congestion status clustering results.

Congestion levels	Free	Fluent	Slightly congested	Moderately congested	Severely congested
Value intervals	<0.274	0.275–0.567	0.568–0.831	0.832–1.107	>1.108
Time slices	1376	814	1092	1131	587

According to the congestion status clustering results shown in Table 3, the TMA congestion status during the day can be evaluated, as shown in Figure 6. In the free level, AO is extremely low, aircraft are scattered and guided by standard arrival or departure procedures with no potential conflict, and air traffic controllers mainly play a role of monitoring the TMA. In the fluent level, AO is low, aircraft’s flying paths are usually straight, and there exist some potential conflicts. Controllers should control individual aircraft with some instructions. In the slightly congested level, AO is moderate and aircraft uniformly distribute in a dynamic state, the interaction between aircraft gradually appears, controllers should conduct aircraft following by adjusting its velocities and transit their focus from individual aircraft to multiple aircraft. In moderately congested level, AO is high, safety requirements are hardly met only with velocity adjustment, maneuvering aircraft are occasional, and the controller’s workload increases. In the severely congested level, the number of aircraft to approach the TMA is limited, maneuvering aircraft are frequent, holding aircraft appear, and the traffic flow status changes from linear to scattered.

Figure 6.

Congestion status with time series.

Forecasting results

The forecasting model is developed using the LIBSVM software in MATLAB. It is necessary to select a proper kernel function and determine reliable model parameters before forecasting congestion based on the developed model.

Selecting the kernel function

Different kernel functions have different forecasting performances. A proper kernel function should provide better performances. The frequently used kernel functions are polynomial kernel function, radial basic function (RBF) kernel function, and the sigmoid kernel function.^32,33,34,35 The poly-nomial kernel function is formulated as $k (x_{i}, x) = [s (x \cdot x_{i}) + b]^{d}$ , where s, c, and b are adjustable parameters, $c \geq 0$ , d is any positive integer. The RBF kernel function is formulated as $k (x_{i}, x) = \exp {- g | x - x_{i} |^{2}}$ , where g is an adjustable parameter. The sigmoid kernel function is formulated as $k (x_{i}, x) = \tanh (s (x, x_{i}) + b)$ , where s and b are adjustable parameters. To select a proper kernel function, a data set with 200 time slices is chosen. The results show that the relative errors of the RBF kernel function, the polynomial kernel function, and the sigmoid kernel function are 13.2%, 16.4%, and 18.2%, respectively. So, the RBF kernel function is the best one to be selected.

Determining model parameters

The penalty factor c and the kernel function parameter g have an important influence on forecasting results. The penalty factor c decides the influences caused by the outliers. A larger factor c can cause greater losses to the objective function, which means the unwilling to give up these outliers. The kernel function parameter g affects the generalization ability of the SVM. When g is small, the fitting effect to the training sample is satisfactory, but the generalization ability to new samples is poor. When g is large, the decision function will be close to a constant, which has poor ability for fitting and forecasting.

The grid searching method is used to determine the optimal parameter pair $(c, g)$ , and K-fold cross validation is used to evaluate the generalization of each parameter pair. The K-fold cross validation divides the raw data into n groups. For each of n times of validation, one group of data is validating set, and the other $n - 1$ groups are training set. The average accuracy of the n times of validation is the performance of the parameter pair. Setting the searching range of c and g as $[2^{- 8}, 2^{8}]$ and the searching step as 1, the optimal result is $c = 0.1$ , $g = 0.001$ . Setting the forecasting period $s = 20$ and the size of training set $k = 3$ , taking congestion status values and AO of the day as training sample, the congestion status values of the next day are forecasted, as shown in Figure 7.

Figure 7.

Forecasting results with $s = 20$ , $k = 3$ .

From Figure 6, the mean absolute error (MAE) is 0.041, the mean absolute percentage error (MAPE) is 12.3%, and the cluster accuracy (CA) is 92.2%. Except for some outliers, the similarity between forecasting values and actual values is relatively high, and the forecast accuracy of the congested level is higher than that of the free level. As the combined time slice is 10–80, the congestion status is in the free level, whose forecasting values deviate from the actual values to a certain extent. This is because there is a little regularity in the free level. Except for the congestion status values in the free level, the MAE and MAPE decrease to 0.038 and 5.1%, respectively. In addition, there are 225 forecasting values higher than actual values because these values from the training sample are higher than that of the forecasted day. This shows that the training sample has an influence on the forecasting results.

Parameter influences

Influences of the size of the training set

Set the forecasting period as a constant value s = 20 and adjust the size of training set k as 2, 5, 10, and 20, respectively, and then four groups of forecasting results are obtained and shown in Figure 8. Some basic features are found from the distribution of forecasting data. From the aspect of integral forecasting accuracy, MAE and MAPE have the same changing trend with different k, while CA has the opposite one. The forecasting data with higher values have higher accuracies. As k increases, the region of forecasting value narrows down, and the accuracy of extreme values drops down. In the free congestion level, forecasting values are generally higher than actual values, and the deviation trend grows as k increases.

Figure 8.

Forecasting results with s = 20 and different values of k.

The forecasting accuracies with a constant s = 20 and different k from 1 to 20 are shown in Figure 9. When k = 3, CA, MAE, and MAPE all obtain their optimal values. When $k = 3$ , the forecasting accuracy drops because of the weakness of training sample. When $k = 3$ , as the size increases, excessive samples create disturbance and negative effect to the forecast, but there is no linear relations between k and accuracy interference. This shows that the size of the training set has an important influence on the forecasting performance, and the forecasting results are optimal when it is 3.

Figure 9.

Relationship between accuracy and k.

Influences of the forecasting period

In the same way as above, set the size of the training set as a constant value k = 3 and adjust the forecasting period s as 1, 5, 15, and 30, respectively, and then four groups of the forecasting accuracy are obtained and shown in Figure 10. Some basic features are also found from the distribution of forecasting data. When s = 1, the forecasting values match the actual values but have punctate distributions. When s increases to 5 and 15, forecasting accuracies are higher, and the distributions of forecasting values are continuous. However, the forecasting accuracies of lower level congestion values and some extreme values decrease when s increases to 30.

Figure 10.

Forecasting results with k = 3 and different values of s.

The forecasting accuracies with a constant k = 3 and different s from 1 to 30 are shown in Figure 11. It is clear that changing trends of MAPE, CA, and MAE are all very stable with a little volatility. When the size of the training set is too small (s = 1) or too big (s = 30), their value are all in bad performance. CA and MAE obtain optimal values when s = 15, and MAPE optimal value appears when s = 7. On the whole, the forecasting period has an influence on the forecasting performance, and the forecasting results are optimal when it is around 15.

Figure 11.

Relationship between accuracy and s.

Conclusion

This article focuses on air traffic congestion status forecasting method for terminal areas based on FCM and SVM. A congestion status evaluation model is introduced based on FCM clustering algorithm, and a congestion status forecasting model was proposed using the SVM method. Using the evaluation model, the terminal traffic congestion status is clustered into five levels: free, fluent, slightly congested, moderately congested, and severely congested. The proposed forecasting model is verified to have a good performance by an actual sample in China. In addition, it is found that some parameters have an important influence on forecasting performance and there exists an optimal value to improve forecasting accuracy.

Further researches could focus on air traffic congestion situation awareness based on four-dimensional (4D) trajectory prediction. There should introduce a comprehensive evaluation index system of air traffic congestion, which is an important basement of congestion situation awareness. The proposed congestion evaluation model and forecasting model could be improved using some dynamic parameters, and they could be further developed for the whole air traffic network, not only for terminal areas. Moreover, rules of air traffic congestion formation, devolution, and dissipation should be found out, which would help in controlling air traffic in a more safe and efficient manner.

Footnotes

Academic Editor: Gang Chen

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (61104159 and 61573181), the Natural Science Foundation of Jiangsu Province (BK20131366), and the Fundamental Research Funds for the Central Universities (NS2014068).

References

Davis

Danaher

Fischl

. The influence of selected sector characteristics upon ARTCC controller activities. FAA/BRD-301, research report, Federal Aviation Administration, 1963.

Arad

. The controller load and sector design. J Air Traffic Contr 1964; 5: 12–31.

RTCA. Final report of RTCA task force 3: free flight implementation. Washington, DC: RTCA Inc., 1995.

Laudeman

Shelden

Branstrom

. Dynamic density: an air traffic management metric. NASA/TM-1998-112226, research report, American National Aeronautics and Space Administration, 1998.

Zhang

. Complexity research in air traffic management. Acta Aeronautica et Astronautica Sinica 2009; 30(11): 2132–2142.

Delahaye

Puechmorel

. Air traffic complexity: towards intrinsic metrics. In: Eurocontrol-FAA (ed.) 3rd USA/Europe ATM R&D seminar, Napoli, 13–16 June 2000.

Delahaye

Puechmorel

Hansman

. Air traffic complexity map based on non-linear dynamical systems. Air Traffic Contr Q 2004; 12: 367–388.

Lee

Feron

Pritchett

. Air traffic complexity: an input-output approach. In: Eurocontrol-FAA (ed.) 7th USA/Europe ATM R&D seminar, Barcelona, 2–5 July 2007.

Lee

Feron

Pritchett

. Describing airspace complexity: airspace response to disturbances. J Guid Contr Dynam 2009; 32: 210–222.

10.

. Identification and prediction of air traffic congestion. Acta Aeronaut Astronaut Sin 2015; 36: 2753–2763.

11.

Larry

. Probabilistic method for air traffic demand forecasting. In: Proceedings of AIAA guidance, navigation, and control conference and exhibit, (AIAA), Monterey, CA, 5–8 August 2002, pp.1–15.

12.

Chatterji

Sridhar

. National airspace system delay estimation using weather weighted traffic counts. In: Proceedings of AIAA guidance, navigation, and control conference and exhibit, (AIAA), San Francisco, CA, 15–18 August 2005, pp.1–17.

13.

Gilbo

Smith

. A new model to improve aggregate air traffic demand predictions. In: Proceedings of AIAA guidance, navigation, and control conference and exhibit, (AIAA), Hilton Head, SC, 20–23 August 2007, pp.1–11.

14.

Gilbo

Smith

. New method for probabilistic traffic demand predictions for en route sectors based on uncertain predictions of individual flight events. In: Eurocontrol-FAA (ed.) Ninth USA/Europe air traffic management research and development seminar, Berlin, 14–17 June 2011, pp.1–11.

15.

Wang

. Risk prediction model and methodology of airport congestion based on probabilistic demand. J Southwest Jiaotong Univ 2013; 48: 154–159.

16.

Menon

Sweriduk

Bilimoria

. New approach for modeling, analysis and control of air traffic flow. J Guid Contr Dynam 2003; 27: 737–744.

17.

Kong

Sun

. A bi-level programming for bus lane network design. Transport Res C: Emer 2015; 55: 310–327.

18.

Yao

. Transit network design based on travel time reliability. Transport Res C: Emer 2014; 43: 233–248.

19.

Mukherjee

Lovell

Ball

. Modeling delays and cancellation probabilities to support strategic simulations. In: Eurocontrol-FAA (ed.) Sixth USA/Europe air traffic management research and development seminar, Baltimore, MD, 27–30 June 2005, pp.1–10.

20.

Zhang

. Research on airport capacity and delay assessment affected by the weather. Nanjing, China: Nanjing University of Aeronautics and Astronautics, 2012.

21.

Yan

. The research of the fleet assignment and aircraft evaluation based on market demand forecast. Nanjing, China: Nanjing University of Aeronautics and Astronautics, 2010.

22.

Meng

. Forecasting of airport congestion level based on cluster and neural network algorithms. Comput Eng Appl 2013; 49: 245–257.

23.

Wen

Dai

. Terminal area capacity assessment based on TAAM. J Civ Aviat Flight Univ China 2013; 24: 9–14.

24.

Mota

Scala

Boosten

. Simulation-based capacity analysis for a future airport. In: Proceedings of 2014 Asia-Pacific conference on computer aided system engineering (APCASE), South Kuta Bali, Indonesia, 10–12 February 2014, pp.97–101. Piscataway, NJ: IEEE.

25.

Zhang

Yang

. Macroscopic model and simulation analysis of air traffic flow in airport terminal area. Discrete Dyn Nat Soc 2014; 14: 741654-1–741654-15.

26.

Yao

. An improved particle swarm optimization for carton heterogeneous vehicle routing problem with a collection depot. Ann Oper Res 2016; 242: 303–320.

27.

Chang

RLP

Palidis

. Fuzzy decision tree algorithms. IEEE T Syst Man Cyb 1977; 7: 28–35.

28.

Ren

. Identification of terminal area traffic situation based on FCM. Aeronaut Comput Tech 2014; 44: 1–8.

29.

Yao

Zhang

. A support vector machine with the tabu search algorithm for freeway incident detection. Int J Appl Math Comp 2014; 24: 397–404.

30.

Song

Yang

. k-Nearest neighbor model for multiple-time-step prediction of short-term traffic condition. J Transp Eng: ASCE 2016; 142: 04016018.

31.

Zhang

. Research on the complexity of the traffic behaviors in air traffic management. Nanjing, China: Nanjing University of Aeronautics and Astronautics, 2011, pp.1–238.

32.

Zhao

Wang

Zhang

. A support vector machine approach for short-term load forecasting. Proc CSEE 2002; 22: 26–30.

33.

Zhang

Yang

Bie

. Research on generalized following behavior modeling and complex phase-transition law of approaching traffic flow in terminal airspace. Acta Aeronaut Astronaut Sin 2015; 36: 949–961.

34.

Yao

Chen

Cao

. Short-term traffic speed prediction for an urban corridor. Computer-Aided Civil And Infrastructure Engineering 2016. DOI: 10.1111/mice.12221.

35.

Chang

Lin

. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2011; 2: 389–396.