Sage Journals: Discover world-class research

Abstract

This paper develops a model for quantifying the relationship between flight volume and its operational performance at the macro level and investigating whether there are any changes before, during, and after the pandemic. Inspired by the market basket concept from economics, we first calculate macro-level effective flight time (EFT) for the U.S. domestic flight market by constructing a flight basket. Semi-log-linear models are developed to formulate the relationship between the total number of flights and macro-level EFT and its components. The estimation results indicate that the total number of flights has a positive and significant impact on EFT and its components, with 29.244 min longer for the weighted EFT for the analysis period compared with a zero-traffic scenario. Further investigations into the post-pandemic period verify the performance of the model and indicate that deteriorating operational performance in this period, especially with regard to gate delay and taxi-in time, is not only a result of the recovery of the flight market.

Keywords

aviation advanced analytics and data science air traffic management capacity delay analysis

Beginning in early 2020, COVID-19 had a negative impact on people’s willingness to fly ( 1 ), leading to dramatic changes in air transportation ( 2 ). Both flight volume and passenger load decreased dramatically ( 3 ), but traffic has recovered as the pandemic has abated. These dramatic fluctuations in traffic have affected the operational performance of air transportation ( 3 – 5 ).

The pandemic provides a natural experiment for investigating the relationship between flight traffic and operational performance. Traffic volume is one well-known factor that affects operational performance ( 4 – 7 ). However, many factors influence flight delays, and only some of these are closely linked with the levels of traffic in the system. Comparisons between operational performance in low- and high-volume traffic situations can reveal how much delay is volume related. This would allow for predictions of increased delay that would result from traffic growth in the future, and how much reduction in delay would be possible with capacity enhancements, which, theoretically, can eliminate volume-related delays. Additionally, the recovery in flight traffic arising from the reduced concerns about COVID-19 (a phenomenon that we term, somewhat loosely, the “post-pandemic recovery”), enables us to assess whether operational performance has changed in recent months in a manner that cannot be explained simply by an increase in flight traffic.

Many previous studies have analyzed flight operational performance. Metrics based on flight time are the most direct means of characterizing operational performance. Effective flight time (EFT) is one such metric. EFT is defined as the duration between the scheduled departure time and actual arrival time, and can be broken down into four components: gate delay; taxi-out time; airborne time; and taxi-in time ( 8 ). Definitions for these EFT components are as follows: (a) gate delay (the difference between actual departure time and scheduled departure time); (b) taxi-out time (the time between actual departure time and wheels-off time); (c) airborne time (the time between wheels-off and wheels-on); and (d) taxi-in time (the time between wheels-on and actual arrival at the destination gate).

Another commonly used metric for evaluating flight operational performance is arrival delay, calculated as the difference between the actual and scheduled arrival time, which is equivalent to the difference between the EFT and the scheduled block time. The EFT is a more reliable metric than the arrival delay for two reasons. First, scheduled block time changes over time ( 8 , 9 ). For example, the scheduled block time may be set longer to improve on-time performance. Therefore, the arrival delay may differ even for two flights with the same EFT. This means we should not directly compare arrival delays for the same flights at different times. Therefore, on-time performance based on arrival delay is not always reliable for operational performance analysis ( 9 ). Second, EFT provides more detailed information for different flight phases, allowing a more complete operational performance analysis by covering all flight phases.

Depending on the objective of the research, various approaches have been developed to analyze the operational performance of the air transportation system. Hsiao and Hansen ( 6 ) analyzed flight delays considering the effects of arrival queuing, volume, terminal weather, en route weather, and seasonal and secular effects. Wang et al. ( 8 ) compared the on-time performance of U.S. and Chinese airlines to investigate how different strategies in relation to setting scheduled block time affected operational performance. Dai et al. ( 7 ) included a broader scope of factors to model the system delay and predict days for which there was likely to be considerable delay, for example, queuing delays, terminal conditions, convective weather, wind, traffic volume, and special events.

This literature suggests that a high volume of traffic is among the most important reasons for poor operational performance ( 4 – 7 ). The total number of flights is an accessible and helpful metric for evaluating traffic volume at the macro level. Sun et al. ( 4 ) estimated the impact of the number of flights on the air transportation network during the COVID-19 pandemic. Zhou et al. ( 5 ) illustrate the vulnerability of the air transportation network, taking traffic volume into consideration. Thus, it is expected that flight volume will be a good predictor of operational performance. In Figure 1, we plot the weekly total number of U.S. domestic flights since January 2015. The volume of air traffic was fairly stable before COVID-19 but plummeted in the wake of the pandemic.

Figure 1.

Weekly total number of domestic flights in the U.S.A. since January 1, 2015.

Although previous studies have yielded valuable insights into the relationship between air traffic volume and operational performance, none has developed a method for quantifying this relationship for all flight phases, or employed a data set that includes the large reduction in flight volume resulting from the pandemic. In this paper, we investigate how the operational performance across all flight phases is influenced by traffic, specifically during the pandemic period. Toward this end, we will focus on the relationship between flight volume and EFT, as well as the EFT components. Because flight traffic, that is, the total number of flights, is calculated for the whole flight system, we will analyze the relationship at the macro level. Specifically, we first borrow the market basket concept from economics and construct a flight basket to describe the EFT and its components at the macro level. Then, we develop regression models to formulate the relationship between the macro-level EFT and the total number of flights, while controlling for airport capacity and other omitted variable bias. The regression analysis is developed based on domestic flights in the U.S.A., for which the requisite data are readily available. Finally, we apply our models to the period of recovery following the pandemic. This serves two purposes. First, it tests model performance on out-of-sample data. Second, it reveals whether a high volume of delays during 2022 is simply the result of traffic recovery, as opposed to other factors, such as labor shortages, which are unique to the recovery period.

The remainder of this paper is organized as follows. We discuss the data source, construction of the flight basket, and the model specifications in the methodology section. Then, the estimation results are presented, and post-pandemic performance is discussed. Finally, we offer some conclusions and suggestions for future work.

Methodology

Data Source

We mainly use the Aviation System Performance Metrics (ASPM) flight-level data set in the Federal Aviation Administration’s (FAA) Operations and Performance Database ( 10 ). The data set provides departure date and time for each operating flight, location identifier for both departure and arrival airport, and components of EFT, including gate delay, and taxi-out, airborne, and taxi-in time, which can be used to derive variables needed for our analysis. The traffic volume and the EFT components can be aggregated at different spatial and temporal levels according to the flight, departure time, and departure/arrival airports. The location identifier for departure and arrival airports is used to construct airport origin–destination (OD) pairs for the operating flights of interest. In this study, the analysis period is from January 1, 2019 to July 25, 2022. All the variables needed are obtained or derived for this 186-week period.

Flight Basket

In economics, a market basket ( 11 ), that is, a selected group of goods and services, is usually used to measure price trends. The market basket for the Consumer Price Index (CPI) is the most popular application. The CPI is an index measuring the overall change in living costs, and is the primary measure of inflation. Prices of essential goods are constantly changing, and they are different for different goods. For instance, the price of clothes may increase by 5%, and the price of electricity may decrease by 3% in the same period. We cannot simply calculate the average changes for all kinds of goods because the amounts of different goods required are not the same. Assuming that the kinds and quantities of goods are held constant, researchers only need to calculate the weighted sum of the prices, with weights reflecting quantities that are purchased, to obtain living costs in different periods ( 11 ). The basket of all kinds of goods with proper quantities is the market basket, and the change in living cost is the CPI. CPI in the period $t$ , $CP I_{t}$ , is the ratio of total cost in the period $t$ , $C_{t}$ , to the total cost in the basic period $C_{0}$ . In the period $t$ , if we know the price $p_{it}$ and the consumption amounts (assumed to be consistent across all time periods) $x_{i}$ of all $I$ kinds of goods ( $i = 1, 2, \dots ., I)$ in the market basket, then we can calculate the total cost for the basket in the period $t$ , $C_{t}$ and, accordingly, $CP I_{t}$ (Equation 1).

\begin{matrix} C_{t} = \sum_{i}^{I} p_{it} x_{i} \\ CP I_{t} = C_{t} / C_{0} \end{matrix}

(1)

Inspired by the market basket, we can construct a similar metric as $C_{t}$ to measure the operational performance of airline flights during the analysis period (from 2019 to 2022) consistently. We call this metric the weighted EFT, which is calculated for a specific basket in each time period, defined in this study to be a week. We do not calculate a value like $CP I_{t}$ , because in this analysis, the value of weighted EFT itself is more important than the changes relative to a basic period.

To calculate the weighted EFT, we first need to identify a suitable basket that consists of a specific set of flights. Flights between specific airport OD pairs are treated as goods. Once a group of airport pairs is selected to form the flight basket according to our defined criterion, we calculate the weighted EFT (similar to $C_{t}$ ) to measure the performance of the flight basket. With regard to the CPI, amounts represent the quantities of goods purchased. In the case of weighted EFT, the “goods” in the flight basket are the airport pairs, the “price” of goods is defined based on the average EFT of each particular airport pair, and the “amount” of goods is the number of flights operating between the airport pairs. Therefore, the weights are based on the number of flights operating between the airport pairs.

Similar to the $p_{it}$ calculation in the CPI, although prices for certain goods (e.g., clothes) vary in the market, researchers use one value to represent the price of the goods. Likewise, we will calculate one weekly EFT for each airport pair, using a simple average across all the flights between that pair in that week. The airport pairs and their weights should be the same for different periods to maintain the consistency of the basket. We use the same approach to measure the four EFT components. In the following sections, we will demonstrate the process of constructing the flight basket: how we select the goods (i.e., airport pairs) to be included in the flight basket; how we determine the amount; how we determine the price; and how we calculate the total cost as the weighted EFT to compare the operational performance over time.

The Goods—Airport Pairs

To construct our flight basket, we must identify airport pairs that are consistently represented in our data throughout the analysis period. (We consider directional airport pairs—the A to B market is distinct from the B to A market.) Specifically, we include the airport OD pairs with at least $n$ flights operating every week of the analysis period. We denote the number of flights operating for airport pair $m$ in week $t$ as $a_{mt}$ . Suppose there are in total $M$ airport pairs in the domestic flight market, then we can calculate the total number of flights in week $t$ ( $Q_{t}$ ) by summing up the flight operations for all $M$ airport pairs $\sum_{m}^{M} a_{t}$ , which is visualized in Figure 1. Across $T$ weeks in the analysis period (the shaded part of Figure 1), if $min_{T} a_{mt} > n$ , then the airport pair $m$ is included in our flight basket and will be analyzed in this study. Applying this criterion, there are $M' < M$ airport pairs included in our flight basket.

To make certain the criterion $n$ is large enough to ensure that all airport pairs in the flight basket have reliable EFT averages for every week, while also including a large set of airport pairs in our basket, we vary the criterion $n$ from 1 to 7 and show the resulting number of selected airport pairs in Figure 2. Furthermore, we vary the flight basket construction period to gain more insights into the selection of $n$ . With different values of $n$ , we construct one flight basket for the whole analysis period (January 2019–July 2022) and seven flight baskets for each calendar year in the period 2015–2021.

Figure 2.

Trade-off between criterion n and number of airport pairs in flight basket.

The lines of different colors in Figure 2 represent different values of $n$ for determining airport pairs included in the flight basket. Given the same value of $n$ (points on the same color of line), the resulting number of airport pairs for the whole analysis period is the smallest, because fewer airport pairs can satisfy the $min_{t} a_{mt} > n$ constraint over a longer time span. The flight market shrank because of the outbreak of the pandemic in 2020. Although the traffic recovered in 2021, the number of airport pairs remained well below pre-pandemic levels. We choose $n = 5$ as the criterion for selecting airport pairs for the flight basket, allowing a large set of airport pairs for which the weekly averages are still based on a reasonable sample size. There are 1,597 airport pairs included in our constructed flight basket during the whole analysis period, that is, from 2019 to 2022.

The Total Cost—Basket-Level Weekly Weighted EFT

In this section, we will calculate the “price” and “quantity” of the selected airport pairs. These, in turn, are used to find the “total cost” of the constructed flight basket.

For the “price”, we compute the weekly ( $t)$ average EFT ( $EF T_{mt}$ ) and its components

( $G D_{mt}, T O_{mt}, A T_{mt}, and T I_{mt}$ ) for each airport pair $m$ .

\begin{matrix} G D_{mt} = \frac{\sum_{i = 1}^{N_{mt}} G D_{imt}}{N_{mt}} \\ T O_{mt} = \frac{\sum_{i = 1}^{N_{mt}} T O_{imt}}{N_{mt}} \\ A T_{mt} = \frac{\sum_{i = 1}^{N_{mt}} A T_{imt}}{N_{mt}} \\ T I_{mt} = \frac{\sum_{i = 1}^{N_{mt}} T I_{imt}}{N_{mt}} \\ EF T_{mt} = \frac{\sum_{i = 1}^{N_{mt}} EF T_{imt}}{N_{mt}} \end{matrix}

(2)

where $EF T_{mt}$ , $G D_{mt}, T O_{mt}, A T_{mt}, and T I_{mt}$ are the weekly average EFT and its components for airport pair $m$ in week $t$ . $EF T_{imt},$ $G D_{imt},$ $T O_{imt},$ $A T_{imt},$ and $T I_{imt}$ are the values of EFT and its components directly obtained from the ASPM dataset for flight $i$ between airport pair $m$ in week $t$ . $N_{mt}$ is the number of flights between airport pair $m$ in week $t$ .

For the “quantity,” we use the total number of flights operating for a certain airport pair as the weight to reflect the consumption amounts for each airport pair. Intuitively, the more flights in one airport pair, the more contribution the airport pair makes to the market. We calculate the weights for each airport pair $m$ , $W_{m}$ , based on the ratio of the number of flights for airport pair $m$ across the whole analysis period $T$ to the total number of flights for all $M'$ airport pairs selected in the flight basket over all $T$ weeks in the analysis period, as denoted in Equation 3. Note that the weights are held constant across all periods.

W_{m} = \frac{\sum_{t = 1}^{T} a_{mt}}{\sum_{m = 1}^{M'} \sum_{t = 1}^{T} a_{mt}}

(3)

where $W_{m}$ is the weight assigned to a certain airport pair $m$ and does not vary with time $t$ , $a_{mt}$ is the number of flights operating for airport pair $m$ in week $t$ , $T$ is the number of weeks in the analysis period, and $M'$ is the number of airport pairs in the constructed flight basket.

Given the “amount” (i.e., the weights of airport pairs $W_{m}$ ) and the “price” (the weekly average EFT and its components), we can finally calculate the “total cost,” that is, flight basket-level weekly weighted EFT and its components $EF T_{t}$ , $G D_{t}, T O_{t}, A T_{t}, and T I_{t}$ in Equation 4 using Equations 1 to 3:

\begin{matrix} G D_{t} = \sum_{m = 1}^{M'} (W_{m} \cdot G D_{mt}) \\ T O_{t} = \sum_{m = 1}^{M'} (W_{m} \cdot T O_{mt}) \\ A T_{t} = \sum_{m = 1}^{M'} (W_{m} \cdot A T_{mt}) \\ T I_{t} = \sum_{m = 1}^{M'} (W_{m} \cdot T I_{mt}) \\ EF T_{t} = \sum_{m = 1}^{M'} (W_{m} \cdot EF T_{mt}) \end{matrix}

(4)

The calculation results for the flight basket-weighted EFT and its components for each week during the analysis period are shown in Figures 3 and 4. Figure 3 is the stacked bar chart showing the weekly weighted EFT and its components in different colors. Each bar represents the weighted EFT of one week in the analysis period, ranging from 120 to 140 min. We can see from the overall bar heights that there was a sharp drop around March 2020 when the COVID-19 pandemic started to affect the flight market. Each bar is also subdivided into its components. The airborne time accounts for most of the weighted EFT. To obtain a clearer view of the trends, we depict the time series for each component separately in Figure 4. Gate delay decreases during 2020 but increases in the post-recovery period compared with the pre-pandemic level. Taxi-in/out times share the same pattern as the total number of flights (Figure 1). The airborne time shows a periodic pattern but does not seem to be affected much by the COVID-19 pandemic.

Figure 3.

Weighted effective flight time for the flight basket.

Figure 4.

Components of weighted effective flight time for the flight basket.

Sensitivity Analysis

In the above sections, we have constructed the flight basket by selecting a group of airport OD pairs that have at least five flights operating every week of the whole analysis period, and have calculated the weighted EFT for the flight basket in each week. In this section, sensitivity analysis is conducted to illustrate that the construction period for the flight basket will not lead to a significant difference in the operational performance measures. By doing so, we can further verify the representativeness of the flight basket in relation to the whole aviation market.

Specifically, we calculate the abovementioned derived basket-level weighted EFT and its components for four flight baskets constructed for different periods with the criterion value $n$ set to 5. We then fit a simple linear regression line to the metrics based on the yearly flight basket (x-axis) and also to the metrics based on the whole analysis period (y-axis). The fitted regression lines are depicted in Figure 5, with the slope and $R^{2}$ reported in the sub-plot title. Each point represents values for one week.

Figure 5.

Weighted EFT comparisons for different flight baskets.

The $R^{2}$ and slopes of the regression models are all close to 1, which means a perfect linear relationship for $EF T_{t}$ , $G D_{t}, T O_{t}, A T_{t}, and T I_{t}$ with the flight basket for the whole analysis period (January 2019–July 2022) having the same growth rate as the flight baskets for each year (2019, 2020, and 2021). We can conclude that the choice of construction period for the flight basket will not lead to different weighted operational performances. The following analysis is based on the flight basket constructed during the whole analysis period from January 2019 to July 2022.

Regression Analysis

Model Specification

As discussed above, the construction of the flight basket helps make the operational performance metrics (i.e., weighted EFT and its components) comparable over time. In this study, we would like to understand further how and to what extent the number of flights may affect the weighted EFT and its components while controlling for omitted variable bias.

We perform initial investigations on data so as to discover patterns and spot the hypothesis we would like to test in statistical modeling. In Figure 6, we visually assess the relationship between the log-transformed weighted EFT and the total number of flights. Each point represents one observation of weekly weighted EFT and the total number of flight operations. There are 186 points in total over the analysis period, and the plot suggests a positive correlation between the two variables. This is expected, because increasing flight traffic exposes more bottlenecks in the National Airspace System (NAS), resulting in congestion. Moreover, the relationship when EFT is log-transformed appears to be roughly linear, suggesting an exponential relationship between EFT and flight traffic.

Figure 6.

Relationship between log-transformed effective flight time and the total number of flights.

Thus, we estimate a semi-log-linear regression model to relate weighted EFT and its components to the number of flights and other factors. The model specification is formulated as in Equation 5, where $Y_{t}$ is the dependent variable in period $t$ , $X_{t}$ contains all explanatory variables in period $t$ , with only the total number of flights variable in logarithm scale, $β$ is the associated coefficient vector estimated by ordinary least squares, and $ε_{t}$ is the regression residual.

\ln (Y_{t}) = β X_{t} + ε_{t}

(5)

As well as the total number of flight operations $Q_{t}$ in the constructed flight basket in week $t$ , we introduce monthly dummy variables ( $N_{t}^{month}$ ) in the model to control for factors changing at the monthly level. We only include 11 monthly dummy variables in the model to avoid the multicollinearity issue. The monthly dummy variable represents the number of days in the subject $month$ of week $t$ . Because one week may cover two months, at most two of the monthly dummy variables will have non-zero values. For instance, the first week ( $t = 1)$ is from January 1 to January 7, 2019. All seven days in this week are in January. Therefore, $N_{1}^{1} = 7$ for January $(month = 1)$ and $N_{1}^{month} = 0$ for all other months ( $month = 2, 3, \dots, 11$ ). For the fifth week, from January 29 to February 4, 2019, there are three days of the week in January and four days in February. Therefore, $N_{5}^{1} = 3$ for January ( $month = 1$ ), $N_{5}^{2} = 4$ for February ( $month = 2$ ), and $N_{5}^{month} = 0$ for all other months

( $month = 3, 4, \dots, 11$ ). As a result, the relationship between the weekly weighted EFT and the predictor variables is specified as in Equation 6.

\ln (Y_{t}) = α + γ \cdot Q_{t} + \sum_{month = 1}^{11} λ_{month} \cdot N_{t}^{month} + ε_{t} \cdot

(6)

where $Y_{t}$ is the dependent variable of interest in period $t$ , and $α$ , $γ$ , and $λ_{month}$ are parameters to be estimated. We estimate five models in which the dependent variable is weighted EFT $EF T_{t}$ and its components gate delay $G D_{t}$ , taxi-out time $T O_{t}$ , airborne time $A T_{t}$ , and taxi-in time $T I_{t} .$

Airport Conditions

Apart from a high volume of traffic, adverse conditions at the terminal, such as congestion and bad weather, are also leading causes of poor operational performance. In this study, we employ the standard deviation of the airport demand-to-capacity ratio for departure and arrival to evaluate airport conditions for each airport pair. The demand-to-capacity ratio is a popular metric in transportation system analysis for describing the level of service. However, in our case, we found that the average value of the demand-to-capacity ratio is highly correlated with the total number of flights, with a correlation score of 0.88 for departure, and 0.90 for arrival, which leads to multicollinearity in our regression analysis. Intuitively, the greater the total number of flights, the higher the departure/arrival demand for each airport and, thus, the higher the demand-to-capacity ratio given that the capacity is relatively stable. Therefore, we employ the standard deviation of the demand-to-capacity ratio, which is less highly correlated to the total number of flights than the average, although it still captures imbalances between demand and capacity. If the standard deviation is high, then there are periods when the demand-to-capacity ratio is quite high, either because of reduced capacity or surges in demand; these are the conditions expected to cause a great deal of delay. Thus, the standard deviation of the demand-to-capacity ratio can be used to represent airport conditions and capture their impact on the weighted EFT.

We calculate the demand-to-capacity ratio for each flight for both departure and arrival airports based on quarter-hourly scheduled demand and capacity of the airport. In detail, we obtain the metric for airport conditions—the standard deviation of the departure/arrival demand-to-capacity ratio—in a manner similar to the method for obtaining the weighted EFT. First, because of missing data on demand and capacity for some airports, we obtained the new flight basket with $M^{d}$ airport pairs for departure and $M^{a}$ airport pairs for arrival and then recalculated weights for each airport pair in our flight basket with the same strategy as for the original flight basket. $M^{d}$ airport pairs for departure and $M^{a}$ airport pairs for arrival are subsets of the original $M$ airport pairs basket. Second, for each airport pair in every week, we calculated the standard deviation of the demand-to-capacity ratio separately for departures and arrivals, using the airport acceptance and airport departure rates recorded in the ASPM data combined with the scheduled arrival and departure demands also included in this database. For every flight in a given airport pair, we find the departure demand-to-capacity ratio at its scheduled departure time and the arrival demand-to-capacity ratio at its scheduled arrival time, and then determine the standard deviations of these ratios across all the flights in the airport pair. Third, we obtain the weighted average standard deviation of the demand-to-capacity ratio for departure and arrival for each week, using weights similar to those in Equation 3. Finally, we average the departure and arrival standard deviation to represent airport conditions in our semi-log-linear model and update Equations 6 to 8. According to lognormal distribution, we can obtain the expected values for $Y_{t}$ , as shown in Equation 9.

{\bar{σ}}_{ratio, t} = \frac{\sum_{m}^{M^{d}} \underset{i ϵ m, t}{σ} (\frac{D_{t, i_{dep}}}{C_{t, i_{dep}}}) + \sum_{m}^{M^{a}} \underset{i ϵ m, t}{σ} (\frac{D_{t, i_{arr}}}{C_{t, i_{arr}}})}{2}

(7)

\ln (Y_{t}) = α + γ \cdot Q_{t} + {\bar{σ}}_{ratio, t} + \sum_{month = 1}^{11} λ_{month} \cdot N_{t}^{month} + ε_{t} \cdot

(8)

E [Y_{t}] = e^{E [\ln (Y_{t})] + σ {(ε_{t})}^{2} / 2}

(9)

Results and Discussion

Estimation Results

In this section, we present the estimation results for the semi-log-linear model described in the previous section. We not only want to investigate the relationship between flight volume and operational performance, but also assess the ability of a model trained on data from the pre-pandemic and pandemic periods to predict performance in the post-pandemic period. Accordingly, we divide the analysis period, 186 weeks, into a 126-week training period (70%), from January 1, 2019 to May 31, 2021, and a 60-week test set (30%), from June 1, 2021 to July 25, 2022. Flight volume in the test period has substantially recovered from COVID-19. We make the split according to the time series for two reasons. First, it tests model performance on out-of-sample data. Second, we can determine whether a high level of delays in the post-pandemic period is simply the result of traffic volume recovery.

We estimate the variable coefficients of the semi-log-linear regression model specified in Equations 6 and 8 based on the training set for which the analysis period is from January 1, 2019 to May 31, 2021. Five models with different dependent variables but with the same set of predictors are analyzed. The estimation results for both models without and with considerations for airport conditions are shown in Table 1. The coefficient estimates for the primary variable of interest—the total number of flights $Q_{t}$ —are presented in the second column for each model. The coefficient estimates for the average departure and arrival standard deviation of the demand-to-capacity ratio are presented in the third column for the second model. The monthly dummy effects were included to reduce the potential omitted variable bias, but we do not report the coefficient estimates for the monthly dummy variables in the regressions because they are not the primary focus of this paper. The adjusted R-squared scores for each model are listed in the last column for each model.

Table 1.

Regression Estimates for Equations 6 and 8

	Semi-log-linear model without considerations for airport conditions			Semi-log-linear model with considerations for airport conditions
Dependent variable EFT and its components	Constant	Coefficient estimate for $Q_{t}$	$Adjusted R^{2}$	Constant	Coefficient estimate for $Q_{t}$	Coefficient estimate for ${\bar{σ}}_{ratio, t}$	$Adjusted R^{2}$
EFT ( $EF T_{t}$ )	4.8939	8.74 × 10^-7***	0.797	4.7977	5.48 × 10^-7***	0.228^***	0.849
Gate delay ( $G D_{t}$ )	0.9798	9.27 × 10^-6***	0.790	-0.0386	5.81 × 10^-6***	2.40^***	0.845
Taxi-out ( $T O_{t}$ )	2.5133	1.98 × 10^-6***	0.886	2.3573	1.45 × 10^-6***	0.369^***	0.911
Airborne ( $A T_{t}$ )	4.7534	3.91 × 10^-8***	0.698	4.7338	-2.76 × 10^-8	0.0465^***	0.759
Taxi-in ( $T I_{t}$ )	1.7104	2.23 × 10^-6***	0.921	1.5723	1.76 × 10^-6***	0.327^***	0.941

Note: EFT = effective flight time. Variables are significant at the 0.1% level***.

Generally, the $Adjusted R^{2}$ for the second model considering the metric ${\bar{σ}}_{ratio, t}$ representing airport conditions (Equation 8) is better than that for the original semi-log-linear model. For the original semi-log-linear model, the $Adjusted R^{2}$ ranges from 0.698 to 0.921, indicating the semi-log-linear models describe the relationship well. The $Adjusted R^{2}$ is improved to 0.759 to 0.941 after we add a new variable ${\bar{σ}}_{ratio, t}$ to represent airport conditions.

The flight volume coefficient is the largest for gate delay with $9.27 \times$ 10^-6 for the original model and 5.82 × 10^-6 for the updated model. This reflects the high variability in gate delay, measured in percentage terms (Figure 4), as well as the strong influence of flight volume on gate delay. The latter results in the high flight volume coefficient, whereas the former means that factors omitted from the model also play a large role, which diminishes the adjusted R².

In contrast, the flight volume coefficient is the smallest for airborne time with 3.91 × 10^-8 for the original model, and is even negative and insignificant for the updated model with a coefficient value of $- 2.76 \times 10^{- 8}$ . This is contrary to expectations, but acceptable for airborne time. When delays are foreseeable because of capacity–demand imbalances, ground delay is usually preferable to airborne delay because of fuel consumption and safety concerns. Therefore, FAA traffic managers try to shift airborne delays to the ground. As flight volume increases, more ground delay is inevitable in response to bottlenecks in the system. Conversely, these strategies ensure that flight volume causes only small changes in airborne time. Therefore, the traffic volume coefficient is the smallest in the airborne time model, as is the adjusted R², indicating that other factors, such as the weather en route, have a greater impact on airborne time rather than the total number of flights or the airport conditions.

The taxi time models have comparable traffic volume coefficients of slightly greater than 10⁻⁶. Taxi-out times are subject to queuing delays at the departure runway and for the overhead airspace, whereas taxi-in delays are caused by gate and ramp congestion. Taxi times may also increase as a result of the assignment of a runway that is further from the gate area. Flight volume is likely to influence all of these delays. The adjusted R² is slightly lower for taxi-out time, implying that these times have greater unexplained variability. This is probably the result of long departure queues for some flights.

The coefficient estimates for ${\bar{σ}}_{ratio, t}$ are all significant and positive. This estimate is largest for the gate delay model, smallest for the airborne time model. The estimates for the two taxi time models are roughly equal. The interpretations for these results are broadly similar to those for the traffic volume coefficients.

The estimation results can be used to predict operational performance in a system with zero traffic. Such performance could result from either a very low level of traffic or from infrastructure investment adequate to eliminate bottlenecks in the NAS. Thus, we consider a counterfactual scenario in which there is no flight in the system and the total number of flights $Q_{t}$ and ${\bar{σ}}_{ratio, t}$ are both set to 0. Assuming the counterfactual scenario is in December, we can calculate the weighted EFT and its components using the estimates from the updated model, as shown in Table 2.

Table 2.

Estimations for the Weighted EFT and its Components for Different Scenarios

EFT and its components	Average weighted time for the pre-COVID 63-week analysis period (minutes)	Average weighted time for the COVID 63-week analysis period (minutes)	Average weighted time for the post-COVID 60-week analysis period (minutes)	Average weighted time for the 186-week analysis period (minutes)	Estimated time for empty air traffic system (minutes)
EFT ( $EF T_{t}$ )	153.772	143.131	154.726	150.475	121.231
Gate delay ( $G D_{t}$ )	12.668	6.232	14.942	11.221	1.039
Taxi-out ( $T O_{t}$ )	16.885	14.310	16.062	15.747	10.562
Airborne ( $A T_{t}$ )	116.382	115.981	115.740	116.039	113.727
Taxi-in ( $T I_{t}$ )	7.87	6.608	7.981	7.467	4.818

Note: EFT = effective flight time.

For the weighted EFT, the estimation result is 121.231 min. It indicates that the necessary weighted EFT is 121.231 min for an empty air traffic system. The average weighted EFT is 150.475 min for the 186-week analysis period, which means that 25% more weighted EFT time is caused in some way by the presence of other flight traffic. This increase is mainly from gate delay, whose actual average value is 10 min more than for the empty system, in which it essentially disappears. The residual gate delay in the empty system is presumably the result of delays in loading aircraft as well as known and expected delays. A busy air traffic system also has an impact on taxi times. Both taxi-out and taxi-in times increased around 50% compared with the empty scenario. This may be explained by queues for flights about to depart and queues at the destination gate. Airborne time is only 2.312 min less in the existing system compared with the empty one. This verifies previous analysis that the total number of flights and airport conditions have little impact on airborne time.

Post-COVID Recovery Performance

The semi-log-linear models perform well on the training set for modeling the relationship between the weighted EFT (as well as its components) and the total number of flights. In this section, we evaluate the model performance in relation to both the training and test set to assess out-of-sample performance of the trained model and to identify changes in operational performance in the post-COVID period. The test period is from June 1, 2021 to July 25, 2022.

Based on Equation 8 and the estimated parameters in Table 1, we calculate the prediction values for the log-transformed weighted EFT and its components. From this we find the root mean square error (RMSE) and the average difference between the predicted and actual values of the dependent variable. As shown in Figure 7, the ground truths and predictions are tracked closely in the training set but are more divergent in the test set. Figure 8 reveals that the residuals are evenly distributed around 0 in the training set, whereas there are systematic biases in the test set. These differences are confirmed in Table 3, which shows that the RMSEs are larger for the test set, and that the average residuals are non-zero. These average residuals suggest how performance during the post-COVID recovery period has changed. They imply that overall EFT is about 3% greater than what is predicted by a model trained on pre-recovery data. However, we see larger differences in certain EFT components. Specifically, gate delay is about (exp(.393)-1)*100 = 48% greater and taxi-in time is about (exp(.065)-1)*100 = 6.7% greater, whereas changes in airborne time (+0.8%) and taxi-out time (-0.2%) are minimal.

Figure 7.

Ground truths and predictions for both training and test sets.

Figure 8.

Residuals for both training and test sets.

Table 3.

Prediction Performances of the Training and Test Sets

Dependent variable: EFT and its components	Training set (January 1, 2019–May 31, 2021)		Test set (June 1, 2021–July 25, 2022)
Dependent variable: EFT and its components	RMSE	Standard deviation of $\ln (Y_{t})$	RMSE	Standard deviation of $\ln (Y_{t})$	Average residuals of $\ln (Y_{t})$
EFT ( $EF T_{t}$ )	0.017	0.043	0.040	0.019	0.030
Gate delay ( $G D_{t}$ )	0.177	0.434	0.454	0.218	0.393
Taxi-out ( $T O_{t}$ )	0.030	0.103	0.028	0.029	0.008
Airborne ( $A T_{t}$ )	0.004	0.008	0.005	0.006	-0.002
Taxi-in ( $T I_{t}$ )	0.025	0.104	0.072	0.031	0.065

Note: EFT = effective flight time; RMSE = root mean square error.

These results suggest that the perceived degradation in operational performance in recent months is partly illusory and partly real. The EFT has increased, but this is largely to be expected as a result of increased traffic in the system. Nonetheless, EFT is about 3% greater than might be expected based on an increase in traffic alone. Moreover, certain EFT components, namely, gate delay and taxi-in time, have shown more substantial percentage increases. One possible reason is labor shortages, which inhibit the ability of airlines to achieve timely departures or expeditiously move landed aircraft to their arrival gates. Delays resulting from these problems are more evident to the traveling public, even if the change in EFT is fairly small.

Conclusion and Future Work

In this paper, we utilize the natural experiment provided by the COVID-19 pandemic, which caused dramatic fluctuations in air traffic, to quantify the relationship between flight volume and the operational performance of the flight system. Analysis of changes in operational performance allows us to investigate to what extent delays can be explained by a high volume of traffic and whether there are other factors leading to a high level of delays in the post-pandemic world. We develop regression models to formalize this relationship and compare the operational performance between the pre-pandemic and post-pandemic periods by incorporating flight volume as a predictor to reveal if there are any changes. A flight basket for the U.S. domestic aviation market is constructed based on the FAA ASPM database for the analysis period, that is, from January 1, 2019 to July 25, 2022. We first determine the airport OD pairs that should be included in the flight basket, and assign weights to each airport pair based on the number of flights performed over the analysis period. To understand the relationship between air traffic and operational performance further, semi-log-linear models are formulated and estimated using the training dataset from January 1, 2019 to May 31, 2021. We include the primary variable of interest (the total number of flights) in the model, as well as metrics based on fluctuations in airport demand-to-capacity ratios and monthly dummy variables to capture the potential effects of other factors. The model captures a statistically significant positive relationship between EFT and the total number of flights. The estimation results suggest that flight times in a system with zero traffic would be about 30 min less than one with the levels of traffic observed in 2019, with most of the differences in the gate delay and taxi-out segments. Using the estimated coefficients, we test the model on the unseen test period (i.e., post-recovery period), from June 1, 2021 to July 25, 2022. Our models have a good performance with regard to the training set, and similar but not identical results in relation to the test set, for which RMSEs are slightly larger and residuals have systematic biases for certain EFT components. The positive residuals for gate delay and taxi-in time during the test period indicate that certain changes in operational performance in the post-pandemic period, for example, increased gate delay, cannot simply be explained by the recovery of flight volume. On the other hand, overall EFT in the post-pandemic period is about 3% greater than what is predicted by the model trained on pre-recovery data.

In future work, we would consider adding more factors that may influence the operational performance of flights to make the model more reliable. It is possible that model performance could be improved by adding more data and relevant features such as queuing delays, convective weather, and wind, as did our other works for modeling system delays ( 7 ). Furthermore, analysis at a micro level may provide more practical insights and help improve the operational performance of the aviation system. This analysis can also be extended to investigate the performance at a single airport or for a specific group of flights.

Footnotes

Author Contributions

The authors confirm contribution to the paper as follows: study conception and design: J. Xu, L. Dai, M. Hansen; data collection: J. Xu, L. Dai; analysis and interpretation of results: J. Xu, L. Dai, M. Hansen; draft manuscript preparation: J. Xu, L. Dai, M. Hansen. All authors reviewed the results and approved the final version of the manuscript.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Jing Xu

Lu Dai

Mark Hansen

Data Accessibility Statements

ASPM data are available at .

References

Lamb

T. L.

Winter

S. R.

Rice

Ruskin

K. J.

Vaughn

Factors That Predict Passengers Willingness to Fly during and after the COVID-19 Pandemic. Journal of Air Transport Management, Vol. 89, 2020, article 101897.

Dai

Tereshchenko

Hansen

Quantifying the Impact of Air Travel on Growth of COVID-19 Pandemic in the United States. Proc., Fourteenth USA/Europe Air Traffic Management Research and Development Seminar (ATM2021), New Orleans, LA, 2021.

Bureau of Transportation Statistics. BTS Quick Links to Popular Air Carrier Statistics. https://www.bts.gov/topics/airlines-and-airports/quick-links-popular-air-carrier-statistics. Accessed July 25, 2022.

Sun

Wandelt

Fricke

Rosenow

The Impact of COVID-19 on Air Transportation Network in the United States, Europe, and China. Sustainability, Vol. 13, No. 17, 2021, article 9656.

Zhou

Kundu

Qin

Goh

Sheu

J. B.

Vulnerability of the Worldwide Air Transportation Network to Global Catastrophes Such as COVID-19. Transportation Research Part E: Logistics and Transportation Review, Vol. 154, 2021, article 102469.

Hsiao

C. Y.

Hansen

Econometric Analysis of US Airline Flight Delays with Time-Of-Day Effects. Transportation Research Record: Journal of the Transportation Research Board, 2006. 1951: 104–112.

Dai

Hansen

Ball

M. O.

Lovell

D. J.

Having a Bad Day? Predicting High Delay Days in the National Airspace System. Proc., Fourteenth USA/Europe Air Traffic Management Research and Development Seminar (ATM2021), New Orleans, LA, 2021.

Wang

Zhou

Hansen

Chin

Scheduled Block Time Setting and On-Time Performance of US and Chinese Airlines—A Comparative Analysis. Transportation Research Part A: Policy and Practice, Vol. 130, 2019, pp. 825–843.

Eufrásio

A. B. R.

Eller

R. A.

Oliveira

A. V

. Are On-Time Performance Statistics Worthless? An Empirical Study of the Flight Scheduling Strategies of Brazilian Airlines. Transportation Research Part E: Logistics and Transportation Review, Vol. 145, 2021, article 102186.

10.

FAA. Aviation System Performance Metrics. https://aspm.faa.gov. Accessed April 14, 2022.

11.

Nicholson

Snyder

C. M.

Microeconomic Theory: Basic Principles and Extensions. Cengage Learning, 11th ed, 2012, Boston, MA, pp. 181–182.

Flight Time and Flight Traffic Before,During,and After the Pandemic: What Has Changed?

Abstract

Keywords

Methodology

Data Source

Flight Basket

The Goods—Airport Pairs

The Total Cost—Basket-Level Weekly Weighted EFT

Sensitivity Analysis

Regression Analysis

Model Specification

Airport Conditions

Results and Discussion

Estimation Results

Post-COVID Recovery Performance

Conclusion and Future Work

Footnotes

Author Contributions

Declaration of Conflicting Interests

Funding

ORCID iDs

Data Accessibility Statements

References