Abstract
Air passenger travel forecasting is necessary and becomes very valuable for airline company, because accurately obtaining practical requirements of air passenger, which can not only help airline company to improve air passenger satisfaction degree and enhance user experience so as to gain huge revenue, but also can help air passengers discover suitable travel plan quickly. In order to generate the air passenger travel forecasting model, this paper aims to analyze the internal driving force and social affect factor simultaneously, which was based on dynamical personal behaviors and air passenger social relationship exactly. In particular, three aspects in terms of dynamical personal behaviors, effect of fellow air passenger, and influence of similar air passenger are all considered simultaneously, and then the data from these aspects are further trained so as to obtain weight allocation in many different scenarios. Besides, workday and non-workday are separately considered in order to make the forecasting model feasible and effective.
Keywords
Introduction
Air passenger travel forecasting has become a hot research topic recently, for example Hofer et al. 1 measured how socio-economic mobility affects domestic passengers at U.S. airports. Except metrics such as income and population levels, socio-economic mobility has been considered as an important characteristic for the socio-economic fabric of market areas. As to the growing urbanization and subsequently increasing traffic, Becker et al. 2 aimed to discover future potential worldwide markets for interurban air mobility up to 300 km. Forecasting of air passenger and cargo had a major influence on the master plan of the airport infrastructure development and investment by the civil airline. Sulistyowati et al. 3 aimed to obtain the most accurate forecasting value of the air passenger and cargo at three international airports named Soekarno Hatta, I Gusti Ngurah Rai, and Juanda Airport in Indonesia. Liu Huang et al. 4 aimed to forecast number of daily passengers in an airline company for the airline from Beijing to Sanya, using the historical data from 2010 to 2016. The forecasting was conducted out by means of machine learning, such as the models of multi-variable regression, support vector regression, RMA-improved, and RBF-based neural network. Li and Sheng 5 investigated the mode choice behavior in terms of inter-city passengers among air transport, high-speed rail, and air and high-speed rail integration services. Kim and Shin 6 aimed to develop a forecasting model for short-term air passenger demand, which used big data from search queries to identify these short-term fluctuations. For Fatemi Ghomi and Forghani, 7 the data belonged to a major airline company in Turkey and comprised past five years’ daily passenger data for a flight. It was used to forecast expected passenger count for that flight, servicing 355 days which were open to sale in reservation systems.
Besides, for the research about travel forecasting has been used in some practical applications. Sun et al. 8 aimed to first use a reliability evaluation method to get the reliability of bus line, and the model of reliability forecasting proposed in this paper was tested with the data of the bus line 23 in Dalian city of China. The results showed that the random forest with the reasonable parameters could forecast the reliability of bus service accurately. Gu et al. 9 solved the train timetabling problem of a double track high-speed railway line with heavy traffic and trains with different operational speeds. The case study showed that more trains would be scheduled when different kinds of headway time were considered.
Current researches about air passenger travel forecasting model mainly adopt the passenger name record (PNR) data to calculate travel relevancy degree between different air passengers, so as to measure the relationship of air passengers. But in fact, the implicit relationship about different ticket orders has more finding value as to one ticket order, not just the value in terms of obvious relationship. Furthermore, current methods about measuring air passengers’ values usually do not consider the interaction effect of different individuals in their social relationship networks.
Research about the air passenger values is facilitated to the travel forecasting of air passenger; Bingyu 10 used generated biograph to forecast the travel probability of air passenger and calculated the potential value of air passenger; however, the proposed model only considered the relationship between air passengers and airlines, without considering the interaction effect between different air passengers. Besides, Bingyu 10 proposed that the effect value in terms of social network should also be considered except for the individual values. While they calculated the air passenger network value, only the air passenger topological relationship was considered, but the individual air passenger difference was ignored, which would lead to the influence force difference eventually.
In this paper, forecasting about route selection of air passenger travel is abstracted to link forecasting model and influence factors of choosing airlines, then an air passenger travel forecasting model is proposed, which is based on both dynamical personal behavior and social influence force. The main contributions of this paper is to measure the strength of influence force in terms of dynamical personal behaviors and air passenger social relationship, and provide theoretical foundation to probability calculation for the airlines’ forecasting process.
Analysis about the passenger travel and influence factor
Supposing

Heterogeneous network of air passenger travel and chosen airlines.
For the relationship
Supposing
Dynamical individual behavior impact analysis
As to air passengers and their historical airlines, we collected the time interval for these air passengers who chose the same airline twice, and calculated the period that having max travel count. The closer for the period between now and the hot cycle time, which means the bigger probability that air passenger chose the same airline.
In order to describe the probability that air passenger chooses certain airlines, hot cycle time is adopted so as to represent this value, which can be calculated by
In order to verify algorithm performance, we calculated historical travel data so as to describe cycle time distribution and probability distribution, respectively, as Figures 2 and 3 show.

Cycle time distribution.

Probability distribution.
As we can see from Figure 2, the time interval for choosing the same airline again is not irregular, there are one or two peaks, which means air passengers prefer to choose certain airlines again near the fixed time interval. Figure 3 shows the relationship between probability and circle time, which is similar with the distribution of air passengers’ actual choosing.
Supposing
Fellow air passenger impact analysis
Though the probability about air passengers choosing certain airlines within different periods can be forecasted by the historical data of air passengers, the potential demand cannot be found for those airlines that were never be chosen by air passengers, especially while the travel data are relatively rare. So the travel plan cannot be well forecasted only according to the historical travel data. This paper aims to utilize both travel data and impact of fellow air passenger, so as to make up the shortfall of rare travel data.
The affect about fellow air passengers to the travel airlines’ choice is described as below:
The probability about fellow air passengers choosing the same airlines again is larger. Air passengers usually have similar requirements with fellow air passengers while choosing airlines, so this can be used to discover the potential required airlines by calculating fellow relationship degree between air passengers.
Similar air passenger impact analysis
Similar air passengers usually prefer to choose same airlines in certain conditions, for example air passengers having similar interests prefer to choose the same airlines. Besides, for some activities such as sport competition, academic conference, these air passengers also prefer to choose the same airlines. In order to measure the similarity of air passenger, workday travel and non-workday travel are both considered, so as to effectively obtain the similarity of air passenger under the conditions of different time types. Supporting
Then finally the air passenger workday travel similarity matrix
Forecasting mode of choosing airline about air passenger travel
Choosing airline is mainly affected by the personal inner driving force and social influence. The inner driving force can be obtained by the analysis of personal behavior rule, and the social influence can be obtained by the analysis of fellow air passenger and similar air passenger.
So in the process of network construction, inner driving force and social affect behave like two types: the route from air passenger to airlines directly, and the route from air passenger to airlines indirectly which means it was simultaneously influenced by other air passengers.
The travel mode and choosing airlines may have great differences during different period for air passengers, so dates are divided into workday, weekend, and holiday accordingly. Figure 4 shows that the network edge weight calculation method is based on time perception, which means network edges are generated while considering workday, weekend, and holiday simultaneously. Figure 5 shows network is generated so as to describe the relationship between different air passengers.

Network edge weight calculation method based on time perception.

Network description about passengers’ relationship.
Where last relation time
Closeness coefficients mean the probability of choosing travel again, which can be obtained by
While the last travel date is less than the average cycle time
Travelling frequency
The choosing of airlines should be considered under the conditions of workday, weekend, and holiday, respectively, so for the date type c, the weight
Then, membership degree is used to solve exact forecasting time. Supposing time membership degree is
Design of proposed air passenger travel forecasting algorithm
Supposing
Forecasting model of choosing airlines is described in Figure 6, and for forecasting date

Workflow description of proposed algorithm.
Step 1: Constructing workday and non-workday travel network separately based on historical travel data of air passenger.
Step 2: Analyzing dynamical personal behavior of air passenger, fellow relationship of air passenger, and similar relationship of air passenger separately based on the above network.
Step 3: Calculating and obtaining the inner driving force matrix
Step 4: Merging air passenger inner driving force, fellow air passenger influence force, and similar air passenger effect force so as to forecast airlines based on formula (9).
Step 5: Estimating the parameters based on the above two networks, and obtaining the values
Step 6: If
Step 7: Choosing totally
Step 8: The airlines accordingly to the non 0 matrix are regarded as the forecasting result, generating the airlines forecasting result set
Step 9: Sorting the probability value of each rows in order, replacing the original value by the sorted probability value, and the other places are all set to value 0, generating the sorted matrix
Step 10: Airlines forecasting result set
Performance verification about proposed passenger travel forecasting algorithm
Experimental data
Experimental data are the PNR data and departure data of civil aviation passenger booking system, 5000 air passengers are used as the air passenger set
Evaluation index in terms of recall ratio, system forecasting precision rate, and sorting accuracy is described as below11–18:
Recall ratio and system forecasting precision rate
Forecasting description about different set definition is presented in Table 1, which aims to show the comparison between forecasting travel and indeed travel, and whether the airlines forecasting is correct.
Forecasting description about different set definition.
Supposing recall ratio
Supposing 2. Sorting accuracy
Sorting accuracy is used to measure the recall ratio of forecasting system. Forecasting sorting result matrix
Constructing the result matrix
The sorting score
And the sorting accuracy
Experiment design and result analysis
The influence about model affected by
Choosing totally
Making

Comparison of recall ratio and sorting accuracy for different
From the results we can know, while 2. The effectiveness and necessary while merging various influence force factors
Three components model will be adopted to be compared as below:
(1) Personal behavior model
Only using dynamical personal behavior rule to forecast the choosing airlines, which means (2) Fellow air passenger relationship model
Only using the travel preference of fellow air passenger and the influence force to forecast the choosing airlines of target air passenger, which means (3) Similar air passenger relationship model
Only using the travel preference of similar air passenger and its similar extent to forecast the choosing airlines of target air passenger, which means
While the

Comparison of recall ratio and sorting accuracy.
As the results shown in Figure 8, the random walk algorithm is based on bigraph, which considers the frequency character for each air passenger while choosing airlines, but ignores the time factor of choosing travel. In the meanwhile, it heavily considers the choosing condition of each airline based on the statistics, and ignores the difference, which make some irrelevant statistics rule disturb the forecasting result.
The proposed algorithm has relatively better performance in terms of model recall ratio and sorting accuracy while compared with other models. Especially for these single factor models, personal behavior model has better performance, because this model is designed according to the air passenger personal behavior rule, which is suitable for workday travel. In fact, air passengers usually choose to travel on workdays, so the personal behavior model has better forecasting performance. But the personal behavior model can only forecast the travel airlines, so it is difficult to discover never choosing but potential airlines; however, fellow air passenger and similar relationship can cover the shortage effectively, so the proposed model merging various influence force factors can achieve better forecasting performance. Consequently, the algorithm proposed has better performance, and it can effectively forecast the potential airlines requirement in the long run, but it cannot make real-time forecasting.
Conclusion
In this paper, impact analysis in terms of dynamical individual behavior, fellow air passenger, and similar air passenger is presented first. Then later air passenger travel forecasting model is put forward so as to mine potential air passengers and further forecast corresponding airlines. Finally, results analysis in terms of the model affected by
Footnotes
.
Acknowledgements
The authors thank Jan Vitek, Ales Plsek, and Lei Zhao for their help while the first author studied in S3 lab at Purdue University as a visiting scholar from September 2010 to September 2012. We also thank the anonymous reviewers for their valuable comments.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Fundamental Research Funds for the Central Universities of Civil Aviation University of China (no. 3122018C025), Scientific Research Foundation of Civil Aviation University of China (no. 2014QD13X), and Major Projects of Civil Aviation Technology Innovation Funds of China (nos. MHRD 20150107 and MHRD 20160109).
