Abstract
Integration of urban and rural infrastructure is critical to integrating urban and rural public transport. A public transport hub is an important element of infrastructure, and it is the key facilities that serve as transferring points between cities and towns. The location of hub is related to the convenience of travel for urban and rural residents and the closeness of economic interactions between urban and rural areas. In this article, considering the background of the integration of urban and rural public transport, from the perspective of public transport hubs in urban and central town, a multi-level hub-and-spoke network is designed, and the location of integration of urban and rural public transport hub is determined. Based on the connection associated with central towns and the capacity constraints of hubs and to achieve the minimum total cost, this article proposes a mixed-integer programming model that employs a genetic and tabu search hybrid optimization algorithm to validate and analyze, which used the urban and rural public transport data from a specified area of Shandong province in China. The results indicate that the model can simultaneously determine locations for hubs in cities and central towns while minimizing total cost. The hub capacity constraint significantly influences the location of two-level hubs. The hub capacity constraint in the model can reduce the transportation cost for an entire network and optimize the transportation network. This study on urban and rural public transport hub location in a hub-and-spoke network not only reduces the transportation cost of the network but also completes and supplements the location theory of integration of urban and rural public transport.
Keywords
Introduction
The integration of urban and rural passenger transport is critical to integration of urban and rural areas. The main mode is the integration of urban and rural public transport. Its major task is to satisfy urban and rural residents’ ever-increasing travel demand, particularly to enable the residents of cities, villages, and towns to conveniently travel to and from destinations at low cost. However, the residents of villages and towns are scattered and travel less frequently than their urban counterparts. Satisfying residents’ travel demands by directly connecting all villages and towns with a city would increase costs and waste resources. A hub-and-spoke network employs indirect links between nodes, which concentrates travelers at a central node and then convey them to their respective destinations. 1 This approach reduces costs while improving efficiency of the network by taking advantage of economies of scale. A hub-and-spoke network includes two types of nodes: hub nodes and non-hub nodes. A hub location determines the distribution of hubs and the assignment of non-hub nodes.
Numerous studies of hub location in a hub-and-spoke network have been conducted. O’Kelly 2 presented the first mathematical formulation to solve the single allocation p-hub median problem. He investigated an American Airlines passenger network and established the first quadratic integer programming for single allocation p-hub median problem. Researchers have primarily investigated four models and algorithms: p-hub median problem, hub location problem with fixed costs, p-hub center problem, and hub covering problem. 3 They discussed allocation and capacity constraints. However, the majority of studies addressed the problem of capacity constraints in single-layer networks; few studies explored capacity constraints in multi-level networks.
An urban and rural public transport network contains two types of hubs: urban public transport hubs and hubs in central towns. 4 Hubs in central towns primarily serve passengers in general towns and administrative villages, whereas urban public transport hubs serve passengers in cities and central towns. Thus, urban public transport hubs, hubs in central towns, and demand nodes constitute a three-level urban and rural public transport network. The hub location in the network is a multi-level location problem. Existing researches on the integration of urban and rural public transport have primarily focused on urban public transport hubs and hubs in central towns, respectively, using different methods. For example, the hub location models for city environments considering different optimization goal were proposed in these studies.5–8 Wang 5 built a hub covering location model for hubs in central towns. This layout failed to consider urban and rural public transport terminal as a whole and was unable to guarantee optimal transportation cost.
Existing researches on multi-level hub location primarily focus on facility hierarchical locations, such as production-distribution systems; 9 express delivery services systems; 10 hospitals, schools, and other public service facilities;11,12 solid waste management systems; 13 and city emergency facilities. 14 The majority of studies take the standpoint of management features and do not consider the use of a hub-and-spoke network structure. Only a few papers on hub location in a hierarchical hub-and-spoke network have considered transportation. One study was based on an application in cargo delivery. A previous survey of a major cargo company in Turkey by Elmastas 15 discovered that the company used a three-level transport network, where two central hubs are directly connected, the remaining hubs are connected to one of these two central hubs, and each remaining demand node is connected to a hub or a central hub. Based on the network, Elmastas established a hierarchical hub location model that considers time constraints and compared it with an actual situation. The results show that the constructed model has fewer hubs with the lowest total transportation cost. Based on this research, Yaman 16 developed a complete-star-star network and proposed a hierarchical single allocation hub location without capacity constraints. Alumur et al. 17 considered the hierarchical hub location problem in multimodal transportation. Davari and Zarandi 18 considered the uncertainty demand and introduced fuzzy variables into the model based on the model proposed by Yaman. Karimi et al. 19 considered hub capacity and time constraints and presented the capacitated single allocation hierarchical hub median location model. Dükkanci et al. 20 defined a ring-star-star hierarchical network and combined location with route and schedule to construct a hub covering model. Zheng et al. 21 studied the location of regional and international hub ports in liner shipping by proposing a hierarchical hub location problem. Li et al. 22 proposed a hierarchical hub location model for a hub-and-spoke network that was primarily aimed at urban comprehensive passenger hub in China. These studies, however, considered the network structure from perspective of carriage of goods, which is not completely aligned with the characteristics of urban and rural public transport. Moreover, level II hubs are not allowed to connect in the literatures, and few studies have considered the capacity limitations for hubs and paths. In reality, central towns can be connected in an urban and rural public transport network, and factors such as capital, land, traffic distribution, and capacity limitations for hubs and paths require consideration. This article’s emphasis on capacity constraints will align well with real-world situation.
To remedy the previously mentioned deficiencies, this article seeks to improve three aspects of existing research:
The article will consider urban and rural public transport terminals as a whole, build a multi-level hub-and-spoke network, and simultaneously locate urban public transport hubs and hubs in central town.
According to the characteristics of urban and rural public transport, passengers travel from demand nodes to a hub in a central town and then travel to urban public transport hubs. This article will build a star-complete-complete network structure. However, considering real-world situation, if two hubs in central towns connect to the same urban public transport hub, then the connection between them will be allowed, as indicated in Figure 1.
Reflecting real-world situations, this article will simultaneously consider urban public transport hubs and hubs in central towns with capacity constraints while establishing a capacitated hierarchical location model.

Urban and rural public transport hub network with three level I hubs, six level II hubs.
Applying the background of the integration of urban and rural public transport, considering a hub-and-spoke network that considers connections of central towns and hub capacity constraints, this article proposes a hierarchical location model that employs a genetic and tabu search hybrid optimization algorithm to validate using real data. The remainder of this article is organized as follows. In section “Description of hub location for the integration of urban and rural public transport,” the problem is described. The model is constructed in section “Construction of hub location model,” and a solution approach is designed in section “Solution method.” Section “Computational result and analysis” presents the computational results and parameters analysis. The conclusions are discussed in section “Conclusion.”
Description of hub location for the integration of urban and rural public transport
The integration of urban and rural public transport does not cover the complete range of urban and rural areas but rather is confined to a certain administrative area, which is defined in this article as a specific area of a city and its neighboring townships. 7 In this region, urban and rural public transport primarily aims to satisfy the growing travel demand generated by urban and rural residents, especially to satisfy the residents who moving into and out of city. Figure 1 displays a network of urban and rural public transport. Urban public transport hubs are known as the level I hub, which is located in the city or on the rural–urban continuum. Hubs in central towns are known as the level II hub. Level I hubs may be connected, and connections can be formed between level II hubs that connect to the same level I hub. Other demand nodes, however, are not allowed to connect.
The urban area and town are divided into numerous traffic districts, each of which can be considered as a demand node or destination that constitutes a set of urban and rural network nodes. The problem to be solved is to determine the number and location of level I hubs and level II hubs and assign demand nodes to the hub and assign level II hubs to level I hubs while minimizing total cost.
The following assumptions for the model are formed:
Each demand node only connects to one hub.
Direct connections between non-hub nodes are not allowed.
The unit transportation cost between level I hubs, level II hubs, level II hubs and level I hubs is lower than the unit transportation cost between two demand nodes. Moreover, the unit transportation cost between two level II hubs, level II hubs, and level I hubs is equal, as expressed by the discount factor
An alternative node can only place one hub.
Construction of hub location model
In this section, the article proposes a mixed-integer programming model that aims to minimize the sum of transportation costs and hub construction costs. The transportation costs include the assignment costs of non-hub nodes and transportation costs among the hubs. The hub construction cost is considered because the number of hubs is determined by the model. To calculate the traffic on the path, the article refers to the idea of multi-commodity flow proposed by Ernst and Krishnamoorthy 24 and Ebery 25 and employs three index variables to represent the traffic. The article assumes that the path cost of the network satisfies the triangular inequality.
Parameters and variables
To illustrate the problem, parameters and variables are defined as follows.
Mathematical formulation
According to the above description of the problem, the related hypothesis and the definition of parameters and variables, based on literatures,16,22 and considering the hub capacity constraints, the following mathematical model is established
The objective function (1) indicates the minimum total costs that conclude transportation costs and hub construction costs. The first term is the assignment costs, the second is the transportation costs among the level I hubs, the third is the transportation costs among the level II hubs and costs between the level II hubs and level I hubs, the fourth and the fifth are construction costs of level I hubs and level II hubs, respectively. Constraint (2) ensures that the demand nodes are assigned only to one hub; and constraint (3) ensures that the level II hubs cannot be selected from the set of level I hubs. Constraint (4) ensures that the nodes must be assigned to the hub. Constraints (5) and (6) ensure that a level II hub is assigned only to one level I hub, which must serve as the hub. According to constraints (7)–(9), if the level II hub m and n are assigned to the same level I hub k, a connection exists between m and n. If m and n are not assigned to the same level I hub k, a connection between m and n does not exist. Constraints (10)–(12) indicate that m and n are selected as hubs. If the level II hub m is assigned to level I hub n, then the connection between m and n is the connection between the level II hubs. Constraint (13) represents the flow balance among level II hubs. Constraint (14) represents the flow balance among level I hubs. According to constraint (15) if a connection does not exist between the hub j and the hub l, the amount of traffic between j and l is 0. Constraints (16) and (17) are the capacity constraints of the level II hub and the level I hub, respectively. Constraints (18)–(20) are non-negativity restrictions on the amount of traffic, and constraint (21) is a 0–1 integer constraint. The model is 0–1 mixed-integer programming model, which has proved to be an NP hard problem in the literature. 16
Solution method
Scholars have proposed many exact and intelligent optimization algorithms about hub location in a hub-and-spoke network 3 and have demonstrated the effectiveness and superiority of intelligent optimization algorithms for solving the hub location problem. The genetic algorithm is an intelligent optimization algorithm that mimics natural selection, genetic biology, and evolution, and it is extensively employed for addressing location and allocation. For example, Lin et al. 26 employed genetic algorithm to solve the p-hub median problem with capacity constraints. Topcuoglu et al. 27 adopted genetic algorithm to determine the number of hubs, location, and the allocation of non-hub nodes. Kratica et al. 28 applied genetic algorithm to solve the uncapacitated hub median location problem. However, the genetic algorithm can easily fall into the local optimal solution, the simple genetic algorithm cannot successfully solve many complex problems, and allocation is not necessarily the optimal solution. 29 Therefore, scholars often combine other algorithms with genetic algorithms and employ combinatorial optimization algorithm to solve the hub location problem in the hub-and-spoke network. For instance, Cunha and Silva 30 embedded simulated annealing algorithm into genetic algorithm to solve the single allocation hub location problem. Abdinnour-Helm 29 combined genetic algorithm and tabu search algorithm to solve the single allocation hub location problem. Skorin-Kapov et al. 31 proved that the tabu search algorithm can achieve optimal results, and it is better than simulated annealing algorithm for solving the problem. For the mixed-integer programming model established in the article, when the number of level II hubs is 0, the multi-level hub location problem becomes a classic single-layer median hub location problem. Abdinnour-Helm 29 has proved the effectiveness and superiority of the method; thus, this article applies the genetic and tabu search hybrid optimization algorithm to solve the model. The solution includes two stages: determining location by genetic algorithm to obtain an initial solution and solving the optimal assignment problem using the tabu search algorithm.
Solution representation
In genetic algorithm, a solution includes two arrays: HubArray and AssignArray. HubArray is expressed in 0s and 1s, where 0 indicates a non-hub node and 1 indicates a hub. AssignArray is expressed in real numbers; the value of the corresponding position in the array represents the serial number of a non-hub node that is assigned to a level II hub or a level I hub. The level I hub is the node that is located to itself. The length of the two arrays equals the number of nodes in the network k. The solution representation of Figure 1 is shown in Figure 2.

The representation of solution.
Steps of genetic and tabu search hybrid optimization algorithm
Initial parameters including population of solution
Generate initial population. A set of initial solution is generated based on the given fixed probability
Calculate the fitness function. Calculate the fitness function value of each individual considering the objective function as a fitness function.
Select two parent solutions for reproduction. Use the binary tournament selection method due to its fewer iterations and greater speed. 30 Randomly choose two individuals from the population each time and save the individual with the smaller fitness value in the next generation. Because each binary tournament selection produces one parent, the article holds two binary tournament selections to perform the following operation.
Produce offspring by recombination. This step includes two parts: crossover operation and mutation operation. Apply one point crossover operator and randomly select one crossover point for HubArray and AssignArray. The selected crossover point is the same. Then, offspring are generated by exchanging genes behind the crossover point in the parents. After exchanging, if two offspring are both hubs and non-hubs, they are dropped and a crossover point is generated once more. For the AssignArray, if a node is assigned to one node that is not a hub after exchanging, then the node is randomly reassigned to a selected hub. Mutation is performed as follows: For HubArray, randomly select a hub node and a non-hub node and exchange the gene value for the corresponding positions: 0–1 and 1–0. For AssignArray, choose a non-hub node and reassign it to another randomly selected hub.
Apply a tabu search algorithm to each generated individual to improve the initial assignment process.
Replace an individual in the initial population using the new offspring solution obtained through this process, then evaluate the new population. Use an elitist generation replacement and replace the individual with the greatest fitness value by offspring solution.
Repeat steps 4–7 until a pre-defined number of iterations is attained.
Parameters for genetic and tabu search hybrid algorithm.
Tabu search optimization component
The article improves the allocation of non-hub node to hub with tabu search algorithm, which finds the optimal allocation within a neighborhood. The allocation does not change the location of hubs. Find the optimal allocation by movement, which includes two types: shift moves and exchange moves. Shift moves switch the assignment of a non-hub from one hub to another hub. Exchange moves involve a pair of non-hubs, which randomly selects two non-hub nodes and switches their assignments. The process is described as follows:
Set the move process as tabu object. Keep two separate tabu lists: one is tabu_list_shift for shift moves and the other one is tabu_list_exchange for exchange moves. The initial value is set 0. Tabu_size is set
Assess the termination conditions of tabu search algorithm. If the conditions are satisfied, tabu search is terminated, and return to step 8 above, or continue the following steps.
When shift probability
Designate the optimal solution as the current solution and determine whether the current solution satisfies the despised criterion. If the despised criterion is satisfied, update despised criterion and the current solution, or continue the following steps.
Update the tabu list.
Return step 2.
Computational result and analysis
The model is computed and analyzed with a set of urban and rural public transport survey data from Guangrao County in Shandong province, China, in 2013. The research area is shown in Figure 3. Guangrao County has a total area of 1167.55 km 2 , which administers six towns, one township, two streets, and one economic development zone. According to the basis of traffic district division, Guangrao County is divided into 54 districts, and residents’ public transport trips are surveyed by a sampling survey to obtain the data of public transport trips. The article uses the distance between the centroid of traffic districts instead of the distance between two districts. To analyze the influence of some parameters on the total cost and hub location, considering the balanced distribution, this article, respectively, chooses 15, 20, 25, and 30 traffic districts with more flow volume as demand points (represented by circles). Some districts are chosen as a possible set of level I hubs (depicted by triangles) and some districts are chosen as a possible set of level II hubs (represented by squares). The distances and traffic volumes among districts are listed in Tables 2 and 3. The article assumes that the unit transportation cost is 0.2 yuan per passenger-kilometer. Based on planning, the maximum capacity of a level I hub ranges from 5000 to 10,000 and the maximum capacity of a level II hub ranges from 2000 to 5000. The planned construction cost of a level I hub is 3,000,000, and the planned construction cost of a level II hub is 1,500,000.

The map for the 30 node instance.
Traffic volumes between demand nodes.
Distances between demand nodes.
Computational results
According to the genetic and tabu search hybrid optimization algorithm, a simulation is performed with MATLAB. The population size is 100, the crossover probability
The change of cost and hub location.
It can be seen that the number and location of level I hubs and level II hubs have changed with different number of demand nodes. With an increase in node number, the number of level I hubs increases, and the number of level II hubs decreases. For example, when
Parameter analysis
The influence of capacity constraints on hub location and cost
Considering the two cases of the hub with and without capacity constraints, 15, 20, 25, and 30 nodes are selected from the 54 nodes:
The change of hub number and location.
When the hub has no capacity constraints, all non-hub nodes are directly assigned to only one level I hub, and no level II hub is selected, because the total cost includes two parts: transportation cost and hub construction cost. The minimum total cost is the cost of building only one level I hub. When the hub has capacity constraints, the number of level I hubs increases, and the location of level II hubs also changes. The results indicate that the number of nodes increases from 15 to 30, the number of level I hubs increases from two to six, and nodes 33 and 35 are required, because the minimum distance is the distance from all nodes to nodes 33 and 35 in the set of possible location of level I hubs.
Figures 4 and 5 depict the trends of total cost and transportation cost for different numbers of demand nodes. Figure 4 reveals a distinct growth trend for total cost when a hub has capacity constraints and a very small change in total cost when a hub has no capacity constraints. Due to the capacity constraints, total cost changes with an increase in the number of selected hubs. From the standpoint of transportation costs, the cost with capacity constraints is lower than the cost without capacity constraints, as shown in Figure 5. Due to capacity constraints, a greater number of hubs can produce a greater economies of scale, which reduces the cost.

Trend of total cost with different number of demand nodes.

Trend of transportation cost with different number of demand nodes.
Influence of capacity value on hub numbers, locations, and costs
As planned, the capacity of a level I hub ranges from 5000 to 10,000. A level II hub ranges from 2000 to 5000. Table 6 shows seven different situations when
Cost and location for different capacity constraints.
When the capacity of a level I hub is constant and the capacity of a level II hub is increased, the number of level I hubs decreases, and the number of level II hubs increases, as the location of hubs changes. If
Figures 6 clarifies how level I hub capacity and level II hub capacity affect the transportation costs. When the level II hub capacity is constant, as

Trend of transportation cost with different capacity.
Influence of construction cost on hubs numbers and locations
According to the calculation, the construction costs of a level I hub and a level II hub are estimated as follows:
Hub quantity and location for different construction costs.
Conclusion
The application of a hub-and-spoke network in urban and rural public transport planning is a relatively new idea. This work can integrate the resources of urban and rural public transport and reduce the travel cost of residents by utilizing economies of scale. Combining the actual background of integration of urban and rural public transport and considering three-level nodes of urban public transport hub, central town hub and demand nodes, a multi-level hub-and-spoke network is constructed, and a mixed-integer programming model is presented in this article. The effectiveness of the model is computed, verified, and analyzed using a genetic and tabu search hybrid optimization algorithm and actual data of urban and rural passenger transportation.
The main contribution of the article is described as follows: first, combining the practical problems of integration of urban and rural public transport, a multi-level complete network that considers the connections among level II hubs, which differs from existing network, is constructed. Second, to address existing problems of separately locating urban hubs and central town hubs, a hierarchical hub location model is constructed. The model can effectively and simultaneously determine the locations of urban public transport hubs (level I hubs) and central town hubs (level II hubs), which minimizes the total cost of the network. Third, the capacity constraints of two-level hubs in the hub-and-spoke network are also considered. The results reveal that the capacity constraints of two-level hubs not only significantly influence the locations and quantities of hubs but also have a significant impact on transportation costs. Moreover, the capacity constraints make the network more reasonable and enable the layout of hubs to fit the actual travel characteristics.
However, due to lapse of time and development of economy, the network structure of integration of urban and rural public transport will change. Establishing a dynamic and uncertain urban and rural public transport hub location model will be studied in the future. The genetic algorithm and tabu search hybrid optimization algorithm in this article are aimed at small scale demand nodes; Abdinnour-Helm proves its effectiveness in solving the problem of single layer network location when the number of demand nodes is large. However, additional research is necessary to determine whether the algorithm is superior in solving multi-layer hub location problem.
Footnotes
Handling Editor: Daniel Gutierrez-Reina
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by “Chunhui” planning project of Ministry of Education of the People’s Republic of China (S2016012). This research was also supported by Science & Technology Development program of Jilin Province (20180418101FG).
