Empirical study of travel mode forecasting improvement for the combined revealed preference/stated preference data

Abstract

The combined revealed preference/stated preference data–based discrete choice model has provided the actual choice-making restraints as well as reduced the prediction errors. But the random error variance of alternatives belonging to different data would impact its universality. In this article, we studied the traffic corridor between Chengdu and Longquan with the revealed preference/stated preference joint model, and the single stated preference data model separately predicted the choice probability of each mode. We found the revealed preference/stated preference joint model is universal only when there is a significant difference between the random error terms in different data. The single stated preference data would amplify the travelers’ preference and cause prediction error. We proposed a universal way that uses revealed preference data to modify the single stated preference data parameter estimation results to achieve the composite utility and reduce the prediction error. And the result suggests that prediction results are more reasonable based on the composite utility than the results based on the single stated preference data, especially forecasting the mode share of bus. The future metro line will be the main travel mode in this corridor, and 45% of passenger flow will transfer to the metro.

Keywords

Discrete choice model combined data mode share prediction random error terms variance

Introduction

In transportation field, the discrete models are frequently being used for traffic demand prediction, policy evaluation, and level of service (LOS) evaluation^1–9 with satisfied results.

Compared with other models, the discrete choice model is better at analyzing traveler’s travel choice behavior with small sample-sized personal data. Stated preference (SP) data and revealed preference (RP) data are two data types used for discrete choice model.¹⁰ The RP data represent actual traveler’s travel choice,^10–12 but cannot reflect the traveler’s preference to new travel mode or new travel services. The RP data may cause model errors according to correlation issue,^13–16 while the SP data have no correlation issue and represent the traveler’s preference under presumed scenarios. Unfortunately, the SP data lack actual restraints.^17–22 Thus, the combined RP/SP data model is proposed to add both data types’ advantages and reduce their prediction errors.^10,23,24 Bradley and Daly theoretically proved the feasibility of applying combined RP/SP data in discrete choice model.²³ Ben-Akiva used discrete choice model to predict traffic demand based on combined RP/SP data. Their results suggested that the single SP data result had biased conclusions, while the combined data analysis results were more precise.²⁴ Elisabetta and Cherchi analyzed the impact of a new train service in Caglirali with the RP/SP joint model.²⁵ However, little empirical research about its universality has been done in realistic environment. In this article, we demonstrate the applicable conditions of the RP/SP joint model and propose to use RP data to modify the single SP data parameter estimation values and achieve the composite utility, in order to overcome the disadvantages of RP/SP joint model and improve the precision of model results.

A new subway Route 2 is planned along Chengdu–Longquan Corridor in Sichuan, China. Assuming subway Route 2 has been operated, based on the single SP data and RP/SP joint data, we can study the predicted mode shares of this corridor, verify the universality of the combined RP/SP data, and demonstrate the effectiveness of the composite utility. The study results will help policies to encourage people take public transits and help to solve the congestion problem of Chengdu–Longquan corridor.

The article is structured as follows. Section “RP/SP joint model” introduces the RP/SP joint model. Then, section “Data analysis” describes the questionnaire survey and data analysis. Section “Model results” demonstrates the applicable condition of the RP/SP joint model. Section “The composite utility” illustrates the specific process of realizing the composite utility. Section “Elastic analysis” compares prediction results between composite utility and the single SP data. Finally, section “Conclusion” summarizes the main contributions and suggests possible extensions for this research.

RP/SP joint model

The RP/SP joint model, discrete choice model, combining the SP data with RP data was developed by Bradley and Daly. According to the random utility theory, the utility is a random variable and composed of a deterministic component V_ni and an unobserved stochastic component ε. V_ni can be calculated by characteristics observed, and both V_ni and ε are assumed to be linear. So for travel mode i, its utility for traveler n can be described as follows^10,11,26

U_{ni} = V_{ni} + ε_{ni}

(1)

V_{ni} = θ X_{ni} = \sum_{k = 1}^{k} θ_{k} X_{nik}

(2)

where X_nik is the attribute of traveler n’s choice alternative I and $θ_{k}$ is the uncertainty parameter.

The probability P of traveler n choosing travel mode i is

\begin{matrix} P_{ni} = \Pr (U_{ni} > U_{nj}) = \Pr (V_{ni} + ε_{ni} > V_{nj} + ε_{nj}) \\ = \Pr (ε_{nj} < V_{ni} - V_{nj} + ε_{ni}) \\ = \int_{- \infty}^{+ \infty} \Pr (ε_{ni} = y, ε_{nj} < V_{ni} - V_{nj} + y) dy \\ = \int_{- \infty}^{+ \infty} [\int_{- \infty}^{V_{i} - V_{j} + y} f_{12} (y, z) dz] dy \end{matrix}

(3)

where $f_{12} (y, z)$ is a joint probability function of $ε_{ni}$ and $ε_{nj}$ .

When ε is independent and follows Gumble distribution, its distribution and density function are as follows

F (y) = \exp [- \exp (- λ y)] f (y) = λ F (y) \exp (- by)

(4)

where λ > 0 is the scale parameter, and its relationship with the variance D(ε) is as follows

D (ε) = \frac{π^{2}}{6 λ^{2}}

(5)

Substituting equation (4) into equation (3), iteration equation (4) can get the multinomial logit (MNL) model equation (6)

\begin{matrix} P_{ni} = \frac{\exp (θ X_{ni})}{\sum_{j \in C_{n}} \exp (θ X_{nj})} \\ = \frac{1}{\sum_{j \in C_{n}} \exp (\sum_{k = 1}^{k} θ_{k} (X_{njk} - X_{nik}))}, (i, j \in C_{n}) \end{matrix}

(6)

where C_n is the alternative set, X_nik is the attribute of traveler n choice alternative i, and θ_k is the unknown character of alternative attributes.

θ_k is the unknown character of equation (6), and the choice probability of travel mode can be calculated through maximum likelihood estimation as follows.

If traveler n’s travel mode choice result is δⁿⁱ, then the probability of achieving all δⁿ¹, δⁿ², …, δⁿⁱ is

Π P_{ni}^{δ_{ni}} = P_{n 1}^{δ_{n 1}} P_{n 2}^{δ_{n 2}}, \dots, P_{ni}^{δ_{ni}}

(7)

where $δ^{ni} = {\begin{matrix} 1, traveler choice the alternative i \\ 0, otherwise \end{matrix}$ .

And if the sample size is increased to N, the probability of achieving all travelers’ travel choice–making results is

L^{*} = Π_{n = 1}^{N} \underset{i \in C_{n}}{Π} P_{ni}^{δ_{ni}}

(8)

where equation (8) is the likelihood function of MNL, and its logarithm likelihood function L is

L = \ln L^{*} = \sum_{n = 1}^{N} \sum_{i = C_{n}} δ_{n i} \ln p_{n i} = \sum_{n = 1}^{N} \sum_{i \in C_{n}} δ_{n i} (θ X_{n i} - \ln \sum_{j \in C_{n}} e^{θ X_{n j}})

(9)

where L in equation (9) can be proved as the convex function of θ. And the maximum likelihood estimation $\hat{θ}$ of L can be calculated by the derivation of equation (9) and equaling to zero

\frac{\partial L}{\partial θ_{k}} = \sum_{n = 1}^{N} \sum_{i \in C_{n}} δ_{n i} (X_{n i k} - \frac{\sum_{i \in C_{n}} X_{n i k} e^{θ X_{n i}}}{\sum_{j \in C_{n}} X_{n i k} e^{θ X_{n j}}}) = 0, (k = 1, 2, \dots, K)

(10)

With equation (6) and $\sum_{i \in C_{n}} δ_{ni} = 1$ , equation (10) can be simplified as

\sum_{n = 1}^{N} \sum_{i \in C_{n}} (δ_{ni} - P_{ni}) X_{nik} = 0, (k = 1, 2, \dots, K)

(11a)

where

P_{ni} = \frac{\exp (θ X_{ni})}{\sum_{j \in C_{n}} \exp (θ X_{nj})}

(11b)

For these samples, we can get the attribute’s utility function, characters, and utility of every alternative using the maximum likelihood estimation (equation (11)) and the probability of each alternative using equation (6). Using the probability prediction method, we also get the choice probability of every alternative in all samples. The probability prediction method is that we first calculated the choice probability of each alternative for each sample by the utility function, and then, we weighted average of the choice probability for each alterative as its probability. The calculation function is shown below¹⁰

P_{i} = \frac{1}{N} \sum_{n = 1}^{N} P_{ni}

(12)

where P_i is the share of alternative i and P_ni is the probability of traveler n choosing alternative i.

The unobserved factors of the utility function are represented by the random error term ε and γ. The RP/SP joint model hypothesis is that there are outstanding differences between the random error terms of different data. As shown in equation (13), µ² equals to the variance ratio of utility function of random error terms between RP and SP data¹

μ^{2} = \frac{V a r (ε^{2})}{V a r (γ^{2})} = \frac{\frac{π^{2}}{6 λ_{r p}^{2}}}{\frac{π^{2}}{6 λ_{s p}^{2}}} = \frac{λ_{s p}^{2}}{λ_{r p}^{2}}

(13)

where ε and γ are random error terms of RP data and SP data utility function and λ is scale parameter.

Therefore, the utility function of alternative i of combined data is represented by

U_{i}^{rp} = V_{i}^{rp} + ε_{i} = \partial X_{i}^{rp} + θ Y_{i}^{rp} + ε_{i}

μ U_{i}^{sp} = μ (V_{i}^{sp} + ε_{i}) = μ (\partial X_{i}^{sp} + ϑ ω_{i}^{sp} + γ_{i})

(14)

where $\partial, θ, φ$ are un-estimated parameters; $X_{i}^{rp}$ and $X_{i}^{sp}$ are common attributions of RP and SP data, respectively; and $Y_{i}^{rp}$ and $ω_{i}^{sp}$ are unique attributions of RP and SP data, respectively.

Since the utility random term of SP data, µγ_i, equals to the RP data–based variance of utility random error term, it can be resolved by building a virtual nested logit (NL) tree. In this NL tree, the alternatives of RP data are placed below the root level, while the alternatives of SP data are placed in a single SPL (each level of decision tree) level. Figure 1 is the demonstration of virtual NL tree of RP and SP data.

Figure 1.

Virtual Nested Logit (NL) tree.

The utility functions of SPL level are shown below

\begin{matrix} V^{SPL} = μ \ln \sum \exp (V^{SP}) \\ V^{SP} = U^{SP} - ε^{SP} = \partial X_{i}^{SP} + ϑ ω_{i}^{SP} \\ V^{SPL} = μ (\partial X_{i}^{sp} + ϑ ω_{i}^{SP}) \end{matrix}

(15)

where equations (15) and (14) have the similar formation; thus, building the virtual NL tree can resolve the RP/SP joint model.

Data analysis

In this article, the Chengdu–Longquan Corridor is selected as study project. In order to demonstrate the effectiveness of the composite utility method, we use the combined RP/SP data to predict mode share and compare the result with the single SP data prediction result as well. We first collected the RP data including the results of traveler’s choice on the corridor, and then, we took a survey about the travelers’ preference suggesting the subway in service on the corridor.

RP data

RP data contain traveler’s realistic travel choice between Chengdu and Longquan, and their demographic and socioeconomic attributes. Our survey targets are residents who live in Chengdu and Longquan. The random sampling method is used to investigate the targets and obtain 550 replies (545 valid respondents). The RP data reflect travelers’ actual choice behavior among the four travel modes which are coach, bus, private car, and taxi. The average monthly income of these respondents is ¥4112, the male–female ratio is 1.02, and the average age is 33 years. It is shown that the travelers choose bus and private car as their primary travel mode, according to the RP questionnaires.

SP data

SP questionnaires record the travelers’ preference for new subway service between Chengdu and Longquan. The questionnaire is designed based on orthogonal design which can maintain the orthogonality among these alternative attributes and avoid multicollinearity problem causing the model estimation error. There are five alternatives, including coach, bus, private car, taxi and subway, and their attributes and attributes level values are shown in Table 1. (Arrival time means the travel time from home to station.) The definition of attributes level is set based on real value to avoid the dominant alternative in combinations. The SP survey requires interview respondents, with 410 samples in total and 1600 valid observed data, with a monthly average income ¥4135, male–female ratio 1.03, and average age of 34 years.

Table 1.

Alternative and its attribute level definition table.

Alternatives	Attributes	Level 1	Level 2	Level 3
Bus	In-vehicle time (min)	60	90	120
	Fee (¥)	2	4
	Waiting time (min)	5	10
	Arrival time (min)	5	10
	Off-vehicle time (min)	5	10
Coach	In-vehicle time (min)	40	60
	Fee (¥)	6	8	10
	Waiting time (min)	15
	Arrival time (min)	20
	Off-vehicle time (min)	15
Taxi	In-vehicle time (min)	40	60
Taxi	Fee (¥)	50	70
Private car	In-vehicle time (min)	40	60
Private car	Fee (¥)	20	40
Subway	In-vehicle time (min)	30	40
	Fee (¥)	4	6
	Arrival time (min)	10	30
	Off-vehicle time (min)	10

The fee of private car means fuel cost.

Model results

When building the models, we assume the socioeconomic attributes have the same impact on all travelers’ choice-making behaviors. We used NLOGIT to estimate the model parameter, and results are shown in Table 2.

Table 2.

Model parameter estimation by software NLOGIT.

Variables	Model 1	Model 2	Model 3	Model 4
Constant of coach (RP)	−1.157 (−1.139)	−	−4.418 (−3.885)	−7.811 (−3.256)
Constant of coach (SP)	−	−1.202 (−3.282)	−0.104 (−0.365)	−0.117 (−0.378)
Constant of bus (RP)	−4.201 (−3.488)	−	−5.006 (−4.842)	−8.487 (−3.571)
Constant of bus (SP)	−	−0.456 (−2.375)	−0.251 (−1.536)	−0.331 (1.830)
Constant of taxi (RP)	6.714 (9.201)	−	5.795 (10.942)	6.138 (8.227)
Constant of taxi (SP)	−	−3.794 (−7.003)	−3.817 (−8.076)	−3.327 (−6.233)
Constant of private car (RP)	−	−	−	−
Constant of private car (SP)	−	−1.309 (−3.666)	−1.784 (−8.076)	−1.286 (−3.593)
Arrival time (RP)	−0.207 (−5.374)	−	−0.227 (−3.885)	−0.246 (−6.463)
Arrival time (SP)	−	−0.09 (−17.079)	−0.095 (−17.211)	−0.103 (−15.086)
Waiting time	−0.312 (−7.691)	−0.033 (−1.236)	−0.121 (6.171)	−0.121 (−5.393)
Off-vehicle time	−0.056 (−1.666)	−0.01 (−0.372)	−0.026 (−1.367)	−0.029 (−1.387)
Fee (RP)	−0.30 (−10.181)	−	−0.311 (−12.843)	−0.327 (−10.241)
Fee (SP)	−	−0.033 (−5.414)	−0.032 (−5.649)	−0.032 (−5.473)
In-vehicle time	−0.049 (−0.780)	−0.03 (−12.277)	−0.031 (−12.339)	−0.031 (−12.195)
Log Likelihood (LL) (&)	−360.542	−1697.787	−2084.005	−2081.697
ρ ²	0.207	0.318	0.219	0.003
µ	−	−	1(0.06)	−
Inclusive Value (IV)
CAR (taxi and private car)	−	−	−	1 (fixed)
Public Transit (PT)	−	−	−	0.808 (9.718)

RP: revealed preference; SP: stated preference. Model 3 (RP/SP), Model 4 (RP/SP with different structure tree).

Values in parentheses mean test value of parameter T.

According to Table 2, Model 1 reflects the realistic behavior of the travelers based on the RP data. Taking the Private car as alternative reference, the attributes of the alternative have the logic symbol. Except the off-vehicle time, other attributes are significant variables with |T| ≥ 1.96. It means that under 5% confident interval, the off-vehicle time has less impact on traveler’s choice behavior, and Chengdu–Longquan Corridor has good accessibility now. According to equation (12), the shares in the Chengdu–Longquan corridor are 8.9% by coach, 43.8% by bus, 5% by taxi, and 42.3% by private car. Model 2 is the result of parameter estimation based on SP data. Take the metro as alternative reference. The results show that all the parameters have the right symbols and the variables are significant except for the off-vehicle time and the waiting time. The most significant variables arrival time and in-vehicle time suggest that the convenience and fast-access should be improved in Chengdu–Longquan corridor. The comparisons between Models 1 and 2 suggest that SP data may introduce some prediction errors such as the waiting time and in-vehicle time. In fact, in Model 1, the waiting time has a significant impact on travelers, while the waiting time does not impact the traveler choice behavior. So, the SP data may cause some prediction errors and probably produce unreasonable results.

Model 3 is the RP/SP joint model parameter estimation results. The constants of alternatives are different in two kinds of data. The degree of significance for the in-vehicle time and the waiting time is only good in the single data. the off-vehicle time is not significant based on the single data. In order to test whether the combined data would help improve the degree of significant on parameter estimation results, the in-vehicle time and the arrival time were defined as generalized variables. According to the estimation results based on the single data, the arrival-vehicle time and the fee are significant in Models 1 and 2. Therefore, the arrival-vehicle time and the fee are defined to be different parameters in the combined data.

The result shows that generalized variables are more significant, while the off-vehicle time is still not significant. It means the off-vehicle time has small impact on travelers choice behavior. And there is better accessibility in the Chengdu–Longquan corridor. The relative variance µ is equal to 1 and is not significant in RP/SP joint model. It shows that there is no difference between the random errors variance of the RP and SP data, and the variance of error term is not relevant with the data source.

Model 4 is RP/SP joint model with another structure tree, as shown in Figure 2. The first level includes PT (public transit) and Car (taxi and private car). The IV parameter is 0.808 and it is significant, which means the alternatives in PT level are relevant. It is suggested that the variance of random error term in RP/SP joint model is relevant with the structure tree (the alternatives have been allocated), but not relevant with data source.

Figure 2.

Nested Logit (NL) tree.

According to Model 3 result, RP/SP joint model can improve the degree of significance for the alternatives attributes and modify the inadequate side of SP data. However, there is contradiction of primary assumption in the RP/SP joint model: among different data, the variance of alternative random error term has non-relevance to data source, while the variance is relevant with the structure tree (the alternatives have been allocated). Therefore, the combined RP/SP is not universal.

The composite utility

The results of the model demonstrate that the RP/SP joint model does not have universal applicability. When there is a significant difference between the random error terms of alternative in different data, the RP/SP joint model can be used. We propose a universal method using RP data to modify the single SP data parameter estimation values to achieve the composite utility, overcome the disadvantages of single SP data.

The alternative constants in the RP data model reflect some unobserved factors which impact travel choice behavior and the market share of each alternative in the real market. The alterative constants based on the SP data cannot reflect these unobserved factors since SP data lack real restraints in the assumed scenarios.

So, we consider the alternative constants as unknown parameters to be re-estimated by RP data; at the same time, we assume other parameter estimations (able to estimate by the single SP data) are fixed. According to single SP data parameter estimation result, the utility function of alternatives is shown as follows

\begin{matrix} V_{(coach)} = - 1.202 - 0.033 \times cost - 0.033 \\ \times waitt - 0.034 \times inveht - 0.095 \\ \times accesst - 0.01 \times egresst \\ V_{(bus)} = - 0.456 - 0.033 \times cost - 0.033 \\ \times waitt - 0.034 \times inveht - 0.095 \\ \times accesst - 0.01 \times egresst \\ V_{(taxi)} = - 3.794 - 0.033 \times cost - 0.033 \\ \times waitt - 0.034 \times inveht - 0.095 \\ \times accesst - 0.01 \times egresst \\ V_{(privatecar)} = - 1.309 - 0.033 \times cost - 0.034 \times inveht \\ V_{(metro)} = - 0.033 \times cost - 0.033 \times waitt - 0.034 \\ \times inveht - 0.095 \times accesst - 0.01 \times egresst \end{matrix}

(16)

When utility function is linear structure, the constant of its utility function can be calculated from the following equation

ASC = V - \sum_{K = 1}^{K} β_{K} X_{K}

(17)

Fixing the single SP data parameter estimation values of the alternative attributes and using the market share of each alternative based on RP data as weight, the alternative constants can be re-estimated. The composite utility is shown as follows

\begin{matrix} V_{(coach)} = - 1.239 - 0.033 \times cost - 0.033 \\ \times waitt - 0.034 \times inveht - 0.095 \\ \times accesst - 0.01 \times egresst \\ V_{(bus)} = 0.857 - 0.033 \times cost - 0.033 \\ \times waitt - 0.034 \times inveht - 0.095 \\ \times accesst - 0.01 \times egresst \\ V_{(taxi)} = - 1.397 - 0.033 \times cost - 0.033 \\ \times waitt - 0.034 \times inveht - 0.095 \\ \times accesst - 0.01 \times egresst \\ V_{(privatecar)} = - 0.289 - 0.033 \times cost - 0.034 \times inveht \\ V_{(metro)} = - 0.033 \times cost - 0.033 \times waitt - 0.034 \\ \times inveht - 0.095 \times accesst - 0.01 \times egresst \end{matrix}

(18)

According to the composite utility function in equation (18), the choice probability of each alternative can be calculated. Thus, the market share of each travel mode in this corridor can be calculated through probability prediction method, as shown in Table 3.

Table 3.

Probability of travel mode choice.

Alternative	Probability (single SP; %)	Probability (composite; %)
Coach	4.9	4.9
Bus	21.3	10.9
Metro	45.7	45.6
Taxi	0.8	8.9
Private car	27.3	29.7

SP: stated preference.

Elastic analysis

The elastic analysis is used to compare the precision degree of forecasting between the composite utility function and the utility function of singe SP data. The elasticity is the change values of the choice probability of alternative if the key factors change the given values. The elasticity is calculated by the following equation⁴

E_{X_{iKQ}}^{P_{iq}} = \frac{\partial P_{ni}}{\partial X_{nik}} \cdot \frac{X_{nik}}{P_{ni}}

(19)

where equation (19) suggests the elasticity when attribute K of alternative i has marginal change; the alternative i is chosen by choice maker n with probability P_ni. If all the parameters meet linear function, equation (19) can be simplified as equation (20), where β_ik is parameter of the variables X_nik

E_{X_{IKQ}}^{P_{iq}} = - β_{ik} X_{ikq} (1 - P_{iq})

(20)

The in-vehicle time and arrival time are the two most significant factors in the model, and their elasticity is shown in Table 4.

Table 4.

Value of elasticity.

	Coach	Bus	Taxi	Private car	Metro
In-vehicle time (single SP)	−1.29	−1.75	−1.35	−0.92	−0.43
In-vehicle time (composite)	−1.49	−1.95	−1.37	−1.04	−0.51
Arrival time (single SP)	−1.79	−0.48	−	−	−0.78
Arrival time (composite)	−1.79	−0.56	−	−	−0.79

SP: stated preference.

The elasticity which represents the 1% change of key factor will cause many changes on the choice probability of the alternative. In Table 4, we can see that the in-vehicle time has the biggest influence on the choice probability of bus. When 20% and 50% of bus in-vehicle time and coach arrival time are decreased, the choice probability changing result is shown in Table 5.

Table 5.

Probability prediction.

	Coach (%)	Bus (%)	Taxi (%)	Private car (%)	Metro (%)
Composite	−0.31	4.69	−0.57	−1.75	−2.06	−20%
Single SP	−0.62	8.14	−0.1	−3.22	−4.02	Bus in-vehicle time
Composite	−1	15.58	−1.94	−5.72	−6.92	−50%
Single SP	−1.88	25.86	−0.31	−10.02	−13.65	Bus in-vehicle time
Composite	2.08	−0.25	−0.26	−0.73	−0.84	−20%
Single SP	2.09	−0.5	−0.02	−0.71	−0.86	Coach arrival time
Composite	6.71	−0.81	−0.83	−2.34	−2.73	−50%
Single SP	6.73	−1.6	−0.08	−2.26	−2.79	Coach arrival time

SP: stated preference.

According to Table 5, we can see that the probability based on the RP/SP composite utility has smaller change than the single SP data. And when decreasing 50% of the bus in-vehicle time, the bus choice probability increased by 25.8%, and thus the bus choosing probability reached almost 50% in total, which is not reasonable. Therefore, we find the single SP data sometimes amplify the impacts of key factors, and cause prediction error while the composite utility considers real restraints that can limit this kind of errors.

In summary, the composite utility takes real restraint and traveler’s preference into account which can obtain more reasonable prediction result. The predicted travel mode shares after introducing subway into Chengdu–Longquan corridor are coach 4.9%, bus 10.9%, taxi 8.9%, private car 29.7%, and Metro 45.6%.

Conclusion

In this article, we studied the single SP data and combined RP/SP data to predict the Chengdu–Longquan corridor travel mode share. Single SP data–based prediction sometimes has unreasonable results like the impact of in-vehicle time. The elastic analysis also suggests that the single SP data may amplify the impact of key factors and cause prediction error. The combined RP/SP data–based prediction can improve the significance of models. However, there is no difference among the random error terms variance of alternatives in either RP or SP data. The variance of random error term is not relevant to data source, but relevant to the location where alternatives are allocated in the structure tree. The RP/SP joint model is not universal; only when there is a significant difference among the random error terms of alternative in different data, the RP/SP joint model can be used. And then, we proposed the composite function to combine the RP with SP data.

After modifying estimation results based on the SP data with the RP data, we use the composite function to predict the travel mode share on the corridor. We find the single SP data would cause prediction error, while the composite utility can obtain more reasonable prediction results by taking real restraints and traveler’s preference into account. The predicted result is 4.9% made up by coach, 10.9% by bus, 8.9% by taxi, 29.6% by private car, and 45.6% by Metro. The results show that the Metro will be the primary travel mode on Chengdu–Longquan corridor. The government is suggested to provide fast-access of public transit by making policy. And then travelers can be guided to choose public transit as a result of the improvement of convenience, thereby solving the traffic congestion problems on this corridor.

Footnotes

Academic Editor: Geert Wets

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Science Foundation of China under Grant Nos 50908195 and 51178403, the Fundamental Research Funds for the Central Universities (No. SWJTU11CX080 and No. 2682014CX130), Program for New Century Excellent Talents in University (NCET-13-0977), Key Laboratory of Road and Traffic Engineering of the Ministry of Education, Tongji University (No. K201207), Specialized Research Fund for the Doctoral Program of Higher Education (No. 20130184110020), Chengdu Science and Technology Bureau (No. 2014-RK00-00034-ZF), and Science & Technology Department of Sichuan Province (No. 2014RZ0037)

References

Van Can

. Estimation of travel mode choice for domestic tourists to Nha Trang using the multinomial probit model. Transport Res A: Pol 2013; 49: 149–151.

Paha

Rompf

Warnecke

Customer choice patterns in passenger rail competition. Transport Res A: Pol 2013; 50: 209–227.

Hoen

Koetse

MJ.

A choice experiment on alternative fuel vehicle preferences of private car owners in the Netherlands. Transport Res A: Pol 2014; 61: 199–215.

Basu

Hunt

JD.

Valuing of attributes influencing the attractiveness of suburban train service in Mumbai city: a stated preference approach. Transport Res A: Pol 2012; 46: 1465–1476.

Chen

Liu

Comparative study on mode split discrete choice models. J Mod Transport 2013; 21: 266–272.

Tirachini

Hensher

Jara-Díaz

SR.

Restating modal investment priority with an improved model for public transport analysis. Transport Res E: Log 2010; 46: 1148–1168.

Anderson

Das

Tyrrell

TJ.

Parking preferences among tourists in Newport, Rhode Island. Transport Res A: Pol 2006; 40: 334–351.

Washbrook

Haider

Jaccard

Estimating commuter mode choice: a discrete choice analysis of the impact of road pricing and parking charges. Transportation 2006; 33: 621–639.

Bliemer

MCJ

Rose

. Serial choice conjoint analysis for estimating discrete choice models. In: Hess

Daly

(eds) Choice modelling: the state-of-the-art and state-of-the-practice. Bingley: Emerald, 2010, pp.139–161.

10.

Hensher

Rose

Greene

WH.

Applied choice analysis: a primer. Cambridge: Cambridge University Press, 2005.

11.

Ben-Akiva

Lerman

SR.

Discrete choice analysis theory and application to travel demand. Cambridge, MA: The MIT Press, 1997.

12.

Hensher

Rose

Does the choice model method and/or the data matter?

Transportation 2011; 39: 351–385.

13.

Holguín-Veras

Wang

Behavioral investigation on the factors that determine adoption of an electronic toll collection system: freight carriers. Transport Res C: Emer 2011; 19: 593–605.

14.

Cherchi

de Dios Ortúzar

Mixed RP/SP models incorporating interaction effects. Transportation 2002; 29: 371–395.

15.

Espino

Roma

de Dios Ortúzar

Analysing demand for suburban trips: a mixed RP/SP model with latent variables and interaction effects. Transportation 2006; 33: 241–261.

16.

Holguín-Veras

Preziosi

Behavioral investigation on the factors that determine adoption of an electronic toll collection system: passenger car users. Transport Res C: Emer 2011; 19: 498–509.

17.

Ahern

Tapley

The use of stated preference techniques to model modal choices on interurban trips in Ireland. Transport Res A: Pol 2008; 42: 15–27.

18.

Train

Wilson

WW.

Estimation on stated-preference experiments constructed from revealed-preference choices. Transport Res B: Meth 2008; 42: 191–203.

19.

Kumar

Rao

KVK

. A stated preference study for a car ownership model in the context of developing countries. Transport Plan Techn 2006; 29: 409–425.

20.

Lee

Cho

Demand forecasting of diesel passenger car considering consumer preference and government regulation in South Korea. Transport Res A: Pol 2009; 43: 420–429.

21.

Bliemer

Rose

JM.

Experimental design influences on stated choice outputs: an empirical study in air travel choice. Transport Res A: Pol 2011; 45: 63–79.

22.

Gunn

Bradley

Hensher

DA.

High speed rail market projection: survey design and analysis. Transportation 1992; 19: 117–139.

23.

Bradley

Daly

. Estimation of logit choice models using mixed stated preference and revealed preference information. In: Les methodes d’analyse des comportements de deplacements pour les annees 1990-6e conference internationale sur les compor- tements de deplacements, Chateau Bonne Entente, Quebec City, QC, Canada, 22–24 May 1991, vol. 1.

24.

Ben-Akiva

Morikawa

. Estimation of travel demand models from multiple data sources. In: Proceedings of the 11th international symposium on transportation and traffic theory, Yokohama, Japan, 18–20 July 1990.

25.

Cherchi

de Dios Ortúzar

On fitting mode specific constants in the presence of new options in RP/SP models. Transport Res A: Pol 2006; 40: 1–18.

26.

Koppelman

Bhat

A self instructing course in mode choice modeling: multinomial and nested logit models. Washington, DC: The United States Department of Transportation, 2006.

Empirical study of travel mode forecasting improvement for the combined revealed preference/stated preference data–based discrete choice model

Abstract

Keywords

Introduction

RP/SP joint model

Data analysis

RP data

SP data

Model results

The composite utility

Elastic analysis

Conclusion

Footnotes

Declaration of Conflicting Interests

Funding

References