Sage Journals: Discover world-class research

Abstract

As a continuous generalization of the multinomial logit (MNL) model, the continuous logit (CL) model can be used for continuous response variables (e.g., departure time and activity duration). However, the existing CL model requires the calculation of numerical integrals to obtain the choice probabilities; it thus takes a long time to estimate the model parameters, particularly when the sample size is large. In this paper, we formulate the finite-mixture CL (FMCL) model as a new continuous choice model by combining the finite-mixture method and the CL model, in which the continuous distributional function of the finite mixture is embedded in the CL model. As a result, the individual choice probability can be obtained directly by computing the probability density of the continuous distribution function; this avoids calculation of the integral but still obeys the random utility maximization (RUM) principle. Simulation experiments are conducted to demonstrate the capability of the model. In an empirical study, the proposed model is applied for non-commuters’ shopping activity start time using the expectation-maximization (EM) algorithm based on Shanghai Household Travel Survey data. The results show that the FMCL model developed in this paper can greatly reduce the model estimation time (10,048 observations requiring only 3 min) of the CL model, and the model also has a more intuitive interpretation of model coefficients, directly reflecting variable effects on time-of-day choice. These two advantages can greatly enhance the practical value of the proposed modeling method.

Keywords

travel behavior modeling continuous logit finite mixture EM algorithm time-of-day modeling

Time is an important dimension of travel choices; it is therefore essential to model time choice in transportation system analysis when evaluating travel demand management policies. The generalized extreme value (GEV) models ( 1 ), based on random utility maximization (RUM) theory, are the most important class of discrete choice models in travel behavior analysis, of which the multinomial logit (MNL) model is most widely applied ( 2 ). These models, despite their dominance in travel behavior analysis, are only used for discrete choice analysis. Some travel-related decisions are inherently continuous in nature; for example, time choices are typically continuous decisions, for which GEV models can only be applied when these variables are artificially discretized.

Using the GEV class of models for time-of-day (TOD) choice can take advantage of random utility theory. However, artificially dividing the time intervals results in adjacent time points being in different intervals; then the model considers them to be completely different alternatives but travelers may consider these time points as very similar.

The time interval boundaries are usually set in an arbitrary manner, as discussed by Bhat and Steed ( 3 ), without a reliable rule, and different model results can appear if the boundaries change. Considering correlations in discrete models can alleviate this problem to some extent, but a continuous treatment of time can solve the problem completely.

The continuous logit (CL) model, as a continuous extension of the MNL model ( 4 , 5 ), is able to take advantage of random utility theory when time is treated as a continuous variable, but requires non-time-varying variables (e.g., age, gender) to be time-varyingly handled. The currently used time-varying treatment of the utility function involves taking the form of an interaction with a trigonometric function of time ( 6 – 8 ), which causes the parameters of the utility function not to intuitively reflect the variable effect on time choice. An obstacle to model estimation is that the CL model requires numerical integration of the time-varying utility function for each observation, which greatly increases the computational effort and takes a long time to compute, thereby making it more difficult for applications with larger sample size.

In this paper, a finite mixture of continuous distributions is introduced to avoid the integration calculation of the model, which is still based on the framework of the CL model and therefore called the finite-mixture CL (FMCL) model. The proposed model not only retains the advantages of random utility theory, but also avoids the calculation of numerical integration, and more intuitively shows the influence of variables on time choice. Then simulation experiments of the FMCL model are conducted to verify the model estimation procedure. In an empirical study, the proposed model is applied for non-commuters’ shopping activity start time, based on Shanghai Household Travel Survey data, using the expectation-maximization (EM) algorithm. Non-commuting activities are chosen because the schedule of non-commuting activities is relatively flexible and has a larger choice space for demonstration purpose.

The remainder of the paper is organized as follows. The next section is a literature review of the TOD modeling, and is followed by a section that gives the equations of the CL model, the FMCL model, and the estimation method. The fourth section shows the simulation experiments of the FMCL model. The fifth section gives the results of the empirical study based on the FMCL model and the analysis of model estimation results. In the last section, some conclusions are drawn and some suggestions for future work are discussed.

Literature Review

TOD choice and time use are two important components of activity scheduling. Time use focuses on time allocation decisions, and capturing the interactions between duration, travel time, and activity–travel frequency. The fractional logit model, multiple discrete-continuous extreme value (MDCEV) model, and episode-based MDCEV model are mainly applied to the studies of time-use problems ( 9 – 13 ). Currently, the improvements proposed by Saxena et al. ( 12 ) and Palma et al. ( 13 ) for the episode-based MDCEV model are new advances in this field.

In the four-step travel-based demand forecasting model, the trip departure time is usually classified into a few peak and off-peak periods, for aggregating trip frequencies in each time period. As the focus of transportation planning shifts from long-term transportation infrastructure construction to short-term travel demand management policies, the activity-based travel demand model is more suitable for travel demand forecasting, which increasingly requires higher resolution of time measures, but time-choice modeling is still one of the main weaknesses of current activity-based travel demand forecasting models.

The TOD models in the literature can be divided into two main categories: (1) models that consider time as a discrete variable; and (2) models that consider time as a continuous variable.

The first category of time-choice models are discrete choice models, which can directly compute the choice utility of travelers and provide a convenient form for TOD choice models. However, these models require the division of time into discrete time intervals ( 14 , 15 ), and the interval boundaries are often arbitrarily set; nevertheless, the convenient structural form of discrete choice models has many advantages for model estimation and application. Discrete choice models have been used extensively in the literature for time-choice modeling.

Some researchers divided time into different time intervals and used MNL models to model commuting departure time choice ( 16 – 26 ). To address the property of the independence of irrelevant alternatives of MNL models, some researchers have applied the nested logit model ( 27 ), the cross-nested logit model ( 28 – 30 ), the ordered GEV model ( 31 – 33 ), the dogit ordered GEV model ( 34 ), the mixed logit model ( 35 – 38 ), or the multinomial probit (MNP) model ( 39 , 40 ) to solve the problem of correlations between alternatives. However, there is no robust theory to support the determination of interval criteria for time-choice discretization; Some researchers discretized time into 5–15-min intervals ( 16 , 17 , 24 ), some discretized time into 30-min to 1-h intervals ( 7 , 27, 41), and some divided time into morning and evening peak hours for study (14, 18, 20, 21, 30–33, 42, 43). The difference for interval division will place adjacent time points that are not much different for travelers into different time intervals, and this becomes a problem that cannot be solved using discrete choice models.

The second category of time-choice models treats time as a continuous variable and is further divided into two categories: (a) hazard duration models; and (b) continuous time-choice models based on RUM. The hazard duration models do not require discretization of time variables, but they are not based on RUM theory. Instead, they are based on a debatable assumption that the start time of the latter state only depends on the duration and end time of the previous state (3, 44 –47). RUM-based continuous time-choice models include the CL model ( 48 , 49 ), the autoregressive continuous logit model ( 50 ), and the continuous cross-nested logit (CCNL) model ( 48 , 49 ). These models treat time as a continuous variable and have a good behavioral basis. The main difference between the CL model and the CCNL model is the latter’s ability to capture the correlation between alternatives that are similar in the continuous spectrum. Ghader et al. ( 50 ) proposed the autoregressive continuous logit model, which combines the CL model and the autoregressive process and can accommodate the correlation between alternatives in the continuous spectrum. Ghader et al. ( 51 ) introduced a Gaussian copula function based on a CCNL model to describe the correlation between the two dependent alternatives of activity start and end times.

The main advantage of both categories of continuous time models is that they avoid the arbitrary time interval settings that are required in discrete choice methods, and the problems that can be caused by time interval choice. The hazard duration continuous model is not a choice model based on RUM theory. For RUM-based models, the CL model requires one-dimensional integration calculations for each observation. The CCNL model and autoregressive CL model involve two- and multi-dimensional integration calculations, respectively, and the estimation time is longer than that of the CL model. Therefore, all the present RUM-based continuous models suffer from the problem of a time-consuming estimation process, which hinders the wide application of such models.

Methods

McFadden ( 1 ) proposed a class of multivariate extreme value distribution and then developed a GEV model for discrete choice applications based on this distribution and RUM theory. The model assumes that an individual will rationally choose the alternative with the highest utility value when making a decision. The utility $U_{j}$ of alternative $j (j = 1, 2, \dots, J)$ consists of the systematic utility $V_{j}$ and a random error term $ε_{j}$ :

U_{j} = V_{j} + ε_{j}

(1)

If the random error term $ε_{j}$ is mutually independent and follows the standard Gumbel distribution, the model is then the MNL model and the probability expression of choosing alternative $k$ is

P_{k} = \frac{\exp (V_{k})}{\sum_{j}^{J} \exp (V_{j})}

(2)

CL Model

The CL model is a continuous extension of the MNL model. Suppose the lower and upper bounds of the continuous variable $t$ are $t_{1}$ and $t_{2}$ , and discretize $t$ into $I$ discrete alternatives at a fixed interval $s$ , where $I$ is computed as $I = (t_{2} - t_{1}) / s$ .

The model can then be written in the form of the MNL model. As the interval $s$ decreases and the number of alternatives $I$ increases, the choice probability can still be written in the same form. When $s \to 0$ , we obtain the CL choice density function

P_{t_{k}} = \frac{\exp (V (t_{k}))}{\int_{t_{1}}^{t_{2}} \exp (V (t)) dt}

(3)

For identification purposes, all the explanatory variables need to interact with some continuous time functions. In addition, the function representing the TOD should have the same utility at $t = 0$ and $t = 24$ , so a cyclic function is needed for interaction. Currently, the commonly used utility functions are in the form of trigonometric functions ( 6 , 7 , 43 , 49 , 51 ):

V (t) = X β s (t)

(4)

s (t) = [\sin (\frac{2 π t}{24}), \sin (\frac{4 π t}{24}), \dots, \sin (\frac{2 f π t}{24}), \cos (\frac{2 π t}{24}), \cos (\frac{4 π t}{24}), \dots, \cos (\frac{2 f π t}{24})]^{'}

(5)

This utility function can equalize the utilities at $t = 0$ and $t = 24$ , where $X$ is a $(K \times 1)$ vector of demographic attribute variables (including the constant), $β$ is a $(K \times 2 f)$ vector of parameters to be estimated, and $s (t)$ is a $(2 f \times 1)$ vector of cyclic functions that captures the cyclicality of the TOD choices for the 24-h day. The integer value of $f$ is chosen by the analyst. By adjusting the value of $f$ , the utility function can be made more flexible and thus exhibits a more diverse trend of time distribution.

As shown in Equation 3, the CL model needs to calculate $\int_{t_{1}}^{t_{2}} \exp (V (t)) dt$ when calculating the probability density $P_{t_{k}}$ . According to Equation 4, the inclusion of variables in the utility function $V (t)$ makes each individual utility function expression unique, which involves the CL model calculating the integral for each observation. The integral needs to be computed by numerical integration methods, which can be time-consuming and harder to use for applications with larger data sets. In addition, the interaction between variables and $s (t)$ , resulting in one variable corresponding to $2 f$ parameters, makes it challenging to intuitively analyze the effect of variables on time choice based on the parameters.

Finite-Mixture CL Model

Utility Specification

The CL model can be used to obtain the probability density corresponding to each choice $t$ only after calculating the integral for each observation. When time t is a continuous variable, if a function $f (t)$ can be used to represent the probability density of $t$ , this time-consuming calculation of integration can be avoided. In this paper, we construct a new utility function $V (t_{k})$ based on the existing CL model, so that the choice probability density of time $t_{k}$ can be directly calculated by the function $f (t_{k})$ . This saves computational time and takes advantage of the RUM theory. The utility function $V (t_{k})$ of time $t_{k}$ is constructed as

V (t_{k}) = \ln (f (t_{k}))

(6)

Then the probability density is

P_{t_{k}} = \frac{\exp (\ln f (t_{k}))}{\int_{t_{1}}^{t_{2}} \exp (\ln f (t)) dt} = \frac{f (t_{k})}{\int_{t_{1}}^{t_{2}} f (t_{k}) dt}

(7)

If $f (t)$ represents probability density of $t$ , $\int_{t_{1}}^{t_{2}} f (t) dt = 1$ and the probability density can be simplified, as

P_{t_{k}} = f (t_{k})

(8)

This utility specification would enable the choice probability density $P_{t_{k}}$ to be obtained by computing $f (t_{k})$ directly, while satisfying the practical constraint that the choice probability density is integrated to 1.

Finite Mixture of Unimodal Distribution

Owing to heterogeneous preferences, people will have different choice preferences for time. People with the same choice preference may prefer to travel around a certain time point; the closer to that time point, the higher the density of the population distribution, thus forming a unimodal distribution. The time-choice distributions formed by people with different preferences often form a multimodal distribution.

The probability density distribution of activity start time choice of commuters is usually simple, with a unimodal or binominal distribution. However, the probability density distribution of the activity start time choice of non-commuters tends to be more complex, showing a multimodal distribution. Therefore, it is difficult to fit well using a single unimodal continuous distribution, which often does not reflect the actual distribution.

In this paper, the finite-mixture method is used to mix several unimodal continuous distributions together to jointly approximate a multimodal distribution of time choice. Let $f (t)$ represent a multimodal distribution consisting of $N$ unimodal distributions $f_{n} (t)$ ; the expression for $f (t)$ is

f (t) = \sum_{n = 1}^{N} w_{n} f_{n} (t)

(9)

where $w_{n}$ is the weight of the unimodal distribution $f_{n} (t)$ , and $\sum_{n = 1}^{N} w_{n} = 1$ . Each unimodal distribution is formed by people with similar choice preferences and is influenced by their demographic attributes, which means that $w_{n}$ is also the probability of a discrete choice. The MNL model is used here to calculate this choice probability, as

w_{n} = \frac{\exp (z_{n})}{\sum_{n = 1}^{N} \exp (z_{n})}

(10)

Here, $z_{n}$ is similar to a utility function, the expression of $z_{in}$ for the nth alternative of individual $i$ and formulated as

z_{in} = X_{i} \cdot θ_{n} + e_{in}

(11)

where $θ_{n}$ is the vector of parameters to be estimated and $e_{in}$ is a random error term following the standard Gumbel distribution.

In addition, it is often assumed that the unimodal continuous distribution used for a finite mixture follows some common distribution (e.g., normal, Gumbel). It is necessary to evaluate and compare the goodness-of-fit of alternative density functions to choose a specific distribution. Based on the time-choice distribution characteristics of the actual data, the fit of the skew-normal and normal distributions were compared and the normal distribution was finally selected in this study.

When $f_{n} (t)$ follows a normal distribution, the probability density $P_{t_{k}}$ of alternative $t_{k}$ is

\begin{matrix} P_{t_{k}} = \sum_{n = 1}^{N} w_{n} f_{n} (t_{k}) \\ = \sum_{n = 1}^{N} w_{n} f_{n} (t_{k}, μ_{n}, σ_{n}) \\ = \sum_{n = 1}^{N} w_{n} \frac{1}{\sqrt{2 π} σ_{n}} \exp (- \frac{{(t_{k} - μ_{n})}^{2}}{2 {σ_{n}}^{2}}) \end{matrix}

(12)

where $μ_{n}$ and $σ_{n}$ represent the location parameter and scale parameter of the normal distribution, respectively.

Next, we consider how the variables are incorporated into $f_{n} (t_{k}, μ_{n}, σ_{n})$ to analyze the effect of the explanatory variables on time choice. Through reparameterization of location parameter $μ_{n}$ and the scale parameter $σ_{n}$ , $μ_{n}$ is expressed as a linear function of the variables, as

μ_{n} = α_{n} X_{1}

(13)

where $α_{n}$ is a vector of parameters to be estimated and $X_{1}$ is a $(K_{1} \times 1)$ vector of demographic attribute variables (including a constant). Since the scale parameter $σ_{n}$ must be greater than zero, $σ_{n}$ is reparameterized as

σ_{n} = \exp (γ_{n} X_{2})

(14)

where $γ_{n}$ is a vector of parameters to be estimated and $X_{2}$ is a $(K_{2} \times 1)$ vector of demographic attribute variables (including a constant).

EM Algorithms

The parameters $α_{n}$ , $γ_{n}$ , and $θ_{n}$ are to be estimated; the log-likelihood function for an observation can be formulated as

\begin{matrix} L = \ln P (t_{k}) \\ = \ln \sum_{n = 1}^{N} w_{n} (P (t_{k} | n)) \\ = \ln \sum_{n = 1}^{N} w_{n} f_{n} (t_{k}) \end{matrix}

(15)

where $w_{n}$ is the weight of the unimodal distribution $f_{n} (t)$ and $P (t_{k} | n)$ represents the choice probability density of individual $i$ under the condition that the individual belongs to the latent segment represented by the normal distribution $f_{n} (t)$ . The log-likelihood function for the entire sample of size $I$ can be formulated as $LL = \sum_{i = 1}^{I} L_{i} / I$ , which can be maximized to estimate model coefficients.

It has been noted earlier that maximization of the likelihood function using the usual Newton or quasi-Newton (secant) routines in such mixture models can be computationally unstable ( 52 , 53 ). To obtain good start values, Dempster et al. ( 54 ) developed a two-stage iterative method, which belongs to the EM family of algorithms.

The EM algorithm consists of an E step and an M step, where the E step is to define an expectation and the M step is to maximize the expectation. For the EM algorithm, the log-likelihood function for an observation can be rewritten, as

\begin{matrix} L = \ln \sum_{n = 1}^{N} w_{n} f_{n} (t_{k}) \\ = \ln \sum_{n = 1}^{N} ({\tilde{w}}_{n} \frac{w_{n} f_{n} (t_{k})}{{\tilde{w}}_{n}}) \\ \geq \sum_{n = 1}^{N} {\tilde{w}}_{n} \ln (\frac{w_{n} f_{n} (t_{k})}{{\tilde{w}}_{n}}) \end{matrix}

(16)

{\tilde{w}}_{n} = \frac{w_{n} (f_{n} (t_{k}))}{\sum_{n = 1}^{N} w_{n} (f_{n} (t_{k}))}

(17)

In the EM algorithm, the E step uses Equation 17 to calculate the weights of the normal distribution by guessing the initial values for the given parameters. In the M step, the weights calculated in the E step become constant values to maximize the likelihood function. Since the likelihood function of Equation 15 is fully concave for a fixed value of ${\tilde{w}}_{n}$ , a simple quasi-Newton algorithm search will converge quickly. The two steps are then alternated, and in each iteration the EM algorithm provides monotonously increasing values of the original log-likelihood function by maximizing $\sum_{n = 1}^{N} {\tilde{w}}_{n} \ln (w_{n} f_{n} (t_{k}) / {\tilde{w}}_{n})$ . Iterating the EM algorithm until convergence, it will return the maximum estimate (54).

Simulation Experiments

Before using survey data for empirical analysis, several simulation experiments are conducted to demonstrate the FMCL model and show whether the model estimation method is able to recover the parameters whose values are given in advance.

The simulation experiment is designed with a bimodal distribution obtained from a finite mixture of two normal distributions. A normal distribution is named Norm A, with $μ_{1} = 8.25$ , $σ_{1} = 0.92$ , and another normal distribution is named Norm B, with $μ_{2} = 5$ , $σ_{2} = 1.3$ . The sample size is set at 10,000, where the percentage of Norm A is 70% and the percentage of Norm B is 30%. The probability density expression for this mixed bimodal distribution is

\begin{matrix} P_{t} & = f (t) = 0.7 \cdot \frac{1}{\sqrt{2 π} \cdot 0.92} \exp (- \frac{{(t - 8.25)}^{2}}{2 \cdot {(0.92)}^{2}}) \\ + 0.3 \cdot \frac{1}{\sqrt{2 π} \cdot 1.3} \exp (- \frac{{(t - 5)}^{2}}{2 \cdot {(1.3)}^{2}}) \end{matrix}

(18)

Table 1 shows the parameter estimation results of the FMCL model simulation experiments, and each parameter is significantly identifiable. The mean values of the parameter estimates are very close to the true values of the parameters. The standard deviation of the parameter estimates and the mean of the standard deviation estimates are basically the same, and the absolute percentage bias (APB) is less than 0.4% ( $APB = | (mean estimate - true value) / true value | \times 100 %$ ), which means that the FMCL model can be estimated consistently using the EM algorithm.

Table 1.

Simulation Results for FMCL Model (Repetition Number = 30)

True value	Parameter estimate		Standard deviation estimate	APB (%)
True value	Mean	Standard deviation	Mean	APB (%)
8.250	8.247	0.012	0.011	0.03
5.000	5.000	0.025	0.024	0.00
0.920	0.918	0.009	0.008	0.27
1.300	1.302	0.021	0.017	0.14
0.850	0.847	0.027	0.022	0.38

Note: FMCL = finite-mixture continuous logit; APB = absolute percentage bias.

Empirical Analysis

Data

The data were derived from the 2019 Shanghai Household Travel Survey, which surveyed 51,114 households and 111,511 individuals. The survey includes personal attributes, household attributes, and trip characteristics. Travelers are classified as commuters or non-commuters according to the purpose of the trip and whether the destination is the workplace or not. In this study, the activity start times of 10,048 home-shopping-home trips of non-commuters were extracted for analysis. The distribution characteristics of the activity start time are shown in Figure 1, showing a significant morning peak and a slight afternoon peak. This suggests that most non-commuters choose to shop at around 8:00 a.m., almost at the same period as the commuters’ morning peak, which further exacerbates the traffic congestion in the morning peak. Traffic congestion can be alleviated to some extent if trips derived from this relatively flexible schedule of shopping activities can be shifted to other time periods.

Figure 1.

Distribution of travelers’ shopping start time choices.

Sample Description

Table 2 gives descriptive statistics of the sample data, with 9754 travelers and 10,048 home-shopping-home trips. The table shows that the proportion of women in the sample data (61.84%) is higher than that of men (38.16%). The average age of non-commuters is over 60 years old. Most non-commuters have a household size of two people, do not own a car, and live with their families. The average trip distance of non-commuters is 2.5 km, and they are less likely to live in the suburbs and shop downtown, and vice versa. This indicates that, for shopping activities, people tend to shop close to their homes: those who live in the city center tend to shop in the central city area; those who live in the suburbs tend to shop in the suburbs; and fewer people live downtown and shop in the suburbs or live in the suburbs and shop downtown.

Table 2.

Sample Characteristics

	Categorical variable	Attribute	Percentage (%)
Demographic attributes	Gender	Male	38.16
		Female	61.84
	Household size	1	11.25
		2	53.77
		3	20.70
		4	7.15
		5	6.03
		>5	1.11
	Having a private car	Own	26.28
		Does not own	73.72
	Number of family cars	0	73.72
		1	22.73
		2	3.25
		3	0.30
	Residential type	Family	99.21
		Dormitory	0.79
Travel attributes	Activity location	Living and shopping downtown	54.76
		Living downtown (suburbs) but shopping in suburbs (downtown)	2.51
		Living and shopping in suburbs	42.73
	Continuous variable	Mean	Standard deviation
Demographic attribute	Age (100 years)	0.622	0.111
Travel attribute	Distance (km)	2.556	6.158

Estimation Results

By comparing simulation results from two, three, and four latent segments, this paper uses four normal distributions to perform finite mixtures to fit the distribution of non-commuters’ shopping activity start times. The parameter estimation was performed in GAUSS software using the EM algorithm, and it took only 3 min to run with 10,048 observations on a computer with a CPU of 2.11 GHz and a RAM of 16 G.

Table 3 shows the FMCL model estimation results. The four normal distributions used for the finite mixture are named Norm 1, Norm 2, Norm 3, and Norm 4. The weight value in Table 3 is obtained by averaging the probability values of each individual entering the four normal distributions. The normal distribution with the highest weight is Norm 2 (81.82%), with $μ_{2} = 8.129$ , $σ_{2} = 1.019$ , followed by Norm 3 (8.25%), with $μ_{3} = 14.827$ , $σ_{3} = 2.473$ , then Norm 1 (6.45%), with $μ_{1} = 7.483$ , $σ_{1} = 1.184$ , and finally Norm 4 (3.48%), with $μ_{4} = 15.262$ , $σ_{4} = 0.543$ . The four normal distributions are shown in Figure 2.

Table 3.

Model Estimation Results (N = 10,048)

	Norm 1		Norm 2		Norm 3		Norm 4
Variable	Coefficient	t	Coefficient	t	Coefficient	t	Coefficient	t
Mean
Constants	6.538	16.53	9.215	113.08	15.035	137.43	15.414	212.37
Age (100 years)	0.201	2.16	−1.474	−12.87	NA	NA	NA	NA
Living with family	0.827	2.09	NA	NA	NA	NA	NA	NA
Having a private car	NA	NA	0.151	5.10	NA	NA	NA	NA
Household size	NA	NA	−0.026	−2.16	NA	NA	−0.062	−2.37
Living and shopping in suburbs	NA	NA	−0.337	−14.47	NA	NA	NA	NA
Distance (km)	NA	NA	NA	NA	−0.029	−2.44	NA	NA
Number of family cars	NA	NA	NA	NA	−0.355	−2.55	NA	NA
Downsub^*	NA	NA	NA	NA	−1.115	−2.47	NA	NA
Standard deviation in exp( ) function
Constants	0.169	6.07	0.124	2.69	0.905	36.87	−0.611	−16.15
Age (100 years)	NA	NA	−0.169	−2.34	NA	NA	NA	NA
Weight
Constants	−2.667	−6.67	1.633	5.90	2.023	6.62	NA	NA
Age (100 years)	8.534	16.35	4.755	16.10	NA	NA	2.201	4.23
Male	NA	NA	−0.349	−6.47	NA	NA	NA	NA
Having a private car	−0.557	−4.92	NA	NA	NA	NA	NA	NA
Living and shopping in downtown	NA	NA	1.877	18.26	1.908	15.32	1.688	11.37
Mean weight value (%)	6.45		81.82		8.25		3.48
Mean value (h)	7.483		8.129		14.827		15.262
Standard deviation (h)	1.184		1.019		2.473		0.543
Sample size	10048
Log-likelihood value	−18779.71

“Downsub” = observation lives downtown and shops in the suburbs or lives in the suburbs and shops downtown; NA = not applicable.

Figure 2.

Four segments of normal distributions.

Norm 1 and Norm 2 together constitute the morning peak of the activity, and Norm 2 is the main part of the morning peak, while Norm 1 accounts for a small proportion and only plays a local adjustment role to the distribution. The two distributions are mixed together to better fit the real shopping activity start time of the morning peak. The actual meaning of the location parameter here is the average start time of the activity; that is, the average start time of the activity for people belonging to Norm 1 is 07:51 and the average start time of the activity for people belonging to Norm 2 is 08:07 The location parameters of the two distribution functions are very close. The sum of the weights of Norm 1 and Norm 2 is 88.27%, indicating that 88.27% of people choose to start their shopping activities at around 08:00 The average start time for Norm 3 and Norm 4 is 14:50 and 15:16 respectively, and the sum of weights for Norm 3 and Norm 4 is 11.73%, which means that 11.73% of people choose to start their shopping activities at around 15:00.

In Figure 3, we compare the actual shopping start time distribution with the finite mixture of the four normal distributions. Generating random numbers for each distribution as time points, $w_{n} \cdot n$ random numbers are generated following each normal distribution. The kernel density curves of the generated random numbers are plotted in R. The bandwidth of kernel density estimation is set as 0.5.

Figure 3.

Comparison of model-fitted distribution and actual distribution.

For the morning peak, the fitted distributions are almost identical to the actual distributions, and for the afternoon peak, the fit shows a slight deviation. In general, the fitting effect is acceptable.

As shown in Table 3, the four variables “age”, “male”, “having a private car,” and “living and shopping downtown” significantly affect the probability that the traveler belongs to each latent segment of normal distribution. Older individuals tend to choose the distribution with an earlier activity start time; men are less likely to choose Norm 2. More women shop during this time period, probably because women undertake more shopping tasks in the household. “having a private car” has a negative effect on choosing Norm 1; this suggests that people with private cars at home do not tend to go out shopping early in the morning because it is very quick and convenient to go shopping by car. And “living and shopping downtown” has positive effects on choosing Norm 2, Norm 3, and Norm 4, which may reflect more characteristics of shopping activities downtown.

For Norm 1, the constant term is 6.538 (≈06:30) and the effect of both “age” and “living with family” on the mean value of Norm 1 is positive, indicating that people who are older and live with their families do not tend to start their shopping activities earlier than 6:30 a.m.

For Norm 2, the constant term is 9.215 (≈09:13). The effect of “age”, “living with family,” and “living and shopping in suburbs” on the mean of Norm 2 is negative, indicating that older people and people with larger households tend to start shopping activities earlier than 09:13 Possible reasons for this are that people from a large household need to undertake more shopping activities to satisfy the whole family, or that they also undertake more household chores and need to end their shopping activities earlier and go home. Older people tend to go shopping earlier in the morning, owing to their long-term habits (e.g., requiring less sleeping time and waking up early). The effect of “having a private car” is positive, indicating that households with cars tend to start shopping activities later than 09:13; a possible reason has already been described.

For Norm 3, the constant term is 15.035 (≈15:02) and the mean effect of “distance”, “number of family cars,” and “downsub” on Norm 3 is negative, indicating that the farther the distance between home and the shopping place, the greater the number of household cars and the greater the tendency of people start their shopping activities before 15:02. People living downtown and shopping in suburbs or living in suburbs and shopping downtown will have longer travel distances; this factor, together with the “distance” variable, suggests that people who travel longer distances tend to go out shopping earlier in the afternoon.

For Norm 4, the constant term is 15.414 (≈15:25) and the effect of “household size” on the mean for Norm 4 is negative, showing that people from larger households tend to start their shopping activities earlier than 15:25, possibly because more shopping tasks from household members push them to start shopping earlier.

The standard deviations of Norm 2 and Norm 4 are less than those of Norm 1 and Norm 3. One possible explanation is that people belonging to Norm 2 and Norm 4 are shopping in downtown areas, with a clearer purpose, and are more concentrated in the morning and afternoon peak hours. People belonging to Norm 1 and Norm 3 are shopping with an unclear purpose; thus the standard deviation is larger.

Conclusions and Discussions

Considering the application limitations of the CL model, this work combines the finite-mixture method and the CL model to develop a new continuous choice model, called the FMCL model. The model uses the finite-mixture method to mix several unimodal distributions together proportionally to construct a utility function. Embedding the novel utility function in the CL model allows the probability of individual choice to be obtained directly by calculating the probability density function of the continuous distribution. This improvement avoids the numerical integration calculation of the CL model, which simplifies the calculation of individual choice probability densities and reduces the estimation time of the model.

Reparameterization of the scale and location parameters for continuous distributions can quantify the effect of variables on continuous choice; this not only allows for an intuitive explanation of the effect of variables on the choice based on coefficients, but also retains a strong theoretical basis following the RUM principle to assess the economic welfare effect of alternative policies (e.g., Lemp and Kockelman [ 48 ]).

Several simulation experiments are conducted to demonstrate the FMCL model. Simulation experiment results show that each parameter is significantly identifiable and that the FMCL model can be estimated consistently. This paper also presents an empirical analysis of the FMCL model for activity start time choice in the context of home-shopping-home trips of non-commuters based on Shanghai Household Travel Survey data. The EM algorithm was coded in GAUSS software for parameter estimation, which takes only 3 min for 10,048 observations. A mixture of normal distributions was chosen to construct a utility function. A mixture function with normal distributions was constructed to fit the shopping activity start time distribution; the fitted distribution from the model is basically consistent with the actual distribution from the data.

The FMCL model has a well-defined probability density expression, and the simulation can be conducted by directly generating random numbers from those distributions. However, the CL model usually needs to be applied to the Metropolis–Hastings (MH) algorithm for simulation, which is relatively complicated and time-consuming. Therefore, the FMCL model also has the potential to reduce the simulation time.

It is worth noting that the proposed FMCL model can also be applied for welfare evaluation, which warrants the derivation of the model within the RUM framework. Ben-Akiva and Watanatada (4) showed that consumer surplus for the CL can be computed as the limiting formula for the MNL, as

C S_{i} = \ln (\int_{t_{1}}^{t_{2}} \exp (V (t)) dt)

(19)

The welfare calculation formula is rewritten for the proposed FMCL model as

\begin{matrix} C S_{i} = \ln (\int_{t_{1}}^{t_{2}} \exp (V (t)) dt) = \ln (\int_{t_{1}}^{t_{2}} \exp (\ln (f (t))) dt) \\ = \ln (\int_{t_{1}}^{t_{2}} f (t) dt) = \ln (\int_{t_{1}}^{t_{2}} \sum_{n = 1}^{N} w_{n} f_{n} (t_{k}, μ_{n}, σ_{n}) dt) \\ = \ln (\int_{t_{1}}^{t_{2}} \sum_{n = 1}^{N} w_{n} \frac{1}{\sqrt{2 π} σ_{n}} \exp (- \frac{{(t - μ_{n})}^{2}}{2 {σ_{n}}^{2}}) dt) \\ = \ln (\sum_{n = 1}^{N} w_{n} (Φ (\frac{t_{2} - μ_{n}}{σ_{n}}) - Φ (\frac{t_{1} - μ_{n}}{σ_{n}}))) \end{matrix}

(20)

From Equation 20, the welfare in the FMCL model can be calculated based on the function of $Φ ()$ , which is available in most statistical and computational packages. Thus, existing CL models are not as simple or computationally efficient as the FMCL model in welfare calculation.

For the utility function of the FMCL model, a finite mixture of other distribution functions can also be chosen for the construction, which has great flexibility for further extension and improvement. This study is just a first exploration on the finite-mixture model, which can be more widely applied to model continuous activity–travel decision variables in future.

Footnotes

Author Contributions

The authors confirm contribution to the paper as follows: study conception and design: X. Ye; data collection: T. Zhang; analysis and interpretation of results: S. Geng; draft manuscript preparation: S. Geng, X. Ye, T. Zhang, and K. Wang. All authors reviewed the results and approved the final version of the manuscript.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Key project “Research on the Theories for Modernization of Urban Transport Governance” (No. 71734004) from the National Natural Science Foundation of China and project “Activity-Based Travel Demand Model Development for Shanghai” (No. kh0160020220382) from Shanghai Urban Planning and Design Research Institute.

ORCID iDs

Shuguang Geng

Xin Ye

Ke Wang

References

McFadden

Modeling the Choice of Residential Location. Transportation Research Record: Journal of the Transportation Research Board, 1978. 673: 72–77.

McFadden

Conditional Logit Analysis of Qualitative Choice Behavior. Academic Press, New York, 1974.

Bhat

C. R.

Steed

J. L.

A Continuous-Time Model of Departure Time Choice for Urban Shopping Trips. Transportation Research Part B: Methodological, Vol. 36, No. 3, 2002, pp. 207–224.

Ben-Akiva

Watanatada

Applications of a Continuous Spatial Choice Logit Model. In: Structural Analysis of Discrete Data with Econometric Applications ( Manski

C. F.

McFadden

, eds.), MIT Press, London, 1981, pp. 320–343.

Ben-Akiva

Litinas

Tsunokawa

Continuous Spatial Choice: The Continuous Logit Model and Distributions of Trips and Urban Densities. Transportation Research Part A: General, Vol. 19, No. 2, 1985, pp. 119–154.

Zeid

Rossi

Gardner

Modeling Time-of-Day Choice in Context of Tour and Activity-Based Models. Transportation Research Record: Journal of the Transportation Research Board, 2006. 1981: 42–49.

Popuri

Ben-Akiva

Proussaloglou

Time-of-Day Modeling in a Tour-Based Context: Tel Aviv Experience. Transportation Research Record: Journal of the Transportation Research Board, 2008. 2076, No. 1: 88–96.

Ben-Akiva

Abou-Zeid

Methodological Issues in Modelling Time-of-Travel Preferences. Transportmetrica, Vol. 9, 2012, pp. 1–14.

Pendyala

A Model of Daily Time Use Allocation Using Fractional Logit Methodology. In Proceedings of Transportation and Traffic Theory. Flow, Dynamics and Human Interaction. 16th International Symposium on Transportation and Traffic Theory. University of Maryland, College Park, Elsevier Science Ltd., 2005, pp. 507–524.

10.

Bhat

C. R.

A Multiple Discrete-Continuous Extreme Value Model: Formulation and Application to Discretionary Time-Use Decisions. Transportation Research Part B: Methodological, Vol. 39, No. 8, 2005, pp. 679–707.

11.

Bhat

C. R.

The Multiple Discrete-Continuous Extreme Value (MDCEV) Model: Role of Utility Function Parameters, Identification Considerations, and Model Extensions. Transportation Research Part B: Methodological, Vol. 42, No. 3, 2008, pp. 274–303.

12.

Saxena

Pinjari

A. R.

Roy

Paleti

Multiple Discrete-Continuous Choice Models with Bounds on Consumptions. Transportation Research Part A: Policy and Practice, Vol. 149, 2021, pp. 237–265.

13.

Palma

Enam

Hess

Calastri

Crastes dit Sourd

Modelling Multiple Occurrences of Activities During a Day: An Extension of the MDCEV Model. Transportmetrica B: Transport Dynamics, Vol. 9, 2021, pp. 456–478.

14.

Saleh

Farrell

Implications of Congestion Charging for Departure Time Choice: Work and Non-Work Schedule Flexibility. Transportation Research Part A: Policy and Practice, Vol. 39, No. 7–9, 2005, pp. 773–791.

15.

Hess

Polak.

J. W.

Bierlaire

Functional Approximations to Alternative-Specific Constants in Time-Period Choice-Modelling. In: Transportation and Traffic Theory: Flow, Dynamics and Human Interaction ( Mahmassani

H. S.

, ed.), Emerald Group, Bingley, UK, 2005, pp. 545–564.

16.

Abkowitz

M. D.

An Analysis of the Commuter Departure Time Decision. Transportation, Vol. 10, No. 3, 1981, pp. 283–297.

17.

Small

K. A.

The Scheduling of Consumer Activities: Work Trips. The American Economic Review, Vol. 72, No. 3, 1982, pp. 467–479.

18.

McCafferty

Hall

F. L.

The Use of Multinomial Logit Analysis to Model the Choice of Time to Travel. Economic Geography, Vol. 58, No. 3, 1982, pp. 236–246.

19.

Hunt

J. D.

Patterson

A Stated Preference Examination of Time of Travel Choice for a Recreational Trip. Journal of Advanced Transportation, Vol. 30, No. 3, 1996, pp. 17–44.

20.

Okola

Departure Time Choice for Recreational Activities by Elderly Nonworkers. Transportation Research Record: Journal of the Transportation Research Board, 2003. 1848(1): 86–93.

21.

Yang

Jin.

Liu

Tour-Based Time-of-Day Choices for Weekend Nonwork Activities. Transportation Research Record: Journal of the Transportation Research Board, 2008. 2054(1): 37–45.

22.

Chang

M. S.

P. R.

A Multinomial Logit Model of Mode and Arrival Time Choices for Planned Special Events. Journal of the Eastern Asia Society for Transportation Studies, Vol. 10, 2013, pp. 710–727.

23.

Chaichannawatik

Kanitpong

Limanond

Departure Time Choice (DTC) Behavior for Intercity Travel During a Long-Holiday in Bangkok, Thailand. Journal of Advanced Transportation, Vol. 2019, 2019, pp. 1–11.

24.

Hendrickson

Plank

The Flexibility of Departure Times for Work Trips. Transportation Research Part A: General, Vol. 18, No. 1, 1984, pp. 25–36.

25.

Vovsha

Bradley

Hybrid Discrete Choice Departure-Time and Duration Model for Scheduling Travel Tours. Transportation Research Record: Journal of the Transportation Research Board, 2004. 1894(1): 46–56.

26.

Sikder

Augustin

Pinjari

A. R.

Eluru

Spatial Transferability of Tour-Based Time-of-Day Choice Models: Empirical Assessment. Transportation Research Record, 2014. 2429(1): 99–109.

27.

Chin

A. T.

Influences on Commuter Trip Departure Time Decisions in Singapore. Transportation Research Part A: General, Vol. 24, No. 5, 1990, pp. 321–333.

28.

Bajwa

Bekhor

Kuwahara

Chung

Discrete Choice Modeling of Combined Mode and Departure Time. Transportmetrica, Vol. 4, No. 2, 2008, pp. 155–177.

29.

Hess

Polak

J. W.

Exploring the Potential for Cross-Nesting Structures in Airport-Choice Analysis: A Case-Study of the Greater London Area. Transportation Research Part E: Logistics and Transportation Review, Vol. 42, No. 2, 2006, pp. 63–81.

30.

Ding

Mishra

Lin

Xie

Cross-Nested Joint Model of Travel Mode and Departure Time Choice for Urban Commuting Trips: Case Study in Maryland–Washington, Dc Region. Journal of Urban Planning and Development, Vol. 141, No. 4, 2015, p. 04014036.

31.

Small

K. A.

A Discrete Choice Model for Ordered Alternatives. Econometrica: Journal of the Econometric Society, Vol. 55, No. 2, 1987, pp. 409–424.

32.

Steed

J. L.

Bhat

C. R.

On Modeling Departure-Time Choice for Home-Based Social/Recreational and Shopping Trips. Transportation Research Record, Vol. 1706, No. 1, 2000, pp. 152–159.

33.

Ozbay

Yanmaz-Tuzel

Valuation of Travel Time and Departure Time Choice in the Presence of Time-of-Day Pricing. Transportation Research Part A: Policy and Practice, Vol. 42, No. 4, 2008, pp. 577–590.

34.

Chu

Y. L.

Work Departure Time Analysis Using Dogit Ordered Generalized Extreme Value Model. Transportation Research Record: Journal of the Transportation Research Board, Vol. 2132, No. 1, 2009, pp. 42–49.

35.

Bhat

C. R.

Accommodating Flexible Substitution Patterns in Multi-Dimensional Choice Modeling: Formulation and Application to Travel Mode and Departure Time Choice. Transportation Research Part B: Methodological, Vol. 32, No. 7, 1998, pp. 455–466.

36.

Holyoak

Modelling the Trip Departure Timing Decision and Peak Spreading Policies. In: Proceedings of the European Transport Conference (ETC), Leiden, The Netherlands, 2007.

37.

Cheng

Yang

X. K.

Random Parameter Nested Logit Model for Combined Departure Time and Route Choice. International Journal of Transportation Science and Technology, Vol. 4, No. 1, 2015, pp. 93–105.

38.

De Jong

Daly

Pieters

Vellay

Bradley

Hofman

A Model for Time of Day and Mode Choice Using Error Components Logit. Transportation Research Part E: Logistics and Transportation Review, Vol. 39, No. 3, 2003, pp. 245–268.

39.

Liu

Mahmassani

H. S.

Dynamic Aspects of Commuter Decisions under Advanced Traveler Information Systems: Modeling Framework and Experimental Results. Transportation Research Record: Journal of the Transportation Research Board, 1998. 1645(1): 111–119.

40.

Horowitz

J. L.

Reconsidering the Multinomial Probit Model. Transportation Research Part B: Methodological, Vol. 25, No. 6, 1991, pp. 433–438.

41.

Holyoak

Departure Time Choice for the Car-Based Commute. In: Proceedings of the 31st Australasian Transport Research Forum, Australia, Department of Transport, Victoria, 2008, pp. 429–442. https://www.researchgate.net/publication/264967373

42.

Tringides

C. A.

Pendyala

M.R.

Departure-Time Choice and Mode Choice for Nonwork Trips: Alternative Formulations of Joint Model Systems. Transportation Research Record: Journal of the Transportation Research Board, 2004. 1898(1): 1–9.

43.

Vishnu

Srinivasan

K. K.

Tour-Based Departure Time Models for Work and Non-Work Tours of Workers. Procedia-Social and Behavioral Sciences, Vol. 104, 2013, pp. 630–639.

44.

Wang

J. J.

Timing Utility of Daily Activities and Its Impact on Travel. Transportation Research Part A: Policy and Practice, Vol. 30, No. 3, 1996, pp. 189–206.

45.

Komma

Srinivasan

Modeling Home-to-Work Commute-Timing Decisions of Workers with Flexible Work Schedules. Presented at 87th Annual Meeting of the Transportation Research Board, Washington, D.C., 2008.

46.

Habib

K. M. N.

Day

Miller

E. J.

An Investigation of Commuting Trip Timing and Mode Choice in the Greater Toronto Area: Application of a Joint Discrete-Continuous Model. Transportation Research Part A: Policy and Practice, Vol. 43, No. 7, 2009, pp. 639–653.

47.

Gadda

Kockelman

K. M.

Damien

Continuous Departure Time Models: A Bayesian Approach. Transportation Research Record: Journal of the Transportation Research Board, 2009. 2132(1): 13–24.

48.

Lemp

J. D.

Kockelman

K. M.

Understanding and Accommodating Risk and Uncertainty in Toll Road Projects: A Review of the Literature. Transportation Research Record: Journal of the Transportation Research Board, 2009. 2132(1): 106–112.

49.

Lemp

J. D.

Kockelman

K. M.

Damien

The Continuous Cross-Nested Logit Model: Formulation and Application for Departure Time Choice. Transportation Research Part B: Methodological, Vol. 44, No. 5, 2010, pp. 646–661.

50.

Ghader

Carrion

Zhang

Autoregressive Continuous Logit: Formulation and Application to Time-of-Day Choice Modeling. Transportation Research Part B: Methodological, Vol. 123, 2019, pp. 240–257.

51.

Ghader

Carrion.

Tang

Asadabadi

Zhang

A Copula-Based Continuous Cross-Nested Logit Model for Tour Scheduling in Activity-Based Travel Demand Models. Transportation Research Part B: Methodological, Vol. 145, 2021, pp. 324–341.

52.

Mclachlan

G. J.

Basford

K. E.

Mixture Models: Inference and Applications to Clustering. CRC Press, New York, 1988.

53.

Redner

R. A.

Walker

H. F.

Mixture Densities, Maximum Likelihood and the EM Algorithm. SIAM Review, Vol. 26, No. 2, 1984, pp. 195–239.

54.

Dempster

A. P.

Laird.

N. M.

Rubin

D. B.

Maximum Likelihood from Incomplete Data Via the EM Algorithm. Journal of the Royal Statistical Society: Series B (Methodological), Vol. 39, No. 1, 1977, pp. 1–22.

Finite-Mixture Continuous Logit Model: Formulation and Application for Shopping Start Time Choice of Non-Commuters

Abstract

Keywords

Literature Review

Methods

CL Model

Finite-Mixture CL Model

Utility Specification

Finite Mixture of Unimodal Distribution

EM Algorithms

Simulation Experiments

Empirical Analysis

Data

Sample Description

Estimation Results

Conclusions and Discussions

Footnotes

Author Contributions

Declaration of Conflicting Interests

Funding

ORCID iDs

References