Comparing the efficiency and robustness of state-of-the-art experimental designs for stated choice modeling: A simulation analysis

Abstract

Among the ways to construct experimental designs having been proposed, orthogonal design, uniform design, and D-efficient design are state-of-the-art methods. This article provides detailed comparisons on the efficiency and robustness among these methods with three case studies in multinomial logit and mixed multinomial logit models. ND-error values and the departures of D-errors corresponding to misspecification of prior information are used as measurements of design efficiency and design robustness, respectively. Design methods are described, and designs with various numbers of runs are constructed. The results indicate that (a) when parameter priors are available, D-efficient design method outperforms the other two methods, in terms of design efficiency, while uniform design and orthogonal design methods are comparable with each other; (b) there will be efficiency loss when D-efficient design that constructed for specific model is implemented in other ones; (c) all three methods have comparable robustness against misspecifications in parameter prior values; however, the effect of misspecification in prior distribution is massive when D-efficient design is used in mixed multinomial logit model; and (d) when parameter priors are unknown, uniform design is suggested to be used in the construction of experimental designs.

Keywords

Experimental design efficiency robustness stated choice

Introduction

Originally derived by Luce and developed by McFadden and so on, stated choice (SC) models, also known as discrete choice models, have been widely used in traveler behavior analysis and forecasting in transportation studies. For SC models, a good experimental design may yield better SC model parameter estimation with comparable number of observations than bad ones. Thus, experimental designs play a greater role in SC modeling being the base of data obtaining.

The work of experimental design construction is to determine the influence of the design attributes on the choices that are observed to be made by sampled respondents undertaking the experiment.¹ It is convenient to construct an experimental design using full factorial design method that considers each possible choice situation, that is, in case of a design with two alternatives, three attributes having three levels each, a design with 729 $(3^{(2 * 3)})$ runs (also known as the number of choice situations) will be constructed. However, the explosive growth in the number of runs may make it difficult or even impossible for individual respondent to handle. This feature makes full factorial design method unfeasible when multiple attributes and levels are concerned. Thus, taking a subset of choice situations from the full factorial design to construct survey seems a reasonable way for the analysts. Such designs are known as fractional factorial designs. Since the man power and monetary cost in survey might be enormous, it has always drawn researchers and engineers to seek better ways to construct designs which can provide more statistically reliable parameter estimation.

Historically, researchers have relied on experimental orthogonal designs (ORDs), in which the attributes of the experiment are statistically independent by forcing them to be orthogonal.¹ However, constrain of orthogonal also restrains the number of runs that can be chosen in ORD. Meanwhile, as SC models are nonlinear, the importance of orthogonality is questioned when ORD is used to construct experimental designs for SC models. Moreover, some scholars have questioned that there might be orthogonality loss when unevenly subsets of the design are conducted.²

Meanwhile, uniform design (UD) is a kind of space filling design that seeks design points to be uniformly scattered on the experimental domain.³ Proposed by K-T Fang,⁴ it has been popularly used in agriculture, medicine, and chemical industries,⁵ however has been little studied or applied to SC experiment designs. In D Wang and J Li,⁶ researchers stated that when UD is used, an overall mean model, which is a polynomial function that can approximate any type of model (either linear or nonlinear), is assumed, and the design is arranged accordingly. In D Wang and P Li⁷ and P Li and D Wang,⁸ researchers studied the statistical properties of UD with a transport mode choice problem using multinomial logit (MNL) model and found out that the efficiency of the parameter estimations of UD is comparable to that of ORD method. These findings intrigued us to conduct further exploration on UD for SC modeling.

However, another fraction of researchers focused on so-called D-efficient designs (DEDs).⁹ Unlike ORD and UD methods, DED construct designs with their efficiency directly linking to SC models that are most likely to be estimated. In DED, efforts are made to obtain experimental design with minimization of expected asymptotic standard errors of the design, so that more reliable parameter estimates can be achieved. To do so, prior information on parameters to be estimated is needed. A theoretical basis for DED has been made that mainly involves with three objects: the analytical derivation of asymptotic variance–covariance (AVC) matrix for corresponding SC models,^9–11 the algorithms for searching low D-error designs,^12,13 and the analysis of misspecification of parameter priors.¹⁴

Given the fact that each method has advantages and limitations in SC experimental designs, how should analysts make trade-offs and select suitable ones for corresponding SC problems? This brings the concern of experimental design efficiency and robustness. The experimental design efficiency measures the statistical performance of a design in parameter estimation. In other words, we say a design is more efficient if it yields parameter estimation that has smaller standard errors than other ones with comparable number of runs. The robustness of a design measures the ability of the design against the prior information bias. A design with stronger robustness may perform more stable in parameter estimation when misspecification of priors occurs. Literatures have been made efforts on the subject with simulation in DS Bunch and colleagues^15,16 and empirical data in F Yang et al.¹⁷ and T Li et al.¹⁸ However, there are some limitations in the existing studies. First, to the authors’ knowledge, efficiency is usually compared among designs constructed by different design methods with single number of runs chosen by arbitrary, and little comparisons have been made among designs constructed by different numbers of runs. Meanwhile, despite the work of comparisons having been made on the design methods in the previous studies,^2,14–16 the SC models that they have used are mainly MNL. As more advanced models have been implemented in practical work which bring stronger ability of explanatory, studies on the efficiency and robustness of design methods for mixed multinomial logit (MMNL) models are requested urgently. Moreover, when MMNL models are considered, there could be misspecifications in both prior values and prior distributions. Further studies on the misspecification of parameter prior information are needed. This article makes an attempt to answer these questions by comparing the performance of ORD, UD, and DED methods under various choice scenarios. We use the criteria of normalized D-error to evaluate the efficiency of designs constructed with different design methods at various numbers of runs. The departures of D-error values corresponding to prior information (prior value and prior distribution) bias are used to evaluate the robustness of the design methods.

The remainder of this article is organized as follows: In section “Considerations in SC experimental designs,” details on the considerations in SC experimental designs are given, including choice scenarios, SC models, experimental design method, as well as the number of runs. Section “Measurements of efficiency and robustness” elaborates the measures of design efficiency and robustness. In section “Case studies,” three cases are proposed representing a simple and more complex choice scenario, in which experiment designs with different numbers of runs using ORD, UD, and DED methods are constructed for MNL and MMNL models; efficiency and robustness of the design methods are calculated and carefully compared. In section “Results and discussion,” results and discussions are analyzed based on case studies in section “Case studies.” Section “Conclusion” provides conclusions and suggestions for further research.

Considerations in SC experimental designs

SC models

SC models are usually constructed basing on random utility theory. Let $U_{nsj}$ denote the utility of alternative j perceived by respondent n in choice situation s, which consists of observed component $V_{nsj}$ and unobserved component $ε_{nsj}$ . The observed component of utility is typically assumed to be a linear relationship of observed attributes of each alternative noted as x and their corresponding parameters $β$ . The utility and observed component can be expressed by the following equations

U_{nsj} = V_{nsj} + ε_{nsj}

(1)

V_{nsj} = \sum_{k = 1}^{K} β_{jk} x_{nsjk}

(2)

In case a parameter $β_{jk}$ appears in the utility function of multiple alternatives j, it is said to be generic over these alternatives. Otherwise, the parameter is called alternative specific. It can be noticed that the utility functions affect the complexity of SC experimental design. To be specific, the number of choice alternatives, attribute categories (generic or specific), the number of attributes, as well as the number of attribute levels are concerned (see JM Rose and MCJ Bliemer² for more information on the determinants of experimental design complexity).

Following utility-maximizing rule, different SC models could be derived from different assumptions about the statistical form of the unobserved component and parameters.

When the unobserved component is assumed to be uncorrelated across choices and individuals, following type I extreme value distribution, MNL model is derived.¹⁰ The form of the choice probabilities of MNL model is expressed as follows

P_{nsj} = \frac{\exp (V_{nsj})}{\sum_{i \in J_{ns}} \exp (V_{nsi})}

(3)

When one or more parameters of observed component are not fixed, the MMNL model (also known as random parameters logit model) is derived.¹⁹ The form of the choice probabilities of MMNL model is expressed as follows

P_{nsj} = \int_{β} \frac{\exp (V_{nsj})}{\sum_{i \in J_{ns}} \exp (V_{nsi})} f (β | θ) d β

(4)

It can be noticed that the probability expression of MNL and MMNL is quite different and the latter one is far more complex with none-closed form. Draws will be needed when simulating the probabilities in MMNL models. This suggests that there might be differences when experimental designs are conducted for parameter estimation of different SC models under the same choice scenario with the same experimental design methods.

Experimental design methods

Orthogonal design method

A design is said to be orthogonal if it satisfies attribute level balance, and all parameters are independently estimable. In other words, an orthogonal design (ORD) satisfies the property that the correlation matrix of coding values in ORD should be an identity matrix. When orthogonal coding ( ${- 1, 1}$ for two levels, ${- 1, 0, 1}$ for three levels, ${- 3, - 1, 1, 3}$ for four levels, etc.) is adopted in the process of designing, the sum of the inner product of any two columns in ORD is zero, as shown below

\sum_{s = 1}^{S} x_{j_{1} k_{1 s},} x_{j_{2} k_{2 s}} = 0, \forall (j_{1}, k_{1}) \neq (j_{2}, k_{2})

(5)

where s denotes the runs of the design and $(j_{1}, k_{1 s})$ and $(j_{2}, k_{2 s})$ denote the coding values of any arbitrary attribute pair in situation s. There are several ways to construct full or fractional factorial ORDs reported in JD Swait et al.¹ and JM Rose and MCJ Bliemer.² However, when computer calculation is adopted, the simplest way to construct an ORD is to construct the full factorial design and then select subsets of choice tasks that satisfy the rule of orthogonality. It is obvious that there would be more than just one ORD for one design scenario.

UD method

The main object of UD is to sample a small set of points from a given closed and bounded set $C^{s} \subset R^{S}$ so that the sampled points are uniformly scattered on $C^{s}$ . Let the model to be estimate be expressed by

Y = h (x_{1}, \dots, x_{s}) + ε

(6)

where $ε$ denotes the random error component. The main goal of UD is to estimate the average value, $E (h (x))$ over the experimental domain, where $h (x)$ is an output of the experiment. When the domain is assumed to be the unit cube, $C^{s} = [0, 1)^{s}$ , then the average value can be expressed as follows

E (h (x)) = \int_{C^{s}} h (x) dx

(7)

From the aspect of sampling, this usually can be estimated by the mean of $h (x)$ as shown below

\bar{h} = \frac{1}{n} \sum_{x \in P} h (x)

(8)

where P is a set with n experimental points over the domain.

Thanks to the Koksma–Hlawka inequality, the upper error bounds of the estimate of $E (h (x))$ is given by

| E (h (x)) - \bar{h} | \leq D (P) V (h)

(9)

where $D (P)$ is the discrepancy and $V (f)$ is the variation of the integrand given by Niederreiter and referenced by Fang,⁴ which is independent of the design points. Thus, equation (9) suggests to find a design with minimum discrepancy. And so, a design is called UD under the measure M if the following condition holds

M (U) = min_{V \in U (n; q_{1} \times \dots \times q_{s})} M (V)

(10)

where M denotes the calculation of discrepancy for a design, $U (n; q_{1}, \dots, q_{s})$ is a $n \times s$ matrix with elements ${1, \dots, q_{j}}$ at the jth column such that ${1, \dots, q_{j}}$ appears in this column equally often

{(CD (P))}^{2} = {(\frac{13}{12})}^{s} - \frac{2}{n} \sum_{k = 1}^{n} Π_{j = 1}^{s} (1 + \frac{1}{2} | x_{kj} - 0.5 | - \frac{1}{2} {| x_{kj} - 0.5 |}^{2}) + \frac{1}{n^{2}} \sum_{k = 1}^{n} \sum_{j = 1}^{n} Π_{i = 1}^{s} [1 + \frac{1}{2} | x_{ki} - 0.5 | + \frac{1}{2} | x_{ji} - 0.5 | - \frac{1}{2} | x_{ki} - x_{ji} |]

(11)

The centered L₂ discrepancy (CD) is considered as the measure of uniformity for its appealing property.³ JH Fred²⁰ gave an analytical expression of CD as shown in equation (11), where n denotes the n points in s-dimensional unit cube $C^{s} = [0, 1)^{s}$ and s denotes the number of attributes.

To convert the domain into unit cube, $x_{ij}$ is given by

x_{ij} = \frac{(u_{ij} - 0.5)}{q}

(12)

where q is the maximum number of levels that the attributes have. Following the measure of M and the principle in equation (10), UD could be constructed with the help of computer through the optimization way.

DED method

DED method is the kind of design that links the process of design construction to the reduction of asymptotic standard errors of parameter estimation in modeling. The theoretical basis of the DED is to obtain the minimum value of the determinant of the AVC matrix of the model, naming D-error.¹⁷ Smaller values of D-error indicate higher reliability of the estimated parameter results.

Let $Ω_{N}$ denote the AVC matrix given a sample size of N respondents (each facing S runs), X denote the matrix of attributes, Y denote the matrix of choice results, and $\tilde{β}$ denote the prior information of parameter matrix. The AVC matrix is the negative inverse of the expected Fisher information matrix $I_{N}$ , where the latter is equal to the second derivatives of the log-likelihood function, as shown below

Ω_{N} (X, Y, \tilde{β}) = - {[E (I_{N} (X, Y, \tilde{β}))]}^{- 1} = - {[\frac{\partial^{2} L_{N} (X, Y, \tilde{β})}{\partial β \partial β^{'}}]}^{- 1}

(13)

If the choice observations from a single respondent over a series of choice situations are assumed independent, then the log-likelihood function can be written as below

L_{N} (X, Y, \tilde{β}) = \sum_{n = 1}^{N} \sum_{s = 1}^{S} \sum_{j = 1}^{J} y_{nsj} \log P_{nsj} (X, \tilde{β})

(14)

where $y_{nsj}$ represents the choice result of the respondents that equals to one when respondent n chooses alternative j in choice situation s, and equals to zero otherwise.

Thanks to the work of M Daniel¹⁰ and MCJ Bliemer and JM Rose,¹² it can be shown that the outcome Y drop out or could be replaced with probabilities when $L_{N}$ taking the second derivatives in case of the MNL and MMNL models. Considering the case that all respondents face exactly the same choice situations, that is, $X_{n} = X$ for all respondents, $I_{N}$ and $Ω_{N}$ can be derived into forms as shown below

I_{N} (X, \tilde{β}) = N \cdot I_{1} (X, \tilde{β})

(15)

Ω_{N} (X, \tilde{β}) = \frac{1}{N} Ω_{1} (X, \tilde{β})

(16)

In other words, the AVC matrix corresponding to a sample size of N can be derived directly from the AVC matrix from a single respondent using a rate of $1 / N$ .

In this article, we choose $D_{p} - error$ as the criteria of D-error, assuming parameters with fixed prior value or distributions with fixed variables. It can be computed with equation shown below

D_{p} - error = det {(Ω_{1} (X, \tilde{β}))}^{1 / K}

(17)

Thus, a DED could be constructed for specific SC model by solving the following problem

\begin{matrix} min {(det (Ω_{1}))}^{1 / K} \\ s . t . x \in X \end{matrix}

(18)

The basic theory to solve the problem is to evaluate each different combination of choice situations from the full factorial. The combination with the lowest efficiency error having a certain number of choice situations is the optimal design. Based on equation (17) and the object function in equation (18), it is obvious that advanced knowledge on the parameter values $\tilde{β}$ is needed in the process of DED constructions. This advanced knowledge is so-called prior information. In practice, parameter estimates that are studied in relative literatures or parameter estimates from pilot studies may be taken as prior information. In this article, we have set priors in advance assuming them to be true in the process of design efficiency comparisons while to be departure from true values in the process of robustness comparisons. As DED links to the model form that is most likely used in parameter estimation, we denote DED for MNL model as DpMNL, while denoting DpMMNL as DED for MMNL model.

The number of runs

The number of runs has influence on the burden that respondents face in terms of the questionnaire. The more runs an experimental design has, the larger burden a respondent may face. The smallest number of runs in design construction depends on the degrees of freedom of the choice scenario, the principle of attribute level balance (over all choice situations, each attribute level should appear an equal number of times) and additional constrains subjecting to design methods, that is, the orthogonality. However, the largest number of runs equals to the runs that corresponding full factorial design has. In this article, designs with different runs were constructed under same choice scenarios with the same design methods, as well as the same SC models, to analyze the effect of runs on the efficiency of design methods.

Measurements of efficiency and robustness

Normalized D-error

As introduced in section “DED method,” D-error is an overall measurement that relates to the standard errors of estimated parameters for SC model. So that it is regarded as the criteria of the efficiency for an experimental design. According to equation (17) in section “DED method,” D-error is related to the runs used. To compare D-errors of designs constructed with different runs, we use the criteria normalized D-error which is proposed by MCJ Bliemer and JM Rose.²¹ Let ND denote the normalized D-error, D denote the original D-error, S denote the runs the design has, while the runs that the standard design has is denoted as $S^{*}$ . Then, D-error could be normalized by

ND = \frac{D \cdot S}{S^{*}}

(19)

Misspecification of priors

In the calculation of D-error and normalized D-error, we have assumed that prior parameter information correspond to the true parameter information in SC models. However, in practice, it is more practical that we can only assume the parameter priors from experience or pilot survey. This may lead to bias between the prior values that analysts set and the real parameter values hold by the population. Thus, the study of robustness of a design against the prior bias can help designers to choose more reliable methods when prior information is uncertain.

There might be two kinds of prior information misspecifications corresponding to the characteristics of attributes. One lies in parameters with fixed value and the other lies in parameters following specific distributions. For the former parameters, fixing the design and varying the parameter estimates over some range provide a way of testing the robustness of different designs to prior information bias by the departures of D-error values. While for the latter kind, we can change the distribution forms to test the departure of D-error values.

Case studies

Based on the considerations specified in section “Considerations in SC experimental designs,” we designed three cases varying in the number of alternatives, the number of attributes, attribute levels, the number of runs, as well as model forms. In this article, the construction of UDs is with the help of DPS software (Version 7.05). In all cases, we have optimized the UDs under the function of mixed attribute level UD with the maximum 2000 iterations or 10 min to run. Meanwhile, the construction of ORDs and DEDs is realized in the Ngene software (Version 1.1.2), which is also used to calculate D-error values. For ORDs, the sequential method is used to ensure the orthogonality across alternatives, and Gaussian quadrature with five abscissas per parameter is used to take draws from the parameter distributions of random parameter in the construction of DEDs.

Case study 1

Case 1 is designed to compare the efficiency of ORD, UD, and DED methods for MNL model in the scenario where respondents face two alternatives. The utility of each alternative is specified as follows

\begin{matrix} U_{1} = β_{1} x_{11} + β_{2} x_{12} + β_{3} x_{13} + ε_{1} \\ U_{2} = β_{1} x_{21} + β_{2} x_{22} + β_{3} x_{23} + ε_{2} \end{matrix}

(20)

where $x_{ji}$ are all generic attributes. $x_{j 1} \in {1, 2, 3, 4}$ , $x_{j 2}$ , and $x_{j 3}$ are attributes subjected to set ${1, 2}$ . $β_{i}$ are parameters to be estimated. $ε_{1}$ and $ε_{2}$ are irrelevant independent components following type I extreme value distribution. We assume that the parameters in the utility functions are fixed with the prior value of $β_{1} = - 0.2$ , $β_{2} = 0.6$ , and $β_{3} = 0.2$ .

The outcome of D-error values and normalized D-error values is presented in Figure 1. As introduced in section “DED method,” we denote DED for MNL model as DpMNL. ND-errors are normalized based on the design having 20 runs within each design methods. It can be observed from Figure 1 that for all three design methods, D-error values decrease with the rise of run numbers. For example, when the run in DpMNL rise from 8 to 32, the value of D-error decreases from 0.351 to 0.096. This may result from the fact that larger runs bring more information for parameter estimation when only one sample is considered, so that lower standard errors could be achieved, which leads to smaller D-error. Meanwhile, line DpMNL keeps to position at lower parts of the charts than line UD and line ORD. This suggests that designs constructed with DpMNL method turn out to be having smaller D-errors than the ones constructed with UD and ORD methods. Taking designs having 16 runs as an example: The D-error value of DpMNL is 0.18, which is 51.7% of UDs and 52.5% of ORDs. Moreover, line UD and line ORD tend to be overlapped. This consists with the findings in D Wang and P Li⁷ that UD and ORD methods have comparable efficiency when utilized in MNL. Besides, within all three design methods, there are little differences in ND-error as the number of runs varies. This implies that for MNL model, the number of runs has little effect on the performance of design efficiency. Analysts may select feasible number of runs at the acceptable burden when MNL model is implemented.

Figure 1.

Criteria comparisons of different design methods with different runs in case 1.

Case study 2

Case 2 provides a platform to conduct comparisons on the efficiency of design methods for MNL model with higher complexity in choice scenario, by assuming four alternatives having both generic and specific attributes. Meanwhile, designs with 24 runs are chosen to inspect the effect of misspecifications in prior values. The utility specification for each alternative in case 2 is given as follows

\begin{matrix} U_{1} = β_{0} + β_{1} x_{11} + β_{2} x_{12} + β_{3} x_{13} + ε_{1} \\ U_{2} = β_{4} + β_{1} x_{21} + β_{2} x_{22} + β_{5} x_{23} + ε_{2} \\ U_{3} = β_{6} + β_{1} x_{31} + β_{2} x_{32} + β_{7} x_{33} + ε_{3} \\ U_{4} = β_{1} x_{41} + β_{2} x_{42} + β_{8} x_{43} + ε_{4} \end{matrix}

(21)

where $x_{j 1}$ and $x_{j 2}$ are generic attributes across four alternatives. The levels of $x_{j 1}$ and $x_{j 2}$ are taken from set ${1, 2, 3, 4}$ . $x_{j 3}$ are specific attributes having two levels which belong to set ${1, 2}$ . $β_{i}$ are parameters to be estimated, and $ε_{j}$ are irrelevant independent components following type I extreme value distribution.

The true prior parameter values are assumed as fixed as follows: For generic attributes, $β_{1} = - 0.2$ and $β_{2} = 0.6$ ; for specific attributes, $β_{3} = 0.8$ , $β_{5} = 0.6$ , $β_{7} = 0.4$ , $β_{8} = 0.2$ ; $β_{0} = 0.2$ , $β_{4} = 0.4$ , and $β_{6} = 0.6$ are parameters of constants in alternatives 1–3.

The outcome of D-error values and ND-error values is presented in Figure 2. ND-errors are normalized based on the design having 24 runs within each design methods. The effect of misspecifications in prior values is shown in Figure 3.

Figure 2.

Criteria comparison of different design methods with different runs in case 2.

Figure 3.

D-error departures with prior value misspecification in parameters in MNL in case 2.

Design efficiency comparisons

In general, the outcome of the comparisons of design efficiency in case 2 mirrors the pattern in case 1. Moreover, with the increase in the complexity of design scenario, we can observe from Figure 2 that the available runs of UD and DED methods are 12, while the minimum runs of ORD method are 24. It implies that UD and DED are more flexible in the choice of run numbers than ORD method. By reading from Figure 2(2), the outcome of ND-error mirrors the finding in case 1 that within all three design methods, ND-errors remain relatively still as the number of runs varies. Along with the analysis on the available runs of UD and ORD methods aforementioned, it indicates that for MNL model, UD may be a potential substitution to ORD as its flexibility in the choice of runs while having comparable ND-error values with ORD. For example, UD16 may load smaller burden of inquiries on respondents than ORD24, by having 16 questions in each questionnaire rather than 24 questions. In the meantime, the comparison of ND-error between UD16 and ORD24 is 0.438 versus 0.422, which are surprisingly close.

Prior value misspecification

Figure 3 plots the departure of D-error values of designs with 24 runs in case 2, when prior value misspecification appears in each parameter. We determine the D-error values when each true parameter independently deviates between −100% and +100% of its prior parameter value. As such, the center of axis x corresponds to original parameter values for design constructions. Vertical axes scale the D-error values of different design methods corresponding to the variation of parameter prior values.

Reading from Figure 3, line DpMNL remains to be the one with smallest D-error values when the parameter prior value varies. Lines in each panel appear to have similar rate of change with the departures of prior values from the original parameter priors values assumed in their construction. This suggests that in this case, UD, ORD, and DpMNL methods have comparable level of robustness against the misspecification in parameter value. Moreover, there are differences in the curvature of lines among the panels, by taking lines in panel 2 and panel 4 as examples: although the original parameter value of the two panels are both 0.6, the lines in panel 4 appear to be flatter than the ones in panel 2. This suggests that attributes may possess different sensitivities to parameter value misspecifications.

Case study 3

Case 3 compares the efficiency of designs constructed in MMNL model, under the same choice scenario as case 2, however being different in parameter specifications. To be specific, we assumed that $β_{1}$ and $β_{2}$ follow the distribution of $β_{1} ~ N (- 0.2, 0.2)$ and $β_{2} ~ N (0.6, 0.3)$ , whose mean values equal to the corresponding fixed value in case 2. The remaining parameters are specified as the same in case 2. DpMNL designs in case 2 are implemented to the MMNL model in case 3 to inspect the effect of model form on design efficiency and robustness. The outcome of D-error values and normalized D-error values is presented in Figure 4. As introduced in section “DED method,” we denote DpMMNL as DED for MMNL model. ND-errors are normalized based on the design having 24 runs within each design methods. Designs with 24 runs are also chosen to inspect the effect of misspecifications in prior value and prior distribution, presented in Figure 5 and Table 1, respectively.

Figure 4.

Criteria comparison of different design methods with different runs in case 3.

Figure 5.

D-error departures with prior value misspecification in parameters in MMNL in case 3.

Table 1.

D-error values with prior distribution misspecification in MMNL model in case 3.

Design methods	D-error
	Before	After	Percent change
ORD	0.62	1.07	72.58
UD	0.65	1.09	67.69
DpMNL	0.64	1.24	93.75
DpMMNL	0.48	0.88	83.33

ORD: orthogonal design; UD: uniform design; MNL: multinomial logit; MMNL: mixed multinomial logit.

Design efficiency comparisons

Reading from Figure 4, designs constructed by DpMMNL turn out to be the most efficient designs with the lowest D-error value at each runs. This consists with the outcome in case 1 and case 2 that the DED designs dedicated for specific models perform better than other designs. Meanwhile, line DpMNL, UD, and ORD appear to overlap each other in Figure 4(1) and (2). It implies that SC model is mattered to what the design is constructed for. In other words, there will be efficiency loss when DED that constructed for specific SC model is used in other ones. Moreover, as shown in Figure 4(2), there are more fluctuations in ND-error values than those in case 1 and case 2 when the number of runs varies. This may result from the draws in the process of parameter estimation in MMNL model. However, as the fluctuations are relatively small, we may yet say that the number of runs has little effect on the performance of design methods on design efficiency in MMNL model. Thus, as in case 2, UD may be a potential substitution to ORD as its flexibility in the choice of runs while having comparable ND-error values with ORD for MMNL model.

Prior value misspecification

Figure 5 plots the departures of D-error values of designs with 24 runs, when prior value misspecification appears in each parameter of MMNL model. As in case 2, we determine the D-errors when each true parameter independently deviates between −100% and +100% of its prior parameter value. For parameters following distributions as $β_{1}$ and $β_{2}$ , departures of prior values are assumed in their means.

Reading from Figure 5, in addition to line DpMNL, lines in each panel of Figure 5 appear to possess similar rate of change with the departures of prior values from the original parameter priors values assumed in their construction. This suggests that in this case, UD, ORD, and DpMMNL methods have comparable level of robustness against misspecification of prior values, when MMNL model is considered. Besides, it can be read from panels 3 to 5 of Figure 5 that there are “∩”-shaped lines representing the D-error value variations of DpMNL. The top of these “∩”-shaped lines happens to have the horizontal ordinates lying close to the center of axis x. This suggests that when the origin prior value is utilized, it is actually the worst scenario for prior setting of DpMNL design in MMNL model. This may explain the efficiency loss when DpMNL is used in MMNL model.

Prior distribution misspecification

To analyze the effect of misspecification of prior distribution, we keep the values of the fixed parameters the same as they were assumed in design constructions, while parameters that follow distributions are assumed to have misspecifications in distribution forms having the same values of mean and standard deviations. To be specific, we changed the distribution form of $β_{1} ~ N (- 0.2, 0.2)$ and $β_{2} ~ N (0.6, 0.3)$ into $β_{1} ~ U (- 1.0, 0.6)$ and $β_{2} ~ U (- 0.35, 1.55)$ . The simulation outcome is shown in Table 1.

Reading from Table 1, DpMMNL remains to possess the smallest D-error value despite its bad performance (worse than UD and ORD) in D-error robustness with its percent change of 83.33%. This may be explained by the information on the mean and standard deviations of parameter distribution which still enables DED to be the most efficient design among the three. Meanwhile, the value of percent change show that UD and ORD perform better in the robustness to prior parameter distribution than the other two design methods. This may result from the fact that the construction of UD and ORD is not linked to the model form directly. Moreover, the similarity in percent change of ORD and UD indicates that ORD and UD have comparable robustness to misspecification in parameter distribution. Besides, DpMNL becomes the worst design in efficiency with the largest D-error value while having the worst robustness in D-error with the largest percent change of 93.75%. In a word, prior distribution misspecifications have greater effect on D-error of designs in MMNL models.

Results and discussion

We have attempted to show the performance of ORD, UD, and DED methods on efficiency and robustness via the use of three case studies varying in the complexity of choice scenarios, model forms, as well as the number of runs. ND-error values and the departures of D-error values are used to evaluate the efficiency and robustness of experimental designs, respectively.

The comparisons of design efficiency show that (a) when the number of run varies, line DED (line DpMNL in Figures 1 and 2 and line DpMMNL in Figure 4) keep to place at lower part in the panels than lines UD and ORD do. This suggests that DED method outperforms UD and ORD in design efficiency when prior information is available. (b) The outcome of line UD and line ORD overlapping each other in Figure 1, Figure 2 consists with the finding in D Wang and P Li.⁷ That uniform design is comparable to orthogonal design in design efficiency for MNL model. Meanwhile, the overlapping of line UD and line ORD in Figure 4 extends the scope of model form to MMNL model where the finding aforementioned stands. Thus, we may say that UD and ORD are comparable in design efficiency in MNL and MMNL models. (c) The lines in Figures 1(2), 2(2), and 4(2) show the effect of run numbers on design efficiency that the lines remain relatively still when the number of runs varies. This indicates that the number of runs has little effect on the efficiency of UD, ORD, and DED methods for MNL and MMNL models. (d) We designed case 3 with the same choice scenario as case 2 so that the designs constructed by DED in case 2 could be implemented in case 3, to inspect the effect of model form on design efficiency in DED methods. The worse performance of DpMNL than DpMMNL and overlapping with UD and ORD in Figure 4 suggests that DED should be implemented to its target SC models, or there will be efficiency loss. This may result from the fact that DEDs are generated by conducting the optimization of minimizing the objective function of D-error, which is the function of SC models. Once deviation in model form exists, efficiency loss occurs.

However, the results of the comparison of robustness to prior misspecifications show that (a) within each panel in Figures 3 and 5, lines appear to have similar rate of change in D-error values with the departure of parameter values from the original values assumed in design construction. This indicates that UD, ORD, and DED methods have comparable level of robustness against misspecification of prior values in MNL and MMNL models. (b) For MMNL model, the percent change of D-error values is larger when the misspecification lies in prior distribution than that lies in prior values. This suggests that prior distribution misspecifications have much stronger effect on the designs in MMNL models.

Conclusion

The high cost of SC survey has been motivated by analysts to find better ways to construct SC experimental designs that are able to provide better parameter estimation with smaller or comparable number of observations. This article compares the efficiency and robustness of ORD, UD, and DED methods to address the issue of how to choose experimental design methods for MNL and MMNL models. We find that in terms of design efficiency, DED method outperforms UD and ORD on condition that parameter priors are available, while the latter two go neck to neck. Meanwhile, SC model form matters. There will be efficiency loss when DED which is constructed for specific model is implemented in other ones. Regarding the robustness of the design methods, all three methods have comparable robustness against misspecifications of parameter prior values. However, misspecifications in parameter prior distributions show larger effect on DED method in MMNL models than misspecifications of parameter prior values. This suggests that extra attentions should be poured when prior distributions are assumed in the construction of DED for MMNL model. Moreover, the number of runs seems to have little effect on the efficiency of designs. Analysts may choose the one under the consideration of the acceptable burden of respondent. Besides, UD method is recommended when prior information is not available for its better flexibility in the choice at run numbers and having comparable values in D-errors to ORD method in both MNL and MMNL models.

However, we have noted that there exist a number of limitations for this article. First of all, the conclusions aforementioned should be treated cautiously; more numerical and empirical cases could be practiced to further verify the results. And one more point we should touch on is that Monte Carlo simulations that mimic the choice process of respondents could be made to check the criteria of goodness of fit in the future work.

Footnotes

Academic Editor: Yongjun Shen

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the Specialized Research Fund for the Doctoral Program of Higher Education (no. 20130184110020), Technological Research and Development Program of China Railway Corporation (no. 2015G002-N and no. 2014X006-A), Outstanding Innovation Talents Program of Southwest Jiaotong University (no. SWJTU-R-[2014]-1), as well as the Technological Research and Development Program of China Eryuan Engineering Group (no. KYY2015026).

References

Swait

Louviere

Hensher

. Stated choice methods: analysis and applications. Cambridge: Cambridge University Press, 2000.

Rose

Bliemer

MCJ

. Constructing efficient stated choice experimental. Transport Rev 2009; 29: 587–617.

Fang

K-T

Lin

DKJ

Winker

. Uniform design: theory and application. Technometrics 2000; 42: 237–248.

Fang

K-T.

The uniform design: application of number-theoretic methods in experimental design. Acta Math Appl Sin: E 1980; 3: 363–372.

Fang

K-T

Lin

DKJ

. Uniform experimental designs and their applications in industry. Handbook Stat 2003; 22: 131–170.

Wang

Handling large numbers of attributes and/or levels in conjoint experiments. J Geogr Anal 2002; 34: 350–362.

Wang

Does uniform design really work in stated choice modeling? A simulation study. Transportmetrica 2005; 1: 209–221.

Wang

Numerical analysis of the statistical properties of uniform design in stated choice modeling. Transport Rev 2009; 29: 619–634.

Bliemer

MCJ

Rose

. Efficient stated choice experiments for estimating nested logit models. Transport Res B: Meth 2009; 43: 19–35.

10.

Daniel

. Conditional logit analysis of qualitative choice behaviour. In: Zarembka

(ed.) Frontiers of econometrics. New York: Academic Press, 1974, pp.105–142.

11.

Street

Burgess

Louviere

JJ.

Quick and easy choice sets: constructing optimal and nearly optimal stated choice experiments. Int J Res Mark 2005; 22: 459–470.

12.

Bliemer

MCJ

Rose

. Construction of experimental designs for mixed logit models allowing for correlation across choice observations. Transport Res B: Meth 2010; 44: 720–734.

13.

Goos

Vandebroek

Comparing different sampling schemes for approximating the integrals involved in the efficient design of stated choice experiments. Transport Res B: Meth 2010; 44: 1268–1289.

14.

Rose

Bliemer

MCJ

. Sample size requirements for stated choice experiments. Transportation 2013; 40: 1021–1041.

15.

Bunch

Batsell

RR.

A Monte Carlo comparison of estimators for the multinomial logit model. J Marketing Res 1989; 26: 56–68.

16.

Bunch

Louviere

Anderson

A comparison of experimental design strategies for multinomial logit models: the case of generic attributes. Working paper, University of California, Davis, CA, January 1996.

17.

Yang

Chen

Cheng

. An empirical study of parameter estimation for stated preference experimental design. Math Probl Eng 2014; 2014: 292608 (11 pp.).

18.

Xia

Yang

. Comparing the state-of-the-art efficient stated choice designs based on empirical analysis. Math Probl Eng 2014; 2014: 740612 (8 pp.).

19.

Train

KE.

Discrete choice methods with simulation. Cambridge: Cambridge University Press, 2003, p.134.

20.

Fred

JH.

A generalized discrepancy and quadrature error bound. Math Comput Am Math Soc 1998; 67: 299–322.

21.

Bliemer

MCJ

Rose

. Efficiency and sample size requirements for stated choice experiments. In: Proceedings of the transportation research board 88th annual meeting (paper no. 09–2421), Washington, DC, 11–15 January 2009. New York: Transportation Research Board.