Sage Journals: Discover world-class research

Abstract

A general random effects model is proposed that allows for continuous as well as discrete distributions of the responses. Responses can be unrestricted continuous, bounded continuous, binary, ordered categorical or given in the form of counts. The distribution of the responses is not restricted to exponential families, which is a severe restriction in generalized mixed models. Generalized mixed models use fixed distributions for responses, for example the Poisson distribution in count data, which has the disadvantage of not accounting for overdispersion. By using a response function and a threshold function, the proposed mixed threshold model can account for a variety of alternative distributions that often show better fits than fixed distributions used within the generalized linear model framework. A particular strength of the model is that it provides a tool for joint modelling, responses may be of different types, some can be discrete, others continuous.

Keywords

Random effects models joint modelling count data bounded continuous data ordinal data

1 Introduction

Random effects models are a strong tool to model the heterogeneity of clustered responses. By postulating the existence of unobserved latent variables, the so-called random effects, which are shared by the measurement within a cluster, correlation between the measurements within clusters is introduced. The clusters or units can refer to persons in repeated measurement trials or to larger units as, for example, schools with measurements referring to performance scores of students.

Detailed expositions of linear mixed models, which are typically used for continuous dependent variables, are found in Hsiao (1986), Lindsey (1993) and Jones (1993). Models for binary variables and counts are often discussed within the framework of generalized mixed models, see, for example, McCulloch and Searle (2001). Random effects models for ordinal dependent variables were considered by Harville and Mee (1984), Jansen (1990), Tutz and Hennevogl (1996) and Hartzel et al. (2001). Mixed model versions for continuous bounded data in the form of rates and proportions that take values in the interval (0, 1) have been considered by Qiu et al. (2008) based on the simplex model and by Bonat et al. (2015) who propagate beta distribution models. Several R packages are available to fit generalized mixed linear models, for example, glmmTMB (Brooks et al., 2017) for various continuous and discrete distributions, lme4 for continuous and binary data, ordinal and MultOrdRS (Schauberger, 2024) for ordinal data.

Generalized mixed models within the generalized linear model framework as well as extended approaches as the models for bounded continuous data postulate familiar fixed distributions for the responses. This is different in the approach propagated here, although there is some overlap with generalized linear models. The mixed threshold model used here gains its flexibility concerning distributional assumptions by using two components, a response function, which is a distribution function, and a threshold function that modifies the distribution. In the simplest case, by assuming a linear threshold function, the distribution of the responses follows the response function. Thus, familiar linear Gaussian response models are obtained but also linear models with quite different, possibly skewed distribution functions are available. Distributions with a restricted support, for example if responses are observed in an interval or are positive only, are obtained by using non-linear threshold functions. They are also useful when modelling discrete data, which typically are restricted to a specific range, for example, count data take only values 0, 1, … and ordered categorical responses, can be coded by 0, 1, …, k, where numbers only represent the order of values. In the case of binary and ordered categorical responses, the cumulative generalized mixed model is a special case of the proposed threshold model.

More concretely, let $y_{i 1, \dots,} y_{i m}$ denote the observations on unit i (i = 1, …, n), which can be continuous or discrete. In addition, let x _ij , z _ij denote covariates associated with response y_ij. Then, the mixed threshold model (MTM) has the form

P (Y_{i j} > y ∣ b_{i}, z_{i j}, x_{i j}) = F (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - δ_{j} (y)),

(1.1)

where F(.) is a strictly increasing distribution function and δ_j(.)) is a non-decreasing measurement-specific function defined on the support S of the dependent variables, referred to as threshold function. The function δ_j(.) is a function to be specified whose parameters are estimated jointly with the other model parameters. In addition, it is assumed that the observations $y_{i 1, \dots,} y_{i m}$ are conditionally independent given b _i , z _ij , x _ij . The form of the model is familiar from generalized linear mixed models for ordinal responses, where F⁻¹(.) is considered the link function and δ_j(y) represents the category-specific intercepts, see, for example, Tutz (2012). However, the mixed proportional odds model is just one special case of this much wider class of models.

The predictor $η_{i j} = z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - δ_{j} (y)$ contains two components linked to explanatory variables.

The term $z_{i j}^{T} b_{i}$ contains the cluster-specific effects b _i . They are assumed to vary independently across clusters and are assumed to follow a specific distribution, typically the normal distribution, $b_{i} \sim N (0, Σ)$ , which is used throughout the paper.

The term $x_{i j}^{T} β_{j}$ contains the effects of x _ij on the dependent variable. The parameters β _j are fixed measurement-specific parameters, but can also be independent of the measurement, that is, $β_{1} = \dots = β_{m} = β .$ Typically z _ij is a subset of x _ij .

The distribution of the dependent variables is crucially determined by the choice of the distribution function F(.), also referred to as response function, and the threshold function δ_j(.). Specific choices yield models that are in common use in random effects modelling. Other choices widen the toolbox yielding models that show better fits than classical approaches. Threshold models have been considered before in the form of item threshold model (Tutz, 2022), which are latent trait models that aim at measurement and do not contain any covariates. In the item response model, it is assumed that the responses on a collection of items depend on the latent abilities or attitudes of persons and the difficulties of the items. Items are considered tools to measure the ability or attitude. In the random effects models considered here, the objective is quite different. The focus is on the effect of covariates on responses, random effects are used to account for the heterogeneity of respondents and covariates, and their effects are explicitly included. The role the response and the threshold functions play in modelling the response distribution will become obvious when considering specific choices.

The strengths of the modelling approach are in particular:

The model provides a common framework for different types of responses, which may be continuous or discrete.

If responses are continuous, the model allows for alternative distributions that can fit much better than the typically used normal response mixed model. In particular, restrictions on the responses can be adequately modelled. Responses can be bounded, that is, restricted to an interval providing flexible alternatives to fixed distribution approaches as the beta distribution. Specific choices of the threshold functions provide alternative models for positive-valued responses.

Within the model framework, discrete responses can be finite or have infinite support. In the case of binary or ordered categorical data, familiar models as the random effects version of the proportional odds model are special cases. In the case of count data, the model is an alternative to fixed distribution models as the Poisson model.

The common framework allows to develop software that fits all the models that are included.

The model does not assume that the form of the distribution is the same across measurement. In general, the type of the distribution can be specified measurement-specific allowing in particular for the joint modelling of discrete and continuous distributions.

In Section 2 the case of continuous responses is considered with linear and non-linear threshold functions. Section 3 is devoted to discrete data with infinite and finite support. Joint modelling, which allows for different types of responses, in particular a mixture of continuous and discrete responses, is considered in Section 4. Marginal likelihood estimation methods are given in Section 5. Several small examples are used to demonstrate the versatility of the approach. They are meant for illustration, and no in-depth investigation of effects is given.

2 Continuous dependent variables

We start with models that contain linear threshold functions. The model class comprises the classical normal response model but allows for alternative distributions. Then we consider models for restricted support, which call for non-linear threshold functions.

2.1 Linear random effects models for Gaussian data and other distributions

Let the dependent variables be continuous with support $ℝ$ . Then a threshold function that is simple but already yields very flexible models is the linear threshold function

δ_{j} (y) = δ_{0 j} + δ_{j} y, δ_{j} > 0 .

Let F(.) denote a fixed, typically standardized, distribution function with support $ℝ$ , for example, the standardized normal distribution function. Then, the means $μ_{i j} = E (Y_{i j})$ and variances $σ_{i j}^{2} = var (Y_{i j})$ of dependent variables have a very simple form

μ_{i j} = \frac{1}{δ_{j}} (β_{0 j} + z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j}), σ_{i j}^{2} = \frac{σ_{F}^{2}}{δ_{j}^{2}},

(2.1)

where $β_{0 j} = - μ_{F} - δ_{0 j}$ , and $μ_{F}, σ_{F}^{2}$ are constants that are determined by the distribution function F(.). More concrete, $μ_{F} = \int y f (y) d y$ is the expectation corresponding to distribution function F(.) and $σ_{F}^{2} = {var}_{F} = \int {(y - μ_{F})}^{2} f (y) d y$ the corresponding variance; for a proof see Proposition 1. It should be noted that the identifiability of parameters depends on the data structure. If x _ij does not depend on the measurement, that is, $x_{i j} = x_{i}$ for all j, measurement-specific parameters β _j are not identifiable and parameters have to be chosen as global parameters by assuming $β_{j} = β$ for all j.

It is seen from equation (2.1) that the means of dependent variables are simple linear functions of z _ij , x _ij and the variances vary across measurements. It is a linear model but not necessarily for Gaussian data. Responses can take any strictly increasing distribution function. The model also parameterizes the mean as a linear function of covariates if responses follow a skewed distribution, which might be more appropriate in applications.

The distribution of responses is easily derived since the density f_ij(.) of variable Y_ij is given by

f_{i j} (y) = f (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - δ_{0 j} - δ_{j} y) δ_{j},

(2.2)

where f(.) denotes the density linked to F(.), $f (y) = \partial F (y) / \partial y$ . It means, in particular, that for symmetric distribution functions F(.), the distributions of all dependent variables are scaled and shifted versions of the distribution specified by F(.). If F(.) is not symmetric the distributions of all dependent variables are scaled and shifted versions of the distribution function $\tilde{F} (y) = 1 - F (- y)$ . This is easily seen by considering the distribution function of Y_ij, which is given by $P (Y_{i j} \leq y | b_{i}, z_{i j}, x_{i j}) = 1 - P (Y_{i j} > y | b_{i}, z_{i j}, x_{i j})$ .

A simplified version is the homogeneous MTM, in which δ_j does not depend on the measurement, that is, $δ_{1} = \dots = δ_{m} = δ$ . Then, the variance is the same for all observations and the mean and variance have the simple form

μ_{i j} = {\tilde{β}}_{0 j} + z_{i j}^{T} {\tilde{b}}_{i} + x_{i j}^{T} {\tilde{β}}_{j}, σ_{i j}^{2} = {\tilde{σ}}_{F}^{2},

(2.3)

where ${\tilde{β}}_{0 j}, {\tilde{β}}_{j}, {\tilde{σ}}_{F}^{2}$ are the original parameters divided by δ and ${\tilde{b}}_{i}$ is the original random effect divided by δ.

A special case of the MTM is the classical linear random effects model with normally distributed dependent variables. Let F(.) denote the standardized normal distribution function and assume that dispersion homogeneity holds $(δ_{1} = \dots = δ_{m} = δ)$ . Then, the dependent variables are normally distributed with means and variance given by (2.3). A more familiar representation of the model is the vector-valued representation

y_{i} = β_{0} + X_{i} β + Z_{i} b_{i} + ε_{i}, b_{i} \sim N (0, Σ), ε_{i} \sim N (0, {\tilde{σ}}_{F}^{2} I)

where $y_{i}^{T} = (y_{i 1}, \dots, y_{i m})$ , the matrices X _i , Z _i are composed from the vectors z _ij , x _ij and I denotes the unit matrix.

Table 1

Parameter estimates for rent data (standard errors given in brackets).

Response distribution	Floor	Rooms	Age	Std deviation random effects	Log-lik
Normal	0.072	−0.504	−0.010	0.233	−3380.422
	(0.003)	(0.081)	(0.002)
Gumbel	0.060	−0.302	−0.015	0.320	−3339.328
	(0.003)	(0.073)	(0.002)

The more general model with varying dispersion parameters δ_j is an extension of this classical model. It is more flexible and can be more appropriate, in particular when repeated measurements on a unit are time-dependent and the variance changes over time.

Within the MTM framework there is no need to assume that dependent variables are normally distributed. Any strictly monotone distribution function F(.) can be used in the model. With linear threshold function δ_j(.) one obtains a linear form of the expectation and simple terms for the variance. In particular, skewed distribution can be used. This extends the usual normal distribution approach to modelling clustered data to a wider class of models with a simple link between covariates and measurements.

Rent data

For illustration we use the Munich rent index data. The variables are rent (monthly rent in Euros), floor (floor space), rooms (number of rooms) and age in years. There are 25 districts (residential areas) which have an effect on rents and are modelled as random effects. For an extensive description of the rent data, see Fahrmeir et al. (2011). The rent is the dependent variable. Floor space, number of rooms and age are explanatory variables, which are assumed to have global effects, that is, $β_{1} = \dots = β_{m} = β$ .

Monthly rents tend to have a right-skewed distribution since there are typically some houses that are much more expensive than the average house. Thus, the normal distribution might not be the best choice for modelling this kind of data. A candidate for a right right-skewed distribution is the Gumbel distribution $F (y) = \exp (- \exp (- y))$ , which is the distribution of responses if one chooses the Gompertz distribution $F (y) = 1 - \exp (- \exp (y))$ as response function F(.). As the log-likelihoods in Table 1 show the assumption of the Gumbel distribution for responses (F(.) is chosen as the Gompertz distribution) yields better fit. The Gompertz distribution as a left-skewed distribution for responses (if F(.) is chosen as the Gumbel distribution function) yields much worse fit and is not shown.

2.2 Random effects models for positive-valued variables

In many applications the dependent variable can take only positive values, for example if responses are response times. Although often used, the assumption of a normal distribution or any other distribution with support $ℝ$ is not warranted and will only yield a crude approximation to the true distribution.

In the MTM the support of the dependent variable can be restricted by using an appropriate threshold function δ_j(.). If the difficulty function is chosen such that $\lim_{y \to 0} δ_{j} (y) = - \infty$ holds, the responses automatically have positive values, $y \geq 0$ . One candidate that can be chosen is the logarithmic threshold function

δ_{j} (y) = δ_{0 j} + δ_{j} \log (y) .

Threshold functions of this type combine linearity with a transformation function. The general form of threshold functions of this type, which are used throughout the paper, is

δ_{j} (y) = δ_{0 j} + δ_{j} g (y),

(2.4)

where g(.) is a non-decreasing function. Threshold functions of this form are simply named after the transformation function g(.).

If the threshold function is logarithmic a familiar distribution is found if F(.) is chosen as the standard normal distribution function. Then, one can derive that the density of y_ij denoted by f_ij(.) is given by

f_{i j} (y) = \frac{1}{{\bar{σ}}_{j} y \sqrt{2 π}} \exp (\frac{- {(\log (y) - {\tilde{μ}}_{i j})}^{2}}{{\bar{σ}}_{j}^{2}}),

(2.5)

where ${\tilde{μ}}_{i j} = (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - δ_{0 j}) / δ_{j}, {\bar{σ}}_{j} = 1 / δ_{j}$ . This is the lognormal distribution with parameters ${\tilde{μ}}_{p i}, {\bar{σ}}_{i}$ . Thus, the logarithmic threshold function generates a random effects model in which dependent variables follow a lognormal distribution.

Sleep data

In a sleep deprivation study, the average reaction time per day for subjects has been measured (data set sleepstudy from package lme4). On day 1, the subjects had their normal amount of sleep. Starting that night, they were restricted to 3 hours of sleep per night. The observations represent the average reaction time on a series of tests given each day to each subject. The 10 days represent the repeated measurements on 18 persons.

Instead of using a normal distribution model with linear effect of days a threshold model with normal response function and logarithmic threshold function is used. In the model

P (Y_{i j} > y ∣ b_{i}, z_{i j}, x_{i j}) = F (b_{i} - δ_{0 j} - δ_{j} \log (y)),

the random effect b_i refers to the person and δ₀ _j accounts for the effect of days. It is not assumed that the mean is a linear function of days, as is common in typical random effects models. Instead, the basic variation of responses over repeated measurements is captured in the parameters δ₀ _j . The parameters δ_j account for possible heterogeneity of variances. The log(y) function, which makes the dependent variables follow a log-normal distribution, shows slightly better fit than the common normal distribution model. The log-likelihood was 872.96 with the log(y) function and 875.50 for the identity function (Gaussian distribution). Figure 1 shows the fitted densities for days 1, 3, 5, and 9 for b_i = 0 and b_i = 1. The latter value is approximately the same as the estimated standard deviation of the random coefficients, which was 1.08. It is seen that the distributions have quite different forms and variances vary across days. The mean reaction time as well as the variances increase over days of sleep deprivation.

Figure 1

Reaction time for days 1, 3, 5, and 9 of sleep deprivation for b_i = 0 (left) and b_i = 1 (right).

2.3 Random effects models for continuous bounded data

Various regression models have been proposed for continuous bounded data in the form of rates and proportions that take values in the interval (0, 1), see, for example, Kieschnick and McCullough (2003) and Bonat et al. (2019). Also mixed model versions for repeated measurement have been developed, in particular the simplex mixed model (Qiu et al., 2008) and beta mixed models (Bonat et al., 2015).

Let us more generally consider the case where $Y_{i j} \in (a, b)$ . The restriction of responses to the interval is obtained within the threshold model framework by choosing threshold functions for which $\lim_{y \to a} δ_{j} (y) = - \infty$ and $\lim_{y \to b} δ_{j} (y) = \infty$ hold since then $P (Y_{i j} > a) \to 1$ and $P (Y_{i j} > b) \to 0$ . A threshold function that meets these demands is, for example, the logit threshold function

δ_{j} (y) = δ_{0 j} + δ_{j} \log ((y - a) / (b - y)) .

Instead of $g (.) = \log ((y - a) / (b - y)$ , any inverse distribution function can be used. The logistic distribution function is just one option, which yields the logit function.

While simple terms for means and variances of the dependent variable can be found only for simple threshold functions as the linear one, it is straightforward to show that for any (non-decreasing) transformation function g(y) means and expectations of the transformed variables g(Y_ij) are given by

\begin{matrix} E (g (Y_{i j})) = \frac{1}{δ_{j}} (β_{0 j} + z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j}), & var (g (Y_{i j})) = \frac{σ_{F}^{2}}{δ_{j}^{2}}, \end{matrix}

(2.6)

see Proposition 2 for a proof. That means the mean of the responses is a linear function of covariates and variances can vary over measurements.

Figure 2

Densities for predictor are $η_{i j} = b_{i} + x_{i} β$ with binary predictor $x_{i}, β = 1$ and $δ_{j} (y) = \log ((y - a) / (b - y))$ , a = 0, b = 10; left: b_i = 0; right: b_i = 0.5; drawn line: x_i = 0, dashed line: x_i = 1.

To illustrate the restriction to intervals generated by properly chosen thresholds functions, Figure 2 shows the obtained distributions if F(.) is the normal distribution, the predictor is $η_{i j} = b_{i} + x_{i} β$ with binary predictor $x_{i}, β = 1$ and $δ_{j} (y) = \log ((y - a) / (b - y)), a = 0, b = 10$ . In the left picture b_i = 0, in the right picture b_i = 0.5. The drawn line shows the density for x_i = 0, the dashed line for x_i = 1. It is seen that the logit type distribution ensures that the support of the distribution is (0, 10). For larger random effect (right picture), the density becomes larger close to the upper boundary.

3 Discrete data

Discrete data come in two forms, with infinite support and finite support. We will consider first the case of infinite support and then the case where the response is in categories. In the latter, typically only an ordinal scale level is assumed for the dependent variable.

3.1 Random effects models for count data

Let the responses be counts, that is, Y_ij takes values from ${0, 1, \dots}$ . If responses are assumed to follow a Poisson distribution, mixed models can be formulated within the generalized linear model framework. Extended versions that are able to account for overdispersion as the negative binomial model have been considered by Tempelman and Gianola (1996), Molenberghs et al. (2007), Zhang et al. (2017). In psychometrics, the Conway–Maxwell–Poisson model has also been used albeit without covariates (Forthmann et al., 2020).

In the mixed threshold model, the discrete density or mass function is given by

\begin{array}{l} f_{i j} (0) = 1 - P (Y_{p i} > 0) = 1 - F (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - δ_{j} (0))), \\ f_{i j} (r) = P (Y_{i j} > r - 1) - P (Y_{i j} > r) = \\ = F (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - δ_{j} (r - 1))) - F (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - δ_{j} (r))), r = 1, 2, \dots \end{array}

Table 2

Parameter estimates for epilepsy data (p-values given in brackets).

Response distribution	Treatment	Base	Age	Log-lik	AIC
Gumbel	−0.606	0.058	0.021	−604.678	1233.358
	(0.031)	(0.000)	(0.321)
Normal	−0.404	0.047	0.023	−623.277	1270.554
	(0.102)	(0.000)	(0.251)

A threshold function that ensures that the responses have support 0, 1, … is the shifted logarithmic threshold function

δ_{j} (y) = δ_{0 j} + δ_{j} \log (1 + y) .

The resulting model is as flexible as models that account for overdispersion. In particular, the varying slopes δ_j make it very flexible and able to account for changes in distributional shape over measurements.

Epilepsy data

The response in the data set epil from R package MASS (Ripley et al., 2013) is the number of seizures in a fixed period (four periods considered). As covariates we use age, treatment (1: treatment, 0: placebo) and base (number of seizures at the beginning of the trial). The model uses a logarithmic threshold function $δ_{j} (y) = δ_{j 0} + δ_{j} \log (1 + y)$ . As response function F(.) we used again the normal, the Gompertz and the Gumbel distribution. As Table 2 shows, the Gumbel distribution shows much better fit than the normal distribution, the fit of the Gompertz was much worse (not given). The table also gives the p-values for the likelihood ratio tests for single covariates. It is seen that the treatment effect, which is more pronounced in the Gumbel model, is significant at the 0.05 level when using the Gumbel response function but not when using the normal distribution. As is to be expected, the number of seizures at the beginning of the trial (base) is highly significant while age can be neglected.

In the study, the most interesting effect is the treatment effect. For illustration Figure 3 shows the densities for period, 2 and 4 for placebo (left) and treatment (right) when using the Gumbel distribution, The other variables were chosen by b_i = 0, base = 20, age = 20, diamonds indicate period four, circles period two. It is seen that treatment distinctly reduces the number of seizures.

The threshold model fits much better than the Poisson model, which yields log-likelihood −651.8071 and AIC 1313.614. It also fits better than the more general negative binomial model, which yielded log-likelihood −609.711. AIC values of the Gumbel threshold model and the negative binomial model are comparable (AIC for negative binomial model: 1231.423). Fitting of the Poisson and the negative binomial model was done by using the R package glmmTMB (Brooks et al., 2017).

3.2 Random effects models for ordered responses

Let Y_ij take values from {1, …, k} and assume that categories are ordered. The typical mixed model for this type of data is the cumulative mixed model considered among others by Jansen (1990) and Tutz and Hennevogl (1996) for univariate random effects. It has the form

P (Y_{i j} > r ∣ b_{i}, z_{i j}, x_{i j}) = F (β_{0 r} + z_{i j}^{T} b_{i} + x_{i j}^{T} β)), r = 1, \dots, k - 1

(3.1)

with ordered intercepts $β_{0 r} \geq β_{0, r + 1}$ for all r. For random intercepts $z_{i j}^{T} b_{i} = b_{i}$ it can be fitted by using the package ordinal.

Figure 3

Densities for epileptics data with b_i = 0, base = 20, age = 20; left: placebo, right: treatment, diamonds indicate period four, circles period two.

The model is equivalent to the threshold model

P (Y_{i j} > r ∣ b_{i}, z_{i j}, x_{i j}) = F (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - δ_{j} (r)))

(3.2)

with the special choices $δ_{j} (r) = - β_{0 r}$ and $β_{1} = \dots = β_{m} = β$ .

However, the model (3.2) without restrictions is more general than the simple cumulative model (3.1). In the simple cumulative model, the thresholds $β_{01} \geq \dots \geq β_{0, k - 1}$ do not depend on the measurement j. It is implicitly assumed that only covariates x _ij modify the distribution of the dependent variables. This is far too restrictive in many applications. In addition, in model (3.1) the effect of covariates does not depend on the measurement $(β_{1} = \dots = β_{m} = β)$ . This is hardly realistic, in particular if dependent variables refer to different variables as in the fears data example considered in the following.

Since the range of outcomes is bounded, similar threshold functions as in the case of bounded continuous data should be used. We use the logit type function

δ_{j} (y) = δ_{0 j} + δ_{j} \log ((y - a) / (b - y))

with a < 1 but close to 1, b = k.

An advantage of using a threshold function of the form $δ_{j} (y) = δ_{0 j} + δ_{j} g (y)$ instead of letting all intercepts vary freely (apart from order restrictions) is that sparser representations are obtained. Only 2m parameters are needed instead of $m (k - 1)$ , and order restriction problems do not occur.

Fears data

As an illustrating example, we consider data from the German Longitudinal Election Study (GLES). The data originate from the pre-election survey for the German federal election in 2017 and are concerned with political fears. The participants were asked: ‘How afraid are you due to the …’: (1) refugee crisis? (2) global climate change? (3) international terrorism? (4) globalization? (5) use of nuclear energy? The answers were measured on Likert scale from 1 (not afraid at all) to 7 (very afraid).

We fitted a discrete threshold with logistic response function and logit difficulty function including covariates, gender (1: female; 0: male), standardized age in decades, EastWest (1: Eastern German countries/former GDR, 0: Western German countries/former FRG) and Abitur (high school degree for the admission to the university, 1: yes, 0: no). Table 3 shows the estimates of item parameters. The parameters given are the intercept and the slope of the difficulty function $δ (y) = δ_{0 j} + δ_{j} \log ((y - a) / (b - y))$ . It is seen that all items show significant covariate effects for at least one of the covariates (z-values given below estimates). Older respondents tend to be more afraid than younger respondents, in particular concerning terrorism but less concerning climate change. Females have for all items higher fear levels than males, the effects of EastWest are rather mixed, people from the Eastern parts of the country are more afraid of globalization but less afraid of nuclear energy. Higher education seems to reduce the level of fears. The necessity of covariates is also supported by testing. The log-likelihood test that compares the model without covariates to the model with covariates is 88.081 on 10 df. Thus, the covariates turn out to be influential if one accounts for the heterogeneity in the population.

The model fits better than the common cumulative model (3.1), in which thresholds do not depend on j. By default the package ordinal fits models with global covariate effects, that is, $β_{1} = \dots = β_{k - 1} = β$ . By constructing appropriate design matrices, it is possible to fit variable-specific covariate effects. The corresponding model has log-likelihood −1710.76 which is much smaller than −1684.871 for the threshold model. Consequently, in terms of AIC values the threshold model fits better. AIC value for the model with measurement-specific covariate effects but global threshold effects is 3475.52 (20 covariate effects, 6 threshold, variance of mixing distribution). For the threshold model one obtains 3431.87 (20 covariate effects, 10 difficulty function parameters, variance of mixing distribution).

4 Joint modelling of different types of responses

A strength of the threshold model is that dependent variables can be of various types. It allows for some of the measurements to be continuous while others are binary, ordinal or given as counts. Also the combination of continuous measurements with differing support can be modelled in a joint random effects model. There have been some approaches to joint modelling of different types of responses in specificsettings, see, for example, Ivanova et al. (2016) with a focus on ordinal variables or Loeys et al. (2011), where a joint modelling approach for reaction time and accuracy in psycholinguistic experiments has been proposed. However, no general random effects model that allows to combine different types of responses seems available.

The flexibility of threshold models to account for various types of measurement, is due to the general form of the model. Since the same model form, which specifies $P (Y_{i j} > y)$ , applies to different types of measurement, it is straightforward to obtain a joint model simply by allowing for different distributions (continuous or discrete) and specifying the threshold function accordingly. For example, if measurement 1 is continuous and measurement 2 ordered categorical, one can choose for the first measurement the linear or logarithmic threshold function (depending on the support) and for the second measurement the logistic threshold function. The common random effects will account for the correlation between measurements without the need for an explicit new concept for the correlation between a continuous and an ordered categorical variable.

Table 3

Parameter estimates for the fears data with logit difficulty function, logistic response function, z-values of parameter estimates of covariate parameters are given in the lower part, variable age was standardized.

Parameters
	Item	Intercepts	Slopes	Age	Gender	EastWest	Abitur
Measurement-specific effects
1	Refugee	−0.341	1.653	0.151	0.606	0.392	−1.388
2	Climate change	−1.140	1.858	0.013	1.061	−0.594	−0.062
3	Terrorism	−2.062	1.795	0.393	1.329	0.321	−1.207
4	Globalization	0.716	1.862	0.195	1.020	0.725	−0.719
5	nuclear energy	−0.922	1.641	0.254	0.379	−0.416	−0.300
	Log-lik	−1684.871
					z-values	Covariates
1	Refugee			0.937	1.883	1.179	−4.002
2	Climate change			0.079	3.313	−1.798	−0.186
3	Terrorism			2.354	3.935	0.939	−3.468
4	Globalization			1.236	3.184	2.206	−2.108
5	nuclear energy			1.565	1.189	−1.253	−0.892
					Global effects
	Log-lik	−1715.82		0.230	1.030	0.290	−0.545

Sleep data

As an illustrative example, let us again consider the sleep deprivation data. Instead of using the average reaction time per day in all 10 measurements, the last two measurements were transformed to ordered categorical data. More concretely, the interval (200, 500), which covers the reaction times, has been divided into six equidistant intervals and responses were coded as 1, …, 6 according to the responses in intervals. For the first eight measurements, the logarithmic threshold function has been chosen, and they are specified as continuous. For the last two measurements, the logit threshold function has been chosen and the measurements are specified as discrete with values 1, …, 6. The fitting of the model with normal response function yielded log-likelihood −768.597, which has to differ from the likelihood of the model with continuous responses considered earlier (−875.50) since now a combination of continuous and discrete distributions is assumed.

However, the variation of reaction times over days remains essentially the same. Figure 4 shows the densities for days 1, 3, 5 for b_i = 0 (left) and b_i = 1 (right). They are practically the same as the densities given in Figure 1, which shows the densities if all variables are considered as continuous.
Figure 4
Reaction time for days 1,3,5 of sleep deprivation for b_i = 0 (left) and b_i = 1 (right) for mixed responses.
5 Estimation

A general form of the threshold function term is given by $δ_{j} (y)) = Φ_{j} {(y)}^{T} δ_{j}$ , where $Φ_{j} (y) = (Φ_{j 0} (y), \dots, Φ_{j M} (y))$ is a vector that contains functions of y. If the same threshold function is used for all measurements, it can be specified as $Φ_{j} {(y)}^{T} = (1, g (y))$ . Measurement-specific threshold functions can be obtained by $Φ_{j} {(y)}^{T} = (1, g_{j} (y))$ . But more general vectors of functions can also be useful. Using this parameterization, the model has the form

P (Y_{i j} > y ∣ b_{i}, z_{i j}, x_{i j}) = F (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - Φ_{j} {(y)}^{T} δ_{j}) .

(5.1)

For continuous measurement j the density is

f_{i j} (y) = f (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - δ_{0 j} - Φ_{j} {(y)}^{T} δ_{j}) Φ_{j}^{'} {(y)}^{T} δ_{j},

where $Φ_{j}^{'} (y) = (Φ_{j 0}^{'} (y), \dots, Φ_{j M}^{'} (y))$ contains the derivatives of the components and f(.) is the density linked to F(.). If $Φ_{j} {(y)}^{T} = (1, g (y))$ , one obtains the simpler form

f_{i j} (y) = f (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - δ_{0 j} - Φ_{j} {(y)}^{T} δ_{j}) δ_{j} g_{j}^{'} (y) .

For discrete data with $Y_{i j} \in {0, 1, \dots}$ , the discrete density is given by

\begin{array}{l} f_{i j} (0) = 1 - P (Y_{p i} > 0) = 1 - F (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - Φ_{j} {(0)}^{T} δ_{j}), \\ f_{i j} (r) = P (Y_{i j} > r - 1) - P (Y_{i j} > r) = \\ = F (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - Φ_{j} {(r - 1)}^{T} δ_{j}) - F (z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j} - Φ_{j} {(r)}^{T} δ_{j}), r = 1, 2, \dots \end{array}

When assuming that random effects are normally distributed with mean zero and covariance matrix Σ the marginal log-likelihood has the form

L ({ψ) = \prod_{i = 1}^{n} \int \prod_{j = 1}^{m} f_{i j} (y_{i j} ∣ b_{i}) f_{0, Σ} (b_{i}) d b_{i},

where $f_{0, Σ} (.)$ is the density of the normal distribution with mean zero and covariance matrix Σ, and all parameters to be estimated are collected in ψ . For clarity in f_ij(y_ij| b _i ), the conditioning on b _i is made explicit.

The corresponding log-likelihood is given by

l (ψ) = \log (L (\{β_{j}\}, \{δ_{j}\})) = \sum_{i = 1}^{n} \log (\int \prod_{j = 1}^{m} f_{i j} (y_{i j} ∣ b_{i}) f_{0, Σ} (b_{i}) d b_{i}) .

Maximization of the marginal log-likelihood can be obtained by using the Gauss–Hermite quadrature, as used, for example, by Anderson and Aitkin (1985), Rodríguez (2008) and Gueorguieva (2001). The Gauss–Hermite quadrature is typically based on the standardized random effects $a_{i} = Σ^{- 1 / 2} b_{i}$ , where $Σ^{1 / 2}$ denotes the left Cholesky factor, which is a lower triangular matrix, so that $Σ = Σ^{1 / 2} {(Σ^{1 / 2})}^{T}$ . With $Σ^{- 1 / 2}$ denoting the inverse of $Σ^{1 / 2}$ , one obtains cov( a _i ) = I and obtains the linear predictor

η_{i j} = z_{i j}^{T} Σ^{1 / 2} a_{i} + x_{i j}^{T} β_{j} - Φ_{j} {(y_{i j})}^{T} δ_{j} = (a_{i}^{T} \otimes z_{i j}^{T}) θ + x_{i j}^{T} β_{j} - Φ_{j} {(y_{i j})}^{T} δ_{j}

where ⊗ is the Kronecker product and θ denotes the vectorization of $Σ^{1 / 2}, θ = v e c (Σ^{1 / 2})$ .

Then, the log-likelihood for the standardized random effects a _i is given by

l (ψ) = \log (L (ψ)) = \sum_{i = 1}^{n} \log (\int \prod_{j = 1}^{m} f_{i j} (y_{i j} ∣ a_{i}) p (a_{i}) d a_{i}),

where p( a _i ) denotes the (standardized) density of a _i , which has zero mean and covariance matrix I . The Gauss–Hermite approximation to the log-likelihood has the form

l^{G H} (ψ) = \sum_{i = 1}^{n} \log (\sum_{s} v_{s} \prod_{j = 1}^{m} f_{i j} (y_{i j} ∣ d_{s})),

where d _s denotes fixed quadrature points and v_s denotes fixed weights that are associated with d _s . Quadrature points and weights are given, for example, in Stroud and Secrest (1966).

By building the derivatives to obtain the score function $s^{G H} (ψ) = \partial^{G H} (ψ) / \partial ψ$ , the equation system $s^{G H} (ψ) = 0)$ is solved iteratively.

An alternative is the adaptive Gauss–Hermite quadrature, which typically is more efficient and needs fewer quadrature points to obtain a good approximation, see Liu and Pierce (1994) and Rabe-Hesketh et al. (2005). A further alternatively is the EM algorithm, which was considered for mixed models among others by Bock and Aitkin (1981) and Anderson and Aitkin (1985). Overviews on inference tools for generalized mixed models are found in Jiang and Nguyen (2007) and McCulloch and Searle (2001).

6 Concluding remarks

It has been demonstrated that the threshold model is very versatile and can be adapted to quite different distribution functions. Moreover, it can be used in the joint modelling of different responses in a straightforward way. Joint modelling has been used in particular in the modelling of survival times and longitudinal data, see, for example, Hsieh et al. (2006). Mixed model approaches that are able to handle continuous, binary and ordinal responses can be based on a latent multivariate normal distribution. Goldstein et al. (2009) typically use the normal distribution only when modelling continuous responses.

We used marginal estimation based on integration methods, but alternative estimation from the toolbox of mixed models could be used. Regularization methods could be useful when trying to find sparser representations, for example by identifying those variables for which the effects do not vary over measurements. An overview on basic regularization method including boosting has been given by Hastie et al. (2009), and strategies to select those effects that do not vary over measurements have been considered, for example, by Tutz (2025).

The class of mixed threshold models could be made even more flexible by letting the data decide which threshold function is appropriate. In particular, B-splines as considered extensively by Eilers and Marx (2021) could be useful, which has been demonstrated by Tutz (2022) in the item-response setting without covariates.

Alternative flexible regression models that are in common use are quantile regression approaches and generalized additive models for location, scale and shape (GAMMLSS). Extension of quantile regression approaches that include random effects has been considered Koenker (2004) and Geraci and Bottai (2007) and Hsieh et al. (2006) but with a focus on continuous data for which quantile regression is most appropriate. Random effects within the GAMMLSS have been considered briefly in Stasinopoulos et al. (2024).

R programs that can be used to fit models by maximization of the marginal log-likelihood are available on github (https://github.com/GerhardTutz/MixedThresholdsModels). Future work might focus on the implementation of a comprehensive package with alternative algorithms in which threshold functions and response functions are chosen from a set of options.

Footnotes

Appendix

For simplicity, let the model be given in the form (A.1)

P (Y_{i j} > y ∣ b_{i}, z_{i j}, x_{i j}) = F ({\tilde{η}}_{i j} - δ_{j} (y)),

where ${\tilde{η}}_{i j} = z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j}$ and F(.) is a strictly increasing distribution function.

Proposition A.1 Let the threshold function have the form $δ_{j} (y) = δ_{0 j} + δ_{j} y, δ_{j} \geq 0$ . Then, one obtains for the expectation and the variance

\begin{array}{l} E (Y_{i j}) = μ_{i j} = \frac{1}{δ_{j}} (β_{0 j} + z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j}), \\ var (Y_{i j}) = σ_{i j}^{2} = \frac{σ_{F}^{2}}{δ_{j}^{2}}, \end{array}

Where $β_{0 j} = - μ_{F} - δ_{0 j}$ with $μ_{F} = \int y f (y) d y, σ_{F}^{2} = {var}_{F} = \int {(y - μ_{F})}^{2} f (y) d y, f (y) = \frac{\partial F (y)}{\partial y}$ .

Proof: For linear threshold function the threshold model has the form $P (Y_{i j} > y ∣ b_{i}, z_{i j}, x_{i j}) = F ({\tilde{η}}_{i j} - δ_{j} (y))$ . The corresponding distribution function is

F_{Y_{i j}} (y) = P (Y_{i j} \leq y) = 1 - F ({\tilde{n}}_{i j} - δ_{0 j} - δ_{j} y),

which has the density

f_{Y_{i j}} (y) = \frac{\partial F_{Y_{i j}} (y)}{\partial y} = f ({\tilde{η}}_{i j} - δ_{0 j} - δ_{j} y) δ_{j} .

Thus, the mean is given by

E (Y_{i j}) = δ_{j} \int y f ({\tilde{η}}_{i j} - δ_{0 j} - δ_{j} y) d y .

With $η = {\tilde{η}}_{i j} - δ_{0 j} - δ_{j} y$ and $d η / d y = - δ_{j}$ one obtains

\begin{matrix} E (Y_{p i}) = - \frac{1}{δ_{j}} \int_{\infty}^{- \infty} ({\tilde{η}}_{i j} - η - δ_{0 j}) f (η) d η = \frac{1}{δ_{j}} \int_{- \infty}^{\infty} ({\tilde{η}}_{i j} - η - δ_{0 j}) f (η) d η \\ = \frac{1}{δ_{j}} ({\tilde{η}}_{i j} - μ_{F} - δ_{0 j}) \end{matrix}

where $μ_{F} = \int y f (y) d y$ is a parameter that depends on F only.

The variance is given by

\begin{matrix} var (Y_{p i}) = \int {(y - \frac{{\tilde{η}}_{i j} - μ_{F} - δ_{0 j}}{δ_{j}})}^{2} f ({\tilde{η}}_{i j} - δ_{0 j} - δ_{j} y) δ_{j} d y = \int {(\frac{η - μ_{F}}{δ_{j}})}^{2} f (η) d η \\ = {var}_{F} / δ_{j}^{2}, \end{matrix}

where ${var}_{F} = \int {(η - μ)}^{2} f (η) d η$ .

Proposition A.2 Let the threshold function have the form $δ_{j} (y) = δ_{0 j} + δ_{j} g (y), δ_{j} \geq 0$ . Then, one obtains

E (g (Y_{i j})) = \frac{1}{δ_{j}} (β_{0 j} + z_{i j}^{T} b_{i} + x_{i j}^{T} β_{j}), var (g (Y_{i j})) = \frac{σ_{F}^{2}}{δ_{j}^{2}} .

Proof: The mean is given by

E (g (y)) = \int g (y) f ({\tilde{η}}_{i j} - δ_{0 j} - δ_{j} g (y)) δ_{j} g^{'} (y) d y .

With $η = {\tilde{η}}_{i j} - δ_{0 j} - δ_{j} g (y)$ one obtains

E (g (y)) = - \int \frac{η - {\tilde{η}}_{i j} + δ_{0 j}}{δ_{j}} f (η) d η = \frac{{\tilde{η}}_{i j} - δ_{0 j} - μ_{F}}{δ_{j}} = \frac{{\tilde{η}}_{i j} + β_{0 j}}{δ_{j}},

since $β_{0 j} = δ_{0 j} - μ_{F}$ .

The variance is given by

\begin{array}{l} {var (g (y)) = \int (g (y) - \frac{{\tilde{η}}_{i j} - δ_{0 j} - μ_{F}}{δ_{j}}))}^{2} f ({\tilde{η}}_{i j} - δ_{0 j} - δ_{j} g (y)) δ_{j} g^{'} (y) d y \\ = \int {(\frac{η - μ_{F}}{δ_{j}})}^{2} f (η) d η = {var}_{F} / δ_{j}^{2} . \end{array}

Acknowledgements

I want to thank the associate editor and two unknown reviewers for their helpful comments.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The author received no financial support for the research, authorship and/or publication of this article.

References

Anderson

and Aitkin

(1985) Variance component models with binary response: interviewer variability. Journal of the Royal Statistical Society , Series B 47, 203–10.

Bock

and Aitkin

(1981) Marginal maximum likelihood estimation of item parameters: an application of an EM algorithm. Psychometrika , 46, 443–59.

Bonat

, Petterle

, Hinde

and Demétrio

(2019) Flexible quasi-beta regression models for continuous bounded data. Statistical Modelling , 19(6), 617–33.

Bonat

, Ribeiro

Jr and Zeviani

(2015) Likelihood analysis for a class of beta mixed models. Journal of Applied Statistics , 42(2), 252–66.

Brooks

, Kristensen

, VanBenthem

, Magnusson

, Berg

, Nielsen

, Skaug

, Machler

and Bolker

(2017) glmmtmb balances speed and flexibility among packages for zero-inflated generalized linear mixed modelling. The R Journal , 9(2), 378–400.

Eilers

and Marx

(2021) Practical smoothing: the joys of P-splines . Cambridge University Press.

Fahrmeir

, Kneib

, Lang

and Marx

(2011) Regression. models, methods and applications . Springer, Berlin.

Forthmann

, Gühne

and Doebler

(2020) Revisiting dispersion in count data item response theory models: the Conway–Maxwell–Poisson counts model. British Journal of Mathematical and Statistical Psychology , 73(1), 32–50.

Geraci

and Bottai

(2007) Quantile regression for longitudinal data using the asymmetric Laplace distribution. Biostatistics , 8(1), 140–54.

10.

Goldstein

, Carpenter

, Kenward

and Levin

(2009) Multilevel models with multivariate mixed response types. Statistical Modelling , 9(3), 173–97.

11.

Gueorguieva

(2001) A multivariate generalized linear mixed model for joint modelling of clustered outcomes in the exponential family. Statistical Modelling , 1(3), 177–93.

12.

Hartzel

, Liu

and Agresti

(2001) Describing heterogenous effects in stratified ordinal contingency tables, with applications to multi-center clinical trials. Computational Statistics & Data Analysis , 35(4), 429–49.

13.

Harville

and Mee

(1984) A mixed-model procedure for analyzing ordered categorical data. Biometrics , 40, 393–408.

14.

Hastie

, Tibshirani

and Friedman

(2009) The elements of statistical learning , 2nd edition. New York: Springer–Verlag.

15.

Hsiao

(1986) Analysis of panel data . Cambridge: Cambridge University Press.

16.

Hsieh

, Tseng

Y-K

and Wang

J-L

(2006) Joint modelling of survival and longitudinal data: likelihood approach revisited. Biometrics , 62(4), 1037–43.

17.

Ivanova

, Molenberghs

and Verbeke

(2016) Mixed models approaches for joint modelling of different types of responses. Journal of Biopharmaceutical Statistics , 26(4), 601–18.

18.

Jansen

(1990) On the statistical analysis of ordinal data when extravariation is present. Applied Statistics , 39, 74–85.

19.

Jiang

and Nguyen

(2007) Linear and generalized linear mixed models and their applications , volume 1. Springer.

20.

Jones

(1993) Longitudinal data with serial correlation: a state-space approach . London: Chapman & Hall.

21.

Kieschnick

and McCullough

(2003) Regression analysis of variates observed on (0, 1): percentages, proportions and fractions. Statistical Modelling , 3(3), 193–213.

22.

Koenker

(2004) Quantile regression for longitudinal data. Journal of Multivariate Analysis , 91(1), 74–89.

23.

Lindsey

(1993) Models for repeated measurements . Oxford: Oxford University Press.

24.

Liu

and Pierce

(1994) A note on Gauss-Hermite quadrature. Biometrika , 81, 624–29.

25.

Loeys

, Rosseel

and Baten

(2011) A joint modelling approach for reaction time and accuracy in psycholinguistic experiments. Psychometrika , 76, 487–503.

26.

McCulloch

and Searle

(2001) Generalized, linear, and mixed models . New York: Wiley.

27.

Molenberghs

, Verbeke

and Demétrio

(2007) An extended random-effects approach to modelling repeated, overdispersed count data. Lifetime Data Analysis , 13, 513–31.

28.

Qiu

, Song

PX-K

and Tan

(2008) Simplex mixed-effects models for longitudinal proportional data. Scandinavian Journal of Statistics , 35(4), 577–96.

29.

Rabe-Hesketh

, Skrondal

and Pickles

(2005) Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics , 128(2), 301–23.

30.

Ripley

, Venables

, Bates

, Hornik

, Gebhardt

, Firth

and Ripley

(2013) Package MASS. CRAN R , 538, 113–20.

31.

Rodríguez

(2008) Multilevel generalized linear models. In Handbook of Multilevel Analysis , pages 335–76. Springer.

32.

Schauberger

(2024) MultOrdRS: Model Multivariate Ordinal Responses Including Response Styles . URL https://CRAN.R-project.org/package=MultOrdRS. R package version 0.1-3.

33.

Stasinopoulos

, Kneib

, Klein

, Mayr

and Heller

(2024) Generalized additive models for location, scale and shape: a distributional regression approach, with applications , volume 56. Cambridge University Press.

34.

Stroud

and Secrest

(1966) Gaussian quadrature formulas . Englewood Cliffs, NJ: Prentice-Hall.

35.

Tempelman

and Gianola

(1996) A mixed effects model for overdispersed count data in animal breeding. Biometrics , 265–79.

36.

Tutz

(2012) Regression for categorical data . Cambridge University Press.

37.

Tutz

and Hennevogl

(1996) Random effects in ordinal regression models. Computational Statistics and Data Analysis , 22, 537–57.

38.

Tutz

(2022) Item response thresholds models: a general class of models for varying types of items. Psychometrika , 87, 1238–69.

39.

Tutz

(2025) A short guide to item response theory models . Springer.

40.

Zhang

, Mallick

, Tang

, Zhang

, Cui

, Benson

and Yi

(2017) Negative binomial mixed models for analyzing microbiome count data. BMC Bioinformatics , 18, 1–10.

A general framework for random effects models for binary,ordinal,count type and continuous dependent variables

Abstract

Keywords

1 Introduction

2.1 Linear random effects models for Gaussian data and other distributions

Parameter estimates for rent data (standard errors given in brackets).

Rent data

Sleep data

Reaction time for days 1, 3, 5, and 9 of sleep deprivation for bi = 0 (left) and bi = 1 (right).

Densities for predictor are η i j = b i + x i β with binary predictor x i , β = 1 and δ j ( y ) = log ( ( y − a ) / ( b − y ) ) , a = 0, b = 10; left: bi = 0; right: bi = 0.5; drawn line: xi = 0, dashed line: xi = 1.

3.1 Random effects models for count data

Table 2

Parameter estimates for epilepsy data (p-values given in brackets).

Epilepsy data

Densities for epileptics data with bi = 0, base = 20, age = 20; left: placebo, right: treatment, diamonds indicate period four, circles period two.

Fears data

Table 3

Parameter estimates for the fears data with logit difficulty function, logistic response function, z-values of parameter estimates of covariate parameters are given in the lower part, variable age was standardized.

Sleep data

Reaction time for days 1,3,5 of sleep deprivation for bi = 0 (left) and bi = 1 (right) for mixed responses.

Footnotes

Appendix

Acknowledgements

Declaration of Conflicting Interests

Funding

References

Reaction time for days 1, 3, 5, and 9 of sleep deprivation for b_i = 0 (left) and b_i = 1 (right).

Densities for predictor are $η_{i j} = b_{i} + x_{i} β$ with binary predictor $x_{i}, β = 1$ and $δ_{j} (y) = \log ((y - a) / (b - y))$ , a = 0, b = 10; left: b_i = 0; right: b_i = 0.5; drawn line: x_i = 0, dashed line: x_i = 1.

Densities for epileptics data with b_i = 0, base = 20, age = 20; left: placebo, right: treatment, diamonds indicate period four, circles period two.

Reaction time for days 1,3,5 of sleep deprivation for b_i = 0 (left) and b_i = 1 (right) for mixed responses.