Sage Journals: Discover world-class research

Abstract

In this article, we introduce trigmm, a package designed to estimate the parameters of triangular two-equation systems without the need for instrumental variables. The trigmm command leverages the identification conditions proposed by Lewbel, Schennach, and Zhang (2024, Journal of Business and Economic Statistics 42: 14–25), enabling instrument-free identification primarily through the non-Gaussianity assumption of error variables. We also introduce the trigmmset command, which provides bounds on the parameters based on the set identification results from the same article even without the non-Gaussianity assumption, offering complementary analyses alongside the trigmm command. Estimation is performed by integrating these moment conditions with Stata’s built-in generalized method of moments (gmm) framework. The package’s functionality is demonstrated through an empirical application.

Keywords

st0797 trigmm trigmmset instrument-free identification triangular system

1 Introduction

One of the main obstructions to identifying the direct causal effect of one variable on another is the existence of confounding factors. For example, a naive ordinary least-squares (OLS) regression does not consistently estimate the causal effect of a worker’s level of schooling on their wage because there can be confounding factors, such as ability, that affect both variables. In econometrics, such issues have traditionally been addressed by finding instrumental variables (IV). For instance, Card (1995, 2001) and other authors have proposed instruments like the accessibility of schools to resolve endogeneity concerns.

However, it can be difficult to find instruments and ascertain their validity. In this regard, Lewbel, Schennach, and Zhang (2024) develop a novel identification result within a standard linear triangular system that does not rely on IV. While there has been prior work on instrument-free identification (for example, Rigobon [2003], Klein and Vella [2010], and Lewbel [2012]), these approaches differ from Lewbel, Schennach, and Zhang (2024) in that they impose restrictions on the heteroskedasticity of unobserved errors and use this structure for identification. Lewbel (2012) requires covariates and heteroskedasticity, while we impose homoskedasticity and do not need covariates. In Stata, instrument-free identification through heteroskedasticity is implemented with the ivreg2 h package. For instructions on this package, see Baum and Lewbel (2019).

Lewbel, Schennach, and Zhang (2024) provide moment conditions for identifying various parameters of linear triangular systems, which facilitate the use of simple generalized method of moments estimators (Hansen 1982). The identification result is primarily achieved through the non-Gaussianity of unobservable error terms. The model and the point identification result are discussed in section 2. The trigmm command implements these moment conditions and interfaces with Stata’s built-in gmm command. The syntax of the trigmm command is described in section 3.

Another approach to addressing questionable or absent IV is to construct set identification results or provide bounds on parameters. In Stata, robust inference based on these approaches has been implemented in community-contributed packages such as tebounds (McCarthy, Millimet, and Roy 2015), plausexog and imperfectiv (Clarke and Matta 2018), psacalc (Oster 2019), and kinkyreg (Kripfganz and Kiviet 2021). Indeed, Lewbel, Schennach, and Zhang (2024) construct a sharp identified set without the non-Gaussianity assumption on error terms and provide bounds on the identified set based on the variance and covariance of observable variables. These bounds are discussed in section 3. The trigmmset command implements these bounds, and its syntax is described in section 5.

In section 6, we illustrate the usage of the trigmm and trigmmset commands based on Acemoglu and Johnson (2007), who consider the causal effect of a country’s health measures, such as growth in life expectancy, on the country’s gross domestic product (GDP) growth. However, technological advances can affect both GDP growth and life expectancy, leading to endogeneity and the failure of naive linear regression. Acemoglu and Johnson (2007) propose using changes in innovations in healthcare as an instrument, but this instrument may also be related to technological advances. We walk through this example in detail in section 6 using the trigmm package. Finally, we conclude and discuss future developments in section 7.

2 Model and point identification

This section provides an overview of the theoretical background of the command trigmm. We describe a triangular two-equation system of interest and briefly introduce the identification result.

2.1 A triangular two-equation system

We consider a standard linear triangular structural model

y = x^{'} b_{1} + ε_{1} (1)

(1)

w = x^{'} b_{2} + γ y + ε_{2} (2)

(2)

Here y is an endogenous variable, w is a dependent variable, x represents exogenous covariates, and ε₁, ε₂ are unobserved error terms. The correlation between ε₁ and ε₂ induces the endogeneity of y.

The structural model of this article makes the key assumption

ε_{1} = u + υ

ε_{2} = β u + r

where β is an unknown constant. u, v, and r are unobserved, with (u, v, r, x) being mutually independent. In this context, u represents a confounding factor, while v and r are idiosyncratic errors for the endogenous and dependent variables, respectively.

The model is identified under a weaker condition that u, v, and r are mutually independent conditional on x, but the trigmm command exploits the additional independence from x to avoid the introduction of infinite-dimensional nuisance parameters. The plausibility of the additional independence from x can be assessed by performing a test for homoskedasticity in the reduced-form regressions of y on x as well as w on x. This can be accomplished with the Stata command estat imtest, white. However, note that the absence of heteroskedasticity is only a necessary condition.

Additionally, the approach identifies β only up to its sign because changing the sign of the unobserved u produces two observationally equivalent models. As a result, the user must specify the sign of β, which is a decision that can be guided by economic theory. For instance, in the wage-schooling example discussed in section 1, we expect that the sign of β is positive because the ability likely affects the wage and the level of schooling in the same direction. In section 2.2, we present the identification result assuming that β is positive. We extend the identification result to the negative β case in section 2.3.

In addition to the coefficients of the triangular model and β, the command trigmm gives estimates for the variances of unobserved variables. These estimates can help researchers assess the model’s validity by checking whether the latent variable variances implied by the model have plausible magnitudes.

2.2 Point identification result

We provide the point identification result for (1) and (2) (Lewbel, Schennach, and Zhang 2024). For simplicity, we consider a model without exogenous variables:

y = u + υ (3)

(3)

w = γ y + β u + r (4)

(4)

A complete list of moment conditions in the presence of exogenous variables can be found in the supplementary materials B of Lewbel, Schennach, and Zhang (2024).

To state the point identification result, we introduce some notation. Let φ_y denote the characteristic function of the random variable y $ϕ_{y} (ζ) = E (e^{i ζ y})$ , where $E (\cdot)$ is the expectation and $i = \sqrt{- 1}$ . Similarly, the characteristic function of random variables (y, w) is denoted as φ_y_,w, where $ϕ_{y, w} (ζ, ξ) = E (e^{i ζ y + i ξ w})$ . Let Φ _y ≡ log φ_y,Φ _y _,w ≡ log φ_y_,w denote cumulant generating functions. Finally, we define the cumulant of order k, l (Lukacs 1970) as

Φ_{y, w}^{k, l} \equiv {\frac{\partial^{k + l} Φ_{y, w} (ζ, ξ)}{i^{k + l} \partial ζ^{k} \partial ξ^{l}}}_{ζ = 0, ξ = 0}

where ∂ is the partial differential operator. Similarly, for a single random variable, the cumulant of order k is defined as

Φ_{y}^{k} \equiv {\frac{\partial^{k} Φ_{y} (ζ)}{i^{k} \partial ζ^{k}}}_{ζ = 0}

We are now prepared to state the point identification result. Based on these notations, Lewbel, Schennach, and Zhang (2024) construct an infinite number of moment constraints indexed by nonnegative integer p:

M_{p} (α, γ) \equiv Φ_{y, w}^{1 + p, 2} - α^{2} Φ_{y}^{3 + p} - (γ + α) (Φ_{y, w}^{2 + p, 1} - α Φ_{y}^{3 + p})

α = β + γ. It is shown that combining two of the above equations uniquely identifies α, γ. For example, a system of quadratic equations,

M_{0} (α, γ) = 0

M_{1} (α, γ) = 0

point identifies α and γ. The command trigmm implements moment constraints for p ∈ {0, 1, 2}. Note that once α and γ are identified, the variances of u, v, and r (de-noted

σ_{u}^{2}

σ_{υ}^{2}

, and

σ_{r}^{2}

, respectively) can be straightforwardly recovered by exploiting the independence among unobserved variables.

Next we discuss required assumptions for this identification result to hold. Suppose that we want to combine two moment conditions M_q (α, γ) = 0 and $M_{\tilde{q}} (α, γ) = 0$ for $q < \tilde{q}$ .There are two main assumptions. First, the existence of higher-order moments— $E (| u |^{\bar{q}})$ , $E (| υ |^{\bar{q}})$ , and $E (| r |^{\bar{q}})$ —is needed. Second, there is a technical assumption that

Φ_{y}^{3 + \tilde{q}} Φ_{y, w}^{2 + q, 1} \neq Φ_{y}^{3 + q} Φ_{y, w}^{2 - \tilde{q}, 1} (5)

(5)

When q = 0, $\tilde{q} = 1$ , which is the default choice of the trigmm command, (5) fails to hold, for instance, if either u or v is normal, if both u and v are symmetric, or if both u and v have the exact same distribution.

Note that the use of one pair of moment conditions does not preclude the use of another pair of moment conditions. In fact, a combination of two pairs of moment conditions provides overidentifying restrictions on the parameters. For instance, a pair of moment conditions M₀(α, γ) = 0 and M₁(α, γ) = 0 point identifies α and γ provided that (5) is satisfied for q = 0, $\tilde{q} = 1$ . Similarly, if (5) holds for q = 1, $\tilde{q} = 2$ , then another pair of moment conditions, M₁(α, γ) = 0 and M₂(α, γ), can also point identify α and γ. Therefore, a combination of two pairs of moment conditions, M₀(α, γ) = 0, M₁(α, γ) = 0, and M₂(α, γ) = 0, overidentifies the parameters.

2.3 Identification in the negative β case

Throughout this section, we have assumed that β > 0. This section explains how the trigmm package performs an estimation when β is negative.

We first rewrite (1) and (2) as

\tilde{y} = x^{'} {\tilde{b}}_{1} + \tilde{u} + \tilde{υ} (6)

(6)

w = x^{'} b_{2} + \tilde{γ} \tilde{y} + \tilde{β} \tilde{u} + r (7)

(7)

where the tilde over the character,

\tilde{\cdot}

, denotes the negative value of the argument. For example,

\tilde{β} = - β

. Through this reparameterization, the parameter

\tilde{β}

becomes a positive value that validates the identification result introduced in this article.

The package trigmm estimates (6) and (7) with the gmm command with the moment conditions described in section 2.2. It then transforms the estimates and the variance–covariance matrix of ${\tilde{b}}_{1}$ , $\tilde{γ}$ , $\tilde{β}$ and other parameters to recover those of original parameters.

3 Set identification

The point identification result described in section 2.2 relies on assumptions about the unobservable error terms, u and v, such as their non-Gaussianity. Indeed, Lewbel, Schennach, and Zhang (2024) also construct a sharp identified set that essentially requires only the existence of second-order moments. However, the sharp identified set is based on the decomposition of observable variables into independent Gaussian and non-Gaussian factors, which can be hard to estimate and interpret. In this regard, the article also provides a coarser bound on the identified set (corollary 6 in Lewbel, Schennach, and Zhang [2024]) based solely on the covariances of observable variables. This procedure, being robust to distributional assumptions on error components, provides useful and easily interpretable complementary analysis to the point identification procedure. We implement this procedure in the trigmmset command.

3.1 Set identification result

For simplicity of exposition, we revisit the model without exogenous covariates presented in section 2.2. Recall that γ is the causal effect of the endogenous variable, y, on the dependent variable, w, and that α, defined as γ +β, is the total effect of the unobservable confounding factor, u, on the dependent variable. For positive β, we have the following bounds on α and γ under the assumption of the existence of second-order moments only,

γ \leq B_{0} \leq α (8)

(8)

(B_{0} - γ) (α - B_{0}) \leq D_{0} (9)

(9)

where

\begin{aligned} B_{0} = \frac{E (w y)}{E (y^{2})} \\ D_{0} = \frac{E (w^{2}) E (y^{2}) - {E (w y)}^{2}}{{E (y^{2})}^{2}} \geq 0 \end{aligned}

Notice that B₀ is simply the slope coefficient from a linear regression of w on y. The bound on (α, γ) is the lower-right quadrant of the region enclosed by the hyperbola defined by (9).

The bound (8) exhibits a relationship between a vanilla linear regression that does not account for endogeneity and our causal parameters. The derivation of this bound is immediate from the model setup in (3) and (4). Observe that

B_{0} = \frac{E (w y)}{E (y^{2})} = \frac{E {(u + υ) (α u + γ υ)}}{E {{(u + υ)}^{2}}} = \frac{α E (u^{2}) + γ E (υ^{2})}{E (u^{2}) + E (υ^{2})} = α λ + γ (1 - λ)

where λ = E(u²)/{E(u²) + E(v²)}. Because λ lies between 0 and 1, we obtain the desired bound. For the derivation of bound (9), see Lewbel, Schennach, and Zhang (2024).

3.2 Inference

To account for randomness in estimating B₀ and D₀, the trigmmset command reports a worst-case bound as follows. It first estimates B₀ and D₀ using plugin estimators, $\hat{B}$ and $\hat{D}$ , and computes the variance–covariance matrix of the estimates, $\hat{V}$ , using the delta method. Then one can construct a confidence set for (B₀, D₀) at a prespecified significance level δ in the form of an ellipse,

E_{δ} := {(B, D) : [B - \hat{B} D - \hat{D}] {\hat{V}}^{- 1} {[B - \hat{B} D - \hat{D}]}^{^{'}} \leq χ_{1 - δ}}

where χ_1−δ is the (1 − δ)th quantile of the χ² distribution with 2 degrees of freedom. This ellipse contains the true (B₀, D₀) with probability 1 − δ.

The identified set for (α, γ) reported by trigmmset is a worst-case bound in that it includes the union of the bounds (8) and (9) for all (B, D) contained within $E_{δ}$ vs. The reported confidence set is slightly coarser than the union bound and takes the form

α ≥ γ, B− ≤ α ≤ B+, B− ≤ γ ≤ B+

B− ≤ α ≤ B+, γ ≤ B−

B− ≤ γ ≤ B+, B + ≤ α

D∗ ≥ (α − B+)(B− − γ), B + ≤ α, γ ≤ B−

Where

\begin{aligned} B^{-} = min_{(B, D) \in E_{δ}} B \\ B^{+} = max_{(B, D) \in E_{δ}} B \\ D * = max_{(B, D) \in E_{δ}} {D + (B - B^{+}) (B^{-} - B)} \end{aligned}

A direct algebraic verification shows that the reported region contains the bounds (8) and (9) for all (B, D) within

E_{δ}

, thereby ensuring that the reported region contains the true identified set with a probability of at least 1 − δ.

When covariates x are present, inference for their coefficients becomes slightly more complex. While the inference for (α, γ) remains similar, there is a difference for the covariate coefficient vectors b₁ and b₂. The coefficients b1 are identified from the reduced-form regression of y on x. However, b2 cannot be directly identified from a regression of w on x because of the endogeneity of y.

Instead, we use the fact that the combined coefficient vector (γb₁ + b₂) is identified. To derive bounds for individual elements of b₂, let b_1,k and b_2,k be the kth elements of b₁ and b₂, corresponding to the kth covariate x_k . We use the fact that b_2,k = (γb_1,k + b_2,k) − γb_1,k.

Using the inequality γ ≤ B₀, we can bound the term γb_1,k . This logic is applied elementwise for each coefficient b_2,k in the vector b₂, based on the statistically determined sign of the corresponding b_1,k :

If the confidence interval for b1, k is strictly positive (that is, does not contain 0 and b1 ,k > 0), then

b_{2, k} = (γ b_{1, k} + b_{2, k}) - γ b_{1, k} \geq (γ b_{1, k} + b_{2, k}) - B_{0} b_{1, k}

If the confidence interval for b_1,k is strictly negative (that is, does not contain 0 and b_1,k < 0), then

b_{2, k} = (γ b_{1, k} + b_{2, k}) - γ b_{1, k} \leq (γ b_{1, k} + b_{2, k}) - B_{0} b_{1, k}

If the confidence interval for b_1,k contains 0, the sign of b_1,k is undetermined, and the interval for b_2,k is reported as uninformative, that is, (−∞, ∞).

The reported bound for b_2,k in the first two cases is constructed by incorporating the confidence bounds of the estimated components—namely, B₀, b_1,k, and the identified sum (γb_1,k +b_2,k)—to ensure a conservative (worst-case) overall interval. Because multiple parameter estimates contribute to this bound, users can choose to apply a Bonferroni correction for the resulting confidence intervals for the elements of b₂.

4 The trigmm command

4.1 Overview

The trigmm command is a wrapper function that provides an interface between the built-in gmm command and the identification results proposed by Lewbel, Schennach, and Zhang (2024). Most of the options and stored results are inherited from the gmm command. Therefore, in this section, we describe the few options and stored results that are specific to the trigmm command.

Users specify the linear model by selecting the list of variables: the endogenous variable y, the dependent variable w, and the exogenous covariates x. Users also choose the set of moment conditions, M_p(α, γ) = 0, by specifying the indices and the sign of β. Based on the variable list, the moment condition indices, and the sign of β, the trigmm command constructs the moment conditions and passes them to the built-in gmm command. Aside from these essential specifications for constructing moment conditions, most of the options for gmm are directly passed to the internal execution of the gmm command.

While its core functionality implements the instrument-free identification strategy, trigmm also allows users to incorporate traditional IV if available. This is done through the instruments() option (see section 4.3), which adds standard moment conditions based on the provided instruments [E(z’Q) = 0, where z are the instruments and Q is the relevant residual from the second structural equation (2)], providing additional overidentifying moment conditions.

4.2 Syntax

The basic syntax of trigmm is as follows:

The varlist contains two variables: the dependent variable, w, and the endogenous variable, y, in that order.

4.3 Options

Most options of the trigmm command are passed directly to the execution of the nested gmm command. Therefore, we omit the description of these options and focus on those specific to the trigmm command. For detailed explanation of other options, refer to [R] gmm .

covariates(varlist) specifies the exogenous variables of the model.

instruments(varlist) specifies the instrument variables of the model.

sign(#) requires any positive number if the unobservable variable affects the endogenous variable and the dependent variable in the same direction. If it affects it in the opposite direction, input any negative number. Zero is not allowed. The default is sign(1).

p( numlist ) specifies moment conditions. The default is p(0 1); that is, M₀(α, γ) = 0, M₁(α, γ) = 0. The user should choose at least two of 0, 1, and 2.

noconstant1 suppresses the constant term in the first regression, (1).

noconstant2 suppresses the constant term in the second regression, (2).

quietly suppresses the terminal output of the nested gmm execution. The final coefficient table produced by trigmm will still be displayed.

from() specifies the initial values for the optimization of the nested gmm execution. The default is obtained by two regressions. trigmm regresses the endogenous variable on other covariates. Regression coefficients are used as the initial values. The variance of the residuals gives initial values for the variance of unobserved variables in the first regression. Next the package regresses the dependent variable on the endogenous variable and other covariates. The regression coefficient is used as the initial values for γ and coefficients of other covariates. Residuals of the second regression are used to set the initial values for remaining parameters. Users can override the initial values.

technique(string) specifies the optimization method. string may be bfgs, nr, gn, or dfp. The default is technique(bfgs).

overid displays Hansen’s J statistic for the test of overidentifying restrictions if the model is overidentified. This option applies only if the model is overidentified and is not available with onestep estimation.

gmm_options are any options documented in [R] gmm .

4.4 The from() option: Default initial values

Note that as with many nonlinear generalized method of moments estimators, the objective function optimized by trigmm can exhibit multiple local minimums. Consequently, the final estimates, particularly in overidentified specifications (for example, when using moment conditions p(0 1 2)), can sometimes be sensitive to the choice of initial values. Therefore, users are encouraged to explore different starting points, for instance, by performing a Monte Carlo search as detailed in section 6.1.

The from() option in the trigmm command facilitates this by allowing users to specify their own initial values for the optimization procedure of the internal gmm command. In scenarios where a reasonable prior guess of the true parameters is unavailable or as a starting point for further exploration, trigmm also provides a default mechanism to generate initial values. However, given the potential for multiple local minimums, relying solely on the default values without further checks is not always advisable. The default initial value generation process is as follows:

First, it conducts a regression of y on x:

y = x^{'} {\hat{b}}_{1} + {\hat{ε}}_{1}

{\hat{ε}}_{1}

is the residual. The estimated coefficients

{\hat{b}}_{1}

are used as the initial values for b₁. The sample variance of the residuals, denoted as

\hat{Var} ({\hat{ε}}_{1})

, is computed, and the initial values for the variances of u and v are set as

Var {(u)}^{init} = Var (υ)^{init} = \hat{Var} ({\hat{ε}}_{1}) / 2

Next, the default process performs a regression of w on y and x:

w = \hat{γ} y + x^{'} {\hat{b}}_{2} + {\hat{ε}}_{2}

${\hat{ε}}_{2}$ is the residual. The initial values are then set as

b_{2}^{init} = {\hat{b}}_{2}, γ^{init} = \hat{γ}, Var {(r)}^{init} = \hat{Var} ({\hat{ε}}_{2})

Considering the positivity constraint on β, we set β^init = 0.01.

When one uses moment conditions with p = 2, the initial value for µ_ww, defined as

μ_{w w} = Var {w - (γ b_{1} + b_{2})^{'} x}

is set to the sample variance of residuals from the regression of w on x. Users can access these initial values by setting the conv_maxiter() option to 0 and retrieving them from e(init).

4.5 Stored results: Transformation of parameters

The trigmm command stores the estimated results in e(), making them accessible to postestimation commands such as test. Many of the stored results from the internal gmm command are retained as stored results of the trigmm command. Refer to the help file of the trigmm command for a complete list of stored results.

However, there are aspects of the trigmm command, specifically those involving the transformation of the original parameters, that users should know when interpreting certain stored results. As explained in section 2.3, the sign of β affects the sign of some parameters when constructing the internal gmm moment conditions. Additionally, there are nonnegativity constraints on several parameters: β, Var(u), Var(v), Var(r), and µ_ww . To enforce these constraints, these parameters are expressed as the exponential of corresponding unconstrained variables in the internal gmm command:

β = e^{b}, Var (u) = e^{τ_{u}}, Var (υ) = e^{τ_{υ}}, Var (r) = e^{τ_{r}}, μ_{w w} = e^{τ_{w}}

The stored results e(b_gmm), e(V_gmm), e(init_gmm), e(G_gmm), and e(exp_j_gmm) report the outputs of the internal gmm command based on these transformed parameters.

On the other hand, e(b), e(V), and e(init) represent the estimates, variances, and initial values in terms of the original parameters, respectively. After the gmm command performs estimation of the transformed parameters, the trigmm command converts these estimates back to the original parameters and stores them accordingly. This transformation ensures that the final reported estimates adhere to the required constraints and accurately reflect the original model parameters.

5 The trigmmset command

5.1 Syntax

The varlist contains two variables: the dependent variable, w, and the endogenous variable, y, in that order.

5.2 Options

covariates(varlist) specifies the exogenous variables of the model.

noconstant1 suppresses the constant term in the first regression, (1).

noconstant2 suppresses the constant term in the second regression, (2).

level(#) specifies the confidence level. The default is level(95).

bonferroni applies the Bonferroni correction to adjust confidence intervals for b₂.

graph generates a graph displaying the identified region for (α, γ).

scale(#) adjusts the scale of the graph. The default is scale(5). Larger values zoom out. When betarange() is specified with a positive value, scale() interacts with betarange(). Specifically, when sign() is positive, the γ-axis lower limit is extended by betarange() × scale() below B⁻, and the α-axis upper limit is extended by betarange() × scale() beyond B⁺. Similar logic applies for a negative sign(). If betarange(0) (the default) is used, the plot’s axis scaling reverts to its standard behavior, which is based on the span (B⁺− B⁻) and the scale() value.

betarange(#) specifies a positive real number to specify a β magnitude for adjusting graph scale. The default is betarange(0).

lstyle1(string) specifies the graph options for boundary line 1 (γ = α). lstyle2(string) specifies the graph options for boundary line 2 (line parallel to α axis).

lstyle3(string) specifies the graph options for boundary line 3 (line parallel to γ axis).

lstyle4(string) specifies the graph options for boundary line 4 (hyperbola).

6 Examples

In this section, we demonstrate the application of the trigmm package using the dataset from Acemoglu and Johnson (2007), which investigates the causal effect of health on economic growth. Acemoglu and Johnson (2007) proxy general health by life expectancy at birth and examine various economic outcomes, including log GDP per working age population. For our analysis, we focus on the change in life expectancy (y) and the difference in log GDP per working age (w) between 1940 and 1980. The exogenous variables (x) include a constant term and a measure of institutional quality for each country.

A primary concern in this analysis is the potential endogeneity of health improvements. Countries with increasing life expectancy may also experience advancements in economic productivity because of unobserved factors such as technological progress (u), which could simultaneously influence both health (y) and GDP growth (w). To address this issue, Acemoglu and Johnson (2007) use predicted mortality from various diseases as an instrumental variable (z), under the assumption that disease decline driven by global interventions is exogenous to country-specific technological changes. However, there remains a possibility that these health interventions are still correlated with unobserved technological advances, potentially violating the exclusion restriction of the instruments.

To mitigate concerns regarding instrument validity, we use the trigmm package’s instrument-free estimation approach on the same dataset. Our analysis reveals that the estimates obtained through the instrument-free method are comparable with those estimated using the IV approach.

6.1 Example 1: Estimation without covariates

To satisfy the required condition (5) for the identification result to hold, we must ensure specific conditions on the unobservable variables u and v, such as their non-Gaussianity, as discussed following (5). Although a direct test for this assumption is unavailable, if y is close to normal, it may suggest that either u or v is also close to normal. In our analysis, the skewness of y is 0.170, and the kurtosis is 1.791. Additionally, the p-value from a Shapiro–Wilk test of normality (implemented using Stata’s built-in swilk command) is 0.02. Based on this diagnosis, we proceed with the trigmm method.

Before delving into the trigmm results, we report OLS and IV regression results. When we use IV, the coefficient of y decreases from −0.78 to −1.35, which is consistent with our assumption that β is positive.

Next we apply the just-identifying moment conditions, M₀(α, γ) = 0 and M₁(α, γ) = 0, by setting the option p(0 1). We input an arbitrary positive number for the sign() option to indicate the assumed sign of β. When the covariates() option is unspecified, the trigmm command automatically demeans y and w. The noconstant1 and noconstant2 options suppress this demeaning process for the first and second regressions, respectively. Additionally, we specify certain options to be passed to the nested gmm command.

The initial values for the from() option were determined by a Monte Carlo search. Given the nonlinearity of the moment conditions, finding good starting values is important for the gmm estimation. Because a full grid search can be computationally expensive, we randomly sampled initial values. One strategy to define a search region is to first obtain a reasonable starting point using trigmm’s default initial value calculation (detailed in section 4.4) and then sample candidate values from a region around that point. From the converged estimations resulting from these random starting values, we selected the set of initial values that yielded the minimum gmm objective function value, e(Q). Specifically, for the examples, we drew 1,000 sets of initial values, with γ sampled from Unif[−3, 0], and the transformed parameters $b = \ln (β)$ , $τ_{u} = \ln (σ_{u}^{2})$ , $τ_{υ} = \ln (σ_{υ}^{2})$ , $τ_{r} = \ln (σ_{r}^{2})$ , and (if applicable) τ_w = ln(µ_ww) sampled from Unif[−3, 2]. In example 2 in section 6.2, where covariates are present, initial values for the coefficients of these covariates (elements of b₁ and b₂) were additionally required. These were sampled from Unif[−2, 2] for each coefficient. For an implementation of a Monte Carlo search for initial values using simulated data, see the accompanying do-file, example.do.

The following conventions are used for naming the model’s parameters. When there are no exogenous variables, gamma = γ, beta = β, var_u = Var(u), var_v = Var(v), and var_r = Var(r). When using the moment condition with p = 2, we have an additional parameter, mu_ww = Var(w). When the quietly option is not specified, the trigmm command also displays outputs from the internal gmm execution. As explained in section 4.5, the internal gmm command operates on transformed parameters. The naming of parameters is as follows: b = log(beta), tau_u = log(var_u), tau_v = log(var_v), tau_r = log(var_r), and tau_w = log(mu_ww) when the sign() is positive. When it is negative, the signs of beta and gamma are negated as explained in section 2.3. However, we note that all of these transformations are performed by the trigmm command when specifying the from() option and displaying the final results, so users need not be concerned about these complications. The following log captures the results:

The standard errors are very large, and the estimate for γ is substantially different from the IV result. This discrepancy might suggest that (5) is close to being violated. To address this issue, we consider adding higher-order moment conditions, which can help stabilize the estimates. Consequently, we set the moment condition option p(0 1 2) to construct overidentifying restrictions with higher moments.

In this revised analysis, the results obtained using higher-order moment conditions exhibit smaller standard errors compared with using only p = 0, 1, and the estimate for γ becomes quite comparable with that estimated using Acemoglu and Johnson’s (2007) IV approach. Specifically, the estimate of the effect of life expectancy on the log GDP per working age is −1.35 with a standard error of 0.43 when using IV. On the other hand, if one reads the row for gamma, the result from overidentifying restrictions has an estimate of −1.60 with a standard error of 0.71.

When the model is overidentified (for example, by using p(0 1 2) or including external instruments) and the overid option is specified, trigmm reports Hansen’s J statistic for the joint validity of all moment conditions. Users should interpret this test with caution. If overidentification arises primarily from higher-order internal moments rather than from traditional external instruments, a rejection of the J test may indicate a failure of the underlying distributional assumptions or model specification, rather than invalid external instruments. Therefore, the source of overidentification should be considered when evaluating the J statistic.

As an additional diagnostic, we examine the average derivative of the moment conditions with respect to the parameters. This diagnostic is stored in e(G_gmm). By performing singular value decomposition and comparing the ratio of the maximum singular value with the minimum singular value, we observe that for p(0 1), the ratio is 2136.69, and for p(0 1 2), the ratio is 121.90. This results in a ratio that is 17.53 times smaller when higher-order moments are included, indicating better identification of the model parameters. The code for this singular value diagnostic is provided in the accompanying example.do file.

6.2 Example 2: Estimation with covariates

One of the specifications in Acemoglu and Johnson (2007) incorporates a measure of institutional quality as a covariate. We replicate the diagnostic assessment of the validity of (5) as performed in example 1. Specifically, after obtaining the residuals from the regression of y on x, we evaluate the distributional properties of these residuals. The residuals exhibit a skewness of 0.74 and a kurtosis of 3.02. Additionally, the Shapiro–Wilk test for normality yields a p-value of 0.01, leading us to reject the null hypothesis of normality.

When a covariate is included in the model, the trigmm command imposes an additional assumption that the unobservable variables are independent of x. Although a direct test for this assumption is unavailable, we can examine necessary conditions such as homoskedasticity. We perform the White heteroskedasticity test on the residuals from the regression of y on x, obtaining p-values of 0.97. These results do not reject the null hypothesis of homoskedasticity. However, the p-value from the White test on the residuals of the regression of w on x is 0.07, which is approaching the rejection. Therefore, the results in this example may be less reliable than those in example 1.

We present the OLS and IV regression results below:

In the IV regression, we observe that the coefficient γ decreases from −0.94 to −1.82, which is consistent with our assumption that β is positive.

Next we apply the trigmm command to fit the model. Under the presence of co-variates, we use additional notational conventions to represent parameters related to these covariates. Specifically, the coefficients of covariates b₁ and b₂ in (1) and (2) are represented by xb1: and xb2: followed by the variable names, respectively. For example, the coefficient of the institution variable in the first equation is written as xb1:institution. The coefficients of the constant terms are represented by xb1:_cons and xb2:_cons and can be suppressed using the noconstant1 and noconstant2 options, respectively. The following results are from just-identifying moment restrictions with p = 0, 1 and overidentifying restrictions with p = 0, 1, 2.

As in example 1, the standard errors are smaller when using overidentifying restrictions. Additionally, the singular value ratio decreases by a factor of 5.96, indicating better identification of the model parameters. The estimates obtained using overidentifying restrictions are comparable with those from the IV estimates, suggesting that the trigmm method provides reliable results under these conditions. The coefficients of life expectancy, institution, and the constant term are, respectively, −1.82, −0.07, and 1.86 in IV regression, while they are −1.44 (gamma), −0.05 (xb2:institution), and 1.63 (xb2:_cons) in trigmm results with overidentifying restrictions.

Furthermore, we explore combining the moment conditions from the trigmm command with additional moment conditions derived from the instruments. This approach results in standard errors that are somewhat smaller than those from the IV estimates and produces coefficients that are very close to the IV estimates. The standard errors for the coefficients of life expectancy, institution, and the constant term are 0.595, 0.047, and 0.421 in IV regression, whereas they are 0.574, 0.043, and 0.392 in trigmm results with the instrument variable. This alignment is expected if our moment conditions are valid and the proposed instruments are strong.

Finally, we apply the set identification results using the trigmmset command with a significance level of 0.05. The confidence bounds are reported in the subsequent log, and the graph option generates figure 1:

Figure 1.

trigmmset output

Comparing this with the results from overidentifying restrictions without IV, we find that α, defined as β + γ, is 0.91 and γ is −1.44, and they fall into region 3 of the reported identified set. Additionally, the estimates for b₁ and b₂ are contained within the reported confidence intervals, further validating the robustness of our estimation approach.

7 Conclusions

We introduced the trigmm package, which implements a novel method for identifying a triangular two-equation system. A key assumption of this method is the additive decomposition of the error terms into two independent components, one of which is common to both equations. This approach provides an easy-to-use alternative to IV estimation, proving particularly useful when researchers face challenges in finding suitable exogenous instruments.

When the non-Gaussianity of error terms can be further ascertained, users can leverage the point identification results using the trigmm command. Additionally, the trigmmset command offers complementary functionality for cases where the non-Gaussianity assumption does not hold, providing bounds on parameters.

The moment conditions of the trigmm command are nonlinear, resulting in a non-convex objective function that may present optimization challenges. We note that the bfgs optimizer within the gmm command performed well compared with other built-in optimizers. However, exploring alternative optimization techniques remains an area for future work.

Furthermore, refinements upon our approach to inference under set identification could be beneficial. Currently, the trigmmset command uses slightly conservative bounds to provide a geometrically simple and closed-form description of a region that bounds the identified set. Enhancing these bounds to be less conservative while maintaining simplicity and user friendliness would improve the package’s utility.

Supplemental Material

sj-txt-1-stj-10.1177_1536867X261425780 - Supplemental material for Instrument-free estimation of triangular equation systems with the trigmm command

Supplemental material, sj-txt-1-stj-10.1177_1536867X261425780 for Instrument-free estimation of triangular equation systems with the trigmm command by Heejun Lee, Arthur Lewbel, Susanne M. Schennach and Linqi Zhang

Footnotes

Acknowledgments

We thank the anonymous referee and the editor for their detailed comments, which have greatly improved this package and the article.

Susanne M. Schennach acknowledges support from NSF grants SES-1950969 and SES-2150003.

8

To install the software files as they existed at the time of publication of this article, type

About the authors

Heejun Lee is a PhD candidate in economics at Brown University.

Arthur Lewbel is a professor of economics at Boston College.

Susanne M. Schennach is a professor of economics at Brown University.

Linqi Zhang is an assistant professor at The Chinese University of Hong Kong.

References

Acemoglu

Johnson

. 2007. Disease and development: The effect of life expectancy on economic growth. Journal of Political Economy 115: 925–985. 10.1086/529000.

Baum

C. F.

Lewbel

. 2019. Advice on using heteroskedasticity-based identification. Stata Journal 19: 757–767. 10.1177/1536867X19893614.

Card

D. E

. 1995. “Using geographic variation in college proximity to estimate the return to schooling”. In Aspects of Labour Market Behaviour: Essays in Honour of John Vanderkamp, edited by Christofides

L. N.

Grant

E. K.

Swidinsky

, 201–222. Toronto, Canada: University of Toronto Press.

Card

D. E

. 2001. Estimating the return to schooling: Progress on some persistent econometric problems. Econometrica 69: 1127–1160. 10.1111/1468-0262.00237.

Clarke

Matta

. 2018. Practical considerations for questionable IVs. Stata Journal 18: 663–691. 10.1177/1536867X1801800308.

Hansen

L. P

. 1982. Large sample properties of generalized method of moments estimators. Econometrica 50: 1029–1054. 10.2307/1912775.

Klein

Vella

. 2010. Estimating a class of triangular simultaneous equations models without exclusion restrictions. Journal of Econometrics 154: 154–164. 10.1016/j.jeconom.2009.05.005.

Kripfganz

Kiviet

J. F.

. 2021. kinkyreg: Instrument-free inference for linear regression models with endogenous regressors. Stata Journal 21: 772–813. 10.1177/1536867X211045575.

Lewbel

. 2012. Using heteroscedasticity to identify and estimate mismeasured and endogenous regressor models. Journal of Business and Economic Statistics 30: 67–80. 10.1080/07350015.2012.643126.

10.

Lewbel

Schennach

S. M.

Zhang

. 2024. Identification of a triangular two equation system without instruments. Journal of Business and Economic Statistics 42: 14–25. 10.1080/07350015.2023.2166052.

11.

Lukacs

. 1970. Characteristic Functions. 2nd ed. London: Griffin.

12.

McCarthy

Millimet

D. L.

Roy

. 2015. Bounding treatment effects: A command for the partial identification of the average treatment effect with endogenous and misreported treatment assignment. Stata Journal 15: 411–436. 10.1177/1536867X1501500205.

13.

Oster

. 2019. Unobservable selection and coefficient stability: Theory and evidence. Journal of Business and Economic Statistics 37: 187–204. 10.1080/07350015.2016.1227711.

14.

Rigobon

. 2003. Identification through heteroskedasticity. Review of Economics and Statistics 85: 777–792. 10.1162/003465303772815727.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB