Abstract
In this paper, we consider a simultaneous modeling of good and bad outputs. We use an input distance function (IDF) with endogenous inputs as well as endogenous bad outputs, which is novel in the literature. Moreover, we model input efficiency to depend on the production of bad outputs which allows us to investigate whether emissions of pollutants (bad outputs) are related to technological performance (technical efficiency). We also model production of each bad output with a spatial structure separately, each depending on production of good outputs, inputs and other exogenous variables. These bad output production functions allow us to estimate both direct and indirect effects of good output on the production of bad outputs, which may be of special interest because they show the cost (to the society) in terms of releasing pollutants to the environment in order to increase production of good outputs. We apply the new technique to a data set on U.S. electric utilities with four bad outputs, three inputs and two good outputs. We used a Bayesian technique to estimate the model which is a system consisting of the input distance function, reduced form equations for each input, dynamics of inefficiency and bad output production technology—separately for each. Empirically, bad outputs are found to affect inefficiency positively. Percentage increases in inefficiency due to a percentage increase in each bad output are found to vary from 0.225% to 0.42%. Energy prices are found to be positively related to inefficiency. From the spatial specifications of bad outputs, we find that the spillover effects of increasing production of good outputs account for the majority of the total effect, indicating that neighborhood effects are more important than own effects. This means, the neighboring utilities played a crucial role indicating “contagion” of practices.
1. Introduction
In many production processes production of desirable (good) outputs also produce some unintentional (undesirable or bad) outputs. Electric power generation is one such example. This unintentional but inevitable outcome is labeled as “bi-production” (BP). The modeling of production technology that use undesirable outputs follows two distinct strands. In the directional distance function and transformation function approaches (see Färe et al. (2005), Atkinson and Dorfman (2005), Fernández et al. (2005), Agee et al. (2014), among others), the technology is specified by a single equation in which good and bad outputs as well as inputs enter as arguments. On the contrary, the BP approach consists of separate production technologies for good and bad outputs.
In this paper we use a modeling approach that relies on the BP concept introduced in Murty et al. (2012), Førsund (2009) and Fernández et al. (2002), Kumbhakar and Tsionas (2016). The technology for the production of good outputs is specified in terms of a standard translog input distance function (IDF) with input-oriented technical inefficiency. We depart from the Kumbhakar and Tsionas (2016) model by including bad outputs in the IDF. Our by-production technologies depend on good outputs. Thus we consider simultaneity between good and bad outputs. That is, similar to a single equation representation of the technology our IDF depends on bad outputs also. Other distinguishing features of our models are as follows. (i) We use a very flexible and dynamic inefficiency model. Inefficiency is allowed to depend on the current and past inefficiency of nearby electric plants via a spatial structure. It also depends on some exogenous variables and more importantly on bad outputs. (ii) Production of bad output technologies are specified as SAR models, each bad output with its own spatial autoregressive (SAR) parameters. The exogenous variables in these functions are good outputs and some other environmental variables (which are ideally inputs used to mitigate production of bad outputs).
Since we allow simultaneity between good and bad outputs in the IDF, it cannot be estimated directly because of endogeneity of bad outputs and possible endogeneity of input ratios (that appear in the IDF because it is homogeneous of degree 1 in inputs). To control for these endogeneity, following the BP approach, we add separate technologies for the production of each bad output which also depends on quantity of good outputs produced. Thus, the simultaneity between good and bad outputs are addressed by using a system approach, in contrast to the single equation approach in Färe et al. (2005), Atkinson and Dorfman (2005) in which endogeneity is not corrected. The main advantages of doing this are as follows. (i) We can compute the shadow price of each bad outputs from
Note that here we are using an IDF from which we can compute the marginal effects each bad output on cost represented by the IDF (
There are several papers that deal with the issue in general terms in a stochastic frontier (SF) framework. For example, Atkinson, Primont and Tsionas (2018) compute optimal directions subject to profit maximization and cost minimization, correct for the potential endogeneity of inputs and outputs, estimate latent prices for bad outputs, measure firms’ responses to shadow prices rather than actual prices, and introduce an unobserved productivity term. They also apply their techniques to U.S. electric utilities. Atkinson and Tsionas (2018) include parameters to estimate optimal directions which correspond to the firm’s profit-maximizing (PM) position and generalize the directional distance function (DDF) to a shadow-quantity DDF. To estimate the shadow quantities and the structural parameters, they form the shadow DDF system, which includes the shadow DDF and all the first-order price equations from the shadow-PM problem. Note that these papers are not directly related to what we do here. Because none of them deal with spatial models with dynamics, and endogeneity of both good and bad outputs as well as inputs. Our model is quite flexible and technically sophisticated. Kumbhakar and Tsionas (2016, hereafter KT) is the first paper that use IDF and address endogeneity of inputs in a bi-production model. However, although they use panel data in their application, their models did not use any panel features and there was no dynamics. 2
There are other papers in the literature that use the non parametric DEA approach, such as Bigerna et al. (2020) that accounts for Greenhouse gas emissions as bad output and Bigerna et al. (2019) which estimates the spatial effects on efficiency. Unfortunately our literature survey is not exhaustive primarily because we focus on bi-production spatial econometric models with good and bad outputs first, and then endogeneity of inputs and outputs (both bad and good) with dyanmic inefficiency. Although there are some papers in the first category which we cited above, there are no papers that deal with both. And that is where our contribution is in this paper. Our main contributions are as follows. First, we consider a true simultaneous model (not a triangular one as in KT) in which production of good output depends on the bad outputs and vice-versa. We introduce separate equations to represent the technology of the good output production (in terms an IDF), and separate technologies for the production of each bad outputs. Input oriented technical inefficiency is made dynamic by not only allowing inefficiency to depend on its past values but also on some exogenous variables and production of bad outputs. We assume that the input ratios in the IDF are endogeous and introduce a system of equations with exogenous instruments. Finally, we use a Bayesian method to estimate the system.
2. Model
We specify the production technology in terms of a transformation function
where
where
where the lowercase letters are logarithms of uppercase letters, and
where
where
where
where
These equations can be written in the form
These are compactly written as
where
and write (10) as
Denote by
Marginal effects of inefficiency can be derived as follows. Since we specified inefficiency as a dynamic process, it should have a steady-state value that is finite. If we denote the steady-state values of
In the short-term we have
The marginal effects of variables in
The marginal effects of variables in
where the dimensionality
whose dimensionality is
where
The system of equations used for estimation are (1), (4), (5), and (12). Estimation details are reported in the Appendix A. Finally, the decomposition into direct and indirect (spillover) effects is performed as follows. In log form, most marginal effects can be expressed in the generic form as
where b represents any parameter in the right-hand side of (7) for example. The first row of A, for example, is
This procedure is repeated to compute marginal effects of all the variables of interest.
3. Data
Our data is sourced from Atkinson, Primont, and Tsionas (2018). It contains information on 77 U.S. electric utilities observed during the period 1995–2005. It is an unbalanced panel and contains 1201 plant-year observations. We use three inputs (energy, labor, and capital), two good outputs (residential and industrial/commercial delivery of electricity) and four bad outputs, SO2, CO2, N0x, and sulfur (S). The quantity of energy is measured in total Btu of fuel consumed and the quantity of labor is the number of full-time plus 50% of the part time employees. Capital is measured as the expenses on interest plus depreciation. We also have prices for the inputs and good outputs. In the IDF, labor appears on the left-hand side, and we also include a dummy for privatized utilities and vintage of equipment for each plant. The W matrix was constructed using GIS information. We use the inverse distance weighting formulation
with
Summary statistics of the variables used are given in Table 1.
Descriptive Statistics
4. Results
4.1 Estimation and results
We use 150,000 Markov Chain Monte Carlo (MCMC; Tierney, 1994) iterations the first 50,000 of which are discarded to mitigate possible start up effects. Moreover, we use 1,000 particles per MCMC iteration. Our details on MCMC are reported in Appendix A. MCMC performance and prior sensitivity are taken up in Appendix B.
4.2 Parameter estimates
Of special interest are parameters
Posterior moments of SAR parameters and other functions of interest

Marginal posterior densities of SAR parameters

Sampling distributions of posterior mean estimates of functions of interest
The IDF in (1) includes time dummies. We estimate Technical Change (TC) using the coefficients of time dummies in (1), i.e.,
4.3 Efficiency change decomposition
Given efficiency estimates,
In part (b) of Figure 2 we report own effects and effects due to others of technical inefficiency for the current period
4.4 Marginal effects and shadow prices
Technologies of the production of bad outputs are specified in (7) (rewritten in (9)). We use (9) to examine the marginal effects of good outputs (residential and commercial use of electricity) on the four bad outputs obtained from these formulas. From (9)
Since

Marginal effects of good output on bad outputs
In Table 3, we report marginal effects of prices on labor, good and bad outputs. Note that since the IDF is homogeneous of degree 1 in inputs, the effect on labor is the same as other inputs and therefore on cost. Since prices are used as exogenous variables, it might be of interest to check their marginal effects. So there should be a direct and an indirect effect. The formulas are shown in equations (21) and (22). We report the overall effect (both direct and indirect combined) in Table 2. Marginal effects of output prices on outputs are positive, as expected. Their effect on labor and capital are negative while the effect on energy is positive. The effects on all four bad outputs are positive. An increase in output prices increase supply (production) which in turn increases production of bad outputs. This is expected because it is not possible to reduce bad outputs when the production of good outputs increase. The marginal effects of inputs on outputs are negative because an increase in input prices will reduce their use which will lower production of good and bad outputs.
Marginal effects of prices
In Table 4, we report marginal effects of determinant (environmental or contextual) variables on technical inefficiency
Marginal effects of determinants of inefficiency
Notes: Reported are posterior means with posterior standard deviations in parentheses.
Sampling distributions of some of these marginal effects are presented in Figure 4. It can be seen that the marginal effects of bad outputs on inefficiency are negative. These plots indicate that utilities that produce more bad outputs are also more inefficient. These findings suggest that plants that produce more bad outputs have poor management quality which is showing up in high inefficiency.

Marginal effects of bad outputs on inefficiency
4.5 Dynamic effects
The dynamic effect of

Dynamic effect of inefficiency on cost (%)
In panels (a) through (d) of Figure 6, we report dynamic effects of bad outputs on inefficiency; specifically their percentage contribution to own production of bad outputs. This is computed using the simultaneous relationship between bad outputs and good output. That is,

Dynamic effects of bad outputs on inefficiency
4.6 Output elasticities
Finally, in Table 5, we report posterior moments for certain elasticities. Note that elasticity of
Marginal effects (elasticities)
Notes: Inputs are energy, labor, and capital. Good outputs are residential and industrial consumption. Bad outputs are SO2, CO2, Nox, and S. ∂ logx1/∂ logY1 is computed from (1) and (5). All other elasticities take W into consideration. “% own” is the percentage of the total effects that are attributed to the a specific utility as in one minus the expression in (20).
In the last block of the table we report the marginal effects (elasticities) of bad outputs on inefficiency. As mentioned earlier, these effects are all positive indicating that an increase in bad output is associated with an increase in inefficiency (and therefore cost). This might the effect of tight regulation meaning that when plants an increase bad outputs is not free—plants need to incur additional cost by buying permits to pollute, for example. Alternatively, when bad outputs increase, more management (quality) is exercised as the production of bad outputs is highly regulated in the U.S. 6
5. Concluding Remarks
In this paper we use a bi-production approach to model production of good and bad outputs. The good output production technology is modeled using an input distance function. We use panel data on the U.S. electric utilities for this. In the production of good outputs we allow for the endogeneity of inputs and bad outputs (SO2, CO2, NOx and S). This enables us to estimate the marginal effect of an increase in good output on the production of each bad ouput. The shadow prices, on the other hand, indicate the sacrifice of good output for a decrease in the production of each bad output. Thus, conceptually it is the opposite of the marginal effects of good output, and is associated with a decrease in each of the bad output. The shadow elasticities in a logarithmic model is the percentage decrease (sacrifice) in good output for a one percent decrease in each bad output due to a 1% increase in good output. In our proposed model the elasticities of each of the bad outputs are directly estimated, instead of the shadow elasticities. These elasticities range from 0.137% to 0.36%. An advantage of using the IDF is that since
Supplemental Material
sj-pdf-1-enj-10.5547_01956574.45.1.mtsi – Supplemental material for Endogenous Bad Outputs and Technical Inefficiency in U.S. Electric Utilities
Supplemental material, sj-pdf-1-enj-10.5547_01956574.45.1.mtsi for Endogenous Bad Outputs and Technical Inefficiency in U.S. Electric Utilities by Mike Tsionas and Subal C. Kumbhakar in The Energy Journal
Footnotes
Appendix A
Appendix B
First, we focus on sensitivity of the posterior with respect to our prior assumptions. In our benchmark prior, we have
The results are reported in Figure B.1 for the main quantities of interest, viz. productivity growth (panel (a)), inefficiency (panel (b)), IDF parameters (panel (c)) and reduced form parameters for both inputs and bad outputs in panel (d)). Across the 10,000 different priors, reported are densities of differences of posterior means and posterior standard deviations relative to the benchmark case. 7 Evidently, the model is reasonably robust so, inferences based on the benchmark prior are representative.
To examine numerical performance we rely on autocorrelation functions (panel (a) of Figure B.2), and the standard Geweke (1992) diagnostics, viz. relative numerical efficiency (RNE, panel (b)), and convergence diagnostics (GCD, panel (c)). We consider acfs, RNEs and absolute GCDs for a given lag across 50 randomly chosen parameters. RNE should be close to one if i.i.d sampling from the posterior were possible and GCDs follow asymptotically (in the number of MCMC draws) a standard normal distribution so, deviations from 1.96 would indicate non-convergence. Specifically, GCD is the absolute value of a a z-test that compares the first and last 25% of means of MCMC draws. For PF methods to be effective, the weights should be approximately uniformly distributed to avoid a degenerate situation where just a few particles have a weight close to 1 and the vast majority of others are zero. In panel (d) we show 50 representative particle weights corresponding to our state variables. These results indicate that although the weights are not uniformly distributed they do not have degenerate distributions either. Finally, in panel (e) we report acfs for inefficiency estimates across fifty randomly selected is and ts. For other is and ts the behavior of acfs was qualitatively the same. From these results it turns out that MCMC mixes well and allows a thorough exploration of the posterior.
Appendix C
To examine the role of instruments, we report the distribution on R-squared from (5) and (7) across MCMC iterations, which will provide evidence on the strength of instruments. Our results are reported in panel (a) of Figure C.1. Second, we report correlation coefficients from the first row of ∑ corresponding to (1). These are reported in panel (b) of Figure C.1. From this evidence, we can claim that the instruments are strong and valid.
1.
Although shadow prices are calculated in Färe et al. (2005),
, their models do not control for endogeneity of bad outputs. Consequently, the estimates are likely to be biased and inconsistent.
2.
Similarly, there some other papers in the SF literature that deal with endogeneity of inputs. For example, Tsionas et al. (2015), use an IDF system approach without no bad outputs. These papers are not important to discuss here because none of them, to the best of our knowledge, discuss endogeneity in the context of a bi-production model. With or without endogeneity of inputs, the bi-production models constitute a system as in KT.
3.
We assume that the variables in
4.
5.
In a cost function, these shadow prices are partial derivatives of cost with respect to bad outputs with a negative sign. Since the IDF is dual to the cost function, we expect to find similar results.
6.
We have not formally introduced management in the bad output production functions.
7.
We make use of the sampling-importance-resampling algorithm of Rubin (1987, 1998); see also
.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
