Abstract
This paper specifies and estimates regression models to test several hypotheses about the operational and managerial drivers of pollution in a multi-rig onshore drilling organization. In addition to conventional models, detection-controlled models are also specified to explicitly control for the potential for imperfect reporting. The results suggest that continuity in operations and supervision act to reduce the likelihood of pollution. Additional variables such as site complexity are also significant. The results are largely consistent with related research on personal safety incidents. While the analysis was completed for one organization in one geographic area, the results may be applicable to similar regions and organizations. The results can be used to drive decisions regarding operating practices and managerial policies.
Introduction
The consequences of pollution have been studied extensively, and the importance of prevention is self-evident, for example see Assaf et al. (1986), Cohen (1986), Sovacool (2006), Douglas (2011), and Harzl and Pickl (2012). In an oil company, engineers and analysts are involved in identifying pollution risks, estimating the probability of incidents, and advising decision-makers on options for elimination, mitigation, and control of these risks. This work often involves the quantitative analysis of historical pollution performance. To this end, this investigation specifies and estimates regression models to test several hypotheses about the operational and managerial drivers of pollution in a multi-rig onshore drilling organization. The results of this investigation provide information that can be used for resource allocation and the definition of operating practices and managerial policies.
Quantitative analysis of pollution poses special challenges because there is typically no theoretical basis for assumptions regarding the functional form of pollution incident phenomena, historical incident data is often unbalanced (few incidents), data typically is not collected in cases when there are no incidents, and incidents are not always reported. Some of these and other challenges in pollution risk analysis are described in Stewart and Leschine (1986). However, many of these challenges can be overcome with common-sense assumptions, improved data collection strategies, and advanced modeling methods.
The models specified in this investigation include conventional regression-based models, but also include detection-controlled models that explicitly control for the potential for imperfect reporting. Specifically, we investigate the drivers for loss of primary containment (LOPC) incidents. An LOPC event is defined here as an unplanned release at the drilling location from the well or drilling-related equipment into the environment via air releases, spills, leaks, tank overfills, etc., irrespective of measures to protect the environment (e.g. safely capturing the release) or the fact that the release was removed immediately. Excluded are supplied water, and planned or anticipated flaring or venting related to drilling operations.
Regression model specification
This investigation specifies and estimates regression models to test several hypotheses about the operational and managerial drivers of LOPC incidents. In addition to conventional models, the authors specify detection-controlled models to explicitly control for the potential for imperfect reporting. This is an important aspect of the investigation. As described in Pransky et al. (1999), Deissenberg et al. (2001a), Leigh et al. (2004), Phimister et al. (2004), Rosenman et al. (2006), Probst et al. (2008), and Probst and Estrada (2010), imperfect reporting of incidents in the workplace occurs across many sectors. There are various reasons for underreporting, some are intentional (evasion) while others are unintentional (ignorance). Also, it is acknowledged that the prospect exists for overreporting, for example, fraudulent reports of personal safety incidents that did not occur. With respect to LOPC incidents however, it is reasonable to assume there is no overreporting.
The notion of imperfect reporting (and/or detection) of pollution was introduced in Epple and Visscher (1984), and there is a growing body of empirical work on the subject of incomplete detection based on the subsequent seminal work of Feinstein (1989, 1990). Feinstein's model of detection-controlled estimation (DCE) has been applied in various contexts. Studies have been completed in tax compliance (Erard, 1997), health diagnosis (Bradford et al., 2001; Kleit and Ruiz, 2003), political science (Scholz and Wang, 2006), and safety in oil and gas drilling (Jablonowski, 2007, 2011). The present investigation specifies and estimates a detection-controlled model of pollution in oil and gas drilling, and will add to the existing detection-controlled literature in environmental compliance (Brehm and Hamilton, 1996; Helland, 1998; Stafford, 2003).
Implications of imperfect reporting
Imperfect reporting distorts the observations of incident data. A simple example demonstrates the impacts of imperfect reporting, assuming that no fraud occurs. Consider 100 hypothetical pollution outcomes in Table 1. The columns represent whether or not a pollution incident occurred, while the rows represent whether or not the incident was reported. In this unobservable “truth” case, the imperfect reporting is evident. In practice, however, the underreported incidents are counted with the actual non-incidents. Thus, the analyst observes the data as depicted in Table 2.
True incident data.
Observed incident data.
Depending on the levels of imperfect reporting, the implications can be severe. The true frequency of an incident in this example is equal to 14/100, while the analyst computes a value of 10/100. Of course the conditional probabilities are also affected. It is clear that in the presence of imperfect reporting, use of the data in Table 2 will bias any qualitative or quantitative analysis. However, when the imperfect reporting is modeled explicitly, more accurate assessments can be made of the true incident phenomena. Also, the analyst can investigate factors that affect the reporting rate.
Models of perfect reporting
These models are specified and estimated to establish a base case for comparison, and reflect conventional practice in frequency and regression analysis of pollution and safety incidents. For discussion and examples of this approach see Lanfear and Amstutz (1983), Anderson and LaBelle (1990, 1994), Fleming et al. (1996), Iledare et al. (1997, 1998), Chunlin and Chengyu (1999), Shultz (1999), Shultz and Fischbeck (1999), Mearns et al. (2001), Conchie and Donald (2006), Boehmer-Christiansen (2008), Malallah (2009), and Winter et al. (2010). That is, these models estimate the case as depicted in Table 2.
The unit of observation is defined as one well. Data is collected for each well i on each rig r in the study period. There are
Model of imperfect reporting
It is assumed that the probability of a reported incident,
The incidence function is specified as Poisson, and the reporting function is specified as a binary probit model (see the Appendix for development of the probit model). The variable
Data set and pollution incident hypotheses
Observations were collected from eight drilling rigs over a recent ∼24 month period from onshore oil and gas development assets in the Permian basin in the U.S. For each well ri, the dependent variable is defined as the number of LOPC events that occurred on the well. In this data set, 22 LOPC events occurred on 143 wells which yields a frequency of ∼15%. While the dependent variable is somewhat unbalanced, this aspect of the data set does not appear to impede the regression analysis in this case.
When defining the hypotheses and independent variables, the emphasis was placed on those operational and managerial attributes that are controlled by the organization and thus subject to modification (although this criterion does not apply to every variable, all of the attributes are amenable to mitigation activities). For each independent variable defined below, the hypothesis regarding the directional impact (sign) of the variable on incidence and reporting is stated, along with the expectation for statistical significance (at the 95% confidence level).
The first three variables reflect attributes of the work and site.
Supervision and monitoring are important factors in driving compliance with procedures as discussed in Epple and Visscher (1984), Embrey (1992), Viladrich-Grau and Groves (1997), Cohen (2000), Deissenberg et al. (2001b), and Skaugrud et al. (2012). Variables were specified to test hypotheses along two dimensions of supervision: concentration and turnover. The results can be used to adjust policies on allocation of supervisory resources.
All of the rigs in this study were governed by the operator's safety management system (SMS). Simply put, the SMS defines expectations and requirements for practically all aspects of well operations. One hypothesis of great interest to safety and environmental practitioners is whether there is improvement over time with respect to SMS compliance. It is an important hypothesis because if indicated to be true, the result may affect procurement strategy when picking up and dropping rigs from the fleet. The following variable is defined to test this hypothesis.
The operator invests considerable resources to ensure that all rigs are compliant with the SMS (e.g. assigning multiple foremen on site). However, the drilling rig contractor employs its own rig manager to manage the detailed activities of the rig crew. These individuals also may have an impact on pollution incident performance and reporting. The following suite of binary variables are defined to control for this potential effect.
There are some unavoidable regrets in the hypothesis tests. For example, because the rigs were all drilling similar types of wells using similar procedures, variables such as well design and operational practices that have been shown to reduce LOPC incidents like underbalanced or managed pressure drilling could not be tested (Jablonowski and Podio, 2011). Also, all of the rigs were provided by the same drilling contractor and outfitted in a similar way (e.g. similar levels of automation), and the drilling sites all shared the same geography and degree of site remoteness, thus it was not possible to test hypotheses about these variables.
Regression analysis and discussion
The models of perfect reporting were estimated first to identify probable drivers of incidence and/or reporting. That is, when one observes a statistically significant variable in these models, it is not discernible whether the effect is attributable to incidence or reporting behavior. However, it is a sign that the variable is probably important in one or both functions and careful attention is warranted in the model of imperfect reporting. When a variable does not indicate as significant in the model of perfect reporting, one cannot ignore the variable in the model of imperfect reporting. That is, it is possible that the incidence and reporting behaviors can “cancel out” and thus are not observed in the model of perfect reporting.
Table 3 presents a summary of regression results. Columns A and B report the results from the ordinary least squares model, where column A contains the Rig binary variables and column B does not. The model was re-estimated without the Rig binaries because none of the Rig binary variables were statistically significant. The same structure is used in columns C and D where the Poisson model results are presented. Column E contains the detection-controlled estimates.
Summary of regression results.
# significant at 90% confidence, * significant at 95% confidence, ** significant at 99% confidence.
The Rig Move variable is not significant in any of the perfect reporting models, contrary to expectations. However, when the detection-controlled model is estimated, this variable is significant in the incidence function, and significant (at the 90% level) in the reporting function, consistent with expectations. This result suggests that after a rig move, there is the potential for a breakdown in incidence and reporting performance. For this variable, the perfect reporting models and the detection-controlled model appear to be inconsistent. However, recall that the in the models of perfect reporting, the coefficient estimate represents a mash-up of two underlying effects (incidence and reporting), and instead of observing increased incidence and decreased reporting, the effects cancel out and there is no apparent effect. The detection-controlled model disentangles these two effects. Beyond the obvious value of this result and how it may affect mitigation activities, the coefficient estimate also matters. That is, if one were estimating the likelihood of a LOPC event after a rig move, the appropriate coefficient to use would be the one from the detection-controlled model which is two to three times as large as estimate from the Poisson model of perfect reporting.
The Well Type variable is significant in all models as expected, suggesting that additional congestion and complexity on development sites increases the likelihood for LOPC incidence, consistent with expectations. The control variable Drilling Days is significant in the detection-controlled model as expected, but not in the models of perfect reporting. A diagnostic detection-controlled regression was estimated with Drilling Days in the reporting function to test whether this result was similar to the mash-up effect discussed above with respect to the Rig Move variable and it was not. Thus, the reason for this inconsistency between the models is probably attributable to a spurious correlation in the data set.
The Foreman Concentration variable is not significant in any of the models, suggesting that a larger concentration of supervision and oversight does not necessarily result in a decrease in LOPC incidence or increase in reporting. This result is contrary to expectations and additional diagnostics revealed that while the variable ranges from 1.5 to 3, the observations are tightly clustered at a value of 2, and this lack of dispersion in the data may constrain the ability to estimate the coefficient with precision.
The Foreman Turnover variable is negative and significant in the models of perfect reporting and in the incidence function of the detection-controlled model, consistent with expectations. This result suggests that well-to-well consistency in supervision decreases the likelihood of LOPC incidents. The variable is not significant in the reporting function of the detection-controlled model, contrary to expectations.
The Rig-SMS Maturity variable is negative and significant in the models of perfect reporting and in the incidence function of the detection-controlled model, suggesting that cumulative experience in working with the operator under its SMS serves to improve LOPC incidence performance. While this result is consistent with expectations, the variable is not significant in the reporting function of the detection-controlled model, which suggests that reporting behavior is independent from experience.
As already mentioned, none of the Rig binary variables were statistically significant in the models of perfect reporting (note, two of the Rig variables were excluded because of collinearity issues). This result is consistent with expectations because the operator invests considerable resources to ensure that all rigs are compliant with the SMS, and there is consistency across rigs. Diagnostic detection-controlled regressions were estimated that included the Rig variables, but there were no noteworthy outcomes and these results are not reported.
As reported in Table 3, there is only one variable (Rig Move) that affects the likelihood of reporting, and it is significant at only a 90% confidence level. The intuitive interpretation of this result is that imperfect reporting is probably not a significant problem in this asset, except of course for the biases introduced in the coefficient estimates (i.e. compare Columns D and E in Table 3). This is not surprising because LOPC events are difficult not to report because they are often witnessed by more than one person, and some releases linger and take time and resources to clean up (e.g. chemical spill) increasing the likelihood of detection. It is possible to use the detection-controlled model to compute the probability of a false negative for each zero observation,
Conclusion
This paper specifies and estimates regression models to test several hypotheses about the operational and managerial drivers of pollution in a multi-rig onshore drilling organization. The overarching theme in these results is that consistency is important in reducing the likelihood of LOPC incidents, and potentially in reporting. This applies to consistency in operations (Rig Move), in supervision (Foreman Turnover), and in the rig crew (Rig-SMS Maturity). Thus, disruptions should be taken seriously and mitigation activities should be defined and implemented when disruptions cannot be avoided. Site complexity was also shown to be an aggravating factor in LOPC incidence. The results are largely consistent with related research on personal safety incidents. While the analysis was completed for one organization in one geographic area, the results may be applicable to similar regions and organizations. The results can be used to drive decisions regarding operating practices and managerial policies.
In addition to conventional models, the authors estimated a detection-controlled model to explicitly control for the potential for imperfect reporting. The detection-controlled model provides incremental value over models of perfect reporting by disentangling the drivers of incidence and reporting phenomena. In this case, the detection-controlled model is also able to demonstrate that imperfect reporting probably is not a significant problem in this asset.
A final note of caution is needed regarding the use of this kind of information. First, when a relationship between LOPC incidents and a variable is identified and an intervention plan or policy is enacted to reduce risk, then over time the relationship between LOPC incidents and the variable will degrade and ultimately be eliminated if the intervention plan or policy is effective. That is, the intervention policy was successful and should be continued. Also, this same phenomenon makes it difficult to identify risk factors that are already being mitigated by some policy. In this case, the lack of statistical evidence would not be a sufficient reason to alter or cancel an existing mitigation policy that is otherwise believed to be working.
Footnotes
Acknowledgements
The author thanks John Tolle, Greg Carlson, and John Lynaugh for excellent research assistance. Numerous colleagues contributed technical expertise and data which was also vital to the completion of this work: Cody Buyer, Eryn Clark, Jonathan Dallaire, Cesar Gongora, James Goodwyne, Kevin Hoffman, Lake Johnson, Gregory Knott, R.J. Mendoza, Jose Mota, Brendan O'Shea, Donald Patton, Bobby Ramos, and Linda Randle. All conclusions, errors, and omissions remain the sole responsibility of the author.
Appendix
A probit model can be derived by defining a latent (unobservable) variable
