Sage Journals: Discover world-class research

Abstract

Background:

Before-and-after studies are a valuable study design in situations where randomization is not feasible. These studies measure an outcome both before and after an intervention and compare the outcome rates in both time periods to determine the effectiveness of the intervention. Before-and-after studies do not involve a contemporaneous control group and must, therefore, take into account any underlying secular trends to separate the effect of the intervention from any pre-existing trend.

Methods:

To illustrate the importance of accounting for underlying trends, we performed a before-and-after study assessing 30-day mortality in hip fracture patients without any actual intervention, and instead designated an arbitrarily chosen time point as our ‘intervention’. We then analysed the data first disregarding and then incorporating the pre-existing underlying trend. We did this to show that even intervention of nothing may be spuriously interpreted to have an effect if the before-and-after study design is incorrectly analysed. Our study involved a secondary analysis of routinely collected data on 30-day mortality following hip fracture in our institution.

Results:

We found a secular trend in our data showing improving 30-day mortality in hip fracture patients in our institution. We then demonstrated that disregarding this underlying trend showed that our intervention of nothing ‘resulted’ in a significant 54% decrease in mortality, from 6.7% in the ‘before’ period to 3.1% in the ‘after’ period (p < 0.0008). Though the 30-day mortality rate decreased during the ‘after’ period, the decrease was not significantly different from the underlying trend in the ‘before’ period, projected onto the ‘after’ period. When we accounted for the underlying trend in our analysis, the impact of the intervention (nothing) on 30-day mortality was no longer apparent (incidence rate ratio 0.75, 95% confidence interval 0.32–1.78; p = 0.5).

Conclusion:

Our study highlights the importance of appropriate measurement and consideration of underlying trends when analysing data from before-and-after studies and illustrates what can happen should researchers neglect this important step.

Keywords

epidemiologic methods historically controlled study interrupted time series analysis research design

Introduction

What is a before-and-after study?

Before-and-after studies compare group-level data for a specified population prior to and following an intervention and assess changes in outcomes between the two periods. These studies may also be referred to as interrupted time series, pre-post studies or historically controlled studies and are often used in situations where resources are limited or where randomization is either not feasible or not ethical.¹ Compared with other study designs, before-and-after studies can be relatively simple to implement, less logistically demanding (particularly if they are observing changes in standard practice) and cost-effective.²

Examples of before-and-after studies

Before-and-after studies allow researchers to answer questions such as: Does the opening of a new trauma centre reduce admissions in surrounding hospitals?³ Do sugar taxes reduce the consumption of sweetened beverages?⁴ and Does the introduction of compulsory helmet legislation for cyclists reduce the rate of head injury?⁵ These and other questions that may be difficult or impossible to investigate with a randomized controlled trial lend themselves easily to the before-and-after study design.

Pitfalls and dangers of before-and-after studies

As with all study designs, however, before-and-after studies have important limitations. Many health conditions or states improve over time, due either to the natural history of the condition or to concomitant interventions that are not considered or measured as part of a study. It can be difficult to isolate the effect of a single intervention from the myriad other changes occurring around us all the time, in the health system or in the society at large.⁶ For example, European survival rates for breast cancer have been improving since the 1980s due to a combination of improvements in therapeutics, early diagnosis and by the improvement in engagement in screening programs.⁷ A before-and-after study measuring the effectiveness of an intervention on survival rates of European breast cancer patients must, therefore, identify and account for this trend to avoid spuriously finding a benefit of the study intervention.

Anticipating, measuring and accounting for underlying trends allows us to consider the natural variations and fluctuations that must be expected in all data sets, and permits us to assess an intervention in a real-world context. A failure to do so may cause the studied intervention to appear, spuriously, to be effective. This can contribute to the relative overestimation of effectiveness that has been observed when using a before-and-after study design compared to randomized trials.⁸ A careful analysis of data from a before-and-after study, therefore, should focus on assessing whether an intervention has an effect beyond that due to any pre-existing trends.⁹

Why is this important?

Prior to performing our own before-and-after study, we reviewed a sample of 249 randomly selected before-and-after studies of population-level interventions and found that 77% of these studies reported that the intervention under assessment was effective. While this may be the case, it is possible that many of the improvements seen in the aforementioned 249 studies were spurious findings, reached without accounting for underlying trends in the data (i.e. by performing unadjusted, simple before-and-after comparisons). Alternatively, this may reflect publication bias, a phenomenon in which studies with positive findings are more likely to be written up and published than those with neutral of negative findings.¹⁰

Is there a solution?

Fortunately, this is a solvable problem. One solution, explored in this article, is interrupted time series analysis or segmented regression. This is an analytical approach in which the study period is divided into ‘before’ and ‘after’ segments and the observed trend measured in each. The trend in the ‘before’ period can be projected into the ‘after’ period, showing what we would have expected to see in the absence of any intervention or change. This is called the counterfactual. Then, the trend that is actually observed in the ‘after’ period is measured and compared to the counterfactual. This allows us to compare the observed outcome with the outcome that we would have expected to see in the absence of any intervention, with the latter being a projection based on the pre-existing trend. Another solution involves the use of a control group, in which case an outcome is measured before and after an intervention at an institution and this change is compared to outcomes at a similar ‘control’ institution which did not concomitantly receive the same intervention. However, the focus of this article is the interrupted time series approach.

A few notes on interrupted time series analysis

It is important for a researcher to specify a priori what type of change might be expected following a certain intervention. This is specific to the intervention, and there are three main patterns of change typically seen in a before-and-after study, which can then be modelled using an interrupted time series analysis. The first is a step change, in which there is a sudden and precipitous increase (or decrease) in the outcome of interest that occurs following the implementation of the intervention. The second is a slope change, in which the rate of the outcome of interest increases (or decreases) gradually following the intervention. This is seen as a difference between the slope of the line in the ‘after’ period and the slope of the line in the ‘before’ period. The third is a combination of a step change and a slope change, whereby the rate of the outcome both rises (or drops) immediately following the intervention, and the rate of the outcome also increases (or decreases) gradually in the ‘after’ period relative to the ‘before’ period.

The objective of this article is to highlight the importance of identifying and adjusting for underlying trends in before-and-after studies, and to show what can happen when this step is omitted. To illustrate this, we used data from our institution to report 30-day hip fracture mortality before and after nothing, that is, we assigned an arbitrary mid-point date as the ‘intervention’ with no specific intervention having taken place. We then compared the 30-day mortality in the before-and-after periods in two ways: first, without considering the underlying trend; then, using an approach described previously by Bernal et al.,¹¹ by taking into account the underlying secular trend.

Methods and results

Using hip fracture mortality as an example

This before-and-after study investigated changes in 30-day mortality at a single tertiary care institution from 2010 to 2016 (inclusive) using two different analytical approaches: one which accounted for any trend during the ‘before’ period and one which did not. For illustrative purposes, we arbitrarily chose 1 January 2014 to denote the date of implementation of an ‘intervention’; in reality, there was no intervention. We did this to show that even an intervention of nothing may be spuriously found to have an effect if an underlying trend in the data already exists. Details of each analysis can be found below. The study used de-identified routinely collected departmental audit data.

This study included patients aged 50 years or older who presented to Liverpool Hospital, New South Wales, Australia, between January 2010 and December 2016 for the treatment of hip fractures. The outcome of interest, 30-day mortality, is routinely audited by our institution as part of standard care using direct patient or family contact.

We used a pre-specified significance level of p < 0.05 and performed all analyses using Stata 15.1 (www.stata.com; StataCorp; College Station, Texas, USA).

Analysis of data: two ways

Analysis 1: Not accounting for the underlying trend

For the first analysis, we used a χ ² test to compare the proportion of people who died within 30 days of a hip fracture between two time periods, before (2010–2013) and after (2014–2016) the ‘intervention’.

This analytical approach, which did not incorporate or even consider any pre-existing trend in 30-day mortality, found a statistically significant decrease in 30-day mortality between the ‘before’ and ‘after’ periods; 65 (6.7%) of 970 hip fracture patients died within 30 days of presentation during the ‘before’ period (2010–2013) compared to 24 (3.1%) of 765 in the ‘after’ period (2014–2016; p = 0.0008), a decrease of 54% (Figure 1). Interpretation: 1 January 2014 was such an important day that it set into motion a decrease in hip fracture mortality.

Figure 1.

Proportion of people who died within 30 days of hip fracture during the ‘before’ (2010–2013) and ‘after’ (2014–2016) time periods.

Analysis 2: Accounting for the underlying trend

We found markedly different results when we performed an interrupted time series analysis as described by Bernal et al.¹¹ We specified a priori that we would expect a step change following the intervention, that is, 30-day hip fracture mortality would drop following the ‘intervention’, with no other change in the trend. Under this specification, we proposed that the slope from the ‘before’ period would remain stable, and there would be a single drop (step change) in 30-day hip fracture mortality at the time of the ‘intervention’. We did this for two reasons: first, as we are modelling a fake intervention we stipulated that any change would be expected to occur immediately and not change over time, that is, we chose the simplest model; and second, as the unit of time in our data set is years (and not months, or weeks, which would provide data at a more granular level), even a gradual change in 30-day hip fracture mortality might be expected to appear as a step change due to the amount of data contained in a single year.

Using a generalized linear model specifying a Poisson distribution, we measured the underlying 30-day mortality during the ‘before’ period, then projected that measure into the ‘after’ period. Modelling allowed us to compare the trend we would have expected to see in the ‘after’ period (the counterfactual), with the trend we observed in the ‘after’ period. Using this analytical approach, we found that 30-day hip fracture mortality was 25% lower following the ‘intervention’, with an incidence rate ratio (IRR) of 0.75 (95% confidence interval 0.32–1.78; p = 0.5). As we had specified a priori that we would expect to see a single step change following the ‘intervention’, the IRR reflects the magnitude of the step change. In other words, the IRR provides an estimate of the difference between the counterfactual, that is, the 30-day mortality rate in the ‘before’ period projected into the ‘after’ period, and the actual, observed 30-day mortality rate in the ‘after’ period. This model estimate suggests that 30-day mortality was 25% lower in the ‘after’ period, but this was not statistically significantly different from the pre-existing trend. In other words, it is possible that our intervention had no effect. In summary, compared to what we would have expected to see based on the underlying trend during the ‘before’ period (2010–2013), there was a non-significant decrease in 30-day mortality during the ‘after’ period (2014–2016) (Figure 2).

Figure 2.

30-day hip fracture mortality before and after an ‘intervention’. The dotted line projects the 30-day mortality during the ‘before’ period into the ‘after’ period.

Discussion

When applying a statistical method that was simple but inappropriate for this context, that is, not considering the possibility of any underlying secular trend, we found a statistically significant decline in 30-day mortality when comparing periods before and after 1 January 2014, an arbitrarily chosen time point denoting no specific or actual intervention. Following adjustment for the underlying trend in 30-day mortality using an appropriate analytical method, however, the magnitude of the effect of the ‘intervention’ decreased, and the difference between periods was not significant.

This illustrates the importance of performing statistical analyses that consider pre-existing trends when assessing the possible effectiveness of an intervention in before-and-after studies to prevent spurious results. Using the first analytical approach, any intervention could have been deemed to be effective, regardless of its true effectiveness, as in the environment of decreasing 30-day hip fracture mortality, the effect of this pre-existing trend could be mistakenly attributed to the ‘intervention’.

When considering possible drivers of 30-day mortality in our patients, we reflected on clinical practice during this period. The overall practice did not change in our institution over the period under study. No policy changes were introduced regarding the timing or urgency of hip fracture surgery and there was no change in the involvement of the orthogeriatric service, a factor that has been previously shown to influence 30-day mortality in our state.¹² We discussed 30-day mortality at our institution with staff, who made several suggestions to explain the reduction in mortality over the studied time period. These included an increased awareness in the problems facing hip fracture patients, the addition of two new consultant orthopaedic staff, the addition of a second Clinical Nurse Consultant to the ward, the introduction of regular Structured Interdisciplinary Bedside Rounding, and the initiation of data collection for the Australia & New Zealand Hip Fracture Registry. In the context of a gradual decline in mortality over time, it is likely that a simply analysed before-and-after study of nearly any intervention, whether or not that intervention was actually effective, would have reported a significant decline in mortality.

The wide range of mortality for each year (from 1.4% to 8.1%) and the decline over time is consistent with the findings in similar published studies. A 2016 review of 16 separate international studies including more than a million elderly patients who experienced traumatic hip fractures between 1986 and 2013 reported 30-day mortality varying from 1.4% to 10% with multiple factors contributing to a steady reduction in 30-day mortality from year to year including improved time to surgery, multidisciplinary care and implementation of National Institute for Health and Care Excellence (NICE) guidelines.¹³

We have shown that, even in the absence of any specific intervention, it is possible to find a spurious statistically significant result using a simple comparison of outcomes between two discrete time periods in a before-and-after study in the presence of an underlying trend. Using an appropriate analytical method to measure and account for the underlying trend permits an appropriate comparison of two periods in studies of this type. Our findings highlight the importance of adjusting for underlying secular trends and recommend a cautious interpretation when evaluating the effects of interventions of before-and-after studies.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Mohammad Cheik-Hussein

References

Gordis

. Epidemiology. 5th ed. Philadelphia: Elsevier/Saunders, 2014, p. xv, 392 pages.

Kontopantelis

Doran

Springate

, et al. Regression based quasi-experimental approach when randomisation is not an option: interrupted time series analysis. BMJ 2015; 350: h2750.

Martin

Aldridge

Harris

, et al.

Opening a new level II trauma center near an established level I trauma center: is this good for trauma care?

J Orthop Trauma 2016; 30: 517–523.

Silver

Ryan-Ibarra

, et al. Changes in prices, sales, consumer spending, and beverage consumption one year after a tax on sugar-sweetened beverages in Berkeley, California, US: a before-and-after study. PLoS Med 2017; 14: e1002283.

Walter

Olivier

Churches

, et al. The impact of compulsory cycle helmet legislation on cyclist head injuries in New South Wales, Australia. Accid Anal Prev 2011; 43: 2064–2071.

Penfold

Zhang

. Use of interrupted time series analysis in evaluating health care quality improvements. Acad Pediatr 2013; 13: S38–44.

Sant

Francisci

Capocaccia

, et al. Time trends of breast cancer survival in Europe in relation to incidence and mortality. Int J Cancer 2006; 119: 2417–2422.

Sacks

Chalmers

Smith

Jr . Randomized versus historical controls for clinical trials. Am J Med 1982; 72: 233–240.

Polus

Pieper

Burns

, et al. Heterogeneity in application, design, and analysis characteristics was found for controlled before-after and interrupted time series studies included in Cochrane reviews. J Clin Epidemiol 2017; 91: 56–69.

10.

Shields

. Publication bias is a scientific problem with adverse ethical outcomes: the case for a section for null results. Cancer Epidemiol Biomarkers Prev 2000; 9(8): 771–772.

11.

Bernal

Cummins

Gasparrini

. Interrupted time series regression for the evaluation of public health interventions: a tutorial. Int J Epidemiol 2017; 46: 348–355.

12.

Zeltzer

Mitchell

Toson

, et al. Orthogeriatric services associated with lower 30-day mortality for older patients who undergo surgery for hip fracture. Med J Aust 2014; 201: 409–411.

13.

Giannoulis

Calori

Giannoudis

. Thirty-day mortality after hip fractures: has anything changed? Eur J Orthop Surg Traumatol 2016; 26: 365–370.