Abstract
In this article, I introduce two commands for computing the fragility index (FI):
Keywords
1 Introduction
When considering the results of a randomized controlled trial (RCT), scientists and those who rely on scientific evidence often conclude that a treatment is effective solely based on a p-value threshold (that is, < 0.05). However, the use of a p-value threshold to declare statistical significance has been widely criticized for being overly simplistic, frequently misunderstood, and inappropriately interpreted (see, for example, Amrhein, Greenland, and McShane [2019]; Colquhoun [2017]; Feinstein [1998]; Ioannidis [2005, 2018]; Sterne and Davey Smith [2001]; Wasserstein and Lazar [2016]).
As an upshot of this discourse, several supplementary measures to the p-value have been proposed to provide more focus on the robustness of statistically significant results from RCTs. Among these are Bayesian analyses (Quatto, Ripamonti, and Marasini 2020); the type S (“sign”) error risk and exaggeration ratio (Gelman and Tuerlinckx 2000; Gelman and Carlin 2014); S-values (Greenland 2019); second-generation p-values (Blume et al. 2019); and the fragility index (FI) (Walsh et al. 2014).
In this article, I introduce two commands for computing the FI: the
2 Methods
2.1 Computing the FI for individual RCTs
The FI represents the absolute number of additional events (primary endpoints) required to obtain a p-value greater than or equal to a predetermined statistical significance threshold (typically set to 0.05). The FI for individual RCTs is computed by adding an event to the study group with the smaller number of events (and subtracting a nonevent from the same group to keep the total number of patients within that group constant) and recomputing the two-sided significance. Events are iteratively added until the first time the computed p-value becomes statistically nonsignificant (Walsh et al. 2014).
2.2 Computing the fragility index for meta-analyses
To evaluate the FI of a meta-analysis, one sequentially recalculates the 95% confidence interval (CI) of the pooled estimate after performing all single event-status modifications that increase the estimate (or decrease it, depending on whether the treatment is expected to increase or decrease the risk of the outcome) by 1) changing a nonevent to an event for patients receiving treatment A for each single trial or 2) changing an event to a nonevent for patients receiving treatment B for each trial (Atal et al. 2019).
This process leads to 2 N newly calculated 95% CIs for the pooled estimate (where N is the total number of studies in the meta-analysis). If one of the newly calculated CIs overlaps 1.0, the FI of the meta-analysis is 1 because one unique event-status modification (that is, changing a nonevent to an event in arm A or an event to a nonevent in arm B) in one specific trail changed the statistical significance of the meta-analysis. If all the newly calculated 95% CIs for the pooled estimate remain < 1.0 (in the case of a treatment that lowers the risk of the outcome or > 1.0 if the treatment is expected to increase the probability of the outcome), the specific trial and specific event-status modification that lead to the 95% CI for the pooled estimate being closer to 1.0 as a starting point for the next iteration are selected (Atal et al. 2019).
This process is then repeated by performing a new single event-status modification in each arm of each trial in turn on top of the first selected modification. Similarly, if one of these 2 N event-status modifications leads to a newly calculated 95% CI for the pooled estimate overlapping 1.0, the FI of the meta-analysis is then equal to 2. This process is iterated until one event-status modification leads to a newly calculated 95% CI for the pooled estimate overlapping 1.0. The number of iterations needed to find a combination of event-status modifications in specific arms and trials leading to a modified meta-analysis with 95% CI for the pooled estimate overlapping 1.0 is thus the FI for the meta-analysis (Atal et al. 2019).
2.3 Differences between metafrag and the R package fragility_ma
3 The fragility command
This section describes the syntax of the
3.1 Syntax
In the syntax, variables #n11 and #n12 contain the respective numbers of events and nonevents from individuals in group 1 (treatment), and variables #n21 and #n22 contain the respective numbers for group 2 (control).
3.2 Options
3.3 Stored results
4 The metafrag command
This section describes the syntax of the
4.1 Syntax
4.2 Options
4.3 Stored results
5 Examples
In this section, we demonstrate the use of
5.1 A fragile RCT
This example from Walsh et al. (2014) specifies that group 1 has 1 event and 99 nonevents and group 2 has 9 events and 91 nonevents.
As shown in the output, the resulting FI of 1 suggests that the inference of a treatment effect is “fragile.” That is, only one additional event is needed to flip the results from being statistically significant to nonsignificant at the 0.05 level.
5.2 A more robust RCT
In example 2 from Walsh et al. (2014), group 1 has 200 events and 3,800 nonevents, and group 2 has 250 events and 3,750 nonevents:
As shown in the output, the resulting FI of 9 suggests that the inference of a treatment effect is more robust than that of example 1.
5.3 A fragile meta-analysis
This meta-analysis includes 7 individual studies, with a total of 448 patients. We first load the data and then use
Next, we plot a forest plot of these data, specifying that the results be presented as exponentiated values, and modify some elements of the display (see [META]
As shown in the forest plot, the treatment was associated with a statistically significant increase in the risk of the outcome (RR 1.23, 95% CI [1.00 to 1.51]). Next we use
As shown in the output, the FI is 1, indicating that the pooled treatment effect turns statistically nonsignificant after only one event-status modification. In this metaanalysis, the one event modification was made by subtracting one event from group 1 in study 3. In the forest plot, this addition corresponds with the value highlighted in gray (red on actual screen) under group 1 in study 3. The RR for the pooled effect is now statistically nonsignificant ([RR] 1.22, 95% CI [0.99 to 1.49]).
5.4 A more robust meta-analysis
This meta-analysis includes 8 individual studies, with a total of 1,344 patients. As before, we first load the data, then use
As shown in the forest plot, the treatment was associated with a statistically significant reduction in the risk of the outcome (RR 0.75, 95% CI [0.68 to 0.83]). Next, we use
As shown in the output, the FI is 65, indicating that the pooled treatment effect turns statistically nonsignificant after 65 event-status modifications, with the event modifications occurring in 4 studies. In the forest plot, event additions correspond with values highlighted in bold (blue on actual screen), and event subtractions correspond with values highlighted in gray (red on actual screen). The RR is now statistically nonsignificant ([RR] 0.91, 95% CI [0.83 to 1.00]). The FI suggests that the pooled estimate from this meta-analysis is more robust than that in the previous example, where only one event modification was necessary to nullify the statistical significance of the pooled estimate.
6 Discussion
In this article, I introduced the
While the FI offers an intuitive supplemental measure to the p-value in interpreting the reliability of study findings, it has its critics. In particular, Carter, McKie, and Storlie (2017) illustrated a strong inverse relationship between the FI and the log10 of the p-value because both operate by decreasing the differences in response rates, resulting in a quantification of how extreme the observed trial results are relative to the null condition. Thus, as is true with p-values, the FI should not be misinterpreted as a measure of clinical effect. In other words, a higher FI should not be interpreted to imply greater clinical effect than a lower FI; rather, it simply illustrates the strength of the statistical significance itself (Brown et al. 2019; Narayan et al. 2018).
In conclusion, the
8 Programs and supplemental materials
Supplemental Material, sj-zip-1-stj-10.1177_1536867X221083856 - Computing the fragility index for randomized trials and meta-analyses using Stata
Supplemental Material, sj-zip-1-stj-10.1177_1536867X221083856 for Computing the fragility index for randomized trials and meta-analyses using Stata by Ariel Linden in The Stata Journal
Footnotes
7 Acknowledgments
I thank John Moran for advocating that I write both of these commands and for providing me with many of the references used in the introduction. I also thank Ignacio Atal for his support in testing the results reported by
, and I thank Houssein Assaad at StataCorp for providing details of how Stata and R differ in their respective computations for meta-analyses. Finally, I thank the chief editor and anonymous reviewer for providing helpful comments to improve the article and commands.
8 Programs and supplemental materials
To install a snapshot of the corresponding software files as they existed at the time of publication of this article, type
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
