Abstract
Background:
Randomized controlled trial (RCT) outcomes reaching statistical significance, frequently determined by P <.05, are often used to guide decision making. Noted lack of reproducibility of some RCTs has brought special attention to the limitations of this approach. In this meta-analysis, we assessed the robustness of RCTs evaluating platelet-rich plasma (PRP) for the treatment of chronic noninsertional Achilles tendinopathy (AT) by using fragility indices.
Methods:
The present study was a systematic review and meta-analysis of RCTs comparing outcomes after PRP injection vs alternative treatment in patients with AT. Representative data sets were generated for each reported continuous outcome event using summary statistics. Fragility indices refer to the minimal number of patients whose status would have to change from a nonevent to an event to turn a statistically significant result into a nonsignificant result, or vice versa. The fragility index (FI) and continuous FI (CFI) were determined for dichotomous and continuous outcomes, respectively, by manipulating each data set until reversal of significance (a=0.05) was achieved. The corresponding fragility quotient (FQ) and continuous FQ (CFQ) were calculated by dividing FI/CFI by sample size.
Results:
Of 432 studies screened, 8 studies (52 outcome events) were included in this analysis. The 12 dichotomous outcomes had a median FI of 4.5 (FQ: 0.111), and the 40 continuous outcomes had a median CFI of 5 (CFQ: 0.154). All 52 outcome events included lost-to-follow-up data, and 12 (23.1%) indicated a greater number of patients lost to follow-up than the FI or CFI.
Conclusion:
Our findings suggest that RCTs evaluating PRP for AT therapy lack statistical robustness, because changing only a small number of events may alter outcome significance.
Level of Evidence:
Level II, therapeutic study.
Keywords
Introduction
Achilles tendinopathy (AT) is one of the most common running-related injuries with an annual incidence rate of 9.1% to 10.9%. 28 More than 52% of endurance runners will develop AT over their lifetimes, of which 5% progress to tendon rupture. 26 The pathology can be associated with a severe reduction in physical activity and persistent pain over several years, which can be especially devastating for athletes and inhibit return to sport. 35 Treatment emphasizes exercise-based interventions (eg, eccentric exercises) and correction of underlying biomechanical problems via splinting and orthoses to lessen disability. 43 Injections and surgical intervention have demonstrated effectiveness but are less well studied.19,22 With the recent surge in clinician use of biologics for orthopaedic conditions, platelet-rich plasma (PRP) has been proposed as a possible therapy or adjunct to other treatment options for chronic AT. 7 PRP is an autologous concentrate with platelet levels greater than whole blood. Its mechanism of action in tendinopathy is thought to relate to the action of growth factors—including platelet-derived growth factor, vascular endothelial growth factor, and transforming growth factor—which promote a healing response.16,30 However, this practice has been a subject of thorough debate, with recent meta-analyses showing limited to no advantage in using PRP vs placebo for the management of chronic AT.27,45
In the last 2 decades, randomized controlled trials (RCTs) have faced increased scrutiny because of concerns regarding the reproducibility of their findings. 17 It has been proposed that misunderstanding and misuse of the almost universally utilized P <.05 threshold may be the culprit.33,42 With statistically significant findings in RCTs being frequently used to guide decision making, it has become a priority to find a solution for this problem. Eliminating the P value threshold to avoid misinterpretation or, more conservatively, lowering it to a stricter value to achieve significance have both been proposed as possible solutions.2,18,42 Recently, an index that addresses several of the limitations of the P value was proposed as an adjunct to its reporting. Walsh et al 41 described the utilization of the fragility index (FI) as a complement to the P value. They defined the FI, which was first described by Feinstein, 12 as the minimum number of patients whose status would have to change from a nonevent to an event to turn a statistically significant result into a nonsignificant result, or vice versa. 41 For example, an RCT with statistically significant results and an FI of 1 would lose significance even if 1 patient had the opposite outcome. A lower FI indicates a more fragile, less statistically robust study and associated results. By utilizing the FI in addition to the P value, the reader can assess the statistical robustness of a study’s findings and make his or her own inferences regarding their utility. Prior studies have applied the FI and fragility quotient (FQ), which considers sample size, to RCTs evaluating other conditions in orthopaedic surgery.1,11,13,14,23,24,29,34 To our knowledge, a meta-analysis evaluating the statistical fragility of RCTs pertaining to utilization of PRP injections for the treatment of chronic AT has not been performed. Further, the FI and FQ have only recently been modified to extend beyond dichotomous outcomes to continuous outcomes, 5 and this study is the first to comprehensively apply these measures.
The purpose of this study was to synthesize outcomes of existing RCTs reporting on the utilization of PRP for treatment of chronic AT and evaluate their statistical robustness by applying the FI and FQ to both dichotomous and continuous outcomes. We hypothesize that comparable to similar studies in orthopedic surgery, we will identify significant statistical fragility in RCTs evaluating PRP as therapy in patients with chronic AT.
Methods
Study Selection
In accordance with the Preferred Reporting Items and Meta-Analyses guidelines, we performed a systematic search of the following 6 online databases: PubMed, Embase, Cochrane, Web of Science, Scopus, and Clinicaltrials.gov. Medical Subject Headings and Emtree terms were used with keywords to identify articles reporting on randomized controlled trials involving PRP therapy for AT in each database. Articles were excluded if they were of the wrong study design; lacked a control group; used therapies that were not PRP; used cadaveric, animal, or in vitro models; used a pediatric patient population (<18 years); included patients with tendon rupture; or were published in a non-English language. Studies were also excluded if they did not report summary statistics for continuous outcomes (mean ± SD), include dichotomous outcomes, or report P values.
After articles were extracted, 2 independent reviewers completed the initial screening using title and abstracts as well as the subsequent full-text review to ensure appropriate studies were included in our analysis. In both steps, a third reviewer acted as a tiebreaker when there was disagreement between the 2 initial reviewers. Of the 699 studies initially identified, 46 were selected for full-text review. At the conclusion of the screening process, 8 studies reporting on RCTs were included in the present systematic review and meta-analysis (Figure 1, Table 1).3,8,9,20,21,25,36,40 No RCTs on PRP utilization in insertional AT that met our inclusion criteria were found in our systematic database search, and thus the present study was limited to only noninsertional AT.
Randomized Controlled Trials Using Platelet-Rich Plasma in Chronic Noninsertional Achilles Tendinopathy.
Abbreviation: FU, follow-up.

Preferred Reporting Items and Meta-Analyses (PRISMA) flow diagram for studies reporting on platelet-rich plasma in Achilles tendinopathy.
Data Extraction
For each of the 8 included studies, we collected the control treatment to which PRP therapy was compared, duration of follow-up, and all primary and secondary outcomes. Outcomes were characterized as continuous or dichotomous. We recorded summary statistics for continuous outcomes and event distribution for dichotomous outcomes. For both types of outcomes, we collected sample size, number of patients lost to follow-up, and the original P values comparing the PRP and respective control groups.
Analysis of Statistical Fragility
The dichotomous analysis was performed by manipulating the reported outcome events in a 2 × 2 contingency table and recalculating a Fisher exact or chi-squared test, as appropriate, until reversal of significance or nonsignificance was appreciated. Statistical significance was defined as P <.05. The number of outcome events required to raise P to >.05 for outcomes initially reported as significant, or the number required to decrease P to <.05 for outcomes initially reported as nonsignificant, was defined as the FI. The FQ was subsequently calculated via dividing the FI by sample size.
Previously, statistical fragility could only be evaluated for dichotomous outcomes, substantially limiting the assessment of RCTs that predominantly reported continuous outcomes. 41 This limitation was recently surmounted by Caldwell et al, 5 who developed a new algorithm to calculate FI and FQ for continuous variables using raw data or summary statistics, appropriately named the continuous fragility index (CFI) and continuous fragility quotient (CFQ), respectively. The CFI was modeled to increase linearly with sample size, increase logarithmically with mean difference, and decrease exponentially with SD. Importantly, CFI and FI are uncorrelated and inherently different; CFI only expands the concept of statistical fragility to more outcomes. In the present study, we further modified the code presented by Caldwell et al to extend CFI and CFQ to initially nonsignificant findings (P > .05). All continuous analyses were conducted with n=5 simulations of representative, synthetic data sets using mean ± SD and sample size, eliminating the need for raw data collection. 5
For both dichotomous and continuous outcomes, statistical fragility measures were reported using median and interquartile range (IQR) to preserve distribution. Data was analyzed using R, version 3.6.1, software (The R Foundation for Statistical Computing, Vienna, Austria).
Results
Of the 8 studies included in the present systematic review and meta-analysis, all reported complete summary statistics and/or dichotomous outcomes that allowed calculation of statistical fragility. Mean sample size was 63.1 patients (range: 20-240) with 4.7 patients lost to follow-up (range: 0-19). Mean study follow-up duration was 39.8 weeks (range: 24-104 weeks). All studies assessed patients with chronic noninsertional AT; none evaluated patients with insertional AT. Control group treatments varied, with 4 studies using isotonic saline injection;3,8,9 2 using dry injection; 20 1 using high-volume injection with saline, corticosteroid, and local anesthetic; 3 1 using adipose-derived stromal vascular fraction injection; 40 1 using eccentric loading exercises; 21 1 using percutaneous needle tenotomy; 25 and 1 using endoscopic debridement. 36 One reported RCT used 2 control treatments, high-volume injection and saline injection. 3 Each study yielded a mean of 5.1 outcome events (range: 1-12) suitable for analysis.
There were 52 total outcome events—12 dichotomous (23.1%) and 40 continuous (76.9%)—recorded across all studies. Nine (17.3%) were initially reported as statistically significant and 43 (82.7%) as nonsignificant. Of the 12 dichotomous outcome events, 1 (8.3%) was initially reported as significant and 11 (91.7%) as nonsignificant. Of the 40 continuous outcome events, 8 (20.0%) were initially reported as significant and 32 (80.0%) as nonsignificant. For the 12 dichotomous outcome events, the median FI was 4.5 (IQR: 4-6) and median FQ was 0.111 (IQR: 0.102-0.144) (Table 2). For the 40 continuous outcome events, the median CFI was 5 (IQR: 3.5-9) and median CFQ was 0.154 (IQR: 0.117-0.206) (Table 3). All outcome events reported lost-to-follow-up data, of which 12 (23.1%) represented those with a greater number of patients lost than the FI or CFI for dichotomous and continuous outcomes, respectively.
Fragility Index and Quotient Data for Dichotomous Outcomes Reported in Randomized Controlled Trials Using Platelet-Rich Plasma in Chronic Noninsertional Achilles Tendinopathy.
Abbreviations: FU, follow-up; IQR, interquartile range; FI, fragility index; FQ, fragility quotient.
Fragility Index and Quotient Data for Continuous Outcomes Reported in Randomized Controlled Trials Using Platelet-Rich Plasma in Chronic Noninsertional Achilles Tendinopathy.
Abbreviations: AT, Achilles tendon; CFI, continuous fragility index; CFQ, continuous fragility quotient; FU, follow-up; IQR, interquartile range; MR, magnetic resonance imaging; NRS, numeric rating scale; US, ultrasonography; VAS, visual analog scale; VISA-A, Victorian Institute of Sports Assessment–Achilles; PFS-SF, Physical Functioning Scale of the Short-Form; MHC-SF, Mental Health Continuum of the Short Form.
Discussion
The present systematic review and meta-analysis is the first to assess the statistical fragility of RCTs evaluating PRP therapy for AT. After screening, 8 studies were included in the analysis with a total of 52 outcome events. Of the dichotomous events, the overall FI was 4.5 with an associated FQ of 0.111, meaning that reversal of only 4.5 patient outcome events (or 11.1 of 100 patients) would alter the statistical significance of the evaluated RCTs. Of the continuous events, the overall CFI was 5 with an associated CFQ of 0.154. This suggests that moving only 5 patients (or 15.4 patients out of 100) from the test group to the control group (or vice versa) would be sufficient to reverse significance. Almost one-quarter of the outcome events had a greater number of patients who did not complete their respective study protocols than the FI or CFI. Attention must be paid to the difference between the definitions of FI and CFI. The FI is defined as the number of patients whose outcome must change to alter significance, whereas the CFQ is defined as the number of patients who must be moved from one intervention arm to the other to alter significance.5,41
Despite the high burden of AT among the active population, best practices for managing the condition remain poorly defined. In addition to traditional methodology (ie, eccentric exercises and orthoses), recent research has explored the value of extracorporeal shock-wave treatment, therapeutic ultrasonography, corticosteroid injections, dry needling, hyaluronic acid, dextrose, autologous whole blood, and PRP. 6 A main advantage of PRP is that it is autologous and believed to have little to no side effects. 9 Laboratory-based studies have confirmed PRP to exert a positive therapeutic effect on damaged Achilles’ tendons.10,44 Clinically, PRP has demonstrated benefit for limited outcome events when compared with controls or when used to augment another intervention.3,36 Available RCTs have not found convincing evidence that it improves function or decreases pain for patients with AT vs placebo. Analyses of pooled data have similarly shown no substantial benefits in regard to secondary outcome measures, including change in tendon thickness, color Doppler activity, or return to sport.27,45 Accordingly, the current high-level evidence for PRP utilization in patients with chronic AT is inconsistent and requires further evaluation to determine whether the therapy’s theorized potential translates to real-world applicability.
The present fragility analysis revealed an FI of 4.5 (FQ: 0.111) and CFI of 5 (CFQ: 0.154) for RCTs assessing PRP therapy in chronic AT. These values are on par with prior studies assessing statistical fragility in the orthopaedic literature.11,13,14,23,24,29,31,32,34 Most pertinently, Parisien et al 31 evaluated the statistical fragility of unspecified comparative trials for Achilles tendon pathology, presenting an overall FI of 4 and associated FQ of 0.048 for 51 outcome events. Parisien et al 32 further studied the statistical fragility of PRP utilization in rotator cuff repair and found an overall FI of 4 and FQ of 0.092 for 177 outcome events. In these analyses, 21.6% and 30.2% of included outcome events had ≥4 patients lost to follow-up during the study periods, respectively. In the present study, 23.1% of outcome events represented those with more patients lost to follow-up than the FI or CFI. Thus, in the context of the available literature, our findings highlight that there is substantial fragility across the orthopaedic literature, including for RCTs examining the therapeutic use of PRP in chronic tendinopathy. It is important to note that for prior studies, the majority of outcomes were omitted from analysis because they were continuous variables for which the fragility measures could not traditionally be calculated. To our knowledge, only 2 prior studies have written about the CFI and CFQ. Caldwell et al 5 reported a CFI of 9 for 39 nondichotomous outcomes in the sports medicine and arthroscopy literature, and Ho et al 15 calculated a CFI of 3 for a single RCT investigating vagal nerve electrical stimulation after stroke. The present study is thereby the first to thoroughly assess statistical fragility for a specific orthopaedic intervention.
Overall, our analyses demonstrate that available results from RCTs on PRP therapy in AT should be interpreted with caution, and future protocols must aim to lower their statistical fragility in order to determine the true therapeutic value of PRP. We emphasize the critical importance of maintaining follow-up, as almost one-quarter of outcome events could have had a reversal of significance if all patients had completed the protocol as designed. Effective strategies that have been proposed for improving retention in clinical trials involve reminders to nonrespondents, flexible appointments, reduced research burden (ie, shortened assessments), incentives, and training staff on the importance of maintaining sample size. 4 Yet, retention remains a considerable obstacle for RCTs. Therefore, protocols must also aim for larger initial recruitment to reduce the impact of anticipated losses. Suggested methods for improving recruitment include the use of opt-out rather than opt-in procedures for contacting potential participants and open designs where participants know which treatment they are to receive in the trial.38,39 Yet, the disadvantages to such strategies, such as higher risk of bias with unblinded trials in open designs, must also be considered carefully before implementation. Altogether, the largely accepted 20% lost-to-follow-up rate for most study analyses thus may not be appropriate. Given the reliance on RCT data for evidence-based clinical decision making, we believe there is strong justification for including fragility indices alongside P values to better inform readers about the strength of statistical findings.
The limitations of evaluating statistical fragility must be discussed. Fragility indices are absolute values without known cutoffs or thresholds that correlate with study strength. 37 Although FQ and CFQ account for sample size and supplement the reporting of the FI and CFI, how these measures speak to the robustness of a protocol remains unclear. The FI and CFI are also unable to account for differences in outcomes over time, which is important for deciding length of follow-up. Moreover, although we were able to overcome the obstacle of FI being applicable to only dichotomous variables by using the methodology outlined by Caldwell et al, 5 use of the novel CFI concept carries its own limitations. The technique uses the Welch t test to assess statistical significance, which assumes that the data set is normally distributed. Although this assumption is always true for simulated data sets, raw samples of finite size are unlikely to be normally distributed and thereby may not be accurately represented with the simulated samples. The use of multiple simulations partially compensates for this shortcoming by deriving an average value, but the accuracy of such calculations may fall short of ideal. It must be noted that we only included nondichotomous data that were reported as mean ± SD, which theoretically assumes a normal Gaussian distribution, although we recognize that some authors may report mean ± SD for nonparametric data. This also limits the inclusion of continuous data presented in other formats. Additionally, a major limitation of fragility meta-analyses is that studies do not contribute an equivalent number of outcome events. As a result, studies with more events disproportionately influence the overall FI/CFI compared to studies with fewer events. Because of the inherent difference between FI and CFI, we were also unable to calculate an overall fragility measure. However, the ability to assess statistical fragility for all dichotomous and continuous outcome events is a considerable strength that supersedes these limitations. Finally, all RCTs included in the current study assessed noninsertional AT, which is more likely to resolve with nonoperative treatment, and our findings are not applicable to insertional AT.
Conclusion
In this systematic review and meta-analysis, we determined that RCTs evaluating the use of PRP as therapy for chronic AT have an overall median FI of 4.5 (FQ: 0.111) for dichotomous events and CFI of 5 (CFQ: 0.154) for continuous events. Almost one-quarter of outcome events would have had a reversal of significance or nonsignificance if they had maintained follow-up. It is paramount that future RCTs be designed with consideration of sample size from both recruitment and retention perspectives to maximize protocol robustness and determine the true therapeutic effect of PRP in AT.
Footnotes
Ethical Approval
Ethical approval was not sought for the present study because it is a systematic review.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. ICMJE forms for all authors are available online.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
