Abstract
Background:
Randomized controlled trials (RCTs) have traditionally been designed with an explanatory approach, in contrast to incorporating real-world, pragmatic considerations.
Aims:
This methodological review assesses the uptake of pragmatic designs in Phase III acute stroke RCTs.
Methods:
We conducted a comprehensive literature search of the MEDLINE, Embase, and Cochrane Library databases from inception to 1 July 2024. Eligible articles included English-language published Phase III RCTs of acute ischemic stroke and intracerebral hemorrhage interventions. Using the Pragmatic Explanatory Continuum Indicator Summary (PRECIS-2) tool, each trial was rated on nine key domains, and relevant study characteristics were extracted. Trials with an average rating of 3 or higher, or a total score (sum of ratings) of 27 or higher (given that all domains were assessed), were considered to adopt an overall pragmatic approach to their design. Risk of bias was evaluated using the Cochrane risk of bias tool.
Results:
Of the 5663 unique articles obtained after deduplication, 136 trials were included, and 71 (52%) trials were classified as pragmatic using the PRECIS-2 tool. A majority had a low risk of bias (63.2%). Pragmatic trials were more likely to be large sample, multicenter, multinational trials with broad inclusion criteria that cover multiple types of strokes.
Conclusion:
There has been an increased uptake of pragmatic designs in acute stroke over the last decade, reflecting improvements in acute stroke care and a greater consideration of real-world applicability by trialists.
Keywords
Introduction
The randomized controlled clinical trial (RCT) provides the highest level of direct evidence regarding the efficacy of interventions,1–4 including those targeting acute ischemic and hemorrhagic stroke,3,5 commonly defined as the period within the first 7 days following symptom onset, during which interventions are considered to target acute management.6–8 Historically, trials have been explanatory in design with strict inclusion criteria, resulting in highly homogeneous populations that may not closely represent the patient population presenting with stroke in routine acute settings.9–11 Explanatory trials prioritize internal validity, aiming to assess whether an intervention works under ideal, controlled conditions. 12 However, there is a growing demand for RCTs that will accurately reflect real-world clinical settings, generating results that are directly generalizable to patient populations commonly seen at the point of care.11–14
Pragmatic designs aim to assess the effectiveness of interventions in real-world clinical practice. They incorporate broad eligibility criteria and flexible treatment protocols, ensuring a broader population is enrolled and maximizing the generalizability of the findings to routine care and diverse patient populations. This design feature acknowledges that real-world adherence and clinician discretion may vary and, as a result, prioritizes patient-centered, practical, and pragmatic conduct of RCTs, as well as the collection of outcomes directly relevant to clinical practice improvements.13,15 As such, pragmatic and explanatory trials represent two ends of the clinical trial design spectrum. A trial's pragmatic-explanatory nature primarily stems from its design rather than solely from its execution.4,10,15,16
The Pragmatic Explanatory Continuum Indicator Summary (PRECIS-2) tool provides a framework for assessing how trials align with real-world settings.10,17 The PRECIS-2 tool comprises nine domains that collectively evaluate the extent to which a trial aligns with real-world clinical conditions. These domains include eligibility criteria, recruitment, setting, organization, flexibility in delivery and adherence, follow-up, primary outcome, and primary analysis. 10 Each domain is rated on a scale from 1 (highly explanatory) to 5 (highly pragmatic), allowing for a systematic assessment of trial design.
Previous reviews have assessed the extent to which pragmatic trial design is employed in RCTs across various clinical specialties, including critical care, rheumatoid arthritis, pain management, and pediatric neurosurgery.18–22 This study aims to evaluate the extent to which pragmatic designs are adopted in Phase III trial design in acute stroke settings. By systematically analyzing these trials through the lens of the PRECIS-2 framework, this review provides a comprehensive assessment of the trend to move to pragmatic trial designs over time. In addition, the review will identify gaps in current research, propose strategies to enhance pragmatism in future clinical trials in acute ischemic and intracerebral hemorrhage, and highlight best practices that can optimize the translation of trial findings into clinical care.
Methods
This methodological review consisted of five primary phases: literature search, study selection, data extraction, quality assessment, and data analysis. The protocol was registered with the International Prospective Register of Systematic Reviews (PROSPERO registration number: CRD42024560101).
Literature search
A comprehensive literature search was conducted across multiple electronic databases, including MEDLINE, Embase, and the Cochrane Central Register of Controlled Trials (CENTRAL), to gather relevant published Phase III English-language RCT articles until 1 July 2024, in acute stroke. The search strategy incorporated a combination of Medical Subject Headings (MeSH) and free-text terms related to “Phase III trials,” “stroke,” and “randomized controlled trials.” The search terms for “Randomized Controlled Trials” were refined iteratively using the search strategy recommended by the Cochrane Handbook for Systematic Reviews of Interventions to ensure comprehensive coverage of relevant studies. Advanced Boolean operators and truncation techniques were applied to maximize search efficiency. See Supplemental Table S1 for the extensive list of search terms. We searched for “stroke” rather than specifying “acute stroke” to avoid missing any papers that may not have specified acuteness in their title or abstract but were conducted within the 7-day timeframe. In our review process, we only include articles that satisfy this definition of acute stroke.
Due to the extensive nature of acute stroke research, we identified studies that may fulfill the inclusion and exclusion criteria above by exploring (1) multiple electronic bibliographic databases to capture a broad range of studies using a systematic and comprehensive search strategy for a wide range of acute stroke interventions and (2) the reference lists of already obtained studies from the search strategy above and appropriate previously published review to identify additional trials not captured in the database searches.
This study includes RCTs with the following criteria: (1) trials involving adult human subjects with acute ischemic or hemorrhagic stroke; (2) trials with published protocols or ClinicalTrials.gov registration to allow for the assessment of PRECIS-2 elements within the study; (3) Phase III clinical trials; and (4) articles published in the English language. Subarachnoid hemorrhage (SAH) trials were excluded. This study excludes RCTs with the following criteria: (1) trials in animal, plant, cell, or non-human subjects; (2) systematic reviews, literature reviews, observational studies, pilot trials, non-randomized studies, or meta-analyses as pragmatism is a characteristic of clinical trial design; (3) papers based on subgroup analyses of existing trials; and (4) publications in gray literature.
For this study, Phase III clinical trials were defined as trials that assessed the safety and efficacy of an intervention and involved at least 100 participants. If a trial had fewer than 100 participants and was classified as a Phase III study by the investigators, it was still included in our analysis.
Study selection
AO, SR, and KC conducted the title and abstract screening and full-text review. Two of the three independent reviewers screened all retrieved articles by title and abstract using Covidence (AO and SR, AO and KC, or KC and SR). 23 Articles meeting the eligibility criteria were then subjected to full-text review. Discrepancies between reviewers were resolved through a discussed consensus, with a third reviewer consulted if necessary. A PRISMA flow diagram (Figure 1) illustrates the selection process. Each screener also independently evaluated the full text of all the studies potentially eligible for inclusion. The interrater reliability score was also assessed. See the list of studies included in this review in Supplemental Table S2.

PRISMA flow diagram for the methodological review.
Data extraction
Data on the included articles were extracted independently by AO, SR, KC, and OA using a standardized extraction form. We extracted the study design characteristics, publication characteristics, and other relevant study details. The full extraction sheet is available in the supplementary material. AO (Reviewer 1) and TS (Reviewer 2) independently evaluated each trial using the PRECIS-2 tool, ensuring that domain scores were applied consistently using a checklist guide (Supplemental Table S3). PRECIS-2 domains were rated on a scale from 1 (highly explanatory) to 5 (highly pragmatic). Non-applicable domains, such as Adherence, were not rated for studies that lacked concrete ways to assess them due to the acuity of the interventions. The final PRECIS-2 scores were derived by averaging the individual PRECIS-2 ratings. The Cochrane risk-of-bias tool was used to evaluate selection, performance, measurement, attrition, and reporting biases.24,25 Based on these criteria, each study was assigned a risk rating (low, high, or unclear).
Statistical analyses
The primary objective of the analysis was to explore the extent of pragmatic design uptake in stroke clinical trial design. As a result, we compared the characteristics of studies classified as either pragmatic or explanatory and evaluated the agreement between two independent reviewers in their assessments of trial pragmatism using the PRECIS-2 tool. The average pragmatic rating per domain was calculated by finding the median of domain-specific pragmatic ratings for each reviewer and then averaging this value. An average rating of 3 or higher, or a total score (sum of ratings) of 27 or higher (given that all domains were assessed) per study, indicates an overall pragmatic approach. See the supplementary material for a detailed description of how each domain was rated (Supplemental Table S3). Please note that non-applicable domains were excluded from domain-wise analyses and not included in the sums. In addition, while binary thresholds are used for descriptive purposes in this study, true pragmatism lies on a continuum.
Using the reviewer-weighted Cohen’s Kappa statistic, we assessed interrater reliability for the screening, full-text review, and PRECIS-2 ratings. In addition, inter-rater reliability for the overall PRECIS-2 total scores was evaluated using the Intraclass Correlation Coefficient (ICC), based on a two-way random-effects model with absolute agreement. Kappa statistics were also used to determine the agreement between the two reviewers' PRECIS-2 total scores. Interpretations of the Kappa coefficient followed established guidelines, 26 where values between 0.01 and 0.20 indicate slight agreement, 0.21 and 0.40 indicate fair agreement, 0.41 and 0.60 indicate moderate agreement, 0.61 and 0.80 indicate substantial agreement, and values above 0.81 indicate almost perfect agreement. ICC values between 0.50 and 0.75 indicate moderate reliability, 0.75 and 0.90 good reliability, and values above 0.90 indicate excellent reliability. 27 Descriptive statistics were used to summarize the characteristics of the included studies. Categorical variables were reported as frequencies and percentages, while continuous and ordinal variables were summarized using medians and interquartile ranges due to non-normal distributions. Studies were stratified according to their final pragmatic status (pragmatic vs explanatory). Group comparisons were made using Pearson’s Chi-square test for categorical variables and the Wilcoxon rank-sum test for continuous or ordinal variables. A two-sided p-value <0.05 was considered statistically significant. We also conducted sensitivity analyses using alternative cutoffs of 2.5 and 24, compared to using 3.5 and 30 (See Supplemental Tables S6 and S7).
Where applicable, missing data were handled using pairwise deletion (complete case analysis), and no imputation was performed. All statistical analyses were conducted using R version 4.4.2 in RStudio. 28
Results
A total of 8510 citations were identified through comprehensive database searches. Nine articles were obtained from reference sources. After removing duplicates, 5663 unique articles remained and were subjected to title and abstract screening. Subsequently, 5442 articles were excluded for various reasons, leaving 221 studies for full-text review. A total of 136 Phase III trials met the inclusion criteria for this methodological review and were included in the analysis. Please refer to the supplementary material (the supplementary reference list and Supplemental Tables S2 and S8) for a comprehensive list of all included articles. The study selection process is summarized in the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram (Figure 1). The reviewer-weighted Kappa statistic was 0.42, indicating moderate agreement between reviewers for the title and abstract review phase, while the Kappa statistic for the full-text review was 0.78, indicating substantial agreement among reviewers of the full text. See the Supplemental Material for the detailed extraction sheet outlining the characteristics of the included studies.
Of the 136 articles included, 71 studies were classified as pragmatic-leaning, while 65 studies were classified as explanatory-leaning if the average total PRECIS-2 score is ≥ 27 (or average of ≥ 3 across all the PRECIS-2 domains). The median PRECIS-2 score for each domain is as follows: Eligibility—1.5; Recruitment—3.5; Setting—3; Organization—3; Delivery—2.5; Adherence—3; Follow-up—3; Primary Outcome—4; Primary Analysis—4.5 (see Figure 2 and S1). Spearman’s correlation between total PRECIS-2 scores for the two reviewers was 0.44, indicating a moderate positive correlation, while the Kappa statistic of the agreement between average overall PRECIS-2 scores for the two reviewers was 0.40, indicating a fair agreement between reviewers’ pragmatic scores. Furthermore, the overall Intraclass Correlation Coefficient was 0.57, indicating moderate reliability. 29 Please find the full trial-level PRECIS-2 dataset in Supplemental Table S4.

The average PRECIS-2 ratings for included trials across all nine domains by pragmatic status. (a) Overall distribution of average PRECIS-2 ratings. (b) Distribution of average PRECIS-2 ratings by pragmatic status. (c) Radar chart showing explanatory and pragmatic distributions.
Table 1 describes the main design features of the included studies, classified by pragmatic-explanatory score. The median (IQR) sample size for the included trials was 567 (933), totaling 197,257 patients. Most studies (97%) were multicenter, with a median of 45 sites (range: 1-674). Sixty-nine (51%) studies were multinational trials. Of the studies included, 131 (96%) were two-parallel arm trials, 54 (40%) trials adopted the PROBE design, 76 (56%) trials were double-blinded trials, and 121 (89%) were superiority trials. Of the 136 studies included in this methodological review, 75 (55%) were published after 2016, of which 51(68%) were pragmatic-leaning trials. Supplemental Table S5 provides additional information on the pragmatic status by journal type.
Descriptive characteristics (n, %) of included studies.
n (%); Median [Min, Max]. Positive results indicate the establishment of superiority or non-inferiority of an intervention over the standard of care or control.
Univariate analysis revealed a significant correlation between pragmatic design classification and the average PRECIS-2 scores, as well as the study sample size and number of centers (Table 2). Figure 3 illustrates the trends in the distribution of more pragmatic-leaning trials up to 2024, with a notable increase in the number of such trials between 2016 and 2024. A lenient threshold (avg ⩾ 2.5 or sum ⩾ 24) classifies 91% of trials as pragmatic, whereas a stricter cutoff (avg ⩾ 3.5 or sum ⩾ 30) classifies 19% of trials. Despite this re-labeling, pragmatic-labeled trials remain substantially larger and involve more centers under both cutoffs, while proportions published after 2016, positive efficacy findings, and safety demonstrations do not differ materially by threshold. See supplementary material for more details (Supplemental Tables S6 and S7).
Correlation analysis between domain-specific PRECIS-2 ratings and study characteristics.
The Spearman’s correlation was used for numeric variables (Spearman's rank order correlation coefficient presented). bThe Kruskal–Wallis test was used for categorical variables, where the categories are the same as indicated in Table 1 (p-values presented). *Statistically significant associations or correlations between any of the categories and the PRECIS-2 domains (p < 0.05) were found.

Number of explanatory and pragmatic published studies by year.
As expected, the methodological quality of the included studies was generally high, with a low risk of bias across key domains, including randomization, protocol adherence, data completeness, and outcome measurement. There was an overall low risk of bias in 86 (63.2%) studies, while 21 (15.4%) showed some concerns. Twenty-nine studies (21.3%) had a high risk of bias (Figures 4, S2, and S3). The percentage of more exploratory trials with a low RoB is 64.6% while the rate of pragmatic-leaning trials with a low RoB is 62.0% (p = 0.888). Similarly, there is no significant difference in the percentage of trials with a high risk of bias between more explanatory and more pragmatic trials (21.5% compared to 21.1%, respectively). See the supplementary material for detailed data on the risk-of-bias ratings for each domain and all the included studies (Supplemental Tables S8 and S9).

Risk of bias for included studies using the Cochrane risk-of-bias tool stratified by pragmatic-explanatory classification. (a) Risk of bias for all included studies. (b) Risk of bias for all included studies stratified by pragmatic-explanatory classification.
Discussion
This methodological review confirms an increased uptake of pragmatic designs in acute stroke trials, with more than half of the included trials considered pragmatic. Notably, the majority of these trials, classified as pragmatic, are large (N > 1000) multicenter, international Phase III acute stroke trials conducted over the last decade.
While pragmatic-leaning trials were more likely to be large multicenter, international, and pharmacological trials than the explanatory-leaning trials, both types of trials were equally likely to be positive (i.e. demonstrate superiority or non-inferiority). Our review shows that the included trials had the highest average PRECIS-2 ratings on the Recruitment, Primary outcome, and Primary analysis PRECIS-2 tool domains. As expected, these acute stroke trials generally tend to prioritize pragmatism in these three elements of trial design. The recruitment process in acute stroke trials is inherently pragmatic, as patients are often enrolled from emergency departments or stroke centers, that is, readily available patient populations. In addition, given the impact of stroke on patients, functional outcome is the most common primary outcome for most acute trials, as it is the most important outcome to patients. On the other hand, assessing primary outcomes and conducting primary analyses typically require clinically relevant tools and biostatistical expertise, respectively. In contrast, the eligibility domain had the lowest ratings across all the studies. The majority of acute stroke trials typically adopt strict inclusion criteria, resulting in highly homogeneous populations, often excluding patients with premorbid disability and those presenting late with stroke after 24 hours of stroke onset.30,31 On the other hand, the adherence criterion of the PRECIS-2 was only applicable to less than one-third of the included trials.
The observed increase in large, multicenter, international pragmatic trials during the last decade may be attributed to several factors. First, there has been an overall rise in recognition of the importance of pragmatic design in medical literature over the past decade.10,14,32–34 Second, the evolution of stroke care from a lack of available treatments a couple of decades ago to the availability of newer interventions (such as mechanical thrombectomy, thrombolysis, and other pharmacological interventions) during the last decade has led to an increase in pragmatic considerations in stroke research following initial trials demonstrating the safety and efficacy of these newer interventions. 35 Third, clinical trial infrastructures for acute stroke research have matured over the years, with many centers in Europe, the United States, Canada, Australia, and China capable of conducting large, multicenter, international stroke trials in recent years.36–38 The increased uptake of pragmatic designs in acute stroke trials has significant implications for clinical practice. By designing trials that closely mimic real-world conditions, researchers can generate findings that are directly translatable to clinical practice.
Nevertheless, several misconceptions about pragmatic designs still persist, namely: (1) that pragmatic-leaning trials have a reduced ability to establish causality and assess the intervention’s effectiveness due to the influence of potential confounding variables, 39 (2) that more pragmatic trials have higher total financial costs and increased effort in study design and execution than explanatory-leaning trials because of larger sample sizes and extended follow-up periods,40,41,15 and (3) that pragmatic-leaning trials alone do not generate sufficient evidence that can support applications for regulatory approval. 42 While explanatory-leaning trials aim to control for confounding and other types of bias, more pragmatic trials seek to demonstrate the real-world effectiveness of interventions. Results from this methodological review indicate that trial safety and efficacy remain unaffected by whether a study is designed to be predominantly pragmatic or explanatory, demonstrating that pragmatic-leaning trials are equally capable of assessing the effects of interventions. In addition, the risk of bias analysis in this study showed that pragmatic-leaning trials do not necessarily have a higher susceptibility to different types of bias. While the more pragmatic trials may require larger sample sizes, longer follow-up periods, and broader study sites, these factors contribute to generalizability rather than methodological weaknesses. Perceived financial constraints may be mitigated by leveraging existing patient-centered clinical research infrastructure, such as clinical registries, electronic data-collection systems, and simplified data-collection procedures. Finally, data from pragmatic-leaning trials can support label claims and approval from regulatory authorities. Regulatory agencies, such as the Food and Drug Administration and the European Medicines Agency, have published guidelines on using pragmatic trials and real-world evidence to support regulatory approval.9,42 By producing findings more directly applicable to clinical practice, pragmatic trials can accelerate the translation of evidence into practice, potentially reducing overall healthcare costs by enhancing treatment effectiveness and improving health outcomes. Despite this, it is necessary to recognize that early-phase explanatory trials, which prioritize safety over efficacy or effectiveness, may be constrained in sample size and eligibility (natural pragmatic considerations). In this instance, explanatory design may serve as an essential precursor to subsequent pragmatic trials; however, to address this, we focus on exploring Phase III acute stroke trials in this paper. Furthermore, the apparent increase in pragmatism over time may be less a matter of deliberate design choice and more a reflection of the accumulation of evidence on safety and efficacy. For instance, early endovascular trials in the 2010s adopted narrow eligibility criteria due to safety concerns.43–46 Afterwards, the MR CLEAN trial demonstrated a clear benefit, and subsequent trials expanded inclusion criteria. 47 A similar trajectory has occurred with intravenous thrombolysis, where eligibility has broadened as safety and responder populations have become better defined.48–51
This study is not without some limitations. A significant limitation of this study is that it primarily focused on English-language publications, which may have excluded relevant articles and publications in languages other than English and may underrepresent pragmatic trials from Asia, Latin America, and non-Anglophone Europe. Notably, there is fair to moderate agreement between PRECIS-2 ratings across the domains for the two reviewers who rated the included studies, which could lead to the misclassification of studies as pragmatic or explanatory trials. However, despite these lower agreement statistics, the overall PRECIS-2 scores remained moderately positively correlated. Finally, although this study used a binary classification of included trials as either pragmatic-leaning/more pragmatic or explanatory-leaning/more explanatory, it is necessary to acknowledge that no trial is purely explanatory or purely pragmatic. In fact, as hinted at in our phrasing, most studies fall somewhere along the pragmatic–explanatory continuum. Therefore, the findings of this methodological review should be interpreted with caution. Despite the broad use of the PRECIS-2 tool to grade the extent to which trial designs are pragmatic, its higher likelihood in assigning pragmatic masked RCTs as explanatory in over open-labeled pragmatic studies, the irrelevance of specific domains for different types of trials, issues about its validity and reliability for specific trials where not all the domains are applicable have been documented in the literature well documented in the literature.52–54 This may introduce some misclassification bias for the included trials, as qualitative information is used to ascertain a subjective quantitative score. In conclusion, the increased adoption of pragmatic designs in acute stroke trials, particularly over the last decade, represents a positive step toward bridging the gap between research and clinical practice. The evidence generated from these trials, which are primarily multicenter and conducted in multiple countries, is robust and generalizable to real-world settings, ultimately improving stroke care and patient outcomes globally.
Supplemental Material
sj-docx-1-wso-10.1177_17474930251407852 – Supplemental material for A methodological review of pragmatic designs in acute stroke trials
Supplemental material, sj-docx-1-wso-10.1177_17474930251407852 for A methodological review of pragmatic designs in acute stroke trials by Ayooluwanimi P Okikiolu, Sucharita Ray, Kamalesh Chakravarty, Olayinka Arimoro, Riley Martens, Nishita Singh, Aravind Ganesh, Mohammed Almekhlafi, Michael D Hill, Bijoy K Menon and Tolulope T Sajobi in International Journal of Stroke
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Data availability statement
All data used in this methodological review are publicly available from the included trial publications and their registries.
Supplemental material
Supplemental material for this article is available online.
