Controls,comparator arms,and designs for critical care comparative effectiveness research: It’s complicated

Abstract

Background

Comparative effectiveness research is meant to determine which commonly employed medical interventions are most beneficial, least harmful, and/or most costly in a real-world setting. While the objectives for comparative effectiveness research are clear, the field has failed to develop either a uniform definition of comparative effectiveness research or an appropriate set of recommendations to provide standards for the design of critical care comparative effectiveness research trials, spurring controversy in recent years. The insertion of non-representative control and/or comparator arm subjects into critical care comparative effectiveness research trials can threaten trial subjects’ safety. Nonetheless, the broader scientific community does not always appreciate the importance of defining and maintaining critical care practices during a trial, especially when vulnerable, critically ill populations are studied. Consequently, critical care comparative effectiveness research trials sometimes lack properly constructed control or active comparator arms altogether and/or suffer from the inclusion of “unusual critical care” that may adversely affect groups enrolled in one or more arms. This oversight has led to critical care comparative effectiveness research trial designs that impair informed consent, confound interpretation of trial results, and increase the risk of harm for trial participants.

Methods/Examples

We propose a novel approach to performing critical care comparative effectiveness research trials that mandates the documentation of critical care practices prior to trial initiation. We also classify the most common types of critical care comparative effectiveness research trials, as well as the most frequent errors in trial design. We present examples of these design flaws drawn from past and recently published trials as well as examples of trials that avoided those errors. Finally, we summarize strategies employed successfully in well-designed trials, in hopes of suggesting a comprehensive standard for the field.

Conclusion

Flawed critical care comparative effectiveness research trial designs can lead to unsound trial conclusions, compromise informed consent, and increase risks to research subjects, undermining the major goal of comparative effectiveness research: to inform current practice. Well-constructed control and comparator arms comprise indispensable elements of critical care comparative effectiveness research trials, key to improving the trials’ safety and to generating trial results likely to improve patient outcomes in clinical practice.

Keywords

Comparative effectiveness research critical care clinical trials control comparator group misalignment trial designs usual critical care unusual critical care

Introduction

Poorly constructed control arms or inappropriate active comparators in critical care comparative effectiveness research (CER) trials have drawn challenges and critiques for over two decades.^1–7 Failure to resolve this problem stems in part from a widely held assumption that CER trials pose little or no risk to subjects because they study routinely used clinical interventions.^2,8–10 This belief has led ethicists and trialists evaluating clinical trials to wrongly advise investigators, institutional review boards, safety monitoring committees, and research subjects that care as studied in some critical care CER trials will not differ substantially from care administered outside of a trial setting.^2,8,9,11 Patient-Centered Outcomes Research Institute, an influential arbiter of CER policies, methodology standards states “usual care … groups should … represent legitimate and coherent clinical options.”¹² To that point, critical care CER trials commonly examine life-sustaining therapies in critically ill subjects, and deviations from “legitimate”“coherent” usual practices may yield care that is unsafe and research conclusions that are uninformative.^2,7,13–19 In routine clinical practice, life-sustaining therapies are often dose-adjusted or limited to patient subsets based on clinical criteria such as history, pathophysiology, or severity of illness.^2,15 Furthermore, some interventions are titrated to effect for patient safety throughout the course of critical illness. Disruption or distortion of these relationships through randomization may result in potentially hazardous “unusual care” that subjects are unlikely to receive outside of the trial.^{2,3,7,11,13,20}

The definition of CER is intentionally broad. The Institute of Medicine defines CER as “the generation and synthesis of evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat, and monitor a clinical condition or to improve the delivery of care.”²¹ The Patient Protection and Affordable Care Act describes CER as “evaluating and comparing health outcomes and the clinical effectiveness, risks, and benefits of two or more medical treatments.”²² While these definitions emphasize the clarity CER can bring to medical practice, neither of these definitions address the hidden issues plaguing CER trial design that can confound studies and harm research subjects. Applying these non-granular CER definitions to critically ill, vulnerable subjects allows for the creation of flawed comparator groups that can undermine the safety and interpretability of these trials.^{2,7,11,13–19} Misconceptions about the risk profile of CER trials and the lack of preconditions guiding the selection of appropriate comparator arms have undermined a substantial number of critical care CER trials.^11,13

To understand the scope of this problem, we reviewed 25 critical care CER trials published in three high-impact medical journals between 2019 and 2020.¹¹ Of the trials studied, eight failed to incorporate designated control and/or active comparator arms representative of contemporaneous usual critical care practices. It appears that this type of design weakness is widespread if little appreciated. This study intends to raise awareness of this issue and develop mitigating methodology that will safeguard research subjects while preventing the adoption of improperly or inadequately tested alterations to clinical practice.

Methods/examples

Definition of usual critical care

For this analysis, we define “usual critical care,” as it applies to CER, as “management consistent with contemporary practices and interventions that would have been received routinely outside of the trial.” Deviation from common practices in control or active comparator arms after randomization in a trial thus constitutes “unusual critical care.” To determine what constitutes “usual critical care,” we consider not only the therapy or intervention itself but also the specific way it would be administered to each patient. While clinical medicine varies considerably in clinical practice, not all the variation is random. Non-random variability in care is typically driven by patient characteristics, clinical factors, and disease severity, and it is vulnerable to disruption by randomization. By failing to offer differential care to patients with well-recognized differences, trial authors create practice misalignments that are at once unconventional (considered “unusual care” or not “usual critical care”) and potentially unsafe. To avoid this error, trial authors must study objective contemporary data of usual critical care across participating institutions prior to trial design and enrollment. The strongest sources of such data include investigator-initiated surveys and observational studies. Prevailing local or national guidelines should also be examined to characterize current critical care practices; however, guidelines alone cannot serve as a substitute for actual data if available or could be obtained. Moreover, some guidelines are built around general statements that overlook common individualization patterns at the national, regional, local, or institutional levels.^2,16 Most importantly, at least one trial arm must provide usual care after randomization. If two active comparators arms are contrasted in CER, then both arms need to provide usual care after randomization. Otherwise, the results of the trial can have no direct applicability to current clinical practice. Some published studies fail to recognize major deviations from usual care and/or how care is commonly provided outside of the trial; this can also lead to a misinterpretation or application of study results. Experimental approaches that compare two new treatments but lack a contemporaneous usual care arm severely compromise any ability to determine the equivalence or superiority of either novel approach to current practice.

We found that many recent critical care CER trials clearly define and incorporate usual care into their trial designs. To examine this as stated above, we performed a systematic review of high-impact clinical trial journals and determined that 12 of 146 randomized clinical trials published in The New England Journal of Medicine from April 2019 to March 2020 met the Institute of Medicine criteria for critical care CER.¹¹ Six of these relied on contemporaneous data to define and incorporate current practices into trial design.^23–28 Across these six trials, investigators were exacting in defining usual care. The investigators responsible for these six trials performed 19 separate studies to understand and characterize usual care. The investigators additionally cited more than 60 articles to define contemporaneous practices.¹¹ We further found that, in 11 out of 13 critical care CER studies published in The Lancet and Journal of the American Medical Association during the same time frame, trial authors also meticulously characterized contemporary practices a priori and rigorously replicated these practices in the design and conduct of the trial.^29–39 Based on these findings, we would dispute any suggestion that it is not possible to characterize and/or implement usual care in critical care CER trials.¹¹

Risks of not studying usual care in CER

Critical care CER studies that lack a control arm—or are comprised of two or more active comparator arms which fail to reliably incorporate contemporaneous usual care—create the following risks: (1) Monitoring boards cannot easily determine whether an intervention is harmful or beneficial compared to usual critical care. Thus, the boards cannot reliably detect a signal for benefit or harm. For an illustrative case, see Example 10 in section “Type 2B trials.” (2) If usual critical care is not studied, it becomes nearly impossible to know if conclusions made from any comparisons made will, with assurance, improve future patient care. For an illustrative case, see Example 8 in section “Type 2B trials.” (3). Informed consent—which should clearly explain usual critical care practices and how care will differ in context of the trial—is compromised. For an illustrative case, see Example 5 in section “Type 2A trials.”

Types of CER trials and common design errors

We propose that there are at least three common types of Critical Care CER trials—Types 1, Type 2A, and Type 2B (see Table 1). Each of these types is associated with persistent, specific, and unique design errors.

Table 1.

Clinical trial types and the most frequent design errors and potential results.

	Reported comparison	Common error	Potential result after randomization	Examples	Solutions
Type I trials
Arm 1	Control (usual care for a therapy)	Control group administers “unusual care” to all or at least some subjects	Experimental arm may appear superior to control because the “control arm” does not represent usual care as given outside of the trial	Examples 1 and 2 in section “Type 1 trials”	See right-hand panel Figure 1 Example 3 in section “Type 1 trials”
Arm 2	Novel approach to usual care for a therapy			Examples 1 and 2 in section “Type 1 trials”
Type 2a trials
Arm 1	Therapy #1 for the disease which, under some circumstances, might represent usual care	Some subjects fail to receive the usual care they would have received outside of the trial	Many subjects in one or both arms get contraindicated therapies, meaning the results of the study may not be applicable to current practice	Examples 4 and 5 in section “Type 2A trials”	See right-hand panel Figure 2Example 6 in section “Type 2A trials”
Arm 2	Therapy #2 for the disease which, under some circumstances, might represent usual care	Other subjects fail to receive the usual care they would have received outside of the trial		Examples 4 and 5 in section “Type 2A trials”
Type 2b trials
Arm 1	Low dose of a therapy used in critical care	Some subjects with severe diseases are treated with a low dose of the therapy, representing an unusual misalignment of disease severity and dose	The study compares two distinct misalignments of disease severity and dose—calling into question the relevance and applicability of the study’s results	Examples 7–10 in section “Type 2B trials”	See right-hand panel Figure 3Example 11 in section “Type 2B trials”
Arm 2	High dose of the same therapy	Some subjects with mild disease are treated with a high dose of the therapy, representing a distinct but also unusual misalignment of disease severity and dose		Examples 7–10 in section “Type 2B trials”

Type 1 trials

These compare a designated usual critical care control intervention to an unusual but potentially beneficial and acceptable modification of that intervention (Figure 1 and Table 1). A common error is that the trial design fails to establish usual critical care as it is practiced outside of the trial and instead provides novel, potentially unusual, critical care across both arms.

Figure 1.

Type 1 CER—a usual critical care control compared to a novel or unusual approach to usual care believed potentially beneficial. For a Type 1 CER comparison (“usual care vs novel usual care”), a common error is the therapeutic misalignment. In this case, the error consists of disadvantaging the control arm. The trialists are comparing a novel approach toward usual care to a disadvantaged control, meaning the trialists cannot determine whether the novel therapy is beneficial or whether the disadvantaged control is harmful. Furthermore, the study’s results cannot easily advise current practice, as the trial design ensures it was never actually studied. The right-hand panel shows how trialists, by studying usual care pre-randomization and verifying it again post-randomization, can ensure the control arm still comprises usual care, eliminating the possibility of a therapeutic misalignment and making the findings readily applicable to current practice.

Example 1, Type 1 error: a recent trial compared cardiac arrest subjects treated with hypothermia to a designated “normothermia control.”⁴⁰ After randomization, subjects in the arm designated “normothermia control” with low temperatures at enrollment were warmed until their core temperature reached an assigned normothermia range. Contemporary critical care practice and guidelines did not support actively warming such patients.^11,41–45 In this design, the “control” represented unusual critical care, potentially disadvantaging this arm compared to the experimental treatment and making it difficult to determine whether the hypothermia arm was beneficial or the “normothermia” arm was harmful.

Example 2, Type 1 error: a recent trial investigated whether early neuromuscular blockade in acute respiratory distress syndrome improves outcomes compared to usual care.⁴⁶ After surveying only the primary site investigators, trial authors restricted use of neuromuscular blockade in the “usual care control arm” to refractory hypoxia and/or to subjects with plateau pressures above 32 cm H₂O that persisted for at least 10 min despite increasing sedation and decreasing positive-end expiratory pressure and tidal volume. However, this did not constitute current practice: data from a large survey, as well as most published observational data available at the time, conclusively showed that clinicians did not commonly restrict neuromuscular blockade in this manner. Instead, clinicians administered neuromuscular blockade most commonly for other clinical indications, such as ventilator asynchrony.^11,47–50

Example 3, Type 1 error avoided: in a trial comparing sedation practices of dexmedetomidine versus usual care in intubated mechanically ventilated subjects, investigators diligently determined usual care a priori to avoid this Type 1 error.^11,26 Three observational studies published by the investigators, and another referenced in the trial and protocol, helped define and support their design of the usual care arm that was employed.^51–54 A prospective observational study found that dexmedetomidine was used in only 7.6% of patients in the first 48 h following intubation.⁵¹ Therefore, the control arm was designed to replicate usual care by only discouraging dexmedetomidine use in the control arm but fully permitting dexmedetomidine if additional sedation was deemed necessary by treating clinicians at the bedside.

Type 2A trials

These compare two or more different active comparator treatments, all of which are defined as usual critical care (Figure 2 and Table 1). The common error is the therapeutic misalignment occurs when randomization leads to subjects within a trial being administered therapeutics and treatments which clinical characteristics would usually preclude them from receiving during routine care outside of the trial.¹⁵ In the case of critical care Type 2A error, the two or more treatments being compared are different from each other but treat the same disease. However, each treatment during contemporaneous common practice is typically administered to patients based on their distinct clinical characteristics. This is because each approach may be associated with perceived risks and benefits that differ across different patient subgroups. Therapeutic misalignments result when investigators fail to consider these practice/patient relationships and decouple them through randomization—thus causing patient subgroups across one or more arms to receive therapies or interventions that would be considered inappropriate outside of the trial setting, meaning investigators have administered unusual care to trial subjects. Any trial that only compares two arms—both constituting unusual care—cannot yield results that reliably inform current critical care practice.^2,15 Current practice might be superior to the two unusual care arms but remain unknown due to its omission from the trial design.

Figure 2.

Type 2A CER—two active comparator arms which vary categorically from each other compared. For Type 2A CER, the panel on the left shows a therapeutic misalignment. In both arms, post-randomization, some subjects received an inadvisable treatment. This comparison has limited clinical meaning. Furthermore, trialists have limited ability to advise current practice as it was never studied. The figure on the right shows how this error can be avoided if trialists, prior to randomization, exclude subjects for whom either therapy is inadvisable. The correction in Type 1 can also be used to help prevent this error.

Example 4, Type 2A error: in a recent trial studying the acute management of patients with status epilepticus, trial authors randomly assigned a broad, heterogeneous group of subjects to receive one of the three antiepileptic drugs.⁵⁵ In routine practice, physicians select such treatment after considering which of the three drugs patients are already receiving, as well as their compliance history, age, comorbidities, and underlying conditions.^13,56–58 But in this trial, the authors failed to account for any of these factors for the majority of trial subjects and instead randomized heterogeneous subjects to receive any one of the three trial drugs regardless of their histories or of the dictates of usual practice. In approximately 10%–15% of subjects, the etiology of the episode of status was withdrawal of or non-compliance with taking their home antiepileptic drugs.⁵⁵ If poor compliance with taking home antiepileptic drugs is suspected, the preferred treatment after benzodiazepine administration would be to administer additional doses of their maintenance antiepileptic drugs.⁵⁷ Arbitrarily selecting an antiepileptic drug in an individual who was in status because of missed doses instead of giving additional doses of their maintenance drug known to have previously provided seizure control would constitute unusual care and potentially delayed effective treatment. Despite this deviation from usual care, the Food and Drug Administration allowed a waiver of informed consent.⁵⁵ Ultimately, the authors compared three Food and Drug Administration approved drugs that were randomly assigned in contradistinction to usual practices. As such, the ability to inform current practice was compromised since it was never studied.

Example 5, Type 2A error: in a trial investigating target oxygen saturation range, neonates born at less than 28 weeks’ gestation were randomly assigned to the lower or upper half of the American Academy of Pediatrics’ recommended range of oxygenation.⁵⁹ Relying solely on this range, the trial authors failed to consider or incorporate previously published available data outlining “current” practices.^16,60 The study as performed ignored the common practice of neonatologists to almost always set the upper limit of the oxygen saturation range at or above 92%, thus allowing bedside caregivers to err on the side of adequately oxygenating all neonates. While the lower limit of the range was highly variable,^2,13,16,60 nurses were found to routinely skew oxygen delivery toward the upper end of target ranges to avoid hypoxia.^16,60 The upper limit for the lower oxygen saturation range studied (89%) in the trial was rarely, if ever, administered in clinical practice.¹⁶ This limited the maximum level of oxygen that can be delivered to only low dosages by restricting the upper limit of the overall range. Delivering low levels of oxygen saturation can be extremely harmful to neonatal subjects, increasing the risk of necrotizing enterocolitis and death.^16,61 The consent documents for this trial did not include this risk. The documents misled parents by stating that there was no increased risk associated with taking part in the study and that, in the two arms studied, neonates would receive “routine” or “standard” care.^16,20,62 Furthermore, within a year after publication in The New England Journal of Medicine, this trial resulted in a 3-year long controversy in the lay and scientific press over the adequacy of the consent documents explanation of risks in consent documents provided to the parents.^{2,9,16,59,63,64} Defenders of the trial argued that since both arms are usual care, informed consent wasn’t even necessary.^9,10 This controversy abruptly ended once it was shown that one of the two oxygen ranges studied was lower than usual care, had increased risks and was rarely if ever used in neonatal intensive care units.¹⁶ This trial illustrates how important it is to understand contemporary practices and not just guidelines in developing well-designed critical care CER trials.

Example 6, Type 2A error avoided: usual care was effectively incorporated in a trial comparing whether high-flow nasal cannula therapy in premature infants with respiratory distress was non-inferior to nasal continuous positive airway pressure.^11,25 Prior to commencement, the investigators conducted a survey that indicated most healthcare providers used nasal continuous positive airway pressure and only a few centers additionally used high-flow nasal cannula therapy in infants.⁶⁵ Therefore, to maintain usual care, enrollment was restricted to centers using nasal continuous positive airway pressure alone. Given this restriction, investigators did not have to determine whether other factors influenced the choice of treatment and did not need to incorporate such factors into the trial design. Focusing enrollment on the chosen centers made it possible to draw firm conclusions for institutions that primarily used nasal continuous positive airway pressure and could therefore easily apply to common practice.

Type 2B trials

Compares different doses or levels of the same treatment that is commonly titrated over a range based on patient characteristics during routine care (Figure 3 and Table 1). The common error is the therapeutic misalignment between dose or level and disease severity. Many critical care interventions are adjusted to individual patient needs, reflecting differences in patient-level characteristics that can evolve over time during critical illness. Errors result when trial authors conduct randomization that misaligns treatment dose or intensity with the needs of individual patients, subjecting them to care different from what they would receive outside of the trial. Instead of preserving routine titration practices, trial authors instead randomize subjects to fixed levels of the same treatment.¹⁵ In general, there is a known significant relationship between a treatment and clinical characteristics which are based on physiologic outcomes that must be preserved or the subjects will be harmed. Therefore, randomizing subjects to two fixed and widely separated treatment doses of a routinely titrated therapy—irrespective of the severity of subjects’ illness or need—means that subgroups in both arms will receive different, non-comparable unusual care. In one arm, a subgroup of subjects with the least severe disease will be randomized to receive maximal therapy. In the other arm, a subgroup with the most severe disease will be randomized to receive minimal therapy. Comparing these two arms has limited clinical applicability.

Figure 3.

Type 2B CER—two active comparator arms which vary orthogonally (by dose) from each other compared. For Type 2B CER, a therapeutic misalignment on the left panel is shown. Therapy variation is not random in this case but is based on a clinical characteristic that is likely to alter patient outcome. Although the therapy dose covaries with severity of disease, trialists disregard this fact. This results after randomization comparing some subjects with severe disease receiving low-dose therapy to subjects with mild disease receiving high-dose therapy. This comparison has limited clinical meaning to current practice. Thus, the trialists’ ability to correctly advise practice is diminished, as it was never studied. The panel on the right shows how these misalignments can be avoided if trialists stratify randomization to maintain the current practice of adjusting dose by disease severity in one arm. In the other arm, one can modify the low-dose and high-dose therapy in a manner believed beneficial for each and stratify the randomization, so the appropriate severity disease is studied with a suitable therapy dose. It is also possible to circumvent the need for stratification and avoid misalignment by studying the experimental therapy in comparison with either only the low-dose therapy as given in current practice or the high-dose therapy as given in current practice.

Example 7, Type 2B error: this problem can be easily seen in a hypothetical trial of vasopressor therapy. In routine care, vasopressors are carefully titrated to maintain a target blood pressure in septic shock. Patients with severe septic shock will require higher doses of vasopressors compared to milder cases who may require little or no vasopressor administration. A hypothetical study can be designed where subjects are randomized to receive either high- or low-dose vasopressors regardless of the severity of shock. This produces a predictable misalignment: some subjects with severe septic shock randomized into the low-dose arm will receive insufficient therapy and have persistent hypotension. Conversely, in the high-dose vasopressor arm, there will be a subgroup of subjects with mild shock who, despite clinically requiring minimal vasopressors, receive excessive doses and become hypertensive. Therefore, the comparison between these two fixed dosing arms of this hypothetical study is a meaningless one and reveals only which practice misalignment is more harmful.¹⁵ If prescribed dosage in clinical practice significantly covaries with severity of illness or other patient determined factors, then randomly assigning patients to two widely separated, fixed treatment regimens will likely produce qualitative interactions or different harmful effects in each arm. If this is not recognized, then whichever arm is more harmful will determine the overall outcome of the study. If presence of a qualitative interaction is not recognized, the less harmful of the two clinical scenarios could be enshrined in perpetuity in future clinical practice.

Example 8, Type 2B error: a trial of critically ill subjects studied how restrictive and liberal approaches to red blood cell transfusion affect mortality.⁶⁶ A prior survey by the investigators showed that physicians typically employ a range of hemoglobin levels to trigger transfusions and prescribed more red blood cell units as patients’ age, cardiovascular comorbidities, and severity of illness increase, consistent with contemporaneous consensus practices.⁶⁷ Despite the survey findings supported by consensus conferences at the time,⁶⁸ the trial authors did not include a usual critical care arm in which they adjusted transfusions based on individual patient characteristics. In one arm, younger, stable subjects received transfusions that were not clinically indicated and potentially harmful—while, in the other arm, older subjects at risk for cardiovascular disease did not receive transfusions that were clinically indicated.^2,13–15,17 Comparison of these two arms was uninformative as it was a comparison of two different types of unusual critical care. This study could not fully inform contemporary practices as current usual care was not incorporated into the trial design. Whether either of the fixed-dose arms (both experimental) might be better than usual care, which is routinely titrated, is not known.^14,15,17,68

Example 9, Type 2B error: in another trial where routinely titrated therapies were instead studied at fixed levels, the authors compared a set high versus low range of arterial blood oxygen during the treatment of acute respiratory distress syndrome.⁶⁹ In no trial arm did clinicians practice usual critical care, which involves titration of supplemental oxygen to avoid hypoxia while minimizing patients’ exposure to toxic oxygen levels.^{11,47,70–74} Consequently, some patients with severe disease randomized to the low fraction of inspired oxygen arm were unnecessarily kept in a state of relative hypoxia.¹¹ Other subjects with minimal disease randomized to the high arm of fraction of inspired oxygen were unnecessarily exposed to potentially toxic O₂ levels despite their high blood oxygen levels.¹¹ Ultimately, this trial compared two arms in which different subgroups received unusual critical care, increasing trial subjects’ risk and significantly diminishing the usefulness of results for informing current practice.¹¹

Example 10, Type 2B error: in another severe respiratory failure acute respiratory distress syndrome trial, subjects were randomized to two fixed treatment strategies one with mechanical ventilation with a large breath versus the other arm which had a small volume breath (high vs low tidal volume ventilation).⁷⁵ The baseline data from this trial showed that healthcare providers typically used a range of tidal volumes pre-randomization.^{2,13,15,18,76} On average, the pre-randomization tidal volumes decrease as lungs became more rigid (non-compliant) or diseased.^76,77 But subjects in this trial’s high tidal volume arm, designated “traditional volume,” did not receive care consistent with contemporaneous practice as defined by current mechanical ventilation data. Approximately 80% of subjects enrolled in the high tidal volume control arm saw their tidal volumes increased from pre-study baseline values as prescribed by their personal physician.¹⁵ The death rate was much higher in the high tidal volume arm than it was for patients who met enrollment criteria but were not randomized.⁷⁶ In this design, the “traditional volume” represented unusual critical care, likely disadvantaging subjects in this arm. Moreover, the trial’s design rendered it difficult to determine whether the intervention in the low tidal volume arm was beneficial or the high “traditional volume” arm harmful.^{2,13,15,76,78}

Example 11, Type 2B error avoided: an example of a trial maintaining usual care titration on all arms investigated if a more tightly managed routine oxygen therapy in acute respiratory failure would prevent lung injury.^11,23 One arm of this trial was unrestricted usual titrated care which was determined by eight previous studies conducted by the investigators as well as four observation studies published by others.^{70–74,79–85} The other arm was a more conservative arm where healthcare providers more fastidiously lowered blood oxygen levels if the oxygen saturation was at acceptable levels of >91% which was consistent with current practice. The trial maintained usual titration practices in both arms; using the lowest inspired oxygen therapy that would result in a safe level of arterial oxygenation. They tested whether a more conservative strategy characterized by more aggressively lowering oxygen levels whenever possible could further limit unnecessary hyperoxia and injury.

Considerations for CER trial design

We have proposed a schema to organize critical care CER trials. Across the three types of error, the foundational flaw in trial design leading to the common errors we describe is the lack of a control or active comparator arms representing contemporaneous usual care. To safeguard against this widespread error, we suggest the following “best practices.” Step (1) Investigators should perform an in-depth determination of contemporaneous usual care practices at enrolling institutions before designing critical care CER trials. This process can include—but is not limited to—literature review, surveys, and retrospective or prospective observational studies. The investigators should define “usual care” by clearly documenting how therapies and interventions are administered, dosed, and adjusted based on disease dynamics and individual characteristics of all major subtypes of subjects. Step (2) Critical care CER trials should be designed to incorporate usual care by one or more of the following methods: for a Type 1 study, at least one arm should reasonably represent usual care as administered outside of the trial. For a Type 2 study, multiple arms of the trial must constitute usual care. Trial authors must consider and apply appropriate exclusion or inclusion criteria when designing their trials so that, after randomization, subjects enrolled in one or more arms of a trial still receive usual care. Step (3) After designing the CER trial, but prior to enrollment of subjects, trial authors should conduct a thought experiment.

In a Type 1 trial, the authors should ask whether, after randomization, the different major subtypes of subjects enrolled in the “control” arm will still receive usual care. If necessary, the authors should then modify their originally planned exclusion criteria to exclude those subjects deemed unlikely to receive usual care post-randomization. In a Type 2 trial, the authors should examine all active comparator arms to ensure that enrolled subjects will still receive usual critical care post-randomization as it is practiced for that specific population of patients. Once again, the authors must modify planned exclusion criteria—or develop additional ones—to account for those subjects deemed unlikely to receive usual care post-randomization.

Limitations

The limitations of applying our recommendations to critical care CER require mention. In this article, our focus is not on whether critical care CER trials ask clinically meaningful questions or whether the informed consent documents were adequate. Rather, we focus on whether trial designs optimized their ability to inform current practice, facilitated (or allowed) the writing of informed consent documents, and minimized risks to subjects through an understanding of usual care and capacity to adequately monitor safety. We acknowledge that there may be additional errors of trial design involving lack of or inappropriate use of usual care comparators in CER.

Conclusion

It is essential to both define and incorporate usual care when designing and conducting critical care CER trials. In some past studies, insufficient attention to these matters compromised study conclusions, patient safety, and the process of informed consent.^2,11,13–19 Our proposed schema may not capture every potential type of critical care CER comparison, and we acknowledge that some of the classification types overlap. Furthermore, we understand that our recommendations could make it more difficult to recruit and enroll patients. However, we believe that the approach we outline is not only feasible but will decrease risks and generate valid study results that have real-world implications. We are confident that our proposed definitions and categorizations, as well as our description of common errors, will constitute a meaningful step toward the improvement of critical care CER trial design.

Footnotes

Acknowledgements

The authors thank Ruth Macklin and Michael Carome for their invaluable help in ensuring the accuracy and readability of the manuscript. The authors dedicate this manuscript to the memory of Jordi Mancebo, who died on 6 August 2022, at 64 years. Jordi reviewed an earlier version of the manuscript, and we can only hope it captures his passion for truth and dedication to patients.

Author contributions

C.N. and H.K. contributed to conceptualization. C.N., H.G.K., and V.J.F. contributed to data curation. V.J.F. contributed to visualization. C.N., H.G.K., P.Q.E., V.J.F., I.C.-P., W.N.A., and J.W. contributed to writing. H.G.K., C.N., and V.J.F. contributed to editing.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Institutes of Health (NIH) intramural funding from the NIH Clinical Center. The work by the authors was done as part of US government–funded research; however, the opinions expressed are not necessarily those of the NIH.

ORCID iD

Charles Natanson

References

Macklin

Natanson

. Response to open peer commentaries on “misrepresenting ‘usual care’ in research: an ethical and scientific error.” Am J Bioeth 2020; 20(1): W12–W14.

Macklin

Natanson

. Misrepresenting “usual care” in research: an ethical and scientific error. Am J Bioeth 2020; 20: 31–39.

Marshall

. Clinical trials in critical condition. Science 2003; 300: 1225–1226.

Couzin-Frankel

. Call to halt heart trial raises vexing questions. Science 2017; 357: 538–539.

Takala

. Better conduct of clinical trials: the control group in critical care trials. Crit Care Med 2009; 37(1 Suppl.): S80–S90.

Thompson

Schoenfeld

. Usual care as the control group in clinical trials of nonpharmacologic interventions. Proc Am Thorac Soc 2007; 4: 577–582.

Silverman

Miller

. Control group selection in critical care randomized controlled trials evaluating interventional strategies: an ethical assessment. Crit Care Med 2004; 32(3): 852–857.

Truog

Robinson

Randolph

, et al. Is informed consent always necessary for randomized, controlled trials? N Engl J Med 1999; 340: 804–807.

Magnus

Caplan

. Risk, consent, and SUPPORT. N Engl J Med 2013; 368: 1864–1865.

10.

Lantos

Spertus

. The concept of risk in comparative-effectiveness research. N Engl J Med 2014; 371: 2129–2130.

11.

Applefeld

Wang

Cortes-Puch

, et al. Modeling current practices in critical care comparative effectiveness research. Crit Care Resusc 2022; 24: 150–162.

12.

PCORI. PCORI methodology standards, https://www.pcori.org/research/about-our-research/research-methodology/pcori-methodology-standards (2019, accessed November 2022)

13.

Applefeld

Wang

Klein

, et al. Comparative effectiveness research in critically ill patients: risks associated with mischaracterising usual care. Crit Care Resusc 2020; 22(2): 110–118.

14.

Deans

Minneci

Klein

, et al. The relevance of practice misalignments to trials in transfusion medicine. Vox Sang 2010; 99: 16–23.

15.

Deans

Minneci

Suffredini

, et al. Randomization in clinical trials of titrated therapies: unintended consequences of using fixed treatment protocols. Crit Care Med 2007; 35(6): 1509–1516.

16.

Cortés-Puch

Wesley

Carome

, et al. Usual care and informed consent in clinical trials of oxygen management in extremely premature infants. PLoS ONE 2016; 11(5): e0155005.

17.

Cortés-Puch

Wiley

Sun

, et al. Risks of restrictive red blood cell transfusion strategies in patients with cardiovascular disease (CVD): a meta-analysis. Transfus Med 2018; 28(5): 335–345.

18.

Eichacker

Gerstenberger

Banks

, et al. Meta-analysis of acute lung injury and acute respiratory distress syndrome trials testing low tidal volumes. Am J Respir Crit Care Med 2002; 166: 1510–1514.

19.

Piller

. Failure to protect? Science 2021; 373: 729–733.

20.

Horn

Weijer

Grimshaw

, et al. An ethical analysis of the support trial: addressing challenges posed by a pragmatic comparative effectiveness randomized controlled trial. Kennedy Inst Ethics J 2018; 28(1): 85–118.

21.

Institute of Medicine (IOM). Initial national priorities for comparative effectiveness research. Washington, DC: The National Academies Press, 2009.

22.

Sox

Goodman

. The methods of comparative effectiveness research. Annu Rev Public Health 2012; 33: 425–445.

23.

ICU-ROX Investigators and the Australian and New Zealand Intensive Care Society Clinical Trials Group, Mackle

Bellomo

, et al. Conservative oxygen therapy during mechanical ventilation in the ICU. N Engl J Med 2020; 382: 989–998.

24.

Lemkes

Janssens

van der Hoeven

, et al. Coronary angiography after cardiac arrest without ST-segment elevation. N Engl J Med 2019; 380: 1397–1407.

25.

Manley

Arnolda

GRB

Wright

IMR

, et al. Nasal high-flow therapy for newborn infants in special care nurseries. N Engl J Med 2019; 380: 2031–2040.

26.

Shehabi

Howe

Bellomo

, et al. Early sedation with dexmedetomidine in critically ill patients. N Engl J Med 2019; 380: 2506–2517.

27.

Arabi

Burns

KEA

Finfer

. Pneumatic compression in venous thromboprophylaxis. Reply. N Engl J Med 2019; 381: 95.

28.

Schupke

Neumann

Menichelli

, et al. Ticagrelor or prasugrel in patients with acute coronary syndromes. N Engl J Med 2019; 381: 1524–1534.

29.

Callum

Farkouh

Scales

, et al. Effect of fibrinogen concentrate vs cryoprecipitate on blood component transfusion after cardiac surgery: the FIBRES randomized clinical trial. JAMA 2019; 322: 1966–1976.

30.

Garrouste-Orgeas

Flahault

Vinatier

, et al. Effect of an ICU diary on posttraumatic stress disorder symptoms among patients receiving mechanical ventilation: a randomized clinical trial. JAMA 2019; 322: 229–239.

31.

Guihard

Chollet-Xemard

Lakhnati

, et al. Effect of rocuronium vs succinylcholine on endotracheal intubation success rate among patients undergoing out-of-hospital rapid sequence intubation: a randomized clinical trial. JAMA 2019; 322: 2303–2312.

32.

Johnston

Bruno

Pauls

, et al. Intensive vs standard treatment of hyperglycemia and functional outcome in patients with acute ischemic stroke: the SHINE randomized clinical trial. JAMA 2019; 322: 326–335.

33.

Rosa

Falavigna

da Silva

, et al. Effect of flexible family visitation on delirium among patients in the intensive care unit: the ICU visits randomized clinical trial. JAMA 2019; 322: 216–228.

34.

Spinella

Tucci

Fergusson

, et al. Effect of fresh vs standard-issue red blood cell transfusions on multiple organ dysfunction syndrome in critically ill pediatric patients: a randomized clinical trial. JAMA 2019; 322: 2179–2190.

35.

PEPTIC Investigators for the Australian and New Zealand Intensive Care Society Clinical Trials Group, Alberta Health Services Critical Care Strategic Clinical Network, and the Irish Critical Care Trials Group, Young

Bagshaw

, et al. Effect of stress ulcer prophylaxis with proton pump inhibitors vs histamine-2 receptor blockers on in-hospital mortality among ICU patients receiving invasive mechanical ventilation: the PEPTIC randomized clinical trial. JAMA 2020; 323: 616–626.

36.

Dalziel

Borland

Furyk

, et al. Levetiracetam versus phenytoin for second-line treatment of convulsive status epilepticus in children (ConSEPT): an open-label, multicentre, randomised controlled trial. Lancet 2019; 393: 2135–2145.

37.

Lamontagne

Richards-Belle

Thomas

, et al. Effect of reduced exposure to vasopressors on 90-day mortality in older critically ill patients with vasodilatory hypotension: a randomized clinical trial. JAMA 2020; 323: 938–949.

38.

Lyttle

Rainford

NEA

Gamble

, et al. Levetiracetam versus phenytoin for second-line treatment of paediatric convulsive status epilepticus (EcLiPSE): a multicentre, open-label, randomised trial. Lancet 2019; 393: 2125–2134.

39.

Collaboration

. Effects of antiplatelet therapy after stroke due to intracerebral haemorrhage (RESTART): a randomised, open-label trial. Lancet 2019; 393: 2613–2623.

40.

Lascarrou

Merdji

Le Gouge

, et al. Targeted temperature management for cardiac arrest with nonshockable rhythm. N Engl J Med 2019; 381: 2327–2337.

41.

Callaway

Donnino

Fink

, et al. Part 8: post-cardiac arrest care: 2015 American Heart Association Guidelines update for cardiopulmonary resuscitation and emergency cardiovascular care. Circulation 2015; 132: S465–S482.

42.

Donnino

Andersen

Berg

, et al. Temperature management after cardiac arrest: an advisory statement by the advanced life support task force of the international liaison committee on resuscitation and the American heart association emergency cardiovascular care committee and the council on cardiopulmonary, critical care, perioperative and resuscitation. Circulation 2015; 132: 2448–2456.

43.

Dumas

Grimaldi

Zuber

, et al. Is hypothermia after cardiac arrest effective in both shockable and nonshockable patients? Insights from a large registry. Circulation 2011; 123: 877–886.

44.

Legriel

Hilly-Ginoux

Resche-Rigon

, et al. Prognostic value of electrographic postanoxic status epilepticus in comatose cardiac-arrest survivors in the therapeutic hypothermia era. Resuscitation 2013; 84(3): 343–350.

45.

Lemiale

Huet

Vigue

, et al. Changes in cerebral blood flow and oxygen extraction during post-resuscitation syndrome. Resuscitation 2008; 76: 17–24.

46.

The National Heart, Lung, and Blood Institute PETAL Clinical Trials Network, Moss

Huang

, et al. Early neuromuscular blockade in the acute respiratory distress syndrome. N Engl J Med 2019; 380: 1997–2008.

47.

Bellani

Laffey

Pham

, et al. Epidemiology, patterns of care, and mortality for patients with acute respiratory distress syndrome in intensive care units in 50 countries. JAMA 2016; 315: 788–800.

48.

Dodia

Richert

Deitchman

, et al. A survey of academic intensivists use of neuromuscular blockade in subjects with ARDS. Respir Care 2020; 65(3): 362–368.

49.

Duan

Adhikari

NKJ

D’Aragon

, et al. Management of acute respiratory distress syndrome and refractory hypoxemia. Ann Am Thorac Soc 2017; 14(12): 1818–1826.

50.

Torbic

Bauer

Personett

, et al. Perceived safety and efficacy of neuromuscular blockers for acute respiratory distress syndrome among medical intensive care unit practitioners: a multicenter survey. J Crit Care 2017; 38: 278–283.

51.

Shehabi

Bellomo

Reade

, et al. Early intensive care sedation predicts long-term mortality in ventilated critically ill patients. Am J Respir Crit Care Med 2012; 186: 724–731.

52.

Shehabi

Chan

Kadiman

, et al. Sedation depth and long-term mortality in mechanically ventilated critically ill adults: a prospective longitudinal multicentre cohort study. Intensive Care Med 2013; 39(5): 910–918.

53.

Shehabi

Bellomo

Kadiman

, et al. Sedation intensity in the first 48 hours of mechanical ventilation and 180-day mortality: a multinational prospective longitudinal cohort study. Crit Care Med 2018; 46(6): 850–859.

54.

Wunsch

Kahn

Kramer

, et al. Use of intravenous infusion sedation among mechanically ventilated patients in the United States. Crit Care Med 2009; 37(12): 3031–3039.

55.

Kapur

Elm

Chamberlain

, et al. Randomized trial of three anticonvulsant medications for status epilepticus. N Engl J Med 2019; 381: 2103–2113.

56.

Brophy

Bell

Claassen

, et al. Guidelines for the evaluation and management of status epilepticus. Neurocrit Care 2012; 17: 3–23.

57.

Cock

Coles

Elm

, et al. Lessons from the established status epilepticus treatment trial. Epilepsy Behav 2019; 101(Pt B): 106296.

58.

Riviello

Jr Ashwal

Hirtz

, et al. Practice parameter: diagnostic assessment of the child with status epilepticus (an evidence-based review): report of the quality standards subcommittee of the American Academy of Neurology and the Practice Committee of the Child Neurology Society. Neurology 2006; 67: 1542–1550.

59.

Network SSGotEKSNNR Carlo

Finer

, et al. Target ranges of oxygen saturation in extremely preterm infants. N Engl J Med 2010; 362: 1959–1969.

60.

Hagadorn

Furey

Nghiem

, et al. Achieved versus intended pulse oximeter saturation in infants born less than 28 weeks’ gestation: the AVIOx study. Pediatrics 2006; 118(4): 1574–1582.

61.

Saugstad

Aune

. Optimal oxygenation of extremely low birth weight infants: a meta-analysis and systematic review of the oxygen saturation target studies. Neonatology 2014; 105(1): 55–63.

62.

Annas

. Legally blind: the therapeutic illusion in the support trial of extremely premature infants. J Contemp Health Law Policy 2013; 30: 1–36.

63.

Lantos

. Learning the right lessons from the SUPPORT study controversy. Arch Dis Child Fetal Neonatal Ed 2014; 99(1): F4–F5.

64.

Tavernise

. Study of babies did not disclose risks, U.S. finds. The New York Times, April 102013, https://www.nytimes.com/2013/04/11/health/parents-of-preemies-werent-told-of-risks-in-study.html#:∼:text=A%20federal%20agency%20has%20found,chances%20of%20blindness%20or%20death.

65.

Manley

Owen

Doyle

, et al. High-flow nasal cannulae and nasal continuous positive airway pressure use in non-tertiary special care nurseries in Australia and New Zealand. J Paediatr Child Health 2012; 48(1): 16–21.

66.

Hebert

Wells

Blajchman

, et al. A multicenter, randomized, controlled clinical trial of transfusion requirements in critical care. Transfusion Requirements in Critical Care Investigators, Canadian Critical Care Trials Group. N Engl J Med 1999; 340: 409–417.

67.

Hébert

Wells

Martin

, et al. A Canadian survey of transfusion practices in critically ill patients. Crit Care Med 1998; 26(3): 482–487.

68.

Consensus conference . Perioperative red blood cell transfusion. JAMA 1988; 260: 2700–2703.

69.

Barrot

Asfar

Mauny

, et al. Liberal or conservative oxygen therapy for acute respiratory distress syndrome. N Engl J Med 2020; 382: 999–1008.

70.

Chu

Kim

Young

, et al. Mortality and morbidity in acutely ill adults treated with liberal versus conservative oxygen therapy (IOTA): a systematic review and meta-analysis. Lancet 2018; 391: 1693–1705.

71.

Helmerhorst

Schultz

van der Voort

, et al. Self-reported attitudes versus actual practice of oxygen therapy by ICU physicians and nurses. Ann Intensive Care 2014; 4: 23.

72.

Panwar

Capellier

Schmutz

, et al. Current oxygenation practice in ventilated patients-an observational cohort study. Anaesth Intensive Care 2013; 41(4): 505–514.

73.

Suzuki

Eastwood

Peck

, et al. Current oxygen management in mechanically ventilated patients: a prospective observational cohort study. J Crit Care 2013; 28(5): 647–654.

74.

Young

Beasley

Capellier

, et al. Oxygenation targets, monitoring in the critically ill: a point prevalence study of clinical practice in Australia and New Zealand. Crit Care Resusc 2015; 17: 202–207.

75.

Acute Respiratory Distress Syndrome Network, Brower

Matthay

, et al. Ventilation with lower tidal volumes as compared with traditional tidal volumes for acute lung injury and the acute respiratory distress syndrome. N Engl J Med 2000; 342: 1301–1308.

76.

Deans

Minneci

Cui

, et al. Mechanical ventilation in ARDS: one size does not fit all. Crit Care Med 2005; 33: 1141–1143.

77.

Carmichael

Dorinsky

Higgins

, et al. Diagnosis and therapy of acute respiratory distress syndrome in adults: an international survey. J Crit Care 1996; 11(1): 9–18.

78.

Cortés-Puch

Applefeld

Wang

, et al. Individualized care is superior to standardized care for the majority of critically ill patients. Crit Care Med 2020; 48(12): 1845–1847.

79.

Eastwood

Bellomo

Bailey

, et al. Arterial oxygen tension and mortality in mechanically ventilated patients. Intensive Care Med 2012; 38: 91–98.

80.

Beasley

Chien

Douglas

, et al. Thoracic Society of Australia and New Zealand oxygen guidelines for acute oxygen use in adults: “Swimming between the flags.” Respirology 2015; 20(8): 1182–1191.

81.

Suzuki

Eastwood

Glassford

, et al. Conservative oxygen therapy in mechanically ventilated patients: a pilot before-and-after trial. Crit Care Med 2014; 42(6): 1414–1422.

82.

Young

Mackle

Bailey

, et al. Intensive care unit randomised trial comparing two approaches to oxygen therapy (ICU-ROX): results of the pilot phase. Crit Care Resusc 2017; 19: 344–354.

83.

de Jonge

Peelen

Keijzers

, et al. Association between administered oxygen, arterial partial oxygen pressure and mortality in mechanically ventilated intensive care unit patients. Crit Care 2008; 12(6): R156.

84.

Helmerhorst

Arts

Schultz

, et al. Metrics of arterial hyperoxia and associated outcomes in critical care. Crit Care Med 2017; 45(2): 187–195.

85.

Roberts

Kilgannon

Hunter

, et al. Association between early hyperoxia exposure after resuscitation from cardiac arrest and neurological disability. Circulation 2018; 137: 2114–2124.