Sage Journals: Discover world-class research

Abstract

In studies such as randomized controlled trials (RCTs), ethical requirements, inter alia, are that the study should be powered to detect the smallest clinically significant difference between treatments, and that the treatment groups should be in equipoise at the start of the trial. This article provides examples of circumstances where small sample, potentially underpowered studies may be justifiable, and where RCTs of treatments that are not in equipoise may be considered appropriate. The concepts presented may be extendable from RCTs to other study designs, as well.

Keywords

Ethics randomized controlled trial sample size power equipoise

All studies involving human participants need prior approval from an appropriately constituted scientific or research committee, and from an institutional review board or institute (or independent) ethics committee (EC); the specifics vary across countries. In India, procedures follow the guidelines¹ issued by the Indian Council of Medical Research (ICMR), downloadable from the ICMR website.

Denial of Approval

An almost invariable requirement during the research appraisal and approval processes is the justification of the proposed study sample size. The sample size should be neither larger nor smaller than what is necessary to answer the research question outlined as the primary outcome in the research proposal.²

Investigators rarely request ECs for a larger sample size than is necessary, unless it is to guard against sample attrition related to drop out. More commonly, due to limitations related to study budget and study completion time, investigators present “small sample” proposals that are likely to be underpowered for the primary outcome. Studies that are underpowered may be denied approval for two reasons: first, it is not ethical to put patients through the inconveniences and risks associated with study participation when the primary research question cannot be confidently answered; and second, inconclusive results from underpowered studies waste academic and institutional resources.²

Separately, in the context of randomized controlled trials (RCTs), ECs may deny approval if equipoise is violated. Equipoise exists when there is sufficient uncertainty about the relative merits of interventions to justify their comparison in an RCT.³ For example, if there is uncertainty about whether a new drug is superior to placebo, equipoise is present and an RCT comparing the drug with placebo can be considered. However, if there are grounds to believe that one treatment is better than another, equipoise is absent and it would not be ethical to randomize patients to these treatments.

The above notwithstanding, there may be situations that might justify approval of studies that are underpowered and studies that are not in equipoise. This article presents examples of such situations for the consideration of investigators and EC members, alike.

Small Sample Size

Studies should be powered to detect the smallest meaningful difference between groups. However, the proscription of underpowered studies is not set in stone. As an example, an underpowered study may be justified in the novel situation where we do not know what values to input for smallest meaningful difference, or other values necessary for sample size estimation.

When the primary outcome is a categorical variable, we may decide that, for example, when comparing two treatments the smallest meaningful difference in response rates is 15%. So, should we power our studies for response rates of 75% versus 60% or for 50% versus 35%? In these two situations, although the advantage for one treatment over the other is the same (15%), the estimated sample size is different.

When the primary outcome is a continuous variable, we can decide what the smallest meaningful difference is based on our familiarity with the measure. For example, in an antidepressant RCT, we may power our study to detect a mean difference of three units on the Hamilton Rating Scale for Depression (HAM-D). However, to estimate sample size, we also need to input standard deviations (SDs), and whereas there may be published data on what to expect as SDs in samples drawn from other populations, there is no assurance that those SDs would be appropriate for the population from which our sample will be drawn.

In such a situation, we can, instead, choose an effect size (ES) to detect. For example, if we select an ES of 0.75, we will need a sample of approximately 30 subjects in each group. Using an ES to power the study is also helpful when the primary outcome measure is an unfamiliar or new research instrument and we cannot confidently state a value as the smallest meaningful difference.⁴

Unfortunately, a limitation of powering a study for an ES rather than for an absolute difference is that, for a specified ES, we cannot know what the absolute difference between treatments might be. For example, we cannot know whether an ES of 0.75 is three units on the HAM-D or a smaller or larger value. So, our problem is not truly solved unless we take refuge in the logic explained in the next section.

So, in situations such as those described above, we may justifiably plan to conduct a pilot study with a small-to-modest sample size. We accept the risk that the study may be underpowered. We justify the pilot study on the grounds that the results can guide sample size estimation in future research, and on the grounds that the results can contribute useful data to future meta-analyses. Thus, the pilot study becomes scientifically and ethically justifiable. Other situations justifying pilot studies are described in the ICMR ethics guidelines.¹

Assuming Absence of Equipoise

A small ES is one that is not obvious; so, a large sample is necessary to detect it. A large ES is one that is easily spotted; so, a small sample suffices to detect it. Investigators may, therefore, be tempted to power an RCT to detect a large ES so as to justify recruiting a small sample in resource-starved settings. However, ECs may demur because, if the ES is expected to be large, the treatments, prima facie, cannot be in equipoise. But, investigators may sometimes have a defense. Consider the following example.

Ketamine is conventionally administered intravenously (iv) and in a standardized dose to treat depression. Oral ketamine is also a treatment option, but one that has not yet been standardized. Limitations of oral ketamine are poor bioavailability and a lower ketamine to norketamine ratio because of conversion of ketamine to norketamine during first-pass metabolism.⁵ When lower bioavailability is compensated for by higher dosing, the two routes of treatment should ideally be compared in a noninferiority design. However, noninferiority RCTs necessitate large samples, and the associated demands on funding and time are beyond the reach of most investigators. Instead, investigators can opt to conduct a “small” RCT that is powered to detect a medium to large ES of (for example) 0.75. This assumes that the treatments are not in equipoise. The necessary sample size is approximately 30 patients per group.⁴

The scientific and ethical justification for such a strategy is that it tells us whether or not the difference between groups is large. If such an RCT finds an advantage for iv ketamine, iv ketamine can be offered to patients if the statistically significant difference is also clinically significant. If the RCT finds no advantage for iv ketamine, it can be concluded that (a) iv ketamine may truly not be superior to oral ketamine or (b) if iv ketamine is indeed superior, the ES is likely to be <0.75; that is, less than medium to large. Oral ketamine can then be offered as a possibly slightly less effective but certainly simpler, more convenient, and less expensive option.⁶ Either way, the findings of the RCT can help patients make an informed choice.

When Equipoise Is Truly Absent

Oral and iv ketamine treatment arms may truly be in equipoise if the dose of oral ketamine is sufficiently high to compensate for lower bioavailability. So, in the example in the previous section, assuming absence of equipoise is merely a strategy to establish how large the ES may not be. But what if two treatments are truly not in equipoise; is it ethical to compare them in an RCT? Indeed yes, as the example below shows.

We know that electroconvulsive therapy (ECT) is more effective than parenteral ketamine.⁷ This suggests that ECT is likely to be more effective than oral ketamine, as well. However, ECT has not yet been compared with oral ketamine. So, is an RCT of ECT versus oral ketamine scientifically and ethically justifiable? Absolutely yes, and for two reasons. First, because equipoise should be based on a composite of efficacy and tolerability, and not on efficacy alone. So, whereas ECT may be more effective than oral ketamine, it is also more likely than oral ketamine to cause more clinically relevant, persistent cognitive adverse effects. Thus, if efficacy and tolerability are considered together, there is uncertainty about which treatment is better. Second, as in the previous example, the RCT of ECT versus oral ketamine can establish how large the advantage for ECT is or may not be. Thus, an RCT that compares ECT and oral ketamine can provide information that would allow patients to make an informed choice between favoring efficacy over favoring preservation of cognition.

Conclusions

Investigators and scientific/ethics committee members may wish to keep in mind that there could be circumstances that justify underpowered samples in research, and circumstances that justify performing RCTs that compare treatments that are not in equipoise. The concepts presented in this article may be extendable from RCTs to other study designs, as well.

Footnotes

Acknowledgements

I acknowledge useful comments on an early draft of this article, received from Vikas Menon, Professor, Department of Psychiatry, JIPMER, Puducherry, and Shahul Ameen, Consultant Psychiatrist, St. Thomas Hospital, Changanacherry, Kerala.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Declaration Regarding the Use of Generative AI

None used.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

References

Indian Council of Medical Research. National ethical guidelines for biomedical and health research involving human participants. Indian Council of Medical Research, New Delhi, 2017.

Andrade

. Sample size and its importance in research. Indian J Psychol Med, 2020; 42(1): 102–103.

London

. Equipoise in research: Integrating ethics and science in human research. JAMA, 2017; 317(5): 525–526.

Norman

, Monteiro

and Salama

Sample size calculations: Should the emperor’s clothes be off the peg or made to measure?

BMJ 2012; 345: e5278. Erratum in: BMJ, 2014; 349: g5341.

Andrade

. Oral ketamine for depression, 1: Pharmacologic considerations and clinical evidence. J Clin Psychiatry, 2019; 80(2): 19f12820.

Andrade

. Ketamine for depression-knowns, unknowns, possibilities, barriers, and opportunities. JAMA Psychiatry, 2023; 80(12): 1189–1190.

Menon

, Varadharajan

, Faheem

, . Ketamine vs electroconvulsive therapy for major depressive episode: A systematic review and meta-analysis. JAMA Psychiatry 2023; 80(6): 639–642. Erratum in: JAMA Psychiatry, 2023; 80(6): 651.

A Reconsideration of Small Sample Studies and Absence of Equipoise as Ethical Issues in Research

Abstract

Keywords

Denial of Approval

Small Sample Size

Assuming Absence of Equipoise

When Equipoise Is Truly Absent

Conclusions

Footnotes

Acknowledgements

Declaration of Conflicting Interests

Declaration Regarding the Use of Generative AI

Funding

References