Sage Journals: Discover world-class research

Abstract

Statistical issues concerning multiple response criteria, multiple treatment groups, and multiple subgroups in clinical trials require careful attention in order to avoid an inappropriately high prevalence of chance findings, as well as to avoid unsatisfactorily low power to detect real treatment differences. An underlying goal is using a 0.050 significance level as often as possible for separate assessments while maintaining a 0.050 level for all assessments taken together, so statistical power is not compromised. For multiple response criteria, a useful assessment strategy is composite ranking as a single criterion first and then its individual components. Multiple treatment comparisons can often be effectively addressed with closed testing procedures with hierarchical evaluation. This hierarchy must be well specified since significance at its first stage is required before testing is allowed at the next stage.

In most clinical trials, subgroups are of supportive interest after statistical significance for all patients is shown. A subgroup hierarchy, however, permits primary evaluation in conjunction with all patients through significance level spending function methods as in interim analyses, for example, the O'Brien-Fleming method. The rationale is the analogy between a subgroup hierarchy and the patient hierarchy at successive interim analyses. With this method, the significance level for all patients' evaluation typically ranges between 0.040 and 0.045 and that for subgroups ranges from 0.005-0.020.

The methods outlined here for multiple response criteria, multiple treatment groups, and subgroups, or related counterparts, should be prespecified in the protocol for a clinical trial. If not in the protocol, they should be incorporated in the analysis plan prior to study unmasking.

Keywords

Multiple endpoints Multiple treatments Subgroups Significance level spending function Closed testing procedure;

Get full access to this article

View all access options for this article.

References

Koch

Amara

Forster

McSorley

Peace

. Statistical issues in the design and analysis of ulcer healing and recurrence studies. Drug Inf J. 1993;17:805–824.

Elashoff

Koch

. Statistical methods in trials of anti-ulcer drugs. In: Swabb

Szabo

, eds. Ulcer Disease Investigation and Basis for Therapy. New York: Marcel Dekker, Inc.; 1991: 375–406.

O'Brien

. Procedures for comparing samples with multiple endpoints. Biometrics. 1984;40: 1079–1087.

Lehmacher

Wassmer

Reitmeir

. Procedures for two-sample comparisons with multiple endpoints controlling the experimentwise error rate. Biometrics. 1991;47:511–521.

Koch

Carr

Amara

Stokes

Uryniak

. Categorical data analysis. In: Berry

, ed. Statistical Methodology in the Pharmaceutical Sciences. New York: Marcel Dekker, Inc.; 1990;389–473.

Koch

. Comment. Stat Med. 1991;10:13–16.

Bauer

. Multiple testing in clinical trials. Stat Med. 1991;10:871–890.

Gansky

Koch

Wilson

. Statistical evaluation of relationships between analgesic dose and ordered ratings of pain relief over an eight-hour period. J Biopharm Stat. 1994;4(2):233–265.

Phillips

Cairns

Koch

. The analysis of a multiple-dose, combination-drug clinical trial using response surface methodology. J Biopharm Stat. 1992;2(1):49–67.

10.

Koch

. Discussion. Biopharm Report. 1993; 2(1):7–8.

11.

O'Brien

Fleming

. A multiple testing procedure for clinical trials. Biometrics. 1979;35: 549–556.

12.

DeMets

Lan

KKG

. Interim analysis: The alpha spending function approach. Stat Med. 1994;13(13/14):1341–1352.

13.

Koch

Edward

. Clinical efficacy trials with categorical data. In: Peace

, ed. Biopharmaceutical Statistics for Drug Development. New York: Marcel Dekker; 1988:403–457.

14.

Mantel

Haenszel

. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst. 1959;22:719–748.

15.

Mantel

. Chi-square tests with one degree of freedom: Extensions of the Mantel-Haenszel procedure. J Am Stat Assoc. 1963;58:690–700.

16.

Koch

Amara

Davis

Gillings

. A review of some statistical methods for covariance analysis of categorical data. Biometrics. 1982;38: 563–595.

17.

Kurtiz

Landis

Koch

. A general overview of Mantel-Haenszel methods: Applications and recent developments. Ann Rev Pub Health. 1988;9:123–160.

18.

Koch

Sollecito

. Statistical considerations in the design, analysis, and interpretation of comparative clinical studies: An academic perspective. Drug Inf J. 1984;18:131–151.

19.

Fleiss

. The Design and Analysis of Clinical Experiments. New York: Wiley; 1986.

20.

Holm

. A simple sequentially rejective multiple test procedure. Scandinavian J Stat. 1979;6:65–70.

21.

Shaffer

. Modified sequentially rejective multiple test problems. J Am Stat Assoc. 1986;81:826–831.

22.

Hailperin

. Best possible inequalities for the probability of a logical function of events. Am Math Monthly. 1965;72:343–359.

23.

Rüger

. Das maximale signifikanzniveau des tests: “Lehne H₀ ab, wenn k unter n gegebenen tests zur ablehnung führen.” Metrika. 1978;25: 171–178.

24.

Westfall

Young

. Resampling-based Multiple Comparison Testing: Examples and Methods for p-Value Adjustment. New York: Wiley; 1993.

25.

Lachin

. Some large-sample distribution-free estimators and tests for multivariate, partially incomplete data from two populations. Stat Med. 1992;11:1151–1170.

26.

Gillings

Koch

. The application of intention-to-treat to the analysis of clinical trials. Drug Inf J. 1991;25:411–424.

Statistical Considerations for Multiplicity in Confirmatory Protocols

Abstract

Keywords

Get full access to this article

References