Statistical and practical considerations in planning and conduct of dose-optimization trials

Abstract

The U.S. Food and Drug Administration launched Project Optimus with the aim of shifting the paradigm of dose-finding and selection toward identifying the optimal biological dose that offers the best balance between benefit and risk, rather than the maximum tolerated dose. However, achieving dose optimization is a challenging task that involves a variety of factors and is considerably more complicated than identifying the maximum tolerated dose, both in terms of design and implementation. This article provides a comprehensive review of various design strategies for dose-optimization trials, including phase 1/2 and 2/3 designs, and highlights their respective advantages and disadvantages. In addition, practical considerations for selecting an appropriate design and planning and executing the trial are discussed. The article also presents freely available software tools that can be utilized for designing and implementing dose-optimization trials. The approaches and their implementation are illustrated through real-world examples.

Keywords

Optimal dose benefit–risk trade-off Project Optimus adaptive design

Background

The conventional phase 1 dose-finding paradigm was developed during the era of cytotoxic therapies with the primary goal of identifying the maximum tolerated dose (MTD) based on dose-limiting toxicity (DLT). The underlying assumption of this more-is-better approach is that both efficacy and toxicity increase monotonically with the dose. However, concerns have been raised regarding the appropriateness of using this approach in the age of targeted therapies and immunotherapies.^1–4 Many of these innovative therapies exhibit a shallow dose–response, meaning that the MTD may not be reached within a clinically effective dose range. Furthermore, efficacy may not increase monotonically with the dose and often reaches a plateau after a certain level is reached.^5,6 Consequently, the MTD may provide minimal improvements in efficacy over a lower dose, while causing more adverse events (AEs). For these reasons, the focus of dose-finding and selection should be shifted from finding the MTD to the identification of the optimal biological dose (OBD).

In 2021, the U.S. Food and Drug Administration (FDA) Oncology Center of Excellence launched Project Optimus with the goal of reforming the dose optimization and selection paradigm in oncology drug development.⁷ To facilitate this shift, the FDA also released draft guidance titled “Optimizing the Dosage of Human Prescription Drugs and Biological Products for the Treatment of Oncologic Diseases.”⁸ Zirkelbach et al.⁶ and Shah et al.³ have offered valuable insights into the rationale, significance, and principles of dose optimization, along with a few drug approval examples from a regulatory agency perspective. Poor dose optimization has several negative consequences, such as failure to bring a drug to market, frequent dose modifications at the approved dose, and postmarketing requirements to further evaluate the dose. Shah et al.³ provide examples of approved drugs whose dose was modified for safety or tolerability after approval.

Dose-optimization trials are more complex than conventional MTD-finding trials.⁹ The latter mainly focuses on DLT and is guided by a simple decision rule: if the observed data suggest that the DLT probability of the current dose is unacceptably higher (or lower) than the target DLT rate, we de-escalate (or escalate) the dose. In contrast, dose-optimization trials are inherently multidimensional. By definition, they require the characterization and assessment of the benefit–risk of the doses, which involves various data, including toxicity and efficacy data, as well as pharmacokinetics, pharmacodynamics, and biomarker data. The decision of dose transition and selection must be based on the benefit–risk assessment of the doses. This increased dimensionality complicates the trial design, decision rule, and their implementation, and often requires larger sample sizes. As a result, it likely increases costs and prolongs the timeline for early phase drug development. Therefore, it is critically important to have efficient and novel statistical design strategies to address these challenges and meet the increasing regulatory requirements for oncology dose justification.

The aim of this article is to provide statistical and practical considerations related to the planning and execution of dose-optimization trials. To facilitate understanding, we classified dose-optimization trials conducted premarket into two types: phase 1/2 dose optimization, where dose optimization is performed in phase 1 or/and 2, and phase 2/3 dose optimization, where dose optimization is performed in phase 2 or/and 3. In what follows, we discuss the methods, challenges, and practical considerations for designing and implementing both types of trials. We also provide real-world trial examples to illustrate key considerations. Finally, we conclude with a brief discussion.

Phase 1/2 dose optimization

The topic of dose optimization has recently been propelled into the spotlight with the advent of Project Optimus, but the concept of finding the optimal dose based on a consideration of both risk and benefit is not a new one.^10–12 Since Thall and Cook’s seminal work on the EffTox design,¹³ a plethora of designs have been proposed for optimizing doses in phase 1/2 settings^14–25 (see the book of Yuan et al.⁹ for a comprehensive review). To aid understanding of this vast and ever-increasing number of designs and provide a roadmap for future development, we categorize them as efficacy-integrated designs and two-stage designs (Figure 1).

Figure 1.

(a) Efficacy-integrated designs, which achieve dose optimization by continuously updating the estimate of the benefit–risk trade-off of each dose, based on the most recent data, to guide dose escalation, de-escalation, and selection. (b) Two-stage designs, where stage 1 dose escalation is performed to establish the MTD, and stage 2 conducts dose optimization often by randomization in multiple doses identified in stage 1. MTD: maximum tolerated dose.

Efficacy-integrated phase 1/2 designs, also known as fully sequential phase 1/2 designs, directly target the OBD by performing dose escalation or de-escalation based on the benefit–risk trade-off. The decision to escalate or de-escalate the dose is typically made by continuously updating the dose–toxicity and dose–efficacy model estimates based on interim data, in a similar fashion as the continual reassessment method.²⁶ In contrast, two-stage phase 1/2 designs first identify the MTD through conventional DLT-based dose escalation and then randomize patients among multiple doses to identify the OBD based on risk–benefit assessment. Examples of efficacy-integrated phase 1/2 designs include EffTox¹³ and Bayesian optimal interval phase 1/2 (BOIN12) design,²⁵ while examples of two-stage phase 1/2 designs include the design by Hoering et al.,²⁷ utility-based Bayesian optimal interval (U-BOIN) design²⁸ and dose-ranging approach to optimizing dose (DROID).²⁹ It is important to note that the purpose of differentiating the two classes is to aid understanding and not to provide a definitive definition. A design may incorporate features of both approaches. For instance, a design may use a benefit–risk trade-off to perform dose escalation while also including randomization.^30,31

In the following sections, we first describe some essential elements of phase 1/2 designs, including efficacy and toxicity endpoints and benefit–risk trade-off, and then provide a more detailed description of efficacy-integrated and two-stage design strategies.

Efficacy and toxicity endpoints

Dose optimization involves assessing the risks and benefits of a new drug, which requires specifying efficacy and toxicity endpoints to characterize its potential outcomes. An example of a toxicity endpoint is DLT, which is often defined as a grade 3 or higher AE that occurs within the first treatment cycle according to the Common Terminology Criteria for Adverse Events. However, DLTs may not be sufficient to fully characterize the safety and tolerability of many novel targeted drugs. These drugs often cause low-grade toxicity but rarely result in DLTs. In such cases, more comprehensive toxicity endpoints that account for low-grade or cumulative toxicity, such as ordinal toxicity endpoint,³² total toxicity burden,^33–38 or equivalent toxicity score,^37,38 may be preferred. To quantify the safety profile of the drug when cumulative toxicity is expected, the dose tolerability rate over multiple cycles can also be used.³⁹ The resulting toxicity endpoint can be binary, categorical, semicontinuous, or continuous.³⁸ Each type of toxicity endpoint has its advantages and limitations and should be chosen carefully with both clinical and statistical inputs. Alternatively, multiple constraints can be used to control different grades of toxicity at prespecified levels.^40,41

Efficacy endpoints should be chosen based on clinical and logistic considerations. For early phase trials, objective response rate along with the duration of response is commonly used. However, the response often takes several cycles to be ascertained, which causes major logistic difficulties in making real-time dose assignment decisions. In this case, intermediate short-term endpoints, such as pharmacodynamic endpoints or target receptor occupancy, may be used instead. Alternatively, new designs are available to handle delayed responses.^19,42,43 A potential issue with using short-term endpoints is that they may not be a reliable surrogate of clinical endpoints. To address this, one approach is to use short-term endpoints to guide real-time dose assignment decisions during the trial and then use the long-term clinical endpoint at the end of the trial to identify the OBD.²⁹ In some situations, more than one efficacy endpoint can be used to capture the multifaceted effects of the drug and improve design efficacy. It is more convincing if different pieces of evidence, such as pharmacokinetics, pharmacodynamics, and efficacy, point to the same dose. For instance, Agrawal et al.⁴⁴ considered receptor occupancy, pharmacokinetic parameters, and tumor shrinkage jointly for nivolumab dose selection. In some trials, patient-reported outcomes, such as quality of life, are important for determining the OBD.

Benefit–risk trade-off

The goal of dose optimization is to find the dose that achieves the optimal balance between benefit and risk. However, in current practice, the benefit–risk trade-off is rarely explicitly defined or used to guide dose optimization and selection. We believe that explicitly defining the benefit–risk trade-off, tailored to the trial’s objectives and characteristics, is advantageous and should be more widely adopted. By doing so, investigators can evaluate and fine-tune the design’s operating characteristics, resulting in more efficient dose optimization.

A straightforward method to define the benefit–risk trade-off is by considering the trade-off between the probability of efficacy and toxicity of a dose.²¹ For example, benefit–risk trade-off $= π_{E} - w π_{T}$ , where $π_{E}$ and $π_{T}$ are the probability of efficacy and toxicity of a dose, respectively; and $w > 0$ represents the penalty for an increase in the toxicity rate. If two doses have similar $π_{E}$ but different $π_{T}$ , the dose with the lower $π_{T}$ will have a higher benefit–risk trade-off and will be preferred. Nonetheless, the benefit–risk trade-off can also take more complex forms, such as nonlinear functions of $π_{E}$ and $π_{T}$ , as illustrated in the EffTox design.¹³

A more versatile and broadly applicable method of defining the benefit–risk trade-off is through utility,^{22,25,30,45,46} where clinicians are asked to provide utility scores for each potential patient-level outcome, thus quantifying the desirability of doses. For instance, with binary toxicity and efficacy endpoints, four potential outcomes exist for each patient: (efficacy, toxicity) = (yes, no), (yes, yes), (no, no), and (no, yes). We assign a score of 100 to the most desirable outcome (yes, no) and a score of 0 to the least desirable outcome (no, yes). For the remaining two outcomes, we elicit the scores from clinicians; for example, a score of 60 and 40 may be assigned to (yes, yes) and (no, no), respectively (see Table 1). The benefit–risk trade-off of a dose is then defined as the average of the scores and weighted by the probability of each potential outcome. That is, $b e n e f i t - r i s k t r a d e o f f = \sum_{k = 1}^{K} π_{k} u_{k}$ , where $u_{k}$ is the utility score assigned to the k^th potential outcome, and $π_{k}$ is the probability of observing the k^th potential outcome, for $k = 1, \dots, K$ . For instance, consider a dose with the probabilities of observing (no toxicity, efficacy), (no toxicity, no efficacy), (toxicity, efficacy), and (toxicity, no efficacy) being 0.5, 0.15, 0.25, and 0.1, respectively. The desirability of this dose is calculated as $0.5 \times 100 + 0.15 \times 40 + 0.25 \times 60 + 0.1 \times 0 = 71$ . As a result, a dose with a higher probability to produce favorable outcomes will have a higher desirability.

Table 1.

An example of utility for binary toxicity and efficacy endpoints.

	Efficacy (yes)	Efficacy (no)
Toxicity (yes)	60	0
Toxicity (no)	100	40

The utility-based benefit–risk trade-off is often a better option than the probability-based benefit–risk trade-off discussed earlier since clinicians typically have a better grasp of the relative desirability of patient outcomes than probabilities. Furthermore, research has shown that the utility approach encompasses the efficacy-and-toxicity-probability-based benefit–risk trade-off as a special case.^25,28,47

The use of a benefit–risk trade-off to guide dose optimization may raise some concerns. The first concern is that the benefit–risk trade-off is subjective. However, we consider the need for elicitation from trial clinicians as a strength. This process encourages clinicians to carefully consider the benefit–risk trade-off and design the study accordingly, reducing subjectivity and variability. Leaving the decision process unspecified actually leads to more subjectivity and variability. In addition, specifying the benefit–risk trade-off enables the evaluation of the design’s operating characteristics through simulation, enhancing understanding and providing opportunities to improve the design before the trial begins. Finally, research indicates that many designs are quite robust to the specification of benefit–risk trade-off. To reduce the subjectivity of benefit–risk trade-off elicitation, open communication and active efforts among stakeholders to reach a consensus are recommended. Seeking regulatory input early on during the design stage is also essential.

The second common concern regarding the use of benefit–risk trade-off for dose optimization is that the benefit and risk of a treatment are multidimensional, making it challenging to capture all aspects with a single benefit–risk trade-off. However, it is important to note that the benefit–risk trade-off is primarily defined to facilitate and enhance the efficiency of the adaptive decision-making process for dose assignment and evaluation of the trial design’s operating characteristics. Ultimately, the final decision for dose selection, particularly at the end of the trial, should be based on both the design recommendation, which is determined by the prespecified benefit–risk trade-off, and the totality of the evidence.

Efficacy-integrated designs

The efficacy-integrated design is characterized by its use of toxicity and efficacy, which are combined simultaneously through the use of benefit–risk trade-off, to guide dose escalation and de-escalation, and ultimately determine the OBD (Figure 1(a)). This approach assumes a statistical model to update the dose–toxicity and dose–efficacy relationship based on interim data. This information is then utilized to make adaptive dose assignment decisions that prioritize doses with high benefit–risk trade-off for treating the next cohort of patients. The efficacy-integrated design operates similarly to the continual reassessment method, but uses the estimate of benefit–risk trade-off rather than DLT to make dose transition decisions. For ease of explanation, the terms benefit–risk trade-off and desirability are used interchangeably. Depending on the method and level of complexity of implementation, efficacy-integrated designs can be further differentiated into model-based designs and model-assisted designs.

Model-based designs The model-based design assumes a statistical model to depict the dose–toxicity and dose–efficacy curves, which are used to guide dose transition. Examples of model-based designs include EffTox¹³ and late-onset EffTox designs.¹⁹ EffTox assumes the following logistic marginal dose–toxicity and dose–efficacy models: $logit (π_{T} | x) = γ_{0} + γ_{1} x$ and $logit (π_{E} | x) = β_{0} + β_{1} x + β_{2} x^{2}$ , respectively, where $x$ is a standardized dosage. Given the marginal models, the Gumbel–Morgenstern copula is further used to model the joint distribution of $(toxicity, efficacy) = (y_{T}, y_{E})$ as follows $f (y_{T}, y_{E} | x) = π_{E}^{y_{E}} (1 - π_{E})^{1 - y_{E}} π_{T}^{y_{T}} (1 - π_{T})^{1 - y_{T}} + (- 1)^{y_{T} + y_{E}} π_{E} (1 - π_{E}) π_{T} (1 - π_{T}) \frac{e^{ψ} - 1}{e^{ψ} + 1}$ , where $ψ$ is a parameter presenting the correlation between $y_{T}$ and $y_{E}$ . During the trial, the dose–toxicity and dose–efficacy models are fitted based on the interim data, and the next cohort of patients is assigned to the dose with the highest estimate of benefit–risk trade-off. Despite their desirable statistical properties, the use of model-based designs has been hindered by the requirement for complicated model estimation after each cohort and the influence of model misspecification.

Model-assisted designs Model-assisted designs have been developed to address the limitations of model-based designs.^48,49 Unlike model-based approaches, model-assisted approaches use simple models, such as binomial or multinomial models, at each dose, without assuming any specific shape for the dose–toxicity or dose–efficacy curves. As a result, the decision rule can be derived and tabulated before the trial onset. During the trial, users can simply consult the decision table to make dose assignment decisions. The book by Yuan et al.⁴⁹ provides a comprehensive review of model-assisted designs. Due to their simplicity and desirable operating characteristics, model-assisted designs have gained popularity in recent years. For instance, the Bayesian optimal interval (BOIN) design,⁵⁰ a model-assisted design, received the fit-for-purpose designation from the FDA as a tool for dose-finding in oncology.⁵¹

A number of model-assisted designs have been proposed for dose optimization.^{24,25,52–54} We here use BOIN12 to illustrate this approach. BOIN12 uses the utility to measure the toxicity–efficacy trade-off and models it using a simple beta-binomial model based on pseudolikelihood methodology. Based on interim toxicity and efficacy data, the BOIN12 design adaptively assigns patients to the dose with the highest estimated desirability. The dose-finding rule of BOIN12 is depicted in Figure 2. A key feature of this design is that dose desirability can be pretabulated and included in the trial protocol prior to the trial’s start (see Table 2), making implementation simple. During the trial, determining the desirability of a dose is straightforward: count the number of patients treated at a given dose, the number who experienced toxicity, and the number who experienced efficacy. This information is used to look up the dose’s rank-based desirability score in Table 2. The next cohort of patients is then assigned to the dose with the highest rank-based desirability score value. For instance, consider a scenario in which a trial has treated three, six, and three patients at the first three doses, respectively, and has observed toxicity and efficacy outcomes of (0, 1, 2) and (0, 3, 1), respectively. The current dose being administered is d = 2. In accordance with the dose-finding rule, we compare the observed toxicity rate ${\hat{π}}_{T} = 0.167$ to the escalation boundary $λ_{e} = 0.276$ . Since ${\hat{π}}_{T}$ is less than the escalation boundary $λ_{e}$ , we consult Table 2, which provides the rank-based desirability score of each dose as 13, 23, and 11, respectively. As dose level 2 has the highest rank-based desirability score, the decision is to continue administering the current dose to the next cohort of patients.

Figure 2.

The schema of the BOIN12 design, where $(λ_{e}, λ_{d})$ are a pair of optimized dose escalation and de-escalation boundaries, and N* is a prespecified sample size cutoff (e.g. $N^{*} = 6$ ). The desirability score table is provided in Table 2.

Table 2.

Rank-based desirability score table for the BOIN12 design with the utility score 100 = (no toxicity, efficacy), 60 = (toxicity, efficacy), 40 = (no toxicity, no efficacy), and 0 = (toxicity, no efficacy).

No.	No.	No.	RDS	No.	No.	No.	RDS
Pts	Tox	Eff		Pts	Tox	Eff
0	0	0	24	6	1	1	10
3	0	0	13	6	1	2	16
3	0	1	22	6	1	3	23
3	0	2	31	6	1	4	30
3	0	3	38	6	1	5	36
3	1	0	9	6	1	6	40
3	1	1	17	6	2	0	2
3	1	2	25	6	2	1	6
3	1	3	33	6	2	2	12
3	2	0	4	6	2	3	18
3	2	1	11	6	2	4	26
3	2	2	19	6	2	5	32
3	2	3	29	6	2	6	37
3	≥3	Any	E	6	3	0	1
6	0	0	7	6	3	1	3
6	0	1	14	6	3	2	7
6	0	2	20	6	3	3	14
6	0	3	27	6	3	4	20
6	0	4	34	6	3	5	27
6	0	5	39	6	3	6	34
6	0	6	41	6	≥4	Any	E
6	1	0	5

Pts: Patients; Tox: toxicity; Eff: efficacy; RDS: rank-based desirability score.

Note: “E” means that the dose should be eliminated, as it does not satisfy the safety and efficacy admissible criteria (i.e. not admissible due to high toxicity or low efficacy) with the upper toxicity limit of 0.35 and the lower efficacy limit of 0.25 and the probability cutoff of 0.9 for the admissibility.

It is worth noting that Table 2 assumes a cohort size of 3. To account for the possibility that the number of evaluable patients may not be a multiple of 3, a more comprehensive decision table can be generated using the software described later, which includes every possible number of patients up to the maximum number of patients that can be treated at a dose. The BOIN12 has been shown to have desirable operating characteristics through an extensive simulation study, often outperforming more complex model-based phase 1/2 designs, such as the EffTox design.²⁵

Two-stage designs

The two-stage design approach takes a staged approach to dose optimization.^27–29,55 As illustrated in Figure 1(b), in stage 1, dose escalation is performed to establish the MTD. At the end of stage 1, typically, the MTD and one or two lower doses that demonstrate appropriate antitumor activities and pharmacokinetic/pharmacodynamic characteristics are selected and proceed to stage 2 for dose optimization often via randomization. Stage 1 often uses conventional MTD-targeted dose-escalation methods, such as model-based continual reassessment method or model-assisted Bayesian optimal interval design. Thus, the key question for this approach is how to design stage 2, especially in terms of sample size determination.

Yang et al.⁵⁶ proposed the MERIT (Multiple-dosE RandomIzed Trial design for dose optimization based on toxicity and efficacy) design to provide a systematic approach to determining the sample size for stage 2 randomization. As in practice the final selection of the OBD involves both statistical and nonstatistical considerations, it is of limited value to control the statistical properties of the design solely in terms of the OBD selection. Thus, MERIT focuses on controlling the statistical properties, such as type 1 error and power, for identifying doses that are admissible to be the OBD. The OBD can only be selected from the admissible doses. To define the admissible doses, let $ϕ_{T, 0}$ denote the null toxicity rate that is considered high and unacceptable, and $ϕ_{T, 1}$ denote the alternative toxicity rate that is deemed acceptable. Similarly, let $ϕ_{E, 0}$ and $ϕ_{E, 1}$ denote the null and alternative efficacy rates that are deemed unacceptable and acceptable, respectively. A dose is considered OBD admissible if its toxicity rate $π_{T} \leq ϕ_{T, 1}$ and its efficacy rate $π_{E} \geq ϕ_{E, 1}$ .

MERIT considers a null hypothesis $(H_{0})$ : none of the doses is OBD admissible, versus an alternative hypothesis $(H_{1})$ : at least one dose is OBD admissible. The design derives the minimal sample size, along with decision boundaries, that satisfies a prespecified requirement on generalized type 1 error and power. The generalized type 1 error and power is a modification of standard type 1 error and power to accommodate the unique features of multiple-arm dose optimization (see Yang et al.⁵⁶ for details).

Table 3 shows the optimal sample size for a randomized trial with two doses, as well as the decision rule to determine whether a dose is OBD admissible. For example, suppose $(ϕ_{T, 0}, ϕ_{T, 1}) = (0.4, 0.2)$ and $(ϕ_{E, 0}, ϕ_{E, 1}) = (0.1, 0.3)$ , to achieve the (generalized) power of 70% and maintain the (generalized) type 1 error of 0.2, we will need to randomize $n = 24$ per dose. After the completion of the trial, a dose is considered as OBD admissible (i.e. meaning it can be chosen as the OBD based on the totality of evidence) if the number of efficacy $\geq m_{E} = 5$ and the number of toxicity $\leq m_{T} = 7$ .

Table 3.

Optimal design parameters for two doses when $(ϕ_{T, 1}, ϕ_{T, 0}) = (0.2, 0.4)$ .

			$α = 0.1$			$α = 0.2$
$ϕ_{E, 0}$	$ϕ_{E, 1}$	$β$	$n$	$m_{T}$	$m_{E}$	$n$	$m_{T}$	$m_{E}$
0.1	0.3	0.6	25	6	5	18	5	4
		0.7	33	8	6	24	7	5
		0.8	39	11	8	30	8	5
0.2	0.4	0.6	26	7	9	18	5	6
		0.7	34	9	11	25	7	8
		0.8	45	12	14	35	10	10
0.3	0.5	0.6	28	7	12	19	5	8
		0.7	37	10	16	28	8	12
		0.8	44	12	18	34	10	14
0.4	0.6	0.6	28	7	15	19	5	10
		0.7	38	10	20	25	7	13
		0.8	46	13	24	32	9	16

Note: $α$ and $β$ are prespecified generalized type 1 error and generalized power, respectively. $n, m_{T}, and m_{E}$ are the optimal design parameters for sample size, and critical values for toxicity and efficacy responses, respectively. A dose is considered as OBD admissible (i.e. meaning it can be selected as the OBD based on the totality of evidence) if the number of efficacy $\geq m_{E}$ and the number of toxicity $\leq m_{T}$ . $β$ is the generalized power 2 defined in the work of Yang et al.⁵⁶

Regarding the method of randomization, equal randomization is the most commonly used approach due to its ease of implementation and unbiased comparison between doses. Although response-adaptive randomization may seem attractive as it allows for more patients to be allocated to a more desirable dose, it often provides little benefit for multiple-dose randomization trials with small sample sizes. In fact, accrual may be nearly complete before the data start to skew the randomization toward better doses. In addition, response-adaptive randomization is more logistically challenging and increases the likelihood of unbalanced patient characteristic distribution across arms, which can lead to biased estimates.⁵⁷ Equal randomization combined with safety and futility monitoring, such as using Bayesian optimal phase 2 design,⁵⁸ is often effective and allows for early stopping of overly toxic and futile doses during the trial.

The approach of randomizing patients among multiple doses for optimization is commonly used in nononcology drug development and is referred to as dose-ranging. However, well-established dose-ranging methods in nononcology therapeutic areas, such as the multiple comparison procedure—modeling method,⁵⁹ are rarely used in oncology due to the unique characteristics and challenges of cancer drug development.²⁹ To address this issue, Guo and Yuan²⁹ developed an oncology-specific dose-ranging design referred to as DROID, by combining the mature framework of nononcology dose-ranging with oncology dose-finding.

Design choice

The efficacy-integrated and two-stage strategies each have their own advantages and disadvantages, making them suitable for different scenarios. The two-stage approach is well-aligned with conventional develop-by-stage practices and can accommodate different populations for the dose-escalation and randomization stages. However, a potential drawback of the two-stage design is that the true optimal dose may be incorrectly excluded when transitioning from stages 1 to 2 due to unreliable toxicity and efficacy estimates based on a small stage 1 sample size. This issue can be partially addressed by backfilling patients during the dose-escalation stage to obtain more data and increase the reliability of dose selection. However, this approach may still be limited by a small sample size. In addition, the two-stage approach generally requires larger sample sizes than the efficacy-integrated approach.

In contrast, the efficacy-integrated approach continuously learns the toxicity and efficacy profile of all doses throughout the trial, making it more efficient to identify the optimal dose and requiring smaller sample sizes. One limitation of this approach is that it requires efficacy and toxicity endpoints to be quickly observable enough to make adaptive decisions. However, methods such as time-to-event BOIN12 (TITE-BOIN12) have been proposed to address this limitation and facilitate real-time decision-making in the presence of pending toxicity or efficacy data⁴². In addition, the efficacy-integrated approach requires that the population used for dose optimization is comparable to that for subsequent phase 2b or 3 trials, which could be challenging when the target population is not clear. In this case, after phase 1/2 dose-finding, we may conduct cohort expansion (e.g. basket trials) in potential target populations to confirm the OBD and establish the target population before proceeding to phase 3 trials. This strategy is also applicable to the two-stage approach.

The efficacy-integrated and two-stage approaches demand different sample sizes. For the two-stage approach, the recommended sample size is $6 \times J$ for the dose-escalation portion,⁴⁹ where $J$ is the number of doses under investigation, and is 20–40 patients per dose arm for the randomization portion to achieve reasonable generalized power and type 1 error.⁵⁶ For the efficacy-integrated approach, based on our experience, a sample size of $6 \times J$ to $9 \times J$ generally yields reasonable operating characteristics.⁴⁹ For example, given four doses under investigation, a reasonable sample size for the two-stage design is between 64 and 104 (assuming two doses are selected for randomization), and that for the efficacy-integrated approach is between 24 and 36. It is important to note that these are rules of thumb. Given a specific trial, the sample size should be validated and calibrated using simulation to ensure reasonable operating characteristics.

The efficacy-integrated and two-stage design strategies can be combined to achieve more efficient dose optimization, and they are not mutually exclusive. For instance, a trial can begin with an efficacy-integrated design (e.g. BOIN12) to optimize the initial dose efficiently, and then progress to the second stage with multiple-dose randomization to refine optimization using the MERIT design. In the generalized phase 1/2 design, a third randomized stage is added to further optimize the dose based on long-term endpoints.³¹

Phase 2/3 dose optimization

A phase 2/3 design offers an alternative strategy for dose optimization. This design type encompasses a broad range of designs and can serve various purposes, including treatment selection, population selection, and endpoint selection,^60,61 and expediting the drug development process for accelerated approval.⁶² Here, we focus on the phase 2/3 design for the purpose of dose optimization.

In this context, the phase 2 component involves the random assignment of patients to multiple doses, with or without a control, to evaluate the benefits and risks of each dose. The doses are typically selected based on factors such as toxicity, pharmacokinetics/pharmacodynamics, and preliminary efficacy data collected in the phase 1 dose-escalation study, which should demonstrate reasonable safety and antitumor activity. At the end of phase 2, an interim analysis is performed to determine the optimal dose that produces the most favorable benefit–risk trade-off for further investigation in the phase 3 component of the trial. The goal of phase 3 is to confirm the efficacy of the selected optimal dose with a randomized concurrent control or historical control.

Types of phase 2/3 designs

Depending on whether the concurrent control is included and the type of endpoints used in phases 2 and 3, Jiang and Yuan⁶³ distinguish four forms of phase 2/3 dose-optimization designs (Figure 3) that are suitable for different clinical settings.

Figure 3.

Four types of phase 2/3 trial designs, varying in whether the concurrent control is included and the type of endpoints used in phases 2 and 3.

Design A incorporates a concurrent control in both stages and employs a short-term binary endpoint (e.g. objective response rate) in phase 2 to identify the optimal dose. In phase 3, a long-term time-to-event endpoint (e.g. progression-free survival (PFS) or overall survival) is used to assess the treatment’s therapeutic effect. The use of a short-term endpoint in phase 2 allows for a prompt selection of the optimal dose to progress to phase 3. Although not depicted in the schema, when appropriate, phase 3 may include an additional interim futility/superiority analysis akin to the standard group sequential design. An example of Design A is the HORIZON 3 [Cediranib Plus FOLFOX6 Versus Bevacizumab Plus FOLFOX6 in Patients With Untreated Metastatic Colorectal Cancer] trial for advanced metastatic colorectal cancer,⁶⁴ which will be further elaborated in the “Trial Examples” section.

Design B is a modification of Design A that includes only the control in phase 3. This can further reduce the sample size. However, Design B’s drawback is the lack of concurrent control in phase 2, making it difficult to combine phase 2 and 3 data and obtain an unbiased estimate of the treatment effect if there is a drift in the patient population or/and the treatment effect. Design B is a reasonable option when a drift is unlikely, for example, when the accrual is fast such that the patient population is unlikely to change, and the characteristics and performance of study centers remain stable over the trial period. Design B was used in several clinical trials, such as a randomized multicenter trial of SM-88 in patients with metastatic pancreatic cancer.⁶⁵

Design C is similar to Design A but simpler because it employs the same short-term endpoint (e.g. objective response rate) for both phases 2 and 3. This design is particularly useful in situations where demonstrating an effect on a long-term endpoint (e.g. survival or morbidity) requires lengthy and often large trials due to the disease’s prolonged course, and the short-term endpoint is reasonably likely to predict clinical benefit on the long-term endpoint. The seamless phase 2/3 trial of intravenous (IV) tenecteplase versus standard-dose IV alteplase for treating patients with acute ischemic stroke is an example of Design C.⁶⁶

Design D is a simplified version of Design C that does not include a control. This design is appropriate when there is a particularly acute unmet medical need (e.g. a refractory or resistant patient population), and/or the tumor under treatment is rare. Designs C and D are useful for drug development that targets accelerated approval from the FDA, which often relies on a short-term surrogate or intermediate clinical endpoint such as response. A limitation of Design D, like Design B, is the lack of concurrent controls, which may result in a biased estimate of the treatment effect if there is a drift in the patient population.

In addition to considering short-term phase 2 and long-term phase 3 endpoints, additional endpoints can be employed for making adaptive decisions regarding the transition from phases 2 to 3. One such approach is the 2-in-1 phase 2/3 designs,^67,68 which may provide additional flexibility for dose optimization.

Operational versus inferential

The phase 2/3 design can be categorized as either operational or inferential. In operational phase 2/3 designs, both phases are conducted under the same protocol to eliminate any gaps between them and reduce the overall trial cost. The data collected in phase 2 are not used in the phase 3 confirmatory analysis. Operational phase 2/3 designs are relatively simple to implement, and provide flexibility in dose selection and study design adjustments based on the results of the phase 2 portion while maintaining the integrity and reliability of the confirmatory phase 3 analysis.

Inferential phase 2/3 designs integrate both phase 2 and 3 data to evaluate the treatment effect. Because they incorporate additional phase 2 data, inferential phase 2/3 designs typically demand smaller sample sizes, shorten timelines of drug development, and exhibit greater statistical efficiency compared with operational phase 2/3 designs. However, for the same reason, they require careful consideration and specialized statistical methods to control the family-wise error rate (see Stallard and Todd⁶⁰ and Kunz et al.⁶¹ for relevant methods). In addition, the implementation of inferential phase 2/3 designs is more logistically and operationally challenging compared with operational phase 2/3 designs, as detailed below.

Practical considerations

In operational phase 2/3 designs, two phases are conducted independently and the data collected in phase 2 are not used in the phase 3 confirmatory analysis. Therefore, the phase 2 and 3 portions can be conducted using standard considerations for phase 2 and 3 trials, respectively, resulting in little additional complexity beyond what is expected for each individual phase.

The use of inferential phase 2/3 designs for dose optimization presents more logistical and operational challenges. Due to the complexity of the design, it is crucial that sponsors and regulatory agencies engage in discussions about the trial design as early as possible in the development process. This allows for agencies to communicate their expectations and potentially leads to more efficient studies. Determining the doses to be studied in phase 2/3 trials requires knowledge of therapeutic properties, patient population heterogeneity, the need for additional dose exploration for a supplemental application, as well as communication between patients and providers. Similar considerations also apply to the selection of the optimal dose when stage 1 is complete.

For phase 2/3 trials to select the optimal dose, unblinded data access is necessary at the end of stage 1. However, this could compromise the trial’s integrity if not handled appropriately. The FDA guidance on adaptive designs recommends limiting access to comparative interim results to individuals with relevant expertise who are independent of the trial’s conducting and managing personnel and who have a need to know. An Independent Data Monitoring Committee or an independent adaptation body should make the interim dose selection decision. Procedures must be in place to ensure that personnel responsible for preparing and reporting interim analysis results to the Independent Data Monitoring Committee are physically and logistically separated from the trial’s managing and conducting personnel. This requires planned procedures to maintain and verify confidentiality, as well as documentation of monitoring and adherence to operating procedures. To maximize the trial’s integrity, investigators should prespecify design details, including the anticipated number and timing of interim analyses, criteria for dose selection, methods for controlling type 1 error and estimating treatment effects, and the data access plan.

Software

Designing dose-optimization trials is more challenging than conventional dose-finding trials and requires the use of more complicated statistical designs. It is critical to thoroughly evaluate and calibrate the operating characteristics before beginning the trial. Moreover, for certain designs, such as model-based designs, the trial conduct also requires real-time model fitting and calculation. Thus, easy-to-use software is a key to the success of dose-optimization trials. This is an area that requires immediate attention and development. The website www.trialdesign.org offers a dose-optimization module that includes several dose-optimization designs, such as BOIN12, TITE-BOIN12, MERIT, and DROID. In addition, software to implement the EffTox design is available from the software download website at The University of Texas MD Anderson Cancer Center.

Trial examples

Efficacy-integrated phase 1/2 dose optimization

An efficacy-integrated model-based late-onset EffTox design¹⁹ was utilized to determine the optimal dose of sitravatinib in combination with a fixed dose of nivolumab for the treatment of clear cell renal cell carcinoma.⁶⁹ The trial had two primary endpoints: toxicity, defined as the time to DLT within 12 weeks of starting therapy, and early efficacy, defined as the absence of progressive disease at 6 weeks using Response Evaluation Criteria in Solid Tumor (RECIST) guideline by investigator assessment. Dose escalation/de-escalation was performed based on the benefit–risk trade-off constructed using marginal toxicity and efficacy probabilities. At the end of the trial, the 80 and 120 mg doses had almost the same estimated trade-off desirability scores; thus, additional criteria were used to compare the doses, including an evaluation of quality of life. The 120 mg dose was chosen as the optimal dose for sitravatinib and is currently being evaluated in ongoing phase 2 and 3 clinical trials for various malignancies. The implementation of this model-based phase 1/2 design has been logistically and resource-demanding. It requires a dedicated staff biostatistician to maintain frequent day-to-day communication between the clinical and data teams and perform real-time calculations to determine the dose assignment for the next patients. Tidwell et al.⁷⁰ review this process and provide a summary of challenges and potential solutions, one of which is to use model-assisted designs such as BOIN12 as described next.

As an example, an optimal dose-finding trial of donor-derived CD5 chimeric antigen receptor (CAR) T cells in patients with relapsed or refractory T-cell acute lymphoblastic leukemia was based on the BOIN12 design.⁷¹ The utility function shown in Table 1 was used to measure the benefit–risk trade-off of the treatment, and the rank-based desirability score in Table 2 was used to guide dose escalation. Patients were treated in cohorts of 3. Up to the time of reporting, a total of five patients who had CD7-negative relapse after CD7 CAR therapy were enrolled and received prior stem cell transplantation donor-derived CD5 CAR T cells at an initial dose of $1 \times 10^{6}$ CAR T cells/kg. No DLT occurred, and all five patients achieved complete remission at day 30. It is important to note that because the first three patients all achieved complete remission, based on the BOIN12 design rule and rank-based desirability score table, the trial continued to treat the second cohort at the dose of $1 \times 10^{6}$ CAR T cells/kg. However, if the standard dose-escalation design was used, the dose would be increased to a higher level after the first three patients showed no DLT, which is unlikely to further improve efficacy but at the risk of more toxicity and a higher burden of manufacturing high levels of CAR T cells. This demonstrates the adverse effect of ignoring efficacy in conventional dose-finding designs and highlights the importance of performing dose optimization and the advantages of a dose-optimization design.

Two-stage phase 1/2 dose optimization

Belantamab mafodotin, an antibody-drug conjugate targeting B-cell maturation antigen, was developed in a two-stage phase 1/2 study. In the DREAMM-1 first-in-human trial,⁷² the dose escalation was performed based on the continual reassessment method design to explore doses ranging from 0.03 to 4.6 mg/kg. MTD was not reached. The 3.4 mg/kg dosage showed activity in the dose-expansion portion of DREAMM-1, but many patients experienced dose interruptions (71%) and reductions (66%). To improve tolerability, both 2.5 and 3.4 mg/kg doses were further evaluated in the DREAMM-2 trial in which patients were randomly assigned between the two dose arms.⁷³ Efficacy was similar between the 2.5 mg/kg cohort $(n = 97)$ with an objective response rate of 31% (97.5% confidence interval (CI) = 20.8–42.6) and the 3.4 mg/kg cohort $(n = 99)$ with an objective response rate of 34% (97.5% CI = 23.9–46.0). There were fewer fatal AEs, serious AEs, dose interruptions, and dose reductions in patients receiving the 2.5 mg/kg IV. Exposure–response analysis showed a flat relationship, while a positive exposure–safety relationship was observed for keratopathy toxicity. Therefore, the 2.5 mg/kg IV was recommended, and the drug was granted accelerated approval. However, postmarket commitment is required to optimize the dose due to ocular toxicity.

Phase 2/3 dose optimization

The HORIZON 3 trial compared the efficacy of cediranib with that of bevacizumab when used in combination with chemotherapy mFOLFOX6 for first-line treatment of advanced metastatic colorectal cancer.⁶⁴ The trial employed a randomized, double-blind, inferential phase 2/3 design (Design A). During the phase 2 part, patients were randomly assigned 1:1:1 to receive cediranib 20 or 30 mg/day or bevacizumab 5 mg/kg IV infusion every 14 days, each combined with 14-day treatment cycles of the regimen. An Independent Data Monitoring Committee conducted end-of-phase 2 data analysis after 225 patients had 3 months of follow-up. The Independent Data Monitoring Committee concluded that cediranib 20 mg met all predefined criteria for continuation. As a result, patients enrolled in the phase 3 part of the study were randomly assigned 1:1 to receive mFOLFOX6 with cediranib 20 mg or bevacizumab. All study personnel other than the Independent Data Monitoring Committee remained blinded to the data until the trial ended. Patients who received cediranib 30 mg in the phase 2 part were unblinded and given the option to continue on open-label cediranib (20 or 30 mg/day). The primary analysis was planned for the primary endpoint PFS, which would occur after 850 progression events had occurred, based on all data from patients recruited into both the phase 2 and phase 3 parts of the study, excluding data from the cediranib dose discontinued at the end of phase 2. Since PFS data from patients recruited into the phase 2 part of the study were used in the phase 3 analyses, and data from the phase 2 part were used to select the phase 3 dose, the method of Todd and Stallard⁷⁴ was used to adjust the type 1 error for the primary analysis. The primary analysis showed that PFS had no significant difference between the arms. The estimated hazard ratio was 1.10 (95% CI = 0.97–1.25), and the median PFS was 9.9 months for cediranib 20 mg and 10.3 months for bevacizumab. However, since the upper 95% CI was beyond the predefined limit of 1.2, noninferiority of cediranib versus bevacizumab could not be concluded.

Discussion

We have reviewed design strategies and provided practical guidance on dose-optimization trials. For phase 1/2 designs, we contrast efficacy-integrated and two-stage phase 1/2 design strategies and discuss their pros and cons and key considerations for trial implementation. For phase 2/3 designs, we discuss and compare different types of designs based on the type of endpoint, whether the control is included, and whether phase 2 data are combined with phase 3 data for the primary analysis (inferential or operational).

In practice, the decision of whether to pursue a phase 1/2 or a phase 2/3 strategy should be made on a case-by-case basis, taking into account a variety of factors including clinical, statistical, logistic, and budgetary considerations. For instance, if a drug candidate has a fast readout of efficacy and pharmacodynamic parameters, an efficacy-integrated phase 1/2 design may be preferred to evaluate safety and efficacy simultaneously, starting early in the trial. On the contrary, if a drug candidate has been tested in a sufficient number of patients at various dose levels and there is a good understanding of its therapeutic window, a seamless phase 2/3 design may be a more attractive approach to selecting a dose from phase 2 and seamlessly bring it to the confirmatory phase 3.

We have focused on dose optimization in the early phases (e.g. phases 1 and 2) of drug development, which is generally preferred to premarket dose optimization. This approach increases the likelihood that the recommended dosage of the marketed product maximizes efficacy and minimizes toxicity, and avoids many issues associated with postmarketing dose optimization, such as the requirement for large sample sizes, long study durations, and difficulties in conducting the study as patients and investigators may be reluctant to be randomized to a dose of an approved product that differs from the approved dose. However, it is not uncommon that a dose has not been optimized at the time of marketing approval, and dose-optimization studies are conducted after the drug has been approved. In such cases, the postapproval study evaluating two or more doses may be planned as noninferiority trials. One may be concerned that performing dose optimization in the early phase will needlessly expose large numbers of patients to ineffective therapies and slow down drug development.⁷⁵ Nevertheless, novel statistical designs, such as BOIN12 and EffTox, can stop the trial early when the drug demonstrates little activity, alleviating these concerns. Further research is warranted to develop and implement better study designs to maximize the benefit of dose optimization and deliver safe and effective treatments to patients.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Ying Yuan

References

Ratain

. Redefining the primary objective of phase I oncology trials. Nature Rev Clin Oncol 2014; 11: 503–504.

Yan

Thall

Yuan

. Phase I-II clinical trial design: a state-of-the-art paradigm for dose finding. Ann Oncol 2018; 29: 694–699.

Shah

Rahman

Theoret

, et al. The drug—dosing conundrum in oncology—when less is more. New Engl J Med 2021; 385: 1445–1447.

Ratain

Tannock

Lichter

. Dose optimization of sotorasib: is the U.S. Food and Drug administration sending a message? J Clin Oncol 2021; 39: 3423–3426.

Sachs

Mayawala

Gadamsetty

, et al. Optimal dosing for targeted therapies in oncology: drug development cases leading by example optimal dosing for targeted therapies in oncology. Clin Cancer Res 2016; 22: 1318–1324.

Fourie Zirkelbach

Shah

Vallejo

, et al. Improving dose-optimization processes used in oncology drug development to minimize toxicity and maximize benefit to patients. J Clin Oncol 2022; 40: 3489–3500.

U.S. Food And Drug Administration. Project Optimus: reforming the dose optimization and dose selection paradigm in oncology, https://www.fda.gov/about-fda/oncology-center-excellence/project-optimus (2022, accessed 16 September 2023).

U.S. Food And Drug Administration. Optimizing the dosage of human prescription drugs and biological products for the treatment of oncologic diseases, https://www.fda.gov/regulatory-information/search-fda-guidance-documents/optimizing-dosage-human-prescription-drugs-and-biological-products-treatment-oncologic-diseases (2023, accessed 16 September 2023).

Yuan

Nguyen

Thall

. Bayesian designs for phase I-II clinical trials. Boca Raton, FL: Chapman and Hall, 2016.

10.

Thall

Russell

. A strategy for dose-finding and safety monitoring based on efficacy and adverse outcomes in phase I/II clinical trials. Biometrics 1998; 54(1): 251–264.

11.

O’Quigley

Hughes

Fenton

. Dose-finding designs for HIV studies. Biometrics 2001; 57(4): 1018–1029.

12.

Braun

. The bivariate continual reassessment method: extending the CRM to phase I trials of two competing outcomes. Control Clin Trial 2002; 23(3): 240–256.

13.

Thall

Cook

. Dose-finding based on efficacy-toxicity trade-offs. Biometrics 2004; 60(3): 684–693.

14.

Yin

. Bayesian dose-finding in phase I/II clinical trials using toxicity and efficacy odds ratios. Biometrics 2006; 62(3): 777–784.

15.

Mandrekar

Cui

Sargent

. An adaptive phase I design for identifying a biologically optimal dose for dual agent drug combinations. Stat Med 2007; 26: 2317–2330.

16.

Yuan

Yin

. Bayesian dose finding by jointly modeling toxicity and efficacy as time-to-event outcomes. J Royal Stat Soc Ser C 2009; 58: 719–736.

17.

Yuan

Yin

. Bayesian phase I/II drug-combination trial design in oncology. Ann Appl Stat 2011; 5: 924–942.

18.

Wages

Conaway

. Phase I/II adaptive design for drug combination oncology trials. Stat Med 2014; 33(12): 1990–2003.

19.

Jin

Liu

Thall

, et al. Using data augmentation to facilitate conduct of phase I/II clinical trials with delayed outcomes. J Am Stat Assoc 2014; 109(506): 525–536.

20.

Zang

Lee

Yuan

. Adaptive designs for identifying optimal biological dose for molecularly targeted agents. Clin Trial 2014; 11(3): 319–327.

21.

Liu

Johnson

. A robust Bayesian dose-finding design for phase I/II clinical trials. Biostatistics 2016; 17(2): 249–263.

22.

Guo

Yuan

. Bayesian phase I/II biomarker-based dose finding for precision medicine with molecularly targeted agents. J Am Stat Assoc 2017; 112(518): 508–520.

23.

Riviere

Yuan

Jourdan

, et al. Phase I/II dose-finding design for molecularly targeted agent: plateau determination using adaptive randomization. Stat Methods Med Res 2018; 27(2): 466–479.

24.

Takeda

Taguri

Morita

. BOIN-ET: Bayesian optimal interval design for dose finding based on both efficacy and toxicity outcomes. Pharm Stat 2018; 17(4): 383–395.

25.

Lin

Zhou

Yan

, et al. BOIN12: Bayesian optimal interval phase I/II trial design for utility-based dose finding in immunotherapy and targeted therapies. JCO Precis Oncol 2020; 16: 1392–1402.

26.

O’Quigley

Pepe

Fisher

. Continual reassessment method: a practical design for phase 1 clinical trials in cancer. Biometrics 1990; 46(1): 33–48.

27.

Hoering

LeBlanc

Crowley

. Seamless phase I-iIItrial design for assessing toxicity and efficacy for targeted agents. Clin Cancer Res 2011; 17(4): 640–646.

28.

Zhou

Lee

Yuan

. A utility-based Bayesian optimal interval (U-BOIN) phase I/II design to identify the optimal biological dose for targeted and immune therapies. Stat Med 2019; 38: 5299–5316.

29.

Guo

Yuan

. DROID: dose-ranging approach to optimizing dose in oncology drug development. Biometrics. Epub ahead of print 20 February 2023. DOI:10.1111/biom.13840

30.

Liu

Guo

Yuan

. A Bayesian phase I/II trial design for immunotherapy. J Am Stat Assoc 2018; 113: 1016–1027.

31.

Thall

Zang

Yuan

. Generalized phase I-II designs to increase long term therapeutic success rate. Pharm Stat 2023; 22(4): 692–706.

32.

Van Meter

Garrett-Mayer

Bandyopadhyay

. Proportional odds model for dose-finding clinical trial designs with ordinal toxicity grading. Stat Med 2011; 30(17): 2070–2080.

33.

Bekele

Thall

. Dose-finding based on multiple toxicities in a soft tissue sarcoma trial. J Am Stat Assoc 2004; 99: 26–35.

34.

Lee

Hershman

Martin

, et al. Toxicity burden score: a novel approach to summarize multiple toxic effects. Ann Oncol 2012; 23(2): 537–541.

35.

Ezzalfani

Zohar

Qin

, et al. Dose-finding designs using a novel quasi-continuous endpoint for multiple toxicities. Stat Med 2013; 32: 2728–2746.

36.

Chen

Krailo

Azen

, et al. A novel toxicity scoring system treating toxicity response as a quasi-continuous variable in phase i clinical trials. Contemp Clin Trial 2010; 31(5): 473–482.

37.

Yuan

Chappell

Bailey

. The continual reassessment method for multiple toxicity grades: a Bayesian quasi-likelihood approach. Biometrics 2007; 63: 173–179.

38.

Yuan

, et al. GBOIN: a unified model-assisted phase I trial design accounting for toxicity grades, and binary or continuous endpoints. J Royal Stat Soc Ser C 2019; 68: 289–308.

39.

Yin

Sargent

, et al. An adaptive multi-stage phase I dose-finding design incorporating continuous efficacy and toxicity data from multiple treatment cycles. J Biopharm Stat 2019; 29(2): 271–286.

40.

Lee

Cheng

Cheung

. Continual reassessment method with multiple toxicity constraints. Biostatistics 2011; 12(2): 386–398.

41.

Lin

. Bayesian optimal interval design with multiple toxicity constraints. Biometrics 2011; 74: 1320–1330.

42.

Zhou

Lin

Lee

, et al. TITE-BOIN12: a Bayesian phase I/II trial design to find the optimal biological dose with late-onset toxicity and efficacy. Stat Med 2022; 41(11): 1918–1931.

43.

Takeda

Morita

Taguri

. TITE-BOIN-ET: time-to-event Bayesian optimal interval design to accelerate dose-finding based on both efficacy and toxicity outcomes. Pharm Stat 2020; 19(3): 335–349.

44.

Agrawal

Feng

Roy

, et al. Nivolumab dose selection: challenges, opportunities, and lessons learned for cancer immunotherapy. J Immunother Cancer 2016; 4: 72–11.

45.

Houede

Thall

Nguyen

, et al. Utility-based optimization of combination therapy using ordinal toxicity and efficacy in phase I/II trials. Biometrics 2010; 66(2): 532–540.

46.

Murray

Thall

Yuan

, et al. Robust treatment comparison based on utilities of semi-competing risks in non-small-cell lung cancer. J Am Stat Assoc 2017; 112: 11–23.

47.

Schipper

Taylor

TenHaken

, et al. Personalized dose selection in radiation therapy using statistical models for toxicity and efficacy with dose and biomarkers as covariates. Stat Med 2014; 33: 5330–5339.

48.

Yuan

Lee

Hilsenbeck

. Model-assisted designs for early-phase clinical trials: simplicity meets superiority. JCO Precis Oncol 2019; 3: 1–12.

49.

Yuan

Lin

Lee

. Model-assisted Bayesian designs for dose finding and optimization: methods and applications. Boca Raton, FL: Chapman and Hall/CRC, 2022.

50.

Liu

Yuan

. Bayesian optimal interval designs for phase I clinical trials. J Royal Stat Soc Ser C 2014; 64: 507–523.

51.

U.S. Food And Drug Administration. Drug development tools: fit-for-purpose initiative, https://www.fda.gov/drugs/development-approval-process-drugs/drug-development-tools-fit-purpose-initiative#:∼:text=Due%20to%20the%20evolving%20nature,evaluation%20of%20the%20information%20provided (2022, accessed 16 September 2023).

52.

Lin

Yin

. STEIN: a simple toxicity and efficacy interval design for seamless phase I/II clinical trials. Stat Med 2017; 36(26): 4106–4120.

53.

Whitmore

Guo

, et al. Toxicity and efficacy probability interval design for phase I adoptive cell therapy dose-finding clinical trials. Clin Cancer Res 2017; 23(1): 13–20.

54.

Shi

Cao

Yuan

, et al. UTPI: a utility-based toxicity probability interval design for phase I/II dose-finding trials. Stat Med 2021; 40: 2626–2649.

55.

Pan

Xie

Liu

, et al. A phase I/II seamless dose escalation/expansion with adaptive randomization scheme (sears). Clin Trial 2014; 11(1): 49–59.

56.

Yang

Lin

, et al. Design and sample size determination for multiple-dose randomized phase II trials for dose optimization. arXiv Preprint, https://arxiv.org/abs/2302.09612 (2023, accessed 16 September 2023).

57.

Wathen

Thall

. A simulation study of outcome adaptive randomization in multi-arm clinical trials. Clin Trial 2017; 14(5): 432–440.

58.

Zhou

Lee

Yuan

. BOP2: Bayesian optimal design for phase II clinical trials with simple and complex endpoints. Stat Med 2017; 36(21): 3302–3314.

59.

Bretz

Pinheiro

Branson

. Combining multiple comparisons and modeling techniques in dose-response studies. Biometrics 2005; 61(3): 738–748.

60.

Stallard

Todd

. Seamless phase II/III designs. Stat Methods Med Res 2011; 20(6): 623–634.

61.

Kunz

Friede

Parsons

, et al. A comparison of methods for treatment selection in seamless phase II/III clinical trials incorporating information on short-term endpoints. J Biopharm Stat 2015; 25(1): 170–189.

62.

U.S. Food And Drug Administration. Clinical trial considerations to support accelerated approval of oncology therapeutics, https://www.fda.gov/regulatory-information/search-fda-guidance-documents/clinical-trial-considerations-support-accelerated-approval-oncology-therapeutics (2023, accessed 16 September 2023).

63.

Jiang

Yuan

. Seamless phase 2-3 design: a useful strategy to reduce the sample size for dose optimization. J Natl Cancer Inst 2023; 115(9): 1092–1098.

64.

Schmoll

Cunningham

Sobrero

, et al. Cediranib with mFOLFOX6 versus bevacizumab with mFOLFOX6 as first-line treatment for patients with advanced colorectal cancer: a double-blind, randomized phase III study (HORIZON III). J Clin Oncol 2012; 30(29): 3588–3595.

65.

A randomized phase 2/3 multi-center study of SM-88 in patients with metastatic pancreatic cancer, https://clinicaltrials.gov/ct2/show/NCT03512756 (accessed 16 September 2023).

66.

Levin

Thompson

Chakraborty

, et al. Statistical aspects of the TNK-S2B trial of tenecteplase versus alteplase in acute ischemic stroke: an efficient, dose-adaptive, seamless phase II/III design. Clin Trial 2011; 8(4): 398–407.

67.

Chen

Anderson

Mehrotra

, et al. A 2-in-1 adaptive phase 2/3 design for expedited oncology drug development. Contemp Clin Trial 2018; 64: 238–242.

68.

Zhang

, et al. A 2-in-1 adaptive design to seamlessly expand a selected dose from a phase 2 trial to a phase 3 trial for oncology drug development. Contemp Clin Trial 2022; 122: 106931.

69.

Msaouel

Goswami

Thall

, et al. A phase 1-2 trial of sitravatinib and nivolumab in clear cell renal cell carcinoma following progression on antiangiogenic therapy. Sci Trans Med 2022; 14(641): eabm6420.

70.

Tidwell

RSS

Thall

Yuan

. Lessons learned from implementing a novel Bayesian adaptive dose-finding design in advanced pancreatic cancer. JCO Precis Oncol 2021; 5: 1719–1726.

71.

Pan

Tan

Shan

, et al. Phase I study of donor-derived CD5 CAR T cells in patients with relapsed or refractory T-cell acute lymphoblastic leukemia. J Clin Oncol 2022; 4: 7028–7028.

72.

Trudel

Lendvai

Popat

, et al. Targeting B-cell maturation antigen with GSK2857916 antibody-drug conjugate in relapsed or refractory multiple myeloma (BMA117159): a dose escalation and expansion phase 1 trial. Lancet Oncol 2018; 19(12): 1641–1653.

73.

Lonial

Lee

Badros

, et al. Belantamab mafodotin for relapsed or refractory multiple myeloma (DREAMM-2): a two-arm, randomised, open-label, phase 2 study. Lancet Oncol 2020; 21(2): 207–221.

74.

Todd

Stallard

. A new clinical trial design combining phases 2 and 3: sequential designs with treatment selection and a change of endpoint. Drug Inform J 2005; 39(2): 109–118.

75.

Korn

Moscow

Freidlin

. Dose optimization during drug development: whether and when to optimize. J Natl Cancer Inst 2023; 115(5): 492–497.