Sage Journals: Discover world-class research

Abstract

One common approach for dose optimization is a two-stage design, which initially conducts dose escalation to identify the maximum tolerated dose, followed by a randomization stage where patients are assigned to two or more doses to further assess and compare their risk-benefit profiles to identify the optimal dose. A limitation of this approach is its requirement for a relatively large sample size. To address this challenge, we propose a seamless two-stage design, BARD (Backfill and Adaptive Randomization for Dose Optimization), which incorporates two key features to reduce sample size and shorten trial duration. The first feature is the integration of backfilling into the stage 1 dose escalation, enhancing patient enrollment and data generation without prolonging the trial. The second feature involves seamlessly combining patients treated in stage 1 with those in stage 2, enabled by covariate-adaptive randomization, to inform the optimal dose and thereby reduce the sample size. Our simulation study demonstrates that BARD reduces the sample size, improves the accuracy of identifying the optimal dose, and maintains covariate balance in randomization, allowing for unbiased comparisons between doses. BARD designs offer an efficient solution to meet the dose optimization requirements set by Project Optimus, with software freely available at www.trialdesign.org.

Keywords

Dose optimization Project Optimus adaptive designs seamless designs dose finding

Introduction

Conventionally, phase I dose-finding trials aim to identify the maximum tolerated dose (MTD) under the assumption that both toxicity and efficacy increase with dose. However, this paradigm poses challenges for targeted therapies and immunotherapies, where the monotonicity assumption often doesn’t hold.^1,2 For instance, when a targeted agent’s binding is saturated before reaching the MTD, increasing the dose may not improve efficacy further. In such cases, a dose below the MTD may offer a better benefit-risk trade-off by providing similar efficacy with lower toxicity and better tolerability.³ Recognizing this issue, the United States Food and Drug Administration (FDA) launched Project Optimus⁴ to reform the dose selection paradigm. This initiative shifts the focus of dose finding and selection from the MTD to the optimal biological dose (OBD), which delivers the optimal risk-benefit profile.

Numerous methods have been proposed to identify the OBD. Yuan et al.⁵ reviewed phase I–II trial designs and discussed critical topics in OBD identification. To better understand these various designs, Yuan et al.⁶ classified them into two strategies: the efficacy-integrated strategy and the two-stage strategy. The efficacy-integrated strategy considers the risk-benefit trade-off from the beginning of the trial and uses it to guide the dose finding. Examples of efficacy-integrated designs include the model-based design such as EffTox design and late-onset EffTox,^7,8 and model-assisted designs such as BOIN12, BOIN-ET, and uTPI.^9–11 Efficacy-integrated designs are efficient for identifying the OBD and are most suitable for cases where the efficacy endpoint can be ascertained relatively quick and also patient population is expected to be similar between dose finding and subsequent phase II trials. A number of clinical trials applied this strategy to find the OBD using EffTox or BOIN12.^12–14

The two-stage strategy refers to the approach that first performs dose escalation to identify the MTD and a safe dose range, followed by a randomization stage where patients are randomized between two or more doses to further assess and compare the risk-benefit profiles of these doses for identifying the OBD. Compared with the efficacy-integrated strategy, the two-stage strategy is more flexible, allowing different populations in the two stages (e.g. all-comers in stage 1 and particular indications in stage 2). In addition, randomization also decreases heterogeneity and enables more fair and unbiased comparisons between doses.¹⁵ This approach has been described in the FDA’s guidance on dose optimization.¹⁶ Examples of trial designs using the two-stage strategy include the method by Hoering et al.,¹⁷ DROID,¹⁸ and U-BOIN.¹⁹

One limitation of two-stage designs is their requirement for a relatively large sample size due to their structured approach. In stage 1, typical sample sizes for dose escalation often range from 20 to 30 patients, depending on the number of doses (e.g. 4–5 doses). For stage 2, Yang et al.²⁰ recommended sample sizes of 20 to 40 patients per dose arm to achieve reasonable accuracy in identifying the OBD. As a result, the total sample size required is substantially larger than that in conventional dose-finding trials, leading to increased costs and longer development times.

To address this challenge, we propose a seamless two-stage design, BARD (Backfill and Adaptive Randomization for Dose Optimization), incorporating two key features to reduce sample size and shorten trial duration. The first feature of BARD is the integration of backfilling into stage 1 dose escalation, allowing additional patients to be treated (backfilled) at doses deemed safe and showing promising activity. This concurrent approach enhances patient enrollment and data generation without prolonging the trial duration, thereby better informing the determination of the MTD and OBD.²¹ The second feature seamlessly combines patients treated in stage 1 with those in stage 2, significantly reducing the overall sample size requirement. Integrating stage 1 patients, who are not randomized, into stage 2 may compromise covariate balance across doses. To mitigate this, we employ covariate-adaptive randomization in stage 2 to actively address potential imbalances in prognostic factors among stage 1 non-randomized patients. Our simulation study demonstrates that this approach achieves covariate balance comparable with that of fully randomized trials comprising only randomized patients.

Method

BARD consists of two seamless stages: stage 1 conducts dose escalation with backfill, and stage 2 performs covariate-adaptive randomization. The objective of stage 1 is to establish the MTD and provide toxicity and efficacy data, as well as pharmacokinetics and pharmacodynamics data, to select the doses for stage 2 randomization. Depending on the dose escalation method used in stage 1, different versions of BARD can be constructed. We focus on the Bayesian optimal interval design (BOIN)²² and the Bayesian logistic regression model (BLRM)²³ methods to illustrate the use of model-assisted and model-based dose escalation designs, respectively, while noting that our methodology can readily accommodate other dose escalation methods, such as the keyboard design²⁴ and the continuous reassessment method.²⁵

Stage 1 dose escalation with backfill

BOIN with backfill (BF-BOIN)

In this section, we briefly review the BF-BOIN design proposed by Zhao et al.,²⁶ which combines BOIN with backfill. This review is not only for the completeness of the method but also to provide the necessary notation and concepts for the development of BLRM with backfill (BF-BLRM) in the next section.

BF-BOIN uses the same rule as BOIN for dose escalation and de-escalation. Let $ϕ$ denote the target dose-limiting toxicity (DLT) rate, and $λ_{e}$ and $λ_{d}$ denote the corresponding dose escalation and de-escalation boundaries of BOIN, respectively. Let $j = 1, \dots, J$ denote the dose levels. Let $y_{T_{j}}$ and $n_{j}$ denote the number of patients having DLT and the number of patients completed DLT assessment at dose level $j$ , respectively, with ${\hat{p}}_{j} = y_{T_{j}} / n_{j}$ denoting the observed DLT rate at dose $j$ . Let $c$ denote the current dose level of dose escalation, where the most recent cohort of patients were treated. The dose escalation/de-escalation of BF-BOIN is given by:

If ${\hat{p}}_{c} \leq λ_{e}$ , escalate one dose level to $c + 1$ .

If ${\hat{p}}_{c} > λ_{d}$ , de-escalate one dose level to $c - 1$ .

Otherwise, stay at the current dose $c$ .

For safety, an overdose control rule is applied throughout the dose escalation: if $\Pr (p_{j} > ϕ$ | $y_{T_{j}}, n_{j}) > C$ , dose level $j$ and higher doses are eliminated from treating further patients. This rule is evaluated based on a beta-binomial model with a uniform prior for $p_{j}$ , resulting in the posterior $p_{j} | y_{j}, n_{j} ~ Beta (y_{j} + 1, n_{j} - y_{j} + 1)$ , where $Beta (\cdot)$ is a beta distribution. The default value of $C$ is 0.95 for BOIN, but it can be adjusted to fit specific trial safety considerations.

To conduct backfilling, BF-BOIN adaptively opens and closes a dose for backfilling based on observed interim data as follows. A dose level $b$ is regarded as eligible for backfilling if it satisfies the following two conditions:

(Safety) $b$ is lower than the current dose of the dose escalation (i.e. $b < c$ ).

(Activity) At least one response is observed at $b$ or at a dose lower than $b$ .

Here, the response can be any reasonable anti-tumor activity readout, such as objective response, or a surrogate endpoint, such as a pharmacodynamics biomarker, receptor occupancy, or changes in ctDNA.

Dose level $b$ will be closed for backfilling if:

(a) both of the following two conditions are met:

The observed DLT rate based on all cumulative patients completed DLT assessment at $b$ is greater than the de-escalation boundary $λ_{d}$ , and

the pooled DLT rate based on the pooled DLT data over $b$ and $b + 1$ is also greater than $λ_{d}$ . Or,

(b) the number of evaluable patients treated at $b$ is $\geq n_{cap}$ , where $n_{cap}$ is a prespecified sample size cap.

The closing rule (a) temporarily closes a dose for backfilling due to its toxicity, while rule (b) permanently closes a dose for backfilling. If dose level $b$ is closed due to rule (a), all the doses higher than $b$ should also be closed. As described in Zhao et al.,²⁶ in rule (a), the pooled DLT estimate is used to mitigate the impact of an accidentally high DLT rate caused by a small sample size. This approach is simple and performs similarly to isotonic regression, which is theoretically more desirable but more complex. When multiple doses are eligible for backfilling, we can either prioritize the highest dose or randomize among them,²⁶ depending on the context. Pin et al.²⁷ considered using response-adaptive randomization.

One complication introduced by backfilling is that the new data observed from backfilling patients may conflict with those from dose escalation. Specifically, after integrating the data from backfilled patients, the observed DLT rate at a lower, backfilled dose $b (b < c)$ could be higher than the current dose $c$ (for dose escalation), that is, ${\hat{p}}_{b} > {\hat{p}}_{c}$ . Table 1 provides the possible conflicts between dose escalation and backfilling.

Table 1.

Conflict between the current dose of dose-escalation and backfilling doses.

	Observed DLT rate of the current dose ${\hat{p}}_{c}$ suggests
Observed DLT rate of backfilled dose ${\hat{p}}_{b}$ suggests	Escalation $({\hat{p}}_{c} \leq λ_{e})$	Stay $(λ_{e} < {\hat{p}}_{c} \leq λ_{d})$	De-escalation $({\hat{p}}_{c} > λ_{d})$
Escalation $({\hat{p}}_{b} \leq λ_{e})$
Stay $(λ_{e} < {\hat{p}}_{b} \leq λ_{d})$	conflict
de-escalation $({\hat{p}}_{b} > λ_{d})$	conflict	conflict	conflict^a

This case does not necessarily mean that ${\hat{p}}_{b} > {\hat{p}}_{c}$ . However, as ${\hat{p}}_{b} > λ_{d}$ , it means that the additional data from backfilling patients demonstrate alarmingly higher toxicity than what originally observed during the dose escalation (i.e. ${\hat{p}}_{b} \leq λ_{e})$ . As $b < c$ , the data observed at $b$ previously during the dose escalation must satisfy ${\hat{p}}_{b} \leq λ_{e}$ , Therefore, it is important to reconcile such conflict for patient safety.

BF-BOIN reconciles the conflicts shown in Table 1 using the following rule. Let $b^{*}$ denote the dose conflicting the current dose $c (b^{*} < c)$ , and define the pooled DLT rate from $b^{*}$ to $j$ , where $b^{*} \leq j \leq c$ , as follows:

{\hat{q}}_{j} = \frac{the sum of number of patients experienced DLT from dose b^{*} to dose j}{the sum of number of patients finished DLT assessment from dose b^{*} to dose j} .

In the presence of the conflict, BF-BOIN uses the following rule to replace the original BOIN rule to determine dose escalation/de-escalation:

If ${\hat{q}}_{c} \leq λ_{e}$ , escalate one dose level.

If ${\hat{q}}_{c} > λ_{d}$ , de-escalate to the highest dose $j$ with ${\hat{q}}_{j} \leq λ_{d}$ , $b^{*} \leq j \leq c - 1$ ; if such a dose does not exist, de-escalate to dose $b^{*} - 1$ .

Otherwise, stay at the current dose.

When a new patient is enrolled in the trial, BF-BOIN assigns the patient to dose escalation or backfilling in the following way:

If the current cohort of the dose-escalation has not been filled, the patient will be allocated to that dose-escalation cohort;

Otherwise, the patient will be allocated to a dose that is open for backfilling. If multiple dose levels are open for backfilling, the patient will be assigned to the highest one.

This patient assignment rule prioritizes dose escalation over backfilling, but can be customized based on the trial. Patient enrollment is staggered between cohorts in dose-escalation, and no stagger is necessary in backfilling.

When dose escalation ends (e.g. the prespecified maximum sample size is reached or another stopping rule is satisfied), backfilling stops, and stage 1 ends. At the end of stage 1, an isotonic regression is applied using all the data, and the dose whose isotonic estimated DLT rate is closest to $ϕ$ is identified as the MTD.

BLRM with backfill (BF-BLRM)

In this section, we present how to incorporate backfill into BLRM,²³ a model-based design. For convenience, we refer to the resulting design as BF-BLRM. The proposed method is directly applicable to other model-based designs, such as the continuous reassessment method and its extensions.

In BF-BLRM, a Bayesian logistic regression model is used to model the dose-toxicity curve. Let $d_{j}$ denote the dosage of dose level $j$ , and $d^{*}$ denote a reference dosage. BF-BLRM assumes:

logit (p_{j}) = \log (α) + β (\frac{d_{j}}{d^{*}}), α, β > 0 .

[\begin{matrix} \log (α) \\ \log (β) \end{matrix}] ~ N ([\begin{matrix} μ_{α} \\ μ_{β} \end{matrix}], [\begin{matrix} σ_{α}^{2} & 0 \\ 0 & σ_{b}^{2} \end{matrix}]) .

To conduct dose escalation/de-escalation, BF-BLRM specifies an underdose cutoff $γ_{1}$ and an overdose cutoff $γ_{2}$ , dividing the toxicity probability into three intervals: the underdose interval $[0, γ_{1}]$ , the target toxicity interval $(γ_{1}, γ_{2})$ and the overdose interval $[γ_{2}, 1]$ .

Given the observed interim data, BF-BLRM estimates the posterior probability of the target toxicity (PTT) and posterior probability of overdose (POD) as $PT T_{j} = \Pr (p_{j} \in (γ_{1}, γ_{2}) | data)$ and $PO D_{j} = \Pr (p_{j} \in [γ_{2}, 1] | data)$ , $j = 1, \dots, J$ . BF-BLRM identifies the dose $j^{*}$ that has the highest value of PTT with POD $< η$ among $J$ doses, where $η$ is a prespecified overdose control cutoff . BF-BLRM conducts dose-escalation/de-escalation as follows:

If $j^{*} > c$ , escalate one dose level to $c + 1$ .

If $j^{*} < c$ , de-escalate one dose level to $c - 1$

Otherwise, stay at the current dose $c$ .

At any time of the trial, if the POD $\geq η$ for all doses, we terminate the trial and claim that all doses are over-toxic. In this case, no dose should be selected as the MTD. The authors of BLRM recommended $η = 0.25$ , which was found to be overly conservative and led to poor accuracy in selecting the MTD.^28–31 In our simulation study, we studied different values of $η$ .

BF-BLRM incorporates backfilling into dose escalation similarly to BF-BOIN. Specifically, BF-BLRM adaptively opens and closes a dose level $b$ for backfilling using criteria similar to BF-BOIN, with a modification to the rule for closing a dose:

A dose level $b$ will be closed for backfilling if:

(a) POD for dose $b$ is $\geq η$ .

(b) The number of evaluable patients treated at $b$ is $\geq n_{cap}$ .

Patients are assigned to the current dose-escalation cohort or backfilled using the same approach as in BF-BOIN. Guidance on selecting $n_{cap}$ will be provided in the next section, following the introduction of stage 2 of the design.

The BF-BLRM enforces monotonicity in the dose-toxicity relationship by constraining $β > 0$ . Therefore, data conflicts between the backfilled dose $b$ and the current dose $c$ are automatically reconciled and smoothed out during the model fitting, obviating the need for additional rules.

At the end of stage 1, BF-BLRM selects the MTD based on all data, including dose escalation and backfilling data. The MTD is chosen as the dose that satisfies the following two conditions:

Treated with at least 6 patients,

Has the highest PTT among all doses with POD $< η$ .

Stage 2 with adaptive randomization

Suppose at the end of stage 1, two doses are selected for stage 2 randomization, referred to as $d_{low}$ and $d_{high}$ with $d_{low} < d_{high}$ . This selection is based on the evaluation of the totality of stage 1 data, including safety, efficacy, pharmacokinetics, pharmacodynamics, and tolerability. Often, $d_{high}$ is the MTD. For ease of exposition, we assume two doses, but the method is directly applicable to more than two doses.

To optimize the dose, the most straightforward approach is to randomize patients to $d_{low}$ and $d_{high}$ in a fixed ratio, most commonly 1:1. Randomization is preferred, as noted in FDA’s guidance,¹⁶ because it balances important prognostic factors between the two doses, allowing an unbiased comparison of their risk-benefit profiles to select the OBD. Since some patients were treated with $d_{low}$ and $d_{high}$ in stage 1, it is highly desirable to incorporate this stage 1 data with stage 2 data to enhance trial efficiency and reduce the sample size for dose optimization. The challenge is that incorporating non-randomized stage 1 data with randomized stage 2 data may compromise the balance of the latter, defeating the purpose of stage 2 randomization.

To address this challenge, we leverage the idea of the Pocock-Simon minimization³² for stage 2 randomization. Minimization is a widely used covariate-adaptive randomization method that is discussed in the FDA’s guidance on adaptive designs.³³ Our key idea is to randomize stage 2 patients, conditional on stage 1 data, in a covariate-adaptive way to eliminate the covariate imbalance present in the stage 1 data. By doing so, at the end of stage 2, covariates are balanced between the two dose arms. This allows stage 1 and stage 2 data to be combined to better inform dose comparison and the selection of OBD.

Let $n_{1, l o w}$ and $n_{1, high}$ denote the number of patients treated at dose $d_{low}$ and $d_{high}$ , respectively, in stage 1. In cases where patient enrollment criteria differ between stage 1 and stage 2 (e.g. stage 1 enrolls all-comers and stage 2 enrolls specific indications), $n_{1, low}$ and $n_{1, high}$ refer to subsets of patients who meet the stage 2 eligibility criteria. Let $N_{2}$ denote the target total sample size to be treated with $d_{low}$ and $d_{high}$ by the end of the trial, including those from both stage 1 and 2. Then, $N_{2}^{*} = N_{2} - n_{1, low} - n_{1, high}$ new patients will be enrolled and randomized in stage 2. Yang et al.²⁰ proposed a method to determine $N_{2}$ and recommended that a sample size of 20 to 40 patients per dose is often reasonable for randomized dose optimization.

Let $X_{1}$ and $X_{2}$ denote important baseline prognostic factors that we aim to balance via randomization. For illustrative purposes, we consider two prognostic factors, but the method is applicable to more than two factors. We assume that $X_{1}$ and $X_{2}$ are categorical with $L_{1}$ and $L_{2}$ levels, respectively. When prognostic factors are continuous, they can be discretized to balance their distribution between the two dose arms. Let $n_{low} (X_{k} = l)$ be the number of patients with $X_{k} = l$ who are treated with $d_{low}$ , and $n_{high} (X_{k} = l)$ is the number of patients with $X_{k} = l$ who are treated with $d_{high}$ , where $k = 1, 2$ . The difference $| n_{low} (X_{k} = l) - n_{high} (X_{k} = l) |$ provides a measure of imbalance on level $l$ of $X_{k}$ between $d_{low}$ and $d_{high}$ .

When a new patient with $X_{1} = v_{1}$ and $X_{2} = v_{2}$ is enrolled at stage 2, the dose assignment of this patient only impacts the balance of $X_{1}$ at level $v_{1}$ and $X_{2}$ at level $v_{2}$ . Following Pocock and Simon,³² define the imbalance index that embraces both $X_{1}$ and $X_{2}$ as

ω = \sum_{k = 1}^{2} | n_{low} (X_{k} = v_{k}) - n_{high} (X_{k} = v_{k}) | .

(1)

To balance the distribution $X_{1}$ and $X_{2}$ between two dose arms, the new patient will be assigned with probability $r$ to the dose ( $d_{low}$ or $d_{high}$ ) that minimizes $ω$ , where $r \leq 1$ is a large probability between 0.8 and 1. When $r = 1$ , the patient is always assigned to the dose that minimizes $ω$ . FDA’s Guidance for adaptive designs³³ noted that setting $r < 1$ reduces the predictability of treatment assignment.

It is important to note that the calculation of $n_{low} (X_{k} = v_{k})$ and $n_{high} (X_{k} = v_{k})$ in (1) is based on both stage 1 and 2 data. Thus, our approach can be viewed as a conditional version of the Pocock-Simon method, termed conditional minimization, in which the randomization of stage 2 patients is conditional on stage 1 data to accommodate the mixture of non-randomized and randomized patients. By doing so, we actively rebalance prognostic factors that may not be well balanced in stage 1. Thus, at the end of randomization, the combined data of stage 1 and stage 2 efficiently resembles these generated by randomization of full $N_{2}$ patients. In contrast, Pocock-Simon method focuses on “full” randomization and the assignment of the next patient based on the covariate distribution of patients who have been randomized. Because substantial imbalance might be present in the stage 1 data and the sample size of stage 2 (i.e. $N_{2}^{*}$ ) is often small, and a trial may have to balance several prognostic factors at a time, we generally recommend using a large value of $r$ (e.g. 0.85 to 0.95) to quickly correct the imbalance.

Due to the interplay between the two stages, the choice of the sample size cap for backfill in stage 1 ( $n_{cap}$ ) should account for the characteristics of both stages. For example, given a fixed $N_{2}$ , $n_{cap}$ should be set so that the number of patients carried forward to stage 2 remains sufficiently smaller than $N_{2}$ ; otherwise, there would be limited sample size remaining in stage 2 to rebalance any potential imbalance among stage 1 non-randomized patients. For the same reasonable, if baseline characteristics of stage 1 patients are likely to be imbalanced between $d_{low}$ and $d_{high}$ (e.g. due to excessively high patient heterogeneity), it may be preferable to reserve more sample size for adaptive randomization by using a relatively smaller $n_{cap}$ . In addition, in some applications where stage 1 enrolls patients across multiple indications but stage 2 focuses on a single or subset of indications, a larger $n_{cap}$ may be appropriate because only a fraction of the patients treated at $d_{low}$ and $d_{high}$ will be eligible to carry forward into stage 2. Finally, under a fast accrual rate, a larger $n_{cap}$ may be desirable to ensure timely treatment of the patients in the trial.

OBD selection

At the end of stage 2, we identify the OBD based on data from $N_{2}$ patients, combined from stages 1 and 2. Depending on the trial setting, different criteria can be used to select the OBD. We consider two approaches, noting that their modifications and other criteria can also be used to define and select the OBD. Let ${\hat{p}}_{E, low}$ and ${\hat{p}}_{E, high}$ denote the estimates of the efficacy rate for $d_{low}$ and $d_{high}$ , respectively. These estimates can be sample mean (e.g. observed efficacy rates) or estimates based on a certain model (e.g. logistic model).

The first approach implicitly considers the toxicity-efficacy trade-off and selects the OBD as follows:

If ${\hat{p}}_{E, low} - {\hat{p}}_{E, high} \leq δ,$ select $d_{low}$ as the OBD; otherwise, select $d_{high}$ as the OBD.

where $δ$ is a prespecified noninferiority/indifference margin. The rationale behind this criterion is that $d_{low}$ is presumably safer than $d_{high}$ . Therefore, if the efficacy of $d_{low}$ is noninferior to $d_{high}$ , $d_{low}$ has a better toxicity-efficacy trade-off and should be selected as the OBD.

The second method explicitly accounts for the toxicity-efficacy trade-off and selects the OBD based on utility. For binary toxicity and efficacy endpoints, each patient can experience one of four possible outcomes: (toxicity, no efficacy), (no toxicity, no efficacy), (toxicity, efficacy), and (no toxicity, efficacy). Let $u_{1}, \dots, u_{4}$ denote the utility scores assigned to these outcomes, which should be elicited from clinicians to reflect the relative desirability of each outcome. Typically, $u_{1}$ is assigned a score of 0 (least desirable, toxicity without efficacy), and $u_{4}$ a score of 100 (most desirable, no toxicity with efficacy). Clinicians then provide scores for the other outcomes based on this reference. Table 2 provides an example of elicited utility scores.

Table 2.

Utility ascribed to each possible efficacy-toxicity outcome.

	Toxicity	No toxicity
No efficacy	0	30
Efficacy	50	100

Let $π_{j 1}, π_{j 2}, π_{j 3}$ and $π_{j 4}$ denote the probabilities of occurrence of these four outcomes at dose level $j .$ We assume the prior distribution of $π_{j} = (π_{j 1}, π_{j 2}, π_{j 3}, π_{j 4})$ is Dirichlet $(α_{1}, α_{2}, α_{3}, α_{4})$ . Let $n_{j 1}$ , $n_{j 2}$ , $n_{j 3}$ , and $n_{j 4}$ denote the numbers of patients of these four outcomes who were treated at dose level $j$ . Applying Dirichlet-multinomial model, the posterior distribution of $π_{j}$ is:

\begin{matrix} π_{j} | data ~ Dirichlet \\ (α_{1} + n_{j 1}, α_{2} + n_{j 2}, α_{3} + n_{j 3}, α_{4} + n_{j 4}) . \end{matrix}

The posterior mean utility of dose $j$ is estimated as:

{\hat{U}}_{j} = \sum_{k = 1}^{4} u_{k} E (π_{jk} | data) .

$d_{low}$ or $d_{high}$ that maximize ${\hat{U}}_{j}$ is selected as OBD.

In both approaches, we require that the OBD j also satisfies the following safety and efficacy requirements:

(Safety) $Pr (p_{j} > ϕ_{T} | data) \leq C_{T}$ ,

(Efficacy) $Pr (p_{E, j} < ϕ_{E} | data) \leq C_{E}$ ,

where $ϕ_{T}$ and $ϕ_{E}$ are the upper and lower limits of the toxicity and efficacy rates, respectively, and $C_{T}$ and $C_{E}$ are probability cutoffs calibrated through simulation. Typically, $C_{T}$ and $C_{E}$ are set to relatively high values, such as 0.8 to 0.95, to minimize the probability of incorrectly ruling out safe and effective doses. In the case that $d_{low}$ had a higher posterior probability of over-toxic than $d_{high}$ , isotonic-adjusted posterior probabilities were applied to evaluate the safety condition.

If only one of $d_{low}$ and $d_{high}$ satisfies the safety and efficacy requirements, that dose will be selected as the OBD.

To facilitate the use of the BARD design, software is available at www.trialdesign.org, allowing users to run simulations and conduct trials.

Numerical studies

Simulation setting

We considered a trial where stage 1 aims to find the MTD from 5 doses with a maximum sample size of 30 for dose escalation. The dose escalation starts from the lowest dose, and patients are treated in cohorts of 3. The accrual rate is 3/month, and the DLT assessment window is 1 month. The sample size cap for backfilling is $n_{cap}$ = 12 per dose. At the end of stage 1, the identified MTD and the dose one level lower (if it exists) move forward to stage 2 for randomization. The targeted total sample size for stage 2 is $N_{2} = 40$ , with 20 patients per dose arm. The randomization parameter $r = 0.95$ is used in stage 2 to assign patients to the arm that minimizes covariate imbalance.

We compared two BARD designs, BARD-BOIN and BARD-BLRM, with their conventional counterparts, referred to as BOIN-SR and BLRM-SR, where BOIN or BLRM is used for stage 1 dose escalation followed by 1:1 simple randomization. In BOIN-SR and BLRM-SR, stage 1 data are not combined with stage 2 data. Thus, a total of 40 new patients are enrolled and randomized in stage 2 to reach 20 patients per dose arm.

For BARD-BOIN and BOIN-SR, the target DLT rate is set at $ϕ = 0.25$ , with the corresponding escalation and de-escalation boundaries being $λ_{e} = 0.197$ and $λ_{d} = 0.298$ , respectively. Stage 1 dose escalation ends when the maximum dose-escalation sample size of 30 is reached, or the number of patients treated at the current dose reaches $n_{stop} = 9$ and the decision is “stay.” The default cutoff $C = 0.95$ is used in its overdose control rule.

For BARD-BLRM and BLRM-SR, the target toxicity interval is set at $(0.16, 0.33)$ . The dosages are set as 10, 20, 50,100, 200 with the reference dosage as 50. The following weakly-informative prior suggested by Neuenschwander et al.²³ is used to fit the model:

[\begin{matrix} \log (α) \\ \log (β) \end{matrix}] ~ N (μ = [\begin{matrix} - 1.1 \\ 0 \end{matrix}], V = [\begin{matrix} 4 & 0 \\ 0 & 1 \end{matrix}]) .

(1)

The overdose control cutoff is set at $η = 0.30$ . The results for $η = 0.50$ and 0.95 are provided in the Supplementary materials. Unlike BOIN, BLRM does not have a rule for stopping the trial when the number of patients treated at the current dose reaches $n_{stop} = 9$ and the decision is “stay.” To facilitate comparison, we calibrated the maximum stage 1 dose-escalation sample size of BARD-BLRM and BLRM-SR so that their average sample size in stage 1 matches that of BARD-BOIN and BOIN-SR.

To evaluate the performance of BARD-BOIN and BARD-BLRM in balancing covariates, we assumed three binary prognostic factors $X_{1}, X_{2}$ and $X_{3}$ , generated from Bernoulli (0.5), which are related to the efficacy rate $p_{E}$ as follows: logit( $p_{E} | d_{j}) = β_{0 j} + β_{1} X_{1} + β_{2} X_{2} + β_{3} X_{3}$ , where $β_{1} = 1.7, β_{2} = - 1.5$ , $β_{3} = 0.4$ and the values of intercepts ${β_{0}}_{j}$ can be found in Table S1 in Supplementary materials. In our stage 2 covariate-adaptive randomization algorithm, we included only $X_{1}$ and $X_{2}$ . We intentionally omitted $X_{3}$ from our algorithm to compare the balance of covariates when it is included versus when it is not included. In addition, we included Simon-Pocock randomization with 40 patients as the benchmark for comparison.

At the end of stage 2, the OBD is identified by the two approaches described previously. In the first efficacy-rate-based approach, we used $δ = 0.05$ to select the OBD. In the utility-based method, we assigned utility scores according to Table 2. We set $ϕ_{T} = 0.3$ , $ϕ_{E} = 0.2$ , $C_{T} = 0.9$ , and $C_{E} = 0.95$ to ensure safety and efficacy of the OBD.

We considered 8 representative scenarios that differ in the toxicity-response curves and the location of the OBD, as presented in Table 3. In scenarios 1–4, the toxicity-response curve is monotone increasing, while in scenarios 5–8, the response rate plateaus below the MTD.

Table 3.

Simulation scenarios, with the OBD highlighted in bold.

		Dose level
Scenario		1	2	3	4	5
1	DLT	0.12	0.25	0.42	0.49	0.55
	Efficacy	0.181	0.349	0.439	0.519	0.596
	Utility	38.6	45.2	44.4	46.6	48.7
2	DLT	0.04	0.12	0.25	0.43	0.63
	Efficacy	0.152	0.181	0.349	0.439	0.519
	Utility	39.3	38.6	45.2	44.0	40.9
3	DLT	0.02	0.06	0.1	0.25	0.4
	Efficacy	0.103	0.152	0.181	0.349	0.439
	Utility	36.6	38.6	39.3	45.2	45.2
4	DLT	0.02	0.05	0.08	0.11	0.25
	Efficacy	0.046	0.103	0.152	0.181	0.349
	Utility	32.6	35.6	38.0	38.9	45.2
5	DLT	0.12	0.25	0.42	0.49	0.55
	Efficacy	0.349	0.349	0.359	0.359	0.359
	Utility	50.0	45.2	39.5	36.9	34.7
6	DLT	0.04	0.12	0.25	0.43	0.63
	Efficacy	0.181	0.349	0.349	0.359	0.359
	Utility	41.3	50.0	45.2	39.1	31.7
7	DLT	0.02	0.06	0.1	0.25	0.4
	Efficacy	0.152	0.181	0.349	0.349	0.359
	Utility	40.0	40.6	50.8	45.2	40.3
8	DLT	0.02	0.05	0.08	0.11	0.25
	Efficacy	0.103	0.152	0.181	0.349	0.349
	Utility	36.6	39.0	39.9	50.4	45.2

The following performance metrics were evaluated based on 30,000 simulated trials.

Average total sample size.

Average trial duration.

Imbalance index, defined as the absolute difference between the proportion of patients with $X_{k} = 1$ in $d_{low}$ and $d_{high}$ , $k = 1, 2, 3$ , which measures the imbalance of the distribution of $X_{k}$ between $d_{low}$ and $d_{high}$ . A smaller value of imbalance index indicates a better balance.

Imbalance in allocation, defined as the absolute difference in the number of patients treated at $d_{low}$ and that treated at $d_{high}$ .

PCS1: the percentage of correct selection (PCS) of the true OBD based on the efficacy-rate-based approach.

PCS2: the PCS of the true OBD based on the utility approach.

Results

Table 4 summarizes the operating characteristics of the designs. BARD-BOIN outperforms its counterpart, BOIN-SR. Across all eight scenarios, BARD-BOIN reduces the sample size by 10–15 patients and shortens the trial duration by 7–8 months compared with BOIN-SR, because of the integration of stage 1 and stage 2 data. In addition, BARD-BOIN achieves significantly better balance on $X_{1}$ and $X_{2}$ than BOIN-SR, with the imbalance index for BARD-BOIN often being one-third that of BOIN-SR. This highlights the effectiveness of the proposed covariate-adaptive randomization, which yields superior covariate balance compared with the “full” simple randomization (without combining stage 1 data). The covariate balance under BARD-BOIN is very similar to that of the “full” Pocock-Simon assignment (without combining stage 1 data), further confirming the approach’s effectiveness. For the omitted covariate $X_{3}$ , the balance remains comparable with that of full simple randomization, indicating that the presence of unknown prognostic factors does not compromise the proposed method. The accumulative number of patients allocated to $d_{low}$ and $d_{high}$ is nearly 1:1, with the average difference generally less than one patient.

Table 4.

Operating characteristics of BARD-BOIN and BARD-BLRM, in comparison with BOIN-SR and BLRM-SR.

Design	N	Duration(month)	Imbalance^a $X_{1}$	Imbalance^a $X_{2}$	Imbalance^a $X_{3}$	Imbalance^a allocation	PCS1	PCS2
Scenario 1
BARD-BOIN	39.45	17.77	4.50(3.46)	4.48(3.49)	12.59(12.56)	0.98(0.82)	51.58	48.70
BARD-BLRM	31.85	15.16	7.55(3.46)	7.61(3.46)	12.66(12.57)	2.46(0.83)	45.77	42.99
BOIN-SR	54.83	24.72	12.56	12.57	12.42	0	49.87	48.34
BLRM-SR	44.95	20.75	12.50	12.37	12.45	0	42.24	40.79
Scenario 2
BARD-BOIN	48.60	20.86	4.13(3.48)	4.13(3.49)	12.51(12.58)	0.87(0.82)	50.26	47.63
BARD-BLRM	43.79	19.60	6.34(3.47)	6.40(3.50)	12.62(12.49)	1.86(0.81)	31.86	31.35
BOIN-SR	63.50	28.53	12.60	12.51	12.53	0	49.87	48.14
BLRM-SR	59.61	26.93	12.51	12.47	12.44	0	30.66	29.44
Scenario 3
BARD-BOIN	53.35	22.39	3.83(3.47)	3.81(3.47)	12.57(12.57)	0.78(0.82)	51.17	47.31
BARD-BLRM	49.83	21.95	6.31(3.52)	6.32(3.47)	12.60(12.62)	2.03(0.82)	31.74	29.76
BOIN-SR	65.70	29.86	12.60	12.55	12.58	0	50.30	47.87
BLRM-SR	64.83	29.79	12.48	12.66	12.49	0	29.22	27.65
Scenario 4
BARD-BOIN	53.43	22.87	3.35(3.48)	3.28(3.46)	12.46(12.54)	0.65(0.81)	49.73	46.70
BARD-BLRM	50.65	22.65	5.15(3.50)	5.08(3.49)	12.62(12.64)	1.47(0.83)	29.63	28.35
BOIN-SR	64.83	29.46	12.55	12.55	12.55	0	47.74	45.25
BLRM-SR	65.02	29.94	12.49	12.67	12.50	0	28.13	26.82
Scenario 5
BARD-BOIN	39.53	17.36	4.80(3.48)	4.72(3.48)	12.57(12.57)	1.06(0.82)	69.42	71.51
BARD-BLRM	31.97	14.72	7.59(3.48)	7.71(3.50)	12.67(12.45)	2.43(0.83)	51.86	53.74
BOIN-SR	54.83	24.72	12.56	12.57	12.42	0	66.32	66.83
BLRM-SR	44.95	20.75	12.50	12.37	12.45	0	50.74	51.12
Scenario 6
BARD-BOIN	48.90	20.58	4.28(3.47)	4.29(3.50)	12.58(12.53)	0.90(0.83)	62.76	63.81
BARD-BLRM	44.00	19.30	6.55(3.48)	6.61(3.50)	12.63(12.51)	1.94(0.81)	66.92	66.70
BOIN-SR	63.50	28.53	12.60	12.51	12.53	0	59.87	61.12
BLRM-SR	59.61	26.93	12.51	12.47	12.44	0	63.71	65.28
Scenario 7
BARD-BOIN	54.24	22.16	3.91(3.46)	3.94(3.48)	12.61(12.54)	0.80(0.82)	57.87	61.69
BARD-BLRM	50.63	21.67	6.43(3.48)	6.42(3.49)	12.65(12.6)	2.05(0.82)	60.45	63.28
BOIN-SR	65.70	29.86	12.60	12.55	12.58	0	54.72	57.92
BLRM-SR	64.83	29.79	12.48	12.66	12.49	0	56.93	60.56
Scenario 8
BARD-BOIN	55.09	22.64	3.45(3.48)	3.36(3.47)	12.46(12.62)	0.68(0.82)	63.25	65.64
BARD-BLRM	52.11	22.30	5.24(3.50)	5.19(3.47)	12.60(12.54)	1.49(0.82)	51.82	54.03
BOIN-SR	64.83	29.46	12.55	12.55	12.55	0	62.29	65.11
BLRM-SR	65.02	29.94	12.49	12.67	12.50	0	50.14	52.79

N: average total sample size; PCS1: percentage of correct selection of the OBD based on the efficacy-rate-based approach; PCS2: the percentage of correct selection (PCS) of the true OBD based on the utility approach.

Numbers in parenthesis are the results from the Pocock-Simon method with 40 patients randomized.

In terms of OBD selection, BARD-BOIN generally outperforms BOIN-SR, with a 1.89% higher PCS1% and 1.55% higher PCS2 on average. For example, in scenarios 6 and 7, the PCS1 of BARD-BOIN is 2.89% and 3.15% percentage points higher than that of BOIN-SR. This result is remarkable, considering that BARD-BOIN uses a smaller overall sample size. Of note, although BARD-BOIN has a smaller overall sample size, the number of patients used to inform the OBD selection (i.e. N₂) is the same for both methods. The performance gain may stem from BARD-BOIN’s better balance in prognostic factors, which results in more accurate estimates (see Table S2 in the Supplementary materials) and, consequently, higher PCS.

As for the two OBD selection approaches (i.e. PCS1 and PCS2), they are generally comparable. Since they are based on different criteria, reflecting distinct clinical considerations and suited to different clinical settings, it is more meaningful to focus on their overall operating characteristics rather than making direct comparisons and drawing general conclusions about which approach is superior.

Similar patterns are observed when comparing BARD-BLRM to BLRM-SR across these performance metrics. Specifically, BARD-BLRM reduces the sample size by 12–15 patients and shortens the trial duration by 6–8 months compared with BLRM-SR. In addition, BARD-BLRM demonstrates greater accuracy in identifying the OBD, with higher PCS, and a superior ability to balance covariates compared with BLRM-SR.

Between BARD-BOIN and BARD-BLRM, BARD-BOIN often exhibits higher accuracy in identifying the OBD, as evidenced by higher PCS1 and PCS2. This is primarily due to BLRM’s lower probability of correctly identifying the MTD. Table S3 in the Supplementary materials summarizes stage 1 of the BARD-BOIN and BARD-BLRM designs in the simulation. BARD-BLRM has a lower probability for carrying forward the true OBD dose to stage 2. Our results align with previous findings that BLRM tends to be overly conservative, resulting in a lower probability of identifying the MTD.^28–31 Tables S4 and S5 in the Supplementary materials present the results for BARD-BLRM with $η = 0.50$ and 0.95. Relaxing overdose control increases both PCS1 and PCS2, approaching the performance of BARD-BOIN, but at the cost of a higher risk of overdosing patients. These results are consistent with previous findings.²⁸

However, it was somewhat unexpected that BARD-BLRM showed notably worse covariate balance than BARD-BOIN, although it still outperformed the simple randomization. This result is surprising, given that both designs use the same covariate-adaptive randomization method in stage 2. A key factor contributing to this result is the rigidity of BLRM due to the use of the two-parameter logistic model. The concept of “rigidity” is defined and discussed in Cheung³⁴ and Iasonos et al.³⁵ It refers to the tendency of a flexible model to overfit the data, which in turn causes the dose-finding process to become stuck at a low dose, preventing exploring higher doses that seem toxic based on the data from a few patients (e.g. 3), which are actually safe. Once the process is stuck at a dose, treating more patients does not resolve the issue. Given the limited data available at the beginning of a dose-finding trial (e.g. data from only 3 or 6 patients), the two-parameter logistic model is often deemed overly flexible, leading to overfitting and getting stuck at a particular dose. As a result, BLRM often leads to a highly imbalanced number of patients between $d_{low}$ and $d_{high}$ at the end of stage 1. This imbalance is carried over to stage 2, making it challenging to fully correct given the limited sample size in that stage. Supplementary materials section 4 provides further explanation and numerical results on this issue.

Sensitivity analysis

We conducted a sensitivity analysis to assess the robustness of BARD-BOIN regarding the number of covariates, the stage 2 sample size $N_{2}$ , the stage 2 adaptive randomization probability $r$ , as well as the number of doses J. We focused on BARD-BOIN due to its superior performance in balancing covariates.

Figure 1 depicts the differences in the covariate imbalance index for $X_{1}$ between BARD-BOIN and “full” Simon-Pocock randomization. Due to the symmetric role of covariates, the covariate imbalance index for the other covariates is similar to that of $X_{1}$ ; therefore, only $X_{1}$ is displayed. Figure 1(a) presents results for 2, 3, and 4 prognostic factors with $N_{2} = 40$ , while Figure 1(b) shows results for $N_{2}$ = 40, 60, and 80 with 2 covariates. The imbalance under BARD-BOIN is comparable with that of full Simon-Pocock randomization, with the imbalance index generally no more than 1.5% higher. This demonstrates the robustness of BARD-BOIN to variations in both the number of prognostic factors and the target sample size for randomized doses. Figure 1(c) presents results for $r$ = 0.75, 0.85 and 0.95. A larger value of $r$ yields notably better covariate balance, as it more strongly favors randomizing patients to the dose arm that minimizes imbalance. These results support our recommendation of using a large value of $r$ (e.g. 0.85 to 0.95) to quickly correct any imbalance carried over from stage 1 non-randomized patients. The accuracy of identifying the OBD (PCS1 and PCS2), provided in Supplementary materials section 5, is not sensitive to the number of prognostic factors or the adaptive randomization probability $r$ , but as expected, improves as the stage 2 sample size $N_{2}$ increases.

Figure 1.

The difference in the imbalance index of $X_{1}$ between BARD-BOIN and Pocock-Simon method under different(a) numbers of covariates adjusted, and (b) target sample size $N_{2}$ in stage 2, and (c) stage 2 adaptive randomization probability $r$ .

Tables S6–S8 in Supplementary materials show the setting and result with J = 3 doses. The results are generally consistent with those observed with 5 doses. Specifically, compared with BOIN-SR, BARD-BOIN reduces the sample size and trial duration, achieves better covariate balance, and improves the accuracy of identifying the OBD.

Discussion

We have proposed a seamless two-stage design, BARD, that integrates backfilling and adaptive randomization for efficient dose optimization. Backfilling allows additional patients to be enrolled at doses deemed safe and showing promising activity, enhancing patient enrollment and data generation without extending the trial duration. The adaptive randomization enables the combination of data from dose escalation and randomization without compromising the balance of baseline characteristics between comparative dose arms. BARD designs offer an efficient solution to meet the dose optimization requirements set by Project Optimus.

Backfilling and adaptive randomization significantly enhance trial efficiency when used together, but they do not necessarily need to be bundled. Stage 1 dose escalation can proceed without backfilling while still utilizing adaptive randomization to combine stage 1 and 2 data for a more efficient comparison of multiple doses. In addition, while we focus on using the Pocock-Simon method, its various extensions and other covariate-balance randomization³⁶ methods can also be employed when appropriate.

Because dose selection is performed at the end of stage 1, one may concern potential estimation bias when stage 1 and 2 data are combined to estimate and select the OBD at the end of the trial, an issue analogous to what occurs in inferential phase II–III trials.³⁷ Supplemental Table S2 presents the bias of the estimates of the DLT rate and response rate at the end of the trial. The estimate of the response rate has minimal bias. The estimate of the DLT rate exhibits a small negative bias (−0.015 to −0.022), which arises from the DLT-data-dependent dose assignment and selection in stage 1. However, this bias is generally negligible relative to the high heterogeneity typically observed in early-phase patients and the inherent variance of the DLT estimate.

The stage 1 of BARD designs centers on dose escalation based on DLT and the identification of MTD. When suitable, efficacy-integrated designs, such as EffTox⁷ and BOIN12,⁹ can be used to more efficiently identify doses likely to be the OBD, which can then be advanced to stage 2 for adaptive randomization. In addition, our simulation does not include interim toxicity and futility monitoring, which potentially further reduces the sample size if one or two doses in stage 2 are overly toxic or futile. Bayesian optimal phase 2 design^38,39 can be employed to achieve this goal. Finally, this article focuses on single-agent dose-finding trials. Extending BARD to combination trials involving the identification of the OBD from a dose matrix is a topic for future research.

Supplemental Material

sj-pdf-1-ctj-10.1177_17407745251350596 – Supplemental material for BARD: A seamless two-stage dose optimization design integrating backfill and adaptive randomization

Supplemental material, sj-pdf-1-ctj-10.1177_17407745251350596 for BARD: A seamless two-stage dose optimization design integrating backfill and adaptive randomization by Yixuan Zhao, Rachael Liu, Jianchang Lin and Ying Yuan in Clinical Trials

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Yuan’s research was partially supported by Award Number P50CA281701, P50CA127001, and U24CA274212 from the National Cancer Institute, and Bettyann Asche Murray Distinguished Professorship.

ORCID iD

Ying Yuan

Supplemental material

Supplemental material for this article is available online.

References

Ratain

Redefining the primary objective of phase I oncology trials. Nature Rev Clin Oncol 2014; 11: 503–504.

Yan

Thall

Yuan

Phase I-II clinical trial design: a state-of-the-art paradigm for dose finding. Ann Oncol 2018; 29: 694–699.

Shah

Rahman

Theoret

, et al. The drug-dosing conundrum in oncology—when less is more. N Engl J Med 2021; 385(16): 1445–1447.

U.S. Food & Drug Administration. Project Optimus: reforming the dose optimization and dose selection paradigm in oncology, https://www.fda.gov/about-fda/oncology-center-excellence/project-optimus (2024, accessed 13 July 2024).

Yuan

Nguyen

Thall

Bayesian designs for phase I-II clinical trials. 1st ed. New York: Chapman and Hall/CRC, 2016.

Yuan

Zhou

Liu

Statistical and practical considerations in planning and conduct of dose-optimization trials. Clin Trials 2024; 21(3): 273–286.

Thall

Cook

JD.

Dose-finding based on efficacy toxicity trade-offs. Biometrics 2004; 60(3): 684–693.

Jin

Liu

Thall

, et al. Using data augmentation to facilitate conduct of phase I-II clinical trials with delayed outcomes. J Am Stat Assoc 2014; 109(506): 525–536.

Lin

Zhou

Yan

, et al. BOIN12: Bayesian optimal interval phase I/II trial design for utility-based dose finding in immunotherapy and targeted therapies. JCO Precis Oncol 2020; 4(4): 393–1402.

10.

Takeda

Taguri

Morita

BOIN-ET: Bayesian optimal interval design for dose finding based on both efficacy and toxicity outcomes. Pharm Stat 2018; 17(4): 383–395.

11.

Shi

Cao

Yuan

, et al. uTPI: a utility-based toxicity probability interval design for phase I/II dose-finding trials. Stat Med 2021; 40(11): 2626–2649.

12.

National Library of Medicine. Intratumoral injections of LL37 for melanoma, https://clinicaltrials.gov/ct2/show/record/NCT02225366 (2021, accessed 13 July 2024).

13.

Msaouel

Goswami

Thall

, et al. A phase 1-2 trial sitravatinib and nivolumab in clear cell renal cell carcinoma following progression on antiangiogenic therapy. Sci Trans Med 2022; 14(641): eabm6420.

14.

Pan

Tan

Shan

, et al. Phase I study of donor derived CD5 CAR T cells in patients with relapsed or refractory T-cell acute lymphoblastic leukemia. J Clin Oncol 2022; 4: 7028–7028.

15.

Simon

Wittes

Ellenberg

SS.

Randomized phase II clinical trials. Cancer Treat Rep 1985; 69(12): 1375.

16.

U.S. Food & Drug Administration. Optimizing the dosage of human prescription drugs and biological products for the treatment of oncologic diseases, https://www.fda.gov/regulatory-information/search-fda-guidance-documents/optimizing-dosage-human-prescription-drugs-and-biological-products-treatment-oncologic-diseases (2021, accessed 13 July 2024).

17.

Hoering

LeBlanc

Crowley

Seamless phase I-II trial design for assessing toxicity and efficacy for targeted agents. Clin Can Res 2011; 17(4): 640–646.

18.

Guo

Yuan

Droid: dose-ranging approach to optimizing dose in oncology drug development. Biometrics 2023; 79(4): 2907–2919.

19.

Zhou

Lee

Yuan

A utility-based Bayesian optimal interval (U-BOIN) phase I/II design to identify the optimal biological dose for targeted and immune therapies. Stat Med 2019; 38(28): 5299–5316.

20.

Yang

Lin

, et al. Design and sample size determination for multiple-dose randomized phase II trials for dose optimization. Stat Med 2024; 43(15): 2972–2986.

21.

Foster

Korn

Freidlin

, et al. The potential to backfill in phase I trials: the National Cancer Institute’s Cancer Therapy Evaluation Program experience. JNCI Cancer Spectr 2023; 7(6): pkad102.

22.

Yuan

Hess

Hilsenbeck

, et al. Bayesian optimal interval design: a simple and well-performing design for phase I oncology trials. Clin Cancer Res 2016; 22(17): 4291–4301.

23.

Neuenschwander

Branson

Gsponer

Critical aspects of the Bayesian approach to phase I cancer trials. Stat Med 2008; 27(13): 2420–2439.

24.

Yan

Mandrekar

Yuan

Keyboard: a novel Bayesian toxicity probability interval design for phase I clinical trials. Clin Cancer Res 2017; 23(15): 3994–4003.

25.

O’Quigley

Pepe

Fisher

Continual reassessment method: a practical design for phase 1 clinical trials in cancer. Biometrics 1990; 46(1): 33–48.

26.

Zhao

Yuan

Korn

, et al. Backfilling patients in phase I dose-escalation trials using Bayesian Optimal Interval Design (BOIN). Clin Cancer Res 2024; 30(4): 673–679.

27.

Villar

Dehbi

HM.

Implementing and assessing Bayesian response-adaptive randomisation for backfilling in dose-finding trials. Contemp Clin Trials 2024; 142: 107567.

28.

Zhou

Yuan

Nie

Accuracy, safety, and reliability of novel phase I trial designs. Clin Cancer Res 2018; 24(18): 4357–4364.

29.

Zhang

Chiang

Wang

Improving the performance of Bayesian logistic regression model with overdose control in oncology dose finding studies. Stat Med 2022; 41: 5463–5483.

30.

Yuan

Zhao

. Commentary on “improving the performance of Bayesian logistic regression model with overdose control in oncology dose-finding studies.” Stat Med 2022; 41(27): 5484–5490.

31.

Yang

Cheng

Lin

On the relative conservativeness of Bayesian logistic regression method in oncology dose-finding studies. Pharm Stat 2024; 23(4): 585–594.

32.

Pocock

Simon

Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. Biometrics 1975; 31(1): 103–115.

33.

U.S. Food & Drug Administration. Adaptive designs for clinical trials of drugs and biologics guidance for industry, https://www.fda.gov/regulatory-information/search-fda-guidance-documents/adaptive-design-clinical-trials-drugs-and-biologics-guidance-industry (2019, accessed 28 August 2024).

34.

Cheung

YK.

Dose finding by the continual reassessment method. 1st ed. New York: Chapman and Hall/CRC, 2011.

35.

Iasonos

Wages

Conaway

, et al. Dimension of model parameter space and operating characteristics in adaptive dose-finding studies. Stat Med 2016; 35(21): 3760–3775.

36.

Scott

McPherson

Ramsay

, et al. The method of minimization for allocation to clinical trials: a review. Control Clin Trials 2002; 23(6): 662–674.

37.

Jiang

Yuan

Seamless phase II/III design: a useful strategy to reduce the sample size for dose optimization. J Natl Cancer Inst 2023; 115(9): 1092–1098.

38.

Zhou

Lee

Yuan

BOP2: Bayesian optimal design for phase II clinical trials with simple and complex endpoints. Stat Med 2017; 36(21): 3302–3314.

39.

Chen

Zhou

Lee

, et al. BOP2-TE: Bayesian optimal phase 2 design for jointly monitoring efficacy and toxicity with application to dose optimization. J Biopharm Stat. Epub ahead of print 24 November 2024. DOI: 10.1080/10543406.2024.2429481.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.75 MB