Abstract
Phase I dose-escalation studies for a single-agent and combination of anti-cancer agents have explored various model-based designs to guide identification of a maximum tolerated dose and recommended phase II dose. This work describes a parallel approach to dose escalation to expedite identification of maximum tolerated doses both for an anti-cancer agent as monotherapy and in combination with another agent. We develop a three-parameter Bayesian logistic regression model that allows for more efficient use of information between monotherapy and combination parts of the study. The model allows the monotherapy and combination data to drive dose escalation of the combination of treatments, reflecting the known dose-toxicity relationship between the monotherapy and combination setting. Through a thorough simulation study in which the proposed model is compared to two comparative approaches, the three-parameter Bayesian logistic regression model is shown to accurately select doses in the target toxicity interval, performing similar to comparative approaches in terms of proportion of target dose/combination selection, while more than halving the proportion of doses selected that were greater than the target toxicity, thereby improving safety concerns.
Keywords
Introduction
In a phase I dose-escalation study, the traditional primary objective is to determine a maximum tolerated dose (MTD). The MTD is defined as the highest dose that harbours an acceptable level of toxicity. 1 This toxicity is determined by the probability of an unacceptable adverse event related to the treatment. These events are referred to as dose-limiting toxicities (DLTs). Dose-finding studies commonly rely on a monotonicity assumption, meaning that a higher dose level will result in a greater clinical benefit compared to a lower dose. However, it is also assumed that a higher dose will result in a greater toxicity risk. Thus, the goal of the early-phase study is to explore the dose-toxicity relationship by escalating/de-escalating through the doses, ultimately identifying the MTD.
These trials have predominantly been implemented in the monotherapy setting in which the study investigates a single therapeutic treatment with interest lying in identifying the MTD of the monotherapy. 2 However, increasingly common in the immunotherapy setting in particular, is the investigation of an active agent in combination with a backbone agent. In this setting, dose escalation only concerns the dose-level of the active agent under investigation, while the dose of the backbone agent is fixed. If the backbone agent were to have multiple dose levels under investigation, there are a plethora of existing methods in the literature.3–9 The model proposed in this work has been developed for a single dose level for the backbone agent as motivated by Cancer Research UK’s current Phase I/IIa trial of HTL0039732. 10 HTL0039732 (or NXE0039732), from Nxera Pharma, is a small molecule antagonist of prostaglandin PGE2’s receptor EP4 that has been designed to block EP4-driven immunosuppressive effects within the tumour microenvironment and to enhance the activity of existing immunotherapies such as programmed cell death protein 1 (PD-1)/programmed death ligand 1 (PD-L1) checkpoint inhibitors, including atezolizumab. This motivates the development of a unique trial design that can efficiently conduct both dose escalation under monotherapy and dose escalation of the treatment in combination with the backbone agent.
In this article, we propose a novel dose-escalation model that enables the parallel testing of a treatment as a monotherapy alongside the dose escalation of the treatment in combination with a fixed dose of a backbone agent. The goal of the study is to find the MTD at monotherapy (where the MTD is defined as the dose that maximises the probability of the toxicity risk being in the target toxicity interval), as well as the maximum tolerated combination (MTC) of the treatment with the backbone agent. This relies on the assumption of monotonicity of toxicity under monotherapy, with a higher dose assumed to be more toxic than a lower dose, but also assumes that a treatment administered in combination with a backbone agent is as/more toxic than the same dose administered as monotherapy. By conducting parallel testing of the monotherapy and combination as opposed to sequential testing, the study efficiency is greatly improved. Not only will this reduce the duration of the study, but it will also allow for more efficient use of information between the monotherapy and combination parts of the study. The proposed model builds on the two-parameter Bayesian logistic regression model (BLRM) proposed by Neuenschwander et al. 11 by incorporating a third parameter to model the combination with the backbone agent. We compare the performance of the model to two alternative models: the partial-ordering continual reassessment method (POCRM) 12 and the two-parameter BLRM 11 that conducts the dose escalation of monotherapy and in combination sequentially. The proposed three-parameter model is demonstrated to balance both accuracy of selection and safety concerns, with a substantial improvement in the selection of over-toxic doses compared to the two comparative models.
Despite the plethora of efficient statistical models for phase I dose-finding studies, there is a severe lack in their uptake for implementation in practice. 8 Through the work presented in this article, we hope to build on the growing literature for the implementation of more complex dose escalation models and to demonstrate the practical and statistical advantages it holds.
Methodology
Notation
In this section, notation is defined and the novel three-parameter BLRM for dose escalation outlined. Let there be a total of
Proposed three-parameter Bayesian logistic regression dose-escalation model
Neuenschwander et al. 11 proposed a two-parameter BLRM which is adapted to incorporate the escalation of the dose of a single agent (monotherapy) alongside the dose escalation of this single agent administered in combination with a backbone agent (combination), into a single model. The adaptation incorporates an additional binary covariate to the two-parameter BLRM. This is a similar model to that of Bailey et al.; 13 however, their approach was not implemented to escalate both monotherapy and combination therapy in parallel.
The proposed model is a three-parameter BLRM where the extra parameter indicates whether the agent is administered as a monotherapy or in combination. The model has the following form
where
is the covariance matrix. The reference dose
The monotherapy and combination therapy dose escalation can run in parallel under this model. The addition of the indicator function means that only the
Figure 1 demonstrates how a dose-escalation trial is conducted under this proposed three-parameter BLRM, for the parallel testing of monotherapy and combination therapy. The trial commences with a cohort of patients assigned to the starting dose of monotherapy. Their DLT outcomes are observed and are used to update the model, which is then used for dose escalation recommendation. From the model, the next cohort of patients will be assigned to the highest admissible dose that maximises the probability of the risk of a DLT being in the target toxicity interval of
Doses cannot be skipped;
A dose of the treatment given in combination can only be assigned from the second cohort onwards (given that the dose-escalation rules are satisfied), allowing the accrual of some data for the monotherapy before administering in combination.
A dose of the treatment given in combination can only be assigned if there have been at least three DLT-evaluated patients on this (or higher) dose as a monotherapy;
If one out of three patients in the current cohort had a DLT event, then the next cohort of patients cannot receive a dose that is higher than the current dose;
Patients can only be assigned to doses/combinations that are deemed safe. A dose/combination is deemed safe if the probability of the risk of toxicity exceeding the upper toxicity threshold of
where

Dose escalation under the proposed three-parameter BLRM.
Any dose or combination which satisfied all four of the constraints are referred to as an admissible dose/combination. Dose escalation/de-escalation can only occur between admissible doses. The trial terminates when either no doses are deemed admissible (for safety concerns) or if the total maximum sample sizes of
Throughout the rest of this work, the proposed three-parameter BLRM is compared to two comparative methods: the two-parameter BLRM (which is equivalent to the proposed design when
Example trial
In the example trial, an experimental treatment is administered to patients with advanced cancers. The study includes a small number of patients in order to determine the safety and tolerability of the treatment. In total, it is assumed that seven doses will be explored: 20 mg, 40 mg, 80 mg, 160 mg, 320 mg, 640 mg and 960 mg, where 20 mg and 40 mg are de-escalation doses only and 80 mg is the starting dose. Each dose is administered orally on a daily 21-day schedule with no breaks and the DLTs are assessed over several cycles. The trial is split into two parts: Part A in which doses of the treatment are administered as a monotherapy and Part B in which doses of the treatment are administered alongside intravenous atezolizumab as a backbone agent. The maximum number of patients to be enrolled in the study (in the monotherapy and combination parts) is assumed
The goal is to implement a dose-escalation model that can efficiently explore the dose space and to ultimately determine both the MTD and the MTC as quickly as possible. As such, the proposed design is to conduct the monotherapy dose escalation and the dose escalation of the treatment in combination with atezolizumab in parallel. This will reduce the duration of the trial and more efficiently utilise the accrued monotherapy and combination data.
Example model outputs for the proposed design
Given the above trial setting, we now demonstrate the model outputs for the first three cohorts under two example data sets. For each cohort, the mean toxicity, probability of being in the target toxicity interval and the probability of being over-toxic is presented for every dose under both monotherapy and combination treatment. The first example is presented in Figure 2, where the first cohort were only administered the monotherapy, with the combination treatment not yet started. In this cohort no DLTs were observed. This information was used to update the model and obtain the toxicity risk for doses at both monotherapy and combination. The model recommends increasing the monotherapy dose to 160 mg, with all higher doses inadmissible due to the escalation rule that no doses may be skipped. The model also indicates that the first three doses at combination are admissible and selects 80 mg as the dose for the first combination cohort. This process is repeated in cohorts 2 and 3, with the doses for both monotherapy and combination remaining the same or de-escalated based on the DLTs observed.

Example trial 1: model outputs for the first three cohorts of a trial. For each cohort the mean toxicity, probability of being in the target toxicity interval (TTI) and the probability of being over-toxic is presented for each dose under both monotherapy and combination treatment. Doses that are inadmissible are greyed out, with the next allocated dose for both monotherapy and combination highlighted in blue.
In the second data set, presented in Figure 3, three DLTs were observed in the first cohort so only the first monotherapy dose was admissible. As a result, the second cohort was assigned to a dose of 20 mg. Unlike in example 1, combination treatment did not start in cohort 2, as no combination doses are admissible. In the second cohort, no DLTs were observed and the monotherapy dose was escalated to 40 mg, while still no combinations were deemed admissible. It is only after the third cohort that a combination dose was an admissible dose and so dose escalation for the combination will start in parallel from cohort 4 onwards.

Example trial 2: model outputs for the first three cohorts of a trial. For each cohort the mean toxicity, probability of being in the target toxicity interval (TTI) and the probability of being over-toxic is presented for each dose under both monotherapy and combination treatment. Doses that are inadmissible are greyed out, with the next allocated dose for both monotherapy and combination highlighted in blue.
Calibration of model parameters
The proposed model has a number of prior parameters. Before conducting the trial, each of these parameters are calibrated in order to produce favourable operating characteristics across several quantitatively and qualitatively different dose-toxicity scenarios. This is conducted via extensive simulation studies in order to optimise the accuracy of dose escalation and MTD/MTC selection.
Setting
The simulation setting considered here is one in which both monotherapy and combination therapy is explored in parallel. Results of simulations under monotherapy alone are provided in the supplementary materials.
From the motivating example, we now define a dose-escalation trial setting used throughout the rest of this article. For all three approaches, the starting dose was set at 80 mg. At each stage of dose escalation, a cohort consisting of three patients were enrolled onto a dose. The trial is terminated when either the maximum total number of
Calibration of monotherapy parameters
First consider the setting where patients are assigned to the monotherapy treatment, with no combination dose escalation considered; thus, the maximum sample size is
Considered dose-toxicity scenarios for monotherapy treatment.
Doses within the target toxicity interval of 20%–30% are highlighted in
The proposed three-parameter model consists of several parameters:
Hyper-parameters for the intercept
Hyper-parameters for the slope,
Overdosing probability threshold
with the covariance parameter set to 0. For each combination of these parameter values, the probability of selecting the dose with the risk of toxicity within the target interval (selection of the doses with a DLT risk between 20% and 30%) is computed under each of the seven dose-toxicity scenarios using 500 simulation runs for each combination and scenario. Simulations were conducted using the ‘rjags’ package 15 in R Studio 16 version 4.1.1. Only parameter combinations that satisfy the following dose-escalation constraints are considered in the calibration:
If 0/3 DLTs are observed in the first cohort, 160 mg should be safe;
If 1/3 DLTs are observed in the first cohort, 80 mg should be safe;
If 2/3 DLTs are observed in the first cohort, the dose must be de-escalated and 40 mg should be safe.
If 3/3 DLTs are observed in the first cohort, the dose must be de-escalated and 40 mg should be unsafe and 20 mg should be safe.
The combination of parameter values that imply good operating characteristics across all scenarios, while satisfying the above constraints will be taken forward for further evaluation. This is quantified by taking the geometric mean of the probability of correct selection of an MTD in the target toxicity interval across each of the seven dose-toxicity scenarios. Through this calibration, the combination of parameter values selected were:
The dose-escalation behaviour of the calibrated monotherapy model can be evaluated through decision trees that show the recommended doses for different cohorts of patients dependent on the observed number of DLTs. For details on this please refer to the supplementary materials.
Calibration of combination therapy parameters
In order to calibrate the parameter values for the combination parameter in the proposed three-parameter BLRM, the calibrated values for
To conduct the calibration, combination-toxicity scenarios must be constructed. These scenarios are used as the true probabilities of a DLT under the combination treatment at each dose. For the monotherapy treatment, the scenarios in Table 1 are implemented for the true probabilities of a DLT. The monotonicity assumption is used to construct combination-toxicity scenarios from these monotherapy scenarios, wherein the dose at combination can be no less toxic than the same dose at monotherapy. As such, for each monotherapy scenario in Table 1, several possible combination-toxicity scenarios are considered, where the true MTC lies between 1 and 3 doses lower than the MTD at monotherapy. Formally, under monotherapy scenario
Considered dose-toxicity and combination-toxicity scenarios.
Doses within the target toxicity interval of 20%–30% are highlighted in
To calibrate the model parameters, only a subset of dose-toxicity and combination-toxicity scenarios are implemented to make calibration more computationally feasible. The scenarios implemented in this calibration are presented in Table 2. These scenarios cover various locations of the true MTC on the dose grid.
For the calibration of the combination parameters, another grid search is implemented, where the values of the grid again represent substantially different combination-toxicity relationships as in the monotherapy calibration. As such, the mean and the standard deviation of the prior distribution on the
Hyper-parameters for the combination term
The parameter values selected are those that yielded the highest geometric mean of the proportion of correct MTD and MTC selection in the target toxicity interval, computed across all six of the considered scenarios which were
A similar discussion on how to calibrate the model parameters for the comparative two-parameter BLRM and POCRM is provided in the supplementary materials. Note, that the model parameters for all comparator models are calibrated to optimise the combined proportion of MTD and MTC selection.
Assessing the dose-escalation model performance
To assess the operating characteristics of the calibrated models and their expected performance, a simulation study is conducted with all three models compared. The performance of the models are measured through the following operating characteristics:
Proportion of the dose selections in the target interval (having the DLT risk of 20%–30%);
Proportion of the over-toxic dose selections (having the DLT risk of
Proportion of trials that were terminated early for safety concerns.
Proportion of patients assigned to an over-toxic dose.
A total of 1000 simulation runs were conducted for each model under each considered scenario. Note that the standard error of the Monte Carlo for this simulation study is approximated to be 0.008.
Monotherapy and combination simulation study
The results of the simulation study investigating the monotherapy and combination therapy using the calibrated parameter values are presented in Table 3.
Results of simulation study..
Columns are scenarios. Values in the table for first three rows of each block show percentage of scenarios achieving target (20%–30%) and over-toxic
First, looking at the monotherapy part of the study, the geometric mean of the percentage of trials selecting an MTD in the 20%–30% target interval across all scenarios is the greatest in the POCRM design at 72%, followed by the proposed three-parameter BLRM at 67%, with the lowest value observed under the two-parameter BLRM at 64%. However, the proposed design reduces the percentage of over-toxic dose selections, with an average of 8% of trials selecting a monotherapy dose above the targeted 30% level. The two-parameter BLRM doubles this value at 16%, with an average of 13% of trials under the POCRM selecting an over-toxic dose. However, when the true MTD is either dose level 1 or 2, there is a trade-off between safety considerations and correctly selecting the true MTD. The proposed design terminates the trial early between 9% and 42% of trials with 38%–81% trials selecting the correct dose in the target toxicity interval and 5%–18% of trials selecting an over-toxic dose. Whereas the POCRM terminates the trial early under scenario 1 only, in which
A similar comparison is drawn from the combination therapy part of the study where the two BLRM approaches have an identical average correct selection of an MTC in the target toxicity interval of around 58%, with the POCRM approach having a 63% average correct selection probability. However, both the two-parameter BLRM and POCRM have an increase in the average over-toxic selection at 12% and 18%, respectively, compared to the three-parameter BLRM at 5%. The proposed design has lower over-toxic selection across most combination-toxicity scenarios compare to the comparator models, with the two-parameter and POCRM demonstrating up to a 46% increase compared to the proposed design. The POCRM has a maximum over-toxic selection of 61%, meaning that under that scenario, over half of the simulation runs selected an unsafe dose. The proposed three-parameter BLRM has a maximum value of 26%. All three models have the same average number of patients assigned to an over-toxic dose.
Practical implementation
In practice, a review of safety data will be conducted at a dose review meeting prior to opening any new dose level, or opening enrolment to a new part of the trial. The BLRM model will be updated using all available data and a report summarising the model output will be sent to the safety review committee prior to the meeting. These dose review meetings will be held to review if the dose level can be deemed tolerated by the participants, assess any emerging data from previous cohorts and review the updated BLRM model to set dose levels for subsequent participants in Part A and Part B. In addition, these review meetings may open enrolment to Part B if this is not yet open and declare the RP2D to open Phase IIa, the dose expansion part. These meetings will be scheduled regularly (if required) during Phase I Part A and Part B (additional meetings may take place if urgent review is required due to emerging safety data) to update the Bayesian safety model and review dosing if a participant has become evaluable for DLT assessment and has data available since the last dose review. The safety data will be assessed along with the available PK data and any additional appropriate trial data, providing quantitative support to the dose decision meetings.
Discussion
In this work, we present an adaptation to a BLRM to incorporate the dose escalation of a monotherapy in parallel with the dose escalation of the treatment in combination with a backbone agent.
The proposed three-parameter model is beneficial over the two-parameter BLRM due to its ability to test the monotherapy and combination side by side. By conducting dose escalation of the two settings in parallel, the duration of the trial can be substantially reduced. Like the three-parameter BLRM, the POCRM also allows parallel testing of both treatments; however, results in far more selections of an over-toxic dose with similar selections of doses in the target toxicity interval. In contrast, the two-parameter BLRM defined in this work requires the completion of the monotherapy dose escalation before commencement of the testing of the combination. In this approach, the observed DLTs in the monotherapy are not utilised in the dose-escalation model for combination despite the clear link between the toxicities at monotherapy and combination. The monotherapy for the two-parameter BLRM simply reduces the dose-combination space, which although intuitive in terms of dose-toxicity monotonicity, fails to utilise the informative accrued data from the first part of the study. The two-parameter model has similar selections of the true MTD and MTC to that of the three-parameter model but, like the POCRM, increases the number of doses selected that were over-toxic.
One potential concern regarding the proposed model is the starting dose of the first cohort on combination treatment. The model, as defined, allows the first cohort of combination patients to be allocated the same dose as the next cohort for monotherapy despite the monotonicity assumption that assumes the combination is more toxic. Alongside the overdose constraint (
In addition, the three-parameter BLRM excludes the interaction term between the monotherapy and the combination treatment. This was motivated by Mozgunov et al., 2 where they demonstrate that including an interaction effect does not improve the accuracy of correct dose selection and decreases the number of patients assigned to the true MTD/MTC.
The simulation study presented focuses on a single trial setting with a parallel shift in toxicity relationship from monotherapy to combination therapy, with a 2:1 allocation ratio between monotherapy and combination therapy. In the supplementary materials, we also consider a setting with 1:1 allocation between the two parts of the trial and found very similar operating characteristics across all three approaches compared to 2:1 allocation. The proposed design was less sensitive to the allocation ratio as the combination part contributes to the model parameters for monotherapy and thus limits the impact of reducing the sample size in the monotherapy when the sample size in the combination part increases. In addition, further scenarios were explored (and presented in the supplementary material) where the combination-toxicity relationship was altered such that there was not a parallel shift in toxicity compared to the monotherapy. The proposed design proved less sensitive to the shape of the combination-toxicity in the considered scenarios, with the lowest proportion of over-toxic selections.
Supplemental Material
sj-pdf-1-ctj-10.1177_17407745251350604 – Supplemental material for Seamless monotherapy-combination phase I dose-escalation model-based design
Supplemental material, sj-pdf-1-ctj-10.1177_17407745251350604 for Seamless monotherapy-combination phase I dose-escalation model-based design by Libby Daniells, Thomas Jaki, Alimu Dayimu, Nikos Demiris, Basu Bristi, Stefan Symeonides and Pavel Mozgunov in Clinical Trials
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This report is independent research supported by the National Institute for Health Research (NIHR Advanced Fellowship, Dr Pavel Mozgunov, NIHR300576). The views expressed in this publication are those of the authors and not necessarily those of the NHS, the National Institute for Health Research or the Department of Health and Social Care (DHSC). T Jaki, P Mozgunov and L Daniells received funding from UK Medical Research Council (MC UU 00040/03). For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
