Abstract
Background/aims:
Dose-escalation studies are essential in the early stages of developing novel treatments, when the aim is to find a safe dose for administration in humans. Despite their great importance, many dose-escalation studies use study designs based on heuristic algorithms with well-documented drawbacks. Bayesian decision procedures provide a design alternative that is conceptually simple and methodologically sound, but very rarely used in practice, at least in part due to their perceived statistical complexity. There are currently very few easily accessible software implementations that would facilitate their application.
Methods:
We have created MoDEsT, a free and easy-to-use web application for designing and conducting single-agent dose-escalation studies with a binary toxicity endpoint, where the objective is to estimate the maximum tolerated dose. MoDEsT uses a well-established Bayesian decision procedure based on logistic regression. The software has a user-friendly point-and-click interface, makes changes visible in real time, and automatically generates a range of graphs, tables, and reports. It is aimed at clinicians as well as statisticians with limited expertise in model-based dose-escalation designs, and does not require any statistical programming skills to evaluate the operating characteristics of, or implement, the Bayesian dose-escalation design.
Results:
MoDEsT comes in two parts: a ‘Design’ module to explore design options and simulate their operating characteristics, and a ‘Conduct’ module to guide the dose-finding process throughout the study. We illustrate the practical use of both modules with data from a real phase I study in terminal cancer.
Conclusion:
Enabling both methodologists and clinicians to understand and apply model-based study designs with ease is a key factor towards their routine use in early-phase studies. We hope that MoDEsT will enable incorporation of Bayesian decision procedures for dose escalation at the earliest stage of clinical trial design, thus increasing their use in early-phase trials.
Keywords
Introduction
The primary aim of many phase I dose-escalation studies is to estimate the maximum tolerated dose of a novel drug or treatment. In practice, this often means identifying a dose for which the probability of a patient developing a dose-limiting toxicity is close to a prespecified target toxicity level, typically between 0.20 and 0.33 in cancer trials. Patients enter the study in cohorts of one or more (usually three), and for every new cohort a decision is made whether to stay at the current dose level, escalate or de-escalate the dose, or stop the study entirely. A statistical study design informs and guides this process, but the ultimate decision will always be based on clinical judgement.
It is good practice that early-phase dose-escalation designs use a well-fitting statistical model to synthesise all available information (from prior knowledge and accumulating patient data) and deduce a recommendation for how to proceed with the study.1,2 The first model-based designs were developed in the 1990s, most prominently the continual reassessment method, 3 and also Bayesian decision procedures,4,5 which are closely related to one another. 6
The alternative to model-based designs are methods that rely on largely heuristic rules or algorithms, such as the 3+3 design. The only real virtue of these algorithms is their simplicity (at least when nothing unforeseen happens), but on the downside they are less likely to identify the correct maximum tolerated dose than model-based designs, require on average more patients to reach a dose recommendation, have less flexibility to accommodate deviations from the prespecified dose-escalation procedure, and lack any theoretical foundation.7–10 Despite these clear drawbacks, rule-based designs are still in wide use, while the uptake of model-based designs remains slow, especially in the public sector.11–13 Unfortunately, the latter appear to many as a black box requiring specialist statistical input – which they are not. 14 There is also a misconception that they are overly complicated because they require more planning than rule-based designs, such as the choice of a prior probability distribution (or ‘prior’ for short).
In this article, we introduce MoDEsT (
Onerous tasks that are currently a barrier to the use of model-based designs, like setting a prior, become easy with MoDEsT: it allows trying out different priors and immediately visualises the consequences in terms of the operating characteristics (e.g. probability of identifying the correct maximum tolerated dose, expected number of patients required) of the procedure. This provides users with insight as to how dose recommendations come about, thus demystifying the model-based design.
In contrast to most other software for study design, MoDEsT is specifically aimed at both clinical trialists and statisticians with no previous experience of model-based dose escalation who would default to the 3+3 design for simplicity, although we believe it is also useful for statistical experts who already have a thorough understanding of model-based dose escalation. The intuitive point-and-click interface of MoDEsT encourages users to explore a variety of design options and allows them to watch changes become effective in real time and get a feel for the design’s performance in different clinically relevant scenarios. It facilitates the consideration and inclusion of efficient model-based dose escalation at the earliest stage of clinical trial design, which should always be a collaborative effort between clinical and statistical experts.
MoDEsT was written in the
Design and run dose-escalation studies using the continual reassessment method, 18
Design dual-agent dose-escalation studies, 19
Compare the performances of various model- and rule-based designs,20,21
Simulate the highest achievable (i.e. optimal benchmark) accuracy when selecting the maximum tolerated dose. 22
There are also a number of graphical user interfaces for dose-finding methods that are not based on
It runs under any operating system;
It does not require any software package to be installed;
Its point-and-click interface means no specialist software or programming skills are required;
It automatically generates PDF reports;
It is free to use.
Methods
Bayesian decision procedure
The Bayesian decision procedure implemented in MoDEsT is made up of four main components: (1) a logistic regression model, (2) prior information about the dose-toxicity relationship, (3) a gain function, and (4) a set of rules for (de-)escalating the dose and stopping the study. We describe each component briefly below; for a detailed exposition we refer to the original article by Zhou and Whitehead. 15
Logistic model
We assume the relationship between dose and risk of toxicity follows a logistic model
where the logit transformation of the probability (i.e. the log odds) of observing a dose-limiting toxicity (left-hand side of the equation) is assumed to depend on the log-transformed dose in a linear fashion (right-hand side of the equation; see the illustration in Figure 1). We use toxicity data from study patients to estimate the values of the model parameters

Example of an S-shaped dose-toxicity curve (left) and the corresponding straight line after transformation (right). The dotted horizontal line indates a target toxicity level of 0.3, or log(0.3/0.7) = –0.847 on the logit scale. P(DLT): probability of a dose-limiting toxicity.
Prior information
To get the Bayesian decision procedure started before any patient data are available, we need prior information on the dose-toxicity relationship. Guessing values of the model parameters
Gain function
A gain function can be used to quantify, for each dose, the advantage of prescribing different dose levels to the next cohort of patients, where larger gains are to be preferred. Thus, the gain function helps to determine which dose should be recommended for the next patient cohort. The ‘patient gain’ function would assign the dose currently thought to be closest to the target toxicity level (which is optimal from a current patient’s perspective), whereas the ‘variance gain’ function would choose the dose that will likely maximise learning about the dose-toxicity relationship (which is optimal from an investigator’s perspective and also from the perspective of future patients who will be treated beyond the current clinical trial). 5 In practice, the choice of gain function is unlikely to have a significant impact on the performance characteristics of the study design, but this can be explored in MoDEsT.
Escalation and stopping rules
Dose recommendations are determined primarily by the model and the gain function, but we may wish to apply additional restrictions such as:
Always start at the lowest dose;
Do not skip over any doses when escalating;
Do not escalate upon observing a toxicity in the current cohort.
Stopping recruitment to the study will be recommended once
The maximum number of patients have been analysed;
A pre-defined maximum number of consecutive patients receiving the same dose has been reached;
A sufficiently accurate estimate of the maximum tolerated dose has been obtained and/or;
No dose among those in the prespecified set is deemed safe.
Results
The MoDEsT app
MoDEsT comes in two parts: a ‘Design’ module to investigate candidate design options and simulate their operating characteristics, and a ‘Conduct’ module to guide decision making throughout the study, incorporate accruing patient data into the model and provide summaries of the final dataset on completion of the study. Both modules are fully reactive, that is, changes made by the user become effective in real time. MoDEsT can be accessed online (https://medstats-lancs.shinyapps.io/design/ and https://medstats-lancs.shinyapps.io/conduct/) from any device with a web browser. For
The ‘Design’ module
The ‘Design’ module takes as inputs the basic study parameters (maximum sample size, cohort size, dose levels, target toxicity level, gain function), the pseudo-observations needed to specify prior distributions for parameters of the dose-toxicity model, ‘true’ values of model parameters for simulation of the Bayesian procedure in different scenarios, and additional escalation and stopping rules as detailed above; all these are conveniently specified via sliders, text boxes, and tick boxes (Figure 3). The app then creates graphical displays of the dose-toxicity curves, simulates an example of a study given the current specifications, and suggests a variety of scenarios * for use in a subsequent simulation study (Figure 4). For the scenario chosen MoDEsT assesses a variety of operating characteristics and presents the results in tables and graphics. On the basis of the inputs the app generates a CSV design file that can subsequently be fed into the ‘Conduct’ module. In addition, a report summarising the design, prior information, and simulation results can be downloaded in PDF format.
The ‘Conduct’ module
The ‘Conduct’ module requires the user to upload a design file (obtained from the ‘Design’ module) and supply (anonymised) patient data. The latter can either be uploaded as a CSV file (typically created with a text editor or spreadsheet software such as Microsoft Excel, OpenOffice/LibreOffice Calc, or Google Sheets), or entered manually via a spreadsheet interface. The app then produces graphical displays of the data, fits the logistic model, calculates the current estimate of the maximum tolerated dose, and recommends either a dose for the next cohort or stopping the study in case a relevant criterion is fulfilled (Figure 5). A PDF report summarising the design, data, analysis, and recommendation is available for download. This can all be easily produced by the clinical study team for each dose review meeting so that dose recommendations based on statistical analyses of current and past patient data are used in real time alongside clinical opinion from the investigators to decide on dosing for the next patient cohort.
Getting help
While the app’s user guidance should be intuitive and most inputs and outputs self-explanatory, additional help may occasionally be required. The quickest way to learn more about an input element (e.g. slider, button, check box, text box) is by mousing over it, and a tooltip will appear. A full description/documentation of all of MoDEsT’s functionality along with a detailed explanation of all inputs and outputs is given in the help pages on the website. They are also included in the
Example: a phase I study of quercetin
The workflow when designing and conducting a study with MoDEsT is best illustrated with a real data example. In the following, we re-design and re-analyse (parts of) a dose-escalation study of a novel drug product in terminal cancer that originally used a 3 + 3-type design with several spontaneous modifications.
Dataset
Ferry
Figure 2 provides an overview of the study data from a total of 52 patients. We note a number of oddities and decisions made against the rules of 3 + 3:
The dose was escalated to 1400 mg/m2 for the 8th cohort despite a dose-limiting toxicity having been recorded for the 7th cohort at 945 mg/m2.
Dose-limiting toxicities occurred in both the 10th and 11th cohort at 1400 mg/m2, and yet the dose was not de-escalated for the 12th cohort.
No dose-limiting toxicities occurred in the 12th cohort, but the dose was de-escalated for the 13th cohort to 945 mg/m2.
The 12th cohort consisted of four patients.
A dose-limiting toxicity was recorded for the 17th cohort at 630 mg/m2, still the dose was escalated to 945 mg/m2 for the 18th cohort.
The 16th, 17th, and 18th cohort each consisted of two patients only.

Overview of the quercetin study data.
These deviations cannot be incorporated within the 3+3 design with its inherent inflexibility. This design no longer provides a relevant contribution to dose-escalation decisions, whereas the model-based procedure implemented in MoDEsT can easily handle cohorts of non-standard size and dose recommendations overruled by clinical judgement.
Study design
We set the study design parameters (maximum sample size, cohort size, dose levels, target toxicity level) as in Ferry

Input mask of the ‘Design’ module with specifications inspired by the quercetin study.
For the purpose of this example, we enforce starting at the lowest dose, not skipping over any doses when escalating, and not escalating when a dose-limiting toxicity occurs in the current cohort. We consider trial designs which would recommend stopping the study for accuracy once 12 consecutive patients have received the same dose, or when the ratio of the upper and lower 95% credible limit around the estimated maximum tolerated dose is 3 or less.
From these inputs, MoDEsT generates a number of graphs and tables to summarise the operating characteristics of the stipulated design. We see that if our simulation model were indeed the true dose-toxicity curve, the maximum tolerated dose would be estimated as 584 mg/m2, but only 352 mg/m2 under the much more cautious prior model (Figure 4, top left panel). Unsurprisingly, the 95% credible band around the prior curve is extremely wide, as it is based on only three (pseudo-)observations.

Top left: ‘Model’ tab of the ‘Design’ module displaying the prior and the assumed true dose-toxicity curve. Top right: ‘Example’ tab of the ‘Design’ module showing one simulated example dataset. Bottom left: ‘Scenarios’ tab of the ‘Design’ module giving an overview of six simulation scenarios. Bottom right: ‘Simulations’ tab of the ‘Design’ module summarising simulation results.
MoDEsT displays one simulated realisation of a study which proceeds according to the proposed Bayesian dose-escalation procedure and stopping rules, simulating patient outcomes setting dose-limiting toxicity risks equal to values consistent with the current dose-toxicity simulation model. This is intended as an illustrative example of what the study
The example trial shown in Figure 4 (top right panel) is stopped after 13 cohorts, when a sufficiently accurate estimate has been obtained in accordance with the pre-specified stopping rules. The 95% credible band for the maximum tolerated dose becomes narrower over the course of the study and always contains the ‘true’ value of 584 mg/m2.
To facilitate assessment of the design’s operating characteristics, MoDEsT automatically creates six simulation scenarios that can be used in the simulation study: the standard scenario defined by the ‘true’ simulation model, and five additional scenarios derived from it that imply lower (‘inactive’) or higher (‘potent’) toxicity rates over the whole dose range or parts of it. They are summarised in a table and graph, alongside the prior for comparison (Figure 4, bottom left panel).
Figure 4 (bottom right panel) shows summary tables and graphs of 1000 simulations performed within seconds by the press of a button under the (anticipated) standard scenario. In this example, the average sample size required was 36.28 patients (averaged over all 1000 simulated trials), the average of the maximum likelihood estimate of the maximum tolerated dose was 649.86 mg/m2, with a large mean squared error and notable bias, and an average toxicity rate of 14%, which is well below the targeted 20%. We see how many simulated trials were stopped for which reason(s); the sum of the percentages is greater than 100% because multiple stopping criteria can be fulfilled at the same time. We also get an overview of the sample sizes used, numbers of dose-limiting toxicities observed, reasons for stopping, and doses recommended as the maximum tolerated dose across the 1000 simulated trials. These summary plots and tables produced instantly by MoDEsT will be sufficient for most users but a detailed account of all individual simulation runs can be downloaded as a CSV file.
Study conduct
For illustrative purposes we present and discuss here only the analyses following the 7th and the last cohort, respectively; in practice a similar analysis would be performed after every single patient cohort.
We upload the design file and a CSV containing the study data. MoDEsT generates tabular overviews of the design parameters and patient data, as well as plots such as the one in Figure 2. We recommend that users review this output to double-check the information and also whether data have been read in as intended, that is, correct columns were specified for the cohort, dose, and response variable.
In Ferry

‘Recommendation’ tab of the ‘Conduct’ module after the 7th cohort (left) and after the final analysis (right) of the quercetin study.
The dataset accrued upon completion of the study comprises 52 patients, which is less than the envisaged maximum sample size of 54, so we have to tick the box in MoDEsT to indicate that the study has been stopped (Figure 5, right panel). The final model-based estimate of the maximum tolerated dose is 998 mg/m2, but this is influenced by the very pessimistic prior. Removing the pseudo-observations used to formulate the prior yields a final (maximum likelihood) estimate of 1090 mg/m2 and a marginally wider credible band. Both estimates lead to a recommendation of 945 mg/m2 for the maximum tolerated dose, which is the same as in Ferry
Discussion
Building trust in the utility, safety, and practical applicability of model-based dose-escalation designs is an essential step towards their wider acceptance within the clinical community. To assist this process, we have created MoDEsT, a software tool that is straightforward to use even without any specialist knowledge of statistical programming. We are positive it will convince trialists and statisticians that model-based methods are a feasible and worthwhile alternative to the 3 + 3 design and can be implemented with limited additional effort. Tasks that currently discourage many clinicians from using model-based designs, like having to set a prior, are made simple in MoDEsT. One of the main advantages of the software is that it allows investigators to input different (hypothetical or real) datasets ahead of time to see what dose recommendations the Bayesian dose-escalation procedure would generate, allowing them to develop some intuition as to how the procedure is working and how it would compare with their own intuition or algorithmic rules they might be more familiar with. We hope that by seeing the effects of changing design parameters in real time, trialists will become more confident in using model-based designs and that these will increasingly become the norm in early-phase dose-escalation studies.
Footnotes
Acknowledgements
The authors are grateful to Ian Wadsworth for providing R code, Michael Sweeting for advice during the early stages of the project, and two anonymous referees, the Associate Editor, and the Editor for a number of very helpful suggestions.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by MRC Network of Hubs for Trials Methodology Research project grant N78. C.Y. was funded by Cancer Research UK grant C22436/A15958. L.V.H.’s contribution to the project was funded by Medical Research Council grant MR/J014079/1.
