Sage Journals: Discover world-class research

Abstract

BACKGROUND:

The initiation biomarker-driven trials have revolutionized oncology drug development by challenging the traditional phased approach and introducing basket studies. Notable successes in non-small cell lung cancer (NSCLC) with ALK, ALK/ROS1, and EGFR inhibitors have prompted the need to expand this approach to other cancer sites.

OBJECTIVES:

This study explores the use of dose response modeling and time-to-event algorithms on the biomarker molecular targeted agent (MTA). By simulating subgroup identification in MTA-related time-to-event data, the study aims to develop statistical methodology supporting biomarker-driven trials in oncology.

METHODS:

A total of n patients are selected assigned for different doses. A dataset is prepared to mimic the situation on Subgroup Identification of MTA for time to event data analysis. The response is measured through MTA. The MTA value is also measured through ROC. The Markov Chain Monte Carlo (MCMC) techniques are prepared to perform the proposed algorithm. The analysis is carried out with a simulation study. The subset selection is performed through the Threshold Limit Value (TLV) by the Bayesian approach.

RESULTS:

The MTA is observed with range 12–16. It is expected that there is a marginal level shift of the MTA from pre to post-treatment. The Cox time-varying model can be adopted further as causal-effect relation to establishing the MTA on prolonging the survival duration. The proposed work in the statistical methodology to support the biomarker-driven trial for oncology research.

CONCLUSION:

This study extends the application of biomarker-driven trials beyond NSCLC, opening possibilities for implementation in other cancer sites. By demonstrating the feasibility and efficacy of utilizing MTA as a biomarker, the research lays the foundation for refining and validating biomarker use in clinical trials. These advancements aim to enhance the precision and effectiveness of cancer treatments, ultimately benefiting patients.

Keywords

Bayesian algorithm biomarker personalized medicine

1. Introduction

The initiation of biomarker-driven trials has transformed the oncology drug development process. The approach with conventional drug development process (phase I, phase II, phase III) is being abandoned for the whole procedure where first-in-human explore the dose and activity of different sites of cancer (“basket studies”). It becomes the basic platform for drug development. The ALK, ALK/ROS1 and EGFR inhibitors trial for non-small cell lung cancer (NSCLC) are now available. It is the real examples of current requirement. Now it is required to be expanded for all other sites. Now the subgroup identification on relevant treatment benefit is the most challenging task [1, 2, 3, 4]. Subgroup identification always is taken as a secondary objective for any prospective study. Studies are not powered enough to identify the subgroups with statistical inference. Identification of proper subgroup only provides successive treatment. It promotes treatment practice towards personalized medicine. However, it is not an easy task to perform. Usually, it is hypothesized that covariates are independent of treatment effect. But it is expected that they are contributed towards treatment success. There are other issues like the occurrence of bias towards selecting the subgroups. Recently the related statistical challenges are stretched towards developing the personalized medicine [5]. There are several attempts to develop a statistical methodology to identify and work with the subgroup. The tree-based approach is one of the widely applied methods to generate subgroups from a trial data [6, 7, 8, 9, 10]. The penalized regression model [11] and Bayesian [12] is also observed as capable enough to work with subgroup identification and draw statistical inference. However, all methods are dedicated to the randomized control trial setting. The statistical inference supports to draw an inference between treatment and control arm. The extension from the randomized controlled trial set up to dose finding modeling all is served by statistical methodology. A methodology is developed to work with subgroups for phase-II and phase-II dose-finding studies separately [13, 14]. The challenge is to subgroup identification for different doses is to keep track about changes of doses with different baseline features of the patients. It is required to assume that baseline features are same in the specific subgroup and only changes of doses make the influence on treatment outcomes. Similarly, dose-response relation changes with different subgroups. It is possible to handle through dose-response functional relationship [14, 15]. The dose response relationship is dependent on selection of potential biomarker value. The suspected biomarker is required to be filtered with diagnostic accuracy test. The popular tool to measure the diagnostics accuracy of a clinical feature is obtained by receiver operating curve or area under curve (AUC) [16]. Further, AUC is comprehensive measurement of false positive rate (FPR) and True positive rate (TPR). The TPR is also called as sensitivity and (1-FPR) as specificity. However, there are several limitation on predicting capability of TPR, FPR and AUC [17, 18]. In this scenario, the positive predictive value (PPV) is found satisfactory to predict the future time point time to event outcomes. The PPV is more sensitive with the performance of AUC. It may be observed with poor measurement while the population prevalence is very low. However, the PPV can be presented with attractive matric to test the predictive power of a risk score [19]. The risk score is generated from baseline characteristic of the data and PPV is adapted through dynamic approach towards prediction of future data. The PPV is defined as

$\displaystyle\text{PPV}(z)=\text{Pr}\{D=1|Z\geqslant z\}\ \text{and}$ (1) $\displaystyle\text{NPV}(z)=\text{Pr}\{D=0|Z<z\}$

where $D=1$ as presences of a subgroup-status and $D=0$ as absence. This definition is extended with time-to-event data context [20] by generating risk score [21, 19]. The challenge is to specify the threshold limit value (TLV) of a clinical feature i.e. biomarker. However, the PPV can be formulated as threshold-free value [22, 20] but it may raise biased inference about selecting the TLV of a biomarker. Therefore, a time-dependent AUC is proposed in this work and termed as the average positive predictive value(AP) [23]. The identification and development of MTAs relies on a thorough understanding of the molecular biology of cancer. Advances in genomic and proteomic technologies have allowed researchers to identify key molecular targets that are involved in the onset and progression of various types of cancer. These targets include specific proteins, enzymes, and receptors that are either overexpressed or abnormally activated in cancer cells. By inhibiting or modulating the function of these targets, MTAs can effectively halt or slow down cancer growth. Some examples of MTAs include: Tyrosine kinase inhibitors (TKIs): These agents target specific tyrosine kinases, which are enzymes involved in signal transduction pathways that regulate cell growth, differentiation, and survival. Examples of TKIs include imatinib (Gleevec) for chronic myeloid leukemia and gefitinib (Iressa) for non-small cell lung cancer. Monoclonal antibodies (mAbs): These are immune system proteins that specifically recognize and bind to target molecules on the surface of cancer cells. By doing so, they can either directly kill the cancer cells or recruit the immune system to attack them. Examples of mAbs include trastuzumab (Herceptin) for HER2-positive breast cancer and cetuximab (Erbitux) for colorectal cancer. Immune checkpoint inhibitors: These agents target proteins on immune cells or cancer cells that act as checkpoints to prevent the immune system from attacking cancer cells. By blocking these checkpoints, immune checkpoint inhibitors can unleash the full potential of the immune system to recognize and destroy cancer cells. Examples include pembrolizumab (Keytruda) and nivolumab (Opdivo), which target the PD-1/PD-L1 pathway. Angiogenesis inhibitors: These agents inhibit the formation of new blood vessels (angiogenesis), which is crucial for tumor growth and metastasis. By cutting off the blood supply, angiogenesis inhibitors can starve the tumor and inhibit its growth. Examples include bevacizumab (Avastin) and sunitinib (Sutent). To enhance the efficacy of MTAs and minimize potential side effects, it is essential to identify patients who are most likely to benefit from a specific MTA. This can be achieved through the use of biomarkers, which are biological molecules that can be measured in blood, tissue, or other bodily fluids and provide information about the presence, severity, or response to treatment of a specific disease. In the context of MTAs, biomarkers can be used to predict the response to treatment, monitor disease progression, and identify potential resistance mechanisms. The integration of biomarkers into the clinical development and application of MTAs has been a critical factor in the success of personalized cancer therapy. In summary, molecular targeted agents (MTAs) represent a promising approach to cancer treatment that is based on a deep understanding of the molecular biology of cancer. By targeting specific molecular pathways that drive cancer growth, MTAs can offer more precise and effective treatment options with fewer side effects compared to conventional therapies. The use of biomarkers to predict and monitor the response to MTAs is crucial for the successful development and application of personalized cancer therapy. Because the biological ground for cancer development is figured as the representation of molecule i.e. also known as “targeted agent” and it is confirmed that the Biological pathways are linked with observable change with the clinical outcome as in Fig. 1. However, the presence of the complex mechanism associated with pathways makes it challenging to estimate the effect of "targeted agent" on clinical outcomes [24]. If the targeted agent established as biomarker then it is difficult to replicate the result in other patients due to subjective variation in clinical outcome [25] shown in Fig. 2. Let us assume that the therapeutic effect is defined as $\tau$ and the covariate of interest is $Z$ . The baseline value of MTA is $X$ and future time point as $Y$ . The clinical outcome is Progression-free survival(PFS) precisely defined as $T$ . The distributional form of the therapeutic effect of $\tau$ is presented as $p(.)$ or $p(T|X,Y,Z,\tau)$ . It is expected that the change will occur by $Y-X$ due to the MTA effect of $\tau$ . The point of interest is to look the distributional change on $Y$ from $X$ as therapeutic effect of $\tau$ . Now, changes into $Y$ from $X$ is due to subjective variation and therapeutic effect as well. But it is difficult to address this complex scenario. Generally, $X$ and $Y$ are observed with binary value as low versus high expression value. It may also observed with skewed distribution or multimodal distribution [26, 27]. The functional form is prepared as

$\displaystyle\Delta=\Delta\{p(X,Y|Z,\tau\}$ (2)

It is further separated into sub-models to cover different subgroups by $\Delta$ through $\tau$ on $T$ . The factorization is defined as

$\displaystyle p(T,X,Y|Z,\tau)=p(T|X,Y,Z,\tau)p(X,Y|Z,\tau)$ (3)

It helps us to solve the problem with Bayesian parametric modeling for $p(T|\Delta,Z,\tau)$ . The density estimation is prepared through estimator of $P(X<Y|\tau)$ [28, 29] for the complex distributional forms of $X$ and $Y$ .

Figure 1.

An example of different subgroups and their effect on Molecular Target Agent y-axis,Molecular target agent; x-axis, dose.

Figure 2.

Dose and MTA relation.

2. Generated data set

Any study dedicated towards personalized medicine is relatively new. The application of personalized medicine in oncology setup is not in common practice. There are some ethical constraints. But it is the need of the hour towards better successive treatment. Randomization of patients in different treatment groups towards looking their biomarker profile is relatively uncommon. We failed to get permission to use real data for illustration in oncology setup due to ethical constraints. In this analysis, a dataset is prepared to mimic the situation on Subgroup Identification of MTA for time to event data analysis. A total of six doses are generated as $d_{1}=$ 15 mg/m ${}^{2}$ , $d_{2}=$ 12.5 mg/m ${}^{2}$ , $d_{3}=$ 10 mg/m ${}^{2}$ , $d_{4}=$ 7.5 mg/m ${}^{2}$ , $d_{5}=$ 5 mg/m ${}^{2}$ and $d_{6}=$ 2.5 mg/m ${}^{2}$ of Cisplatin chemotherapy doses. The MTA condensed with proteomic value as a continuous variable. The MTA range is generated from a normal distribution with the mean 12–16 and standard deviation 2.5–5.5 for different doses. The subgroups are prepared by the arbitrary chosen cut-off value of MTA. The relevance of cut-off values is validated through the ROC analysis. The intention is to investigate whether there is a subgroup with differential treatment effects or not. A total of 240 patients data is formulated form different doses i.e. $d_{1}-d_{6}$ . The corresponding PFS duration is obtained from the normal distribution with median 500 days and standard deviation 20. The binomial distribution is assumed to generate the time to event data. The probability of an event is assumed with 0.7. The distribution of PFS is presented in Fig. 4.

3. Modeling

3.1 Dose response modeling

We will investigate the dose-response modeling based on maximum tolerated activity (MTA) and then extend it towards a time-to-event algorithm. Suppose we have a total of n patients selected and assigned to different doses (di). The response is measured through MTA and assumed as a series of values $(y_{1},y_{2},\ldots,y_{n})$ . The link function $g^{-1}$ is utilized as $E(y_{i})=\mu_{i}=g^{-1}(\eta_{i})$ . The predictor is formulated as

$\displaystyle\eta_{i}=\beta_{0}+\Delta(d_{i},\theta)+\gamma^{\prime}x_{i}$ (4)

The placebo effect is considered as $\beta_{0}$ . The dose-response function formulated as $\Delta(d_{i},\theta)$ . Now $\Delta(0,\theta)$ shows no treatment effect is present. It can be assumed that there are some baseline covariates. Simply the covariate is considered as $x_{i}$ . The effect of covariate on response varibale is defined as $\gamma^{\prime}$ in the above equation.The regression parameters $\beta,\theta,\gamma$ are combined and defined as $\phi=(\beta_{0},\theta,\gamma,\sigma)$ . The additional term $\sigma$ is used as nuisance paramter and helped to consider the standard deviation of the linear model. The derivation is like

$\displaystyle\hat{\phi}=\text{arg \ min }\sum_{i=1}^{n}\Psi(y_{i},d_{i},x_{i})% ,\Phi)$ (5)

The $\phi$ will work like score function.

3.2 Subgroup generation on MTA

In a clinical trial, it is assumed that treatment effects are equally effective in any population. Only baseline features of the patients are incorporated and understand they are important at the design stage of the trial. The conventional technique to measure the baseline covariates $Z$ , never explored the treatment interaction effect on baseline characteristics as the clinically meaningful outcomes. In biomarker-driven studies specific covariates i.e. $X$ are required to be captured while it is suspected to influence the treatment effect. In this study, the subset covariate is defined as $Z$ . Model-based subset provides the effect as $\text{treatment}\times\text{subgroup}$ interactions. It helps to work with the parametric model and useful to explore the subset covariates $Z$ . The model can be comprehensively written as $m(y,d,Z),\phi(Z))$ . It helps to measure the estimate for different subsets. The subsets are generated through model algorithm which helps to decide the effects on the parameters of model $m$ . Similarly, model can split the patients into subsets by looing their covariates. Thereafter, each of the subset performed separately to work with dose response modeling. The subset of covariates is decided based on threshold limit value of a MTA. The model is defined as $z^{(1)},\cdots,z^{(J)}$ , i.e.,

$\displaystyle\Psi((y,d,Z),\hat{\psi})\in z^{(j)},j=1,2\ldots,J,$

(6) $\displaystyle\quad p=1,\ldots,p,$

The term $J$ is used to partitioning the covariates and $P$ value shows the number of prepared model. The relative score is defined as $\Psi_{p}$ with the partial derivative [30]. Now multiple hypothesis tests can be performed to create subsets from covariates $Z$ . The smallest $P$ value is selected to split the covariates. Now the Bonferroni correction is used to overcome the multiple test problems. In the context of dose-response modeling $\gamma$ is the baseline covariates effect and $\sigma$ is used as nuisance parameter. In subset formation it is possible to restrict $\theta$ . The term $\theta$ helps to generate parameter for split covariate.
3.3 Model to decide the positive predictive value (PPV) and true positive rate (TPR)

In this study, we focus on the response value, which is measured as a continuous variable and defined as Z. The event time data is prepared through the measurement time T. The positive predictive value (PPV) and true positive rate (TPR) [31, 32] is defined as

$\displaystyle\text{PPV}_{t_{0}}(z)=\text{Pr}\{T<t_{0}|Z\geqslant z\}$ (7) $\displaystyle\text{TPR}_{t_{0}}(z)=\text{Pr}\{Z\geqslant z|T,t_{0}\}$ (8)

The duration of event is considered as $D_{t_{0}}=I(T,t_{0})$ through the indicator variable $I(.)$ . Particularly, the $\textit{AP}_{t_{0}}$ is defined as [23]

$\displaystyle\text{AP}_{t_{0}}=\int_{R}\text{PPV}_{t_{0}}(z)\text{TPR}_{t_{0}}% (z)$ (9)

Now the TPR provides the distributional form of $Z$ for subject experienced the event till the period $t_{0}$ i.e. $T<t_{0}$ . Further,

$\displaystyle\textit{AP}_{t_{0}}=E_{Z_{1}}\{\text{PPV}_{t_{0}}(Z_{1})\}$ (10)

Similarly, $\text{PPV}_{t_{0}}(z)$ is formulated as

$\displaystyle\text{PPV}_{t_{0}}(z)=\frac{\text{P}(Z\geqslant z|T<t_{0})\text{P% }(T<t_{0})}{P(Z\geqslant z)}=\frac{\pi_{0}\{1-F_{1}(z)\}}{\{1-F(z)\}}$

Now

$\displaystyle F_{1}(z)=\text{Pr}(Z<z|T<t_{0})=\text{P}(Z_{1}<z)$ (11)

The distribution function of the MTA value till the time point $T<t_{0}$ is defined as,

$\displaystyle F(z)=P(Z<z)$ (12)

The event rate till the time $t_{0}$ is defined as

$\displaystyle\pi_{t_{0}}=\text{Pr}(T<t_{0})$ (13)

The $\textit{AP}_{t_{0}}$ is redefined as

$\displaystyle\text{AP}_{t_{0}}=\pi_{t_{0}}\int\frac{1-F_{1}(z)}{1-F(z)}\textit% {dF}_{1}(z)$ (14)

3.4 Estimating

\textit{AP}_{t_{0}}

in presence of MTA

It may be possible to generate the survival data in the presence of censored information. The occurrence of lost to follow-up of patient’s information is common in any survival analysis. The presence of censoring is presented as $V=\textit{min}\{T,C\}$ . The censoring time is defined as $C$ , and $\delta=I(T<C)$ . Now $\{(V_{i},\delta_{i},Z_{i}),i=1,\ldots,n\}$ is assumed to be independent with $(V,\delta,Z)$ . Further, the event status is defined as $t_{0},I(T_{i}<t_{0})$ with the inverse probability about occurrence of censoring event [33, 34]. The distribution free non-parametric estimator is presented in relation to the MTA observations. The duration of event is defined as $T$ . The time-dependent PPV and TPR are prepared as

$\displaystyle\hat{\textit{PPV}}_{t_{0}}(z)=\frac{\mathop{\sum}\limits_{i=1}^{n% }\hat{w}_{t_{0},i}I(V_{i}\geqslant z)I(X_{i}<t_{0})}{\mathop{\sum}\limits_{i=1% }^{n}I(Z_{i}\geqslant z)}$ (15)

and

$\displaystyle\hat{\textit{TPR}}_{t_{0}}(z)=\frac{\mathop{\sum}\limits_{i=1}^{n% }\hat{w}_{t_{0},i}I(V_{i}\geqslant z)I(V_{i}<t_{0})}{\mathop{\sum}\limits_{i=1% }^{n}\hat{W}_{t_{0},i}I(V_{i}<t_{0})}$ (16)

The term $\hat{w}_{t_{0},i}$ is used to estimate the time-dependent probability of event by $I(T_{i}<t_{0})$ as

$\displaystyle\hat{w}_{t_{0},i}=\frac{I(V_{i}<t_{0})\delta_{i}}{\hat{G}(V_{i})}% +\frac{I(V_{i}\geqslant t_{0})}{\hat{G}(t_{0})}$ (17)

The estimator $\hat{G}(c)$ is defined to consistently estimate the survival function with censoring time, $G(c)=\text{Pr}(C\geqslant c)$ . The censoring is assumed as time-dependent from the duration of survival and MTA value. The value $Z,G(c)$ is obtained through the nonparametric Nelson-Aalen or Kaplan-Meier estimator. It is formulated as

$\displaystyle G_{Z}(c)=\text{Pr}(C\geqslant c|Z=z)$ (18)

Now redefined function for $\textit{PPV}_{t_{0}}(z)$ and $\textit{TPR}_{t_{0}}(z)$ , and $\textit{AP}_{t_{0}}$ are

$\displaystyle\hat{\textit{AP}}_{t_{0}}=\frac{\mathop{\sum}\limits_{j=1}^{n}I(V% _{j}\leqslant t_{0})\hat{w}_{t_{0},j}\hat{\textit{PPV}}_{t_{0}}(Z_{j})}{% \mathop{\sum}\limits_{j=1}^{n}I(V_{j}\leqslant t_{0})\hat{w}_{t_{0},j}}$ (19)

It is concluded as consistent estimator [35]. Now the MTA values are incorporated as

$\displaystyle\hat{\textit{PPV}}_{t_{0}}(Z_{j})=$ (20)

3.5 Progession free survival model

The patients are represented as $i=1,\ldots,N$ . The duration of time to event i.e. progression free survival(PFS) or overall survival(OS) for $i^{th}$ patient is $T_{i}$ . Now the right censoring event is classified as

$\displaystyle T^{0}=(T_{1}^{0},T_{2}^{0},\ldots,T_{N}^{0})$ (21)

If $T_{i}^{0}=T_{i}$ then it shows that $\epsilon_{i}=1$ and $T_{i}^{0}<T_{i}$ as $\epsilon_{i}=0$ .

$\displaystyle\epsilon=(\epsilon_{1},\ldots,\epsilon_{N})$ (22)

The pre-treatment and post treatment MTA is defined as $X=(X_{1},\ldots,X_{N})$ and $Y=(Y_{1},\ldots,Y_{N})$ . The individual label changes of MTA from pre to post treatment is defined as

$\displaystyle\Delta_{i}=\Delta\{p(X_{i},Y_{i}|\tau_{i})\}$ (23)

The term $\tau_{i}$ is defined to assign the dose-response. Now the patients label data is formulated as

$\displaystyle p(T_{i}^{0},\epsilon_{i},X_{i},Y_{i},\tau_{i},\beta,\theta)$ (24) $\displaystyle\quad=p(T_{i}^{0},\epsilon_{i}|\Delta_{i},\tau_{i},\beta)p(X_{i},% Y_{i}|\tau_{i},\theta)$

The $\theta$ is considered as parameter to address the MTA value and $\beta$ as regression coefficient with the model

$\displaystyle p(T^{0},\epsilon|X,Y,\tau,\beta)=p(T^{0},\epsilon|\Delta,\tau,\beta)$ (25)

The term $T^{0}$ is used to define the survival distribution with the function of $\Delta$ and $\tau$ .

3.6 Bayesian model for MTA

The intention to develop model is to classify the patient’s population through their MTA value. The model is prepared as $p(X,Y|\tau,\theta)$ . It is expected that the measurement of MTA will be dependent on subject-specific distribution by

$\displaystyle X_{i1},\ldots,X_{in_{i}}\lx@stackrel{{\scriptstyle\text{i.i.d}}}% {{\sim}}F_{X_{i}}$ (26)

and

$\displaystyle Y_{i1},\cdots,Y_{in_{i}}\lx@stackrel{{\scriptstyle\text{i.i.d}}}% {{\sim}}F_{Y_{i}}$ (27)

The pre and post treatment MTA values are measured as $X_{i}$ and $Y_{i}$ respectively. It is assumed that $X_{i}$ and $Y_{i}$ will follow the normal distribution with mean $\mu$ and standard deviation $\sigma$ . The generalized term for $i^{th}$ individuals $j^{th}$ time point measurement is

$\displaystyle X_{ij}|\mu_{X_{ij}},\sigma_{X_{ij}}\lx@stackrel{{\scriptstyle% \text{i.i.d}}}{{\sim}}N(\mu_{X_{ij}},\sigma_{X_{ij}}),$ (28) $\displaystyle\quad i=1,\ldots n_{i},$

and

$\displaystyle Y_{ik}|\mu_{Y_{ik}},\sigma_{Y_{ik}}\lx@stackrel{{\scriptstyle% \text{i.i.d}}}{{\sim}}N(\mu_{Y_{ik}},\sigma_{Y_{ik}}),$ (29) $\displaystyle\quad k=1,\ldots m_{i}$

Now, $\theta_{X_{ij}}=(\mu_{X_{ij}},\sigma_{X_{ij}})^{\prime}$ and $\theta_{Y_{ij}}=(\mu_{Y_{ij}},\sigma_{Y_{ij}})^{\prime}$ . It is expected that the model, $\theta_{X_{ij}}$ and $\theta_{Y_{ij}}$ will follow some mixture distribution as $G_{X_{i}}$ and $G_{Y_{i}}$ and

$\displaystyle\theta_{X_{i1}},\ldots,\theta_{X_{in_{i}}}\lx@stackrel{{% \scriptstyle\text{i.i.d}}}{{\sim}}|G_{X_{i}}\ \text{and}\ \theta_{Y_{i1}},\ldots,$ (30) $\displaystyle\quad\theta_{Y_{im_{i}}}\lx@stackrel{{\scriptstyle\text{i.i.d}}}{% {\sim}}|G_{Y_{i}}$

The distribution is formulated as $G_{X_{i}}$ and $G_{Y_{i}}$ with the individual vectors $X_{i}$ and $Y_{i}$ as

$\displaystyle f_{X}(x_{i}|G_{X_{i}})=\int\prod_{j=1}^{n_{i}}\phi(x_{ij},\theta% _{X_{ij}})G_{X_{i}}(d\theta_{X_{ij}})$ (31) $\displaystyle f_{Y}(y_{i}|G_{Y_{i}})=\int\prod_{k=1}^{m_{i}}\phi(y_{ik},\theta% _{Y_{ik}})G_{Y_{i}}(d\theta_{Y_{ij}})$ (32)

Now the individual label changes of MTA value from pre to post treatment is defined as $\Delta_{i}$ . The model on $G_{X_{i}}$ and $G_{Y_{i}}$ helps to classify the patients for specific treatment. The Dirichlet Process (DP) is assumed to represent the prior probability for unknown distribution assumption [36]. The random distribution is presented as $G\sim DP(\alpha,G^{*})$ .The baseline measurement is presented as $E(G)=G^{*}$ by parameter $\alpha$ . The prior assumption is presented as

$\displaystyle G=\sum_{h=1}^{\infty}\pi_{h}\delta_{m_{h}}$ (33)

The $\delta_{m_{h}}$ is used to degenerate the distribution [37]

$\displaystyle\pi_{h}=s_{h}\prod_{j=1}^{h-1}(1-s_{j})$ (34)

It is assumed that $s_{h}\sim\text{Be}(\alpha,1)$ and $m_{h}\sim G^{*},h=1,\ldots,\infty$ . Finally the pre and post treatment assumptions are seperated as

$\displaystyle G_{X_{i}}\sim\sum_{r=1}^{\infty}\pi_{r}\delta_{G_{r}^{*}}$ (35)

and

$\displaystyle G_{Y_{i}}\sim\sum_{r=1}^{\infty}\pi_{r}\delta_{G_{r}^{*}}$ (36)

The $G_{r}^{*}$ is assumed to follow the DP prior with $DP(\gamma,G_{0})$ and $(G_{X_{i}},G_{Y_{i}})=(G_{X_{i^{\prime}}},G_{Y_{i^{\prime}}})$ is used for $i\neq i^{\prime}$ to define the $\Delta_{i}=\Delta_{i^{\prime}}$ to put a patients into a specific cluster. The gamma prior is used to generate the $\alpha$ and $\gamma$ .

3.7 Provision to consider MTA measurement into Binary format

Generally, MTA values are observed as continuous measurement. However, decision about MTA or relevance of any biomarker can only be defined as binary indicator. Commonly measured MTA values are available with pre or post measurement on each patient and corresponding response variable are considered as complete response (CR), partial response (PR), stable disease (SD), progressive disease (PD) etc. Our intention is to induce the response as time to event with PFS or OS. The model is preferred into binary format as $X_{i}$ and $Y_{i}$ and unobserved real-valued latent variable are $\bar{X}_{ij}$ and $\bar{Y}_{ij}$ . The observed binary variable is defined as $X_{ij}=I(X_{ik}>0)$ and $Y_{ij}=I(Y_{ik}>0)$ [38]. The variable $X_{ij}$ and $Y_{ik}$ can be observed from the conditional independent Bernoulli distribution as $p_{X_{ij}}$ and $p_{Y_{ik}}$ , that is $X_{ij}\sim\text{Bern}(p_{X_{ij}})$ and $Y_{ij}\sim\text{Bern}(p_{Y_{ij}})$ . It is linked with parameter $\theta_{X_{ij}}=\text{link}(p_{X_{ij}})$ , $\theta_{Y_{ij}}=\text{link}(p_{Y_{ij}})$ through a logit or probit link function. The prior assumption is used to get estimate of $\Delta_{i}$ . The term $\Delta_{i}$ is presented as

$\displaystyle\Delta_{i}=\text{Pr}(X_{i}=0,Y_{i}=1)$ (37)

Now $\Delta_{i}$ and $G_{Y_{i}}$ both are presented with Bernoulli distribution and helped to understand the MTA is equally present for all samples or not. Further, the post and pre treatments are defined as $(Y_{i},X_{i})$ . The equal numbers of measurements are assumed for post and pre treatment. The continuously measured MTA value is presented as $\theta_{X_{i}}=(\mu_{X_{i}},\sigma_{X_{i}})$ and $\theta_{Y_{i}}=(\mu_{Y_{i}},\sigma_{Y_{i}})$ , $X_{i}\theta_{X_{i}}\sim N(\theta_{X_{i}})$ and $Y_{i}\theta_{Y_{i}}\sim N(\theta_{Y_{i}})$ , with $\theta_{X_{i}}\sim G_{X}$ and $\theta_{Y_{i}}\sim G_{Y}$ and $(G_{X},G_{Y})\sim\text{DP}(\alpha,G_{0})$ . Now $X_{i}\sim\text{Bern}(p_{X_{i}})$ and $Y_{i}\sim\text{Bern}(p_{Y_{i}})$ $\theta_{X_{i}}=\text{link}(p_{X_{i}})$ and $\theta_{Y_{i}}=\text{link}(p_{Y_{i}})$ .

3.8 Model to capture the expected change on MTA value

The term $\Delta$ is used to check the MTA profile in pre and post treatment. Our intention is to estimate the MTA measurement and capture the real change due to treatment effect. Now the real change is defined as distributional change and tried to capture through cdfs of $F_{X}$ and $F_{Y}$ respectively. The joint function is defined as

$\displaystyle Q_{X,Y}(p)=F_{Y}\{F_{X}^{-1}(p)\},\ \text{for}\ p\in(0,1)\text{% for}\ p\in(0,1)$ (38)

The Receiving Operating Characteristic (ROC) is defined as

$\displaystyle\text{ROC}(p)=1-F_{Y}\{F_{X}^{-1}(1-p)\}$ (39)

Our intention is to split the MTA profile based on ability to prolonged the survival duration. The expected distributional change between pre and post treatment arms are presented as

$\displaystyle\Delta=\int_{0}^{1}Q_{X,Y}(p)dp=E_{F_{Y}}\{F_{X}(Y)\}=P(X<Y)$ (40)

For $i^{th}$ individual $F_{X_{i}}$ and $F_{Y_{i}}$ are used to define the pre and post treatment measurement of MTA value.

4. Simulation study and posterior computation

The performance of the proposed model is tested through simulation technique. A simulation technique is performed to mimic the Phase II dose detection module. The intention is to define the subgroups based on exploratory measurement. The mimic data is prepared to represent the dose-response relation, different doses and standard error. A total of $n=$ 210 patients data is simulated through presences of 5 different doses of chemotherapy i.e. 15, 12.5, 10, 7.5 and 5 respectively. The covariate is defined as $z_{i}$ and it is assumed that $z_{i}\sim N(0,I_{10})$ . The assumed distribution is formulated as

$\displaystyle y_{i}\lx@stackrel{{\scriptstyle\text{i.i.d}}}{{\sim}}N(\mu,% \sigma^{2})$ (41) $\displaystyle\mu_{i}=\beta_{0}(z_{i})+\theta_{1}(z_{i})\frac{d_{i}}{\theta_{2}% (z_{i}+d_{i})}\lx@stackrel{{\scriptstyle\text{i.i.d}}}{{\sim}}$ (42) $\displaystyle\quad N(\mu,\sigma^{2})\ i=1,\ldots,n$

The dose intercept effect is observed through $\beta_{0}(Z_{i})$ . The highest MTA effect was covered by $(\theta_{1})$ . Similarly, the lowest effects in terms of MTA are documented through $(\theta_{2})$ . Further, the simulation is performed with a different scenario

Senario1:

No treatment effect is present.

Senario2:

Treatment effect is present and measured only on MTA as covariate $z_{i}$ . This condition is separated by different distributional form of MTA.

Senario3:

Treatment effect is present and measured through presences of different MTA as covariate $z_{i}$ .

Table 1

Time-dependent-ROC curve estimation

Parameter	Time	Cases (Survivors)	Censored (AUC $(\%)$ )	SE
Potasium	$t=$ 235	55(428)	53(57.08)	4.30
and	$t=$ 365.8	111(354)	71(56.12)	3.20
calcium	$t=$ 579	159(268)	109(54.86)	2.92
combined	$t=$ 842.6	201(182)	153(53.72)	2.98
	$t=$ 1137	217(107)	212(52.64)	3.34
Serum	$t=$ 235	55(428)	53(60.72)	4.29
calcium	$t=$ 365.8	111(354)	71(57.11)	3.17
	$t=$ 579	159(268)	109(55.28)	2.98
	$t=$ 842.6	201(182)	153(52.67)	2.95
	$t=$ 1137	217(107)	212(52.97)	3.34

The results obtained through simulation technique are presented in Table 1. Similarly, different scenarios are presented. The scenario is like

(I)

The absence of any subgroups.

(II)

No treatment effect difference observed between the subgroups.

(III)

In subgroups, the MTA is achieved with the optimum label as expected.

(IV)

Subgroups for which dose-response relation reached a plateau after incline the dosing label.

Our intention is to identify the doses within the contol limit of MTA. This identification procedure is further extended in next sections. The MTA measurement is assumed to follow normal distribution through

$\displaystyle y_{i}\lx@stackrel{{\scriptstyle\text{i.i.d}}}{{\sim}}N(\mu_{i},% \sigma^{2})\mu_{i}=\beta_{0}+\Delta(d_{i},\theta)$ (43) $\displaystyle\quad\text{for}\ i=1,2\ldots,n$

Table 2

Result observed through simulation technique ( $N=$ 20,000)

Mean PPV		$M_{1}$	$M_{2}$	$\Delta$
Dose	$t_{0}$ (Event Rate)	True (Bias)	True (Bias)	True (Bias)
2.5	50(0.05)	0.28(0.03)	0.17(0.03)	0.10(0.00)
2.5	100(0.30)	0.92(0.01)	0.29(0.012)	0.09(0.00)
2.5	200(0.85)	0.46(0.00)	0.38(0.00)	0.07(0.00)
2.5	250(0.97)	0.52(0.00)	0.44(0.00)	0.08(0.00)
5	50(0.07)	0.26(0.03)	0.16(0.03)	0.10(0.00)
5	100(0.30)	0.35(0.01)	0.23(0.01)	0.12(0.00)
5	200(0.97)	0.44(0.00)	0.321(0.00)	0.12( $-$ 0.00)
5	250(1)	0.51(0.005)	0.456(0.00)	0.05(0.00)
7.5	50(0.05)	0.276(0.03)	0.17(0.03)	0.09(0.00)
7.5	100(0.42)	0.394(0.02)	0.25(0.02)	0.13( $-$ 0.00)
7.5	200(0.97)	0.49(0.00)	0.41(0.00)	0.08(0.00)
7.5	250(1)	0.55(0.00)	0.49(0.00)	0.49( $-$ 0.00)
10.5	50(0.07)	0.25(0.03)	0.18(0.03)	0.07(0.00)
10.5	100(0.17)	0.36(0.02)	0.23(0.02)	0.12(0.05)
10.5	200(0.80)	0.43(0.01)	0.32(0.01)	0.10(0.00)
10.5	250(0.95)	0.49(0.00)	0.40(0.00)	0.09( $-$ 0.00)
12.5	50(0.12)	0.28(0.03)	0.16(0.03)	0.11(0.00)
12.5	100(0.25)	0.38(0.02)	0.30(0.02)	0.08( $-$ 0.00)
12.5	200(0.77)	0.43(0.01)	0.35(0.01)	0.07(0.00)
12.5	250(0.95)	0.54(0.00)	0.43(0.00)	0.10(0.00)
15.0	50(0.07)	0.24(0.03)	0.18(0.02)	0.06(0.00)
15.0	100(0.30)	0.32(0.02)	0.23(0.01)	0.09(0.00)
15.0	200(0.82)	0.39(0.01)	0.29(0.01)	0.10(0.00)
15.0	250(0.97)	0.48(0.00)	0.40(0.00)	0.08(0.00)

Figure 3.

A time dependent AUC estimation on threshold value of a MTA.

The functional form is captured by $\Delta$ . The covariate $z$ is separated to formulate the subgroups. In one scenario, it is assumed that $\beta_{0}$ and $\theta$ are fluctuating and in another situation, the term $\theta$ is assumed only to be affected by a dose-response curve. The performances of the models are judged by convergences. The MCMC simulation is performed to test the performance of the models. The sample size is formulated with $N=$ 20,000 iterations. Further, the Bias of the tested data is presented in Table 2. The interest is to identify the treatment as the main effect for subgroup analysis. A subgroup of the patients can be separated into different groups by looking on treatment effect on different doses. It is possible to generate the threshold values for relevant subgroups and proceed to finalize the effective dose. The challenge is two folds. One is to identify the actual subgroup of covariate and next is to finalize the effective dose. Now the threshold detection procedure is formulated through ROC curve and details are provided in Section 3.3 and Section 3.4. The threshold values of a MTA is required to be established through survival modeling. The dose-response modeling procedure is performed in earlier sections. The baseline information of $G_{0}$ is obtained through iteration methods for MTA levels,i.e. $(X,Y)$ . The parameters are obtained through truncation of the Dirichlet Process prior. The proposed model is attempted to jointly estimate the biological and clinical effects. The patient label inconsistency is measured through $\Delta_{i}$ . The MCMC is used to generate the MTA distribution of $(G_{X_{i}},G_{Y_{i}})$ for each individual. The MCMC iteration is obtained with burn in for each individual $\Delta_{i}$ and is computed through averaging the posterior estimates of $(G_{X_{i}}^{*},G_{Y_{i}}^{*})$ . The posterior MTA profile is computed as

$\displaystyle\Delta_{i}^{*}=E_{G_{Y_{i}}^{*}}\{G_{X_{i}}^{*}(Y_{ik})\}=\int G_% {X_{i}}^{*}(y)=dG_{Y_{i}}^{*}(y)dy$ (44)

and

$\displaystyle G^{*}_{X_{i}}=\sum_{l=1}^{\infty}w_{l}^{*}\delta_{\theta_{l}^{*}}$ (45) $\displaystyle G^{*}_{Y_{i}}=\sum_{l^{\prime}=1}^{\infty}w_{l^{\prime}}^{*}% \delta_{\theta_{l^{\prime}}^{*}}$ (46)

Figure 4.

Progression free survival with different doses.

Table 3

Posterior means computed with $95\%$ credible intervals of the MTA

Parameter	Posterior mean and 95% C.I
$\beta_{o}$ (Intercept)	1.60(0.39,2.87)
$\theta$ ( $d_{1}$ vs $d_{2}$ )	0.27(0.04,1.46)
$\beta_{2}$ (d $\Delta$ )	0.81( $-$ 0.48,2.30)

The posterior mean is estimated as

$\displaystyle\Delta_{i}^{*}=\!\!\sum_{l}\!\sum_{l^{\prime}}w_{l}^{*}\omega_{l^% {\prime}}^{*}\!\left(\!1-\Phi\!\left(\!\frac{\mu_{l}^{*}-\mu_{l^{\prime}}^{*}}% {\sqrt{\sigma_{l}^{2*}+\sigma_{l^{\prime}}^{2*}}}\!\right)\!\right)$ (47)

The posterior mean estimates are provided in Table 3.

5. Result

The computations are performed by MCMC of sample size 20,000. A total of burn-in of 20,000 are observed. The convergences of the MCMC are confirmed visually. The presences of heterogeneity of the MTA are verified graphically for each patient. The posterior estimates obtained through MCMC iterations helps to classify the patients into different clusters.

The MTA is observed with range 12–16. It is expected that there is a marginal level shift of the MTA from pre to post-treatment. Others MTA level change may occur due to the therapeutic efficacy. The loss function for each patient from pre to post-treatment is defined as $\Delta_{i}$ . The corresponding posterior mean is obtained as $E(\Delta_{i}|\textit{data})$ . The frequency of the posterior mean for each dose label is presented in Fig. 3. It is confirmed that the model is utilized to measure the presence of heterogeneity for all the patients. Further, the MTA value distribution between pre and post-treatment refers to dose effectiveness to control the MTA.

The dose is able to control the MTA in post-treatment over the $\Delta_{i}$ value can be defined as the most effective dose. Now the control limit of MTA is dependent on the influence of MTA to control the progression of the disease. The decision about the MTA label under controlled or not can be identified by looking at the MTA threshold value on the progression of the disease and it is defined presented separately in this manuscript. In this section, the arbitrarily selected threshold value of MTA is 14. Patients are clustered into two groups as $\Delta_{i}\geqslant 14$ and $\Delta_{i}<14$ . The Kaplan-Meier curves of these groups are presented in Fig. 4. The patient’s population can be separated as a function of the posterior mean of $\Delta_{i}$ for better survival. The Cox time-varying model can be adopted further as causal-effect relation to establishing the MTA on prolonging the survival duration. The parametric approach can be considered to subgroup the patients into distributional changes by looking at their MTA profiles. However, the nonparametric approach can also be considered for the better goodness of fit [39]. The suitable model is defined as a model for minimizing the posterior predictive loss with lower values [40]. Now the patients can be separated by mean MTA values observed at pre and post-treatment. The mean of pre and post treatment are presented as $\bar{X}_{i}$ and $\bar{Y}_{i}$ respectively to separate the continuous measurement of MTA into binary format.

In conclusion, this study utilizes MCMC with a sample size of 20,000 to perform computations and analyze the heterogeneity of MTA among patients. The results indicate a marginal level shift of MTA from pre to post-treatment and provide insight into the effectiveness of various doses. A threshold value of 14 for MTA is used to cluster patients into two groups, and Kaplan-Meier curves are employed to visualize survival rates. Both parametric and nonparametric approaches are considered for further analysis, with the goal of minimizing posterior predictive loss. Ultimately, this research provides a foundation for understanding the impact of MTA on disease progression and for identifying effective treatment strategies to improve patient outcomes.

6. Discussion

The dose response modeling is presented with linear and non-linear modeling [15]. The problem with linear and non-linear modeling are settled with spline and B-spline method [41]. Our proposed method is suitable to consider without having assumption between response and covariates. It helps to explore the additional flexibility about dose response modeling.This manuscript is dedicated towards subgroup classification for dose-finding cancer clinical trial. The intention to perform the dose-finding trial has prolonged the duration of PFS and OS. Particularly, this manuscript is bridged the subgroup classification, dose-finding problem and prepare the application of classified prognostic marker towards prolonging the survival. The work is performed through identification of subgroup classification and the application of subgroups into the dose-finding trial. It is really changeling to identify effective treatment with limited sample size. The simulation study is performed to generate the variability due to sample size, treatment effect size etc. The methodology is presented with the exploratory analysis. It is helpful to apply the proposed method to identify subgroups with different dose-response curves and the Bonferroni correction factor is also required to be adopted in this context. The identification of subgroups will be required more with naive approaches without multiplicity adjustments. In this work, only the normal assumption about response variable is assumed. This work is proposed as an integral work to find the best effective dose through MTA. It provides the comprehensive framework to quantify the MTA threshold value, finalize the patients’ subgroups and decide the effective dose. This Bayesian nonparametric approach is used to decide the proper dose for each individual by classifying the patients’ outcomes on the metronomic dose. It is observed that our model can test the conventional clinical hypothesis. This work is merged trough deciding the effective dose on time-to-event data i.e. survival outcomes. It is very difficult to repeatedly measure the MTA due to ethical and feasibility issues. This work is performed with single time point measurement of baseline MTA value. However, computational flexibility can be extended toward repeatedly measurement of the MTA value. This proposed model is used to classify the group of patients into their MTA levels. It is required to consider that the MTA may be harmful to normal cells as an immune suppressor as well. Further, the structural formation is required to considered in this context. The accuracy of MTA is tested by $\textit{PPV}_{t_{0}}$ . The $\textit{PPV}_{t_{0}}$ is observed from a continuous variable and is purely dependent on threshold value of MTA. Measurement of MTA is time dependent and is defined as $\textit{AP}_{t_{0}}$ . The label of performance of a tool is measured by ROC in medical research. In this context, the MTA value is also measured through ROC. However, MTA is time dependent measurement and there are advantages [31, 23, 42] and limitations about application of ROC [19, 31]. The ROC provides the true risk probability $P(T<t_{0}|Z)$ of a MTA at every point [43]. It helps to maximize the true risk probability.

Footnotes

Acknowledgments

The authors express their sincere gratitude to the editor-inchief, Prof. Sudhir Srivastava, the managing editor, and the learned reviewers for their valuable comments, which have greatly contributed to improving the content and presentation of the original manuscript.

Author contributions

Conception: GKV and AB.

Interpretation and analysis of data: AB and FT.

Preparation of the manuscript: AB and AFP.

Revision for important intellectual content: FT and AFP.

Supervision: GKV.

References

D.T.

Durham

J.N.

Smith

K.N.

Wang

Bartlett

B.R.

Aulakh

L.K.

Kemberling

Wilt

Luber

B.S.

et al., Mismatch-repair deficiency predicts response of solid tumors to PD-1 blockade, Science (2017), eaan6733.

Khozin

Weinstock

Blumenthal

G.M.

Cheng

Zhuang

Zhao

Charlab

Fan

Keegan

et al., Osimertinib for the treatment of metastatic EGFR T790M mutation-positive non-small cell lung cancer, Clinical Cancer Research 23(9) (2017), 2131–2135.

Malik

S.M.

Maher

V.E.

Bijwaard

K.E.

Becker

R.L.

Zhang

Tang

S.W.

Song

Liu

Marathe

Gehrke

et al., US Food and Drug Administration approval: crizotinib for treatment of advanced or metastatic nonâ€“small cell lung cancer that is anaplastic lymphoma kinase positive, Clinical Cancer Research 20(8) (2014), 2029–2034.

Vishwakarma

G.K.

Bhattacharjee

Banerjee

and Liquet

, Classification algorithm for high-dimensional protein markers in time-course data, Statistics in Medicine 39(28) (2020), 4201–4217.

Ruberg

S.J.

and Shen

, Personalized medicine: Four perspectives of tailored medicine, Statistics in Biopharmaceutical Research 7(3) (2015), 214–229.

Loh

W.Y.

and Man

, A regression tree approach to identifying subgroups with differential treatment effects, Statistics in Medicine 34(11) (2015), 1818–1833.

Dusseldorp

and Van Mechelen

, Qualitative interaction trees: a tool to identify qualitative treatment-subgroup interactions, Statistics in Medicine 33(2) (2014), 219–237.

Lipkovich

Dmitrienko

Denne

and Enas

, Subgroup identification based on differential effect search – a recursive partitioning method for establishing response to treatment in patient subpopulations, Statistics in Medicine 30(21) (2011), 2601–2621.

Tsai

C.L.

Wang

Nickerson

D.M.

and Li

, Subgroup analysis via recursive partitioning, Journal of Machine Learning Research 10 (2009), 141–158.

10.

Bhattacharjee

Vishwakarma

G.K.

and Ong

S.H.

, A modified risk detection approach of biomarkers by frailty effect on multiple time to event data, Journal of Computational and Applied Mathematics (2022), 114681.

11.

Tian

Alizadeh

A.A.

Gentles

A.J.

and Tibshirani

, A simple method for estimating interactions between a treatment and a large number of covariates, Journal of the American Statistical Association 109(508) (2014), 1517–1532.

12.

Berger

J.O.

Wang

and Shen

, A Bayesian approach to subgroup identification, Journal of Biopharmaceutical Statistics 24(1) (2014), 110–129.

13.

Ting

, Dose finding in drug development, Springer Science & Business Media, 2006.

14.

Yan

, BOIN: An R Package for Designing Single-Agent and Drug-Combination Dose-Finding Trials Using Bayesian Optimal Interval Designs.

15.

Thomas

Sweeney

and Somayaji

, Meta-analysis of clinical dose-response in a large drug development portfolio, Statistics in Biopharmaceutical Research 6(4) (2014), 302–317.

16.

Wright

C.F.

and Zimmern

R.L.

, Conceptual issues for screening in the genomic era-time for an update? Epidemiology, Biostatistics, and Public Health 11(4) (2014).

17.

Greenland

, The need for reorientation toward cost-effective prediction: Comments on â€˜Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyondâ€™by MJ Pencina et al., Statistics in Medicine, Statistics in Medicine 27(2) (2008), 199–206.

18.

Grimes

D.A.

and Schulz

K.F.

, Uses and abuses of screening tests, The Lancet 359 (2002), 881–884.

19.

Moskowitz

C.S.

and Pepe

M.S.

, Quantifying and comparing the predictive accuracy of continuous prognostic factors for binary outcomes, Biostatistics 5(1) (2004), 113–127.

20.

Zheng

Cai

Pepe

M.S.

and Levy

W.C.

, Semiparametric Models of Time-Dependent Predictive Values of Prognostic Biomarkers, Biometrics 66(1) (2010), 50–60.

21.

Wald

N.J.

and Bestwick

J.P.

, Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test? Journal of Medical Screening 21(1) (2014), 51–56.

22.

Raghavan

Bollmann

and Jung

G.S.

, A critical investigation of recall and precision as measures of retrieval system performance, ACM Transactions on Information Systems (TOIS) 7(3) (1989), 205–229.

23.

Yuan

and Zhu

, Threshold-free measures for assessing the performance of medical screening tests, Frontiers in Public Health 3 (2015), 57.

24.

Ratain

M.J.

and Glassman

R.H.

, Biomarkers in phase I oncology trials: signal, noise, or expensive distraction? AACR, 2007.

25.

Kelloff

G.J.

and Sigman

C.C.

, Cancer biomarkers: selecting the right drug for the right patient, Nature Reviews Drug Discovery 11(3) (2012), 201.

26.

Bessarabova

Kirillov

Shi

Bugrim

Nikolsky

and Nikolskaya

, Bimodal gene expression patterns in breast cancer, BMC Genomics, 11(1) (2010), S8.

27.

Lucas

J.E.

Carvalho

C.M.

Chen

J.L.Y.

Chi

J.T.

and West

, Cross-study projections of genomic biomarkers: an evaluation in cancer genomics, PLoS One 4(2) (2009), e4523.

28.

Escobar

M.D.

and West

, Bayesian density estimation and inference using mixtures, Journal of The American Statistical Association 90(430) (1995), 577–588.

29.

Rodriguez

Dunson

D.B.

and Gelfand

A.E.

, The nested Dirichlet process, Journal of The American Statistical Association 103(483) (2008), 1131–1154.

30.

Zeileis

Hothorn

and Hornik

, Model-based recursive partitioning, Journal of Computational and Graphical Statistics 17(2) (2008), 492–514.

31.

Zheng

Cai

Pepe

M.S.

and Levy

W.C.

, Time-dependent predictive values of prognostic biomarkers with failure time outcome, Journal of The American Statistical Association 103(481) (2008), 362–368.

32.

Heagerty

P.J.

Lumley

and Pepe

M.S.

, Time-dependent ROC curves for censored survival data and a diagnostic marker, Biometrics 56(2) (2000), 337–344.

33.

Lawless

J.F.

and Yuan

, Estimation of prediction error for survival models, Statistics in Medicine 29(2) (2010), 262–274.

34.

Uno

Cai

Tian

and Wei

, Evaluating prediction rules for t-year survivors with censored regression models, Journal of The American Statistical Association 102(478) (2007), 527–537.

35.

Pepe

M.S.

, The statistical evaluation of medical tests for classification and prediction, Medicine, 2003.

36.

Ferguson

T.S.

, Bayesian density estimation by mixtures of normal distributions, Recent Advances in Statistics (1983), 287–302.

37.

Sethuraman

, A constructive definition of Dirichlet priors, Statistica Sinica (1994), 639–650.

38.

Albert

J.H.

and Chib

, Bayesian analysis of binary and polychotomous response data, Journal of The American Statistical Association 88(422) (1993), 669–679.

39.

Ibrahim

J.G.

Chen

M.-H.

and Sinha

, Criterion-based methods for Bayesian model assessment, Statistica Sinica (2001), 419–443.

40.

Gelfand

A.E.

and Ghosh

S.K.

, Model choice: a minimum posterior predictive loss approach, Biometrika 85(1) (1998), 1–11.

41.

Thomas

Bornkamp

and Seibold

, Subgroup identification in dose-finding trials via model-based recursive partitioning, Statistics in Medicine 37(10) (2018), 1608–1624.

42.

Gail

M.H.

and Pfeiffer

R.M.

, On criteria for evaluating models of absolute risk, Biostatistics 6(2) (2005), 227–239.

43.

McIntosh

M.W.

and Pepe

M.S.

, Combining several screening tests: optimality of the risk score, Biometrics 58(3) (2002), 657–664.

Subgroup identification of targeted therapy effects on biomarker for time to event data

Abstract

BACKGROUND:

OBJECTIVES:

METHODS:

RESULTS:

CONCLUSION:

Keywords

1. Introduction

3. Modeling

3.1 Dose response modeling

6. Discussion

Footnotes

Acknowledgments

Author contributions

References