Abstract
Recent studies of (cost-) effectiveness in cardiothoracic transplantation have required estimation of mean survival over the lifetime of the recipients. In order to calculate mean survival, the complete survivor curve is required but is often not fully observed, so that survival extrapolation is necessary. After transplantation, the hazard function is bathtub-shaped, reflecting latent competing risks which operate additively in overlapping time periods. The poly-Weibull distribution is a flexible parametric model that may be used to extrapolate survival and has a natural competing risks interpretation. In addition, treatment effects and subgroups can be modelled separately for each component of risk. We describe the model and develop inference procedures using freely available software. The methods are applied to two problems from cardiothoracic transplantation.
Keywords
1 Introduction
In recent studies of (cost-)effectiveness in transplantation, the quantity of interest has been mean survival over the lifetime of the recipients. Two examples are: (1) estimation of the difference in mean survival for recipients of a single lung transplant (SLT) compared with a double lung transplant (DLT), and (2) potential mean life years gained (LYG) using a novel device for preservation of donor hearts during transportation to the recipients; the device is expected to incur high costs during the transplant procedure but the full benefit will only be accrued over several years after heart transplantation. Mean survival is estimated as the area under the survivor curve but, in common with many cases, the complete survivor curve is not observed at the time of the study. Thus, we can either estimate the truncated mean survival defined by the period over which the survivor curve is observed, or we can use some form of survival extrapolation. The former will give an accurate estimate of a quantity that is of limited interest, while the second gives a possibly less accurate estimate of mean survival but answers the question of interest. Therefore calculation of the full benefit of new therapies and comparisons of the life expectancy of different subgroups depends on survival extrapolation.
Extrapolation of survivor curves in cost-effectiveness studies is routine but is often completed in ad hoc ways, including extrapolation of constant hazard rates, which at the very least ignores the increasing risk due to ageing. In transplantation, hazards have been observed to be bathtub- or U-shaped. Empirical hazards are high in the period immediately after transplantation due to the procedure itself and associated immunological events. This is followed by a steep decrease to a period of relative stability reflected by constant hazards, and thereafter hazards increase due to the ageing of the participants and the chronic complications of transplantation. 1 In this context, and others, the bathtub-shaped hazard functions reflect latent competing risks that operate simultaneously in an additive manner. They are common to many surgical applications, for example in the study carried out by Blackstone et al., 2 the hazard following heart valve replacement operations was decomposed into additive overlapping components reflecting similar latent competing risks.
One further complication in transplantation is that the cause of death is either unknown or its determination is often not straightforward. For example, an episode of acute rejection of the transplanted organ will result in increased immunosuppression therapy, which in turn leaves the recipient at risk of infections and further immunological compromise. If the patient then dies, it is not clear whether the cause of death is acute rejection, infection or primary donor organ failure, and all three may contribute. Therefore cause of death is often considered unknown so that we cannot model the competing risks directly, we are required to infer the contributions of the competing risks from the shape of the hazard function.
The above features suggest that a parametric model like a mixture may be appropriate for this task. Sufficient conditions for the shape of the hazard rate are given by Glaser. 3 In particular, he shows that a mixture of two Weibull distributions cannot result in bathtub-shaped hazard rate, correcting a contention due to Kao. 4 In contrast, poly-Weibull models5,6 in which the overall hazard is the sum of several independent components appear natural in this context. Poly-Weibull models have been used in the competing risks framework, typically with different objectives to the ones that concern us here. Specifically, we are not interested in marginal (or ‘net’) hazards, cumulative incidence functions or examination of the effect of removing a risk component. Merely, we shall use such models as flexible parametric assumptions that may be reasonable for extrapolating survival while their parameters retain a natural interpretation. Gelber et al. 7 used a hybrid approach in which a parametric model is fitted to the tail of the survival curve while the ‘early’ part is treated non-parametrically. The proposed model also has close connections to other ‘rich’ parametric models such as fractional polynomial regression 8 (FPR) and generalised additive models 9 (GAM). However, each component in our model retains a competing risks interpretation in terms of the survival outcome. This is not typically the case with the hybrid, FPR and GAM models.
In our first example, comparison of single and double lung transplantation as treatment options over the lifetime of the recipients is of primary interest, and can be combined with data on quality of life in a more general analysis. In the second example, the assessment of a novel organ preservation system, the main focus is estimation of mean survival within a cost-effectiveness analysis. In both these contexts, it is attractive to adopt a Bayesian approach since diverse sources of evidence often need to be synethesised in a coherent framework. 10 In our applications, the clinical context dictates the nature of the components of the hazard function and this information can be incorporated via prior distributions in a Bayesian approach. In addition, simulation-based techniques seem especially advantageous as the correlation structures that may be present are propagated through to the evaluation of (non-linear) functionals of the basic model parameters, in this case, the mean survival. The development of the WinBUGS software11,12 has made the routine fitting of fairly complicated models possible and as a result of these transplantation studies, the poly-Weibull model has been programmed into WinBUGS and is freely available. The remainder of this article is organised as follows. The modelling framework is described in Section 2 and in Section 3 we describe the details of the inference procedure. Implementation of the methods to answer substantive questions in cardiothoracic transplantation is presented in Section 4, while this article is concluded with some discussion in Section 5. The details of the Markov chain Monte Carlo (MCMC) algorithm can be found in Appendix.
2 Poly-Weibull model
Suppose that an individual is subject to m independent sources of risk that operate additively. Assume further that the distribution of each of the components may be sufficiently described by a Weibull form with density f(t ∣ ν, λ) = νλtν−1 exp(−λtν). Then, the observed event time is said to follow a poly-Weibull distribution. The hazard function arises as the sum of the m independent Weibull-type hazards:
Covariate effects can be modelled via the rate parameters λ1, λ2,…, λ
m
in the usual way
2.1 Mean survival and discounted mean survival
In many health economic applications, mean survival is of primary interest,
3 Bayesian Inference
3.1 Likelihood
We shall be concerned with inference based on observed survival data {y = (t
i
, δ
i
), i = 1,…, n}, where for individual i, t
i
denotes the time to the event or to censoring and δ
i
indicates whether i was censored (δ
i
= 0) or not (δ
i
= 1). When the observations of distinct individuals are conditionally independent, the likelihood function can be written as
3.2 Prior specification and identifiability
The choice of priors should be dictated by subject-matter knowledge. It has been empirically observed that a bathtub shape is the most common shape of the hazard rate following transplantation. This suggests that m = 2 (often referred to as the bi-Weibull model) 13 and m = 3 are the most likely candidates. In all the examples, considered values for m between 1 and 4 were investigated, with the case m = 1 used to confirm the validity of our results when compared to the standard Weibull model.
In the case of no covariate effects, we specify vague, log-normal priors for the rate parameters: log λ
j
∼ Normal(0, 1002), j = 1,…, m. If covariate effects are included, however, we specify Normal(0, 1002) priors for the coefficients of those effects. For the shape parameters, we note that the immediately post-operative hazard must be decreasing, and so ν1 ∈ (0, 1) (assuming the first poly-Weibull component is designated as corresponding to immediately post-operative risk). An obvious prior choice is then ν1 ∼ Uniform(0, 1). Subsequent components of risk may be either increasing or decreasing, and so for these, we specify log-normal priors with mode 1:
To ensure identifiability of the individual hazard components we impose an ordering constraint on the shape parameters, such that ν1 < ν2 < ··· < ν
m
. This is necessary because, with the same parameters, the individual components have identical likelihood contributions. Hence, if unordered, the parameters might switch roles during the MCMC simulation giving, for each parameter, posterior distributions that are a mixture over the supported values. The constraint is applied by multiplying the joint prior by an indicator function
In cases where covariate effects are to be included for the shape parameters, then the ordering constraint is somewhat more difficult to apply, except in the absence of covariate effects for the rates, where we would be able to impose λ1 < λ2 < ⋯ < λ
m
instead, say. When covariate effects are included for both sets of parameters, we impose the constraint that ν1 < ν2 < ⋯ < ν
m
for all possible combinations of observed covariate values. The indicator γ then becomes
3.3 Implementation
The details of the MCMC algorithm we used can be found in Appendix, including details of the implementation of the poly-Weibull distribution in the WinBUGS Development Interface. 15 Evaluating the summaries of interest to health economists, such as the mean survival and the LYG for different covariate levels is straightforward in our simulation-based approach. Specifically, sampling from non-linear functionals of the basic model parameters requires no additional coding. For each model fitted, we calculated the deviance information criterion (DIC) described in Spiegelhalter et al. 16 However, due to skewness in some of the posterior distributions, which can lead to poor estimates of the ‘effective number of parameters’, we prefer to use the posterior mean deviance, penalised by the actual number of parameters (since the model is non-hierarchical), as our model selection criterion, alongside consideration of coefficients' credible intervals. Plummer 17 discusses the performance of DIC within the framework of penalised loss functions and cross-validation. On related work, Draper and Krnjajic 18 discuss the equivalence of minimising the DIC and maximising the log-scoring rule using cross-validation. This connection was first established by Stone 19 for the Akaike information criterion.
4 Application to transplantation data
In Section 1, we briefly described the two studies in cardiothoracic transplantation that motivated this study. The poly-Weibull models were developed to estimate mean survival after transplantation where survival extrapolation was required. The first problem relates to different lung transplant procedures and the second is concerned with the effect on life expectancy of cold ischaemic time (IT) in heart transplantation.
4.1 Assessing different transplant types
Single lung transplantation (SLT), as opposed to double lung transplantation (DLT), may be used for some patient groups with severe lung disease and it allows two patients to be treated by a single organ donor. In addition, a donor may have only one suitable lung. However, SLT may be considered as a partial treatment and there is some evidence that post-transplant survival may be worse for SLT recipients.
20
However, whether the effect is mainly associated with poorer operative survival, background survival or chronic complications remains unclear. In addition, the impact on life expectancy for the transplant types is important when deciding whether to accept a SLT when offered. Data from Papworth, one of the UK's specialist cardiothoracic hospitals with a programme of heart and lung transplantation, were used. The dataset consists of 338 patients presenting for lung transplantation mainly due to either chronic obstructive pulmonary disease or pulmonary fibrosis, conditions for which single lung transplantation is a possibility. There were 173 single and 165 double lung transplants out of whom 144 and 79, respectively, died before the end of the study, while the remaining 115 patients were alive at the time of data analysis. Cumulative hazards are plotted in Figure 1, while product-limit survival estimates in Figure 2. It is evident from both figures that there is a high death rate which decreases before slowly increasing in the long term. The single lung patients clearly have worse survival. The cumulative hazard plot indicates that this is most evident in the longer term.
Cumulative hazard estimates for lung transplant recipients. Product-limit survival estimates for lung transplant recipients.

Posterior mean deviance for different models
Posterior mean and 95% credible intervals for the different shapes (ν) and log-rate (β) parameters
Note: SL, single lung and DL, double lung. Subscript 1 (2) refers to the early (late) risk component.
The model with m = 3 was not essentially identifiable as the parameter estimates for the second hazard component are very similar to the estimates of the third component. Thus, considering models with m > 2 appears unnecessary in this example.
4.2 Cold Ischaemic Time effect
Our second study is concerned with the health economic evaluation of a novel device for storing donated hearts for transplantation. The length of time between excision of a heart from the donor and successful implant into the recipient is called IT. Longer IT has a detrimental effect on survival, particularly in the short term. 21 In the UK, average IT has been steadily increasing in recent years. 22 Traditionally, donated organs are packed in ice during transfer from the donor hospital to the transplant centre but the new devices store the hearts in warm, oxygenated donor blood, supplemented with nutrients. It is claimed by the manufacturers that the new machine could effectively reduce the IT to 60 min. However, the machine is likely to be expensive and the potential health gains, evaluated over the lifetime of the patients, will dictate whether the intervention can be termed cost-effective or not. Device regulation is less well developed when compared to drug regulation and an analysis of this kind could help to decide whether a randomised controlled trial of the new device is likely to be a good use of resources.
A preliminary analysis of heart transplant activity from the UK was conducted by Goldsmith et al. 23 This analysis included 4335 patients of whom 2279 died before the end of the study, with median survival of 10.4 years. The mean IT was 191 min and standard deviation was 59 min. The authors used a Cox 24 proportional hazard regression model with the hazard ratio for IT piecewise constant, with a single fixed change-point after the initial operative hazards had passed. It showed that short ITs can confer a significant advantage in the early (post-transplant) survival. However, the effect appears smaller in the latter segment of the survival curve and in particular, the IT effect beyond the initial period is not ‘statistically significant’. The length of the initial period was hard to estimate using a change-point model but it appeared that the estimate of the overall LYG (defined as the difference in mean survival) was not sensitive to varying the period between 6 and 12 months. One advantage of the poly-Weibull distribution we shall consider is that a choice of the length of the initial period is unnecessary since the various risk components overlap and there is no assumption of an instantaneous jump in relative hazards. A related issue regarding health economic evaluation, is the sensitivity of the estimated LYG to the piecewise-constant relative hazards assumption. Specifically, assuming different IT effects in the early and late phases can cause large changes in the estimated LYG with severe overestimation bias when a common effect is assumed by the model and extrapolated over the lifetime. Choices of this kind are naturally dealt within the parametric approach we adopt in this article. Finally, and perhaps more importantly, any non- or semi-parametric method necessitates some reliable technique for extrapolating the observed survival. Goldsmith et al. 23 used the estimated hazard rate in years 13–20 to model crudely the shape of the hazard rate beyond 20 years, for which no data were available. In the model we propose here, this problem is inherently accommodated by the parametric survival distribution.
Posterior mean and 95% credible intervals for the different IT effect parameters
The estimates we derive are slightly different to the results obtained by Goldsmith et al.
23
although the findings are not materially different. In particular, the IT effect is strong in the early part of the survival curve but negative, although not ‘statistically significant’, in the latter part. A negative effect of IT is considered implausible by the transplant clinicians, although this was not known when the prior distributions were determined, and has a large effect on the estimates of potential LYG. Mean survival and the LYG for different levels of IT are presented in Table 4. For the model presented in Table 3, i.e. with covariate effects on both rates and no constraints on the second component, a posterior median of −0.901 life years are gained when the IT is reduced from its average of 191 to 60 min. Two alternative assumptions regarding the second component were considered: (1) where the coefficient is constrained to be non-negative (results presented in Table 3); and (2) where the coefficient is fixed at zero. These gave point estimates for LYG of 1.66 and 1.09, respectively. Of course, this assumes that the IT of 60 min can be achieved in practice and that changing IT will result in the predicted change in survival rates, irrespective of other donor and recipient factors. However, it does provide estimates for the potential gain in survival. These results are to be contrasted to the costs of such systems. For example, if health care providers are willing to pay £30 000 per life year gained, the preservation system (including base system, consumables and staffing) must cost no more than £50 000 (1.66 × £30 000) per patient if the constrained analysis is appropriate, or £33 000 (1.09 × £30 000) if the effect is restricted to the early post-operative component. Thus, such machines may not be cost-effective for relatively small IT, but they may prove very cost-effective (as well as clinically effective) in certain circumstances. Long ITs may be less common in Europe (see Figure 3 for the distribution of the observed ITs in the UK) but do arise frequently in countries like Australia and Canada. Further, using such systems may still prove cost-effective in smaller countries when long ITs are anticipated. In summary, our findings suggest that reducing IT could improve average post-transplant survival over the lifetime of the patients by between 1.1 and 1.7 years depending on model assumptions, but the cost-effectiveness of the donor organ maintenance systems will depend on the exact costs and the clinical context in which it is used.
The distribution of the observed ITs. Posterior summaries for the mean survival, the mean survival if the IT is reduced to 60 min and the corresponding LYG Note: CI, confidence interval.
5 Discussion
We have presented methods for survival extrapolation using flexible parametric models that have been previously used within the competing risks framework. The poly-Weibull function has been incorporated into the freely available software, namely WinBUGS, making its implementation straightforward since the details are not more complicated than any other distribution within the package, excepting the requirement for identifiability constraints.
This study was motivated by two problems from the field of cardiothoracic transplantation and these applications demonstrate the simplicity of the approach. The IT example is typical of cost-effectiveness analyses in which there is a high initial cost, but for which the full benefit is only accrued over the lifetime of the patients. Thus, it is important to extrapolate survival in order to estimate the potential benefit and therefore the value of further assessment of the technology. In addition to the poly-Weibull providing a flexible parametric distribution for this purpose, there is no need to make the arbitrary choices surrounding time-varying hazards that were required in a suitable semi-parametric approach. Using this method, we were able to provide some guidance on acceptable costs for the heart preservation system.
There has been recent interest 20 in the utility of SLT transplants and whether the ability to treat two individuals from the transplant list outperforms the potential LYG from DLT, at least in the cases where both these choices are an option. The comparison between SLT and DLT provides an indication of the difference in life expectancy, which is considerable and, combined with the poorer quality of life associated with SLT, suggests that it should be used selectively. However, credible intervals are wide and this analysis should be revisited in the future. The flexible parametric models of this article could be a significant part of such a study.
In our examples, identifiability issues have been tackled via the use of ordering constraints. Occasionally, we may wish to include covariate effects for both the rate and shape parameters simultaneously. In such cases, the proposed constraint is increasingly difficult to satisfy, during the MCMC simulation, as the number of covariates increases. It may be preferable, therefore, to apply some form of post-analysis relabelling algorithm instead, as discussed by Jasra et al., 25 to impose the required constraint. Jasra et al. 25 also discuss more advanced techniques that may be of use in situations where the hazard components are not well separated. Another option may be to incorporate prior information on the model parameters if it is available. Otherwise the model may need to be simplified by combining different sources of hazard. It was not necessary to use strong priors in our studies, and results were robust to a range of prior distributions.
There are alternative approaches for flexible hazard modelling. Various three-parameter distributions have been proposed in the literature.26,27 Models of this kind may be appropriate in some cases and they can result in U-shaped hazards. However, some identifiability issues can arise when estimating all three parameters and the choice of the optimal parametrisation is often not straightforward. Also, the interpretation of the model parameters is not always natural in terms of clinical quantities. Finally, incorporating covariates can be complicated in these approaches while both categorical and continuous covariates are simply incorporated within the proposed methods of this article, and retain a natural interpretation.
One potential benefit of using parametric survival curves is that we are able to extrapolate beyond the data and calculate mean survival, a commonly used measure of effect for health economic analyses. This approach allows us to divide hazards into early and late phases, so that later survival extrapolation is less influenced by the early pattern. However, the later phase inevitably contains fewer observations and therefore less information, with poorer mixing of related parameters and typically wide credible intervals, as seen in the IT example. Instabilities may arise in examples with fewer patients with long-term follow-up. Stronger priors may then be required for convergence. Finally, in our experience, posterior distributions can be highly skewed so that DIC estimates can be unreliable. Therefore, unless there is sufficient information extrapolations that rely heavily on the later phase parameter estimates should not be made uncritically.
Footnotes
Acknowledgements
The authors are grateful to UK Transplant Cardiothoracic Advisory Group and Papworth hospital transplant unit for allowing them to present data from their registries and to several members of the MRC Biostatistics Unit for their helpful discussions. This study was supported by the Medical Research Council.
