Abstract
Predicting the risk of death for chronic patients is highly valuable for informed medical decision-making. This paper proposes a general framework for dynamic prediction of the risk of death of a patient given her hospitalization history. Predictions are based on a joint model for the death and hospitalization processes, thereby avoiding the potential bias arising from selection of survivors. The framework is valid for arbitrary models for the hospitalization process—it does not require independence of hospitalization times nor gap times. In particular, we study the prediction of the risk of death in a renewal model for hospitalizations—a common approach to recurrent event modeling. In the renewal model, the distribution of hospitalizations throughout the follow-up period impacts the risk of death. This result differs from the prediction of death when considering the Poisson model for the hospitalization process, previously studied, where only the number of hospitalizations matters. We apply our methodology to a prospective, observational cohort study of 512 patients treated for chronic obstructive pulmonary disease in one of six outpatient respiratory clinics run by the Respiratory Service of Galdakao University Hospital, with a median follow-up of 4.7 years. We find that more concentrated hospitalizations increase the risk of death and that the hazard ratio for death continuously increases as the number of hospitalizations increases during follow-up.
Keywords
Introduction
When studying disease evolution in chronic patients, data on the patient’s hospitalization history is often available. Recurrent hospitalizations may indicate worsening of the disease and are therefore expected to be informative about the risk of death. Hence, several works have studied the relationship between hospitalizations and death.1–4 Developing prediction frameworks for the risk of death that account for the history of hospitalizations of the patient may be crucial for the effective management of patients. This paper provides a dynamic prediction framework under a joint frailty model for death and hospitalizations, considering that hospitalization times may be dependent within a patient. The joint modeling approach overcomes the informative censoring problem in the simultaneous analysis of death and hospitalization recurrences. 5 Moreover, compared to the prediction of risk with baseline patient characteristics, dynamic prediction allows for continuously updated risk assessments that incorporate a patient’s evolving clinical history. It is therefore particularly suitable for predicting the risk of death of chronic patients, who are monitored during their disease’s evolution.
Our development is motivated by the need to better understand the relationship between recurrent hospitalizations and mortality in patients with chronic obstructive pulmonary disease (COPD), a connection that has been well-documented as a critical factor in disease progression and long-term prognosis. For instance, Soler-Cataluña et al. 4 found that severe exacerbations requiring hospitalization are a significant predictor of mortality in COPD patients; Serra-Picamal et al. 6 observed that the number of re-hospitalizations in COPD patients is directly related to increased all-cause mortality, emphasizing the prognostic significance of recurrent hospital admissions. Suissa et al. 3 studied this relationship in a Cox-regression framework, finding that subsequent hospitalizations increased the risk of death and that the time between hospitalizations decreases over the course of follow-up. We contribute to the understanding of hospitalizations and death for COPD patients by estimating a joint model (avoiding the informative censoring bias) and predicting the risk of death given the history of hospitalizations. Specifically, our methodological contribution is guided by the following clinical hypothesis “The risk of death is larger for patients who experience hospitalizations that are concentrated in time.”
In this work, we develop a dynamic prediction framework that is valid for any model of the hospitalization process. In particular, we study and compare the risk of death when the Poisson (calendar timescale) and renewal (gap timescale) models are used to model the hospitalization process. In addition, we show that the distribution of hospitalizations throughout follow-up impacts the risk of death in the renewal model. This result differs from the prediction of death within a Poisson model, where only the number of hospitalizations matters. 7 Therefore, we contribute to the study of the modeling implications of the choice of the timescale. 8 In addition, we find that in the renewal model, monotonicity of the baseline hazard for hospitalizations characterizes whether concentrated or dispersed hospitalizations lead to a higher risk of death. Our results help link clinical knowledge with features of the hospitalization process. We are able to confirm the clinical hypothesis for COPD patients: concentrated hospitalizations lead to a higher risk of death.
Dynamic prediction is an active research area.7,9–12 Mauguen et al. 7 studied dynamic prediction of the risk of a terminal event given a Poisson model for recurrent events—that is, under the assumption that recurrent event times are independent. We relax this hypothesis and generalize their results to allow for arbitrary models for the recurrent event. The general model encompasses, among others, the renewal model for recurrent events, which is frequently used for the hospitalization process.2,3,13 Several works have been done to relax the conditional independence assumption for joint models by introducing a copula to account for the dependence between the terminal and recurrent processes.9,14 Following this strand, Liang et al. 10 propose a Bayesian hierarchical model, where the copula parameter is allowed to be individual and recurrence specific. These works provide a more flexible structure for the dependence between the terminal and recurrent events, while maintaining independence between recurrent event gap times. The work proposed in this paper advances in a parallel path. We do not require independence between recurrent event times nor gap times, allowing for arbitrary models for the recurrent event, although we maintain conditional independence between the terminal and recurrent events.
This paper is organized as follows. Section 2 introduces the motivating case: the study of the risk of death of COPD patients. Section 3 presents the methodological contributions of the paper. In Section 4, we apply our results to predict the risk of death, given the hospitalization history, for COPD patients. Section 5 concludes with a discussion. A more detailed technical discussion of the methodological contributions and the results is presented in the Appendixes.
Motivating study: Risk of death of COPD patients
COPD’s influence on global health and healthcare policies is significant and expanding. It is currently one of the most paradigmatic chronic respiratory conditions, with projections indicating further escalation in the years ahead. 15 COPD accounted for 3% of all deaths in 2021 across OECD countries, 16 and it was the third leading cause of death in 2019 according to the World Health Organization March 2023 report. The burden of disease will continue to increase, especially in women and regions with low to medium gross domestic product or income. 17
Exacerbations, and particularly hospitalizations, represent critical events in the management of COPD due to their substantial negative consequences. 18 Hospitalizations have a significant clinical impact on patients with COPD. They further compromise already impaired pulmonary function and health-related quality of life. 19 Moreover, hospitalizations increase the risk of subsequent readmissions, cardiovascular events, and mortality both during and after the acute episode.4,20,21 The cycle of hospitalization and readmission not only elevates healthcare costs but also intensifies these adverse outcomes for patients.
Several prediction models have been developed for COPD readmissions or mortality risk.22–25 Furthermore, the increasing hospitalization frequency has been shown to increase the risk of mortality. 3 Therefore, given the importance of these two outcomes (hospitalization and mortality) and their intrinsic relationship, joint modeling of both processes is necessary for two reasons. On the one hand, it reduces the potential bias present in Cox regression models. The cause of the bias is the fact that long-lived patients tend to experience more hospitalizations, and this could mask the accentuating effect of hospitalizations on the risk of death. On the other hand, it allows us to evaluate the effect that hospitalization history (frequency and distribution) has on mortality and to make predictions of mortality risk based on the history of hospitalizations during follow-up. In the COPD literature, joint modeling has primarily focused on linking longitudinal biomarkers—such as forced expiratory volume trajectories—with the risk of exacerbations. 26 However, although recurrent exacerbations and hospitalizations are crucial for predicting mortality in COPD patients, the application of joint frailty models to examine the relationship between these events and death remains underexplored. This presents an opportunity to improve our understanding of disease progression and improve individualized risk prediction in COPD.
In this work, we considered a prospective, observational cohort study of 512 patients recruited after being treated for COPD in one of six outpatient respiratory clinics run by the Respiratory Service of Galdakao University Hospital. Patients were consecutively included in the study if they had been diagnosed with COPD for at least 6 months and had been stable for at least 6 weeks. The protocol was approved by the Ethics and Research Committees of the hospital (reference 16/14). All candidate patients were given detailed information about the study, and all those included provided written informed consent. Sociodemographic, smoking habits, and clinical variables were recorded. Pulmonary function tests included forced spirometry and body plethysmography, and measurements of carbon monoxide diffusing capacity in percentage (DLCO). These tests were performed in accordance with the standards of the European Respiratory Society. 27 For theoretical values, we considered those of the European Community for Steel and Coal. 28 All variables were measured at baseline. The median follow-up time was 4.7 years (interquartile range: 2.66 to 5.13 years). Patient hospitalizations were reviewed during the follow-up period. A brief description of the main variables used for this study is presented in Table 1. More detailed information regarding the dataset can be found elsewhere. 29
Descriptive statistics for the whole sample.
Descriptive statistics for the whole sample.
DLCO: carbon monoxide diffusing capacity in percentage;
Joint frailty model for death and hospitalization
We briefly introduce the joint frailty model for death, the terminal event, and hospitalization, the recurrent event. A more detailed description can be found in previous work.1,5,30,31 For each patient
Estimation and prediction are based on conditional independence of the death and hospitalization processes given a frailty variable
The function
We have developed an expression for the probability of a patient dying between
Consider that we have followed up a patient for
Derivation of equation (3.2) is based on the independence of the death and hospitalization processes given frailty (and covariates) and applications of Bayes’ rule. The result relies on the density of the hospitalization history provided in Theorem 2.1 in Cook and Lawless.
32
Mauguen et al.,
7
Equation (3.2) is valid for any model of the hazard for the next hospitalization given their history:
Poisson model for hospitalizations
The hazard for the next hospitalization at time
Renewal model for hospitalizations
The hazard for the next hospitalization given their history at time
In the Poisson model, since
Under the renewal model, the distribution of hospitalization plays a significant role in predicting the risk of death. A natural question to ask then is which pattern of hospitalization times leads to the highest risk of death. Is it when hospitalizations are concentrated or spread out through the follow-up period? For instance, the clinical hypothesis in the study of COPD patients is that concentrated hospitalizations lead to a higher risk of death. In this section, we show that the answer depends on two features of the model: the relationship between the two processes (
We consider a patient with
Two patients with different distributions of hospitalizations will have a different value of the survival function
Let us consider two patients under two different scenarios for the baseline hazards. During the first year of follow-up (
The first column of Figure 1 plots the “timing of the last hospitalization before First column: Timing of the last hospitalization before time 
The second and third columns of Figure 1 plot the hazard for hospitalizations given their history
The fact that the relationship between the distribution of hospitalizations and the cumulative hazard (and hence the survival function) depends on the shape of the baseline hazard can be generalized beyond the example (cf. Proposition 1 in Appendix B):
If the baseline hazard for hospitalizations is increasing, then dispersed hospitalizations lead to larger values of the survival function for hospitalizations at time
Up to this point, we have studied the impact of the distribution of hospitalizations on the value of the survival function for hospitalizations at time
As we have already done, it is better to study the problem through the lenses of the cumulative hazard for hospitalizations. Consider Scenario 1, where Patient A accumulates less hazard than Patient B. When confronted with this fact, how does the model update the expected value of the frailty variable for each patient? Note that Patient B has been “exposed” to a larger cumulative hazard than Patient A, so Patient B was expected to experience more hospitalizations. However, they both experienced the same number of hospitalizations. Thus, Patient B must be less fragile than Patient A. In conclusion, patients with larger values of the cumulative hazard are expected to have smaller frailty values.
The preceding discussion may be rephrased as “when comparing two patients with the same number of hospitalizations, patients with larger values of the survival function for hospitalizations (
When the death and hospitalization processes are positively related (
The two results introduced in this section describe how the distribution of hospitalizations affects the risk of death in the renewal model. Table 2 summarizes the main findings (see also Theorem 1 in Appendix B). For instance, when death and hospitalizations are positively related, and the baseline hazard for hospitalizations is decreasing, the risk of death is higher for patients with concentrated hospitalizations.
The hospitalization pattern that leads to a higher risk of death according to the renewal model is shown based on the distribution of hospitalizations and the relationship between death and hospitalization processes.
We would like to highlight that the results in Table 2 indicate that the renewal model for hospitalizations is flexible enough to provide a wide range of predictions for the risk of death. Depending on the parameters of the model—
We fit a joint frailty model for hospitalization and death of COPD patients. We consider Weibull baseline hazards for death and hospitalization. These are given by
We consider both a Poisson (calendar timescale) model and a renewal (gap timescale) model for hospitalizations. In the Poisson model, time is measured in “days since inclusion in the study.” In the renewal model, time is measured in “days since inclusion in the study” for the first hospitalization and “days since last hospitalization” for the following ones. We note that COPD patients tend to have short hospital stays: median of 4 days (see Table 1). Therefore, we keep the timescale as “days,” as opposed to “days out of hospital.” 2 Prediction results are more interpretable this way.
Table 3 shows the results obtained for the estimated joint models, considering both renewal and Poisson specifications for the hospitalization process. We have included age, sex, DLCO, and forced expiratory volume in 1 second in percentage (
Fit results.
Fit results.
Sample: 507 patients, 91 death events, and 473 hospitalization events. Columns: HR = hazard ratio (computed for covariates); Estimate = computed for scale, shape, and frailty parameters; CI = confidence interval. Covariates: Age, female, DLCO = carbon monoxide diffusing capacity in percentage;
Figure 2 displays predictions of the risk of death given the history of hospitalizations for four different patients. All of them resemble the median patient (65-year-old, male, DLCO of 61.0,

Prediction of the risk of death given the hospitalization history at different follow-up times. Results for the median patient (65-year-old, male, DLCO of 61.0,
The dynamic prediction approach allows us to approximate the HR for death between different patients at different follow-up times. At follow-up time
Hazard ratios at different follow-up times for patients with different numbers of hospitalizations. The reference (denominator) has one hospitalization prior to
DLCO: carbon monoxide diffusing capacity in percentage;
The key result from Section 3.3 is that, in a renewal model for the hospitalization process, the risk of death given hospitalizations depends on the distribution of the latter. The dependence is characterized by the relationship between the processes (parameter
Figure 3 shows the prediction of the risk of death given the hospitalization history for four patients with median covariate values. The patients experienced a different number of hospitalizations (two or three) during follow-up. Moreover, the distribution of the hospitalizations is dispersed (in orange) or concentrated (in green). As can be observed in all the panels, different predicted risks are estimated (under the renewal model) when hospitalizations occur in a dispersed manner, when compared to all of them occurring in a concentrated pattern, with a higher risk of death for those patients with concentrated hospitalizations. In Appendix D, we show how these differences in risk of death between patients with dispersed and concentrated hospitalizations vary across different values of

Prediction of the risk of death given the hospitalization history at different follow-up times. Results for the median patient (65-year-old, male, DLCO of 61.0,
The differences in terms of risk of death between concentrated and dispersed hospitalizations are sizeable. Table 5 shows the HRs at different follow-up times for patients with different distributions of hospitalizations. Note that, even after conditioning for clinical and sociodemographic variables, the distribution of the hospitalizations is an important risk factor for death. For instance, at the fourth year, the HR of three concentrated hospitalizations versus three dispersed hospitalizations (HR = 1.050) is equivalent to the HR associated with being 6 months older, holding the remaining clinical characteristics fixed. In addition, for a given number of hospitalizations, the risk of concentrated versus dispersed hospitalizations increases as the follow-up time increases.
Hazard ratios at different follow-up times for patients with different distributions of hospitalizations (renewal model). Fixed the number of hospitalizations, we compute the hazard ratio of concentrated hospitalizations (numerator) over dispersed hospitalizations (denominator). Results for the median patient (65-year-old, male, DLCO of 61.0,
DLCO: carbon monoxide diffusing capacity in percentage;
We propose to statistically test whether the distribution of hospitalization determines the shape of the risk of death. The distribution of hospitalizations is irrelevant if
We evaluate the prediction accuracy of both the renewal and the Poisson models. To that end, we compute the time-dependent Brier score, which measures the square loss between the “survived up to
Figure 4 shows the results for the renewal and Poisson models. As a benchmark, we have included a Weibull failure time regression model with the number of hospitalizations

Time-dependent Brier score and AUC for three models: renewal, Poisson, and regression model with number of hospitalizations as time-dependent covariate. Predictions are made in intervals of 4 months, from follow-up times
Figure 4 shows that prediction under the joint-modeling approach outperforms the inclusion of the number of hospitalizations as a time-dependent covariate, both in terms of Brier score and AUC. We see that the renewal and the Poisson models perform similarly. For the dataset at hand, the renewal model has slightly better Brier scores (prediction accuracy), while the Poisson model has slightly better AUC (discrimination capacity). Prediction accuracy decreases with the prediction window
We have developed a general dynamic prediction framework for the risk of a terminal event (death) given the recurrent event (hospitalization) history in a joint frailty model. The result is valid for any model for the recurrent event process, allowing for any dependence structure between recurrence times. This contributes to the literature on the dynamic prediction of terminal events given recurrent events. In particular, Mauguen et al. 7 obtained a prediction result solely for a Poisson process for the recurrent event (assuming independence between the recurrent event times). On the other hand, Liang et al. 10 obtained prediction results complementary to this proposal, relaxing the conditional independence between the terminal and recurrent events. Nevertheless, they assumed independence between recurrence gap times (renewal model).
We have studied how the distribution of hospitalizations throughout the follow-up period determines the risk of death when hospitalizations are modeled following a renewal process. In contrast to the Poisson case, where solely the number of hospitalizations during follow-up matters, we have found that the risk of death depends on the gap times between hospitalizations. The dependence between the risk of death and the distribution of hospitalizations is characterized by two features: whether the two processes are positively or negatively related (parameter
Time-dependent external covariates could be included in the hazard for any of the two processes. Note that time-dependent covariates are usually assumed to be constant in between hospitalizations (see Cook and Lawless
32
We applied our methodology to a dataset of patients with COPD. We found that the risk of death is generally higher when the recurrent hospitalization process is modeled using the Poisson specification. We believe that this is a consequence of hospitalization times being independent in the Poisson model. We see that, in both models, the number of hospitalizations greatly impacts the risk of death, in line with previous results.3,4,20 In this study, we advance the understanding of COPD’s evolution through several contributions. First, within the renewal modeling framework, we identify the distribution of hospitalizations as a significant risk factor. Second, by employing a joint modeling approach, we effectively mitigate attenuation bias—arising from the fact that patients who survive longer are inherently more likely to accumulate hospital admissions—thus revealing HRs that increase in the number of hospitalizations over the follow-up. Finally, the use of a dynamic prediction framework enables real-time estimation of mortality risk, continuously updated with each new recurrent event throughout the follow-up period.
In our application to COPD data, we considered a Weibull model for the baseline hazards. The hazard of this model is monotone, with a single parameter determining whether it is increasing or decreasing. This eases the characterization of the dependence between the risk of death and the distribution of hospitalizations, as it is reduced to two parameters. Appendix C provides fits for various combinations of the Weibull, Gompertz, and log-logistic hazards. In all the renewal models, the baseline hazard for hospitalization appears to be monotone decreasing.
If the practitioner is uncertain about the baseline hazard being monotonic, we recommend fitting a semiparametric joint model (e.g. with spline approximations of the baseline hazards). 30 If the fit for the baseline hazard for hospitalizations is non-monotonic, a general conclusion about whether a concentrated or dispersed history of hospitalizations leads to a higher risk cannot be drawn. Nevertheless, our modeling framework remains applicable and valuable: it allows for individualized risk predictions based on each patient’s characteristics and event history. These personalized predictions can be used to compare the death risk of two patients with differing hospitalization trajectories.
In the COPD study, hospitalizations were modeled as instantaneous events due to their short duration in our dataset (median length of stay: 4 days; interquartile range: 3–7 days). If hospital stays were longer or more variable, an alternative approach could be considered within the renewal framework. It consists of measuring time from the discharge of the previous hospitalization to the admission of the next. The primary distinction between this approach and the one used in the paper lies in the interpretation of the inter-hospitalization period: whether it includes or excludes time spent hospitalized.
There are different avenues for future work. Our results correspond to Setting 1 in Mauguen et al. 7 —we believe that our results could be readily extended to cover Setting 2. Moreover, to increase prediction accuracy, it may be of interest to consider the evolution of different biomarkers alongside the history of hospitalizations. The present paper has solely considered a joint model for the death and hospitalization process. Nevertheless, since biomarker data is often available for chronic patients, it is appealing to include it in the model. This requires proposing a joint model for a terminal, a recurrent, and a longitudinal outcome, as in Król et al. 39 A promising line of research is to simultaneously relax conditional independence between the terminal and recurrent events, as in Liang et al., 10 and allow for dependence of the recurrence times.
Additionally, our study of the renewal model for hospitalizations has disclosed some of its limitations. In the renewal model, the distribution of hospitalizations affects the risk of death through the gap times between hospitalizations. Say, for instance, that we have followed up two patients for one year. The first was hospitalized at months 1 and 2. The second was hospitalized at months 10 and 11. These two patients have the same gap times between hospitalizations: a large gap of 10 months and two short gaps of 1 month. The model thus predicts the same risk of death for both patients, which may not be completely plausible from a clinical perspective.
In view of this result, it may be of interest to study more complex models for the hospitalization process. Indeed, one may think that the hospitalization process should have characteristics from both the Poisson and a renewal model.8,40 For instance, the hazard for the next hospitalization may be modeled as:
Finally, it is worth mentioning that the software developed to implement the methodological proposal presented in this paper is available at the corresponding author’s GitHub page.
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was financially supported in part by grants from the Departamento de Educación, Política Lingüística y Cultura del Gobierno Vasco IT1456-22, IT1866-26 and BERC 2022-2025 program, by the Ministry of Science and Innovation through BCAM Severo Ochoa accreditation CEX2021-001142-S/MICIN/AEI/10.13039/501100011033, by MICIU/AEI/10.13039/2501100011033 and FEDER, UE, PID2024-156800OB-I00, by the BMTF “Mathematical Modeling Applied to Health” Project and the Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS RD21/0016/0017 and RD21/0016/001) and PI13/02352 of the Instituto de Salud Carlos III (PI13/02352). The COPD study received an unrestricted grant from Laboratorios Menarin.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Appendix A Risk of death given the hospitalization history
The probability of interest is
First, using the conditional independence of the two processes given frailty and covariates:
To compute the second part of the integral, we use Bayes’ rule to write the conditional density of the frailty variable as
We can now plug in the above results into equation (A.2). A key aspect of the model is that, since
We conclude by putting all the results together to obtain equation (3.2):
