Abstract
The high morbidity and poor survival rates associated with chronic heart failure still represent a big challenge, despite improvements in treatments and the development of new therapeutic opportunities. The prediction of outcome in heart failure is gradually moving towards a multiparametric approach in order to obtain more accurate models and to tailor the prognostic evaluation to the individual characteristics of a single subject. The Metabolic Exercise test data combined with Cardiac and Kidney Indexes (MECKI) score was developed 10 years ago from 2715 patients and subsequently validated in a different population. The score allows an accurate evaluation of the risk of heart failure patients using only six variables that include the evaluation of the exercise capacity (peak oxygen uptake and ventilation/CO2 production slope), blood samples (haemoglobin, Na+, Modification of Diet in Renal Disease) and echocardiography (left ventricular ejection fraction). Over the following years, the MECKI score was tested taking into account therapies and specific markers of heart failure, and it proved to be a simple, useful tool for risk stratification and for therapeutic strategies in heart failure patients. The close connection between the centres involved and the continuous updating of the data allow the participating sites to propose substudies on specific subpopulations based on a common dataset and to put together and develop new ideas and perspectives.
Introduction
Over the past 20 years, despite improvements in treatments and the development of new therapeutic opportunities that have reduced mortality, the prevalence of heart failure has increased. 1 This is probably due to population ageing and to the prolonged survival of patients obtained with new treatments, along with the increasing prevalence of cardiovascular risk factors, such as hypertension, diabetes and obesity. 2
The high morbidity and poor survival rates associated with chronic heart failure still represent a big challenge for the scientific community, and different approaches have been attempted to improve the treatment of these patients. Beside the therapeutic strategies, identifying patients at higher risk has become a crucial point in order to define those on which to focus the greatest efforts and, from a healthcare point of view, to best direct the economic resources.
Due to the ageing of the population and the heterogeneity of comorbidities, the prediction of outcome in heart failure is gradually moving towards a multiparametric approach. This allows stratifying patients by taking into account different parameters simultaneously and at the same time to tailor the prognostic (or therapeutic) evaluation to the individual characteristics of a single subject.
This evidence has boosted the identification and study of the parameters that, combined, allow calculating the prognosis of a patient as accurately as possible, and therefore to identify high-risk patients. From these studies, different scores have emerged, which can combine different variables through an algorithm and return the probability of death of the individual patient.3–8
In this context, 10 years ago we conceived the idea of developing a new risk score based on the exercise capacity of patients and aiming to isolate only few variables able to identify high-risk patients with an easy, reproducible approach. 9 Indeed, the most used scores available at that time had some limitations, such as the large number of variables required,4,7 or they were totally 7 or partially4,8 lacking the main exercise parameters, which are crucial for the prognostic evaluation of heart failure patients.
The choice to limit the evaluation to subjects able to perform exercise was made for different reasons. First of all, in parallel with variables that can be collected at rest, a complete evaluation of the patient should be made also during their activities, in order to mirror their daily life. This is, in fact, the only way to represent the real health status of a subject, because in real life they constantly need to perform at least simple exercises and not only to stay at rest. Second, cardiopulmonary exercise test (CPET) is considered the gold standard for the functional evaluation of heart failure patients, since peak oxygen uptake (VO2),10,11 ventilatory efficiency (VE/VCO2 relationship)10,12–14 and their combination15,16 are recognized as independent predictors of heart failure prognosis, routinely used to guide heart transplant lists.11,17,18 These CPET-derived parameters need to be integrated into clinical practice, choosing from among demographic data, medical history and laboratory samples the ones more strictly related to the prognosis of the patients.
The MECKI score project
The Metabolic Exercise test data combined with Cardiac and Kidney Indexes (MECKI) score project was initially conducted by 13 Italian centres with proven experience in heart failure and CPET. The database was conceived to collect a large amount of data generally available during a standard hospitalization for heart failure: demographic data, echocardiography, electrocardiography (ECG), complete CPET variables, main procedures, previous cardiac resynchronization therapy (CRT)/implantable cardioverter-defibrillator (ICD) implant, hospitalization history, therapy at enrolment, heart failure aetiology, main laboratory results were retrospectively collected. Information about the follow-up of the patients was also registered to collect information about vital status and outcome.
Inclusion criteria were: previous or present heart failure symptoms (New York Heart Association (NYHA) functional class I–III, stage C of American College of Cardiology/American Heart Association classification) and former documentation of left ventricular systolic dysfunction (left ventricular ejection fraction <40%), stable clinical conditions with unchanged medications for at least three months, ability to perform a CPET, no major cardiovascular intervention scheduled. Notably, also patients with a history of left ventricular systolic dysfunction but with improved left ventricular ejection fraction at the moment of enrolment were included. Furthermore, only subjects who performed what they considered a maximal effort, regardless of the respiratory quotient reached, were included in the study population. Exclusion criteria were: history of pulmonary embolism, moderate-to-severe aortic and mitral stenosis, pericardial disease, severe obstructive lung disease, exercise-induced angina and significant ECG alterations, 19 or presence of any clinical co-morbidity interfering with exercise performance.
Details about CPET procedures have already been reported. 9
Patient follow-up was carried out according to the local heart failure programme in a theoretically endless fashion. Follow-up ended with the last clinical evaluation in the centre where the patient had been enrolled, or with the patient’s death or urgent cardiac transplantation. The study endpoint was the composite of cardiovascular death or urgent heart transplant.
We also put much effort into data management and cleaning procedures to avoid errors in the database. Centro Cardiologico Monzino was the coordinator centre, responsible for data collection, while individual investigators were responsible for their own records. Moreover, two ‘external’ experts, not involved in patient recruitment, reviewed all the patients’ data, supported by one data manager for checking data quality and consistency.
After this first phase of data collection and quality check, the Biostatistics Unit of Centro Cardiologico Monzino was asked to develop a score to quantify patients’ risk of the designated outcome (death or need for urgent heart transplant). The basic idea was to develop a tool similar to those most commonly used in cardiology for risk stratification, which would provide an accurate quantification of the probability of developing a major cardiovascular event within two years. The score had to be based on a set of variables collected at baseline, including all the parameters potentially predictive of the endpoint occurrence.
The strategy of development was based on three points:
To start from a large set of variables measured at baseline; To select a small subset of strongly predictive variables (according to a ‘parsimony rule’); To perform an internal cross-validation of the variables included in order to guarantee the robustness and the reproducibility of the results.
Candidate variables
Among the collected parameters, the candidate variables chosen to be included in the score are listed in Table 1, and they consisted in demographic, biometric, laboratory, echocardiographic and CPET data. All variables were screened, regardless of their univariable association with the endpoint (Table 1). Moreover, in order to account for the potential heterogeneity between clinical sites, the analysis was also stratified by recruiting centre.
Characteristics of the population used to build the MECKI score.
aBike ergometer.
Reproduced with permission from Agostoni et al. 9
AT: anaerobic threshold; BMI: body mass index; CI: confidence interval; Crea: creatinine; CRT: cardiac resynchronization therapy; Hb: haemoglobin; HF: heart failure; HR: heart rate; ICD: implantable cardioverter-defibrillator; K+: potassium; LVeDV: left ventricular end-diastolic volume; LVEF: left ventricular ejection fraction; LVeSV: left ventricular end-systolic volume; MDRD: Modification of Diet in Renal Disease; Na+: sodium; NYHA: New York Heart Association; PM: pacemaker; pred.: predicted; Prob. chi sq: RER: respiratory exchange ratio; RR: respiratory rate; TV: tidal volume; VCO2:carbon dioxide consumption; VE: ventilation; VO2: oxygen uptake
Variable selection
To identify the independent predictors of the study outcome we employed a Cox proportional hazard regression model with stepwise selection of variables. Yet, it is well known that automated variable selection procedures, such as stepwise selection, can introduce a disproportionate number of false positives, serious problems of selection bias, and an over optimistic estimation of the predictive value of the model. 20 Therefore, in order to minimize the false positives and to overcome the problem that the model was built and tested on the same sample, we employed a cross-validation procedure. The sample was randomly split in half, and a Cox model, with stepwise selection procedure, was applied to the complete variable set in the first half of the sample (training set); then the variables selected in the training set were tested on the second half (testing set), using a multivariable Cox model. After 200 iterations, we computed the number of times a single variable was selected in the first step, and the number of times it was confirmed (deemed as significant) in the second step. The covariates that were selected and confirmed at least 70% of the times were considered as independent predictors of the outcome.
Six variables – peak VO2 (% of predicted value), VE/VCO2 slope, haemoglobin (g/dL), Na+ (mmol/L), left ventricular ejection fraction (%) and Modification of Diet in Renal Disease (mL/min) – were considered independent predictors of the study outcome after the Cox analysis and cross-validation procedure.
Risk score
In order to develop a risk score able to accurately quantify the probability of an event (mortality or urgent transplant) within two years, we proceeded as follows: all patients with a censoring time shorter than two years were excluded from the analysis; all patients with events occurring after two years were considered as censored. Then we used a logistic regression model including all the previously selected and validated independent predictors of outcomes.
The predicted probability of event was computed, for each subject, by incorporating into a logistic formula the individual values of the six predictors, weighted for the estimated logistic coefficients:
Where βi is the estimated coefficients for the six variables and Xi is the actual values of six predictors.
In analogy with the risk score for events at two years, we also computed risk scores devised to predict events occurring within one, three and four years.
The calibration analysis, performed by dividing the sample into deciles of risk, showed a remarkable concordance between the observed and the predicted events in each decile (p = 0.36 at Hosmer–Lemeshow test).
Finally, the predictive capacity of the score in the classification of patients undergoing and not undergoing an event was quantified and tested by receiver operating characteristic (ROC) curve analysis. Again, to correctly estimate the area under the ROC curve, we applied a cross-validation procedure, similar to that employed to select independent predictors.
Figure 1 shows that the predictive capacity of the risk scores, although slightly decreasing in more extended time frames, is always remarkable, ranging from 0.80 for events occurring within one year to 0.76 for events occurring within four years.

Receiver operating characteristic analysis of the MECKI score. The MECKI score AUC was 0.804 (0.754–0.852) at one year (1758 survivors and 83 events), 0.789 (0.750–0.828) at two years (1254 survivors and 152 events) 0.762 (0.726–0.799) at three years (1114 survivors and 205 events) and 0.760 (0.724–0.796) at four years (891 survivors and 246 events).
The evolution of the MECKI score
To simplify the calculation of the score, we developed a free online calculator, available at https://www.cardiologicomonzino.it/en/mecki-score/.
Some years later, these findings were confirmed by a validation study 21 that applied the MECKI score to a new population and confirmed its usefulness as a prognostic tool in daily heart failure routine. Later, other internal and external studies confirmed the value of the MECKI score also in comparison with other scores used in heart failure.22,23 Moreover, over the following years, the population of the MECKI score registry was enlarged and continuously updated, so that the MECKI group conducted a number of studies in different subpopulations of patients according to comorbidities or to study-specific parameters.22,24–43
At present, 25 Italian centres participate in the collection of data, and the registry counts more than 7000 patients so far, with a median follow-up of 1421 (627–2713) days and 1899 events. Table 2 shows the evolution of the MECKI score registry over time with the main steps of data collection.
Main characteristics of the MECKI score registry population according to the enrolment steps.
CV deaths = CV death + urgent transplant or left ventricular assistant device implant.
CV: cardiovascular; MECKI: Metabolic Exercise test data combined with Cardiac and Kidney Indexes; VO2: oxygen uptake.
In parallel with the Italian work, two new projects started in Europe and China, with the aim of extending the prognosis study through the MECKI score tool to different populations and ethnicities, and eventually to improve and correct the score according to their results.
Limitations and strengths of the MECKI score
Although it is easy to calculate, the main limitation of the score is the capability of the patient to perform a maximal CPET. Thus, the MECKI score could not be applied to very severe heart failure patients (i.e. NYHA class IV and inotropic-dependent patients), who are not sufficiently represented in the study population, or to patients who are not able to pedal. However, most patients with reasonably stable severe heart failure can undergo a full exercise evaluation with a significant improvement in their prognostic evaluation.
Due to the different impact of single prognostic values in different patients (e.g. different subjects can have a dramatically different prognosis even though they have the same ejection fraction), a MECKI score evaluation can offer a common ground to compare patients from different institutions and, even more, to compare different stages of the disease in the same patient during follow-up. Moreover, due to the length of the study, which has collected data since 1993, the MECKI score dataset also carries a paramount importance in assessing the real weight of different prognostic values over time. In this regard, the ‘fixed’ cutoff usually reported in literature (i.e. peak VO2 < 14 mL/min per kg or 12 mL/min per kg for patients receiving beta-blockers) should be interpreted in a dynamic fashion, since the impact of heart failure treatments (i.e. new drugs, CRT/ICD implant, biomarkers, risk factor control, coronary and valvular interventions) have clearly modified the prognosis of heart failure patients also in advanced stages of the disease.41,44,45
In conclusion, the MECKI score initiative has proven to be a simple, useful tool for risk stratification and for therapeutic strategies in heart failure patients. The close connection between the heart failure centres involved and the continuous updating of the data allows the participating sites to propose substudies on specific subpopulations based on a common dataset and to put together and develop new ideas and perspectives.
Footnotes
Acknowledgement
We thank Dr Michela Palmieri for the English revision of the manuscript.
Author contribution
ES and PA contributed to the conception or design of the work. ES, AB, FR, MM, IM, GV, FMS, PP, FV, PA contributed to the acquisition, analysis, or interpretation of data for the work. ES and AB drafted the manuscript. FR, MM, IM, GV, FMS, PP, FV, PA critically revised the manuscript. All gave final approval and agree to be accountable for all aspects of work ensuring integrity and accuracy.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
