Abstract
Background and aim
The Barcelona Clinic Liver Cancer (BCLC) staging system is commonly used to classify hepatocellular carcinoma (HCC) patients. However, other staging classification schemes have been proposed. We aimed to compare the prognostic accuracy of the Hong Kong Liver Cancer Staging (HKLC), the Model to Estimate Survival for HCC (MESH), and the BCLC staging systems using a Western cohort of HCC patients.
Methods
We retrospectively analyzed 918 patients diagnosed with HCC treated at the University Medical Center of Mainz between 2005 and 2014. We compared the predictive power of survival time of the BCLC, HKLC, and MESH. Predictive ability was tested using the integrated Brier score (IBS) and Harrell’s C index.
Results
Kaplan–Meier analyses showed significant differences in survival between stages defined by the BCLC, HKLC, and MESH. The HKLC classification demonstrated a more robust classification concordance and lower prediction error compared to the BCLC and MESH. In addition, we found that the BCLC offers superior predictive ability to the MESH in the first four years, whereas the MESH is superior for long-term predictions.
Conclusion
Our analyses confirm the prognostic value of three different HCC scoring systems. When compared, the HKLC provides superior prognostication ability.
Introduction
Hepatocellular carcinoma (HCC) has seen a rising incidence in the Western world and has long ranked among the most common malignancies worldwide. 1 Despite well-known risk factors such as chronic viral hepatitis, alcohol consumption, and metabolic syndrome, the majority of patients are diagnosed with advanced-stage unresectable disease. 2 Underlying chronic liver disease, particularly cirrhosis, increases the risk of developing HCC and impedes therapeutic efficacy. Curative options are further limited by a shortage of donor organs for liver transplantation. Over the last 20 years, different staging systems have been developed to provide prognostic characterization and treatment recommendations. 3 Staging systems for HCC differ considerably from other solid tumors, given that both tumor growth and underlying liver disease can influence liver function and directly impact prognosis.
The BCLC staging system—originally introduced in 1999 4 —is most frequently used to classify HCC patients in the USA and Europe and is endorsed by the guidelines of the European Association for the Study of the Liver (EASL) 5 and the American Association for the Study of Liver Diseases (AASLD). 6 For more than 10 years, it was the only scoring system offering both prognostic information and therapeutic recommendations. A variety of different HCC staging systems have since been proposed. Recently, Liu et al. and Yau et al. introduced two newly developed scoring systems—the Model to Estimate Survival for HCC (MESH) and the Hong Kong Liver Cancer Staging System (HKLC)—that have been validated in large patient cohorts. The HKLC scoring system, like the BCLC system, provides more aggressive treatment recommendations. 7 This system likely reflects the specific treatment strategies for HCC patients in Asian countries. The recently published MESH score incorporates tumor size, vascular invasion, Child–Pugh stage (CP), performance status (PST), alpha fetoprotein (AFP), and serum alanine aminotransferase (ALT). In contrast to BCLC and HKLC scores, it does not provide concrete treatment recommendations. 8 While predictive ability has been demonstrated, 9 the HKLC and MESH systems are not currently endorsed in European guidelines. The utility and comparison of these scoring schemes and their evaluation using different cohorts are currently being evaluated. 10
Since the underlying liver disease and preserved liver function have an impact on tumor development and therapeutic approaches, it is important to validate new scoring systems in different patient cohorts.
In this study, we aimed to validate the HKLC as well as the MESH score in a cohort of Western HCC patients from a tertiary liver center. We also sought to compare both staging systems to the established BCLC system for patient stratification and prediction of overall survival (OS).
Methods
Patients and staging system
All patients were treated at the University Medical Center Mainz from 2005 to 2014. Data were collected from our Clinical Registry Unit. In accordance with the EASL 5 or AASLD 6 guidelines, first diagnosis of HCC was confirmed either by tissue histology or radiologic criteria of magnetic resonance imaging or computed tomography scan.5,11 Laboratory parameters were analyzed at first diagnosis of HCC prior to treatment initiation. Tumor staging was established using patient performance status, serum levels of hepatic markers, hepatic function (including ascites, episodes of hepatic encephalopathy, CP stage), and radiological staging. Treatment decisions were reached in a multidisciplinary tumor board. Treatments were documented in the hospital documentation system. Scores were determined using information obtained at the time of initial diagnosis. Patients were only included if the treatment they received was in accordance with the corresponding stage treatment recommendations of both the BCLC and HKLC systems. Only patients who met the eligibility criteria for both scores (BCLC and HKLC) were included in the calculation.
Statistical analyses
Patient data collection was conducted with clinical registry software especially developed for clinical characterization of patients with HCC. 12 All statistical analyses were performed using R v3.5.0 (R Foundation for Statistical Computing, Vienna, Austria).
Clinical parameters are provided as frequency and percentage (e.g., etiology), and continuous variables as median and range (e.g., age). OS was calculated from date of first diagnosis to death or end of observation at 66 months. Kaplan–Meier survival curves were created using R. OS between stages was compared using the log-rank test.
Results
Patient baseline characteristics
We retrospectively analyzed a total of 1173 HCC patients who were treated at the University Medical Center of Mainz. At total of 918 patients met the eligibility criteria for our study and were included in further analyses (Figure 1). Patients’ baseline characteristics were collected from the HCC registry and are shown in Table 1.
12
Of the 918 patients, 83.1% ( Flow diagram of included patients. Hepatocellular carcinoma (HCC) patient cohort treated at the University Medical Center Mainz in 2005 and 2014. Patient baseline characteristics. HCC patient cohort treated at the University Medical Center Mainz in 2005 and 2014. Patients who died due to, e.g., acute liver failure, sepsis, etc. while waiting for therapy. HBV: hepatitis B virus; HCV: hepatitis C virus; NASH: non-alcoholic steatohepatitis; ECOG: Eastern Cooperative Oncology Group; PS: performance status; AFP: alpha-feto protein; PEI: percutaneous ethanol injection; RFA: radiofrequency ablation; LITT: laser-induced thermotherapy; OLT: orthotopic liver transplantation; TACE: trans-catheter arterial chemoembolization; SIRT: selective internal radiation therapy; BSC: best supportive care; BCLC: Barcelona Clinic Liver Cancer: MESH: Model to Estimate Survival of HCC; HKLC: Hong Kong Liver Cancer.
Median survival and Kaplan–Meier analyses according to different staging systems
All 918 patients in this cohort were classified according to the BCLC, HKLC, and MESH scoring systems. Median survival rates were calculated for all patients. Kaplan–Meier survival curves stratified by the BCLC, HKLC, and MESH scores are shown in Figure 2. Significant differences ( Kaplan–Meier survival curves for (a) the Barcelona Clinic Liver Cancer score, (b) the Hong Kong Liver Cancer Staging score, and (c) the Model to Estimate Survival for HCC score. Significant differences in survival could be seen between stages (***
Comparative analyses of the different staging systems
Comparison of parameters of the staging systems BCLC, HKLC, and MESH.
AUC: area under the curve. Bold shows superior values.
We further compared the scoring systems using ROC curve analysis and calculated AUC values at specific time points. Time intervals of six months up to 66 months for the MESH, BCLC, and HKLC were calculated as had been done before for the c statistic. The HKLC score showed moderately higher AUC values than either the BCLC or MESH for the first three years (BCLC AUC = 0.680 (95% CI 0.642–0.717); MESH AUC = 0.0.710 (95% CI 0.675–0.745); HKLC AUC = 0.737 (95% CI 0.701–0.772) for 36 months). MESH stratification was better than the BCLC or HKLC at 54 months (BCLC AUC = 0.664 (95% CI 0.619–0.710), MESH AUC = 0.0.724 (95% CI 0.684–0.765), HKLC AUC = 0.704 (95% CI 0.661–0.747); Table 2 and Figure 3).
Time-dependent receiver operating characteristic curves for (a) 12 months, (b) 36 months, and (c) 54 months. (d) Area under the curve values for specific time points (months). (e) Prediction error curves show the Brier score for each time point 
To verify the accuracy of the probability forecast, we calculated the IBS for Kaplan–Meier analyses up to a time point of 66 months. The IBS of the HKLC revealed a uniformly lower prediction error than the BCLC and MESH at times up to 42 months (BCLC IBS = 0.176, HKLC IBS = 0.168, and MESH IBS = 0.168 for 42 months; Table 2). The MESH score revealed a slightly lower prediction error than the HKLC after month 42 (HKLC IBS = 0.162 and MESH IBS = 0.161 for 48 months; Table 2). Calculation of the IBS showed also a slight superiority of the BCLC to the MESH score within the first 42 months (BCLC IBS = 0.188 and MESH IBS = 0.184 for 36 months), whereas later on, the MESH had a slightly lower prediction error (BCLC IBS = 0.160 and MESH IBS = 0.148; Table 2 and Figure 3).
Discussion
An optimal staging system offers accurate prognostic information and allows for accurate stratification of patients into best treatment strategy-based subgroups. 15 In this study, we validated the MESH score as well as the HKLC score in a large European cohort.
The BRIDGE study suggested that there is a difference in survival time, patient demographics, as well as documented treatment approach between different regions. 16 As such, a validation of new scoring systems should be performed in large patient cohorts from different regions across the world. Our cohort is a representative cohort from a tertiary European clinical center with similar patient demographics as those reported in the BRIDGE study. 16
The limitations of this study lie principally in its retrospective design and the patient collective of a tertiary clinical center, which usually includes patients in a more advanced stage than those representative of the entire population. While 918 patients can be considered a big cohort, it is still limited in number and therefore limited in power. Cancer registries such as the Surveillance, Epidemiology, and End Results program (National Cancer Institute) would help to overcome this issue, but such tools are often limited in terms of the quality of the data.
The development of a scoring system may be influenced by the patient cohort used to establish it. The BCLC was criticized for its lack of universal applicability, since it was developed in a small European patient cohort of mainly advanced-stage hepatitis C virus (HCV)-related HCC cases. 4 These patients have highly impaired liver function secondary to advanced cirrhosis. The MESH and HKLC scoring systems were developed in an Asian patient cohort, whereas hepatitis B infection is endemic. In contrast to HCV patients, patients with chronic hepatitis B virus (HBV) infection frequently develop HCC in a non-cirrhotic liver.17,18 Despite being classified as same stage, HBV-related HCCs carry a better stage-matched prognosis than HCV-related HCCs, even when resected. 19
We demonstrated that the BCLC, HKLC, and MESH scores show moderate discriminative ability but are reliable tools to predict OS in patients with HCC based on the stage at diagnosis in a Western patient cohort. We demonstrated that the HKLC score provides higher predictive power as well as a lower prediction error than either the BCLC or MESH scores. Of note, the BCLC may be superior in the prediction of OS than the MESH score in short-term analyses. For observations over 54 months, MESH demonstrated slightly better discriminative ability. Although previous analyses have also shown a moderate superiority of the HKLC score, these analyses were limited by smaller cohorts, shorter follow-up time, 20 different patient demographics, 21 and analyses of groups receiving only intra-arterial treatments. 22 A recently published letter to the editor evaluated the MESH as well as the HKLC score in a French cohort.21,23 We could confirm their findings about the applicability of the MESH as well as the HKLC score in European cohorts. Overall, they showed slightly higher values of the c statistic for all three scoring systems compared to our analysis. We assume that this is due to the differences in patient cohorts (e.g., less advanced tumors, better preserved liver functions). However, to our knowledge, our study is the first comparative analysis of different observational time points for these scoring systems, which provides a more detailed analysis of the applicability of a scoring system.
Comparison of parameters of the staging systems BCLC, HKLC, and MESH.
ALT: alanine transaminase; PVI: portal vein invasion; EVM: extrahepatic vascular metastasis.
Treatment recommendations of the HKLC and BCLC score according to staging of HCC.
LT: liver transplantation.
Size and extent of the tumor parameters emphasized by the MESH score are important prognostic predictors for HCC. 28 HCCs within Milan criteria are classified as early stage, and the fact that a tumor falls within the Milan criteria is one of the most important criteria for curative treatment by orthotopic liver transplantation, but its validity continues to occupy center stage in the discussion.30–32 Extrahepatic metastasis or macrovascular invasion is usually a sign of aggressive tumor biology and is associated with poor prognosis, and thus presents a contraindication for liver transplantation. Milan criteria are included in the MESH as well as in the BCLC score. The HKLC score estimates tumor burden and vascular invasion with more parameters, which offers a more accurate picture. The HKLC score especially discriminates between invasion of the main portal trunk and smaller vessels, which was shown to influence prognosis significantly. 33 More diverse criteria defining tumor burden may result in a better predictive ability of OS and improved treatment stratification.
The MESH score has been shown to be superior for risk stratification in patients with early-stage HCC (BCLC A and HKLC I/II). 8 In accordance, we were able to show that the MESH score is superior to the BCLC score in terms of long-term prediction accuracy. Long-term predictions are particular difficult due to different individual responses to therapy and progression of liver disease. Better means of discrimination are needed in early as well as in intermediate and advanced stages to select optimal treatment options. 2
As previously stated, the HKLC generally recommends a more aggressive treatment strategy. Consequently, even patients classified as being in the advanced stages of BCLC with a consecutive recommendation for a palliative therapy regime can be classified as stage I or II according to the HKLC score, which would provide the possibility of resection or transplantation, and are thus treated with a curative intent (Table 4). HKLC scoring therefore offers the ability to identify patients who are suitable for a more aggressive treatment option and could potentially improve outcomes.
Overall, we were able to show superiority of the HKLC score in terms of predictive power as well as prediction error in a Western patient cohort. In addition, the HKLC shows a uniformly higher discriminative power and lower prediction error up to four years, whereas the MESH shows the lowest prediction error after three years.
Conclusion
Our analyses of different scoring systems for HCC confirmed the prognostic ability of the BCLC, HKLC, and MESH scoring systems. Our comparative analyses suggest that the MESH and HKLC possesses a high predictive accuracy in Western HCC patients. However, the HKLC offers a uniformly lower prediction error and higher discriminative power than the BCLC and MESH score. Thus, use of the HKLC scoring system should be further evaluated for potential use in prognostication and treatment strategy for HCC patients in Europe and the Americas.
Footnotes
Declaration of conflicting interests
The authors have no conflicts of interest to declare.
Ethics approval
The study was approved by the responsible ethics committee for the retrospective analysis of clinical data.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: J.U.M. was supported by grants from the Volkswagen Foundation (Lichtenberg program), and C.C. was supported by a TransMed Fellowship of the University Medical Center Mainz.
Informed consent
No informed consent was requested. The study design was retrospective and all data were anonymized.
