Abstract
Keywords
Introduction
Intrahepatic cholangiocarcinoma (ICCA) is the second most common primary malignancy in liver. 1 Though the primary hepatic lymphoma (PHL) is rare, secondary hepatic lymphoma (SHL) is quite common and counts for 20% of non-Hodgkin lymphoma. 2 The treatments of these 2 diseases are different: surgery is the only method to cure ICCA, while chemotherapy plays important role in treating HL. 3 Thus, accurate differentiation of these 2 diseases in the early stage is necessary for choosing appropriate treatment.
Traditional radiological diagnosis is not objective enough, as the number of traditional image features is limited, and the interpretation of these results depends on the experience of radiologists to some extent. Meanwhile, with the development of radiological techniques, like contrast-enhanced computer tomography (CECT), magnetic resonance imaging, and contrast-enhanced ultrasound, more digital parameters can be obtained, enabling quantitative radiological diagnosis.4,5 To promote objective diagnosis, digital and artificial intelligence (AI) methods are developed prosperously in recent years.
Texture parameter is related to digital image information, which cannot be seen by naked eyes, describing the heterogeneity of the regions-of-interest (ROI). 6 Previous studies have used texture analysis and AI in the diagnosis, staging, treatment plan, and prognosis prediction of hepatic diseases. 7 For instance, texture analysis and topological data with 3 machine learning (ML) models were used in classifying different malignant liver masses. 8 The differentiation of ICCA and HL has not been studied in the AI area and the discrimination of these 2 diseases needs the combination of clinical, laboratorial, and radiological results in clinical practice. Thus, in this study, we purposed to explore the ability of combining texture analysis with ML to differentiate these 2 malignant liver diseases.
Method
Patient Enrollment
This study was approved by the ethic committee of West China Hospital, Sichuan University and no written informed consent was required. From January 2014 to October 2019, a total of 129 patients from West China Hospital was involved, and all of their diagnoses were confirmed histologically (28 were HL and 101 were ICCA). The inclusion criteria were that patients had: (1) histologically diagnoses of ICCA or HL, (2) full electronic medical record, (3) abdominal contrast CT images. The exclusion criteria were that: (1) patients have incomplete information or low-resolution images, (2) patients have other diseases which can influence the image significantly like liver cirrhosis.
Contrast CT Data Acquisition and ROI Segmentation
All of the CECT images were obtained by Philips Brilliance 64-slice detector-row machines (Philips Healthcare). The enhanced scan (120 kVp, 200 mA, pitch 0.891 to 1.235; collimation 64 × 0.625 mm) was initiated 30 s (hepatic arterial phase) and 90 s (portal venous phase) after injection of the contrast medium (1.5-2.0 mL/kg, Iohexol: Beijing Beilu Pharmaceutical), using a power injector (Stellant D, Medrad) with the speed of 2 to 3 mL/s.
Texture Feature Extraction
Texture features were extracted by the software LifeX (version 3.74, French Alternative Energies and Atomic Energy Commission), an open-source platform that can analyze and extract plenty of quantitative parameters from digital images. 9 ROIs on the delayed phase were independently drawn by 2 readers with more than 5 years of clinical experience. All the results were supervised by a third reader with 15-year clinical experience to deal with different opinions. Three-dimensional ROIs were built by the accumulation of all two-dimensional region ROIs, which were delineated around the boundary of lesions in each layer of transaxially CT images. Respective examples of CECT images and ROIs of ICCA and HL patients were shown in Figure 1.

CECT of patients with ICCA and HL. (a and b) The CECT images of 1 ICCA patient. This patient presented with intermittent dull pain in the right upper abdomen with postprandial pain that had been evident for 8 years and worsened for 20 days. No nausea, vomiting, or yellowish skin staining was found. An irregular and mixed low-density mass was seen in the left internal lobe of the liver, with a blurred boundary and a size of about 8.1 × 3.9 cm on CECT. (c and d) The CECT images of 1 HL patient. This patient was admitted to the hospital for 3 months with intermittent fever, night sweats, and pain in the right chest, accompanied by decreased appetite and no yellowing of the skin. On CECT, a soft tissue mass with slightly lower density was seen in the lower segment of the right lobe of liver, about 8.8 × 8.7 cm, with an ill-defined boundary and uneven moderate enhancement. The boundary between the lesion and the right anterior and posterior portal vein branches was not clear. ROIs were all drawn along the liver lesion slice by slice, and all areas of calcification and necrosis were excluded.
Feature Selection and Classification Methods
In this study, 5 feature selection methods were used, including distance correlation (DC), random forest (RF), least absolute shrinkage and selection operator (LASSO), eXtreme gradient boosting (XGBoost), and gradient boosted decision tree (GBDT) methods. Python software was used to conduct feature selection methods. Meanwhile, 9 feature classification methods were used in feature classification. These classification methods were linear discriminant analysis (LDA), support vector machine (SVM), random forest (RF), Adaptiveboosting (Adaboost), k-nearest neighbor (KNN), Gaussian Naïve Bayes (GaussianNB), logistic regression (LR), GBDT, and decision tree (DT). A total of 45 models were built based on the combination of these 5 selection methods and 9 classification methods.
Diagnostic Model Built by ML
The patients were randomly divided into 2 groups, a training group and a test group, in the proportion of 4:1. The algorithms deployment procedure was assessed by 10-fold cross-validation, guaranteeing the maximum use of data and promote the accuracy of models. 10 The sensitivity, specificity, areas under the receiver operating characteristic curve (AUC), and accuracy were calculated to assess the differential ability of the 45 models (Figure 2). To explore the necessary of ML selection methods, the 3 features with the optimal AUC in the single feature joint LDA classification detection were used for modeling. The ML algorithms were all programmed using the Python (version 3.6.4) ML library known as scikit-learn (version 19.0).

The flowchart of this study.
Results
Characteristics of the Study Cohort
The clinical and pathological characteristics of the ICCA and HL patients were summarized in Table 1. The mean age of ICCA patients was 58.2 (10.8), and that of HL patients was 53.2 (17.9). The gender ratios were 55:46 and 17:11 (male: female) for ICCA and HL patients, respectively. In terms of the pathological findings, there were 26 poorly differentiated, 54 poorly to moderately differentiated, 19 moderately differentiated, and 2 moderately to highly differentiated ICCA patients. Among HL patients included, 8 were PHL and 20 were SHL.
Clinical Parameters of ICCA and HL.
Abbreviations: ICCA, intrahepatic cholangiocarcinoma; HL, hepatic lymphoma; M: F male: female; NA, not appropriate; PHL, primary hepatic lymphoma; SHL, secondary hepatic lymphoma.
Characteristics of Texture Parameters
A total of 45 features were extracted from the CECT images of each patient and 38 of them were eligible. They included 4 histogram-based matrixes, 3 shape-based matrixes, 6 gray-level co-occurrence matrixes, 11 gray-level run length matrixes, 3 neighborhood gray-level dependence matrixes, and 11 gray-level zone length matrixes (Supplemental Material 1). The definitions of texture parameters were shown in Supplemental Material 2, and the features selected by each selection methods were listed in Supplemental Material 3.
Diagnostic Performance of Models
Features were selected by 5 methods and classified by 9 methods. Thus, a total of 45 predictive models were built by the combination of the feature selection and feature classification methods. We used underline to combine the name of both selection and classification methods to name the models. The diagnostic ability of each model was listed in Table 2. Models with the highest AUC and accuracy (>0.96) were distance correlation (DC)_linear discriminant analysis (LDA) (0.997, 0.962) and random forest (RF)_LDA (0.997, 0.969) (AUC and accuracy, respectively). Heatmaps of AUC and accuracy of the test group were presented in Figure 3. To figure out the necessity of using ML selection methods, we compared the AUC of each feature and selected the top 3 features to build a model by the LDA classification method. The AUC of this model without ML selection was 0.975 and the accuracy was 0.899 (AUCs of each single feature were in Supplemental Material 4).

The heatmap of AUC and accuracy of 45 models in the test group.
The Differentiational Ability of all Models Based on 5 Feature Selection Methods and 9 Feature Classification Methods.
Abbreviations: Adaboost, Adaptiveboosting; AUC, area under the curve; DC, distance correlation; DT, decision tree; GaussianNB, Gaussian Naïve Bayes; GBDT, gradient boosted decision tree; ICCA, intrahepatic cholangiocarcinoma; KNN, k-nearest neighbor; LASSO, least absolute shrinkage and selection operator; LDA, linear discriminant analysis; LR, logistic regression; RF, random forest; SVM, support vector machine; OD, original data; XGBoost, eXtreme gradient boosting.
Discussion
Early and accurate differentiation of ICCA and HL is necessary for treatment. Meanwhile, image features with much digital information can be explored to facilitate this process. Texture parameters combining with ML promote the imaging diagnosis to be more objective and precise. 11 To elevate the radiological diagnosis and enhance its ability in assisting clinical diagnosis, this study used texture parameters and ML to differentiate CCA and HL. The results showed texture parameters can differentiate ICCA from HL effectively as long as choosing suitable ML models.
ICCA and HL have similar features in CECT, low attenuation masses with clear rim,12–14 increasing the difficulty of image diagnosis, which can be influenced by the experience of radiologists. The specific diagnosis of these 2 diseases depends on histopathological results by surgical excisional biopsies or computer tomographic/ultrasound transcutaneous biopsies, which are invasive and have selection bias. Though patients with SHL have systemic symptoms and diffused lesions on images, systemic symptoms also exist in patients with advanced stages ICCA and intrahepatic isolated masses also exist in SHL images. 15 Several studies had reported cases, in which HL mimicked ICCA and multiple modalities were required for diagnosis.16–18 For example, a study tried to use serum alkaline phosphatase isoenzyme electrophoresis to differentiate the 2 diseases. 19 To improve the radiological diagnosis capacity, texture analysis combining with ML is the most potential method waiting to be explored.
As the histological differences are the basic of differentiational diagnosis and some studies have proved that histological features can reflect on digital images. 20 Thus, we hypothesize that the differences of components of ICCA and HL can also be uncovered by texture analysis. Texture combining with ML has been used in many precious studies for promoting staging, diagnosis, and therapy response of many diseases.21,22 In terms of liver, previous studies used a convolutional neural network to differentiate malignant indeterminate and benign liver masses with an overall accuracy of 0.84. 23 Besides digital images, hematoxylin and eosin-stained whole-slide images can also be analyzed by deep learning methods in differentiating malignant liver masses (AUC > 0.85). 24 To our knowledge, this is the first study that used multiple ML models and texture features to distinguish ICCA and HL. Meanwhile, compared with previous studies that used only several algorithms in feature selection and classification, our study built 45 models to guarantee that proper parameters can be selected and effective classification methods can be used, as there is not any general model being suitable for all the issues.25–28
In our study, RF_LDA performs the best in differentiating ICCA and HL and followed by RF_LDA. Comparing the AUC and accuracy of LDA models built with and without ML feature selection, higher diagnostic ability was found in LDA models with ML selection. To some extent, this result indicated the superiority of ML selection. LDA is the simplest classification method with low cost and well performance. It separates parameters by projecting a line and guarantees the discriminative ability by maximizing the ratio of intergroup variance to the intragroup variance. 29 RF is an effective method for both feature selection and classification by using the training bootstrap samples to build subtrees and choosing the classification depending on the number of votes. It can deal with not only linear but also nonlinear variables with high accuracy and resistance to overtraining. 30 A previous study used ML methods for exploring prognostic biomarkers of advanced nasopharyngeal carcinoma, and found RF_RF with performed the best. 31
There are some limitations to this study. Firstly, the number of HL enrolled is limited as HL does not have high provenance, and larger cohorts are required in the future study. Secondly, this study is hold in a single hospital, and thus, the generalization performance may be suspected. However, texture parameter extraction and model establishment are conducted by open-source packages, guaranteeing repeatability by other researchers. Moreover, this study only explored texture parameters of CECT, other radiological modalities are supposed to be studied and compared to figure out the most effective radiological method for distinguishing these 2 diseases. In conclusion, combining texture parameters from CECT with multiple ML models can differentiate ICCA and HL effectively, and _LDA performed the best in this process.
Supplemental Material
sj-docx-1-tct-10.1177_15330338211039125 - Supplemental material for Differentiation of Intrahepatic Cholangiocarcinoma and Hepatic Lymphoma Based on Radiomics and Machine Learning in Contrast-Enhanced Computer Tomography
Supplemental material, sj-docx-1-tct-10.1177_15330338211039125 for Differentiation of Intrahepatic Cholangiocarcinoma and Hepatic Lymphoma Based on Radiomics and Machine Learning in Contrast-Enhanced Computer Tomography by Hanyue Xu, Xiuhe Zou, Yunuo Zhao, Tao Zhang, Youyin Tang, Aiping Zheng, Xianghong Zhou and Xuelei Ma in Technology in Cancer Research & Treatment
Supplemental Material
sj-docx-2-tct-10.1177_15330338211039125 - Supplemental material for Differentiation of Intrahepatic Cholangiocarcinoma and Hepatic Lymphoma Based on Radiomics and Machine Learning in Contrast-Enhanced Computer Tomography
Supplemental material, sj-docx-2-tct-10.1177_15330338211039125 for Differentiation of Intrahepatic Cholangiocarcinoma and Hepatic Lymphoma Based on Radiomics and Machine Learning in Contrast-Enhanced Computer Tomography by Hanyue Xu, Xiuhe Zou, Yunuo Zhao, Tao Zhang, Youyin Tang, Aiping Zheng, Xianghong Zhou and Xuelei Ma in Technology in Cancer Research & Treatment
Supplemental Material
sj-DOCX-3-tct-10.1177_15330338211039125 - Supplemental material for Differentiation of Intrahepatic Cholangiocarcinoma and Hepatic Lymphoma Based on Radiomics and Machine Learning in Contrast-Enhanced Computer Tomography
Supplemental material, sj-DOCX-3-tct-10.1177_15330338211039125 for Differentiation of Intrahepatic Cholangiocarcinoma and Hepatic Lymphoma Based on Radiomics and Machine Learning in Contrast-Enhanced Computer Tomography by Hanyue Xu, Xiuhe Zou, Yunuo Zhao, Tao Zhang, Youyin Tang, Aiping Zheng, Xianghong Zhou and Xuelei Ma in Technology in Cancer Research & Treatment
Supplemental Material
sj-docx-4-tct-10.1177_15330338211039125 - Supplemental material for Differentiation of Intrahepatic Cholangiocarcinoma and Hepatic Lymphoma Based on Radiomics and Machine Learning in Contrast-Enhanced Computer Tomography
Supplemental material, sj-docx-4-tct-10.1177_15330338211039125 for Differentiation of Intrahepatic Cholangiocarcinoma and Hepatic Lymphoma Based on Radiomics and Machine Learning in Contrast-Enhanced Computer Tomography by Hanyue Xu, Xiuhe Zou, Yunuo Zhao, Tao Zhang, Youyin Tang, Aiping Zheng, Xianghong Zhou and Xuelei Ma in Technology in Cancer Research & Treatment
Supplemental Material
sj-docx-5-tct-10.1177_15330338211039125 - Supplemental material for Differentiation of Intrahepatic Cholangiocarcinoma and Hepatic Lymphoma Based on Radiomics and Machine Learning in Contrast-Enhanced Computer Tomography
Supplemental material, sj-docx-5-tct-10.1177_15330338211039125 for Differentiation of Intrahepatic Cholangiocarcinoma and Hepatic Lymphoma Based on Radiomics and Machine Learning in Contrast-Enhanced Computer Tomography by Hanyue Xu, Xiuhe Zou, Yunuo Zhao, Tao Zhang, Youyin Tang, Aiping Zheng, Xianghong Zhou and Xuelei Ma in Technology in Cancer Research & Treatment
Footnotes
Abbreviations
Authors’ Contributions
HY X, XH Z, and XL M contributed to the study design. AP Z and YY T contributed to the data collection, YN Z, XH Z, and T Z contributed to the data analysis. All authors contributed to the writing or revising of this manuscript. All authors have read and approved this version of manuscript to be published and agree to be responsible for all aspects of the work.
Availability of Data and Materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Declarations
Ethics approval and consent to participate were obtained from the ethic committee of West China Hospital.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
