Evaluation of Combined Cancer Markers With Lactate Dehydrogenase and Application of Machine Learning Algorithms for Differentiating Benign Disease From Malignant Ovarian Cancer

Abstract

Background

The differential diagnosis of ovarian cancer is important, and there has been ongoing research to identify biomarkers with higher performance. This study aimed to evaluate the diagnostic utility of combinations of cancer markers classified by machine learning algorithms in patients with early stage ovarian cancer, which has rarely been reported.

Methods

In total, 730 serum samples were assayed for lactate dehydrogenase (LD), neutrophil-to-lymphocyte ratio (NLR), human epididymis protein 4 (HE4), cancer antigen 125 (CA125), and risk of ovarian malignancy algorithm (ROMA). Among them, 53 were diagnosed with early stage ovarian cancer, and the remaining 677 were diagnosed with benign disease.

Results

The areas under the receiver operating characteristic curves (ROC-AUCs) of the ROMA, HE4, CA125, LD, and NLR for discriminating ovarian cancer from non-cancerous disease were .707, .680, .643, .657, and .624, respectively. ROC-AUC of the combination of ROMA and LD (.709) was similar to that of single ROMA in the total population. In the postmenopausal group, ROC-AUCs of HE4 and CA125 combined with LD presented the highest value (.718). When machine learning algorithms were applied to ROMA combined with LD, the ROC-AUC of random forest was higher than that of other applied algorithms in the total population (.757), showing acceptable performance.

Conclusion

Our data suggest that the combinations of ovarian cancer-specific markers with LD classified by random forest may be a useful tool for predicting ovarian cancer, particularly in clinical settings, due to easy accessibility and cost-effectiveness. Application of an optimal combination of cancer markers and algorithms would facilitate appropriate management of ovarian cancer patients.

Keywords

cancer antigen 125 human epididymis protein 4 lactate dehydrogenase machine learning neutrophil-to-lymphocyte ratio ovarian cancer

Highlights

Risk of ovarian malignancy algorithm (ROMA) is a widely used ovarian cancer marker.

ROMA consists of cancer antigen 125 (CA125), human epididymis protein 4 (HE4), and menopause state.

Lactate dehydrogenase (LD) related to cancer is a routinely prescribed biomarker.

The diagnostic utility of combined cancer markers with LD for ovarian cancer, which has been seldom reported, was evaluated.

Machine learning algorithms were applied to differential diagnosis of early stage ovarian cancer from benign conditions for better performance.

Areas under the receiver operating characteristic curves (ROC-AUCs) and sensitivities at 75.0% specificities were measured.

The combination of ROMA with LD classified by random forest showed the best ROC-AUC (.757, 68.8% sensitivity at 75.0% specificity), indicating acceptable usefulness for differential diagnosis of ovarian cancer.

Introduction

A total of 295 414 new ovarian cancer patients were diagnosed and 184 799 cancer-related deaths occurred worldwide in 2018.¹ Ovarian cancer is the eighth most common cause of cancer-related death in women in South Korea.² The age-standardized incidence rate of ovarian cancer has increased progressively from 5.0 to 6.3 based on the Korea Central Cancer Registry.³ Unfortunately, the physical inaccessibility of the ovaries and the lack of specific symptoms in the early stages of ovarian cancer make differential diagnosis difficult. Several patients undergo extensive surgical staging, such as oophorectomy, without a definite diagnosis of malignant cancer, leading to increased morbidity.⁴

There has been ongoing research to identify biomarkers with higher diagnostic performance in differentiating ovarian cancer from other benign conditions. Among diverse biomarkers, cancer antigen 125 (CA125) is a representative marker for detecting and guiding treatment in patients with ovarian cancer. However, CA125 showed a wide range of sensitivity (27–66%) regarding detection of early stage ovarian cancer due to high false-positive rate among premenopausal women with benign diseases.^5,6 Therefore, there have been attempts to find other biomarkers that can complement or replace CA125. Among them, human epididymis protein 4 (HE4) has been reported to have better specificity than CA125 in discriminating benign from malignant ovarian masses.⁷ The combination of these markers and the menopausal status of patients led to the proposition of the risk of ovarian malignancy algorithm (ROMA) to predict ovarian cancer.⁸ However, there have been discrepancies regarding the reported diagnostic performances of CA125, HE4, and ROMA in previous studies.⁷ In addition, multiple markers other than these representative ovarian markers and their combinations of them based on machine learning algorithms have been also investigated and showed inconsistent results.^9,10

In this study, we evaluated the diagnostic value of combinations of conventional ovarian cancer markers with routinely prescribed markers such as lactate dehydrogenase (LD) and neutrophil-to-lymphocyte ratio (NLR), which were rarely reported, to identify practically useful tools for differentiating early stage ovarian cancer from benign diseases in clinical settings. We also applied machine learning algorithms including bagging, boosting, classification tree, random forest, support vector machine, and K-nearest neighbor algorithms^10,11 to investigate the optimal diagnostic performance of combinations of multiple ovarian cancer markers.

Materials and Methods

Study Population

A total of 743 samples from patients who visited Kangnam Sacred Heart Hospital for ROMA testing were collected consecutively to demonstrate the diagnostic performance of ovarian cancer markers between June 2014 and October 2016. We excluded 35 patients with malignant diseases other than ovarian cancer. Three patients diagnosed with advanced stage (n = 2 for stage III and n = 1 for stage IV) were also excluded to investigate only patients with early stage ovarian cancer. Additionally, 25 patients diagnosed with early stage ovarian cancer between November 2016 and December 2020 were included for more thorough analyses. The 730 samples without duplicated patients were classified according to patients’ diagnosis as follows: ovarian cancer group (n = 53) and control group (n = 677) (Supplementary Figure 1). All patients were diagnosed by specialized gynecologists and pathologists in their clinics at Kangnam Sacred Heart Hospital based on the criteria of the International Federation of Gynecology and Obstetrics^12,13 for ovarian cancer. The control group included patients with benign pelvic masses such as simple cysts of ovary, and leiomyoma of uterus, reflecting actual clinical laboratory conditions. The dataset analyzed in this study is provided in Supplementary Table 1. The procedures for the determination of major laboratory parameters used for ovarian cancer markers were described as follows. The medical technicians and researchers were blinded to the test results.

HE4, CA125, and ROMA

HE4 serum concentration was determined using a commercially available Alinity i HE4 Reagent kit (Abbott Diagnostics, Abbott Park, IL, USA), which was used according to the manufacturer’s instructions. A two-step chemiluminescent microparticle immunoassay was used for quantitative analysis of HE4. Serum samples were incubated with 2H5 anti-HE4-coated paramagnetic microparticles. After non-bound antibodies were washed out, an acridinium-labeled 3D8 anti-HE4 conjugate was added. After another wash, pretrigger and trigger solutions were combined with the reaction complexes. The resulting chemiluminescent reaction was measured as relative light units (RLUs). The amount of HE4 antigen in the serum and the RLUs detected by the Alinity i HE4 assay exhibited a direct relationship and the results were calculated automatically by the analyzer. CA125 was also detected using a two-step chemiluminescent microparticle immunoassay with Alinity i CA125 II Reagent kit (Abbott Diagnostics). Serum samples and paramagnetic microparticles coated with ovarian cancer 125 were incubated for binding to CA125 reactive determinants to the particles. After washing, a M11 acridinium-labeled conjugate was added to the mixture. The followed steps were similar to the protocol for Alinity i HE4 Reagent kit. ROMA was calculated according to a study by Moore et al.⁸ as follows.

Premenopausal: PI (predictive index) = −12.0 + 2.38 * LN (HE4) + .0626 * LN (CA125).

Postmenopausal: PI = −8.09+1.04 * LN(HE4) + .732 * LN(CA125).

Then, ROMA value (predictive value) was calculated using the following equation: ROMA (%) = e^PI/(1 + e^PI) * 100

LD and NLR

The AU LD reagent kit (Beckman Coulter, Inc, Brea, CA, USA) was applied on Beckman Coulter AU5800 to quantitate LD levels. Lactate and nicotinamide adenine dinucleotide (NAD) were converted to pyruvate and NADH catalyzed by LD. NADH strongly absorbs light at 340 nm, whereas NAD does not. The rate of change of absorbance at 340 nm is directly proportional to LD activity in serum samples. Values were calculated automatically by the analyzer. The Siemens Advia 2120i Hematology System (Siemens Health care Diagnostics, Deerfield, IL, USA) was used to count total and differential white blood cells (WBCs). This flow cytometry–based system uses a combination of reactions that occur within the peroxidase and the basophil/nuclear lobularity channels. A cluster analysis of the cells within each channel was used to generate a cytogram in which the x-axis reflected nuclear complexity and the y-axis reflected cell size. We calculated NLR by dividing the neutrophil counts to the lymphocyte counts provided by this hematology system.

Statistical Analysis

Statistical analyses were performed using Analyse-it Method Evaluation Edition, version 2.26, software (Analyse-it Software Ltd., Leeds, UK), PASW version 18.0 (SPSS Inc, Chicago, IL, USA), and R statistical software (version 3.6.3, R Foundation for Statistical Computing, Vienna, Austria). Comparisons of nominal variables and continuous variables between groups were assessed with Pearson’s chi-square and Mann–Whitney U tests, respectively. The adjusted P-values were calculated using the Benjamini–Hochberg method¹⁴ for multiple tests. Variables satisfying the Benjamini–Hochberg method were included for receiver operating characteristic (ROC) analysis. ROC curves were plotted for ovarian cancer markers and their combinations with LD in order to assess their diagnostic ability to differentiate between ovarian cancer and control groups. The areas under ROC curves (AUCs) of ovarian cancer markers and their combinations with LD were compared. The numbers on the curve present the degree of accuracy as follows: no discrimination (AUC < .5), acceptable (.7 < AUC < .8), excellent (.8 < AUC < .9), and outstanding (.9 < AUC).¹⁵ Binary logistic regression analysis was used to calculate the predicted probability values of the combinations of ovarian cancer markers and LD, and these values were used to estimate ROC-AUCs, similar to the previously described method.¹⁶ The presence of ovarian cancer as the outcome and the results from ovarian cancer marker identification were used as predictor variables. P-values less than .05 were considered statistically significant. In addition, widely used and available machine learning algorithms used for ovarian cancer markers such as bagging, boosting, classification tree, random forest, support vector machine, and K-nearest neighbor algorithms^10,11 were applied to our datasets for better diagnostic performance in differentiating ovarian cancer from control. The ratio of independent datasets used for training and testing, which were randomly separated, was 7:3. Three-fold cross-validation was performed for machine learning analyses. When these machine learning algorithms were applied to our data, the values of markers as they stand were used for analyses.

Results

Study Population Characteristics

The basic characteristics of our study cohort are shown in Table 1. All patients included in our study were female and diagnosed with early stage ovarian cancer (stage I, 81.1% and stage II, 18.9%). The median age of patients in the ovarian cancer and non-cancer control groups were 54.0 and 49.0 years, respectively (P < .001). The proportion of patients with menopause was higher in the ovarian cancer group than in the control group (75.5% vs 50.5%, P = .003). Among cancer markers, the median values of ROMA (15.3% vs 6.0%, P < .001), HE4 (48.3 pmol/L vs 36.9 pmol/L, P < .001), and CA125 (27.4 U/mL vs 18.4 U/mL, P = .003) were significantly elevated in the ovarian cancer group compared to the control group. In terms of hematological laboratory results, NLR (2.9 vs 2.2, P = .008) showed significant differences. Regarding routine chemistry, median LD values revealed significant differences between the ovarian cancer and non-cancer groups (202.0 IU/L vs 183.0 IU/L, P < .001).

Table 1.

Basic Characteristics and Laboratory Results Related to Ovarian Cancer of the Study Population.

Variable^a	Ovarian cancer	Non-cancer control	p-value^b
Age, years	54.0 (48.7-62.0)	49.0 (37.0-55.3)	<.001
Menopause	40 (75.5)	342 (50.5)	.003
BMI, kg/m²	23.4 (20.8-26.1)	22.9 (20.9-25.2)	.490
Cancer marker
ROMA, %	15.3 (5.4-87.8)	6.0 (3.5-10.5)	<.001
HE4, pmol/L	48.3 (34.5-210.2)	36.9 (30.9-45.8)	<.001
CA 125, U/mL	27.4 (14.7-420.3)	18.4 (11.5-37.9)	.003
CEA, ng/mL	.8 (.1-1.4)	.7 (.3-1.3)	.607
Hematology
Hemoglobin, g/dL	12.6 (10.8-13.5)	12.8 (11.8-13.5)	.364
WBC, ×10⁹/L	6.3 (4.9-7.4)	6.3 (5.1-8.1)	.789
Neutrophil, %	66.9 (58.0-77.5)	63.6 (56.1-69.7)	.040
Lymphocyte %	23.3 (14.7-32.5)	27.7 (21.9-33.5)	.027
Neutrophil-to-lymphocyte ratio	2.9 (1.8-5.5)	2.2 (1.7-3.2)	.008
Monocyte, %	4.2 (3.7-5.7)	4.7 (3.8-5.6)	.607
Monocyte-to-lymphocyte ratio	.2 (.1-.3)	.2 (.1-.2)	.027
Platelet, ×10⁹/L	275.0 (223.7-320.7)	255.0 (215.0-296.0)	.051
Chemistry
Creatinine, mg/dL	.6 (.6-.7)	.6 (.6-.7)	.963
Albumin, g/dL	4.3 (4.0-4.6)	4.5 (4.3-4.7)	.027
LD, IU/L	202.0 (176.7-261.7)	183.0 (165.0-208.0)	<.001
Smoking	2 (3.8)	48 (7.1)	.485

Abbreviations: BMI, body mass index; CA125, cancer antigen 125; CEA, carcinoembryonic antigen; HE4, human epididymis protein 4; LD, lactate dehydrogenase; ROMA, risk of ovarian malignancy algorithm; WBC, white blood cell.

^aData are expressed as median (first to third quartiles) or number (percentage).

^bAdjusted using the Benjamini–Hochberg method after Pearson’s chi-square test for nominal variables and the Mann–Whitney U test for continuous variables.

Performance of Single Ovarian Cancer Markers

AUCs of ROMA, HE4, CA125, LD, and NLR for differentiating ovarian cancer from all other conditions were .707, .680, .643, .657, and .624, respectively. Among single markers, only ROMA showed acceptable performance based on AUCs. The study cohort was subsequently divided into premenopausal (n = 348) and postmenopausal (n = 382) groups. In sub-group analysis, the AUCs were .580 for ROMA, .589 for HE4, .540 for CA125, .586 for LD, and .609 for NLR in the premenopausal group, while AUCs were elevated in the postmenopausal group (.685 for ROMA, .684 for HE4, .693 for CA125, .635 for LD, and .623 for NLR).

The resulting ROC-AUCs with 95% confidence interval (CI) and sensitivities at 75.0% specificities are summarized in Table 2. When the specificities were fixed at 75.0%, as recommended by the manufacturer of ROMA, the sensitivities of the ROMA and HE4 were 60.4% and 54.7%, respectively. The sensitivity of none of the markers was over 50.0% in the premenopausal group. The sensitivities over 50.0% in the postmenopausal group were 57.5% of ROMA, 55.0% of HE4, and 52.5% of CA125.

Table 2.

Performance of Single Ovarian Cancer Markers.^a

Menopause state	Markers	ROC-AUC	Sensitivity^b (%)
Total	ROMA	.707 (.623-.792)	60.4 (46.0-73.6)
	HE4	.680 (.596-.765)	54.7 (40.5-68.4)
	CA125	.643 (.553-.733)	49.1 (35.1-63.2)
	LD	.657 (.577-.737)	49.1 (35.1-63.2)
	NLR	.624 (.533-.716)	45.3 (31.6-60.0)
Pre-menopause	ROMA	.580 (.411-.749)	30.8 (9.1-61.4)
	HE4	.589 (.423-.755)	38.5 (13.9-68.4)
	CA125	.540 (.372-.708)	30.8 (9.1-61.4)
	LD	.586 (.413-.759)	30.8 (9.1-61.4)
	NLR	.609 (.433-.786)	46.2 (19.2-74.9)
Post-menopause	ROMA	.685 (.577-.793)	57.5 (40.9-73.0)
	HE4	.684 (.586-.782)	55.0 (38.5-70.7)
	CA125	.693 (.590-.796)	52.5 (36.1-68.5)
	LD	.635 (.539-.730)	45.0 (29.3-61.5)
	NLR	.623 (.515-.732)	47.5 (31.5-63.9)

Abbreviations: CA125, cancer antigen 125; HE4, human epididymis protein 4; LD, lactate dehydrogenase; NLR, neutrophil-to-lymphocyte ratio; ROC-AUC, areas under the receiver operating characteristic curve; ROMA, risk of ovarian malignancy algorithm.

^aData are shown as value (95% confidence interval).

^bSensitivities at 75.0% specificities are presented.

Performances of Combined Ovarian Cancer Markers

The performances of the combinations of ovarian cancer markers were evaluated because these 5 markers showed overlapping ROC curves. In particular, the combination of conventional cancer markers (ROMA, HE4, and CA125) with LD was examined because LD showed better performance than NLR among routinely prescribed laboratory results. The AUCs of ROMA with LD, HE4 with LD, CA125 with LD, and NLR with LD for differentiating ovarian cancers from all other conditions were .709, .692, .698, and .690, respectively. Regarding the combination of more than 3 markers, the AUCs for distinguishing ovarian cancers from other conditions were .708 for ROMA + NLR + LD, .705 for HE4 + CA125 + LD, .698 for HE4 + NLR + LD, .690 for CA125 + NLR + LD, and .696 for HE4 + CA125 + NLR + LD. In sub-group analysis, the AUCs in the premenopausal group (.556 to .600) were lower than those in the postmenopausal group (.687 to .718).

For the ROC-AUCs, sensitivities at 75.0% specificities of combined markers are presented in Table 3. The sensitivities of the best ROC-AUC in 2 to 4 combinations of markers were as follows: 58.5% for ROMA+LD (AUC = .709) in the total cohort and 62.5% for HE4 + CA125 + LD (AUC = .718) in the postmenopausal group. There were no combinations that showed sensitivities over 50.0% in the premenopausal group.

Table 3.

Performance of Ovarian Cancer Markers in Combination.^a

Menopause state	Markers	ROC-AUC	Sensitivity^b (%)
Total	ROMA+LD	.709 (.624-.794)	58.5 (44.1-71.9)
	HE4+LD	.692 (.611-.773)	58.5 (44.1-71.9)
	CA125+LD	.698 (.616-.780)	56.6 (42.3-70.2)
	NLR+LD	.690 (.608-.771)	50.9 (36.8-64.9)
	ROMA+NLR+LD	.708 (.621-.795)	58.5 (44.1-71.9)
	HE4+CA125+LD	.705 (.621-.789)	58.5 (44.1-71.9)
	HE4+NLR+LD	.698 (.615-.782)	52.8 (38.6-66.7)
	CA125+NLR+LD	.690 (.605-.775)	49.1 (35.1-63.2)
	HE4+CA125+NLR+LD	.696 (.610-.783)	52.8 (38.6-66.7)
Pre-menopause	ROMA+LD	.556 (.379-.734)	30.8 (9.1-61.4)
	HE4+LD	.584 (.406-.761)	30.8 (9.1-61.4)
	CA125+LD	.572 (.396-.749)	30.8 (9.1-61.4)
	NLR+LD	.600 (.414-.786)	38.5 (13.9-68.4)
	ROMA+NLR+LD	.583 (.399-.767)	30.8 (9.1-61.4)
	HE4+CA125+LD	.568 (.389-.747)	30.8 (9.1-61.4)
	HE4+NLR+LD	.600 (.411-.789)	46.2 (19.2-74.9)
	CA125+NLR+LD	.591 (.401-.781)	46.2 (19.2-74.9)
	HE4+CA125+NLR+LD	.593 (.402-.784)	46.2 (19.2-74.9)
Post-menopause	ROMA+LD	.694 (.589-.798)	60.0 (43.3-75.1)
	HE4+LD	.687 (.593-.781)	55.0 (38.5-70.7)
	CA125+LD	.708 (.614-.802)	55.0 (38.5-70.7)
	NLR+LD	.689 (.596-.781)	50.0 (33.8-66.2)
	ROMA+NLR+LD	.702 (.599-.805)	57.5 (40.9-73.0)
	HE4+CA-125+LD	.718 (.623-.814)	62.5 (45.8-77.3)
	HE4+NLR+LD	.701 (.607-.796)	55.0 (38.5-70.7)
	CA125+NLR+LD	.703 (.606-.799)	52.5 (36.1-68.5)
	HE4+CA125+NLR+LD	.712 (.614-.809)	57.5 (40.9-73.0)

^aData are shown as value (95% confidence interval).

^bSensitivities at 75.0% specificities are presented.

Machine Learning Analysis of Ovarian Cancer Markers

Machine learning analyses, including classification tree, bagging, random forest, adaptive boosting (AdaBoost), support vector machine, and K-nearest neighbor algorithms, were performed. The presence of ovarian cancer was considered the dependent variable. The estimated values of each machine learning algorithm from ROMA + LD for the total cohort and HE4 + CA125 + LD for the postmenopausal group presenting the best AUCs with acceptable performances in conventional combinatorial marker analysis were predictors (Figure 1(A) and (B)). We found that ROMA + LD classified by random forest showed the best AUC (.757, 95% confidence interval [CI] = .615-.898) among machine learning sets. Three-fold cross-validation was performed, and its sensitivity at 75% specificity was 68.8%. Additionally, ROMA for total cohort was analyzed using these machine learning algorithms based on the acceptable AUC value of conventional analysis. The best AUC of single ROMA (.681 by AdaBoost) was lower than that of ROMA + LD (.757 by random forest) (Supplementary Figure 2). Regarding the postmenopausal group, HE4 + CA125 + LD classified by these 6 algorithms revealed AUCs ranging from .500 to .648, which were not higher than those of conventional logistic regression. Machine learning analyses of these algorithms showed AUCs less than .500 for the premenopausal group.

Figure 1.

Performance of combined markers classified using machine learning for predicting ovarian cancer. (A) ROC curves of ROMA + LD determined by classification tree, bagging, random forest, adaptive boosting, support vector machine, and K-nearest neighbor analyses for distinguishing ovarian cancer from the non-cancer controls, (B) ROC curve of HE4 + CA125 + NLR + LD in the postmenopausal group. The areas under the ROC curves (AUCs) of combined markers are presented in brackets. Abbreviations: AdaBoost, adaptive boosting; CA125, cancer antigen 125; HE4, human epididymis protein 4; KNN, K-nearest neighbor; LD, lactate dehydrogenase; ROC, Receiver operating characteristic; ROMA, risk of ovarian malignancy algorithm; SVM, support vector machine; Tree, classification tree.

Discussion

Here, diagnostic applications of single cancer markers and their combinations with LD were evaluated in Korean patients with early stage ovarian cancer. The diagnostic values of machine learning algorithms for these combinations of cancer markers were also examined.

In terms of single cancer markers, our data showed that ROMA incorporating CA125, HE4, and the menopausal status was the best marker (AUC = .707) for discriminating epithelial ovarian cancer from benign disease. Many studies supported that HE4 was likely more specific than CA125, the conventional ovarian cancer marker.^17,18 Consistent with our study, ROMA has been also suggested to be an effective diagnostic tool for the detection of ovarian cancer.^8,17,19 In contrast, ROMA and HE4 were not superior to CA125 in postmenopausal groups. The reported diagnostic performance of these markers has been controversial. Some studies revealed no benefit for ROMA.^20-22 A prospective validation study showed that ROMA and HE4 alone revealed similar performance to CA125 alone in the premenopausal group, whereas their performance was worse in the postmenopausal group.²³ In another retrospective study, perioperative CA125 alone was superior to ROMA and HE4 in predicting ovarian tumors based on ROC analysis.²⁴

In addition to these markers, it was also investigated whether LD, which can be obtained in routine chemistry, can complement HE4, ROMA, and CA125. LD is an enzyme that plays a major role in anaerobic glycolysis and is related to the prognosis of patients with various cancers.²⁵ Serum LD levels in ovarian cancer patients were significantly elevated and were correlated with shorter survival time in previous reports.^26,27 Special AT-rich-binding protein 1, a global genome organizer, may reprogram energy metabolism in ovarian cancer by mediating LD levels, thus promoting metastasis.²⁷ Although there were only a few studies covering LD in ovarian cancer patients, LD was considered as a potential biochemical marker due to diagnostic accuracy with relatively high specificity.^26,27

NLR has been reported as a potent prognostic biomarker for progression-free survival and overall survival in ovarian cancer.^28,29 Regarding diagnostic utility, some studies demonstrated that preoperative NLR could differentiate ovarian cancer from benign ovarian masses.^30-32 In a recent study investigating NLR in a Korean cohort, the AUC for NLR was .709, which was slightly higher than AUC reported in our study (.624, 95% CI = .533-.716). The higher proportion of patients with advanced stages (52.9%) compared to our study cohort with early stages may account for this difference.³⁰

A recent review supported that combination of HE4 with CA125 has been a highly efficient tool for the diagnosis of ovarian cancer. This combination can bypass variations in HE4 derived from smoking or contraception drugs.⁷ The diagnostic performance of this combination with LD has seldom been reported, while combination with NLR has been evaluated in some previous articles. These studies demonstrated that preoperative CA125 in combination with NLR would be more sensitive and cost-effective. Furthermore, this strategy could be conducted routinely for identifying ovarian cancers.^33,34

There have been a few studies applying machine learning algorithms for the detection of ovarian cancer.^9,10,35 A recent study demonstrated that machine learning algorithms enhanced biomarker specificity for several types of tumors, including ovarian cancer. K-nearest neighbor and classification tree were used to improve specificity, which is challenging in early detection of cancer by conventional serum biomarkers.³⁵ In another study, machine learning was applied to preoperative diagnosis and prognosis prediction in ovarian cancer based on blood biomarkers. They found that machine learning systems provided critical diagnostic and prognostic prediction before initial intervention.⁹ Song et al¹⁰ also adopted machine learning algorithms such as linear discriminant analysis and K-nearest neighbor for the early detection of ovarian cancer. The 3 or 4 combinations, which included transthyretin and prolactin, revealed outstanding performance ranging from .91 to .95 for cancer detection. The study cohort included healthy controls, and the choice of serum biomarker, which was not routinely used in clinical settings, generated differences compared to our study. The utilization of machine learning algorithms may facilitate personalized management and increase the number of treatment options for better outcomes through early stratification of patients.

This study had some limitations. Only a small number of ovarian cancer patients were available from collected samples. In addition, early stage patients would bias toward lower diagnostic performance compared to other studies. However, evaluation of early stage patients is important because early differentiation is correlated with a better outcome. Furthermore, to the best of our knowledge, this is the first study to apply machine learning algorithms to combinations of ovarian cancer-specific markers with LD and NLR, which could be utilized in clinical practice. Additional studies with large sample size are necessary for the validation of our algorithm in ovarian cancer patients.

Conclusion

In conclusion, we evaluated the diagnostic benefit of HE4, CA125, and ROMA combined with LD to identify early stage ovarian cancer patients. Although a few published studies have discussed the usefulness of machine learning algorithms, no study has assessed the diagnostic performance of combinations of ovarian cancer markers with LD using machine learning algorithms for early stage ovarian cancer. The combination of ROMA and LD was acceptable for ovarian cancer patients, and classification by random forest was effective for the differential diagnosis of cancer. Our study provides information on the application of machine learning to combinations of practical biomarkers for patients with early stage ovarian cancer to facilitate appropriate patient management. Because our study results are based on a relatively small sample size of cancer patients, further studies including a larger number of ovarian cancer patients are needed to confirm our study findings.

Supplemental Material

sj-pdf-1-ccx-10.1177_10732748211033401 – Supplemental Material for Evaluation of Combined Cancer Markers With Lactate Dehydrogenase and Application of Machine Learning Algorithms for Differentiating Benign Disease From Malignant Ovarian Cancer

Supplemental Material, sj-pdf-1-ccx-10.1177_10732748211033401 for Evaluation of Combined Cancer Markers With Lactate Dehydrogenase and Application of Machine Learning Algorithms for Differentiating Benign Disease From Malignant Ovarian Cancer in Neuroblastoma by Seri Jeong, Dae-Soon Son, Minseob Cho, Nuri Lee, Wonkeun Song, Saeam Shin, Sung-Ho Park, Dong Jin Lee and Min-Jeong Park in Cancer Control

Footnotes

Acknowledgments

The authors would like to thank staff members of Abbott Diagnostics, Beckman Coulter, and Siemens Healthcare Diagnostics for their technical support.

Abbreviations

AdaBoost, adaptive boosting; BMI, body mass index; CA125, cancer antigen 125; CEA, carcinoembryonic antigen; HE4, human epididymis protein 4; LD, lactate dehydrogenase; NAD, nicotinamide adenine dinucleotide; NLR, neutrophil-to-lymphocyte ratio; PLR, platelet-to-lymphocyte ratio; RLUs, relative light units; ROC-AUCs, areas under the receiver operating characteristic curves; ROMA, risk of ovarian malignancy algorithm; WBC, white blood cell.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Research Foundation of Korea (NRF) grant, funded by the Korean government (Ministry of Science, ICT & Future Planning) [NRF-2017R1C1B2004597].

Ethics Statement

This study was approved by the independent Institutional Review Board of Kangnam Sacred Heart Hospital (HKS 2020-05-009) and was conducted in accordance with the Declaration of Helsinki. Moreover, the need for informed consent was waived because anonymity of personal information was maintained.

ORCID iDs

Seri Jeong, MD

Min-Jeong Park, MD

Supplemental Material

Supplemental material for this article is available online.

References

Bray

Ferlay

Soerjomataram

Siegel

Torre

Jemal

. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394-424.

Jung

K-W

Won

Y-J

Kong

H-J

Lee

. Prediction of cancer incidence and mortality in Korea, 2019. Cancer Res Treat. 2019;51(2):431-437.

Lim

Won

, et al. Incidence of cervical, endometrial, and ovarian cancer in Korea during 1999-2015. J Gynecol Oncol. 2019;30(1):e38.

Partheen

Kristjansdottir

Sundfeldt

. Evaluation of ovarian cancer biomarkers HE4 and CA-125 in women presenting with a suspicious cystic ovarian mass. J Gynecol Oncol. 2011;22(4):244-252.

Terry

Sluss

Skates

, et al. Blood and urine markers for ovarian cancer: a comprehensive review. Dis Markers. 2004;20(2):53-70.

Gupta

Naumann

. Ovarian cancer: screening and future directions. Int J Gynecol Canc. 2019;29(1):195-200.

Dochez

Caillon

Vaucel

Dimet

Winer

Ducarme

. Biomarkers and algorithms for diagnosis of ovarian cancer: CA125, HE4, RMI and ROMA, a review. J Ovarian Res. 2019;12(1):28.

Moore

McMeekin

Brown

, et al. A novel multiple marker bioassay utilizing HE4 and CA125 for the prediction of ovarian cancer in patients with a pelvic mass. Gynecol Oncol. 2009;112(1):40-46.

Kawakami

Tabata

Yanaihara

, et al. Application of artificial intelligence for preoperative diagnostic and prognostic prediction in epithelial ovarian cancer based on blood biomarkers. Clin Canc Res. 2019;25(10):3006-3015.

10.

Song

Yang

Kim

Park

Kyung

Kim

. Best serum biomarker combination for ovarian cancer classification. Biomed Eng Online. 2018;17(Suppl 2):152.

11.

Abbott

Fishman

, et al. Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics. 2003;19(13):1636-1643.

12.

Benedet

Bender

Jones

3rd Ngan

Pecorelli

. FIGO staging classifications and clinical practice guidelines in the management of gynecologic cancers. FIGO committee on gynecologic oncology. Int J Gynaecol Obstet. 2000;70(2):209-262.

13.

Prat

Oncology FCoG . Staging classification for cancer of the ovary, fallopian tube, and peritoneum. Int J Gynaecol Obstet. 2014;124(1):1-5.

14.

Benjamini

Hochberg

. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc B. 1995;57(1):289-300.

15.

Kim

Hwang

. Drawing guidelines for receiver operating characteristic curve in preparation of manuscripts. J Korean Med Sci. 2020;35(24):e171.

16.

Jeong

Park

Cho

Kim

H-S

. Diagnostic values of urine CYFRA21-1, NMP22, UBC, and FDP for the detection of bladder cancer. Clin Chim Acta. 2012;414:93-100.

17.

Cho

Park

, et al. Comparison of HE4, CA125, and risk of ovarian malignancy algorithm in the prediction of ovarian cancer in Korean women. J Korean Med Sci. 2015;30(12):1777-1783.

18.

Zhen

Bian

L-H

Chang

L-L

Gao

. Comparison of serum human epididymis protein 4 and carbohydrate antigen 125 as markers in ovarian cancer: a meta-analysis. Mol Clin Oncol. 2014;2(4):559-566.

19.

Dayyani

Uhlig

Colson

, et al. Diagnostic performance of risk of ovarian malignancy algorithm against CA125 and HE4 in connection with ovarian cancer: a meta-analysis. Int J Gynecol Canc. 2016;26(9):1586-1593.

20.

Jacob

Meier

Caduff

, et al. No benefit from combining HE4 and CA125 as ovarian tumor markers in a clinical setting. Gynecol Oncol. 2011;121(3):487-491.

21.

Montagnana

Danese

Ruzzenente

, et al. The ROMA (risk of ovarian malignancy algorithm) for estimating the risk of epithelial ovarian cancer in women presenting with pelvic mass: is it really useful? Clin Chem Lab Med. 2011;49(3):521-525.

22.

Han

Park

Kim

, et al. The power of the risk of ovarian malignancy algorithm considering menopausal status: a comparison with CA 125 and HE4. J Gynecol Oncol. 2019;30(6):e83.

23.

Van Gorp

Cadron

Despierre

, et al. HE4 and CA125 as a diagnostic test in ovarian cancer: prospective validation of the risk of ovarian malignancy algorithm. Br J Canc. 2011;104(5):863-870.

24.

Braicu

Van Gorp

Nassir

, et al. Preoperative HE4 and ROMA values do not improve the CA125 diagnostic value for borderline tumors of the ovary (BOT)-a study of the TOC consortium. J Ovarian Res. 2014;7:49.

25.

Zhou

Wang

, et al. Prognostic value of lactate dehydrogenase expression in different cancers: a meta-analysis. Am J Med Sci. 2019;358(6):412-421.

26.

Boran

Kayikcioglu

Yalvaç

Tulunay

Ekinci

Köse

. Significance of serum and peritoneal fluid lactate dehydrogenase levels in ovarian cancer. Gynecol Obstet Invest. 2000;49(4):272-274.

27.

Xiang

Zhou

Zhuang

, et al. Lactate dehydrogenase is correlated with clinical stage and grade and is downregulated by siSATauB1 in ovarian cancer. Oncol Rep. 2018;40(5):2788-2797.

28.

Miao

Yan

Feng

. Neutrophil to lymphocyte ratio and platelet to lymphocyte ratio are predictive of chemotherapeutic response and prognosis in epithelial ovarian cancer patients treated with platinum-based chemotherapy. Canc Biomarkers. 2016;17(1):33-40.

29.

Yin

Yang

. Prognostic significance of neutrophil-lymphocyte ratio (NLR) in patients with ovarian cancer: a systematic review and meta-analysis. Medicine. 2019;98(45):e17475.

30.

Kim

Park

, et al. Diagnostic accuracy of inflammatory markers for distinguishing malignant and benign ovarian masses. J Canc. 2018;9(7):1165-1172.

31.

Prodromidou

Andreakos

Kazakos

Vlachos

Perrea

Pergialiotis

. The diagnostic efficacy of platelet-to-lymphocyte ratio and neutrophil-to-lymphocyte ratio in ovarian cancer. Inflamm Res. 2017;66(6):467-475.

32.

Yildirim

Demir Cendek

Filiz Avsar

. Differentiation between benign and malignant ovarian masses in the preoperative period using neutrophil-to-lymphocyte and platelet-to-lymphocyte ratios. Mol Clin Oncol. 2015;3(2):317-321.

33.

Cho

Hur

Kim

, et al. Pre-treatment neutrophil to lymphocyte ratio is elevated in epithelial ovarian cancer and predicts survival after treatment. Cancer Immunol Immunother. 2009;58(1):15-23.

34.

Zhang

Huo

Huang

Cheng

Liu

Bao

. Neutrophil-to-lymphocyte ratio in ovarian cancer patients with low CA125 concentration. BioMed Res Int. 2019;2019:8107906.

35.

Banaei

Moshfegh

Mohseni-Kabir

Houghton

Sun

Kim

. Machine learning algorithms enhance the specificity of cancer biomarker detection using SERS-based immunoassays in microfluidic chips. RSC Adv. 2019;9(4):1859-1868.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.87 MB