Abstract
BACKGROUND:
Breast cancer is a common gynecological malignant tumor and currently its clinical diagnosis mainly depends on methods of iconography and measurement of serum level.
OBJECTIVE:
To analyze correlation between serum index levels and prognosis of patients with breast cancer in one week and six months after operation, and to establish support vector machine (SVM) model to evaluate its effectiveness.
METHODS:
One hundred sixty eight patients diagnosed with breast cancer at Affiliated Cancer Hospital of Zhengzhou University were collected, 46 of which did palindromia while other 122 didn’t six months after operation. Serum CA153, CA125 and CEA levels of different periods in two groups were analyzed from their differences. Through receiver operating characteristic (ROC) curve analysis, their diagnostic threshold values were calculated, at the same time, SVM model was built.
RESULTS:
There was a significant difference between serum index levels of recurrence group and non-recurrence group in one week and six months after operation (
CONCLUSIONS:
Serum CAl53, CEA and CA125 levels after operation have certain instructional significance for prognosis of breast cancer patients, and the established SVM model has high clinical application value.
Introduction
Breast cancer is a common gynecological malignant tumor and its clinical diagnosis mainly depends on methods of iconography (which is just a specific diagnostic method in our present study) and measurement of serum level currently. According to many studies, serum markers are related to breast cancer, such as carbohydrate antigen (CA153, CA199 and CA125), carcinoembryonic antigen (CEA) and Cytokeratin fragme antigen 21-l (CYFRA21-1), and so on [1, 2] which have been widely used in clinical diagnosis of breast cancer today. Since breast cancer is usually founded at its early phase and there are a few cases having breast cancer metastasis at their first-time diagnosis, generally 3
Starting from three serum markers with high specificity and sensitivity and combining a number of materials of patients with breast cancer and clinical experience, we retrospectively studied patients’ recurrence after mastocarcinoma resection and its correlation to serum marker levels after the operation, aiming to indicate patients’ postoperative effect by the serum marker levels one week after the resection.
Materials and methods
Subjects
One hundred and sixty eight patients diagnosed as breast cancer and receiving resection at Affiliated Cancer Hospital of Zhengzhou University from January 2013 to December 2014 were selected. And 122 of them didn’t relapse within 6 months, with age of 52.1
Collection of specimens
One hundred and sixty eight patients’ serum CA153, CEA and CA125 levels were determined by enzyme-linked immunosorbent assay (ELISA) at three time points, before the surgery, in one week of postoperation and 6 months after the surgery. All subjects were not cases with diseases like other tumors, infections, diabetes and encyesis, etc. and other reasons that might cause a high serum level were excluded as well.
Methods
Statistical analysis
SPSS 19.0 software was used to analyze the collected materials. One-way ANOVA method was adopted to analyze differences of serum CA153, CEA and CA125 levels at the three points. Then ROC curves of each markers were drawn to compare their sensitivity and specificity and to find the diagnostic threshold value of breast cancer recurrence. Measurement data were processed by
Establishment of SVM model
SVM, first put out by Vapnik [6], has many distinctive advantages in solving small samples, nonlinear and high-dimensional pattern recognition and it could be generalized to the learning of function fitting and other machine learning. In accordance with limited sample data, reaching the optimal compromise between model’s recidivity and its learning ability hoped to obtain the best generalization capacity. In this study, all patients were divided into two groups, recurrence group and non-recurrence group in six months after surgery, the recurrence group was assigned as 1, and non-recurrence group as 0; five indexes, level of CA153 one week after the operation, level of CA153 one year after the operation, level of CA125 one week after the operation, level of CEA one week after the operation and level of CEA one year after the operation, were included. Choosing appropriate kernel function parameter and error penalty factor C is very important to the performance of learning machine. Penalty factor C controls punishment for misclassification. If the C value is infinite, all the constraint conditions must be satisfied, which means that all training samples should be accurately classified. But this will bring complexity to the algorithm. Therefore, the selection of C value should be based on practical application to obtain a relatively simple decision function by taking minimal value while classification accuracy is acceptable, besides, the parameters of the kernel function also need to be determined in advance. When establishing SVM model, a parallel grid search method was used to screen SVM parameters. The grid search algorithm gave M and N values to the penalty parameter C and kernel parameter
Serum marker CA153, CA125 and CEA levels of collected patients
Serum marker CA153, CA125 and CEA levels of collected patients
Note: There’s no apparent differences of the three serum marker levels before the operation between recurrent group and non-recurrent group, while they were significantly different one week after the operation (
Area under ROC curves of each single markers
Note: Serum CA153 and CEA levels one week of postoperation had highest sensitivity and specificity to recurrence.
The SVM model was established and then the two models were assessed from the following aspects.
In those equations, true positive number (TP) represents the number of positive samples that were judged correctly in the test set; true negative number (TN) represents the number of negative samples that were judged correctly in the test set; false positive number (FP) represents the number of positive samples that were judged mistakenly in the test set and false negative number (FN) represents the number of negative samples that were judged mistakenly in the test set.

ROC curves of each single markers. Note: AUCs of CA153 one week and six months after the operation were 0.825 and 0.664 respectively (
Levels of the three serum markers (CA153, CEA and CA125) at the three time points (before the operation, one week after the operation and 6 months after the operation) between recurrent group and non-recurrent group were listed in Table 1. According to the table, there’s no apparent differences of the three serum marker levels before the operation in the two groups, while they were significantly different one week after the operation (
ROC curves of every marker
The diagnostic value of recurrence in breast cancer was evaluated based on AUC which is between 0.5

SVM training process. Note: The SVM model in this study included five indexes (the levels of CA153 one week and six months after the operation, CA125 level in one week after the operation, and the levels of CA153 one week and six months after the operation).
The SVM model in this study included five indexes (the levels of CA153 one week and six months after the operation, CA125 level in one week after the operation, and the levels of CA153 one week and six months after the operation). After verification, the ultimately determined optimal parameter (

SVM simulation results. Note: “
It has been clinically proved that CA153 and CA125 play role in breast cancer patients’ early diagnosis and postoperative follow-up, and they are sensitive tumor markers too [9, 10]. CA153 is a soluble-form MUC1 mucin, usually located in normal secretory epithelium, and currently it’s also regarded as a breast cancer marker with optimal diagnostic capability [11]. There are studies suggesting that serum CA153 content is related to the metastasis of breast cancer III stage because the level of CA153 in patients with metastatic liver metastasis is higher than that in the control group [12]. CA125 is widely applied to ovarian cancer diagnosis and has high value in breast cancer diagnosis as well [13]. It was reported that the positive rate of CA125 in breast cancer is 20% and is correlated to breast cancer recurrence [14]. But it is still controversial to use CA125 to detect breast cancer [15]. CEA is the first tumor marker applied in clinic and it has been broadly used in tumor detection. And it has been reported that increasing serum CEA level can be considered as a sign for poor prognosis of primary breast cancer [16] and is related to the stage of breast cancer [17]. Meanwhile, serum CEA level is positively correlated to patient’s condition [18]. CA153, CA125 and CEA are generally used in clinical diagnosis of breast cancer and other tumors, and are often taken as prognostic indicators of some cancers [19, 20, 21, 22]. It had been shown that mastocarcinoma resection may have a dynamic role in breast cancer focuses [23], which may change the levels of serum markers through the promotion of tissue growth or other activities [24, 25, 26], providing a basis for further research.
This study retrospectively analyzed whether breast cancer patient’s tumor marker levels before and after surgery has indicating significance for patient’s therapeutic effect in postoperative six months. Study found that serum CA153, CA125 and CEA levels before the operation were higher than the normal level in both recurrent group and non-recurrent group but there’s no difference between the two groups. While there were apparent differences in one week of postoperation, which indicated that the three serum levels in one week of postoperation were significant for patient’s prognosis. Six months after the operation, CA153 and CEA levels existed dramatic difference between the two groups, while CA125 levels didn’t. And according to the ROC curves of serum markers one week after operation, the diagnostic thresholds of CA153, CA125 and CEA for the two groups were 18.0 U/mL, 19.29 U/mL and 18.09
Although the ROC curves of each serum marker showed a certain diagnostic value, the area under the curve were not more than 0.9, and there were limitations. Therefore, this study included five indexes (the levels of CEA and CA153 one week and six months after the operation, and CA125 level one week after the operation) to establish the SVM model, and the results showed that the accuracy of the model is 96.67% (29/30), the sensitivity is 90.0% (9/10), the specificity is 100.0% (20/20), whose prediction efficiency is good. With the development of SVM model, it has been widely used in medical field [27, 28, 29]. Oliveira et al. [30] using the SVM model to analyze whether there are lumps in breast X-ray found that the accuracy rate of SVM is 98.88%. Juneja et al. [31] through selecting 23 patients (8 sparse and 15 non-sparse) to research the sparsity of fibroglandular tissue of breast cancer patients and they used CT data to establish SVM model whose accuracy rate was up to 91%. Belekar et al. [32] discussed the effect of inhibitors and non-inhibitors in the breast cancer resistant protein (BCRP), through SVM, k-nearest neighbor (k-NN) and artificial neural network (ANN) models, and the accuracy of the three models was 90.8%, 88.3%, 85.2% respectively. In addition, Zhao [33] detected the levels of CEA, CA125 and CA15-3 in breast cancer patients after surgery and according to her founding that CEA had the highest accuracy rate which was 72.9% [33]. In summary, the SVM model has high accuracy, and provides a more accurate diagnostic basis for the future clinical diagnosis of breast cancer. SVM is expected to be an effective and practical tool for prediction of breast cancer recurrence and objective evaluation tool for therapeutic efficacy, and through SVM, recurrence of patients with breast cancer can be detected accurately so that patients themselves can timely understand their own conditions, reducing the mortality of patients with breast cancer.
Conclusions
In this study, through analysis of serum level changes
in patients with breast cancer before and after mastocarcinoma resection, our experimental work suggests that serum levels of CA153 and CEA one week and six months after operation, and the levels of CA125, CEA one week after operation have significant indication for prediction of patients’ recurrence, which provided theoretical reference and experimental evidence for clinical workers in prognosis of patients with breast cancer. Moreover, the SVM prognostic model established in this study provided a high clinical value for the prognosis of breast cancer patients.
Prospect
This research mainly discussed the establishment of the predictive model of recurrence in breast cancer, and analyzed how to improve the accuracy of the prediction model, providing a new idea and a new method for the prevention and treatment of recurrence in breast cancer, but it is still insufficient in practical application. The predictive model constructed in this study only included five input variables, and it can be further studied with more input variables; besides, other tumor markers should be taken into consideration in practical application. Once an application platform built, based on the gradual accumulation of data, the input variables of the prediction model could be selected so as to improve the accuracy of the prediction model as much as possible. Because of the specificity of medical data analysis, the input variables are not limited to data variables, and it is significant to find more suitable combination with other medical data including medical imaging, electrocardiogram, etc. in elevating the predictive accuracy of breast cancer’s recurrence and reducing its recurrence rate.
Footnotes
Acknowledgments
I thank all authors who have contributed to this paper for advice and comments and thank Affiliated Cancer Hospital of Zhengzhou University for providing research materials and experimental base. Project supported by the Natural Science Foundation of Henan Province, China (Grant No. 132300410076).
Conflict of interest
The authors declare that they have no competing interests.
