Abstract
Introduction
In patients affected by epithelial ovarian cancer (EOC) complete cytoreduction (CC) has been associated with higher survival outcomes. Artificial intelligence (AI) systems have proved clinical benefice in different areas of healthcare.
Objective
To systematically assemble and analyze the available literature on the use of AI in patients affected by EOC to evaluate its applicability to predict CC compared to traditional statistics.
Material and Methods
Data search was carried out through PubMed, Scopus, Ovid MEDLINE, Cochrane Library, EMBASE, international congresses and clinical trials. The main search terms were: Artificial Intelligence AND surgery/cytoreduction AND ovarian cancer. Two authors independently performed the search by October 2022 and evaluated the eligibility criteria. Studies were included when data about Artificial Intelligence and methodological data were detailed.
Results
A total of 1899 cases were analyzed. Survival data were reported in 2 articles: 92% of 5-years overall survival (OS) and 73% of 2-years OS. The median area under the curve (AUC) resulted 0,62. The model accuracy for surgical resection reported in two articles reported was 77,7% and 65,8% respectively while the median AUC was 0,81. On average 8 variables were inserted in the algorithms. The most used parameters were age and Ca125.
Discussion
AI revealed greater accuracy compared against the logistic regression models data. Survival predictive accuracy and AUC were lower for advanced ovarian cancers. One study analyzed the importance of factors predicting CC in recurrent epithelial ovarian cancer and disease free interval, retroperitoneal recurrence, residual disease at primary surgery and stage represented the main influencing factors. Surgical Complexity Scores resulted to be more useful in the algorithms than pre-operating imaging.
Conclusion
AI showed better prognostic accuracy if compared to conventional algorithms. However further studies are needed to compare the impact of different AI methods and variables and to provide survival informations.
Introduction
A prognostic factor is defined as a patient characteristic that identifies subgroups of untreated patients having different outcomes, while a factor predictive of treatment effect is a patient characteristic that identifies subgroups of treated patients having different outcomes. For that reason, with the advent of a greater range of treatment options for ovarian cancer, prediction has become increasingly important.
Currently, the association of surgical complete cytoreduction (CC) followed by platinum-based chemotherapy with or without maintenance therapy by bevacizumab and/or PARPi is the standard treatment for patients affected by epithelial ovarian cancer. 1 In patients with a low probability of optimal primary surgical debulking, neoadjuvant chemotherapy (NACT) followed by interval debulking surgery increases the chance of optimal residual tumor. The optimal debulking is defined as a residual tumor less than 1 cm after cytoreductive surgery, 2 but ideally the crucial point is to eliminate all the macroscopic lesions, leaving no residual disease, a condition that is known as CC0. 3 The impact of post-operative residual disease on patients survival has been described in several studies, with significative poorer survival outcomes.4-7 If there is a probability of sub-optimal debulking, primary surgery should be avoided, 8 for this reason the identification of patients for which it is not possible to achieve optimal cytoreduction acquires a strategic importance.
As far as the patients with recurrences is concerned, chemotherapy represents the gold standard for the treatment in these situations. 9 Nevertheless, several studies highlighted that secondary cytoreduction (SC) might improve survival outcome of selected patients, evidencing the need of selection algorithms to predict the possibility to achieve complete cytoreduction (CC).10-17 Different predictive scores have been proposed to assess the feasibility of CC in recurrent ovarian cancers. The Arbeitsgemeinschaft Gynaëkologische Onkologie Descriptive Evaluation of preoperative Selection of (K)Criteria for Operability in recurrent Ovarian cancer (AGO OVAR DESKTOP) I and II trials 10 identified a predictive score for complete resection (AGO score) comprehensive of good performance status (ECOG 0), complete resection at primary surgery (or alternatively, International Federation of Gynaecology and Obstetrics, FIGO stage I/II), and ascites less than 500 ml. The presence of all these three factors, meaning a positive AGO score, was predictive for complete resection in 79 and 76% of patients. Similar sensitivity (79%) but low specificity (57.6%) where reported by Tian et al 11 analyzing six predictive factors (FIGO stage, residual tumor after primary surgery, DFI, ECOG performance status, CA125 at recurrence, and ascites at recurrence). Angioli et al 12 also included HE4 in their predictive SeC-Score (Secondary Cytoreductive Surgery in Recurrent Ovarian Cancer), while The “Memorial Sloan Kettering (MSK) criteria” are based on the site of recurrence and disease-free interval (DFI).13,14
Artificial intelligence (AI) systems provide many benefits including ability to handle large series of data, to cope with missing data items or the presence of new data. AI approaches excel when there is no requirement for ‘the absolutely provably correct or best’ answer, but, instead, the requirement is for an answer which is better than currently known one. 18 At the base of the functioning of AI systems there is the concept of algorithm or decision tree, a process that, after examining the properties of the available elements or clusters of elements, can split a set of possible answers to a question into subsets corresponding to different test results. More complex models of decision three based on simple mathematical model associated with learning algorithms include input and output layer feed-forward analyses. These systems are capable to weight the importance of single and associated variables and to analyze the back propagation of the error as a learning rule in the phase of training in order to obtain an acceptable output value. Synaptic weights are updated after training to finally predict the effect of each new variable added.
AI has proven to have an enormous potential in many areas of healthcare.19-22 In addition, there is evidence that AI methods can have a better performance than traditional statistical methods in analyzing complex information derived from large datasets with multiple input variables. 23
The aim of this review is to systematically assemble and analyze the available literature on the use of AI in patients affected by EOC to evaluate its applicability to predict CC compared to traditional statistics by analyzing the strengths and current limitations of this method.
Material and Methods
We performed a systematic literature review following Cochrane’s review methods guide and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.
Data search was carried out through the following database: PubMed, Scopus, Ovid MEDLINE, Cochrane Library, and EMBASE. Communications of international gynecology and oncology congresses and studies reported in ClinicalTrials.gov were also screened to identify relevant literature.
The main search terms were: Artificial Intelligence AND surgery/cytoreduction AND ovarian cancer. The search was supplemented with a comprehensive evaluation of references of relevant articles and related articles. It was not restricted according to date but was limited to English and French language.
Two authors (GP and ML) independently performed the search by October 2022, and evaluated the eligibility criteria. The data were extracted by one author (ML) and checked by the other (GP) under the supervision of a third author (PZ). Studies were included when data about Artificial Intelligence and methodological data were detailed. We excluded studies in which Artificial Intelligence has been used for purposes other than the prediction of complete cytoreduction.
The following data were extracted: author, year of publication, number of patients, median age of patients, study period, type of pathology, AI method used, predictive and method accuracy, mean absolute error (MAE), root mean squared error (RMSE), area under the curve (AUC).
Results
The search of the databases and registers found 47 items. After the assessment of article eligibility based on the selection criteria, 6 articles were finally retained for the review24-29 (Figure 1). The data were extracted from the published manuscripts as descriptive information of the populations and details. Identification and selection of studies.
A total of 1536 cases were analyzed in this review (range 668 and 98). The median age was 63,3 years. Different intervals of years were described in the different studies, with a median of 8,5 years (range 4 to 15 years). Three articles reported data on the use of AI in advanced epithelial cancers,25,26,28 two did not specified the stage of cancer24,29 and one article focused only on recurrences. 27
Different AI methods were used in the single studies. In 4 out of 6 studies the authors declared to use one AI system to analyze data: Laios et al 25 applied k. NN (k-Nearest Neighbor), Laios et al 26 applied XGBoost Model (eXtreme Gradient Boosting Model), Bogani et al 27 applied ANN (Artificial Neural Network), Feng Y et al 29 applied MLDTA (Machine Learning Based Decision Tree Algorithm). Otherwise, multiple systems were used in Enshaei et al 24 and Laios et al 28 studies.
Only few data on survival were available. In particular, Enshaei et al 24 reported 92% of predictive accuracy on 5-years overall survival, while Laios et al 28 reported 73% predictive accuracy on 2-years overall survival. Similarly, the predictive survival AUC was described in three articles24,28,29 with a median of 0,62 (range 0,42-0,74).
In order to provide predictive information on surgical resection, two articles reported a model accuracy of respectively 77,7% 24 and 65,8%. 25 The median AUC for a residual tumor of 0 was 0,81.
Variables considered in AI algorithms. Abbreviations: MS menopausal status; BMI body mass index; PS performance status; TOS time of surgery; RT residual tumor; DFI death free interval; SCS surgical complexity score; LM lymphnode metastases; CT chemotherapy; LP leucocyte proportion.
The number of variables chosen in the algorithm of each study was different. On average, 8 variables (range 6-12) were inserted in the analyzed algorithms. The most used parameters were patient age, present in all the algorithms, and Ca125, used in 5 out of 6 algorithms. The use of other variables as menopausal status, lymphnode metastasis and leucocyte proportion appeared to be more sporadic. Surprisingly, imaging tools including trans vaginal ultrasound, Computed Tomography (CT) and Positron Emission Tomography (PET) was inserted in only 1 out of 6 algorithms.
Discussion
Primary Tumours
The decision to perform or to delay surgery is one of the most important clinical decisions in gynecologic oncology since it can be a complex surgery impacting not only on the patient survival, but also on the quality of life following the intervention. In most of the patients a relatively simple surgery consisting of hysterectomy, bilateral salpingo-oophorectomy, infracolic omentectomy, limited excision of retroperitoneal node and segmental resection of intestine is sufficient. 30 A radical surgery involves more extensive procedures that are associated with greater peri-operative and post-operative complications. Aletti et al 31 published in 2006 a retrospective work on more than 194 patients aimed to analyze the effect of aggressive surgical effort on survival. He observed that residual disease was the only independent predictor of survival that radical surgery is superior to non-radical surgery in terms of overall survival.
Predicting overall survival is the most important outcome measure in order to give to the physician more information and to improve the treatment of gynecologic oncological diseases. Conventional statistic algorithms have been described in literature with poor prognostic accuracy.32,33 In their study based on 668 cases analyzed with ANN algorithm, Enshaei et al 24 obtained a predicting survival accuracy of 93% with an AUC of .74 in predicting complete cytoreduction. Comparing these results with the logistic regression models data, AI revealed greater accuracy in 75% of analysis thanks to its ability to predict survival more or less than the median for each cohort outperforming logistic regression, confirming that AI is a better predictor of outcome than a traditional statistic system. A similar predicting survival AUC (.69) was described in the study of Feng et al 29 in which it is explored the role of preoperative circulating leukocytes in predicting survival prognosis of serous ovarian carcinoma. Another interesting finding of this paper is the relationship between the rising of monocytes and leucocytes and the worsening of prognosis, especially in terms of monocytes-to-leucocytes ratio: it confirms the fundamental importance of immune environment in EOC and this parameter should achieve more attention in AI algorithm.
Predictive accuracy and AUC were lower for advanced stage EOCs (73% and .42, respectively) in the study of Laios et al 28 The predictive information about the cytoreductive surgery seems to be less influenced by the stage of the disease, revealing a model accuracy only slightly inferior (65.8% 25 vs 77.7% 24 ) and an higher AUC (.87 26 vs .75 24 ). Advanced EOC are characterized by greater extent of the disease also at distance. Therefore, it is more difficult to accurately estimate the survival of these patients because of the increasing number of factors influencing it. It is evident that this type of diseases requires a higher number of variables and more stringent criteria for the inclusion of patients, which constitutes an interesting future perspective for the application of AI. Moreover, Laios et al 28 tried to build a model to estimate 2 years prognosis by using different features and the mean predictive accuracy of ML model was 73%. Nevertheless unfortunately available data about Overall Survival and Disease Free Survival are still insufficient because of their relative recent introduction.
It is not possible to compare AI models respect to each other in terms of performance since no sufficient data are available. Moreover, multiple models of AI are combined together in order to ensure a higher level of accuracy.
Recurrences
Recurrence is a difficult terrain for the clinician. This occurs because recurrence can have extremely variable characteristics for location and size and clinical and surgical treatment must be customized for each patient. Recently, the DESKTOP III study demonstrated a statistically significant increased overall survival in the subgroup of platino-sensible patients treated by secondary optimal cytoreduction and chemotherapy vs the subgroup treated by chemotherapy only (53.7 months vs 46 months). 15 This results have been confirmed by the SOC-1 Trial, 16 while in the GOG-0213 secondary cytoreduction in patients with platinum-sensitive recurrence of ovarian cancer does not result in an increase in OS compared to chemotherapy. 17
Unfortunately at present no study reporting the impact of AI in rEOC with a descriptive and complete approach have been published, thus representing an interesting perspective for future research. Among the few informations available on this subject, the retrospective study of Bogani et al 19 resulted the only one analyzing the importance of factors predicting CC in patient having secondary surgery for rEOC through an AI model. 194 patients were enrolled and evaluated using ANN and 82.9% of them achieved CC at primary surgery. DFI (importance: .231), retroperitoneal recurrence (importance: .178), residual disease at primary surgical treatment (importance: .138), and International Federation of Gynecology and Obstetrics (FIGO) stage at presentation (importance: .088) represented the main factors influencing the possibility to achieve a CC, while DFI was the most important variable influencing OS (importance: .306).
The authors underlined that the presence of retroperitoneal disease alone is associated with an increased ability to achieve CC in comparison to the presence of peritoneal disease. On the other hands, the presence of single or multiple peritoneal nodules and carcinomatosis had a limited impact on the possibility to obtain CC, thus highlighting that the presence of carcinomatosis should not be considered a contraindication for SCS. Also, in this study the site of recurrence (peritoneal vs. retroperitoneal) had no impact on survival outcomes.
Variables Analyzed
According to our review of the literature, among the variables used in AI algorithms we found data that influence the operability of the patient and the probability of development of complications by performing maximum surgical effort in order to obtain a complete cytoreduction. Among these, age and performance status are the most recurrent together with BMI. Ovarian cancer is an extremely heterogeneous disease with a variety of different histologic subtypes and a wide range of responses to treatment so it is extremely difficult to select, in addition to disease parameters, which characteristics of the patient can be entered into an AI algorithm. Older age, comorbidities, and postoperative complications are known major risk factors for prolonged hospitalization, highlighting that efforts to optimize baseline functional status and minimize surgical complications may improve hospital discharge rates and postoperative functional status. 34 Anyway, these parameters have not been included in any AI algorithm, probably due to the inter and intra-patient heterogeneity. Also, smoking habits, the presence of maintenance therapy with molecular antibodies, previous surgery or blood cells count are elements that can affect the surgeon in the operative phase and the subsequent survival of the patient.
Other routinely data easily available for the clinician and more related to the diffusion of the cancer and therefore the difficulty of the intervention are CA125, stage, histology and grade. HE4 and CA125 and other serum markers demonstrated a statistically significant association with optimal cytoreduction.35-39 Recently, blood biomarkers analyzed by Machine Learning and conventional systems has been also proved to provide diagnostic and prognostic prediction informations for patients with EOC before initial intervention. 40 Therefore, preoperative leucocyte proportion (LP) and in particular Monocyte-to-lymphocyte (MO/LY) has been associated to suboptimal cytoreduction. 41 In the study of Feng et al. 29 MO/LY ratio resulted significantly increased in the blood of patients with EOC and the authors concluded that the reduction of lymphocites was related to poorly immunological response to the cancer.
CT scan findings such as large-volume ascites, diffuse peritoneal thickening, extensive omental involvement, lymphadenopathy and spleen, liver and intestinal involvement have been used for the prediction of suboptimal cytoreductive surgery, often associated to other factors.42-44 Diagnostic laparoscopy has the advantage of a direct detection of tumor extension and is often used to ensure a correct stadiation of ovarian cancer based on tumor size and/or location within the peritoneal cavity. The most frequently cited scores are the Peritoneal Carcinomatosis Index (PCI) for peritoneal carcinomatosis of all types,45,46 the Fagotti score 47 and the Aletti score: 48 a score higher than the specific cut-off promotes a neoadjuvant chemotherapy, taking into account that the presence of blocking points such as massive hepatic, stomach or intestinal invasion can significantly impact even in the presence of a low score. Surgical Complexity Scores (SCS) including previous cited scores resulted to be the strongest positive influence in the study of Laios et al 28 This result is consistent with previous literature, attributing high accuracy to these scores.49,50 In our revision of literature, these parameters are more used in AI algorithms respect to preoperative imaging (3/6 and 1/6 studies respectively). This can be justified in different ways: on the one hand imaging systems have a lower accuracy than laparoscopy in defining the extent of the disease, on the other hand the addition of AI to imaging can be complex due to the different anatomical structures of the abdominal cavity. Thus, the inclusion of radiological predictors should be used with caution.
The presence of residual tumor (RT) drastically reduce the survival advantage of surgery in the study of Laios et al 28 and result to be influenced by the time of surgery (TOS), meaning the exposition or non-exposition to NACT. Even if RT and adjuvant CT are post-operative features, not includible in an hypothetical pre-operative algorithm for patients selection, these parameters are considered in three and two studies respectively because of their impact on survival analysis.
Conclusions
To our knowledge this is the first systematic review focusing on the use of AI for predicting the possibility to achieve CC both in primary and recurrent EOC and on the variables which are part of individual algorithms.
The most important limitation of our study is the lack of descriptive and survival data. In addition, AI methods used by different authors present many differences in terms of algorithms and parameters considered. Actually, the lack of a study that delivers solid results substantiated by rigorous criteria, performance measures and scores that comprehensively refer to AI makes it difficult to draw conclusions about the actual usefulness of AI in this field. This is the main reason that explains the resilience of clinicians to use AI instead of conventional statistics, along with the additional costs for the installation and maintenance of these systems.
AI has demonstrated better prognostic accuracy if compared to conventional algorithms because of the capability of handling a greater number of data and more complex interactions. However, nowadays the use of AI remains unclear and further studies are needed to compare the impact of different AI methods and variables and to provide informations about survival in order to improve the management of EOC patients.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
