Abstract
Background
Cytoreductive surgery is critical for optimal tumor clearance in advanced epithelial ovarian cancer (EOC). Despite best efforts, some patients may experience R2 (>1 cm) resection, while others may not undergo surgery at all. We aimed to compare outcomes between advanced EOC patients undergoing R2 resection and those who had no surgery.
Methods
Retrospective data from 51 patients with R2 resection were compared to 122 patients with no surgery between January 2015 and December 2019 at a UK tertiary referral centre. Progression-free survival (PFS) and overall survival (OS) were the study endpoints. Principal Component Analysis and Term Frequency – Inverse Document Frequency scores were utilized for data discrimination and prediction of R>2 cm from computed tomography pre-operative reports, respectively.
Results
No statistical significance was observed, except for age (73 vs 67 years in the no- surgery vs R2 group,
Conclusions
R2 resection and no-surgery cohorts displayed unfavourable prognosis with a notable degree of uniformity. When cytoreduction results in suboptimal results, the survival benefit may still be higher compared to those who underwent no surgery.
Plain language summary
The study examined outcomes in advanced epithelial ovarian cancer (EOC) patients who underwent either R2 (suboptimal) surgical resection or received no surgery at all at a UK tertiary referral center. Sophisticated machine learning methodolgies were used to analyze data patterns and predict the extent of resection (>2 cm) from pre-operative CT reports. Reasons for not undergoing surgery included older age, presence of other medical conditions, patient preference, progressive disease, patient decline, or lack of detectable intra-abdominal disease. Factors like serous histology and performance status iinfluenced the risk of recurrence in both groups, while serous histology and adjuvant chemotherapy predicted the risk of death in the R2 group. Word sequences like “omental disease” and “reduced bulk” helped differentiate between R>2 cm and less extensive resections (R1-2 cm). In summary, both R2 resection and no-surgery groups had poor outcomes, but patients who underwent R2 resection generally had better survival compared to those who received no surgery, even when complete tumor removal was not achieved.
Keywords
Introduction
Epithelial ovarian cancer (EOC) ranks third amongst gynaecological malignancies being responsible for a minimum of 4000 deaths annually in the UK and 140 000 deaths worldwide. 1 The cornerstone of treatment for advanced EOC involves a combination of surgical cytoreduction —encompassing procedures such as hysterectomy, bilateral salpingo-oophorectomy, and debulking to minimize residual tumour burden and systemic chemotherapy, typically employing platinum-based agents, which is integral to eradicating residual malignant cells. 2 Tumour biology and the extent of residual disease post-surgery are identified as the strongest predictors of survival. 3 Pre- treatment CT remains the standard-of-care imaging for deciding the extent of the disease in these patients. Acute reporting can facilitate treatment selection and planning. 4 Conventional narrative radiology reports often overlook crucial findings pertinent to the indication for examination, potentially limiting their value as a communication tool. 5
Cytoreductive surgery for advanced EOC aims to achieve optimal tumour clearance and improve the efficacy of subsequent chemotherapy. Achieving optimal cytoreduction in advanced tubo-ovarian cancer is associated with improved outcomes. 6 Nevertheless, optimal resection can be feasible in 50-70% of women with FIGO stage III/IV disease due to disease burden or involvement of critical structures. 7 In some cases, despite the best surgical efforts, patients may undergo an R2 resection (in which >1 cm of cancer tissue remains), while others may not undergo surgery at all.
Both R2 resection and no surgery at all represent challenging scenarios in the management of advanced EOC. Bulky disease with or without surgical resection shift treatment goals toward palliation, symptom control, and optimization of the quality of life. 8 The administration of systemic chemotherapy to these patients is a milestone, but its effectiveness in controlling the disease is a blurred area. A valid consideration lies in examining the comparative outcomes observed in these potentially homogenous patient groups. This question has received scant attention in the international literature, given the absence of published studies demonstrating a feasible comparison between the two groups. We aimed to compare the outcomes of advanced EOC patients who underwent R2 resection following surgical cytoreduction to those who had no surgery. The primary endpoints were progression-free survival (PFS) and overall survival (OS). Secondary endpoints included reasons why EOC patients opted out of cytoreductive surgery. The value for preoperative CT reports to predict suboptimal outcomes was additionally examined.
Materials & Methods
Patient Selection
Retrospective data from 51 advanced EOC patients who underwent R2 resection were compared to those from 122 advanced EOC patients who had no surgery between Jan 2015 and Dec 2019 at a UK tertiary referral ESGO centre of excellence for ovarian cancer surgery. All data were retrieved from electronic health records (EHRs) and were prospectively registered in the internal ovarian database. The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Institutional Review Board (23/NE/0229/328779/12.01.24). The research was registered in the UMIN/CTR Trial Registry (UMIN000049480). Patients underwent CT examination within one month from treatment initiation and/or interval debulking surgery. Historically, CT examinations were reported by dedicated gynaecologic oncology multidisciplinary team (MDT) radiologists using simple structured report templates. All patients were discussed at the central gynaecological oncology MDT meeting prior to treatment decisions.
The main inclusion criteria included women 18 years or older with known or suspected confirmation of advanced ovarian cancer who underwent CT Thorax Abdomen Pelvis (TAP) with intravenous (IV) or oral contrast media. Major exclusion criteria included: a) non-epithelial histology b) synchronous non- ovarian primary tumours c) early stage or borderline EOC d) insufficient CT imaging quality to allow for measurements according to RECIST criteria. Reasons for exclusion from the surgical option were recorded as verified by EHRs. Clinical variables included patient age, Eastern Co-operative Oncology Group (ECOG) performance status (PS), histology type (serous and non-serous), administration of adjuvant chemotherapy, Body Mass Index (BMI) and pre-treatment CA125. Residual disease was assessed at the end of the laparotomy; more than 1 cm residual was considered to indicate suboptimal resection (R2 resection). 9 Progression-free survival (PFS) was defined from date of diagnosis to date of first recurrence. Overall survival (OS) was defined from date of diagnosis to date of death.
Principal Component Analysis (PCA)
Principal Component Analysis (PCA) of four common variables (age, histology, ECOG performance status (PS), type of chemotherapy) was employed to quantify the amount of data discrimination. The PCA is used to reduce the dimensions of multivariate data by selecting the data projection that maximizes the explained variance, allowing for visualization in lower dimensions while maintaining information present within the data. 10 More specifically, each of the four variables was normalized by subtracting its mean and dividing each of its samples by its standard deviation. The normalized variables were used to calculate the principal component transformation, which was then limited to two principal components. Discrimination was visualized by setting the colour of the projected data points on the scatter plot according to the class to which they belonged.
Natural Language Processing
The EHRs were queried to identify women with advanced EOC including their operative notes. Large EHR datasets have shown the potential to increase understanding of real- world patient journeys, and to identify subgroups of patients grouped with the same disease label but differ in their outcomes and surgical requirements. 11 For textual analysis of CT reports, word frequencies were calculated using the most common words and n-grams. N-grams are word sequences of length N that carry more contextual information than simple words.
Term Frequency – Inverse Document Frequency (TF-IDF) was utilized to vectorize the individual texts at n-gram level. The words within the documents were also transformed to better represent the actual n-gram frequency by disregarding any variations due to the word conjugation. The vectorization operation involved the calculation of the TF-IDF score of each of the unique individual n-grams contained in the corpus of the CT notes. The TF-IDF metric quantifies n-gram importance which is defined by the frequency at which the n-gram appears in the document adjusted for the frequency at which it appears in the rest of the documents. A high TF-IDF score for an n-gram in a document indicates that the n-gram is highly unique to that document. Initially, the frequency and the conceptual meaning of words and phrases were analysed. Subsequently, the TF-IDF vectors of the CT reports were used to predict sub-optimal debulking through a logistic regression model. More specifically, the logistic regression (LR) classifier was trained to predict the R2 resection outcome. Analysis of CT reports from patients who underwent R2 resection may enhance our comprehension of pre-operative imaging’s potential to predict suboptimal surgical outcomes, distinguishing between R1-2 cm (an acceptable outcome) and R>2 cm resection. Standard performance metrics were carried out to measure discrimination. The reporting of this study conformed to STROBE guidelines. 12
Statistics
Descriptive statistics were summarised by frequency and percentages for binary and categorical variables and means with standard deviation (SD) for continuous variables. Continuous variables were analysed using the
Results
Group Descriptive Statistics. Independent Samples

Principal Component Analysis plot on the four variables commonly shared between the R2 resection and non-surgery groups. The first two principal components accounted for 34% of the data variance.
The median PFS and OS was 12 and 14 months for the no-surgery group vs 14 months and 26 months for the R2 group ( Cohort survival outcomes. Kaplan-Meier curves demonstrating: (A) progression-free-survival (B) overall-survival (blue = R2 resection; orange = non-surgery) analysed by the two groups (blue = R2 resection; orange = non-surgery). Hazard ratio and 95% confidence intervals for univariate Cox regression analysis: (A) recurrence and non-recurrence in the non-surgery group (B) recurrence and non-recurrence in the R2 resection group (C) fatal and non-fatal outcomes in the non-surgery group (D) fatal and non-fatal outcomes in the R2 resection group. Hazard ratio (HR) and 95% confidence intervals (CIs) for prospective log-linear associations (Cox regression) from multivariate analysis: (A) recurrence and non-recurrence in the non-surgery group (B) recurrence and non-recurrence in the R2 resection group (C) fatal and non-fatal outcomes in the non-surgery group (D) fatal and non-fatal outcomes in the R2 resection group.


In the R2 group, the median PFS was 12 (CI 11-15) and 5 (CI 4-18) months for the serous and non-serous group, respectively (
We focused on further interrogating the survival outcomes of patients without evidence of intra-abdominal disease who had no surgery (herein referred as group G6) compared to those who had no surgery for all other reasons (group non-G6). The median PFS and OS were 12 and 22 months for the G6 group vs 7 months and 11 months for the non-G6 group ( Survival outcomes in the non-surgery group. Kaplan-Meier curves demonstrating: (A) progression-free-survival (B) overall-survival for patients without evidence of intrabdominal disease (herein referred as G6, blue), and those who underwent no surgery for other reasons (Non-G6, orange).
Discrete bi-gram clouds weighted by the ranked absolute value of the logistic regression coefficients for the R>2 cm and R1-2 cm groups were identified (Figure 6). The bi-grams “solid omental”, “abdominopelvic ascites” were amongst those best discriminating between R>2 cm and R1-2 cm. The LR model performance for the discrimination prediction using TF-IDF 2-grams was good (Precision: 0.54; F1 Score: 0.70; Accuracy: 0.54; AUPRC: 0.70). 2-gram word clouds from CT reports best discriminating R >2 cm resection (left) and R1-2 cm resection (right).
Discussion
In this retrospective study, we compared outcomes between two cohorts of patients with advanced EOC; those who experienced R2 resection and those who did not undergo surgery. Our analysis initially showed that the baseline characteristics between the two groups were well-balanced. This reasoning laid the foundation for further analysis. Evaluation of the compared data showed that OS was more favourable in patients with R2 resection than the non-surgery group. While R0 resection represents the best-case surgical scenario, there are instances where R2 resection becomes unavoidable even with surgeons exerting substantial efforts to eliminate the disease. Nevertheless, the more favourable outcomes in the R2 resection group compared to the non-surgery group could only be explained, but equally confounded, by the administration of adjuvant chemotherapy, and the lower median age. The obvious benefit from surgery could be attributed to the adequate resection of the main tumour bulk resulting in a better vascularized smaller residual tissue offering a better chemotherapeutic response. 14 It can be argued that in nearly 65% of patients undergoing R2 resections, minimal surgical effort was exerted. It appears that experienced surgeons are adept at recognizing early on when optimal tumour clearance is unattainable, using surgical discretion to balance outcomes against potential added complications. Intra-operative decisions may be influenced by human factors as well as factual knowledge, aiming to maximize the magnitude of selected effort trade-offs. 15 On further analysis, chemotherapy was not administered in women with low grade or mucinous histology, constituting approximately 10% of patients in the R2 group. Another 10% of chemotherapy-naïve patients in the R2 group either became unfit for chemotherapy or suffered severe wound infections resulting in missing the window for treatment or had unrecorded chemotherapy data. Notably, the serous histology persistently prevailed as an independent predictor between the two survival endpoints in both sub-cohorts.
In principle, the study results come in line with those of other published studies showing benefits from submitting patients to debulking surgery despite the likelihood of R2 resection. 5 This highlights a surgical challenge; one should be confident about achieving a < R2 resection or a significant negative impact on those women’s prognosis is to be expected. Ovarian cancer is not a singular disease; it rather encompasses a variety of heterogeneous malignancies that share a common anatomical site. The serous subtype is predominant and appears to contribute significantly to the unequal share of fatalities across the EOC spectrum due to the absence of anatomical barriers to restrict the direct spread to the adjacent organs. 16 That said, in the R2 group, our subgroup histological analysis indicated a potential trend towards lower recurrence rates among patients who underwent R2 resection with serous histology compared to those with non- serous histology. Additionally, there was a less favourable fatality amongst serous patients compared to non-serous patients albeit the numbers were small, possibly attributable to the notably smaller population of non-serous patients, the anticipated chemotherapeutic unresponsiveness or aggressiveness of non-serous types like carcinosarcomas. 17
For the no-surgery group, the reasons for those women not to have surgery have been previously outlined. 18 For instance, a significant proportion of women treated with primary chemotherapy do not undergo interval surgery; these women are usually older and frailer. In our study, the non-surgical group was on average seven years older than the R2 resection group. Despite the age disparity, age was not an independent prognostic factor in the separate sub-cohort analysis. Their diminished survival could also be reflective of an aggressive underlying disease phenotype as more than half of those had a poor response to chemotherapy. Amongst the non-surgical patients with serous EOC, a small proportion of women opted out of surgery. These women frequently cite fear of surgical risk or want to maintain their quality of life. The advisory and holistic approach for these patients is a key action that the gynaecologic oncology team should undertake, as it is highly likely that these patients are unaware of the consequences of their decision. 19 Nevertheless, they should be counselled that despite some anticipated extended survival with chemotherapy, the struggle to manage the disease with an early relapse can be relentless and demanding. Those who opt for the no- surgery route should be aware that cytoreductive surgery significantly extends OS but not PFS. Also, a significant proportion of women (32/51) during interim imaging showed no signs of detectable disease in the abdomen-pelvis. As a result, they were not considered candidates for surgery and instead continued with chemotherapy. Our findings indicate that patients without image detectable intra-abdominal disease, had better survival outcomes, regardless of stage, compared to those who were not candidates or opted out of cytoreductive surgery. This suggests a more favourable effect of chemotherapy or potentially indicates lower biological tumour aggressiveness. In truth, many of these cases present with equivocal findings regarding residual disease, including lesions that are too small to measure or not easily detected by reproducible imaging techniques, uncertain identification of new lesions, or necrosis within existing lesions. Thus, guidance is provided to clearly define ”unequivocal progression” of non- measurable or non-target disease. 20 This considerable proportion of women underscores a critical question previously addressed by the RECIST Working Group during the development of their updated criteria: whether it is appropriate to transition from an anatomic unidimensional assessment of tumour burden to either volumetric anatomical assessment or functional assessments using Positron Emission Tomography (PET) or Magnetic Resonance Imaging (MR). 21 To add to the debate, in certain tumour types such as EOC, peritoneal mesothelioma and mucinous appendiceal tumours, there is a higher incidence of occult disease within visually normal peritoneum compared to others like colorectal and gastric cancer. 22 It is speculated that ‘target regions’ such as the greater omentum, umbilical round ligament and falciform ligament are more prone to hosting occult disease in such patients. For selected patients with advanced EOC without visible disease, resection of these target areas may be considered. 23 Herein, we surmise that EOC patients with no image detectable intraabdominal disease should still receive “box standard” or low surgical effort interval debulking surgery.
Another key idea of our study was to use the Term Frequency - Inverse Document Frequency (TF - IDF) score to highlight how to discriminate and predict to some extent the specific surgical outcomes in the R2 resection group from the pre-operative imaging examinations. The TF-IDF scores, coupled with modeling techniques, provide insights into the relationship between specific in-text n-grams and the dependent variable of the analysis, however, these results must be qualitatively verified and tested for generalizability. The CT images have previously shown their potential to evaluate surgical outcomes.24,25 It is estimated that CT can be as accurate as 80% to assess the degree of peritoneal carcinomatosis. 26 The primary limitations include lesion size, ascites, and technical challenges such as inadequate visualisation of bowel surfaces and mesenteric tumour implants, as well as difficulty distinguishing between parietal diaphragmatic and visceral liver carcinomatosis. 27 Collective efforts to predict suboptimal resectability based on preoperative imaging methods have been summarised in a recent expert narrative review. 28 In advanced EOC, structured imaging parameters can be used to create a radiological Peritoneal Carcinomatosis Index (CT- PCI) score to aid in surgical planning. 29 An association between CT-PCI score and pre- surgical CA125 has been proposed but does not correspond to histological subtypes. 30 In essence, the ideal imaging modality for assessing non-resectability does not yet exist. The 2023 ESMO–ESGO-ESP Consensus Conference on Ovarian Cancer recommended abdominal contrast-enhanced (CE) CT, MRI or whole- body (PET)-CT with the radiotracer 19 F Fluorodeoxyglucose (FDG) as valid imaging modalities in the initial work-up. 28 The power of radiological assessment can be increased by its association with diagnostic laparoscopy (18) or mini-periumbilical laparotomy for early intra-operative evaluation. 31 One of the main arguments is that conventional free-text radiology reports often overlook key findings pertinent to the indication, making them questionable as communication tools. Radiological reports featuring predefined templates serve as reminders for radiologists to address specific areas, especially those posing challenges for resection or are deemed unresectable. 32 Lately, a synoptic report documenting the occurrence of disease in 45 anatomical sites relevant to ovarian cancer distribution, improved completeness of pre-treatment CT reporting but added 30 min on the report turnaround time. 25
To address this, we explored the integration of sui-generis NLP (Natural language processing) tools to exact valuable information from the unstructured pre-operative imaging textual reports. We previously demonstrated the feasibility of extracting textual data from unstructured operative notes to predict R0 resection following advanced EOC surgical cytoreduction. 33 We employed the RoBERTa classifier, and we focused on the discrimination between R1-2 cm and R>2 cm taking advantage of our expertise gained during the extensive pre-trained process. We identified two specific bi-grams that revealed to the maximum extent a suboptimal postoperative outcome (>R2 resection). The occurrence of the bi-grams “solid omental” and “abdominopelvic ascites” in these reports should alert surgeons and improve MDT communications between oncologists and radiologists by considering the degree of accuracy and the reliability of the predictions. These word sequences are not ambiguous to surgeons. They could potentially become pre-defined response options to help with the development of locally developed synoptic reports. As the current paradigm shift towards treatment personalization, these n-grams, if validated in larger cohorts can potentially serve as linguistic biomarkers to aid in the selection of the most appropriate EOC treatment. We acknowledge that model performance was far from satisfactory. This could be secondary to the small sample size. Nonetheless, the model could grasp contextual nuances extending across multiple words, even if they were not in sequential order. Consequently, it is conceivable that local information, which was not evident through straightforward TF-IDF analysis, would be crucial for discrimination prediction. Pending further research, this information can be exploited in the standardisation process of MDT quality assessment -whereas surgeon’s perception for suboptimal debulking serves as an MDT reference- with the goal of improving quality. That said, our study did not intend to reflect any individual practice.
To the best of our knowledge, no similar recent study was found to compare outcomes between these specific patient cohorts. The results were generated in a tertiary centre of excellence for ovarian cancer surgery. Key features of such centres include high-quality infrastructure and high levels of expertise allowing for treatment centralisation. 34 Fully curated data contributed the strongholds of this study as previously described. Another strength of the study was the routine use of CT imaging in conjunction with a state-of-art NLP methodology at no additional cost to the patient to predict suboptimal surgical outcomes. Moreover, we demonstrated how intelligent use of narrative information could discern patterns, identify critical junctures, and potentially target areas for improvement in the continuum of EOC care. Our analytical approach to textual data empowers healthcare professionals to enhance the overall trajectory of patient outcomes in the context of EOC management.
However, some limitations exist. Firstly, the study was designed retrospectively, so we could not intervene or manipulate any variable. This design was inevitable as it would be unethical to conduct a randomised study where, in one arm surgeons deliberately carry out R2 resection. Secondly, it is a single institutional study, preventing the potential lack of generalizability or external validity. Thirdly, survival rates in the non-surgical group would have improved, had routine early BRCA testing alongside standardised salvage interventions using Bevacizumab and maintenance PARP inhibitors consistently applied throughout the study period. Continued research examining the efficacy of immune checkpoint inhibitors in the treatment of advanced EOC holds promise but current results indicate modest success. 35 We recognize that the ECOG-PS data for the no surgery group were incomplete and therefore not included in the report. If these data were available, they could offer additional insight into the disparity in survival between the two groups.
Conclusion
The R2 resection and no-surgery cohorts exhibit unfavourable prognosis but with notable degree of uniformity. When cytoreduction results in suboptimal results, the survival benefit is still higher compared to those who underwent no surgery. Accurate MDT patient selection for cytoreductive surgery guided by accurate pre-operative imaging continues to be crucial in effectively addressing the needs of these women.
Footnotes
Acknowledgments
Sincere thanks to all the staff caring for our ovarian cancer patients at Leeds Teaching Hospitals
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
