Abstract
Introduction
Urinary bladder cancer is the 4th most common cancer in men and the 12th in women. 1 Most bladder cancers are diagnosed as nonmuscle invasive, but approximately 20% present with muscle invasion. 2 Nonmuscle invasive tumors are treated by transurethral resection and intravesical drug therapy. 3 Approximately 20% of the initially nonmuscle invasive tumors will progress to muscle invasive disease. 4 Muscle invasion is diagnosed when tumor cells infiltrate the thick muscle bundles of the muscularis propria layer. The assessment of muscularis propria invasion is crucial to the management of urothelial carcinoma because it is the main crossroad toward a more aggressive treatment (surgical and/or chemoradiation therapy).5,6 Transurethral resection of bladder tumor (TURBT) specimens are examined by pathologists for tumor type, grade, and invasion. These specimens usually contain large amount of tissue that needs to be examined for muscularis propria invasion, and this can be a challenging process for pathologists. Artificial intelligence (AI) has the potential to be an accurate and time efficient technology that can aid pathologists in diagnosing muscularis propria invasion. AI has already shown promising results in several areas of pathology, such as Gleason grading for prostate adenocarcinoma,7–11 melanoma scoring, 12 and more. In previous studies we have developed and applied a new algorithmic approach called hierarchical contextual analysis (HCA) for the detection of perineural invasion in pancreatic adenocarcinoma, 13 and also for the detection of ganglion cells, a process of significant importance for improving the accuracy of Hirschsprung's disease diagnosis. 14 This approach allows the development of accurate algorithm even with relatively small training cohorts. Using this novel approach, in the present study, we were able to show that machine learning techniques, combined with expert pathologists’ insights yield an accurate tool for detection of muscularis propria invasion.
Methods
Ethical approval was obtained from the local studies ethics committee of our institution (approval no. 0660-16-TLV).
All methods were performed in accordance with the relevant guidelines and regulations. This is a cohort study. The reporting of this study conforms to STROBE guidelines. 15 We have de-identified all patient details to ensure the confidentiality of patient information.
Clinical Samples
The material used in this research was derived from formalin-fixed paraffin embedded tissue. Hematoxylin and eosin stained slides were scanned using the Philips UFS scanner (Koninklijke Philips, Amsterdam, The Netherlands), at ×40 magnification. The proprietary ISYNTAX format was converted to TIFF format using the Philips IntelliSite pathology solution program, version 3.2, and annotations were performed manually using a locally developed annotation tool. The pathological ground truth was based on hospital records. All specimens were previously reviewed by senior pathologist with expertise in genitourinary pathology.
Algorithmic Approach and Training
Following the scanning of 50 whole specimens of TURBT, 925 images were selected randomly for the algorithms' training (training cohort). To develop the model, we first used the insights of the pathologist and raw data with no annotations, in order to create the algorithm framework, which integrated convolutional neural networks (CNN) and decision process inspired by expert knowledge. The algorithm we developed is based on our experience with working with slides of various tissues for various tasks. Our experience shows that directly applying off the shelf methods yield insufficient results, both in terms of segmentation accuracy, bold errors, and the ability to rely on a limited amount of data. Therefore we developed HCA, the description of it is outside the scope of the current article. Based on the pathologist's insights and explanations without any annotation, we create a framework which is trained in a fully unsupervised way on all relevant available data. At this stage we set the algorithm framework and degrees of freedom that the algorithm has and tailor some parameters to the raw data. The framework integrates CNN and decision processes inspired by expert knowledge. U-Net CNN structure was used as a first approximation of the desired result. At the second phase we tailored the system to the given annotations by training the deep neural networks (using off the shelf backend did not yield sufficient results, and therefore by iterative process we modified them in the construction of the HCA algorithm). Data augmentation was designed to enforce robustness to the algorithm, part of the augmentation was using generative adversarial network (GAN). Following that, manual segmentation for muscularis propria and tumor structures was performed by 2 pathologists. Unmarked areas were considered as negative for tumor and muscularis propria. We then customized the system to provide annotations of the 925 images by training the deep neural networks. Subsequently, the algorithm was run on these same images and we determined the intersection over union (degree of overlap between the algorithm and the pathologist markings), the detection rate and the false alarm rate.
Analytical Validation
A validation cohort was generated to test the algorithm, consisting of 97 additional labeled images from 10 new specimens that were not used in the training cohort. The algorithm ran on these images and the degree of intersection over union, detection rate, and false alarm rate were determined.
Clinical Validation
TURBT specimens (n = 127) with a total of 617 slides were used for the clinical validation cohort. The number of slides per specimen ranged from 1 to 36. The algorithm presented 20 areas most suspicious for muscularis propria invasion for each case. This was done by having the algorithm first identify tumor and muscularis propria above a set confidence level providing a score of 0 if it was below the set confidence level and 1 it was above the set confidence level. Afterward, the 20 images with the smallest distance between muscularis propria and tumor were presented. A pathologist with expertise in genitourinary pathology analyzed these images and determined whether muscularis propria invasion was present or not in each case.
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request, pending institutional review board approval. The study was performed from November 2022 to September 2023.
Results
Training Cohort
After training on 925 fully labeled images, the algorithm was run on the same 925 images. Algorithm performance showed an intersection over union (IOU) of 85% for muscularis propria and 80% for tumor at the algorithm confidence threshold of 50%. The detection rates were 80% for muscularis propria and 90% for tumor, and the false alarm rates were 30% for muscularis propria and 20% for tumor, all at the same confidence threshold (Figure 1).

Training set graphs: (a) intersection over union: reflects the percentage of overlapping between the pathologist's and the algorithm's muscle and tumor annotations against the confidence threshold. (b) Detection rate: the percentage of muscle and tumor that the algorithm detected against the confidence threshold. (c) False alarm rate: the percentage of muscle and tumor that the algorithm falsely identified against the confidence threshold.
Analytical and Clinical Validation
The analytical and clinical validations are 2 different parts of the research. In the analytical validation part the analysis was done at the level of patch (not entire slide) and we evaluated the accuracy of the segmentation (how close it is to the manual segmentation we made). Based on the analytical validation we learned that the analytical accuracy of the algorithm is suboptimal and we added the element of proximity between the tumor cells and the muscle fibers. In the clinical validation the analysis was performed on whole slides and all the slides of the case. Additionally, the output was small images of suspicious muscularis propria invasion. We did not look into the accuracy of the segmentation but into the ability of the algorithm to “bring to the surface” the invasion areas.
Analytical validation: The algorithm was evaluated on additional 97 new manually annotated images. At 50% confidence threshold, the IOU was 78% for muscularis propria and 80% for tumor. The detection rate was 72% for muscularis propria and 65% for tumor, and the false alarm rate was 54% for muscularis propria and 23% for tumor (Figure 2).

Validation set graphs: (a) intersection over union: reflects the percentage of overlapping between the pathologist's and the algorithm's muscle and tumor annotations against the confidence threshold. (b) Detection rate: the percentage of muscle and tumor that the algorithm detected against the confidence threshold. (c) False alarm rate: the percentage of muscle and tumor that the algorithm falsely identified against the confidence threshold.
These results were still not sufficient for clinical application, because the false alarm rates were not low enough. Analysis of the results showed multiple reasons for false alarms. One of the main reasons for muscularis propria false alarms were thick-walled blood vessels (Figure 3a). Tumor false alarm was caused by areas with mildly thickened urothelium with reactive changes (Figure 3b).

Examples for false alarms: (a) thick-walled blood vessels were marked by the algorithm as muscle events, causing a muscle false alarm. (b) Surface urothelium with mild thickening and reactive changes which was marked by the algorithm as tumor event, causing a tumor false alarm.
In order to improve the algorithm's accuracy we decided to make the algorithm perform in a more logical way by mimicking the thought process of the pathologist. The algorithm was, therefore, modified to identify places where tumor and muscularis propria events were in close proximity to each other.
Clinical validation: For the clinical validation, 127 full specimens of TURBT were analyzed with a total of 617 slides. For each one of these specimens, the algorithm was instructed to provide the pathologist with 10 images where tumor and muscularis propria events were in nearest proximity. Of these 127 specimens, 17 were muscle invasive. The remaining specimens included 43 noninvasive urothelial carcinomas (29 low grade and 17 high grade), 17 specimens with invasion into lamina propria, 6 specimens of urothelial carcinoma in situ, and 44 specimens with no malignancy. The algorithm was able to detect 16 of 17 specimens with muscle invasive urothelial carcinoma. The algorithm provided images of muscularis propria and tumor events close to each other, and the pathologist managed to diagnose muscularis propria invasion based on these images. The one case missed by the algorithm was a case of nested subtype of urothelial carcinoma, which is characterized by unusually bland tumor cells arranged in nests (Figure 4).

Nested subtype of urothelial carcinoma characterized by unusual bland tumor cells lacking the nuclear features of classic urothelial carcinoma. Tumor nests invade thick muscle bundles of muscularis propria.
The time for analyzing the images provided by the algorithm ranged between 1 and 50 s. The time for diagnosing muscularis propria invasion by the pathologists based on these images ranged from 1 to 34 s, with an average time of 5.62 s (Figure 5a). The image number where muscularis propria invasion was detected ranged from 1 to 15 (median 2). In 50% (8 of 16) of the specimens that the algorithm identified muscularis propria invasion, the diagnosis has been made by the pathologist from the first image provided by the algorithm (Figure 5b).

(a) The time for diagnosing muscularis propria invasion by the pathologist based on the images provided by the algorithm. (b) Number of the picture that was diagnostic for muscularis propria invasion.
Interestingly, in 15 of 16 of the specimens with muscularis propria invasion, the image number was 5 or less, with the remaining case of muscle invasive tumor presented by the algorithm in image number 15. This particularly was a case of urothelial carcinoma with squamous differentiation, where nucleated keratin was presented as a muscle false alarm (Figure 6).

Example of urothelial carcinoma with squamous differentiation, where the nucleated keratin was labeled by the algorithm as muscle event.
Discussion
The diagnosis of muscularis propria invasion in urothelial carcinoma has a significant impact on treatment. It is one of the main tasks for pathologists in TURBT specimens’ evaluation. Occasionally, these specimens contain a large number of tissue fragments that the pathologist carefully examines. Detection of muscularis propria invasion is a time-consuming process for pathologists, who carefully examine tens and sometimes even hundreds of tissue fragments from a bladder tumor, and small foci of muscle invasion can still be missed. Moreover, it can be challenging to diagnose muscle invasion confidently, since there are pathologic findings that may mimic muscle invasion in urothelial carcinoma of the bladder, such as desmoplastic stromal reaction and hypertrophic muscularis mucosa. 16
AI is being applied in the field of pathology 17 to aid pathologists in tedious tasks that require the identification of events that sometimes may be small and focal. Examples include mitotic count,18,19 tumor infiltrating lymphocytes,20,21 perineural invasion, 22 and others. In this study we developed an accurate and efficient algorithm for identification of muscularis propria invasion in urothelial carcinoma, which significantly reduced the diagnosis time and workload. Of note, our algorithm does not determine whether muscularis propria invasion is present or absent, but rather identifies and provides suspicious areas, and it is for the pathologist to make the decision, based on these areas, whether there is muscularis propria invasion or not, rather than the algorithm.
The HCA method that we used in this study enabled us to eliminate the need for large datasets by training the algorithm to mimic the way that pathologists think, meaning that it detected muscularis propria and tumor separately, and then identified areas where these 2 events were in nearest proximity. This allowed the pathologist to diagnose muscularis propria invasion in a significant short time of few seconds instead of tens of minutes using conventional means.
In the training and analytical phases, the false alarm levels for detection of muscularis propria and tumor were not low enough, and following analysis we found that one of the reasons for this was the muscular walls of blood vessels. In addition, tumor false alarms were the result of urothelium with reactive changes. These events were mostly seen as standalone events, since reactive urothelial changes are present in the surface urothelium, and the blood vessels are more prominent in the lamina propria which is located just below the surface urothelium. The inclusion of the proximity factor between tumor and muscularis propria events allowed the algorithm to accurately and efficiently provide the pathologist with images of areas with muscle invasive tumor.
Our algorithm was able to accurately identify specimens with muscularis propria invasion, but there were few limitations that worth mentioning, since they emphasize the efficiency of our approach, in which it is not the algorithm that makes the decision, and the pathologist has the final word in determining muscularis propria invasion. Close proximity between muscle and tumor events does not necessarily mean that there is muscle invasion. Two separate fragments of muscularis propria and tumor may be very close on the glass slide, and still the tumor will not be muscle invasive (Figure 7a). The identification of muscularis propria invasion requires the experience and insight of the pathologist to identify the infiltrative architecture of the tumor and that tumor cells need to invade and dissect through the thick muscle bundles in order to diagnose invasion to muscularis propria (Figure 7b). In addition, not all urothelial carcinomas look the same under the microscope, and there are urothelial carcinoma with divergent differentiations, in addition to different subtypes. 23 Urothelial carcinoma with squamous differentiation shows foci of malignant squamous epithelium, which can be keratinized. The pink keratinized areas along with the nuclei of the cells can mimic muscle bundles. One of the 17 specimens with muscle invasion was urothelial carcinoma with squamous differentiation. The algorithm mistakenly labeled the squamous keratin as muscle event, and provided the pathologist with multiple images of these areas that were muscle false alarm. This particular case was muscle invasive and the algorithm provided other images with tumor invading muscularis propria, upon which the pathologist managed to diagnose muscle invasion. The number of the diagnostic image (15) was significantly bigger than for the remaining specimens with muscularis propria invasion (5 or less).

Tumor and muscle events in nearest proximity are not always diagnostic of muscularis propria. (a) Two adjacent but separate fragments of tumor and muscle close to each other, with no invasion of tumor to the muscle. (b) Tumor and muscle events in close proximity with muscularis propria invasion. The infiltrative pattern of tumor cells dissecting through the muscle is diagnostic of muscularis propria invasion.
There was one case of urothelial carcinoma with muscularis propria invasion that was missed by the algorithm. This was a case with an uncommon subtype called nested urothelial carcinoma.24–26 Two of the main features of malignant tumor cells that are present in the majority of cancers including bladder cancer are nuclear pleomorphism and hyperchromasia. The nested subtype of urothelial carcinoma breaks this rule, as it is characterized by unusually bland tumor cells that look uniform and lack pleomorphism and hyperchromasia. This is a rare subtype of urothelial carcinoma, which was not included in the training cohort. Training the algorithm to identify urothelium with benign features could have an added value for specimens with rare subtypes such as the nested subtype. The diagnosis of urothelial carcinoma invasive to muscularis propria requires the identification of thick muscle bundles infiltrated by malignant cells. In the lamina propria, there are thin muscle bundles called muscularis mucosa. Tumor invasion of these thin bundles is not considered a muscle invasive disease, and is not an indication for aggressive surgical and/or chemoradiation therapy. Usually, muscularis mucosa can be easily identified due to their small size and adjacent location to blood vessels in the lamina propria. Rarely, muscularis mucosa bundles may become hypertrophic. Currently, there are no guidelines for a certain threshold for thickness of muscle bundles to distinguish between muscularis mucosa and muscularis propria, and if tumor involves muscle bundles of medium size, it can be a very challenging task for the pathologist to distinguish between tumor involving hypertrophic muscularis mucosa and tumor invading muscularis propria. In this rare scenario, the pathologist emphasizes that there is tumor invading muscle bundles of intermediate size, and distinction between hypertrophic muscularis mucosa and muscularis propria cannot be made with certainty. Additional biopsy is strongly recommended. Our study did not include specimens with this uncommon scenario.
Conclusion
The AI-based algorithm developed in this study is an excellent tool that can aid pathologists to identify muscularis propria invasion in urothelial carcinoma in a time efficient and accurate manner. The HCA algorithmic approach used in this study enabled this even with the use of small datasets. Nevertheless, varied training cohorts are needed to cover the entire spectrum of the different histologic morphologies and subtypes of urothelial carcinoma, and to address challenging specimens with ambiguous muscle invasion.
Footnotes
Abbreviations
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethics Statement
The study was approved and informed consent was waived by the local ethics committee at Tel-Aviv Sourasky Medical Center (ethics approval #TLV-16-660).
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
