Abstract
Introduction
Urinary bladder cancer (BCa) is one of the most common urological malignancies among males.1–4 Clinical treatment strategies of BCa patients vary differently between the non-muscle-invasive lesions (stage T1 or lower) and the muscle-invasive bladder tumors lesions (stage T2 or higher).5–7 With the rapid development of magnetic resonance (MR) imaging and image analysis technology, non-invasive radiomics based on MR images have been widely recognized for identifying the state of muscle infiltration in patients with BCa,8–11 and accurate segmentation of bladder tumor masses has become a necessary step for non-invasive staging, which is of critical importance for the clinical management of patients with BCa. 12
Nowadays, the standard reference for the segmentation of BCa on MR images is manual delineation based on experts’ visual perception, which is a tedious process with a huge number of human effort and time consumption. Therefore, an accurate and efficient segmentation method is in highly desirable need. However, bladder tumor segmentation faces many substantial challenges, including the strong intensity inhomogeneity of the tumor region and overlapping of the bladder wall on MR images.
In the field of BCa segmentation, Duan et al 13 proposed an adaptive window-setting scheme in 2012 to automatically localize and roughly segment the BCa region on T1-weighted MR images. Inspired by this study, Xiao et al 14 adopted a two-step method, first using the CDLS method to obtain the entire bladder area, and then using a new morphological feature and spatial fuzzy c-mean (sFCM) algorithm to segment the BCa area. Because only using gray information and unsupervised clustering methods make the method’s segmentation performance for different data not stable enough, its segmentation performance is not ideal, and the accuracy needs to be improved.
Afterward, Dolz et al 15 proposed a dilated progressive convolution-based UNet model (DPC-UNet) to simultaneously segment the bladder wall and tumor region on T2 W MR images. The DPC-UNet model achieved a favorable performance for the bladder wall segmentation, with an average Dice Similarity Coefficient (DSC) of 0.984 and 0.839 for the inner and outer border of the bladder wall segmentation. Nevertheless, the performance for tumor segmentation was far less than satisfactory, with an average DSC of only 0.686, indicating that the end-to-end segmentation approach of the deep learning method might not sufficiently extract the typical image characteristics to well differentiate the bladder tumor from its surrounding wall tissue.
Based on the above analysis, we propose a supervised segmentation method, which uses the pixel-level features of the tumor and normal tissue to be extracted after partitioning candidate regions and constructs an RF-based segmentation model 16 to obtain more accurate segmentation results.
Methods
The proposed method is mainly divided into four steps. The first step is the automatic extraction of the candidate region by using the CDLS algorithm, which contains the entire tumor lesion and wall tissue for further accurate tumor segmentation. In the second step, two groups of regions of interest (ROI), including all tumor ROIs and wall ROIs, are defined in the training set for feature extraction. The third step is the pixel-level feature extraction, feature selection, and RF-based segmentation model development. Finally, the tumor was segmented from the candidate regions previously defined in the testing set, using the developed model. The entire methodological outline of the present study is exhibited in Figure 1.

The entire methodological outline of the present study. a. The strategy for the automatic determination of the candidate region containing the entire tumor tissue and its neighboring wall tissue for further segmentation; b. The schematic pipeline for the accurate segmentation of the tumor tissue from the wall tissue in the candidate region by using the supervised learning strategy with an RF classifier. CDLS = coupled directional level-set; mRMR = maximum relevance and minimum redundancy; RF = random forest; DSC = Dice similarity coefficient; ASSD = average symmetric surface distance.

Different compositions of LM filter and Gabor filter a. LM filter in 18 first-order derivatives, 18 second-order derivatives of Gaussian differential filters (6 orientations and 3 scales), 8 Gaussian Laplacian filters, and 4 Gaussian smoothing filters; b. Gabor filter in five scales and eight orientations.
Subject Enrollment
The present study was approved by the Medical Ethical Committee of Xijing Hospital (First Affiliated Hospital of Air Force Medical University (Fourth Military Medical University)) (Approval number KY20194009). Informed consent was waived because of the retrospective nature of this study. In this study, 56 patients with postoperatively-confirmed BCa were included. Tumors with the highest stage were determined as the pathological stage for patients with multifocal sites. Prior to any treatment, all patients underwent a 3.0 T T2 W MR scan (GE Discovery MR 750). Primary imaging parameters included: slice thickness 1 mm, repetition time (TR) 2500 ms, echo time (TE) 135 ms, and pixel size = 0.5 × 0.5 mm2. These patients were then randomly divided into a training set (40 subjects) for model development and the testing set (16 subjects) for the performance assessment at a ratio of 7:3.
Considering that the intermediate tumor layer always contains the most information for the preoperative BCa diagnosis by radiologists, most of the published articles focused on the application of radiomics with this tumor layer for preoperatively staging and grading of BCa.8,11,17–19 Therefore, in the present study, the intermediate tumor layer was chosen as the target from each patient's MRI data for implementing the segmentation method proposed.
Definition of the Candidate Region
For the T2 W MR images collected, the coupled directional level-set (CDLS) method was used in the first place to segment the inner and outer surface of bladder. 20 And then, a potential field and a streamline can be generated between the two surfaces according to solving the Laplacian equation. 21 After that, the thickness of the bladder is the arc length of a streamline connecting a point on the inner surface and its corresponding point on the outer surface,14,22 as shown in the middle of Figure 1a. The bent rate difference between the paired points reflects the bladder anomaly caused by the tumor lesion. 14 From these abnormal points, all the pixels on the streamline constituted a candidate region, 14 which contains the entire tumor tissue and its neighboring wall tissue for further segmentation, as shown in the right of Figure 1a.
Two ROI Groups Definition in the Training set
After the candidate region was determined, the accurate extraction of the tumor tissue from the wall tissue was further important work. To realize it, three main steps were implemented, including i) two ROI groups (tumor tissue and wall tissue) definition in the training set, ii) pixel-level features extraction and selection, and iii) RF-based segmentation model building and performance assessment, as shown in Figure 1b.
In the ROI definition, two groups of ROIs were carefully defined in the training set, including i) the tumor region enclosed by the red circle in the top left image of Figure 1b, which contains all the tumor tissue, and ii) the circular region enclosed by the green circle in the top left image of Figure 1b, which contains all the normal wall tissue. Then, all pixels in the ROI region were marked as “1” and “0” respectively, where the tumor is “1” and the normal tissue is “0”.
Pixel-Level Feature Extraction
In order to finely separate the tumor tissue from the wall tissue in the candidate region, pixel-level discrimination and classification are considered a more precise way. To realize it, pixel-level features should be extracted from these two groups of ROIs previously defined in the training set. In the test set, features should be extracted from all candidate regions, and all candidate regions are the sum of the two groups of ROI regions, as shown in the bottom left of Figure 1b.
Based on the intermediate tumor layer MR images, two feature categories were extracted from each pixel on the MR images, including (i) 20 intensity-based features, (ii) 48 Leung-Malik (LM) filter bank-based features, 23 and (iii) 40 Gabor-filter bank-based features,24,25 the details are shown in Table 1, respectively. Therefore, a total of 108 pixel-level features were obtained from each pixel of the intermediate tumor layer. These features were used as a vector to quantitatively characterize the property of each pixel in the bladder tumor segmentation process.
The pixel-level features used in this study.
Feature Selection and Segmentation Model Development
Considering such a great number of features were applied to characterize the property of each pixel, feature redundancy would occur and inevitably impair the capability of the segmentation model finally developed. Therefore, the feature selection process is indispensable. In this study, maximum relevance and minimum redundancy (mRMR) strategy was used for feature selection, 26 as shown in Figure 1b. The core idea of mRMR is to maximize the correlation between features and classification variables while minimizing the correlation between features. This algorithm uses a criterion (correlation - redundancy) to rank features to obtain the optimal M features based on the correlation between the measured feature and the target variable, which can quickly remove a large number of irrelevant features with a low computational burden and high efficiency.
After the optimal feature subset was determined, the widely-used machine learning method, RF, (as shown in Figure 3) was further implemented to develop the classification model for BCa segmentation from the candidate region. In the study, the classifier was implemented by using the YAN-PRTools MATLAB toolbox.

The display of the BCa segmentation results using the RF-based segmentation model in the testing set. The green curves represent the manual delineation results of the tumor regions (ground truth), and the red curves denote the segmentation results using the RF-based segmentation model.
Post-Processing and Performance Assessment
The classification label of each pixel in the candidate region was first predicted and then mapped to the original image based on the original location information. Finally, considering the connectivity of the pixels in the segmented tumor region, the hole filling and maximum region connection process were implemented to obtain a continuous and compact region for tumor extraction from the candidate region.
In this study, the manual delineation results were used as the ground truth to evaluate the automatic segmentation results. The contours of the cancer region in the testing set were independently delineated by two experts with bladder MR interpretation experience of 8 years and 9 years. After delineation, the contours were carefully checked by them slice by slice. Disputes were resolved by them on consensus. The measures of DSC and ASSD were used to quantitatively evaluate the performance of the proposed segmentation method.
Experiments and Results
Demographics of the Patients Enrolled
A total of 56 patients with postoperatively-confirmed BCa and preoperative MR scans were included and then were randomly allocated into the training set (n = 40) and testing set (n = 16) with a ratio of 7:3. The demographics of these patients in the training and testing sets are listed in Table 2.
Demographics of the patients enrolled in this study.
Performance Evaluation Using the Candidate Regions Determined in the Testing set
In this study, the mRMR strategy was used for filtered feature selection, and 30 features highly related to classification variables were selected from 108 features. Figure 3 shows the examples of the BCa segmentation results using the supervised learning strategy with an RF classifier. The first column of Figure 3 exhibits the inner and outer bladder walls segmented by the CDLS approach and the candidate regions to be segmented; The second column exhibits the tumor regions (encircled by the green curve) manually segmented by the experts; The area enclosed by the red line in the third column represents the BCa segmentation results using the supervised learning strategy with an RF classifier; The fourth column compares the segmentation results with the tumor regions manually segmented by the experts, from which we can easily notice the advantages of the segmentation model we constructed for accurate segmentation of tumor tissue.
Figure 4 exhibits results of using multiple supervised machine learning methods for the intermediate tumor layer of different patients’ segmentation. Each row is the result of a supervised machine learning method. Apparently, the results demonstrated that the proposed RF-based segmentation model outperformed other supervised learning methods, such as linear regression (LR) and decision tree (DT), for accurate bladder tumor segmentation.

The comparison of the BCa segmentation results using the RF-base segmentation model and other supervised machine learning methods with the intermediate tumor layer of different patients, respectively. The LR-based model represents the segmentation model constructed by the LR classifier, and the DT-based model represents the segmentation model constructed by the DT classifier. RF = random forest; LR = linear regression; DT = decision tree.
To quantitatively assess the precision of the tumor region segmentation, the values of DSC and ASSD were adopted as a numerical measurement for the segmentation performance evaluation. The RF-based segmentation model and other supervised machine learning methods, such as linear regression (LR) and decision tree (DT), are used to calculate the DSC and ASSD values of candidate regions in the test set, as shown in Table 3. Obviously, the DSC and ASSD derived from the RF-based segmentation model was the highest among several methods.
The DSC and ASSD of different approaches for the BCa segmentation in the testing set.
RF indicates the classifier of the random forest; LR indicates the classifier of linear regression; DT indicates the classifier of decision tree
Methods Comparison Experiment
In recent years, many attention-based methods have been proposed for MRI segmentation,27–29 and achieve favorable performance in terms of precision and robustness. To verify the superiority of our methodology, we also compared our proposed method with the widely-used Attention U-Net model. 29 In order to get the Attention U-Net model sufficiently trained, we used all T2 W image layers with tumor appearance in the training process and five middle tumor layers in the testing process. This part of the experiment was run on an NVIDIA Tesla V100 machine with 16GB of memory. The training parameters are: epoch = 200, batchsize = 64, learning rate = 0.001. The optimizer uses the SGD optimizer, and the optimizer parameters are: weight-decay = 1e-8, momentum = 0.9. The final segmentation result is shown in Table 4.
Comparison of our proposed methodology with the state-of-the-art Attention U-Net approach 29 .
The average segmentation DSC of the Attention U-Net segmentation model is 0.675, which is much inferior to that of our proposed RF-based model. Considering such a very limited sample set size, these results further reveal the great power of our proposed methodology for the accurate segmentation of bladder tumors.
Extended Experiment
To verify the robustness of our proposed RF-based model, we further collected five T2 W MR datasets from the Imaging Center of Xinjiang Uiger Municipal People's Hospital to conduct the external validation. The primary imaging parameters of these datasets include: underwent 3.0 T T2 W MR scans, slice thickness 2 mm, repetition time (TR) 1500 ms, and echo time (TE) 80 ms. Final segmentation results were listed in Table 5.
The DSC and ASSD of different approaches for the BCa segmentation in the extended set.
RF indicates the classifier of the random forest.
Results in Table 6 further demonstrate the favorable performance of our proposed method in a multi-center data application.
Discussion
Precise segmentation of BCa on MR is essential for preoperatively identifying muscle-invasive status and staging of BCa, which is of critical importance for the treatment strategy and prognostication of patients with BCa. Clinically, the manual annotation of the BCa on MR images by the experts is the standard reference for the AI-based diagnosis of BCa phenotypes and prognostication. However, this is usually a tedious process with great challenges like low efficiency, huge time consumption. The potential errors are very easy to occur for the radiologists with an inadequate experience of bladder MR images interpretation, especially in the segmentation of the mixed region which contains both the tumor tissue and its neighboring wall tissue. Considering that the boundary between the tumor and wall tissue on the MR images is the primary difficulty and main target for BCa segmentation, and it is usually very narrow and the pixel-level edges, the pixel-level features, and samples should be more appropriate for the accurate segmentation of the BCa lesion.
In this study, we propose to use a new approach for BCa segmentation. After dividing the inner and outer walls of the bladder using the coupled level set algorithm, it makes full use of the pixel-level features and samples determined from two kinds of ROIs, including ROIs of all tumor tissues and all wall tissue, with a supervised classifier RF to realize the pixel-level segmentation of bladder tumors on T2 W images. The results demonstrate the good performance of the proposed approach for BCa segmentation.
Compared with the widely-used Attention U-Net model that usually needs a large database for sufficient model training and parameters tuning to obtain their excellent performance and generalizability, the proposed pixel-level RF-based model only needs a small size database to realize a more preferable segmentation performance. In fact, by leveraging a kernel function to transform the original data into a high-dimensional pixel-level feature space, the classifier could find the optimal hyperplane for the classification purpose. 30 The results further demonstrated the superiority of the proposed approach for BCa segmentation in terms of the small database.
To verify the robustness of our proposed model, we added additional data as an independent verification, used the original trained model to directly segment the additional data, and obtained a good result. Although this result is not as good as the original one, which may be due to the inconsistency of the new data scan parameters with the original parameters, it nevertheless proves that our model is sufficiently robust.
However, the results of this study should be carefully interpreted due to the following limitations. First of all, the sample size of our study is very small, and most of the subjects are from the same center, which might impair the generalizability of the model for the large multi-center database application. Secondly, only filtered feature selection strategy was used in the present work to improve the performance. In future work, the wrapped feature selection strategy should also be considered to obtain the most powerful feature subset. Besides, a single imaging sequence could only provide limited information for BCa segmentation. Considering multimodal MR including T2-weighted, diffusion-weight, apparent diffusion coefficient images, as well as the dynamic contrast-enhanced images have been demonstrated more useful for BCa staging and muscle-invasive status identification, noninvasively, the inclusion of these MR modalities may provide much more information probably useful for BCa segmentation. 12 Apart from these points, the sample power analysis was not implemented in the present study owing to the currently limited data collected. In future investigations, more datasets will be included and the sample power analysis will be conducted prior to any further investigations. Considering generative adversarial networks (GANs) have been widely used for data augmentation and image quality enhancement recently, in future work, we plan to further utilize GANs for data augmentation and attention networks for deep information mining to realize more accurate and robust bladder tumor segmentation on MR data.
Conclusion
The proposed pixel-level RF-based strategy could comprehensively use the vast of information beneath the bladder MR images, thus effectively distinguishing the tumor tissue from its neighboring wall tissue, achieving more accuracy for the BCa segmentation.
Footnotes
Abbreviations
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest concerning the research, authorship, and/or publication of this article.
Funding
This work was supported by the National Natural Science Foundation of China under grants No. 81901698, 81871424, and Young Eagle Plan of High Ambition Project under grant No. 2020CYJHXXP.
Ethical Approval
Ethical approval to report this case was obtained from the Medical Ethical Committee of Xijing Hospital (First Affiliated Hospital of Air Force Medical University (Fourth Military Medical University)) (Approval number KY20194009).
Informed Consent
This study was informed consent was waived due to its retrospective nature.
