Sage Journals: Discover world-class research

Abstract

Rationale and Objectives: We aimed to develop and validate prediction models for histological grade of invasive breast carcinoma (BC) based on ultrasound radiomics features and clinical characteristics. Materials and Methods: A number of 383 patients with invasive BC were retrospectively enrolled and divided into a training set (207 patients), internal validation set (90 patients), and external validation set (86 patients). Ultrasound radiomics features were extracted from all the eligible patients. The Boruta method was used to identify the most useful features. Seven classifiers were adopted to developed prediction models. The output of the classifier with best performance was labeled as the radiomics score (Rad-score) and the classifier was selected as the Rad-score model. A combined model combining clinical factors and Rad-score was developed. The performance of the models was evaluated using receiver operating characteristic curve. Results: Seven radiomics features were selected from 788 candidate features. The logistic regression model performing best among the 7 classifiers in the internal and external validation sets was considered as Rad-score model, with areas under the receiver operating characteristic curve (AUC) values of 0.731 and 0.738. The tumor size was screened out as the risk factor and the combined model was developed, with AUC values of 0.721 and 0.737 in the internal and external validation sets. Furthermore, the 10-fold cross-validation demonstrated that the 2 models above were reliable and stable. Conclusion: The Rad-score model and combined model were able to predict histological grade of invasive BC, which may enable tailored therapeutic strategies for patients with BC in routine clinical use.

Keywords

invasive breast carcinoma histological grade radiomics ultrasound machine learning

Introduction

Breast carcinoma (BC) is the most common carcinoma in women and is also the leading cause of cancer-related deaths worldwide.¹ BC histological grade is an independent prognostic factor that consists of information about 3 aspects, namely, the degree of glandular ducts, pleomorphism of nucleus, and chromatin and mitotic phase.² The histological grade of BC maintains its pivotal role within the prognostic classification framework of the Nottingham Prognostic Index.³

Invasive BC, which represents the predominant histological subtype of breast cancer, constitutes approximately 80% of all breast cancer cases.⁴ Different histological grades of invasive BC have different management and treatment schemes.^5,6 Hence, an accurate diagnosis of histological grade in patients with invasive BC has a tremendous influence on the prognosis. Clinically, pre-operatively histological grade of invasive BC is mainly confirmed pathologically by ultrasound-guided biopsy, which is an invasive method and still associated with some unacceptable complications, including hematoma, pain, and infection.⁷

Medical images include information that reveals underlying pathophysiology and these internal connections could be illustrated by a number of advanced methods of image processing.⁸ The radiomics enables application of advanced image analysis in the standard clinical setting, which converts digital medical images into mineable data.^9,10 Advances in machine learning have also opened up new view points for ultrasound (US) image analysis by using radiomics.

Breast US is widely used as a noninvasive, nonradiative and inexpensive modality to help clinician to detect and diagnose BC, clinically. There is evidence showing that some gray scale features of US have a close relationship with histological grade of invasive BC.^11,12 However, the reproducibility and accuracy of the interpretations of ultrasonic imaging varies greatly among different ultrasonographers because of the considerable subjectivity of the understanding and application of diagnostic criteria. Theoretically, US images possibly contain hidden information that can be hardly perceived by the naked eye.¹³ Furthermore, US imaging has been universally utilized in the field of breast and lots of studies have demonstrated that US radiomics analysis is able to predict malignant breast tumor,^14,15 axillary lymph node metastasis,¹⁶ hormone receptor-positive BC,¹⁷ human epidermal growth factor receptor 2 (HER2) expression,¹⁸ Ki-67 expression level,¹⁹ neoadjuvant chemotherapy responses,^20,21 and disease-free survival of invasive BC.²²

To the best of our knowledge, no multicenter study has been conducted to pre-operatively predict histological grade of invasive BC using the method of US-based radiomics except for ours. In our study, it was hypothesized that US radiomics features extracted from the invasive BC lesions and machine learning classifiers can be used to develop imaging biomarkers that may differentiate the high histological grade (grade III) from low histological grade (grade I and II) of invasive BC, noninvasively.

Materials and Methods

The study was approved by the Institutional Review Board of all the hospitals and complied with the Declaration of Helsinki. The informed consent was waived because of the retrospective nature of this study.

Data Source and Patient Selection

Our multicenter study was conducted at 2 hospitals: Hospital 1 (Zhejiang Cancer Hospital, approval number IRB-2022-548) and Hospital 2 (Dongyang People's Hospital, approval number 2024-YX-111). Finally, 297 patients with invasive BC meeting the inclusion criteria from Hospital 1 were consecutively enrolled between March 2019 and December 2021 and divided into 2 either a training or an internal validation set at a ratio of 7:3. A number of 207 cases were enrolled in the training set, and 90 cases were enrolled in the internal validation set. Another independent set of 86 patients was consecutively enrolled from September 2021 to December 2022 at Hospital 2 as the external validation set. The procedure of inclusion and exclusion of patients is revealed in Figure 1. The reporting of this study conforms to STROBE guidelines (https://www.equator-network.org/reporting-guidelines/strobe/). We also have de-identified all the patient details.

Figure 1.

Flow chart of patients for enrollment. The left panel represents patient screening at Hospital 1, while the right panel pertains to patient screening at Hospital 2.

The inclusion criteria were: (1) patients who underwent biopsy or surgery of the breast lesion and were histopathologically confirmed with nonspecial types of invasive BC; (2) lesions presenting as mass on ultrasound images; (3) time interval between surgery and ultrasound examination less than 2 weeks; and (4) patients who received no previous chemotherapy or radiotherapy. The exclusion criteria included: (1) ultrasound images with obvious artifacts; (2) the target tumor was not completely visible in the ultrasound image; (3) patients with lack of pathological biomarkers or incomplete history information in clinical medical records; and (4) patients with BC with multiple malignant lesions.

The training set contained high (n = 60) and low histological grades (n = 147). The internal validation set contained high (n = 27) and low histological grades (n = 63). The external validation set contained high (n = 22) and low histological grades (n = 64).

The patients’ demographic characteristics such as tumor size, age, tumor site, axillary lymph node status (metastasis or no metastasis), Ki-67 index, HER2 status (positive or negative), estrogen receptor (ER) status (positive or negative), progesterone receptor (PR) status (positive or negative), Breast Imaging-Reporting and Data System (BI-RADS), and histological grade (I, II or III) were collected.

Postoperative Pathological Assessment

Pathological results were confirmed by US-guided biopsy or surgery. The cutoff point for ER-positive, PR-positive expression was 1%.²³ The critical threshold of Ki-67 to 14% was set, and over 14% was considered high expression.²⁴ In cases of equivocal HER2 overexpression, an amplification ratio of 2 or higher on fluorescence in situ hybridization test was considered to indicate HER2 positivity; otherwise, they were deemed to be negative.²⁵ The scoring criteria for histological grade of invasive BC were based on the following criteria:^2,26 (1) in regard to the formation of glandular ducts, most obvious glandular ducts were defined as 1 point, moderately differentiated glandular ducts were considered as 2 point, and the tumor cells growing in solid patches or strips were determined as 3 point; (2) regarding the size, shape, and chromatin of the nucleus, the size, shape, and chromatin of the nucleus were identical (1 point), the nucleus was moderately irregular (2 points), and the nucleus showed obvious polymorphism (3 points); and (3) for chromatin and mitotic phase, 1/10 HPF considered as 1 point, 2 to 3/10 HPF considered as 2 points, and > 3/10 HPF considered as 3 points. We graded each score as follows: 3 to 5 scores were graded I, 6 to 7 scores were graded II, and 8 to 9 scores were graded III.

Ultrasound Acquisition and Image Segmentation

Different types of ultrasound diagnostic equipment (LOGIQ E9, Siemens Acuson S2000, Toshiba Aplio 500 and Philips EPIQ 5) were used at the above 2 hospitals, employing a high-frequency linear probe with radial, transverse, and longitudinal scans on both breasts. Ultrasound images were exported with the format of digital imaging and communication in medicine from the picture archiving and communication system database.

In this study, sonographer 1 was responsible for collecting the ultrasound image information of the patients. Preprocessing was carried out using resampling technique, resampling all of the ultrasound images so that they were 1 mm × 1 mm × 1 mm to obviate the disturbances due to the nonuniform spatial resolution. Next, ITK-SNAP software (open source software; http://www.itk-snap.org) was employed to manually outline region of interest that covered the largest cross-sectional area of each breast lesion in the transverse plane. This was carried out independently by sonographer 2 (sonographer 1 from Hospital 2, with more than 5 years’ experience in ultrasonic diagnosis) not knowing a patient's histopathological grade result.

Radiomics Feature Extraction and Selection

The “pyradiomics” package (version 3.0.1) of Python (version 3.7.11) was utilized to extract ultrasound radiomics features. A total of 788 features of 4 types were extracted from ultrasound image from each eligible case: (1) 18 first-order features, (2) 14 shape-based features, (3) 14 gray-level dependence matrix features, (4) 16 gray-level run length matrix features, (5) 16 gray-level size zone matrix features, (6) 22 gray-level co-occurrence matrix features, and (7) 688 features using wavelet filter images.

In order to assess the inter- and intra-observer consistency of radiomics feature exaction, the ultrasound images from 50 patients were randomly selected. Two experienced sonographers (sonographer 2 and sonographer 3 from Hospital 2, with more than 5 years’ experience in ultrasonic diagnosis) finished the procedure independently according to the same references. Although the 2 sonographers knew that all the patients were invasive BC, they were without knowing the diagnosis of tumor histological grade. Moreover, sonographer 2 repeated the process according to the same procedure after 2 weeks. The intraclass correlation coefficient (ICC) was adopted to evaluate the inter- and intra-observer stability of the obtained radiomics features, and features with ICCs > 0.75 were selected for the following analysis.

For the extracted radiomics features, a z-score normalization was used to standardize the radiomics feature data in the 3 sets, respectively. Features with ICCs more than 0.75 were considered to be consistent and retained for further analysis. In the training set, 2 feature selection methods, Mann-Whitney U test and the Boruta method were adopted in order to screen out the principal radiomics features that could be the most relevant for histological grade of invasive BC. The Mann-Whitney U test was applied for comparisons between the features of subjects with high histological grade and subjects with low histological grade. P values lower than .05 were considered significant. By comparing the importance of original attributes with randomly achievable importance, a top-down search is performed using the Boruta method, which is recommended for high-dimensional data analysis.^27,28 In this step, the features that are important to the classification process is selected. Finally, the algorithm outputs a minimum and optimal subset of features.

Model Construction and Validation

In the training set, 7 representative machine learning classifiers including naive Bayesian, support vector machine, k-nearest neighbor, decision tree, extreme gradient boosting (XGBoost), logistic regression, and random forest were adopted to developed prediction models for histological grade of BC. The output of the machine learning classifier with the superior predictive power was labeled as the radiomics score (Rad-score) and the model was selected as the Rad-score model. Univariable regression analysis in the training set was performed to determine the independent predictive factors for the histological grade of invasive BC. The combined model was developed through integrating the independent predictive factors into Rad-score model by multivariate logistic regression. Moreover, a nomogram was plotted for the combined model.

Sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were used to assess the power of the prediction models in the training, internal, and external validation sets. We plotted calibration curve in order to evaluate the calibration of the radiomics and the combined models, which illustrated the relationship between the observed and predicted results. Meanwhile, to explore the clinical utility of the 2 models, decision curve analysis was carried out. An analysis of receiver operating characteristic curves (ROC) was presented, and area under the curve (AUC) measured the predictive power of a classifier was reported. The flowchart of this research is shown in Figure 2.

Figure 2.

Schematic diagram of the processing and analysis flowchart. Abbreviations: ACC, accuracy; AUC, area under the curve; DCA, decision curve analysis; DT, decision tree; ICC, intraclass correlation coefficient; KNN, k-nearest neighbors; LR, logistic regression; NB, naive Bayesian; NOM, nomogram; NPV, negative predictive value; PPV, positive predictive value; RF, random forest; ROI, region of interest; SEN, sensitivity; SPE, specificity; SVM, support vector machine; XGBoost, extreme gradient boosting.

Statistical Analysis

R software (version 3.5.1; www.r-project.org) was used to perform the statistical analysis and data processing. P value < .05 (2-sided) was considered to be statistically significant. Continuous variables with normal distribution are expressed as mean ± standard deviation, while continuous variables with non-normal distribution are expressed as median (interquartile range). The Student's t test was applied to compare clinical-pathological characteristics with a normal distribution, while Mann-Whitney U test was used to characteristics with an abnormal. Categorical variables such as histological grade and histological types were represented as N and were compared using chi-square test or Fisher's exact test.

Results

Clinical-Pathological Characteristics

The baseline clinical-pathological characteristics of patients with BC in the training, internal, and external validation sets are presented in Table 1. A total of 297 of 383 patients with invasive BC were included in the training and internal validation sets, and 86 patients were enrolled in the external validation set. A total of 147 patients with BC were low-grade invasive BC, accounting for 71.0% in the training set. Overall, compared with high-grade group, patients in low-grade group had the following significant clinical-pathological characteristics: smaller tumor size, lower Ki-67 index, and higher proportion of ER + and PR + tumors (all the P value < .05). There were significant differences in statistics between high- and low-grade groups in tumor location, Ki-67 index and proportion of ER + and PR + tumors in the internal validation set. Patients in low-grade group had lower Ki-67 index and smaller tumor size in the external validation set.

Table 1.

The Baseline Characteristics of the Enrolled Patients in the Training Set, Internal Validation Set and External Validation Set.

	Training set			Internal validation set			External validation set
Variables	Low grade (147)	High grade (60)	P	Low grade (63)	High grade (27)	P	Low grade (64)	High grade (22)	P
Radiomics score (median [IQR])	0.201 [0.132, 0.337]	0.369 [0.265, 0.522]	<.001	0.214 [0.147, 0.318]	0.381 [0.234, 0.490]	<.001	0.205 [0.114, 0.396]	0.463 [0.216, 0.626]	.001
Age (median [IQR])	52.0 [46.0, 61.0]	51.5 [44.8, 57.3]	.712	52.0 [45.5, 59.5]	54.0 [48.5, 63.5]	.360	54.0 [47.8, 63.3]	55.5 [45.0, 56.8]	.223
Location (%)			.358			.005			.615
Left	78 (53.1)	27 (45.0)		17 (27.0)	16 (59.3)		24 (37.5)	10 (45.5)
Right	69 (46.9)	33 (55.0)		46 (73.0)	11 (40.7)		40 (62.5)	12 (54.5)
Tumor size (median [IQR])	20.0 [15.0, 25.0]	25.0 [20.0, 30.8]	<.001	20.0 [15.0, 30.0]	25.0 [18.0, 36.0]	.130	21.5 [15.0, 25.3]	25.0 [18.8, 34.5]	.019
ER (%)			.001			.003			.222
Positive	122 (83.0)	36 (60.0)		53 (84.1)	14 (51.9)		53 (82.8)	15 (68.2)
Negative	25 (17.0)	24 (40.0)		10 (15.9)	13 (48.1)		11 (17.2)	7 (31.8)
PR (%)			.001			.004			.087
Positive	104 (70.7)	27 (45.0)		47 (74.6)	11 (40.7)		51 (79.7)	13 (59.1)
Negative	43 (29.3)	33 (55.0)		16 (25.4)	16 (59.3)		13 (20.3)	9 (40.9)
HER2 (%)			.061			.176			.336
Positive	26 (17.7)	18 (30.0)		12 (19.0)	9 (33.3)		9 (14.1)	5 (22.7)
Negative	121 (82.3)	42 (70.0)		51 (81.0)	18 (66.7)		55 (85.9)	17 (77.3)
Ki-67 (median [IQR])	18.0 [10.0, 30.0]	50.0 [30.0, 70.0]	<.001	15.0 [10.0, 40.0]	30.0 [22.5, 50.0]	.017	20.0 [15.0, 33.3]	52.5 [31.3, 60.0]	<.001
Histological grade (%)			<.001			<.001			<.001
I	8 (5.4)	0 (0.0)		1 (1.6)	0 (0.0)		9 (14.1)	0 (0.0)
II	139 (94.6)	0 (0.0)		62 (98.4)	0 (0.0)		55 (85.9)	0 (0.0)
III	0 (0.0)	60 (100.0)		0 (0.0)	27 (100.0)		0 (0.0)	22 (100.0)
Pathology (%)			1.000			1.000			1.000
IDC	145 (98.6)	60 (100.0)		62 (98.4)	27 (100.0)		61 (95.3)	21 (95.5)
ILC	2 (1.4)	0 (0.0)		1 (1.6)	0 (0.0)		3 (4.7)	1 (4.5)
BI-RADS (%)			.796			.322			.158
4A	17 (11.6)	8 (13.3)		9 (14.3)	1 (3.7)		8 (12.5)	3 (13.6)
4B	38 (25.9)	12 (20.0)		19 (30.2)	6 (22.2)		45 (70.3)	11 (50.0)
4C	36 (24.5)	14 (23.3)		15 (23.8)	7 (25.9)		10 (15.6)	6 (27.3)
5	56 (38.1)	26 (43.3)		20 (31.7)	13 (48.1)		1 (1.6)	2 (9.1)
US-reported LN (%)			.756			1.000			.119
Positive	61 (41.5)	23 (38.3)		27 (42.9)	12 (44.4)		19 (29.7)	11 (50.0)
Negative	86 (58.5)	37 (61.7)		36 (57.1)	15 (55.6)		45 (70.3)	11 (50.0)
Pathology-reported LN (%)			.544			.819			.614
Positive	78 (53.1)	29 (48.3)		33 (52.4)	13 (48.1)		22 (34.4)	9 (40.9)
Negative	69 (46.9)	31 (51.7)		30 (47.6)	14 (51.9)		42 (65.6)	13 (59.1)
Number of metastasis LN (median [IQR])	0 [0, 4]	0 [0, 2]	.351	1 [0, 5]	0 [0, 4]	.993	0 [0, 1]	0 [0, 2]	.384

Abbreviations: BI-RADS, Breast Imaging Reporting and Data System; ER, estrogen receptor; HER2, human epidermal growth factor receptor 2; IDC, invasive ductal carcinoma; ILC, invasive lobular carcinoma; IQR, interquartile range; LN, lymph node; PR, progesterone receptor; US, ultrasound.

Radiomics Feature Extraction and Selection

A total of 788 radiomics features were extracted from each patient. Then, 765 robust features with ICCs > 0.75 were obtained and used for dimension reduction, which were consistent with our prior study.¹⁷ In the training set, the Mann-Whitney U test on the 765 robust features was performed, and 206 features with P < .05 were retained. Finally, 7 radiomics features were finally screened out by utilizing Boruta method shown in Figure 3.

Figure 3.

Feature selection using the Boruta algorithm. The x-axis represents ultrasound radiomics features by name, while the y-axis indicates their importance scores evaluated by the Boruta algorithm, aiding in understanding their significance within the dataset.

Independent Clinical Factor

Univariable regression analysis of the association between the clinical factors and the histological grade of invasive BC in the training set only detected tumor size that had statistical difference, with an odds ratio (OR) of 1.06 (95% confidence interval [CI], 1.03-1.09; P < .001) (Table 2). Multivariable regression analysis was waived as only one significant clinical factor was detected in univariable regression analysis.

Table 2.

Univariable Logistic Regression Analysis in the Training Set.

Characteristics	Univariable analysis
Characteristics	OR	95% CI	P
Age	1	0.97-1.02	.84
Location
Left	Ref.
Right	1.38	0.76-2.53	.29
Size	1.06	1.03-1.09	<.001
BI-RADS
4A	Ref.
4B	0.67	0.23-1.94	.46
4C	0.83	0.29-2.34	.72
5	0.99	0.38-2.58	.98
Lymph node
Negative	Ref.
Positive	0.88	0.47-1.62	.67

Abbreviations: BI-RADS, Breast Imaging Reporting and Data System; CI, confidence interval; OR, odds ratio; Ref., reference.

Machine Learning Classifiers and Radiomics Score Calculation

Seven machine learning classifiers were trained on the basis of the selected features in the training set and tested in the internal and external validation sets. The performance of the 7 machine learning models is summarized in Table 3. The results unveiled that the models could differentiate patients with BC with high histological grade from those with low histological grade. Moreover, the comparison of AUCs between any pair of the models in the 3 sets was performed respectively, and the DeLong test was used to calculate the P values (Figure 4).

Figure 4.

The statistical comparison of area under the curve values using the DeLong test among 7 machine learning classifiers in the training set, internal validation set, and external validation set. Abbreviations: DT, decision tree; KNN, k-nearest neighbors; LR, logistic regression; NB, naive Bayesian; RF, random forest; SVM, support vector machine; XGB, extreme gradient boosting.

Table 3.

Diagnostic Performance of the 7 Machine Learning Classifiers in the Training Set, Internal Validation Set, and External Validation Set.

Model	Set	AUC (95% CI)	Specificity	Sensitivity	Accuracy	PPV	NPV
NB	Training	0.582 (0.514-0.649)	83.00%	33.30%	68.60%	44.40%	75.30%
	Internal validation	0.609 (0.502-0.715)	81.00%	40.70%	68.90%	47.80%	76.10%
	External validation	0.695 (0.582-0.809)	89.10%	50.00%	79.10%	61.10%	83.80%
DT	Training	0.688 (0.617-0.759)	81.00%	56.70%	73.90%	54.80%	82.10%
	Internal validation	0.614 (0.504-0.724)	74.60%	48.10%	66.70%	44.80%	77.00%
	External validation	0.656 (0.537-0.774)	76.60%	54.50%	70.90%	44.40%	83.10%
KNN	Training	0.659 (0.593-0.726)	91.80%	40.00%	76.80%	66.70%	78.90%
	Internal validation	0.627 (0.530-0.724)	92.10%	33.30%	74.40%	64.30%	76.30%
	External validation	0.590 (0.488-0.691)	90.60%	27.30%	74.40%	50.00%	78.40%
RF	Training	1.000 (1.000-1.000)	100.00%	100.00%	100.00%	100.00%	100.00%
	Internal validation	0.609 (0.502-0.715)	81.00%	40.70%	68.90%	47.80%	76.10%
	External validation	0.626 (0.512-0.741)	84.40%	40.90%	73.30%	47.40%	80.60%
SVM	Training	0.559 (0.504-0.615)	91.80%	20.00%	71.00%	50.00%	73.80%
	Internal validation	0.582 (0.490-0.674)	90.50%	25.90%	71.10%	53.80%	74.00%
	External validation	0.681 (0.573-0.789)	95.30%	40.90%	81.40%	75.00%	82.40%
XGBoost	Training	0.983 (0.960-1.000)	100.00%	96.70%	99.00%	100.00%	98.70%
	Internal validation	0.667 (0.558-0.775)	77.80%	55.60%	71.10%	51.70%	80.30%
	External validation	0.633 (0.514-0.752)	76.60%	50.00%	69.80%	42.30%	81.70%
LR	Training	0.742 (0.668-0.816)	70.10%	71.70%	70.50%	49.40%	85.80%
	Internal validation	0.731 (0.617-0.846)	84.10%	59.30%	76.70%	61.50%	82.80%
	External validation	0.738 (0.617-0.859)	67.20%	72.70%	68.60%	43.20%	87.80%

Abbreviations: AUC, area under the curve; CI, confidential interval; DT, decision tree; KNN, k-nearest neighbors; LR, logistic regression; NB, naive Bayesian; NPV, negative predictive value; PPV, positive predictive value; RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.

The logistic regression classifier performing best in the internal and external validation sets was determined as the Rad-score model. Figure 5 shows the regression coefficients of the logistic regression algorithm. By adopting the regression coefficients of the Rad-score model to weight each radiomics feature, the probability of high histological grade of BC based on the selected 7 radiomics features was quantitatively predicted and considered as Rad-score.

Figure 5.

Feature coefficients in predicting high histological grade of invasive breast cancer according to the Rad-score model.

The medians of Rad-score had statistical difference between the low- and high-grade groups in the training set, and the same results were achieved in the internal and external validation sets (Table 4 and Figure 6A-C).

Figure 6.

Raincloud plots, receiver operating characteristic curves, and calibration curves in the training set (A, D, G), internal validation set (B, E, H), and external validation set (C, F, I).

Table 4.

Rad-Scores for the Training Set, Internal Validation Set, and External Validation Set.

Set	Radiomics score (median [IQR])		P
Set	Low grade	High grade	P
Training	0.201 [0.132, 0.337]	0.369 [0.265, 0.522]	<.001
Internal validation	0.214 [0.147, 0.318]	0.381 [0.234, 0.490]	<.001
External validation	0.205 [0.114, 0.396]	0.463 [0.216, 0.626]	.001

Abbreviation: IQR, interquartile range.

Establishment of Combined Model and Comparison of Models

The combined model was developed by integrating the tumor size into the Rad-score model and the performance of the Rad-score and combined models for predicting high histological grade was revealed (Table 5 and Figure 6D-F). The combined model achieved satisfactory discrimination, with AUCs of 0.750 (95% CI: 0.677-0.824), 0.721 (95% CI: 0.604-0.838), and 0.737 (95% CI: 0.616-0.857) in the training, internal, and external validation sets, respectively. The combined model yielded higher AUC for high histological grade in the training set; meanwhile, the Rad-score model had the slightly higher predictive ability in the internal and external validation sets. There was no significant difference in the ROC curves of the 2 models in the training, internal, and external validation sets (DeLong test, P = .307, P = .381, P = .987).

Table 5.

Diagnostic Performance of the Rad-Score Model and Combined Model in the Training Set, Internal Validation Set, and External Validation Set.

Model	Set	AUC (95% CI)	Specificity	Sensitivity	Accuracy	NPV	PPV
Radiomics model	Training	0.742 (0.668-0.816)	70.10%	71.70%	70.50%	85.80%	49.40%
	Internal validation	0.731 (0.617-0.846)	84.10%	59.30%	76.70%	82.80%	61.50%
	External validation	0.738 (0.617-0.859)	67.20%	72.70%	68.60%	87.80%	43.20%
Combined model	Training	0.750 (0.677-0.824)	75.50%	70.00%	73.90%	86.00%	53.80%
	Internal validation	0.721 (0.604-0.838)	85.70%	55.60%	76.70%	81.80%	62.50%
	External validation	0.737 (0.616-0.857)	73.40%	63.60%	70.90%	85.50%	45.20%

Abbreviations: AUC, area under the curve; CI, confidential interval; NPV, negative predictive value; PPV, positive predictive value.

The 10-fold cross-validation was conducted to test the stability and reliability of the Rad-score model and combined model in the training set, which yielded mean AUCs of 0.728 and 0.745, demonstrated the predictive power of the 2 models was reliable and stable.

Clinical Application of Prediction Models

The calibration curve for the Rad-score model and the combined model was tested by utilizing the Hosmer-Lemeshow method, and had no significantly statistical differences as all the P values > .05 in the training set (P = .789; P = .513), internal validation set (P = .405; P = .430) and external validation set (P = .378; P = .868), revealing well consistency between the observed and predicted results (Figure 6G-I). Moreover, we used the tumor size and Rad-score to build a nomogram based on training set to discriminate the high grade from low grade of invasive BC (Figure 7). In the nomogram plot, each variable value of the patients is positioned along its corresponding axis, with a straight line drawn upwards to determine the point corresponding to the Rad-score and tumor size. These points are then summed to derive the total point, which is then projected downwards onto the probability axis, indicating the probability of a patient being diagnosed with high histological grade invasive BC. Using this nomogram, we obtained a median probability of predicting high histological invasive BC of 0.2226, with a maximum of 0.9128 and a minimum of 0.0725, on the external validation set. Additionally, a box plot (Figure 8) has been included to illustrate the predicted probabilities for each patient in the external validation set. In addition, decision curve analyses of the combined model and Rad-score model are revealed in Figure 9.

Figure 7.

Nomogram for predicting high histological grade of invasive breast cancer. Clinicians can add up corresponding scores using the plot and obtain the high histological grade probability. The red bar represents the range where the variable's value falls, while the green bar delineates the 95% confidence interval of these values.

Figure 8.

The probability distribution of high and low histological grade invasive breast cancers in the external validation set.

Figure 9.

Decision curve analysis for the Rad-score model and combined model.

Discussion

In the current study, we developed a new method for predicting the histological grade of invasive BC. We used ultrasound radiomics and machine learning classifiers to develop accurate prediction models for histological grade. We analyzed 788 quantitative ultrasound features to determine their value in predicting the histological grade of invasive BC by high-throughput radiomics analysis. As a result of using the Boruta method, we were able to screen out 7 radiomics features as imaging markers to develop machine learning prediction models. Seven advanced machine learning classifiers were used to establish 7 models for predicting the histological grade, which were all assessed and validated. The logistic regression classifier performing best was determined as the Rad-score model, with AUC values of 0.742, 0.731, and 0.738 in the training, internal, and external validation sets, respectively. In addition, combined with the clinical-pathological information, the data regarding size of tumor, BI-RADS, site of tumor, age, and axillary lymph node metastasis was collected and analyzed by univariate logistic regression analysis. Finally, tumor size was identified as an independent factor, combining with the Rad-score to develop the combined model, and the AUC values in the training, internal, and external validation sets were 0.750, 0.721, and 0.737, respectively. However, there was no statistical difference between the Rad-score and combined model. Our findings demonstrated that both the Rad-score model and combined model could accurately predict the histological grade in patients with invasive BC. To our knowledge, it is the first study to incorporate ultrasound radiomics features with clinical factor (tumor size) in the prediction of histological grade of invasive BC.

The histological tumor grades were found to be associated with lymph node invasion and different subtype of hormone receptor according to Zodinpuii et al.²⁹ Similarly, Zheng et al³⁰ found that a high histological grade of invasive BC was more likely to be present in patients with positive axillary lymph nodes, large tumor size (more than 2 cm), HER2 positivity, lymphovascular invasion, and Basal-like BC. However, only the clinical factor of tumor size showed significant difference between the low- and high-grade BC in this investigation. Furthermore, the ER, PR, and Ki-67 of pathological information had statistical difference between the low- and high-grade groups shown in Table 1. The ER, PR, and Ki-67 here were referred to the postoperative pathological information, but it was difficult for us to get them prior to surgery since this is a prediction study. Therefore, predicting histological grade of invasive BC adopting radiomics features and clinical factors might be more successful and effective.

In recent years, radiomics, first proposed by Lambin et al in 2012,⁹ has developed rapidly. It can be used to diagnose and predict diseases noninvasively and is universally considered as a breakthrough in the field of radiomics for personalized cancer management.^17‐20 In Mao et al's study,²⁵ a radiomics model on the basis of contrast-enhanced spectral mammography was developed and validated in order to pre-operatively discriminate the low- from high-grade invasive BC. The combined radiomics model on the basis of 28 radiomics characteristics demonstrated the most superior power for pre-operatively predicting histological grade in patients with invasive BC, acquiring AUCs of 0.88 and 0.80 in the training and test sets. It is important to note that despite the positive findings, the study above had a few shortages. First, as a result of the limited radiomic features and small sample size, it is unlikely that the conclusion could be generalized. Second, there were no clinical-pathological characteristics referred, which have been demonstrated to be relative to the histological grade. Third, multicenter research is necessary to improve high-level evidence for clinical use in this study, which is a single-center study. A study by Wang et al³¹ evaluated 901 patients with invasive BC and pre-operative magnetic resonance imaging (MRI) scans. Based on the radiomics model, the AUC values for histological grade prediction were 0.761 in the training set and 0.722 in the validation set which suggested that radiomics model based on MRI was capable of predicting the histological grade of invasive BC. However, this study was a single-center study and the model was not tested by the external validation cohort, which might make the model unrepresentable. In the Fan et al's study,³² 167 patients with invasive ductal carcinoma were assembled, and radiomics features from the dynamic contrast-enhanced MRI and images with T2 weights were fused using a canonical correlation analysis. The highest AUC value for predicting the histological grade in the validation cohort was 0.803. Despite some significant findings, there was a smaller sample size in the dataset than in ours and the robustness of predictive models needs to be further validated in future studies using a large external dataset.

After feature screening, these 7 features played an important role in correctly classifying the 2 groups, which included shape feature, texture feature, and wavelet features. In addition, according to the findings, texture and wavelet features were the most significant, particularly the wavelet features, which accounted for the majority of high-weight features. It is possible to quantify intratumoral heterogeneity at different scales with the wavelet transformation, which is often invisible to the naked eye.³³ Furthermore, texture features have the advantage of retaining the spatial features of the lesions and can quantify the subtle differences in image pixel values and their arrangement.³⁴ A significant amount of radiodiagnostic experience is required for a separate imaging diagnosis and there are large subjective differences between observers. In contrast, radiomics quantifies all image features and develops an objective model to make the classification results more objective. Thus, radiomics may be an auxiliary tool for doctors in identifying these 2 histological grades, and may aid doctors in making quick differential diagnoses.

There remained a few limitations in this study. First, this study had a small sample size, which needs to be increased. By doing so, machine learning classifiers will be less susceptible to data bias; on the other hand, it may enhance the learning ability of the machine learning model by providing more training data. Second, it is not clear whether other planes of the tumor such as transverse cross-sectional plane, or peritumoral regions could also be applied for differential diagnosis.³⁵ There is still a need for further research in this area. Third, a radiomics analysis was performed only on images of the largest tumor diameter in 2 dimensions. Whereas, as compared to a model based on features of the whole tumor volume, radiomics analysis of single slices may miss some important information.³⁶ A 3-dimensional model should be developed in future studies for prediction of the histological grade in patients with BC.

Conclusions

In summary, we developed and validated the Rad-score model and the combined model to effectively distinguish different histological grade of invasive BC. Thus, the models may provide an effective diagnostic reference for histological grade identification in routine clinical use.

Footnotes

Abbreviations

Author Contributions

LG and YJ examined the experiment and wrote this article. JW provided help with the data analysis. ZW and DX revised this article. DX provided the research platform.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethics Approval

The research involving human participants underwent comprehensive examination and obtained the official approval of the Institutional Review Board at Dongyang People's Hospital (Approval No. 2024-YX-111) and Zhejiang Cancer Hospital (Approval No. IRB-2022-548). The informed consent was waived because of the retrospective nature of this study.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by Research Program of National Health Commission Capacity Building and Continuing Education Center (CSJRZC2021JJSJ001).

ORCID iD

Jiangfeng Wu

References

Siegel

Miller

Fuchs

Jemal

. Cancer statistics, 2021. CA Cancer J Clin. 2021;71(1):7‐33. doi: https://doi.org/10.3322/caac.21654

Schwartz

Henson

Chen

Rajamarthandan

. Histologic grade remains a prognostic factor for breast cancer regardless of the number of positive lymph nodes and tumor size: a study of 161 708 cases of breast cancer from the SEER program. Arch Pathol Lab Med. 2014;138(8):1048‐1052. doi: https://doi.org/10.5858/arpa.2013-0435-OA

Galea

Blamey

Elston

Ellis

. The Nottingham prognostic index in primary breast cancer. Breast Cancer Res Treat. 1992;22(3):207‐219. doi: https://doi.org/10.1007/BF01840834

DeSantis

Siegel

Bandi

Jemal

. Breast cancer statistics, 2011. CA Cancer J Clin. 2011;61(6):409‐418. doi: https://doi.org/10.3322/caac.20134

Rakha

Reis-Filho

Baehner

, et al. Breast cancer prognostic classification in the molecular era: the role of histological grade. Breast Cancer Res. 2010;12(4):207. doi: https://doi.org/10.1186/bcr2607

Weigelt

Geyer

Reis-Filho

. Histological types of breast cancer: how special are they? Mol Oncol. 2010;4(3):192‐208. doi: https://doi.org/10.1016/j.molonc.2010.04.004

Knuttel

Menezes

van Diest

Witkamp

van den Bosch

Verkooijen

. Meta-analysis of the concordance of histological grade of breast cancer between core needle biopsy and surgical excision specimen. Br J Surg. 2016;103(6):644‐655. doi: https://doi.org/10.1002/bjs.10128

Acs

Rantalainen

Hartman

. Artificial intelligence as the next step towards precision pathology. J Intern Med. 2020;288(1):62‐81. doi: https://doi.org/10.1111/joim.13030

Lambin

Rios-Velazquez

Leijenaar

, et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer. 2012;48(4):441‐446. doi: https://doi.org/10.1016/j.ejca.2011.11.036

10.

Gillies

Kinahan

Hricak

. Radiomics: images are more than pictures, they are data. Radiology. 2016;278(2):563‐577. doi: https://doi.org/10.1148/radiol.2015151169

11.

Ghai

Moshonov

Crystal

. Histological grade and immunohistochemical biomarkers of breast cancer: correlation to ultrasound features. J Ultrasound Med. 2017;36(9):1883‐1894. doi: https://doi.org/10.1002/jum.14247

12.

Farras Roca

Tardivon

Thibault

Rouzier

Klijanienko

. Correlation of ultrasound, cytological, and histological features of 110 benign BI-RADS categories 4C and 5 nonpalpable breast lesions. The institute curie's experience. Cancer Cytopathol. 2021;129(6):479‐488. doi: https://doi.org/10.1002/cncy.22402

13.

Limkin

Sun

Dercle

, et al. Promises and challenges for the implementation of computational medical imaging (radiomics) in oncology. Ann Oncol. 2017;28(6):1191‐1206. doi: https://doi.org/10.1093/annonc/mdx034

14.

Romeo

Cuocolo

Apolito

, et al. Clinical value of radiomics and machine learning in breast ultrasound: a multicenter study for differential diagnosis of benign and malignant lesions. Eur Radiol. 2021;31(12):9511‐9519. doi: https://doi.org/10.1007/s00330-021-08009-2

15.

Luo

Huang

Zeng

Wang

. Predicting breast cancer in breast imaging reporting and data system (BI-RADS) ultrasound category 4 or 5 lesions: a nomogram combining radiomics and BI-RADS. Sci Rep. 2019;9(1):11921. doi: https://doi.org/10.1038/s41598-019-48488-4

16.

Guo

Liu

Sun

, et al. Deep learning radiomics of ultrasonography: identifying the risk of axillary non-sentinel lymph node involvement in primary breast cancer. EBioMedicine. 2020;60:103018. doi: https://doi.org/10.1016/j.ebiom.2020.103018

17.

Jin

, et al. Development and validation of an ultrasound-based radiomics nomogram for predicting the luminal from non-luminal type in patients with breast carcinoma. Front Oncol. 2022;12:993466. doi: https://doi.org/10.3389/fonc.2022.993466

18.

Guo

Wang

Jin

. Development and validation of an ultrasound-based radiomics nomogram for identifying HER2 status in patients with breast carcinoma. Diagnostics (Basel). 2022;12(12):3130. doi: https://doi.org/10.3390/diagnostics12123130

19.

Fang

Yao

, et al. Integration of ultrasound radiomics features and clinical factors: a nomogram model for identifying the Ki-67 status in patients with breast carcinoma. Front Oncol. 2022;12:979358. doi: https://doi.org/10.3389/fonc.2022.979358

20.

Jiang

Luo

, et al. Ultrasound-based deep learning radiomics in the assessment of pathological complete response to neoadjuvant chemotherapy in locally advanced breast cancer. Eur J Cancer. 2021;147:95‐105. doi: https://doi.org/10.1016/j.ejca.2021.01.028

21.

Tong

, et al. Deep learning radiomics of ultrasonography can predict response to neoadjuvant chemotherapy in breast cancer at an early stage of treatment: a prospective study. Eur Radiol. 2022;32(3):2099‐2109. doi: https://doi.org/10.1007/s00330-021-08293-y

22.

Xiong

Chen

Tang

, et al. Ultrasound-based radiomics analysis for predicting disease-free survival of invasive breast cancer. Front Oncol. 2021;11:621993. doi: https://doi.org/10.3389/fonc.2021.621993

23.

Allison

Hammond

MEH

Dowsett

, et al. Estrogen and progesterone receptor testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists Guideline update. Arch Pathol Lab Med. 2020;144(5):545‐563. doi: https://doi.org/10.5858/arpa.2019-0904-SA

24.

Goldhirsch

Winer

Coates

, et al. Panel members. Personalizing the treatment of women with early breast cancer: highlights of the St Gallen International Expert Consensus on the primary therapy of early breast cancer 2013. Ann Oncol. 2013;24(9):2206‐2223. doi: https://doi.org/10.1093/annonc/mdt303

25.

Loibl

Gianni

. HER2-positive breast cancer. Lancet. 2017;389(10087):2415‐2429. doi: https://doi.org/10.1016/S0140-6736(16)32417-5

26.

Mao

Jiao

Duan

Xie

. Preoperative prediction of histologic grade in invasive breast cancer by using contrast-enhanced spectral mammography-based radiomics. J Xray Sci Technol. 2021;29(5):763‐772. doi: https://doi.org/10.3233/XST-210886

27.

Kursa

Rudnicki

. Feature selection with the Boruta package. J Stat Softw. 2010;36(11):1‐13. doi: https://doi.org/10.18637/jss.v036.i11

28.

Guryleva

Penzar

Chistyakov

Mironov

Favorov

Sergeeva

. Investigation of the role of PUFA metabolism in breast cancer using a rank-based random forest algorithm. Cancers (Basel). 2022;14(19):4663. doi: https://doi.org/10.3390/cancers14194663

29.

Zodinpuii

Pautu

Zothankima

Pachuau

Kumar

. Clinical features and first degree relative breast cancer, their correlation with histological tumor grade: a 5-year retrospective case study of breast cancer in Mizoram, India. Environ Sci Pollut Res Int. 2020;27(2):1991‐2000. doi: https://doi.org/10.1007/s11356-019-06944-8

30.

Zheng

Tan

, et al. Clinicopathologic factors related to the histological tumor grade of breast cancer in western China: an epidemiological multicenter study of 8619 female patients. Transl Oncol. 2018;11(4):1023‐1033. doi: https://doi.org/10.1016/j.tranon.2018.06.005

31.

Wang

Wei

Zhou

. Development and validation of an MRI radiomics-based signature to predict histological grade in patients with invasive breast cancer. Breast Cancer (Dove Med Press). 2022;14:335‐342. doi: https://doi.org/10.2147/BCTT.S380651

32.

Fan

Liu

Xie

, et al. Integration of dynamic contrast-enhanced magnetic resonance imaging and T2-weighted imaging radiomic features by a canonical correlation analysis-based feature fusion method to predict histological grade in ductal breast carcinoma. Phys Med Biol. 2019;64(21):215001. doi:https://doi.org/10.1088/1361-6560/ab3fd3

33.

Qiu

Xing

Wang

Feng

Wen

. Development and validation of a radiomics nomogram using computed tomography for differentiating immune checkpoint inhibitor-related pneumonitis from radiation pneumonitis for patients with non-small cell lung cancer. Front Immunol. 2022;13:870842. doi: https://doi.org/10.3389/fimmu.2022.870842

34.

Tang

, et al. Ultrasound-based radiomics for predicting different pathological subtypes of epithelial ovarian cancer before surgery. BMC Med Imaging. 2022;22(1):147. doi: https://doi.org/10.1186/s12880-022-00879-2

35.

Song

Yin

. Intratumoral and peritumoral radiomics based on functional parametric maps from breast DCE-MRI for prediction of HER-2 and ki-67 status. J Magn Reson Imaging. 2021;54(3):703‐714. doi: https://doi.org/10.1002/jmri.27651

36.

Ouyang

, et al. Magnetic resonance imaging radiomics predicts pre-operative axillary lymph node metastasis to support surgical decisions and is associated with tumor microenvironment in invasive breast cancer: a machine learning, multicenter study. EBioMedicine. 2021;69:103460. doi: https://doi.org/10.1016/j.ebiom.2021.103460

Noninvasive Assessment of Tumor Histological Grade in Invasive Breast Carcinoma Based on Ultrasound Radiomics and Clinical Characteristics: A Multicenter Study

Abstract

Keywords

Introduction

Materials and Methods

Data Source and Patient Selection

Postoperative Pathological Assessment

Ultrasound Acquisition and Image Segmentation

Radiomics Feature Extraction and Selection

Model Construction and Validation

Statistical Analysis

Results

Clinical-Pathological Characteristics

Radiomics Feature Extraction and Selection

Independent Clinical Factor

Machine Learning Classifiers and Radiomics Score Calculation

Establishment of Combined Model and Comparison of Models

Clinical Application of Prediction Models

Discussion

Conclusions

Footnotes

Abbreviations

Author Contributions

Declaration of Conflicting Interests

Ethics Approval

Funding

ORCID iD

References