Abstract
Background:
The NCCN clinical guidelines recommended core needle biopsy for breast lesions classified as Breast Imaging Reporting and Data System (BI-RADS) 4, while category 4A lesions are only 2-10% likely to be malignant. Thus, a large number of biopsies of BI-RADS 4A lesions were ultimately determined to be benign, and those unnecessary biopsies may incur additional costs and pains. However, it is important to emphasize that the current risk prediction model focuses primarily on the details and complex risk features of US or MG findings, which may be difficult to apply in order to benefit from the model. To stratify and manage BI-RADS 4A lesions effectively and efficiently, a more effective and practical predictive model must be developed.
Methods:
We retrospectively analyzed 465 patients with BI-RADS ultrasonography (US) category 4A lesions, diagnosed between January 2019 and July 2019 in Tianjin Medical University Cancer Institute and Hospital and National Clinical Research Center for Cancer. Univariate and multivariate logistic regression analyses were conducted to identify risk factors. To stratify and predict the malignancy of BI-RADS 4A lesions, a nomogram combining the risk factors was constructed based on the multivariate logistic regression results. In order to determine the predictive performance of our predictive model, we used the concordance index (C-index), calibration curve, and receiver operating characteristic (ROC), and the decision curve analysis (DCA) to assess the clinical benefits.
Results:
Based on our analysis, 16.3% (76 out of 465) of patients were pathologically diagnosed with malignant lesions, while 83.6% (389 out of 465) were diagnosed with benign lesions. According to univariate and multivariate logistic regression analysis, age (OR = 3.414, 95%CI:1.849-6.303), nipple discharge (OR = .326, 95%CI:0.157-.835), palpable lesions (OR = 1.907, 95%CI:1.004-3.621), uncircumscribed margin (US) (OR = 1.732, 95%CI:1.033-2.905), calcification (mammography, MG) (OR = 2.384, 95%CI:1.366-4.161), BI-RADS(MG) (OR = 5.345, 95%CI:2.934-9.736) were incorporated into the predictive nomogram (C-index = .773). There was good agreement between the predicted risk and the observed probability of recurrence. Furthermore, we determined that 153 was the best cutoff score for distinguishing between patients in the low- and high-risk groups. Malignant lesions were significantly more prevalent in high-risk patients than in low-risk patients.
Conclusion:
Based on clinical, US, and MG features, we present a predictive nomogram to reliably predict the malignancy risk of BI-RADS(US) 4A lesions, which may assist clinicians in the selection of patients at low risk of malignancy and reduce the number of false-positive biopsies.
Introduction
Global statistics has shown that breast cancer has become the most common cancer and the second most common cause of cancer-related death worldwide. 1 The key to reducing mortality and improving prognosis from breast cancer is early diagnosis and aggressive systemic treatment. 2 Ultrasonography (US) is one of the most important imaging technique for the early detection or diagnosis of breast diseases especially for diagnosing early-stage breast cancer. 3 The Breast Imaging Reporting and Data System (BI-RADS) for the US was developed by the American College of Radiology (ACR) and is the standard for evaluating breast lesions in the US. According to the fifth edition of BI-RADS updated in 2013, breast lesions can be classified according to their sonographic characteristics (BI-RADS 0-6). 4 In general, BI-RADS 0 is defined as needs to be combined with other imaging, BI-RADS 1 is defined as no lesions or negative findings, BI-RADS 2 is defined as benign lesion without suspicious features, BI-RADS 3 is defined as benign possible with less than 2% malignant probability. 5 BI-RADS 6 is already proved to be malignant through pathological. Depending on the different likelihood of malignant, the suspicious lesions are defined as BI-RADS 4 with a wide range of malignant likelihood from 2% to 95% and BI-RADS 5 with more than 95% malignant odds. As BI-RADS 4 has a large statistical dispersion for malignant estimates, it was further divided into 3 subcategories: 4A, 4B, and 4C.
BI-RADS 4 lesions are recommended for core needle biopsy according to NCCN clinical guidelines. 6 In fact, only 2%-10% of the BI-RADS 4A lesions are malignant which means a great majority of biopsies were unnecessary.7,8 These unnecessary biopsies, which were ultimately determined to be benign pathologically, are undoubtedly a significant burden on patients physically and psychologically. Consequently, it is of great importance to improve the differentiation between malignant and benign lesions in clinical diagnosis and treatment activities associated with BI-RADS 4A suspected lesions.
To date, a few prediction models have been raised to improve the discriminating ability to identify malignant lesions from benign lesions thus decreasing the care costs and patients suffering.9,10 These analyses, however, were primarily concerned with the detailed risk features of US or MG findings, which are highly dependent on the expertise and interpretation of breast radiologists. Currently, no study has been conducted to evaluate the clinical characteristics and basic US or MG characteristics of BI-RADS 4A lesions to assist physicians in distinguishing individuals who may be at risk for malignancy.
In light of these reasons, the purpose of this study was to establish and validate a corresponding nomogram that would be able to stratify and predict the malignancy of BI-RADS(US) 4A lesions in order to avoid unnecessary biopsies for patients with low-risk malignancies.
Methods
Study Participants
This retrospective research was deemed exempt from institutional review board approval by Tianjin Medical University Cancer Institute and Hospital and National Clinical Research Center for Cancer (Tianjin, China) and the informed consent was waived. Our research was conducted in accordance with the relevant regulations and guidelines.
A total of 465 female patients, in line with our inclusion criteria diagnosed and treated at Tianjin Medical University Cancer Institute and Hospital (Tianjin, China) between January 2019 and July 2019, were enrolled in our study. The inclusion criteria were as follows: (a) female patients aged over 18 years old; (b) diagnosed with lesions assigned as BI-RADS(US) category 4A; (c) clinical characteristics and pathological results were available. The exclusion criteria were: (a) male patients; (b) incomplete clinical information or pathological diagnosis; (c) any history of malignant tumor lesions of the breast or other organs, blood diseases and acute infections.
Ultrasonography
The US examination was performed using GE LOGIQ E9, GE LOGIQ E7, and SuperSonic Imaging Aixplore color Doppler ultrasonic diagnostic apparatus equipped with a variable frequency linear array probe with a frequency range of 6 to 15.0 MHz. During the examination, the probe is used to scan the breast from the edge to the center with the patient in the supine position, then a re-examination of the area where suspicious lesions were discovered is conducted. All of the lesions were analyzed and diagnosed by our hospital’s 4 dedicated breast radiologists with over 5 years of experience in US breast examination. Based on the ACR BI-RADS (US) criteria, radiologists, without knowing the pathological results, described the lesions and assigned them as one BI-RADS (US) category.
Mammography
The mammography examination uses the Selenia full-field digital mammography (Hologic, USA) machine. Three technicians with over 5 years of experience operated the MG, and the images from the MG were reviewed by three radiologists with over 5 years of experience in breast imaging diagnosis. Using the ACR BI-RADS for MG (2013) criteria, the radiologists described the lesions and classified them according to BI-RADS(MG).
Pathological Result
In our hospital, patients with BI-RADS(US) 4A lesions usually are usually recommended to undergo US-guided core needle biopsy (US-CNB) or surgical excision in order to obtain a definitive pathological diagnosis. In our study, US-CNB was performed by using BARD semi-automatic core instrument (USA) with a 14-gauge Tru-Cut needle, obtaining an average of three tissue samples. Pathologists with more than 10 years of experience evaluated and classified all tissue samples according to the 2019 WHO classification of breast tumors. 11 Furthermore, we obtained the immunohistochemistry staining information (with the presence of estrogen receptors (ER), progesterone receptors (PR), human epithelial growth factor receptor 2 (HER-2), Ki-67, p53, CK5/6, epidermal growth factor receptors (EGFR), and etc.), and classified the molecular subtype of breast cancer according to the consensus of experts at the St.Gallen meeting (2013). 12
Statistical Analysis
All of the data analysis in our study was performed by the R version 4.1.1 software (http://www.Rproject.org) with the R packages reader, rms, riskRegression, pROC, rmda, waterfalls, forestplot, and etc. To identify the correlation between clinical, US, and MG features with pathological results, we used the χ2 test or Fisher’s exact test. The statistically significant characteristics (P < .05) in the univariate analysis were then further filtered in the multivariable logistic regression analysis. Then we calculated the odds ratio (OR) and its 95% confidence interval (CI). The independent risk factors statistically significant both in univariate analysis and multivariable regression analysis was incorporated into our prediction nomogram. The internal validation method was used to reduce the prediction model’s bias, the concordance index value (C-index) was used to evaluate the model’s discriminative ability, and calibration curves were used to evaluate the reliability of the nomogram. According to the nomogram, the risk prediction score for each patient can be calculated by adding each index scores, and then the best cut-off point in the receiver operating characteristic (ROC) is determined and the patients are divided into low-risk and high-risk groups. To assess the clinical value of our nomogram, we used decision curve analysis (DCA) by quantifying the clinical utility along with clinical consequences at different threshold probabilities. 13
Results
Univariate Analysis
Clinical-Pathological Parameters of Patients With BI-RADS(US) 4A Lesion.
The bold values is used to highlight the name of each characteristics without special meaning.

The correlations between each possible risk factors.
In our research, there was no significant difference between the malignant and benign cohorts regarding the presence of the history of childbearing (ρ = .148), history of breastfeeding (ρ = .208), family history of breast cancer (ρ = .375), history of smoking (ρ = .375), direction (ρ = .124), quadrant (ρ = .272), the pain of breast (ρ = .310), unclear boundary (ρ = .139), shape(US) (ρ = .488), duct dilatation(US) (ρ = .563), multiple benign nodules(US) (ρ = .508), echo pattern(US) (ρ = .824), blood flow signal(US) (ρ = .824), size(US)(ρ = .698). As was shown in Table 1, the age of malignant patients was higher than that in benign patients (ρ < .001) especially in the (>55 years) group (35.5% vs 16.2%). The rate of palpable lesions among the malignant patients was higher than the rate among the benign patients (80.3% vs 67.6%, ρ = .028). In contrast, benign lesions showed much higher rates of nipple discharge (19.5% vs 9.2%, ρ = .032). Nipple discharge is one of the most common symptoms of breast disease including breast intraductal papilloma, breast duct dilatation, breast cancer and etc. 15 The majority of researchers believe that 80∼90% of nipple discharges are caused by benign breast diseases. 16 The results of our study revealed that 83 patients had the symptom of nipple discharge, of which 76 (84%) were malignant lesions.
Among the US features, 11.8% of malignant lesions have enlarged lymph nodes while only 5.1% of benign lesions have similar circumstance (ρ = .027). In addition, there was a significant positive association between benign and malignant lesions in the uncircumscribed US margin (ρ < .05).
In the MG features, most malignant lesions (42.1%) were found to have suspicious MG calcification, while in the benign cohorts, the rate only was 19.3% (ρ < .001). Furthermore, malignant patients had a much higher rate of BI-RADS(MG) 4B and above (38.2% vs 8.7%, ρ < .001).
Multivariate Analysis
Multivariate Analysis of Clinicopathological Parameters in the Patients With BI-RADS(US) 4A Lesion.

The forest plot represents the OR of the final malignancy risk feature of BI-RADS(US) 4A lesions. The X-axis shows the OR and 95% CI of each risk factor the OR and 95% CI of each risk factor.
Validation and Calibration of the Nomogram
We established a nomogram based on the forestplot that incorporated all significant predictive factors as shown in Figure 3. In our analysis, the calibration curves of the predictive nomogram model for predicting the malignant risk of BI-RADS(US) 4A lesions showed good agreement (Figure 4). The predictive model’s Harrell’s C-index was .773 (95% CI, .714 to .833) for our predictive nomogram. As shown in Figure 5, we compared the AUCs of our nomogram and each component to determine whether one was more predictive than the other. In accordance with the Receiver Operating Characteristic (ROC) curves, our predictive nomogram exhibited the highest AUC (.773 [95% CI: .714-.833]) compared with any single risk factor, indicating that the nomogram had a higher predictive power. Risk nomogram for predicting the malignancy risk of BI-RADS(US) lesions. The age, nipple discharge, palpable lesions, margin (US), calcification (MG) and suspicious malignancy (MG) were used for building the prognostic nomogram. Total point values were independently calculated and then applied to the corresponding probability scale at the bottom of each figure. Calibration curve for the predictive nomogram. The calibration of the nomogram was depicted by the calibration curve in terms of the agreement between the predicted malignancy risk and the actual results based on pathological results. The turquoise line represents an ideal prediction, and the purple line represents the predictive performance of the nomogram. The closer the fit of the red line to the ideal line, the better the prediction. The blue line represents the bias corrected. Receiver operating characteristic (ROC) curves of our predictive models and each risk factors.


The decision curve analysis (DCA) of the predictive nomogram and each risk factor model was presented in Figure 6A to determine an optimal decision point of the nomogram score. First of all, the DCA curves were used to determine the net benefits of the predictive nomogram compared with each risk factor in predicting the malignancy risk of BI-RADS 4A lesions. As was shown in Figure 6A, when the threshold probability is between 0-.51, using the nomogram to predict malignancy when BI-RADS(US) 4A lesions are present should provide a higher net benefit than assuming that all BI-RADS(US) 4A lesions are malignant (line All in Figure 7A) or assuming that all BI-RADS(US) 4A lesions are benign (line None in Figure 7A), suggesting that our nomogram was superior to predicting. The net benefit of reducing the risk threshold tends to be greater at the same time as there will be more patients diagnosed with malignant lesions, which indicates an increase in the number of false positives. Determination of decision point via Decision Curve Analysis and Clinical Impact Curve. (A) Decision curve clinical impact curve analysis for the predictive nomogram and each single variable. (B) Clinical impact curve analysis for the predictive nomogram and single variable. The vertical blue lines across the (A) and (B) showed the alignment of the DCA and the clinical impact curve to achieve the balance between the higher net benefits and lower false-positive rates. Finally, the best cut-off points was 153.189. The 465 patients nomogram score was arranged in order of low and high in the waterfalls plot. The horizontal black line showed the best cut-off score which divided the patients into high-risk group (recommended for biopsy or surgery) and low-risk groups (recommended for follow up).

As a result of analyzing the percentage of patients classified as high risk by our nomogram and the percentage of patients pathologically diagnosed as malignant at each threshold, we developed the clinical impact curve in Figure 7B. As shown in Figure 7B as the risk threshold increases, the difference between the number of patients considered to have malignant lesions predicted by our nomogram (the red curve) and the actual number of patients pathologically diagnosed as malignant (the blue curve) gets larger.
To achieve a balance between lower false-positive rates and a higher net benefit, we aligned the DCA with the clinical impact curve. Based on the DCA curve, clinical impact curve, and ROC curve, the malignant of risk threshold was determined to be .38.
Risk Stratification via the Nomogram
According to our nomogram, the overall risk scores of all the patients ranged from 0 to 365 as shown in Figure 7. Based on the maximal Youden index for predicting the malignancy rates, we chose a cutoff score of 153 as the best cutoff score. Afterwards, the above patients have been grouped into high-risk groups (recommend biopsy) with ≥153 points on the final score, and low-risk groups (recommend follow-up) with <153 points on the final score. There were 35.5% (54/152) of patients with malignant pathology in the high-risk group, and 64.5% (98/152) patients with benign pathology. Comparatively, only 7% (22/313) of patients in the low-risk group had malignant pathology, while 93% (291/313) had benign pathology as shown in Figure 8. By stratifying patients in using our nomogram, the unnecessary biopsy rate reduced from 83.7% (389/465) to 21.1%(98/465), while 4.7%(22/465) of malignant BI-RADS(US) 4A lesions were missed. As a result, 76 breast lesions were diagnosed as malignant and all of malignant lesions were tested immunohistochemically. A comparison was then made between the low-risk group and the high-risk group in terms of histological type, molecular subtype, Ki67, EGFR, P53, ER, PR, HER2 and CK56 status. As was shown in Table 3, the rate of invasive carcinoma was higher in high-risk group than the low risk group, while the rate of positive ER was higher in the low risk group. The risk classification performance of our nomogram. Comparison of Pathological Parameters Between High-Risk Group (Recommended for Biopsy or Surgery) and Low-Risk Group (Recommended for Follow Up). The bold values is used to highlight the name of each characteristics without special meaning.
Figure 9 illustrates how our predictive nomogram can be effectively used by giving an example. We calculated each component score when a patient diagnosed with BI-RADS(US) 4A presented to the hospital and met the criteria for the study (Age-43 years old:0, Nipple Discharge:45.63, Palpable:0, BI-RADS(MG)-4A:0, Calcification (MG):0, Margin-uncircumscribed:40.23). A comprehensive score can then be calculated based on the nomogram. All the risk factors’ points accounted for 85.86 points, which was less than 153 points, and it could be classified as low-risk group and the patients could be recommended for continued follow up. The workflow of how to use the nomogram.
Discussion
It is imperative to distinguish benign from malignant breast lesions in the early stages of breast disease. The US and MG are indispensable examinations especially in the diagnosis of early breast disease. 17 ACR developed the BI-RADS (US) to classify breast lesions into different categories based on their degree of malignancy. Breast lesions in categories 4 and 5 are recommended for biopsy or open surgery in order to determine a final pathologic diagnosis. 4 In the current study, BI-RADS(US) 4A lesions account for about half of all BI-RADS 4 lesions, however only 2*10% of those lesions were ultimately diagnosed as malignant.17-21 Accordingly, it is appropriate to reduce the surveillance rank of BI-RADS 4A lesions by providing more information about lesions in a more comprehensive manner. 22 Thus, we can identify those patients who are suitable for short-term follow-up and those who may benefit from biopsies or surgery in order to reduce the burden on patients.
Although nomograms have been widely used to predict medical prognosis and outcomes by combining multiple risk factors. 9 There are only a few predictive nomograms that can be used to differentiate benign lesions from malignant lesions in BI-RADS (US) 4A.5,9,10,23,24 As these predictive models rely on many detailed US or MG features but a limited number of clinical characteristics, their application in some basic medical institutions may be limited due to the requirement of technology and experience from the radiologists. Furthermore, some nomograms include MRI features, but we all know that MRIs are expensive and not always accessible, which may contribute to the limited use of these nomograms.
In this study, we retrospectively analyzed 465 patients across China from our center and evaluated the incidence as well as the risk factor for the malignancy probability of BI-RADS (US) 4A lesions. The results of our study indicated that 83.7% of unnecessary biopsy procedures were reduced from 389/465 to 21.1% (98/465). As far as we know, this is one of the largest studies investigating the relationship between BI-RADS (US) 4A lesions and the risk of malignancy. The purpose of this study was to develop a predictive nomogram based on clinical information and examination images for predicting malignancy in breast lesions classified as BI-RADS(US) category 4A, which performed satisfactorily in terms of discrimination and is capable of acting as a decision-making model in a noninvasive manner. In our nomogram, we have incorporated 6 risk factors, including 3 clinical features (age, nipple discharge, palpable lesions), 1 US imaging feature (margin), and 2 MG imaging features (calcification, BI-RADS(MG)). Consistent with previous findings, the uncircumscribed US margin remained as a risk predictor of malignancy in our study. The reason for this may be that malignant lesions usually grow to infiltrate, which is why US examinations display irregular margin features. It should be noted, however, that no study has examined the relationship between BI-RADS 4A lesions’ malignancy and age. We retrospectively classified patients into three age groups in accordance with the age of breast cancer peak incidence in China and identified the age that was also influential in predicting the malignancy of BI-RADS(US) 4A lesions. Especially among young patients (less than 45 years old), the malignant risk of masses is only .39 times that of elderly patients (over 55 years old). In addition to most studies have shown that palpability is a risk predictor of nodal involvement in breast cancer.25-30 Moreover, we have found that palpable BI-RADS(US) 4A lesions were associated with a higher risk of malignancy. Malignant lesions tend to grow more rapidly than benign lesions, which may explain why they are more likely to be palpated. Nipple discharge is a common symptom of breast disease which can be classified into physiological and pathological. We newly identified that nipple discharge was a protective factor for malignant prediction of BI-RADS 4A lesions. This phenomenon may result from the high proportion of benign lesions in BI-RADS(US) lesions and the fact that pathological nipple discharge is primarily caused by benign breast disease, such as intraductal breast papilloma, and only 5∼23% caused by malignant breast disease.31,32 The BI-RADS(MG) is a highly powerful indicator for predicting malignant tumors, regardless of whether it is analyzed in a univariate or multivariate manner. However, it is believed that high breast density increases the high false positive rate.33,34 In our study, we found that more than half of the BI-RADS(MG) 4B lesions were ultimately determined as benign. Therefore, it would be more rigorous to include BI-RADS(MG) as one component of the nomogram rather than relying solely on MG results due to the lower sensitivity of MG for women with dense breast tissue.
It is important to note that although some studies have reported that breast cancer family history is an independent risk factor for BI-RADS 4A lesions, we found that breast cancer family history and other clinical factors (such as breastfeeding history, smoking history, etc.) were not statistically significant for the malignancy of BI-RADS(US) 4A lesions.10,23 Based on our analysis, the total malignancy rate of BI-RADS (US) 4A lesions was 16.3%, which was higher compared to other studies, probably because some patients did not have complete clinical information or did not undergo standard histological examinations.10,18-21,35
There are still limitations in our study. ① Our nomogram developed from this retrospective analysis was only tested internally and needed further external validation to verify the efficacy of this model. ② As a retrospective study, the bias was inevitable, so for future studies we should expand our sample size and conduct multicenter analyses. ③ Although this study did not include the calculation and justification of the sample size which may affect the statistical significance of our results, we believed that the results from the current sample size were still reliable. ④ Since the sample size of our study was limited, each patient had only one BI-RADS(US) 4A lesion. There were no cases where more than one 4A lesion. In spite of this, we believe that patients with two or more 4A lesions can also benefit from our prediction model by incorporating the risk factors associated with each 4A lesion separately.
Conclusion
In conclusion, we developed a risk model using a cohort of BI-RADS(US) 4A lesions and identified that age, nipple discharge, palpable lesions, uncircumscribed US margin, MG calcification, BI-RADS(MG) 4B and above significantly increase the risk of malignancy. Following the establishment of a well-discriminated nomogram that could quantitatively measure each patient’s risk of malignancy, we stratified the patients into low-risk groups (recommended for follow-up) and high-risk groups (recommended for biopsy or surgery).
Footnotes
Acknowledgments
We thank the staff in the Tianjin Medical University Cancer Institute and Hospital for supporting the research.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the Wu Jieping Medical Foundation (320.6750.2021-10-2). This study was funded by Tianjin Key Medical Discipline(Specialty) Construction Project (TJYXZDXK-009A).
Ethics Statement
This study was deemed exempt from institutional review board approval by Tianjin Medical University Cancer Institute and Hospital and National Clinical Research Center for Cancer (Tianjin, China) and the informed consent was waived.
