Abstract
Objective:
The Thyroid Imaging Reporting and Data System (TIRADS) has been proposed to reduce the number of unnecessary fine needle aspirations (FNA) from thyroid nodules.
Materials and Methods:
An individual radiologist provided sonographic examinations and FNA on a collection of 188 thyroid nodules. The recommendations based on the TIRADS system, for each nodule, was determined and evaluated against the cytology results.
Results:
The American College of Radiology (ACR), artificial intelligence (AI), European (EU), and Korean (K) scoring systems reduced FNAs by 53%, 56%, 48%, and 28%, respectively. Among those lesions without a recommendation for immediate FNA, The ACR would have missed four malignant nodules, the AI would have missed four malignant nodules, and K TIRADS would have missed three malignant nodules but with a recommended follow-up imaging. The ACR would have missed three malignant nodules, the AI would have missed four malignant nodules, and EU TIRADS would have missed four malignant nodules, without a recommended follow-up examination. The highest and lowest kappa interrelated agreements were between ACR and AI (0.902) and AI and K (0.448).
Conclusion:
The ACR and AI TIRADS could substantially decrease the number of FNAs but rely on follow-up imaging. The EU TIRADS reduced the number of FNAs, the least however this system had less dependence on follow-up imaging. The K TIRADS was the most conservative method and the least dependent on follow-up diagnostics.
The American Thyroid Association defines a thyroid nodule as “a discrete lesion within the thyroid gland that is radiologically distinct from the surrounding thyroid parenchyma.” 1 Its prevalence has been estimated at 2%–6% with palpation, 19%–35% with sonography, and 8%–65% in the autopsy.2,3 Thyroid nodules can cause functional abnormalities or have pressure impacts; however, malignancy is the main concern.
In the process of evaluating a patient with a thyroid nodule, after history and physical examination, thyroid function tests are conducted. If the thyroid stimulating hormone (TSH) level is increased, sonographic evaluation of the thyroid gland is the next diagnostic step. If the presence of the nodule is confirmed, fine needle aspiration (FNA) will be considered for the patient. According to the FNA results, the treatment plan and necessity of surgery will be decided. 3 FNA is an invasive diagnostic procedure that is associated with higher costs than a stand-alone sonogram. Furthermore, some of FNA’s results are indeterminate and a number of the results are benign. Conversely, sonography is a noninvasive tool with considerable differences in cost aspects, which potentially could decrease the number of unnecessary FNAs via detecting benign nodules. Thyroid imaging reporting and data systems (TIRADS) have been proposed to reduce the number of unnecessary FNAs. 4 Horvath et al suggested the first TIRADS by modifying the BIRADS (Breast Imaging Reporting and Data System). 5 The TIRADS has been assessed to increase its efficacy in recent years which led to a number of versions. The most accepted systems are K TIRADS (Korean TIRADS), EU TIRADS (European TIRADS), ACR TIRADS (American College of Radiology TIRADS), and AI TIRADS (Artificial Intelligence TIRADS). The K TIRADS and EU TIRADS are descriptive methods6,7 whereas ACR TIRADS uses a scoring system. 8 AI TIRADS derived from the optimization of ACR TIRADS scoring system with artificial intelligence. 9 There is no agreement on which TIRADS is the optimal method to reduce unnecessary FNAs that also has the lowest possibility to miss malignancies. The goal of this study was to assess the result of these TIRADS methods against FNA outcomes and their impact on the reduction of unnecessary FNAs.
Materials and Methods
This study was conducted after the approval of the ethical committee of the corresponding medical school. The study design was cohort by convenience sampling method from referred patients to the radiology department, of the hospital, in order to have thyroid nodule FNA. After informed consent, the patients underwent high-resolution gray-scale sonographic evaluation with a high-frequency linear transducer (12-6.2 MHz), via GE-Voluson 730 equipment system. The full thyroid protocol of sonographic imaging, which includes evaluation of lymph nodes was used. Each patient’s thyroid was evaluated in both the axial and sagittal planes. An individual radiologist conducted the real-time evaluation of that nodules for composition, echogenicity, shape, margin, and echogenic foci. After the sonographic evaluation, the FNA was performed using a 21- or 23-gauge needle and a 10 cc syringe. The cytology reports were conducted by an individual pathologist in the institution, who was blinded to sonographic results. Patients’ demographic data and the sonographic features of their thyroid nodules were documented. Also, sonographic images of nodules with DICOM format were kept in case of reinspection necessity in the process of the study. Then TIRADS category was assigned by each TIRADS system and was calculated for every nodule (see Figures 1–3). The ACR TIRADS and AI TIRADS included TIRADS 1 category for benign nodules but TIRADS 1 category in K TIRADS and EU TIRADS indicated that no nodule has been detected. Since all of the patients in this study had at least one thyroid nodule, there is no TIRADS 1 category for K TIRADS and EU TIRADS in these results.

The ACR and AI TIRADS flow chart. ACR, American College of Radiology; AI, artificial intelligence; TIRADS, Thyroid Imaging Reporting and Data System.

The EU-TIRADS flow chart. EU-TIRADS, European-Thyroid Imaging Reporting and Data System.

The K-TIRADS flow chart. K-TIRADS, Korean-Thyroid Imaging Reporting and Data System.
Pathology results were reported in the Bethesda system. The Bethesda 1 (nondiagnostic/unsatisfactory) and Bethesda 6 (already known malignant nodules) were excluded from the study. Cytology results as Bethesda 3 (atypia of undetermined significance) and Bethesda 4 (suspicious for follicular neoplasm) were considered as malignant in the process of analyses accompanying Bethesda 5 (suspicious for malignancy). Therefore, nodules with Bethesda 2 were considered as benign and Bethesda 3, 4, and 5 were considered as malignant in the analysis.
The sample size was 188 nodules calculated with the effect size for kappa at 0.3 (alpha = 0.05, beta = 0.1), which were collected from April 2019 until April 2020. The statistical analysis was performed using IBM SPSS version 18. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated for each TIRADS scoring system based on FNA recommendation of the corresponding TIRADS system and cytopathology results. The kappa interrelated agreement between different types of TIRADS scoring systems was calculated according to their recommendation for FNA necessity at the time of the sonographic evaluation. At this stage, recommendations for follow-up imaging were assessed as a negative result for FNA conduction.
Results
There were 164 patients included in the study (144 females, 20 males), of which 24 had 2 nodules (based on 188 nodules). The mean age of the patients was 46.7 (SD = 11.2), ranging from 21 to 79 years. The average nodule size was 23.82 mm (SD = 12.10 mm).
The TIRADS category for each nodule was calculated for ACR, AI, EU, and K TIRADS based on that nodule’s characteristics and size. Each TIRADS plan for every nodule was considered based on corresponding TIRADS guidelines (see Table 1).
The ACR, AI, EU, and K TIRADS Recommendations for Each Nodule Is Based on Lesion’s Characteristics and Size.
Abbreviations: ACR, American College of Radiology; AI, artificial intelligence; EU, European; FNA, fine needle aspiration; FU, follow-up; K, Korean; TIRADS, Thyroid Imaging Reporting and Data System.
The number and percent of prevented FNAs, missed malignancies but followed, and totally missed malignancies for each TIRADS system is showed in Table 2 and Figure 4. The AI TIRADS could prevent 56% of FNAs followed by ACR (52%), EU (48%), and K (27%). According to these results, the sensitivity, specificity, PPV, and NPV were calculated (see Table 3). The K TIRADS had the most sensitivity (84%), the AI TIRADS had the most specificity (62%), and EU TIRADS had the most PPV and NPV (16% and 96%, respectively).
The Number and Percent of the Prevented FNAs, Followed, or Missed Malignancies, Based on Each TIRADS Scoring System.
Abbreviations: ACR, American College of Radiology; AI, artificial intelligence; EU, European; FNA, Fine Needle Aspiration; K, Korean; TIRADS, Thyroid Imaging Reporting and Data System.

The percent of preventable FNAs, followed, or missed malignancies, based on each TIRADS scoring system. FNA, Fine Needle Aspiration; TIRADS, Thyroid Imaging Reporting and Data System.
The Sensitivity, Specificity, PPV, and NPV for Each of the TIRADS Systems Based on the Diagnosis of Malignancy.
Abbreviations: ACR, American College of Radiology; AI, artificial intelligence; EU, European; K, Korean; NPV = negative predictive value; PPV, positive predictive value; TIRADS, Thyroid Imaging Reporting and Data System.
By considering the recommendation for FNA as a positive result and otherwise as negative for each TIRADS classification, the kappa interrelated agreements between each pair of them were calculated (Table 4). Generally, all of them had a significantly interrelated agreement (P < .001). The K TIRADS had a relatively lower kappa with other TIRADS systems compared with other pairs.
The Kappa Interrelated Agreement Between the Different Types of TIRADS Scoring Systems.
Abbreviations: ACR, American College of Radiology; AI, artificial intelligence; EU, European; K, Korean; TIRADS, Thyroid Imaging Reporting and Data System.
Discussion
The thyroid imaging and reporting data system has been proposed to reduce the unnecessary FNAs. 4 There have been a number of TIRADS systems with the aim of increasing its accuracy and acceptance among physicians. The most recent and recognized TIRADS systems are K TIRADS, EU TIRADS, ACR TIRADS, and AI TIRADS.
According to this study’s results on recommendations of each TIRADS system, if applied (see Tables 1 and 2), ACR, AI, and EU TIRADS would decrease the number of FNAs in a similar manner (53%, 56%, and 48%, respectively). But K TIRADS would decrease by roughly half, compared with the other systems (28%). The ACR would have missed four malignant nodules, the AI would have missed four malignant nodules, and K TIRADS would have missed three malignant nodules. It is possible that these lesions but have been detected on a follow-up imaging examination, according to the reported recommendations. With the EU TIRADS system, any malignant nodule fell into a diagnostic follow-up group. Finally, the ACR would have missed three malignant nodules, the AI would have missed four malignant nodules, and EU TIRADS would have missed four malignant nodules, without a recommended follow-up examination. The K TIRADS would not have missed any malignancy without a recommendation for follow-up imaging. In the process of revision of missed nodules, it was found out that three of the missed nodules were the same across all the different TIRADS systems. They were in the category of missed malignancies without a recommendation for follow-up imaging of ACR, AI, and EU TIRADS and follow-up imaging recommended category for K TIRADS. The follow-up patient case characteristics were as follows: a 69-year-old male with a 13 mm nodule, two female cases, 57 and 67 years old with nodule sizes 11 and 9 mm, respectively. All of these lesions were solid, isoechoic, wider than taller, with normal margin, and without echogenic foci nodules. It would seem that these lesions would have been missed without follow-up imaging. If a follow-up diagnostic examination was possible, only the K TIRADS might have detected these, based on their growth or the appearance of other suspicious features.
Tan et al 10 in a similar prospective designed study, with a lower sample size (144 nodules) and fewer malignant nodules on FNA (7 nodules), had recommendations for each of the TIRADS systems. These recommendations were for nodules that excluded the follow-up imaging recommendations category, as they have combined this category with those that had no further FNA recommendations. 10 Also, Tan et al excluded nodules smaller than 10 mm and nodules with the FNA report of Bethesda 3. 10 The initial sonographic exam and the FNA were completed by eight different individuals in their study. Based on Tan et al’s results, ACR, AI, and EU TIRADS would have miss one malignant nodule (no mention of whether follow-up imaging was recommended for that nodule or not); however, the K TIRADS did not miss a malignant nodule. 10 Their results indicated that the ACR, AI, EU, and K TIRADS decreased the number of FNAs by 54%, 62%, 38%, and 12%, respectively. 10 The Tan et al results are very comparable to the present study results (see Table 2 and Figure 4). Again, in the present study, K TIRADS has decreased the number of FNAs much less than the other TIRADS systems and had not missed a malignant nodule.
These TIRADS systems had almost similar PPVs (11%–16%) and NPVs (91%–96%). The most sensitivity was for K TIRADS (84%) and the least one was for AI TIRADS (58%). The most and least specificities were vice versa (32% and 62%, respectively). These results (see Table 3) were justifiable according to decreased number of FNAs and missed malignancies (see Table 2). Tan et al 10 had calculated sensitivity, specificity, PPV, and NPV differently; they considered the malignancy risk of more than 20% in each category of the TIRADS systems as the positive result. This analytical method differs from the current study. Given that Tan et al’s method of analysis was different, it is not comparable to the present study.
The over diagnosis of thyroid nodules, defined as “diagnosis of thyroid tumors that would not, if left alone, result in symptoms or death” has been reported for about 80% and 45% of thyroid cancers in women and men, respectively. 11 Also, it has been suggested that an increase diagnosis of thyroid cancers does not necessarily reduce mortality. 12 There is some evidence that even malignant nodules, in particular nodules smaller than 10 mm, could have indolent or nonaggressive behavior.13,14
Therefore, if the goal is to utilize TIRADS to exclude the “unnecessary lesions,” this might include malignant nodules, as with the ACR and AI TIRADS systems8,9 there is a risk of having missed malignancies, in the effort to reduce the cost and number of FNAs. These missed malignancies may be detected during a follow-up diagnostic imaging examination or not. It should be emphasized that the ACR and AI TIRADS systems rely on follow-up imaging and without it, missed malignancies could potentially be increased, based on the present results. The EU TIRADS had an almost similar number of missed malignancies but demonstrated less dependence on follow-up diagnostic imaging.
If the goal of TIRADS is to reduce FNAs for benign nodules, then the most conservative method, K TIRADS, would be the most appropriate, based on the current study results. This system would be less likely to miss a malignant nodule with follow-up imaging. It is important to note that the K TIRADS, without follow-up, would likely miss an equal number of malignant nodules compared with other TIRADS systems, despite the inclusion of follow-up imaging with other TIRADS systems.
The similarities between ACR and AI TIRADS, the differences with K TIRADS, and also the comparison of EU TIRADS are noted by the kappa interrelated agreements (see Table 4). Although all of the comparisons demonstrated a statistically significant result (P < .001).
The impact of using a TIRADS scoring system on the patients and clinicians depends on the version that is used, as well as the socioeconomics of the population. The last factor to consider is the importance of follow-up diagnostic imaging. In the case of patients’ compliance for follow-up diagnostic imaging, ACR or AI TIRADS could be the first choice. EU TIRADS could be a conservative substitute for them in this regard. If a follow-up examination is not determined, then the K TIRADS would provide the most conservative course of action.
Although all of TIRADS versions can potentially reduce the financial burden on the health system and patients, it has the risk of missing some malignant nodules. This could be debated as to its importance and whether the costs are ambiguous. Therefore, evaluation of the cost-benefit ratio of different TIRADS scoring systems and the diagnostic yield from diagnostic follow-up examination would be recommended as future prospective study.
Conclusion
Making a decision on which TIRADS to use would depend on the demographic and socioeconomic state of patients and region. In this study, the ACR and AI TIRADS provided a significant decrease in unnecessary FNAs. There was a need for follow-up diagnostic imaging on these patients. The EU TIRADS decreased the number of unnecessary FNAs, in this study, compared with the other systems and relied less on follow-up diagnostic imaging. Conversely, the K TIRADS provided the most conservative method by reducing the amount of unnecessary FNAs in this cohort, as well as the lowest dependence on follow-up diagnostic imaging.
Limitations
This study was limited by its convenient sample size and the threats to the research design. To remedy these limitations, a larger sample size, stronger research design, and a larger proportion of malignant nodules would be beneficial. It would be recommended that inclusion of the cost of each treatment plan as well as the result of follow-up diagnostic examinations could significantly strengthen the future study results.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
