Abstract
In accordance with National Guidelines, we currently follow a linear approach to the diagnosis of thyroid nodules, with management decision based primarily on a cytological diagnosis following fine-needle aspiration biopsy. However, 25% of these biopsies render an indeterminate cytology, leaving uncertainty regarding appropriate management. Individualizing the risk of malignancy of these nodules could improve their management significantly. We summarize the current evidence on the relevance of clinical information, radiological features, cytological features, and molecular markers tests results and describe how these can be integrated to personalize the management of thyroid nodules with indeterminate cytology. Several factors can be used to stratify the risk of malignancy in thyroid nodules with indeterminate cytology. Male gender, large tumors (>4 cm), suspicious sonographic patterns, and the presence of nuclear atypia on the cytology are all associated with an increased cancer prevalence. The added value of current molecular markers in the risk stratification process needs further study because their performance seems compromised in some clinical settings and remains to be validated in others. Risk stratification is possible in thyroid nodules with indeterminate cytology using data that are often underused by current guidelines. Future guidelines should integrate these factors and personalize the recommended diagnostic and therapeutic approaches accordingly.
Limitations and Opportunities of the Current Diagnostic Algorithm for the Evaluation of Thyroid Nodules
Most thyroid nodules are benign. However, thyroid cancer usually presents as a thyroid nodule. Establishing an adequate differential diagnosis is crucial to avoid unnecessary surgeries for asymptomatic benign nodules and delayed diagnosis and treatment for thyroid cancer. Current guidelines, including those from the American Thyroid Association (ATA), the American Association of Clinical Endocrinologists (AACE), American College of Endocrinology (ACE), and Associazione Medici Endocrinology (AME), the National Comprehensive Cancer Network (NCCN), and the British Thyroid Association (BTA) follow a linear approach to the diagnosis of thyroid nodules (Figure 1). 1 -4 After clinical and physical evaluation, the sonographic appearance of the nodule is characterized to determine whether a biopsy is necessary or not and then rely almost exclusively on the cytological diagnosis to dictate management, supplemented increasingly by the results of one of several molecular marker tests. This general approach has been replicated consistently across professional association guidelines worldwide and has remained fundamentally unchanged for many years. 1 -4 Nonetheless, it has a major drawback: approximately 25% of all thyroid biopsies do not render a definitive cytological diagnosis. 5

Current diagnostic approach for a newly diagnosed thyroid nodule.
We have known for many years, through population studies and autopsy series, that the prevalence of occult thyroid nodules and thyroid cancer is much higher than that which becomes clinically relevant. 6 -9 However, the rapid expansion of thyroid ultrasound use, both for diagnosis and for screening, has uncovered some of this reservoir of thyroid disease, contributing to a thyroid cancer “epidemic.” 10 Indeed, thyroid cancer incidence is currently rising faster than any other cancer type; it is already the fifth most common cancer among women. 11 During 2016, approximately 64 300 new cases of thyroid cancer were diagnosed in the United States. 12 Despite these striking figures and the many challenges that this situation creates for any health-care system, thyroid cancer overdiagnosis likely represents the tip of an iceberg. Although the use of thyroid ultrasound and fine needle aspiration (FNA) biopsy continues to increase rapidly (5- and 7-fold increase in the last decade, respectively), 13 the number of thyroid nodules with indeterminate cytology will follow those curves. In a recent meta-analysis, malignancies represented one-third of all resected nodules; half of all resected nodules had an indeterminate cytology; and two-thirds of them proved to be benign after a diagnostic surgery. 5 Applying this figures to the number of thyroid cancers diagnosed (64 300) in 2016, we estimate that 192 900 biopsied nodules were resected in 2016 (64 300 cancers × 3); one-half of which (192 900/2 = 96 450) had indeterminate cytology and two-thirds of those ((96 450/3) × 2 = 64 300) ultimately proved to have benign histology. In summary, based on our current diagnostic approach, we removed approximately 64 300 thyroid cancers and approximately 64 300 benign thyroid nodules because of indeterminate cytology last year. This ongoing overtreatment exposes our patients to short- and long-term surgical complications, and many will face the need for lifelong thyroid hormone replacement. 14
This situation has underpinned the development of several molecular marker tests, in an attempt to refine the preoperative diagnosis to reduce the rate of diagnostic surgeries. These tests have been embraced with enthusiasm in parts of the United States and have already gained a prominent role in the recommended evaluation of thyroid nodules with indeterminate cytology in the 2015 ATA and 2015 NCCN guidelines. 1,3 However, other professional association guidelines like those of AACE/ACE/AME and BTA are more cautious in recommending the use of molecular markers for clinical use. 2,4 This more conservative attitude is based on the lack of large prospective independent validation studies; the significant costs of molecular markers—presently most widely spread tests are >US$5 000 per nodule—and their unknown long-term impact on health-care management and cost-effectiveness.
Nonetheless, molecular tests have made at least 1 valuable contribution: to make us aware of the need to individualize our assessment of a patient’s—and nodule’s—risk of malignancy, which underpins the interpretation of all subsequent tests. 15 It is well recognized that the performance, and therefore the clinical utility, of molecular marker tests is highly dependent on the pretest risk of malignancy 16 ; in this case, the prevalence of malignancy among thyroid nodules with indeterminate cytology, which varies widely among different institutions. 17 However, analyzing the risk of malignancy of each indeterminate category in each institution is just the first necessary step toward the correct interpretation of these tests since many other factors impact on the individual pretest risk of malignancy of a given nodule. 18 At the present time, most of these risk factors are used to alter the threshold for ultrasound evaluation and FNA biopsy but are largely ignored once cytology becomes available. 1 -4 We review here how clinical, radiological, cytological, and molecular features impact on the risk of malignancy of thyroid nodules with indeterminate cytology; how can these features be integrated to optimize risk stratification.
Factors That Can Be Used for Risk Stratification of Indeterminate Thyroid Nodules
Clinical Characteristics
The evaluation of a patient with a known or suspected thyroid nodule begins with clinical history and physical evaluation. 1 -4 Some risk factors, although infrequent, significantly raise the suspicion for malignancy and may push the clinician to recommend surgery, despite indeterminate cytology. Examples include the finding of cervical lymph node metastases; hard consistency on palpation, with fixation to adjacent structures; rapidly progressive and painful growth; rapid onset of compressive symptoms in the central neck, including dysphagia, dysphonia, or dyspnea; or the loss of recurrent laryngeal nerve function, suggesting locally invasive disease. 19 Other risk factors modestly increase the risk of malignancy of thyroid nodules including: exposure to ionizing radiation, especially in childhood or adolescence; strong family history of first-degree relatives with thyroid cancer; male gender; solitary nodules; and nodule size >4 cm. 20 Several studies aimed to assess the impact of these factors on the risk of cancer of thyroid nodules with indeterminate cytology.
A recent meta-analysis of 19 studies, including 3494 patients with indeterminate thyroid nodules, showed that male gender and nodule size >4 cm were associated with an increased risk of malignancy (odds ratio [OR]: 1.51 and 2.10, respectively). 21 Nodule size was found to be an independent predictor of malignancy among mutation-negative nodules with atypia/follicular lesion of undetermined significance cytology in 1 study. 22 In another study, the risk of cancer was found to be lowest for nodules 2.5 cm and increased 40% and 50% for each centimeter increase or decrease in size, respectively. 23 However, the impact of size on the risk of cancer of indeterminate thyroid nodules is still controversial due to conflicting data in other studies. 24 Family history of thyroid cancer and past history of radiation exposure seem not to significantly stratify the risk of malignancy of cytologically indeterminate thyroid nodules. 23,25,26 The impact of single nodularity in risk stratification of indeterminate thyroid nodules is also unclear as it has been associated with increased, decreased, and no effect on the risk of malignancy. 23,26 -30
Sonographic Features
Ultrasound plays a central role in the evaluation of thyroid nodules, identifying which nodules are worthy of biopsy. Some sonographic features have been consistently associated with an increased risk of malignancy including (from least to most specific): internal vascularity (OR ∼ 3.5), solid composition (OR ∼ 4.5), hypoechogenicity (OR ∼ 5), irregular margins (OR approximately 6), microcalcifications (OR ∼ 7), and shape taller than wide in the transverse view (OR ∼ 10). 20,31 -33 Unfortunately, the heterogeneity in the published literature for the risk of malignancy associated with each feature is substantial; none of these sonographic features have sensitivity and specificity high enough to be clinically relevant in isolation. 31,33 Furthermore, the features are subject to interpretation and influenced by external factors, such as imaging acquisition equipment, operator and settings, or screen resolution. As a result, the interobserver and intraobserver agreements for individual features are moderate (κ statistic of 0.4-0.6). 33 -35
Several classification systems have been developed to simplify the reporting of thyroid nodules and to guide the need for FNA biopsy. Each sonographic pattern is associated with an estimated risk of malignancy and their use seems to increase the interobserver agreement compared to individual sonographic features. 36,37 Table 1 summarizes and compares the 2015 ATA sonographic patterns and those described in other classifications. 1,2,4,38 -44 Despite differences in number and nomenclature of sonographic patterns between classifications, all have demonstrated a significant gradation in the pretest risk of malignancy. 1,2,4,38 -44 More importantly, several studies, including ours, have demonstrated that these differences in the pretest risk of malignancy, do have a significant impact on the interpretation of the risk of malignancy of thyroid nodules with indeterminate cytology. 45 -49 Results are difficult to compare because the sonographic patterns, and the cytological diagnoses analyzed differ between studies, as does the underlying prevalence of malignancy, which ranges from 15% to 55% (Table 2). However, cytologically indeterminate thyroid nodules with sonographic pattern equivalent to the ATA. “Very-low suspicion” had a prevalence of malignancy lower than other patterns in all studies, ranging from 4% to 22%; whereas those with sonographic pattern equivalent to the ATA “high suspicion” pattern had a prevalence of malignancy consistently higher than other patterns, ranging from 46% to 100%. Cytologically indeterminate thyroid nodules with sonographic patterns equivalent to the ATA “low suspicion” or “intermediate suspicion” had a prevalence of malignancy that ranged from 14% to 31% with overlapping results between these categories. Current guidelines recognize that the sonographic appearance should influence management to some extent. For example, a suspicious ultrasound in a nodule with benign cytology should trigger reevaluation and possible rebiopsy within a year; whereas nodules with benign appearance and benign cytology may not need any follow-up whatsoever. 1,2,4 However, current guidelines do not yet recommend different management for cytologically indeterminate thyroid nodules with various sonographic patterns. 1 -4 Based on recent data demonstrating the ability of sonographic patterns to stratify the risk of malignancy, we believe that these patterns should not only inform the selection of nodules for biopsy but also guide management after an indeterminate cytological diagnosis.
2015 American Thyroid Association Sonographic Patterns and Equivalence With Other Classifications.
Abbreviations: AACE American Association of Clinical Endocrinologists; ACE, American College of Endocrinology; AME, Associazione Medici Endocrinology; CLT, chronic lymphocytic thyroiditis; ETE, extrathyroidal extension; LMN, lymph node metastasis; PoM, prevalence of malignancy; TIRADS, Thyroid Imaging Reporting and Data System.
aClassifications without validation studies in which risk estimates were derived from previous classification systems.
Risk Stratification of Indeterminate Thyroid Nodules According to Sonographic Pattern.
Abbreviations: ATA, American Thyroid Association; Follicular tumors, equivalent to B-III and B-IV of the Bethesda system; B-III, Atypia/follicular lesion of undetermined significance (Bethesda category III); B-IV, Follicular/Hürthle cell neoplasm (Bethesda category IV); PoM, prevalence of malignancy; TIRADS: Thyroid Imaging Reporting and Data System, US pattern, sonographic pattern.
Additional Diagnostic Imaging Techniques
Thyroid nodules are one of the most frequent incidental findings reported on neck imaging studies that have been increasingly used in the last decades. 50,51 Given that the prevalence of malignancy is not significantly different between palpable and nonpalpable nodules of the same size, many professional association guidelines recommend evaluating these incidental nodules with a dedicated thyroid ultrasound. 1 -3,52 Nonetheless, some guidelines, such as the BTA or the American College of Radiologists, are more restrictive in selecting incidental thyroid nodules detected during computed tomography or magnetic resonance imaging for full evaluation given the elevated prevalence of these findings. 4,50 Even so, there is general agreement that incidental hypermetabolic thyroid nodules detected during 18F fluorodeoxyglucose positron emission tomography (PET) need further work-up with thyroid ultrasound and often biopsy as this finding is associated with an increased risk of cancer of around 35%. 1 -4,50,53 However, the impact of hypermetabolism on the interpretation of the risk of malignancy of thyroid cytology has been unclear. Our recent study on 436 biopsied thyroid nodules with PET scan and cytological evaluation demonstrated that the risk of malignancy increases with increasing maximum standardized uptake values (SUVmax) exceeding that of the general population (around 5%) at levels ≥2.5. 54 This implies that nodules with SUVmax <2.5 should probably receive the standard management, whereas higher SUVmax values have progressively increased suspicion for malignancy, which should arguably lower the size-threshold for biopsy. 54 The rate of aspirates diagnostic for malignancy increased with increasing SUVmax values; but SUVmax values did not alter the risk of malignancy of thyroid cytopathology: all resected nodules with a benign cytology were benign; all but one with malignant or suspicious for malignancy cytology were malignant; and no discrimination threshold was clinically useful for nodules with indeterminate cytology. 54
Cytological Features
Cytological evaluation of a FNA biopsy specimen remains the most precise single test for the presurgical diagnosis of a thyroid nodule. Table 3 summarizes the equivalence of the categories of the Bethesda system for reporting thyroid cytopathology (Bethesda) and those in other classifications. 55 -58 A benign (B-II) or a malignant (B-VI) diagnosis carries a risk of malignancy of approximately <5% and >97%, respectively, making these categories very reliable. 55 Unfortunately, around 25% of FNA biopsies render an indeterminate cytological diagnosis. 5 Indeterminate cytology is stratified into 3 categories in the Bethesda system: atypia or follicular lesion of undetermined significance (B-III), follicular or Hürthle cell neoplasm (B-IV), and suspicious for malignancy (B-V). 55 Each category was associated with an estimated risk of malignancy: 5% to 15%, 15% to 30%, and 60% to 75%, respectively. 55 These ranges in the risk of malignancy are used in current guidelines to recommend specific management for each category: repeat FNA in B-III, diagnostic lobectomy in B-IV, and lobectomy or total thyroidectomy in B-V. 1 -4 However, the risk of malignancy of each category has much wider ranges in the published literature, ranging from 6% to 48% in B-III, 14% to 49% in B-IV, and 42% to 90% in B-V among different institutions. 17,18 As a consequence, the risk of malignancy of these categories may overlap in some institutions, making it necessary to individualize management recommendations for each indeterminate category. For example, our recently published series from Moffitt Cancer Center (Tampa, Florida) demonstrated a risk of malignancy among resected nodules of 30% in B-III and 33% in B-IV. 18
Bethesda System for Reporting Thyroid Cytopathology and Equivalence With Other Classifications.
The variability in the risk of malignancy of indeterminate categories likely exists for several reasons,
59
although 3 of them are likely the most relevant: Differences in the application of diagnostic criteria for cytology among different observers and institutions. There is substantial overlap between cytological diagnostic categories.
59
This is particularly true for indeterminate categories where the interobserver agreement is much lower than for benign or malignant diagnoses.
60
-62
The decision on whether to allocate a specific aspirate to one or the other category is subjective and dependent on the pathologist’s training, experience, and local quality control. Second opinion or group consensus review has been shown to reduce the number of indeterminate aspirates and improve cytohistologic concordance, but interinstitutional agreement is likely to remain poor.
63
-65
Therefore, to optimize management recommendations, it is necessary to know the specific data for your institution/local pathologist. The heterogeneity of the specimens in the indeterminate categories. Indeterminate categories are very heterogeneous, particularly the B-III category. Therefore, it is likely that not all nodules within the same diagnostic category have the same risk of malignancy.
59,66,67
For example, an aspirate with clotting artifact that creates tridimensional groups probably does not carry the same risk of malignancy as a nodule with mild or focal nuclear atypia suggestive of papillary thyroid carcinoma but insufficient to make a more definitive diagnosis, even though both scenarios are legitimately classified as B-III.
55
Several studies have attempted to refine the risk assessment by creating diagnostic subcategories.
66
In this regard, aspirates exhibiting mild/focal nuclear atypia have consistently exhibited a 2- to 3-fold higher risk of malignancy than other cytologically indeterminate thyroid nodules.
66
As a result, differences in the proportion of nodules with nuclear atypia within the indeterminate categories might explain, at least in part, the interinstitutional differences observed for the prevalence of malignancy of these categories, particularly for B-III specimens.
66
Furthermore, cytological subcategories seem to be associated with distinct histological profiles.
67
Therefore, implementing standardized subcategories into the clinical practice should not only improve the cancer risk estimation but also the cytology–histology correlation, necessary to personalize management. Differences in the interpretation of borderline follicular lesions in histological specimens. The most frequent malignancy among thyroid nodules with indeterminate cytology is the follicular variant of papillary thyroid carcinoma (FVPTC).
68
The diagnosis of these tumors is often challenging and has low interobserver agreement, even among acknowledged experts. This is particularly true when the lesion is encapsulated, has no invasive features, and mild or focal areas of nuclear atypia.
69,70
These lesions may be classified as either follicular adenomas with atypia—benign—or encapsulated noninvasive FVPTC—malignant—depending on the pathologist interpretation.
71
Recently, a change in the nomenclature was proposed to recategorize well-defined noninvasive follicular pattern tumors with nuclear atypia as “noninvasive follicular thyroid neoplasm (NIFTP) with papillary-like nuclear features”.
72
Recognizing the indolent behavior of these tumors, the new terminology eliminates the carcinoma label, but without truly defining them as benign, adding an additional layer of complexity to this challenging arena.
72
At the same time, a nuclear-atypia score based on the qualitative assessment of 3 groups of nuclear features has been proposed to standardize the interpretation of nuclear atypia.
72
However, the interpretation of these nuclear features is still subjective; significant differences between observers are expected.
72
As a result of a stricter or less stringent threshold for considering the nuclear atypia “sufficient” for an NIFTP diagnosis, the estimated risk of malignancy within cytologically indeterminate thyroid nodules may be significantly impacted, if these lesions are considered malignant. In contrast, if NIFTP are no longer considered malignant, the risk of malignancy of all indeterminate categories should decrease significantly
73,74
and could reduce the variability in the risk of malignancy in cytology categories between institutions.
The interinstitutional variability in the risk of malignancy of indeterminate categories was of little relevance a few years ago because the expectation was that a diagnostic surgery would be performed in most of these nodules. 75 However, it has gained relevance in recent years because of its impact on the performance of molecular marker tests. 16
Molecular Marker Tests
Several molecular marker tests have been commercialized in the United States in the last 5 years (Table 4). 76 -80 These can be classified according to their performance into 2 groups: Those able to achieve high positive predictive value (PPV), often referred to as “rule-in” tests for cancer; and those able to achieve high negative predictive value (NPV), often referred to as “rule-out” tests. The advantage of rule-in tests is to identify nodules that are most likely to be malignant, allowing an appropriate thyroid surgery to be planned from the outset, so reducing the need for a completion thyroidectomy. 81 This advantage was more relevant when the treatment of choice for all thyroid cancers >1 cm was total thyroidectomy. 75 However, recent guidelines have moved to acknowledge that many intrathyroidal thyroid cancers <4 cm may be adequately treated with a lobectomy alone, which might change the perceived value of rule-in test. 1 In particular, lobectomy may be sufficient initial treatment for most FVPTCs and minimally invasive follicular thyroid carcinomas that represent a significant majority of the malignancies among indeterminate thyroid nodules. 1,68
Performance of Molecular Marker Tests as Reported in the Original Validation Studies.
Abbreviations: A/FLUS, atypia follicular lesion of undetermined significance; FN/HCN, follicular neoplasm/Hurthle cell neoplasm; n, total number of nodules; NA, not applicable; NPV, negative predictive value; PoM, prevalence of malignancy; PPV, positive predictive value; Sn, sensitivity; Sp, specificity.
Furthermore, currently available rule-in tests rely on the identification of somatic mutations known to drive cancer. Most of these mutations found in indeterminate cytology occur in the RAS gene. 18,77,79 -81 The prevalence of malignancy of RAS-mutant tumors has been usually quoted at >80% in studies where histological interpretation was not blinded to molecular marker results, and the nuclear atypia interpretation was not standardized. 82,83 However, most malignant RAS-mutants tumors are FVPTCs, particularly encapsulated and noninvasive (NIFTPs), that have very low malignant potential for which no further treatment is usually recommended after lobectomy limiting the clinical benefit of such finding preoperatively. 18,72,82 -84 Moreover, the cancer prevalence of RAS-mutant tumors in several other, more recent, studies has been lower, between 40% and 60% suggesting that it may have been previously overestimated. 85,86 In our institution, RAS-mutations were associated with 53% (16/30 tumors) prevalence of cancer. 18,87 However, the prevalence of invasive RAS-mutant cancers (20%, 6/30 tumors) was similar to that of other larger series (30%). 83 This finding suggests that differences in the classification of noninvasive neoplasms are responsible for the discrepancy in the risk of cancer of RAS-mutant tumors which significantly contributes to the overall PPV of oncogene panels that has been lower than previously reported. 87 Other more cancer-specific mutations such as BRAF-V600E are typically associated with full display of classical nuclear features and categorized as B-V or B-VI by cytopathologists but are infrequent among B-III or B-IV specimens. 18,87 -89
The ATA recommends treating solitary, cytologically indeterminate, thyroid nodules with a lobectomy but suggests that total thyroidectomy might be more appropriate if a cancer-specific mutation is identified due to the higher prevalence of malignancy. 1 However, it has been recognized that mutations in RAS, the most frequent among nodules with indeterminate cytology, are usually associated with low-risk malignancies for which a lobectomy is often enough. 90 Therefore, current "rule-in" tests might have limited role in guiding the surgical extent of solitary indeterminate thyroid nodules.
The advantage of rule-out tests is to avoid surgery if an indeterminate thyroid nodule is classified by the test as benign with a false-negative rate approximately 5% or less (similar to that of a benign cytology). 3,17 Despite promising initial results, the ability to rule out cancer with that level of confidence has not been adequately validated to date. 76 The resection rate of nodules that are molecularly “benign” has been disappointingly low in all of the independent clinical validation studies (generally <10% of benign calls and ≤5 nodules resected in most studies), which precludes the accurate calculation of NPV. 91 -93
Moreover, it has been recently recognized by the ATA that the clinical utility of molecular marker tests (the NPV and PPV) is heavily influenced by the pretest risk of malignancy. 16 Because clinical, sonographic, and cytological features influence the risk of malignancy of a given nodule beyond the cytological category, calculating the institutional prevalence of malignancy for each cytological category might be insufficient to adequately interpret molecular marker results. 18 In general, molecular markers are considered of little benefit in cases with very high or very low risk of malignancy. 16 For example, the risk of malignancy of indeterminate thyroid nodules with high-suspicion sonographic pattern or nuclear atypia might be around 45% to 50%. 47,66 Under these circumstances, none of the currently available molecular marker tests could, theoretically, achieve an NPV high enough to avoid surgery; whereas a positive result, despite increasing the risk of malignancy, is unlikely to change the clinical management given that a lobectomy is often sufficient in cytologically indeterminate thyroid nodules with a driver mutation (Figure 2). 90 Supporting this idea, a recent publication suggested that Afirma—a messenger RNA expression-based rule-out test—may not be useful in nodules with a repeated B-III cytology that are solid and hypoechoic on ultrasound scan because of the elevated pretest risk of malignancy in those nodules. 94 The performance of molecular marker tests (both rule-in and rule-out) in aspirates with oncocytic cells seems to be particularly poor 18,95 -97 ; whereas the performance in other subsets of indeterminate thyroid nodules is currently unknown. In our experience, oncogene panels performed worse than expected in B-III specimens perhaps due to differences in the cytological characteristics of specimens included in this category in ours and in previous studies. 18,87

Theoretical performance of molecular marker tests for indeterminate thyroid nodules with either high-suspicion sonographic pattern or nuclear atypia. The sensitivity (Sn) and specificity (Sp) of the tests were calculated for cytologically indeterminate thyroid nodules (Bethesda III and IV) with the information provided in the original validation studies. The expected negative (solid red line) and positive (solid blue line) predictive values were calculated using a prevalence of malignancy of 45% (dashed purple line). 77,79,80 NPV indicates negative predictive value; PPV, positive predictive value.
The reclassification of NIFTPs as benign—nonmalignant—tumors is expected to decrease the prevalence of malignancy of the indeterminate cytology categories by 20% to 45%, particularly among specimens with mild or focal nuclear atypia or architectural atypia. 67,73,74 This significant reduction in the pretest risk of malignancy is likely to increase the NPV and decrease the PPV of all molecular marker tests. However, the specific effect on these parameters will depend on the pathologist’s NIFPT diagnostic threshold, the proportion of NIFTPs within the indeterminate categories, and the proportion of NIFPTs that are classified as “benign” or “suspicious” by each test, all of which are currently unknown.
Additional independent clinical validation studies are needed to assess the performance of all of the molecular marker tests, applied to specific subgroups of indeterminate thyroid nodules. Until those results are available, we must carefully select which test to apply to an individual nodule with indeterminate cytology, considering all other risk factors that might impact on the test performance. 98,99 Indeterminate thyroid nodules that remain unresected on the basis of a rule-out test should be monitored carefully because the NPV of these tests has not been adequately validated. In that situation, the presence of suspicious clinical, sonographic, and/or cytological features, not just the nodule growth rate, should raise the concern about a false-negative result and lead to consideration of a repeat FNA biopsy. 22,100
Integration of Diagnostic Test Results
In our experience, the risk of malignancy of cytologically indeterminate thyroid nodules can be stratified using either sonographic patterns or cytological subcategories 47,67 but is further refined when these variables are integrated (data not published, Figure 3). More importantly, the information provided by the sonographic features and cytological characteristics improve the prognostication of histological outcome. 47,67 This information is necessary to optimize and to personalize management because it is likely to impact on the results of subsequent tests such as repeat biopsy or molecular marker tests. 95,96,101,102 Our findings are supported by the results of several other studies.

Proposed algorithm for evaluation and management of indeterminate thyroid nodules. Prevalence of malignancy in parenthesis and most frequent malignancies for each scenario derived from Moffitt retrospective data. Note that other institutions could have different findings. Non-ATA suspicion sonographic pattern includes heteroechoic nodules and iso or hyperechoic nodules with at least 1 suspicious sonographic feature. aConsider surgery if large (>4 cm), symptomatic or patient preference. Consider repeating FNA if cytological specimen was of limited quality (scant cellularity/ preparation artifact) in a solid nodule. bRepeat FNA is the preferred approach for A/FLUS unless management is already decided and unlikely to change with a different cytological diagnosis. Consider diagnostic surgery if large (>4 cm), symptomatic, patient preference, or if molecular markers are not available. Molecular markers could be helpful if surgery is not already indicated for other reasons, but their performance has not been validated in specific sonographic or cytologic scenarios, and negative results might need to be interpreted with caution. cIn the absence of other indications for total thyroidectomy, a lobectomy is usually appropriate for A/FLUS or F/HN even if mutations are identified with oncogene panels. Consider repeating biopsy before surgery in A/FLUS. A/FLUS indicates atypia/follicular lesion of undetermined significance, ATA, American Thyroid Association; FC, follicular cell predominance without nuclear atypia; FNA, fine needle aspiration; F/HN, follicular/Hürthle cell neoplasm; FTC, follicular thyroid carcinoma; FVPTC, follicular variant of papillary thyroid carcinoma, includes encapsulated noninvasive tumors (NIFTP, asterisk denotes when NIFTP are the most frequent tumors); NA, nuclear atypia; OF, oncocytic features; OTA, other types of atypia such as air drying or clotting artifacts, atypical lymphocytes, atypical cyst-lining cells, reactive changes. Other, nonfollicular cell-derived malignancies (in our series lymphoma); PTC, conventional papillary thyroid carcinoma.
Rosario stratified 150 B-III specimens using 2 risk factors: nuclear atypia and suspicious sonographic pattern. 102 Nodules lacking both risk factors were associated with 4% prevalence of cancer, compared to 11% when only nuclear atypia was present, 47% when only suspicious ultrasound was present, and 87% when both factors were present. 102 Rago et al created a score with clinical and sonographic features in which the weight of each feature was given based on its association with malignancy in a cohort of 505 indeterminate thyroid nodules. 103 The cancer risk in nodules with a score >7 was 41% compared to 16% in nodules with score ≤4. In both score groups, the cancer risk increased by approximately 20% when only nodules with nuclear atypia were analyzed (63% and 35%, respectively). 103 In a later publication from the same group on 1520 cytologically indeterminate thyroid nodules, they used a similar scoring system that integrated patient age, 3 sonographic features—hypoechogenicity, irregular margins, and microcalcifications—and cytological category that stratified the risk of malignancy, from 17% in the lowest risk group to 64% in the highest risk group. 68 In addition, they found that the presence of more advanced or aggressive malignancies—which required more than surgery and 131I remnant ablation—was similarly stratified, with <1% of these malignancies within the lowest risk score group. 68
The McGill thyroid nodule score was developed by a single-center multidisciplinary team that included endocrinologists, otolaryngologists, surgeons, and pathologists. 104 A total of 22 variables including clinical, laboratory, radiological (ultrasound and PET), cytological, and molecular features were included in the score. 104 The relative weight of each variable was determined according to the estimated cancer risk of the variable and the supporting evidence in the literature, by consensus of the panel. 104 Varshney et al found that this score was lower for cytologically indeterminate thyroid nodules with benign histology than that in nodules with malignant histology (7 and 9, respectively). The estimated cancer risk of those McGill scores were 32% and 63%, respectively (P = .001). 105
Others have created mathematical models with variables selected through multivariable logistic regression. The area under the receiver operating curve (AUC) is used to measure the discrimination ability of diagnostic models. An AUC of 1 indicates perfect discrimination, > 0.8 good discrimination, 0.6 to 0.8 moderate discrimination, and < 0.6 poor discrimation. 106 Banks et al developed a model in a retrospective cohort of 638 cytologically indeterminate thyroid nodules using 3 variables: patient age, nodule size, and cytological diagnosis that had an AUC of 0.74. 23 This model was validated in an independent cohort of 135 cytologically indeterminate thyroid nodules where it achieved an AUC of 0.85. 23 Patient age, nodule size, and cytological diagnosis were also used to develop a score that stratified the risk of malignancy. Scores <5 were associated with <25% cancer risk; scores 5 to 15 were associated with 25% to 75% cancer risk; and scores >15 were associated with >75% cancer risk. 23 Lubitz et al found that the best 3 variables to predict malignancy in a cohort of 144 follicular neoplasms were nodule size, and the presence of transgressing vessel or nuclear grooves on cytology. This model had, in their retrospective cohort, an AUC of 0.88. 107 Macias et al developed a model that integrated information on age, tobacco use, nodule size, presence of calcifications in ultrasound imaging, and nuclear atypia on cytology that had an AUC of 0.82 on 151 follicular neoplasms. 25 Lin et al developed a score system based on 3 variables: sonographic pattern, elastographic score, and cytological category that achieved an AUC of 0.87 on the 167 nodules of the study. 26
Artificial neural network analysis is a highly flexible nonlinear regression model that uses machine learning algorithms inspired by the structural and functional aspects of neurons to integrate large sets of data and so generate a prediction of the probability of a specific event. 108 This approach has been used for different diagnostic and prognostic purposes in medicine. Ippolito et al used artificial neural network analysis to develop a model in 371 cytologically indeterminate thyroid nodules that was validated in an independent set of 82 nodules. 109 They found that only certain cytological descriptors, but not clinical information, contributed to the model that performed better than standard cytological diagnosis alone, achieving an AUC of 0.88. 109 The work from Saylam et al suggests that models developed through artificial neural network analysis might be superior to multivariable logistic regression models. 110 In their study on 116 cytological indeterminate thyroid nodules, the model developed through artificial neural network analysis achieved an AUC of 0.82 compared to 0.70 in that developed through multivariable logistic regression using the same variables. 110
Interestingly, the AUC of commercialized molecular marker tests seems not to be superior to that of the models described above that use readily available information at no extra cost. In a recent publication describing our experience with ThyroSeq version 2, we found an AUC of 0.84 in the Bethesda IV category, where the performance was similar to the original validation study, whereas the AUC for Bethesda III specimens was significantly worse 0.57 (P = 0.03). 87 Unfortunately, none of the clinical risk assessment models have been subjected to independent validation studies that might support their generalization. Perhaps, this is due to the fact that each study has looked into different variables resulting in different algorithms. Most of them, however, incorporate some sonographic features—or patterns—and cytological features—often presence of nuclear atypia—and frequently some other clinical or biochemical data. 23,25,26,102,103,105,107 The impact of NIFTP reclassification on the performance of these models needs to be addressed.
Future Challenges and Perspectives
Noninvasive follicular thyroid neoplasms, are neither entirely benign, nor threatening malignancies, but rather tumors likely to be in a benign-to-malignant transformation. 71,72,111,112 This concept, which applies to other thyroid neoplastic lesions also—like well-differentiated tumors of uncertain malignant potential, minimally invasive follicular thyroid carcinomas, or papillary microcarcinomas—renders the traditional dichotomous outcome of thyroid histology—“benign” or “malignant”—obsolete. 113 As a consequence, NIFTP reclassification not only changes the nomenclature of some tumors but also constitutes acceptance of borderline/precursor lesions in the thyroid. In this context, B-III and B-IV are no longer indeterminate categories where we have benign or malignant tumors but rather categories that can accommodate these borderline/precursor follicular pattern lesions (Figure 4). The identification of more aggressive malignancies contaminating these categories seems to be appropriately performed through clinical risk assessment 68 ; but identifying borderline/precursor lesions contaminating higher risk cytological categories, particularly B-V, might be equally necessary to avoid overdiagnosis and overtreatment. 73,74,114 Therefore, we believe that it is time to incorporate clinical risk assessment tools/algorithms into the clinical practice to allow personalized management of cytologically indeterminate thyroid nodules. Molecular marker test results have also shown early promise in this regard. However, the use of both rule-in and rule-out tests should be applied, in our opinion, with caution because of the paucity of adequate independent clinical validation studies and the unknown impact on mid- and long-term clinical outcomes. Future professional guidelines need to face the challenge of changing the current linear diagnostic algorithm to one that integrates all relevant information before and after each diagnostic step. Standardizing reporting systems and the interpretation of sonographic and cytologic features will be key to generate generalizable models. Meanwhile, risk stratification of thyroid nodules with indeterminate cytology has proven feasible using clinical factors such as gender or nodule size; sonographic features, preferably grouped into sonographic patterns; and cytological features such as the presence/absence of nuclear atypia but needs to be tailored to the institutional outcomes.

Thyroid cytology–histology correlation. (A) Traditional view of thyroid cytology–histology correlation. (B) Current view of thyroid cytology–histology correlation. Borderline/precursor lesions include neoplastic lesions without clear evidence of invasion or minimally invasive, with or without papillary-like nuclear features.
Footnotes
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: No significant relationships exist between the authors and the companies/organizations whose products or services may be referenced in this article. Dr McIver receives grant/research support from GeneproDx. Dr Valderrabano has nothing to disclose.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
