Abstract

As noted in a 1973 Clinical Chemistry editorial, automation has resulted in precision displacing accuracy as the predominant standard for acceptable laboratory results; the author argues that ‘meaningful measurement’ must also address accuracy. 1 Because measurement of thyroid stimulating hormone (TSH) in conjunction with total triiodothyronine (TT3), total thyroxine (TT4), free T3 (FT3), and free T4 (FT4) is used for the diagnosis and management of thyroid disorders, it is important for all these assays to achieve optimal accuracy. A College of American Pathologists survey reported significant biases, so that more than 50% of results were unacceptably inaccurate in 13–60% of assays evaluated. 2 This inaccuracy, coupled with dissimilar reference intervals, complicates test interpretation and impacts the creation of clinical guidelines.
The International Federation of Clinical Chemistry (IFCC) Working Committee cofunded a project to harmonize thyroid hormone immunoassays (IAs), with the goal of achieving similar results with each IA method. To do this, clinical serum samples were run in multiple assays. All IAs were recalibrated to the statistically inferred targets with a robust mass spectrometry reference method. 3 Here, we describe some of the current limitations of thyroid hormone measurements by IAs that cannot be resolved by assay harmonization.
Limitations of IAs
The reliability of TSH as an index of thyroid function has been questioned due to findings of significant intra-individual effects, which include aging, pregnancy, comorbidities leading to euthyroid sick syndrome, drug-related effects on hypothalamus-pituitary-thyroid axis, and diurnal and seasonal variations.4,5 An investigation of 102 healthy volunteers using paired samples found that the mean afternoon (p.m.) TSH concentration is 10% higher for males (reference interval: 6–8 a.m.: 0.7–3.7 uIU/ml; 6–8 p.m.: 0.7–4.7 uIU/ml) and 17% higher for females (reference interval: 6–8 a.m. 0.5–4.3 uIU/ml; 6–8 p.m. 0.5–6.1 uIU/ml) compared with the mean morning (a.m.) concentration. 5
TSH reference intervals should be age adjusted, particularly in individuals over 70 years old, to avoid diagnostic misclassification. The TEARS study revealed that the median TSH values in 153,127 adult participants without autoimmunity increased significantly with age, from 1.58 mU/liter at 31–40 years old to 1.86 mU/liter at >90 years old (p < 0.001). In addition, the 2.5th percentile decreased with age and the 97.5th percentile increased with age. 6
Medications can cause clinically significant variations in TSH levels (Table 1). Thus, the utility of a single reference interval for TSH as a standalone marker for hyper- and hypothyroidism is questionable. 5
Factors that alter thyroid hormone measurements by immunoassays in the absence of thyroidal illness.
FT3, free triiodothyronine; FT4, free thyroxine; NSAID, nonsteroidal anti-inflammatory drugs; TGB, thyroid binding globulin; TKI, thyroid kinase inhibitor; TSH, thyroid stimulating hormone.
The diagnostic strategy of measuring both free thyroid hormone and TSH requires use of assays that correlate with the clinical presentation. Most immunoassays of free thyroid hormone are one-step direct analog IAs that are impacted by protein-binding variations. As free thyroid hormone concentrations measured by IA depend on serum binding proteins, 2 thyroxine binding globulin (TBG) deficiency or excess, abnormal transthyretin or albumin levels, and pregnancy have been found to alter free hormone levels. Similarly, medications can disrupt T3 and T4 binding to serum proteins and affect thyroid measurements (Table 1). Medical conditions such as cardiac surgery, renal disease, and critical illness affects both free and total thyroid hormone levels. Assay harmonization will not resolve the misleading IA results for free and total thyroid hormone measurements. Lastly, antibody-binding assays for stimulating and blocking TSH receptor antibody measurements require methodological optimizations before they can be incorporated in clinical practice. 7
Comparisons of IA and liquid chromatography-tandem mass spectrometry results
Numerous studies have used paired samples to compare thyroid hormone IAs and liquid chromatography-tandem mass spectrometry (LC-MS/MS) results. These correlations are usually acceptable in the euthyroid range. However, up to 50% of IA measurements in patients with low T3 states provide a discordantly higher result (assay bias) than those obtained by LC-MS/MS. 8
Despite excellent precision, IAs often provide precisely the wrong value. In a population study, IAs were characterized by a good inverse linear relationship between FT4 and log-transformed TSH in a pediatric population (correlation coefficient −0.82), while in adults the correlation coefficient of −0.48 was suboptimal. In contrast, the FT4 measured by LCMS/MS was characterized by a good correlation coefficient of −0.90 in children and −0.77 in adults. 9 In a subsequent study, a comparison of the thyroid hormone assay performance of LCMS/MS versus IA in a diverse group of 100 patients of any age with any medical diagnosis in a mixed healthcare setting (inpatient versus outpatient), demonstrated that the TT4 and TT3 values determined by the two different assays had a good correlation coefficient (r: 0.91–0.95). Conversely, the FT4 and FT3 correlation coefficient was suboptimal (r: 0.75 and 0.50, respectively). A better correlation was found for FT3 and TT3 than with FT4 and TT4. The IAs demonstrate discrepant values at the low and high end of the established reference range, with only a moderate correlation coefficient (r: 0.51–0.75). 2 More recently, Hannah-Shmouni and Soldin highlighted the importance of LCMS/MS in free thyroxine measurements. 10 This technique can avoid diagnostic misclassification of older adults with subclinical versus overt hypothyroidism.
Another study of 109 individuals were assigned to three equal-sized groups clinically characterized as hypothyroid, euthyroid, or hyperthyroid. For the entire group, the correlation coefficient of TSH with FT4 performed on the Siemens Immulite 2500 analyzer, was moderate [0.45, 95% confidence interval (CI) 0.29–0.59]. 11 Analysis performed by LCMS/MS demonstrated a better correlation (coefficient of 0.84, 95% CI 0.77–0.88). Importantly, when the euthyroid group was removed from the analysis, the correlation coefficient for IA dropped to 0.2, while LCMS/MS was 0.72. In summary, the inverse log-linear correlation between TSH and FT4 was significantly improved when FT4 was assayed by LC-MS/MS compared with IA, indicating that FT4 results measured by LC-MS/MS agreed better with those obtained by TSH and the patient’s clinical condition. 12
A study of 40 patients classified as subclinical hypothyroidism using a FT4 IA found potential diagnostic misclassification: 65% of these patients had FT3 or FT4 values below the reference interval when measured by LC-MS/MS. The mass spectrometry findings agreed best with the clinical picture in this study and others.8,13
In addition to improved clinical correlation, especially at low hormone concentrations, LC-MS/MS methods have the advantage of superior analytical sensitivity and specificity. 14 The assay performance is paramount and permits appropriate clinical decision-making, particularly in the lower or higher ends of a given reference range, and may change medical intervention from the watch-and-wait strategy to immediate implementation of targeted therapy. Unfortunately, the pitfalls of the IAs mentioned are not addressed by the current harmonization approach.
In summary, new data show that isolated TSH measurements by IA should no longer be regarded as the most reliable test of thyroid function assessment. Similarly, FT4 measured by IA can be affected by TBG concentrations, heterophilic nonspecific antibodies, steroids, and various medications (Table 1). Measurements of plasma FT3/TT3 with accurate methods complement the clinical workup due to its biological activity. 15 This phenomenon is frequently observed in Graves’ disease T3-toxicosis, characterized by the elevation of FT3/TT3, rather than increment in FT4/TT4. IAs for TT3/FT3 and FT4 frequently give falsely normal results in individuals with hypothyroidism, suggesting subclinical rather than overt hypothyroidism. 16
The direct measurement of thyroid hormones via LC-MS/MS is highly sensitive, specific, and precise, and these results correlate well with the patient’s clinical presentation. Measurement of TSH by IA may need to be accompanied by measurements of FT4 and TT3. Optimal measurements of FT4 and FT3 should include removal of TBG by ultrafiltration or equilibrium dialysis followed by measurement with either LCMS/MS or IA, as this approach enables the most accurate assessment of the pituitary-thyroid axis. Therefore, this approach, as opposed to measurement of TSH levels alone or together with IA FT4, 3 is preferable, and is the recommended method of screening for thyroid abnormalities. In support of this, the high accuracy of LC-MS/MS has been recognized by the American Thyroid Association guidelines for management of thyroid disorders during pregnancy, as a gold standard for measurement of thyroid hormones by ultrafiltration or equilibrium dialysis LCMS/MS. 17
To remove IAs interference requires use of higher economic cost LCMS/MS techniques.
From 2006 to the present time, all FT4/FT3 analyses at Children’s National Medical Center were shifted from IA to mass spectrometry, under the leadership of both the Endocrinology Department and Dr Soldin. This resulted in improved diagnosis and patient management, while the LCMS/MS unit generated a profit of approximately 2 million dollars per annum. An inaccurate IA result can lead to misdiagnosis, inappropriate treatments, and deleterious consequences for human health. Even if interference of a given IA is recognized by clinicians, additional testing, with its inherent financial costs, must be pursued. New information published over the last 8 years cast further doubt on the concept of IA thyroid hormone harmonization. The evidence accumulated over the past several years points out significant shortcomings in diagnostic accuracy of FT4, FT3, TT4, TT3, and TSH measurements performed by IA.18,19
