Abstract
Background:
Though the functional states of other endocrine systems are not defined on the basis of levels of controlling hormones, the assessment of thyroid function is based on levels of the controlling hormone thyrotropin (TSH). We, therefore, addressed the question as to whether levels of thyroid hormones [free thyroxine (fT4), total triiodothyronine (TT3)/free triiodothyronine (fT3)], or TSH levels, within and beyond the reference ranges, provide the better guide to the range of clinical parameters associated with thyroid status.
Methods:
A PubMed/MEDLINE search of studies up to October 2019, examining associations of levels of thyroid hormones and TSH, taken simultaneously in the same individuals, with clinical parameters was performed. We analyzed atrial fibrillation, other cardiac parameters, osteoporosis and fracture, cancer, dementia, frailty, mortality, features of the metabolic syndrome, and pregnancy outcomes. Studies were assessed for quality by using a modified Newcastle–Ottawa score. Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines were followed. A meta-analysis of the associations was performed to determine the relative likelihood of fT4, TT3/fT3, and TSH levels that are associated with the clinical parameters.
Results:
We identified 58 suitable articles and a total of 1880 associations. In general, clinical parameters were associated with thyroid hormone levels significantly more often than with TSH levels—the converse was not true for any of the clinical parameters. In the 1880 considered associations, fT4 levels were significantly associated with clinical parameters in 50% of analyses. The respective frequencies for TT3/fT3 and TSH levels were 53% and 23% (p < 0.0001 for both fT4 and TT3/fT3 vs. TSH). The fT4 and TT3/fT3 levels were comparably associated with clinical parameters (p = 0.71). More sophisticated statistical analyses, however, indicated that the associations with TT3/fT3 were not as robust as the associations with fT4.
Conclusions:
Thyroid hormones levels, and in particular fT4 levels, seem to have stronger associations with clinical parameters than do TSH levels. Associations of clinical parameters with TSH levels can be explained by the strong negative population correlation between thyroid hormones and TSH. Clinical and research components of thyroidology currently based on the measurement of the thyroid state by reference to TSH levels warrant reconsideration.
Introduction
Thyroid function testing (1,2
This classification of thyroid function is based on the concept of TSH levels being the most sensitive indicator of thyroid function such that subclinical thyroid dysfunction as currently defined is believed to be more significant than isolated hyper/hypothyroxinemia (2), as indicated by the alternative term for the latter, “euthyroid hyper/hypothyroidism” (4).
Subclinical thyroid dysfunction is common and comprises most cases of thyroid dysfunction with a population prevalence of ∼5% (5 –9), increasing to 15% in older adults (9). Even though it is generally asymptomatic or associated only with non-specific symptoms, subclinical thyroid dysfunction has been associated with many adverse outcomes across a variety of organ systems (5 –9). Therefore, despite the lack of convincing evidence of significant benefit (10,11), treatment for subclinical thyroid dysfunction has been recommended in certain circumstances (6 –9).
It has previously been suggested by some authors that the earlier definition of subclinical thyroid dysfunction is overly simple and that its diagnosis should not be based solely on the TSH level being outside of a general population range (12,13). Rather, it is claimed that more accuracy may be achieved by defining a normal reference range for the combination of thyroid hormones and TSH.
However, any model whereby judgment of the thyroid status includes consideration of the TSH level is anomalous, in that the levels of other physiological parameters are not judged by the levels of their controlling hormones. For example, whether or not an individual has hypoglycemia or hypercalcemia is not determined by reference to insulin (14) or parathyroid hormone levels (15), respectively. Adrenocorticotropic hormone (ACTH) levels, though helpful in diagnosing adrenal autonomy, are not considered diagnostic for Cushing's syndrome (16). In general, the level of a controlling hormone is used to determine the cause of a disturbance rather than identifying whether or not there is a disturbance (14 –16).
We, therefore, aimed at determining whether or not a systematic review of the literature might indicate the relative merits of thyroid hormone levels and TSH levels, in terms of associations with a broad range of clinical parameters. Because of the strong negative population correlation between free thyroxine (fT4) and TSH (17,18), we expected to find associations between both TSH and fT4 levels and the clinical features of thyroid dysfunction. We further reasoned that if the clinical features were associated better with TSH levels, the current rationale for thyroid function testing and the current consequent clinical and research classifications and practices would be supported, but, if the clinical features were associated better with thyroid hormone levels, these classifications and practices would warrant review. In this latter circumstance, the previously noted associations of clinical features with TSH levels could be attributed to the aforementioned strong negative population correlation between fT4 and TSH.
Methods
Search strategy
Up to October 9, 2019, a systematic search was performed of PubMed/MEDLINE by using the following terms: thyroxine (T4), fT4, total triiodothyronine (TT3), free triiodothyronine (fT3), TSH, and subclinical. No restrictions were placed on language, country, or publication date. The resulting literature was first examined to confirm the previously reported general trends of association between clinical parameters and thyroid status.
On account of the results of this first examination of the literature (see Results section), we studied atrial fibrillation (AF) and other cardiac parameters, bone density and fracture, cancer, death, frailty, dementia and associated pathology, obesity, features of the metabolic syndrome, and pregnancy outcomes. We specifically sought studies that addressed the associations between both TSH and thyroid hormone levels, determined simultaneously in the same individuals, with any of the clinical parameters just mentioned.
Study selection and data extraction
Initially, the titles of the articles were screened for relevance and then the abstracts, with full-text reports of potentially relevant reports were reviewed. Additional relevant articles were searched for in the reference lists of the retrieved full-text studies. If repeated study was made of the same cohort, only the latest was included. The literature search data extraction, identification of additional relevant articles, and critical appraisal were conducted independently by two of the authors (S.P.F. and H.F.), and any discrepancies were resolved by consensus with reference to the criteria described in the next section. Should consensus regarding any article not have been achieved, the default position was that the article would be included. No study that contradicted the results of our work was knowingly excluded.
Studies reporting on associations of levels of fT4, TT3/fT3, and TSH with clinical features related to thyroid dysfunction were included. We included both TT3 and fT3, as there were relatively few studies of fT3. We also included analyses comparing associations with subclinical hypothyroidism and euthyroid hypothyroxinemia, reasoning that this is a comparison of low thyroid function defined on the basis of TSH levels or thyroid hormone levels, respectively. Reports were excluded if the studied population was <100 individuals. Review articles, editorials, meta-analyses, and meeting abstracts were also excluded.
The following information was extracted from each such study: first author, country, number of individuals, sex, age intervals, nature of the study, and the relevant clinical parameter. As there were many subtle different parameters examined, we also grouped the parameters into eight major phenotypes or systems: “cardiac,” bone,” “dementia,” “cancer,” “mortality,” “frailty,” “metabolic,” and “pregnancy.” We recorded any associations with thyroid hormones and/or TSH, in addition to the statistical techniques and degrees of significance of any associations (p-values and/or confidence limits). We also recorded the presence of “incongruent” associations, that is, associations in the opposite direction to that normally expected (e.g., obesity having associations with high thyroid function), or associations of thyroid hormones in the same direction as associations with TSH, as indicators of reverse causation (Supplementary Table S1) (19).
As our study was not directed at a collection of works addressing therapeutic outcomes of an intervention, the use of a quality assessment (the Newcastle–Ottawa scale) was adjusted to suit this setting. Principally, this adjustment consisted of allowing for continuous, as well as binary quantifications, of clinical outcomes and exposure to thyroid hormone levels. Articles were scored according to the representativeness of the subjects, the similarity of the subjects apart from differences in the parameter of interest, the reliability of the classification of thyroid status and parameter status, control for confounding factors, and for prospective studies, the demonstration that outcome was not present at study onset, the adequacy of length and completeness of follow-up. The Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines were followed (20).
Statistical analysis
To determine whether thyroid hormone levels or TSH levels were associated better with the examined clinical parameters, we analyzed the earlier studies as to the relative frequencies of significant associations of thyroid hormone and TSH levels with the clinical parameters. We then performed further analyses to confirm that these findings did not result from any systematic bias.
We classified each result in a study as showing a significant result or a non-significant result. By a significant result, we mean that a given thyroid test has been shown to be associated with a given condition at a 5% significance level. We treated the result as a binary response variable with the levels of success (significant) and failure (non-significant). We combined TT3 and fT3 as TT3/fT3.
The predictors considered were the type of thyroid test (i.e., TSH, fT4, or TT3/fT3), the clinical system under consideration; the number of subjects in the analysis; and the number of covariates in the model. To account for the repeated analysis within each study, we also incorporated a random intercept term. We considered random intercepts for the study, the cohorts nested within each study, the type of analysis nested within the study, and the complexity of the models nested within the studies.
Pairwise comparisons of the thyroid tests were performed at a 5% overall significance level for those models where a significant effect of the predictor, type of thyroid test, was found. We calculated the Tukey pairwise comparisons between the thyroid tests by using the multcomp package (21). We conducted a McNemar analysis on the contingency tables for each comparative pair of thyroid tests. We tested the null hypothesis that there was no change in the proportion of significant results between the two thyroid tests under consideration. As a final attempt to account for dependency within each study, we performed a simple logistic regression analysis by using only a single randomly chosen analysis from the series of nested models in each study. We performed this for each of the following strata: smallest number of subjects, simple model; smallest number of subjects, complex model; largest number of subjects, simple model; and largest number of subjects, complex model.
We performed a sensitivity study minimizing the contribution of possible reverse causation, analyzing only the prospective analyses from studies that were free of incongruent associations.
All modeling was performed by using the lme4 (22) and lmerTest (23) packages in R (24), and all codes are available at
Results
We found, in our first examination of the literature, that though the findings were not unanimous, there was general consistency of the data. In general, consistent with prior work (8), AF (25 –31), osteoporosis (32 –39), and cancer (40 –43) were associated with higher thyroid function that was defined by using TSH and/or thyroid hormone levels, across and beyond the reference range, and steatohepatitis (44 –46) and other features of the metabolic syndrome (19,47 –66) were associated with lower thyroid function. Both high and low thyroid function, as compared with mid-range thyroid function, were associated with clinical and pathological features of cognitive decline (26,67 –75), frailty (76 –79), total/cardiovascular mortality (26,80 –88), cardiac physiology (89), cardiac disease (apart from AF) (26,31,67,83 –85,88,90,91), and pregnancy outcomes (92 –99).
There were many series finding these associations in the context of subclinical thyroid dysfunction. Many of these studies (25,50,51,67,83 –85,87 –89), however, did not address the relative associations of clinical parameters with TSH and thyroid hormone levels, the focus of our study.
In the end, we identified 58 studies that addressed this question (Fig. 1; Table 1). We found no previous synthesis of the data on the effect of thyroid function, as measured by TSH in comparison to thyroid hormone levels, across a range of organ systems. One meta-analysis restricted to AF (27) was not included in our analysis. Many of the studies addressed multiple parameters summarized by those indicated in Table 1.

Description of literature search.
Description and Quality Assessment of Included Studies
Age of largest subset of population.
F, female; M, male; Met S, metabolic syndrome; NOS, adapted Newcastle–Ottawa quality assessment scale (the higher number out of 9, the better the study).
We found 22 studies (19,26,30,31,34,35,39,41,44,45,48,55,60 –62,65,66,71,75,78,79,91) that examined associations with fT4, TT3/fT3, and TSH and a further 36 studies (26,29,33,36 –38,40,42,43,45,47,49,52 –54,56,63,64,69,73,74,76,77,80 –82,86,90,92 –99) that examined associations with only fT4 and TSH levels.
These 58 studies included cross-sectional and prospective cohort studies, diverse populations, and both sexes. They were contemporary and of high quality (Table 1). The study populations comprised strictly euthyroid subjects (26,29,30,34,39,45,48,52 –55,62,69,81,82,86,90,91), subjects either euthyroid or with subclinical thyroid dysfunction (19,33,35,36,38,40,42,47,49,60,65,71,73,75 –78,80,92,93,95,96,98,99), and subjects euthyroid or with subclinical/overt thyroid dysfunction (28,31,37,41,43,44,46,47,56,61,63,64,66,74,79,94). In some studies, different subsets were examined separately. The 58 articles included in our meta-analysis yielded 1880 results of associations analysis. The supplement catalogues all of these associations in terms of clinical parameters, subgroups, number of participants, statistical methods, statistical significance, and p-values/confidence limits.
The number of subjects for each analysis ranged from 18 to 10,990 with a mean of 3071 (median 2078). The number of results in each study ranged from 3 (60) to 180 (92).
Analysis of all these data confirmed the superiority of associations with thyroid hormone levels (fT4, TT3/fT3) as compared with TSH levels (Fig. 2). fT4 had a significant association with a clinical parameter in 50% of the analyses of the articles. TT3/fT3 had a significant association in 53% of the analyses, whereas TSH had a significant association in only 23%. fT4 levels were associated with clinical parameters and were statistically significantly more often than with TSH levels, (p < 0.0001), as did TT3/fT3, (p < 0.0001). The difference between fT4 and TT3/fT3 levels was not significant (p = 0.71).

Overall associations of thyroid hormone and TSH levels with clinical parameters. T3, triiodothyronine; T4, thyroxine; TSH, thyrotropin.
When there was a significant association with fT4, the association with TSH was simultaneously significant 30% of the time; for the converse, the frequency was 66%. For TT3/fT3, the respective figures were similar at 33% and 62%. The McNemar analysis demonstrated these results to be significant; for the comparisons of fT4 versus TSH and TT3/fT3 versus TSH, the null hypothesis was rejected (p < 0.0001). For the comparison of fT4 versus TT3/fT3, we failed to reject the null hypothesis (p < 0.4305).
As the number of subjects in the analysis increased, the superior associations with thyroid hormones did not diminish (Fig. 3), and similarly the system did not play a significant role (Fig. 4). In an analysis including the number of covariates in the original result, at higher numbers of covariates, the association of clinical parameters with TT3/fT3 levels was no longer significantly different from the association with TSH levels (Fig. 5).

Associations of thyroid hormone and TSH levels with clinical parameters according to sample size.

Associations of thyroid hormone and TSH levels with clinical parameters according to clinical system.

Associations of thyroid hormone and TSH levels with clinical parameters according to number of covariates.
The basic analysis provided earlier ignores the many sources of dependence between the results reported in each study. To account for this, it was necessary to incorporate a random intercept for study in the model (p = 2.2 × 10−16). There was still then a statistically significant effect of thyroid test in predicting the significance of results (p < 2.2 × 10−16). Post hoc pairwise comparisons show that there was a statistically higher proportion of significant results for fT4 compared with TSH (p < 1 × 10−5), and also a statistically higher proportion of significant results for TT3/fT3 compared with TSH (p < 1 × 10−5). These results confirmed those illustrated in the earlier mentioned confidence interval plots. We found that the additional main effects of system, cohort size, and number of covariates again did not improve the predictive effect of the model, compared with one with just a thyroid test (based on minimizing the Bayesian Information Criterion). We found that a nested random effects structure of cohort within study was a statistically valid addition to the model, but it did not change the observed effects of the thyroid test on what has been cited earlier.
We found, when addressing the issue of dependence of results, a statistically significant effect of thyroid test on the proportion of statistically significant results. Pairwise comparisons revealed that the only significant results in all four models were for fT4 having more significant associations than TSH (Table 2). In this analysis, TT3/fT3 levels did not have more associations with clinical parameters than levels of TSH.
Model Description
fT4, free thyroxine; TSH, thyrotropin.
The results of our sensitivity analysis aimed at minimizing any effect of reverse causation showed no significant change in the proportion of fT4 and TSH levels being associated with clinical parameters, and hence in the statistical conclusions. However, the association of TT3/fT3 levels with clinical parameters was not significantly different than with TSH levels. The proportion of associations with TT3/fT3 was only 13% in this analysis as compared with 53% in the full analysis.
There was no significant change to any of our results regarding the TT3/fT3 combination with consideration of TT3 and fT3 separately.
Only a few of the studies included patients on T4 therapy. In these studies, the proportion of patients on T4 was very low such that separate analyses of these patients were not undertaken. Analyses of cohorts with removal of these patients did not affect the results.
Discussion
We believe this is the first systematic review studying TSH and thyroid hormone associations with various clinical parameters. The results indicate that, contrary to the current paradigm, thyroid hormone levels are associated more strongly with clinical parameters than TSH levels. Any relationship of clinical parameters with TSH levels can be explained by the strong population relationship between thyroid hormone levels and TSH levels, such that TSH levels are merely indirect measures of thyroid hormone levels.
In our sample, we found no indication of, or reference to, any work that suggested that TSH levels consistently indicate thyroid status of any organ or tissue more strongly than thyroid hormone levels.
As our goal was not to estimate the effect size for one treatment, our meta-analysis methodology differed from some other meta-analysis methodologies in that we did not use a weighted technique or pool all original patient data. In addition, it would not have been appropriate to combine all of these factors by using such meta-analysis methodology, as our analysis encompassed multiple studies covering various clinical outcomes, using different methodologies, different assays, and statistical methods.
Theoretically, one could use such other methodology to do a meta-analysis of each clinical parameter, but these individual meta-analyses would still need to be combined by using a method akin to ours (i.e., summing the meta-analyses in some way) to determine whether levels of thyroid hormones or TSH are more likely to be associated in general with clinical parameters. Further, in using such a technique of analysis, the information from many of the studies in our sample would be lost as the parameter/population/statistical method might not be amenable to pooling (27).
The results of the individual patient meta-analysis of AF (27) do, in fact, support our conclusions, showing superior associations with fT4 levels than with TSH levels. Also supportive is a recently published similar meta-analysis of pre-term delivery (100), showing that fT4 levels are associated with the clinical parameter at least as well as TSH levels. One could even argue that the results of these two conventional meta-analyses alone disprove the general hypothesis that TSH levels provide a better guide to thyroid status than fT4 levels.
Potentially, the summation of statistically significant results can be unreliable (20], but we have accounted for the possibilities of bias on account of imbalance in the size of the studies, the nature of the parameters, and the possibility of reverse causation. In all of the studies (except for the few studies comparing individuals with subclinical hypothyroidism with individuals with isolated hypothyroxinemia), each subject was his/her own control, and the study populations of many of the studies were unselected members of a community, so the risk of bias from these considerations was obviated. The convincing degree of superiority of thyroid hormone levels as compared with TSH levels also provides a buffer against the possibility of some unidentified bias influencing our results.
The strictest interpretation of our data would, nevertheless, qualify our conclusions such that they would be valid only to the degree that the chosen parameters truly reflect the thyroid state. Residual confounding, by mechanisms as yet incompletely understood, may have affected the reported associations between thyroid function tests and all of the clinical parameters. This possibility, not considered to be likely in the studies we reviewed, would also compromise the previous literature describing the considered consequences of sub-clinical thyroid dysfunction.
However, even in these circumstances, a slightly weaker conclusion for our study—that is, that there is no evidence supporting the superiority of TSH levels in the assessment of thyroid function—would still stand. Further, there is much evidence to indicate that at least some of the chosen parameters do truly reflect the thyroid state.
In particular, although there are clinical parameters that can affect thyroid function, we do not believe that such reverse causation significantly influence our results. Reverse causation mechanisms have been described for parameters associated with low thyroid function (e.g., obesity (101–103) and dyslipidemia (57,58,104)), but in these circumstances the reverse causation effects would tend to lead to greater associations with TSH levels rather than with fT4 levels.
The sensitivity of TT3/fT3 levels to the sick euthyroid state (105), generated by altered deiodinase activity (106), may also explain some of the associations with TT3/fT3. In particular, mortality and frailty may be associated with low TT3/fT3 levels via reverse causation. As the TSH would also be expected to be low in this situation, one might expect incongruent associations between clinical parameters, and TT3/fT3 (and possibly fT4) and TSH. Our sensitivity study excluded such studies.
We are not aware of any association of a clinical parameter with a high fT4 having been linked to reverse causation. If anything, any component of the sick euthyroid state associated with these conditions, by lowering TSH and fT4 (105), should again favor an association with TSH rather than fT4.
Mendelian randomization studies have provided evidence that the relationship between thyroid function and AF is causal (107,108), whereas there may be reverse causation underlying the relationship between thyroid function and obesity (109). Other indicators supporting a causative relationship between thyroid function and at least some of the parameters we examined include the relationships being seen in otherwise healthy individuals (91 –99), the prospective nature of many of our included studies, our sensitivity study, the observed similarity of the relationships to those seen in overt thyroid disease (110 –120), basic science evidence (121,122), and positive animal (123) and human (19) intervention studies.
Nevertheless, additional intervention studies could provide further evidence as to the direction of causality in the associations we have studied. Ideally, such intervention studies would be designed to ensure that the intervention, rather than merely normalizing TSH levels, significantly changes the levels of thyroid hormones.
We found TT3/fT3 level associations with fewer parameters than we found for fT4. Although TT3/fT3 levels were associated more strongly than TSH levels, and equally strongly as fT4 levels, with clinical parameters, our sensitivity study showed a fall in the frequency of TT3/fT3 associations, suggesting a component of reverse causation.
Two other analyses, the analysis of associations according to the number of covariates and the sampling analysis, also indicated that the associations of clinical parameters with TT3/fT3 may not be as robust as the associations with fT4 levels. Overall, TT3/fT3 measurement added little to the assessment based on fT4 levels. Future studies may further clarify the relative importance of fT4 and fT3 levels.
fT4 is not the active thyroid hormone at the cellular nuclear level (106). The strong relationships of parameters, especially AF (risk increased up to 9 × across the normal reference range (30)), with levels of fT4 indicate that the active intracellular triiodothyronine generated by thyroid hormone transporters and deiodinases (106) appears to be, at least in the heart, proportional to circulating fT4. Any discrepancy, indicating local regulation of thyroid effect, may be more prominent in more severe pathophysiological circumstances (106), and therefore more relevant in the circumstances of multisystem entities such as frailty, death, and metabolic disturbance.
Our results do not imply that no information can be gleaned from the presence of an abnormal TSH level. In the presence of normal thyroid hormone levels, such TSH levels indicate that the thyroid gland physiology is abnormal. However, for the function of other tissues and organs, the TSH level required to maintain a given level of thyroid hormones appears generally not to be relevant.
It remains possible too that additional analyses might find that TSH levels are providing an additional signal to fT4 levels, in some populations for some conditions. It has been suggested that TSH itself may have physiological effects apart from the stimulation of thyroid hormone levels (36,124), and such effects rather than via the reflection of thyroid status might explain such a TSH signal. Empirically, thus far, the evidence suggests that any of these TSH effects are small.
The association of thyroid hormone and particularly fT4 levels, rather than TSH levels, with clinical features has been noted by many authors, covering many individual parameters (26 –28,30,33,35,42,44,46 –48,52,53,57,69,73,76,80,81,86). In particular, the meta-analysis regarding AF noted the association with fT4 but not with TSH (27). Authors also previously found evidence of associations of clinical parameters with fT4 in the absence of an association with subclinical thyroid dysfunction as currently diagnosed (33,49,73,80,86). One of these studies also showed associations with TSH (49).
Nevertheless, to date, to the best of our knowledge, this information from the individual studies showing the superiority of thyroid hormone levels in terms of associations with individual clinical parameters has not been synthesized into a formal conclusion regarding the biochemical assessment of thyroid function in general.
It has been suggested that despite TSH being considered a more sensitive indicator of thyroid status, fT4 may be a more sensitive indicator of “cardiac” (28), or “tissue” (47,53) thyroid status. Our study strengthens and generalizes these propositions, indicating that fT4 is the more sensitive indicator of thyroid status because it is the better indicator of tissue and organ effects.
The superior association of clinical parameters with fT4 as compared with TSH levels has more often been attributed to a putative disturbance of set point physiology (42,46,47,69,76,81,86), to a significant difference between pituitary and peripheral sensitivity to fT4 (27,46,48,52), or to statistical/other factors (including reverse causation) (33,36,44,49,58).
The explanations related to set points are denied in the first instance by the evidence that the relatively stable thyroid hormone levels seen in individuals are better explained by a model of “balance points” (or “equilibrium points”) rather than “set points” (125). Notwithstanding this concept, it has been suggested that in older adults there is an alteration of what is termed “set point physiology,” in that TSH may be less suppressed by any given rise in fT4 (42,76). However, in this situation, though the range of TSH may change, any physiological association with greater or lesser TSH levels should remain intact. Further, the greater association of clinical parameters with fT4 rather than TSH levels is apparent across a wide age range (Table 1).
At a population level, TSH levels do, indeed, decrease with rising fT4 levels (17,18), suggesting that in general pituitary sensitivity to thyroid hormones is robust. If for any reason there were a disturbance to pituitary sensitivity in the absence of a corresponding change to peripheral sensitivity, this would in any event provide another reason not to diagnose thyroid function on the basis of TSH levels.
The evidence also suggests that, regardless of the method used, the classification of thyroid function into normal, subclinical disease and overt disease is arbitrary. Thyroid hormones, as previously suggested (9,26), similar to many other biological parameters, exert a continuum of effects across the normal range. There is no clear border between normal and abnormal. There are advantages and disadvantages associated with all levels (9,26,126). Individuals with relatively low levels of fT4, for example, are less likely to develop AF but more likely to develop metabolic syndrome; the converse applies for individuals with higher fT4 levels. At the extremes, the disadvantages clearly outweigh the advantages, and individuals are likely to become symptomatic.
On the other hand, any excursion from the middle of the range has an association with some pathology or other. Some individual pathologies, for example, frailty, mortality, and dementia may increase with deviations either side of the middle of the range. It seems likely that evolutionary mechanisms have arisen to minimize variation from the middle of the reference range of thyroid hormones (127).
The fact that TSH levels reliably identify overt thyroid dysfunction can also be explained by the negative population relationship between TSH and fT4, that is, its extension into the abnormal ranges of fT4 (17,18). This extension is due merely to the fact that the vast majority of all overt thyroid dysfunction is primary rather than secondary (128). This situation differs from other endocrine pathology, for example Cushing's syndrome, where ACTH levels cannot be used as a screening test on account of the likelihood that Cushing's syndrome may be secondary, that is, be due to a disorder of ACTH regulation (129). The fact that TSH levels are thereby very sensitive screening tests for overt thyroid dysfunction (130) does not imply that TSH levels are very specific, that is, that an abnormal TSH level implies thyroid dysfunction. Our work indicates that an abnormal TSH level per se is an imprecise indicator of tissue or organ hyper/hypothyroidism as compared with thyroid hormone levels.
This work addressed diagnosis alone. Extrapolation of our findings appears logical, and there is no apparent a priori reason as to why TSH levels should be preferred over thyroid hormone levels in the context of monitoring thyroid treatments. Randomized trials might, nevertheless, reveal that additional considerations apply in these circumstances. Though there was no suggestion in the studies that we examined of a difference with individuals on thyroid hormone replacement, their numbers were small.
In summary, there is now matching theoretical and empiric evidence from a variety of sources suggesting that the thyroid status of an individual is better defined by thyroid hormone levels than TSH levels. There is evidence of a continuum of thyroid hormone effects along the continuum of thyroid hormone levels, with a possible optimum around the middle of the reference range. Though TSH levels remain good screening tests for overt thyroid dysfunction, it is theoretically and empirically more sound to rely on thyroid hormone and especially fT4 levels to classify the thyroid state.
This work should result in a simplification of the understanding of thyroid physiology and pathophysiology, and bring it more into line with the understanding of the physiology and pathophysiology of other parameters, whereby the status of a parameter is judged by its level rather than the level of any controlling factor. Reconsideration of the TSH-based diagnostic approach to thyroid function appears to be indicated. In turn, this would appear to have implications for clinical guidelines, research methodology, and the rationale of underlying physiological principles.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
Funding Information
No funding was received for this article.
Supplementary Material
Supplementary Table S1
