Abstract
Keywords
Syndromes based on medically unexplained somatic experiences such as persistent fatigue or pain, have a long and controversial history in medicine and psychiatry [1]. The publication of international consensus criteria in 1994 for the diagnosis of chronic fatigue syndrome (CFS) was pivotal to its recognition as a major public health problem [2]. A conceptual framework was proposed in which CFS was considered to exist in a subset of persons with fatigue lasting >6 months (chronic fatigue) and for which there was not another clear medical or psychiatric explanation. Chronic fatigue in turn was considered to be a subset of unexplained, but disabling, fatigue lasting >1 month (i.e. prolonged fatigue) [2]. The diagnostic criteria represented the consensus of expert clinicians working largely in specialist medical and psychiatric care settings. Although not derived empirically, the individual symptom criteria are broadly consistent with findings from both population-based and clinic studies in English-speaking, Western populations [3–7].
In 2003 the International CFS Study Group recommended a new study of patients with chronic, unexplained fatigue from which a definition of CFS could then be derived empirically [8]. They also recommended that the study be international in nature, encompassing different regions and cultures. From a psychiatric perspective it was considered important to access symptom data across a wide range of cognitive, affective and somatic domains. The present study was based on analysis of pre-existing data from population-based studies, as well as studies in primary, secondary or specialized tertiary care clinics. The data were collected from a range of international settings. Multivariate statistical modelling was used to test the construct validity of chronic fatigue and CFS (as defined by the 1994 case definition).
Method
Literature search
Individual datasets were identified via articles extracted from MEDLINE (1966–2003), PsycINFO (1985–2003), EMBASE (1988–2003), CINAHL (1982–2003), personal communications and reference list searches. Relevant papers were checked for cross-references to find as many trials and investigations studying prolonged fatigue, chronic fatigue and CFS as possible. Investigators were contacted to identify unpublished studies. Most studies had already been published.
Study inclusion criteria
The original studies were conducted in both English-speaking countries (Australia, Canada, UK, USA, n = 36 326) and non-English-speaking countries (n = 1398) in Africa (Nigeria), Asia (China, Hong Kong, India, Japan, Turkey, United Arab Emirates, Vietnam), continental Europe (France, Germany, Greece, Ireland, Italy, Spain, Sweden, The Netherlands), and South America (Brazil).
All studies contained data collected from patients with prolonged fatigue (defined as fatigue lasting 1–6 months), chronic fatigue (defined as fatigue lasting >6 months), or CFS (as specified in the diagnostic criteria, included here). These subjects had been enrolled in: population-based studies, which were defined as including subjects drawn from the community (and not seeking health care); primary care, which included subjects attending general or family practice centres; and finally, secondary or specialized tertiary referral clinics with expertise in fatigue syndromes. All studies were completed by investigators with experience in fatigue research. All datasets included demographic details, measures of fatigue duration and severity, and information as to the setting in which the data were collected. A minority of datasets included a specifier as CFS, or not CFS, according to a published case definition [2, 9–12]. Some datasets contained information concerning the presence of other physical and psychological symptoms, and notations of accompanying medical and psychiatric illnesses. Data collected in population-based or primary care studies did not necessarily include a formal clinical diagnosis of a fatigue syndrome. Forty-five studies qualified for inclusion in the analysis. The principal investigators of 33 studies agreed to provide data for the combined analysis (Table 1), while investigators of 12 studies declined.
All merged dataset details by principal investigators and study sites
CDC SI, Centers for Disease Control and Prevention Symptom Inventory [47]; Chalder, Chalder fatigue scale [45]; CF, chronic fatigue; CFS, chronic fatigue syndrome; CIS, Checklist Individual Strength [44]; PB, population-based studies; PC, primary care studies; PF, prolonged fatigue; SCL-90, Symptom Checklist [46]; SPHERE, Somatic and Psychological HEalth REport [25]; S/T, secondary or specialist tertiary referral clinic studies; No. subjects included in this analysis may not coincide with those in the associated publication; †To date these datasets remain unpublished.
Ethics approval
For each study (and thus each dataset) included in this meta-analysis, the study protocols were approved by appropriate institutional human research ethics committees.
Data collection
The analyses included 37 724 patients who reported at least prolonged fatigue. The data were electronically submitted in standard format (subject identifier, date of interview or self-report, date of birth (or age), gender, setting(s) in which information was collected, measures of fatigue duration and severity, symptoms as required by the CFS case definition used in the clinic or study, other physical and psychological symptoms and, exclusionary and non-exclusionary medical and psychiatric illness). Where possible, data regarding marital status and years (or level) of education were also included. Standard range and consistency checks were performed on each dataset. Missing information, obvious errors, inconsistencies between variables, or extreme values were queried and rectified as necessary. If details of the study had been published, they were checked against the raw data and any inconsistencies queried and rectified.
Data analyses
Prior to analyses, all symptom data were dichotomized such that any negative response (e.g. none, never) was recoded to ‘no’ and all other responses recoded to ‘yes’. Factor analysis, using SPSS (SPSS, Chicago, IL, USA), was then used to evaluate the data. To determine the number of factors to retain for rotation to an interpretable solution, we used a combination of the eigenvalues, the percentage of the total variance explained by each possible number of factors and the associated scree plot, the reproducibility of the factors, and the clinical meaningfulness of the factors extracted. Orthogonal (or varimax) rotation was used to maximize interpretability of factors. An arbitrary but conventional threshold of 0.35 for the factor loadings was applied when interpreting and labelling the factors.
The merged 37 724 person dataset contained 94 variables (each with ≥1000 total responses), translating to a subject: item ratio of >5:1 (equivalent to the statistical rule of thumb [48]). Given the nature of this study, the 94 variables were not common across all subjects. As a result, mean substitution (where missing values were replaced with the mean of the variable) was used for the replacement of missing data values.
In total, 152 physical and psychological items were collected, of which some captured similar symptoms. Before merging such items, consideration was given as to whether the item was describing a clinically similar phenomenon to any other item(s). If so, the authors (including a psychiatrist, research psychologist, virologist, biostatistician, epidemiologist and an infectious diseases physician) made a unanimous decision to merge them (e.g. ‘night sweating’ and ‘sweating more than usual’), or to leave them separated (e.g. ‘waking up tired’ and ‘feeling tired after rest or relaxation’). If clinical consensus could not be reached, polychoric correlations were used to guide decision-making.
Initially, an exploratory factor analysis was run on a random sample of 4257 subjects from the three diagnostic categories (prolonged fatigue, chronic fatigue, CFS) from three English-speaking countries. This models’ items and factor structure were then imposed on all 37 724 cases to ascertain validity of the full dataset. Finally, a separate exploratory factor analysis was run on the dataset as a whole. After final factor models were derived, individual factor scores were computed for each subject using the regression method. For ease of understanding, these scores were then weighted using a total average sum of factor loadings ≥0.35 on each factor. Resultant mean symptom scores were consequently analysed using Hedges’ g effect size, which is an inferential measure that assesses the strength of the difference between two groups. We assumed that an effect size ≥0.50 (a medium effect size by Cohen's definition [49] and indicating that 69% of one group, vs 50% of another, is above the mean of the second) is likely to reflect a meaningful clinical difference.
A subset of subjects included in the final factor model had also completed the Brief Disability Questionnaire (BDQ) [50]. Pearson's correlations were used to assess the relationship between BDQ ‘days out of role’ and the mean symptom scores for each factor.
Results
The first set of analyses evaluated a random sample of 4257 subjects from the three diagnostic categories from three English-speaking countries (Australia, n = 1543; UK, n = 1336; USA, n = 1378). This included 42% (n = 1799/4257) of subjects from population-based studies, 23% (n = 978/4257) from primary care and 35% (n = 1480/4257) from secondary or specialist tertiary referral clinics. Of note, 34% (n = 1439/4257) came from studies designed to evaluate subjects with chronic fatigue, and 33% (n = 1401/4257) from studies of persons with CFS. This subset was selected to minimize differences attributable to minor cultural or linguistic differences and to equally include all the variation in symptoms seen in different health-care settings. Within this subsample, 63% (n = 2674/4212) were female, 57% (n = 1730/3029) were married, and 56% (n = 1399/2504) had a college education.
The optimal solution for this random sample included five factors (and 18 individual symptom items) and explained 50% of the variance. The five factors were designated as: ‘musculoskeletal pain/fatigue’, including items such as ‘pain in arms or legs’, ‘joint aches and pains’ and ‘muscle weakness’; ‘neurocognitive difficulties’, including ‘poor concentration’ and ‘difficulty thinking’; ‘inflammation’, including ‘sore throats’, ‘fevers’ and ‘swollen glands’; ‘sleep disturbance/fatigue’, including ‘feeling tired after rest or relaxation’ and ‘waking up tired’; and ‘mood disturbance’, including items such as ‘constantly under strain’, ‘irritable or cranky’ and ‘unhappy or depressed’.
From this optimal solution, mean symptom scores were derived and analysed in terms of diagnostic category and health-care setting. When comparing prolonged fatigue to chronic fatigue, differences were noted for all factors but were most evident for ‘inflammation’ and ‘sleep disturbance/fatigue’ (Table 2). When comparing chronic fatigue with CFS, the former subjects more commonly reported items of ‘sleep disturbance/fatigue’ while the latter subjects more commonly reported ‘musculoskeletal pain/fatigue’ and ‘neurocognitive difficulties’ (Table 2). When comparing health-care settings, the largest differences were found for ‘neurocognitive difficulties’ and ‘inflammation’ in those with CFS (Table 3).
Mean symptom scores and effect size vs fatigue type for a random subsample of 4257 subjects from Australia, USA and UK
CFS, chronic fatigue syndrome;
Mean symptom scores and effect size vs health-care setting for a random subsample of 4257 subjects from Australia, USA and UK
A second set of analyses involved running three individual factor analyses on data from each health-care setting (i.e. population-based studies, primary care, and secondary or specialist tertiary referral clinics). These analyses, designed to test the homogeneity of the full dataset, found remarkably similar structures in each setting. Again, they were labelled: ‘musculoskeletal pain/fatigue’; ‘neurocognitive difficulties’; ‘inflammation’; ‘sleep disturbance/fatigue’; and, ‘mood disturbance’ (Table 4). In our view, these essentially uniform results justified running further factor analyses on the dataset as a whole.
Common/similar items representing extracted factors across settings
†Absence of any sleep factor in the population-based studies does not mean it does not exist for this group but that such items were not included in the analyses due to missing data.
To explore the robustness of the first factor analytical solution, its structure was imposed on the full dataset. Because a very similar factor structure was achieved, it was decided to run a separate exploratory analysis including all data. Of the 37 724 persons included in this analysis, the mean age was 39 years (range = 16–97 years), 57% (n = 20 845/36 809) were female, 56% (n = 12 333/22 031) were married, 40% (n = 8838/22 221) had achieved a tertiary level of education (with only 10% (n = 2240/22 221) not receiving secondary level education). Most subjects had been seen in population-based studies (42%, n = 15 749/37 724) or primary health-care settings (52%, n = 19 472/37 724). Only a small proportion had attended secondary or specialist tertiary referral clinics (7%, n = 2503/37 724). While all subjects reported fatigue of at least 1 month's duration (i.e. prolonged fatigue), the final sample included 2013 people specified as having chronic fatigue, and 1958 had been formally diagnosed as having CFS.
The optimal solution for the entire dataset included five factors (and 25 individual symptom items) and explained 47% of the variance. The five factors were again comfortably designated as: ‘musculoskeletal pain/fatigue’; ‘neurocognitive difficulties’; ‘inflammation’; ‘sleep disturbance/fatigue’; and ‘mood disturbance’. This solution provided the most robust collection of the key constructs shared by persons with clinically significant prolonged fatigue states. That is, because it was based on subjects with fatigue of at least 1 month's duration and included all participating centres across cultural boundaries, it was not likely to be constrained by those sociodemographic, cultural or health-system factors that typically influence referral to secondary or specialist tertiary referral clinics.
Again, mean symptom scores were derived and analysed in terms of diagnostic category and health-care setting. When comparing prolonged fatigue to chronic fatigue, a large difference was detected for the ‘inflammation’ symptom factor, as well as a medium-size difference for ‘neurocognitive difficulties’ (Table 5), with the chronic fatigue group reporting these symptoms more commonly. Similarly, two important differences were detected when comparing chronic fatigue to CFS, but these were for ‘sleep disturbance/fatigue’ being more common in the former, and ‘musculoskeletal pain/fatigue’ being more common in the latter (Table 5). When comparing health-care settings, ‘neurocognitive difficulties’ and ‘inflammation’ were reported as more common in those seen in secondary or specialist tertiary care as compared with those in primary care (Table 6).
Mean symptom scores and effect size vs fatigue type (n = 37 724)
CFS, chronic fatigue syndrome;
Mean symptom scores and effect size vs health-care setting for all subjects with fatigue (n = 37 724)
An important measure of the validity of this factor model is the extent to which it can predict disability. Mean symptom scores for four of the five factors were found to be predictive of BDQ ‘days out of role’: ‘mood disturbance’ (r = 0.22, p < 0.001); ‘musculoskeletal pain/fatigue’ (r = 0.20, p < 0.001); ‘neurocognitive difficulties’ (r = 0.10, p < 0.001); and, ‘sleep disturbance/fatigue’ (r = 0.04, p < 0.05).
Discussion
Five domains of illness experience were derived empirically from multivariate analyses of large international epidemiological and clinical datasets based on symptom reports of subjects with prolonged fatigue. These domains were robust across cultures and health-care settings and are consistent with the key criteria described in the 1994 international CFS case definition. They are best summarized as: prolonged fatigue and musculoskeletal pain; impaired neurocognitive function; sleep disturbance; and symptoms suggestive of inflammation. From a psychiatric perspective, the only noteworthy variation is one of emphasis on the central role of mood disturbance. There has been a strong tendency in the medical and lay literature on CFS to suggest that depressive symptoms are simply an understandable psychological response to the severity or duration of disability. These data argue that mood disturbance is a core component.
The subjects contributing data came from a wide variety of cultures (European and non-European ethnicity, English- and non-English-speaking), and the full spectrum of population-based datasets to specialized health-care settings. Initial analyses restricted to persons from Australia, the USA and the UK (which included a comparable mix of subjects across diagnostic types and duration of illness), produced comparable results to analysis of the entire dataset. There has been genuine concern in the international medical literature as to the construct validity of chronic fatigue and CFS. This study suggests that the experience of prolonged fatigue is stereotyped internationally. Thus, rather than representing a cultural or medical construct restricted to specific cultures, or certain specialized health-care settings, chronic fatigue states are robust clinical entities internationally. Common symptom expressions (i.e. clinical phenotypes) are likely to be underpinned by common pathophysiological elements.
The International CFS Study Group recommended consideration of a revision of the 1994 CFS case definition based on empirical data gathered from a large international study [8]. The present findings indicate that the core dimensions specified in the 1994 definition have construct validity and do not need to be revised. The International CFS Study Group also recommended that for research purposes, the diagnosis of CFS should be made using validated instruments that allow standardized assessments of the major symptom domains of the illness. The present study supports that recommendation and suggests an empirical diagnostic algorithm similar to that used by the Centers for Disease Control and Prevention [51].
Although the factor analyses suggest a common clinical phenotype, there were differences in relative prevalence of some symptom dimensions between the groups diagnosed as having prolonged fatigue, chronic fatigue or CFS, as well as across levels of the population and health-care settings. Subjects with chronic fatigue differed from those with prolonged fatigue largely in terms of reporting of neurocognitive and physical symptoms suggestive of inflammation; subjects with CFS reported more pain and fatigue, but less sleep disturbance than patients with chronic fatigue only. These differences may reflect an evolution of the illness complex over the months from a prolonged fatigue state (lasting ≥1 month) to a chronic fatigue state lasting >6 months. Similarly, persons seen in secondary care reported neurocognitive impairment and physical symptoms suggestive of inflammation more than those seen in primary care. This may reflect a selection bias towards individuals with more severe illness underpinning the referral to secondary or specialist tertiary clinics. They could also be influenced, however by other intrinsic or extrinsic risk factors, such as exposure to a prior infective illness [23], as well as demographic (gender, socioeconomic status, education status), health-care system (e.g. availability of specialist clinics) and referral agency factors.
A great deal of research effort, particularly in mental health aspects of general medical care, continues to focus on whether such chronic fatigue states can be distinguished from other medical and psychiatric diagnoses and also from other similar medically unexplained syndromes (e.g. chronic pain, fibromyalgia, irritable bowel syndrome). We suggest that this international study supports the proposition that chronic fatigue states share a common and stereotyped set of symptom domains, and that these can be readily identified in the community and at all levels of health care. Consequently, it is likely that they share common risk factors, are underpinned by a common pathophysiology, and may respond to common treatment strategies. We also suggest that there is little to be gained by further reorganization of the diagnostic criteria, or the related diagnostic entities. It is time to consider whether chronic fatigue states should be included formally in future international classification systems both in psychiatry and in general medicine.
Conceptually, the present findings are consistent with the notion that the key symptom phenomena of chronic fatigue states are likely to share common central nervous system mechanisms, independent of any other precipitating illness (e.g. infection) or risk factors (e.g. prior mood disturbance). The systematic and longitudinal study of changes in phenotype in subjects who present with these syndromes, either after exposure to potential risk factors such as infection [23] or after treatment strategies (e.g. graded exercise, antidepressant agents) presents considerable opportunity for better understanding of the nature of these fatigue-related conditions.
International Chronic Fatigue Syndrome Study Group
The researchers who contributed data to this study included: Professor Gijs Bleijenberg, Dr Sieberen P. van der Werf, Dr Judith B. Prins (University Medical Centre, Nijmegen, The Netherlands); Dr Paul M.A. Blenkiron (Bootham Park Hospital, York, UK); Professor Dedra Buchwald, Dr Wayne R. Smith (Harborview Medical Center, Washington, USA); Dr Rachel Edwards, Dr Sean Lynch (University of Leeds, Leeds, UK); Professor Laurence J. Kirmayer, Ms Suzanne S. Taillefer (McGill University, Quebec, Canada); Dr Sing Lee (Prince of Wales Hospital, Shatin, Hong Kong); Professor Nicholas G. Martin, Dr Nathan A. Gillespie (Queensland Institute of Medical Research, Queensland, Australia); Dr Shirley McIlvenny (Sultan Qaboos University, Sultanate of Oman, United Arab Emirates); Professor Norman Sartorius, Dr T.B. Ustun (World Health Organization); Dr Petros Skapinakis (University of Bristol, UK and University of Ioannia, Greece); Professor Simon Wessely, Dr Trudie Chalder, Dr Matthew Hotopf, Dr Chaichana Nimnuan, Ms Bridget Candy, Dr Lucy Darbishire, Dr Leone Ridsdale (Guy's, King's and St Thomas’ School of Medicine, London, UK); Professor Peter D. White, Ms Janice M. Thomas (St Bartholomew's Hospital, London, UK); Associate Professor Kathleen Wilhelm, Dr Andrew Wilson (University of New South Wales, New South Wales, Australia).
Footnotes
Acknowledgements
This project was supported by the Centers for Disease Control and Prevention (CDC), Atlanta, Georgia, USA. The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the funding agency. The researchers involved in the present study were independent of the funding agencies, with the exception of S.D.V. and R.N. who were previous employees, and W.C.R., who is a current employee of the CDC.
