Abstract
Quality of life (QOL) in patients with Chronic obstructive pulmonary disease (COPD) is a major global concern in respiratory care with the specific instruments used rarely being developed using a modular approach. This paper is aimed to develop the COPD scale of the system of QOL Instruments for Chronic Diseases (QLICD-COPD) by the modular approach based on Classical Test Theory and Generalizability Theory (GT). 114 inpatients with COPD were used to provide the data measuring QOL three times before and after treatments. The psychometric properties of the scale were evaluated with respect to validity, reliability and responsiveness employing correlation analysis, factor analyses, multi-trait scaling analysis, and also GT analysis. The Results showed that Multi-trait scaling analysis, correlation and factor analyses confirmed good construct validity and criterion-related validity with almost all correlation coefficients or factor loadings being above 0.40. The internal consistency α and test-retest reliability coefficients (Pearson r and Intra-class correlations ICC) for all domains except for the social domain were larger than 0.70, ranging between 0.70–0.86 with r = 0.85 for the overall. The overall score and scores for physical and the specific domains had statistically significant changes after treatments with moderate effect size SRM (standardized response mean) ranging from 0.32 to 0.44. All G-coefficients and index of dependability were all greater than 0.80 exception of social domain (0.546 and 0.500 respectively), confirming the reliability of the scale further. It concluded that the QLICD-COPD has good validity, reliability, and moderate responsiveness, and can be used as the QOL instrument for patients with COPD.
Keywords
Introduction
Chronic obstructive pulmonary disease (COPD) is a slowly developing disease that causes a huge economic and social burden.1,2 The disease commonly causes people to experience symptoms such as difficulty breathing, cough, chest tightness and fatigue, etc. Since most patients with COPD are still incurable, one of the main goals of care is to improve health-related quality of life (HRQOL), which is subjective in nature and involves patients' self-assessment of multiple health dimensions. As a result, many studies have included HRQOL measurements in assessing the impact and progress of the disease.3-5 However, many studies used generic questionnaires which have a relatively wide application and can be used for comparison among different diseases, including EuroQol 5 dimension,5-8 the MOS Short-Form 36 (SF-36) and Short-Form 12,8-11 and the World Health Organization Quality of Life-BREF. 12 One of main limitations of generic questionnaires is that they may not be sensitive to small changes associated with specific disease. On the other hand, disease-specific questionnaires can detect minor clinical changes. For this reason, many disease-specific questionnaires have been developed, which included the St George’s Respiratory Questionnaire (SGRQ), COPD Assessment Test (CAT), and COPD Clinical Questionnaire (CCQ).13-15 Weldam et al. 16 reviewed 77 studies on QOL in patients with COPD and found that 13 disease-specific questionnaires and 10 generic questionnaires have ever been used.
The SGRQ, CCQ and CAT etc. are very well validated questionnaires and widely used in research and clinical practice. However, these specific scales are developed neither for COPD (but for respiratory diseases) nor for QOL (but for health and function status to the most extent), demonstrating some limitations used for QOL in COPD.16-18 For example, most questions/items (80%) in SGRQ have two points (yes/no) answers. As a result, the scale/domain scores may be less responsive to changes, as two-point scales are less reliable than those with more categories. 19 In addition, they are not developed through modular method - generic/core module plus specific module to overcome the shortcomings of generic and specific instrument use alone.20-22 Since diseases in the same class such respiratory diseases have much in common, the modular approach is helpful to capture common and also unique characteristics. The general module can be used to compare QOL across different diseases, while the specific module can be used to depicture symptoms and treatments in detail. Therefore, the modular approach for developing instruments has its significant benefits. Consequently, both the QLQs (Quality of Life Questionnaires) from EORTC (European Organization for Research and Treatment) and the FACITs (Functional Assessment of Chronic Illness Therapy) from CORE (Center on Outcomes, Research and Education) in USA were developed based on this modular principle.20,21
Consider combining generic and disease specificity in the questionnaire and responding directly to the need for COPD in HRQOL instruments, we have developed a system of Quality of Life Instruments for Chronic Diseases (QLICD).23,24 This system includes a general module (QLICD-GM) which can be used with all types of chronic diseases, and specific modules for different diseases with each module being used for only the relevant disease.23,24 Among this system, the QLICD-COPD is constructed by combining QLICD-GM with the specific module of COPD.
Considering no COPD instrument developed by the modular approach, and both classical test theory (CTT) and Generalizability theory (GT) have their advantages, 24 especially GT can gives more sophisticated results than CTT, this article is aimed to describe the development and validation of this instrument based on the modular approach by both CTT and GT. To use for reference, other instruments can be developed and validated efficiently in similar ways.
Methods
Establishment of the general module (QLICD-GM)
The study population was limited to patients with chronic diseases who were able to read and understand the questionnaires at any stages and treatments. And the study protocol and the informed consent form were approved by the IRB (institutional review board) of Kunming Medical University.
Correlation coefficients r among items and domains of QLICD-COPD (n = 114).
*Correlations between each item and its designated scale are in bold type.
Establishment of the specific module
The programmed decision method was also used in item generation and selection.
23
First, the focus group discussed and confirmed the structure of the module, which included five facets. Based on literature review, reference of domestic and foreign mature scales, and the clinical experience of COPD, the nominal group proposed some possible items under each of the facet within the module. As a result, 25 items were presented to form an item pool of the specific module that reflects the symptoms of COPD such as cough, white sputum, and side effects of treatments on COPD such as nausea, vomit and dry mouth. Then, a pre-test and in-depth interview was administered to 20 patients and 20 experts of COPD (physicians/nurses), respectively. The following statistical procedures were used to re-screen the items: (1) Importance rating by patients. Each patient was asked to score the importance of each item on a 0-100 score system (0 = extremely unimportant and 100 = extremely important). Those items with a low importance rating (<65) were deleted. (2) Importance rating by experts. Each expert was asked to score the importance of each item same as (1). Those items with a low importance rating (<65) were deleted. (3) Variation procedure. The standard deviation (SD) of the scores for each item was calculated and the items with smaller SD (<1.10) were deleted. (4) Correlation procedure. The Pearson’s correlation coefficients between the score of each item and the sum score of its own domain were computed and the items with smaller correlation coefficients (<0.50) were deleted.
To determine the final items, the following selection rules were used: (1) retain those items that were selected by at least three of the above procedures; and (2) retain those items selected by at least two procedures with incorporation of professional knowledge (focus group).
In the end, the specific module consists of 15 items, coded COPD1-COPD15, classified into 5 facets (see Tables 1 and 3 for details).
Validation of the QLICD-COPD
Data collection and scoring
The final QLICD-COPD consists of the above general and specific modules and is used in a field survey at the First Affiliated Hospital of Kunming Medical University to study their psychometric properties. According to the guidelines,26,27 COPD was diagnosed by the presence of chronic cough, sputum production and/or dyspnea, and also other clinical indicators such as Forced Expiratory Volume in the first second (FEV 1), Forced Vital Capacity (FVC).
The study population was limited to COPD inpatients who were able to read and understand the questionnaires at any stages and treatments. The investigators explained the aims of the tests and the instrument to the patients and obtained written informed consent from those patients who agreed to participate in the study and met the inclusion criteria. Participants were asked to answer the questionnaire at the time of admission by themselves. Each patient was assessed a second time on the second day after hospitalization to assess the reliability of the test-retest. The Data were obtained on the third scheduled time to complete the discharge measures to assess the responsiveness of the questionnaire, after about 1 week of treatment. The Chinese version of SF-36 28 was also used simultaneously to provide data for evaluating the criterion-related validity, and also the convergent and the discriminant validity of the QLICD-COPD.
Participants took 15–30 min to complete the questionnaires. Each investigator immediately checked the answers each time to ensure their integrity. If missing values were found, the questionnaire would be returned to the patients to fill in the missing item.
Each item of QLICD-COPD is rated as a 5-point Likert rating system. Positive statement items are rated directly from 1 to 5, while negative statement items are scored in reverse.
Psychometrics analysis
There are several types of validity that can be distinguished.29,30 The construct validity is evaluated by the Pearson correlation coefficient r between the item and the domain and the factor analysis using the Varimax rotation. We used correlating corresponding domains of the QLICD-COPD and SF-36 to evaluate criterion-related validity. Relatively high correlations among conceptually related domains and relatively low correlation among conceptually distinct domains would suggest high criterion-related validity. Multi-trait scaling analysis 31 was used to test the convergence validity and discriminant validity of the scale. There are two criteria: (1) when an item-domain correlation is 0.40 or greater, the convergence validity is supported; when item-domain correlation is higher than that with other domains, it indicates discriminant validity.
Reliability refers to the degree to which the instrument is not affected by random errors and is evaluated by internal consistency and repeatability. Internal consistency is evaluated by Cronbach’s alpha coefficient of domain/facet. A high internal consistency (above 0.7) indicates that the scale is measuring a single structure. Repeatability (test-retest reliability) is the instrument stability over time in a stable population. 30 It was evaluated by the Pearson’s correlation coefficients between the first and second assessments, and intra-class correlation (ICC) with definition of absolute for single measure under the two-way mixed model.32,33
Responsiveness is the instrument’s ability to detect clinically important change over time. It was measured by comparing the mean score change between the two assessments before and after treatments using paired t-tests as well as the effect size SRM (standardized response mean), with values of 0.20, 0.50 and 0.80 having been proposed to represent small, moderate and large responsiveness, respectively.34,35
GT analysis
In addition to the CTT, in the current study, we also applied GT (G theory) to analyze the score dependability. G theory let us improve the design of the measurement procedure by techniques of experiment design and analysis of variance (ANOVA), and try to produce reliable data through two types of study: G studies and D studies.36-39 The G study quantifies the amount of variance associated with the different facets (factors) that are being examined. The D study provides information about which protocols are optimal for a particular measurement situations by G-coefficients that can be interpreted as reliability coefficients across various facets of the study.
In our research, a random crossed design (person-by-item (p × i) design) was conducted in G and D studies to estimate variance components and dependability coefficients. We define a patient’s QOL as a measurement target, and use the item as a facet of measurement error. It is equivalent to all participants being asked to reply to all items, the design is a one-facet crossover design.36-39 Therefore, the corresponding relative errors or absolute errors in the G study, as well as generalizability and reliability coefficients in the D study in each potential factor were
Results
Socio-demographic characteristics of the sample
At first measurement at admission to hospital, 114 patients with COPD were enrolled (mean age 67.4 ± 10.9, 71.9% were male). Of them, 104 subjects participated in the test-retest phase of the study, and 106 subjects completed an assessment of sensitivity to change at discharge.
Construct validity
Correlation analysis of data measurements at admission shows that there is a strong correlation between the item and its own domain, and all correlation coefficients above 0.40 except of SO6 and SO10, but weak relationship between items and other domains (see Table 1 for detail).
Factor analyses led to the extraction of 8 principal components from the 30 items of the general module QLICD-GM based on the eigenvalues >1 criterion, accounting for 75.91% of the cumulative variance. Three domains of the general module included eight main components reflect the different facets. The physical domain(represented by the first and seventh principal components) with higher loadings on PH2(0.78), PH3(0.64), PH4(0.72), PH5(0.73), PH6(0.73), PH7(0.79), PH8 (0.78); The second and eighth major components mainly reflect the social domain with higher loadings on SO4(0.84), SO5(0.88),SO7(0.62), SO8(0.68) and SO10(0.80); Other major components usually describe a psychological domain with a higher loadings on PS1(0.82), PS3(0.65), PS6(0.80), PS7(0.67), PS8(0.86), and PS11(0.68).
Similarly, the principal component factor analysis extracted 4 principal components from the 15 items of the specific module with the cumulative variance of 70.78%, reflecting 4 facets (cough and phlegm, short breath, pulmonary encephalopathy, and effect of mental and life) with high loadings ranging from 0.61 to 0.83.
From the results described above, the theoretical construction is usually confirmed by data analysis and shows good construct validity.
Criterion-related validity
Correlation Coefficients among domain scores of QLICD-COPD and SF-36 (n = 114)*.
PHD: physical domain, PSD: psychological domain, SOD: social domain, SPD: specific domain.
PF: physical function, RP: role-physical, BP: bodily pain, GH: general health, VT: vitality, SF: social function, RE: role-emotional, MH: mental-health, PCS: Physical Component Summary, MCS: Mental Component Summary.
These confirmed the criterion-related validity and also demonstrated the convergent and divergent validity to some extent.
Reliability
Reliability of the quality of life instrument QLICD-COPD (n = 114 for α, n = 104 for r and ICC).
- not acceptable/suitable.
The test-retest correlation coefficients (r) for the 4 domains and the overall QLICD-COPD were larger than 0.70, ranging between 0.70–0.86 with r = 0.85 for the overall scale. The results from ICC and their 95% confidence intervals computed based on the definition of absolute agreement were very similar to Pearson’s correlation coefficients (r).
Responsiveness
Responsiveness of the quality of life instrument QLICD-COPD (n = 106).
Results from GT
The estimated variance components and percentage of variances for p×i design in G-study for four domains of quality of life instrument QLICP-COPD
p: person effect, i: item effect, p × i: person-by-item interaction effect.
PHD: physical domain, PSD: psychological domain, SOD: social domain, SPD: specific domain.
G-coefficients and Ф-coefficients for different numbers of items for p ×I design in D-study for four domains of quality of life instrument QLICP-COPD.
Discussions
The article focuses on the main steps in development and validation of a specific QOL instrument for COPD (the QLICD-COPD) by combining the general module of the entire disease category with the specific module. As far as we know, although a number of instruments have been widely used for studying COPD impacts on patients’ HRQOL, none of them combine brevity, comprehensive coverage of all dimensions of HRQOL and COPD specificity together. Moreover, there was none of them developed directly for COPD by the modular approach although the recent McGill COPD QOL questionnaire was established by combining both generic and disease-specific properties.22,40 In addition, the QLICD-COPD has several advantages23,24 over existing instruments: (1) it can compare HRQOL for various diseases through a generic module and capture symptoms and side effects through the specific module, showing general and specific attributes; (2) it consists of a medium number of items with a clear hierarchy (items→ facets→ domains→ overall) so that mean scores can be computed not only at the domain (four domains) and the overall levels but also at the different facet levels (15 facets) to detect changes in detail; (3) Based on the scale, the foreign language versions can be developed with strict translation and back-translation procedures, and it can be used in future research, testing its use in other languages in other countries.
On items generation and selection, both qualitative (in-depth interview, focus group discussion) and quantitative methods (importance rating, variation and correlation procedure) were used. Qualitative methods were just used for essential tools and steps to identify content for the instrument, to establish relevant domains and rate items of relevance (item content and clarity). The data were transcribed, managed, and thematic coding and analysis were done by NVivo software package. This is a big and comprehensive process for item selection and content validity of the instrument and will be reported in other paper.
In terms of quantitative methods, the thresholds for item selection were based on the statistical standards (e.g. correlation coefficients <0.50) and scale development experience or item numbers prefer to retain (e.g. importance rating <65, SD<1.10), which were subjective to some extent.
With regard to psychometrics, practical QOL instruments must be validated in at least three ways: validity, reliability and responsiveness. We used correlation and factor analysis to confirm the construct and criterion-related validity of QLICD-COPD. Correlation analysis showed a strong correlation between the items and its own domain/facet, but weak correlation between the items and other domains/facet, indicating good criterion-related validity and construct validity.
Our results indicated that the instrument has good reliability given coefficients above 0.70. For the exception of the social domain, many studies presented small Cronbach’s α for relatively higher heterogeneity of this domain (social support, social relations, social security, etc.).12-15 Accordingly, we suggested that it is acceptable for Cronbach’s α 0.60 for social domain.
For responsiveness, we focus on internal responsiveness, assuming that sensitive instruments should detect changes after detection. As shown in Table 4, QOL scores had significant changes after treatments for physical and the specific domains as well as the overall score with SRM ranging from 0.32 to 0.44 at domains levels, and also 5 out of 8 facets reached statistical significance. Some possible reasons for these non-significant domains and facets are: (1) The observation period (about 1 week) was not long enough to detect significant changes; (2) They may not change over time in nature, such as the social function in hospital. In other words, the instrument revealed the changes of domain/facets scores which are expected to change. Therefore, it can be inferred that this instrument has good (moderate) responsiveness.
Traditionally, the scale is assessed by classical test theory analysis. Generalizability Theory was also applied both in G-study and D-study in this research. The index of dependability is typically lower than G-coefficients because they consider the main error effects in addition to the interaction effects that are used for G-coefficients. This research presented both G-coefficients and Ф, and also their changes as items changing.
For social domain, we estimated a G-coefficient of 0.546 and an index of dependability of 0.500 for the current design, which was below the acceptable level of 0.70. For an alternative design with 17 items, the G-coefficient estimated to be 0.650 and the index of dependability 0.607, which will meet acceptable criteria for social domain. Theoretically speaking, it will be better to increase the numbers of items of social domain from 11 to 17 in order to reach an acceptable dependability 0.70. However, this is very difficult in practice. Therefore, we preferred to coefficient 0.60 being as acceptable dependability as many researches.12-15 In terms of other domains, G-coefficients and index of dependability were all greater than 0.80 for the current design, and changed a little as items changing. It can be considered that current items are reasonable and acceptable for all domains.
In this paper, both CTT and GT (focusing on reliability of the scale) were used to develop and validate the QLICD-COPD to avoid one’s weaknesses. The analysis from GT confirmed the reliability of the scale further, and presented much more information on items change. The numbers of items for social domain can be increased in order to obtain better reliability although 0.6 was acceptable here. If not so, the items’ quality, not quantity, should be addressed for this domain in new version of the scale.
Finally, it is worth noting that there are some limitations in this study. First, the sample size of the study is not very large, which may also affect the results related to factor analysis (114 cases and 30 variables) and responsiveness. Second, the subjects in this study were only selected from hospital inpatients. Third, it takes 15–30 min to use, which possibly makes some patients feel burden. Moreover, some patients (8 cases) were not assessed at discharge because they were not in the wards when the study investigators visited the hospital at the scheduled assessment time (earlier discharge, transferred to other departments, etc.). The missing cases may affect responsiveness to some extent although it is reasonable to infer that these events all happened by chance. Further large-scale research is needed to assess the universality of the instrument for other settings and populations, such as outpatients in local clinics.
In summary, the QLICD-COPD can be used as a useful instrument with some strengths in measuring and assessing quality of life for patients with COPD who speak Chinese. The questionnaire is free to use on request by other researchers. It also needs further large-scale studies to confirm psychometric properties in different settings (outpatients, community etc.).
Supplemental Material
Supplemental material - Development and preliminary validation of the chronic obstructive pulmonary disease scale quality of life instruments for chronic diseases-chronic obstructive pulmonary disease based on classical test theory and generalizability theory
Supplemental material for Development and preliminary validation of the chronic obstructive pulmonary disease scale quality of life instruments for chronic diseases-chronic obstructive pulmonary disease based on classical test theory and generalizability theory by Chonghua Wan, Zheng Yang, Zhihuan Zhao, Peng Quan, Bin Wu and Yunbin Yang in Chronic Respiratory Disease
Footnotes
Acknowledgements
In carrying out this research project, we have received substantial assistance from staffs of the first affiliated hospital of Kunming Medical University and the affiliated hospital of Guangdong Medical University, and Prof. Fabio Efficace at Italian Group for Adult Hematologic Diseases (GIMEMA). We sincerely acknowledge all the support.
Author Contributions
WCH, ZY and YYB designed the study. ZZH, QP, WB performed the data collection and WCH,
QP performed data analyses, and all authors contributed to interpreting the data. WCH and ZY wrote the first draft, which was critically revised by all others. All authors have read and approved the final manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is supported by the National Natural Science Foundation of China (71373058, 30860248), Science and Technological Planning Program of Guangdong Province (2013B021800074).
Ethics approval and consent to participate
The study protocol and the informed consent form were approved by the IRB (institutional review board) of Kunming Medical University (30860248, Kunming Medical University, 01/17/2009).
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
