Abstract
This article describes the development of the Claudication Symptom Instrument (CSI) and its measurement properties for evaluating the symptom experience of patients diagnosed with intermittent claudication (IC). We conducted semi-structured qualitative interviews with IC patients for item development and cognitive interviews in which patient comprehension of items was tested. We evaluated measurement properties using data collected and analyzed in the context of an observational comparative effectiveness study of IC treatments. Items measuring five symptom important to patients were developed and cognitively tested: Pain, Numbness, Heaviness, Cramping, and Tingling. Item means (higher means worse) ranged from 1.1 (Tingling) to 2.3 (Pain) (range: 0 ‘none’ to 4 ‘extreme’). Rasch analysis yielded support for an overall score (χ2=26.5, df=20, p=0.15). The total CSI score differed by clinician-rated severity of mild versus moderate (p<0.05), but not moderate versus severe. Re-administration of the CSI 5–10 days after baseline yielded an intra-class correlation coefficient of 0.86. Changes in CSI total score and VASCUQOL total score between baseline and 6 months post-treatment were correlated at −0.52 (p<0.05). The CSI preliminarily meets accepted measurement standards for content validity, internal consistency and test-retest reliability, construct validity, and sensitivity for detecting change. Because of its high test-retest reliability, it may also be useful in clinical care with individual patients. It takes approximately 3 minutes to complete.
Introduction
Peripheral artery disease (PAD) affects eight million American adults 1 and is associated with an increased risk of myocardial infarction, stroke, and death. 2 The most common classical presentation of lower extremity PAD, intermittent claudication (IC), manifests as pain with walking due to an imbalance between oxygen consumption and delivery. With disease progression, impairments typically appear in walking and daily functioning and can affect health-related quality of life (HrQoL). Goals for clinical management and treatment aim to decrease cardiovascular morbidity and mortality, reduce the risk of limb threat, and improve patients’ symptom experience, functional status, and HrQoL. Treatment strategies include lifestyle modification, medication management, and, for more severe limitations, surgical and percutaneous intervention. 3
Patient-reported outcomes (PROs) can aid treatment decision-making, document disease progression, and/or characterize patients’ perspectives of their disease-related symptoms, function, and HrQoL. 4 The incorporation of PROs into clinical care reflects a recognition that clinicians and patients sometimes offer different views of what constitutes an improved outcome.4–6 In IC, clinicians often describe treatment success in terms of hemodynamic improvement, bypass/stent patency, or ability to walk a number of city blocks, which may miss outcomes that patients care most about.
Existing PROs that measure IC-specific symptoms include the Intermittent Claudication Questionnaire (ICQ), 7 the Peripheral Artery Questionnaire (PAQ), 8 and the Questionnaire for Patients with Intermittent Claudication (CLAU-S). 9 The ICQ combines ‘Pains’, ‘Cramps’, ‘Numbness’, and ‘Discomfort’ into the same item stem and the PAQ combines ‘Discomfort’, ‘Fatigue’, ‘Pain’, ‘Aching’, and ‘Cramps’. Both instruments were developed with patient input. The CLAU-S assesses pain only and it is unclear if patients were involved in item development. 10 The ability of the ICQ and PAQ to detect change associated with treatment has not been established, and the CLAU-S has been shown to detect change for patients who improve, but not for those whose clinical status worsens. 10
In the qualitative interviews we conducted, patients described experiencing several distinct IC symptoms, the most prevalent ones being ‘Pain’, ‘Numbness’, ‘Heaviness’, ‘Cramping’, and ‘Tingling’. We believe it is plausible that patients may differ in how important they consider each of these different symptoms to be for their daily experience and functioning. If these distinct symptoms are assessed only globally, however (as current measures do), this important source of variation in patient experience will not be accounted for in treatment studies and clinical practice. To improve symptom assessment and optimize outcome sensitivity to patient experience in treatment studies, we developed and tested the Claudication Symptom Instrument (CSI).
Materials and methods
The study was carried out in two phases. In Phase 1, instrument development, we interviewed patients diagnosed with IC about their symptom experience (‘concept elicitation’) and their understanding of the symptom terms, recall period, and response scales we used for the instrument (‘cognitive interview’). In Phase 2, instrument validation, the measurement properties of the CSI were investigated as a sub-study in the context of a multisite, longitudinal, prospective observational cohort comparative effectiveness research study. 11 The study examined the comparative effectiveness of the following three treatment strategies for infrainguinal IC: medical management (physician-recommended walking program, smoking cessation, and phosphodiesterase III inhibitors) versus endovascular or surgical revascularization. We compared the change from baseline to 6- and 12-month physical function, HrQoL, and symptom scores using PRO measures. This study population consisted of English-speaking patients aged 21 years or older with newly diagnosed or established IC. Those with acute ischemia, rest pain or ulceration, or isolated aortic or iliac claudication were excluded. Potential participants were identified from clinician appointment schedules, followed by review of electronic medical records. Once the diagnosis was confirmed, recruitment and enrollment adhered to standard protocols. After participants completed a baseline survey, they were categorized into one of the following three cohorts: medical management, endovascular revascularization, or surgical revascularization. Patient-reported outcomes were collected at baseline, 6 months, and 12 months. Data were collected using DatStat, a secure web-based platform (DatStat Inc., Seattle, WA, USA). Patient characteristics were obtained at baseline through medical record abstraction or self-report.
The University of Washington Human Subjects Committee served as the study’s institutional review board of records and approved the study. Preliminary consent was obtained via a scripted telephone conversation. This was followed by completion of a written consent form and a Health Insurance Portability and Accountability form (with witness signatures from the study coordinator), both of which were returned to the study coordinator via US mail. All study data were de-identified. 11
Recruitment and enrollment for Phases 1 and 2
We recruited patients diagnosed with IC from 15 clinics in 11 hospitals in Washington State. We identified participants through clinic schedules, or by direct referral by site physicians or clinic staff. We also displayed study posters and brochures in participating clinics.
Inclusion/exclusion criteria for Phases 1 and 2
Eligible participants included those diagnosed with IC. Ineligible participants included those with: (1) documented acute ischemia, rest pain, or tissue loss; (2) IC not caused by atherosclerotic disease; (3) IC of aortic or iliac origin; (4) diagnosis of dementia confirmed in the medical record; (5) those who were not English speaking; and (6) children and young adults up to and including 20 years of age. The inclusion/exclusion criteria were established to capture a sample of IC patients most appropriate for the intervention arms being tested.
Phase 1: Instrument development
Study procedures
From April through June 2011, we conducted semi-structured qualitative concept elicitation interviews with patients recruited in the Seattle area in order to elicit from them IC-related symptoms they experienced and in what circumstances they occurred. As the parent comparative-effectiveness study excluded patients with claudication of aortic or iliac origin, the interview questions focused on symptoms occurring in the leg or foot, and differentiated between symptoms occurring as a result of exertion and those that patients reported also occurring at rest.
Interviews were conducted until concept saturation was reached, defined as the point at which no new symptom information was obtained over three consecutive interviews. We interviewed a total of 11 participants, in person or over the telephone (about 50% each). The purpose of the interviews was to develop a new symptom instrument, which unlike other available measures would assess separate IC symptoms, with the goal of enhanced measurement precision and the ability to detect treatment-related change. The interviews lasted 30–45 minutes, and were audio-recorded and transcribed by a professional transcription service. We removed identifying information and uploaded the transcripts to an ATLAS.ti™ qualitative database for coding and analysis (ATLAS.ti Scientific Software Development, Berlin, Germany). 12
We conducted cognitive interviews with the last four concept elicitation participants who completed a draft of the CSI instrument and identified items that seemed confusing or redundant. 13
Phase 2: Instrument validation
Study measures
Instruments included to assess the measurement properties of the CSI were the Walking Impairment Questionnaire (WIQ), 14 the Vascular Quality of Life Questionnaire (VASCUQOL), 15 and an item assessing Claudication Impact on Usual Daily Activities adapted from the Work Productivity and Activity Impairment Questionnaire. 16 The WIQ assesses impairment in the domains of Walking Distance, Speed, and Stair Climbing, and has been validated against treadmill testing and the 6-Minute Walk.17,18 The VASCUQOL assesses quality of life specific to chronic lower limb ischemia. IC severity was categorized based on clinician judgment from notes in the electronic medical record. This was defined in terms of the participant’s ability to walk: two to three blocks or 900 feet (Mild), one to two blocks or 600 feet (Moderate), or less than one block or 300 feet (Severe).
Data collection
We collected and managed the study data using Research Electronic Data Capture (REDCap) tools (Vanderbilt University, Nashville, TN, USA). 19
Test-retest procedure
To assess retest reliability of the CSI we administered it at two time points a minimum of 5 days apart during which time no intervention occurred (acceptable window was 5–10 days).
Six-month follow-up procedure
The index procedure date was the date that was used as the starting point for all follow-up analysis in the study. For participants enrolled in one of the interventional cohorts (surgical bypass/endovascular intervention), the index procedure date was the date at which the surgical bypass or endovascular procedure occurred. For participants enrolled in the medical management cohort, the index procedure date was the date occurring 28 days after the participant completed the baseline survey. The 6-month follow-up survey was collected 6 months after the index procedure date (–1 month/+2 month window).
Analysis
Phase 1: CSI development
In order to create a content valid outcome measure, we followed accepted development standards. 20 An initial code list was developed in the early stages of coding and refined as the coding process progressed. Each interview transcript was coded by one investigator, checked by a second, and discrepancies reconciled. Concept saturation was reached at 11 interviews, in that no new symptom information was obtained in the last three interviews.
Phase 2: CSI validation
Standard descriptive statistics were first run on the data, and frequency and patterns of missing data examined. For continuous measures, the mean, median, standard deviation (SD), percentiles and range were computed. For categorical measures, the frequency and mode were computed. Unless otherwise noted, statistical analyses were conducted with SAS software, Version 9 (SAS Institute Inc., NC, USA).
Measurement model and scale internal consistency
Item reduction statistics were assessed, including: (1) items with greater than 5% missing data; (2) items demonstrating a floor or ceiling effect; (3) item-to-total correlation lower than 0.40; and (4) item-to-item correlation greater than 0.70. The inter-item correlations of Pain, Numbness, Heaviness, Cramping, and Tingling were examined. We tested item fit, response thresholds, and dimensionality using methods guided by Rasch Measurement Theory.21–23 All Rasch analyses were performed using RUMM 2030 (RUMM Laboratory Pty Ltd, Perth, Australia). 24 Floor and ceiling effects were considered to be present if more than 15% of respondents endorsed the lowest (‘None’) or highest (‘Extreme’) response option, respectively. 25 Cronbach’s alpha coefficient was used to assess internal scale consistency, with a minimum coefficient of 0.70 considered necessary to establish internal consistency. 25
Test-retest reliability
We tested reproducibility of the CSI on a randomly selected sample of 25 participants who completed it a minimum of 5 days post-baseline, using the intra-class correlation coefficient (ICC). ICC ranges between 0.00 and 1.00, and the minimal acceptable level is 0.70 for group comparisons 25 and 0.90 for use with individual patients. 26
Construct validity
To assess construct validity, we computed Pearson’s correlation coefficient to examine strength of association between the CSI and WIQ, Impact on Usual Daily Activities, and VASCUQOL. Based on studies of the relationship between symptoms, function, and HrQoL in other health areas,27,28 we hypothesized that IC symptoms (as assessed by CSI) would be more strongly associated with HrQoL (as assessed by VASCUQOL) than with function (as assessed by WIQ and Impact on Usual Daily Activities). Construct validity is demonstrated if hypotheses are specified in advance and ≥75% of the hypotheses are supported by results (in groups of ≥50 study participants). Correlations between measures of related constructs should be ≥0.50.10,25
Known groups validity
We evaluated known groups validity by examining mean CSI scores in relation to clinician-rated IC severity. We hypothesized that CSI scores would be consistently higher across the clinician-rated severity categories (Mild, Moderate, Severe), with the highest CSI scores found in the Severe group of participants. We used one-way analysis of variance (ANOVA) and group differences were expected to be statistically significant at the 0.05 level.
Sensitivity for detecting change
To assess the ability of the CSI to detect changes associated with IC treatment, we examined correlations of CSI change scores with VASCUQOL and WIQ change scores between baseline and 6 months post-intervention (all treatment groups combined) and calculated effect sizes. Measure sensitivity is demonstrated if correlations between changes on instruments measuring the same construct (CSI and VASCUQOL) are ≥0.50. 10
Results
Phase 1: Instrument development
Participant characteristics
Eight concept elicitation/cognitive interviewees were recruited from the University of Washington (UW) Vascular Clinic, and three from the UW Harborview Medical Center. Seven were male and four female. The age range of the interviewees was 54–78 years (median 62 years). Nine were Caucasian and two African-American. The earliest had been diagnosed with IC in 1992 and the most recent in 2011. All had had IC treatment in the past, or were in the process of planning treatment. Of those who had treatment, a mix of medical management and/or endovascular intervention/surgical bypass had been received.
Content results
Concept elicitation participants identified five distinct symptoms as important to them: Pain, Numbness, Heaviness, Cramping, and Tingling. These symptoms were reported as occurring most commonly while walking, and were more pronounced when walking up stairs than on level ground, and even more pronounced when walking up an incline than walking up stairs. An initial symptom code list was developed, and refined as the coding process progressed. Some codes were combined over time (such as ‘aching’ and ‘burning’ into pain), while new codes were added (such as ‘tingling’). ‘Tiredness’ was collapsed with ‘heaviness’.
Cognitive interview results
The four interviewees felt that the five CSI symptoms were distinguishable from one another, and were relevant to their experience of IC. The format of the CSI was modified as a result of the cognitive interviews so the flow across the page from left to right was clearer. The 7-day recall period was considered reasonable for recalling the worst symptom intensity experienced and the response options were considered sufficient in number and clear in their order of magnitude.
CSI design
We designed the CSI to consist of five items assessing IC symptoms in the leg or foot experienced in the previous 7 days. Each item is rated on a 5-point intensity scale for the worst intensity experienced in the past 7 days (range: 0 ‘none’ to 4 ‘extreme’). For each item, we also included response options for participants to indicate in what circumstances the symptom had occurred: Sitting or standing still; Walking on flat ground; Walking up stairs and/or Walking up hill. See the Appendix, available as Supplementary material, for the CSI instrument and instructions.
Phase 2: Instrument validation
Participant characteristics
The mean age of participants (n=323) was 71.1 years (SD 9.5), 70% were male and 87% were white. Ninety-nine percent reported living totally independently and 50% reported having had a previous vascular procedure for IC (Table 1).
Phase 2 participant characteristics (n=323).
Item descriptive statistics
No items had greater than 5% missing data and none demonstrated ceiling effects (% endorsing ‘Extreme’). We found floor effects (% endorsing ‘none’) for Numbness (47%), Heaviness (53%), Cramping (36%), and Tingling (47%) (Table 2). The percentages of participants reporting the presence of each symptom in the past 7 days were Pain = 92%, Numbness = 53%, Heaviness = 47%, Cramping = 64%, and Tingling = 53%. Mean symptom intensity scores ranged from 1.09 (Tingling) to 2.34 (Pain) (range: 0 ‘none’ to 4 ‘extreme’).
Claudication Symptom Instrument – item descriptive statistics (n=323) (baseline).
Measurement model and internal consistency
Symptom intensity inter-item correlations ranged from 0.25 (Cramping-Tingling) to 0.51 (Numbness-Tingling). None of the items had item-to-total score correlations <0.40, which would have indicated an item belonging to a different scale. We found evidence that the set of five CSI items covered the theoretical distribution of claudcation severity from low to high. Table 3 shows the overall fit of the items and the location of the items along the continuum. All items appear to fit the Rasch model, with the possible exception of the Heaviness item, with a Fit Residual of 2.98. We also found potential evidence of disordered response thresholds for the items. The thresholds were resolved for four of the items by combining the response options ‘moderate’ and ‘quite a bit’. Rescoring the Heaviness item to dichotomous response options (yes/no) resolved the disordered thresholds for that item. Disordered thresholds occur when patients demonstrate difficulty consistently discriminating between response categories. Disordered thresholds often arise if there are too many response options or if the response labelling is confusing.
Measures of fit and location (SE) of Claudication Symptom Instrument items (baseline).
SE, standard error; DF, degrees of freedom.
Table 4 summarizes the overall fit to the Rasch model with the modifications to the response options (χ2=26.50, df=20, p=0.15). In this analysis, a non-significant result indicates preliminary evidence that the five CSI items as a set represent a unidimensional construct. The internal consistency reliability of the resulting CSI scale is Cronbach’s alpha=0.73. For a fuller explanation of interpretation of Rasch method results, see Petrillo et al. 29
Indices of fit to the Rasch model.
The item–trait interaction is a measure of overall fit to the Rasch model.
Item–trait interaction statistic is a χ2 that is determined from the comparison between the expected score and the mean observed score for groups of patients with similar ability (i.e. claudication symptom severity) estimates on an item.
A statistically significant result on a χ2 test suggests that some items may not fit the Rasch model. Our results show a non-significant result on this test, suggesting that the items fit the model and are collectively assessing the construct of interest.
The item–person interaction statistic is a standardized residual derived from the difference between the expected or modeled score and the obtained score for each segment to each item. This statistic is determined for each Claudication Symptom Index item and can be summarized over the entire set of items.
An ideal model fit is a mean of 0 and a standard deviation of 1.
Test-retest reliability
The CSI total score had an ICC of 0.86, exceeding the standard of 0.70 for group comparisons 25 and approaching the standard of 0.90 for use with individual patients. 26
Construct validity
As hypothesized, correlations between the CSI total score and VASCUQOL domain scores were higher (range 0.50–0.64) than correlations between the CSI and the WIQ (range 0.21–0.46). The correlation between the CSI and Impact on Daily Activities was higher than hypothesized however (0.51) (Table 5).
Correlation of Claudication Symptom Instrument total score with other scales (n=323) (baseline).
Known groups validity
The total CSI score differed significantly by clinician-rated IC severity categories (F=3.64, p<0.05). The mild versus moderate severity categories were significantly different from one another in the expected directions (p<0.05), but the mild versus severe and moderate versus severe categories were not (Table 6).
Means and standard deviations of Claudication Symptom Instrument (CSI) by clinician-rated severity (n=197 a ) (baseline).
Mild vs moderate: p<0.05; mild vs severe: p=0.10; moderate vs severe: p=1.00.
Missing data: clinician-rated severity from medical record.
Sensitivity for detecting change
For the Medical arm, CSI total scores were 4.18 at baseline, 4.11 at 6 months and 4.07 at 12 months. For the Procedural arm (endovascular and surgical groups combined), they were 3.91, 3.34 and 3.28, respectively. Changes in the CSI total score and VASCUQOL total score between baseline and 6 months post-treatment were correlated at r=−0.52 10 (p<0.05), and the CSI change score and WIQ Distance change score were correlated at r=−0.30 (p<0.05). The effect sizes obtained (Cohen’s d) between baseline and 6 months post-treatment (all groups combined) for the study measures were: CSI total=0.23, WIQ Distance=0.21, WIQ Stairs=0.18, WIQ Speed=0.14, and VASCUQOL total=0.30 (all p<0.05).
Respondent burden
The CSI took approximately 3 minutes to complete, and, as described above, missing data were minimal.
Discussion
Our study participants identified five distinct symptoms of IC that they had experienced: Pain, Numbness, Heaviness, Cramping, and Tingling. When we collected CSI data from participants in the parent comparative effectiveness study, we found that 47% (Heaviness) to 92% (Pain) reported experiencing these five symptoms in the last 7 days. While we found floor effects for Numbness, Heaviness, Cramping, and Tingling, sizeable proportions of study participants reported experiencing these symptoms. This is consistent with clinical sources of information regarding symptoms associated with PAD.30–32 We also found psychometric evidence indicating these symptoms represent a single unidimensional construct, of which the CSI measures across the continuum. This provides support for using the CSI total score in treatment studies. We found that the internal consistency of the scale was maximized with all five symptoms included (Cronbach’s alpha = 0.73). The high test-retest reliability of the CSI total score suggests that it may be useful in clinical practice as well as clinical research. Because the CSI captures a variety of symptoms it may be useful as a tool for facilitating shared treatment decision-making among patients and treatment providers. While the effect size obtained with the CSI in this comparative effectiveness study was relatively small (0.23), it could also be useful for obtaining sample size estimates in other controlled clinical studies.
A primary objective in developing the CSI was to create an outcome instrument that would comprehensively capture patient symptom experience and be sensitive for use in treatment studies. Our results indicate that this objective was achieved as the CSI change score was sufficiently correlated with the change score of the VASCUQOL. As in previous studies,27,28 we found a stronger relationship between IC symptoms (CSI) and HrQoL (VASCUQOL) than between symptoms and function (WIQ). This may be due to symptoms and HrQoL having a more subjective or perceptual basis to them than function, which can be assessed more objectively. The WIQ has been validated against the 6-Minute Walk 17 and treadmill testing. 18
Future work should focus on how to interpret CSI change scores in different treatment contexts and on the development of responder criteria. 33 Future work should also examine further the correspondence between the CSI and clinician-rated CI severity. We found the CSI to track clinician-rated severity between the mild–moderate severity categories, but not between the mild–severe and moderate–severe. IC severity was categorized based on clinician judgment from notes in the electronic medical record. This was defined in terms of the participant’s ability to walk: two to three blocks or 900 feet (Mild), one to two blocks or 600 feet (Moderate), or less than one block or 300 feet (Severe). We have no way of knowing, however, to what degree the information was obtained from patients in a standardized and reliable manner in the clinical setting. Therefore, we hypothesize that the symptom information obtained from patients in a standardized manner with the CSI may better reflect claudication severity than the function information obtained from the medical record.
Limitations
Limitations of the study include a relatively small Phase 1 sample, although concept saturation was documented. A second limitation is the predominantly Caucasian samples in both Phase 1 and Phase 2. A third limitation was that the comparative effectiveness study within which the CSI’s measurement properties were assessed was not designed specifically for that purpose and thus was not ideal for examining instrument sensitivity to change. Rather, outcome sensitivity is typically assessed in the context of a treatment of known effectiveness. A final weakness is the lack of data relating the CSI score to a well-established objective measure such as treadmill walking distance or the 6-Minute Walk, although the WIQ (self-reported) used in the study has been validated against treadmill testing and the 6-Minute Walk. A strength of the study is its careful inclusion and consideration of the patient perspective, and development of a symptom instrument that comprehensively captures that perspective.
Conclusion
The CSI preliminarily meets established measurement standards for reliability, validity, and sensitivity for detecting change. It includes five distinct IC symptoms identified by participants, takes approximately 3 minutes to complete, and is easy to administer and score. The CSI will be useful as a valid, reliable, and sensitive instrument for assessing IC treatments in the context of clinical trials and possibly clinical care. If used as a profile, each CSI item provides important information about specific patient symptoms and the circumstances under which they occurred, and as such could be useful for informing treatment planning and assessment. Non-English versions are needed.
Footnotes
Acknowledgements
We would like to acknowledge Cheryl Armstrong and Rebecca Symons, without whose valuable assistance this paper would not have been possible.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: this work was supported by the Agency for Healthcare Research and Quality (R01HS020025, PI: D Flum).
Supplementary material
The supplementary material is available online with the article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
