Abstract
Background
In anticipation of future clinical trials involving individuals with Parkinson's disease (PD), it is important to have fully validated, clinically-relevant, responsive, disease-specific patient-reported outcome measures that are capable of serially measuring how an individual with PD feels and functions.
Objective
To develop and validate a disease-specific patient-reported outcome measure for PD that is sensitive in detecting clinically meaningful changes in multifactorial disease burden during clinical trials.
Methods
We conducted qualitative interviews and a cross-sectional study to identify the most important symptoms to individuals with PD. Symptom questions with the highest frequency and relative importance to the cross-sectional study cohort were selected for potential inclusion in the Parkinson's Disease-Health Index (PD-HI). The PD-HI was evaluated and refined through: 1) Qualitative beta interviews; 2) Test-retest reliability assessments; 3) Internal consistency analysis; and 4) Cross-sectional subgroup analysis of PD-HI subscale scores.
Results
Twenty individuals with PD participated in initial qualitative interviews and 404 individuals with PD participated in the cross-sectional study. Symptom questions representing 13 symptomatic themes of PD health were selected for inclusion in the PD-HI. Beta interview participants found the PD-HI to be comprehensive, relevant, and easy to use. The PD-HI demonstrated high test-retest reliability (ICC = 0.90) and internal consistency (α= 0.99). PD-HI total and subscale scores successfully distinguished between subgroups of participants with varying disease severity.
Conclusions
Initial evaluation of the PD-HI demonstrates its content validity, construct validity, usability, and test-retest reliability as a patient-reported outcome tool for potential use in PD clinical trials.
Plain language summary
This research describes the development of a disease-specific, patient-reported outcome measure called the Parkinson's Disease-Health Index (PD-HI). The PD-HI is a questionnaire completed by an individual with Parkinson's disease (PD) that provides an assessment of how they feel and function. This is a useful tool for tracking changes in health over time in the clinic or during research studies. The PD-HI is comprised of 13 subscales that ask about symptoms related to physical, social, and emotional PD health. When completed, the PD-HI produces a score of overall disease severity that ranges from 0 (no disease burden) to 100 (maximum disease burden). The symptom questions in the PD-HI were selected based on direct input from over 400 adults with PD who participated in qualitative interviews or a national survey study. The PD-HI was designed to measure the symptoms that are most important to individuals with PD. Additionally, testing of the PD-HI showed that the format and wording are acceptable to people with PD, that the questions are reliable over a short period of time, and that the questions in each subscale are consistent with the underlying concept that they intend to measure. In summary, the PD-HI is a new tool available to the PD research community that can help researchers evaluate therapeutic interventions for PD based on the priorities and perspectives of research participants.
Keywords
Introduction
Parkinson's disease (PD) is a movement disorder characterized by tremor, muscular rigidity, bradykinesia, and impaired balance and gait. 1 In addition to these cardinal features, patients also experience a wide variety of non-motor symptoms that can affect their quality-of-life including sleep, fatigue, difficulty with communication, and impaired emotional and social health.2–4 In preparation for clinical trials involving individuals with PD, it is important that researchers have access to outcome measures that are able to comprehensively measure the symptoms of the disease that are most important to patients with PD. The United States Food and Drug Administration (FDA) has identified patient-reported outcome (PRO) measures as valid mechanisms to quantify how patients feel and function during clinical trials, and to support the merit of drug-labeling claims.5,6 Disease-specific PROs have been established as tools that can bring the patient voice to the forefront of clinical trials and measure small but clinically relevant changes in disease burden. 7 They also provide an opportunity to measure symptoms of importance that are difficult to quantify without the direct input of the patient.
There are numerous disease-specific PROs for PD that have been previously developed and used to monitor PD disease status longitudinally, including global scales of disease severity such as the Parkinson's Disease Questionnaire-39, sections of the Movement Disorders Society-Unified Parkinson's Disease Rating Scale, and the Patient-Reported Outcomes in Parkinson's Disease.8–10 In addition, several PROs have been developed to assess specific symptomatic domains in PD such as the Parkinson's Disease Sleep Scale, the Parkinson Fatigue Scale, the Sialorrhea Clinical Scale for PD, and the Freezing of Gait Questionnaire.11–14 Despite the availability of a variety of PROs for PD, these tools have sometimes failed to comprehensively capture the symptoms and experiences that are most important to people with PD, especially those with early-stage PD. 15 As stated by Morel et al. (2022), “The development of a new PRO instrument, created in conjunction with people with PD, that fully assesses symptoms and the experience of living with early-stage PD, is required.” 15 Additionally, prior research has pointed to the need for PD PROs to include non-motor symptoms of PD, which are occasionally underrepresented in existing PROs and are known to be significant contributors to patient quality-of-life.16,17 This disconnect in what is measured by PD clinical trial outcome measures and what individuals with PD feel is important to capture during clinical trials is likely due to the lack of direct involvement of individuals with PD throughout the PRO development process. This research sought to fill a gap in existing PD clinical trial infrastructure by implementing patient-centered methodology to develop a new PRO known as the Parkinson's Disease-Health Index (PD-HI). The PD-HI was designed to measure how individuals with PD feel and function, and to track clinically relevant changes in disease burden during a clinical trial. Importantly, the PD-HI uses a novel weighted scoring algorithm that accounts for the prevalence and perceived importance of each of the items, generating meaningful composite scores of disease burden that reflect patient priorities. The development and validation of the PD-HI were conducted in accordance with FDA guidelines to ensure its suitability for use in future clinical trials and in support of drug-labeling claims. 5 The present study describes the comprehensive development process of the PD-HI and provides initial evidence for its content validity, construct validity, test-retest reliability, and internal consistency.
Methods
Study participants
All participants in this study: 1) Were aged 18 years or older; 2) Reported that they have been diagnosed with PD; and, 3) Spoke English. Semi-structured qualitative interview participants were recruited from the University of Rochester Brain Health Registry, the Parkinson's Support Groups of the Finger Lakes, and the University of Rochester affiliated movement disorder clinics. Cross-sectional study participants were recruited from the Davis Phinney Foundation for Parkinson's, the Michael J. Fox Foundation, and the Parkinson & Movement Disorder Alliance. Beta interview participants and test-retest reliability participants were recruited from the University of Rochester Brain Health Registry. The demographic characteristics of each sample cohort in this study are provided in Table 1.
Demographic characteristics of Research Participants.
Disease duration was calculated as current age – age of PD diagnosis
Semi-structured qualitative interviews and a national cross-sectional study
In prior work, we conducted semi-structured qualitative interviews with adults with PD to identify potential symptoms of importance to those living with this disease. 18 Using open-ended interview questions from a comprehensive interview guide, we asked participants to identify the symptoms of PD that have the greatest impact on their daily life. These interviews were performed by the study team clinical research coordinators who were experienced in conducting and analyzing qualitative interviews with neurological disease patients. The comprehensive interview guide was developed from a literature review conducted by a multidisciplinary team of neurologists and qualitative researchers, and is provided as an attachment in Appendix A. These questions served as a template to probe specific symptomatic domains; however, the interviewers were purposeful in allowing the participant to guide the discussion and elaborate on the symptoms that they think are most important and impactful to their lives. The sample size for the initial qualitative interviews was determined by content saturation, or the point at which participants were not mentioning new symptoms that had not been already discussed in prior interviews. Symptoms identified during these qualitative interviews were included in a survey that was implemented in a national, online cross-sectional study of adults with PD. We determined the prevalence and average life impact score of each symptom and symptomatic theme assessed in the cross-sectional study. The average life impact score is a metric ranging from 0 to 4 that measures the relative importance of a symptom to participant. A population impact score was also calculated for each symptom and symptomatic theme, obtained by multiplying the prevalence by the average life impact score. The range for the population impact score is also 0 to 4, with a value of 4 representing a symptom or symptomatic theme that affects all participants in the cross-sectional study cohort at the highest level.
Question selection and content validity
The present study identified the symptoms with the highest population impact scores for inclusion in the Parkinson's Disease-Health Index (PD-HI). Symptom questions were excluded if they were: 1) redundant; 2) vague; 3) not potentially responsive to therapeutic intervention; 4) potentially abrasive or; 5) lacking generalizability. In addition, symptom questions were excluded if the wording was considered above a 6th grade reading level. These decisions were guided by the cross-sectional study results and a research team consensus approach.
Initial internal consistency analysis
We used a research team consensus approach to group symptom questions into subscales representing distinct symptomatic themes of PD health. Cronbach's alpha scores were used to quantify the internal consistency of the full instrument and subscales. Item placement was evaluated using corrected item-total correlations. Symptom questions were deleted or moved to alternative subscales as appropriate to maximize the internal consistency of the subscales.
Beta testing and participant interviews
We conducted beta interviews with adults with PD to obtain qualitative feedback regarding the content, relevance, and usability of the PD-HI. A sample size of 15 participants has been previously reported as sufficient for reaching content saturation in cognitive interviews for assessing the usability of a PRO measure.19–26 Eligible individuals affiliated with the University of Rochester Brain Health Registry were notified about the study via email. Participants were asked to complete the PD-HI, and then provide feedback about the instrument during an interview with a member of our research team. The PD-HI was administered remotely via Research Electronic Data Capture (REDCap), a Health Insurance Portability and Accountability Act (HIPAA)-compliant, secure electronic data capture software licensed by the University of Rochester. During these interviews, participants were asked to comment on the format of the PD-HI, the relevance and clarity of the symptom questions, and its overall usability. Participants also described their understanding of the symptomatic theme addressed in each of the subscales, provided feedback on the response options, and discussed the timeframe they referenced in responding to each symptom question. Lastly, participants identified any symptom questions of importance that were not included in the PD-HI and provided feedback regarding the wording and the placement of symptom questions within the instrument. All beta interviews were conducted remotely and were audio-recorded, transcribed, and analyzed by our research team. Participant feedback obtained through these interviews in combination with a research team consensus approach were used to make appropriate modifications to the instrument.
Scaling and scoring of the PD-HI
We developed a scoring algorithm for the PD-HI where subscales are scored from 0 to 100 with 0 representing no disease burden and 100 representing the maximum level of disease burden. Symptom questions within each subscale are weighted based on participant-reported prevalence and average impact scores as determined through the prior cross-sectional study. Subscale scores are also weighted to generate a total PD-HI score from 0 to 100, representing overall disease burden.
Test-retest reliability of the PD-HI
During test-retest reliability, participants completed the PD-HI at baseline and 14 days later to assess the test-retest reliability of the instrument. Eligible individuals affiliated with the University of Rochester Brain Health Registry were notified about the study via email. The time interval of 14 days was selected to satisfy the following criteria regarding an instrument's ability to yield consistent and reproducible estimates of symptomatic burden: 1) the chosen time interval is long enough to minimize memory effects; and 2) the chosen time interval is short enough to minimize variability due to changes in disease status. 5 This time interval (14 days) has been used in other research studies evaluating test-retest reliability of PD clinician-reported outcome measures or PROs.27,28 During the consent process, participants were encouraged to complete the surveys at the same time of day to minimize variability due to “on/off” states. Reliability of the symptom questions in the instrument were quantified using weighted kappa (WK) scores, while reliability of PD-HI total and subscale scores were examined using intraclass correlation coefficient (ICC) scores.
Cross-sectional analysis of PD-HI scores
We analyzed PD-HI total and subscale scores by demographic and clinical characteristics of the study cohort to evaluate the initial discriminatory performance of the PD-HI. Average PD-HI total and subscale scores were determined for predefined subgroups of the cross-sectional study cohort. PD-HI scores were grouped by age (above vs. equal to or below the mean), sex (male vs. female), employment status (on disability vs. not on disability), ambulatory status (uses assistive ambulatory device(s) vs. does not use assistive ambulatory device(s), speech status (speaks clearly vs. experiences speech impairment), experiences gait freezing (yes vs. no), years since diagnosis (above vs. equal to or below the mean), duration of tremor (0–5 years vs. >5 years), and duration of ambulatory impairment (0–5 years vs. >5 years). Wilcoxon rank sum scores were used for group comparisons of mean total scores. Group comparisons were conducted using the Wilcoxon Two-Sample Test and t-approximation two-sided p-values. The Benjamini-Hochberg procedure was applied to control for multiple comparisons. We used a false discovery rate of 0.05 and 135 test statistics. As outlined by this method, p-values were ranked from smallest to largest and the largest value of i such that p(i) ≤ 0.05 i/135 was determined. The hypotheses associated with the p-values p(1), …, p(i) were rejected, resulting in 102 i “discoveries”.
Final internal consistency analysis
Upon completion of the final PD-HI, we conducted an internal consistency analysis and used Cronbach's alpha scores to quantify the internal consistency of the total instrument and each of the subscales. Symptom questions were deleted or moved to alternative subscales as appropriate to maximize the internal consistency of the subscales.
Standard protocol approvals, registrations, and patient consents
This study was approved by the University of Rochester Research Subjects Review Board in 2019 (STUDY00004139). We obtained verbal consent from all beta interview participants. We received a waiver of written consent for the cross-sectional and test-retest reliability studies such that participants’ acknowledgement of the study information letter and subsequent completion of the survey(s) implied consent.
Results
Semi-structured qualitative interviews and a national cross-sectional study
Twenty adults with PD participated in semi-structured qualitative interviews, providing 2,978 direct quotes regarding the symptomatic burden of PD. The subsequent cross-sectional study implemented a survey which inquired about 315 symptom questions representing 14 symptomatic themes of PD health. Four-hundred and four individuals with PD participated in the cross-sectional study. Cross-sectional study participants ranged in age from 22 to 89 years of age, were 55.2% female, 44.1% male, and represented 45 U.S. states. Participants provided over 125,000 survey item responses, which were used to identify the symptoms and symptomatic themes with the highest population impact scores for inclusion in the Parkinson's Disease-Health Index (PD-HI).
Question selection and content validity
Symptom questions with a population impact score greater than 0.25 were selected for inclusion in version 1.0 of the PD-HI. We removed 36 symptom questions due to a low population impact score (≤ 0.25). We removed 171 symptom questions and 1 symptomatic theme deemed by the research team to be redundant, vague, potentially not responsive to future therapeutic intervention, potentially abrasive, not generalizable, or at above a 6th grade reading level. We reworded 2 symptom questions for clarity. An overview of the process used to select, remove, and modify questions during each phase of the PD-HI development is provided in Figure 1.

Displays an overview of the complete development process of the PD-HI, including symptom question selection, removal, and revision.
Initial internal consistency analysis
Version 1.0 of the PD-HI was developed which consisted of 108 symptom questions representing 13 symptomatic themes.
Beta testing and participant interviews
Fifteen individuals with PD participated in beta interviews lasting between 30–60 min. Overall, beta interview participants commented that the PD-HI Version 1.0 was easy to use and comprehensively covered the symptoms of importance that they experience in their daily lives. Participants indicated that they felt the PD-HI was capable of capturing how they feel and function. When participants were asked to identify the concept addressed by each subscale in the instrument, correct responses were consistently provided. Participants also provided input regarding the wording, placement, and inclusion of specific symptom questions. Following beta interview analysis, we implemented several modifications to the PD-HI to optimize the clarity, usability, and responsiveness of the instrument. We deleted 5 symptom questions that participants identified as vague, 3 symptom questions that participants identified as redundant, 3 symptom questions that participants indicated would be unlikely to respond to therapeutic intervention, and 1 symptom question that lacked generalizability. We reworded 5 symptom questions based on participant feedback to improve clarity and specificity. Lastly, 1 question was added back due to its high importance as indicated by participants. Upon completion of beta interview analysis and instrument modifications, Version 2.0 of the PD-HI was developed.
Test-retest reliability of the PD-HI version 20
Twenty individuals with PD participated in test-retest reliability of the PD-HI Version 2.0. The average WK value of all symptom questions in the PD-HI was 0.61 (95% confidence interval = 0.55–0.67; Range = 0.21- 0.92). Eighty-nine out of 97 symptom questions in the PD-HI displayed at least a satisfactory level of test-retest reliability over a 14-day period, (WK value ≥ 0.4). Items with borderline WK values were reviewed by our study team and all were kept due to their high impact on patient's lives, prevalence, and a reasonable likelihood of reflecting true changes in symptom burden over a two-week period (e.g., restless legs). PD-HI subscales demonstrated high reliability (>0.7), with the exception of Central Sensory Function which is just a one-item subscale (ICC=0.44). ICC, SDC, and SEM values for the full PD-HI instrument and subscales are provided in Table 2. PD-HI total scores at baseline and day 14 for n = 20 participants are plotted in Figure 2, and a Bland-Altman plot for PD-HI total score test-retest reliability is provided in Supplemental Figure 1. The average time to complete the PD-HI for n = 19 was 14.3 min with a median of 11 min (one data point was excluded from the time to complete analysis as an outlier). No changes to the PD-HI Version 2.0 were made as a result of test-retest reliability assessment.

Displays baseline and day 14 retest PD-HI total scores for the test-retest reliability cohort (n = 20), with a fitted linear trendline.
Internal consistency and test-retest reliability of the PD-HI version 3.0 and subscales.
SEM = SD x sqrt(1 – ICC); SDC = 1.96 x sqrt(2) x SEM
Internal consistency and instrument finalization
We performed internal consistency analysis to assess the PD-HI Version 2.0. This led to the deletion of one item to optimize the internal consistency of the PD-HI and subscales, resulting in the PD-HI Version 3.0. The PD-HI Version 3.0, or final version, contains one question and one set of Likert responses covering 96 items that represent 13 symptomatic themes (subscales). A 13-question short form was developed as a surrogate of the total instrument. The short form includes one representative question of each subscale in the instrument. Cronbach's alpha scores representing the internal consistency of the final PD-HI instrument, short form, and subscales are provided in Table 2.
Cross-sectional analysis of PD-HI version 3.0 scores
The full cohort distribution of total PD-HI scores from the cross-sectional study are provided in Supplemental Figure 2. Seventeen participants (4.21%) received the highest (100) or the lowest (0) possible total PD-HI score. Through subgroup analysis, we found significant differences in PD-HI total and subscale scores among predefined subgroups of the study cohort that differed in disease severity. Specifically, higher PD-HI total scores were observed in those on disability, those with speech impairment, those who require an assistive ambulatory device, those who experience gait freezing, those who were diagnosed above the mean number of years ago, those with a duration of ambulatory impairment greater than 5 years, and those with a duration of tremor greater than 5 years. Table 3 provides effect sizes and p-values for binary comparisons of PD-HI total, short form, and subscale scores across demographic and clinical subgroups. Binary comparisons of PD-HI total scores by demographic and clinical subgroups are also shown in Figure 3.

Displays mean PD-HI total scores (+/- standard error) across demographic and clinical subgroups of the cross-sectional study cohort; statistically significant differences for binary comparisons are indicated (p < 0.05).
Cross-Sectional subgroup analysis pd-hi version 3.0 scores.
Bolded values indicate Benjamini-Hochberg adjusted significant p-values
Discussion
The Parkinson's Disease-Health Index (PD-HI) is a novel, multifactorial, disease-specific, patient-reported outcome measure (PRO) that was developed using large scale patient-reported data and designed to satisfy FDA guidelines for use as a primary, co-primary, or secondary endpoint in a therapeutic trial. The PD-HI can be administered on paper or electronically and consists of 13 subscales, each representing a distinct symptomatic theme of PD health. The instrument comprehensively assesses patient-reported disease burden in the following areas: fatigue, sleep and daytime sleepiness, mobility and ambulation, activity participation, central sensory function, pain, gastrointestinal function, hand and arm function, social health, emotional health, cognition, communication, and abnormal movements. With each completion of the instrument, a score from 0 (no disease burden) to 100 (maximum disease burden) is generated for each of these 13 subscales, as well as a PD-HI total score and short form score.
The PD-HI adds to existing clinical trial infrastructure for PD, as there are multiple disease-specific PROs that have been previously developed and utilized in PD clinical trials.8–14 Future studies have the option to implement one or multiple PROs based on which symptomatic domains need to be evaluated, the purpose of the study, and the strengths and limitations of the outcome measures. The PD-HI appears to address an unmet need in that it was designed from the ground up using a large-scale patient-centered approach involving participants in variable stages of their disease, adheres to FDA guidance regarding the development and validation of PROs for use in drug labeling claims, and was designed specifically to detect small but clinically important changes in patient health in response to therapeutic intervention. The PD-HI comprehensively assesses disease burden in many important symptomatic domains in PD and emphasizes non-motor symptoms such as fatigue and sleep impairment, which are not always addressed or are underrepresented in other PD outcome measures. Additionally, the PD-HI utilizes a scoring algorithm that weighs symptoms and symptomatic themes according to their patient-identified importance as determined through the PRSIM-PD study. As a matter of convenience, many prior PD outcome measures have utilized a one-to-one ratio of points per question which limits their ability to prioritize the issues and items that are most important to individuals with PD.
The design and utility of the PD-HI offers potential benefit as a clinical trial tool for PD. The PD-HI can be completed by a patient in approximately 14 minutes without the need for professional administration. The short form version of the PD-HI includes one representative question from each of the subscales and can be completed in approximately 1 minute. This may be a useful alternative for longitudinal studies administering frequent assessments; however, short form scores are inherently less robust and may therefore demonstrate reduced sensitivity to change. Psychometric analysis revealed that the PD-HI has optimal internal consistency, test-retest reliability, and limited ceiling-floor effects. Cross-sectional subgroup analysis of PD-HI scores suggests that the instrument is capable of distinguishing between individuals with PD with different levels of disease burden. For instance, the difference in PD-HI total score between those on disability vs. those not on disability is 14.2 points. This provides a rough estimate of what a score change might mean to a patient with PD, however, additional research to investigate this is warranted. The PD-HI is currently being implemented in multiple academic and industry longitudinal studies of PD, which will be used to inform its longitudinal responsiveness, correlation to other outcome measures, and ability to differentiate between subgroups of patients that differ in their disease burden.29–31 An important future step will be determining the responsiveness, convergent validity, and minimal clinically important difference of the PD-HI full form and subscales.
This research has limitations. While we utilized several recruitment sources throughout this research, our study cohort was not a perfect representation of the general PD population. Individuals without internet access were likely underrepresented in this study due to the remote administration of interviews and surveys, though we believe this increased access to the study and allowed for participation from a larger geographic range. Individuals with more severe disease or those with cognitive impairment were likely underrepresented in this study due to their disease interfering with their ability to participate in an interview or complete a survey. Another limitation stems from the fact that our study cohort was primarily white, possibly limiting relevance and generalizability of the PD-HI to individuals with PD of other races. Additional research to evaluate the PD-HI in diverse study cohorts will be imperative in order to fully understand the performance and usability of this tool in future research and clinical settings. Specifically, future work may investigate the cultural relevance of the PD-HI in select patient populations or translate the PD-HI into languages other than English. Lastly, participants in this study self-reported their diagnosis. While a detailed review of participant's medical records was beyond the scope of this research, we partnered with PD foundations and registries that manage robust databases designed to support clinical research recruitment. We acknowledge the possibility that a small number of participants in our study cohort may have inaccurately reported their diagnosis, including individuals with Parkinson-plus syndromes, non PD-related tremors or other nervous system disorders.
Overall, this research provides initial evidence of the content validity, construct validity, test-retest reliability, and usability of the PD-HI as a disease-specific PRO for PD clinical trials and clinical care. The PD-HI provides researchers and clinicians with a comprehensive mechanism to quantify clinically meaningful changes in health and disease status using the direct input of the patient. The PD-HI supplements existing clinical trial infrastructure and is a novel tool to facilitate patient-centered therapeutic assessment in PD.
Footnotes
Acknowledgements
This research was conducted in partnership with the Davis Phinney Foundation, The Michael J. Fox Foundation, the Parkinson & Movement Disorder Alliance, and the Parkinson's Support Groups of the Finger Lakes.
Ethical considerations
This study was approved by the University of Rochester Research Subjects Review Board in 2019 (STUDY00004139); Principal Investigator Chad Heatwole.
Consent to participate
We obtained verbal consent from all beta interview participants. We received a waiver of written consent for the cross-sectional and test-retest reliability studies such that participants’ acknowledgement of the study information letter and subsequent completion of the survey(s) implied consent.
Funding
This study was funded by the University of Rochester Center for Health + Technology.
Declaration of conflicting interests
Data availability statement
The data supporting the findings of this study are available within the article and/or its supplementary material. Additional data supporting the findings of this study are available on request to the corresponding author. The PD-HI and its scoring algorithm are owned by the University of Rochester. Requests for access, licensing, or further information can be directed to HealthIndexes@chet.rochester.edu or visit
or HealthIndexes.com for additional information.
Supplemental material
Supplemental material for this article is available online.
