Abstract
Background:
Parkinson’s Disease (PD) is characterized by motor and non-motor symptoms that can overlap with other movement disorders, complicating accurate diagnosis and monitoring. Wearable technologies, such as smartwatches, offer continuous and objective assessment of motor function, but their clinical utility in multiclass classification and symptom prediction remains underexplored. This study aimed to determine whether smartwatch-derived motor features can distinguish idiopathic PD from other movement disorders and whether motor variability is associated with non-motor symptom burden in PD.
Methods:
We analyzed data from the Parkinson’s Disease Smartwatch (PADS) dataset (N = 469), which includes accelerometer and gyroscope signals recorded during 20 standardized motor tasks. For each participant, mean and standard deviation values for each axis were averaged across tasks. Diagnostic group classification was assessed using multinomial logistic regression. Among individuals with idiopathic PD (n = 276), linear regression evaluated associations between motor variability and total non-motor symptom scores from a 30-item questionnaire.
Results:
Motor variability features, particularly accelerometer Y-axis and gyroscope X-axis standard deviations, significantly differentiated diagnostic groups (pseudo R2 = 0.068). Age, sex, and handedness also contributed. In the PD subgroup, higher accelerometer X and gyroscope X variability were associated with greater non-motor symptom burden, while greater stability in the mediolateral (Y) axis was linked to fewer symptoms (adjusted R2 = 0.0081, p < 0.001).
Conclusion:
Smartwatch-derived motor variability features can modestly differentiate movement disorder diagnoses and are associated with non-motor symptom severity in PD. Our findings support the complementary use of wearable sensors in clinical assessment and remote monitoring. Our findings also lay the foundation for future integration of wearable-derived data into telemedicine workflows.
1 Introduction
Parkinson’s Disease (PD) is a progressive neurodegenerative disorder characterized by both motor and non-motor symptoms [1]. Among the earliest and most functionally disruptive symptoms are motor impairments, including tremor, bradykinesia, and postural instability. These symptoms interfere with basic activities such as walking, dressing, and writing [2]. In addition to motor symptoms, individuals with PD often experience cognitive decline, sleep disturbances, mood changes, urinary dysfunction, and gastrointestinal issues [3]. These non-motor symptoms may precede the motor symptoms and are often underrecognized, despite having a profound impact on quality of life [4]. As the disease progresses, both motor and non-motor symptoms become more severe and complex, underscoring the need for reliable, accessible, and continuous assessment tools that can support early detection and long-term monitoring [5].
Traditional clinical assessments of motor symptoms in PD rely on clinician-administered scales such as the Unified Parkinson’s Disease Rating Scale (UPDRS), which are subjective and require in-person visits. This approach may overlook subtle changes in motor performance that occur between appointments or during daily activities. In recent years, wearable sensor technologies, particularly smartwatches, have emerged as a promising method to detect and monitor motor symptoms in real-time [6]. These devices can capture high-resolution data on limb movement during routine tasks, enabling more objective and frequent monitoring of symptom patterns. Despite the growing interest in wearables for PD, many studies have focused on single- wrist sensors, single-task measures, or isolated symptom detection, limiting their utility for broader clinical application [7, 8].
The Parkinson’s Disease Smartwatch (PADS) dataset offers a unique opportunity to address these gaps by combining bilateral smartwatch recordings with structured motor tasks and detailed clinical and demographic information [9]. The dataset includes individuals with idiopathic PD, essential tremor, atypical Parkinsonism, and healthy controls, allowing for a more comprehensive analysis of diagnostic differentiation. In addition, the inclusion of a structured non- motor symptom questionnaire enables researchers to investigate the relationship between motor patterns and non-motor symptom burden in PD, a connection that remains underexplored in the current literature [9].
While some studies have used wearable data to detect or classify PD, few have examined how bilateral motor features relate to the full range of Parkinsonian syndromes. Compared to previous wearable-based studies that focused primarily on binary classification or narrow symptom targets [6], our analysis offers a broader diagnostic scope and links wearable features to non-motor symptoms. Even fewer have explored the connection between motor asymmetry, one of the distinguishing features of PD, and the severity of non-motor symptoms [10]. While prior studies have explored wearable-based classification of PD, performance metrics such as accuracy, sensitivity, and specificity are often lacking [11, 12]. Although our study did not include such metrics, future work should evaluate predictive performance and clinical thresholds for real-world application. Moreover, given the modest explanatory power observed in our regression models, sensor-derived features should be considered complementary to traditional clinical and patient-reported data rather than standalone predictors.
Considering the limitations of existing assessment methods and the potential for wearable sensors to support earlier and more precise identification of disease characteristics, this study leverages the PADS dataset to explore two important research questions. The primary purpose of our study was to determine whether bilateral smartwatch-derived motor features can distinguish individuals with idiopathic PD from those with essential tremor (ET), atypical Parkinsonism, multiple sclerosis (MS), other movement disorders, and healthy controls. The secondary purpose was to assess whether smartwatch-derived motor asymmetry and instability features are associated with non-motor symptom burden among individuals with PD.
2 Methods
This study utilized an existing dataset and was deemed exempt by the Institutional Review Board at the University of Jamestown.
This study utilized data from the PADS dataset [9], a publicly available repository of wearable sensor data and clinical assessments. The sample included 469 participants assigned to one of six diagnostic categories: healthy controls (n = 89), other movement disorders (n = 46), idiopathic Parkinson’s disease (PD; n = 223), atypical Parkinsonism (n = 12), MS (n = 45), and ET (n = 54). Each participant completed motor tasks while wearing a smartwatch, and their demographic data, such as age, sex, and handedness, were recorded. An additional variable related to alcohol effects on symptoms was available but excluded from analysis due to its limited relevance to the study purposes.
2.1 Motor Task Protocol and Sensor Data Acquisition
All participants performed a standardized protocol consisting of 20 motor tasks designed to capture a range of voluntary and involuntary movements. These tasks were conducted while the participants wore a smartwatch on their left wrist. The smartwatch contained two primary motion sensors: a tri-axial accelerometer and a tri-axial gyroscope. The accelerometer measured linear acceleration in the X, Y, and Z directions (in m/s2), while the gyroscope recorded angular velocity in the same axes (in degrees/s). The time-series data were saved as plain text files for each participant and task, and included raw signal values recorded continuously over the course of each motor task. Sensor sampling rates were not explicitly reported in the PADS dataset documentation but are presumed to follow standard smartwatch acquisition settings. Although the raw smartwatch sensor data were used without explicit filtering, future studies should consider applying noise-reduction techniques (e.g., wavelet filters) to improve signal reliability.
2.2 Preprocessing and Feature Extraction
The original time-series files did not include header rows, but based on the known structure of the sensor outputs, the first column in each file was presumed to represent time and was excluded from analysis. The next six columns were retained and interpreted as accelerometer (X, Y, Z) and gyroscope (X, Y, Z) data. For each of the 20 tasks per participant, Python 3.11 (Python Software Foundation) was used to calculate the mean and standard deviation for each of the six signal axes. These values were then averaged across tasks, yielding 12 features per participant: mean and standard deviation for each accelerometer and gyroscope axis. This preprocessing step allowed for the derivation of stable sensor-derived metrics that reflect both average movement and variability in movement across different motor tasks.
2.3 Questionnaire-Based Non-Motor Symptom Assessment
In addition to motor data, all participants completed a structured 30-item questionnaire evaluating non-motor symptoms commonly associated with PD [7]. Each item was scored using a numerical scale, and scores were summed to create a non-motor symptom total score, with higher scores indicating greater symptom burden. This variable was computed only for participants diagnosed with idiopathic PD, in order to investigate within-group variation.
2.4 Statistical Analyses
Descriptive statistics were first calculated for all variables, including measures of central tendency (means and medians) and dispersion (standard deviations and interquartile ranges), to summarize demographic characteristics and sensor-derived features across diagnostic groups. The analyses were designed to: (1) determine whether wearable sensor-derived motor features and demographic characteristics can differentiate between diagnostic groups, and (2) evaluate whether motor variability is associated with non-motor symptom burden within individuals diagnosed with idiopathic PD. All analyses were conducted using Stata 18 (StataCorp LLC, Stata statistical software: release 18. College Station, TX: StataCorp LLC. 2023). The alpha level for statistical significance was set at 0.05 across all models, and 95% confidence intervals were reported alongside regression coefficients to facilitate interpretation.
2.5 Diagnostic Classification Analyses
Bivariate analyses were performed to explore the unadjusted association between each predictor and the diagnostic group. For continuous sensor-derived features and age, the Kruskal- Wallis test was used due to non-normal distributions. Categorical predictors (gender and handedness) were analyzed using chi-square tests. A multinomial logistic regression model was utilized. This model is appropriate when the outcome variable is categorical with more than two unordered groups, in this case, diagnostic classification (healthy controls, other movement disorders, idiopathic PD, atypical Parkinsonism, MS, and ET). The dependent variable was the diagnostic group, with healthy controls (coded as 0) set as the reference category.
Predictors were selected based on their theoretical relevance to disease-related motor control characteristics, specifically, average movement patterns and signal variability, and were retained in the multivariable model if they showed at least marginal significance (p < 0.10) in bivariate analyses [7]. The independent variables included the mean and standard deviation values of tri-axial accelerometer and gyroscope data (six means and six standard deviations), as well as age (continuous), gender (binary: 0 = female, 1 = male), and handedness (binary: 0 = right-handed, 1 = left-handed). These predictors were chosen to capture both average movement patterns and variability, as fluctuations in motor signals have been suggested to reflect disease-specific motor control characteristics.
The model allowed us to estimate the log-odds of belonging to each clinical condition relative to the healthy control group as a function of these variables. Model fit was assessed via likelihood ratio chi-square tests and pseudo R2. Pseudo R2 values in logistic regression models do not represent the proportion of variance explained in the same way as in linear regression. Values below 0.1 are typically interpreted as modest explanatory power [13], while values above 0.2 are considered indicative of relatively good model fit, especially for complex categorical outcomes such as diagnostic classifications. Coefficients were interpreted in terms of their statistical significance and directionality.
To assess multicollinearity among the predictors used in the multinomial logistic regression model, variance inflation factors (VIFs) were calculated using an auxiliary ordinary least squares (OLS) regression that included the same independent variables [14]. All continuous sensor-derived variables (means and standard deviations of tri-axial accelerometer and gyroscope data), as well as demographic covariates (age, gender, and handedness), were included. VIFs below 10 were considered acceptable, with higher values indicating potential multicollinearity.
2.6 Non-Motor Symptom Prediction Analyses
Bivariate analyses were performed to explore the unadjusted association between each predictor and the total non-motor symptom score within the idiopathic PD subgroup. For continuous variables, including the standard deviations of tri-axial accelerometer and gyroscope signals, and age, simple linear regressions were used. For categorical predictors (gender and handedness), separate regressions using binary coding were applied. Predictors with a significant or marginal association at p < 0.10 were retained for inclusion in the multivariable model. The multiple linear regression model was then used to estimate the independent associations between motor variability and non-motor symptom severity, adjusting for age, gender, and handedness. Model fit was evaluated using the F-statistic, R2, and adjusted R2.
Multiple linear regression analysis was conducted within the subgroup of participants diagnosed with idiopathic PD (n = 223). The dependent variable in this model was the total non-motor symptom score, calculated as the sum of the 30 questionnaire items. The primary predictors of interest were the standard deviations of the six sensor signals (three from the accelerometer and three from the gyroscope), which represent motor variability. These features were initially selected based on prior evidence suggesting that irregularities in motor patterns are linked to both motor and non-motor symptom domains in PD, and retained in the multivariable model if they met a significance threshold of p < 0.10 in bivariate analyses. Age, sex, and handedness were included as covariates. Model fit was evaluated using the F-statistic, R2, and adjusted R2. Adjusted R2 values were interpreted using conventional guidelines, where values between 0.01 and 0.09 reflect small or modest effects, 0.10 to 0.25 moderate effects, and values above 0.26 indicate large effects [15].
To assess multicollinearity among the predictors used in the multiple linear regression model, variance inflation factors (VIFs) were calculated following the estimation of the full model. All continuous sensor-derived variables (standard deviations of tri-axial accelerometer and gyroscope data), as well as demographic covariates (age, gender, and handedness), were included. VIFs below 10 were considered acceptable, with higher values indicating potential multicollinearity. In addition, model assumptions were evaluated. Residual normality was assessed using histogram and Q-Q plots; homoscedasticity was evaluated with a residual-versus-fitted values plot; and linearity of predictor relationships was examined using added-variable plots.
3 Results
3.1 Characteristics of Participants
The final dataset included a total of 469 unique participants, each performing 20 motor tasks. Based on their clinical status, participants were categorized into six groups: healthy controls (n = 79), other movement disorders (n = 60), idiopathic Parkinson’s Disease (PD; n = 276), atypical Parkinsonism (n = 15), multiple sclerosis (MS; n = 11), and essential tremor (ET; n = 28) (Table 1).
Participant Characteristics by Diagnostic Group
Participant demographics and questionnaire data were merged with extracted motor features to generate the final analytic sample. Across all tasks performed by the 469 participants (totaling 6072 task observations), the mean age was 65.36 years (SD = 9.62). The sample was 70.7% male and 94.6% right-handed. The average tri-axial accelerometer signals were close to zero, reflecting baseline movement distributions: accelerometer X-axis mean = −0.003 (SD = 0.064), accelerometer Y-axis mean = 0.007 (SD = 0.037), and accelerometer Z-axis mean = 0.006 (SD = 0.037). Mean gyroscope readings were also near zero: gyroscope X-axis mean = −0.001 (SD = 0.049), gyroscope Y-axis mean = −0.002 (SD = 0.113), and gyroscope Z-axis mean = 0.0003 (SD = 0.090).
In terms of motor variability, standard deviations of accelerometer signals ranged from 0.065 (SD = 0.081) on the X-axis to 0.091 (SD = 0.127) on the Z-axis. Gyroscope variability showed a wider range: gyroscope X-axis standard deviation = 0.476 (SD = 0.685), gyroscope Y-axis standard deviation = 0.456 (SD = 0.589), and gyroscope Z-axis standard deviation = 0.560 (SD = 0.805) (Table 2). The mean total non-motor symptom score among idiopathic PD observations was 9.93 (SD = 5.22), with scores ranging from 0 to 24.
Descriptive Statistics for Study Variables (N = 6,072 task observations)
3.2 Bivariate Associations Between Predictors and Disease Classification
Bivariate analyses revealed significant differences Bivariate analyses revealed significant differences across diagnostic groups for multiple variables. Among the continuous predictors, accelerometer Y-axis mean, accelerometer Z-axis mean, and gyroscope X-axis mean showed significant variation across groups (p = 0.0124, 0.0054, and 0.0075, respectively), whereas accelerometer X-axis mean, gyroscope Y-axis mean, and gyroscope Z-axis mean did not. All six standard deviation features (accelerometer X-axis standard deviation, accelerometer Y-axis standard deviation, accelerometer Z-axis standard deviation, gyroscope X-axis standard deviation, gyroscope Y-axis standard deviation, gyroscope Z-axis standard deviation) differed significantly by group (p < 0.0001). Age also showed a significant group difference (p < 0.0001). Both gender and handedness were significantly associated with diagnostic classification (chi-square p < 0.001). These findings guided the selection of predictors for the full regression model.
3.3 Multinomial Logistic Regression Analysis
The multinomial logistic regression model examining whether motor features and demographics differentiated participants by diagnostic group was statistically significant overall (LR χ2(75) = 1740.33, p < 0.001), with a pseudo R2 of 0.068, indicating modest explanatory power. All variables had VIFs below the commonly used threshold of 10, suggesting no evidence of severe multicollinearity. Across comparisons, standard deviations (variability) of sensor signals emerged as more robust predictors than mean signal values.
Participants classified with other movement disorders were more likely than healthy controls to show higher variability in accelerometer Y-axis (β = 4.35, p < 0.001) and gyroscope X-axis (β = 0.38, p < 0.001), and higher gyroscope X-axis mean values (β = 4.30, p < 0.001). Conversely, lower variability in accelerometer Z-axis (β = − 4.08, p < 0.001) and gyroscope Z-axis (β = −0.37, p = 0.002) was associated with this group. Younger age (β = −0.015, p < 0.001) and male sex (β = 0.19, p = 0.017) were also linked to higher odds of classification in this group.
For individuals with idiopathic PD, key positive predictors included greater variability in accelerometer Y-axis (β = 2.99, p < 0.001) and gyroscope X-axis (β = 0.30, p = 0.001), as well as higher gyroscope X-axis mean values (β = 1.29, p = 0.03). Lower mean values on accelerometer Y-axis (β = −2.34, p = 0.011) were negatively associated. Demographically, older age (β = 0.012, p < 0.001), male sex (β = 1.38, p < 0.001), and right-handedness (β = −0.94, p < 0.001) increased the odds of idiopathic PD classification.
Participants with atypical Parkinsonism were strongly differentiated from healthy controls by a marked reduction in accelerometer X-axis standard deviation (β = -12.47, p < 0.001). Male sex (β = 0.58, p < 0.001), older age (β = 0.023, p < 0.001), and right-handedness (β = −0.71, p < 0.001) also increased the odds of classification into this group.
The MS group showed higher gyroscope X-axis mean (β = 3.96, p = 0.004) and increased accelerometer X-axis standard deviation (β = 6.69, p = 0.003). Lower accelerometer Y-axis mean (β = −5.17, p = 0.027) and reduced accelerometer Z-axis standard deviation (β = −6.05, p = 0.002) were negatively associated. Males had increased odds (β = 1.58, p < 0.001), whereas older participants had decreased odds (β = −0.10, p < 0.001) of MS classification.
For ET, higher variability in accelerometer Y-axis (β = 7.69, p < 0.001) and gyroscope X-axis (β = 0.67, p < 0.001) were positively associated with disease classification. However, lower accelerometer Y-axis mean (β = −4.28, p = 0.001), gyroscope Z-axis mean (β = −1.26, p < 0.001), and lower accelerometer Z-axis standard deviation (β = −3.15, p = 0.003) were significant negative predictors. Being male (β = 1.09, p < 0.001), olde (β = 0.022, p < 0.001), and right-handed (β = −1.40, p < 0.001) also increased the odds of ET classification.
In summary, sensor-derived motor variability measures, especially from the accelerometer Y-axis and gyroscope X-axis, were the most consistent predictors across diagnostic groups (Table 3).
Multinomial Logistic Regression Summary Table
3.4 Bivariate Associations Between Predictors and Non-Motor Symptom Severity
In the bivariate analyses, greater variability in the mediolateral (accelerometer Y-axis standard deviation, p = 0.005) and vertical (accelerometer Z-axis standard deviation, p = 0.013) accelerometer axes, as well as older age (p < 0.001) and male gender (p < 0.001), were significantly associated with higher non-motor symptom scores. In the multivariable linear regression model, which included all predictors with a p-value < 0.10 in the unadjusted analyses, the overall model was statistically significant (F (9, 6062) = 6.48, p < 0.001) and explained 0.95% of the variance in non-motor symptom severity (adjusted R2 = 0.0081). Independent predictors included greater variability in the anteroposterior accelerometer axis (accelerometer X-axis standard deviation: β = 8.10, p < 0.001), reduced variability in the mediolateral axis (accelerometer Y-axis standard deviation: β = −4.05, p = 0.007), and older age (β = 0.03, p < 0.001). Male gender remained associated with lower non-motor symptom scores (β = −0.65, p < 0.001). No significant associations were found for gyroscope variability metrics or handedness in the adjusted model.
3.5 Multiple Linear Regression
The multiple linear regression model was statistically significant (F (9, 6062) = 6.48, p < 0.001), though the proportion of variance explained was modest (R2 = 0.0095; adjusted R2 = 0.0081), indicating that motor variability accounts for a small but meaningful portion of variation in non-motor symptoms. Diagnostic checks indicated that the assumptions of linear regression were adequately met.
Greater variability in accelerometer X-axis standard deviation (β = 8.10, p < 0.001) and gyroscope X-axis standard deviation (β = 0.40, p = 0.025) was positively associated with non-motor symptom scores. In contrast, higher variability in accelerometer Y-axis standard deviation (β = −4.05, p = 0.007) was negatively associated with symptom severity, and a trend was observed for accelerometer Z-axis standard deviation (β = −2.98, p = 0.061). Age (β = 0.029, p < 0.001) was positively associated with non-motor symptom severity, while being male (β = −0.65, p < 0.001) was linked to lower symptom scores. Variability in gyroscope Y-axis and Z-axis, as well as handedness, was not statistically significant in this model. All predictors were assessed for multicollinearity using VIF, which ranged from 1.01 to 9.28 and indicated no serious collinearity concerns.
4 Discussion
This study examined whether sensor-derived features collected via wearable devices during standardized motor tasks can differentiate between diagnostic groups and predict non-motor symptom burden in individuals with idiopathic PD. Using data from 469 participants performing multiple tasks, we applied multinomial logistic regression to assess classification accuracy across six diagnostic categories and linear regression to examine associations with non-motor symptom severity in PD. Our findings highlight the clinical promise of passive sensor data for aiding diagnostic differentiation and understanding symptom heterogeneity. Compared to previous wearable-based studies that focused primarily on binary classification or narrow symptom targets [6, 11], our analysis offers a broader diagnostic scope and links wearable features to non-motor symptoms.
4.1 Clinical Implications of Sensor-Based Classification
Our multinomial logistic regression results demonstrate that variability in accelerometer and gyroscope signals, particularly along specific axes, is significantly associated with diagnostic classification. Features such as standard deviation in accelerometer Y and Z axes, and gyroscope X and Z axes, as well as mean values from gyroscope X, were consistently significant predictors for differentiating conditions such as PD, ET, MS, and other movement disorders. These results support earlier work showing that motor signal variability reflects disease-specific movement signatures, such as tremor frequency, rigidity, or bradykinesia patterns unique to PD versus ET or MS [1, 3]. For instance, accelerometer Y-axis variability (mediolateral sway) and gyroscope X-axis variability (rotational wrist movement) may be particularly sensitive to distinguishing tremor and bradykinetic patterns that are common in PD and ET. This biomechanical interpretation provides a plausible explanation for their consistent predictive value across groups.
Importantly, we found that certain predictors, such as higher gyroscope X variability or accelerometer Y variability, were more strongly associated with classification into PD and ET, which are often difficult to distinguish clinically in early stages [4]. While modest, this level of explained variance is typical for classification problems involving heterogeneous clinical groups and complex behavioral data, such as wearable sensor signals. Thus, wearable-derived signal variability may support clinical decision-making as an adjunct tool, rather than a primary diagnostic modality. Moreover, demographic variables such as male gender and younger age consistently increased the odds of classification into disease groups, aligning with known epidemiological trends in PD and essential tremor [6, 7]. These findings highlight the clinical potential of wearable sensor data in differentiating movement disorders beyond idiopathic PD.
4.2 Motor Variability and Non-Motor Symptom Burden
In the subsample of PD patients, we found a significant relationship between motor variability and non-motor symptom severity. Standard deviation in accelerometer X and gyroscope X were positively associated with higher non-motor symptom scores, suggesting that greater motor fluctuation is linked to more pronounced non-motor burden. Conversely, greater stability in accelerometer Y (i.e., lower variability) was associated with fewer non-motor symptoms.
These findings align with prior evidence suggesting that motor and non-motor domains in PD are not entirely independent but may share common neurophysiological underpinnings, such as dopaminergic and non-dopaminergic system involvement [9, 16]. The association between motor instability and symptom severity also supports the potential for remote, passive monitoring to assess both motor and non-motor progression, which remains a challenge in clinical practice due to fluctuating daily symptom patterns [2].
4.3 Relevance to Clinical Practice and Future Applications
The ability to distinguish diagnostic groups based on wearable sensor data has significant implications for clinical neurology. Traditional clinical diagnosis relies heavily on expert observation, which can be subjective and prone to inter-rater variability [5]. Wearable technologies can augment diagnostic confidence, reduce misdiagnosis, and potentially shorten the diagnostic delay, particularly in resource-limited or telemedicine settings [8]. Our findings also support the integration of sensor-based monitoring into long-term care models, where fluctuations in motor stability can indicate disease progression or response to treatment. While our models captured statistically significant associations, their explanatory power was limited, and classification performance metrics (e.g., accuracy, sensitivity) were not evaluated. These should be prioritized in future studies to assess real-world diagnostic potential. Exploring machine learning classifiers may also improve predictive accuracy, as they can model non-linear relationships and higher- order feature interactions beyond what traditional regressions can capture.
Furthermore, the associations between motor features and non-motor symptom burden suggest that clinicians may benefit from incorporating sensor-derived metrics into broader patient assessment strategies. Given the heterogeneity of PD, a multimodal monitoring strategy, including wearable data, symptom scales, and biomarkers, may provide the most accurate picture of disease progression. While sensor technologies are not yet widely adopted in routine neurology clinics, growing evidence, including our results, highlights their value as adjuncts to improve clinical accuracy and personalize care.
4.4 Integration into Telemedicine
Wearable-derived motor variability features could be embedded into remote monitoring dashboards to generate alerts for symptom worsening or medication effects. This aligns with emerging telemedicine models, particularly for patients with limited access to in-person neurologic care.
4.5 Limitations and Future Directions
Despite these promising findings, several limitations must be acknowledged. First, while we standardized task performance and sensor positioning, variation in compliance or environmental factors may introduce noise into the signal. Second, while standardization improves comparability, it may reduce ecological validity. Lab-based tasks may not fully reflect real-world motor fluctuations or context-specific symptom patterns. Third, the models explain only a modest amount of variance, particularly in the non-motor domain, indicating that sensor features alone may be insufficient for comprehensive prediction and should be combined with clinical data. Fourth, demographic imbalance in the sample, 70.7% male and 94.6% right-handed, may limit generalizability to more diverse patient populations, such as women and left-handed individuals. The lack of validation metrics such as sensitivity, specificity, or classification accuracy further limits the ability to assess clinical utility. These metrics (e.g., ROC curves, AUC, or confusion matrices) are essential for evaluating real-world predictive performance and should be prioritized in follow-up analyses.
Future studies should assess diagnostic performance more explicitly using cross- validation, ROC curves, or confusion matrices. Moreover, expanding the analysis to include machine learning models (e.g., random forests, support vector machines) could improve classification accuracy and provide feature importance rankings.
Another limitation is the cross-sectional design, which precludes conclusions about symptom progression or changes over time. Longitudinal tracking using wearable sensors may offer a more dynamic and personalized understanding of motor and non-motor symptom evolution in PD. Future modeling efforts could also benefit from incorporating interaction terms or using machine learning models that offer feature importance rankings to improve interpretability and predictive accuracy.
5 Conclusion
This study provides robust evidence that sensor- derived features, particularly variability metrics from accelerometer and gyroscope data, can meaningfully differentiate diagnostic groups and predict non-motor symptom burden in PD. These findings underscore the clinical potential of wearable technology in supporting diagnosis and monitoring, paving the way for more personalized and data-driven approaches to neurodegenerative care.
However, given the modest effect sizes observed, these sensor-based features should be interpreted as supportive rather than definitive markers. Combining sensor data with clinical assessments, longitudinal data, and patient- reported outcomes is likely essential to improve predictive accuracy and applicability in real-world settings. In alignment with open science principles, the PADS dataset used in this study is publicly available and can be accessed via PhysioNet [9]. Additional code or processed data will be made available upon reasonable request.
Footnotes
Acknowledgements
None.
Funding information
None.
Author Contribution
PA analyzed the data and prepared the first draft of the manuscript. PA and GJ participated in the conception and design of the study, GJ constructively revised the manuscript; PA participated in data collection and organization; PA and GJ participated in and supervised the study throughout, and they share corresponding authorship. All authors commented on previous versions of the manuscript and approved the final version.
Declaration of Conflicting interests
GJ is an Associate at FIECON. The authors declare no conflict of interest.
Data Availability Statement
The data that support the findings of this study are openly available at https://physionet.org/content/parkinsons-disease-smartwatch/1.0.0/, reference number [
].
Ethics Statement
Not applicable.
Informed Consent
Not applicable.
