Abstract
Evaluating agreement between AI-generated functional assessment metrics and clinician judgment in stroke rehabilitation is important for understanding how AI-supported evaluation tools may complement clinical practice. This cross-sectional study examined agreement between AI-generated functional performance metrics and clinician-rated judgments in adults with chronic stroke and explored clinician- and patient-level predictors of discrepancies. A total of 186 participants, including 90 adults with chronic stroke and 96 rehabilitation clinicians, completed standardized functional assessments. AI-generated metrics derived from wearable inertial sensors and computer vision algorithms were compared with clinician-rated judgments across multiple functional domains. Agreement was evaluated using intraclass correlation coefficients, and multivariable linear regression identified predictors of AI–clinician discrepancy scores. Agreement between AI metrics and clinician judgment was moderate to good across domains (ICC range = 0.68–0.79), with the highest agreement observed for task execution speed. Greater clinician experience and physical therapy discipline were associated with higher agreement, while regression analysis showed that older age and longer time since stroke were associated with larger discrepancies, and greater motor impairment severity was associated with smaller discrepancies. Overall, AI-generated functional metrics demonstrated meaningful concordance with clinician judgment and appear most useful as complementary evaluative tools that require contextualized clinical interpretation.
Keywords
Get full access to this article
View all access options for this article.
