Sage Journals: Discover world-class research

Abstract

Background

Telerehabilitation (TR) delivers rehabilitation services through digital and information technologies. Recent advances in artificial intelligence (AI) have introduced new opportunities for TR, particularly in remote monitoring and individualized treatment. This scoping review aims to examine and synthesize the current literature on the use of AI and markerless motion analysis (MMA) within TR for patients with neurological disorders, distinguishing between approaches focused on remote monitoring/assessment and those supporting AI-based TR platforms.

Methods

A scoping search conducted in March 2025 identified articles published in the last ten years in the following databases: PubMed, Embase, Scopus and Web of Science (WoS).

Results

The initial search retrieved 290 records. After removing 67 duplicates, the remaining records were screened. Following full-text assessment, only 10 studies were included, while 208 were excluded due to wrong population (n = 93), study design (n = 89), outcomes (n = 21), or language (n = 5). Overall, the evidence for both MMA-based remote monitoring/assessment and AI-supported TR interventions remains early-stage and heterogeneous across populations, outcomes, and set-ups.

Conclusion

AI applications in TR and remote monitoring for neurological disorders remain early stage and heterogeneous. While current platforms remain largely experimental, AI-based TR devices and metrics can offer objective, quantitative data to support personalized care, reinforcing the essential role of remote rehabilitation and monitoring in maintaining the patient–clinician connection. Therefore, integrating AI to promote continuity of rehabilitation beyond the clinic may provide a novel way to tailor treatment intensity, adapt exercises over time, and optimize follow-up in neurological rehabilitation.

Registration number

10.17605/OSF.IO/FB8TD

Keywords

telerehabilitation remote monitoring markerless motion analysis artificial intelligence pose estimation neurorehabilitation

Introduction

Telerehabilitation (TR) is a branch of telemedicine that refers to the delivery of rehabilitation interventions through digital and information technologies. It encompasses a broad range of services, such as assessment, monitoring, prevention, intervention, supervision, education, consultation, and coaching, which can be provided remotely to patients and their caregivers (Brennan et al., 2010). This rehabilitation approach is widely considered among patients affected by neurological disorders, including Parkinson's disease (PD) (Goffredo et al., 2023), multiple sclerosis (MS) (Pagliari et al., 2025), and stroke (Calabrò et al., 2023; Federico et al., 2023). These neurological conditions are a common cause of motor and cognitive disabilities, resulting in a lower level of functional independence and poorer quality of life. In this context, rehabilitation is fundamental to promote global recovery, according to the patients’ needs. However, regarding conventional face-to-face rehabilitation, patients and their caregivers may encounter some challenges. For example, demographic barriers, like distance from rehabilitation centers, and high costs related to the therapy and transport, result in low adherence to the treatment. In this sense, TR could represent a potential solution that can address these concerns (Maggio et al., 2020). TR, through digital platforms and/or gaming technologies, can remotely provide therapeutic interventions, improving accessibility and engagement for patients. On one side, TR serves as a facilitator by offering interactive and engaging experiences that can enhance patient outcomes. However, the TR approach also poses challenges, including technological constraints and differences in user adaptability. In addition, evaluating patients clinically in the TR modality can be complex and challenging due to the inherent limitations of remote assessment (Asgharzadeh Chamleh et al., 2025). One of the primary issues is the self-performed nature of the assessments, since some items of the clinical scales can be difficult to execute accurately. For example, the administration of motor clinical scales, like the Fugl-Meyer assessment scale for the upper limb, often requires precise execution. In fact, it cannot be easily replicated in a remote setting, also due to several factors, including low camera resolution and poor lighting conditions, which can limit accurate observation of patient movements (Asgharzadeh Chamleh et al., 2025). Gaining a clear understanding of these factors is essential to ensure that remote rehabilitation programmes effectively support patients from diverse backgrounds in achieving their rehabilitation goals. Several studies (Calabrò et al., 2023; Krzyzaniak et al., 2023; Muñoz-Tomás et al., 2023) have reported the feasibility and the usability of different TR platforms in multiple neurological conditions, suggesting that the use of innovative TR technologies (e.g., the VRRS, Khymeia, Padua, Italy) can achieve outcomes comparable to conventional physiotherapy in selected populations. In these contexts, TR has shown evidence of clinical effectiveness and, in some trials, non-inferiority to in-clinic rehabilitation for neurological disorders. For example, Gandolfi et al. reported that TR improved balance in people with PD compared with in-clinic training. Other randomized controlled trials have suggested that TR may be a feasible and effective option for post-stroke rehabilitation (Calabrò et al., 2023; Saygili et al., 2024; Toh et al., 2025). Having established the feasibility and clinical effectiveness of TR across different neurological populations, an important next step is to explore how these interventions can be further optimized and individualized. In this context, recent progress in artificial intelligence (AI) has created novel opportunities for TR.

For instance, pose estimation (PE), which involves the detection of position and orientation of the body or an object from videos/images, can enhance the accuracy and the quality of movement assessment in TR modality (He et al., 2024). Thanks to the capability of 2D PE, physiotherapists could easily quantify the movement of patients and are constantly informed about patients’ changes in postures and movements, tracking their recovery progress. In addition, 2D PE could be used to give feedback to the patients during TR sessions, which enables patients to understand and correct any posture or movement issues. However, 2D PE has some limitations deriving from the lack of capturing specific details. In this regard, 3D PE involves the estimation of the 3D orientation (corresponding x, y, z coordinates) and position of joints (such as shoulders, elbows, wrists, hips, knees, and ankles) in human movement, allowing a more precise motion detection (He et al., 2024). This method for detecting peoples’ movements without the use of active or passive markers is called markerless motion analysis (MMA). Different from other types of motion capture systems (e.g., marker-based motion capture), MMA allows reducing time-consuming marker placement and costs related to the equipment (Lam et al., 2023). MMA typically relies on a standard camera, such as an RGB or infrared camera, to record human movements. For higher precision in 3D motion tracking, multiple cameras can be positioned at different angles (Colyer et al., 2018). After video acquisition, PE algorithms process the footage through AI systems, including OpenPose (CMU-Perceptual-Computing-Lab/openpose, 2017/2025), MediaPipe (google-ai-edge/mediapipe, 2019/2025), or DeepLabCut (DeepLabCut, s.d.). These systems detect joints and limb positions by applying deep learning (DL) techniques, such as Convolutional Neural Networks (CNNs), to extract skeletal structures accurately. This enables a more natural and realistic capture of human movement while relying on portable, cost-effective sensors rather than traditional marker-based multi-camera setups (Lam et al., 2023).

Recent developments (Cotton et al., 2023; West et al., 2023) have further enhanced precision by employing dense key point models trained on multiple datasets, such as MeTRAbs, which improve tracking of critical regions like the torso and pelvis. In addition, new approaches use neural networks to generate smooth and anatomically consistent 3D trajectories from video, leading to more reliable inverse kinematic analysis. Together, these advances position MMA as a promising tool for clinical applications, enabling fast and detailed movement assessments in rehabilitation settings, even for patients with complex motor deficits or those using assistive

Moreover, AI has enabled not only the development of new tools for movement analysis but also for rehabilitation treatment. An AI-based training platform that integrates DL-based 3D human PE can deliver highly accurate feedback and guidance by capturing detailed movement patterns (Capecci et al., 2018, 2023). Through these systems, users can compare their own movements with synchronized reference models displayed on their devices, allowing them to detect deviations or irregularities and adjust their performance accordingly (Barzegar Khanghah et al., 2023). In this regard, He et al. (He et al., 2024) reported in a randomized controlled trial of older adults with sarcopenia that a 3D PE-based TR programme achieved improvements in motor function and quality of life comparable to those observed with traditional rehabilitation, although these findings require confirmation in larger samples.

This scoping review aims to explore and synthesize the existing literature on the application of AI and MMA in the context of remote monitoring and TR of neurological disorders. Previous reviews were primarily focused on the use of AI in physical rehabilitation more broadly (Rasa, 2024; Sumner et al., 2023), considering different causes of injuries and not only neurological disorders. In addition, a recent bibliometric analysis was conducted on studies involving AI and robotics for stroke patients (Taşkaya & Taşkaya, 2026). Other reviews have examined markerless motion capture for clinical assessments and in healthy subjects (Lam et al., 2023; Pardell et al., 2024). To the best of our knowledge, no previous review has specifically focused on AI-based MMA systems that have been applied in home-based TR for adult patients with neurological disorders, with a dual emphasis on technical implementation and clinical applicability.

Specifically, this scoping review seeks to identify the AI-based and MMA methodologies that have been applied in clinical or home-based TR settings, summarize their reported performance in tracking and evaluating movement, describe their maturity in terms of usability and feasibility in real-world or near real-world environments, and examine their potential benefits and limitations for neurological rehabilitation.

Methods

This scoping review followed the Preferred Reporting Items for Scoping Reviews - Systematic Reviews and Meta-Analyses (SR-PRISMA) guidelines (Tricco et al., 2018) to enhance the transparency, completeness, reliability, and validity of the reported information (SR-PRISMA checklist is available in the supplementary material – S1). The protocol was registered in Open Science Framework (OSF) (https://osf.io/dashboard): 10.17605/OSF.IO/FB8TD.

PICO Model

We defined our combination of search terms using a PICO model (population, intervention, comparison, outcome) (Eriksen & Frandsen, 2018). The population considered was various neurological disorders, such as stroke, Parkinson's disease, multiple sclerosis and so on. The intervention included all studies that explored, described, or applied AI to remotely monitor and treat patients affected by neurological disorders. The comparison was related to the differences between AI models, ML and DL algorithms. The results encompassed contributions to assessment and rehabilitation treatments by AI technologies.

Eligibility Criteria, Information Sources and Search Strategy

A scoping search, started from March 2025, was conducted for all peer-reviewed articles published in the last ten years, using the following databases: PubMed, Embase, Scopus and WoS, which are the most used in the context of medicine and the bioengineering field. The choice of this time frame was made to capture the growing interest and recent technological advances in MMA and AI-based TR, and to map the emerging body of literature in this field. The initial search strategy included combinations of the following keywords: “telerehabilitation,” “machine learning,” “markerless,” “artificial intelligence,” and “neurological disorders”. However, the search strings were adapted to each database's syntax, as reported in Table 1.

Table 1.

Database Search Strategies and Keyword Strings.

Database	Keyword string
PubMed	(telerehabilitation) AND (machine learning)) AND (neurological disorders)(“telerehabilitation” AND markerless) (telerehabilitation AND artificial intelligence); (((((telerehabilitation[Title/Abstract]) OR (telemonitoring[Title/Abstract])) OR (remote monitoring[Title/Abstract])) AND (machine learning[Title/Abstract])) OR (pose estimation[Title/Abstract])) OR (markerless[Title/Abstract])
Scopus	(((((telerehabilitation[Title/Abstract]) OR (telemonitoring[Title/Abstract])) OR (remote monitoring[Title/Abstract])) AND (machine learning[Title/Abstract])) OR (pose estimation[Title/Abstract])) OR (markerless[Title/Abstract])
WoS	(“telerehabilitation” AND markerless) (telerehabilitation AND artificial intelligence)

We included all studies on the adult population (>18 years) affected by neurological disorders, with various diagnosis. Specifically, the inclusion criteria were: i) patients affected by neurological disorders; ii) Use of AI and related technologies for the remote monitoring and TR treatment; iii) written in English language; and iv) published in a peer-reviewed journal.

Articles describing theoretical models, methodological approaches, algorithms, and basic technical descriptions were excluded. We also excluded: i) animal studies; ii) review; iii) studies involving children; iv) case reports. These restrictions were intentional, as our focus was on AI and MMA methods that have been applied in TR or remote monitoring of patients with neurological disorders. In particular, purely theoretical or algorithm-development studies, often conducted on healthy volunteers or on generic motion datasets without a TR context in neurological patients, were considered beyond the scope of this clinically oriented scoping review.

The list of articles was refined for relevance, revised, and summarized, with key topics identified based on the inclusion and exclusion criteria. Given the limited literature available, various study designs were included in the qualitative synthesis: i) Randomized Controlled Trials (RCTs); ii) Observational studies; iii) Cross-sectional studies; iv) Case-control studies; and v) Cohort studies.

Selection of Sources of Evidence and Data Charting Process

Two independent reviewers (M.B. and G.L.) screened titles, abstracts, and full texts, applying predefined inclusion and exclusion criteria to minimize selection and publication bias. Disagreements were resolved by consensus. All search results were imported into an online database (Rayyan) (Ouzzani et al., 2016), where the reviewers independently assessed each study's relevance. Following the initial title and abstract screening, blinding was lifted, and any remaining disagreements regarding study inclusion were resolved through discussion.

In line with current guidance for scoping reviews, we did not perform a formal methodological quality or risk-of-bias assessment of the included studies. The primary aim of this work was to map and descriptively synthesize the emerging literature on AI-based TR and MMA in neurological rehabilitation, rather than to generate pooled effect estimates or compare the efficacy of specific interventions.

To contextualize potential sources of bias in the available evidence, we recorded key methodological descriptors for each study during data extraction (e.g., design, sample size, intervention setting, and type of comparison group). These characteristics are reported in the summary tables (see Tables 2 and 3).

Table 2.

Summary of the Studies Reporting Markerless Applications for Remote Monitoring.

First author and year of pubblication	Study sample	AI technology/Markerless application	Sensor/mobile application	Features/Predictors	Main findings/clinical relevance
Nucita et al. (2023)	Total Patients: 21 Gender: Female Condition: Rett Syndrome Age Range: 4–31 years Clinical stage: III-IV Functional ability level: 95.57; Level of severity (RARS): 68.71;	ZED OpenPose Library Description: An open-source 3D extension of the original OpenPose library Functionality: Enables 3D human pose estimation by combining OpenPose with ZED stereo camera technology Use Case: Useful in applications requiring depth-aware skeletal tracking, such as advanced rehabilitation, motion analysis, or interactive environments	Laptop with a built-in eye-tracker, a stereo camera (3D ZED camera, a webcam and a headset. The software was implemented as a Web Application, exploiting CISCO Webex API for videocalls.	Joint angles during shoulder flexion and abduction, elbow flexion and extension, and knee extension	3D Skeleton-Based Measurements Comparison Tool: Therapist-measured angles using a goniometer Pearson's r: 0.62–0.89 (p < 0.01) Intraclass Correlation Coefficient (ICC): 0.78–0.92 High reliability for joint mobility assessment Joints evaluated: shoulder, elbow, knee→ 3D computer vision methods are strongly validated 2D Skeleton-Based Measurements Pearson's r: 0.45–0.77 (p < 0.05) ICC: 0.68–0.81 These findings suggest that 3D markerless computer vision can provide reliable joint mobility estimates in this sample, while 2D approaches may represent a lower-cost option with moderate precision for clinical or remote monitoring.
Capecci et al. (2018)	Healthy Control Group Total Subjects: 28 Neurological Patients Parkinson's Disease (PD): N = 8 (all female) Mean Age: 63.8 ± 8.7 years Stroke: N = 4 (all female) Mean Age: 56.4 ± 17.2 years Musculoskeletal Patients: Back Pain: N = 3 (all female) Mean Age: 49.8 ± 16.7 years	Remote assessment of motor exercises for axial disorders Tasks Monitored: 5 rehabilitation exercises (e.g., trunk tilt, arm lifts, squats)	Sensor: Microsoft Kinect v2 (RGB-D camera) Data Captured: 3D skeletal tracking data, joint positions, motion trajectories The subjects were at a distance of about 3 meters in front of the Kinect sensor.	Primary Outcomes (POs): e.g., limb angles, distances Control Factors (CFs): e.g., posture constraints Frequency Variability (FV): timing consistency of repetitions Functional correctness (PO) Postural compliance (CF) Repetition timing (FV)	Moderate to high correlation with clinician scores (e.g., Spearman's ρ = 0.41–0.64; p < 0.05) TS and PO metrics differentiated, at group level, between healthy participants and patients with neurological or musculoskeletal conditions, whereas CF scores were more variable, underlining limitations in posture assessment with Kinect. Overall, the system demonstrated the potential to support objective, remote evaluation of axial rehabilitation exercises, particularly in home or low-resource settings.
Dellepiane et al. (2025)	Stroke Patients (Group A). Participants: 2 male patients Ages: 62 and 79 years Frail Elderly (Group B) Participants: 20 elderly individuals Gender: 8 females, 12 males Mean Age: 82.5 ± 3.4 years. Control participants: 16 healthy individuals Gender: 9 females, 7 males Age Range: 27–61 years Mean Age: 38.62 ± 12.95	Feature Extraction & Segmentation: Automatic, AI-driven process. Utilizes skeletal tracking data from the Kinect sensor Machine Learning Technique: Random Forest Classifiers. Used to identify and classify movement phases based on joint positions and kinematic features Movement Similarity Evaluation: DTW. Measures the temporal and spatial similarity of movement patterns across repetitions.	Sensor: Microsoft Kinect v2, Markerless, contactless optical motion capture system Mechanism: Uses Time-of-Flight (ToF) infrared depth sensing; 30 Hz sampling frequency for the skeletal joint signals Data: 25 skeletal joints in 3D space Part of the ReMoVES IoT system. Used for exergame-based rehabilitation (interactive therapeutic exercises).	Joint angles, range of motion, velocity, and acceleration	Extracted features successfully characterize different types of motor impairments (e.g., in gait, posture, balance). Enables patient-specific rehabilitation monitoring and progress tracking Markerless motion capture combined with signal processing provides a low-cost, non-invasive, and clinically meaningful alternative to traditional lab-based systems
Hartman et al. (2022)	12 male participants, aged 14–25 years (Mean = 19) with Duchenne muscular dystrophy; all power wheelchair users; mostly non-Hispanic/Latino Caucasians (1 Asian); all right-dominant and used the device on the dominant side; 9 students, 2 employed, 1 not in school/work; no prior experience with dynamic arm support devices.	SVM	The KINOVA O540 (KINOVA Robotics Inc., Boisbriand QC, Canada). The O540 uses power from an electric powered wheelchair to provide the user with support for upper arm movements. The ActiGraph GT9X Link (ActiGraph Corp, LLC, Pensacola, FL), a wrist-worn accelerometer.	Key Features Identified (via feature selection): Common to both categories: Mean value across time; Standard deviation across time; Normalized energy Additional for movement category 2: Maximum value across time accelerometry signals, captured in three dimensions, were aligned to the start of each task, and their vector magnitude was calculated. From this signal, several features were extracted, including mean, standard deviation, minimum, maximum, median, task duration, entropy, signal energy, and normalized energy.	SVM models trained on wrist-worn accelerometer features achieved high accuracy, sensitivity, and specificity in distinguishing between periods of KINOVA O540 arm-support usage and non-usage in young adults with Duchenne muscular dystrophy. These results indicate that wearable inertial sensing can be used to objectively detect device utilization during daily activities in this population, although clinical implications for function and participation were not directly assessed.
Mulfari et al. (2022)	69 native Italian speakers with dysarthria (42 males and 27 females). The majority of voice contributions (54%) came from individuals with infantile cerebral palsy.	Deep learning: CNN Framework: TensorFlow 1.6.0, Python 3.6 OS/Hardware: Ubuntu 18.04, Intel i7-8700 K CPU, 16 GB RAM, Nvidia GTX 1070 (8 GB) Training Parameters: Dataset split: 80% training / 10% validation / 10% test	CapisciAMe mobile application	Automatic speech recognition: Training loops to run: • 15,000 (learning rate: 0.001) • 3,000 (learning rate: 0.0001) • Evaluation frequency: Every 500 steps • Expected sample rate of audio files: 16,000 Hz • Expected audio duration: 2,500 milliseconds • Spectrogram window size: 75 milliseconds • Spectrogram window stride: 25 milliseconds	Promoting the use of personal mobile devices for speech sample donation, offering personalized offline voice recognition through edge computing, and leveraging smartphone ubiquity to expand access to assistive technologies, but its impact on communication outcomes in real-world rehabilitation settings remains to be evaluated.

Legend: AI (Artificial Intelligence); PD (Parkinson's Disease); POs (Primary Outcomes); CFs (Control Factors); FV (Frequency Variability); TS (Total Score); ICC (Intraclass Correlation Coefficient); DTW (Dynamic Time Warping); IoT (Internet of Things); SVM (Support Vector Machine); CNN (Convolutional Neural Network); OS (Operating System); RAM (Random Access Memory); Hz (Hertz)

Table 3.

Summary of the Studies Reporting AI-based TR Platforms and Applications.

First author and year of pubblication	Study sample description	AI technology	Sensor/data	TR intervention/technology	Feature/Predictors	Model performance metrics	Clinical findings/relevance
Capecci et al. (2023)	11 COV19 with a mean age of 57.3, 5 F/6 M; Days from COV19 = 117; RANKIN disability score = 1.0 10 pwPD with a mean age of 65.4, 2 F/8 M; HY = 3; UPDRS = 33; MoCA = 24.5; mBI = 89; RANKIN disability score = 1.8	ARC: Wearable sensors + mobile device + AI.	5 inertial sensors (MetaMotionR+, MbientLab, San Francisco, USA),), tablet with dedicated app, charging station.	Patients received 30-min instruction on ARC hardware/software, followed by a usability test (15 tasks rated for support needed and difficulty, plus SUS questionnaire). After enrollment, each patient underwent clinical assessment and used ARC at home (45-min sessions, 5 days/week for 4 weeks), with ≥30-min weekly video calls. ARC enabled exercise execution and adherence monitoring during unsupervised sessions.	Tested on 30 healthy subjects across 41 ARC exercises. Repetition counts by the algorithm (NARC) were compared with expert annotations (Noperator).	In 35/41 exercises (86%), no significant differences were found (p ≥ 0.05), confirming functional performance. Only successfully validated exercises were included in the study.	21/23 patients (11 post-COVID, 10 PD) completed the study. Mean SUS = 77/100; median adherence = 80%. Significant improvements observed in BaDI, 2MWT, BFI, BAI, BDI, and EQ-5D; no adverse effects. The authors concluded that ARC was feasible, well tolerated, and associated with pre–post improvements in clinical and patient-reported outcomes in this small sample, although comparative effectiveness versus standard rehabilitation was not assessed.
Ramírez-Sanz et al. (2023)	76 patients: Intervention group (n = 38) received telerehabilitation + standard care; Control group (n = 38) received standard care only. PD patients had a history of at least one fall in the previous 12 months; HY: < 3 MoCA: >18 Presence of freezing of gait or self-selected gait speed of less than 1.1 m per second	Deep-learning skeleton estimation using Detectron2 for real-time pose tracking during rehabilitation exercises.	Mini-PC (Intel Celeron N4000, 4 GB RAM, 120 GB SSD), the webcam model (Logitech HD Pro C920), and the AI processing server (Intel Xeon E5-2630 v4 with three NVIDIA Titan Xp GPUs and 128 GB RAM); Raspberry Pi 3 discarded due to overheating. Operated via low-cost SNES controller with color-coded buttons for key functions (e.g., therapist call, patient info).	Real-time sessions supervised by an occupational therapist via video. Sessions start with guided warm-up (mobility and stretching from neck to feet), followed by posture, mobility, balance, and coordination exercises aimed at reducing stiffness, dyskinesia, and gait freezes.	Seventeen 2D points represent key body joints, including wrists, elbows, shoulders, knees, and ankles.	Computational performance only: average time to process a frame (t_mean = 626 ms, t_std = 98.7 ms), and estimation of the number of Spark workers required to sustain a 10-fps video stream. The authors also compared loading and processing times for different keypoint R-CNN backbones used for 2D pose estimation.	TUQ administered to a subset of participants (n = 15), with a mean total score of 139.1/147, indicating high perceived usability and satisfaction. No motor or functional clinical outcome scales were reported; the system is presented as a proof-of-concept, low-cost TR system integrating AI and big-data tools (Apache Kafka/Spark) to support therapists in evaluating exercise performance and managing multiple remote sessions.
Barzegar Khanghah et al., (2023)	16 patients (5 SCI, 5 post-stroke, 1 brain injury, 5 other neurological; 11 M/5F; age 20–60) and 14 healthy controls.	Deep learning: CNN model (3DConvNets)	Microsoft Kinect One sensor	Upper limb rehabilitation movements (elbow flexion and extension, shoulder flexion, abduction) and trunk shifting exercises.	Each video consists of a set number of frames at 200 × 200 resolution. The model extracts features through successive layers using 3D convolutional kernels of varying sizes and 3D max pooling to reduce overfitting by abstracting the feature maps.	Accuracy = 90.6% ± 9.2% (10-Fold), 83.8% ± 7.6% (LOSO); F1-score = 71.8% ± 5.7% (10-Fold), 60.6% ± 21.3% (LOSO).	No clinical scales or functional outcome measures were reported. The system was presented as a potential tool to support unsupervised home rehabilitation by automatically classifying upper-limb and trunk exercises, but its effects on movement quality, and adherence have not yet been evaluated in clinical trials.
Bertomeu-Motos et al. (2023)	2 post-stroke (74 and 68 years old, with an MRC of 4 and 3 respectively) patients and 1 healthy subject	OIESGP DTW	Shimmer magneto-inertial sensors (upper arm, forearm, hand); 9-DOF (accelerometer, gyroscope, magnetometer), 16-bit resolution.	Eight upper-limb activities (mouth, shoulder, knee, ear, head, triangle, square, circle) performed following 5-s visual cues on screen, with 5-s rest between tasks. The therapists executed ten trials of each activity while the healthy subject and the poststroke patients only performed five trials with the dominant/affected limb	Data acquired from the two therapists (e.g., the estimated upper limb joints and the sEMG signal), was used to train the OIESGP model, and the reference of each joint trajectory, necessary to compute the DTW distance, was estimated as the mean of the ten trials.	Accuracy: 64% (healthy), 38% (P1), 8.3% (P2). Correct classifications required one DTW distance per joint; misclassifications required two (true vs predicted activity).	No clinical scales or patient outcomes were reported. The study focused on demonstrating that the system can generate probability curves for activity recognition, trajectory comparisons with therapist reference movements, and compact visual summaries of session outcomes, which may help therapists monitor movement quality and patient progress. Its clinical utility still requires validation in larger patient cohorts.
Chae et al. (2020)	6 and 17 chronic stroke patients. CG: Age = 64.5 (9.6) WMFT = 38.8 (25.6) FMA = 29.0 (14.2) Grip power = 11.7 (11.6) BDI = 24.2 (11.2) HBR group: Age = 58.3 (9.3) WMFT = 39.7 (22.2) FMA = 36.6 (18.6) Grip power = 13.3 (12.7) BDI = 17.88 (14.7)	CNN (TensorFlow 1.7.0, Python 3.5).	IMU smartwatch LG W270: 3-axis accelerometer + gyroscope; sampled at 10 Hz. Data structured as time × sensor matrix. Feature extraction with 3-s sliding time window (optimized experimentally).	(1) bilateral shoulder flexion with both hands interlocked; (2) wall push exercise; (3) active scapular exercise; (4) towel slide exercise.	Models trained on personal vs. total datasets; accuracy compared using accelerometer, gyroscope, and combined sensor data to identify most reliable predictors for exercise classification.	ML performance: Accelerometer + gyroscope (99.8%) > accelerometer (98.1%) > gyroscope (96.1%).	In this randomized study, the HBR group (n = 17) showed significant improvements in WMFT (P = 0.02), shoulder flexion ROM (P = 0.004), and internal rotation (P = 0.001), whereas the control group (n = 6) improved only in internal rotation (P = 0.03). Drop-out rates: Control = 40% (12w), 100% (18w); HBR = 22% (12w), 45% (18w).

Legend: AI (Artificial Intelligence); TR (Telerehabilitation); ARC (Augmented Rehabilitation Care); SUS (System Usability Scale); BaDI (Brief Aphasia Disability Index); 2MWT (2-Minute Walk Test); BFI (Brief Fatigue Inventory); BAI (Beck Anxiety Inventory); BDI (Beck Depression Inventory); EQ-5D (EuroQol-5 Dimension); PD (Parkinson's Disease); HY (Hoen & Yahr); MoCA (Montreal Cognitive Assessment); SCI (Spinal Cord Injury); CNN (Convolutional Neural Network); LOSO (Leave-One-Subject-Out); OIESGP (Online Infinite Echo-State Gaussian Process); DTW (Dynamic Time Warping); DOF (Degrees of Freedom); sEMG (Surface Electromyography); IMU (Inertial Measurement Unit); HBR (Home-Based Rehabilitation); WMFT (Wolf Motor Function Test); ROM (Range of Motion); HBR (home-based rehabilitation); CG (control group); BDI (Beck Depression Inventory).

Data Extraction and Data Items

Following full-text selection, data from the included studies were charted in a structured data sheet. The extracted information included: first author and year of publication, study aim, sample size, baseline characteristics, intervention setting, and type of comparison group, type of clinical predictors (if included), type of ML/DL/AI algorithm used, results and performance metrics (e.g., accuracy, sensitivity, specificity, area under the curve – AuC), presence of interpretability techniques, and clinical implication.

Results

The initial total number of records retrieved was 290. After removing 67 duplicates, the remaining records underwent screening. Five articles were excluded based on title and abstract. In the second round of screening, the two reviewers assessed the included articles per their full text. In particular, we included 10 articles in the final analysis and excluded 208 articles for the following reasons: wrong population (n = 93), wrong study design (n = 89), wrong outcomes (n = 21) and non-English studies (n = 5) (see Figure 1).

Figure 1.

PRISMA flow-chart showing the study selection process.

Markerless Applications for Remote Monitoring

Among the selected evidence, we found that 50% (5 out of 10) articles dealt with markerless and AI-based applications for remote monitoring in patients with neurological disorders (see Table 2). In these five studies, AI and MMA were used to capture human's movements and gesture during TR sessions (Capecci et al., 2018; Dellepiane et al., 2025) or to assess movements more objectively in TR modality (Hartman et al., 2022; Nucita et al., 2023). Other applications include voice recognition for dysarthric people (Mulfari et al., 2022) through DL techniques (see Table 2).

Nucita et al. used a ZED camera, which was placed in front of the subject to evaluate passive range of motion (PRoM) on the frontal plane, and to the patient's side to evaluate ProM on the sagittal plane. The body segments associated with the assessed joint were passively and carefully mobilized to the limit of their range of motion, at which point a measurement was recorded. Each evaluation was conducted by the participant's primary caregivers, who passively moved the subject's joints in front of the 3D ZED camera. Each motor evaluation was carried out before and after the TR intervention. In particular, 3D ZED/Open Pose-based joint angle estimates showed high agreement with therapist-measured goniometric values (Pearson's r = 0.62–0.89; ICC = 0.78–0.92), supporting the validity of 3D markerless methods for joint mobility assessment. In contrast, 2D skeleton-based measurements exhibited only moderate agreement (Pearson's r = 0.45–0.77; ICC = 0.68–0.81), indicating lower accuracy compared to 3D but still acceptable precision for clinical or remote monitoring and representing a feasible, low-cost alternative. Regarding the TR intervention, it consisted of three or four active exercises and two or three passive postures. The active exercises were performed every week within three thirty-minute sessions for the duration of the intervention (three months). These sessions were conducted by participants’ primary caregivers (who were together with the patients) with the live supervision of a therapist experienced in the motor treatment of people with Rett syndrome.

Similarly, Capecci et al. introduced a real-time monitoring system designed to assist clinicians in remotely evaluating exercise performance during home-based rehabilitation (HBR). In particular, they extracted specific kinematic features based on clinician indications to assess five motor tasks commonly used in axial disorder rehabilitation (trunk lateral tilt, arm lifting, trunk rotation, pelvis rotation, and squatting). These features were extracted using the Kinect v2 skeleton tracking system and processed to generate disaggregated performance scores based on a bell-shaped ranking function. The system was tested on 28 healthy individuals and 29 patients with neurological or orthopaedic conditions. Using a cross-sectional controlled design, the algorithm's scores were validated against blinded clinical evaluations via a structured questionnaire. The authors reported moderate to high correlations between Kinect-derived kinematic scores and clinician ratings (Spearman's r = 0.41–0.64; p < 0.05), further supporting the clinical relevance of markerless, remote assessments of motor performance, discriminating between healthy and pathological subjects.

The Kinectv2 sensor was also used by Dellepiane et al. to recognize human movements during Sit-to-Stand (STS). In particular, the motor exercises and instructions to the patient were delivered by the ReMoVES IoT system. Designed to integrate with traditional rehabilitation, ReMoVES operates during periods without therapist-guided activity, enabling information collection and access from various locations. The authors collected multidimensional kinematic parameters, finding that stroke and the elderly patient groups showed non-normal distributions for most parameters (p < 0.001), with some exceptions: flexion knee angles (normal in both patient groups), trunk extension and flexion angles (normal in the elderly group), and shoulder twist angle (normal in the stroke group). In addition, feature robustness was validated through correlation analysis of instantaneous right and left knee angles, showing that the controls and the elderly group had the highest correlation. Despite lower correlation in the stroke group, high values suggest preserved bilateral movement coordination, possibly due to good residual autonomy in participants. Moreover, the authors provided a detailed analytical error model for Kinect v2 joint and angle measurements, combining additive and multiplicative noise modelling with error propagation to estimate joint and angle root mean square error (RMSE). Using published Kinect v2–Qualisys errors as a reference, they reported that their virtual center-of-mass (CoM) joint and nonlinear filtering pipeline can reduce CoM RMSE to below 1.4 cm and trunk angle errors to below 6°, but no standardized per-joint position error metric is reported, and no quantitative comparison between 2D and 3D pose estimation is performed.

Different from other authors, Hartman et al. aimed to use accelerometer data in a novel analysis to remotely monitor the use of a dynamic arm support device for individuals with muscular dystrophy. The authors employed an ML algorithm (support vector machine – SVM) to evaluate the relationship between accelerometer measurement and functional tasks of the upper limb during the use of an actuated assistive device. The arm movements were registered on all planes, and they included: reaching forward, pushing backwards, reaching left, right, and diagonally in all directions. The amount of support given by the device can be adjusted by the user with the use of a hand-held remote. Each participant completes the functional upper limb tasks while wearing the ActiGraph and being video recorded. In this study, accelerometery data were used to address two classification problems: determining whether the O540 dynamic arm support device was being used and distinguishing between successful and unsuccessful task attempts. The SVM helped to automatically identify the most informative features from the accelerometer signals, such as mean, standard deviation, and signal power, while minimizing model overfitting. SVM was chosen not only for its ability to model complex, nonlinear decision boundaries (using kernels like the radial basis function), but also because it performs well with smaller datasets, a key consideration given the rarity of Duchenne muscular dystrophy and the resulting limited sample size. The study further explored how variations in these key features related to task success. Classification of task outcomes was performed under three conditions: when the O540 was used, not used, and both combined. This allowed for an analysis of the device's effect on performance detection.

Interestingly, Mulfari et al. developed an automatic speech recognition algorithm for native Italian speakers with dysarthria, exploiting an existing mobile app to collect audio data from users with speech disorders while they perform articulation exercises for speech therapy purposes. With this data availability, a CNN has been trained to spot a small number of keywords within atypical speech, according to a speaker-dependent method. The CNN was trained for isolated keyword recognition using a supervised ML approach. Speech data consisted of 650 audio samples from six Italian native speakers with varying levels of dysarthria. Each speaker contributed 50 samples per keyword. The computational setup used for deep learning training and deployment on mobile devices included: i7 CPU, 16 GB RAM, GTX 1070 GPU. The results showed that personalized models trained with more data (mode30) achieve the highest accuracy, with up to 98.6% in speaker-specific configurations. Global models also perform well, particularly in the mode30 setup (97.9%), confirming that increasing the number of training examples significantly improves keyword recognition performance.

AI-Based Platforms Carrying out TR

The other 50% (5 out of 10) of collected evidence dealt with AI-based TR platforms. Some authors have conducted feasibility and usability studies on developing systems that incorporate AI algorithms to optimize the selection of exercises based on patient-specific difficulties (Capecci et al., 2023). Others have developed systems to support TR using devices equipped with ML algorithms (Bertomeu-Motos et al., 2023; Chae et al., 2020), where the systems are able to correct movements and support the safe guidance of rehabilitation sessions for patients (see Table 3).

In their study of 2023, Capecci et al. delivered TR sessions with an innovative device called ARC. The ARC is a TR solution that integrates multiple wearable sensors and a mobile device supported by AI algorithms. The system includes five inertial sensors (MetaMotionR + from MbientLab, San Francisco, CA, USA), a tablet equipped with a dedicated application, and a charging station. In its Home version, ARC enables patients to carry out their rehabilitation programme independently, providing simple instructions, video tutorials, and automated tracking of correctly executed repetitions. The innovative core of the device lies in its AI algorithm, which automatically counts the number of repetitions performed accurately. The algorithm processes input signals consisting of tri-axial accelerations and angular velocities, transmitted in real time via Bluetooth from three of the five inertial measurement unit (IMU) sensors worn by the patient during each session. The specific sensors used vary according to the type of exercise and the targeted body region. Before using the ARC device at home, each participant received a 30-min in-person training session on the software and hardware components, followed by hands-on practice with selected rehabilitation exercises. After the training, participants used the ARC system at home for four weeks, performing 45-min sessions, five days per week, and participating in at least one 30-min video call per week with the investigator. The ARC system enabled remote monitoring of exercise execution and adherence. Despite the high usability score achieved (77/100), seven participants (9% of the total sample, mostly individuals with PD) experienced difficulties in using or accepting the technology. Nevertheless, the system was found to be safe, and no adverse effects were reported during its use.

Chae et al. designed a HBR system that combines a smartwatch with a smartphone application, employing an ML algorithm to identify and track both the type and frequency of rehabilitation exercises. The system utilized off-the-shelf devices and custom applications, with a CNN to detect exercises. They compared detection accuracy across different data types (accelerometer, gyroscope, or both) and data sets (individual vs. total). The intervention focused on bilateral upper limb exercises for stroke patients with mild motor impairment, including shoulder flexion, wall push-ups, scapular activation, and towel slides. Bilateral training aimed to promote contralateral motor network reorganization via interhemispheric crosstalk, supporting upper limb recovery in chronic stroke.

For the ML model, the authors implemented a CNN with two convolutional layers containing 8 and 16 feature maps, followed by a fully connected layer with 32 nodes. They also tested two approaches: one model trained on personal datasets, and another trained on the combined dataset. Model performance was assessed using cross-validation. The most accurate model was built with personal data that combined accelerometer and gyroscope signals, achieving 99.80% accuracy (5590/5601). This outperformed model trained with accelerometer data alone (98.13%, 5496/5601) or gyroscope data alone (96.07%, 5381/5601). In the comparative study, dropout rates at 12 weeks were 40% (4/10) in the control group and 22% (5/22) in the TR group, increasing at 18 weeks to 100% (10/10) and 45% (10/22), respectively.

In the study by Bertomeu-Motos et al., the authors proposed a home-based AI system, in which the patients were guided during the execution of the correct movement. The proposed system had to decide on the next activity that the patient might perform to help the patient stay motivated and eager to continue the therapy. This system was also used to assess upper limb joint movements in patients with motor impairments. Motion trajectory results were derived by comparing patients’ joint trajectories executed with reference movements performed by clinicians. The system supports a wide range of neurorehabilitation tasks, from simple actions (e.g., touching the head) to complex ones (e.g., drawing geometric shapes in the air). In particular, the activity classification was obtained with a time-series classification model, Online Infinite Echo-State Gaussian Process (OIESGP). This model was trained on clinician-performed activities. The trajectory comparison was obtained by using the Dynamic Time Warping (DTW), measuring similarity between the patient's and the clinician's joint trajectories. The authors defined the correct execution of the upper limb task as a case in which the system both correctly classified the activity and reported a low DTW distance. In addition, the authors evaluated a system combining activity classification and DTW distance computation to assess upper limb rehabilitation movements. Three data scenarios were tested: trajectories of seven joints, five joints, and eight sEMG signals (flexor and extensor muscles of the forearm). Due to poor accuracy in the sEMG-based scenario, the five-joint model was selected. Classification accuracy was 64% for a healthy subject but significantly lower for post-stroke patients (38% for patient 1 and 8.3% for patient 2). The system also supported adaptive therapy by suggesting subsequent exercises and evaluating movement quality to maintain patient motivation during HBR.

Similarly, other researchers have introduced TR platforms with automated exercise guidance, a feature that is expected to substantially enhance rehabilitation outcomes. In this context, Barzegar Khanghah et al. (Barzegar Khanghah et al., 2023), designed and validated a vision-based biofeedback system capable of assessing the quality of exercises performed during TR sessions. In particular, the study developed an activity recognition model to automatically detect whether users performed rehabilitation exercises correctly. The model was trained using only “correctly executed” gestures, leveraging the Inflated 3D ConvNets (I3D) architecture by Carreira et al. (Carreira & Zisserman, 2017), pre-trained on the Kinetics dataset. The I3D model was designed based on the Inception-v1 using batch normalization with inflating filters and pooling kernels into 3D. The model achieved high accuracy on benchmark datasets (e.g., 97.9% on UCF-101, 96.9% on HMDB-51, and 74.1% on Kinetics with RGB data). The proposed system achieved average accuracy values of 90.57% ± 9.17% and 83.78% ± 7.63% using 10-Fold and Leave-One-Subject-Out (LOSO) cross-validation, respectively. In addition, the authors achieved average F1-scores of 71.78% ± 5.68% using 10-Fold and 60.64% ± 21.3% using LOSO validation. The proposed 3D-CNN successfully classified rehabilitation videos and provided feedback on exercise quality, enabling users to adjust their movement patterns.

Ramírez-Sanz et al. (Ramírez-Sanz et al., 2023) proposed a TR system consisting of three main components: i) a Jitsi-based server for real-time data transmission; ii) an affordable, patient-side device that connects to a television, and iii) an AI-processing server using Kafka, Spark, and Detectron2 for advanced computer vision tasks. A key element of the system is the real-time video processing module used to analyze patient movements during rehabilitation exercises. This module identifies the skeletal structure of the patient using a PE model. For each video frame, the model outputs a series of tensors, with the primary tensor containing 17 key points representing major body joints (e.g., wrists, elbows, shoulders, knees, ankles). This skeletal data can be used to track patient progress over time by analyzing joint angles and movement patterns throughout the rehabilitation process. To reduce false positives in person detection, particularly in the controlled scenario where only one person (the patient) is present and centrally located, a high confidence threshold (0.99) was set. This ensures that only the patient is detected and classified, enhancing accuracy. To address privacy concerns during system development and research, an anonymization step was integrated. The system uses the res10_300 × 300_ssd_iter_140000 Caffe model in OpenCV to detect facial regions, which are then blurred using a 3 × 3 Gaussian filter. This guarantees that patients cannot be identified in the recorded videos. In production deployments, this step may be skipped to reduce computational cost, as video processing would be fully automated and only accessible by the therapist.

Across these studies, error metric values related to classification or detection accuracy, such as F1-scores, should be interpreted as indicators of technical performance in recognizing or quantifying specific tasks or movement patterns. They do not in themselves demonstrate the clinical effectiveness of the underlying rehabilitation interventions.

Discussion

To the best of our knowledge, this is the first review to examine the role of AI, including ML, DL, and markerless applications, in the context of remote motor assessment and treatment. Specifically, we focused on TR applications and AI-based platforms capable of recording patients’ movements and exercises in home settings. We found that only a limited number of studies addressed these applications in the context of neurological disorders. In fact, many studies were excluded because they investigated TR combined with AI in healthy individuals (Abrar Ashraf et al., 2025; Clemente et al., 2024; Lam & Fong, 2024). For example, Clemente et al. (Clemente et al., 2024) explored the feasibility of a model for 3D PE from monocular 2D videos (MediaPipe Pose) in healthy subjects by comparing its performance to ground truth measurements. MediaPipe Pose was investigated in eight exercises typically performed in musculoskeletal physiotherapy sessions, where the ROM of the human joints was the evaluated parameter. The model achieved its best performance in key upper- and lower-limb exercises, supporting the potential of monocular 2D PE as a markerless, low-cost, and accessible tool for musculoskeletal TR. Interestingly, other authors (Abrar Ashraf et al., 2025) developed a novel TR platform for remote monitoring in elderly people that processes depth video frames using a multistage methodology. Their pipeline started with noise and floor removal, followed by 3D connected component labelling to identify the human subject and extract the human silhouette. Next, skeleton joint points are estimated, and features are extracted from both the joints and silhouette. These multimodal features are fused and input into a DL model for classification and correctness assessment. Advanced feature extraction techniques, including Synchrosqueezing Transform and Hilbert-Huang Transform, are employed to capture dynamic time-frequency characteristics of human actions. The proposed system classifies nine distinct exercises and assesses the correctness of movements. Furthermore, the authors tested the classification accuracy (91%) for exercise recognition and movement correctness (82%). In addition, we excluded other studies focusing on facial expression recognition, which were mainly conceptual, on healthy subjects and were not TR-based applications (Ciraolo et al., 2024; Hadjar et al., 2025; Yolcu et al., 2019; Yoonesi et al., 2025). Other excluded studies focused on the analysis of clinical data using ML algorithms (Buscarini et al., 2025), whereas our review concentrated on AI-integrated TR technologies for motor assessment and treatment.

Considering major challenges in healthcare, such as population aging and the prevalence of chronic conditions, the integration of AI into TR may prove highly valuable, enabling continuous monitoring and treatment while providing low-cost, eco-friendly, and adaptable solutions for patients and their caregivers.

Markerless Applications for Remote Monitoring

Some of the included studies in this review used 3D PE models. In general, the human PE is the process of tracking the human body. This technique typically represents the human body as a skeletal model, an interconnected, tree-like framework consisting of key landmarks (such as joints and other crucial points) linked by segments representing body parts, as reported by some authors (Nucita et al., 2023). Commonly tracked joints include the shoulders, elbows, wrists, hips, knees, and ankles, while additional key points from the spine, hands, feet, and face can also be incorporated. This form of body representation relies on a relatively small number of parameters, making it highly effective for motion capture applications (Figure 2).

Figure 2.

General pipeline for markerless motion analysis. (1) Preparation of the setup and definition of the task (e.g., a standardized test such as the 10-Meter Walk Test); (2) Acquisition of body landmarks during the motor task through camera recording; (3) Post-processing and selection of the most relevant landmarks; (4) Quantitative motion analysis based on the extracted landmarks.

Human PE is generally categorized into 2D and 3D approaches. While 2D PE identifies the spatial position of body landmarks on the x–y plane, 3D PE incorporates the z-axis as well, thus capturing depth. The essential distinction lies in the additional depth dimension provided by 3D models (Clemente et al., 2024). In 2D PE, 3D PE methods can be further divided into monocular (single-view) and multi-view systems. Monocular methods employ only one fixed 2D camera, whereas multi-view approaches combine images from two or more cameras placed at different viewpoints to reconstruct the subject from several angles (Zheng et al., 2023). Another axis of PE classification relates to the number of individuals detected. PE algorithms can be designed either for single-person or multi-person detection. Single-person models extract the body landmarks of one subject per frame, while multi-person models must simultaneously detect and separate landmarks for several individuals, a process that is inherently more complex (Clemente et al., 2024). In the selected studies, TR typically involves a single session per participant, making the single-person model the predominant approach in this clinical context. Most of the included studies used the Microsoft Kinect to assess movements in a markerless way. This system was validated by Metcalf et al. (Metcalf et al., 2013), to measure dynamic hand, finger, and thumb movements. The system reached an accuracy of 78% in landmark identification, with joint angle errors ranging between ±10–12° and generally low absolute errors. These results outperformed conventional manual assessments, indicating their potential for home-based hand motion capture in telerehabilitation settings. However, the tracking accuracy of Kinect may be influenced by the subject's orientation and distance from Kinect itself, noise affecting depth data, the subject's body shape, and limitations in the PE algorithm (Wang et al., 2015). The accuracy of Kinect v2 may also be affected by motion speed, as its relatively low capture rate (30 fps) and local fluctuations in depth sensing during movement can introduce errors (Sarbolandi et al., 2015). In this context, Timmi et al. (Timmi et al., 2018) developed a novel tracking method using Kinect v2, employing custom-made coloured markers and computer vision techniques. The authors tested the accuracy of this approach relative to a conventional Vicon motion analysis system, performing a Bland–Altman analysis of agreement. In most conditions, the limits of agreement (LOA) for marker coordinates were within 10 mm, although accuracy tended to decrease as treadmill speed increased along the depth axis of the Kinect. For knee joint angles, LOA remained within −1.8° to 1.7° for flexion and −2.9° to 1.7° for adduction during fast walking. These findings indicate that the proposed method showed good consistency with a marker-based reference system across different gait speeds, supporting its use as a cost-effective motion analysis solution for selected biomechanical applications.

AI-Based Platforms Carrying out TR

The integration of AI-based platforms in a rehabilitation setting allows for tailoring physical exercises to the specific characteristics of each patient's condition. The growing demand for remote healthcare solutions has further accelerated the development of effective TR systems to support individuals recovering from chronic conditions. In this review, we identified evidence related to AI-based TR platforms; however, a large proportion of these platforms are still experimental, with research primarily addressing technical accuracy, feasibility, and usability aspects. Differently, Chae et al., also investigated the effectiveness of the treatment delivered with an innovative AI-based TR system. Similarly, Capecci et al. (2023) clinically evaluated patients who received TR sessions with ARC, an AI-based platform. Despite the unsupervised nature of the TR sessions, the median patient adherence to the prescribed exercises was 80%. In addition, the authors reported significant improvements in walking function (2-Minute Walking Test), fatigue (Brief Fatigue Inventory), and quality of life (Euro-Quality of Life Questionnaire self-assessment-5 Dimension). Furthermore, no side effects were reported, suggesting that ARC is a feasible and well-tolerated option for HBR in people with PD.

Current trends in AI for rehabilitation emphasize the acquisition and analysis of extensive datasets generated by wearable sensors, ambient monitoring tools, and smart home technologies (Calabrò & Mojdehdehbaher, 2025; Celesti et al., 2024). By applying predictive analytics, these continuous data streams allow for dynamic, individualized adjustments to treatment and help anticipate risks like falls or progressive functional loss (Celesti et al., 2023, 2024). For example, Bertomeu-Motos et al., developed a system to intelligently guide poststroke patients during exercise, according to the quality of the movement executed, for a self-managed rehabilitation platform at home. In addition, the system could offer a novel tool for clinical assessment of patient improvements throughout the rehabilitation therapy. This aspect is more needed in the context of remote monitoring, where clinicians are not able to visit patients directly, thus they need to be objectively informed about the status of the patients. For example, the TR system implemented by Ramírez-Sanz et al. allowed the personalization of the TR sessions through a real-time DL model for human PE (Detectron2), which tracked patients’ skeletal movements during therapy sessions. Beyond exercise monitoring, AI-enhanced TR systems can also track physiological parameters, sleep quality, and treatment compliance, thereby broadening the scope of patient management, as highlighted by Capecci et al. This aspect can provide a comprehensive overview of the patient's overall health status.

Technical Gaps in Vision-Based AI and Markerless Systems

Regarding the technical gaps identified in the included evidence, we observed that the vision-based systems lack standardized and consistent procedures to validate PE accuracy and overall system performance. Classical computer-vision pose-estimation metrics (e.g., mean per-joint position error (MPJPE) and percentage of correct key points) are commonly used to quantify the accuracy of 2D and 3D PE algorithms. However, among the studies included in this review, only Nucita et al. provided a direct quantitative comparison between 3D and 2D PE, and this was based on correlations and intraclass correlation coefficients with goniometric joint measurements rather than on these standard pose-estimation metrics. Furthermore, most vision-based TR systems included in this review evaluated performance mainly at the level of the final task, for example, by reporting exercise-level classification accuracy or the degree of concordance with clinical rating scales, rather than by directly quantifying pose-estimation error. For example, in Kinect-based systems such as those by Capecci et al. (2018) and Dellepiane et al. (2025), the authors evaluated their methods by analysing the correlations between kinematic indices and clinical ratings or by modelling how measurement errors propagate to joint angles. Likewise, both depth-based and 2D CNN methodologies, such as the Kinect One depth-I3D system developed by Barzegar Khanghah et al. and the webcam-based Detectron2 pipeline created by Ramírez-Sanz et al., primarily evaluated performance through exercise classification accuracy. However, they did not offer quantitative error measures for individual joints.

System-level performance reporting was also limited. Apart from the system proposed by Ramírez-Sanz et al., which reported the time required to process each frame and the amount of data needed to sustain a 10-fps video stream, the vision-based systems did not provide any information on processing speed or use of computing resources. Consequently, the available information does not allow us to determine whether the proposed pipelines can be reliably deployed on standard home devices, mobile platforms, or in low-bandwidth settings. Beyond raw processing speed, the way AI-generated feedback is integrated into patient and therapist workflows was also only partially described. The majority of the studies included in this review refer to “real-time” or “online” feedback, as in the architecture proposed by Ramírez-Sanz et al. However, metrics regarding end-to-end latency or user–machine feedback delay (i.e., the time between a patient's movement, AI processing, and the delivery of corrective feedback to the user or the therapist) were rarely reported. As a result, the temporal characteristics of these TR systems, and their potential impact on rehabilitation responsiveness remain largely undocumented in the current literature.

From a reproducibility standpoint, hardware and camera-configuration details were often incomplete. While the type of sensor (e.g., Kinect v2, Kinect One, ZED stereo, webcam model) was usually specified, essential information such as camera calibration procedures, image resolution, frame rate, camera-to-subject geometry, and illumination control was rarely reported. Some vision-based studies (Barzegar Khanghah et al., 2023; Capecci et al., 2018; Dellepiane et al., 2025) mentioned the potential impact of lighting conditions but did not provide standardized protocols for environmental control, thereby making it difficult to compare the different experimental setups.

Across the included studies, the reporting of AI algorithms was often limited and rarely addressed comparative performance, hyperparameter optimization, or computational complexity systematically (See Supplementary Table 1). AI components were mainly used to support activity recognition, exercise-quality classification, or automatic speech recognition, and typically consisted of a single or hybrid chosen model rather than a family of alternatives evaluated under the same conditions. The algorithms ranged from conventional ML models (e.g., SVM, random forests, hidden semi-Markov models, Gaussian process–based classifiers) to DL architectures such as CNNs, CNN-LSTMs, 3D CNNs (I3D), and time-series CNNs for inertial sensor data.

Only a few studies reported some form of comparative analysis. For instance, Chae et al. compared CNN models trained on accelerometer-only, gyroscope-only, and combined accelerometer–gyroscope signals and compared personalized with population-level models. Personalized models using both modalities achieved the highest recognition accuracy in chronic stroke survivors. Similarly, Bertomeu-Motos et al. compared three input configurations (seven-joint, five-joint, and EMG-only trajectories) and selected the five-joint model as a compromise between robustness and performance, while the EMG-only configuration yielded poor accuracy, especially in post-stroke patients. In the vision-based domain, Ramírez-Sanz et al. empirically compared four COCO keypoint R-CNN variants implemented in Detectron2 and chose the keypoint_rcnn_R_50_FPN_3x model based on a favourable trade-off between loading and processing time on their big-data pipeline. Barzegar Khanghah et al. trained a depth-based I3D model to classify exercises as correctly or incorrectly executed and evaluated its performance using both 10-fold and LOSO cross-validation. These authors explicitly describe a hyperparameter optimization procedure, using grid search over batch size, learning rate, and number of epochs, together with dropout and early stopping.

Furthermore, the statistical validation of AI performance was generally based on point estimates (e.g., accuracy, F1-score), and confidence intervals or formal tests to compare alternative models or devices were rarely reported. Similarly, we noticed a lack of Bland–Altman analyses in the included evidence to assess agreement between AI-derived measures and reference instruments, which could help quantify bias and limits of agreement in clinical applications.

Benefits and Challenges for the Adoption in Clinical Practice

The evidence summarized in this review should still be regarded as preliminary, especially when considering its translation into everyday clinical practice. Although initial trials report promising outcomes, important questions persist, particularly about the sustainability of AI-based tools over time and their applicability across different clinical populations. To confirm the clinical value of AI-supported TR, future research will need to rely on broad, long-term studies involving heterogeneous patient groups to strengthen external validity. Although conventional rehabilitation has consistently proven effective, it often relies on intermittent monitoring, potentially missing subtle yet clinically meaningful changes in a patient's condition. In contrast, AI-assisted systems offer the potential for continuous, personalized monitoring and intervention. However, this transition brings substantial challenges. A critical challenge relates to digital inequities: differences in device availability, internet reliability, and digital skills may widen current gaps in healthcare access. These infrastructural limitations must be addressed to ensure equitable implementation of AI-driven rehabilitation systems. Moreover, adherence to the use of devices must also be considered. This aspect goes hand in hand with system usability and technology acceptance. However, few authors have investigated this important issue. Ongoing remote monitoring depends mainly on patients’ regular and consistent use of the technological devices provided. Formal usability questionnaires were reported in the ARC platform (Capecci et al., 2023), which achieved high System Usability Scale scores, and in the big-data architecture proposed by Ramírez-Sanz et al., where Telehealth Usability Questionnaire scores suggested good perceived ease of use and satisfaction. However, most other studies provided little or no insight into how patients perceived remote supervision, AI-generated feedback, or the overall burden of interacting with the technology. Beyond quantitative adherence rates, very few data are available on barriers and facilitators from the patient's point of view, such as fear of technology, perceived usefulness, changes in motivation, or the impact on autonomy in daily life. In this regard, providing a patient-centric and personalized approach can be an important factor in determining the success of a TR and remote monitoring interventions in reducing acute care use (Srivastava et al., 2019). Another important aspect in this field is related to the patients’ training on how to use the device, which will likely also need to be personalized and, at times, repeated. For example, the TR session or remote monitoring intervention can be personalized by using individual data to determine alert thresholds (Thomas et al., 2021). In this regard, the caregiver's role is crucial: by acting as a co-therapist and being actively involved in the rehabilitation process, the caregiver can facilitate the patient's engagement in the use of TR devices. In this context, the way AI-generated feedback is integrated into the therapeutic workflow is also clinically relevant. In most available systems, it is not always clear whether feedback is delivered synchronously during the exercise or mainly through asynchronous reports that clinicians review after the session. From a clinical perspective, the timing and modality of feedback are likely to influence motor learning, patient engagement, and the possibility for therapists to correct compensatory strategies in real time. Future TR platforms should therefore not only ensure technical feasibility, but also explicitly consider how feedback delays, communication patterns, and supervision modes shape the therapeutic interaction. Furthermore, while advanced ML models excel in accuracy, they frequently lack clinical interpretability, an essential component of effective and transparent decision-making. This stands in contrast to conventional rehabilitation, where clinicians observe patients directly, instilling trust and ensuring clear clinical judgment.

From a clinical perspective, technical limitations (e.g., scarce use of pose-estimation metrics, processing time speed and feedback latency, incomplete hardware settings and/or setups) have direct consequences for how AI-based TR systems can be used in practice. The lack of standardized pose-estimation metrics and detailed statistical validation makes it difficult to determine whether small changes in joint angles or movement quality reflect true clinical improvement, which in turn limits the use of these measures as reliable rehabilitation biomarkers or as decision-support tools at the individual patient level. Similarly, incomplete reporting of hardware configurations, processing speed, and feedback latency prevents clinicians from knowing whether the proposed systems can deliver feedback with sufficient temporal precision and robustness in typical home environments, where lighting, camera positioning, and connectivity are often suboptimal. Better characterizing these aspects will help ensure that the promising performance reported in feasibility studies can be translated into consistent support for day-to-day clinical decision-making. Achieving this will require close collaboration between engineers, computer scientists, and clinicians to design AI tools that combine technical sophistication with usability, affordability, and clinical practicality.

Moreover, current evidence suggests that AI-based MMA has primarily been administered to relatively stable patients with mild to moderate motor impairments who can perform structured tasks, such as reaching movements, sit-to-stand transitions, or simple upper-limb exercises, in a home environment. These include, for example, stroke survivors, people with Parkinson's disease (Bertomeu-Motos et al., 2023; Capecci et al., 2018; Dellepiane et al., 2025; Ramírez-Sanz et al., 2023), patients with fatigue-related deficits (Capecci et al., 2023), and other conditions with altered motor function (Hartman et al., 2022; Nucita et al., 2023). In acute or medically unstable patients, existing AI research has focused more on enhancing diagnosis and prognosis (e.g., by identifying key recovery factors and optimizing patient outcomes; (Bonanno et al., 2025) rather than on unsupervised home-based TR, which may be unsafe or difficult to implement.

Regarding clinical outcomes, most studies have focused on task-specific performance metrics (e.g., movement smoothness, range of motion, number and quality of repetitions) and on correlations between AI-derived kinematic indices and standard clinical scales. These metrics could complement the traditional assessment by providing information on movement quality, training intensity, and adherence over time. Future work should clarify which digital biomarkers derived from MMA are most responsive to change and clinically meaningful for different neurological populations.

According to the International Classification of Functioning, Disability and Health framework (ICF), the clinical outcomes in the reviewed studies were primarily focused on defining activity-level measures, such as the upper limb test (e.g., WMFT) (Chae et al., 2020), and walking capacity tests (e.g., the 2-Minute Walk Test) (Capecci et al., 2023). Capecci et al., also incorporated outcomes related to participation and broader health status, such as quality-of-life indices (e.g., EQ-5D) and fatigue, anxiety, or depression scales. Environmental and personal factors were addressed more indirectly, mainly through usability questionnaires (e.g., SUS, TUQ), adherence metrics, and digital access constraints(Capecci et al., 2023; Chae et al., 2020; Ramírez-Sanz et al., 2023). In contrast, AI-derived kinematic metrics (e.g., movement smoothness, range of motion, number and quality of repetitions) could be linked to define body functions.

Finally, environmental conditions play a critical role in the reliability of markerless systems. Most vision-based setups were tested in controlled environments with sufficient lighting, limited occlusions, and predefined camera-to-subject distances and viewpoints. When translating these solutions to routine practice, whether in outpatient clinics or at home, it will be important to ensure adequate space, stable illumination, and simple positioning guidelines for cameras or sensors, as well as to provide clear instructions to patients and caregivers. Defining and standardizing these environmental requirements is likely to improve signal quality, reduce data loss, and enhance the robustness of AI-based MMA in real-world TR settings.

Ethical, Regulatory, and Educational Considerations

Integrating AI into rehabilitation requires rethinking and redefining the roles and responsibilities of healthcare professionals. Such a transition is not limited to learning how to use software but also demands a deeper move toward clinical reasoning guided by data. Despite the perceived benefits of AI in rehabilitation, there is a lack of understanding and uptake of AI among physical therapy professionals. This knowledge is crucial for the successful implementation of AI applications in the rehabilitation field, particularly given the increasing emergence of technological advancements that have the potential to enhance healthcare delivery. In the study by Shawli et al., (Shawli et al., 2024) the physiotherapists agreed that AI can reduce the workload, time, and effort required by physical therapists; however, they emphasized that AI will not replace their role. While reducing physical workload can enhance efficiency and job satisfaction among therapists, many participants expressed concerns about relying solely on AI for clinical decisions. Participants stressed that AI cannot adequately capture patients’ psychological and social dimensions, which remain fundamental to rehabilitation. Similar concerns were found in a study among radiation oncology professionals, where 77% agreed that human input remains essential for refining AI-driven decisions (Wong et al., 2021). Despite these concerns, some participants acknowledged AI's potential in delivering accurate diagnoses and treatment plans. This aligns with findings from general practitioner studies, which view AI as a tool to enhance diagnostic accuracy and support clinicians in their roles (Alsobhi et al., 2022; Buck et al., 2022). However, in most of the current studies, the role of therapists is described only briefly, and there is little information on how clinicians are trained to interpret AI-derived metrics, understand their limitations (e.g., measurement error, latency, algorithmic uncertainty), and integrate them into their clinical reasoning. Future studies should therefore assess how clinicians interpret AI-based outputs and how these technologies can be appropriately incorporated into everyday rehabilitation practice.

Ethical and regulatory issues are also central. Relying on AI for personalized treatment recommendations raises concerns related to data bias, transparency of algorithms, patient autonomy, and responsibility for clinical decisions (Díaz-Rodríguez et al., 2023; Mennella et al., 2024). Strong data protection measures, such as end-to-end encryption, multi-factor authentication, secure data storage, and routine audits, are critical in reducing privacy threats and maintaining trust. To ensure equity and maintain patient confidence, AI-based TR platforms must be designed with fairness, openness, and accountability at their core.

The Human Element: Caregivers and the Therapeutic Relationship

Another underexplored aspect, addressed notably only by Nucita et al., is the role of the caregiver. Several studies highlight the caregiver's role as a co-therapist in TR, providing support to physiotherapists during both real-time and remote-delayed sessions (Calabrò et al., 2023; Dulawan et al., 2024; Sun et al., 2023). Their participation is essential for the success of home rehabilitation, particularly for patients who need help in performing exercises or maintaining adherence to prescribed protocols (Calabrò et al., 2023).

More broadly, the shift from traditional to AI-enhanced rehabilitation compels a deeper reflection on the very nature of “care” in digital environments. Technological proficiency alone cannot capture the nuanced dynamics of therapeutic practice and alliance. Core aspects such as the emotional bond and interpersonal relationship established in face-to-face therapy, especially the therapeutic alliance, remain difficult for AI systems to reproduce (Dolev & Zilcha-Mano, 2019; Kornhaber et al., 2016). Therefore, evaluations of AI-based TR must account not only for clinical outcomes but also for this intangible, yet deeply impactful, dimensions of care. Accordingly, incorporating patients’ active participation and constant feedback between patient and clinician is likely to improve engagement.

Limitations

The limitations of the included studies and of the scoping review methodology must be acknowledged. Many findings have limited generalizability due to small sample sizes. In addition, clinical characterization of the study populations was often restricted by incomplete reporting in the primary articles; diagnostic criteria, disease severity, and detailed demographic or clinical scale data were absent or only briefly described, thereby reducing the clinical interpretability and between-study comparability of the results. Moreover, several studies (Bertomeu-Motos et al., 2023; Hartman et al., 2022; Ramírez-Sanz et al., 2023) evaluated only methodological aspects, overlooking the potential effectiveness of the innovative approaches. Furthermore, most AI-based TR studies were small feasibility or pilot trials without a conventional rehabilitation control group, which prevents drawing firm conclusions about the relative efficiency, safety, and usability of AI-based interventions compared with standard care. With respect to our review, restricting the search to English-language publications may have led to the exclusion of relevant evidence, while the absence of statistical analyses prevented a quantitative appraisal of the literature. In line with the scoping review design, we did not perform formal statistical analyses to aggregate results across studies, as our primary aim was to map and qualitatively describe the range of AI-based TR and remote monitoring approaches. In addition, the marked heterogeneity of the included evidence, in terms of sample size and characteristics, study design, interventions, and technological equipment, indicates that a systematic review with meta-analysis would currently be difficult to conduct and of limited clinical interpretability. Consequently, this review provides a broad qualitative synthesis of the available evidence, offering meaningful insights into the role of AI technologies in the field of TR and remote monitoring.

In the future, TR studies using MMA and AI-based technologies should adopt more rigorous and standardized reporting of pose-estimation metrics, hardware and environmental detailed descriptions, providing also details of TR modality (synchronous or asynchronous) and feedback latency. In addition, future work should include comparisons of 2D and 3D pipelines within the same experimental setup and clinical population, allowing quantitative assessment of the trade-off between accuracy, robustness, computational cost, and deployability on low-cost devices. Benchmark datasets for TR scenarios, annotated with both clinical scores and ground-truth kinematics, would greatly facilitate such comparisons and accelerate progress toward standardized evaluation protocols.

For non-visual AI-based TR modalities such as IMU-based, accelerometry-based, or speech-based systems, similar principles apply reporting of sensor calibration procedures, environmental conditions, and inference latency will be essential to ensure reproducibility and to enable fair comparisons between alternative hardware configurations and algorithms. Together, these improvements would address the current gaps highlighted by our review and provide a more solid empirical basis for home-based neurorehabilitation using AI-based technologies.

Conclusion

In conclusion, research in the field of AI and TR, including remote monitoring, remains in a preliminary stage, particularly regarding its application in patients with neurological disorders. This aspect is related to the great heterogeneity in terms of patient populations, outcome measures, and extracted features. Furthermore, AI-based TR platforms and applications focused on physical exercise seem to be largely experimental when compared to those aimed at remote movement assessment. Despite these limitations, this review highlighted considerable potential for future development, both in terms of technological advancements and in relation to patient outcomes and effects. Nonetheless, AI-based platforms could be used to dynamically adapt the level and type of exercises in real time according to the patient's movement performance and difficulties, allowing a more finely tuned progression of training intensity and task complexity. In parallel, MMA in a TR setting could provide continuous, quantitative information on motor status, offering an objective complement to clinical scales that are often difficult to administer remotely and may be insensitive to subtle changes over time. Beyond innovation, it is important to emphasize that TR and remote monitoring have become essential, as they allow clinicians to maintain close contact with patients despite the absence of traditional face-to-face interaction. For these reasons, the integration of AI as a clinical support tool, providing quantitative and objective data, could represent a novel approach to enhancing the personalization of care.

Supplemental Material

sj-docx-1-nre-10.1177_10538135261426537 - Supplemental material for Markerless Motion Analysis and AI-Based Platforms for Neurological Telerehabilitation: A Scoping Review

Supplemental material, sj-docx-1-nre-10.1177_10538135261426537 for Markerless Motion Analysis and AI-Based Platforms for Neurological Telerehabilitation: A Scoping Review by Mirjam Bonanno, Giovanni Lonia, Sepehr Mojdehdehbaher, Antonio Celesti and Rocco Salvatore Calabrò in NeuroRehabilitation

Supplemental Material

sj-docx-2-nre-10.1177_10538135261426537 - Supplemental material for Markerless Motion Analysis and AI-Based Platforms for Neurological Telerehabilitation: A Scoping Review

Supplemental material, sj-docx-2-nre-10.1177_10538135261426537 for Markerless Motion Analysis and AI-Based Platforms for Neurological Telerehabilitation: A Scoping Review by Mirjam Bonanno, Giovanni Lonia, Sepehr Mojdehdehbaher, Antonio Celesti and Rocco Salvatore Calabrò in NeuroRehabilitation

Footnotes

Acknowledgements

MB is a PhD student enrolled in the National PhD in Artificial Intelligence, XL cycle, course on Health and life sciences, organized by Università Campus Bio-Medico di Roma. GL is a PhD student enrolled in the National PhD in Artificial Intelligence, XL cycle, course on AI for society, organized by Università degli Studi di Pisa.

Informed Consent Statement

Not applicable.

Institutional Review Board Statement

Not applicable.

Author Contributions

Conceptualization, MB and RSC; methodology, MB, GL, and RSC; validation, all authors; investigation, MB, GL and SM; resources, R.S.C.; data curation, MB, GL, SM; writing—original draft preparation, MB, GL, SM; writing—review and editing, RSC and AC; visualization, SM and AC; supervision, R.S.C and AC; project administration, R.S.C.; funding acquisition, R.S.C. All authors have read and agreed to the published version of the manuscript.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and publication of this article: This study was supported by Current Research Funds 2025 RRC-2025-23686388, Ministry of Health, Italy.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

ORCID iDs

Mirjam Bonanno

Giovanni Lonia

Sepehr Mojdehdehbaher

Antonio Celesti

Data Availability Statement

Not applicable.

Supplemental Material

Supplemental material for this article is available online.

References

Abrar Ashraf

Najam

Sadiq

Algamdi

Aljuaid

Rahman

Jalal

(2025). A novel telerehabilitation system for physical exercise monitoring in elderly healthcare. IEEE Access, 13, 9120–9133. https://doi.org/10.1109/ACCESS.2025.3526710

Alsobhi

Khan

Chevidikunnan

M. F.

Basuodan

Shawli

Neamatallah

(2022). Physical therapists’ knowledge and attitudes regarding artificial intelligence applications in health care and rehabilitation: Cross-sectional study. Journal of Medical Internet Research, 24(10), e39565. https://doi.org/10.2196/39565

Asgharzadeh Chamleh

M. R.

Afkanpour

Tehrany Dehkordy

Norouzkhani

Aalaei

(2025). Game-based telerehabilitation in neurological disorders: A systematic review of features, opportunities and challenges. Disability and Rehabilitation: Assistive Technology, 1–15. https://doi.org/10.1080/17483107.2025.2450010

Barzegar Khanghah

Fernie

Roshan Fekr

(2023). Design and validation of vision-based exercise biofeedback for tele-rehabilitation. Sensors, 23(3), Articolo 3. https://doi.org/10.3390/s23031206

Bertomeu-Motos

Ezquerro

Barios

J. A.

Catalán

J. M.

Blanco-Ivorra

Martinez-Pascual

Garcia-Aracil

(2023). Feasibility of an intelligent home-based neurorehabilitation system for upper extremity mobility assessment. IEEE Sensors Journal, 23(24), 31117–31124. https://doi.org/10.1109/JSEN.2023.3326531

Bonanno

Cardile

Liuzzi

Celesti

Micali

Corallo

Quartarone

Tomaiuolo

Calabrò

R. S.

(2025). Can artificial intelligence improve the diagnosis and prognosis of disorders of consciousness? A scoping review. Frontiers in Artificial Intelligence, 8, 1608778. https://doi.org/10.3389/frai.2025.1608778

Brennan

Tindall

Theodoros

Brown

Campbell

Christiana

Smith

Cason

Lee

(2010). A blueprint for telerehabilitation guidelines. International Journal of Telerehabilitation, 2(2), 31–34. https://doi.org/10.5195/ijt.2010.6063

Buck

Doctor

Hennrich

Jöhnk

Eymann

(2022). General practitioners’ attitudes toward artificial intelligence-enabled systems: Interview study. Journal of Medical Internet Research, 24(1), e28916. https://doi.org/10.2196/28916

Buscarini

Romano

Cocco

E. S.

Damiani

Pournajaf

Franceschini

Infarinato

(2025). Enhancing patient rehabilitation outcomes: Artificial intelligence-driven predictive modeling for home discharge in neurological and orthopedic conditions. Journal of NeuroEngineering and Rehabilitation, 22(1), 117. https://doi.org/10.1186/s12984-025-01654-4

10.

Calabrò

R. S.

Bonanno

Torregrossa

Cacciante

Celesti

Rifici

Tonin

De Luca

Quartarone

(2023). Benefits of telerehabilitation for patients with severe acquired brain injury: Promising results from a multicenter randomized controlled trial using nonimmersive virtual reality. Journal of Medical Internet Research, 25, e45458. https://doi.org/10.2196/45458

11.

Calabrò

R. S.

Mojdehdehbaher

(2025). AI-Driven telerehabilitation: Benefits and challenges of a transformative healthcare approach. AI, 6(3), Articolo 3. https://doi.org/10.3390/ai6030062

12.

Capecci

Ceravolo

M. G.

Ferracuti

Grugnetti

Iarlori

Longhi

Romeo

Verdini

(2018). An instrumental approach for monitoring physical exercises in a visual markerless scenario: A proof of concept. Journal of Biomechanics, 69, 70–80. https://doi.org/10.1016/j.jbiomech.2018.01.008

13.

Capecci

Cima

Barbini

F. A.

Mantoan

Sernissi

Lai

Fava

Tagliapietra

Ascari

Izzo

R. N.

Leombruni

M. E.

Casoli

Hibel

Ceravolo

M. G.

(2023). Telerehabilitation with ARC intellicare to cope with motor and respiratory disabilities: Results about the process, usability, and clinical effect of the “ricominciare” pilot study. Sensors (Basel, Switzerland), 23(16), 7238. https://doi.org/10.3390/s23167238

14.

Carreira

Zisserman

(2017). Quo Vadis, action recognition? A new model and the kinetics dataset. In 2017 IEEE Conference on computer vision and pattern recognition (CVPR) (pp. 4724–4733). https://doi.org/10.1109/CVPR.2017.502

15.

Celesti

Dell’Acqua

Lonia

Ciraolo

Celesti

Fazio

Villari

Bonanno

Calabrò

R. S.

(2024). Leveraging audio biomarkers for enriching the tele-monitoring of patients. In 2024 IEEE Symposium on computers and communications (ISCC) (pp. 1–6). https://doi.org/10.1109/ISCC61673.2024.10733631

16.

Celesti

Fazio

Ruggeri

Celesti

Villari

Bonanno

Calabrò

R. S.

(2023). Adopting machine learning-based pose estimation as digital biomarker in motor tele-rehabilitation. In 2023 IEEE Symposium on computers and communications (ISCC), (pp. 1–4). https://doi.org/10.1109/ISCC58397.2023.10218121

17.

Chae

S. H.

Kim

Lee

K.-S.

Park

H.-S.

(2020). Development and clinical evaluation of a web-based upper limb home rehabilitation system using a smartwatch and machine learning model for chronic stroke survivors: Prospective comparative study. JMIR mHealth and uHealth, 8(7), e17216. https://doi.org/10.2196/17216

18.

Ciraolo

Fazio

Calabrò

R. S.

Villari

Celesti

(2024). Facial expression recognition based on emotional artificial intelligence for tele-rehabilitation. Biomedical Signal Processing and Control, 92, 106096. https://doi.org/10.1016/j.bspc.2024.106096

19.

Clemente

Chambel

Silva

D. C. F.

Montes

A. M.

Pinto

J. F.

Silva

H. P. D.

(2024). Feasibility of 3D body tracking from monocular 2D video feeds in musculoskeletal telerehabilitation. Sensors, 24(1). https://doi.org/10.3390/s24010206

20.

CMU-Perceptual-Computing-Lab/openpose. (2025). [C++]. CMU-Perceptual-Computing-Lab. https://github.com/CMU-Perceptual-Computing-Lab/openpose (Opera originale pubblicata 2017)

21.

Colyer

S. L.

Evans

Cosker

D. P.

Salo

A. I. T.

(2018). A review of the evolution of vision-based motion analysis and the integration of advanced computer vision methods towards developing a markerless system. Sports Medicine - Open, 4(1), 24. https://doi.org/10.1186/s40798-018-0139-y

22.

Cotton

R. J.

DeLillo

Cimorelli

Shah

Peiffer

J. D.

Anarwala

Abdou

Karakostas

(2023). Markerless motion capture and bBiomechanical analysis pipeline (No. arXiv:2303.10654). arXiv. https://doi.org/10.48550/arXiv.2303.10654

23.

DeepLabCut. (s.d.). GitHub. Recuperato 21 agosto 2025, da. https://github.com/DeepLabCut

24.

Dellepiane

S. G.

Ferraro

Baffigo

Simonini

(2025). Signal processing and feature extraction in markerless telerehabilitation. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 33, 911–924. https://doi.org/10.1109/TNSRE.2025.3541153

25.

Díaz-Rodríguez

Del Ser

Coeckelbergh

López de Prado

Herrera-Viedma

Herrera

(2023). Connecting the dots in trustworthy Artificial Intelligence: From AI principles, ethics, and key requirements to responsible AI systems and regulation. Information Fusion, 99, 101896. https://doi.org/10.1016/j.inffus.2023.101896

26.

Dolev

Zilcha-Mano

(2019). The role of the therapeutic relationship in the association between interpersonal behaviors and outcome: Comparison of two competing models. Psychotherapy Research: Journal of the Society for Psychotherapy Research, 29(5), 553–564. https://doi.org/10.1080/10503307.2017.1422215

27.

Dulawan

J. A. T.

Ignacio

S. D.

Ang-Muñoz

C. D.

Carlos

F. A. B.

Leochico

C. F. D.

(2024). Caregivers’ perceptions and willingness to utilize telerehabilitation for outpatient consultation and therapy for pediatric patients in a COVID-referral center in a developing country: A cross-sectional study. Acta Medica Philippina, 58(20), 20–28. https://doi.org/10.47895/amp.v58i20.8713

28.

Eriksen

M. B.

Frandsen

T. F.

(2018). The impact of patient, intervention, comparison, outcome (PICO) as a search strategy tool on literature search quality: A systematic review. Journal of the Medical Library Association: JMLA, 106(4), 420–431. https://doi.org/10.5195/jmla.2018.345

29.

Federico

Cacciante

De Icco

Gatti

Jonsdottir

Pagliari

Franceschini

Goffredo

Cioeta

Calabrò

R. S.

Maistrello

Turolla

Kiper

, & On Behalf Of Rin Tr Group, null. (2023). Telerehabilitation for stroke: A personalized multi-domain approach in a pilot study. Journal of Personalized Medicine, 13(12), 1692. https://doi.org/10.3390/jpm13121692

30.

Goffredo

Baglio

DE Icco

Proietti

Maggioni

Turolla

Pournajaf

Jonsdottir

Zeni

Federico

Cacciante

Cioeta

Tassorelli

Franceschini

Calabrò

R. S.

, & RIN_TR_Group. (2023). Efficacy of non-immersive virtual reality-based telerehabilitation on postural stability in Parkinson’s disease: A multicenter randomized controlled trial. European Journal of Physical and Rehabilitation Medicine, 59(6), 689–696. https://doi.org/10.23736/S1973-9087.23.07954-6

31.

Google-ai-edge/mediapipe. (2025). [C++]. google-ai-edge. https://github.com/google-ai-edge/mediapipe (Opera originale pubblicata 2019)

32.

Hadjar

Hemmje

(2025). Empowering recovery: The T-Rehab system’s semi-immersive approach to emotional and physical well-being in tele-rehabilitation. Electronics, 14(5), 852. https://doi.org/10.3390/electronics14050852

33.

Hartman

Elkhadrawi

McKendry

Akcakaya

Bendixen

R. M.

(2022). Towards remote monitoring of dynamic arm supports for individuals with Duchenne muscular dystrophy using 3D accelerometry. Expert Systems with Applications, 205, 117712. https://doi.org/10.1016/j.eswa.2022.117712

34.

Meng

Wei

Guo

Yang

Wang

(2024). Proposal and validation of a new approach in tele-rehabilitation with 3D human posture estimation: A randomized controlled trial in older individuals with sarcopenia. BMC Geriatrics, 24(1), 586. https://doi.org/10.1186/s12877-024-05188-7

35.

Kornhaber

Walsh

Duff

Walker

(2016). Enhancing adult therapeutic interpersonal relationships in the acute health care setting: An integrative review. Journal of Multidisciplinary Healthcare, 9, 537–546. https://doi.org/10.2147/JMDH.S116957

36.

Krzyzaniak

Cardona

Peiris

Michaleff

Z. A.

Greenwood

Clark

Scott

A. M.

Glasziou

(2023). Telerehabilitation versus face-to-face rehabilitation in the management of musculoskeletal conditions: A systematic review and meta-analysis. Physical Therapy Reviews, 28(2), 71–87. https://doi.org/10.1080/10833196.2023.2195214

37.

Lam

W. W. T.

Fong

K. N. K.

(2024). Validity and reliability of upper limb kinematic assessment using a markerless motion capture (MMC) system: A pilot study. Archives of Physical Medicine and Rehabilitation, 105(4), 673–681.e2. https://doi.org/10.1016/j.apmr.2023.10.018

38.

Lam

W. W. T.

Tang

Y. M.

Fong

K. N. K.

(2023). A systematic review of the applications of markerless motion capture (MMC) technology for clinical measurement in rehabilitation. Journal of NeuroEngineering and Rehabilitation, 20(1), 57. https://doi.org/10.1186/s12984-023-01186-9

39.

Maggio

M. G.

De Luca

Manuli

Calabrò

R. S.

(2020). The five ‘W’ of cognitive telerehabilitation in the COVID-19 era. Expert Review of Medical Devices, 17(6), 473–475. https://doi.org/10.1080/17434440.2020.1776607

40.

Mennella

Maniscalco

De Pietro

Esposito

(2024). Ethical and regulatory challenges of AI technologies in healthcare: A narrative review. Heliyon, 10(4), e26297. https://doi.org/10.1016/j.heliyon.2024.e26297

41.

Metcalf

C. D.

Robinson

Malpass

A. J.

Bogle

T. P.

Dell

T. A.

Harris

Demain

S. H.

(2013). Markerless motion capture and measurement of hand kinematics: Validation and application to home-based upper limb rehabilitation. IEEE Transactions on Bio-Medical Engineering, 60(8), 2184–2192. https://doi.org/10.1109/TBME.2013.2250286

42.

Mulfari

La Placa

Rovito

Celesti

Villari

(2022). Deep learning applications in telerehabilitation speech therapy scenarios. Computers in Biology and Medicine, 148, 105864. https://doi.org/10.1016/j.compbiomed.2022.105864

43.

Muñoz-Tomás

M. T.

Burillo-Lafuente

Vicente-Parra

Sanz-Rubio

M. C.

Suarez-Serrano

Marcén-Román

Franco-Sierra

M. Á.

(2023). Telerehabilitation as a therapeutic exercise tool versus face-to-face physiotherapy: A systematic review. International Journal of Environmental Research and Public Health, 20(5), 4358. https://doi.org/10.3390/ijerph20054358

44.

Nucita

Iannizzotto

Perina

Romano

Fabio

R. A.

(2023). Telerehabilitation with computer vision-assisted markerless measures: A pilot study with Rett syndrome patients. Electronics (Switzerland), 12(2). https://doi.org/10.3390/electronics12020435

45.

Ouzzani

Hammady

Fedorowicz

Elmagarmid

(2016). Rayyan-a web and mobile app for systematic reviews. Systematic Reviews, 5(1), 210. https://doi.org/10.1186/s13643-016-0384-4

46.

Pagliari

Tella

S. D.

Bonanno

Cacciante

Cioeta

De Icco

Jonsdottir

Federico

Franceschini

Goffredo

Rainoldi

Rovaris

Springhetti

Calabrò

R. S.

Tassorelli

Rossini

P. M.

Baglio

, & for RIN_TeleSM_Group. (2025). Enhancing the effect of rehabilitation on multiple sclerosis: A randomized clinical trial investigating the impact of remotely-supervised transcranial direct current stimulation and virtual reality telerehabilitation training. Multiple Sclerosis and Related Disorders, 94, 106256. https://doi.org/10.1016/j.msard.2024.106256

47.

Pardell

Dolgoy

N. D.

Bernard

Bayless

Hirsche

Dennett

Tandon

(2024). Movement outcomes acquired via markerless motion capture systems compared with marker-based systems for adult patient populations: A scoping review. Biomechanics, 4(4), 618–632. https://doi.org/10.3390/biomechanics4040044

48.

Ramírez-Sanz

J. M.

Garrido-Labrador

J. L.

Olivares-Gil

García-Bustillo

Arnaiz-González

Díez-Pastor

J.-F.

Jahouh

González-Santos

González-Bernal

J. J.

Allende-Río

Valiñas-Sieiro

Trejo-Gabriel-Galan

J. M.

Cubo

(2023). A low-cost system using a big-data deep-learning framework for assessing physical telerehabilitation: A proof-of-concept. Healthcare, 11(4), Articolo 4. https://doi.org/10.3390/healthcare11040507

49.

Rasa

A. R.

(2024). Artificial intelligence and its revolutionary role in physical and mental rehabilitation: A review of recent advancements. BioMed Research International, 2024, 9554590. https://doi.org/10.1155/bmri/9554590

50.

Sarbolandi

Lefloch

Kolb

(2015). Kinect range sensing: Structured-light versus Time-of-Flight Kinect. Computer Vision and Image Understanding, 139, 1–20. https://doi.org/10.1016/j.cviu.2015.05.006

51.

Saygili

Guclu-Gunduz

Eldemir

Ozkul

Gursoy

G. T.

(2024). Effects of modified-constraint induced movement therapy based telerehabilitation on upper extremity motor functions in stroke patients. Brain and Behavior, 14(6), e3569. https://doi.org/10.1002/brb3.3569

52.

Shawli

Alsobhi

Faisal Chevidikunnan

Rosewilliam

Basuodan

Khan

(2024). Physical therapists’ perceptions and attitudes towards artificial intelligence in healthcare and rehabilitation: A qualitative study. Musculoskeletal Science and Practice, 73, 103152. https://doi.org/10.1016/j.msksp.2024.103152

53.

Srivastava

J.-M.

Sales

V. L.

Joseph

(2019). Impact of patient-centred home telehealth programme on outcomes in heart failure. Journal of Telemedicine and Telecare, 25(7), 425–430. https://doi.org/10.1177/1357633X18775852

54.

Sumner

Lim

H. W.

Chong

L. S.

Bundele

Mukhopadhyay

Kayambu

(2023). Artificial intelligence in physical rehabilitation: A systematic review. Artificial Intelligence in Medicine, 146, 102693. https://doi.org/10.1016/j.artmed.2023.102693

55.

Sun

W.-J.

Song

Y.-Y.

Wang

Jiang

Cui

W.-Y.

Liu

W.-J.

Liu

(2023). Telerehabilitation for family caregivers of stroke survivors: A systematic review and meta-analysis. Journal of Nursing Management, 2023(1), 3450312. https://doi.org/10.1155/2023/3450312

56.

Taşkaya

(2026). Robotics and artificial intelligence applications in neurorehabilitation: a bibliometric analysis (2003–2025). Journal of NeuroEngineering and Rehabilitation, 23, 55. https://doi.org/10.1186/s12984-025-01870-y

57.

Thomas

E. E.

Taylor

M. L.

Banbury

Snoswell

C. L.

Haydon

H. M.

Gallegos Rejas

V. M.

Smith

A. C.

Caffery

L. J.

(2021). Factors influencing the effectiveness of remote patient monitoring interventions: A realist review. BMJ Open, 11(8), e051844. https://doi.org/10.1136/bmjopen-2021-051844

58.

Timmi

Coates

Fortin

Ackland

Bryant

A. L.

Gordon

Pivonka

(2018). Accuracy of a novel marker tracking approach based on the low-cost Microsoft Kinect v2 sensor. Medical Engineering & Physics, 59, 63–69. https://doi.org/10.1016/j.medengphy.2018.04.020

59.

Toh

F. M.

Lam

W. W. T.

Cruz Gonzalez

Fong

K. N. K.

(2025). Effects of a wearable-based intervention on the hemiparetic upper limb in persons with stroke: A randomized controlled trial. Neurorehabilitation and Neural Repair, 39(1), 31–46. https://doi.org/10.1177/15459683241283412

60.

Tricco

A. C.

Lillie

Zarin

O’Brien

K. K.

Colquhoun

Levac

Moher

Peters

M. D. J.

Horsley

Weeks

Hempel

Akl

E. A.

Chang

McGowan

Stewart

Hartling

Aldcroft

Wilson

M. G.

Garritty

Straus

S. E.

(2018). PRISMA Extension for scoping reviews (PRISMA-ScR): Checklist and explanation. Annals of Internal Medicine, 169(7), 467–473. https://doi.org/10.7326/M18-0850

61.

Wang

Kurillo

Ofli

Bajcsy

(2015). Evaluation of pose tracking accuracy in the first and second generations of Microsoft Kinect. In 2015 International conference on healthcare informatics (pp. 380–389). https://doi.org/10.1109/ICHI.2015.54

62.

West

A. M.

Tessari

Hogan

(2023). The study of complex manipulation via kinematic hand synergies: The effects of data pre-processing. In 2023 International conference on rehabilitation robotics (ICORR) (pp. 1–6). https://doi.org/10.1109/ICORR58425.2023.10304710

63.

Wong

Gallant

Szumacher

(2021). Perceptions of Canadian radiation oncologists, radiation physicists, radiation therapists and radiation trainees about the impact of artificial intelligence in radiation oncology—National survey. Journal of Medical Imaging and Radiation Sciences, 52(1), 44–48. https://doi.org/10.1016/j.jmir.2020.11.013

64.

Yolcu

Oztel

Kazan

Palaniappan

Lever

T. E.

Bunyak

(2019). Facial expression recognition for monitoring neurological disorders based on convolutional neural network. Multimedia Tools and Applications, 78(22), 31581–31603. https://doi.org/10.1007/s11042-019-07959-6

65.

Yoonesi

Abedi Azar

Arab Bafrani

Yaghmayee

Shahavand

Mirmazloumi

Moazeni Limoudehi

Rahmani

Hasany

Idjadi

F. Z.

Aalipour

M. A.

Gharedaghi

Salehi

Asadi Anar

Soleimani

M. S.

(2025). Facial expression deep learning algorithms in the detection of neurological disorders: A systematic review and meta-analysis. BioMedical Engineering OnLine, 24(1), 64. https://doi.org/10.1186/s12938-025-01396-3

66.

Zheng

Chen

Yang

Zhu

Shen

Kehtarnavaz

Shah

(2023). Deep learning-based human pose estimation: A survey (No. arXiv:2012.13392). arXiv. https://doi.org/10.48550/arXiv.2012.13392

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.02 MB

0.08 MB