Abstract
Duchenne muscular dystrophy is a severe neuromuscular disorder characterized by progressive muscle degeneration resulting from mutations in the dystrophin gene. Digital outcome measures offer a promising alternative to traditional outcome measures used in clinical trials. This review explores the development and application of digital outcome measures in Duchenne muscular dystrophy, emphasizing the feasibility, reliability, sensitivity, and validity of these measures. The stride velocity 95th centile has been validated as a robust endpoint and has been approved for use in clinical evaluation of drugs for the treatment of Duchenne muscular dystrophy by the European Medicines Agency. Although digital outcome measures have the potential to enhance the efficiency and accuracy of clinical trials, challenges such as limited sample sizes and patient compliance persist. The integration of artificial intelligence into the data analysis is in progress, but further validation is required before these analysis strategies can be incorporated into future clinical trial methodologies.
Highlights
Digital outcome measures are transforming development of drugs for Duchenne muscular dystrophy.
The Stride Velocity 95th Centile was the first digital clinical outcome assessment approved.
Digital outcome measures are now well perceived by regulators and the pharma industry.
Introduction
Duchenne muscular dystrophy (DMD) is a severe neuromuscular condition caused by a mutation in the dystrophin gene that disrupts production of dystrophin protein. The main function of dystrophin is to maintain the integrity of muscle cells, and its absence results in muscle cell fragility and death leading to a gradual deterioration of muscle function. The proximal muscles are the first muscles affected, and the initial signs of DMD typically emerge when boys are 2–3 years old. Early symptoms include struggles in stair climbing, difficulties in running and rising from floor, waddling gait, and frequent falls. There is considerable heterogeneity of disease time course and a phase of improvement that usually lasts until 6–7 years of age. Patients living with DMD may present with a large spectrum of central nervous system involvement ranging from attention deficit to severe intellectual disability and autism.1,2 Clinical trials of drugs that aim to restore dystrophin or quasi-dystrophin production or to limit the consequences downstream of the absence of dystrophin are ongoing. 3
Because of the slow progression of symptoms and the spectrum of symptoms at a given age in subjects with DMD, it is difficult to capture significant changes in clinical trials, which usually have a limited number of patients and occur over a short period of time. Traditional outcome measures in DMD trials are based on timed assessments and clinical scores with tests conducted in clinical environments at designated time points. These approaches measure the condition of the patient at a very specific time point and can therefore be influenced by the patient's motivation or level of fatigue or by daily fluctuations of the condition due, for instance, to concomitant minor viral infection.
To address these issues, there has been growing interest in developing digital outcome measures. Digital outcome measures involve continuous and longitudinal data collection as well as automated analysis. 4 Thus, digital outcome measures may have improved sensitivity to change and provide greater objectivity than traditional measures. Digital outcome measures that robustly capture drug efficacy based on disease-specific outcomes meaningful to patients will be crucial for the successful development of new treatments. This review discusses the efforts conducted so far in the field of digital outcome measures in DMD and the extent to which they have improved sensitivity, reliability, and objectivity in DMD research.
Methods
The PubMed online database was searched to identify pertinent articles using keywords related to digital outcome measures in DMD research like “digital”, “wearable”, “movement” and “motor function”. A total of 56 titles and abstracts were screened for eligibility based on the use of digital outcome measures to assess movement functionality in DMD. The full-text articles of the selected studies were further evaluated to confirm if they addressed the review question; 21 articles met the criteria. This was not a comprehensive review since the search was done by only one evaluator and only one database was searched.
Results
Table 1 outlines the properties of different digital outcomes assessed in DMD, while Table 2 summarizes the data collection methods used in the analyzed studies. Figure 1 illustrates the sensor placements, and the outcomes measured.

Sensor locations and outcomes measured using digital devices in subjects with DMD and controls. The color bar shows the number of studies per sensor site.
Properties of digital outcome measures applied in evaluation of subjects with DMD. White spaces indicate an unassessed property for the digital outcome metric.
* ICC: intraclass correlation.
** The values in parentheses are the correlation coefficients.
Studies on digital data collection in DMD.
* Number of DMD patients with available data for analysis.
** TD: typically developed.
Activity
Studies examining activity levels in DMD patients have primarily focused on quantifying physical activity using accelerometers. The feasibility and accuracy of digital daily physical activity monitoring were first evaluated in a pilot study involving five patients with DMD. Participants wore a three-dimensional (3D) accelerometer and gyroscope on their chest for two consecutive days at baseline and one month after initiating prednisolone treatment. 5 This approach provided initial insights into the intensity, distribution, and structure of physical activity throughout the day, including time spent sitting, standing, lying, and walking; transitions between rest and activity; number of steps taken; walking duration; and cadence. Compared to baseline, after 1 month of prednisolone treatment, overall time spent in activity was increased for all subjects, walking duration was longer for three patients, total numbers of steps were higher for four patients, maximal quasi-continuous walking episodes were longer for four patients, and maximal walking cadence was improved across all patients. Although the small sample size was a limitation, the study demonstrated the sensitivity of digital outcome measures to change. Furthermore, subjects were not significantly inconvenienced by wearing of the accelerometer during the 10 hours of daily use, indicating acceptability and feasibility of accelerometers for activity monitoring.
In a cohort consisting mostly of non-ambulant individuals with DMD, vector magnitude counts, calculated as the square root of the sum of squared recordings over 7 consecutive days at baseline, one year, and two years later, showed that wrist vector magnitude counts were higher than ankle vector magnitude counts. 6 The frequency of movements, quantified by the number of times per epoch that the signal crossed a threshold, and activity levels measured as the area under the curve for each epoch of wrist movements, were lower in non-ambulatory than ambulatory subjects. Changes in activity levels over 7 days at baseline and one year later were correlated with progressive muscle weakness alterations. 7 In a cross-sectional study, patients with higher Brooke and Vignos scale scores had higher activity counts assessed based on levels of arm elevation and elevation rate from the upper limbs based on up to 3 days of recordings assessed activity counts. 8
In a similar study, accelerometry data collected for 7 days was converted into counts per minute and periods were classified into sedentary, low-intensity, and moderate-to-vigorous physical activity categories. DMD patients spent most of the time awake in sedentary positions (85% of the time) with most of the rest of the day spent in lower intensity movements (13.8%). Age and ambulation level were correlated with physical activity. Vector magnitude was lower in patients with DMD than healthy controls and in non-ambulatory than ambulatory patients with DMD. 9 Futhermore, differences in physical activity monitored with a wrist-worn 3D accelerometer for 12 weeks were observed between children with Niemann-Pick type C, juvenile idiopathic arthritis, and DMD. 10
The relationship between DMD and pathological sleep was investigated by evaluating rest activity with a wrist-worn accelerometer for 10 days. 11 Participants who had more fragmented activity rhythms experienced higher levels of subjective sleep disorder, in terms of initiating and maintaining sleep. Intraday variability was associated with the subjective sleep disorder subscale measure, suggesting it may serve as an indicator of sleep-wake dysfunction in DMD. Additionally, daytime activity correlated with the 6-meter walk test (6MWT), indicating that peak activity hours may provide a reliable indicator of ambulatory status. 11
Step and stride count
A waist-worn accelerometer was used to measure step count over 7 days in a large cohort of boys with DMD and unaffected controls. On average, DMD patients took 63% of the daily steps of the unaffected controls, and step activity was correlated with function and strength, with an overall decline in daily steps with increasing age in subjects with DMD. 12 A 5-year long longitudinal study of DMD patients also showed that there is a decline in average strides per day and in all stride rates (low, moderate, high, pediatric high) as a function of age. 13 The number of steps taken, together with the 6MWT, was used as a primary endpoint for assessing a multicomponent nutritional supplement on functional outcomes in DMD in a 50-week randomized controlled trial, however, the trial was discontinued due to a lack of significant improvements. 14
Stride length
DMD patients and typically developing children were asked to walk 200 meters at a self-determined pace, and stride was monitored using accelerometers attached to the shanks and trunk. Stride length was lower in DMD patients than control subjects. 15 Stride length has also been assessed from videos as the distance from heel to heel using a camera moving alongside the patient, however the lack of standardization in the setup limited the assessment of this outcome. 4
Data from waist-worn iPhone equipped with a built-in accelerometer were analyzed with a machine learning model trained on various speeds of gait. Accurate measurement of gait features such as step length, duration, and speed was achieved through this methodology, which also predicted travel distances over a range of walking speeds, in individuals with DMD and their typically developing peers. 16 DMD participants were subdivided by NSAA scores into near-typically developing (≥30), mildly affected (20–29), moderately affected (10–19), and severely affected (<10) groups. Significant differences in average step lengths were observed across different velocities, with the most substantial drop occurring in the moderate NSAA scoring range. The largest differences in step lengths in the moderate group might be due to the non-linear nature of NSAA scoring. 16 Stride length, along with other clinical features of DMD, including speed, step frequency, total power, mediolateral power, and anteroposterior power, differentiated DMD from typically developing children, with the best accuracy of 100% obtained at self-selected walking speeds. The extracted temporospatial gait features showed reduced step length and a greater mediolateral component of total power in DMD subjects compared to controls, consistent with shorter strides and Trendelenburg-like gait commonly observed in DMD. The impact of stride length in the discrimination between groups was confirmed by the moderate to high correlation after dimensional reduction (principal component analysis and linear discriminant analysis) before classification. Raw accelerometry data was also used as input for a deep learning model with the same intention, achieving 86.67% accuracy at slow walk speed and free walking, with the amount of data likely limiting performance. 17
Mean stride velocity
DMD gait is characterized by a reduced stride speed, calculated as the mean linear speed of the foot throughout the gait cycle, in comparison to typically developing children. 15 There is also a significant difference in stride speed between mildly and moderately affected DMD subjects. 15
Cadence
Stride cadence, which is the number of strides taken per minute, decreases with age in normally developing children, 18 and this decrease is greater in boys with DMD than in typically developing boys. 13 For instance, among the oldest boys with DMD who remained ambulatory, approximately 60% of strides per day fell within a low cadence range. 13 When distinguishing between populations, cadence has a significant effect size, with moderate DMD patients having a lower cadence compared to mildly affected DMD subjects, and typically developing children. 15
Angular velocity and acceleration at the wrist level in non-ambulant patients
Angular velocity can be measured with the aid of a gyroscope. The shank peak angular velocity is lower in moderately affected DMD patients than normally developing controls. 15 The magnitude of the angular speed, the ratio of the vertical acceleration component, the energy required to move the forearm (the torque's scalar product with the angular velocity), and the forearm's lifting velocity (representing the angular velocity at which it was raised), were also assessed in DMD patients using wrist-worn magneto-inertial sensors, as subjects completed various tasks on two occasions,15 days apart. 19
Double support time
The double support time indicates that DMD children spend a greater percentage of time during the gait cycle with both feet in contact with the ground compared to typically developing children. 15
Trunk smoothness
Higher spectral entropy of acceleration norm was seen in moderately affected DMD children compared to mildly affected DMD children and typically developing children, indicating less smooth trunk movement. 15 Assessing functional muscle strength by analysis of trunk movement in videos may identify muscle weakness, as less smooth, time-consuming, and asymmetrical movements suggest compensatory strategies. 4
Relative coupling coefficient
Accelerometer data collected during a 14-meter walk were used to calculate the relative coupling coefficient, which distinguishes between DMD and typically developing children based on the core-limb coupling coefficient (CLCC) and homolateral-limb coupling coefficient (HLCC). 20 CLCC and HLCC can detect compensatory movements caused by muscle weakness, with CLCC assessing the core muscle group's ability to control and coordinate limbs, and HLCC reflecting compensation in the ipsilateral limb. The relative coupling coefficient integrates the input from CLCC and HLCC to better measure the coordination of the entire body during walking. As children with DMD get older, their CLCC decreases progressively, and their HLCC increases. In contrast, typically developing children do not show a clear pattern of change in the CLCC and HLCC. Consequently, a notable difference in relative coupling coefficient between children with DMD and age-matched typically developing children is observed. 20
Behavioral fingerprints and KineDMD
The KineDMD behavioral biomarker was identified through the analysis of data collected on a cohort of 21 participants with DMD and 17 age-matched controls, who were assessed at baseline, 6 months, and 12 months. 21 Evaluations included the 6MWT, NSAA, and the Performance of the Upper Limb (PUL) test, and everyday activities while wearing a 17-sensor bodysuit. The data-derived measures encopassed mean velocity of the limbs, hip movement orbit, and volumetric workspaces of various joints, forming movement behavioral fingerprints that were shown to distinguish subjects with DMD from controls. Skeletal joint movement histograms from activities of daily living reflected hyperlordosis in the DMD subjects. The angle distribution of the hip joint in patients is shifted to the right compared to healthy individuals causing a more pronounced leftward shift in the distribution of knee joint angles. Additionally, there is an anticorrelation between joint angular velocity between knee and hip flexion movements in the sagittal plane and increased correlation in the sideways abduction of the hips in the coronal plane, indicative of waddling. Fingerprints were also used to predict clinical outcome measures at a cross-sectional and longitudinal level, and the fingerprints outperformed predictions based on the clinical evaluations. Finally, Bayesian optimization was applied to create the KineDMD biomarker, which increases in value with age in an S-shaped sigmoid curve pattern. 21
Video assessments
Videos can also be used to generate digital measures. Phones were used to capture videos of patients performing activities at home, chosen according to their age and functional abilities. 22 The 63 videos taken over 2 weeks for each patient were rated by certified physiotherapists using severity scoreboards to assess weaknesses in patients’ performances. Nevertheless, the risk of subjectivity in the evaluationa persists, and evalutors may be influenced by prior task performance. OpenPose, a computer vision technique, 23 was used to extract articulations of interest in the body, to quantify the time of execution, trajectory pattern, smoothness, and symmetry of movement. 4 In this research, tasks were not fundamental daily activities such as eating and drinking. Furtherlore, issues related to the standardization of videos were identified, resulting from insufficient guidance regarding settings, task execution, duration of actions, recording angles, and tools required for tasks, such as the height of the chair used to measure time needed to stand.
Stride velocity 95th centile
The stride velocity 95th centile (SV95C) is the top 5% of stride velocities achieved spontaneously over a set period. 24 Captured using a passive wearable device, SV95C quantifies a patient's ambulatory ability and serves as a maximal performance indicator. SV95C can be determined in the home environment, and is less influenced by time of the day the test is conducted and by patient motivation, compared to traditional tests performed in a clinical setting. An optimal recording duration for SV95C was mathematically determined to be 180 hours, and the mean variability is 4.41% (standard deviation [SD], 2.33%); the minimum was established to be 50 hours, with a mean variability of 6.38% (SD, 2.60%). 25 Excellent and early standardized response mean observed in natural history studies suggest that use of SV95C as an endpoint could potentially reduce the number of patients needed in trials, thereby reducing cost, duration, and logistical complexity of trials. 26
The European Medicines Agency now recognizes SV95C as a primary endpoint for ambulant DMD patients over 4 years old. The qualification of SV95C as a clinical outcome assessment was based on successful demonstrations of feasibility, reliability, sensitivity to change, and external validity, supported by consistent and established efficacy endpoints in DMD, which were included as secondary endpoints. 26 Regarding the Food and Drug Administration's qualification of SV95C, the Letter of Intent (stage 1) has been accepted, and the qualification plan (stage 2) was submitted in July 2024.
A recent example of how SV95C impacts on the development of new treatments comes from the EMBARK (NCT05096221) study of use of adeno-associated virus to deliver a gene encoding a micro-dystrophin. Treated and untreated patients could not be distinguished on the primary endpoint, the North Star Ambulatory Assessment (NSAA), however, the SV95C, used as a secondary endpoint in this study, showed a statistically and clinically significant difference between the two groups. 27 Another example is the SPITFIRE/WN40227 trial (NCT03039686), which was discontinued due to a lack of clinical benefit. In this trial, SV95C demonstrated responsiveness to clinical changes as early as Week 12, whereas other outcome measures only showed responsiveness at Week 36 or later. This suggests that SV95C can detect disease progression earlier than traditional endpoints. 28
Discussion
Diverse digital outcome measures have been investigated in DMD. These digital outcome measures have a wide range of validation maturity, from proof of concept to approval for use clinically. Most studies conducted to date have been transversal, assessing reliability, clinical validity, and the ability to discriminate between controls and individuals with DMD. Although longitudinal data are rare, certain measures, such as activity levels, step count, and cadence, have shown significant sensitivity to changes over time.
Endpoints in clinical trials must accurately reflect patient conditions. They need to be objective, reliable, and sensitive to change. Digital outcome measures, collected through standardized protocols and instruments, may ensure accurate measurements of intended metrics, identifying significant improvements or deterioration during disease progression or treatment, and providing valuable insights into optimal patient management. Furthermore, digital outcome measures have lower variability than traditional hospital-based assessments as monitoring can be conducted at home and over time, offering more reliable and repeatable measurements. Among all the digital outcome measures evaluated to date, only the SV95C has formally proven to be superior to traditional hospital-based assessments (Table 1). The approval by the European Medicines Agency of the SV95C as a secondary 24 and as a primary endpoint 29 for DMD clinical trials is expected to lead to increased use and a possible transition to digital outcome measures in clinical research. Many clinical trials have included the SV95C as a secondary endpoint (NCT06138639, NCT04906460, NCT05291091, NCT03039686, NCT05096221, NCT05524883, NCT03907072), and its use as a primary endpoint in trials is currently underway.
Research on DMD benefits from the increasing use of AI methods in data analysis. Artificial intelligence can automate data processing, improve step detection, and estimate distance traveled, while also accurately distinguishing DMD patients from controls. 16 Although machine learning approaches can achieve high accuracy in classification, prior work to extract clinical features, such as stride length and speed, is needed. Conversely, deep learning models can utilize raw data, however, they require a large amount of data to effectively identify patterns that differentiate populations. 17 Moreover, machine learning enables the analysis of movement patterns, presenting opportunities for both cross-sectional and longitudinal projections of clinical assessments. Futhermore, the KineDMD biomarker, was demonstrated to be predictive of disease progression and response to therapy. 21 In another instance, computer vision could extract objective and quantitative measures of movement from videos in a home environment to identify voluntary or compensatory movements that may indicate muscle weakness. 4 Although these findings are promising, the use of AI methods presents significant challenges, including hurdles in validation for clinical use, the amount of data needed, and the validation of measurements in non-clinical environments, where events cannot be controlled or predicted. How AI-derived outcomes can be incorporated into evaluations of the standard of care and the introduction of new treatments, remains to be established. AI tools must not only be accurate but also clinically relevant and adaptable to changing treatment landscapes.
The main weakness in evaluating DMD patients using digital outcome measures is the limited sample size, due to the rarity of DMD. Recruiting sufficient numbers of patients for clinical trials poses significant challenges, highlighting the necessity for further validation of results. The broader use of digital outcome measures in randomized control studies and the potential use of placebo groups could be a future way to assess reliability and accuracy of digital outcome measures.
Patient compliance with wearing sensors or taking videos is important. Limitations in this area have been identified in the literature 9 and include complaints by patients about the aesthetics and comfort of the devices, 10 deficiencies in the wearables themselves, 8 and lack of compliance by users in uploading data or for submission of videos that did not meet quality standards.4,22 Moreover, the impacts of seasons, weather, and day of the week as potential confounding factors have not been generally examined. Despite these challenges, research conducted in the DMD population has provided indications that digital outcome measures will be useful clinically. 30
This review has limitations. The selection of papers was carried out by a single person, and only one database was consulted for the research. This method could have created biases and missed important studies available through different databases or selection processes. Nevertheless, despite the need for more validation, we expect that digital outcome measures will begin to influence the design and conduct of clinical trials, reducing numbers of patients needed and costs compared to trials conducted with traditional endpoint measures, and enhancing the reliability, sensitivity, and objectivity of clinical trials for DMD.
Conclusions
Over the last two decades several digital outcome measures have been developed for use in patients with DMD. Validation has reached different levels of maturity from proof of concept to regulatory approval. Evidence from several trials and natural history studies shows that certain digital outcome measures have better reliability and sensitivity than traditional tests conducted in a clinical setting. Collection of large amounts of data supported by multi-stakeholders for the validation and refinement of these outcome measures is needed to accelerate this trend and support more efficient development.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: CGB is an employee of the ActiMyo/Syde wearable sensor manufacturer, Sysnav.
LS gave consultancy to Roche, Biogen Digital Health, PepGen, Dyne Therapeurics, WaveLife and Sysnav in the context of digital outcome measures.
