Abstract
Introduction
Digital home rehabilitation systems require accurate segmentation methods to provide appropriate feedback on repetition counting and exercise technique. Current segmentation methods are not suitable for clinical use; they are not highly accurate or require multiple sensors, which creates usability problems. We propose a model for accurately segmenting inertial measurement unit data for shoulder rehabilitation exercises. This study aims to use inertial measurement unit data to train and test a machine learning segmentation model for single- and multiple-inertial measurement unit systems and to identify the optimal single-sensor location.
Methods
A focus group of specialist physiotherapists selected the exercises, which were performed by participants wearing inertial measurement units on the wrist, arm and scapula. We applied a novel machine learning based segmentation technique involving a convolutional classifier and Finite State Machine to the inertial measurement unit data. An accuracy score was calculated for each possible single- or multiple-sensor system.
Results
The wrist inertial measurement unit was chosen as the optimal single-sensor location for future system development (mean overall accuracy 0.871). Flexion and abduction based exercises mostly could be segmented with high accuracy, but scapular movement exercises had poor accuracy.
Conclusion
A wrist-worn single inertial measurement unit system can accurately segment shoulder exercise repetitions; however, accuracy varies depending on characteristics of the exercise.
Introduction
Shoulder impairments caused by musculoskeletal conditions or chronic diseases require a management plan which includes rehabilitation in the home environment.1–3 An important component of home rehabilitation is the exercise programme provided by physiotherapists, yet low levels of adherence to these programmes are observed.4,5 The development of digital biofeedback systems, which aim to support and motivate patients during home rehabilitation, is the focus of a new field of rehabilitation technologies.6,7
These digital biofeedback systems contain an external sensor which collects biomechanical data, which is then analysed by the system and relayed to the user. Analysis of shoulder kinematics may be performed with a variety of sensors, but these are often not suitable for home rehabilitation. Electromyography, real-time ultrasound and marker-based motion tracking systems are costly and require specialist training, meanwhile commercial gaming consoles cannot detect the subtleties of rehabilitation exercise movements.8–12 Inertial measurement units (IMUs) are suitable alternatives for tracking human movement in home rehabilitation interventions, due to their low cost, small size and ease-of-use.13,14 IMUs consist of an accelerometer, gyroscope and magnetometer and are capable of measuring quantities such as acceleration and angular velocity of a body. Most commercial devices currently available are triaxial; therefore, each sampled quantity is measured in 3D, on the axes x, y and z. IMU-based systems for shoulder rehabilitation have been developed for measurement and analysis of range of movement (ROM), 15 function 16 and activities of daily living (ADLs). 17 However, these systems are usually developed as assessment tools for clinicians only, and not for use in unsupervised home rehabilitation.
The analysis of IMU data for rehabilitation exercises has two main stages. 18 The first stage, referred to as ‘segmentation’, detects primitive movements, which constitute the basic unit of a rehabilitation exercise (i.e. a repetition). A digital tool with highly accurate segmentation abilities, meaning that it identifies and counts repetitions correctly and free from error, can perform features such as repetition counting, automatic logging of exercise sessions and compilation of progress reports without manual data entry. Poor segmentation accuracy would result in a system which is unreliable and misleading, and will ultimately fail to be valuable for the end user i.e. patients undergoing home rehabilitation. Accurate segmentation is also crucial for success in the second stage of IMU data analysis, named ‘classification’. In the classification stage, each repetition is given a label which categorises it based on certain characteristics, such as movement speed, movement direction or movement quality. This stage does not always involve exercise detection; the system is usually aware already of the specific exercise being performed. Several systems have managed to segment and classify upper limb exercises to a clinically acceptable level, but they require multiple IMUs for data collection, which may be seen as impractical in a home rehabilitation context.19,20 A single-IMU system, being both more user-friendly and cost-effective, would be ideal for home rehabilitation. Lee et al. 21 developed a single-IMU exercise system for the upper limb, but this required manual segmentation and was tested on one upper limb exercise only. There is a clear need for an accurate segmentation model that is based on minimal IMU data input, containing a comprehensive exercise programme, which is capable of being implemented and evaluated in the home setting.
The challenge of segmenting exercise sensor signals, as described in detail by Lin et al., 22 can be approached using a variety of methods, including zero-velocity crossing (ZVC), dynamic time warping (DTW) and hidden Markov models (HMMs). While each method has its advantages, none of them entirely fulfil the needs of the above-described application. ZVC or local minima/maxima are based on specific features of the signal exceeding a threshold, whereupon the algorithm detects an exercise repetition has occurred. These methods do not require any prior knowledge of the activities performed by the subjects and are computationally economical. However, ZVC methods tend to over-segment the input signals, which would lead to erroneous repetition counting and confusion or frustration for the user. 23 DTW compares temporal and spatial differences between the performed movements and a pre-determined ideal (or ‘golden’) movement template. DTW-based methods can segment accurately, but are computationally expensive, and suffer from the issue of singularity, where short portions of movements are mapped to fit the target patterns, and therefore identified as such. 24 Additionally, DTW-based methods are not designed to segment in ‘real time’, meaning that users must wait until an exercise set is finished before receiving any feedback. Alternatively, HMMs model the input signal in a sequence of unobservable states, but despite being more flexible, they also tend to over-segment. 22 In an earlier paper, we assessed this system against ZVC and HMM for seven shoulder exercises recorded at the Wrist sensor and found this system consistently achieved the highest accuracy, precision and recall. 25 ZVC and HMM methods suffered from over-segmentation, as can be seen by their relatively lower precision scores. This paper assesses the algorithm across a broader set of exercises and considers the clinical relevance of the findings with regards to developing a digital biofeedback system.
There is a need for a model which possesses the various strengths of these systems, while also operating in real time, being computationally cheap and delivering accurate segmentation counts. This project proposes the use of a deep learning model, which does not require feature engineering or much domain knowledge, can scale well with unseen movements and will not over-segment the data. To fulfil the technological and practical needs of a digital biofeedback system for shoulder rehabilitation, we aim to use a single IMU to collect biomechanical data, and a novel machine-learning algorithm to process and segment the signal in real-time. Future steps will involve developing an exercise classifier, and this was a consideration when designing the protocol for IMU data collection.
There were two main aims of this study. Firstly, we collected a reference set of IMU data for a set of shoulder rehabilitation exercises designed by specialist physiotherapists, and used this data to train a segmentation model, which could operate with either single or multiple sensor input. Secondly, the accuracy of the segmentation model was evaluated for multi- and single-sensor systems, and the most suitable location for a single sensor system with this exercise set was identified.
Methodology
Exercise selection
This system focuses on the specific clinical scenario of early stage post-operative rehabilitation following shoulder surgery. The initial system development required a suitable exercise programme which was both comprehensive enough to form a clinically relevant system and sufficiently refined to not over-burden the data collection and analysis processes. A focus group of three physiotherapists with experience in shoulder rehabilitation assisted in the development of this programme. Ethical approval was granted by Beacon Hospital Research Ethics Committee. Participants were asked to list the most commonly prescribed exercises in this clinical scenario and to develop a descriptive definition of each movement. They then discussed the various compensatory movement patterns (‘deviations’) associated with these exercises in a post-operative population. Participants were asked to focus on deviations which would prove detrimental to achieving full motor recovery and not on variations of ‘normal’ movement. Several proposed deviations were excluded and their descriptions and reasons for exclusion are as follows:
Neck movement during scapular retraction – none of the three IMUs were in a location that could detect this. Inadequate ROM during all exercises – in the early post-operative stages this is mainly caused by pain and oedema, and the user cannot increase ROM volitionally in these circumstances.
This resulted in a total of 11 exercises and 31 deviations as listed in Table 1.
Exercises, abbreviations and variations for data collection.
Focus group participants also highlighted that, to meet both strengthening and ROM goals, several movements should be performed both as isotonic exercises and as static stretches within the programme. We therefore included a variety of ‘hold’ times from five to ten seconds in the data collection protocol. The list of exercises and their associated stretches and deviations can together be referred to as the different exercise ‘variations’, which would be performed by participants in the IMU data collection phase of the study.
IMU data collection procedure
Participants were a convenience sample of healthy adults who were recruited via posters in the university campus. Exclusion criteria were current shoulder or upper limb injury limiting normal range of movement and inability to provide informed consent. Participants provided informed consent, and their height, weight and upper limb measurements were recorded. Ethical approval for this study was granted by the Human Research Ethics Committee of University College Dublin.
Data collection took place in a university laboratory. Three IMUs (SHIMMER, Shimmer research, Dublin, Ireland), labelled ‘Wrist’, ‘Arm’ and ‘Traps’, were attached to specific locations on the right arm of the participants using adhesive tape and bandaging. Placement and orientation of IMUs were consistent across all participants (Figure 1). IMU parameters were set as follows: sampling frequency 102.4 Hz, tri-axial low-noise accelerometer ±2 g and tri-axial gyroscope ±500 dps. These settings were adapted from previous research using IMUs to collect rehabilitation exercise data.13,26 These exercises are low-velocity and low-impact, so an accelerometer range of ±2 G achieves maximum granularity in the data and still ensures that all sensor data was captured. ConsensysPRO v1.5.0 software by Shimmer was used to configure the IMUs and manage the data.

Orientation and placement of IMU sensors.
The lead investigator demonstrated the first exercise variation to the participant, who then held the specified starting position for one second before performing ten repetitions of the movement at a moderate pace with a rest of ∼0.5–1 s in between repetitions. This procedure was repeated for each of the 31 exercise variations. Exercises were performed in standing, except those listed as supine exercises, which were performed on a plinth. All data collection sessions were observed by a physiotherapist and any variations from the instructed exercise technique, such as additional or incomplete repetitions, were noted so that they could be appropriately labelled and would not compromise the dataset integrity. High-definition video cameras were used to record participants laterally and posteriorly, so their performance could be reviewed if any discrepancies in data were noted during the analysis phase.
Data analysis
The segmentation system implemented for the isolation of the repetitions has a two-tier structure. The first component is a convolutional classifier, which classifies small sliding windows extracted from the streaming IMU signals. Each sliding window can be classified as either dynamic or dormant, depending on whether it corresponds to a period of movement or to a period of ‘silence’ (no movement). A window size of 30 data points was chosen. With a fixed sampling frequency of 102.4 Hz, each window is ∼0.3 s. Considering the expected length of both the exercise movements and pause between exercises, this is a suitable window size to enable both detection of short pauses between consecutive repetitions and extraction of discriminative features from windows. 27 Consecutive windows overlapped by 29 points. As convolutional classifiers can automatically learn meaningful features from the raw input, no data pre-processing for the IMU signals is required. The window labels produced by the convolutional classifier are then streamed as input to a finite state machine (FSM), a stateful component that models the high-level movement patterns of the exercise repetitions. The FSM keep track of the movement phase currently under execution (eccentric, concentric, isometric), and for each individual primitive it returned the starting point and the ending point (Figure 2).

Signal segmentation of accelerometer and gyroscope data from Wrist IMU for three repetitions of shoulder abduction exercise.
The overall performance of the segmentation method was assessed with the leave-one-subject-out (LOSO) cross-validation protocol.
28
The validation folds generated with LOSO are designed so that repetitions from the same subject cannot be distributed between the training data and the test data. This results in a more accurate estimate of the system generalisation capabilities with respect to subject variability. Segmentation accuracy (ACC), the proportion of repetitions in the testing data which have correctly segmented, is the metric used to assess the segmentation ability of the algorithm.
18
To compute this, the training dataset was manually labelled in advance with ground truth segment coordinates, to compare with the coordinates proposed by the algorithm. In this system, and as described by Bevilacqua et al.,
25
any coordinate generated by the segmentation algorithm was reviewed by an investigator and identified as a true positive (TP) if it corresponded to or fell within 50 data points either side of a ground truth coordinate, or a false positive (FP) if it was further than 50 data points from any of the ground truth coordinates. Instances where the ground truth coordinates did not have a correspondence within the set of generated coordinates were marked as false negatives (FN). For the purpose of accuracy computation, we allowed the set of TN points to be empty. Precision and recall are metrics which provide additional detail regarding accuracy of a model. Precision (PRE) reports the percentage of coordinates which were correctly labelled; recall (REC) reports the ratio of correctly labelled coordinates to the total number of labelled coordinates.
22
Precision, recall and accuracy were calculated as follows, where tp is true positive, fp is false positive and fn is false negative:
We chose a threshold accuracy level of ≥0.85 to select the exercises, which were acceptable for inclusion in our system prototype as biofeedback exercises; exercises falling below this level will be included but without segmentation-related biofeedback.
Results
IMU data for the 31 variations was collected from 35 participants (22 male, 13 female, age: 22–69). A mean of 677 (range 449–1079) repetitions per exercise and a mean of 240 (range 98–300) repetitions per variation were collected, with some loss of data due to dropped signals or automatic reconfiguration of sensor parameters to default settings. Twenty percent of the data was allocated for algorithm testing and the remainder used for training purposes. Results for accuracy testing are presented in Tables 2 and 3. A score of 1 signifies that all co-ordinates corresponding to repetition start and end points were correctly identified, with no false positive or false negative co-ordinates detected.
Accuracy of segmentation system in exercises.
ACC: accuracy; PRE: precision; REC: recall; W: wrist sensor; A: arm sensor; T: traps sensor.
Low accuracy scores of <0.85 are italicised.
MOA, precision and recall of each sensor or sensor combination for all 11 exercises.
ACC: accuracy; PRE: precision; REC: recall; W: wrist sensor; A: arm sensor; T: traps sensor.
Discussion
The results demonstrate that this novel machine-learning system can segment IMU data for shoulder rehabilitation exercises to a high level of accuracy, although there are some exceptions to this. The accuracy of this system varies between exercises, as several exercises possess characteristics which make segmentation fundamentally challenging. Overall, flexion and abduction exercises can be segmented to an excellent (>0.95) or very good (>0.90) level. This high level of accuracy is due to the large-magnitude movements created during these exercises, forming distinct and clear patterns in the IMU signal. However, in FLEX WALL, participants are instructed to ‘walk your fingers up the wall,’ which creates noise in the wrist IMU, resulting in a high number of false positive co-ordinates for this sensor. Traps as a single-sensor system produced poor results in six exercises (see Table 2), as these exercises involved little movement around the scapula, and larger limb movements. Conversely, in the ROLL exercise, which involves broad scapular movements and minimal limb motion, Traps out-performed Wrist and Arm sensors. We included the Traps sensor to monitor for elevation of the scapula, as the focus group had identified this as a common compensatory movement following shoulder surgery. Other sensor-based systems designed to detect scapular motion have been successfully developed using multiple IMUs, electromagnetic sensors and ultrasonography,29–31 but are not suitable for patient use in home rehabilitation. RET and ROLL, which both primarily involve motions of the scapulae against the thorax, achieved poor accuracy overall. These comparatively subtle, low-velocity movements created low-magnitude IMU signals with less meaningful patterns for segmentation purposes, particularly in the limb-worn sensors. RET, ROLL and FLEX WALL did not achieve an accuracy of ≥0.85 and so, while they will be included in the system prototype as part of the exercise programme, they will not have segmentation-related biofeedback features.
The ability to demonstrate a high level of accuracy with a single sensor system is both novel and clinically relevant. While systems using IMUs to track shoulder movement for home rehabilitation have been successfully developed,19,20 so far no effective system exists using one IMU. In comparison to multi-sensor systems, a single-sensor system is more user-friendly and realistically deployable for use in a home rehabilitation setting. Of the single-sensor systems, the mean overall accuracy (MOA) for wrist, arm and traps were 0.871, 0.837 and 0.666, respectively (table 3). In comparison, the most accurate multi-sensor system (Arm and Wrist) achieved an MOA of 0.918. As only those exercises with an accuracy of ≥0.85 will be developed to have biofeedback features in the prototype, we considered these exercises separately and found the MOA was 0.956 for Arm and 0.95 for Wrist. To decide which sensor is most suitable for a single-sensor system, usability must also be considered. Users may prefer the wrist as it is further from the operation site, which may still be painful or swollen in the early post-operative stages. The wrist location is also feasible; in a study of 20 participants with mild to moderate upper limb impairments; Lee et al. 21 found that commercial activity monitors could be independently applied to the wrist, once the fastening mechanism did not require a high level of finger and hand dexterity. Forner-Cordero et al. 32 report that skin-mounted IMUs are more likely to produce a distorted signal or artefact if there is soft tissue between the sensor and the bone. We placed the wrist sensor close to the bone on the dorsum of the wrist, while the arm sensor was placed where there is often excess adipose or muscle tissue. Additionally, the upper arm can often be wider at the top than at the bottom, a shape in which displacement of the sensor and loss of orientation can occur. 33 It is evident that the wrist is the optimal location for the development of a single-sensor system.
The main musculoskeletal aims of post-operative exercise programmes are to improve ROM, increase strength and return to functional patterns of movement.34,35 To achieve this, a variety of exercises are prescribed, some of which require the user to ‘hold’ at the peak of the movement for several seconds to increase ROM (i.e. a ‘static stretch’). 36 This segmentation model is designed to detect static stretches by identifying sections of movement with sections of silence in between as one repetition, regardless of the number of windows of silence in between them (Figure 3). As a result, users will be able to perform any exercise in the system as a stretch as well as a conventional repetition; this greatly expands the versatility and relevance of the system.

Accelerometer and gyroscope data of shoulder flexion stretch in supine. Windows of silence, lasting ∼9 s do not limit the system’s ability to detect a single repetition.
The main limitation of this study is that the IMU data was collected from participants with no current shoulder pathologies, and participants needed to be taught how to perform the exercises with the specified technique deviations. Ideally, the dataset should be collected from a sample of the system’s target population, as IMU data from individuals with shoulder dysfunction will vary from that of a healthy population. Additionally, the data was collected under laboratory conditions with a researcher instructing participants on IMU placement and orientation. ‘Real world’ applications of IMU-based systems without constant supervision may suffer from poorer accuracy as a result. It is important that the validity of this system is now assessed with a clinical population in a ‘real world’ scenario. As previously mentioned, some data was lost due to dropped signal or system failure. While it did not appear to affect results, it would be recommended that any replications of this methodology use a second investigator to monitor the sensor signal during data collection. There are no published guidelines to follow regarding selecting a threshold to distinguish TP from FP, and we were required to self-select the threshold of ±50 data points. These rehabilitation exercises are performed slowly, and there is a comparatively low level of precision required regarding the exact moment the exercise begins or ends. Therefore, this is an appropriate timeframe and would lead to a minimal margin of error in the results, but we acknowledge this self-selection of metrics as a limitation of this study. Future work will focus on continuing to refine the algorithm to further improve accuracy, assessing validity of the algorithm in a clinical population, developing the exercise classification model and developing a mobile biofeedback application to validate the segmentation and classification system in a clinical population.
Conclusion
Digital biofeedback support systems using IMUs can assist during home rehabilitation by providing information on exercise performance. For this, accurate segmentation of exercise repetitions is essential. IMU data for shoulder exercises was used to train and test a machine learning segmentation system. The system could accurately segment exercise repetitions using a single sensor located at the wrist for most of the selected exercises, however, several exercises were inaccurately segmented largely due to characteristics of the movement involved. Future work will focus on developing an exercise classification model and accompanying biofeedback mobile application.
Footnotes
Acknowledge
The authors would like to thank all those who gave their time to participate in this research. We also wish to thank Prof. Anand Pandyan, Associate Editor, and the two anonymous editors for their helpful feedback on this manuscript.
Declaration of conflicting interests
The author(s) declare that there is no conflict of interest with respect to the research, authorship, and/or publication of this article
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article. This project was supported by a grant from the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement no. 722012.
Guarantor
BC.
Contributorship
LB, AB, BC and TK conceived the study. LB researched literature, developed protocol, gained ethical approval, recruited participants and conducted data collection. AB and TK developed and implemented data segmentation system. LB wrote the first draft of the manuscript. All authors reviewed and edited the manuscript and approved the final version of the manuscript.
