Abstract
Objective
Early detection and intervention are essential for the mitigation of degenerative cervical myelopathy (DCM). However, although several screening methods exist, they are difficult to understand for community-dwelling people, and the equipment required to set up the test environment is expensive. This study investigated the viability of a DCM-screening method based on the 10-second grip-and-release test using a machine learning algorithm and a smartphone equipped with a camera to facilitate a simple screening system.
Methods
Twenty-two participants comprising a group of DCM patients and 17 comprising a control group participated in this study. A spine surgeon diagnosed the presence of DCM. Patients performing the 10-second grip-and-release test were filmed, and the videos were analyzed. The probability of the presence of DCM was estimated using a support vector machine algorithm, and sensitivity, specificity, and area under the curve (AUC) were calculated. Two assessments of the correlation between estimated scores were conducted. The first used a random forest regression model and the Japanese Orthopaedic Association scores for cervical myelopathy (C-JOA). The second assessment used a different model, random forest regression, and the Disabilities of the Arm, Shoulder, and Hand (DASH) questionnaire.
Results
The final classification model had a sensitivity of 90.9%, specificity of 88.2%, and AUC of 0.93. The correlations between each estimated score and the C-JOA and DASH scores were 0.79 and 0.67, respectively.
Conclusions
The proposed model could be a helpful screening tool for DCM as it showed excellent performance and high usability for community-dwelling people and non-spine surgeons.
Keywords
Introduction
Degenerative cervical myelopathy (DCM) 1 is a progressive disease with symptoms such as numbness, pain, and the inhibition of skilled movement, which can cause disabilities in daily life.2–4 The recognition of the DCM is low,5,6 and delays often occur in its diagnosis. 7 The early detection of DCM is essential to receive appropriate medical treatment to inhibit the progression of these symptoms.8,9 In general, the longer the symptomatic period, the worse the severity of the disease and the postoperative outcome score.10–12 These issues were presented as 10 research priorities. 13
A disorder known as “myelopathy hand” appears as the predominant symptom in DCM patients during the natural course of the disease.11,14 This disorder is a well-known symptom used to identify DCM's presence.15,16 In particular, the 10-second grip-and-release test (10-second test) is used to ascertain the presence of myelopathy hand. 16 This test evaluates the number of repetitive grip–release–grip motion cycles in a 10-second period (over 20 motion cycles is thought to be normal). 16 Because of its simplicity, this test is often used in a clinical setting; however, it typically only considers the number of motion cycles in a particular duration rather than the motion itself. Modified test protocols using video cameras and other devices have been proposed,15,17–22 with reports suggesting that there may be more important variables than the number of motion cycles when determining the myelopathy hand.15,20,21,23
Koyama et al. 24 developed a new classification model for DCM screening focusing on the 10-second test using Leap Motion (Leap Motion, San Francisco, CA, USA) and a machine learning algorithm. However, the developed test still requires a specific device and is not sufficiently easy to use for clinicians or patients themselves. More affordable and widespread tools such as a camera or smartphone are needed to overcome these limitations.
Recently, Zhang et al. 25 presented an open-source hand pose estimation system, and as it only requires a processer, the entire system can be included in a smartphone. If the system could accurately determine the DCM condition of an individual, it could be used as a screening tool for the general population. This would encourage individuals to visit a physician sufficiently early to avoid worsening their disease. Additionally, if the severity of DCM can be detected, it would be helpful for medical staff.
This study aimed to identify subjects with DCM based on videos recorded with a smartphone. We hypothesized that the proposed identification system using a camera and the hand pose estimation system can identify the presence of DCM from a 10-second test video and accurately estimate the disease severity.
Methods
Participants
Participants were recruited between September 2020 and August 2021, with patients with DCM who had planned for cervical surgery being assigned to the DCM group. Here, DCM was diagnosed by experienced spine surgeons who performed diagnoses through physical and neurological examinations (including examinations of deep tendon and pathologic reflexes such as the Hoffmann reflex) and imaging such as magnetic resonance imaging and computed tomography myelograms.
Patients in the control group planned to undergo total hip arthroplasty and had no sign of neurological examination nor known degeneration or ossification change of ligaments from cervical radiography. Experienced spine surgeons verified the latter aspect. The exclusion criteria included a history of upper extremity disorders, cerebral accidents, whole-body inflammatory diseases, dementia, and psychiatric diseases. Those who did not want to participate were also excluded. Further, as the study hypothesizes that hand disorders will affect the result, hand surgeons examined participants’ hands to confirm the absence of hand abnormalities. In addition, we checked diabetes mellitus as the complicating disease and the HbA1C scores.
All procedures were approved by the Institutional Review Board of Tokyo Medical and Dental University (approval number: M2019-047), and all participants provided written informed consent prior to participating in this study. The experimental procedures were conducted in accordance with the Declaration of Helsinki.
Questionnaires
All participants were requested to answer the Japanese Orthopaedic Association scores for cervical myelopathy (C-JOA)26,27 and the Disabilities of the Arm, Shoulder, and Hand (DASH)28,29 questionnaires to determine disease severity.
Measurement procedure
This study used a Samsung Galaxy S9 SC-02 smartphone (Samsung Electronics, Suwon, South Korea) equipped with a camera. The smartphone was placed on the desk with the screen facing upward, and participants held their hands above the smartphone (Figure 1A). Subsequently, they were asked to grip and release their hands as fast as possible. The hand movement was recorded at 30 frames per second. During the trial, it was ensured that the hand remained within the image frame captured by the smartphone. The recordings were continued until 20 motion cycles were reached, or the recording time reached 10 seconds. Both left and right hands were recorded. Prior to the recording of participant's trial, an examiner demonstrated the 10-second test, and we did not allow the participants to practice the movement.

Capture setting of the hand and smartphone(A). Coordination system of original (white arrows and letters x and y) and transformed (yellow arrows and letters x’ and y’) (B). The third axis was set as orthogonal to these two axes.
Data processing
The recorded videos were transferred to a personal computer, and the hand pose was estimated using MediaPipe Hands, provided by Google. 25 MediaPipe Hands estimates 21 positions of the hand and fingers from the video frame and outputs each point's horizontal, vertical, and depth position relative to the image's origin (top left). The coordinate system was adjusted to eliminate the effect of the position of the hand relative to the camera on the data characteristics. The horizontal and vertical axes were rotated to coincide with the direction of the base of the index finger as the first axis and the orthogonal direction as the second axis. Furthermore, the origin was translated to the wrist point (Figure 1B). The positions of each point were calculated according to the adjusted coordinate system. In this step, 60 variables were obtained for further analysis (20 points in three directions). Because the wrist position was the origin of the coordination system, 21–1 points were used for the analysis (Figure 2).

Data analysis flow. The feature values calculated from the videos were preprocessed according to the procedure shown in this figure and used for further analysis.
Subsequently, the time-series data of each of the 60 variables were divided into 20 data segments with equal time lengths (64 video frames) (Figure 2). Fast Fourier transform (FFT) was applied to each segment, and the one-sided amplitude spectrum was used, selecting the lower-half frequency components to eliminate duplicates. Each segment was then divided into 64 data points, and 32 values were calculated as the results of the FFT procedure. In this phase, 32 frequency feature values, which described periodic characteristics of hand movement, were obtained at 20 data segments of 60 kinematic variables of the point positions per participant.
Several models were created by combining these feature values, and the model with the highest identification performance was selected according to the area under the curve (AUC). 30 Each finger and kinematic position were included in the models as feature values in this step.
In addition to these newly proposed models, we formulated the count-based model, which used the count of the 10-second test of both hands as feature values, as the baseline method.
Binary classification
With these variables, support vector machine (SVM) classifiers were used for binary classification of whether or not DCM was present (Figure 2). The feature values were evaluated within each of the 20 data segments, and a classification result was output as a real number using the SVM. Because the number implied the patient's probability of having DCM, this was estimated by averaging the SVM outputs corresponding to each of the 20 segments. In the validation phase, 10-fold cross-validation was used, 31 during which the data were divided into 10 groups. One group was used for evaluation, while the others were used for training. The overall evaluation result was obtained by evaluating each of the 10 groups. The model performance scores were indicated by the average of the cross-validation results.
In the count-based model, the steps of binary classification were the same as those of the proposed model, except for the feature values.
Regression model of disease severity
Random forest regression analysis was used to evaluate whether the disease severity could be estimated using hand motion tracking data (Figure 2). The independent variables were set as the C-JOA or DASH scores, and the explanatory variables were set as the feature values determined in the final classification model. The average of 20 regression results for each segment was regarded as the final result. In the validation phase, leave-one-out cross-validation was used. 32 This method separates the dataset into one for evaluation and another for training. The training and evaluation protocol was repeated according to the combination of separation.
In addition to this proposed model, we calculated the correlation between the count of 10-second tests and each questionnaire score.
Statistical analysis
The participants’ ages, HbA1c, count of 10-second test, C-JOA, and DASH score were compared using the Student's t-test, and the Fisher exact test compared their sex, dominant hand, and the number of patients with diabetes mellitus. The significance level was set at 5%, and the binary classification model estimating the presence of DCM was evaluated using the AUC calculated from the receiver operating characteristic curves. The sensitivity and specificity were also calculated.
The estimated disease severity values by the random forest regression method were evaluated using Spearman's rank-order correlation coefficients. The root mean squared error (RMSE) between the estimated values and the recorded values of the C-JOA and DASH scores were also calculated. In addition, the correlation coefficients between the count of 10-second test and each questionnaire score were calculated using Spearman's rank-order correlation coefficients.
All data analyses were performed using Python (Python Software Foundation, Wilmington, DE).
Results
Participant demographics and characteristics
In total, 22 patients with DCM (DCM group) and 17 participants without DCM (control group) were recruited. No significant differences were found between the DCM and control groups regarding age, sex, or dominant hand (Table 1). The number of patients with diabetes mellitus tended to be large in the DCM group; however, the HbA1c was not significantly different. The count of the 10-second test and JOA score were lower in the DCM group than those in the control group. The DASH score was higher in the DCM group than in the control group.
Demographic and disease details of participants.
The age, HbA1c, Count of 10-second test, C-JOA score, and DASH score are given as the mean ± standard deviation.
DCM, degenerative cervical myelopathy; CSM, cervical spondylotic myelopathy; OPLL, ossification of the posterior longitudinal ligament; CDH, cervical disk herniation.
Binary classification model
First, the model including all variables representing features and the model focusing on each finger was selected and evaluated with the SVM. These results are shown in Table 2. Thumb movement was selected as the most predictable variable. Second, each finger variable was selected to develop and evaluate the model construction, and the thumb interphalangeal (IP) joint was chosen as the most predictable variable, as shown in Table 3. Moreover, the AUC was higher in the model that considered the movement of both hands rather than each hand individually (AUC; right: 0.89, left: 0.90, both: 0.93) (Table 4). Thus, both hands’ thumb IP joint movement was used as the feature values for further analysis. The final model's sensitivity and specificity were 90.9% and 88.2%, respectively (left panel in Figure 3).

Receiver operating characteristic curve of the final model of the proposed model using both IP joints as features (left panel) and the count-based model (right panel). The red cross indicates the highest performance point.
Classification model performances for each and all fingers.
The models were constructed with both hand data.
AUC, area under the curve.
AUCs of each classification model focus on the tips and joints of each finger.
Joints were represented as the index to the little finger and (the thumb).
The models were constructed using both hand data.
AUC, area under the curve; DIP, distal interphalangeal; IP, interphalangeal; PIP, proximal interphalangeal; MP, metacarpophalangeal; CM, carpometacarpal.
Model performance focused on the selection of hands which used the interphalangeal joint position as the feature value.
AUC, area under the curve.
Furthermore, the count-based model, which used the count of 10-second test of both hands, indicated 79.4% sensitivity, 86.4% specificity, and 0.87 AUC (right panel in Figure 3).
Regression model
Regression analysis was performed using the random forest regression model with both hands’ thumb IP joint movement as explanatory variables. Significant correlations were found between the estimated value of the random forest regression and the C-JOA and DASH scores. The correlation coefficients for the proposed model were 0.79 for the C-JOA score and 0.67 for the DASH (left panel in Figure 4A, B). The RMSE values were 2.41 for the C-JOA score and 21.86 for the DASH score. The correlation coefficient between the count of 10-second test and the C-JOA score was 0.61, and that of the DASH score was −0.68 (the correlation coefficient is negative because a higher number of 10-second tests indicates less severity) (right panel in Figure 4A, B).

(A). Scatterplots of the C-JOA scores: the scores estimated using the random forest regression model (left panel in A) and the count of 10-second test (right panel in A). (B). Scatterplots of the DASH scores: scores estimated using the random forest regression model (left panel in B) and the count of 10-second test (right panel in B).
Discussion
This study used a smartphone with a camera and a machine-learning algorithm. This study aimed to evaluate whether the proposed screening method for DCM was effective, focusing on each joint/point movement during the 10-second test.
Owing to its inherent usability, the 10-second test was employed to screen patients for DCM, and the sensitivity, specificity, and AUC were reported as 61–77%, 52–66%, and 0.71–0.77, respectively. 33 Zheng et al. 34 previously used video and artificial intelligence and also reported higher model performance when the outcome was set as the classification of the disease severity presumed by the number of grip–release–grip cycles. However, it appeared to be insufficient because the correlation of the C-JOA score with the number of grip–release–grip cycles was approximately 0.5, 35 which, although indicating a strong correlation, is not sufficient. 30 Thus, there was much room for improving the screening accuracy and severity estimation.
The performance of the 10-second test focusing on the required time and a number of grip–release–grip cycles was considered insufficient as it appeared to discard rich information regarding hand movement. Further performance improvements are expected by utilizing the rich information that has been discarded. Koyama et al. 24 investigated hand movement during the 10-second test for the screening of DCM using Leap Motion (which consists of infrared cameras and LEDs) and obtained a good performance with 84.0% sensitivity, 60.7% specificity, and 0.85 AUC. In this study, the count-based model had 79.4% sensitivity, 86.4% specificity, and 0.87 AUC. The sensitivity, specificity, and AUC of the final proposed model were 90.9%, 88.2%, and 0.93, respectively. These scores are significantly high, suggesting that this is a viable tool for screening DCM.
Conventionally, various additional methods are used to support the diagnosis or screening; for example, Hoffmann's sign (sensitivity: 31–94%, specificity: 73–78%),36–40 finger escape sign (sensitivity: 48–55%),38,40 and Babinski's test (sensitivity: 7–33%, specificity: 92–100%).36–38 Recent studies have also reported the hand-grip strength (sensitivity: 91–96%, specificity: 59–88%, AUC: 0.87–0.97), 41 10-second step test (sensitivity: 86–94%, specificity: 59–82%, AUC: 0.87–0.93), 33 and 10-second test with a glove equipped with sensors (sensitivity: 86%, specificity: 86%, AUC: 0.93) as being helpful for screening the DCM condition. 22
Compared with the above methods, the proposed method performed significantly better. In addition to the improved performance, the proposed method provides advantages regarding its inherent usability. It does not require equipment other than a smartphone or similar tool to capture videos, and the test can be accomplished similarly to the traditional 10-second test. These facts demonstrate its simplicity and satisfy the easy-screening requirement.
The final model was selected according to the AUC value. The chosen feature value was the movement of the IP joint of both hands and the thumb movement, which was the most characteristic for screening in this setting. Previous reports suggested that the inability to extend the ulnar fingers was a feature of patients with DCM, 16 which was not reflected in the results of this study. However, these other models that focused on the movement of features other than the thumb also showed excellent classification performance (Table 2; AUC > 0.85). The thumb movement model's highest performance can be attributed to the capture setting and compensational movement.
This study captured the hand movement from under the hand, facing the palm. In this setting, the finger movement direction of the index to little fingers was orthogonal to the camera view. The thumb movement was parallel to the camera view; thus, the characteristic motion may have been detected in greater detail. The characteristics of the myelopathy hand were believed to be a dysfunction of the extension of two or three ulnar fingers. 16 However, Sakai recently reported that the thumb IP joint demonstrated a more significant flexion movement for compensating the weakened intrinsic muscles during the repeated pinching task in patients with DCM. 42
The final model obtained the highest performance by focusing on the thumb IP movement, reflecting the thumb's compensation movement. The action may be a new observation in patients with DCM during the 10-second test that is little investigated. This could also be why the final model focused on the thumb IP movement exhibited the highest performance for identifying the presence of DCM in patients.
Moreover, the models aggregating the data from both hands performed better than those based on only one hand. It should be noted that the appearance of symptoms in both upper extremities is typical in patients with DCM. 43 Considering the bilateral progression of DCM may further improve the model's accuracy.
The random forest regression analysis was used to estimate the disease severity in the proposed model. The developed model significantly correlated with both the C-JOA and DASH scores because the correlation coefficient values were 0.79 and 0.67, respectively. The correlations between the count of 10-second test and each questionnaire were 0.61 and −0.68, respectively. Thus, the proposed model may have a similar or more meaningful performance than the count-based method for the prediction of disease severity. This study used the hand motion data during the 10-second test only. The DASH score included questions related to the upper extremity functions (including the hand, elbow, and shoulder functions), such as the usability of the upper extremities in activities of daily living. The C-JOA score had queries related to both the upper extremity functions and the ambulation or bladder functions. As both scores were similarly correlated with the proposed model and the RMSE values were small, especially for the C-JOA score, it may be that the proposed model precisely evaluated the hand movement function of the DCM patients and that the severity of DCM was captured in the patients’ hand movement during the 10-second test.
This study also faced certain limitations. First, the patients in the DCM group were those who had planned surgery; thus, the disease condition was already severe. Thus, whether our model can screen the patient with early DCM remains unknown. In future work, it is necessary to include patients with mild to moderate symptoms to screen them for DCM early. Second, the participants with other disorders that affect hand movement were not included in this study. Ideally, those patients must be included, and the model should distinguish patients with DCM from these patients. Third, while the aging effect on hand movements during the 10-second test is under discussion,20,33,35,44 the participants of this study were quite elderly, and so young participants should be recruited for global use. Fourth, the examinee was able to see the demonstration by the examiner in this study. For general public use, the data recorded only by the examinee will have more noise, and the performance of this screening and prediction may be low. Overcoming these limitations in future work should improve the classification method. Nevertheless, the results of this study are still valuable as a pilot study.
Conclusions
This study identified patients with DCM using a smartphone and a machine learning algorithm with high sensitivity, specificity, and AUC by focusing on the IP joint movements of the thumb of both hands. Moreover, the severity of the disease was estimated using the same test protocol. These results will provide a tool for screening DCM and not require special equipment to apply the procedure. This is advantageous for self-checking among community dwellers, people without access to a spine surgeon, and surgeons not specialized in diagnosing spine disease.
Footnotes
Acknowledgments
We thank Drs Davies, Hilton, and their colleagues for their work, which inspired the present study.
Authors’ Note
Takuya Ibara, Ryota Matsui and Takafumi Koyama contributed equally to this work.
Contributorship
TI conducted the data analysis and wrote the first draft of the manuscript. RM and TK contributed to the study design, data quality control, and analyzed the data. EY, AY, KT, and HK conducted literature searches, developed the concept for the study, and recruited participants during the study. AN, TY, AO, and HS developed the study materials and protocol and facilitated the data collection. HS and KF developed the concept for the study and curated and analyzed the data. All authors contributed to the data interpretation and manuscript revision, and approved the final version.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical approval
The Institutional Review Board of Tokyo Medical and Dental University approved this study (REC number: M2019-047).
Funding
This work was funded by grants from JST AIP-PRISM (grant number JPMJCR18Y2), JST PRESTO (grant numbers JPMJPR17J4, JPMJPR2134), and JSPS KAKENHI (grant number JP21H03485).
Guarantor
KF.
