Abstract
With the development of robotics, intelligent neuroprosthesis for amputees is more concerned. Research of robot controlling based on electrocardiogram, electromyography, and electroencephalogram is a hot spot. In medical research, electrode arrays are commonly used as sensors for surface electromyograms. Although these sensors collect more accurate data and sampling at higher frequencies, they have no advantage in terms of portability and ease of use. In recent years, there are also some small surface electromyography sensors for research. The portability of the sensor and the calculation speed of the calculation method directly affect the development of the bionic prosthesis. A consumer-grade surface electromyography device is selected as surface electromyography sensor in this study. We first proposed a data structure to convert raw surface electromyography signals from an array structure into a matrix structure (we called it surface electromyography graph). Then, a convolutional neural network was used to classify it. Discrete surface electromyography signals recorded from three persons 14 gestures (widely used in other research to evaluate the performance of classifier) have been applied to train the classifier and we get an accuracy of 97.27%. The impacts of different components used in convolutional neural network were tested with this data, and subsequently, the best results were selected to build the classifier used in this article. The NinaPro database 5 (one of the biggest surface electromyography data sets) was also used to evaluate our method, which comprises of hand movement data of 10 intact subjects with two myo armbands as sensors, and the classification accuracy increased by 13.76% on average when using double myo armbands and increased by 18.92% on average when using single myo armband. In order to driving the robot hand (bionic manipulator), a group of continuous surface electromyography signals was recorded to train the classifier, and an accuracy of 91.72% was acquired. We also used the same method to collect a set of surface electromyography data from a disabled with hand lost, then classified it using the abovementioned network and achieved an accuracy of 89.37%. Finally, the classifier was deployed to the microcontroller to drive the bionic manipulator, and the full video URL is given in the conclusion, with both the healthy man and the disabled tested with the bionic manipulator. The abovementioned results suggest that this method will help to facilitate the development and application of surface electromyography neuroprosthesis.
Keywords
Introduction
With the development of technologies such as computer vision and signal processing, the new type of human–computer interaction becomes more and more popular in robot controlling. Using robot such as neuroprosthesis to assist amputees can significantly improve their life quality. 1,2 Compared to other technologies, surface electromyography (sEMG) makes it possible to design a wearable smart prostheses due to its ease of use and noninvasiveness. 3,4 More and more upper limb prosthetics use pattern recognition of sEMG for controlling. 5 The support vector machine (SVM) 6 and linear discriminant analysis 7 –9 were considered as the best two methods to classify the sEMG signals through the past decade research.
Since AlexNet has defeated SVM in image classification problems, 10 deep learning is widely used in various classification problems. The first convolutional neural network, LeNet-5, 11 had been proposed to solve the problem of handwritten character classification. With the improvement of computer hardware and the increase of data size, convolutional neural network has made great breakthroughs in image classification, image recognition, 12 and image semantic segmentation 13 problems. In recent years, convolutional neural networks have also been successfully used in animal behavior classification, drug synthesis, and many other chemical and bio fields. 14,15 We believe that convolutional neural networks can make a good result in the classification of hand gestures based on sEMG. Indeed, in recent years, many researchers have also tried to use convolutional neural networks to solve sEMG-based gesture classification tasks and achieved some results. 16,17
There are two main reasons why sEMG-based neural prostheses have not been extended to the consumer level. First, the price, volume, and power consumption of traditional medical grade sensors cannot meet the requirements of ease of use and weight reduction. 16,18 Second, traditional methods need to collect more data from a period, analyze data features, and classify them using data features. 19,20 This problem discourages long-term use of neuroprosthetics in amputees. Convolutional neural network is an end-to-end classification method, compared with the traditional machine learning method for the design of features, and the convolutional layer can autonomously analyze the required features, which also makes it possible to classify the gesture with time-domain sEMG signals only. Most researchers who tried to use convolutional neural networks were more concerned with the ability of network, but the structure of network used was more important to the bionic manipulator. A better structure leads to a smaller size of model and a faster speed of calculating. 21
In this article, a cheap, lightweight sEMG sensor called myo armband was used to sample the sEMG signals. Considering the characteristics of sEMG signals recorded from myo armband, we designed a time-domain sEMG-graph structure (a matrix structure), then proposed an end-to-end classifier based on the convolutional neural network and data structure mentioned earlier to classify the gesture. We recorded discrete sEMG signals from three subjects 14 gestures and tested the performance of the classifier and the influences of different structures to the ability of network by classifying these data. We used a database named NinaPro to test our methods, which is publicly accessible, and then compared our results with SVM and Random Forests which have been shown to achieve the top performance on the NinaPro database. 22 Next, we recorded continuous sEMG signals from one subjects five gestures (a healthy man and a disabled), tested the performance of classifier, then employed to the sEMG-based real-time gesture controlling on a bionic manipulator.
Materials and methods
Database
The database5 of the NinaPro was used in this study. The project NinaPro is a publicly available database, which aims to aid research on advanced hand myoelectric prosthetics. 23 The database has been used widely for the research on gesture recognization based on sEMG. 24 The Ninapro database5 includes sEMG data from 10 intact subjects recorded with two Thalmic Myo armbands. Each subject needs to perform three exercises with totally 52 different hand gestures. Since we want to guide a bionic manipulator with sEMG signals, we just test our algorithm on the exercise B in this study, which contains eight isometric and isotonic hand configurations and nine basic movements of the wrist, the use of which is advantageous in robot controlling.
Next, we recorded two sets of continuous gesture sEMG data from both an intact subject and an amputee and a set of discrete gesture sEMG data as experimental data. Since we want to train the network to control the bionic manipulator, our data acquisition method is different from most sEMG recording experiments. We asked subjects to maintain continuous behavior and record the entire sEMG signals. The sEMG database usually requires the subject to hold each gesture in a short time, then repeat several times. 17 To avoid fatigue, the gestures were alternated with a resting posture lasting. However, it is normal to maintain a long posture during daily use. Therefore, in order to design a smart prosthesis that is more suitable for amputee daily use, each gesture requires the subject to maintain a relative long time, so the acquired signal will produce some noise due to the fatigue and recovery. Because each subject’s fatigue will have a big difference, and these data fluctuations will have greater interference when exploring the effect of different normalization methods on the accuracy, we created discrete and continuous data sets when making data sets. The difference between the continuous and discrete samplings is shown in Figure 1(a).

Difference between discrete and continuous samplings and 14 gestures used in this article. (a) Difference between discrete sampling (top) and continuous sampling (bottom) and (b) performance of 14 gestures used in this article.
In discrete sampling, we only label the stable part of the motion signal as the target action, as shown in the right-top of Figure 1(a). The label 1 represents the action “rest,” and the label 2 is the label of our target action. We marked the beginning, end of each action transitions, fatigues, and recoveries as 0. After the recording of all sEMG signals is completed, the signals corresponding to label 0 is deleted, and then we get a discrete database. Because we want to use the sEMG signal to drive the bionic manipulator and the user’s signal when using the bionic manipulator is not as perfect as the discrete data set, we use the original data containing all signal perturbation as our continuous data set, as shown in the right-bottom of Figure 1(a).We labeled the action “rest” to 1 and the target motion to 2, and the perturbation generated by each action are counted as those action-data themselves, to obtain a continuous data set, as shown in Figure 1.
The discrete data set contains sEMG data from 3 intact subjects 14 gestures recorded with one Thalmic Myo armbands, and the gestures were similar to the gestures in NinaPro database5 exercise B, including 10 isometric and isotonic hand configurations and 4 basic movements of the wrist. The continuous data sets contain sEMG data from 1 intact subject 5 gestures and 1 amputee 6 gestures, both recorded at five different times. Each gesture contains 6000 groups sEMG signals in those databases we recorded. The gestures used earlier in this article are shown in Figure 1(b), and the descriptions of gestures are shown in Table 1.
Fourteen gestures used in this article.
The discrete data set uses all the 14 gestures, the continuous data set of intact subject uses the gesture with labels 1, 4, 5, 6, and 9, and another continuous data set of amputee uses the gesture with labels 1, 5, 7, 8, 9, and 12. The NinaPro database5 belongs to a continuous data set.
The amputee who participated in our experiment lost her left hand at the age of 4 due to a car accident and her arm skin was transplanted from the scalp. Her left arm was barely growing, maintaining its length and thickness at the age of 4. Although myo armband was tightened to a minimum size, it is still too large to wear it in her left arm. So we had to tape two surface electromyogram sensors together to make the armband smaller, and only use the six surface electromyogram sensors left to record her sEMG signals. Left arm of the amputee and the myo armband she used are shown in Figure 2(a) and (b).

sEMG signal recording of amputee. (a) The difference between the left arm and the right arm of the amputee, (b) myo armband for amputee, and (c) the mirror experiment. sEMG: surface electromyography.
Since the amputees lost their arms early on, there was almost no sensation of phantom limbs, so she couldn’t accurately make the actions we designed. So we designed a “mirror experiment” to help amputees to find the feeling of phantom limbs: a mirror is placed in front of the residual limbs, covering the amputation site. The left and right hands simultaneously perform the same gesture movement, and amputees can see their amputated limbs in the mirror and find the feeling of amputated limbs movement. The mirror experiment was first used to treat “phantom limb pain,” 25,26 but in our research, this method can be used to help amputees in rehabilitation training and find the feeling of phantom limbs. Figure 2(c) shows the mirror experiment.
Hardware
Sensor we used in this article is named myo armband, which contains nine degrees of freedom inertial measurement unit and eight dry surface electromyogram sensors, as shown in Figure 3(b). The sampling frequency (frames per second) is 200 Hz. Although the sEMG sensors used in medical research are usually at least 1 kHz, but the cheaper price (only US$200) and easy to use (just wearing the armband) made myo more convenient for our research. Bionic manipulator we used is a robot hand made by 3D print. The hand has five fingers, and each finger is driven by a motor with a plastic line. A microcontroller named Arduino and a motor-driver shield were used to control the motor, and the program that converts each gesture to five angles of motors was deployed into the Arduino. The structure and components are shown in Figure 3(a). Arduino is communicated with upper computer through USB type-D; the computer transmits the index of gesture, and then Arduino controls the motor to do.

Hardware used in this article. (a) Bionic manipulator, motors, and controller and (b) myo armband.
Data preprocessing
We rearrange the sEMG signals from the abovementioned three databases into an sEMG graph structure. The two data sets we recorded use one myo armband, which returns a group of eight electrode signals once; we then map signals from eight continuous groups into an 8 × 8 graph to can get an 8 × 8 matrix. If we map “pixels” in sEMG into [0, 255], the sEMG graph will become a grayscale image that can be visualized, and the process of mapping and visualization is shown in Figure 4, each line contains eight values, lines 10–17 form the first graph, the 18th line is inserted at the bottom of the picture, each line is shifted up by one, and the 10th line is popped up, lines 11–18 form a new graph, and so on. Because two surface electromyogram sensors were no use while the amputee wearing the myo armband, although eight groups of data are collected at the same time, only six groups are effective. So we removed the two sets of unused data, then got a 6 × 6 matrix. The NinaPro database5 uses two myo armbands, which returns a group of 16 electrode signals once, then we map signals from 16 continuous groups into a 16 × 16 graph.

The process and visualization that convert sEMG signals into graph. sEMG: surface electromyography.
Then, we used two different normalization methods to process the abovementioned sEMG graph, tested the influences of the double different methods to the classifier using the three discrete sEMG data set, and determined the best preprocess method to the sEMG gesture classification problem:
Min–max normalization:
Min–max normalization is a linear normalization technique, widely used in image preprocessing. This method will not change the distribution of the data set
The min–max normalization can map the sEMG data into [0,1].
Zero-mean normalization
The zero-mean normalization is one of the most common normalization methods, and the data processed by this method will fit in the standard normal distribution
where μ and σ are the mean and standard deviation of fully input sEMG data, respectively. The sEMG data after zero-mean normalization conform to the standard normal distribution with a mean of 0 and a standard deviation of 1.
We found in the test that the size of sEMG value changes each time we picked off and reworn the myo armband. So each time we reworn it, as in the case of myo initialization, we would ask the user to do a gesture with a label of 12 for 5 s, find the maximum and minimum values, and calculate the mean and variance. Then use these values to do the normalization.
Classification
We use a residual convolutional neural network as our basal classifier, 27 as shown in Figure 5, then we test the different network performances by cutting the residual layers and convolutional layers in the residual convolutional neural network, transforming the network into a common convolutional neural network and a fully connected neural network. Our classifier contains four convolutional layer, one residual layer, four fully connected layers with dropout, and a softmax loss layer. Normalized exponential function has been used as the cost function. Both gesture labels and probabilities of each gesture have been outputted from the softmax layer. Then, we use the Adam as our optimizer to train the parameters of each layer by backpropagation. 28 In the fully connected layers, we use rectified linear units as our activity function, instead of sigmoid and tanh sigmoid function, which have been shown a good influence in vanishing gradient problem. 29 In order to avoid overfitting, we used dropout 30 with keep probability as 0.5 in the fully connected layers and applied instance normalization to the residual layer, which usually used in the image stylization problem, so that can maximize the influence of residual value. The computation in instance normalization is the same as the zero-mean normalization, except that the μ and σ in instance normalization are calculated just use current input graph. All the codes in classifier were completed with an open source deep learning framework tensorflow, and we save the model into a binary file so that we can apply it to control the bionic manipulator.

Structure of a residual neural network with different components with different colors is shown.
Performance evaluation and optimization
We calculated the accuracies of each subjects in NinaPro database5 independently, then calculated a total accuracy over all subjects, compared with the methods of others, and validated the advantage of our algorithm
where i and j are the index of subjects and gestures, respectively, M and N are the number of subjects and gestures, respectively, in NinaPro database5, M = 10 and N = 18, in discrete gesture database, M = 3 and N = 14, and in continuous gesture databases, M = 1 and N = 5 for the intact subject data set and M = 1 and N = 6 for the amputee data set. Next, we trained a single-use gesture classifier, applied it to a rope-driven bionic manipulator, and proved the real-time performance and robustness of our methods.
Results analysis
We selected a group of subject’s data randomly from NinaPro database5, trained the convolutional neural network with 5000 iterations, and then, plotted the result and loss with tensorboard, as shown in Figure 6. We can see that network performance has converged and errors of test set began to increase after 2000 iterations, and we believe that the network has been saturated at 2000 iterations. So we trained our networks 2000 iterations in each of the following experiments. The four experiments described below in this section default to convert the raw data collected without any preprocessing into the sEMG graph structure we proposed, using the data after convertion to finish the experiments.

Accuracy and loss value versus iteration. (a) Accuracy versus iteration and (b) cross-entropy loss versus iteration.
Evaluation of normalization methods and dropout
The discrete sEMG data set was used here to identify which normalization method was best. Convolutional neural network without residual layer, that is, without using the green block, was used as the classifier, and its structure is shown in Figure 5. Figure 7(a) shows a comparison of the overall accuracies of the double methods. We can see that the zero-mean normalization is 2.3362% higher than the min–max normalization. We suspect that the characteristics of the distribution of sEMG signals (symmetric along the zero) is similar to the Gaussian distribution, so the zero-mean normalization is more useful to the problem of gesture classification.

Evaluation results of normalization methods and dropout. (a) Accuracy versus normalization and (b) accuracy versus keep probability
Then, we tested the influence of parameter in dropout, called keep probability, and found how many neurons (parameters in neurons) were not updated (that is dropped) during training. We ran convolutional neural networks with keep probability from 0.1 (only 10% neurons were updated) to 1 (all neurons were updated), by increasing 0.1 each time. Results are shown in Figure 7(b). Since all the results from discrete sEMG database were very high and the differences are small, NinaPro database5 was used here. A higher keep probability may speed up network convergence, but a lower keep probability can increase network performance. The result shows that only 1.895% of increment was brought by the keep probability between 0.4 and 0.7. We deem that all parameters between 0.4 and 0.7 are available in this problem. In this article, we choose 0.5 as keep probability.
Evaluation of network structure
We tested the performances of four network structures, including a residual network, a convolutional network with keep probability 0.5, a convolutional network with keep probability 0.9, and a fully connected network. Due to the success of residual network in the problem of image classification, we believe that this structure has ability to give a good result in our problem too. Residual network structure and detail of each layer are shown in Figure 5 and Table 2, and we use different color blocks to represent different network components. Yellow is convolutional layer, green stands for residual layer, and the blue represents fully connected layer. The detailed structure of the residual layer is shown in Figure 8, and the residual layer made by layers of index 6, 7, and 8 is given in Table 2. The convolutional neural network we used here has the same structure that residual network with residual layer cropped (layers 6, 7, and 8), that is without the green blocks in Figure 5. With the convolutional layer cutted (layers 2 to 9), the convolutional network become fully connected network, removing yellow and green blocks in the figure. Each layer in Table 2 uses ReLu as the activation function.
Structure of residual neural network.

The detailed structure of the residual layer.
All network layers that were removed did not participate in network training, including forward propagation and back propagation. We tested four networks separately with 10 subject’s data in NinaPro database5 to prevent the defects of networks that happened with special data masked by the full accuracy.
From the result shown in Figure 9(a), we can see that convolutional network > fully connected network > residual network. Keep probability with 0.5 is 1.63% higher average than 0.9. Maybe the residual values increase the error and decrease the accuracy of residual network.

Evaluation results of structures and comparison to the others. (a) Accuracy versus networks and (b) accuracy comparison among three methods.
Performance evaluation of classifier
Since we find the best iteration, normalization, keep probability, and network structure, we can compare the accuracies between our method and other methods. First, we test our classifier using the discrete data set recorded from three persons with 14 gestures selected from NinaPro exercise 2, usually been used in robot controlling, and we get an accuracy of 97.27%. Owing to the discrete recording method, noises and transitions from one gesture to another gesture were ignored, and differences between each gesture were maximized.
Then, the full NinaPro database5 is compared with machine learning methods (using myo armbands) to evaluate the performance of our method. The author of NinaPro database5 has tested different machine learning methods and different feature analysis methods and found that the SVM with multi-variate discrete wavelet technique (mDWT) and random forests (RF) with temporal difference (TD) show the best performance. 22 Since our method can use either double or single myo sensor, we like to compare our method with the two methods mentioned earlier separately using data from two myos: upper myo and lower myo. From Figure 9(b), we can see that our method is 19.39%–19.9587%–13.10557% higher than SVM + mDWT, 18.3701%–17.9551%–14.43191% higher than RF + TD in subjects upper, lower, and double myo.
Real-time controlling of bionic manipulator
We trained the classifiers with continuous sEMG databases separately, got an accuracy of 91.7157% for the intact subject and 89.37% for the amputee, compiled the two models into binary files, and deployed them to a python program. Running time of our classifier once is about 6.58 ms on NVIDIA GTX 1070 and 14.5 ms on AMD Ryzen 5 1500× (CPU only). Max sampling frequency of myo armband is 200 Hz, 5 ms each time. We used two threads in the controlling program, one to read sEMG data from myo sensor and another to calculate which gesture should be done next, and transmitted the gesture signal to the bionic manipulator through USB. We write the corresponding conversion from each gesture to the angles of five motors in the bionic manipulator, when get signal, do the corresponding gesture. Figure 10(a) shows a schematic of the hole program. Input of classifier just needs an sEMG graph, containing eight-group signals; each time the sEMG graph updates, it only needs one group new signals, so the maximum sampling frequency can be set to 150 Hz (with GPU). Here, we set the sampling frequency to 10 Hz, sampling eight-group signals each time, converting the sampling signals and feeding to classifier, and sending the results to microcontroller. Note that the sampling process is synchronized with the calculation process, and the whole time running once is within 100 ms. If the same user wants to use the bionic manipulator each time, he just needs to update the parameters of zero-mean normalization without retraining needed (If long time no use or the wearing position and angle of sensor changed, retraining is needed). The performance is shown in Figure 10(b), both a healthy man and the amputee tested the bionic manipulator. Because we cannot see what gestures are being made through the amputee’s left hand, we let the amputee’s right hand to do a mirror gesture at the same time, which is convenient for us to observe the effect. Note that we have reversed the instructions of the manipulator’s wrist to turn left and right because our bionic manipulator is a right hand. Consistent with the amputee’s right-hand movement, it looks better. We uploaded the full video and the link can be found in the conclusion part.

Performance of real-time controlling based on sEMG and program structure. (a) Schematic of the hole program, from sensor to motor. (b) Snapshots of real-time controlling. sEMG: surface electromyography.
Conclusion
In this study, we proposed a data structure, converting the sEMG signals from an array structure into a matrix structure, then a classifier based on the convolutional neural network was used to classify the gesture. When classification was done, the result was transmitted to microcontroller to drive the bionic manipulator. Our method allows us to control the bionic manipulator use sEMG signals in a real time (within 100 ms) with a cheap and easy-to-use sensor named myo armband. We also test the impacts of different components and parameters used in convolutional neural network. Influences of different normalization methods and keep probabilities used in dropout on the performance of classifier were tested first. Based on the best normalization method and dropout, we evaluate performances to the gesture classification problem from four different network structures, which all show a good performance in image classification problem. Since NinaPro database5 use double myo armbands as sensors, we compared the result tested from our method and the best results shown in other articles. With double myo armbands, our method can offer an average of 13.10557% increase in accuracy compared to SVM + mDWT and 14.43191% to RF + TD. With single myo armband, our method shows a performance that has an increment about 19% and 18% to the two methods mentioned earlier. Then, we deployed our classification-method to test controlling of bionic manipulator, the running-time can be limited to within 100 ms after optimization, shows a good result, the video can be found in https://youtu.be/EHhMaB_r2J8.
What we do next is about three problems. First, testing whether our method can be deployed on an embedded hardware such as Raspberry Pi (a kind of embedded hardware), and whether we can optimize the program running on the embedded hardware into a real-time controlling. Second, trying to generalize our method to different users with the usage of a common neural network. This problem is a key point to the conversion from algorithm to production. Then, trying whether our method can be used to upper limb amputees since our method only tests the NinaPro database and sEMG signals recorded from intact subjects.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Supported by National Natural Science Foundation of China (Grant No.51475034).
