Electromyography-based gesture recognition for quadriplegic users using hidden Markov model with improved particle swarm optimization

Abstract

People with quadriplegia cannot move their body and limbs freely, making them unable to interact normally with their environment. This article aims to improve the life quality of quadriplegia patients through a development of a system to help them interact with their surroundings. A novel algorithm to classify human gestures is proposed in this article. The algorithm is developed as the core of an assistive technology system in the form of a human interface device, which utilizes electromyograph as its sensor. The system utilizes a wearable electromyograph with a custom software as the signal capturing and processing tool. The electrodes of the electromyograph are placed on certain positions on the face, corresponding to the locations of the major muscles that govern certain facial gestures. The signals are then processed using a novel algorithm that employs hidden Markov model and improved particle swarm optimization to classify the gesture. Based on the gestures, a custom command can be assigned for different conditions. The accuracy of the system is 96.25% for five gestures classification.

Keywords

Assistive technology electromyograph hidden Markov model human interface device quadriplegia

Introduction

Spinal cord injury (SCI)¹ is inflicted upon between 250,000 and 500,000 people around the world annually, from which 54.1% have suffered cervical injuries.² It is called cervical injury when a SCI is happened to any of the first eight segments of the spinal cord. People with cervical SCI have high tendency to develop quadriplegia, a loss of sensory and motoric capability in their arms, body, and legs. This condition prohibits them to interact normally with their surroundings, including to operate human interface devices (HIDs) such as mouse and keyboard. They may have to rely on their caregivers, which in turn can lead to depression, impacting their overall health.¹ An assistive technology (AT) in form of a novel HID might be able help to improve their condition. Using the AT, not only that the users will be able to get connected with the world through the use of personal computer (PC), but it is also possible to connect the HID to any actuators like wheelchair or even the home automation system. Several methods have been developed as solutions to this problem, for example, a voice-controlled system,^3,4 an electroencephalographic (EEG)-driven device,^5,6 mouth sticks, and several image recognition systems applied to head, eye, or facial movements.^7–11

Advancement in the voice recognition and artificial intelligence technology has made a voice-controlled system a good choice for an AT, but it still has some weaknesses in certain daily life scenarios. In some events that need detailed commands such as drawing an image using a mouse, voice-controlled system will be too arduous to be used. Furthermore, this system cannot be used by a mute person or people with disability in their articulation organs. Another concern is the compatibility of this system to be used in different languages around the world, aside from the main language it was developed for. An EEG-based system offers solutions for all of the concerns regarding the voice-controlled system, but currently it lacks the speed and consistency of a proper HID. Utilization of mouth sticks is inseparable to many quadriplegia patients around the world. Albeit the popularity, the performance of the mouth stick is still limited by the inconvenience it caused to the user and the impracticality for doing detailed tasks. The same problem can be said for the image recognition systems; moreover, it also needs relatively elaborate equipment to be set up around the user. The ideal solution would be an AT that is able to respond to small voluntary changes of the body, where the change can be done repetitively without exerting too much burden to the user. For example, small movements of the face or other parts of the head will be sufficient for this system, so we need a sensory tool that is able to capture even the slightest change caused by the movement.

Electromyography (EMG) is a technique developed to acquire electrical biosignal produced by muscle cells. When the nerve system triggers the muscle cell, the cell will generate certain electrical signal. Depending on several factors such as the period of the activation, the intensity of the movement, and the number of the activated cells, the generated signal will vary accordingly. These signals can be acquired by the EMG. The obtained signal can then be further processed to extract certain information, such as muscle activation level, duration, abnormalities, and recruitment order. With sensitive enough sensors placed in certain positions on the facial muscles, EMG would be ideal to be used as an input for an AT.

Breakthroughs in semiconductor and material technology have helped lowering the costs of small sensors and supporting circuitries, ushering production of numerous smart network-connected devices which are collectively known as Internet of things (IoT) devices. Smart wearable is a class of IoT devices which differentiates itself with the rest of the group by being the only one that is deployed by being worn or strapped to the body of its user. A network of wearable sensors can be deployed together to achieve certain objectives, such as human activity recognition,¹² drowsiness detection, overeating detection, and cardiovascular autonomic neuropathy diagnosis.¹³ In practice, wearable sensors can be further utilized in tandem with other technologies to improve the result, for example, to complement human activity recognition system that employ depth camera as its sensor.¹⁴

Recent developments also make it possible to build an EMG device with compact size and low power consumption. These characteristics enable the EMG to be worn on the body, which made it identifiable as a wearable device. Recently, bioelectrical signal has gained popularity among several prominent research institutes.^5,6,15–18 Aside from the raise of the IoT devices, other breakthroughs in the related fields such as cardiology, muscle physiology, and neuroscience also help improving the EMG field. Availability of low-cost sensors and semiconductors, combined with the increasingly deeper knowledge about the subject of muscle and human gesture, has propelled the research about EMG and its implementation outside the medical field.

Human and machine interaction, the core of every AT, is one of the many fields where the efforts to improve existing techniques are evenly matched by efforts to innovate new methods and devices. Several techniques have been introduced since the first interaction between human and machine is recorded in history. From the plethora of those devices, only a small number is still being used by the majority of the people. Despite this fact, the research to find a better HID is still being carried on in many places. With the advent of the IoT, the progress is improving more rapidly. Other new concepts such as crowdsourcing and rapid prototyping also help to push the development of more sophisticated HID devices.

In order to be used as an HID, a method should have a consistent result from multiple users in different conditions. Interaction between human and machine has reached the point where inaccuracy can affect the user experience such that an HID user acceptance rate is actually depending on it. In numerous real-life scenarios, the interaction has been in such important state that it is only acceptable if an HID error rate is lower than 0.1%. However, even the most sophisticated EMG system still has some incorrect recognitions in its muscle signals’ classification. This problem affects how the user experiences the system, restraining the implementation of EMG-based gesture recognition human–machine interface (HMI) devices in current systems. A feasible solution for this problem is to use an algorithm with better accuracy, allowing the user to control the system with minimal error.

The concept of the system proposed in this article utilizes wearable EMG as an input and hidden Markov model–particle swarm optimization (HMM-PSO) algorithm to realize an HID which takes specific facial gestures as input and translates each gesture to a certain command. New facial gestures can be assigned to the system by calibrating the system just once. The example of the electrode placements for this system can be seen in Figure 1. This system has high accuracy while still maintaining a fast calculation time. This approach offers a classification accuracy of up to 96.25% with a calculation time averaging in 0.05 s.

Figure 1.

Example of electrode placement on a patient.

Related works

Several HMI implementations with various inputs for quadriplegia patients have been mentioned in the introduction part, such as voice-controlled systems, EEG and EMG–driven devices, mouth sticks, and image recognition systems. In this part, we will focus on systems that are similar to our work: EMG-driven systems with various types of classification algorithm. Although all of the works mentioned here are not using face muscles as the input of their system, considering that EMG is the main input sensor, it is assumed that the comparison of their works and our work is acceptable.

EMG-based gesture recognition systems have been developed in numerous research works for several reasons, for example, as an input of biomechatronic-based prosthetics, HMI, and sign-language interpreter. The main focus of these research works is the machine learning techniques utilized to classify the gestures. Some of the popular algorithms are linear discriminant analysis (LDA),^19–21 artificial neural network (ANN),^22,23 and support vector machine (SVM).²⁴

Patel et al.²¹ propose classification method using only one training data obtained over single contraction period, with better accuracy compared to LDA classifier. The feature set utilized in the classification is also smaller from the one used in the LDA method. The method proposed in the work by Geng et al.²⁵ implements sparse representation-based classification with root mean square descriptor (RMS-SRC). This method is claimed to be able to withstand signal interference in the form of white Gaussian noise (WSG) better than other classification methods. System proposed by Jiang et al.²⁰ utilizes LDA-based gesture recognition, with the addition of input from an inertial measurement unit (IMU) module to help improve the classification. Another highlight of the system is that it focuses the placement of the electrodes at the wrist of the user. Kanitz et al.²⁶ propose using an onset detection algorithm (ODA) to capture the EMG signal with certain duration before processing it with a classifier. After capturing the signal, a method called error-correcting output codes (ECOC) classifier with one-versus-all coding matrix is implemented as the classification scheme. Each of these algorithms has its own advantages and disadvantages, showing that for gesture recognition, there is no single exceptionally good method that far outperforms the others.

Apart from the algorithm used in the system, other factors include the input gestures, muscles involved, number of electrodes employed, and the signal processing done to the EMG signal prior to the classification process. Hargrove et al.²⁷ compare accuracy between systems with invasive and non-invasive electrodes. Although for diagnostic purposes the invasive electrodes are deemed to produce more detailed results, the research verifies that there is no significant difference in the gesture recognition performed with these two different electrodes. Similarly, in the works by Giroux and Lamontagne²⁸ and Chiou et al.,²⁹ it is shown that there is only a slight accuracy rate difference between active and passive surface electrodes. This research was conducted involving five test subjects. Two couples of passive and active electrodes were secured in the same location while the subject was conducting certain gesture.

Hidden Markov model (HMM) is typically utilized in clustering and classification schemes, such as speech recognition and heart rate anomaly detection, with prominent results. Much development for this method has been done, improving the understanding of the algorithm itself. The results of the development are not only useful for enhancing result of the previous field of implementation but also for further implementing HMM on the different new fields efficiently. Ultimately, HMM is a good tool for modeling a time-varying random process such as the EMG signals, which makes it an ideal solution to classify gestures from the EMG signals. The research examples stated in above are all done in offline condition, using signal data set recorded prior to the experiment using corresponding EMG.

A novel wearable EMG solution, MYO armband, on the other hand, performs real-time gesture recognition, although the calculation is done in a separate device, not in the EMG device itself. MYO is a wearable device which is worn by strapping one in the arm like an armband, and it needs to connect to another system, whether a PC or a smartphone, to be able to process the signals it acquires. Current MYO system utilizes an undisclosed algorithm that classifies five types of gestures of the hand: fist, spread, wave right, wave left, and double click. The scheme proposed in this article offers an improved accuracy and more importantly the possibility to add additional gestures to the current system. This approach is based on HMM and particle swarm optimization (PSO) method applied on the EMG signals captured with an MYO device.

Methods

The system proposed in this article offers a concept of HID using EMG-based gesture recognition scheme. The concept is built based on several underlying principles listed in this section. EMG sensors are utilized to obtain the input of the system, and from the signals some features are extracted. An optimization method called improved swarm wavelet optimization (ISWO) is utilized to obtain the number of hidden states that gives the HMM the best result. ISWO method is an algorithm built based on PSO, incorporating additional features such as wavelet mutation, particle refresh, and velocity improvement.

The sensor utilized in the proposed concept is MYO, a wearable EMG device for biopotential signal acquisition. A local server is deployed to store the obtained signal and also to process the signal classification. This system implements the HMM-based algorithm to classify the EMG signal. In this system, the classification algorithm is performed offline. We use existing consumer-level products as our hardware and focus this work on the software algorithm implemented in processing the captured EMG signal. This section explains related methods on which our system is built.

Signal acquisition and preprocessing

To acquire biopotential signals, we need a signal acquisition device: sensors attached to an analog front-end circuitry, complemented with a corresponding analog-to-digital converter which also complies with the processing input. The biopotential signal of human muscles is in the order of microvolt, ranging from 10 uV to 10 mV. The amplitude is determined by several factors: the size of the muscle, as well as the speed and the intensity of the movement. In this system, we use MYO from the Thalmic Labs as the signal acquisition device, as can be seen in Figures 2 and 3. MYO is a consumer-grade wearable device which consists of EMG and IMU sensors and uses low-power Bluetooth to connect to its host. The EMG sensor has eight channels with a sampling frequency of 200 Hz. Figure 4 shows the example of an EMG signal captured by MYO. Each input signal, whether training or testing signal is set at 200 samples, translates to the duration of 1 s. This is comparable to the duration of the gestures which averages between 0.7 to 0.9 s in this experiment. The starting point of the clipped signal is decided from detecting a certain decision threshold from it and snipped the signal from 20 previous sample points. These procedures and constants are determined heuristically after analyzing the obtained signals for this experiment.

Figure 2.

MYO armband—the hardware for signal acquisition.

Figure 3.

Example of signal acquisition using MYO armband.

Figure 4.

Example of a signal preprocessing in the proposed system: original signal (left), after enveloping (middle), and after step signaling (right).

Prior to being handled by the classification algorithm, the signals undergo a preprocessing function to make them appropriate as inputs of the algorithm. The preprocessing consists of signal enveloping and converting the signal into a step signal. This is necessary because the algorithm used in this system needs an input with discrete values in certain precision that is proportional to its computational resources. Depending on the needs, the precision can be altered, adapting with the accuracy and calculation efficiency. The preprocessing example also can be seen in Figure 4.

HMM

HMM is a sequence model or also identified as a sequence classifier. It is a model which produces label or class to different units in a sequence of observable condition. Specifically, HMM is a probabilistic sequence model; it calculates a probability distribution from the given observation, discerning possible sequences, and assigns the best label sequence available from it. The sequence of observation can be signals, symbols, values, words, or any kind of discernible condition. HMM is useful in labeling sequences with many known different states, mapping an observed sequence to its respective label. In general, HMM comprises three separate algorithms: forward algorithm, Viterbi algorithm, and Baum–Welch algorithm. Forward algorithm calculates the probability of the occurring observed sequences by summing the probability values from all possible hidden states that are theoretically able to fit said sequence. These probability values are then put together into a forward trellis that stores them efficiently. Viterbi algorithm is the most commonly used decoding algorithm in HMM. It is a dynamic programming similar to forward algorithm, where the value of each cell of the trellis is computed by recursively calculating the most probable path to the respective cell. Baum–Welch algorithm, also known as the forward–backward algorithm, is the standard algorithm for training an HMM model. It trains both the transition probabilities and emission probabilities of an HMM model by computing an initial estimate of the probabilities and iteratively computes a better estimate from the previously obtained estimation. Baum–Welch algorithm is a special case of the expectation–maximization (EM) algorithm.

PSO

PSO is an optimization algorithm that mimics the behavior of a group of animals such as birds or fishes. The algorithm consists of iteratively updating movement of a group of agents with individual position and velocity into their optimal position. Each agent stores two optimal positions; it remembers its own optimal position up until current time and the global optimal position, that is, the optimal position known by every agent in the population. They also know the movement of their neighbors. They use these information to advance into their optimal position. The agents are called particles and the population is also known as the swarm, hence the term “particle swarm optimization.”

Each particle is usually generated with random parameters that indicate its position in the problem space. This position will be evaluated in every iteration by a certain function, which will let the particle to know how optimized its current position is. The particles will then update themselves by another function and a new iteration of particles are produced, replacing the previous iteration.

PSO works by examining how optimized each of the particles is and moving them closer and closer to the individual most optimized position in each iteration. The strong point of PSO is the short time it requires to achieve the optimal point of the whole swarm. In other perspective, it is also its main weakness. Because it quickly gets into the optimal point, oftentimes, it gets trapped in local optimum by not spending some time to consider other solutions.

ISWO

To avoid getting trapped in local optimum, several methods have been proposed. One solution is to implement mutation technique to the particles to prevent them to achieve optimal point quickly. There are several mutation schemes, with wavelet function being one that produces significantly better results in processing EMG signals. Utilization of wavelet function as a PSO mutation schemes was proposed in the work by Ling et al.³⁰ The probability of mutation for each particle is given by $p_{m} \in [0, 1]$ , where the mutation will affect the new position of said particle. The value of $p_{m}$ will decide the mutation: the particle will mutate if the number is less than or equal to $p_{m}$ . The following equation decides the new position of the particle

q_{i} = {\begin{matrix} q_{i} + σ \times (q_{i_{max}} - q_{i}) & if & σ > 0 \\ q_{i} + σ \times (q_{i} - q_{i_{min}}) & if & σ \leq 0 \end{matrix}

(1)

$q_{i}$ is the new position acquired through mutation, $q_{i_{max}}$ is the allowed maximum value of the particle, and $q_{i_{min}}$ is the minimum counterpart. The wavelet function is implemented through $σ$ . It represents the mother wavelet function used in the equation. The examples include Morlet wavelet, Gaussian wavelet, and Mexican hat wavelet functions.

Morlet function was chosen in the work by Ling et al.³⁰ as the mother wavelet because of the performance it gives in the nature of cost values. The Morlet wavelet function is defined as follows

σ = \frac{1}{\sqrt{α}} e^{- {(\frac{ϕ}{α})}^{2} / 2} \cos (5 (\frac{ϕ}{α}))

(2)

$ϕ$ is a random generated value between $- 2.5 α$ and $2.5 α$ where $α$ is the dilatation parameter which is given by the following equation

α = e^{- \ln (g) \times {(1 - \frac{t}{T})}^{ζ wm}} + \ln (g)

(3)

In PSO formula, the particles possess velocities which are related to two positions: the local optima and global optima. Relation to local optima decides how fast the particle approaches the local optima, while the relation to global optima affects how fast it approaches the global optima. Because of these relations, velocities hold great importance in PSO. After all, the objective of the algorithm is to reach the optimum point of the problem space. The velocities are updated in each iteration, and how they are updated affects the end result of the whole algorithm. By improving the update function, the end result can be made more balanced in both speed and optimization, avoiding getting trapped in local optima while maintaining the fast processing speed. Whereas, generally, the velocity is related to the particle’s local best position $p_{best}$ and global best position $g_{best}$ ; in the improved version, the global particle’s best velocity $v_{g}$ and neighbor particle’s best velocity $v_{nb}$ are also brought into consideration. $v_{g}$ is the velocity of a particle that is considered closest to the optimum point, and $v_{nb}$ is the velocity of a neighbor particle that is considered closest to the optimum point. The velocity update function is as follows

\begin{matrix} v_{i} (t + 1) = χ v_{i} (t) \\ + (c_{1} - \frac{(c_{1} - c_{2}) \times t}{T}) r_{1} (p_{i} - q_{i} (t)) \\ + (c_{1} - \frac{(c_{1} - c_{2}) \times t}{T}) r_{2} (p_{i} - q_{i} (t)) \\ + (c_{1} - \frac{(c_{1} - c_{2}) \times t}{T}) r_{3} (v_{g} (t) - v_{i} (t)) \\ + (c_{1} - \frac{(c_{1} - c_{2}) \times t}{T}) r_{4} (v_{nb} (t) - v_{i} (t)) \end{matrix}

(4)

and the respective position update function is as follows

\begin{matrix} q_{i} (t + 1) = (d_{1} - \frac{(d_{1} - d_{2}) \times t}{T}) q_{i} (t) \\ + (d_{1} - \frac{(d_{1} - d_{2}) \times t}{T}) v_{i} (t + 1) \end{matrix}

(5)

where $v_{i} (t)$ is the current velocity, $v_{i} (t + 1)$ is the updated velocity, $v_{g}$ is the global particle’s best velocity, and $v_{nb}$ is the neighbor particle’s best velocity. Acceleration is manifested in acceleration factors $c_{1}$ and $c_{2}$ where $c_{1} > c_{2}$ . $r_{1}$ , $r_{2}$ , $r_{3}$ , and $r_{4}$ are random constants that are randomly generated with values between 0 and 1. $d_{1}$ and $d_{2}$ are weighting factors where $d_{1} > d_{2}$ .

To avoid getting particle trapped in local optima³¹ adds a refresh operator to the PSO algorithm. This can be implemented together with the improved update velocity PSO algorithm to further strengthen the algorithm. A particle is considered need to be refreshed if its velocity is small. If $v_{i} < E$ , then the particle is relocated to a new position. E is a very small threshold constant.

The improved algorithm is called improved swarm wavelet PSO.

Implementation of HMM with ISWO

As mentioned before, in this HID system, the HMM algorithm is utilized to classify the obtained EMG signal from corresponding channel. HMM is chosen because it only requires one set of signal as training signal which will considerably shorten the calibration process of the HID. The calibration process would need the user to perform two sets of gestures before the HID could function properly. This process is the training phase of the HMM, which would produce a corresponding model that serves as the basis of the gesture classification for this particular user.

In building the model, we need to set the number of the hidden nodes in the model to a certain number to get a well-balanced result, giving response as accurate as possible in shortest amount of time. ISWO is utilized to obtain the number of hidden states that gives the HMM the best result. In implementing ISWO, the input parameter is the number of the hidden states of the HMM, with the classification accuracy as the optimization evaluation point. The classification accuracy is calculated from the comparison of the first set of the calibration signal with the second set. This is why the HID system requires two sets of gesture repetitions in the calibration process, although the HMM training phase only demands one set of signal.

The high-level block diagram can be seen in Figures 5 and 6. Figure 5 shows the flowchart of the training phase where the model for the HMM is built. It is during this phase that ISWO is implemented. In building the model for the HMM, it is required to decide the number of the hidden states in the model. ISWO is utilized to optimize the number of these hidden states so that the classification process can have optimized speed to accuracy results. After the HMM model is acquired, the gesture recognition is exercised using HMM.

Figure 5.

Flowchart diagram of the model training phase of the proposed system.

Figure 6.

Flowchart diagram of the classification phase of the proposed system.

Experiment and result

This proposed system determines the output control signal of the HID by classifying the facial gesture of the user from the EMG signal captured by a wearable EMG, the MYO armband. To ensure the system has an accurate response, an experiment is performed. The experiment is done as follows. First, the electrodes of the MYO wearable EMG is affixed at several specific positions on the head of the user. Then, a custom-made software is activated to capture the EMG signal. While the software is active, the user is asked to perform certain gestures. Each gesture is repeated five times. The recorded signal is then introduced to a MATLAB function. The function will process the signal and classify the gesture based on the previously added trainer signal.

Electrode placements

This experiment captures EMG signals from three locations on specific areas of the head, which means three channels of the MYO will be used. Two electrodes are used in each channel. They are placed side by side with a 15-mm inter-electrode distance at the region of interest. The positions are symmetrical to the sagittal plane of the head. Movements that will be classified in this experiment are brow raise, jaw tighten, sneer right and left, lip puck, and jaw tighten, as can be seen in Table 1. The electrodes used in this experiment are standard disposable Ag–Cl EMG electrodes. The skin is cleaned first and then the electrodes are attached to their respective positions. The muscles where the electrodes are put and their respective gesture are listed in Table 2.

Table 1.

Registered face gestures.

No.	Gesture name	Example image
11	Brow raise
12	Lip puck
33	Sneer left
44	Sneer right
55	Jaw tighten

Table 2.

Muscles involved and respective face gestures.

Channel	Muscle name	Common name	Respective gesture
1	Occipitofrontalis	Forehead	Brow raise
2	Left zygomaticus major	Left cheek	Sneer left, jaw tighten, lip puck
3	Right zygomaticus major	Right cheek	Sneer right, jaw tighten, lip puck

Data acquisition

Biosignals acquired from the MYO were recorded using a custom software. The MYO armband sent the data through Bluetooth to a connected PC, and the data were then received by a software, put together in a file. The signals are acquired with a sampling rate of 200 Hz. The recorded signal is composed in a .CSV file in a nine-column format, which is then processed in MATLAB. Each file has nine columns, with the first column as the label for the time and the rest eight columns are the signals from each of the eight channels of the MYO. This system only utilizes three channels of the MYO because the experiment only takes measurements from the three previously mentioned electrode pairs. Because of this five columns of data from the .CSV file are discarded and only three channels of signals for each gesture are inputted into the algorithm.

Gesture recognition

For each gesture, we need to choose a trainer signal. A set of the captured signals which are generated by brow raise, jaw tighten, jaw drop, lip puck, and lip tighten gestures are prepared as the trainers. All of these five signals are then inputted into the MATLAB function. Another set of signal with unknown gesture which we want to classify is also inputted. The MATLAB function then will classify the unknown input signals with the five trainers as the calibration signals. In this experiment, each gesture is repeated five times. One set of the gesture is intended as the trainer and four sets of repetition as the testing signals. Generally, EMG signals of different persons are similar, but there are specific differences for each individual caused by different biological traits such as muscle fiber pattern and changes of blood flow in the muscle.³² This is the reason why consumer-grade EMG products such as the MYO armband require calibration process before usage. HMM-ISWO can be further implemented to integrate more than one calibration input to train its model, but for this work, we limit the input of training process to one set of signals. To evaluate the performance of the algorithm, a fivefold cross-validation is employed on these signals. In each fold, the ISWO algorithm is utilized to process the training signals only, to build the model for the classification.

Results

Figure 7 shows the result of the experiment. The name of the methods is shown in the x-axis, while the accuracy of the methods can be seen in the y-axis. From the figure, it can be seen that the proposed system has better accuracy compared to other methods in this experiment. The proposed method is HMM with ISWO or HMM-ISWO. The other methods are HMM-PSO which implements standard PSO method, HMM-WPSO which implements wavelet function into the standard PSO, and standard HMM. The accuracy rate is 96.25 ± 2.75%, whereas HMM-PSO, HMM-WPSO, and HMM each has accuracy rate of 94.5 ± 3.25%, 90.55 ± 3.75%, and 82.75 ± 4.25%, respectively. The accuracy rate is calculated by averaging the accuracy for each gesture from every person. The equation for accuracy is shown as follows

A c c u r a c y = \frac{C o r r e c t c l a s s i f i c a t i o n}{N u m b e r o f i n p u t s} \times 100 %

(6)

Figure 7.

Accuracy comparison of standard HMM, HMM-PSO, HMM-WPSO, and HMM with improved wavelet PSO using fivefold cross-validation.

From the observation done during the experiment, it is known that the signals produced by the movement are different from person to person, where some of them even struggle to repeat a gesture with constant signal. From the observation of the signals, in certain gestures, the signals differ significantly from person to person. This shows that calibration of the system is always needed because the input signals themselves are significantly different from person to person.

Discussion

In this work, an AT for quadriplegia patients is proposed. The system aims to provide a versatile HID for the user, offering an interface method which provides complete control without causing too much discomfort.

Quadriplegia patients cannot voluntarily move any of their body parts below the neck. With the current technology, several options are available to be used by a human as HID even without using their limbs. Speech recognition and video-based object or movement detection are two fields which undergo rapid growth at the present time with the increasing growth of machine learning methods. However, these methods have several drawbacks as inputs for HID. Speech recognition^3,4 cannot efficiently control detailed commands such as moving mouse pointer several pixels at the 23° elevation angle or moving wheelchair at specific direction with certain acceleration. EMG-based system eliminates this problem by arranging its inputs to behave similar to that of a keyboard. By assigning certain user gestures to respective keyboard press such as directional arrows, it is possible to deliver a meticulous control of certain precision. Video-based system^7–11 captures every movement of the user, which can mistakenly translate several unintended, accidental, or reflex movements. These systems also limit the usage of the HID to certain environments, such as places with quiet environment or places with well-lit lighting.

The decision to put the electrodes on the face of the user also has its origins in avoiding mistranslation of the input command. Aside from the face muscles, two other prominent muscle groups that are available for the EMG input are the eye muscles and neck muscles. These two locations are avoided because of the restriction they may cause to the user. Neck and eye are required for a person to observe and or communicate; when we want to see something, we point our head and eye toward it intuitively. The same can be said in the scenario of hearing sudden loud noise or other surprise scenarios. Although the decision to put the electrodes to the face cannot entirely eliminate mistranslation problem, this option can minimize its effect.

Another important issue of an HID is the delay of the system. Even in the occurrence that the input of the user is correctly translated by a system, a considerably long delay would hinder the adoption of the said system. An algorithm with fast response is required. In our work, the delay of the algorithm is minimized by implementing the classification method using HMM-ISWO while also maintaining a good accuracy. As mentioned before, ISWO is an improved version of PSO. The main idea behind the utilization of ISWO to improve the HMM is the problem of choice when deciding on one of the parameters of the HMM. ISWO helps optimize the HMM by choosing the number of hidden states that produce highest accuracy without sacrificing calculation time. With the implementation of ISWO, we achieve an average calculation time of 0.05 s. Although both HMM and PSO are classical algorithms, the result of our work shows that with proper implementation, the synergy can boost the accuracy by 13.5% compared to HMM-only algorithm.

EMG-based systems performance are decided heavily by their classification algorithm. Figure 8 and Table 3 show comparison of our system with the previous state-of-the-art systems. Accuracy of these systems is obtained from equation (6). Although some papers listed their equations in different writings, the essential calculation is the same. From this table, it can be seen our work is comparable with others in terms of accuracy. Although the other methods are using inputs from other parts of the body and not the face muscle, the classification processes of EMG-based signals across different muscles are still the same, with the difference mainly in the maximum amplitude of said muscles.

Figure 8.

Accuracy comparison graph.

Table 3.

Comparison table of accuracy and number of employed electrodes.

	Accuracy (%)	Number of electrodes
This work	96.25	6
RMS-SRC²⁵	96.94	56
ECOC²⁶	95.63	15
LDA²⁰	94.6	8
LDA-TD²¹	89.2	8

RMS-SRC: root mean square descriptor; ECOC: error-correcting output codes; LDA: linear discriminant analysis.

RMS-SRC employed by Geng et al.²⁵ provides the highest accuracy of all the systems, which is higher than our work by 0.69%. An important fact is this system places 56 electrodes on the subject, excluding the reference electrode placed on the wrist. As opposed to this system, our work utilizes six electrodes on the three channels of our system. It also requires a long duration for the calculation process; the average time required for the system is 21 s, as opposed to 0.05 s in our system. System by Kanitz et al.²⁶ utilizes ECOC which produces accuracy slightly below our own at 95.63%. Their approach highlights the usage of onset classification, which puts detector of muscle contraction before the classification takes place. The input of their classification phase is a fixed duration of signal that has been captured from the input signal by their ODA instead of continuous signal. In our work, we also employ the same mechanism of deciding a threshold to capture a duration of signal, and we use a duration of 1 s, as opposed to 300 ms in their system. Although longer delay is expected, we prefer higher reliability to lower latency in this work. For future works, we have considered to shorten the duration to obtain lower latency.

LDA algorithm utilized by Jiang et al.²⁰ produces better result than LDA-TD by Patel et al.²¹ Although the method implemented by Patel et al.²¹ is an improved version of LDA, the accuracy is worse than the standard LDA. This fact can be attributed to the difference in the hardware and the preprocessing method and also to the experiment scenarios. Instead of focusing in the accuracy of individual muscle movement, LDA-TD is expected to perform better with detecting coordination of several muscles. In our future work, we also consider to utilize the characteristics of coordinated movement instead of individual movement, such as grabbing a glass from the table and lift it up instead of just a movement of bicep flexion. Of course, extending body movement translation to face gesture coordination is another challenge that is very interesting in itself.

Conclusion

In this article, a concept of HID using EMG-based gesture recognition scheme is proposed. The concept is built in consideration of quadriplegic users. The system utilizes an MYO device from Thalmic Labs and implements custom software as the signal capturing tool. The electrodes of the EMG are placed at certain positions on the face, corresponding to the locations of the major muscles that govern facial gestures. To classify the gestures, the system utilizes HMM optimized by ISWO. Based on the gesture, custom commands can be assigned. There are five different recognizable gestures implemented in this system.

From the experiment conducted, the proposed system obtained accuracy of 96.25%. For future works, we envision an HID system with a real-time signal processing embedded to the hardware, built in the controlled devices such as wheelchairs, computers, or even prosthetic limbs.

Footnotes

Handling Editor: Salvatore Serrano

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Taiwan Tech-Tokyo Tech Joint Research Program, under grant TIT-NTUST-107-05.

ORCID iDs

Xanno K Sigalingging

Jun-ichi Takada

References

Spinal cord injury, 2013, http://www.who.int/mediacentre/factsheets/fs384/en/ (accessed 23 May 2018).

Jackson

Dijkers

DeVivo

et al . A demographic profile of new traumatic spinal cord injuries: change and stability over 30 years. Arch Phys Med Rehabil 2004; 85(11): 1740–1748.

Pacnik

Benkic

Brecko

Voice operated intelligent wheelchair–VOIC. In: Proceedings of the IEEE international symposium on industrial electronics (ISIE 2005), Dubrovnik, 20–23 June 2005, vol. 3, pp. 1221–1226. New York: IEEE.

Qidwai

Shakir

Ubiquitous Arabic voice control device to assist people with disabilities. In: Proceedings of the 2012 4th international conference on intelligent and advanced systems (ICIAS2012), Kuala Lumpur, Malaysia, 12–14 June 2012, vol. 1, pp. 333–338. New York: IEEE.

Wolpaw

Birbaumer

McFarland

et al . Brain–computer interfaces for communication and control. Clin Neurophysiol 2002; 113(6): 767–791.

Qidwai

Shakir

. Fuzzy classification-based control of wheelchair using EEG data to assist people with disabilities. In: Proceedings of the 19th international conference on neural information processing (ICONIP’12), Doha, Qatar, 12–15 November 2012, vol. 4, pp. 458–467. Berlin: Springer.

MacKenzie

. Evaluating eye tracking systems for computer input. In: Majaranta

Aoki

Donegan

et al . (eds) Gaze interaction and applications of eye tracking: advances in assistive technologies. Hershey, PA: IGI Global, 2012, pp. 202–225.

Guness

Deravi

Sirlantzis

et al . Evaluation of vision-based head-trackers for assistive devices. In: Proceedings of the 2012 annual international conference of the IEEE engineering in medicine and biology society, San Diego, CA, 28 August–1 September 2012, pp. 4804–4807. New York: IEEE.

Jiang

Duerstock

Wachs

. An analytic approach to decipher usable gestures for quadriplegic users. In: Proceedings of the 2014 IEEE international conference on systems, man, and cybernetics (SMC), San Diego, CA, 5–8 October 2014, pp. 3912–3917. New York: IEEE.

10.

Bian

Hou

Chau

et al . Facial position and expression based human computer interface for persons with tetraplegia. IEEE J Biomed Hlth Inform 2016; 20(3): 915–924.

11.

Epstein

Missimer

Betke

Using kernels for a video-based mouse-replacement interface. Pers Ubiquit Comput 2014; 18(1): 47–60.

12.

Hassan

Huda

Uddin

et al . Human activity recognition from body sensor data using deep learning. J Med Syst 2018; 42: 1–8.

13.

Fortino

Ghasemzadeh

Gravina

et al . Advances in multi-sensor fusion for body sensor networks: algorithms, architectures, and applications. Inform Fusion 2019; 45: 150–152.

14.

Liu

Wang

et al . Learning structures of interval-based Bayesian networks in probabilistic generative model for human complex activity recognition. Pattern Recogn 2018; 81: 545–561.

15.

Mandel

Luth

Laue

et al . Navigating a smart wheelchair with a brain-computer interface interpreting steady-state visual evoked potentials. In: Proceedings of the 2009 IEEE/RSJ international conference on intelligent robots and systems, St. Louis, MO, 10–15 October 2009, pp. 1118–1125. New York: IEEE.

16.

Ullah

Ali

Rizwan

et al . Low-cost single-channel EEG based communication system for people with lock-in syndrome. In: Proceedings of the 2011 IEEE 14th international multitopic conference, Karachi, Pakistan, 22–24 December 2011, pp. 120–125. New York: IEEE.

17.

Jali

Sulaima

Izzuddin

et al . Comparative study of EMG based joint torque estimation ANN models for arm rehabilitation device. Int J Appl Eng Res 2014; 9: 1289–1301.

18.

Izzuddin

Ariffin

Bohari

et al . Movement intention detection using neural network for quadriplegic assistive machine. In: Proceedings of the 2015 IEEE international conference on control system, computing and engineering (ICCSCE), Penang, Malaysia, 27–29 November 2015, pp. 275–280. New York: IEEE.

19.

Zhang

Zhao

Yao

et al . An adaptation strategy of using LDA classifier for EMG pattern recognition. In: Proceedings of the 2013 35th annual international conference of the IEEE engineering in medicine and biology society (EMBC), Osaka, Japan, 3–7 July 2013, pp. 4267–4270. New York: IEEE.

20.

Jiang

Guo

et al . Feasibility of wrist-worn, real-time hand, and surface gesture recognition via sEMG and IMU sensing. IEEE Trans Ind Inform 2018; 14(8): 3376–3385.

21.

Patel

Castellini

Hahne

et al . A classification method for myoelectric control of hand prostheses inspired by muscle coordination. IEEE Trans Neur Syst Rehabil Eng 2018; 26(9): 1745–1755.

22.

Matsumura

Mitsukura

Fukumi

et al . Recognition of EMG signal patterns by neural networks. In: Proceedings of the 9th international conference on neural information processing (ICONIP’02), Singapore, 18–22 November 2002, vol. 2, pp. 750–754. New York: IEEE.

23.

Gonzalez-Ibarra

Soubervielle-Montalvo

Vital-Ochoa

et al . EMG pattern recognition system based on neural networks. In: Proceedings of the 2012 11th Mexican international conference on artificial intelligence, San Luis Potosi, Mexico, 27 October–4 November 2012, pp. 71–74. New York: IEEE.

24.

Leon

Gutierrez

Leija

et al . EMG pattern recognition using support vector machines classifier for myoelectric control purposes. In: Proceedings of the 2011 Pan American health care exchanges (PAHCE), 28 March–1 April 2011, pp. 175–178. Buenos Aires, Argentina: PAHCE.

25.

Geng

Ouyang

Samuel

et al . A robust sparse representation based pattern recognition approach for myoelectric control. IEEE Access 2018; 6: 38326–38335.

26.

Kanitz

Cipriani

Edin

BB.

Classification of transient myoelectric signals for the control of multi-grasp hand prostheses. IEEE Trans Neur Syst Rehabil Eng 2018; 26(9): 1756–1764.

27.

Hargrove

Englehart

Hudgins

A comparison of surface and intramuscular myoelectric signal classification. IEEE Trans Biomed Eng 2007; 54(5): 847–853.

28.

Giroux

Lamontagne

Comparisons between surface electrodes and intramuscular wire electrodes in isometric and dynamic conditions. Electromyogr Clin Neurophysiol 1990; 30: 397–405.

29.

Chiou

Luh

Chen

et al . The comparison of electromyographic pattern classifications with active and passive electrodes. Med Eng Phys 2004; 26(7): 605–610.

30.

Ling

HHC

Chan

et al . Hybrid particle swarm optimization with wavelet mutation and its industrial applications. IEEE Trans Syst Man Cyb B Cybernet 2008; 38(3): 743–763.

31.

Kawakami

Meng

Improvement of particle swarm optimization. PIERS Online 2009; 5(3): 261–264.

32.

Holi

MS.

Electromyography analysis for person identification. Int J Biomet Bioinform 2011; 5(3): 172.