Sage Journals: Discover world-class research

Abstract

Objective

In this study, we propose a method for removing artifacts from superficial electromyography (sEMG) data, which have been widely proposed for health monitoring because they encompass the basic neuromuscular processes underlying human motion.

Methods

Our method is based on a spectral source decomposition from single-channel data using a non-negative matrix factorization. The algorithm is validated with two data sets: the first contained muscle activity coupled to artificially generated noises and the second comprised signals recorded under fully unsupervised conditions. Algorithm performance was further assessed by comparison with other state-of-the-art approaches for noise removal using a single channel.

Results

The comparison of methods shows that the proposed algorithm achieves the highest performance on the noise-removal process in terms of signal-to-noise ratio reconstruction, root means square error, and correlation coefficient with the original muscle activity. Moreover, the spectral distribution of the extracted sources shows high correlation with the noise sources traditionally associated to sEMG recordings.

Conclusion

This research shows the ability of spectral source separation to detect and remove noise sources coupled to sEMG signals recorded during unsupervised daily activities which opens the door to the implementation of sEMG recording during daily activities for motor and health monitoring.

Keywords

Electromyography daily logging muscle activity health monitoring artifact removal

Introduction

The adoption of new technologies by large population groups has paved the way for the emergence of what is known as the “digital health era”, in which people are encouraged to take up new digital media technologies to engage in health self-monitoring.^1,2 This trend has been further accentuated since 2020, when the COVID-19 pandemic triggered a situation in which people were requested to socially distance and reduce visits to hospitals and other health centers.^3,4 Governments and health organizations have promoted the development and implementation of self-monitoring networks to reduce health care system burden and improve societal health.^5–7

Feasible self-monitoring is usually based on cheap and easy-to-use technologies that actually target societal needs. Some examples are self-measurement of blood glucose by people with diabetes⁸ and heart beat monitoring for athletes and people with heart disease.^9,10 It is likely that new self-monitoring technologies will be adopted by the general populace in the coming years.

The monitoring of muscle activity during daily life has been proposed for the early diagnosis and treatment of the age-related loss of muscle mass known as sarcopenia,¹¹ head, neck, and back pain,¹² and anxiety.¹³ Moreover, superficial electromyography (sEMG) has been extensively studied and correlated to the basic neuromuscular processes underlying human motion.¹⁴ For these reasons, daily logging of sEMG signals will likely play an important role in the future of health monitoring. Although the recording of muscle activity during daily life presents several challenges,¹⁵ sEMG can now be acquired using wearable and increasingly inexpensive devices.^16–19

In general, sEMG signals present a high signal-to-noise ratio (SNR) compared with other physiological signals such as electroencephalography and magnetic resonance images. In the laboratory, it is relatively simple to create experimental conditions in which artifacts affecting sEMG are suppressed or controlled. However, it becomes more challenging in the analysis of large amounts of data recorded during unsupervised daily tasks. The wide range of environmental and behavioral changes during daily activities generates huge variability in signal quality because the noises affecting muscle recordings during these conditions present heterogeneous distributions and are difficult to predict. Therefore, the identification and removal of noises during unsupervised recordings are one of the main challenges facing the implementation of daily health logging systems based on this technology.

Noise sources for sEMG signals are mainly divided into three groups: white Gaussian noise (WGN), which has an equally distributed spectrum; power line interference (PLI), which mainly affects 50/60 Hz (depending on the country); and low-frequency artifacts (LFAs), which include electrocardiography (for electrodes around the chest) and a range of disturbances produced by motion (e.g. electrode vibration, hanging wires, and wireless data loss).^20,21

Raw sEMG signals have a bandwidth in the approximate range of 10–200 Hz.^22,23 Regular bandpass filtering is normally accepted to reduce LFAs and WGN while a common technique to remove PLI is the application of a 50/60-Hz Notch filter.^24,25 However, this approach is not time discriminant; that is, it requires a priori knowledge of the noises affecting the signals and also assumes that the noises have a homogeneous distribution and temporal modulation. These assumptions can be true under controlled conditions but should not be made during unsupervised recordings.

Previous studies with high-density sEMG (HD-EMG) arrays have proposed noise removal methods based on source separation approaches such as canonical correlation analysis^21,26 and independent component analysis (ICA).²⁷ Their aim was to separate sEMG from noise components by using redundant information recorded from several electrodes placed on the same muscle. These approaches are strongly dependent on the amount of spatial information available.

For daily logging of sEMG data, the use of HD arrays is not always a feasible option. Apart from the increased cost associated with HD-EMG arrays, not all muscles are large enough to allow multiple simultaneous sEMG recordings. Aware of this issue, Mijovic et al. proposed a generalized noise source separation method from single-channel biological signal recordings that combines empirical-mode decomposition (EMD) (to decompose a signal into spectrally separated sources) and ICA (to extract statistically independent sources).²⁸ This technique was successfully validated with experimental data recorded under laboratory conditions and compared with other similar approaches. However, EMD decomposition assumes that the spectral components extracted are present throughout the signal under analysis. In this case, the limitations of EMD analysis during fully unsupervised conditions are unclear.

However, it is clear that the development and validation of a method for sEMG artifact removal during fully unsupervised activities is currently lacking a state-of-the-art application of this technology. For this reason, the current work proposes a novel algorithm for the analysis of sEMG data with the two following main goals: (a) the effective detection of noise sources from data recorded during fully unsupervised conditions, and (b) the development of a noise removal technique that allows the discrimination of valid sEMG data from large and highly contaminated signals.

Our approach targets the separation of muscle activity from noise sources by using a non-negative matrix factorization (NMF) over the signal spectrum. To perform such a decomposition, the NMF algorithm extracts common spectral patterns from a number of signals. Redundancy is achieved by the temporal segmentation of the signal into overlapping epochs. The data used for source extraction are the normalized spectral magnitude of those epochs. This article describes the proposed methodology and compares its performance with that of regular bandpass filtering and of the method proposed by Mijovic et al.²⁸ Two different data sets were used to validate methodology performance. A set of sEMG signals combined with artificially created noises was used to compare the signal reconstruction performance among methods. In addition, to show the effects of each method on real data, a second set of signals recorded during fully unsupervised conditions was used.

Materials and methods

Conceptual source decomposition

In general, most of the power of raw sEMG signals is contained in a frequency range between 10 and 200 Hz,^22,23 with an approximate energy distribution as presented in Figure 1(A). Although the noise sources might overlap the spectral distribution of sEMG data, they present independent modulation. This means that, with enough information, the spectrum of sEMG signals contaminated with different noise sources (Figure 1(B)) can be decomposed into signal and noise components. Figure 1(C) shows an approximate graphical representation of the spectral distribution associated with the three main noise sources affecting sEMG signals (LFAs, PLI, and WGN).²¹ The present work focuses on the detection and removal of these sources by using redundant information extracted from temporally shifted signal epochs.

Figure 1.

Spectral sources. (A) Spectral distribution associated with sEMG signals. (B) Spectral distribution of sEMG activity coupled to noise sources. (C) Spectral distributions of the main noise sources affecting sEMG signals: white Gaussian noise (WGN), low-frequency artifacts (LFAs), and power line interference (PLI).

General workflow

Figure 2 shows the general workflow of the methodology presented in this work. First, single-/multi-channel signals are segmented (Figure 2—segmentation) and the spectrum of each epoch is computed and normalized (Figure 2—spectral computation). Standardized spectrums undergo a NMF where the noise sources are decoupled from the sEMG spectrum (Figure 2—source separation). Then, noise source contributions are removed from each epoch, leaving only the contribution related to sEMG activation (Figure 2—filter design/noise removal). Clean epochs are shifted back to the time domain and the final signal is reconstructed by the temporal arrangement of all of the clean epochs. The technical details of each stage are provided in Appendix 1.

Figure 2.

General workflow. Stages of the proposed algorithm for sEMG noise removal. Temporal signals are segmented into L-length epochs with an O-overlap prior to a spectral extraction and normalization through a fast-Fourier transform. Spectral information undergoes non-negative matrix factorization to separate the sEMG information from the noise sources. Based on this decomposition, epoch-specific filters are designed to remove the relative spectral contributions associated with the noise sources. After filtering, the remaining spectral distribution, associated with sEMG information, is turned into the time domain. Finally, the original signal is reconstructed from all clean signal epochs.

Noise removal

After source decomposition, the spectrum of each epoch ( $E p_{e} (f)$ ) can be reconstructed ( $R E p_{e} (f)$ ) as the linear combination of one sEMG source and three noise sources, as in:

R E p_{e} (f) = s E M G S o u r c e_{e} (f) + N o i s e S o u r c e s_{e} (f);

(1)

where

s E M G S o u r c e_{e} (f) = s E M G (f) \cdot W_{e}^{s E M G};

(2)

N o i s e S o u r c e s_{e} (f) = W G N (f) \cdot W_{e}^{W G N} + L F A (f) \cdot W_{e}^{L F A} + P L I (f) \cdot W_{e}^{s E M G};

(3)

where

s E M G (f)

W G N (f)

L F A (f)

, and

P L I (f)

are the sources associated with sEMG and each noise, respectively, and

W_{e}^{s E M G}

W_{e}^{W G N}

W_{e}^{L F A}

, and

W_{e}^{P L I}

are the weights of each source associated with the epoch e. From this source-based reconstruction, it is possible to define an epoch-specific filter that removes the noise contributions from each frequency of the original epoch (

E p_{e} (f)

). The filter is described by the following equation:

S F_{e} (f) = 1 - \frac{N o i s e S o u r c e s_{e} (f)}{R E p_{e} (f)}

(4)

Finally, the clean spectrum of each epoch e is computed as the element-by-element multiplication of the original epoch spectrum

E p_{e} (f)

by the epoch-specific filter

S F p_{e} (f)

, as in:

C l e a n E p_{e} (f) = E p_{e} (f) \cdot S F_{e} (f)

(5)

Method comparison

To test the proposed methodology, the extraction and removal of noise sources was compared among three different methods. The first comprised a regular bandpass and notch filtering (hereinafter, RFilt), the second was based on the approach introduced in this work (hereinafter, FFT-NMF), and the third used the EMD-ICA approach previously introduced in the literature.²⁸ The technical details of the parameter tuning for each method can be found in Appendix 2.

The NMF and ICA source separation techniques require a priori selection of the number of sources from which data will be decomposed. Four sources were selected in both methods (FFT-NMF and EMD-ICA) (one source representing the sEMG data and three sources representing WGN, PLI, and LFA noises). Both source separation algorithms return randomly arranged components. Therefore, to avoid biases generated by a visual component arrangement, the sources extracted were rearranged according to their similarity to the model spectrums shown in Figure 3. These signals are a simplification of the spectral magnitudes associated with each expected spectral source. Similarity was tested by computing the coefficient correlation matrix between the sources extracted from each session and the modeled spectral distributions. The highest coefficient values were used to rearrange the data.

Figure 3.

Modeled spectral distributions. A simple model of the spectral distribution associated with sEMG signals and the three main noise sources described (WGN, LFAs, and PLI). This model is used as a common reference point for the rearrangement of the sources extracted from the FFT-NMF and EMD-ICA methods.

Two data sets were used to validate and compare methodology performance. The first data set included single-channel sessions in which sEMG data recorded under laboratory conditions were combined with different distributions of artificially generated noise sources (including LFAs, PLI, and WGN). The second data set comprised sessions of sEMG data recorded during unsupervised daily activities.

Data set 1: artificial noises

Artificial noise data generation

Figure 4 describes the process followed to include artificial noises into sEMG data recorded under laboratory conditions. A 10-min session of sEMG recordings was combined with three types of noises with spectral distributions corresponding to those associated with LFA, PLI, and WGN noises.

Figure 4.

Addition of artificial noises. sEMG signal recorded under controlled conditions is coupled with three different noise sources generated artificially following the spectral distributions associated with white Gaussian noise, low-frequency artifacts, and power line interference.

Noise source modulation and their temporal features were randomized to achieve the unstable behavior expected from unsupervised recordings. Figure 5 describes the details for noise generation and sEMG data acquisition. Figure 5(A) shows a 10-min sEMG signal recorded from the sternocleidomastoid muscle of a healthy subject under laboratory conditions. The subject was instructed to sit in a chair and wait for 10 min. During the waiting period, the subject was allowed to perform any desired postural changes and head motions. The fast-Fourier transform (FFT) of the data presented in the right graph of Figure 5(A) shows the spectral distribution of the power of the data.

Figure 5.

Artificial noise generation. (A) A temporal and spectral representation of the sEMG data used to generate the data from data set 1. (B) Each artificial noise source added to sEMG data was described by two vectors: the normalized noise source (NNS) (containing time data with the spectral distribution associated with a specific noise source) and the noise appearance vector (NAV) (containing information about noise appearance timings and amplitude). (C) The normalized noise sources were computed from white Gaussian noise generated from random values. The spectral magnitude and phase of each vector were computed through a fast-Fourier transform (FFT). Magnitudes were modified according to the expected spectral distribution of each noise source. Finally, the NNS was computed as the inverse fast-Fourier transform (IFFT) of the original phase and the modified magnitude. (D) Algorithm followed for the computation of the noise appearance vector. An iterative process generates activation periods with a length, amplitude, and position randomly selected from an amplitude vector (AV), length vector (LV), and the available positions given by the length of the sEMG signal (P). The iterative process finishes when noise activations reach 80% of the total signal length.

Figure 5(B) shows the elements required for noise generation. Each noise source was described by two vectors. The first was a 10-min normalized noise source (NNS) in which the spectral distribution corresponded with one of the three main noises affecting sEMG data. The second was a 10-min vector defining the noise appearance periods and modulating its amplitude (NAV) during the 10-min session. The element-by-element multiplication of both vectors defined the features and behavior of a single noise source during a single session. Figure 5(C) shows the process followed to generate the NNS. First, a random 10-min vector was generated with values between 1 and −1 (top-left graph—Figure 5(C)). The spectrum of this vector was computed through a FFT and divided into its magnitude and phase (note that its magnitude behaves as WGN with an equally distributed spectrum). The noise source spectral distributions modeled in Figure 3 were used to fit the computed magnitude into the spectral distribution expected for LFAs, PLI, or WGN. The modified magnitude and its original phase were used to reconstruct the signal in the temporal domain by an inverse FFT (IFFT). Finally, the temporal signal was normalized between 1 and −1.

For NAV generation (Figure 5(D)), several parameters were considered. Each noise source was set to be active for 80% of the total session duration (8 out of 10 min). Moreover, to decide the length and amplitude of each activation period, two vectors were used. These vectors, amplitude vector (AV), and length vector (LV), were defined as:

A V = [M a x A m p, M a x A m p - i n c A, \dots, M i n A m p];

(6)

M i n A m p = \frac{3}{4} \cdot s E M G_{p 95};

(7)

M a x A m p = 1.5 \cdot s E M G_{p 95};

(8)

i n c A = \frac{M a x A m p - M i n A m p}{99};

(9)

L V = [M a x L e n, M a x L e n - i n c L, \dots, M i n L e n];

(6)

R E p_{e} (f) = [M a x A m p, M a x A m p - i n c A, \dots, M i n A m p];

(6)

M i n A m p = \frac{3}{4} \cdot s E M G_{p 95};

(7)

M a x A m p = 1.5 \cdot s E M G_{p 95};

(8)

i n c L = \frac{M a x L e n - M i n L e n}{99};

(9)

where

sEM G_{p 95}

is the 95th percentile of the rectified amplitude from the sEMG signal and EMGLen is the length of the session (10 min). Both vectors contain 100 values arranged in a descending order for AV and an ascending order for LV. Figure 5(D) summarizes the algorithm workflow to generate the NAV. First, a random amplitude and length are chosen from AV and LV, respectively. In addition, a random starting sample is chosen from the number of samples within the total length of the data. If these randomly chosen values allow the creation of an activation period that fits within the session time and that do not overlap with any other activation period, it gets included in the NAV. The algorithm keeps repeating this loop, adding new activation periods, until the total time of noise activation reaches 80% of the signal length. In addition, a stability index (SI) was defined as a number between 1 and 100. The SI was used to tune the range of available values in AV and LV as:

A V_{a v a i l a b l e} = A V [S I : e n d];

(10)

L V_{a v a i l a b l e} = L V [S I : e n d];

(11)

Therefore, for an SI = 100, there will be a single value available for both vectors (

A V_{a v a i l a b l e} = A V [100] = M i n A m p

and

L V_{a v a i l a b l e} = L V [100] = M a x L e n

) and for SI = 1, all amplitude and length values defined in equations (6) and (10) will be available. Modification of this parameter allows tuning of the noise variability both in amplitude and active time.

Data set generation

The process described in Figure 4 was repeated 100 times for each $S I = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]$ , giving a total amount of 1000 10-min sessions in which different distributions of LFA, PLI, and WGN sources were added to the sEMG signal.

Method performance validation

The results obtained from this data set were used to compare the performances of each method regarding the separation of sEMG signals from noise sources. For that purpose, the original SNR data were compared to the SNR predicted from each method after source decomposition. Efficiency in signal and noise reconstruction was quantified by two parameters. The first parameter was the root mean square error (RMSE), computed as:

R M S E = \sqrt{a v g ({(S_{0} - S_{r})}^{2})};

(12)

where

a v g ((S_{0} - S_{r})^{2})

is the averaged value of the square difference between the original signal So and the reconstructed signal Sr. This parameter quantifies the energy difference between two signals. The second parameter was the correlation coefficient between the original and reconstructed recordings. This coefficient quantifies, with a value between −1 and 1, the similarity in the shapes of the two vectors.

Data set 2: unsupervised sEMG recordings

Experimental set-up

Unsupervised sEMG data were acquired using a medium density band recording 8 channels from 16 bipolar electrodes (Figure 6(B)) with a sampling frequency of 2000 Hz. A single electrode located in the area of the sternocleidomastoid muscle was selected for analysis. The system included a belt pouch with a battery providing autonomy of around 10 h. Every 10 min, a data file was stored in a SD card (Figure 6(C)).

Figure 6.

sEMG recording system and set-up. (A) A subject wearing the whole system for neck sEMG recordings. (B) Detailed view of the neck sEMG band. (C) sEMG recording system including the SD card for data storage, battery for 10-h autonomy, and hand button for external flag generation.

Participants

Five patients with neck pain were requested to wear the sEMG band shown in Figure 6(A). They were instructed on how to set up the recording device and all of them provided signed informed consent according to the Declaration of Helsinki. No further instructions were given, and the participants were asked to continue with their normal daily activities.

Data set

Each patient recorded around 5 h of muscle activity. Thirty 10-min recordings were extracted from each patient, giving a total of 150 10-min sessions (30 sessions × 5 patients).

Data evaluation

This data set was used to visualize the spectral distribution of sEMG and the noise sources extracted by the RFilt, FFT-NMF, and EMD-ICA methods during real unsupervised sEMG recordings. Moreover, the separability between spectral component pairs was quantified for each method as their Bhattacharyya distance (Bdist), first introduced by Bhattacharyya et al.²⁹ This parameter is closely related to the Bhattacharyya coefficient, which measures the amount of overlap between two statistical samples or populations. Bdist can provide any value in the range 0 < Bdist < ∞, with larger values associated with higher class separability.

Details of the study

Patients’ sEMG data was fully recorded by them during their daily lives in Saitama and Tokyo Prefectures in Japan during periods that varied between 4 and 17 days. The conceptualization of the experiment, generation of artificial noises, and recordings of clean sEMG signals were performed at the same time in the facilities of Riken in Nagoya. Data analysis, results extraction, and interpretation were done during the following months also in Riken's laboratory in Nagoya.

Results

sEMG reconstruction performance

The signal/noise decomposition performance for each method was assessed by using the data set including real sEMG data coupled to artificially created noises (data set 1) in which the original sources and SNR were known. Figure 7(A) compares the SNR of the original data with that predicted from source decomposition using the RFilt, FFT-NMF, and EMD-ICA methods. Each boxplot represents the SNRs of 100 10-min sessions associated with different values of the SI, as shown on the x-axis (original SNR, blue; FFT-NMF-predicted SNR, green; EMD-ICA-predicted SNR, red; RFilt-predicted SNR, orange). As designed, the real SNR (blue) decreases with an increasing variability of noise appearance. The predicted SNRs presented significant differences for the RFilt, FFT-NMF, and EMD-ICA methods, both compared with original SNR and among each other. However, the SNR predicted by the FFT-NMF algorithm exhibited higher accuracy and less variability that predicted by the RFilt and EMD-ICA methods, which showed a significantly higher standard deviation for all values of the SI. Moreover, the RFilt method presented significantly lower performance in SNR prediction and was also more affected by noise stability than the methods based on source decomposition.

Figure 7.

sEMG reconstruction performance. (A) Comparison between the signal-to-noise ratio of the signals from data set 1 (blue) and the signal-to-noise ratio predicted by FFT-NMF (green), EMD-ICA (red), and RFilt (orange) methods. (B) Root mean square error between the original sEMG data and the sEMG data reconstructed by FFT-NMF (green), EMD-ICA (red), and RFilt (orange) methods. (C) Correlation coefficient between the original sEMG data and the sEMG data reconstructed by FFT-NMF (green), EMD-ICA (red), and RFilt (orange) methods.

Figure 7(B) and (C) uses the same boxplot format to show the RMSE and correlation coefficients between the original sEMG data and the sEMG data reconstructed by each method (FFT-NMF, green; EMD-ICA, red; and RFilt, orange). For each SI value, the FFT-NMF method showed less RMSE (error recovering the energy of the original data) and a higher correlation coefficient (shape similarity to original data) than the RFilt and EMD-ICA methods. The increase in the noise variability (a decrease in the SI) produced, in the three methods, a decrease in the energy and shape reconstruction performance. This phenomenon was stronger in the sEMG data extracted using the RFilt method. Moreover, the FFT-NMF algorithm showed a more stable reconstruction (lower standard deviation) for lower stability values (0.7 ≥ SI ≥ 0.1).

Real data analysis

Extracted sources. Data set 2 was used to compare the signal decomposition results from the FFT-NMF and EMD-ICA algorithms applied to data recorded from the sternocleidomastoid muscle during unsupervised tasks. In Figure 8(A) and (B), the four spectral components extracted using each method are presented according to their median (red), 25th and 75th percentiles (blue), and maximum and minimum values (back) computed from the set of 150 sessions (30 10-min session × 5 subjects). Spectral components were arranged according to their correlation coefficient with the sources modeled in Figure 3 that represent a simplification of the spectral distributions expected for the sEMG and LFA, WGN, and PLI sources. Figure 8(C) shows the Bhattacharyya distances computed between each pair of components extracted using each method. The spectral distributions of sources extracted using the FFT-NMF method showed higher similarity with the conceptual decomposition of the data into four sources representing the sEMG data and WGN, PLI, and LFA sources. Moreover, Bhattacharyya distances exhibited higher class separability for component pairs extracted using the FFT-NMF method. Figure 8(D) shows an example of the FFT-NMF algorithm reconstructing the original sEMG signals recorded under a controlled environment from versus their version where with the addition of artificial noises were added.

Figure 8.

Sources extracted by FFT-NMF and EMD-ICA. sEMG and noise sources extracted from data set 2 and rearranged according to the model spectral distributions defined in Figure 3. (A) Sources extracted from FFT-NMF. (B) Sources extracted from EMD-ICA. (C) Bhattacharyya distances computed for each pair of sources. (D) Example of the FFT-NMF algorithm used to remove artificial noises coupled to clean sEMG signals.

Clean data and sEMG estimation. Figure 9 shows the results of the use of the FFT-NMF algorithm to remove the noise contributions from two 10-min recordings from the sternocleidomastoid muscle during fully unsupervised tasks. The graphs on the left show the raw data (top graph), the clean sEMG signal (blue) + original signal reconstruction (OSR) (orange) (middle graph), and the epoch-wise predicted SNR (bottom graph). The right graphs show a zoom-in of selected periods that allows a clear visualization of the effects of the noise removal technique. Both for raw signals (two upper graphs) and clean data (two lower graphs), a time/amplitude and spectrogram representation are shown. Three signal periods labeled as 1 ^∗ , 2 ^∗ , and 3 ^∗ are highlighted from both examples to discuss algorithm effects. 1 ^∗ shows an example in which the main contribution of the algorithm is the removal of WGN (affecting mainly the higher frequency bands), allowing a clear visualization of the muscle activity after cleaning. 2 ^∗ shows an example in which the strong coupling of PLI is removed by the algorithm, leaving only the part in which a small sEMG activation happened. 3 ^∗ shows an example of the ability of the FFT-NMF algorithm to removing the appearance of a LFA that produced low-frequency variations in the raw data. The OSR ratio and SNR provide extra information about the relative power between the predicted sEMG signals and noises. Both ratios can be used for a time-wise estimation of the physiological meaning of cleaned signals.

Figure 9.

Noise removal results. Two examples of the FFT-NMF algorithm applied to data recorded during fully unsupervised recordings. Graphs on the left show the temporal information of raw and clean recordings. Graphs on the right show the spectral properties of the signals under analysis. The three data periods labeled as 1*, 2*, and 3* show the ability of the algorithm to remove WGN, PLI, and LFA sources.

Finally, Figure 10 shows an example of the RFilt, FFT-NMF, and EMD-ICA methods applied to a highly contaminated session where most of the recorded data was noise. The blue data represent the raw recorded data while the orange data represent the signal after noise removal for each method. The effects on the spectral domain can be seen in the respective FFTs presented in the right graphs.

Figure 10.

Example of highly contaminated data. Example of each method applied to highly contaminated data. The left graphs show the temporal domain of a single 10-min session. The blue signal shows the raw data while the orange signals represent the clean data after the application of each method. The right graphs show the frequency domain computed from a fast-Fourier transform between 0 and 1000 Hz.

Discussion

To the best of our knowledge, the current work presents the first method for the effective detection and removal of noise sources from sEMG recordings during fully unsupervised daily activities. The method can be fully automated after establishment of the number of sources (four in the case of this article) and their spectral distributions (Figure 3). Depending on the data set under analysis, the number of sources and the model of their spectral distribution can be tuned to increase method performance.

The method was compared to traditional signal filtering and to a state-of-the-art approach (EMD-ICA) previously introduced in the literature.²⁸ The FFT-NMF method showed higher performance both when reconstructing the original sEMG data and when predicting the SNR (Figure 7). The segmentation of the original data into overlapping epochs generates sufficient signal redundancy for the extraction of common spectral sources, which allows its application to both single- and multi-channel recordings. Moreover, epoching decouples time and frequency domains, which improves the detection of sources whose contributions are not stable in time. The results also show that traditional noise removal (RFilt) is more affected by higher noise variabilities than FFT-NMF and EMD-ICA methods (Figure 7). The sources extracted with the FFT-NMF algorithm present a spectral distribution that correlates with the main noises affecting sEMG data (Figure 8(A)).²⁰ Such a correlation was not found in the sources extracted using the EMD-ICA method, at least when applied to fully unsupervised recordings (Figure 8(B)). The epoch-specific filters generated by the FFT-NMF method allow fine-tuning of the spectral bands and power reduction in accordance with the noises affecting the data in each time period (Figure 9).

In addition, OSR and SNR indices allow the detection of temporal periods of interest that can be used to estimate the physiological meaning of the cleaned signals. Finally, the FFT-NMF algorithm allows a signal reconstruction for illustrating which spectral features present a similar distribution to the one expected from sEMG signals (Figure 10).

The effectiveness of the current algorithm is strongly related to the number of noises coupled to the sEMG signals because the redundancy in their spectral distributions over time allows the extraction of the spectral pattern associated with each noise. The algorithm might present limitations if applied to large data sets that present a very low amount of coupled noises because there will not be enough redundant noise information to extract common spectral patterns between epochs. Moreover, researchers interested in applying the current methodology should carefully check the origins of the LFA noises coupled to their signals. Given the multi-origin nature of these noises, the fine-tuning of their reference spectral distribution could be vital for the proper detection and removal of these artifacts.

Conclusion

This research shows the ability of spectral source separation to detect and remove noise sources coupled to sEMG signals recorded during unsupervised daily activities. The algorithm presented showed a better performance than traditional and other state-of-the-art methodologies for noise removal, which opens the door to the implementation of sEMG recording during daily activities for motor and health monitoring.

As mentioned in the Introduction, social trends toward subject engagement with self-health monitoring and the development of increasingly cheap and wearable sEMG devices put this technology in the early mass adoption stage. Under this scenario, the first big challenge of people in charge of sEMG big data analysis will be artifact removal and discrimination of valid muscle activity. In this regard, the current methodology was developed as a tool for the treatment of large sets of sEMG data recorded during fully unsupervised daily activities. Our method provides a robust basis for the preliminary treatment and classification of data recorded under highly variable conditions.

Nevertheless, there are still several challenges that remain to be addressed to further improve this implementation. One of the outstanding challenges is in the development and inclusion of hardware-specific modules in the amplification stage of sEMG sensors that apply the proposed algorithm during recording. Moreover, future researches must address the characterization of the spectral distribution of other potential noises affecting sEMG signals.

Finally, in future works, this methodology will be used to treat sEMG signals recorded during the daily lives of people in age ranges where muscle conditions such as back and neck pain, and sarcopenia present their first symptoms, potentially enabling for their early detection and treatment. The effects of artifact reduction on the later stage of signal processing and analysis of sEMG signals must also be discussed according to the results of future works.

Footnotes

Acknowledgments

The authors would like to thank the Toyota Motor Corporation for their promotion of this work and the Saitama Neuropsychiatric Institute for assisting with recruitment of participants for this research.

Authors’ Note

Álvaro Costa-García, National Institute of Advance Industrial Science and Technology, Chiba, Kashiwa, Japan; Shotaro Okajima, Nagoya University, Aichi, Nagoya, Japan; Shingo Shimoda, Nagoya University, Aichi, Nagoya, Japan.

Contributorship

ACG and SS researched the literature and conceived the study. ACG, NY, and SO were involved in methodology development, data recording, and analysis. SO and SS contributed to the recruitment of participants. ACG wrote the first draft of the manuscript. All authors reviewed and edited the manuscript and approved the final version of the manuscript.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethical approval

The study was conducted in accordance with the guidelines of the Declaration of Helsinki and was approved by the Institutional Review Board of Riken (Reference Wako3 28-13). Informed consent was obtained from all participants involved in the study.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Toyota Motor Corporation and a JSPS KAKENHI grant (number 18K18431).

Guarantor

ACG accepts responsibility for the overall integrity of the manuscript (including ethics, data handling, reporting of results, and study conduct).

ORCID iDs

Álvaro Costa-García

Appendix 1

Appendix 2

References

Lupton

. The digitally engaged patient: self-monitoring and self-care in the digital health era. Soc Theory Health 2013; 11: 256–270.

Lancaster

Abuzour

Khaira

, et al. The use and effects of electronic health tools for patient self-monitoring and reporting of outcomes following medication use: systematic review. J Med Internet Res 2018; 20: e294.

Chakraborty

Maity

. COVID-19 outbreak: migration, effects on society, global environment and prevention. Sci Total Environ 2020; 728: 138882.

Bayoumy

Veugen

van Rijssen

, et al. Self-monitoring of the tympanic membrane: an opportunity for telemedicine during times of COVID-19 and beyond. J Otol 2021; 16: 120–122.

Wahlqvist

. Self-monitoring networks for personal and societal health: dietary patterns, activities, blood pressure and Covid-19. Asia Pac J Clin Nutr 2020; 29: 446–449.

Lim

Teo

, et al. An automated patient self-monitoring system to reduce health care system burden during the COVID-19 pandemic in Malaysia: development and implementation study. JMIR Med Inform 2021; 9: e23427.

Whiting

Elwenspoek

. Accuracy of self-monitoring heart rate, respiratory rate and oxygen saturation in patients with symptoms suggestive of covid infection.

Benjamin

. Self-monitoring of blood glucose: the basics. Clin Diabetes 2002; 20: 45–47.

Dooley

Golaszewski

Bartholomew

. Estimating accuracy at exercise intensities: a comparative study of self-monitoring heart rate and physical activity wearable devices. JMIR Mhealth Uhealth 2017; 5: e7043.

10.

Forerunner

Fuse

. Comparison of wearables for self-monitoring of heart rate in coronary rehabilitation patients. Georgian Medical 2021; 315: 78–85.

11.

Caroppo

Rescio

Leone

, et al. A surface electromyography-based platform for the evaluation of sarcopenia.

12.

Hart

Cichanski

. A comparison of frontal EMG biofeedback and neck EMG biofeedback in the treatment of muscle-contraction headache. Biofeedback Self Regul 1981; 6: 63–74.

13.

Viens

. The effects of self-monitoring, emg biofeedback, and relaxation tapes in the treatment of tension headaches.

14.

Merletti

Farina

(eds). Surface electromyography: physiology, engineering, and applications. Hoboken, New Jersey, USA: John Wiley & Sons, 2016.

15.

Milosevic

Benatti

Farella

. Design challenges for wearable EMG applications. InDesign, Automation & Test in Europe Conference & Exhibition (DATE), 2017 2017 Mar 27 (pp. 1432-1437). IEEE.

16.

Cerone

Botter

Gazzoni

. A modular, smart, and wearable system for high density sEMG detection. IEEE Trans Biomed Eng 2019; 66: 3371–3380.

17.

Pino

Arias

Aqueveque

. Wearable EMG shirt for upper limb training. In2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2018 Jul 18 (pp. 4406-4409). IEEE.

18.

Yamaguchi

Mikami

Maeda

, et al. Portable and wearable electromyographic devices for the assessment of sleep bruxism and awake bruxism: a literature review. CRANIO® 2020; 41(1): 69–77.

19.

Michelsen

Lund

Alkjær

, et al. Wearable electromyography recordings during daily life activities in children with cerebral palsy. Dev Med Child Neurol 2020; 62: 714–722.

20.

Clancy

Morin

Merletti

. Sampling, noise-reduction and amplitude estimation issues in surface electromyography. J Electromyogr Kinesiol 2002; 12: 1–6.

21.

Al Harrach

Boudaoud

Hassan

, et al. Denoising of HD-sEMG signals using canonical correlation analysis. Med Biol Eng Comput 2017; 55: 375–388.

22.

Wakeling

Rozitis

. Spectral properties of myoelectric signals from different motor units in the leg extensor muscles. J Exp Biol 2004; 207: 2519–2528.

23.

Von Tscharner

Goepfert

. Estimation of the interplay between groups of fast and slow muscle fibers of the tibialis anterior and gastrocnemius muscle while running. J Electromyogr Kinesiol 2006; 16: 188–197.

24.

De Luca

Gilmore

Kuznetsov

, et al. Filtering the surface EMG signal: movement artifact and baseline noise contamination. J Biomech 2010; 43: 1573–1579.

25.

Wang

Tang

Bronlund

. Surface EMG signal amplification and filtering. Int J Comput Appl 2013; 82: 15–22.

26.

Anand

Bhateja

Srivastava

, et al. An approach for the preprocessing of EMG signals using canonical correlation analysis. InSmart Computing and Informatics: Proceedings of the First International Conference on SCI 2016, Volume 2 2018 (pp. 201-208). Springer Singapore.

27.

Azzerboni

Carpentieri

La Foresta

, et al. Neural-ICA and wavelet transform for artifacts removal in surface EMG. In2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541) 2004 Jul 25 (Vol. 4, pp. 3223-3228). IEEE.

28.

Mijović

De Vos

Gligorijević

, et al. Source separation from single-channel recordings by combining empirical-mode decomposition and independent component analysis. IEEE Trans Biomed Eng 2010; 57: 2188–2196.

29.

Bhattacharyya

. On a measure of divergence between two multinomial populations. Sankhyā: The Indian J Stat 1946; 7(4): 401–406.

30.

Zhang

Fang

. A NMF algorithm for blind separation of uncorrelated signals. In2007 International Conference on Wavelet Analysis and Pattern Recognition 2007 Nov 2 (Vol. 3, pp. 999-1003). IEEE.

Artifact removal from sEMG signals recorded during fully unsupervised daily activities

Abstract

Objective

Methods

Results

Conclusion

Keywords

Introduction

Materials and methods

Conceptual source decomposition

General workflow

Noise removal

Method comparison

Data set 1: artificial noises

Artificial noise data generation

Data set generation

Method performance validation

Data set 2: unsupervised sEMG recordings

Experimental set-up

Participants

Data set

Data evaluation

Details of the study

Results

sEMG reconstruction performance

Real data analysis

Discussion

Conclusion

Footnotes

Acknowledgments

Authors’ Note

Contributorship

Declaration of conflicting interests

Ethical approval

Funding

Guarantor

ORCID iDs

Appendix 1

Appendix 2

References