Abstract
Surface electromyography (sEMG) is a noninvasive method for monitoring muscle activity, essential in rehabilitation, prosthetics, sports, and medicine. However, current sEMG systems struggle with signal noise, unclear muscle activation detection, and limited adaptability. This study proposes a method combining signal processing and machine learning to enhance muscle activation detection. Experiments on EMG data from muscle contractions (n = 140 per test, with 28 participants, five biceps curl repetitions, five wrist curl repetitions each) demonstrated 92.23% overall accuracy (3.83% false positives, 3.94% false negatives) for biceps tests and 91.11% overall accuracy (4.36% false positives, 4.53% false negatives) for wrist tests, and an average recall of 96% significantly outperforming traditional methods. This approach highlights potential applications in real-world biomedical settings from rehabilitation and prosthetics to sports science.
Keywords
Introduction
Surface electromyography (sEMG) is a noninvasive measurement technique that records muscle-generated electrical signals by two or three electrodes placed on the skin overlying the muscle of interest. These electrodes measure the electrical voltage difference generated by muscle fibers in different states. This raw signal, known as electromyogram, is faint and mixed with noise from various sources; as such, it needs careful amplification and filtering to be correctly read. This reading can be analyzed to provide real-time feedback on the timing and intensity of muscle contraction, making it extremely valuable for applications in sport science, biomechanics, rehabilitation, and assistive technology.1,2 Medical applications range from diagnosing neuromuscular disorders like muscular dystrophy and amyotrophic lateral sclerosis 3 to monitoring muscle function and fatigue when recovering from strokes or performing physical therapy. 4 In sports, EMG helps collecting large datasets to better understand the muscular effort during activities, optimizing training and techniques for maximum efficiency and power. 5 In augmented reality and robotic applications, they can be used as human–machine interface to capture and classify human motion.2,5,-7
Despite its advantages, sEMG systems often suffer from signal noise and limited adaptability to physiological variations.5,8 A variety of hardware solutions and postprocessing methods have been suggested to improve data quality, which partially solve these issues.1,9 However, the low magnitude of the signal still makes it extremely sensitive to noise and interference. A common solution observed in literature includes the usage of multiple EMG sensors or fusion with other types of sensors, in order to extract either force, 10 motion,11,12 or personal 13 information from the acquired data. This option, however, involves complex systems and introduces additional data and processing that is not always necessary when the EMG is used to identify only muscle activation, for example, to drive a prosthesis. Artificial intelligence (AI) has often been used in multisensor scenarios, with previous works relying on sensor fusion to perform pattern recognition14,15; other previous works adopt AI to analyze sEMG datasets to evaluate overall muscle activity or performance with a single global performance parameter. 16 Recent works generally rely on multiple sEMG sensors to identify muscle activation pattern throughout related muscles,17,18 sometimes combined with inertial measurement units. 19
Overall, there is currently a research gap in the literature regarding the identification of muscle patterns without the use of multiple sensors (either all sEMG or with a combination of sEMG and other sensing technologies), leading to increased cost and complexity creating a barrier to the widespread adoption of sEMG in daily task or sport monitoring. This study aims at this gap, providing a streamlined method for muscle activation detection with an individual sEMG sensor. Here we show how muscle activation patterns can be detected with a single sEMG, enabling simpler designs for sEMG-activated controllers (e.g. in prostheses) by integrating signal processing techniques—such as filtering, rectification, root mean square (RMS) envelope extraction, and thresholding—with machine learning models (Random Forest, XGBoost) to improve accuracy and reliability.
The article first introduces the three-electrode sEMG sensor used in this study, explaining our experimental protocol over two different upper-limb exercises and our postprocessing pipeline. Then, we report study results, discussing both analytical thresholds and learning-based methods for pattern detection. Finally, we compare the performance of the proposed methods, highlighting the advantages and disadvantages of both.
Material and methods
EMG sensor
This study analyzed a signal obtained through a Beyond EMG sensor (Sensor Medica srl). As a two-channel wearable EMG sensor, it can measure electrical signals from two different muscles at the same time in either a two-electrode (Figure 1(a)) or three-electrode (Figure 1(b)) configuration. Two of the electrodes collect muscle signals, while the optional third electrode, placed on a bone, provides a stable reference for a cleaner signal. Signal strength (gain) can be set to four different levels: 330, 466, 658, and 1170 V/V. The sensor filters signals through a low-pass filter set at 250 Hz and a high-pass filter at 10 Hz to reduce unwanted noise. Noise levels of less than 2 µV RMS ensure clean signals, while its high input impedance of over 100 million ohms helps protect the signal from interference. It can detect signal levels up to 10 mV peak-to-peak (mVpp), which is suitable for capturing muscle electrical activity. The sensor uses a 12-bit analog-to-digital converter to convert the signals into digital form for analysis with a dynamic range of 3.3 V, and users can choose between two sampling rates: 1000 Hz or 500 Hz.

Beyond EMG sensor (Sensor Medica srl): (a) Two-electrode layout, with two pairs (positive–negative) of electrodes; (b) three-electrode layout with two sets (positive–negative–reference) of electrodes; (c) user interface of the commercial integrated software with raw signal (white) and postprocessed signal with activation (blue) and deactivation (red) detection; (d) detail from an example of inaccurate detection with false negatives (in red despite clear muscle activity in the raw signal).
The software provided with the sensor acquires sensor data and processes them, providing as output the signal plot in the example graphical user interface in Figure 1(c) and automatically detecting muscle activation and deactivation at threshold increase or decrease rates of the acquired values. However, while this code proves fairly reliable in controlled environments and for main muscle groups (e.g. biceps), high rates of false positives (i.e. identifying the muscle as active when it is not) and especially false negatives (i.e. identifying no activation when the muscle is active, visible in the examples in Figure 1(d)) are observed in realistic environments and smaller muscle groups (e.g. wrist), leading to a very inaccurate detection in several use cases (less than 20% accuracy for some exercises). The lack of medical and technical standards on EMG signal acquisition and postprocessing, together with the low accuracy of state-of-the-art muscle activation algorithms, is the main motivation behind this study, which aims to improve EMG signal analysis through new analytical methods and learning-based algorithms.
Experimental protocol
To obtain functional signals with clear muscle activation and deactivation to be used as ground truth, a timed weight-lifting exercise has been performed to generate a sample signal with five repetitions of biceps curls and then five repetitions of wrist curl exercises, repeated for 28 users (n = 140 for biceps; n = 140 for wrist). Figure 2 shows the experimental layout with the two different phases in our experimental acquisitions. The first phase is to acquire biceps activation during a 50 s cycle of biceps curl (5 s rest, 5 s activity) with a 4 kg dumbbell. In the second phase, new electrodes are placed on the forearm to perform an equally timed exercise of wrist curl with the same dumbbell. The latter task, after an empirical tuning, has shown the best results in activating the flexor digitorum superficialis muscle. In both exercises, the skin was precleaned with alcohol, the interelectrode distance was set to 1–2 cm, and placement strictly followed SENIAM guidelines for reproducibility. 20 More details are reported in the experimental protocol in Figure 3. This step-by-step procedure has been applied to all the test subjects. When instabilities were observed (e.g. interference, unexpected sensor behavior), the test was discarded and repeated from the beginning.

Experimental setup with reference electrode positions: on the left, weight-lifting routine for biceps curl; on the right, the EMG sensor on the forearm measures the flexor digitorum superficialis activation in the wrist curl. The blue electrode is the ground (positioned in both cases on the elbow). Red and black electrodes take the signal differences with respect to our body reference; we have positioned the electrodes along the muscle 2 cm of distance each other.

Experimental protocol for the acquisition of muscle activity during a biceps curl exercise first and a wrist curl exercise later; each exercise is repeated five times during the routine.
Postprocessing
The acquired EMG data in this work have been analyzed through the streamlined postprocessing pipeline here presented. First, an initial calibration is performed asking the user to relax the muscle for 5 s. Then, the acquired data is used to remove the baseline offset (constant background value) from the data, averaging the initial n data points and subtracting this average
After baseline correction to remove constant offsets, rectification was used to convert values to positive as
The value after equations (1)–(2) provides a cleaner visualization of the raw signal (in blue in Figure 4). Then, the signal has been smoothed by computing its RMS over a moving window. This mobile average, used as in related literature to minimize the effect of outliers on further analysis, represents muscle activation as

Example of EMG signal processing on three repetitions of an example biceps curl exercise, with raw signal in black, processed signal with equations (1)–(2) in blue, and RMS in red.
The challenge of detecting muscle activation has been discussed here with two possible solutions: an analytical one, focusing on the change of the RMS envelope through thresholds in both absolute value and derivative, and a learning-based one, training a neural network to automatically detect activation and deactivation, as described in the two following subsections, respectively.
Analytical detection of muscle activation
The analytical approach for muscle activation aims at determining functional thresholds for both signal absolute value
AI-based detection of muscle activation
To further improve muscle detection accuracy, machine learning methods were integrated in the signal analysis pipeline, using Random Forest and XGBoost classifiers. Preprocessing steps included baseline correction to remove offsets, as in equation (1), followed by a Wiener and Butterworth filter to reduce noise by isolating the frequency range of interest (15–450 Hz) and rectification as per equation (2). Key features capturing time-domain (RMS, variance, waveform length), frequency-domain (median frequency, peak frequency, spectral entropy), and energy-based (Teager-Kaiser Energy Operator and the Hilbert envelope) characteristics were extracted from the filtered signal and used to train a machine learning model to classify muscle activity as active (
Results and discussion
Results: Analytical detection
Different values of moving window sample size and thresholds have been tested, both in absolute value (mV) and change rate (mV/s), with optimal results obtained for a 50-sample moving window and a threshold value of 0.095 mV/s on change rate, as shown in Figure 5(d). The other tests highlight the general errors observed in EMG sensor analysis, with false positives in Figure 5(a) and false negatives in Figure 5(c).

Muscle activation identified with analytical methods on three repetitions of an example biceps curl exercise, using different threshold values at signal rate of change to evaluate sensitivity to threshold parameters: (a) threshold at 0.01 mV/s; (b) threshold at 0.10 mV/s; (c) threshold at 1.00 mV/s; (d) threshold at 0.095 mV/s.
Results: AI-based detection
The results of the tests with the AI-based detection method are reported in Table 1, with some examples of acquisition and post-processing reported in Figures 6 and 7, respectively. The results in the table report, for each test subject, the time intervals in which the sensor detects muscle activation (

Examples of muscle activation identified with AI-based methods from biceps curls: (a) an acquisition example; (b) an example of an acquisition with increased noise.

Examples of muscle activation identified with AI-based methods from wrist curls: (a) an acquisition example; (b) an example of an acquisition with a weak signal.
Summary of results, with 92.23% overall accuracy (3.83% false positives, 3.94% false negatives) for biceps tests and 91.11% overall accuracy (4.36% false positives, 4.53% false negatives) for wrist tests.
Accuracy is computed as the ratio between the time in which the muscle was active and the measured activation time.
Discussion
Overall, the analytical detection results proved to be sample-dependent, as different samples and acquisitions show different optimal values of thresholds. This analytical threshold-based method provided a simple and reliable approach for identifying muscle activation without relying on machine learning but requires time- and effort-intensive manual calibration for each set of samples. Further, its effectiveness diminishes under high-noise or variable physiological conditions, suggesting the potential benefit of integrating AI techniques for further improvement. Finally, the moving window introduces delays in activation time that reduce overall sensor effectiveness, especially in online usage (e.g. exoskeleton control). For these reasons, the analytical detection method has not been used to post-process all the acquired data of the dataset, given its already low performance on a testing subsample (detection accuracy lower than 70%).
The results for the AI-based detection show different kinds of acquisitions. In Figure 6(a), a typical biceps acquisition is reported: the raw signal, in gray at the top, already shows clearly when the muscle is active and when it is not; after postprocessing, the proposed algorithms correctly detects the first, second, and fifth repetition; in the third and fourth repetition, limited intervals with lower muscle activation are incorrectly labeled as periods of rest (see zeros in the diagram at the bottom). Figure 6(b) shows a much noisier acquisition, with noise partially covering the periods of rest; while this noise would have caused the analytical methods to fail, the proposed algorithm manages to detect activation with good accuracy. In Figure 7(a), the code correctly identifies the five main repetitions, even though some false positives and negatives are observed at the beginning and end of each period of activation, as the detected events are not fully synchronized with the experiments. Figure 7(b) shows how the forearm electrodes acquire a much weaker signal when compared to the bicep layout; this reduced amplitude led to misdetection for the analytical methods, but the AI-based algorithm proved to be robust even in this case, despite a significant noise before the first activation lowering performance.
The main challenge observed during all experiments is the presence of excessive noise, likely caused by electrical interference as it was observed as worse in the presence of electronics (e.g. smartwatch and exoskeletons). This noise makes it difficult to clearly capture muscle activity, leading to unreliable data when assessing muscle strength and timing. Initial tests conducted without the use of weights showed unclear results. Signal amplification solves this problem only partially, as it amplifies the above-mentioned noise in addition to the signal; introducing weights during the tests improved the clarity of the results by forcing higher muscle activation, making the signal more readable. This amplification is particularly needed in the wrist test: biceps signal is generally clear and stable, given the larger size of the muscle, whereas wrist flexors are smaller and resulted in worse signal quality, also because electrodes of the same size were used for both wrist and biceps acquisitions. This signal difference is critical with analytical thresholding, but learning-based methods demonstrate their robustness with their small loss of performance between the different tests (approximately 1.12%).
The analytical method works from a theoretical perspective but requires manual calibration case-by-case, sometimes resulting in low accuracy (less than 70%) even between consecutive tests with the same person. This level of effort represents a critical barrier to the accessibility of sEMG technology. Conversely, the trained models demonstrated strong performance, achieving detection with 92.23% overall accuracy (3.83% false positives, 3.94% false negatives) for biceps tests and 91.11% overall accuracy (4.36% false positives, 4.53% false negatives) for wrist tests, and an average recall of 96%. As such, the AI-based method outperforms threshold-based methods, easily adapting to noisy signals (e.g. Figure 6(b)) as well as weak and variable signals (e.g. Figure 7(b)).
Conclusions
This study developed a framework for analyzing sEMG signals to improve muscle activation detection. By integrating baseline correction, rectification, RMS envelope extraction, and thresholding, the signal processing approach reduced noise and highlighted activation patterns. However, traditional threshold-based methods can struggle with adaptability in complex or noisy conditions (accuracy lower than 70%). To overcome these limitations, machine learning models such as Random Forest and XGBoost were introduced. By using key time-domain, frequency-domain, and energy-based features, these AI-based methods significantly enhanced classification accuracy and robustness. The combined approach achieved high performance levels, offering a reliable method for muscle activation analysis with 92.23% overall accuracy (3.83% false positives, 3.94% false negatives) for biceps tests and 91.11% overall accuracy (4.36% false positives, 4.53% false negatives) for wrist tests. While multiple sensors can still provide more nuanced information on motion and biomechanics, the proposed method shows how a single sensor can still be effective in identifying muscle activation patterns for a simple but efficient analysis as well as potential for real-time exoskeleton control. When compared with previous works, where muscle activation detection is obtained manually during postprocessing or discussed without any quality metrics, these findings provide a simple but effective automatic method, contributing to the integration of sEMG in real-time monitoring or control for rehabilitation, sports, exoskeletons, and prosthetics. Future research could extend this work to additional muscle groups and explore real-time, wireless solutions to improve practical usability in clinical and biomechanical settings.
Footnotes
Consent to participate
Received written informed consent from participants. This study protocol was reviewed and approved by the Institutional Ethics Committee of the Tor Vergata University Hospital (Policlinico Tor Vergata), Rome, on 21 March 2024 (Approval No. RS. 26.24 CET2 UTV) in accordance with the Declaration of Helsinki.
Funding
This study was partly funded by the Italian Government through the ASSIST project (Grant No. P2022A4ELB, PRIN-PNRR 2022) and Alessandro Perini's DM630 PhD Scholarship, and by Sensor Medica srl.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability
Collected data are reported in the body of the article.
