Sage Journals: Discover world-class research

Abstract

The growing need for mental stress detection in the workplace has prompted the exploration of machine-learning solutions. Nevertheless, traditional centralized methods often encounter critical data privacy issues, especially when dealing with sensitive physiological signals. To address this, we introduced a privacy-preserving mental stress detection framework utilizing federated learning, focusing on human-robot collaboration scenarios. We first developed classifiers employing traditional centralized algorithms including SVM, multilayer perceptron, random forest, and Naïve Bayes, followed by implementing a federated SVM classifier. These classifiers utilize multimodal physiological features to distinguish between relaxed, low-level stressed, and high-level stressed states. Comparative analysis regarding precision, recall, and F1-score was conducted to evaluate the performance of the federated learning model against centralized models. The results demonstrate that federated learning not only offers comparable accuracy to centralized methods but also ensures the protection of sensitive data, making it a valuable approach in scenarios where data privacy is paramount.

Keywords

federated learning mental stress detection data privacy human-robot collaboration

Introduction

Mental stress in the workplace is increasingly recognized as a critical factor that affects individual work performance and health (Arsalan et al., 2023). Inadequate management of mental stress can lead to adverse consequences, spanning from short-term task errors and accidents to a variety of long-term health issues, such as cardiovascular diseases, diabetes, strokes, sleep disturbances, and weakened immune systems (Cacioppo et al., 2007; Lagraauw et al., 2015). Consequently, the detection of mental stress has become essential to help workers maintain good mental status and promote workplace well-being.

Traditionally, mental stress evaluation has relied on subjective measures, such as self-reporting questionnaires and interviews. However, one of their limitations is the inability to capture real-time experiences (Bethel et al., 2007). As a complement, objective measures involving physiological signals have been widely employed for the quantification of mental stress levels, such as galvanic skin response (GSR), heart rate (HR), and electromyography (EMG) (Healey & Picard, 2005; Robles et al., 2022; Umer, 2022). Leveraging these physiological signals, various machine-learning methods have been explored for mental stress classification. Commonly used algorithms for their efficacy and relative simplicity include Support Vector Machine (SVM), Multilayer Perceptron (MLP), Random Forest (RF), and Naïve Bayes (NB) (Arsalan et al., 2019; Kim et al., 2021; Subhani et al., 2017).

Nevertheless, these machine learning approaches rely on training the model in a centralized way. This process involves the collection and storage of substantial volumes of sensitive data on a central server, posing significant privacy concerns. With data privacy becoming a paramount concern in the digital era (Das et al., 2020), there is a growing need for decentralized methods in mental stress classification. As a result, federated learning (FL) emerges as a highly proactive solution to these challenges. In the federated framework, local clients train models on their own data and share only specific model parameters, such as gradients or weights, with a central server. The central server then aggregates these parameters using algorithms like federated averaging to minimize the global loss function, constructing a global model without directly accessing individual user data (Somandepalli et al., 2022). This method significantly mitigates the risks associated with centralizing user data, thereby preserving user data privacy.

In this study, we aim to assess the effectiveness of federated learning for mental stress classification, employing mental stress levels observed during human-robot collaboration (HRC) as the experimental framework. Our prior experimental study highlighted that the human-robot interaction paradigms can significantly affect workers’ mental stress levels (Su et al., 2024). Specifically, participants experienced significantly higher mental stress in scenarios without interaction with robots compared to scenarios with interaction. Building upon these findings, the current study further developed classifiers to classify workers’ mental stress during HRC into three levels, ranging from relaxed (not engaged in any task) to low-level stressed (engaged in HRC tasks with interaction) and high-level stressed (engaged in HRC tasks without interaction). Both centralized learning-based and federated learning-based classifiers were constructed leveraging key features extracted from multimodal physiological data such as GSR, EMG, and HR. The performances of these classifiers were analyzed and compared to understand the effectiveness of federated learning in mental stress classification.

Method

An experiment involving 24 participants performing HRC assembly tasks was conducted. The experiment design was able to induce varying levels of mental stress among participants, categorized as relaxed, low-level stressed, and high-level stressed, based on the how participants interacted with robot. During the experiment, multimodal physiological signals such as GSR, EMG, and HR were collected. Subsequently, feature extraction was performed to extract time and frequency domain features from these signals, resulting in 38 features extracted. After feature selection using the F-test method, 14 features in total were selected for constructing machine learning models, which are listed in Table 1.

Table 1.

Physiological Features for Training Models.

GSR	EMG	HR
• Mean value • Standard deviation • Max value • Range • Variance • Mean of the first derivative • Spectral rolloff • Spectral entropy	• Mean value • Standard deviation • Variance • Mean of the first derivative	• Mean value • Max value

Next, centralized machine learning methods including SVM, MLP, RF, and NB, were first applied to determine the most effective method to classify mental stress based on the selected features. Subsequently, SVM, the best-performing method in the centralized approach, was adopted in a federated learning framework. In the federated learning framework, training data was manually allocated across different local computers, to mimic scenarios where data were collected by different parties who preferred not to or were unable to directly share the raw data. Model aggregation in federated SVM was conducted using the federated averaging algorithm, where the local parameters are averaged on a central server to construct the global model (McMahan et al., 2017).

Furthermore, this study also aimed to explore the potential impacts of varying local data distributions on model performances and how the global model derived from federated learning could enhance the local model performances. To this end, local models were constructed individually using their respective local data in a centralized way for comparisons. Evaluation metrics such as precision, recall, and F1-score were employed to assess the performance of the classifiers.

Results

First, the study analyzed the classification results of the centralized classifiers. The SVM classifier demonstrated superior performance, achieving a precision of 0.837, a recall of 0.830, and an F1-score of 0.829. MLP and RF follow with a comparable precision of 0.830 and 0.826, yet with relatively low recall and F1-score. The NB classifier showed the least effective results, with a precision 0.810, a recall 0.723, and an F1-score 0.747.

Federated learning was then implemented using SVM due to its superior performance over other centralized methods. When performing federated SVM, the training dataset was allocated across three individual computers in proportions of 20%, 50%, and 30%, respectively. Additionally, local models were developed using their own respective data on the three individual computers, referred to as SVM1, SVM2, and SVM3.

The federated learning approach yielded impressive results, achieving a precision of 0.816, a recall of 0.809, and an F1-score of 0.809. These metrics closely match the performance of the centralized SVM method. When comparing the global model from the federated learning with the individual local models, the global model outperformed each local model. Among the local models, SVM2, representing the device with a 50% data proportion, showed relatively better precision, recall, and F1-scores of 0.777, 0.766, and 0.760, respectively. In contrast, the other local models, SVM3 and SVM1, corresponding to devices with smaller data proportions (30% and 20%), demonstrated even lower performances. These results indicated the benefits of federated learning and potential impacts of data distribution on the efficacy of the models.

Discussion

This study investigated the potential of machine learning methods in identifying mental stress levels among workers engaged in HRC tasks. The use of established centralized machine learning algorithms, especially SVM, has been proven to be feasible in practical HRI interaction scenarios for stress detection. These centralized machine learning methods, which use all available data for model training, establish a baseline for performance evaluation. In particular, SVM-based centralized method stands out as a top performer in this context, exhibiting promising classification capabilities with a precision of 0.837, a recall of 0.830, and an F1-score of 0.829.

Importantly, the study then highlights the significant potential of federated learning for mental stress recognition in HRC environments. On the one hand, the decentralized nature of federated learning, which facilitates data privacy, is particularly advantageous in HRC scenarios where sensitive data is involved. For example, multimodal physiological signals, including GSR, EMG, and HR, were employed for training the model in our study. Such data may not be shared among different organizations due to Institutional Review Board (IRB) concerns (Fairchild & Bayer, 2004). On the other hand, the federated learning approach exhibited impressive classification performance, achieving a precision of 0.816, recall of 0.809, and an F1-score of 0.809. These results closely match the performance of the centralized SVM method. Similar results were reported in another study conducted by Tsouvalas et al. (2022), where the accuracy gap between their federated learning approach and the centralized method was less than 4% in speech emotion recognition tasks.

Comparison between the performance of the global model in federated learning and individual local models further revealed that federated learning has distinct advantages in situations where data distribution among local devices is imbalanced or limited. By aggregating these individual models, the global model in federated learning surpassed the performance of all local models. Notably, it improved more than 15% in recall compared to local device 1 (SVM1). This is particularly relevant in situations where data collection is challenging, or organizations face data scarcity. Federated learning thus offers a collaborative solution, enabling different organizations to jointly develop a more effective global model, ultimately benefiting all parties involved.

Several strategies can be employed to enhance classification performance in future work. Initially, our study incorporated various interaction scenarios to reflect real-world applications, yet it resulted in an uneven number of low-stress (six scenarios) versus high-stress (two scenarios) scenarios per participant. To address this, data augmentation techniques like oversampling could be employed (Chawla et al., 2002). Additionally, recent research has explored synthetic data generation, particularly using generative AI models such as Generative Adversarial Networks, to generate artificial samples to augment underrepresented classes (Ma et al., 2022).

Conclusion

This study explored the classification of mental stress in HRC assembly settings employing privacy-preserving machine learning strategies. We first collected multi-modal physiological signals across a variety of HRI scenarios. The approach involved extracting and selecting features from these signals to categorize mental stress into three levels: relaxed, low-level stressed, and high-level stressed, utilizing both centralized and federated learning methods. Our findings reveal that machine learning approaches are effective in recognizing mental stress, and that federated learning, in particular, offers a promising performance while maintaining data privacy. This research contributes to the understanding of mental stress in HRC settings and proposes federated learning as a viable solution for mental stress classification, paving the way for more efficient stress management strategies.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This manuscript is based upon work supported by the National Science Foundation under Grant #2024688.

ORCID iDs

Bingyi Su

Liwei Qing

SeHee Jung

References

Arsalan

Majid

Anwar

S. M.

Bagci

(2019). Classification of perceived human stress using physiological signals. 2019 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC) (pp. 1247–1250). https://doi.org/10.1109/EMBC.2019.8856377

Arsalan

Majid

Nizami

I. F.

Manzoor

Anwar

S. M.

Ryu

(2023). Human stress assessment: A comprehensive review of methods using wearable sensors and non-wearable techniques (arXiv:2202.03033). arXiv. http://arxiv.org/abs/2202.03033

Bethel

C. L.

Salomon

Murphy

R. R.

Burke

J. L.

(2007). Survey of psychophysiology measurements applied to human-robot interaction. RO-MAN 2007—The 16th IEEE international symposium on robot and human interactive communication (pp. 732–737). https://doi.org/10.1109/ROMAN.2007.4415182

Cacioppo

Tassinary

Berntson

(2007). Handbook of psychophysiology. Cambridge University Press. https://doi.org/10.13140/2.1.2871.1369

Chawla

N. V.

Bowyer

K. W.

Hall

L. O.

Kegelmeyer

W. P.

(2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357. https://doi.org/10.1613/jair.953

Das

Gutzwiller

R. S.

Roscoe

R. D.

Rajivan

Wang

Jean Camp

Hoyle

(2020). Humans and technology for inclusive privacy and security. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 64(1), 461–464. https://doi.org/10.1177/1071181320641104

Fairchild

A. L.

Bayer

(2004). Ethics and the conduct of public health surveillance. Science, 303(5658), 631–632. https://doi.org/10.1126/science.1094038

Healey

J. A.

Picard

R. W.

(2005). Detecting stress during real-world driving tasks using physiological sensors. IEEE Transactions on Intelligent Transportation Systems, 6(2), 156–166. https://doi.org/10.1109/TITS.2005.848368

Kim

Park

Kim

H.-S.

(2021). Mental stress assessment using SVM with physiological sensor data. (2021, October). Mental stress assessment using SVM with physiological sensor data. In 2021 International Conference on Information and Communication Technology Convergence (ICTC) (pp. 1296–1299). IEEE. https://doi.org/10.1109/ICTC52510.2021.9621148

10.

Lagraauw

H. M.

Kuiper

Bot

(2015). Acute and chronic psychological stress as risk factors for cardiovascular disease: Insights gained from epidemiological, clinical and experimental studies. Brain, Behavior, and Immunity, 50, 18–30. https://doi.org/10.1016/j.bbi.2015.08.007

11.

Huang

S.-L.

Zhang

(2022). Data augmentation for audio-visual emotion recognition with an efficient multimodal conditional GAN. Applied Sciences, 12(1), 1. https://doi.org/10.3390/app12010527

12.

McMahan

Moore

Ramage

Hampson

Arcas

B. A.

(2017, April). Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics (pp. 1273–1282). PMLR.

13.

Robles

Benchekroun

Zalc

Istrate

Taramasco

(2022, July). Stress detection from surface electromyography using convolutional neural networks. In 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) (pp. 3235–3238). IEEE.

14.

Somandepalli

Eoff

Cowen

Audhkhasi

Belanich

Jou

(2022, October). Federated learning for affective computing tasks. In 2022 10th International Conference on Affective Computing and Intelligent Interaction (ACII) (pp. 1–8). IEEE.

15.

Jung

Wang

Qing

(2024). Exploring the impact of human-robot interaction on workers’ mental stress in collaborative assembly tasks. Applied Ergonomics, 116, 104224. https://doi.org/10.1016/j.apergo.2024.104224

16.

Subhani

A. R.

Mumtaz

Saad

M. N. B. M.

Kamel

Malik

A. S.

(2017). Machine learning framework for the detection of mental stress at multiple levels. IEEE Access, 5, 13545–13556. https://doi.org/10.1109/ACCESS.2017.2723622

17.

Tsouvalas

Ozcelebi

Meratnia

(2022). Privacy-preserving Speech Emotion Recognition through Semi-Supervised Federated Learning. In 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events (PerCom Workshops) (pp. 359–364). https://doi.org/10.1109/PerComWorkshops53856.2022.9767445

18.

Umer

(2022). Simultaneous monitoring of physical and mental stress for construction tasks using physiological measures. Journal of Building Engineering, 46, 103777. https://doi.org/10.1016/j.jobe.2021.103777