Sage Journals: Discover world-class research

Abstract

Objective

This scoping review reports on findings of recent studies which assess the methodologies, significant features, and machine learning (ML) models employed in wearable-based panic attack (PA) research.

Background

PAs affect a significant percentage of people worldwide and can induce rapid heartbeat, sweating, and trembling in affected individuals. They are often unpredictable and can seriously affect day-to-day activities and functioning. The integration of wearable technology with advanced ML methods has been successfully used in identifying and managing several health conditions. Despite significant advancements, the predictive capabilities of these tools to detect PAs remain less understood.

Method

A comprehensive search of databases including PubMed, PsycINFO, Embase, and Google Scholar identified seven studies focusing on PA prediction using wearable devices.

Results

These studies employed a range of ML models, such as supervised anomaly detection, deep learning (e.g. LSTM, RNN), random forests, and mixed regression models. The studies analyzed physiological metrics like heart rate (HR) variability, respiratory rate, and activity levels. Accuracy rates varied, with models achieving between 67.4% and 94.8% predictive accuracy.

Conclusion

Findings show the utility of combining psychological, physiological, and environmental data for improved predictions, and highlight the key data features, such as resting HR, heart rate variability, and certain sleep metrics that may help predict the onset of PAs. However, most of these studies have impractical prediction time frames of PAs with limited evidence of successful near-real-time prediction, highlighting the need for further research to predict the onset of PAs in real time.

Keywords

Panic attack anxiety digital health mental health wearables machine learning

Introduction

The World Health Organization estimates that 1 in 8 people worldwide suffer from a mental disorder, making mental health a major global concern.¹ Specifically, panic attacks (PAs), which are a form of “sudden, intense feelings of fear” impact around 2–3% of the population in the US for example, and up to 35% of the population are likely to experience one at some point in their lives.² These unexpected, strong episodes of anxiety or discomfort frequently manifest as heart palpitations, sweating, dizziness, or shortness of breath.² If left unchecked, they can cause social distancing, fear of certain places, and, in extreme situations, agoraphobia, which can seriously affect day-to-day activities and functioning.^3,4 Usually, medication, cognitive-behavioral therapy, and lifestyle changes are used to treat PAs after they occur but anticipating the occurrence of these PAs may help patients better plan their daily activities and avoid certain PA triggers when at risk. However, predicting these panic episodes has not yet been adequately validated with the current technology available.^5,6

Wearable devices have advanced significantly in recent years allowing the detection and real-time monitoring of various health conditions such as hypoglycemia and hypertension.^7,8 Devices such as smartwatches and fitness trackers are capable of monitoring changes in physiologic metrics such as skin conductance, heart rate (HR), physical activity levels, and sleep patterns.⁹ These devices offer several advantages over traditional laboratory-based assessments. Unlike stationary laboratory systems that restrict mobility, wearable's enable unobtrusive collection of behavioral and physiological data in the participants’ natural environments, enhancing real-world validity.¹⁰ This is particularly important for panic disorder (PD) research, as PAs often occur spontaneously outside of controlled laboratory settings,¹¹ typically at home especially during sleep or rest, in public places, at work or during physical activity.¹² Wearable devices also facilitate remote monitoring and longitudinal study designs, reducing participant burden while allowing continuous data collection across weeks or years.^13,14 Additionally, they are more cost-effective as compared to traditional laboratory systems, enabling scalability without the need for costly facilities or staff supervision.¹⁵ Recent research has explored the potential of wearable technology to monitor mental health and alleviate psychological distress. In their evaluation on the use of wearable sensors for mental health monitoring, Sadeghi¹⁶ demonstrated how these devices can effectively detect PTSD-related hyperarousal events in real-world settings. Similarly, wearables were shown to promote mental health by lowering psychological discomfort through encouraging increased physical activity, better self-care, and improved health perception.⁹ These results point to the growing importance of wearable technology in treating mental health issues, and open new avenues to investigate how well wearables can inform their users of the onset of PAs.

With increasing research on the utility of wearable technology in the healthcare field, and despite recent developments in machine learning (ML) and wearable technologies, to our knowledge, there are no published reviews on the use of these wearables to predict PAs. Therefore, we conducted a scoping review of the current research approaches to address this gap and propose a practical framework for PA prediction. The review specifically aimed to answer the following questions:

- What data features predict a PA?

- How effective are wearable devices in predicting PAs using physiological data?

- What are the common ML models employed for PA prediction?

- What challenges exist in using wearables to predict the onset of PAs?

Methods

The research team followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines represented in Figure 1 below.¹⁷

Figure 1.

Prisma diagram.

Literature search and inclusion criteria

This review assessed studies that focused on the prediction of PAs using wearable devices and ML and was conducted between September and December 2024. The methodology utilized three databases: PubMed, PsycINFO, and Medline, and there was no restriction on the publication date. Two researchers collaborated on the screening process, guided by predefined inclusion and exclusion criteria.

The inclusion criteria comprised as follows:

Empirical studies published in English.

Studies focus specifically on PAs.

Studies utilizing wearable devices (e.g. smartwatches, fitness trackers) for data collection.

Research that used ML or deep learning models to predict PAs.

The exclusion criteria were follows:

Review articles

Abstracts, editorials, or other partial academic articles.

The following keywords were used in the search: (“Panic attack prediction” OR “panic attacks” OR “anxiety attack prediction”) AND (“wearable devices” OR “wearable technology” OR “smartwatches”) AND (“machine learning” OR “deep learning” OR “predictive modeling”). In the search process, Boolean operators and quotation marks were employed to capture variations in terminology and ensure the scope was sufficiently broad to include all relevant studies.

Screening procedures and extracting themes

The search yielded 51 studies across the four databases. After removing duplicates (n = 24), 27 articles remained for screening. Using Rayyan.ai, a web-based systematic review tool, two researchers KA and JH independently performed the title and abstract screening, guided by the inclusion and exclusion criteria mentioned earlier. The Cohen's kappa was calculated after the initial screening to assess inter-rater reliability and was found to be a value of 0.59, indicating substantial agreement. Discrepancies between the reviewers were resolved by the third reviewer KZ.

Based on the title and abstract screening, 10 articles were excluded for not meeting the inclusion criteria. The remaining 19 articles moved on to the full-text screening phase. During full-text screening, both researchers read and assessed the articles, ensuring that the studies met the inclusion and exclusion criteria. After full-text screening, seven articles were included in the final review and were assessed to extract the data features analyzed, the ML algorithms employed and the corresponding accuracy, highlight the devices used in the study, and thematically code any issues common across the reviewed studies.

After finalizing which studies were included in the review, those studies were assessed for quality based on the Quality of Study Rating Form (QSRF), a universally standardized instrument commonly applied in evaluation research to synthesize and assess the studies’ quality. The QSRF enables systematic rating of key features such as treatment effect size, characteristics of interventions, and participant profiles.¹⁸ It facilitates consistent evaluation of methodological rigor and the identification of important study parameters across multiple studies as seen in Table 1.

Table 1.

Quality rating for the reviewed studies.

Study	TQP (0–100)	Notes
Tsai et al. 2022	46	Strongest/Highest Quality Best methodological quality, validated scales, ML rigor, small cohort weakness generalizability, no external validation, attrition
Tsai et al. 2024	40	Strong 2-year cohort, large sample size, XAI integration, lacks randomization, self-reported bias present, external generalizability limited to one hospital, no control
Wu et al. 2022	40	Large precision health study, very large dataset, multi-disease modular platform, prospective follow-up, PD specific insights, heavy reliance on AI modeling without full interpretability
McGinnis et al. 2023	30	Pilot preprint, first to test digital biomarkers in panic, no control, small sample, high attrition, self-reported outcomes
Caldirola et al. 2023	29	Systematic review, PRISMA 2020 approach, systematic review not intervention (no protocol registration)
Rubin et al. 2015	28	Feasibility of prediction system, proof-of-concept demonstration, small sample size, no control group
Cruz et al. 2015	24	Lowest quality Feasibility prototype of intervention delivery, no control group, first attempt at just-in-time intervention delivery in panic, high risk of selective reporting

PD: panic disorder.

Results

The results were split into three themes revolving around data features collected and devices, ML models employed, and participant issues and challenges.

Devices and data features

Across recent efforts to predict PAs using wearable technology, a range of devices and data features were utilized. Both Cruz et al.¹⁹ and Rubin et al.¹¹ used the Zephyr BioPatch, Patch (Zephyr Technology– Annapolis, Maryland), a chest-worn device, to anticipate panic episodes in real time and offer prompt treatments, such as guided breathing exercises. It collects key physiological features including HR, respiratory rate (RR), heart rate variability (HRV), core body temperature, and physical activity. These raw physiological signals were analyzed and processed into engineered features such as mean, standard deviation, and change points. Pre-panic periods were generally characterized by lower HRV and higher HR, breathing rate (BR), and temperature, making them the most informative features with 93.8% precision. However, the study was conducted on a small sample size (n = 7), and no follow-up was performed to validate these initial findings on real-time prediction.

Similarly, Caldirola et al.¹⁰ utilized the Zephyr Bio Patch in their pilot study to evaluate its accuracy in PA assessment, by comparing its measurements of HR and BR against those obtained from a gold-standard system (Quark-b2). The Bio Patch demonstrated promising but inconsistent accuracy, particularly for BR measurements. The study underscored challenges in using wearable devices for respiratory assessment and highlighted the need for methodological precision when deploying such tools outside lab environments.

McGinnis et al.²⁰ leveraged the Apple Watch to passively collect physiological and environmental data including resting heart rate (RHR), HRV, RR, ambient noise level and physical activity (steps, distance, stair flights all combined into an “Activity” score) using an Ecological Momentum Assessment (EMA) style of reporting, whereby participants report the occurrence of PAs or log their symptoms in real time via a phone app or wearables. RHR and ambient noise showed statistically significant associations with increased likelihood of next-day PAs. The study found that both a 1 Beat Per Minute (BPM) increase, and a 5 BPM decrease in RHR from an individual's average more than doubled the risk of a next-day PA from 9% to 23% and 9% to 19%, respectively. Additionally, elevated ambient noise levels were associated with an 80% increase in risk of a PA.

Next, Tsai et al.^13,14 conducted a longitudinal study using the Garmin Vivosmart 4 smart watch (Garmin – Olathe, Kansas, USA), selected for its capacity to capture continuous HR metrics, HRV, physical activity, and sleep metrics including total sleep duration and general sleep patterns in naturalistic settings. These were combined with environmental factors such as air quality index (AQI) and psychological questionnaires namely Beck's Anxiety Index (BAI),²¹ Beck's Depression Inventory (BDI),²² and State Trait Anxiety Inventory (STAI).²³ Prediction accuracy was enhanced from 77.1% accuracy using questionnaires alone, and only 67.4% accuracy using lifestyle and environmental data alone, to 81.3% accuracy when all factors were combined. Additionally, the study highlighted that adequate sleep and physical activity provide a protective effect against PAs.

The former finding is consistent with Wu et al.²⁴, which used a modular system that integrated data across multiple commercial devices including Fitbit (Fitbit, now owned by Google – Mountain View, California, USA), Garmin, Apple, Oura Ring (Oura Health – Oulu, Finland), and Asus (ASUS – Taipei, Taiwan). From these wearables, they collected diverse physiological features including HR, SpO2, HRV, sleep stages, and physical activity (steps, floors climbed). Environmental data, including AQI, were obtained via open APIs. Additionally, psychological questionnaires such as BDI, BAI, STAI, PD severity scale²⁵ and MINI²⁶ were utilized to collect psychological features. The prediction model achieved its highest performance when all features were integrated, with accuracy rising from 83.1% using only psychological state and sleep metrics to 88.5% when all combined.

A summary of the key findings from each study can be found in Table 2a and 2b which are split to encompass Study Characteristics and Modeling Approaches & Outcomes respectively. Additionally, a summary of devices used, and their benefits, can be found in Table 3.

Table 2.

Key findings from each study included in the review.

Table 2a. Study characteristics
Study	Sample size	Duration	Demographic	Comorbidities	Device	PA reporting
Rubin et al., 2015	10	3 weeks (Short-term preliminary predictive modeling) An extension of Cruz et al. ¹⁹	Age:19–53yrs Gender:50% females,40% males,10% trans-males Clinical population: Self-identified panic disorder sufferers	No psychiatric comorbidity	Zephyr Biopatch	Mobile app with manual event-based EMA reporting
Cruz et al., 2015	10	3 weeks (Short-term pilot feasibility intervention system)	Gender: (5 females, 4 males, 1 Trans-male) Clinical population: Self-identified panic disorder sufferers	No psychiatric comorbidity	Zephyr Biopatch	Self-reporting through a smartphone app, using widget start/stop buttons
Wu et al., 2022	1667	24 months (Longitudinal)	Patients with chronic diseases Demographics not mentioned	COPD (177 patients) Panic disorder (70 patients) Obesity (120 patients)	Various Wearables (e.g. Fitbit, Garmin)	PAs inferred from wearable + clinical data via telecare platform
Tsai et al., 2022	59	24 weeks (Longitudinal)	Age: 20–74yrs Gender: 61% females, Clinical population: Patients with a primary DSM-5 ^21,22 diagnosis of panic disorder Breakdown: 51% of participants had at least one comorbidity	GAD:32.2%), agoraphobia (22%), PTSD (6.8%), OCD (3.4%), bipolar disorder (1.7%), major depressive disorder (6.8%), social anxiety disorder (1.7%), Others (3.4%)	Garmin Vivosmart 4	Mobile app (event-based self-report)
McGinnis et al. 2023	38	28 days (Short prospective study)	Age:18–69yrs Gender: 79% female Clinical population: All participants had at least one PA previous week Breakdown: 50% reported a mental health diagnosis, with an average of 5.56 PAs in the prior month	No specific comorbidities mentioned	Apple watch series 10	Daily EMA reporting at fixed times through a mobile app
Caldirola et al., 2023	10	Review + Pilot (short)	Clinical population: Healthy volunteers (Hospital Staff) Gender: 50% females No Other Demographics mentioned	No comorbidities	Zephyr Biopatch	Review included EMA/app studies, but in their pilot no PAs were reported
Tsai et al., 2024	114	2 years (Longitudinal)	Age:20–89yrs Gender:58.6% female Clinical population: outpatients with a primary diagnosis of panic disorder from En Chu Kong Hospital psychiatric clinics Breakdown: 47.5% comorbid with at least one psychiatric illness	OCD (0.3%), PTSD (0.5%), bipolar disorder (0.3%), GAD (26.3%), Agoraphobia (16.1%), Depressive disorders (14.1%), social anxiety disorder (0.1%), others (such as heroin disorder (0.2%)	Garmin Vivosmart 4	Smartphone app (event-based self-report)

Table 2b. Modeling approaches and outcomes
Study	Models used	Pre-processing	Class distribution	Evaluation metrics	Prediction timeframe	Significant findings
Rubin et al., 2015	Supervised Anomaly Detection + Change point detection	Data segmented into windows labeled pre-panic vs non-panic. Some manual artifact correction noted	Small number of manually logged PAs (exact count not given)→ Imbalanced Dataset	Precision = 0.938% Recall = 0.838 F1 = 0.885 Accuracy not reported	1 hour prior (Proof of Concept	Best model: Personalized anomaly detection models Significant features: HR, RR, skin conductance, and skin temperature
Cruz et al., 2015	Anomaly Detection + Change point detection	N/A	29 PAs total→ Data Imbalanced	Not reported	1 hour prior (Proof of Concept)	No best feature identified but established that wearable-detected physiological changes could trigger therapeutic support
Wu et al., 2022	Modular Prediction (ML & DL models)	Wearable data denoised and aggregated into daily summaries. Clinical features standardized. All modalities merged into structured feature matrices	No PA episode-level reporting: Imbalance not analyzed	Accuracy = 0.885 Sensitivity = 0.756 Specificity = 0.93 F1 = 0.798	7-days prior	Significant Predictors: Combination of lifestyle + environmental factors outperformed psychological questionnaires alone
Tsai et al., 2022	Random Forest, Decision Trees, Gradient Boosting (XGBoost), Adaptive boosting (Ada Boost), Regularized greedy forests (RGF)	Wearable features summarized daily. Clinical scales digitized and standardized. Missing values imputed using mean substitution. Features normalized using z-scores	261 PAs → Moderate Imbalance	Accuracy = 0.674–0.813 (Random forest achieved the highest accuracy) F1 = 0.677 Precision = 0.827 Sensitivity = 0.574, Specificity = 0.938 AUROC = 0.871	During the next 7-day period	Best model: Random Forest Key insight: Psychological scores (BAI, BDI, STAI) were the most significant features, but HR and deep sleep were also significant
McGinnis et al. 2023	Mixed Regression (autoregressive structure)	Features aggregated daily, Missing/incomplete days excluded, features standardized using z-score normalization	Few daily PA days (exact count not reported)→ Strong imbalance: most daily reports = no PA	95% confidence interval and p-values ranging between 0.001 and 0.820 with no other metrics since no classifier used	1-day prior	Significant predictors: Higher RHR and louder Ambient noise exposure
Caldirola et al., 2023	Random Forest	Pilot data synchronized across devices; Bland–Altman + correlation analysis used	Pilot study: no PAs recorded	Accuracy = 0.813 No ML	Preliminary In Lab assessment (7-day prior)	Consistent abnormalities in respiration rate and HRV were observed in panic disorder patients Key insight: Respiratory signals were less reliable outside lab settings
Tsai et al., 2024	LSTM, XAI (SHAP), RNN, GRU	Features aggregated daily. environmental features joined by timestamp. Data normalized using min–max scaling to [0,1]	402 PAs across 71 patients → Imbalanced: more non-PA days vs. fewer PA days	Accuracy = 0.948 (GRU), 0.928 (LSTM), and 0.908 (RNN) AUC = 0.986 Precision = 0.928 Sensitivity = 0.928 Specificity = 0.949	During the next 7-day period	Best model: LSTM deep learning model Significant features (via SHAP explainability): RHR (55–60 bpm), Daily average HR (72–87 bpm), Sleep duration (6 h 23 m – 10 h 50 m), Deep sleep (>50 min), and Daily climbing (>9 floors)

PA: panic attack; HR: heart rate; RR: respiratory rate; HRV: heart rate variability; RHR: resting heart rate; EMA: Ecological Momentum Assessment; bpm: Beat Per Minute; BAI: Beck's Anxiety Index; BDI: Depression Inventory; STAI: State Trait Anxiety Inventory; ML: machine learning.

Table 3.

Comparison of device characteristics across studies.

Reference	Device	Worn on	Battery life	HR monitoring	Sleep tracking	Sampling rate/Resolution	Metrics collected
Tsai et al., 2024	Garmin Vivosmart 4	Wrist	Up to 7 days	Yes	Yes	PPG-based HR sensor: 1 Hz Accelerometer: up to 50 Hz. Activity aggregated per min Sleep stages: 30–60 s epochs⁶	HR, HRV (derived), Step count, Activity intensity, Calories, Sleep duration & stages
Tsai et al., 2022	Garmin Vivosmart 4	Wrist	Up to 7 days	Yes	Yes
McGinnis et al., 2023	Apple Watch Series 7	Wrist	Up to 18 h	Yes	Yes	HR and HRV via PPG: 1 Hz Accelerometer and gyroscope: up to 100 Hz. Ambient noise sensor: Microphone 1 Hz²⁷	RHR, HRV, Step Count, Movement, & Noise levels
Caldirola et al., 2023	Zephyr BioPatch	Chest	Up to 36 h	yes	No	ECG and Respiration: 250 Hz, 3-axis accelerometer: 100 Hz.²⁸	ECG (HR, HRV, arrhythmia detection), RR, Tidal volume estimation, Posture, & activity level
Wu et al., 2022	Fitbit Charge 5	Wrist	Up to 7 days	Yes	Yes	HR: 1Hz Accelerometer: up to 50 Hz Activity aggregated: per minute Sleep: 30–60 s epochs²⁹	HR, HRV, SpO₂, steps, calories, & sleep staging
	Garmin Forerunner 955	Wrist	Up to 15 days	Yes	Yes	HR/HRV:1 Hz GPS, accelerometer: up to 100 Hz. Activity aggregated: per minute Sleep: 30–60 s epochs³⁰	HR, HRV, SpO₂, steps, GPS activity, & sleep
	Apple Watch Series 8	Wrist	Up to 18 h	Yes	Yes	HR/HRV via PPG: 1 Hz Accelerometer and gyroscope: up to 100 Hz Sleep: 30 s epochs	SpO₂, Skin temperature, HR, HRV, ECG, Steps, Sleep, Calories & Ambient noise
	Oura Ring Gen 3	Finger	4 to 7 days	Yes	Yes	Accelerometer:50 Hz PPG:250 Hz Sleep: 30–60 s epochs³¹	HR, HRV, SpO₂, Sleep staging & Temperature
	Asus VivoWatch 5	Wrist	Up to 7 days	Yes	Yes	HR: 1 Hz Data synced at minute-level intervals	HR, HRV, SpO₂, Activity, & Sleep
Rubin et al., 2015	Zephyr Biopatch	Chest	Up to 36 h	Yes	No	ECG and Respiration: 250 Hz Accelerometer: 100 Hz	HR, HRV, respiration, activity level, & Core temperature
Cruz et al., 2015	Zephyr Biopatch	Chest	Up to 36 h	Yes	No	ECG and Respiration: 250 Hz Accelerometer: 100 Hz	HR, HRV, respiration, activity level, & Core temperature

Most studies reported data at an aggregated level, whereas Caldirola et al. (2023), Rubin et al. (2015), and Cruz et al. (2015) employed continuous raw data, preserving the full signal.

HR: heart rate; RR: respiratory rate; HRV: heart rate variability; RHR: resting heart rate.

Machine learning models

Across six out of seven reviewed studies, both classical ML and deep learning (DL) approaches were employed to predict PAs using physiological, behavioral, and environmental data collected via wearable devices and psychological questionnaires.

First, Cruz et al.¹⁹ and Rubin et al.¹¹, similarly built personalized models using wearable data, relying on supervised anomaly detection via Gaussian probability models tailored to each subject. This model used Gaussian probability distributions to create personalized profiles, fitting the density to non-panic windows and classifying outliers below a learned threshold as pre-panic, designed to trigger in the moment mobile interventions. Additionally, they adjusted leave-k-out evaluation ensuring every positive instance is tested and reported macro averages alongside micro to avoid large-class dominance at the evaluation stage. They achieved an accuracy of 0.938, a precision of 0.938, a recall of 0.838, an F1 score of 0.885, and demonstrated strength in handling data imbalance (17 pre-panic and 280 non-panic windows) as well as emphasized early-stage feasibility.

Next, McGinnis et al.²⁰ employed a per feature Mixed Regression Model with an autoregressive covariance structure, using physiological data (e.g. HRV, RHR, and RR) to predict the likely hood of next-day PAs. The model adjusted physiological parameters relative to each participant's unique baseline, effectively capturing day-to-day variations in PA risk. They faced moderate class imbalance between 16 and 24% panic days and therefore reported parameter estimates with 95% confidence interval and p-values ranging between 0.001 and 0.820 with no AUC, F1, precision and recall calculations since they didn't use a classifier.

Third, Wu et al.²⁴ employed modular prediction models that incorporated both ML and DL techniques including decision trees, random forests, and deep neural networks. Collectively, these models achieved an overall accuracy of 0.885, a sensitivity of 0.756, a specificity of 0.93, and an F1 score of 0.798 in predicting acute exacerbations. Additionally, they passed external validation. These models, applied to various chronic conditions including PD, were trained on multimodal data from wearable sensors, lifestyle factors and environmental inputs, reinforcing the value of multimodal lifestyle-environmental integration. The model handled class imbalance (386 out of the 1667 detected as abnormal episodes) by upweighting the minority class using “Keras” class weights to increase sensitivity. Additionally, re-sampling was done to mitigate a “disparate ratio of abnormal events” without reporting exact per-class counts for the dataset. Similarly, Tsai et al.¹⁴ used six ML models: Random Forest, Decision Trees, Linear discriminant analysis “LDA,” AdaBoost, XG-Boost, and Regularized Greedy Forests. They achieved the highest accuracy with Random Forest at 81.3% with an F1 score of 0.677, a precision of 0.827, a sensitivity of 0.574, a specificity of 0.938, and an AUROC of 0.871 using features such as HR, sleep duration, and standardized anxiety scores from BAI, BDI, and STAI. Nonetheless, they reported a moderate class imbalance showing 35.1% PA vs 64.9% non-PA in the training set and 34.2% vs 65.8% in the testing set. Seven days of physiological data were backfilled to align with participants’ weekly label, and models were trained to predict whether a PA would happen in the upcoming week. Their later study Tsai et al.¹³ applied LSTM, RNN, and GRU models. As a result, LSTM achieved an accuracy of to 92.8% for a 7-day prediction window with an AUC of 0.986, a precision of 0.928, a sensitivity of 0.928, and a specificity of 0.949. Six days of time-series data were used to predict the label on the 7th day. However, this 7-day label still reflected whether a PA had occurred at any point during the week. Hence, the models signaled risk within the next 7-days, without specifying whether the attack would happen the following day, 3 days later, or exactly on the 7^th day. Similarly, they reported a moderate class imbalance for the PA label with about 37% positive in both the training and testing sets. Evaluating F1, recall, precision, and AUC alongside accuracy reduces bias from class imbalance. Nonetheless, they integrated SHAP to elucidate the most influential features, identifying optimal sleep, and activity thresholds to mitigate PA risk. A summary of the accuracy of some of these models mentioned in the studies reviewed can be found in Figure 2 below.

Figure 2.

Panic attack prediction accuracy across various ML models.

Participant issues

Gender imbalance and dropout rates were a common issue across most studies dealing with PAs. Caldirola et al.¹⁰ reviewed seven different studies before conducting their experiment and found varying participant demographics. For instance, one study they mentioned recruited 26 healthy controls and 26 patients with PD, with women constituting 85% of the PD group and 77% of the healthy controls. Across all their studies, women were the majority constituting 59% to 85% of the samples. Tsai et al.¹⁴ recruited 59 participants with a primary diagnosis of PD, 61% of which were females, but excluded 974 data points due to missing environmental or physiological data. They later conducted a follow-up study in 2024, starting with 114 enrolling participants, out of which only 99 completed the study (58.6% were females).

Wu et al.²⁴ involved 1667 participants having various chronic diseases including PD. Over a 24-month follow-up, 386 abnormal episodes were explicitly identified. McGinnis et al.²⁰, started initially with 107 participants; however, only 87 completed the daily survey, and only 38 uploaded apple watch data, 79% of which were females. Moreover, high dropout rates were noted, in which 35 participants were withdrawn after the first week, 25 after the second week and 1 after the third week. Finally, Cruz et al.¹⁹ and Rubin et al.¹¹ both included 10 individuals who suffer from PD. In Rubin et al.'s.¹¹ study, 7 active participants remained after excluding those who didn’t log any physiological data (Initial cohort included: 5 females, 1 male and 1 trans-male).

Discussion

Given that no reviews have been published on the use of wearable devices to predict PAs, this manuscript synthesizes the key findings from studies attempting the prediction of PAs using wearables. Below, we comment on the data features, devices, and ML algorithms employed to serve this purpose, while addressing the challenges, implications, and ethical issues pertinent to these interventions.

Data features and devices

Conducting research with wearables requires careful attention to several important factors, given the variety of devices utilized in literature and on the market. In fact, the selection of a device is critical since it affects what data features are analyzed, the data's accuracy, as well as the user experience with the wearable. Although device choice and features varied, several insights emerged.

Among the reviewed devices, the Apple Watch and Garmin Vivosmart 4 offered strong usability and reliable tracking of behavioral and physiological trends such as RHR, sleep patterns, and ambient noise.^32,33 However, both lacked ECG-grade precision, with the absence of measurements of HRV during the day, and low accuracy in inferring what sleep stage a person is in. In contrast, the Zephyr BioPatch delivered the most physiologically precise data particularly for HR, HRV, and RR due to its chest-worn ECG-grade sensors.³⁴ However, its bulky design, susceptibility to motion artifacts, and lower comfort and wearability may limit scalability and compliance with the study.³⁵ Overall, consumer-grade wrist-worn devices may provide greater comfort and user adherence, while clinical-grade chest sensors offer greater depth at the expense of long-term feasibility.

Accordingly, we propose that Garmin Vivosmart 4 seems to offer an optimal balance for PA prediction, as it combines acceptable physiological tracking with strong usability in real-world settings. Our findings suggest its predictive performance improves when integrated with psychological and environmental data.⁶ Additionally, it stands out as the most affordable device among those evaluated.³⁶ In terms of battery life, we note that the Garmin Vivosmart 4 supports up to 7 days of continuous use, making it particularly suitable for long-term monitoring.³⁶ The Zephyr Bio Patch supports short-term high-fidelity assessments for up to 36 h of use but requires more frequent charging.³⁷ Similarly, the Apple Watch SE provides the shortest battery life up to 18 h, which may limit its utility for continuous tracking.³⁸ Researchers must weigh the different limitations and benefits of each watch as they choose the optimal one for their study. Other devices can also be validated in terms of PA prediction, such as the Fitbit, which is affordable, supports a long battery life, and can be considered an everyday smartwatch.

Regardless of the device used, our results indicate that key predictive features included HRV, RHR, sleep quality, and scores from validated psychological scales. Even though HRV proved to be the most informative physiological feature, as it's sensitive to stress-induced shifts preceding PAs.^22,23 HRV alone did not outperform psychological features as a standalone predictor. Most watches are unable to measure HRV during the day and only provide this feature during sleep. Alternatively, sleep quality seems to be an interesting feature to further investigate due to the correlation between sleep stages and next-day PA vulnerability.³⁹

Meanwhile, data from psychological questionnaires (BAI, BDI and STAI) emerged as the strongest standalone predictors of a PA 1 week in advance, as they captured subjective emotional states closely aligned with panic vulnerability. However, seeing that psychological questionnaires require frequent user engagement which can be perceived as intrusive, we propose that models integrate multimodal data streams (physiological, psychological, and environmental). This approach can mitigate participant burden by increasing passive monitoring and enhancing overall robustness and accuracy as compared to single-feature models. Additionally, by leveraging personalization and adaptive sampling, models can learn individual patterns over time, increasing accuracy and ultimately reducing participant burden.⁴⁰

These key predictive features can be clinically contextualized to guide interventions. Reduced HRV and elevated RHR signify autonomic dysregulation, a state linked to heightened vulnerability to PAs.^10,13 Interventions such as HRV biofeedback and paced breathing have shown efficacy in improving autonomic balance and reducing anxiety symptoms in other studies.⁴¹ Poor sleep quality, indicated by fragmentation or reduced duration, reflects impaired recovery and greater next-day panic risk.¹³ Cognitive-behavioral therapy for insomnia (CBT-I) and structured sleep-hygiene strategies are effective treatments that also mitigate anxiety.⁴² Moreover, psychological burden scores (BDI, BAI, STAI) support stepped-care interventions including CBT, psychoeducation, and pharmacotherapy.¹⁴ Together, these insights may illustrate how wearable-derived metrics can be translated into actionable interventions for panic prediction and prevention.

Machine learning

The LSTM deep learning model delivered the highest performance for predicting PAs over a short-term 7-day window. It captured temporal patterns and handled sequential physiological data well. However, it required larger datasets and external tools like SHAP for interpretability. Random Forest was the second-best performer in short-term PA prediction. Nonetheless, it was the top-performing traditional ML model, achieving high accuracy using physiological and psychological time-series data. It demonstrated high interpretability and robustness to over fitting, although some models showed limited sensitivity.

The modular prediction model bridged the gap between performance and practicality by combining the strengths of deep learning and ML in an explainable, multi-source architecture. It delivered high accuracy using only a few selected features, making it highly suitable for real-world deployment and personalized health monitoring. While it doesn’t quite match the peak accuracy of LSTM, it stands out as the most well-rounded and scalable approach for PA prediction.

As such we recommend using LSTM models when the primary objective is to maximize performance, particularly when rich sequential physiological data is on hand. However, for real-world deployment scenarios that require interpretability and integration of diverse data types (e.g. physiological, environmental), we suggest using modular prediction models which offer an optimal balance between accuracy, usability, and explainability. Meanwhile, random forest serves as a strong traditional ML baseline offering computational simplicity, robustness, and interpretability with reasonable predictive performance.

Based on these findings of data features, devices, and ML models, we propose a framework to aid in the design of a PA prediction study in Figure 3. The framework integrates multimodal data acquisition from participants through three main modalities: wearable devices, psychological questionnaires, and environmental sensors. Wearables provide continuous physiological data such as RHR, HRV, RR, and sleep patterns. On the other hand, psychological questionnaires capture psychological assessments related to stress, anxiety, and PD symptoms, while environmental sensors collect contextual data including AQI and noise levels. These data streams feed into a ML model potentially an LSTM, Random Forest, or modular ensemble, as these demonstrated strong predictive performance in our findings, tasked with predicting impending PAs. Model performance is evaluated using standard performance metrics such as precision, accuracy, and F1-scores. If thresholds are met, the system proceeds to alert the user and deliver just-in-time interventions either via notifications on mobile applications or delivering feedback directly through wearable devices. The framework also opens the door to broader possibilities, encouraging the exploration of other novel feedback mechanisms and inviting researchers and practitioners to envision how such interventions could evolve.

Figure 3.

Proposed framework.

Predictive Timeframe: Weekly risk prediction in PAs, as seen in Tsai et al.^13,14, offers clinical utility through closer monitoring and preventive strategies, aligning with real-world aggregated reporting and reducing daily fluctuation noise. However, this method sacrifices temporal precision, failing to pinpoint the exact day of an attack within the week which leads to ambiguous early warnings and potentially inflated performance metrics. This contrasts with other studies aiming to achieve more fine-grained predictions such as 1 hour or one day in advance.^11,20 The lack of temporal precision limits personalized interventions, as precise critical periods within the week cannot be identified. Future research should focus on developing models capable of daily or at least near-real-time predictions for more actionable, just-in-time interventions.

Common study issues

Attrition

Research on wearable technology and PAs should address recruitment and dropout challenges and ensure adequate sample sizes for robust findings. Dropout rates in many studies ranged from 20% to 50%, influenced by a variety of factors. These factors may be due to the sensitive nature of the study population, as well as other issues related to technology, the hassle of completing frequent surveys, and the longitudinal nature of the study.^10,13

These concerns highlight the need for better study design that accounts for the sensitive nature of participants and the need for better retention and recruitment tactics. First, how participants are referred to the study plays a big role in how serious they may be in completing the study.⁴³ Researchers must utilize various forms of recruitment (e.g. posters, direct email, physician referrals, counseling, or community centers) to attempt to recruit as many serious participants as possible. Researchers must also be understanding of the PA population, their concerns over participating in such a study, and attempt to address any barriers the participants may have on remaining in the study for the required period of time. Additionally, emphasizing the benefits of participating as well as scheduling regular reminders on the required tasks, and including elements of gamification and goal setting have been found to be important motivators to keep participants engaged and dedicated.^44,45 Finally, early prediction of dropout using passive behavioral data such as time spent on tasks or device usage combined with self-reported feedback may enable the timely identification of at-risk users.⁴⁶ This enables researchers to customize support, such as reminders, nudges, or light coaching to stop disengagement before it starts.

Gender imbalance

In most studies reviewed in this paper, the majority of participants were female, primarily because females are more likely to experience forms of anxiety such as PA.⁴⁷ This indicates a need for achieving a higher representation for males in study samples to better understand the experience males undergo to manage their PAs . However, it seems that males are also more hesitant to participate or stay engaged in mental health studies, and this may be influenced especially by cultural and demographic factors.^48,49 Recruitment can attempt to target more male participants by tailoring the recruitment material to males and incorporating gamification elements to align with male interests such as sports themes and goal-oriented progression to further enhance engagement among men. Male participants enrolled in the study may also help in a snowball approach to recruit additional male acquaintances into the study.

Outcome measurement and labeling

Most studies relied on self-reported PAs via EMA reporting of PA onset or symptoms in either real time or at fixed times, which is prone to recall bias, inaccurate timing, and inconsistent compliance.^11,20 Participants may miss or misreport episodes, especially during distress, and even with real-time widgets,¹⁹ alignment accuracy is limited by compliance and delays. Subjective interpretation further introduces inconsistent labeling, adding noise that undermines predictive validity in longitudinal studies.^13,14

Study design and cohort size

Most studies were short-term pilots with small cohorts,^10,11,19,20 which limits power and reproducibility. Even larger efforts such as Tsai et al.^13,14, remain modest for ML, yielding proof-of-concept value but may be insufficient scale for robust, externally validated models.^50,51 Recruitment challenges and difficulties of conducting a longitudinal study partly explain these tradeoffs. Additionally, both studies by Tsai et al.^13,14 lacked clarity in how the data was split. The first study used temporal split without confirming participant separation, risking leakage, while the second study applied an 80:20 random split without maintaining chronological integrity. Rigorous time-aware methods may be essential to avoid inflated performance estimates.

Methodological and generalizability limitations

Several of the reviewed studies face methodological, ecological, and generalizability limitations. Many lacked transparencies in pre-processing and artifact handling, hindering replication.^14,19,24 Ecological validity was also limited by controlled settings or healthy volunteers rather than clinical populations.¹⁰ Generalizability is further constrained by single-country, outpatient cohorts, often in Taiwan or specific U.S samples.^13,14,20,24 These issues underscore the need for transparent reporting, ecologically valid real-world data, and multi-site studies to improve external validity.

Ethical issues

The chosen articles addressed various ethical issues related to data privacy, digital literacy barriers, and predictive reliability that are discussed below.

Data privacy and confidentiality

Caldirola el al.¹⁰ and Wu et al.²⁴ highlighted the importance of privacy when it comes to handling sensitive patient data. Wearable devices and mobile systems continuously collect and transmit environmental data, physiological data, physical activity data as well as patient location. For this reason, Tsai et al.¹⁴ emphasized the necessity of utilizing and implementing strict data protection guidelines that guarantee confidentiality of patient information during collection and analysis to prevent it from being accessed by unauthorized parties. Privacy concerns must be handled seriously especially since it may affect recruitment and retention, especially in cultures where mental health stigma is particularly high.⁵²

Socioeconomic and digital literacy barriers

Aside from privacy, Wu et al.²⁴ stated that despite the promising potential of such technologies, not everyone would benefit from them equally. Wearable technologies such as smart watches may be expensive, thus preventing low-income population segments from accessing these devices. Additionally, digital literacy may hinder certain individuals, such as elderly people or underserved communities, from using such devices effectively, further widening the gap of healthcare accessibility. Communities interested in improving the management of PAs may subsidize and integrate wearable devices with their local services offered to individuals who may not be able to afford them, push insurance agencies to help cover costs of these wearables, and train affected individuals on using these devices effectively to manage their condition.^34,35

Challenges in data uncertainty and predictive reliability

One of the articles reviewed posed another ethical concern revolving around the validity and accuracy of information gathered from wearables in real-world settings.¹⁰ As compared to more controlled and stationary systems, the accuracy of commercial wearables’ physiological data such as stages of sleep, HR and BR may sometimes be low leading to incorrect interpretations and decisions.⁵³ Therefore, care must be taken when such algorithms are applied as it may potentially affect the quality of care provided and hampers people's trust in such technology.

Merits and advancements of reviewed studies

Across these studies, the main merit lies in advancing the conceptualization of PAs from unpredictable and spontaneous mental health events to conditions with identifiable physiological, behavioral, and environmental precursors. Feasibility studies by Rubin et al.¹¹ and Cruz et al.⁹ were the first to demonstrate that wearable devices may predict PA episodes up to an hour in advance also and deliver in-the-moment interventions, reframing panic management from reactive treatment to proactive support. Later, McGinnis et al.²⁰ demonstrated a fully remote real-world feasibility study identifying that consumer wearables can yield digital biomarkers such as RHR and ambient noise that prospectively signal next-day panic risk.

Meanwhile, Tsai et al.¹⁴ added critical contribution of multimodal modeling proving that combining environmental, physiological, and psychological data enriches predictive capacity and reframes PAs as outcomes of complex biopsychosocial interactions. Wu et al.²⁴ pushed this research further by embedding panic prediction within a precision health service, contributing a systems-level view that integrates panic management into chronic disease care. Tsai et al.¹³ then extended this work by demonstrating that panic recurrence is not only predictable with high, accuracy but may also be impacted by daily behaviors like sleep and physical activity, situating PAs within a lifestyle framework. Collectively, these contributions have shifted the field toward viewing PAs as predictable, preventable, and digitally actionable, strengthening the foundation for precision psychiatry and early intervention.

Future work

While the potential of wearables to track physiological markers such as HRV, HR, sleep data, or BR has been shown in numerous studies, models that can reliably predict PAs before they happen are still in the early phases of development. First, the prediction timeframe of these PAs is still not well validated as mentioned earlier, and more research is warranted on defining what constitutes a practical prediction time frame within a naturalistic setting.

Second, phenotyping patients based on symptoms and demographics may be a necessary approach to incorporate when attempting to predict PA onset, yet this has not been incorporated in the studies we reviewed. Ohst & Tuschen-Caffier⁵⁴ noted that while individuals with PD often show heightened interoception and exaggerated physiological responses, this sensitivity varies widely among individuals and does not consistently predict the onset of PAs. Additionally, Jang et al.⁵⁵ emphasized that tailoring predictions to individual baselines and accounting for inter-individual variability in panic symptom patterns is essential for achieving optimal performance. Therefore, personalizing the prediction model by merging advanced ML models with a patients’ profile and symptoms may ensure a more realistic prediction model.

Third, once prediction has achieved adequate accuracy and a practical time frame, a need will arise to provide patient-friendly interventions. This would require an understanding of how the wearable technology might be used in conjunction with proactive intervention measures, such as automated breathing exercises or relaxation tasks, to help patients better manage the onset of panic episodes.⁵⁶ Work is in progress to interview physicians and PA patients to understand how to design such an intervention in conjunction with PA prediction via a commercial smartwatch.

Limitations

Our review had a few limitations, mainly because of our focus on the prediction of PAs using wearable devices. This led to a limited number of articles (n = 7) being included in the full review and possibly excluding articles that do not use a wearable device or are not solely focused on PAs . This reflects the fact that wearable-based research on PAs is still in its infancy, with only very few studies achieving adequate sample size and monitoring patient condition longitudinally. The limited number of studies across a variety of devices also limits our ability to conduct a systematic review of the existing literature to provide more generalizable findings. Second, we only utilized three common research databases namely PubMed, PsycINFO, and Medline. Third, we utilized specific keywords and search terms in the search and may have missed some other terminology that may be used. Lastly, the review only included articles published in English and did not critically assess the quality of the publication. Given, the limited research in this field, conducting a review at this stage was critical as it enables the synthesis of emerging findings and the identification of consistent features that may serve as preliminary indicators to guide hypothesis generation and help lay the groundwork for future large-scale research.

Conclusion

This paper highlighted the various devices and data features used to predict PAs with different levels of accuracy and prediction time frames. Our study also provided a comprehensive framework for conducting a study to predict the onset of a PA by highlighting the ML models, wearable devices, and data features recommended. Given the limited research in this area, future work should investigate phenotyping participants to improve prediction efforts, validate a near-real-time prediction algorithm, and implement strategies to reduce participant attrition and ensure balance gender representation. Additionally, studies should further investigate the correlation between sleep stages and next-day PA vulnerability, identify the ideal intervention to provide upon the detection of such a PA, as well as identify an approach that balances between device wearability and ability to record HRV during the day which is deemed critical for near-real-time PA detection.

Footnotes

Acknowledgments

The team would like to thank Samir Abdul Aal for his help in final editing and proofreading of this manuscript.

ORCID iD

Karim Zahed

Ethical considerations

Not required.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Author contributions

KA and JH contributed to data curation, formal analysis, and writing—original draft; YH contributed to formal analysis, investigation, and writing—original draft; KZ contributed to methodology, project administration, conceptualization, supervision, and writing; HG contributed to supervision and writing—review & editing.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Vertically Integrated Project Program at the Maroun Semaan Faculty of Engineering & Architecture at the American University of Beirut.

Declaration of conflicting interest

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability

N/A.

Guarantor

KZ.

References

World Health Statistics . World Health Statistics 2022 [Internet]. 2022 [cited 2025 Mar 7]. https://www.who.int/news/item/20-05-2022-world-health-statistics-2022.

Cleveland Clinic . Panic Attacks & Panic Disorder: Causes, Symptoms & Treatment [Internet]. 2023 [cited 2025 Mar 7]. https://my.clevelandclinic.org/health/diseases/4451-panic-attack-panic-disorder.

Panic Attack Symptoms [Internet] . Verywell Health. [cited 2025 Sep 13]. https://www.verywellhealth.com/panic-attacks-symptoms-5093170.

Agoraphobia: MedlinePlus Medical Encyclopedia [Internet] . [cited 2025 Sep 13]. https://medlineplus.gov/ency/article/000923.htm.

Stäubli

. Panic attacks. Schweiz Med Wochenschr 1993; 123: 800–806.

Gomes

Pato

Lourenço

, et al. A survey on wearable sensors for mental health monitoring. Sensors 2023; 23: 1330.

Jahromi

Zahed

Sasangohar

, et al. Hypoglycemia detection using hand tremors: home study of patients with type 1 diabetes. JMIR Diabetes 2023; 8: e40990.

Zahed

Markert

Dunn

, et al. Investigating the effect of an mHealth coaching intervention on health beliefs, adherence and blood pressure of patients with hypertension: a longitudinal single group pilot study. DIGITAL HEALTH 2023; 9: 20552076231215904.

Choudhury

Asan

. Impact of using wearable devices on psychological distress: analysis of the health information national trends survey. Int J Med Inf 2021; 156: 104612.

10.

Caldirola

Daccò

Grassi

, et al. Cardiorespiratory assessments in panic disorder facilitated by wearable devices: a systematic review and brief comparison of the wearable zephyr BioPatch with the quark-b2 stationary testing system. Brain Sci 2023; 13: 502.

11.

Rubin

Eldardiry

Abreu

, et al. Towards a mobile and wearable system for predicting panic attacks. In: Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing [Internet]. Osaka Japan: ACM; 2015 [cited 2025 Jan 12]. pp.529–33. https://dl.acm.org/doi/10.1145/2750858.2805834.

12.

Margraf

Taylor

Ehlers

, et al. Panic attacks in the natural environment. J Nerv Ment Dis 1987; 175: 558–565.

13.

Tsai

Christian

Kuo

, et al. Sleep, physical activity and panic attacks: a two-year prospective cohort study using smartwatches, deep learning and an explainable artificial intelligence model. Sleep Med 2024; 114: 55–63.

14.

Tsai

Chen

Liu

, et al. Panic attack prediction using wearable devices and machine learning: development and cohort study. JMIR Med Inform 2022; 10: e33063.

15.

Huhn

Axt

Gunga

, et al. The impact of wearable technologies in health research: scoping review. JMIR Mhealth Uhealth 2022; 10: e34384.

16.

Sadeghi

McDonald

Sasangohar

. Posttraumatic stress disorder hyperarousal event detection using smartwatch physiological and activity data. PLOS ONE 2022; 17: e0267749.

17.

Tricco

Lillie

Zarin

, et al. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med 2018; 169: 467–473.

18.

Gibbs

. Quality of study rating form: an instrument for synthesizing evaluation studies. J Soc Work Educ 1989; 25: 55–67.

19.

Cruz

Rubin

Abreu

, et al. A wearable and mobile intervention delivery system for individuals with panic disorder. In: Proceedings of the 14th international conference on mobile and ubiquitous multimedia [Internet]. Linz Austria: ACM; 2015 [cited 2025 Jan 12]. pp.175–82. https://dl.acm.org/doi/10.1145/2836041.2836058.

20.

McGinnis

Lunna

Berman

, et al. Discovering Digital Biomarkers of Panic Attack Risk in Consumer Wearables Data. medRxiv. 2023 Mar 6;2023.03.01.23286647.

21.

Beck

Epstein

Brown

, et al. An inventory for measuring clinical anxiety: psychometric properties. J Consult Clin Psychol 1988; 56: 893–897.

22.

Beck

Ward

Mendelson

, et al. An inventory for measuring depression. Arch Gen Psychiatry 1961; 4: 561–571.

23.

Spielberger

Gonzalez-Reigosa

Martinez-Urrutia

, et al. The State-Trait Anxiety Inventory. Revista Interamericana de Psicología/Interamerican Journal of Psychology [Internet]. 2017 Jul 17 [cited 2025 Apr 15];5(3 & 4). https://journal.sipsych.org/index.php/IJP/article/view/620.

24.

Wang

, et al. A precision health service for chronic diseases: development and cohort study using wearable device, machine learning, and deep learning. IEEE J Transl Eng Health Med 2022; 10: 2700414.

25.

PDSS - Panic Disorder Severity Scale (PDSS) [Internet] . 2021 [cited 2025 Sep 13]. https://novopsych.com/assessments/anxiety/panic-disorder-severity-scale-pdss/.

26.

Sheehan

Lecrubier

Sheehan

, et al. The mini-international neuropsychiatric interview (M.I.N.I.): the development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. J Clin Psychiatry 1998; 59: 22–33. quiz 34–57.

27.

O’Grady

Lambe

Baldwin

, et al. The validity of apple watch series 9 and ultra 2 for serial measurements of heart rate variability and resting heart rate. Sensors (Basel) 2024; 24: 6220.

28.

Nazari

Bobos

MacDermid

, et al. Psychometric properties of the zephyr bioharness device: a systematic review. BMC Sports Sci Med Rehabil 2018; 10: 6.

29.

Schyvens

Van Oost

Aerts

, et al. Accuracy of Fitbit Charge 4, Garmin Vivosmart 4, and WHOOP versus polysomnography: systematic review. JMIR Mhealth Uhealth 2024; 12: e52192.

30.

Szot

. Evolution of sport wearable global navigation satellite systems’ receivers: a look at the Garmin forerunner series. Proc Inst Mech Eng Part P J Sports Eng Technol 2024: 17543371241237319.

31.

Svensson

Madhawa

, et al. Validity and reliability of the oura ring generation 3 (Gen3) with Oura sleep staging algorithm 2.0 (OSSA 2.0) when compared to multi-night ambulatory polysomnography: a validation study of 96 participants and 421,045 epochs. Sleep Med 2024; 115: 251–263.

32.

Heart Rate Accuracy from Garmin Wearables - Labfront [Internet] . [cited 2025 Sep 13]. https://www.labfront.com/article/heart-rate-accuracy-wearables.

33.

Hernando

Roca

Sancho

, et al. Validation of the Apple watch for heart rate variability measurements during relax and mental stress in healthy subjects. Sensors 2018; 18: 2619.

34.

Reliability and Validity of the ZephyrTM BioHarnessTM to Measure Respiratory Responses to Exercise. Request PDF. ResearchGate [Internet] . 2025 Aug 10 [cited 2025 Sep 13]; https://www.researchgate.net/publication/254307384_Reliability_and_Validity_of_the_Zephyr_BioHarness_to_Measure_Respiratory_Responses_to_Exercise.

35.

Kang

Exworthy

. Wearing the future-wearables to empower users to take greater responsibility for their health and care: scoping review. JMIR Mhealth Uhealth 2022; 10: e35684.

36.

Garmin vivosmart® 4. Fitness Activity Tracker. Pulse Ox [Internet] . [cited 2025 May 24]. https://www.garmin.com/en-US/p/605739/.

37.

ZephyrTM Performance Systems. Performance Monitoring Technology [Internet] . [cited 2025 May 24]. https://www.zephyranywhere.com/.

38.

Apple Inc. Buy Apple Watch SE [Internet] . Apple. 2025 [cited 2025 May 24]. https://www.apple.com/shop/buy-watch/apple-watch-se.

39.

Belleville

Potočnik

Belleville

, et al. A meta-analysis of sleep disturbances in panic disorder. In: Psychopathology - an international and interdisciplinary perspective [Internet]. IntechOpen; 2019 [cited 2025 May 28]. https://www.intechopen.com/chapters/67142.

40.

Wang

Dasilva

, et al. Tracking depression dynamics in college students using mobile phone and wearable sensing. Proc ACM Interact Mob Wearable Ubiquitous Technol 2018; 2: 43.

41.

Pizzoli

SFM

Marzorati

Gatti

, et al. A meta-analysis on heart rate variability biofeedback and depressive symptoms. Sci Rep 2021; 11: 6650.

42.

Kanady

Talbot

Maguen

, et al. Cognitive behavioral therapy for insomnia reduces fear of sleep in individuals with posttraumatic stress disorder. J Clin Sleep Med 2018; 14: 1193–1203.

43.

Liu

Pencheon

Hunter

, et al. Recruitment and retention strategies in mental health trials – A systematic review. PLOS ONE 2018; 13: e0203127.

44.

Cheng

Ebrahimi

. Gamification: a novel approach to mental health promotion. Curr Psychiatry Rep 2023; 25: 577–586.

45.

Greene

Bina

Gum

. Interventions to increase retention in mental health services: a systematic review. Psychiatric Services [Internet]. 2016 Jan 4 [cited 2025 Apr 19]. https://psychiatryonline.org/doi/10.1176/appi.ps.201400591.

46.

Baee

Eberle

Baglione

, et al. Early attrition prediction for web-based interpretation bias modification to reduce anxious thinking: a machine learning study. JMIR Ment Health 2024; 11: e51567.

47.

Sheikh

Leskin

Klein

. Gender differences in panic disorder: findings from the national comorbidity survey. AJP 2002; 159: 55–58.

48.

BinDhim

Althumiri

Al-Luhaidan

, et al. Assessing attitudes toward mental health illnesses in Saudi Arabia: a national cross-sectional study. Int J Soc Psychiatry 2024; 70: 1118–1127.

49.

Sheikh

Payne-Cook

Lisk

, et al. Why do young men not seek help for affective mental health issues? A systematic review of perceived barriers and facilitators among adolescent boys and young men. Eur Child Adolesc Psychiatry 2025; 34: 565–583.

50.

Wong

Siah

. Estimation of clinical trial success rates and related parameters. Biostatistics 2019; 20: 273–286.

51.

Beam

Kohane

. Big data and machine learning in health care. JAMA 2018; 319: 1317–1318.

52.

Martínez-Pérez

de la Torre-Díez

López-Coronado

. Privacy and security in mobile health apps: a review and recommendations. J Med Syst 2015; 39: 181.

53.

Park

Ahn

Yoon

, et al. Performance of Fitbit devices as tools for assessing sleep patterns and associated factors. J Sleep Med 2024; 21: 59–64.

54.

Ohst

Tuschen-Caffier

. Catastrophic misinterpretation of bodily sensations and external events in panic disorder, other anxiety disorders, and healthy subjects: a systematic review and meta-analysis. PLOS ONE 2018; 13: e0194493.

55.

Jang

Sun

Lee

, et al. Machine learning prediction of impending panic symptoms using digital phenotypes: from over 2-year prospective longitudinal data [Internet]. Rochester, NY: Social Science Research Network; 2024 [cited 2025 Apr 14]. https://papers.ssrn.com/abstract=4848512.

56.

Cackovic

Nazir

Marwaha

. Panic disorder. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2025 [cited 2025 May 13]. http://www.ncbi.nlm.nih.gov/books/NBK430973/.

Utility of wearable technology in predicting panic attacks: A scoping review

Abstract

Objective

Background

Method

Results

Conclusion

Keywords

Introduction

Methods

Literature search and inclusion criteria

Screening procedures and extracting themes

Results

Devices and data features

Machine learning models

Participant issues

Discussion

Data features and devices

Machine learning

Common study issues

Attrition

Gender imbalance

Outcome measurement and labeling

Study design and cohort size

Methodological and generalizability limitations

Ethical issues

Data privacy and confidentiality

Socioeconomic and digital literacy barriers

Challenges in data uncertainty and predictive reliability

Merits and advancements of reviewed studies

Future work

Limitations

Conclusion

Footnotes

Acknowledgments

ORCID iD

Ethical considerations

Consent to participate

Consent for publication

Author contributions

Funding

Declaration of conflicting interest

Data availability

Guarantor

References