Abstract
Objective
As an indicator of exercise intensity, heart rate can be measured in a timely manner using wrist-worn devices. No study has attempted to estimate a target exercise intensity using wearable devices. The objective of the study was to evaluate the validity of prescribing exercise intensity using wrist-worn devices.
Methods
Thirty healthy subjects completed a maximal cardiopulmonary exercise test. Their heart rates were recorded using an electrocardiogram and two devices—Apple Watch Series 6 and Garmin Forerunner 945. Exercise intensity with the target heart rate was defined as resting heart rate + (maximal heart rate − resting heart rate) *n% (n%: 40–60% for moderate-intensity exercise and 60–89% for vigorous-intensity exercise). Heart rate was analyzed at the lower and upper limits of each exercise intensity (HR40, HR60, and HR89). The mean absolute percentage error and concordance correlation coefficient were calculated, and Bland–Altman plots and scatterplots were constructed.
Results
Both devices showed a low mean absolute error (1.16–1.48 bpm for Apple and 1.35–2.25 for Garmin) and mean absolute percentage error (<1% for Apple and 1.16–1.39% for Garmin) in all intensities. A substantial correlation with electrocardiogram-measured heart rate was observed for moderate to vigorous intensity with concordance correlation coefficient > 0.95 for both devices, except that Garmin showed moderate correlation at the upper limit of vigorous activity with concordance correlation coefficient = 0.936. Moreover, Bland–Altman plots and scatterplots demonstrated a strong correlation without systematic error when the values obtained via the two devices were compared with electrocardiogram measurements.
Conclusions
Our findings indicate the high validity of exercise prescriptions based on the heart rate measured by the two devices. Additional research should explore other populations to confirm these findings.
Introduction
Regular aerobic exercise has significant benefits to health, such as improving cardiovascular fitness, insomnia, and strengthening the immune system.1,2 The positive effect of exercise on health depends on the intensity, duration, and frequency of the exercise routine. 3 The American College of Sports Medicine (ACSM) and American Heart Association3,4 recommend that most adults aged 18 – 65 years require aerobic physical activity with moderate intensity for at least 30 min/day, 5 days/week, or vigorous intensity for at least 20 min, 3 days per week. The basis of this recommendation of the ACSM is that minimally intense exercise that does not exert the body would not achieve health benefits. 3 Thus, the ACSM recommends moderate (e.g. 40–59% heart rate reserve [HRR] or 64–76% maximal heart rate [HRmax]) to vigorous (e.g. 60–89% HRR or 77–95% HRmax) exercise for healthy adults.
The use of a target heart rate (HR) as a tool for exercise prescription is common 5 because variations in HR during exercise correlate with changes in exercise intensity. 6 However, the acquisition of an individual's HRmax is difficult. Some formulas, such as Fox (HRmax = 220−age) 7 and Tanaka (208−age × 0.7) equations, have been used to estimate HRmax. 8 These formulas are simple to use but could underestimate or overestimate the measured HRmax.8–12 The ACSM guidelines state that using directly measured HRmax is preferred to using estimated values for greater accuracy in determining exercise intensity.
In the last decade, there has been a surge in the availability of wrist-worn devices and HR monitors in the market. Shipments of wrist-worn wearables increased by 487% from 18.7 million units in 2016 to 91 million units in 2020. Many consumers have used wearable devices to measure parameters such as HR, steps, or energy expenditure. 13 Utilizing a consumer-based wearable device can inspire a significant increase in physical activity and a decrease in weight.13–15 Wrist-worn devices with photoplethysmography (PPG) sensors can note changes in blood volume and serve as HR monitors. 16 Wrist-worn devices that measure HR using PPG signals may obtain accurate HRmax. 17
Previous studies have attempted to compare HR measurements between wearable PPG and a reference electrocardiogram (ECG).18–21 Many researchers have investigated the reliability and validity of wearable devices for monitoring HR17,20,22–25 and have shown that wearable devices may slightly underestimate absolute HR.19,22,26,27 Two systematic reviews stated that a small tendency for HR underestimation particularly developed during vigorous activity.17,22 In both laboratory and real-life settings, wearable devices detected HR with an acceptable level of accuracy.19,21,28,29 Researchers reported that arm movement can interfere with PPG signaling.18,24,30,31 Stable wrist exercises, such as during walking and stationary cycling, provided more accurate heartbeat measurements than exercises in which the wrist was unstable, such as in using elliptical machines and during intermittent exercises.25,31,32
To date, however, previous researchers have only focused on the traditional reliability and validity of HR measurement using wearable devices during various physical activities or during exercises of different intensities. No study has attempted to estimate a target exercise intensity using wearable devices. Therefore, the primary objective of this study was to evaluate the accuracy of exercise intensity measurement based on the HRmax measured by wearable devices. The secondary objective was to compare Garmin Forerunner 945 and Apple Watch Series 6 to determine the more accurate device.
Methods
Participants
The experiment was conducted at a large medical center, and all participants were recruited through open recruitment or personal introduction. To minimize the risk of complications, the included individuals comprised healthy subjects aged between 20 and 40 years who answered “no” to all questions in the 2019 Physical Activity Readiness Questionnaire. 32 Excluded individuals included those with resting blood pressure > 140/90 mmHg, body mass index > 30 kg/m2, resting HR >100 beats per min, those who received medications that affect HR, were incapable of performing exercises of vigorous intensity (e.g. running, climbing, cycling fast, and playing basketball, football, and other competitive sports), and pregnant women.
Devices and data collection
All subjects performed formal ramp incremental exercise tests on an electronically braked cycle ergometer equipped with a face mask (Ergospirometry, Cortex, Germany) and a 12-lead electrocardiogram for recording HR and heart rhythm during the exercise test.
Exercise protocol
The exercise protocol started with the subjects resting on a seat for 3 min, followed by a warmup with unloaded cycling at 0 W for 1 min, and a 15-W/min ramp protocol at a pedaling rate of 60–70 r/min. The tests terminated when the subjects were exhausted and unable to maintain 60 r/min. The subjects’ HR data were simultaneously recorded by the Apple Watch at 5-s intervals, the Garmin Forerunner at 1-s intervals, and ECG
Determination of exercise intensity
Resting HR was obtained from the average HR measured by both wearable devices and ECG during the 3-min resting phase. HRmax was defined as the highest HR value throughout the exercise test.
The HRR formula suggested by ACSM for establishing exercise intensity in the general population is
Statistics analysis
Subject characteristics and CPET data are depicted as mean with standard deviation or as percentages. Means and standard deviations of HR were calculated for ECG, the two devices, and two age-related HR prediction formulas. Mean absolute error and mean absolute percentage error were calculated for both devices, formulas, and ECG for different exercise intensities, with upper and lower limits. Bland–Altman plots with mean difference and 95% limits of agreement (LoA, MD ± 1.96*SDD) were used to describe the agreement between intensities measured by each device versus that measured via ECG.
Results
Thirty subjects (15 male and 15 female) were included in the study. Table 1 displays the characteristics of all subjects. All participants discontinued CPET when they felt exhausted, that is, when they felt fatigued in their lower extremities.
Subject characteristics.
CPET: cardiopulmonary exercise testing; VO2: oxygen consumption.
Statistical results regarding the accuracy and precision of measurement using Apple Watch versus ECG are summarized in Table 2. The concordance correlation coefficient showed a substantial correlation between Apple Watch and 12-lead ECG, from moderate to vigorous-intensity exercises, with a concordance correlation coefficient of 0.977–0.979. Apple Watch showed very low error, with a mean absolute percentage error of < 1% in moderate to vigorous-exercise intensities. Figures 1 and 2 depict the Bland–Altman plots and scatterplots of Apple Watch in comparison with ECG. Both plots indicate good correlation and low systematic error.

Bland–Altman plot showing agreement between Apple Watch Series 6 and electrocardiogram (ECG) in all subjects.

Bland–Altman plot showing agreement between Garmin Forerunner 945 and electrocardiogram (ECG) in all subjects.
Means, errors, and correlations between Apple/Garmin/age-predicted equation and ECG.
SD: standard deviation; HR40: lower limit of moderate intensity; HR60: cut-off between moderate and vigorous intensity; HR89: upper limit of vigorous-intensity; ECG: electrocardiogram; AW: Apple Watch; GF: Garmin Forerunner; SDD: standard deviation of difference; LoA: limit of agreement; MAE: mean absolute error; MAPE: mean absolute percentage error; CCC: concordance correlation coefficient.
The results of the statistical analysis of Garmin Forerunner and ECG are summarized in Table 2. The concordance correlation coefficient showed a substantial correlation in moderate-intensity exercises, but a moderate correlation was noted in the upper limit of vigorous-intensity exercises. The mean absolute error and mean absolute percentage error of Garmin were slightly higher than those of Apple Watch in moderate and vigorous-intensity exercises. Figures 3 and 4 depict the Bland–Altman plots and scatterplots of Garmin Forerunner in comparison with ECG. The plots show a high correlation without systematic error, from moderate to vigorous-intensity exercises.

Scatter plot and concordance correlation coefficient (CCC) showing the strength of association between Apple Watch series 6 and electrocardiogram (ECG) in all subjects.

Scatter plot and concordance correlation coefficient (CCC) showing the strength of association between Garmin Forerunner 945 and electrocardiogram (ECG) in all subjects.
Exercise prescription based on age-related HR prediction for all subjects is shown in Table 2. Fox and Tanaka’s equations overestimated the subjects’ HRmax, and the error rate was more pronounced as exercise intensity increased. The CCC of both equations showed a poor correlation with ECG, and the correlation decreased with an increase in exercise intensity.
Discussion
The study validates the accuracy of wrist-worn devices for establishing exercise intensity through their HR assessment function. We found that the exercise intensity prescribed by the two wrist-worn devices was highly consistent with that prescribed by 12-lead ECG, from moderate to vigorous-intensity exercise, in healthy adults.
As shown in Tables 2 and 3, the overall errors of Apple Watch and Garmin Forerunner were much smaller than those in previous studies. A review article indicated an error of about 1.2% to 6.7% for the Apple Watch,
34
while in our research, an error of less than 1% for the Apple Watch Series 6 was noted. The main possible reason for the discrepancy may be connected to differences in study design. Previous studies focused on the accuracy of HR monitoring in different exercise intensities. Most previous studies compared simultaneous measurements of HR using 12-lead ECG and wearable devices to calculate the accuracy of device-measured HR.18,22,23,26 There might have been some time lag in the display of HR between the devices. Furthermore, some studies obtained the average HR under a different length of time and other HR data were retrieved at different time intervals.18,23 Contrarily, our research used the HRmax obtained by the device during CPET; thus, there was no concern regarding time delay and heterogeneity in the time interval.
In this study, the exercise intensity prescribed by Apple Watch seemed more precise and was associated with a lower error rate than that prescribed by Garmin Forerunner at each intensity. The Apple Watch uses two types of sensors, including green and infrared light-emitting diodes (LED), to detect the amount of blood flowing through the wrist. 35 Conversely, Garmin uses only green LEDs to detect HR. 36 The green LED sensor is resistant to motion but is limited in its tissue penetration; whereas, the infrared LED has a better tissue penetration, but is susceptible to motion artifacts. 37 The combination of sensors in the Apple Watch might be helpful in filtering noises and might further decrease the error rate. However, the overall error for Garmin remained very low (1.16–1.39%), from moderate to vigorous-intensity exercises.
In addition, in the present study, HRmax was overestimated in 33% (10/30) and 26.7% (8/30) of participants with an error >10%, using Fox and Tanaka equations, respectively. The findings indicate that there is a limit to the ability of the formula to accurately determine exercise intensity. The error in the determination of HR using the formulas increased with increasing exercise intensity. Besides, the error in the predicted HRmax observed in our study is consistent with those reported by other researchers.9–11 Currently, without the need for equipment, it is possible to determine HRmax using Fox and Tanaka equations; these formulas are commonly used by the general population.7–9 Although studies have suggested some issues with the age-related prediction of HRmax,9,38 many people use the traditional formula of “220 minus age” in training programs, for the purposes of convenience. However, it should be noted that if exercise intensity is not set according to individual differences, people might be injured due to excessive exercise intensity or fail to achieve the expected effect of exercise due to insufficient exercise intensity.
Recent systematic reviews have shown a dose–response relationship between physical activity and the primary and secondary prevention of several chronic diseases, as well as premature death. 39 However, there are several personal, societal, and environmental-related barriers to adopting and maintaining physical activity. 40 In a survey, about 29% of individuals said “I don't know how to do it.” 3 It is important for individuals to build self-efficacy using an appropriate strategy. Goal-setting (i.e. in the frequency, duration, and intensity of exercise) before engaging in exercises can lead to positive changes in the behavior of physical activities. 41 Wrist-worn devices may assist in goal-setting and provide feedback to individuals. Feedback, including the frequency and duration of exercise, HR during exercise, and the distance traveled or the number of steps is commonly used in self-monitoring in exercise. Self-monitoring involving observing and recording behavior has been shown to have a positive effect on the changes in physical activity behavior. 41
There are some limitations to this study. First, only young and healthy participants were recruited; thus, the generalizability of the study findings is limited. Second, the results are based on an ergometer ramp protocol and may not be generalized to other forms of exercise or exercise protocols such as treadmill-based tests and increment protocols. However, previous systematic studies have shown that the accuracy of HR measurement using treadmills was higher than that using ergometers. However, it seems uncertain whether exercise intensity can be accurately calculated via other testing modalities (e.g. treadmills).
This is the first study to analyze the validity of exercise intensity prescription using wrist-worn devices. Previous systematic reviews showed that it is acceptable to monitor HR with wearable devices during exercise. Our findings confirm the high validity of using a watch to establish exercise intensity. Consequently, healthy adults may only require wearable devices and exercise equipment to establish the exercise intensity required for their training, and whether their HR falls within the training target throughout the exercise could be monitored synchronously.
Conclusion
In conclusion, we found that the exercise intensities prescribed by commercial wearable devices, including Apple Watch and Garmin Forerunner, were highly consistent with those prescribed by ECG in healthy adults. We believe that the direction of future research on this subject should be exploring other populations, such as elite athletes or the elderly, or the investigating other exercise testing modalities, such as treadmills.
Footnotes
Acknowledgements
The authors would like to acknowledge and thank the participants for their time and willingness to take part in this study.
Availability of data and material
The datasets generated during and/or analyzed during the current study are not publicly available for privacy purposes, because they include personal health information data; however, the datasets are available from the corresponding author on reasonable request.
Contributorship
W-T Ho and T-C Li participated in the study concept and design. W-T Ho collected the data. Y-J Yang participated in the analysis and interpretation of the data. T-C Li drafted the manuscript, and all critically revised the manuscript. All authors read and approved the final manuscript.
Consent to participate
All subjects participated in this study voluntarily and informed consent was obtained from all participants.
Consent for publication
All individuals permit the publication of their data.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethics approval
This study was approved by the Cathay General Hospital Research Ethics Committee (reference number: CGH-P109050).
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by project CGH-MR-B10911 from the Cathay General Hospital, Taipei, Taiwan.
Guarantor
T-C Li.
