Sage Journals: Discover world-class research

Abstract

Background

COPD underdiagnosis persists in China due to limited spirometry access. Smart wearables enabling cough and physiological monitoring (SpO₂, respiratory rate) offer a scalable screening solution.

Methods

Participants were randomly allocated to training and validation cohorts. All underwent cough sound recordings, smartwatch monitoring (heart rate variability, respiratory rate, oxygen saturation), and pre-/post-bronchodilator spirometry. Machine learning algorithms extracted cough sound features to predict lung function (evaluated via MAE, Pearson correlation, and Bland-Altman analysis). These predictions were combined with physiological data in a multimodal COPD screening model, with diagnostic performance assessed against physician diagnosis.

Results

The training cohort included 178 patients (112 males) with COPD or pulmonary dysfunctions, aged 54.42 ± 14.77 years, BMI 24.81 ± 3.73 kg/m², FVC 3.64 ± 1.09 L, and FEV₁ 2.42 ± 0.96 L, alongside 298 healthy volunteers (151 males) aged 35.3 ± 12.35 years, BMI 22.62 ± 3.12 kg/m², FVC 3.63 ± 0.89 L, and FEV₁ 3.14 ± 0.73 L. The validation cohort comprised 47 COPD patients (35 males) aged 65.53 ± 7.62 years, BMI 25.38 ± 4.38 kg/m², FVC 3.27 ± 0.59 L, and FEV₁ 1.91 ± 0.50 L, and 71 healthy controls (27 males) aged 45.51 ± 12.15 years, BMI 25.79 ± 4.00 kg/m², FVC 3.35 ± 0.80 L, and FEV₁ 2.72 ± 0.67 L. Using cough sounds, the model's mean absolute error for FEV₁/FVC, FVC%, and FEV₁% prediction was 7.4%, 10.6%, and 17.78% ( Table 3 – 5), respectively, compared to spirometry. Significant correlations were found between predicted and measured FVC (r = 0.798, P < 0.001), FEV₁ (r = 0.752, P < 0.001), and FEV₁/FVC (r = 0.784, < 0.001) ( Table 6). Combined with physiological parameters, our model's overall accuracy, sensitivity, and specificity for differentiating between COPD and normal controls were 87.82%, 86.96%, and 87.73% ( Table 9).

Conclusion

Our wearable-based algorithm effectively screens for ventilatory dysfunction and COPD, showing potential for large-scale population screening to reduce medical burdens.

Trial Registration

Chinese Clinical Trial Registry of the International Clinical Trials Registry Platform of the World Health Organization ChiCTR2100050843; Registration Date: 2021-9-4 Clinical Trial Number: ChiCTR2100050843. https://www.chictr.org.cn/showproj.html?proj=126556

Keywords

COPD pulmonary function test wearable devices smartwatch cough sound analysis

Introduction

Chronic respiratory diseases, including chronic obstructive pulmonary disease (COPD) and asthma, affect hundreds of millions of people worldwide.¹ Recent large-scale epidemiological studies in China have shown that the overall prevalence of COPD is 8.6%, which translates to 99 million COPD patients, who largely go undiagnosed and untreated.² The 2023 GOLD guidelines define COPD as a heterogeneous lung disease characterized by persistent respiratory symptoms (e.g., dyspnea, cough, sputum, and/or exacerbations) due to airway and/or alveolar abnormalities, leading to persistent and often progressive airflow obstruction.³ Although PFT remains the gold standard, COPD diagnosis requires integrating symptoms and medical history. A post-bronchodilator FEV1/FVC < 0.7 confirms COPD.³ Despite spirometry's well-established role as the gold standard for COPD diagnosis, its strikingly low utilization rate (merely 12.0% among COPD patients) carries profound negative consequences for timely respiratory disease detection.²

Additionally, PFTs are not routinely used in emergency departments or primary care settings due to a lack of experienced practitioners and the low availability of specialized equipment. The test requires patient cooperation, experienced staff, and repeated testing to ensure measurement consistency. The rapid increase in COPD prevalence in people over 40 years old of approximately 67% from 2002–2004 to 2012–2015 further indicates the increased need for screening.¹ Moreover, the global COVID-19 pandemic, a respiratory infectious disease, has hindered the widespread use of PFT.⁴ Consequently, the basic ability to measure lung capacity remains limited, leading to incorrect or missing diagnoses of respiratory diseases, particularly in developing countries.

Smart wearable devices may provide new solutions for addressing this problem. In recent years, due to significant advancements in sensors, microprocessors, body area networks, and wireless data transmission, such devices have come to be widely used in the healthcare field.⁵ Ubiquitous human health monitoring is becoming a reality, and wearable technologies allow for the assessment of physical, physiological, and biochemical parameters across different environments without restrictions on activity.⁶ In addition, wearable devices are becoming increasingly popular with consumers, due to their low cost, portability, ease of use, and non-interference with the wearer's freedom of movement; this makes real-time monitoring of health status possible for both healthcare institutions and users. Lesions in the lung lead to decreased ventilatory and air-exchange function and entail respiratory symptoms, such as abnormal cough (cough with a tail sound, low-frequency energy increase, and so forth) or abnormal physiological parameters (reduced heart rate variability (HRV) and increased respiratory rate, increased body temperature, decreased blood oxygen, and so forth).^7–10 By establishing a mapping relationship between cough and the physiological parameters collected by smart wearable devices to evaluate the state of the lung, COPD screening may be possible.

Given the high prevalence and underdiagnosis of COPD in China, coupled with limited access to traditional pulmonary function tests, there is an urgent need for simpler and more accessible screening methods. Smart sensing devices, such as wearables, offer the advantage of conveniently collecting multi-dimensional physiological parameters (e.g., acoustic signals, pulse rate, oxygen saturation). This capability provides a novel and effective approach for enabling early and convenient screening of lung function and COPD, with the potential to significantly increase screening coverage and opportunities for early intervention. Currently, published research on predicting pulmonary function through cough sounds predominantly employs audio features of cough sounds to forecast pulmonary functional metrics.^7,8 However, there is a relative dearth of studies that integrate physiological parameter data with cough sound-predicted pulmonary function for COPD screening. Existing research focused on assessing COPD through physiological parameters primarily pertains to determining the severity of COPD or identifying instances of acute exacerbation,^9–12 or differentiating COPD from other respiratory diseases or healthy controls.^13,14 The inherent convenience of wearable smart devices offers a transformative opportunity for large-scale population screening, while practical application and iterative feedback provide avenues for continuous algorithmic refinement. Building on prior knowledge of the mechanism by which cough sounds are generated and their ability to reflect lung ventilation function,⁶ this study proposes an innovative approach leveraging data collected from smartwatches. Specifically, we aim to utilize cough sounds as a non-invasive proxy for pulmonary function assessment, while simultaneously employing physiological parameters to capture the distinctive physiological features of COPD patients. By integrating these two data streams, we seek to develop a robust algorithm capable of identifying characteristic signals in COPD patients, with the potential to achieve higher accuracy than previous studies. The ultimate goal of this research is to enable early detection of respiratory function decline, facilitate timely referral of high-risk individuals for comprehensive clinical evaluation, and promote early diagnosis and treatment of COPD, thereby reducing the overall burden on healthcare systems.

Methods

Recruitment and data collection

This prospective diagnostic accuracy study followed STARD 2015 reporting standards after obtaining ethical approval from the Chinese PLA General Hospital's ethics committee (Approval No. S2021-663-01). We recruited participants concurrently at the pulmonary function laboratories of both the First and Eighth Medical Centers (Figure 1). Each participant provided written informed consent before we collected physiological parameters and performed pulmonary function tests.

Figure 1.

STARD flow diagram of participant recruitment and analysis process.

Participants aged 18 years or older who could cooperate with PFTs and the collection of cough sounds and physiological parameters were included. Exclusion criteria were: (1) history of pneumonectomy, vocal cord surgery, or nasopharyngeal surgery; (2) vocal cord injury, tracheal or main bronchus obstruction, or neuromuscular disease involving respiratory muscle weakness; and (3) pregnancy or inability to cooperate with study requirements.

We recruited two groups for sample collection. The first group comprised patients who visited the Respiratory Department between June 1 and October 11, 2022, for lung function tests. The second group consisted of healthy volunteers without respiratory symptoms who also underwent lung function testing. Professional pulmonary function technicians performed all tests using standardized equipment (Masterscreen-PFT, Carefusion Germany 234 GmbH) according to ERS/ATS guidelines.¹⁵ The predictive equation for assessing pulmonary function have been reported in the literature.¹⁶

For assessment, we recorded lung ventilation function indicators (FEV₁, FVC, and FEV₁/FVC ratio). FEV₁ measures the volume of air exhaled in the first second after maximal inhalation and is critical for assessing airway obstruction in diseases like COPD and asthma. FVC represents the total volume of air exhaled forcefully after maximal inhalation, aiding in pulmonary fibrosis diagnosis. The FEV₁/FVC ratio helps differentiate obstructive from restrictive lung diseases, with a reduced ratio indicating airway obstruction. We then collected cough sounds using a smartwatch and recorded HRV and blood oxygen parameters via photoplethysmography (PPG) pulse wave signals over 1 min. Research technicians recorded cough sounds 5 min after spirometry in the PFT room using the following protocol:

After completing lung function tests, patients rested for 5 min. We then collected physiological signal data using a smartwatch (HUAWEI Watch GT 3) to capture PPG and acceleration signals (ACC) for 1 min (Figure 2). These signals provided essential inputs for computing HRV, blood oxygen saturation, and respiratory rate.

Patients inhaled to maximum lung capacity and voluntarily coughed forcefully 2–3 times. We positioned the smartwatch approximately 30 cm from the subject's mouth at a 45° angle (Figure 1).

We repeated step 2 once.

Because consecutive cough events following a single inhalation are not equivalent,¹⁷ they cannot serve as precise lung function indicators. After resting 2–3 min, subjects repeated steps 2 and 3.

Figure 2.

Watch placement to collect cough sounds and physiological parameters. Note: the 45° placement is to prevent the direct airflow from a patient's cough into the microphone, which may result in plosive sounds and subsequently degrade the quality of the collected audio. The fixed distance is maintained to mitigate variations in volume caused by different distances, thereby preserving the consistency of audio energy for comparative analysis.

We sampled cough sounds at 16 kHz, PPG at 25 Hz, and ACC at 100 Hz. Researchers made recordings in realistic hospital environments with background noise including conversations, medical equipment sounds, footsteps, and doors closing. A second sample population (patients visiting from October 15 to November 11, 2022)was used to validate the algorithm's effectiveness in two aspects: (1) pulmonary ventilation function indicators (FVC and FEV₁/FVC ratio) and (2) COPD screening accuracy. At last, we compared algorithm-identified COPD patients with clinical diagnoses to assess the algorithm's accuracy, sensitivity, and specificity.

Data processing

The processing process for the cough signal included audio preprocessing, sound segment extraction, non-cough sound segment exclusion, feature extraction, and feature aggregation (Supplementary Table 2).

Audio preprocessing: We applied a high-pass finite impulse response filter to each cough audio to remove background low-frequency noise (cutoff frequency: 200 Hz). Then speech enhancement and pre-emphasis methods were used to increase the high-frequency resolution of the cough sounds.

Preprocessing of Audio Signals and Spectrogram Generation ( Supplementary Figure 1 ): To analyze the frequency characteristics of cough sounds, we preprocessed the raw audio signals. First, the time-domain signals were transformed into frequency-domain signals using the Fast Fourier Transform (FFT) to generate spectrograms. The spectrograms visualize the energy distribution of the audio signals across different frequencies, providing an intuitive representation of the differences in cough sound characteristics between healthy controls, asthma and COPD patients. The specific parameters were set as follows: a sampling rate of 16 kHz, an FFT window length of 1024, and an overlap rate of 50%. We subsequently used the generated spectrograms for feature extraction and algorithmic analysis.

Sound segment extraction: using the root mean square energy and zero-crossing rate, silent segments without sound were excluded.

Non-cough segment exclusion: We excluded non-cough sound segments segment by segment, using a machine learning recognition model. This recognition model is a binary classification model (accounting for cough and non-cough sounds), for which the training data include cough sounds and speech sounds.

Feature extraction: features including short-term Fourier transform (STFT), Mel-scale frequency cepstral coefficients (MFCC), spectral contrast, spectral center, spectral bandwidth, cross zero rating, and root mean square energy were extracted, frame-by-frame.

Feature aggregation: We calculated aggregation features including mean, variance, median, kurtosis, and skewness of all frames in one cough segment, which we then used for model building. The PPG data were filtered first, and then RRI (R-R Interval), blood oxygen, and respiration rate features were calculated. RRI data were further used to calculate HRV and rate-specific features (including SDNN, power in different frequency bands, and so forth).

Algorithm establishment

Due to the distinct mapping relationships between cough sounds and the spirometric indicators FEV₁/FVC and FVC, we developed two separate XGBoost regression models to independently predict FEV₁/FVC and FVC, thereby minimizing prediction errors. Both models were constructed following the same methodological framework, as illustrated in Figure 3. The feature sets used in these models include aggregated features derived from cough sounds, HRV features extracted from R-R intervals, and blood oxygen saturation (SpO₂) and respiratory rate obtained from PPG signals (the feature extraction process was detailed earlier).

Figure 3.

The procedure of the algorithm establishment and prediction.

We employed the XGBoost regression algorithm to build both the FEV1/FVC and FVC models, utilizing the selected feature subsets. Optimal hyperparameters for XGBoost were determined through five-fold cross-validation on the training dataset. To validate the models, the COPD screening results were compared against the gold standard of hospital-based diagnoses, with performance evaluated based on accuracy, sensitivity, and specificity.

The final COPD classification was derived by integrating the predicted spirometric values (FEV₁/FVC and FVC) from the regression models with additional physiological data collected from the smartwatch, including heart rate variability (HRV), blood oxygen saturation (SpO₂), and respiratory rate. We applied a threshold-based approach to the predicted FEV₁/FVC ratio—a widely accepted criterion in clinical practice for COPD diagnosis—to classify individuals as either healthy or likely to have COPD. Furthermore, the physiological parameters were leveraged to refine the classification, as they provide complementary insights into the patient's respiratory and cardiovascular health, enhancing the overall robustness of the screening algorithm.

Statistical analysis

Continuous variables are presented as means ± standard deviations. Intergroup differences with normal distributions were compared using a t-test. The normal distribution of continuous variables was tested using the Shapiro-Wilk test. For continuous variables conforming to a normal distribution, correlation analysis was performed using Pearson's method; Spearman rank correlation analysis was used for variables that did not conform to a normal distribution. Bland-Altman analysis was used to evaluate the consistency of algorithm-PFT parameters and the gold standard PFT. Two-tailed P < 0.05 was considered statistically significant. The consistency of the two lung function methods was compared using intraclass correlation coefficient analysis. The statistical analysis of variables was performed using IBM SPSS Statistics, version 26.0 (IBM, Chicago, IL, USA) and MedCalc Statistical Software version 19.0.4 (MedCalc Software bvba, Ostend, Belgium;).

Results

During the initial recruitment phase, we enrolled a total of 624 participants. After data analysis, 30 subjects (19 patients and 11 healthy volunteers) were excluded primarily due to audio segments being identified as non-cough sounds, which may have resulted from environmental noise interference or throat-clearing behaviors. In the algorithm development phase, we established a comprehensive training set of 476 subjects, utilizing a total of 1835 cough audio recordings. Following the model's establishment, the validation set, comprising 479 cough sound samples from 118 subjects, was utilized. Data from the remaining participants were excluded due to either excessive environmental noise interference or failure to meet quality control standards for pulmonary function measurements. This training cohort encompassed a diverse range of participants, including 178 patients diagnosed with COPD or experiencing various other pulmonary dysfunctions, and 298 healthy volunteers with normal pulmonary function. The demographic data of both data sets was shown in Tables 1 and 2. Additionally, the cohort included 90 patients suffering predominantly from asthma, but also from a spectrum of other pulmonary conditions such as interstitial lung diseases, pulmonary nodules, malignant pulmonary tumors, pulmonary infections, bronchiectasis, allergic rhinitis, and various undiagnosed respiratory ailments (Supplementary Table 1). The validation cohort comprised 118 participants, including 47 COPD patients and 71 healthy volunteers with normal pulmonary function. Comprehensive demographic characteristics and pulmonary function test (PFT) results for both groups are systematically presented in Tables 1 and 2.

Table 1.

Demographic features of the training sets.

Index	COPD or Other Pulmonary Dysfunction Patients (n = 178)			Healthy Volunteers (n = 298)
	62.92/37.08			50.67/49.33
Male/Female (%)	mean ± SD	Min	Max	mean ± SD	Min	Max	P
Age (years)	54.42 ± 14.77	21	83	35.30 ± 12.35	18	78	0.002
Height (cm)	166.20 ± 8.87	137	189	166.62 ± 7.77	147	185	0.587
Weight (kg)	68.56 ± 11.89	36	100	63.14 ± 11.77	37	100	0.000
BMI (kg/m²)	24.81 ± 3.73	16.33	40.00	22.62 ± 3.12	15.76	34.60	0.000
Cough times	13.00 ± 5.70	2	27	15.20 ± 8.50	2	46	0.742
FVC (L)	3.64 ± 1.09	0.95	6.57	3.63 ± 0.89	1.43	5.97	0.718
FEV₁ (L)	2.42 ± 0.96	0.68	4.21	3.14 ± 0.73	1.05	5.57	0.000
FEV₁/FVC (%)	65.14 ± 15.13	15.34	87.20	86.77 ± 5.58	70.45	100	0.000

Table 2.

Demographic features of the validation sets.

Index	COPD Patients (n = 47)			Healthy Volunteers (n = 71)			P
	74.47/25.53			38.03/61.97
Male/Female (%)	mean ± SD	Min	Max	mean ± SD	Min	Max
Age (years)	65.53 ± 7.62	52	84	45.51 ± 12.15	18	69	0.000
Height (cm)	164.74 ± 7.44	149	178	163.00 ± 7.99	148	180	0.287
Weight (kg)	69.20 ± 14.26	43.00	125.00	68.59 ± 12.39	50	125	0.426
BMI (kg/m²)	25.38 ± 4.38	19.29	42.25	25.79 ± 4.00	18.82	42.25	0.924
Cough times	28.9 ± 15.0	4	55	32.9 ± 15.2	4	70	0.654
FVC (L)	3.27 ± 0.59	2.00	4.54	3.35 ± 0.80	2.01	5.41	0.718
FVC pred%	100.99 ± 17.57	60.86	136.41	98.07 ± 13.00	67.61	129.55
FEV₁ (L)	1.91 ± 0.50	0.88	2.96	2.72 ± 0.67	1.57	4.43	0.000
FEV₁ pred%	74.70 ± 20.63	35.56	137.07	94.59 ± 12.37	61.66	137.07
FEV₁/FVC (%)	58.35 ± 10.86	29.61	78.45	81.29 ± 5.87	70.37	92.96	0.000
FEV₁/FVC pred%	72.94 ± 12.77	38.79	92.25	96.57 ± 6.48	82.05	108.85

Tables 3 –5 delineate a comparison between the algorithm-predicted FVC, FEV₁/FVC ratio, and FEV₁ values against their PFT-measured counterparts. The Mean Absolute Errors (MAEs) for these parameters were 10.59%, 7.40%, and 17.78% respectively. Table 3 highlights a notable trend: an increase in mean deviation within the ≥ 50 years age group, with the deviation becoming more pronounced with larger lung capacities. This trend was observed to be less significant in the < 50 years age group. Similarly, the analysis of FEV₁ distribution (as shown in Table 5) indicates that the more severe the reduction in FEV₁, the greater the absolute deviation of the algorithm-predicted values, particularly in the age group of 50 years and older, where this deviation is more pronounced. In contrast, the age group below 50 years did not show this trend, with an overall deviation smaller than that in the ≥50 years group.

Table 3.

FVC distribution of validation sets.

FVC (L)	Age	FVC range (L)	n	Mean absolute error(MAE) (L)	MAE/PFT-FVC (%)
	≥50 years	[2,3)	30	0.31	11.02
		[3,4)	39	0.32	10.34
		[4,6)	9	0.57	16.87
	＜50 years	[2,3)	8	0.33	11.90
		[3,4)	18	0.35	10.28
		[4,6)	14	0.25	5.95
FVC			118	0.34	10.59

Table 4.

FEV₁/FVC distribution of validation sets.

FEV₁/FVC (%)	Age	FEV₁ /FVC range (%)	n	Cough samples	MAE/PFT-FEV₁/FVC (%)
	≥50 years	[40,50)	12	27	22.44
		[50,60)	12	53	12.29
		[60,70)	23	89	6.78
		[70,80)	20	91	4.64
		[80,100]	11	32	3.36
	＜50 years	[70,80)	12	56	4.19
		[80,100]	28	112	4.30
FEV₁/FVC			118	479	7.40

Table 5.

FEV₁ distribution of validation sets.

FEV₁ (L)	Age	FEV₁ range (L)	n	Mean absolute error (L)	MAE/PFT-FEV₁ (%)
	≥50 years	[0.5, 2)	34	0.547	38.58
		[2,3)	39	0.251	10.75
		[3,5)	5	0.251	8.02
	＜50 years	[0.5,2)	2	0.436	26.76
		[2,3)	17	0.173	6.75
		[3,5)	21	0.265	7.59
FEV₁			118	0.330	17.78

Cough sound feature distribution

Figure 4 shows the spectrogram distribution of cough sound segments and the spectrogram differences between the healthy, COPD, and asthma patients. Compared to healthy coughs, COPD-related coughs exhibited lower anterior segment power but higher posterior segment power, suggesting a prolonged duration with reduced burst intensity. However, asthma cough exhibited greater high-frequency power than healthy cough, especially in posterior sound segments, indicating that high-frequency enhancement is a key feature of asthmatic cough. COPD cough displayed greater low-frequency power but reduced high-frequency power compared to asthma cough. In cases of COPD-asthma overlap, the cough signature predominantly reflected COPD characteristics, showing stronger low-frequency and weaker high-frequency components than asthma alone.

Figure 4.

Comparison between the frequency spectrum and differences in cough sounds in healthy, COPD, and asthmatic patients. A–D: Cough audio spectrograms of healthy, COPD, asthma, and COPD with asthma patients, where all spectrograms were calculated using the means for all cough spectrograms in the respective group. E–H: Spectral differences between COPD and healthy cough, asthma and healthy cough, COPD and asthmatic cough, and COPD with asthma and asthmatic cough. This spectrogram represents the averaged statistics of cough sounds corresponding to various conditions.

Algorithm verification

The consistency between the algorithm-derived parameters and the actual PFT-based parameters (FEV₁, FVC, FEV₁/FVC) was evaluated using Pearson and Spearman correlation analyses. Parameters such as FEV₁ and FVC were tested for normality and conformed to a normal distribution. Thus, Pearson correlation analysis was performed between FEV₁ and FVC obtained by the algorithm and PFT, respectively. FEV₁/FVC was also tested for normality and did not conform to a normal distribution. Spearman analysis was performed for FEV₁/FVC. A strong correlation was found for all algorithm-derived parameters and the actual PFT-based parameters. The Pearson and Spearman correlation analysis results are shown in Table 6. Scatter diagrams are shown in Figure 5–7.

Figure 5.

Scatter diagram of correlation analysis between two FVC values.

Figure 6.

Scatter diagram of correlation analysis between two FEV₁ values.

Figure 7.

Scatter diagram of correlation analysis between two FEV₁/FVC values.

Table 6.

Correlation analysis results of ventilatory function parameters.

Index	R	95% confidence interval		P
Index	R	Low limit	High limit	P
PFT-FVC & Algorithm-FVC	0.798	0.728	0.867	0.000
PFT-FEV₁ & Algorithm-FEV₁	0.752	0.754	0.884	0.000
PFT-FEV₁ & Algorithm-FEV₁/FVC*	0.784	0.711	0.842	0.000

*Correlation analysis was conducted using the Spearman rank correlation method.

Furthermore, an intraclass correlation analysis was conducted to evaluate the concordance between the parameters quantified by the algorithm and those obtained through clinical PFTs. These findings are presented in Table 7. Across the entirety of the 118 samples analyzed, the algorithm-FEV₁, FVC, and the FEV₁/FVC ratio exhibited notable consistency with the analogous parameters measured in PFTs, as detailed in Table 7. FEV₁ demonstrated the highest consistency coefficient. This result emphasized the potential need for further refinement of the algorithm, particularly in its capacity to accurately determine the FEV₁/FVC ratio.

Table 7.

Intraclass correlation analysis results of the algorithm: FEV₁, FVC, and FEV₁/FVC.

Parameters	Sample size	Intraclass correlation coefficient	95%CI		P
Parameters	Sample size	Intraclass correlation coefficient	Lower limit	Upper limit	P
FEV₁	118	0.801	0.726	0.857	0.000
FVC	118	0.794	0.716	0.852	0.000
FEV₁/FVC	118	0.640	0.520	0.736	0.000

Next, Bland-Altman analysis was used to further assess the agreement between the parameters measured by the algorithm and PFT tests (Figure 8). The result was shown in Table 8.

Figure 8.

Bland-Altman plot of FVC, FEV₁, FEV₁/FVC between PFT and algorithm.

Table 8.

Bland-Altman analysis result of algorithm-FEV₁, FVC and FEV₁/FVC.

Parameters	Sample size	Arithmetic mean	95%CI		Coefficient of repeatability	P
Parameters	Sample size	Arithmetic mean	Lower limit	Upper limit	Coefficient of repeatability	P
FEV₁	118	0.140	0.065	0.214	0.846	0.000
FVC	118	−0.026	−0.104	0.052	0.836	0.516
FEV₁/FVC	118	4.794	3.052	6.536	20.881	0.000
FVC*	104	0.025	−0.054	0.103	0.789	0.537

*14 Samples (PFT-FVC ≥ 4.3 L) were excluded.

The analysis revealed significant systematic biases in the algorithm predictions for FEV₁ (mean difference = 0.140, P < 0.001) and FEV₁/FVC (mean difference = 4.794, P < 0.001). Additionally, the coefficients of repeatability (CR) for these parameters were relatively wide (0.846 for FEV₁ and 20.881 for FEV₁/FVC), indicating substantial variability in the agreement between the algorithm predictions and the gold standard. These findings suggest that the algorithm's performance for FEV₁ and FEV₁/FVC requires further optimization to reduce both bias and variability. For FVC, the algorithm predictions showed no significant systematic bias (mean difference = -0.026, P = 0.516). However, the consistency range was relatively wide (CR = 0.836), indicating some variability in the agreement. The algorithm shows good FVC bias performance but requires improvement in reducing variability. When excluding extreme outliers, FVC* and FVC results were statistically equivalent (mean difference = 0.025, P = 0.537; CR = 0.789), demonstrating that results are somewhat dependent on sample selection.

Subsequently, the validation set was employed to ascertain the algorithm's accuracy in distinguishing between patients with COPD and healthy controls. Upon the exclusion of age as a confounding factor, the algorithm consistently demonstrated a quantifiable level of accuracy in differentiating between individuals with COPD and those without the condition. The metrics pertaining to the accuracy, sensitivity, and specificity of this algorithm are delineated in Table 9 and illustrated in Figure 9.

Figure 9.

ROC curve of algorithm in detecting COPD. Note: Solid blue line: ROC curve; dotted blue line: 95% confidence interval; ROC: Receiver Operating Characteristic.

Table 9.

Detailed diagnostic performance of the COPD screening algorithm model.

		AUC	Sensitivity	Specificity	+LR	-LR	P	Youden index
%		87.82	86.96	88.73	7.72	0.15	<0.001	0.757
95%CI	Upper Limit	93.20	73.71	79.04	4.01	0.07	——	——
95%CI	Lower Limit	80.51	95.15	95.03	15.00	0.30	——	——

AUC: Area under curve; +LR: positive likelihood ratio; -LR: negative likelihood ratio.

In the subgroup analysis of patients aged >50 years, we observed a decline in model performance, with overall accuracy decreasing to 81.6% and specificity dropping to 74.2%. Sensitivity remained unchanged due to the absence of COPD cases in the <50 years age group.

Discussion

In this study, we developed a cough sound-based algorithm to predict pulmonary function parameters, including FEV₁, FVC, and the FEV₁/FVC ratio. The algorithm demonstrated significant correlation and consistency with gold standard measurements. Furthermore, we integrated this algorithm with wearable device sensor metrics— HRV, blood oxygen saturation, and respiratory rate—to construct a COPD screening model. Validation results revealed a COPD screening accuracy of 87.8%, highlighting its potential for large-scale population screening.

Cough sound characteristics and algorithm performance

Cough sounds reflect diverse aspects of bronchopulmonary pathophysiology, with characteristics such as duration and frequency varying across pulmonary disorders.¹⁸ Consistent with prior research,^8,19 our findings identified distinct acoustic features between individuals with normal lung function and those with pulmonary abnormalities. Unlike previous studies, our investigation utilized a larger sample size and established separate training and validation databases. By incorporating smartwatch-collected cough sounds and physiological metrics, our methodology achieved greater precision than approaches relying solely on cough sound analysis.

However, we observed that the algorithm's predictive accuracy varied with lung function severity. For individuals aged 50 and older, the mean deviation of predicted FVC increased compared to gold standard values (Table 3). Similarly, lower FEV₁ and FEV₁/FVC values were associated with greater prediction deviations (Tables 4 and 5). These findings suggest that the algorithm's efficacy in predicting FEV₁ inversely correlates with the degree of lung function deterioration. This phenomenon may be attributed to obstructive lung impairment, which diminishes the distinctive acoustic characteristics of cough sounds. Specifically, reduced lung function conductivity may weaken the ability of cough sounds to convey diagnostic audio features.

Moreover, this result may also stem from age-related confounders and physiological changes: subclinical laryngeal dysfunction and comorbidities (e.g., GERD, pharyngitis due to snoring) are more prevalent in older adults, potentially generating cough signals with overlapping acoustic features. Age-related vocal cord atrophy and reduced lung elastic recoil may attenuate the high-frequency components of cough, impairing the extraction of audio features.

Data imbalance and algorithm limitations

The algorithm's performance was further impacted by data imbalance within the dataset. Specifically, the training set exhibited a skewed distribution of key clinical parameters: only 10.7% of COPD patients (19/178) had severe airflow limitation (FEV₁/FVC < 0.4), and only 9.6% of the total cohort (46/476) demonstrated a high lung capacity (FVC ≥ 5 L). Additionally, significant differences in age distribution were observed between the COPD and healthy control groups (Supplemental Table 3). These imbalances likely introduced prediction bias, particularly for populations with severe airflow limitation, larger lung capacity, or advanced age, thereby limiting the algorithm's generalizability to these subgroups. Additionally, the underrepresentation of older males in the training set may limit the algorithm's generalizability, as COPD predominantly affects this demographic. While benign coughs were not a focus of this study, their inclusion among subjects with normal lung function measurements may have further complicated the analysis.

Spectral analysis and disease differentiation

Spectrogram analysis revealed distinct cough sound patterns among healthy individuals, COPD patients, and asthmatic patients. Key differentiating features included the power contrast in the initial phase of the cough and the duration of low-frequency energy. We found that using high-frequency power in the post-cough (later stage of cough sound) segment to distinguish healthy individuals from COPD patients reduced specificity for asthma. These findings align with studies by Knocikova et al.²⁰ and Ramesh et al.,²¹ which identified unique acoustic characteristics in asthmatic coughs. Current research on distinguishing COPD from asthma using cough sounds is limited. Most existing studies focus on differentiating pulmonary dysfunction patterns, such as obstructive and restrictive patterns, with reported accuracy exceeding 90%.^22,23 Additionally, it has been shown that asthma and COPD exhibit differences in entropy distribution, with asthma demonstrating higher entropy due to greater variability in cough sequences. Our study further identifies distinctions in spectral characteristics between the two conditions. Future research aimed at differentiating asthma and COPD could explore Future refinements to the algorithm will focus on these two aspects—entropy distribution and spectral features—for enhanced discrimination.

Integration of physiological metrics

Our COPD screening model integrates cough sound analysis with physiological parameters, such as HRV, blood oxygen saturation, and respiratory rate, collected via smartwatches. This multimodal approach achieved an accuracy of 87.8% in the validation set, surpassing the COPD model that relies solely on cough sounds for differentiation,⁷ thereby highlighting the potential of this algorithm for large-scale population screening. Some studies have highlighted the utility of 24-h cough counts,²⁴ exercise-induced HRV changes,²⁵ and six-minute walk distances^26,27 as COPD indicators. In our model, cough audio features and SpO₂ are the most discriminative for COPD detection, as they directly reflect respiratory function and oxygenation status. HRV and respiratory rate provide supplementary information, enhancing the model's ability to identify subtle physiological changes associated with COPD. By integrating these features, our approach achieves a more comprehensive and accurate classification than relying solely on spirometric predictions. By leveraging wearable devices, our model offers a non-invasive and scalable solution for COPD management, addressing the limitations of traditional methods that require continuous microphone use and raise privacy concerns.

Clinical implications and early detection

Population studies indicate that mild to moderate airflow limitations (FEV₁ ≥ 70%) often go undiagnosed.^28,29 In our cohort, 83.0% of clinically diagnosed COPD patients fell into this category, underscoring the algorithm's potential to aid early disease identification. Early detection is critical, as timely intervention can significantly improve outcomes. Additionally, the integration of physiological metrics, such as altered respiratory patterns and HRV, enhances the model's ability to differentiate COPD patients from healthy individuals, particularly in older populations. While the current findings are promising, large-scale real-world studies remain imperative to verify clinical applicability, identify potential limitations, and guide further optimization.

Limitations and future directions

This study has several limitations. First, the algorithm's ability to detect extreme lung function conditions was limited by the sample's data distribution. Second, the relatively small sample size may limit the generalizability of the findings, particularly for individuals with extreme FEV₁ or FVC values, highlighting the need for validation in larger and more diverse cohorts. Additionally, the number of severe COPD cases was limited. This may affect the generalizability of our results to populations with more advanced disease. Third, the training set's male predominance may limit applicability to female patients. Fourth, the algorithm's accuracy in differentiating COPD from asthma remains suboptimal, highlighting the need for additional diagnostic features, such as airway dilation reversibility. Fifth, Elderly individuals often present with comorbidities (such as cardiovascular diseases), which may influence physiological parameters such as HRV and SpO₂, thereby introducing additional noise into the data used for COPD screening algorithms. Finally, the current algorithm's performance may not yet be sufficient for large-scale screening applications, as its accuracy and generalizability require further optimization and validation in broader clinical settings. As a prospective diagnostic study, our data were collected under standardized protocols. It is essential to note that publication bias is not applicable at the level of a single primary study; however, our hospital-based recruitment strategy may introduce selection bias, as enrolled participants (e.g., those with more severe COPD cases or tech-savvy individuals) may not fully represent the general population.

Future research should prioritize several key directions to advance the clinical utility and generalizability of the algorithm. First, further validation studies in broader community-based populations are warranted, with particular attention to currently underrepresented groups, including patients with severe COPD, female participants, and individuals with comorbid conditions. Second, external validation studies in varied demographic and clinical settings are needed to rigorously evaluate the algorithm's performance and robustness. Third, further refinement of the algorithm's diagnostic accuracy, especially in differentiating COPD from asthma, should be pursued by incorporating additional diagnostic features, such as airway dilation reversibility and other biomarkers. Fourth, real-world clinical implementation studies are necessary to assess the algorithm's practicality and effectiveness in routine healthcare settings. Finally, while the current study utilized HUAWEI smartwatches, future investigations should explore the algorithm's compatibility with other wearable and mobile devices, such as smartphones, to enhance its accessibility and applicability across different technological platforms.

Conclusion

We developed novel algorithms to assess ventilatory function and screen for COPD based on cough sounds and physiological parameters detected by smartwatch sensors. The accuracy and effectiveness of the two models were preliminarily verified, providing a theoretical basis and practical reference for the screening of some respiratory chronic diseases via wearable device signals.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076251377938 - Supplemental material for Smartwatch-based ventilatory assessment for COPD screening: A diagnostic accuracy study

Supplemental material, sj-docx-1-dhj-10.1177_20552076251377938 for Smartwatch-based ventilatory assessment for COPD screening: A diagnostic accuracy study by Yibing Chen, Lu Cao, Dahui Zhao, Song Meng, Dan Li, Jing Li, Yuqi Cui and Lixin Xie in DIGITAL HEALTH

Footnotes

ORCID iD

Yibing Chen

Ethical considerations

This study was performed according to the Declaration of Helsinki. The study protocol was approved by the Ethics Committee of the Chinese PLA General Hospital (Approval No. S2021-663-01). All participants provided written informed consent.

Consent for pulication

Not applicable.

Authors’ contributions

YB Chen and Lu Cao collected and analyzed the data, drafted the manuscript, and helped with data collection. DH Zhao, S Meng, and D Li recruited the patients, conducted the examination, and made the diagnoses. LX Xie supervised the study. J Li and YQ Cui analyzed the data and built the algorithm model. All of the authors read and approved the final version of the manuscript. All authors contributed to the article's analysis, drafting, and revising, agreed on the journal for submission, gave final approval of the version to be submitted, and agreed to be accountable for all aspects of the work.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the China Key Scientific Grant, (grant number Grant No. 2021YFC0122500).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

The data supporting this study's findings are available upon request from the corresponding author. The data are not publicly available because of privacy or ethical restrictions.

Disclosure

The abstract of this paper was accepted by ERS CONGRESS 2023, and presented as a Poster Discussion — Monitoring chronic airway diseases: COPD screening and ventilatory function test based on smartwatch sensor signals.

Supplemental material

Supplemental material for this article is available online.

References

GBD Chronic Respiratory Disease Collaborators . Prevalence and attributable health burden of chronic respiratory diseases, 1990-2017: a systematic analysis for the global burden of disease study 2017. Lancet Respir Med 2020 Jun; 8: 585–596.

Wang

Yang

, et al. China Pulmonary health study group. Prevalence and risk factors of chronic obstructive pulmonary disease in China (the China pulmonary health [CPH] study): a national cross-sectional study. Lancet 2018 Apr 28; 391: 1706–1717.

Global Strategy for the Diagnosis, Management, and Prevention of Chronic Obstructive Pulmonary Disease (2023 Report). 2023 [cited 2023 Aug 10]. Available from: https://goldcopd.org/wp-content/uploads/2023/03/GOLD-2023-ver-1.3-17Feb2023_WMV.pdf.

Crimi

Impellizzeri

Campisi

, et al. Practical considerations for spirometry during the COVID-19 outbreak: literature review and insights. Pulmonology 2021 Sep-Oct; 27: 438–447. Epub 2020 Aug 5.

Bietz

Bloss

Calvert

, et al. Opportunities and challenges in the use of personal health data for health research. J Am Med Inform Assoc 2016 Apr; 23: e42–e48. Epub 2015 Sep 2.

Aliverti

. Wearable technology: role in respiratory health and disease. Breathe (Sheff) 2017 Jun; 13: e27–e36.

Sharan

Abeyratne

Swarnkar

, et al. Predicting spirometry readings using cough sound features and regression. Physiol Meas 2018 Sep 5; 39: 095001.

Pan

, et al. A forced cough sound based pulmonary function assessment method by using machine learning. Front Public Health 2022 Oct 25; 10: 1015876.

Tiwari

Liaqat

, et al. Remote COPD severity and exacerbation detection using heart rate and activity data measured from a wearable device. Annu Int Conf IEEE Eng Med Biol Soc 2021 Nov; 2021: 7450–7454.

10.

Chen

Zhang

, et al. Continuous monitoring of heart rate variability and respiration for the remote diagnosis of chronic obstructive pulmonary disease: prospective observational study. JMIR Mhealth Uhealth 2024 Jul 18; 12: e56226.

11.

Bellos

Papadopoulos

Rosso

, et al. Identification of COPD patients’ health status using an intelligent system in the CHRONIOUS wearable platform. IEEE J Biomed Health Inform 2014 May; 18: 731–738.

12.

Shah

Velardo

Gibson

, et al. Personalized alerts for patients with COPD using pulse oximetry and symptom scores. Annu Int Conf IEEE Eng Med Biol Soc 2014; 2014: 3164–3167.

13.

Rahman

Nemati

Rahman

, et al. Automated assessment of pulmonary patients using heart rate variability from everyday wearables. Smart Health 2020; 15: 100081.

14.

Spina

Casale

Albert

, et al. Nighttime features derived from topic models for classification of patients with COPD. Comput Biol Med 2021; 132: 104322.

15.

Graham

Steenbruggen

Miller

, et al. Standardization of spirometry 2019 update. An official American thoracic society and European respiratory society technical statement. Am J Respir Crit Care Med 2019 Oct 15; 200: e70–e88.

16.

Zheng

Zhong

. Normative values of pulmonary function testing in Chinese adults. Chin Med J (Engl) 2002 Jan; 115: 50–54.

17.

Piirilä

Sovijärvi

. Differences in acoustic and dynamic characteristics of spontaneous cough in pulmonary diseases. Chest 1989 Jul; 96: 46–53.

18.

Piirilä

Sovijärvi

. Objective assessment of cough. Eur Respir J 1995 Nov; 8: 1949–1956.

19.

Nemati

Rahman

Blackstock

, et al. Estimation of the lung function using acoustic features of the voluntary cough. Annu Int Conf IEEE Eng Med Biol Soc 2020 Jul; 2020: 4491–4497.

20.

Sumner

, et al. Predictors of objective cough frequency in chronic obstructive pulmonary disease. Am J Respir Crit Care Med 2013; 187: 943–949.

21.

Bartels

Jelic

Ngai

, et al. High-frequency modulation of heart rate variability during exercise in patients with COPD. Chest 2003 Sep; 124: 863–869.

22.

Rudraraju

Palreddy

Mamidgi

, et al. Cough sound analysis and objective correlation with spirometry and clinical diagnosis. Informatics in Medicine Unlocked 2020; 19: 100319.

23.

Davies

Bachtiger

Williams

, et al. Wearable in-ear PPG: detailed respiratory variations enable classification of COPD. IEEE Trans Biomed Eng 2022 Jul; 69: 2390–2400. Epub 2022 Jun 17.

24.

Rejbi

Trabelsi

Chouchene

, et al. Changes in six-minute walking distance during pulmonary rehabilitation in patients with COPD and in healthy subjects. Int J Chron Obstruct Pulmon Dis 2010 Aug 9; 5: 209–215.

25.

Danilack

Weston

Richardson

, et al. Reasons persons with COPD do not walk and relationship with daily step count. COPD 2014 Jun; 11: 290–299. Epub 2013 Oct 23.

26.

Knocikova

Korpas

Vrabec

, et al. Wavelet analysis of voluntary cough sound in patients with respiratory diseases. J Physiol Pharmacol 2008 Dec; 59: 331–340.

27.

Ramesh

Vatanparvar

Nemati

, et al. CoughGAN: generating synthetic coughs that improve respiratory disease classification. Annu Int Conf IEEE Eng Med Biol Soc 2020 Jul; 2020: 5682–5688.

28.

Sandelowsky

Ställberg

Nager

, et al. The prevalence of undiagnosed chronic obstructive pulmonary disease in a primary care population with respiratory tract infections - a case finding study. BMC Fam Pract 2011 Nov 3; 12: 22.

29.

Lindberg

Bjerg

Rönmark

, et al. Prevalence and underdiagnosis of COPD by disease severity and the attributable fraction of smoking report from the obstructive lung disease in northern Sweden studies. Respir Med 2006 Feb; 100: 264–272. Epub 2005 Jun 21.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.04 MB