Abstract
Objective
Current adjustments of continuous positive airway pressure (CPAP) may expose patients to risks of respiratory episodes or oxygen desaturation. Therefore, this study developed machine learning models using leading indicators, such as heart rate variability (HRV) and oximetry-related metrics, to proactively predict optimal adjustment timings.
Methods
CPAP titration data were first collected from a sleep center in northern Taiwan. Subsequently, Continuous HRV and oximetry-related metrics were retrieved (60-s window with 1-s stride) and then labeled based on the presence of pressure adjustments. The dataset, comprising seven HRV and two oximetry-related parameters, was independently divided into training/validation (80%) and test (20%) datasets based on patient information. Five cross-sectional and two time-series models were established. The model with the highest accuracy and area under the receiver operating characteristic curve (AUROC) in the training/validation dataset was applied to the test dataset to investigate feature importance through permutation analysis.
Results
A dataset comprising 14,629 time-series cases from 374 patients undergoing CPAP therapy was obtained. The InceptionTime model outperformed others during the training/validation phase with accuracies of 78.27% and AUROC of 78.07%, and achieved 80.92% accuracy and 76.52% AUROC in the test phase. Feature importance analysis identified peripheral arterial oxygen saturation, its standard deviation over a 60-s window, and normalized power in the very-low-frequency band of HRV as the most impactful predictors.
Conclusions
The findings demonstrated the feasibility of incorporating HRV and oximetry-related metrics, as leading indicators, to proactively predict CPAP adjustment timings. Further research should consider integrating these metrics to improve CPAP therapy.
Keywords
Highlights
Utilizing machine learning models based on HRV and oximetry-related measures, this study proactively predicts continuous positive airway pressure (CPAP) adjustment timings. The established InceptionTime model (time-series) demonstrated an accuracy of 80.92% and an area under the receiver operating characteristic curve of 76.52%, proving its efficacy in predicting CPAP adjustment timings. Physiological features, such as standard deviation of peripheral arterial oxygen saturation, peripheral arterial oxygen saturation, and normalized power values of very low frequency, were identified as having the most significant impact on adjustment timing, elucidating the feasibility of using these indicators to evaluate pressure support levels.
Introduction
Partial or complete obstruction of the airway is the cause of the sleep breathing problem known as obstructive sleep apnea (OSA). 1 Almost a billion people globally, between the ages of 30 and 69 years, have mild to severe OSA, and almost half of a billion have moderate to severe OSA, making OSA a serious health concern worldwide. 2 The prevalence of OSA in the United States has increased significantly over the past 25 years, while the global prevalence is approximately 50%.3,4 Individuals with long-term untreated OSA, who are often exposed to nocturnal desaturation, face a heightened risk of developing comorbidities such as cardiovascular disease, hypertension, type 2 diabetes, and increased all-cause mortality. 5 Additionally, long-term untreated OSA has been linked to cognitive impairment and neurodegenerative disorders, such as dementia.6,7 These clinical observations may highlight the critical importance of treating OSA.
Continuous positive airway pressure (CPAP) maintains a constant pressure to reduce the work of breathing during sleep and is the evidence-based standard treatment for moderate to severe OSA. Previous research highlighted the benefits of CPAP therapy for OSA patients, including an improved quality of life and reduced daytime sleepiness, along with alleviating comorbid conditions of hypertension and glycemic control.8,9 However, the discomfort associated with CPAP, such as mouth dryness and headaches, can result in low compliance and adherence to therapy, significantly reducing its beneficial effects. 10 Additionally, different sleep stages, such as rapid eye movement (REM), and body positions like the supine sleep position, may require higher pressure settings for CPAP. 11 Fixed-pressure CPAP settings might not adequately accommodate such variations, potentially limiting their effectiveness. For autotitrating CPAP systems, the method of pressure adjustment is activated only upon detecting alterations in respiratory resistance or reduced minute ventilation, which may cause arousal occurrences and even extended exposure times to desaturation.12,13 Therefore, it may be beneficial to explore the potential of integrating additional reliable physiological signals to enable more precise and proactive pressure adjustments in CPAP systems, rather than relying solely on traditional signals like pressure and airflow.
Heart rate variability (HRV), modulated by both the sympathetic and parasympathetic nervous systems, can be proactively utilized to detect the occurrence of respiratory events associated with OSA. 14 More precisely, common manifestations of OSA, such as apnea and hypopnea, activate the sympathetic nervous system, leading to heart rate fluctuations triggered by restricted airflow and corresponding oxygen desaturation.15,16 Previous researchers also indicated relationships between HRV metrics, such as standard deviation of R-to-R intervals (SDNN), and ration of low-frequency to high-frequency power for evaluating OSA severity.17,18 Therefore, the level of autonomic dysfunction and stress caused by respiratory events may be feasibly assessed through alterations in HRV patterns. 19 Furthermore, in the early stages of respiratory events, minor airflow disturbances can trigger compensatory mechanisms in the brain aimed at restoring proper breathing. 20 These neural responses in the brain, facilitated by brain–heart interactions, are reflected in alternations to the HRV. Such variations may be observed even during initial minor disturbances—well before full-blown respiratory events occur—making them potential leading indicators of impending respiratory events. 21 Hence, the sensitivity of HRV parameters suggests that they could be effectively utilized to predict respiratory events in advance, potentially offering a critical window for timely medical interventions.
While the correlation between HRV and respiratory events has been discussed in previous literature, the application of HRV parameters to develop machine learning (ML) models for automatic pressure adjustments in CPAP systems remains underexplored. This study aims to address this gap by leveraging HRV and oximetry-related parameters to build ML models capable of predicting optimal moments for proactive CPAP adjustments. We hypothesize that these parameters, as early indicators of respiratory events, could enable precise and timely pressure adjustments, offering a novel alternative to the conventional reliance on pressure and airflow signals. By utilizing physiological data collected during CPAP titration in a hospital setting, we developed multiple ML models and performed feature importance analyses to identify the most critical parameters contributing to model performance. Our findings highlight the potential of incorporating HRV and oximetry signals into CPAP systems, paving the way for more proactive, data-driven approaches to optimize therapy outcomes and enhance patient care.
Materials and methods
Research ethics and study population
The protocol for this retrospective study received ethical approval from the Institutional Review Board at the Office of Human Research of Taipei Medical University (Clinical Trial Number: TMU-SHH, N202212067). All procedures regarding data collection, processing, analysis, and maintenance were conducted in accordance with the approved protocols. The study involved retrospective data retrieval from 374 patients who underwent both polysomnography (PSG) and full-night CPAP titration tests at a sleep center in New Taipei City from January 2020 to January 2022. The data inclusion criteria were as follows: (1) aged 18 to 65 years, (2) both the PSG and CPAP titration recording times were over 6 h and there was a determined optimal pressure, (3) having received no invasive treatment for OSA (i.e. otorhinolaryngology surgery), (4) no cardiopulmonary disease, and (5) not using hypnotic or psychotropic medications. Regarding the retrieved data, baseline parameters and clinical history were accessed from medical records in the hospital database, which included age, sex, body-mass index (BMI), neck and waist circumferences, as well as surgical and medication histories. Concerning sleep parameters, both initial PSG details and subsequent full-night CPAP titration data, including summary reports and original physiological signal data, were obtained from the sleep center database. All of the collected data of eligible patients were used for further analysis.
Sleep parameters of PSG and CPAP titration
PSG recordings were obtained using two systems in the sleep center, specifically the ResMed Embla N7000 (ResMed, San Diego, CA, USA) and Embla MPR (Natus Medical, Pleasanton, CA, USA), which were utilized for physiological signal measurements. The CPAP titration test was conducted via the same PSG recording system but with the addition of the S9 ResMed VPAP titration device (ResMed). Pressure adjustments were made using EasyCare Tx titration software (ResMed, version 7.00). RemLogic software, which served as the scoring environment (version 3.41; Embla, Thornton, CO, USA), was employed to determine sleep stages and identify episodes such as arousal responses and respiratory events. The recorded physiological signals included electroencephalography, electrocardiography (ECG), electrooculography, electromyography, snoring patterns, thoracic and abdominal impedance, sleeping position, nasal prong pressure and thermistor measurements (only for PSG), and oxygen parameters. Regarding details of sleep data scoring and pressure adjustments of CPAP, certified PSG technologists scored sleep stages and respiratory events, and adjusted the pressure according to the scoring manual released by the American Academy of Sleep Medicine in 2017. 22 To reduce scoring bias, another licensed technologist independently reviewed the scoring outcomes. Any discrepancies were extracted for further review and discussion until agreement was established. Notably, OSA severity was classified by the apnea-hypopnea index (AHI) from PSG into four categories: normal (AHI <5 events/h), mild (AHI: 5–15 events/h), moderate (AHI: 15–30 events/h), and severe (AHI >30 events/h). 23 The CPAP alternation moment for patients aged 12 years and older was increased by at least 1 cmH2O with an interval no shorter than 5 min when the following situations occurred: (1) at least two obstructive apneas, (2) at least three hypopneas, (3) at least 3 min of loud or unambiguous snoring, and (4) at least five respiratory effort-related sleep arousals were observed. 24 Additionally, the optimal CPAP titration was determined based on the established consensus that the pressure should reduce the respiratory disturbance index (RDI) to fewer than 5 events/h for at least a 15-min period, including REM sleep in a supine position, and not be frequently interrupted by spontaneous arousals or awakenings. 25 Relevant physiological signals (i.e. ECG and oxygen parameters) and pressure details were processed and applied to predict the time point of CPAP alterations.
Time series data preparation
To determine HRV metrics, this study applied Python (version 3.7.15) and open-source modules such as BioSPPy (version 1.0.0), pyHRV (version 0.4.1), and hrv-analysis (version 1.0.3). A flowchart for computing continuous HRV characteristics is presented in Figure 1. First, the electrocardiographic signal from CPAP titration was extracted to calculate HRV metrics (approximately a 6-h recording), while exact time points for adjusting the pressure level of CPAP were extracted for labeling. Regarding the HRV calculation, as per previous outcomes, the ECG signal was segmented into 60-s windows with a 1-s stride to continuously determine variations in HRV metrics. In terms of more-technical details for HRV calculation, referring to previous outcomes, the ECG signal was segmented into 60-s windows with a 1-s stride to continuously determine variations in HRV metrics. 26 For the types of HRV, time-domain metrics, including SDNN, root mean square of successive differences between normal heartbeats (RMSSD), and the number of interval differences of successive normal heartbeats greater than 50 ms (NN50), were obtained. For frequency-domain metrics, normalized power values of very low frequency (nVLF), low frequency (nLF), and high frequency (nHF) were also determined. Additionally, the peripheral arterial oxygen saturation (SpO2) level measured by pulse oximetry during PSG was extracted to serve as a continuous oxygen-related feature and aligned with continuous HRV metrics. Notably, to prevent erroneous HRV calculations due to artifacts in the ECG signal, data with a mean heart rate not between 40 and 90 beats per minute were eliminated. 27 Next, aligned time-series data were sliced using a 60-s window with a 60-s stride. The standard deviation of SpO2 (SpO2-std) and the mean heart rate (HR-mean) for each 1-min window were calculated as derived features. In total, seven HRV metrics (i.e. SDNN, RMSSD, NN50, HR-mean, nLF, nHF, and nVLF) and two oxygen-related metrics (i.e. SpO2 and SpO2-std) served as input features for developing the ML models. To label each window, if a CPAP pressure adjustment was performed, the 60-s segment immediately preceding the adjustment was labeled accordingly. Conversely, segments labeled as no pressure adjustment were 60-s windows that met the following criteria: (1) total sleep time of more than 80%, and (2) absence of apnea, hypopnea, or desaturation events in the preceding 60 min. These criteria were applied uniformly across the dataset to ensure consistent labeling for model training.

Flowchart of data collection and processing. Polysomnography (PSG) and continuous positive airway pressure (CPAP) titration data were collected from 374 patients with each recording lasting approximately 6 h. Heart rate variability (HRV) was extracted and analyzed from EKG signals, with continuous HRV features calculated using a 60-s window and a 1-s stride. Next, time series data were generated with a 60-s window and stride. To ensure data accuracy, 60-s intervals where the heart rate exceeded 90 or fell below 40 beats per minute were adjusted by interpolating with neighboring data. Finally, data were labeled according to whether there was a pressure level adjustment.
ML approaches and model selection
This study established models using open-source Python libraries: PyTorch (version 1.13.1), tsai (version 0.3.4), fastai (version 2.7.9), and scikit-learn (version 1.1.2). Seven types of models were established, including cross-sectional models of logistic regression (LR), k-nearest neighbor (kNN), support vector machine (SVM), random forest (RF), and gradient boosting machine (GBM); and time-series models of long short-term memory (LSTM) and InceptionTime. Figure 2 depicts the flowchart for training, validating, and testing the models on a patient-independent dataset. The total data were initially split into training/validation datasets and test datasets using the patient-independent method with a ratio of 80:20. Since it was an imbalanced dataset, an over-sampling technique—Synthetic Minority Oversampling Technique—was applied to balance the data only in the training process. 28

Training and testing processes in the development of prediction models for pressure level adjustment during continuous positive airway pressure (CPAP) titration using a patient-independent dataset. Total data from 374 patients were labeled as having pressure level adjustment or not having pressure level adjustment during CPAP titration. Time-series data were divided into a patient-independent training dataset and a test dataset with a ratio of 80%:20%. The Synthetic Minority Oversampling Technique (SMOTE) was first applied to the training dataset to oversample the data with a ratio of 4.36. Ten-fold cross-validation was performed. Trained models included logistic regression (LR), k-nearest neighbor (kNN), support vector machine (SVM), random forest (RF), gradient boosting machine (GBM), long short-term memory (LSTM), and InceptionTime. The best model was selected by both the highest accuracy and the largest area under the receiver operating characteristic curve (AUROC) and used for testing. Finally, the model performance and the permutation feature importance were evaluated.
Input features and hyperparameters tuning
The input features for cross-sectional models (i.e. LR, kNN, SVM, RF, and GBM) were the mean values of nine metrics (seven HRV parameters and two oxygen-related metrics) within each 60-s window. For time-series models (i.e. LSTM and InceptionTime), continuous HRV and SpO₂ values in each 60-s window were used, with HR-mean and SpO₂-std duplicated to align with the time-series format. To determine optimized hyperparameters, a grid search 10-fold cross-validation was applied during the training and validation phase, and performance was evaluated using the area under the receiver operating characteristic curve (AUROC). Tuning tests of hyperparameters for the cross-sectional models were as follows. (a) For the LR model, the inverse regularization strength (C, ranging from 10−1 to 10¹) and optimization solvers (limited-memory Broyden–Fletcher–Goldfarb–Shanno method and liblinear) were tested. (b) For the kNN model, the k-value (ranging from 1 to 5) and weight type (uniform or distance) were optimized. (c) For the SVM model, variations included different types of kernels (linear, polynomial, radial basis function, and sigmoid), kernel coefficients (the reciprocal of the number of input features or the reciprocal of the product of the number of input features and the variance of input data), and the regularization parameter (C, ranging from 10−1 to 10¹). (d) For the RF model, adjustments were made to the criterion (Gini impurity or Shannon entropy), the maximum number of features used for fitting (the square root or binary logarithm of the total number of features), the maximum depth of the trees (set to 2, 6, or 10), and the number of classification trees (set to 250, 500, or 800). (e) For the GBM model, the criterion (mean squared error with or without improvement score by Friedman), the fraction of samples to be used for fitting the individual base learners (set to 0.5, 0.75, or 1), the maximum number of features used for fitting (the square root, binary logarithm, or total number of features), the maximum depth of the trees (set to 2, 6, or 10), and the number of estimators (set to 500 or 800) were configured. The tuning methods for time-series models (i.e. LSTM and InceptionTime) were as follows: (1) a batch size of 4096 and a cross-entropy loss function; and (2) Adam as the optimizer, an epoch of 40, and a weight decay of 0.01.
Statistical analysis of model performance and interpretation
Model performance metrics included accuracy, precision, recall, F1-score (harmonic mean of precision and recall), and AUROC. The Delong test was used to compare the AUROC between the models. The model with the highest accuracy and AUROC on the test dataset was selected for feature importance analysis to enhance interpretability (Figure 2). Feature importance was assessed using the permutation importance method, which quantified the impact of each feature on model performance. This approach involved measuring the decrease in AUROC when the values of a single feature were randomly shuffled, disrupting its association with the target variable. The process included calculating the model's baseline AUROC on the original dataset, shuffling each feature individually, and reevaluating AUROC. The importance of a feature was determined by the percentage decrease in AUROC, with larger reductions indicating higher significance to the model's predictive capability.
Results
Characteristics of enrolled data
Demographic characteristics and PSG parameters from eligible participants are presented in Table 1. In total, 374 participants (282 males and 92 females) with a mean age of 50.61 ± 13.92 years were included. The mean values of the BMI, neck circumference, and waist circumference were 31.33 ± 5.36 kg/m², 40.32 ± 3.47 cm, and 103.35 ± 11.22 cm, respectively. In terms of PSG parameters, the mean sleep efficiency was 72.3% ± 15.92%, while the mean AHI was 57.07 ± 25.98 events/h. The majority of eligible participants were categorized as having severe OSA (N = 315, 84.22%). As indicated in Figure 2, 14,629 time-series data segments were generated from the 374 participants.
Demographic characteristics of patients.
Abbreviations: BMI: body-mass index; SPT: sleep period of time; NREM: nonrapid eye movement; REM: rapid eye movement; WASO: wake time after sleep onset; ODI-3%: oxygen desaturation index ≥3%; AHI: apnea-hypopnea index; OSA: obstructive sleep apnea.
Data are expressed as the mean ± standard deviation.
Sleep parameters of CPAP titration
Sleep parameters from the CPAP titration are shown in Table 2. During CPAP titration, the optimal pressure was 11.75 ± 2.71 cmH2O, while the RDI and SpO2 under such a pressure were 1.53 ± 4.41 events/h and 96.59% ± 1.38%. In terms of other whole-night sleep parameters, the mean sleep efficiency was 76.84% ± 14.62%, and the mean RDI was 9.97 ± 7.09 events/h.
Sleep parameters and ventilator setting details in continuous positive airway pressure (CPAP) titration.
Abbreviations: RDI: respiratory disturbance index; SpO2: peripheral arterial oxygen saturation; SPT: sleep period of time; NREM: nonrapid eye movement; REM: rapid eye movement; WASO: wake time after sleep onset; ODI-3%: oxygen desaturation index ≥3%.
Data are expressed as the mean ± standard deviation.
Validation performance of ML approaches
There were 299 patients in the training dataset, with 9491 data points labeled as without pressure adjustment and 2067 with pressure adjustment (11,558 data points in total). Table 3 documents the performance of the ML algorithms in the training process. Mean accuracies of the various models were 48.49% ± 6.31% for the LR, 66.7% ± 5.5% for the kNN, 69.16% ± 6.64% for the SVM, 72% ± 3.26% for the RF, 72.06% ± 3.74% for the GBM, 77.1% ± 4.64% for the LSTM, and 78.27% ± 2.87% for InceptionTime. Mean AUROC values of the various models were 55.82% ± 3.2% for the LR, 56.63% ± 3.07% for the kNN, 74.95% ± 4.48% for the SVM, 77.48% ± 2.76% for the RF, 76.67% ± 2.84% for the GBM, 75.84% ± 3.73% for the LSTM, and 78.07 ± 2.19% for InceptionTime. In the validation phase, the InceptionTime model achieved the highest accuracy and AUROC.
Comparison of the cross-validation results of the models established using multiple machine learning approaches using a patient-independent training dataset (number of patients: 299).
Abbreviations: LR: logistic regression; kNN: k-nearest neighbor; SVM: support vector machine; RF: random forest; GBM: gradient boosting machine; LSTM: long short-term memory; AUROC: area under the receiver operating characteristic curve.
Test set performance and feature importance analysis
In the patient-independent test dataset (N = 75; with 3071 total data points), 2536 data points were labeled as not requiring pressure adjustment, while 535 were labeled as requiring adjustment (Table 4). The results of the Delong test used to compare the AUROC between the models are provided in Supplemental Figure S1. Among the evaluated models, accuracy ranged from 50.02% (LR) to 80.92% (InceptionTime), while AUROC varied from 56.45% (LR) to 76.52% (InceptionTime). Notably, InceptionTime achieved the highest performance with an accuracy of 80.92% and an AUROC of 76.52%, outperforming other classifiers. Given its superior performance, InceptionTime was selected for permutation feature importance analysis to further interpret the model's predictions. SpO2-std (9.32%) and SpO2 (4.44%) were the most and third-most important features. Among HRV metrics, nVLF showed the second-highest importance (6.94%), followed by nLF (3.95%), and nHF (2.03%), which were ranked fourth and fifth, respectively, in terms of feature importance in the prediction model (Figure 3).

Feature importance of the selected model (InceptionTime model) in the patient-independent test dataset (number of patients: 75). Feature importance is represented as the percentage decrease in area under the receiver operating characteristic curve (AUROC) when the values of an individual input feature are randomly shuffled, highlighting the relative contribution of each feature to the model's performance.
Comparison of the performance of the models established using multiple machine learning approaches using a patient-independent test dataset (number of patients: 75).
Abbreviations: LR: logistic regression; kNN: k-nearest neighbor; SVM: support vector machine; RF: random forest; GBM: gradient boosting machine; LSTM: long short-term memory; AUROC: area under the receiver operating characteristic curve.
Discussion
Regarding the improvement in OSA manifestations in the derived data, significant decreases in the AHI were observed from 57.07 events/h to 9.97 events/h of the RDI when comparing conditions without (i.e. PSG) and with CPAP use (titration). Under optimal pressure settings, the mean RDI was only 1.53 ± 4.41 events/h, categorized as a normal level (i.e. without OSA). These improvements underscore CPAP as the gold standard treatment for OSA. Also, determining the optimal pressure for maintaining upper airway patency can aid in eliminating respiratory events associated with OSA. In line with previous findings, researchers indicated that CPAP therapy positively affects OSA patients, including decreasing the AHI and improving cognitive function.29,30 Another study also indicated improved self-rated sleep quality and sleepiness level after three months of CPAP therapy. 31 Collectively, the current results affirm the efficacy of CPAP therapy in managing OSA, emphasizing the importance of tailored optimal settings and treatment plans to maximize therapeutic benefits.
Concerning the performance of the developed approaches, the InceptionTime model exhibited the highest values of both accuracy and AUROC during the training and validation stages, followed by the LSTM and RF models. Additionally, in the testing stage, InceptionTime still maintained the highest accuracy and AUROC on a patient-independent dataset, demonstrating its resilience and adaptability when applied to new patients. While there is no conclusive evidence that time-series models are superior to cross-sectional methods for processing sequential data, the inclusion of dynamic features through time ordering or stride—treating each time point as a unique characteristic—may highlight the potential advantages of time-series models. In other words, time-series classification models (e.g. InceptionTime), which contain convolutional filters of varying lengths, are more likely to capture patterns and features at various scales. 32 This capability may make them highly effective in recognizing complex and diverse temporal patterns in time-series data. Similar to previous outcomes, InceptionTime was used to predict the severity of some depressive disorders based on other time-series data, such as electroencephalograms. 33 Another study incorporated the InceptionTime module into their deep learning model and achieved high accuracy (>80%) in predicting sleep stages. 34 Collectively, the relatively superior performance of time-series models may highlight their potential in utilizing temporal features to predict the necessity for pressure adjustments in CPAP therapy.
Regarding feature importances, a permutation analysis revealed that oxygen-related features, particularly SpO2-std and SpO2, were crucial in predicting pressure adjustments of CPAP using the InceptionTime model. These observations may be explained by oxygen-related features providing immediate insights into arterial blood oxygen saturation, crucial during sleep when breathing interruptions can cause oxygen levels to fluctuate. Such fluctuations indicate the need for adjusting the pressure of CPAP to maintain effective therapy and ensure stable oxygen saturation. Essentially, oxygen desaturation during sleep signals airway collapse and insufficient ventilation, necessitating adjustments in the pressure setting of CPAP to support proper airway function and oxygen delivery. To the best of our knowledge, although there are no widely applied guidelines that use oxygen-related features to prescribe pressures for CPAP, some studies indicated the importance of these features. For example, one study demonstrated that auto-adjusting CPAP, guided by maintaining SpO2 levels, significantly improved sleep-related breathing disorders in children with adequate pressure support. 35 Another study proposed a pressure optimization equation for CPAP considering the lowest SpO2 level as the main indicator. 36 Furthermore, a relevant study presented a novel clinical index, the saturation oxygen pressure index, to monitor the relationship between the pressure provided by CPAP and mean oxygen saturation in neonates. 37 Taken together, per the current outcomes, oxygen-related features may be reliable indicators that inform necessary adjustments in pressure for CPAP to prevent hypoxemia, underscoring the critical role of these features in optimizing therapy outcomes.
In terms of HRV features, frequency-domain measures, including nVLF, nLF, and nHF, were ranked as relatively important in predicting pressure adjustments. While there is no direct evidence linking frequency-domain HRV features to pressure level adjustment requirements, some underlying mechanisms may support their feasibility. Clinically, HRV measures are utilized to quantify autonomic changes, with high-frequency power reflecting vagal tone and low-frequency power indicating sympathetic activity. 38 A previous study demonstrated that significant alterations in the sympathovagal balance, indicated by consistently increased nLF, were observed during respiratory events. 39 Another study documented significant correlations between variations in SpO2 and nHF. 40 Those findings also indicated a relative increase in sympathetic modulation during episodes of oxygen desaturation, as evidenced by decreased nHF. Likewise, researchers revealed that amplitudes of very-low-frequency and low-frequency components were significantly reduced when receiving pressure support compared to sleep without it. 41 This partially suggests that physiological changes in both autonomic activity and sleep structure can be reflected by nVLF, thereby serving as an objective indicator for respiratory event episodes. Altogether, frequency-domain analysis of HRV may be able to represent activation of the sympathetic and parasympathetic nervous systems during sleep apnea episodes, highlighting the sympathovagal imbalance caused by respiratory disturbances.
There are several strengths of the present study. First, this study demonstrated the adequate accuracy of established prediction models for CPAP level adjustments based on readily available clinical parameters, namely HRV and oxygen-related features. In addition, this study evaluated the necessity for pressure adjustments by employing a 1-min window and demonstrated accurate results, which can be considered a reasonable and applicable time segment for predicting CPAP level adjustments in clinical practice. In exploring feature importances, the established models primarily focused on SpO2-std, SpO2, and the frequency domain of HRV within a 1-min window, aligning with findings from previous research. This highlighted the feasibility and confirmed the robustness of using these easily accessible parameters to predict pressure adjustments of CPAP. Additionally, these outcomes concerning feature importance suggest the potential for considering such parameters when monitoring and adjusting pressure levels of CPAP in clinical settings. Altogether, this study demonstrated the feasibility of integrating the proposed model into CPAP therapy systems. By leveraging readily available clinical parameters such as HRV and oximetry features, this approach can potentially enhance real-time pressure adjustments in CPAP machines, improve clinical decision-making, and personalize patient care.
Several limitations of this study should be addressed in future research. First, the retrospective collection of PSG and CPAP titration data was limited to a single sleep center and involved 374 participants from a single ethnic group in northern Taiwan. As a result, the generalizability of the CPAP adjustment prediction model to other populations and ethnicities may be limited, requiring validation in larger and more diverse cohorts. In addition, differences in datasets, patient demographics, and study protocols may hinder direct comparisons with existing methodologies. This study did not account for craniofacial factors or sleep position, all of which can significantly influence breathing patterns (e.g. buccal respiration) and necessitate adjustments in CPAP pressure.42,43 Furthermore, although the model demonstrated feasibility and strong performance, its real-world clinical impact remains uncertain. Variations in healthcare environments, CPAP equipment, and clinical protocols may affect model performance. Regarding feature importance, although permutation importance is a useful method for estimating feature relevance, it is sensitive to correlated features and may underestimate the importance of features with nonlinear interactions. Future research should focus on multicenter validation across diverse populations and healthcare settings to enhance model robustness and clinical relevance. Incorporating additional physiological metrics (e.g. respiratory rate, blood pressure) and longitudinal CPAP titration data may further improve prediction accuracy and generalizability. Comparative studies with other autotitration strategies are also recommended to benchmark performance and assess clinical value. Lastly, several considerations remain regarding clinical implementation. Integrating the model into real-world CPAP systems would require overcoming technical challenges such as real-time signal processing, hardware compatibility, and data privacy. Additionally, regulatory approval and usability testing will be essential to translate the algorithm into a practical and trusted tool for clinical use.
Conclusions
The current CPAP pressure adjustment procedure can risk respiratory events or oxygen desaturation. To mitigate this, this study explored the feasibility of developing models to proactively predict adjustments using HRV and oximetry measures, analyzing data from 374 patients and 14,629 time-series cases. The InceptionTime model (time-series), which demonstrated the highest performance with an accuracy of 78.27% and an AUROC of 78.07% during training and validation, and 80.92% accuracy with 76.52% AUROC during testing. Notably, SpO2-std, SpO2, and nVLF were the most impactful features for predicting adjustment moments. These findings validate the potential of HRV and oximetry-related measures as leading indicators for CPAP adjustments, enhancing OSA treatment.
Supplemental Material
sj-docx-1-dhj-10.1177_20552076251339273 - Supplemental material for From reactive to proactive: Machine learning models for continuous positive airway pressure adjustments using heart rate variability and oximetry-related parameters
Supplemental material, sj-docx-1-dhj-10.1177_20552076251339273 for From reactive to proactive: Machine learning models for continuous positive airway pressure adjustments using heart rate variability and oximetry-related parameters by Chih-Fan Kuo, Yi-Chih Lin, Ze-Yu Chen, Jiunn-Horng Kang, Cheng-Chen Chang, Zhihe Chen, Arnab Majumdar, Yen-Ling Chen, Yi-Chun Kuan, Kang-Yun Lee, Po-Hao Feng, Kuan-Yuan Chen, Hsin-Chien Lee, Wun-Hao Cheng, Wen-Te Liu and Cheng-Yu Tsai in DIGITAL HEALTH
Footnotes
Abbreviations
Acknowledgments
We are grateful to the technologists at the Sleep Center of Shuang Ho Hospital for gathering and managing the raw data. We also thank all of the participants and researchers who contributed to this study.
Ethical considerations
This study is a retrospective analysis conducted using data sets acquired from Taipei Medical University-Shuang Ho Hospital. The protocols for data collection, storage, deidentification, and statistical analysis were reviewed and approved by the Joint Institutional Review Board at the Office of Human Research of Taipei Medical University (Clinical Trial Number: TMU-SHH, N202212067). Additionally, the Joint Institutional Review Board of Taipei Medical University granted an exemption from the requirement of obtaining informed consent from subjects.
Author contributions
Conceptualization: Chih-Fan Kuo and Yi-Chih Lin; methodology, experimentation, and software: Ze-Yu Chen, Zhihe Chen, and Cheng-Chen Chang; data analysis and interpretation: Kang-Yun Lee and Po-Hao Feng; writing—original draft: Chih-Fan Kuo, Yi-Chun Kuan, and Cheng-Yu Tsai; visualization: Chih-Fan Kuo and Cheng-Chen Chang; supervision: Jiunn-Horng Kang, Arnab Majumdar, and Wen-Te Liu; writing—review, editing and revision: Yen-Ling Chen, Kuan-Yuan Chen, Hsin-Chien Lee, and Wun-Hao Cheng. All authors read and approved the final version.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by the Taiwan National Science and Technology Council (Grant Nos. NSTC 112-2634-F002-003 and NSTC 113-2222-E-038-003) and Taipei Medical University (Grant No. TMU113-AE1-B03). The funders had no role in the study design, data collection and analysis, decision to publish, or manuscript preparation.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The data for this study were collected at the Sleep Center of Taipei Medical University-Shuang Ho Hospital between January 2020 and January 2022. Due to the inclusion of personal clinical information (Clinical Trial No. TMU-SHH, N202212067), the data are not provided as a supplementary file. Interested parties may contact the corresponding authors to request access to the dataset and relevant documents.
Precis
This study used HRV and oximetry-related data to develop ML models that proactively predict CPAP adjustment timings, enhancing treatment for OSA.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
