Abstract
This study develops machine learning-based algorithms that facilitate accurate prediction of cerebral oxygen saturation using waveform data in the near-infrared range from a multi-modal oxygen saturation sensor. Data were obtained from 150,000 observations of a popular cerebral oximeter, Masimo O3™ regional oximetry (Co., United States) and a multi-modal cerebral oximeter, Votem (Inc., Korea). Among these observations, 112,500 (75%) and 37,500 (25%) were used for training and test sets, respectively. The dependent variable was the cerebral oxygen saturation value from the Masimo O3™ (0–100%). The independent variables were the time of measurement (0–300,000 ms) and the 16-bit decimal amplitudes values (infrared and red) from Votem (0–65,535). For the right part of the forehead, the root mean square error of the random forest (0.06) was much smaller than those of linear regression (1.22) and the artificial neural network with one, two or three hidden layers (2.58). The result was similar for the left part of forehead, that is, random forest (0.05) vs logistic regression (1.22) and the artificial neural network with one, two or three hidden layers (2.97). Machine learning aids in accurately predicting of cerebral oxygen saturation, employing the data from a multi-modal cerebral oximeter.
Introduction
In recent times, cerebral blood oxygen saturation has been increasingly recognized as an important hemodynamic parameter, and been utilized in various clinical contexts, such as the prevention of hypoxia and brain ischemia during anesthesia, the estimation of cerebral blood flow during cardiopulmonary resuscitation for post-cardiac arrest care, and the evaluation of patient status in neonatal intensive care units.1–3 A cerebral oximeter is a non-invasive device that continuously monitors cerebral oxygen saturation by attaching an adhesive pad with a light source and a light detector to the patient’s scalp.4,5 These devices measure the oxygenation of hemoglobin in the frontal cerebral cortex, which is particularly vulnerable to changes in oxygen supply and demand, by utilizing light’s ability to penetrate the skull.
Specifically, cerebral oximeters are based on near-infrared spectroscopy (NIRS), which assesses the chemical composition of a compound or solution by measuring its absorption of near infrared radiation. NIRS operates in the near-infrared electromagnetic spectrum, with wavelengths ranging from 700 nm to 2500 nm. The main principle behind NIRS is the Beer-Lambert Law: The concentration of a certain chemical compound in a solution determines how much light, whether red or infrared, this solution will absorb. The higher the concentration, the more radiation of a specific wavelength will be absorbed. Here, NIRS works differently from other spectroscopy approaches because of the unique interaction that near-infrared radiation has with matter. Rather than exciting the electrons within the atoms of a chemical element, near-infrared radiation affects entire molecules, more specifically, their vibrational motion - the bonds that make the atoms within a molecule stick together.6–9
However, existing NIRS-based cerebral oximeters face several challenges. Firstly, they cannot measure cerebral oxygen saturation directly, necessitating separate monitoring devices and sensors. These additional devices are costly and space-consuming, making it difficult to use them for many patients. Secondly, the utility of cerebral oxygen saturation in evaluating critically ill patients may be limited due to spatial constraints and the high cost of cerebral oximeters, which can hinder the collection and analysis of sufficient data. Thirdly, cerebral oximeters can be influenced by various factors such as hemodynamic status, skin conditions like scalp bleeding or venous congestion, and the presence of intracranial foreign bodies. For these reasons, it is essential to improve cerebral oximeter performance with machine learning-based algorithms, which become increasingly accurate through repeated learning from large datasets. In this study, we address these challenges by introducing a multi-modal cerebral oximeter (Votem) and developing machine learning-based algorithms that enable accurate prediction of cerebral oxygen saturation based on Votem data.
Methods
Data
The data source for this study comprised 150,000 observations from a popular cerebral oximeter (O3™, Masimo, Irvine, CA, USA) and a patient monitoring equipment using a multi-modal oxygen saturation sensor, VP-1200, Votem (Inc., Korea). The inclusion criteria for the 20 participants were ages 20–45 years, no underlying disease and informed consent for study participation. The exclusion criteria were: (1) hypertension, diabetes mellitus, coronary artery disease, heart failure, renal insufficiency, liver insufficiency, peripheral vascular disease or hematologic disease; (2) skin pigmentation disorders like jaundice that affect optical absorption (3) recent or acute respiratory tract infections, chronic lung disease, or smoking; (4) brain injury (including hemorrhagic stroke), ischemic stroke, or carotid artery stenosis; (5) pregnancy, sensitive skin or atopic conditions, and foreign residents. Masimo O3™ and Votem sensors are displayed in Figure 1. They were attached to the right and left sides of the forehead, respectively (Masimo O3™ Right, Votem Left). Then, their positions were switched (Masimo O3™ Left, Votem Right). The Masimo O3™ sensor measured cerebral oxygen saturation every 2 s, while the Votem sensor recorded the waveform every 4 milliseconds for 10 min on each side. Masimo O3™ and votem sensors. Legend: Masimo O3™ and votem sensors were attached to the right and left of forehead, respectively (Masimo O3™ right, votem left). Then, they switched their positions (Masimo O3™ left, votem right).
The Votem sensor is presented in Figure 2. It includes a light source and two light detectors, and employed a flexible pad consisting of plastic sensors and black sponges with a conventional distance between the sensors. The light source produced an infra-red waveform (870 nm) for the left light detector and red waveform (660 nm) for the right light detector. As discussed below, cerebral oxygen saturation from the Masimo O3™ sensor served as the dependent variable, while infrared and red waveforms from light detectors in the Votem sensor served as the independent variables. Votem sensor. Legend: The votem sensor includes a light source and two light detectors. The light source produces the infra-red waveform (870 nm) for the left light detector and the red waveform (660 nm) for the right light detector. Masimo O3™ data serve as the dependent variable and votem data from the light detectors serve as the independent variables. 140: electrical power source, 110: red/infra-red light source, 120: photodiode for infra-red light source, 130: photodiode for red light source.
Analysis
Recently, the term “machine learning” has attracted great attention all over the globe and the random forest and the artificial neural network are the most common machine learning algorithms. A decision tree consists of an intermediate node (the evaluation of a predictor), a branch (a value of the predictor as a result of the evaluation) and a terminal node (a value of the dependent variable). A random forest is called “bootstrap aggregation,” that is, decision trees are created based on random samples with replacement (bootstrap) then these trees take a majority vote on the dependent variable (aggregation). 10 An artificial neural network is composed of the input layer, intermediate layers and the output layer. Each layer has neurons (operation units) and neurons in different layers are connected with weights, which show the strengths of connection. The weights are updated as information moves forward and backward between the layers through the neurons. 11 Traditional research for the early diagnosis of disease adopts linear regression with an unrealistic assumption of ceteris paribus, that is, “all the other variables staying constant.” In this context, emerging literature uses machine learning, for example, birth outcome11,12 and menopause.13,14 It does not require unrealistic assumptions of “all the other variables staying constant” while managing to analyze which predictors are more important for the early diagnosis of the dependent variable.
Five machine learning models were applied and compared for the prediction of cerebral oxygen saturation. Among the 150,000 observations, 112,500 (75%) and 37,500 (25%) were used for training and test sets, respectively. The dependent variable was the value of cerebral oxygen saturation from Masimo with the range of 0–100% (Figure 1). The independent variables (or predictors) were the time of measurement with the range of 0–300,000 ms and the values of 16-bit decimal amplitudes (infra-red and red) from Votem with the range of 0–65,535 (Figure 2). The random forest, linear regression, and artificial neural networks with one, two and three hidden layers, were trained, tested and compared with the criterion of the root mean square error (RMSE) between Masimo O3™’s and its predicted value from Votem’s. Default parameters for these models were used. The number of trees was 500 and node impurity was measured based on the residual sum of squares for the random forest. The number of hidden neurons in each layer was 1, the optimization algorithm was resilient backpropagation with weight backtracking and the activation function was logistic for the artificial neural network.
Results
Descriptive statistics.
Note. In Figure 1, Masimo O3™ and votem sensors were attached to the right and left of forehead, respectively (Masimo O3™ right, votem left). Then, they switched their positions (Masimo O3™ left, votem right). Votem 1 and 2 denotes the infrared and red waveform, respectively.
Model performance: root mean squared error.

Model prediction: predicted versus actual values - votem left.

Model prediction: predicted versus actual values - votem right.
Discussion
The cerebral oximeter employs the mechanism in which near-infrared light in the optical window of 650–940 nm demonstrates strong transmission capacity in a human tissue. Light in this range can penetrate from the probe through the subcutaneous tissue and skull to the underlying cerebral tissue. The primary molecules in a human tissue that absorb light in this range are metal complex chromophores such as hemoglobin, bilirubin, and cytochrome. A conventional cerebral oximeter estimates cerebral oxygenation saturation based on the principle that hemoglobin is a major absorbing molecule of near-infrared light in this range and its absorption spectra vary depending on whether it exists in an oxygenated or deoxygenated form.6,7 A significant confounding factor for accurately measuring cerebral oxygenation saturation is the absorption and reflection of light in extra-cerebral tissues such as the scalp and skull.15,16 Other body parts that absorb light of a similar range can affect the measurement outcome. Currently, there is no standard for addressing these issues. Several cerebral oximeters are commercially available, and they attempt to overcome these challenges with different combinations of (1) a spatial resolution that places a certain distance between the light source and the light detector and (2) algorithmic resolution that calculates cerebral oxygen saturation based on an effective combination of waveforms from the light detectors.7–9
In other words, an existing cerebral oximeter has challenging requirements such as an appropriate distance between the sensors, effective attachment to the forehead, and blocking of surrounding light.17–20 These requirements also exhibit a high degree of personal variations. Moreover, existing monitors lack the capability to measure cerebral oxygen saturation, necessitating additional monitoring devices and sensors to measure this parameter. However, the high cost and space requirement of the additional equipment present significant challenges to their implementation in the care of multiple patients. The potential utility of measuring cerebral oxygen saturation in critically ill patients may be hindered by the need for separate space and the high cost of cerebral oximeters.21–23 These factors may limit the collection and analysis of sufficient hemodynamic parameters as well as the reliable interpretation of cerebral oxygen saturation levels. Additionally, factors such as hemodynamic status, skin conditions like scalp bleeding or venous congestion, and the presence of intracranial foreign bodies can influence the performance of cerebral oximeters. Therefore, it is essential to improve the accuracy of cerebral oximetry through the implementation of machine learning-based algorithms that utilize big data to enhance their performance over time. In this context, this study developed a new approach for measuring cerebral oxygen saturation by combining sensors from existing equipment and applying machine learning-based algorithms. Firstly, this machine learning-based device is expected to be less expensive but more applicable with a broader range of target patients compared to the existing cerebral oximeter. Secondly, it would exhibit excellent compatibility, given that it does not require a separate monitor but takes sensors from existing equipment only. Thirdly, it would serve as a robust foundation for machine learning-based prediction models for the severity of critically ill patients in real-time based on the collection and analysis of multiple hemodynamic parameters such as peripheral versus cerebral oxygen saturation levels. It can be expected that, as more data accumulate, more learning proceeds and stronger performance follows.
Several studies have attempted to assess and predict hemodynamic status using machine learning algorithms with data collected from existing patient monitors.24–26 One study developed an AI algorithm that predicts the correlation between the 3-lead ECG and hemodynamic status with excellent accuracy (85.6%). This approach allows for real-time prediction of a patient’s hemodynamic status using only the 3-lead ECG, without the need for invasive procedures. 24 Another study developed an AI algorithm predicting acute patient deterioration in the emergency department based on just five parameters from existing patient monitors, including heart rate, respiratory rate, peripheral oxygen saturation, body temperature, and blood pressure. 26 The authors applied this algorithm to patient monitors, enabling medical staff to quickly identify patient deterioration. These studies developed AI algorithms that assess and predict hemodynamic status and outcomes using data collected from existing patient monitors and machine learning. In contrast, this study proposes a new approach that (1) modifies existing sensors for the prediction of additional hemodynamic parameters and (2) integrates these sensors with existing patient monitors. This method has the potential to overcome the limitations of cerebral oximeters in a cost-effective manner, although further cost-effectiveness analyses are necessary.
The random forest registered almost perfect performance. For Votem in the left part of forehead, the RMSE of the random forest (0.05) was much smaller than those of logistic regression (1.22) and the artificial neural network of one, two or three hidden layers (2.97). The result was similar for Votem in the right part of forehead, that is, random forest (0.06) versus logistic regression (1.22) and the artificial neural network of one, two or three hidden layers (2.58). To our best knowledge, this is the first study in this direction.
Limitations
This study still had some limitations. Firstly, this study involved healthy adults from a single country, which may limit the generalizability of our findings. Future prospective studies should include individuals from diverse ethnic backgrounds, with various skin conditions, and those who are critically ill, to validate the applicability of our method more broadly. Secondly, this study used the training and test sets only. Employing an external validation set is expected to strengthen the external validity of this study. Thirdly, little literature is available and more investigation is needed on the issue of expanding interaction among digital health, machine learning and multi-modal oxygen saturation sensors. It is a widely accepted claim that machine learning accelerates precision medicine with unprecedented performance and this new innovation is a driving force of digital medicine across all areas including diagnosis, management and prognosis. 27 This new development is expected to exert significant effect on the progress of multi-modal oxygen saturation sensors in the context of digital medicine and it will be an important topic for future research.
Conclusions
Traditional cerebral oximeters face several challenges such as sensor placement, personal variations, and high costs associated with additional equipment. This study developed a new approach to overcome these limitations for measuring cerebral oxygen saturation by combining existing sensors and applying machine learning-based algorithms. This innovative method integrates with existing patient monitors and has the potential to improve the accuracy and cost-effectiveness of cerebral oximetry, making it more accessible and beneficial for patient care.
Footnotes
Author contributions
K-SL and SK designed the study. K-SL, SK, DCK, S-HP, D-HJ, EHK, YK, SL, SWL collected, analyzed, and interpreted the data. K-SL and SK wrote and reviewed the manuscript. All authors approved the final version of the manuscript. All authors have read and agreed to the published version of the manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by (1) Korea Medical Device Development Fund grant funded by the Ministry of Science and ICT, the Ministry of Trade, Industry and Energy, the Ministry of Health & Welfare and the Ministry of Food and Drug Safety of South Korea (RS-2020-KD000275) and (2) Korea Health Industry Development Institute grant funded by the Ministry of Health and Welfare of South Korea (No. HI22C1302 (Korea Health Technology R&D Project)). The funders had no role in the design of the study, in the collection, analysis, and interpretation of the data; or the writing and review of the manuscript.
Ethical statement
Data availability statement
The data generated or analyzed during this study are available from the corresponding author upon reasonable.
