Abstract
Introduction
COVID-19 continues to impact vulnerable populations disproportionally. Identifying modifiable risk factors could lead to targeted interventions to reduce infections. The purpose of this study is to identify risk factors for testing positive for SARS-CoV-2.
Methods
Using electronic health records collected from a large ambulatory care system in northern and central California, the study identified patients who had a test for SARS-CoV-2 between 2/20/2020 and 3/31/2021. The adjusted effect of active and passive smoking and other risk factors on the probability of testing positive for SARS-CoV-2 were estimated using multivariable logistic regression. Analyses were conducted in 2021.
Results
Of 556 690 eligible patients in our sample, 70 564 (12.7%) patients tested positive for SARS-CoV-2. Younger age, being male, racial/ethnic minorities, and having mild major comorbidities were significantly associated with a positive SARS-CoV-2 test. Current smokers (adjusted OR: 0.69, 95% CI: 0.66-0.73) and former smokers (adjusted OR: 0.92, 95% CI: 0.89-0.95) were less likely than nonsmokers to be lab-confirmed positive, but no statistically significant differences were found when comparing passive smokers with non-smokers. The patients with missing smoking status (25.7%) were more likely to be members of vulnerable populations with major comorbidities (adjusted OR ranges from severe: 2.52, 95% CI = 2.36-2.69 to mild: 3.28, 95% CI = 3.09-3.48), lower income (adjusted OR: 0.85, 95% CI: 0.85-0.86), aged 80 years or older (adjusted OR: 1.11, 95% CI: 1.07-1.16), have less access to primary care (adjusted OR: 0.07, 95% CI: 0.07-0.07), and identify as racial ethnic minorities (adjusted OR ranges from Hispanic: 1.61, 95% CI = 1.56-1.65 to Non-Hispanic Black: 2.60, 95% CI = 2.5-2.69).
Conclusions
Our findings suggest that the odds of testing positive for SARS-CoV-2 were significantly lower in smokers compared to nonsmokers. Other risk factors include missing data on smoking status, being under 18, being male, being a racial/ethnic minority, and having mild major comorbidities. Since those with missing data on smoking status were more likely to be members of vulnerable populations with higher smoking rates, the risk of testing positive for SARS-CoV-2 among smokers may have been underestimated due to missing data on smoking status. Future studies should investigate the risk of severe outcomes among active and passive smokers, the role that exposure to tobacco smoke constitutes among nonsmokers, the role of comorbidities in COVID-19 disease course, and health disparities experienced by disadvantaged groups.
Keywords
Introduction
In two and a half years, over 557 million people worldwide have been infected with coronavirus disease 2019 (COVID-19), caused by the SARS-CoV-2 virus (severe acute respiratory syndrome coronavirus 2), and over 6.3 million people have died. 1
While increasing evidence suggests that the elderly, racial/ethnic minorities, and those with certain comorbidities are at significantly higher risk of adverse outcomes from COVID-19,2-4 little else is known about what impacts COVID-19 infection. Smoking is a particularly concerning health behavior because it suppresses immune function in the lungs, forces the exhalation of significant quantities and high concentrations of droplet particles, and impacts the amount of contact people may have with contaminated surfaces (from contaminated hands touching cigarettes and cigarettes touching lips) which may increase the risk of lung infection. Additionally, individuals regularly exposed to secondhand smoke (SHS), known as “passive smokers,” may face similar respiratory problems and immune system effects5-7 as smokers and therefore an increased likelihood of COVID-19 infection. Interestingly, published research to date has shown mixed results on the association of smoking and COVID-19, with some studies suggesting a slight protective effect of smoking on COVID-19 infection.8-10 However, one of the key challenges for studies on COVID-19 infection is having sufficient sample sizes to allow adjustment for confounding risk factors, such as comorbidities that are closely associated with tobacco smoking. 11 Thus, the need remains for well-designed population-based studies to examine the association between smoking, comorbidities, race/ethnicity, and COVID-19 in terms of the likelihood of infection.
These population-based studies are hindered by another challenge--the incomplete documentation of smoking status in the electronic health record (EHR). Certain patient subgroups such as older adults, racial/ethnic minorities, patients with language barriers, patients with fewer office visits, and light smokers may face a lower likelihood of having smoking history documentation in the EHR. 12 For patients with COVID-19, collecting information about tobacco use and SHS exposure is difficult during any emergency admission and likely to lead to significant reporting errors and potential biases.
In light of these gaps, we sought to explore the differences between laboratory-confirmed COVID-19 positive and negative cases to identify risk factors for testing positive for SARS-CoV-2 and to understand whether smoking (active or passive), comorbidities, or race/ethnicity contributes independently to predisposition of COVID-19 infection, which could guide efforts aimed at minimizing these disparities for COVID-19. We hypothesized that racial and ethnic minority patients who are active or passive smokers and have more severe comorbidities are more likely than other patients to be laboratory-confirmed COVID-19 positive cases. Moreover, we also examined the differences between patients with and without smoking history in the EHR among those who had a SARS-CoV-2 test to understand the generalizability of the findings as well as disparities in documentation of smoking status in the EHR when testing for COVID-19.
Methods
Study Sample
The study setting was Sutter Health, a not-for-profit organization serving over 100 communities in northern and central California, including San Francisco and many of its affluent suburbs, rural communities from Eureka to Modesto, and disadvantaged urban/inner city residents in Oakland and San Jose. In 2018, Sutter Health’s foundation-affiliated providers served more than 3 million patients who represent diverse groups of socio-demographic and cultural backgrounds. As of 2020, Sutter patients self-identified their race/ethnicity in the electronic health record (EHR) as: 45.6% non-Hispanic White, 15.6% Hispanic, 16.5% non-Hispanic Asian, 4.7% non-Hispanic Black/African American, and 17.4% other. 13 Sutter Health’s electronic health record system (EPIC) is deployed across the enterprise in 3 settings: acute, ambulatory, and community-based practices.
The COVID-19 Universal Registry for Vital Evaluations (CURVE) database, a semi-real-time registry of all confirmed and suspected cases of COVID-19 Sutter Health patients, serves as a centralized resource for this cross-sectional study. Patients were included in the study sample if they had a SARS-CoV-2 test performed at a Sutter Health facility between February 20, 2020 and March 31, 2021. There was no exclusion criterion based on age as COVID-19 can infect individuals of all ages and secondhand smoke exposure may be higher among children and adolescents. A total of 556 690 patients were identified. COVID-19 confirmed cases (N = 70 564) are defined as having one or more positive reverse transcription polymerase chain reaction (RT-PCR) tests for SARS-CoV-2. The comparison group included patients whose COVID-19 lab results were all negative (N = 486 126).
Measures
COVID-19 confirmed cases and the comparison group were linked longitudinally at the patient-encounter level to billing, diagnosis, procedure, clinical encounter records, and provider notes (text fields). Using linked data, we ascertained the prior exposures of subjects in each group, focusing on smoking habits, the severity of comorbidities, and race/ethnicity, and looked into details of care and missing patterns that may be relevant to understanding variation in the COVID-19 lab results. Specifically, detailed information about smoking status, age when the patient first smoked, age when patient stopped smoking, type of tobacco smoked, duration of smoking, and quantity of cigarettes smoked per day were extracted from EHR for both groups. We identified whether the smoking status of each person has been documented in the EHR (missing or non-missing), and if non-missing, we further identified whether s/he is a user of tobacco (current/former/passive/non-smoker) using the most recent self-reported smoking status found in the Social History part of Epic EHR at the time of the earliest lab order for SARS-CoV-2. For example, “Passive smoker” was operationalized using the question “Exposure to secondhand smoke (Yes or No)?” on the adult patient questionnaire and “Does anyone who lives with your child smoke (Yes or No)?” on the children’s patient questionnaire. “Pack years of smoking” was calculated by multiplying the number of packs of cigarettes smoked per day by the number of years the person has smoked, and was used to quantify the level of smoking: mild (<10 pack-years); moderate (10-20 pack-years); or heavy (>20 pack- years). Charlson Comorbidity Index (CCI) scores 14 at the time of the earliest lab order for SARS-CoV-2 were calculated to determine the severity of major comorbidities, and patients were divided into four groups: no major comorbidity (CCI = 0); mild (CCI = 1-2); moderate (CCI = 3-4); or severe (CCI ≥ 5). Sutter Health’s EHR system also captures race/ethnicity, categorized as non-Hispanic White, non-Hispanic Black, Hispanic, non-Hispanic Asian, or other. Other covariates include age at the time of the earliest lab order for SARS-CoV-2 and sex. In addition to age, sex, race/ethnicity, and severity of major comorbidities, having received primary care at Sutter Health, defined as having a visit to internal medicine, family medicine, or OB/GYN in the 12 months prior at the time of the earliest lab order for SARS-CoV-2, and median household income, estimated through census data and linked to geocoded home address, were included to examine the documentation of smoking status.
Statistical Analysis
We audited the data for quality and completeness before any analyses were carried out, including missing data patterns. We evaluated distributions to ensure that they met the assumptions of planned analyses and examined the variable distributions to detect outliers. All inferential tests were carried out at a two-tailed alpha level of 0.05. Unadjusted and adjusted Odds Ratios (ORs) were estimated for measuring the effect, and Chi-square tests were used for testing the association between the SARS-CoV-2 positivity and race/ethnicity, smoking status, and severity of major comorbidities among patients who had a SARS-CoV-2 test performed at a Sutter Health facility. Variables were then assessed as potential independent predictors using multivariable logistic regression. Additionally, patients with missing smoking status were compared to those with documented smoking status to understand the generalizability of the findings using Chi-square tests. Logistic regression analysis was used to examine the association of independent variables (i.e., age, sex, race/ethnicity, median household income, having received primary care at Sutter Health, and severity of major comorbidities) with missing smoking status data. All analyses were conducted in 2021 and performed using SAS, version 9.4. This work was reviewed and approved by the Sutter Health IRB and granted a Waiver of Health Insurance Portability and Accountability Act Authorization and a Waiver of Consent as a data-only study.
Results
Sample Characteristics
Participant characteristics among patients with positive vs negative SARS-CoV-2 tests.
Note: SARS-CoV-2, severe acute respiratory syndrome coronavirus 2; NH, non-Hispanic; CCI, Charlson Comorbidity Index.
Of our sample, 5.1% (n = 28 270) were current smokers, 17.3% (n = 96 563) former smokers, and 0.7% (n = 4175) passive smokers. Smoking status data were missing in one quarter of the sample (n = 142 902). Among the lab-confirmed positive patients, 2447 (3.5%) were current smokers, 9106 (12.9%) were former smokers, 643 (0.9%) were passive smokers, and 22 025 (31.2%) did not have their smoking status recorded. Less than 50% of current and former smokers provided detailed information on their smoking habits that enabled us to calculate the pack-years. Twenty-two percent of the smokers were light smokers with fewer than 10 pack-years smoking history, 12.8% of the smokers were moderate smokers with 10-20 pack-years smoking history, while 12% of the smokers were heavy smokers with over 20 pack-years smoking history. The average age of smoking initiation was 22 years (SD = 12), ranging from 7 to 83 years old. The average age of smoking cessation was 39 years (SD = 15), ranging from 12 to 98 years old. Only 3.1% of all patients used smokeless tobacco (0.8% current users and 2.3% former users). Statistically significant differences in years of quitting, level of smoking, and smokeless tobacco use were found when comparing lab-confirmed SARS-CoV-2 positive and negative cases (P < .0001).
Risk Factors for Testing Positive for SARS-CoV-2
Odds ratio for SARS-CoV-2 positivity.
Note: Boldface indicates statistical significance (P < .05).
SARS-CoV-2, severe acute respiratory syndrome coronavirus 2; NH, non-Hispanic; CCI, Charlson Comorbidity Index.
Differences Between Patients With and Without Smoking History in the Electronic Health Record Among Patients Who had a SARS-CoV-2 Test
Participant characteristics among patients who received SARS-CoV-2 tests with or without documentation of smoking status in the EHR.
Note: SARS-CoV-2, severe acute respiratory syndrome coronavirus 2; NH, non-Hispanic; CCI, Charlson Comorbidity Index.
Risk Factors for Missing Smoking Status in the Electronic Health Record Among Patients Who had a SARS-CoV-2 Test
Odds Ratio for missing smoking status.
Note: Boldface indicates statistical significance (P < .05).
NH, non-Hispanic.
Discussion
This is one of the first and largest cross-sectional analyses to assess risk factors for testing positive for SARS-CoV-2 in primary care in the United States using EHR data from a large, integrated healthcare system. Moreover, we sought to understand the prevalence and causes of missing smoking data among patients tested for SARS-CoV-2. This study includes patients of all ages who had a test for SARS-CoV-2 which is uncommon among published COVID-19 research to-date. Including patients younger than 18 years old also provided sufficient number of passive smokers to allow assessment of secondhand smoke exposure, a possible COVID-19 risk factor that has not been well studied.
Our findings suggest that current and former smokers are less likely to be lab-confirmed SARS-CoV-2 positive. To date, data on whether smokers have a higher risk of SARS-CoV-2 infection is contradictory and inconclusive. 11 A recent report from a primary care network in the UK showed that active smoking was linked with decreased odds of a positive test result, 15 which is consistent with our findings. Another study on risk of COVID-19 in health-care workers in Denmark showed no significant difference between current, former, and never smokers when comparing prevalence of antibodies against SARS-CoV-2. 16 There is no significant association between passive smokers and being lab-confirmed SARS-CoV-2 positive after adjustment for age, sex, race/ethnicity, and comorbidities. The possible reasons for the lower odds of being lab-confirmed SARS-CoV-2 positive among active smokers but not passive smokers remain unclear. However, Kashyap et al 17 hypothesized that a current smoker’s immune systems is less responsive to a COVID-19 infection than that of a never smoker whose immune system would rapidly trigger a cytokine release syndrome. Moreover, when smokers exhale, they are intentionally and forcefully pushing air out of their lungs. The deliberate, deep exhalation of tobacco smoke expels large quantities and high concentrations of particles. The massive rush of particulate matter (PM) may provide opportunities for viral particle to be expelled from the lung and transmitted to passive smokers. Exhaled smoke PM is likely to travel farther than regular exhaling. Consequently, secondhand smoke deposits on surfaces and in dust to become thirdhand smoke residue. Viral particles and droplets with viral particles also deposit on surfaces and in dust. Third-hand smoke constituents include compounds with known microbial activity. Thirdhand smoke may affect the survival of COVID-19 on surfaces and in dust. 18
An important finding is that missing smoking status was associated with an increased odds of being lab-confirmed SARS-CoV-2 positive after adjustment for age, sex, race/ethnicity, and comorbidities. In our sample, the patients with missing smoking status were more likely to be male, aged 80 years or older, a racial ethnic minority, have lower income, less access to primary care, and have major comorbidities. Smokers are at high risk of having or developing other chronic diseases, 19 and smoking is more prevalent among economically disadvantaged groups and certain racial/ethnic minority groups. 20 This suggest that those with missing smoking status (25.7% of the sample) were more likely to be smokers. Therefore, the overall SARS-CoV-2 positivity rate of smokers in the entire sample may have been underestimated due to missing smoking status data. Additionally, selection bias may skew the results because smokers are more likely to experience respiratory symptoms including cough, expectoration, and sore throat, which could lead to more frequent testing and increasing proportion of smokers with negative SARS-CoV-2 results. 15
Approximately 17% of U.S. adults currently smoke cigarettes 21 and, while shelter-in-place restrictions have been put in place to slow the spread of COVID-19, these restrictions may also increase feelings of social isolation and mental distress, both of which have been linked to increased motivation to smoke. 22 As smoking is ultimately a modifiable behavior, understanding the impact of active and passive smoking on COVID-19 infection is especially important during the COVID-19 pandemic. Evidence suggests that individuals aware of the emerging infectious diseases are more likely to practice preventive behavior,23-25 providing a potential opportunity to decrease smoking rates and protect non-smokers from tobacco smoke exposure.
Previous studies suggested that COVID-19 has affected more adults than children, and more men than women.26,27 With the more recent sample, we found an increased risk of a positive SARS-CoV-2 test in patients under 18 years old and in men. The number of children contracting COVID-19 in the U.S. made up only around 3% of the U.S. total in 2020, but children now account for more than a fifth of new coronavirus cases in states that release data by age, according to the American Academy of Pediatrics. 28 The steadily increasing child COVID-19 cases have raised alarms about a surge in COVID infection among children. By March 31, 2021 (i.e., end of the study period), children were the remaining population ineligible for the vaccine and the main vectors of virus spread creating risk to both themselves and the rest of the population. Full approval of the coronavirus vaccine for children 5-11 year-old was released on October 29, 2021, 29 which is essential to help protect children from COVID-19 infection.
We also found that Hispanic/Latino patients had 2.70 times the odds of a positive SARS-CoV-2 test result than non-Hispanic White patients, which remained significant after adjusting for smoking status, age, sex, and comorbidities. Consistent with previous studies,30,31 this finding highlights the disproportionate incidence of COVID-19 among the Hispanic population and the burden of the pandemic on racial and ethnic minority communities across the country.32,33 Employment in high-risk positions, education, income, and structural barriers to healthcare are all factors likely contributing to this association.34,35 According to data from Johns Hopkins and the American Community Survey, the COVID-19 infection rate is 3-fold higher in counties with a majority Black population vs predominately White counties, and the death rate is 6-fold higher in Black counties. 36 The higher proportion of positive SARS-CoV-2 tests in racial and ethnic minority groups highlights differential access to resources and underlying systemic racism which are contributing to and exacerbating the racial ethnic disparities seen in the COVID-19 pandemic.
Increasing evidence shows that a number of chronic comorbidities such as hypertension, diabetes, and cardiovascular disease are associated with an increased risk of progressing to severe COVID-19 disease.2-4 Comorbidities that coexist with COVID-19 may lead to a delayed diagnosis, be confounders in analysis of association between COVID-19 and other risk factors, and increase morbidity and mortality. Therefore, it appears desirable to summarize one or multiple comorbidities into a single score in an efficient manner, using comorbidity indices from the EHR data. The CCI is the most commonly used comorbidity indicx and includes sixteen diseases with different weights based on the strength of their association with mortality. Our study is the first to adapt the CCI to a large health care database to examine the association between comorbidities and COVID-19. We found an association between severity of major comorbidities and a positive SARS-CoV-2 test as hypothesized. Specifically, patients with mild comorbidities were about 1.10 times as likely to have the positive SARS-CoV-2 test result as those without comorbidities. It may add prognostic information to the COVID-19 diagnosis as a predictor of SARS-CoV-2 positivity.
Over a quarter of patients tested for SARS-CoV-2 have no documentation of smoking status in the EHR. The influence of the missing data on the association between smoking and SARS-CoV-2 positivity remains unknown. Healthcare systems should enhance documentation of smoking status among all patients and especially among COVID-19 patients. Vulnerable populations with major comorbidities, lower income, less access to primary care, aged 80 years or older, and identifying as racial ethnic minorities were more likely to have missing smoking data, suggesting opportunities to target subgroups and influence clinical practice at patient, provider, and systems levels.
Limitations
We acknowledge several limitations in the current study. The study population only included individuals who were assessed for SARS-CoV-2 infection by a physician and were tested for SARS-CoV-2. Thus, results only reflect risk factors for testing positive for SARS-CoV-2. There are limitations in the diagnostic tests used and the sensitivity and specificity of the tests themselves. For example, test sensitivity increases with viral load, so the SARS-CoV-2 positive patients may have been more likely to include patients who were more severely ill. On the other hand, a false negative test result indicating that a person has not been infected when the person actually does have the infection poses a challenge to COVID-19 diagnosis. 37 However, it may result in an underestimate of the association between COVID-19 and risk factors. In our sample, many patients had several simultaneous or repeated tests (up to 20 tests for SARS-CoV-2) which could overcome an individual test’s limited sensitivity.
Although barriers to documentation of smoking status for patients tested for SARS-CoV-2 in vulnerable populations need to be better understood, the completeness, sensitivity, and positive predictive values of smoking history documentation in the EHR have improved considerably in recent years, indicating that the EHR is an acceptable source for identifying persons with a history of smoking for research and clinical purposes similar to our study. 12 However, the measurement of some smoking behaviors such as duration of smoking, age at which patient first smoked, type of tobacco smoked, level of smoking, and passive smoking, are limited by the available data and may be underreported in the EHR.38,39 More comprehensive assessments of secondhand smoke exposure and biological confirmation are required to better understand potential biases in reported exposure and the conditions under which secondhand smoke may affect COVID-19 transmission, infection, symptoms, and health outcomes. Acknowledging this limitation, similar studies like ours on the health effects of passive smoking among COVID-19 patients may become another reason for clinicians to be more committed to improved passive smoking documentation and patient education.
Lastly, we are studying a single healthcare organization in northern and central California. However, studying a single system with a shared infrastructure controls for access barriers and variations in testing supplies and protocols, and allows us to instead focus on variation across patients who had a test for SARS-CoV-2 and factors that could lead to COVID-19 prevention and control.
Conclusions
Our findings suggest that children, men, patients identifying as racial ethnic minorities, patients without documentation of smoking status, and patients with mild major comorbidities are at increased risk for testing positive for SARS-CoV-2. Federal, state, and local government should dedicate the resources necessary to support these particularly vulnerable populations during COVID-19 pandemic comprehensively. This should include, but not be limited to, protecting nonsmokers from exposure to secondhand smoke. Findings from this study would be useful in identifying the critical individual characteristics that should be considered in tailoring approaches to identify trusted sources of information and create effective messaging to disseminate COVID-19 prevention and control information. Additional research is also needed to improve documentation of smoking history, pack-years, vaping, and secondhand exposure especially among patients tested for SARS-CoV-2. Future studies should investigate the risk of severe outcomes among active and passive smokers, the role that exposure to tobacco smoke constitutes among nonsmokers, the role of comorbidities in COVID-19 disease course, and health disparities experienced by disadvantaged groups.
Supplemental Material
Supplemental Material - Effects of Smoking on SARS-CoV-2 Positivity: A Study of Sutter Health Organization Serving in Northern and Central California
Supplemental Material for Effects of Smoking on SARS-CoV-2 Positivity: A Study of Sutter Health Organization Serving in Northern and Central California by Jiang Li, Meghan C Martinez, Dominick L Frosch and Georg E Matt in Tobacco Use Insights
Footnotes
Acknowledgments
The authors would like to acknowledge Rene Everline, Senior Information Analyst, and Amandeep Mann, Statistical Analyst, at the Palo Alto Medical Foundation Research Institute, Center for Health Systems Research, Sutter Health, for their data extraction work.
Declaration of Conflicting Interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding:
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Tobacco-Related Disease Research Program Emergency COVID-19 Funding (10.13039/100005188, R00RG2429).
Data Availability:
The data that support the findings of this study are available from the corresponding author, JL, upon reasonable request. The data are not publicly available due to their containing patient health information that could compromise the privacy of research participants. A Data Use Agreement will be implemented for any outside researcher interested in using the data for replication of research findings or for additional areas of research.
Supplemental Material:
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
