Abstract
Objectives:
People with disabilities, people experiencing homelessness, and people who have substance use disorders face unique health challenges. Gaps in public health surveillance data limit the identification of public health needs of these groups and data-driven action. This study aimed to identify current practices, challenges, and opportunities for collecting and reporting COVID-19 surveillance data for these populations.
Methods:
We used a rapid qualitative assessment to explore COVID-19 surveillance capacities. From July through October 2021, we virtually interviewed key informants from the Centers for Disease Control and Prevention, state and local health departments, and health care providers across the United States. We thematically analyzed and contextualized interview notes, peer-reviewed articles, and participant documents using a literature review.
Results:
We identified themes centered on foundational structural and systems issues that hinder actionable surveillance data for these and other populations that are disproportionately affected by multiple health conditions. Qualitative data analysis of 61 interviews elucidated 4 primary challenges: definitions and policies, resources, data systems, and articulation of the purpose of data collection to these groups. Participants noted the use of multisector partnerships, automated data collection and integration, and data scorecards to circumvent challenges.
Conclusions:
This study highlights the need for multisector, systematic improvements in surveillance data collection and reporting to advance health equity. Improvements must be buttressed with adequate investment in data infrastructure and promoted through clear communication of how data are used to protect health.
Keywords
The COVID-19 pandemic highlighted persistent social and structural conditions that left many populations at disproportionate risk for COVID-19 infection, severe disease, and mortality.1-4 Three such groups compose the populations of focus for this study: people with disabilities, people experiencing homelessness (PEH), and people who have substance use disorders (PSUD). Each of these populations is growing, with about 1 in 5 people in the United States likely pertaining to 1 or more of these groups.5-8
Public health data are essential for evidence-based action to reduce morbidity and mortality. 9 Many tools and systems are used to gather these data, including electronic health records (EHRs), electronic laboratory reporting (ELR), electronic case reporting (eCR) or manually submitted case report forms (CRFs), and syndromic surveillance.10,11 During the COVID-19 pandemic, the Centers for Disease Control and Prevention (CDC) gathered COVID-19 case information from jurisdictions to provide aggregated estimates of COVID-19 case rates (Figure 1). 12

Visualization of the case surveillance supply chain for COVID-19 data received and shared by the Centers for Disease Control and Prevention. The figure is based on a graphic available at: https://www.cdc.gov/nndss/about/conduct.html. 12
From May 5 to August 18, 2020, CDC expanded the COVID-19 CRF questions to include data on disability, housing status, and substance use. After August 18, 2020, CDC released a list of variables considered highest priority for collection and reporting that did not include expanded variables for disability, housing status, and substance use (Figure 2). During the period of expansion (May 5 to August 18, 2020), data on age in COVID-19 CRFs were 99.9% complete. However, completeness of data on race (67%), ethnicity (58%), disability (7%), housing status (3%), and substance use (phrased on the CRF as “substance abuse or misuse”) (1%) was lower. While the completeness of data on race and ethnicity improved over time through collaboration between jurisdictions and CDC, 13 these collaborative data improvement efforts did not provide the data necessary to understand disparities among people with disabilities, PEH, and PSUD. A team of CDC subject matter experts supporting the COVID-19 response for these populations observed that some jurisdictions were more likely than others to collect and report this information, even when not required.14,15 While the CDC team found innovative ways to triangulate COVID-19 data on these populations across alternative data sources,14-18 traditional surveillance systems lacked relevant data on COVID-19 morbidity and mortality among people with disabilities, PEH, and PSUD.

Timeline of decisions made by the Centers for Disease Control and Prevention (CDC) about COVID-19 reporting and changes to data on people with disabilities, people experiencing homelessness, and people with substance use disorders in CDC data collection tools, United States, 2020. Dashed line indicates the time period where data were collected on disability, housing, and substance use status in COVID-19 case reporting to CDC. Abbreviations: CRF, case report form; STLT, state, tribal, local, and territorial. MITRE is the Health Federally Funded Research and Development Center, which was funded to survey federal, state, tribal, local, and territorial public health staff and health care providers.
This study aimed to investigate the observed differences in completeness of data on people with disabilities, PEH, and PSUD to (1) describe challenges in capturing these equity-related variables and (2) identify strategies used by jurisdictions and health systems to mitigate these challenges. While this study did not include every equity-related variable or population at disproportionate risk for COVID-19, we explored reasons behind 3 highly incomplete and rarely studied data elements to uncover weaknesses in data systems and processes whose effects likely apply more broadly across additional historically marginalized populations. An understanding of the root causes of suboptimal data is needed to improve equitable responses to future public health emergencies for populations at disproportionate risk of adverse outcomes.
Methods
The Health Federally Funded Research and Development Center, operated by the MITRE Corporation (hereinafter, MITRE) in collaboration with and on behalf of CDC, completed a rapid qualitative assessment to better understand challenges in collecting and reporting data on people with disabilities, PEH, and PSUD, as well as strategies that jurisdictions and health systems used to navigate these challenges from January 2020 through July 2021. This exploratory study included semistructured interviews with public health professionals and health care providers supplemented by document reviews.
Participant Identification and Recruitment
MITRE co-investigators interviewed people from 3 groups: (1) CDC staff supporting the COVID-19 response and engaged in data-related activities, (2) jurisdictional health department staff conducting similar activities, and (3) health care providers. We purposively sampled participants with rich knowledge about data collection and reporting at the federal and jurisdictional levels, and we used snowball sampling to identify additional participants within organizations.19-21 When identifying potential CDC staff participants, we ensured that participants varied by individual attributes (eg, race, sex, ethnicity) so that we included a breadth of potentially differing perspectives. Nineteen of 21 invited CDC staff participated.
When considering which jurisdictional health departments to include in our sample, we assessed completeness of COVID-19 reporting by calculating the percentage of cases with a response other than “unknown” or “missing” for 3 variables in the CRF: presence of a disability, staying in a homeless shelter or sleeping outside, or indication of substance use disorder (SUD). We calculated the average percentage completeness across these variables as an overall percentage complete score (Table 1). We aimed for variation of jurisdictions across calculated data completeness, overall COVID-19 case rates, US Census regions, and level of governance (ie, state vs local). Nine of 10 identified jurisdictions participated, and 39 of 47 invited jurisdictional health department staff were interviewed. We identified health care providers with clinical informatics expertise through referrals from staff at participating jurisdictions. Three of 9 health care providers contacted were interviewed.
Characteristics of federal, state, and local participants interviewed about COVID-19 surveillance among people with disabilities, people experiencing homelessness, and people with substance use disorder, United States, July–October 2021
Abbreviation: CDC, Centers for Disease Control and Prevention.
The sum of rows in this category may not total to 100% because of rounding.
Completeness was defined as the percentage of cases reported that did not have “unknown” or “missing” selected for disability, housing, and substance use statuses. Because each jurisdiction had a different percentage complete for each of the 3 populations of focus for this study, the average percentage complete was calculated across the 3 data fields.
Two jurisdictions were included despite having unknown completeness of data across the 3 study populations of focus; these jurisdictions were selected based on their size, overall case counts of COVID-19, and proportion of the population experiencing homelessness or substance use and/or with a disability within the jurisdiction.
Data Collection: Interviews and Document Review
MITRE co-investigators completed all interviews via teleconference; interviews were recorded with participant permission. MITRE developed semistructured guides tailored to CDC, jurisdictional, and health care participants (eTable in Supplemental Material). Questions explored procedures for data collection and reporting and the implementation fidelity of these procedures.
To understand the context for data collection and reporting, MITRE reviewed jurisdictional and CDC policies, forms, and resources identified by the study team or shared by interview participants. Documents included scientific journal articles; federal and state, tribal, local, and territorial (STLT) department web pages; CDC deployment summaries; CRF protocols; case reporting data dictionaries; and other data collection documentation.
Data Analysis
MITRE analyzed interview responses and documents to identify prevalent themes and provided deidentified results aggregated by participant type to CDC collaborators. MITRE and CDC reviewed the identified themes and respective participant quotes to further refine themes and describe the data collection and reporting challenges, successes, and opportunities. A mix of inductive and deductive coding approaches were used; prior to coding, broad categories were anticipated (eg, data collection, reporting, challenges, solutions to challenges) and participant data were grouped into these broad categories. Inductive coding was used to further analyze participant responses and elucidate subthemes. No preexisting framework or schema was used in this analysis.
Ethical Considerations
This activity was reviewed by CDC and conducted consistent with applicable federal law and CDC policy (eg, 45 CFR part 46; 21 CFR part 56; and 42 USC §241[d], 5 USC §552a, 44 USC §3501 et seq). To minimize response bias, all data were collected by MITRE and were anonymized and aggregated by participant type before sharing with CDC collaborators.
Results
From July through October 2021, MITRE interviewed 19 CDC staff and 39 staff members representing 3 local health departments, 6 state health departments, and 3 health care providers (Table 1). Qualitative data analysis identified 4 primary challenges: data definitions and policies, resource limitations, data systems, and articulation of the purpose of capturing data on disability, housing, and SUD status. We also report on strategies offered by participants to minimize some of these challenges (Table 2).
Quotes from federal, state, and local interview participants about challenges in collecting and reporting COVID-19 surveillance data among people with disabilities, people experiencing homelessness, and people with substance use disorder, United States, July–October 2021
Abbreviations: CDC, Centers for Disease Control and Prevention; DAP, disproportionately affected population; ELR, electronic laboratory reporting.
Data Definitions and Policies
Definitions and measurement
Measures for disability, homelessness, and SUD differed across data systems and between public health and health care settings. Participants described uncertainty about how they should define and capture these data, which differed across organizations. Data fields were left blank in public health reporting forms, and EHRs where applicable, to avoid errors in classification; as a result, many participants described these data as having low utility.
Perceptions of federal regulations and policies
Participants were uncertain about the permissibility and prioritization of the collection and reporting of certain data fields after reviewing federal regulations and policies. Several participants described challenges in collecting and accessing SUD data, citing 42 CFR, part 2, pertaining to disclosure of records maintained by programs that diagnose, treat, or refer individuals for treatment of SUD. Public health professionals and health care providers reported leaving SUD data fields blank because of this uncertainty. Additionally, after the disability, housing status, and SUD elements were not included in the priority data fields required to be submitted from jurisdictions to CDC after August 18, 2020 (Figure 2), many jurisdictions deprioritized these elements.
Overcoming measurement challenges
Strategies shared by participants were wide-ranging and resourceful. Participants reported that eCR, where implemented, improved the ease of sharing health system data with jurisdictions. One health care provider described how their hospital system improved reporting by automating matching of EHR data on disability and homelessness with COVID-19 laboratory results. Jurisdictional staff cited partnerships with local hospital systems and community-based organizations as key to enabling review of EHRs and other relevant systems (ie, housing services). They used International Classification of Diseases, Tenth Revision diagnostic codes for specific conditions and the social determinants of health, which they noted was helpful, particularly when analysis of free text was not feasible. One health department devised an address field response to indicate when a person was experiencing homelessness, to easily identify PEH within existing standardized data fields.
Resource Limitations
Funding
Time-restricted and insufficient funding limited participants’ ability to sustainably enhance systems and staffing for data collection and reporting. While temporary budget increases for COVID-19 facilitated hires, these time-limited positions attracted a limited pool of viable candidates and led to increased staff turnover. These challenges in hiring and retaining staff with appropriate technical skills were amplified in rural areas.
Volume of cases and staff wellness
Across jurisdictions, data on disability, housing, and SUD were primarily collected by case investigators. Case investigation interview guides were described as lengthy; equity questions that appeared later in the guide or were included as subprompts of other questions were more likely to be skipped than earlier questions or question stems. The ability of jurisdictions to conduct outreach and investigations decreased as the number of cases increased. Concurrently, participants indicated that the public became less willing to share information than they had been earlier in the pandemic, which also challenged already low morale.
Staff across all sectors felt overwhelmed by the size and duration of the pandemic, which limited staff capacity for new data collection, even though data system updates were acknowledged as long-term solutions needed to manage increased data volume and complexity.
Overcoming resource challenges
While participants had limited levers to address funding and workforce challenges, jurisdictions described support from CDC deployers as force multipliers that improved communication with federal public health officials. To protect staff wellness and maintain other public health functions, public health agencies rotated staff to support COVID-19 efforts. Other strategies centered on optimizing data collection within manageable workflows. For example, in 1 hospital system, information technology staff and frontline health care providers collaborated to identify which data fields were routinely collected during the patient care workflow. Where feasible, collection of new data fields was integrated into the registration process to lessen the time and data collection effort for health care providers. One jurisdiction described streamlining the work of data collectors using “smart assigning,” which connects investigators with family units rather than individuals. Another jurisdiction reported that implementation of texts prior to calls from case investigators increased engagement, resulting in less time spent on repeat calls.
Data Systems
Data systems used
Most health care participants reported that EHR systems lacked standardized fields for data on disability, homelessness, and SUD. A similar challenge was shared by public health participants for various public health data systems. As a result, data were captured in text-based fields that are more difficult to search and share across systems. Jurisdictions collecting these data had to use manual medical record reviews or automated matching processes with health care systems. Diagnostic codes for homelessness and other social determinants of health were not easy to search, share, or analyze, because these codes are nonreimbursable and underused.
Jurisdictions received most case report data from laboratories. As of April 21, 2021, all participants had converted data systems equipped to receive digital data directly from laboratories via ELR. However, laboratory data often did not include variables for disability, homelessness, or SUD; even data on race or ethnicity were frequently unavailable. Respondents described existing laboratory systems as “not set up” to collect additional demographic data. Data from pop-up laboratories established for high-volume COVID-19 testing were less complete than data from longstanding laboratories. Jurisdictions reported that the lack of financial incentives or consequences around data completeness decreased the effectiveness of federal recommendations to increase variables collected by laboratories.
Interoperability across data systems
Interoperability challenges occurred both within and across jurisdictions and among multiple data systems. Jurisdictions were required to report data to CDC using multiple platforms. CDC then had to reformat data internally because of lack of standardization between systems. Within jurisdictions, lack of data system interoperability limited opportunities to automate data integration (eg, from vital statistics, social services programs, local health care systems). In 1 state, data entered by multiple counties into the same state-level system could not be linked because the systems were not designed to be interoperable.
Overcoming data challenges
One state invested in a modernized public health data infrastructure before the pandemic. This state was able to use a single system for receiving and entering case reports, conducting case investigations, analyzing data, and reporting to CDC, increasing the accuracy and timeliness of data. This system was compliant with state-level data standards, which allowed automated data collection from medical records. In another jurisdiction, public health officials worked with partners to create a laboratory performance score card for iterative feedback to assess and improve the completeness and accuracy of laboratory data. The reports included how the laboratory compared with others in the state, and it was sent directly to senior leadership to optimize visibility.
Articulation of Purpose of Data Collection on Disproportionately Affected Populations
Incentives, clinical applicability, and public health utility
Participants shared that the lack of requirements or incentives to collect disability, homelessness, and SUD data was a disincentive to collect this information. Health care providers questioned the clinical relevance of collecting this information for direct patient care. Health care providers may disproportionately screen for and record disability, homelessness, or SUD among patients presenting in ways they associate with these populations based on social biases and stigma, which leads to underreporting these patient experiences.
Additionally, jurisdictional staff felt they had inadequate information on how these data were used and what actions were taken to protect data privacy for smaller populations. This lack of awareness concerning data use and data privacy limited their ability to communicate the benefits to the people in these populations, which may have increased hesitancy from respondents during case interviews.
Overcoming Articulation of Purpose Challenges
We did not identify actions taken during the study period to overcome challenges in the articulation of purpose. However, jurisdictional staff shared that, in the future, communication from CDC and other national public health organizations (eg, the Council for State and Territorial Epidemiologists) about how these data are used and that these issues and populations are priorities would improve the ability of jurisdictional public health staff to communicate their importance to staff, partners, and the public.
Discussion
While timely, high-quality data are crucial for understanding and addressing health disparities, incomplete data limit evidence-based action for critical populations during public health emergencies. This study analyzed insights from partners at multiple levels to describe challenges in collecting and reporting data on disability, housing status, and SUD early in the COVID-19 pandemic. Challenges reported across each participant type included the absence of consistent definitions and standardized data fields for capturing these experiences in clinical and public health data systems and frequent lack of data system interoperability, both within public health and between public health and clinical data systems. Public health participants noted that high workload and the limitations of short-term funding and staffing strategies hindered their work; these were accompanied by a decline in public willingness to participate in data collection as the pandemic continued. Participants described that data collection for these fields was deprioritized because of a limited understanding of how these data would improve outcomes at a population level or an individual clinical level.
Jurisdictions with higher data completeness referenced preestablished supports, including strong and diverse partnerships with health care and community-based organizations, and investments in public health data infrastructure before the onset of the COVID-19 pandemic. Enforced requirements were cited as potential motivators, and regular feedback to the people collecting information on data quality and use could improve data completeness. Other solutions demonstrated the importance of engaging end users, such as health care providers, to efficiently incorporate data collection into existing workflows.
These findings highlight that the primary challenges to data collection and reporting for these populations are structural or systemic. Understanding and addressing these systems-level issues is fundamental for the wider context of public health data and health equity research. While innovative strategies were identified, jurisdictions and organizations had limited levers to address issues related to historic underinvestment in public health. Changes are needed at national and state levels to ensure sustainable modernization of equitably structured data systems, and further research is needed to understand and evaluate interim solutions for these challenges at the local level.
Data standards and requirements are 1 opportunity for systemic change. While recent efforts have focused on the implementation of automated and direct-reporting technologies such as eCR and ELR, the lack of systematic capture of equity-related variables in EHRs and laboratory data may prevent these systems from attaining their full potential to inform health equity efforts. Lack of race and ethnicity data in laboratory systems22,23 further complicates this challenge and is especially problematic when considering disproportionate rates of disabilities, homelessness, and SUD among intersecting racial, ethnic, and gender identities. Therefore, these findings emphasize the need to prioritize health equity in data modernization efforts.
To improve data collection and sharing, CDC is working to establish recommendations on definitions, 24 validated question sets, and standardized data elements for homelessness. Multiple organizations are engaged in feedback for new data elements in the United States Core Data for Interoperability (USCDI), 25 a standardized set of health data classes and elements used for a nationwide, interoperable health information exchange. As of July 2022, USCDI includes a new data element for disability status. 26 Other efforts highlight opportunities to leverage existing data to their fullest potential by integrating multiple data sources within a state (eg, data from Medicaid or public housing services) to connect individuals with needed services.27,28 Improvements to laboratory data reporting requirements and enhancement of public health surveillance systems remain areas of opportunity to improve data equity.
While many key challenges are structural, interview data identified gaps that can be addressed through training. CDC is collaborating with the Council of State and Territorial Epidemiologists to provide free training for public health staff on collecting data among populations at disproportionate risk of adverse health outcomes. The training will include content on areas of need identified through this study, including trauma-informed practices for effective communication, clarification of federal policies, and how data can be used to benefit individuals and populations.
Limitations
This study had 4 limitations. First, self-selection of interview participants within jurisdictions might have introduced selection bias. Second, because of the time frame for this project, we did not interview community-based organizations or other service providers, which may also be part of the data collection and reporting process. Third, findings from this purposive sample may not reflect all jurisdictional experiences. Fourth, this rapid assessment aligned with a national surge in COVID-19 cases because of the Delta variant; while the timing of these interviews enabled participants to evaluate and discuss data processes and challenges in the context of large case numbers, it also may have limited participant input because of time constraints.
Conclusions
Multiple opportunities exist to improve public health data collection and reporting among populations at disproportionate risk of adverse health outcomes, particularly through systemic improvements to public health data structures. Jurisdictions would benefit from improved timeliness and completeness of data, as it would allow them to efficiently understand and respond to population needs during public health emergencies. Advancing data standards and integrating data systems across multiple sectors are key to improve data to address health inequities. Improvements will only be successful if they are buttressed with adequate investment in workforce and data infrastructure, with clear communication about how data will be used to benefit the health of all people.
Supplemental Material
sj-docx-1-phr-10.1177_00333549241245624 – Supplemental material for Data Equity as a Building Block for Health Equity: Improving Surveillance Data for People With Disabilities, With Substance Use Disorder, or Experiencing Homelessness, United States
Supplemental material, sj-docx-1-phr-10.1177_00333549241245624 for Data Equity as a Building Block for Health Equity: Improving Surveillance Data for People With Disabilities, With Substance Use Disorder, or Experiencing Homelessness, United States by Ashley A. Meehan, Shauna S. Flemming, Shelley Lucas, Megan Schoonveld, Jennifer L. Matjasko, Megan E. Ward and Kristie E.N. Clarke in Public Health Reports
Footnotes
Acknowledgements
The authors acknowledge CDC COVID-19 health department liaisons for their participation and assistance in connecting us with participating jurisdictions. We also thank all who participated for their time commitment, as well as the CDC COVID-19 Disproportionately Affected Populations Team for their contributions to the development and interpretation of this study and the findings presented.
Authors’ Note
Megan E. Ward and Kristie E.N. Clarke are co–senior authors. The findings and conclusions of this article are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Work performed by MITRE co-investigators was supported by funding from the Centers for Disease Control and Prevention.
Supplemental Material
Supplemental material for this article is available online. The authors have provided these supplemental materials to give readers additional information about their work. These materials have not been edited or formatted by Public Health Reports’ scientific editors and, thus, may not conform to the guidelines of the AMA Manual of Style, 11th Edition.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
