Abstract
Project Overview and Context
The opioid epidemic remains a serious and growing public health problem, contributing to significant mortality and morbidity (Centers for Disease Control and Prevention, 2021), and imposing an economic burden of greater than $1 trillion (Florence et al., 2021). In response, the National Institutes of Health and the Substance Abuse and Mental Health Services Administration launched the HEALing Communities Study (HCS) in 2019 to test the effectiveness of the “Communities that HEAL” (CTH) intervention at reducing opioid-related overdose deaths in 67 highly impacted communities across four states: Kentucky, Massachusetts, New York, and Ohio (Chandler et al., 2020; The HEALing Communities Study Consortium, 2020). The CTH intervention uses a community engagement approach to build diverse coalitions in each community to expand access to and use of evidence-based overdose prevention practices across healthcare, behavioral health, criminal justice, and other community-based settings.
In addition to testing the effectiveness of the CTH intervention, the HCS is also evaluating the CTH implementation process using a mixed-methods implementation science approach. As part of this approach, researchers from each research site (i.e., state) conduct semi-structured qualitative interviews with key informants to understand the important domains of the internal community context and external policy and systems context that can affect CTH implementation (Drainoni et al., 2022). As findings from various components of the HCS and the CTH implementation evaluation are beginning to emerge (Drainoni et al., 2022; Walker et al., 2022), we undertook the opportunity to reflect on our qualitative data collection and analysis approach to share lessons learned for designing and applying qualitative methods within an implementation science framework to improve understanding of how qualitative methods using a Team Science approach can be applied in large, multi-site evaluations.
Overview of Study Design
HCS is a multi-site, wait-listed, community-level cluster randomized trial with 34 communities randomized to Wave 1 and 33 communities randomized to Wave 2 (The HEALing Communities Study Consortium, 2020). Key informants (i.e., local/county health officials, criminal justice officials, medical providers, social service providers, etc., who are community coalition members of local stakeholders in substance use issues) from all HCS communities are recruited to participate in semi-structured qualitative interviews at four timepoints: baseline (i.e., prior to CTH implementation for Wave 1; contextual for Wave 2); follow-up 1 (i.e., midway through CTH implementation for Wave 1; contextual for Wave 2); follow-up 2 (i.e., end of CTH implementation for Wave 1; prior to CTH implementation for Wave 2); and follow-up 3 (i.e., 1.5 years after CTH implementation has been completed for Wave 1, end of CTH implementation for Wave 2). To date, we have completed 382 baseline interviews with 389 participants, and 304 follow-up 1 interviews with 310 participants.
The qualitative implementation evaluation component proceeds in a cyclical manner through each interview timepoint (Figure 1). We have completed this process for two timepoints (i.e., baseline and follow-up 1) and will continue using this process for the remaining two rounds of data collection. First, an interview guide is drafted based on the RE-AIM/PRISM evaluation framework (Glasgow et al., 2019) and insights from previous rounds of data collection. Then, researchers and research staff from each site conduct interviews with key informants from their respective states. Data analysis begins with the drafting of a codebook, also grounded in the PRISM/RE-AIM framework, that is iteratively refined in each round of data collection through a cross-site consensus process that involves weekly meetings of the site qualitative leads to resolve difficult coding decisions; this is followed by a within-site consensus process. Once the codebook is finalized, interview transcripts from each research site are divided among the members of the coding team from the respective state for individual coding. Emerging themes and changes to the codebook are documented in detail in a log, and the information is used for interview guide and codebook development in subsequent interview rounds. The baseline (i.e., prior to CTH implementation) interview guide as well as the interview and data analysis procedures have been published in detail elsewhere (Drainoni et al., 2022; Knudsen et al., 2020; Walker et al., 2022). Qualitative process for each wave of data collection.
Practical Lessons Learned
Embedded within our qualitative evaluation of the CTH intervention are the notions of Team Science and Big Qual, both of which confer particular benefits and challenges in qualitative research. Team Science refers to the leveraging of cross-disciplinary expertise to address complex scientific issues (Salas et al., 2018). By bringing together experts from different fields, Team Science blurs disciplinary boundaries and fosters a more inclusive and collaborative analytic process that can often enhance understandings of phenomena in ways that cannot be achieved with experts from a singular scientific field (Hall et al., 2012; Stokols et al., 2008). However, Team Science can create challenges for the qualitative research process, especially with building consensus and maintaining efficiency (Skillman et al., 2019; Vindrola-Padros & Johnson, 2020). These matters are further complicated when the qualitative research is Big Qual, or has data from at least 100 participants (Brower et al., 2019). While the large volume of data in Big Qual is beneficial for theory-building, innovation, and generalizability, the data collection and analysis processes are often time and resource intensive (Brower et al., 2019). And similar to Team Science, there are concerns with achieving consensus and ensuring data trustworthiness and validity in Big Qual (Hossain & Scott-Villiers, 2019).
The challenges posed by Team Science and Big Qual to the qualitative research process are especially salient in the CTH implementation evaluation, which not only contains data from several hundred participants, but also involves large teams of interviewers and coders (e.g., at baseline, a team of 28 interviewers and 25 coders) who are collaborating across four distinct research sites. Further, the CTH implementation evaluation also involves collecting data longitudinally and in different study arms, adding to the complexity (“bigness”) of the dataset. Below, we highlight some of our successes and challenges, and we discuss what we have learned from our experience so far with conducting a large-scale, multi-site qualitative implementation science study.
Researcher Collaboration and Skilled Leadership
Effective collaboration among all members of a research team is critical for ensuring consistent data collection and for achieving consensus during data analysis, both of which contribute to increase the dependability, credibility, and trustworthiness of our findings (Wisdom et al., 2012). Therefore, it was important to identify point persons at each research site and establish a process for collaborating early on. Given that researchers were geographically dispersed, coupled with the emergence of the COVID-19 pandemic, in-person collaboration was not possible. As such, we adapted to using virtual platforms such as Zoom for meetings and developed clear procedures for keeping track of group decisions and for organizing research documents. Furthermore, two experienced qualitative researchers from each site formed the cross-site qualitative analysis core (QAC), and they were charged with liaising between the QAC and the other interviewers and coders at their respective sites. This structure has helped facilitate efficient group decision making.
In addition to the QAC, each research site also designated a senior researcher as their site’s lead. Site leads varied with respect to their levels of qualitative expertise, but each shared a health services orientation to research and had experience with project management, design, and implementation of qualitative research. As a result, beyond facilitating the data collection and analysis process, the site leads are also critical in helping the interview and coding teams develop and flourish as they progress through different stages of the implementation evaluation (Tuckman & Jensen, 2010). Furthermore, leads establish the group norms of their sites and provide coaching that helps the interview and coding teams perform at their expected levels.
Structured Processes for Communication
Excerpt of Log of Coding Discrepancies and Codebook Updates.
Structured Data Collection and Analytic Approaches
An interesting distinction between our data collection efforts and those of other qualitative research projects is the amount of structure in both our interview guide and our interview process. While many interviewers on our implementation evaluation teams had prior experience conducting semi-structured interviews, it became apparent early on that our qualitative data collection process required a more well-defined interview guide than interviewers had used in the past. By incorporating specific question probes that interviewers were required to ask, the more directed interview guide helps to improve consistency throughout data collection while still allowing for flexibility in the interviews.
Structure of the Codebook.
Diversity Among Research Team Members
One of the challenges that emerged during data analysis of baseline and the first follow-up interviews was related to the diversity in experience, training, and knowledge among our coders. Qualitative data collection and analysis trainings were held prior to the start of each round of data analysis to ensure a consistent approach to coding across all 25 coders. Qualitative trainings were used to review the goals of qualitative research, types of qualitative data collection, interviewing techniques, deductive coding and theme generation approaches, as well as training in the NVivo software that would be used for all project coding and analyses. Trainings provided the coding team with foundational and applied knowledge about the HCS study. However, these trainings could not eliminate differences in “insider knowledge” about HCS communities. For example, coders who were involved in other components in the HCS were more aware of the CTH intervention strategies chosen by a specific community, leading them to code transcripts differently than coders who did not have this information. To overcome these differences, the QAC had discussions on the epistemological stance of our coding and decided that coders should in general take a constructive approach to coding rather than an objectivist one (Chamberlain, 2014). In practice, this position means that coders are advised to derive their interpretations and meanings based on their reading of the transcripts, which involves a methodological tradeoff. While we gain consistency in our coding process, we may have lost additional insights that could have been gained from an objectivist approach. A small group secondary analysis focused on targeted codes will proceed after the completion of baseline coding; this secondary analysis will use a combination of deductive and inductive coding, and we believe this will be the opportunity to consider more nuanced insights.
Data Management and Data Sharing
To facilitate cross-site collaboration, it is important to establish reliable ways to securely store and manage the large volumes of data that the sites are collecting and analyzing. For HCS, RTI International serves as the data coordinating center (DCC). The DCC is responsible for data management and statistical support. After completing baseline coding, the original workflow for sharing of the qualitative data in HCS involved sites sending a single file from NVivo 12 (the coding software the QAC selected for all qualitative analysis in HCS, as it was the tool most familiar and used across sites), including primary coding of all transcripts for each round of interviews. The DCC then merged these files, and investigators could request code reports for analyses for manuscripts. The DCC would then supply these code reports as exported Microsoft Word documents.
Two key issues emerged with the original workflow. The first issue concerned sharing of personally identifiable information (PII) in the transcripts. The original workflow required sites to de-identify the transcripts prior to sharing with the DCC. However, sites were concerned that de-identifying the transcripts would diminish interpretability, and they requested to be able to send identifiable transcripts to the DCC. This change required the DCC to establish a revised data use agreement with the sites.
The second issue pertained to the code reports provided back to the investigators. The Word file exports of code reports neither retain the file structure that NVivo applies nor allow for examination of code overlap, making them unsortable by transcript identifiers and inefficient for in-depth analysis of the primary coded data. The DCC’s motivation for this approach was to preserve the data integrity by not sharing the full NVivo file, and to be better able to track use of the data. Moving forward, the QAC and the DCC have decided on several process improvements, including using cases to classify the data along key analytic variables (e.g., site, coalition role, study intervention assignment), and providing investigators with a limited NVivo data file that maintains the file structure and more readily supports secondary analysis to explore relevant code overlaps.
Saturation
Beyond the practical lessons described above, issues emerged around the concept of saturation that merit discussion. Saturation is a core concept in qualitative research (Glaser & Strauss, 1999) that refers to the idea that additional data collection will not produce new findings; it is typically used to guide justification for ceasing data collection. For HCS, this concept was problematic to apply, as our sampling strategy needs to achieve representation both within and across communities while also working within the study’s resource constraints. These issues are common to all types of qualitative studies, yet there are additional needs for consistency in a multi-site study in terms of methods and timelines. Further, given our interest in the implementation of the CTH, we purposefully sampled key informants with heterogenous roles in their communities – raising questions about what level of saturation should be identified at (e.g., the role, the community, the site, the study intervention). Ultimately, our sampling approach was determined a priori and was designed to achieve broader study goals of understanding the context of implementation of the CTH (Sim et al., 2018). The strength of this approach is that it is well-oriented toward thematic saturation and code identification as well as framework-driven deductive analysis (Hennink et al., 2017; Saunders et al., 2018). Yet, these strengths must be weighed against the limitations of this approach; namely, that our data have more breadth than depth in any single community and are less ideal for developing theory. This limitation was acceptable for HCS given that the study’s goal was to test and explicate the RE-AIM/PRISM model rather than develop theory, and it is also aligned with our deductive-dominant coding approach. The impact of this limitation is also minimized in HCS due to the multiple rounds of data collection that can strengthen code refinement, as well as the additional data collection efforts that are part of HCS (e.g., surveys, fidelity reporting, case notes) that can help us triangulate our findings and develop in-depth case studies at the community level.
Challenges
Significant challenges arose in the IS qualitative evaluation related to its position as one part of the larger HCS study. Many researchers on the larger HCS are experienced quantitative scholars with a more limited understanding of qualitative evaluation, which led to the need to advocate for sufficient resources to support the qualitative research activities. Additionally, competing goals and requirements that have taxed staff who are working on different aspects of the HCS may have drawn attention from the qualitative data collection and analysis process. Further, given the duration of the study, staff turnover occurred, and new interviewers and coders had to be added to the team throughout the process. While the complexity of the study was challenging to understand for new staff entering at different phases of the project, this challenge was mitigated by the established processes and trainings.
Considerations for Future Research
Given that the HCS is a multi-year study with multiple rounds of data collection, we have been able to incorporate what we learned during the baseline interviews and analysis process into our subsequent rounds of data collection and analysis. One particularly important consideration was the need to consider the timeframe for both our data collection and analysis processes. Due to the time required to obtain ethics approval and the need to quickly start the intervention in Wave 1 communities, baseline data collection occurred from late November 2019 through early January 2020. This resulted in interview schedules becoming compressed given limited interviewee availability over the holidays. In addition to scheduling interviews during a different part of the year for the next wave, our improved understanding of the time required to ensure consistency of coding across sites also led us to schedule additional time for the primary coding process for these new data.
An additional consideration for future research is related to analytic techniques for the qualitative data. So far, our teams have completed primary coding of 686 interviews across two rounds of data collection and anticipate this number will nearly double by completion over the next two rounds of data collection. Primary coding of each round of interviews, from codebook development to final coding, requires teams of 20 to 25 researchers approximately six to nine months to complete. While the teams have successfully sub-coded some themes for manuscript development, this process is resource intensive and increasingly complex given the incorporation of study design elements (i.e., the start of the CTH intervention). In short, the HCS study has a Big Qual problem: How can we analyze the data into trustworthy, interpretable, and meaningful components? This challenge is an emergent one for qualitative research, as to our knowledge, there are limited qualitative studies of this scale. A critical component of the success of the HCS qualitative work is the level and duration of funding support for the study from the National Institute on Drug Abuse (NIDA). This level of funding, sustained for five years, was achieved by a large investment in a multi-site, multi-method study by NIDA, and it allowed the sites to develop a robust process that facilitated collection and analysis of the Big Qual dataset. This situation has been extraordinary, and while qualitative work is becoming a more prominent feature of large-scale funded projects, the resources required to collect and analyze Big Qual datasets remain an obstacle.
Current qualitative methods may need to progress to lower the costs involved in managing datasets of this size and to work toward a more efficient and cognitively accessible process. Alternative approaches to traditional qualitative analysis, such as the breadth-and-depth approach (Davidson et al., 2019; Edwards et al., 2021), rapid analysis (Gale et al., 2019; Vindrola-Padros & Johnson, 2020), the matrix approach (Averill, 2002), qualitative comparative analysis (McAlearney et al., 2016), and natural language processing (Abram et al., 2020; Crowston et al., 2012; Leeson et al., 2019) all offer potentially innovative methodological tool sets. However, a thorough understanding of the tradeoffs of these approaches is lacking, a shortfall that deserves attention in future research to enable discovery using Big Qual.
Qualitative methods using a Team Science approach have been limited in their application in large, multi-site randomized controlled trials of health interventions (Lewin et al., 2009; Mannell & Davis, 2019). While this paper reports the results of the experience so far of a large, multi-disciplinary, multi-site team, we are limited in that our focus is on a single study. We are hopeful that the perspectives we provide can inform future large-scale qualitative data collection and analyses projects that advance implementation science across settings. Incorporating qualitative methods is essential to understand intervention uptake and maintenance, particularly for etiologically complex phenomena such as the opioid epidemic. Our experience provides practical guidance for future multi-site studies with large and experientially and disciplinarily diverse teams seeking to incorporate qualitative or mixed-methods components.
Footnotes
Acknowledgments
The authors would like to thank Dr. Ramona Olvera for her excellent assistance with this manuscript. This study protocol (Pro00038088) was approved by Advarra Inc., the HEALing Communities Study single Institutional Review Board. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, the Substance Abuse and Mental Health Services Administration or the NIH HEAL Initiative®. ClinicalTrials.gov identifier NCT04111939.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the National Institutes of Health (NIH) and the Substance Abuse and Mental Health Services Administration through the NIH HEAL (Helping to End Addiction Long-term®) Initiative under award numbers UM1DA049394, UM1DA049406, UM1DA049412, UM1DA049415, UM1DA049417 (ClinicalTrials.gov Identifier: NCT04111939).
Correction (November 2023):
Eight author’s name has been corrected in this version.
Ethics approval
All procedures were approved by Advarra Inc., the HEALing Communities Study single Institutional Review Board.
