Abstract
Due to differing research traditions and philosophical positions, qualitative and epidemiological research are rarely integrated. Therefore, here, we reflect on how we integrated epidemiological methods into a purposive sampling approach to obtain a highly diverse study sample for our in-depth qualitative study (N = 37) from the Dutch Lifelines Cohort Study (N = 152,728). This study contributes to mixed methods research, as we first reflect on benefits and challenges related to interdisciplinarity among researchers, also in relation to open science practices. Second, we discuss the integration of quantitative epidemiological elements into a purposive sampling strategy and how this resulted in an information-rich and varied study sample. Last, we provide hands-on recommendations and lessons learned on integrating epidemiological and qualitative techniques within an interdisciplinary setting.
Introduction—Integration of Quantitative, Epidemiological, and Qualitative Methods
Different types of research produce different types of knowledge. The knowledge gained from epidemiological quantitative research is often considered broad, more general, and focused on the “what,” whereas qualitative research produces in-depth, contextualized knowledge on the “how” and “why.” Usually, epidemiological research also aims to generalize and extrapolate research findings beyond the study sample via statistical inference, whereas qualitative research aims to understand phenomena within their specific culture, time, and place via a circular research process (Hennink et al., 2020). Partly due to these different research aims, the sampling methods of qualitative and quantitative research are often considered to be in contrast with each other (Yilmaz, 2013). Epidemiological research usually requires a random study sample that represents the population that is being studied to allow for generalization. Qualitative research in contrast does not aim for statistically representative study samples, but rather an information-dense sample that varies in (sociodemographic) characteristics that are relevant to the research question at hand. The study sample in qualitative research is not static either, as the relevance of certain characteristics or traits of the sample may evolve during the iterative research process (Miles et al., 2014; Tong et al., 2007). Therefore, in qualitative research the study sample is flexibly and purposively recruited, rather than statically and randomly as in epidemiology. Both types of sampling have their merits in their respective contexts (Mazzocchi, 2019).
Contrasts are also observed in the philosophical position of epidemiological research and qualitative research (Hennink et al., 2020). Epidemiological quantitative research is generally considered to be rooted in an epistemology of objectivism and ontology of realism (Johnson & Gray, 2015). The objectivistic epistemology assumes a rather direct production of truthful knowledge, and ontological realism assumes the existence of a universal reality. In contrast, qualitative research follows a more subjectivistic and relativistic approach (Johnson & Gray, 2015). Epistemological subjectivism considers the production of knowledge as a process that is dependent on one’s theories, assumptions, and frames of reference. The relativist ontology ignores or denies the existence of a universal reality (Porpora, 2015). In practice, epistemologies and ontologies exist on continua, which may intersect at any point, creating combinations (Goertz & Mahoney, 2012).
While certain phenomena are particularly well-suited to one specific type of research, a full understanding of many real-world research questions requires an approach that integrates both quantitative epidemiological and qualitative methods, as well as their respective philosophical positions (Ågerfalk, 2013). A large body of evidence argues for the benefits of mixed methods and interdisciplinary research approaches, but few studies maximize the potential of a mixed methods design (Mazzocchi, 2019; van Grootel et al., 2020). Although “a whole that is greater than the sum of the individual qualitative and quantitative parts” is produced when integrating both approaches (Fetters & Freshwater, 2015, pp. 116), many mixed methods studies apply singular methods derived from quantitative epidemiological and qualitative research in parallel and lack an integration of these (Guetterman et al., 2015). This frequently results in a mere discussion of qualitative and quantitative data in the same study, instead of a truly integrative, synergistic approach in which qualitative and epidemiological quantitative approaches are combined into one design or methodology. An integrative approach allows for an increased quality of research via, for example, obtaining a diverse study sample and consequently a deeper understanding of the research question at hand (Goertz & Mahoney, 2012; Shan, 2022). However, research processes that aim to overcome this “integration challenge” (Fetters & Freshwater, 2015), including its drawbacks and benefits, are required to advance the field, yet are rarely described (Bates et al., 2023; Fetters & Molina-Azorin, 2017; Hasan et al., 2023). Besides technical aspects, the “integration challenge” also comprises matters of collaboration and issues concerning open science practices.
In this paper, we reflect on how we integrated methods for data collection through connecting, as we linked quantitative, epidemiological data to a qualitative, purposive sampling approach (Fetters et al., 2013). In addition, we discuss the challenges we encountered, and strategies we applied to mitigate these, in harmonizing our interdisciplinary research team.
First, the study at hand will be concisely described. Second, we reflect on the interdisciplinary nature of our team. Third, we discuss the preregistration of our interview study. Fourth, we describe in detail the integration of quantitative epidemiological elements into a purposive sampling strategy and how this resulted in a highly diverse study sample, including the benefits and drawbacks of such an integration. Last, we conclude why we consider the integration of qualitative and quantitative sampling methods highly valuable, and sometimes required, to answer real-world research questions. We provide recommendations hereon as well.
The Study—Approaches to Illness in a Family Context
We illustrate the integration of the epidemiological and qualitative approaches by describing an in-depth interview study (n = 37; 46% female; mean age = 49.1 years [SD = 12.0]), in which we explored participants’ illness approaches in the family context. We define these illness approaches as people’s dynamic and context-dependent organized set of ideas, feelings, and management strategies, regarding health, illness, and disease (Gottman et al., 1996). We conducted semi-structured interviews with participants, exploring among others their beliefs and ideas regarding experienced somatic symptoms, these symptoms’ causes and consequences, their management, and their general ideas about vitality and health. We also asked participants to reflect on their parents’ beliefs and ideas on this, and whether and how these are integrated into the upbringing of their own children. This allowed us to explore people’s illness approaches in a family context and to explore how these approaches were involved in the intergenerational transmission of somatic symptom proneness. We designed a purposive sampling strategy using epidemiological data from a large-scale cohort study to obtain an information-dense, varied, and diverse study sample for the interview study. This study sample provided us with a wide range of experiences with the presence and burden of somatic symptoms in families and subsequently with different illness approaches. In our research, although we realized this after reflecting on our research process, we adopted a critical realist approach. This means that we acknowledge that a universal reality exists, but we are also aware that the knowledge we produce about this is imperfect and influenced by the theories we use and the frames of reference of the involved researcher (Shan, 2022).
The Team—Interdisciplinary Collaboration
We anticipated that to gain nuanced insights in illness approaches, we required a variety of disciplines in our research team as our research crossed the traditional boundaries of the social, medical, and public health discipline. Our team consisted of six researchers (one male and five female researchers), all with their respective background and expertise, including but not limited to epidemiology, sociology, general practice, pediatric medicine, pedagogy, developmental psychology, neurosciences, and biomedical sciences. The academic experience of the researchers ranged from 2 years to 29 years. Three researchers also had experience working in clinical practice.
Although the variety in experience, knowledge, and backgrounds enriches the collection and understanding of our data, optimal interdisciplinary collaboration requires a shared understanding of research aims, methods, and values among team members (Fetters & Molina-Azorin, 2017). Therefore, to integrate our philosophical assumptions, academic backgrounds, and research experiences, we engaged in structured discussions to identify individuals’ research skills and preferences at the start of the research project. This guided how responsibilities were distributed. In these processes, we had to balance individual preferences and skills, but also individual autonomy and collaboration, which resulted in the development of a detailed procedural handbook that ensured consistency and transparency throughout the study. Another balancing act between individual autonomy and collaboration pertains to the analyses of the qualitative data within the larger research project. As multiple research projects based on the same qualitative dataset, but addressing different research questions, were conducted in parallel we maintained regular updates on progress and discussed challenges in the data analysis.
The discussions allowed us to recognize and acknowledge each other’s backgrounds, discipline-specific discourses, practices, and methods. For example, team members with a background in epidemiology tended to adopt a more objectivist epistemology, whereas those with backgrounds rooted in the social sciences leaned more toward constructivism. Similarly, team members with clinical experience and those with predominantly academic experience held different views on the balance between pragmatism and theory. By explicitly discussing research traditions related to each discipline, we reached a common understanding of these that facilitated equal collaboration in the project (Bates et al., 2023). These reflexive practices were maintained throughout the study to remain aligned in our approaches. Simultaneously, we collaboratively developed an overarching theoretical model, the research questions, sampling strategy, interview guide, and coding system as used in our study. This was an iterative process that developed during the research process. Five members of the research team participated in the qualitative data collection. To improve our interview skills, and minimize differences in experience with interviewing, we started the research project with an actor-based interview training. Furthermore, we discussed and exchanged interview experiences throughout the whole research process.
Intensive collaboration among our relatively large team of researchers was achieved. All interviews were verbatim transcribed by the researchers, allowing for familiarization with the qualitative data. Thereafter, transcripts were independently double-coded by two researchers from different disciplines. The composition of coding duos was varied throughout the study to minimize coding bias and duos discussed incongruences in coding of the transcripts. Decisions on incongruences were logged. Here, we noticed at times that different research disciplines held distinct vocabularies, which sometimes led to conceptual misunderstanding of the defined codes. By keeping a structured glossary in which we updated the shared definition of discussed codes, we overcame these misunderstandings and simultaneously improved our coding scheme. Interview experiences, transcripts, interpretation of participants’ shared experiences, and applied codes were discussed among all researchers to develop a global overview of participants’ illness approaches. Personal triggers or emotions experienced by the researchers were also addressed. Due to the interdisciplinary nature of the research team, it was vital to build and maintain consensus during coding and analyses. This continued close collaboration facilitated a more natural intertwinement of disciplines in our study.
Increasing Trustworthiness—Preregistering Qualitative Research
In order to increase the research’s trustworthiness, we preregistered our qualitative research project on the Open Science Framework (OSF; https://osf.io/a4gn2) (Lincoln & Guba, 1986). In a preregistration, one generally fixes their research question, hypotheses, and methods at a certain timepoint ensuring that these are not altered to fit the results after data analyses (Bosnjak et al., 2022). This idea of a static preregistration is more fitting for epidemiological, quantitative research. However, qualitative research is an iterative process by design. Inherent to the process is the constant redevelopment and redefinition of data analysis. Even data collection is arguably an inconsistent process as researchers function as an add-on to the measurement instrument (i.e., the interview guide) itself: different interviewers have different styles of interviewing, different triggers, and different associations with the narrative of the interviewee (Haven & Van Grootel, 2019).
The preregistration exemplifies how characteristics of quantitative epidemiological research and qualitative research may oppose each other, because the flexibility of qualitative research initially interfered with our notion of a preregistration. We, however, consider our preregistration of this qualitative study as a living document that functions as an audit trail for all to assess. It is not an abandonment of the flexibility of qualitative research, as a clear motivation on why we deviated from the preregistration does not make our research less valid. Although the interpretative and cyclical process of qualitative research cannot be predefined, the preregistration of our qualitative study forced us to explicate our research paradigm, sampling methods, and analytic strategies. This also contributed to the alignment of our interdisciplinary research team.
Integration—Synergistically Combining Epidemiological Quantitative and Qualitative Methods
In this study, integration of epidemiological and qualitative methods was achieved during the sampling phase of the qualitative study, which was started upon completion of the preregistration. Based on literature and expert opinion, we hypothesized that one’s illness approach relates to, among others, the experienced personal and familial symptom burden (Lynch-Jordan et al., 2013). Therefore, we used epidemiological data from the Lifelines Cohort Study, including sex, age, and information on somatic symptom burden within families, to inform and structure our purposive sampling strategy for participants in qualitative interviews. Via this approach, we ultimately identified and recruited a highly diverse and varied study sample, which enabled us to capture a wide range of experiences with somatic symptoms in families differing in both symptom burden and illness approaches. By embedding quantitative stratification, based on previous epidemiological analyses, within the sampling strategy for a qualitative interview study, we operationalized an integrative mixed methods sampling approach.
Lifelines is a multi-disciplinary prospective population-based cohort study examining in a unique three-generation design, via which families were included, the health and health-related behaviors of 167,729 persons living in the North of the Netherlands. Lifelines employs a broad range of investigative procedures in assessing the biomedical, sociodemographic, behavioral, physical, and psychological factors, which contribute to the health and disease of the general population, with a special focus on multi-morbidity and complex genetics. Lifelines participants consented to being contacted for add-on studies, such as the current qualitative interview study. Lifelines participants do not receive monetary incentives to participate in the main or add-on studies. Extensive information on the cohort, representativeness, design considerations, and recruitment procedures is provided elsewhere (Klijs et al., 2015; Scholtens et al., 2015).
In Lifelines, adult participants’ (N = 152,728) somatic symptom burden over time was previously estimated by means of data-driven latent class analyses of participants’ mean scores of the Symptom CheckList-90 Somatization subscale (Zijlema et al., 2013). This resulted in five distinct categories of somatic symptom burden (Figure 1) (Ballering et al., 2022). The majority of adult participants (N = 150,494; 98.5%) were categorized into one of these five burden categories, while the algorithm could not categorize a small minority of participants (N = 2,234; 1.5%). Visualization of the integration of epidemiological information into the qualitative sampling strategy
By combining information on somatic symptom burden and family relations within Lifelines, we developed a sampling strategy that maximizes variation in somatic symptom burden within families and between generations in our qualitative study sample. First, we selected participants of whom we had information on their somatic symptom burden (N = 150,494). Based on the five aforementioned symptom burden categories, we dichotomized participants into subgroups with a low (category 1 and 2) or high (category 4 and 5; Figure 1) somatic symptom burden (N = 139,398 and N = 3,928, respectively). Second, we selected participants with children who live or had ever lived at least 50% of the time in their home (N = 102,191). Third, we only selected participants whose parents also provided information on their somatic symptom burden in Lifelines (N = 6,618). Within this population, we defined four patterns of familial somatic symptom burden: (1) a pattern in which the participant and both parents had a low symptom burden (N = 6,363); (2) a pattern in which the participant had a low symptom burden and at least one parent had a high symptom burden (N = 186); (3) a pattern in which the participant had a high symptom burden and both parents had a low symptom burden (N = 49); and (4) a pattern in which at least one parent had a high symptom burden, as well as the participant (N = 20). This process is visualized in Figure 1.
We aimed to invite an equal number of people from each pattern, while maintaining maximum variation in participants’ sex and age. As a pilot, we invited five non-Lifelines participants via convenience sampling to participate (three male and two female pilot participants). During these pilots, we tested our interview guide and further developed our interview skills. Amendments made after the pilot interviews mainly involved rephrasing of questions, and adding questions and probes (the amendments are further described on OSF). Thereafter, we sent out five rounds of invitation letters to potential interviewee’s home address, with a reminder after a month. During each round, 32 to 64 Lifelines participants were invited to return an informed consent form if they wanted to participate in the interview study. As the last pattern of familial symptom burden (N = 20) was depleted after two rounds and the third pattern (N = 49) after three rounds, we invited participants with a high symptom burden, but unknown parental symptom burden as well. In total, we invited 224 Lifelines participants of whom 35 returned an informed consent form (15.6% response rate). The participants who were invited per pattern were selected based on their sociodemographic characteristics. The relevance of these characteristics was determined throughout the research process, as we strived to maximize diversity in our sample. This iterative approach also facilitated saturation in our data. After we purposively selected potentially information-rich participants to aim for a highly diverse sample, we found that code saturation became apparent after 20 interviews (Rahimi & Khatooni, 2024). In practice, this means that during the preliminary coding processes we found that identified codes became repetitive, no new codes were added, and we observed no new relationships among the codes (Hennink et al., 2017; Rahimi & Khatooni, 2024).
As most Lifelines participants have been included in the cohort study for multiple years, these participants are used to and potentially feel a sense of duty to participate in research. Eventually, this resulted in 32 interviews with Lifelines participants as three opted out after providing informed consent. We included a member check by giving interviewed participants the opportunity to read their verbatim transcript and return these with corrections or additions within 4 weeks. Corrections or additions could be returned via phone or e-mail to the researchers. None of the interviewed participants chose to provide alterations or additions to the transcript.
The Integrative Approach—Strengths and Limitations
Overview of the sociodemographic characteristics of the participants
aCategorization of educational level in Lifelines is defined elsewhere (van Zon et al., 2017).
bCategorization of urbanization is defined by Statistics Netherlands (Statistics, 2023).
Occupations of interview participants
A second strength of our integrative approach relates to the increased transparency and trustworthiness of our study due to our sampling strategy. As we integrated longitudinal and intergenerational epidemiological data into our purposive sampling strategy, we could a priori define diversity in familial symptom burden to guide our sampling. This predefinition of diversity also allows for transparent flexibility in sampling. While other purposive sampling strategies, such as ethnographic case-control sampling, usually define “cases” and “controls” in a binary manner and on an individual level, we were able to move beyond this by sampling from a spectrum of symptom burdens across generations. Purposive sampling that accounts for diversity across generations is rarely seen in qualitative research.
A limitation inherent to add-on studies within a large-scale epidemiological cohort study is a potential selection bias, as only those participants who are intrinsically interested in health and healthcare are willing to participate. As we informed participants of the aim of our research, it may be that only participants who have a clear idea about their illness approach wanted to participate in this study. Our analyses, however, showed that the study sample included participants in whom a clear illness approach was absent and who returned an informed consent form only because they felt it was their duty as a Lifelines participant. This indicates that the effect of the aforementioned limitation may be limited. To further mitigate any potential selection bias, we also included pilot interviews in our study sample, as we found that these were highly informative as well.
Additionally, Lifelines is a large-scale general population cohort representative of the North of the Netherlands. Nevertheless, our stringent criteria for defining familial symptom burden rendered some categories, especially the category including participants and parents with a high symptom burden, relatively small and easily depleted. This could be considered as a limitation from an epidemiological point of view. However, as discussed earlier, qualitative research is not fueled by a need for external validation of findings over the Dutch general population. The adaptivity of qualitative research allowed us to redefine our sampling strategy, as we recruited participants with a high symptom burden, but of whom their parents’ symptom burden was unknown, once specific categories of familial symptom burden became depleted.
Conclusion and Recommendations
Foremostly, in this paper we discuss how we integrated a quantitative epidemiological method to define relevant characteristics of participants, with a qualitative, purposive sampling approach. This integrative approach, combined with the interdisciplinarity of the research team, has significant value in obtaining a diverse and varied qualitative study sample. Subsequently, this approach contributes to enhancing the obtained qualitative data. It should be noted that we consider this strategy as a guiding tool to diversify study samples and to potentially enrich datasets. Our approach can be tailored to the respective study’s design, research question, and study sample. We do not recommend this method as a golden standard that is suitable for every context. If a large-scale cohort study is available, this complementary strategy may have added value. In smaller cohorts, this strategy may be adapted to involve less stringent criteria to avoid depletion of categories of participants. If no cohorts are available, one may consider initiating a mixed methods study, in which respondents to a survey are stratified according to relevant criteria, allowing for selective sampling from the defined strata of participants (Kouwenhoven et al., 2013).
We recommend assessing participants’ reasons underlying non-respondence. This would allow for adaptation of the sampling strategy to increase the response rate (Haan et al., 2024). Compensation of participants’ time, if allowed by the protocol of the cohort study, may be a strategy to increase the response rate as well.
We also recommend researchers, especially when operating in an interdisciplinary team or those who are relatively new to qualitative research, to explicate and discuss their philosophical position at the very beginning of the research process. Consensus on the philosophical position allows for motivation and justification of the study design, and thus guides methodological choices, such as integrating epidemiological, quantitative, and qualitative approaches (Shan, 2022). Clearly articulating the researchers’ philosophical position beforehand enhances the transparency of the research process, allowing readers to understand how the researchers’ point of view may have shaped data interpretation, analyses, and reporting (Creswell & Cresswell, 2022). We realized, only after thoroughly reflecting on our research process, that we maintained a critical realist position throughout our research. However, it could have further smoothened the research process if we had explicitly stated this early in the research.
Contribution to Mixed Methods Research
In conclusion, we exemplify that integration of quantitative epidemiological elements into a qualitative purposive sampling strategy allows for obtaining a highly diversified study sample, ultimately contributing to increased transparency and enhanced quality of research. We consider this study a successful example of how barriers to integrate qualitative and quantitative research methods can be resolved, but also of how interdisciplinarity enhances research outcomes. It shows that our approach aids in answering complex and real-world research questions.
Footnotes
Acknowledgements
The Lifelines Biobank initiative has been made possible by subsidy from the Dutch Ministry of Health, Welfare and Sport, the Dutch Ministry of Economic Affairs, the University Medical Center Groningen (UMCG, the Netherlands), the University of Groningen, and the Northern Provinces of the Netherlands. The authors wish to acknowledge the services of the Lifelines Cohort Study, the contributing research centers delivering data to Lifelines, and all the study participants. This work is part of the project FEEL-IT with project number VI.C.191.021 of the research program NWO Talent Program Vici which is financed by the Dutch Research Council (NWO). The funders had no role in relation to the study design, data collection, analysis or interpretation, or writing the report.
Ethical Considerations
The Lifelines cohort study was approved by the Medical Ethical Committee of University Medical Center Groningen (2007/152), and participants provided written informed consent to take part. The qualitative add-on study as described in this manuscript was approved by the non-WMO committee from the UMCG (202200104).
Consent to Participate
All participants of the add-on study described in this manuscript have provided written informed consent to take part in this study.
Author Contributions
AB: conceptualization, data curation, investigation, software, formal analysis, visualization, writing—original draft, and writing—review and editing; MH: formal analysis, writing—original draft, and writing—review and editing; EH: investigation, data curation, and writing—review and editing; SvdZ: investigation, data curation, and writing—review and editing; SB: investigation, data curation, and writing—review and editing; DvT: conceptualization, data curation, supervision, and writing—review and editing; JR: conceptualization, supervision, project administration, funding acquisition, and writing—review and editing.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Dutch Research Council (NWO) and is part of the project FEEL-IT (grant number VI.C.191.021).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
Lifelines data will not be shared publicly due to the sensitive nature of the data. Access to the Lifelines data is organized according to a strict data access procedure. For all types of access, a research proposal must be submitted for evaluation by the Lifelines Research Office. The evaluation is performed to align the goals of the researchers with the goals of Lifelines (which are in turn aligned with the informed consent form signed by Lifelines participants). Further information on Lifelines data can be obtained by contacting the Lifelines Research Office.
