Abstract
All qualitative researchers are familiar with the idea of saturation: that researchers should continue to collect and/or analyze data until nothing new is being added to their arguments or conclusions. Saturation is, however, used and understood in a variety of ways, often appearing as an unevidenced and dogmatic statement seeking to justify that a piece of research is complete. This article explores the application of the idea of saturation in qualitative research, noting its association with grounded theory and the particular interest taken in it by health researchers. It concludes that it is both a misunderstood and an overworked concept.
Introduction
Saturation—which supposedly occurs when the collection and/or analysis of additional data adds nothing new to a piece of research—is a key concept in qualitative research. At one level, it can be seen as an attempt to place qualitative analysis on a par with quantitative; just as quantitative researchers can claim, through the application of the appropriate statistical tests, that their findings are “significant” and generalizable, so qualitative researchers can claim that their analysis is saturated and complete. At another level, saturation allows the harassed and time-poor qualitative researcher to call it a day on a particular project and move on.
This article argues that the concept has been overworked and misunderstood, and that it is perhaps now time to replace or retire it. In an analogous way to how claiming saturation allows a project to end, we should recognize that the application of saturation in qualitative research is now well past saturation point and should hence be discontinued. We may then rely on other means for judging qualitative research of which there are many. The article constructs this argument through an interrogation of the research literature that has used the concept over the past 60 years or so.
The next section considers the origins and meaning of the term, and how its achievement may be assessed, exploring its close connection to grounded theory and its subsequent application throughout qualitative research. A series of issues that the concept raises are then considered, including the relation between saturation and sample size, how saturation may be evidenced, the dominance of health researchers in discussions of saturation, whether saturation has different meanings in different disciplines, and whether it is being used in another attempt to “quantify” qualitative research.
Origins, Meanings, and Assessment
The notion of saturation is particularly associated with the grounded theory approach to qualitative research. In their much-referenced book, the originators of that approach, Glaser and Strauss (1967/2017) explained the term in the following fashion, linking it closely to the development of theory and categories: the sociologist must continually judge how many groups he should sample for each theoretical point. The criterion for judging when to stop sampling the different groups pertinent to a category is the category’s
Note the emphasis on the care that needs to be taken by the researcher to ensure that saturation has genuinely been reached. Glaser and Strauss clearly envisaged this process as being both lengthy and troublesome.
In a detailed commentary on this thinking, another key grounded theory author, Dey (1999), argued that “theoretical sufficiency” might be a better and more appropriate term than “theoretical saturation” as “‘Saturation’ . . . seems to imply that the process of generating categories (and their properties and relations) has been exhaustive rather than merely ‘good enough’” (p. 117). One can, of course, detect in this argument a practical desire to make the grounded theory process less lengthy and troubling, and thus more practical.
Glaser and Strauss later developed grounded theory in somewhat different ways, and new thinkers came to the field. Strauss allied with Corbin and they, similar to Dey, sought to offer a rather more nuanced explanation of saturation: Only when a researcher has explored each category or theme in some depth, identifying its various properties and dimensions under different conditions, can the researcher say that the research has reached the level of saturation. In reality, a researcher could go on collecting data forever, adding new properties and dimensions to categories. Eventually, a researcher has to say this concept is sufficiently well developed for purposes of this research and accept what has not been covered as one of the limitations of the study. (Corbin & Strauss, 2015, p. 140)
This explanation suggests that definitively knowing that saturation has occurred may not be possible, offering instead the rather looser and more subjective criterion of “sufficiently well developed,” which may be seen as analogous to Dey’s “good enough.”
Charmaz, probably the most cited contemporary exponent of grounded theory, offers another variant to these criteria, “nothing new happening”: Unfortunately many researchers have detached the term “saturation” from its theoretical moorings in grounded theory. Some researchers treat saturation in conjunction with theoretical sampling in grounded theory; many do not. The common use of the term “saturation” refers to nothing new happening. (Charmaz, 2014, p. 213)
As Charmaz points out, by the time that she was writing, the idea of saturation had lost its close association with grounded theory and had come to be applied throughout qualitative research. Some researchers interpreted it in much the same way as grounded theorists: for example, Bowen (2008) explained that “saturation is reached when the researcher gathers data to the point of diminishing returns, when nothing new is being added” (p. 140). Others, however, sided with Charmaz in questioning the varied application of the concept by qualitative researchers:
This seeks to make clear the different meanings and applications of saturation in grounded theory and qualitative research in general. Rather unhelpfully, Hood (2007) went on to note that “The concept of ‘theoretical saturation’ is as difficult to explain as it is for most researchers to understand,” which would suggest that seeking to achieve it was also problematic.
O’Reilly and Parker (2012), however, took a rather different view: While grounded theory has clear guidance about what constitutes theoretical saturation, how to apply it and when to use it, the new meanings in relation to other qualitative approaches are less developed. . . As saturation becomes unquestioned and expected, it is necessary to take time to reflect on what this actually means for research practice. (p. 196)
That “clear guidance,” however, has clearly been interpreted in different ways by different authors, within grounded theory and beyond, so what it actually means for practice may be rather varied.
In the broader arena of qualitative research as a whole, researchers were quick to point out that there was no “one-size-fits-all” approach to saturation available (e.g., Fusch & Ness, 2015; Nelson, 2017). Researchers needed to take an approach that was appropriate for their particular research method and field, and be explicit about how they determined that saturation had been achieved.
Thus, Braun and Clarke (2021), writing on the use of reflexive thematic analysis by sport and exercise researchers, argued, We encourage sport and exercise and other researchers using reflexive TA [thematic analysis] to dwell with uncertainty and recognise that meaning is generated through interpretation of, not excavated from, data, and therefore judgements about “how many” data items, and when to stop data collection, are inescapably situated and subjective, and cannot be determined (wholly) in advance of analysis. (p. 201)
It’s refreshing, but also sensible, to see the subjectivity of qualitative research being recognized here and the impracticality of predicting ahead how much research will be needed.
Fusch and Ness (2015) took a different approach in adding further criteria for saturation to those already identified: “Data saturation is reached when there is enough information to replicate the study, when the ability to obtain additional new information has been attained, and when further coding is no longer feasible” (p. 1413). This reads a little contradictorily, but does recognize the practical demands placed on small-scale researchers.
Nelson (2017) went so far as to suggest that the term “saturation” be dropped and replaced with “conceptual depth,” for which he offered five criteria (range, complexity, subtlety, resonance, and validity). Sebele-Mpofu (2020) put forward a similar argument: “Researchers must also strive not to let their pursuit of saturation overshadow other important measures of quality in qualitative research such as: credibility, diversity, conformability, trustworthiness and reliability” (p. 17). One concern with both of these formulations is that they begin to overlap—notably through the suggestion of using the criteria of reliability and validity; however, they are interpreted—with the criteria conventionally used for judging quantitative research.
Saunders et al. (2018) examined the existing research literature, identifying four models of saturation, each of which appears to make different core assumptions about what saturation is, and about what exactly is being saturated. These have been labelled as: theoretical saturation, inductive thematic saturation, a priori thematic saturation, and data saturation (p. 1903).
To these four, we may add code, conceptual, construct, meaning, researcher, studywise, and thematic (not qualified in any way) saturation as other terms that have been used in the literature; and there are probably others that I have yet to come across.
As with much terminology in the social sciences, therefore, it appears that the longer the idea of saturation has been in use the more it has become qualified and/or used in different ways.
Issues
As already suggested, the use of the concept of saturation in qualitative social research raises a number of issues. These include
the relation between saturation and sample size, and how saturation may be evidenced,
the dominance of health researchers in published discussions of saturation, and whether saturation has different meanings in different disciplines, and
whether the use of saturation is another attempt to “quantify” qualitative research.
These issues will now be discussed in turn.
Sample Size and Saturation
As anyone who has supervised qualitative social researchers for undergraduate, master’s, or doctoral degrees will know, one of the most common questions you will be asked is “how many interviews do I need to do?” or its equivalent. While at times we may suspect we are being asked this because the questioner wants to minimize—or at least put some limits on—their workload and effort, and have a convenient authority figure to justify their doing this, there are genuine concerns about unnecessarily overdoing data collection and ending up with lots of redundant material.
Firming up the link between sample size and saturation may work in two ways. You might, as in the question posed in the previous paragraph, want to know in advance how many interviews you will need to do to reach saturation. Or the interviews may already have been done and you’d like to know how many of them you’ll need to analyze to reach saturation.
Earlier writing on sample size and saturation both bemoaned the lack of such guidance and doubted those researchers who claimed to have reached saturation. For example, In qualitative research, there are no published guidelines or tests of adequacy for estimating the sample size required to reach saturation equivalent to those formulas used in quantitative research. Rather, in qualitative research, the signals of saturation seem to be determined by investigator proclamation and by evaluating the adequacy and the comprehensiveness of the results. (Morse, 1995, p. 147)
Thorne and Darbyshire (2005), in humorous vein, referred to this problem somewhat bizarrely as “the wet diaper”: Surprisingly not the exclusive domain of grounded theorists, this phenomenon represents the claim of “theoretical saturation” (that is, no new information will arise from further sampling) merely by conveniently ignoring the complexities inherent in any human health-related experience. Despite health disciplines whose logic is invested in a theory of infinite possible variance of the inherent complexities involved, the saturation claim is often invoked as a convenient stopping point. (p. 1108)
They didn’t, however, offer any suggestions as to how the problem might be handled.
Morse (1995, p. 149), then the editor of
Select a cohesive sample. The greater the cohesiveness of the sample, the faster saturation will be obtained, but the less generalizability of the project. . .
Saturation will be achieved most quickly if theoretical sampling is used. Snowball, or a convenience sample, will result in saturation being achieved more slowly. With a random sample, saturation may never be achieved. . .
Sample all variations appearing within the data until each “negative case” perspective is saturated. . .
Saturated data are rich, full, and complete. The resulting theory makes sense and does not have gaps.
The more complete the saturation, the easier it is to develop a comprehensive theoretical model.
These principles offer some useful guidance, but many qualitative researchers may not want a cohesive sample or be able to judge how cohesive or not their actual sample is. The fourth and fifth principles involve value judgments, but by whom—the researcher(s), their supervisors or funders, or their broader readership?
Five years later, Morse offered more comprehensive, but still not very definitive, guidance, including that Estimating the number of participants in a study required to reach saturation depends on a number of factors, including the quality of data, the scope of the study, the nature of the topic, the amount of useful information obtained from each participant, the number of interviews per participant, the use of shadowed data, and the qualitative method and study design used. Once all of these factors are considered, you may not be much further ahead in predicting the exact number, but you will be able to defend the estimated range presented in your proposal. (Morse, 2000, p. 3)
One of the key problems with this approach, though, is that, although you may be able to provide a defensible estimate of the number of participants required, you could then find—once you have completed and analyzed the interviews—that saturation has not been reached.
By contrast, more recent writing on this topic has been characterized by clearer and numeric guidance, either proposing quantitative methods that may be used to assess whether saturation has been reached in qualitative research (e.g., Fofana et al., 2020; Galvin, 2015; Guest et al., 2020; Lowe et al., 2018; Tran et al., 2017; van Rijnsoever, 2017) or specifying how many interviews and/or focus groups are needed in particular circumstances to reach saturation (e.g., Coenen et al., 2012; Francis et al., 2010; Fugard & Potts, 2015; Guest et al., 2006, 2017; Hagaman & Wutich, 2017; Hennink et al., 2017, 2019; Namey et al., 2016).
Hennink and Kaiser (2022), for example, carried out a systematic review that “identified studies that empirically assessed saturation in qualitative research, documented approaches to assess saturation, and identified sample sizes for saturation” (p. 9). Their analysis demonstrates that “saturation can be achieved in a narrow range of interviews (9–17) or focus group discussions (4–8), particularly in studies with relatively homogenous study populations and narrowly defined objectives” (Hennink & Kaiser, 2022, p. 9). These ranges allow for variations in the complexity and detail of the information being sought and the homogeneity of the population it is being sought from.
These figures, particularly those for focus groups, may seem readily achievable, even for small-scale research, which of course is what the bulk of qualitative research is. But is it realistic for other social science researchers to simply pick up and apply these suggestions, whatever their discipline or subject of study? Vasileiou et al. (2018), writing in a health research context, would caution otherwise: The past decade has seen a growing appetite in qualitative research for an evidence-based approach to sample size determination and to evaluations of the sufficiency of sample size. . . To ensure and maintain high quality research that will encourage greater appreciation of qualitative work in health-related sciences, we argue that qualitative researchers should be more transparent and thorough in their evaluation of sample size as part of their appraisal of data adequacy. We would. . . caution against responding to the growing methodological research in this area with a decontextualised application of sample size numerical guidelines, norms and principles. (p. 16)
Simply applying Hennink and Kaiser’s (2022) guidelines to any qualitative research project, particularly when the population being sampled is not homogeneous and the subject of study is broad and/or ill-defined, is not, therefore, a sensible or defensible strategy. As Sim et al. (2018) conclude, “defining sample size a priori is inherently problematic in the case of inductive, exploratory research, which, by definition, looks to explore phenomena in relation to which we cannot identify the key themes in advance” (p. 630).
We may agree with Bowen (2008) that “claims of saturation should always be supported by an explanation of how saturation was achieved and substantiated by clear evidence of its occurrence” (p. 150), but have to ask how can we know whether saturation has been reached, especially if the topic being studied is complex? There is always the possibility that the next interviewee or the next focus group—particularly if we recruit outside of our comfort zone—will provide responses to challenge our existing analysis. And those responses might be just the ones we need to progress the understanding of our field further.
The Use of Saturation in Health Studies and Other Disciplines
It is, perhaps, telling that 22 of the 32 articles referred to in this article were authored by health researchers and/or published in health research journals; several of the quotations that have been used have also indicated this. Glaser and Strauss’s (1967/2017) original book on grounded theory, although they were sociologists, was written with the support of a health service research grant. It may be that saturation, and its relation with sample size, is much more important in the health research context. After all, In the patient-reported outcomes field, strict regulatory requirements must be met for qualitative research that contributes to labeling claims for medicinal products. These requirements not only emphasize the importance of reaching saturation but also of providing documentary evidence that saturation has been reached. (Kerr et al., 2010, p. 269)
In health studies research, where research is overwhelmingly grant-driven, it is critically important, therefore, to be able to budget for the sample size that will be needed to reach saturation.
The nature of interviews and focus groups in health studies research also differs somewhat from practice in other social science disciplines. Three examples from the research on saturation already referred to will be used to illustrate this:
Guest et al.’s (2006) research “examined how women talk about sex and their perceptions of self-report accuracy in two West African countries—Nigeria and Ghana” (p. 62). Their “interview guide consisted of six structured demographically oriented questions, sixteen open-ended main questions, and fourteen open-ended subquestions” (p. 63).
Guest et al.’s (2017) research involved “focus group discussions among members of the African American community in Durham, North Carolina, to solicit their opinions on the health issues most in need of research within their community” (p. 9). Their focus group instrument contained nine questions.
Hagaman and Wutich’s (2017) “research was designed to explore cultural understandings of problems in water delivery, fairness in water systems, and solutions to water problems” (p. 28). A total of 132 respondents were interviewed at sites in Bolivia, Fiji, New Zealand, and the United States: “Ethnographic interviews were conducted. . . The protocol contained open-ended questions covering local water sources, threats to future water supplies, potential solutions for improving water security, and respondent demographics. . . On average, interviews lasted about 30 minutes” (p. 29).
What this suggests is that interviews and focus groups in health studies research tend to be both more highly structured and shorter than in some other disciplines. They may even be characterized as quasi-quantitative in nature, recognizing and accepting only a limited range of responses to the questions being asked. In such circumstances—with reasonably homogeneous samples—it makes a lot of sense to link saturation to the number of interviews or focus groups required.
In other disciplines, and other forms of qualitative social research, however, it may be that saturation usually means something else. In such cases, interviews and focus groups may be rather more open-ended or semi-structured, and the researchers may have less of an idea as to what their respondents might say. Saturation might then be grasped at as a signal that the researcher has done or had enough: Qualitative research talks about data saturation; that is, stopping data collection once new information is no longer identified and when only repetitions are noted. . . On the other hand, to the extent that each life is unique, no data are ever truly saturated: there are always new things to explore. In our study, however, fewer interviews might have reduced researcher fatigue and led to timely data analysis. . . This also ensures that researchers are not “drowning in data,” another contributor to researcher saturation. (Wray et al., 2007, p. 1400)
Saturation, in these circumstances, may then be simply an acknowledgment of what the researcher is able to cope with in a particular time frame.
The “Quantification” of Qualitative Research
The interest in saturation, particularly the attempts to link it to sample size, may also be seen as another attempt to “quantify” qualitative research (I use double quotation marks to indicate that this is not necessarily achievable in any full sense). It has already been suggested that some health research may employ interviews and focus groups in a quasi-quantitative fashion, but the concern is broader than that. Being able to specify that, to research a given topic with a given population, you need to carry out interviews with 10, 15, or 20 people, or convene 5 or 10 focus groups, or whatever the numbers are, reduces qualitative research to its lowest common denominator. It invites the researcher to do the absolute minimum required, write up the results, and move on quickly to the next project.
The use of saturation is not the only example of such attempted “quantification” of qualitative research. The much-cited approach suggested to thematic analysis by Braun and Clarke (2006), which requires the researcher to carefully work through six phases of analysis, could be viewed as a move in the same direction, with the researcher following strictly a particular approach. Perhaps tellingly, it was suggested by a pair of psychologists, a discipline, similar to health research, predominantly quantitative in nature.
Much the same argument might be made, of course, with reference to the different versions of grounded theory (Charmaz, 2014; Corbin & Strauss, 2015; Glaser & Strauss, 1967/2017), which each specify a series of required stages and actions. It seems hardly surprising, then, that saturation has come to be interpreted in a simplistic and quantitative fashion.
While the availability of detailed, quantified, or semi-quantified guidance to the conduct of qualitative research may be found very helpful and supportive by many social researchers, whether novice or more experienced, it arguably then ceases to be qualitative research. A computer could be programmed to do the research and, very possibly, already has been in some contexts. Qualitative research surely needs to remain more flexible and open-ended.
Discussion and Conclusion
I have argued in this article that the concept of saturation in social research has become overworked and misunderstood. An idea that had a specific, if perhaps not so easily understood, meaning within the early formulations of grounded theory has been picked up within qualitative research generally and used as a justification for stopping data collection and/or analysis at a particular point. And that particular point has become a movable feast, with saturation being increasingly claimed after the completion of a relatively small but variable number of interviews or focus groups.
Much of this development has been driven by the demands of health research, where the need to be able to carefully specify sample sizes and come up with evidence-based results that may be safely generalized to larger populations is critical. But the needs, and the nature of the interviews and focus groups that are widely used to collect the data, are different and varied in other forms of social research.
In such forms of research, other criteria for ceasing data collection and/or analysis—such as “good enough,” “sufficiently well developed,” “nothing new happening,” or even “a convenient stopping point”—may be employed instead of the more nebulous and impossible-to-verify saturation. We might also choose to talk of theoretical sufficiency or conceptual depth, and use other criteria for judging qualitative research, such as credibility, confirmability, transferability, and trustworthiness (Denzin & Lincoln, 2005).
Without in any way disputing the importance of health research, we must not let its concerns and preferences drive methodological developments in the social sciences as a whole. At the same time, we should resist attempts to “quantify” qualitative research (while not rejecting the practicality of mixed-methods research). The achievements and flexibility of qualitative research (and of quantitative research) are too important and valuable to be compromised in this fashion.
It is arguably, then, time to retire the notion of saturation. As an idea, it has itself become saturated with too many varied and conflicting interpretations and practices. Let us move on—at least outside health research—and employ other evaluative criteria.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
