(I Can’t Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research

PLOS ONE, Dec 2019

I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: “random chance,” which is based on probability sampling, “minimal information,” which yields at least one new code per sampling step, and “maximum information,” which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0181689&type=printable

(I Can’t Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research

July (I Can't Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research Frank J. van Rijnsoever 0 1 0 Innovation Studies, Copernicus Institute of Sustainable Development, Utrecht University , Utrecht , The Netherlands, 2 INGENIO (CSIC-UPV) , Universitat Politècnica de València , Valencia , Spain 1 Editor: Gemma Elizabeth Derrick, Lancaster University , UNITED KINGDOM I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: ªrandom chance,º which is based on probability sampling, ªminimal information,º which yields at least one new code per sampling step, and ªmaximum information,º which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario. - Data Availability Statement: All relevant data are within the paper and its Supporting Information files. Funding: The author(s) received no specific funding for this work. Competing interests: The author has declared that no competing interests exist. Introduction Qualitative research is becoming an increasingly prominent way to conduct scientific research in business, management, and organization studies [ 1 ]. In the first decade of the twenty-first century, more qualitative research has been published in top American management journals than in the preceding 20 years [ 2 ]. Qualitative research is seen as crucial in the process of building new theories [2±4] and it allows researchers to describe how change processes unfold over time [ 5,6 ]. Moreover, it gives close-up and in-depth insights into various organizational phenomena [ 7,8 ] perspectives and motivations for actions [ 1,8 ]. However, despite the explicit attention of journal editors to what qualitative research is and how it could or should be conducted [8±10], it is not always transparent how particular research was actually conducted [ 2,11 ]. A typical topic of debate is what the size of a sample should be for inductive qualitative research to be credible and dependable [ 9,12 ] (Note that in this paper, I refer to qualitative research in an inductive context. I recognize that there are more deductive-oriented forms of qualitative research). A general statement from inductive qualitative research about sample size is that the data collection and analysis should continue until the point at which no new codes or concepts emerge [ 13,14 ]. This does not only mean that no new stories emerge, but also that no new codes that signify new properties of uncovered patterns emerge [ 15 ]. At this point, ªtheoretical saturationº is reached; all the relevant information that is needed to gain complete insights into a topic has been found [ 1,13 ]. (Note that to prevent confusion, I use the term `code' in this article to refer to information uncovered in qualitative research. I reserve the term `concept' to refer to the concepts in the theoretical framework). Most qualitative researchers who aim for theoretical saturation do not rely on probability sampling. Rather, the sampling procedure is purposive [ 14,16 ]. It aims ªto select informationrich cases whose study will illuminate the questions under studyº [ 12 ]. The researcher decides which cases to include in the sample based on prior information like theory or insights gained during the data collection. However, the minimum size of a purposive sample needed to reach theoretical saturation is difficult to estimate [9,17±22]. There are two reasons why the minimum size of a purposive sample deserves more attention. First, theoretical saturation seems to call for a ªmore is betterº sampling approach, as this minimizes the chances of codes being missed. However, the coding process in qualitative research is laborious and time consuming. As such, especially researchers with scarce resources do not want to oversample too much. Some scholars give tentative indications of sample sizes that often lie between 20 and 30 and are usually below 50 [ 23,24 ], but the theoretical mechanism on which these estimates are based is unknown. Second, most research argues that determining whether theoretic (...truncated)


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0181689&type=printable

Frank J. van Rijnsoever. (I Can’t Get No) Saturation: A simulation and guidelines for sample sizes in qualitative research, PLOS ONE, 2017, Volume 12, Issue 7, DOI: 10.1371/journal.pone.0181689