Sample Size Determination for Pilot Studies Based on Event Thresholds: A Framework for Evaluating Feasibility of Research Processes

Abstract

Background:

Pilot studies are essential for assessing feasibility and operational processes before large-scale research. This study presents a Poisson-based framework for determining the minimum pilot sample size based on discrete event thresholds.

Methods:

The number of events represents occurrences such as errors, deviations, or missing data that indicate whether a process is functioning as intended. Setting a maximum tolerable threshold for these events provides a clear decision rule for proceeding or refining study procedures. A Poisson-based model was applied for sample size calculation.

Results:

Based on α ≤ 0.05, the findings show that sample sizes vary depending on the study conditions, namely the maximum tolerable threshold and event rate. Sample size increases with the number of thresholds, while other parameters remain fixed. When α ≤ 0.05, sample sizes ranging from 12 to 26 are generally sufficient for studies involving three to five maximum tolerable thresholds, assuming an event rate of 10.0%. Larger sample sizes may be necessary when a greater number of thresholds are involved or when the acceptable event rate is low.

Conclusions:

The framework of sample size determination facilitates the early identification of potential issues and supports informed decision-making prior to conducting a full-scale study. In general, a sample size of 25 to 30 units is considered adequate for a pilot study, consistent with existing literature. However, larger sample sizes may be necessary when a greater number of thresholds are involved or when the acceptable event rate is low.

Keywords

Feasibility pilot study Poisson research design sample size

Key Messages:

Question: How to calculate or estimate sample size, focusing on the feasibility of research processes?

Findings: Using α = 0.05, results show sample sizes vary with study conditions, mainly the maximum tolerable threshold and event rate. At 5.0%, sample sizes of 7 to under 30 suit three to five thresholds and event rates from 20.0% to 30.0%.

Meaning: Calculations for a pilot study, emphasizing the feasibility of research processes, can be determined using the Poisson formula.

Pilot studies play a crucial role across research disciplines by assessing the feasibility, safety, and operational aspects of a planned main study prior to large-scale implementation.^1,2 Appropriate sample size determination in pilot studies ensures meaningful information can be obtained while avoiding unnecessary use of time, resources, and costs. Traditionally, pilot sample sizes are determined using rules of thumb or based on the precision of parameter estimates, such as confidence intervals for means or proportions.^3–5 Pilot studies are also frequently conducted to evaluate the suitability of research instruments or questionnaires before their use in definitive studies.^2,6

While these approaches provide useful guidance, they often do not explicitly address the feasibility of the research process, particularly from an operational perspective. In this context, research process feasibility can be defined as the extent to which selected study procedures achieve their intended objectives with minimal deviations while meeting acceptable expectations.^7,8 This is because research feasibility is a multidimensional construct; a practical, observable proxy is the frequency of process deviations or adverse events during the pilot phase.

Rationale for the Study

Operational deviations, errors, or adverse incidents are naturally represented as discrete count outcomes, which commonly arise in feasibility and implementation research. However, methodological guidance for determining pilot sample sizes for count outcomes remains limited. To address this gap, this study proposes a Poisson framework for pilot sample size determination using predefined event thresholds. The approach allows researchers to calculate the minimum number of participants required so that the probability of exceeding a prespecified event limit (e.g., more than three deviations) remains below a chosen risk level. In addition, sample size calculation or estimation is now a requirement for ethical approval. Although the emphasis is on the actual study, the sample size for a pilot study is also important, as findings from the pilot will justify conducting the main study. Therefore, the sample size calculation and guidelines presented in this study can serve as a reference for researchers conducting pilot studies.

Aims and Objectives

Therefore, the primary objective of this study is to develop and illustrate a statistical approach for determining pilot study sample size based on event thresholds using the Poisson model. This framework enhances transparency and reproducibility in pilot study planning by enabling investigators to link acceptable event limits directly to sample size requirements, thereby improving feasibility assessment and supporting evidence-based progression decisions.

Methods

Approach

This study presents a statistical framework for determining pilot study sample size based on predefined event thresholds. The framework is intended for feasibility-oriented pilot studies that aim to evaluate the operational performance of study procedures by specifying the number of thresholds or the maximum number of discrete events that can be tolerated. Under this approach, the pilot study functions as a decision-making tool, guiding whether the protocol can proceed to a full-scale study or requires refinement.

Outcome Definition

The primary outcome is the total number of discrete events (for example, procedural deviations, operational errors, or adverse incidents) observed during the pilot study. A progression rule is specified a priori as follows:

If the total number of events is ≤3, the study procedures are considered feasible, and the main study may proceed.

If the total number of events is >3, modifications to the protocol are required before further implementation.

This threshold-based criterion provides a transparent feasibility benchmark and ensures that pilot findings directly inform operational decision-making. The choice of three events as the threshold represents a practical and illustrative example rather than a fixed requirement. In practice, researchers may specify alternative thresholds based on the context of the study, the acceptable level of risk, and the operational or clinical implications of the events being monitored.

Statistical Model

Let X denote the total number of observed events during the pilot study. The event count is assumed to follow a Poisson distribution,

X ~ P o i s s o n (µ),

where µ represents the expected number of events during the pilot study. In this framework, the expected count is expressed as

µ = n λ_{o},

where n is the pilot sample size, and λ_o is the maximum acceptable event rate per participant (or per study unit). The Poisson model is appropriate when events are rare, occur independently, and occur over a fixed exposure period. Hence, the pilot sample size is determined by imposing the probability constraint

P (X > c | µ = n λ_{o}) \leq α,

where,

c is the predefined event threshold (e.g., 3),

α is the acceptable risk level (probability of exceeding the threshold), and

λ_o represents the highest event rate considered operationally acceptable.

The smallest integer n satisfying this inequality is selected as the required pilot sample size. This ensures that, if the true event rate does not exceed the acceptable level, the probability of observing more than the tolerated number of events remains below the specified risk threshold.

Decision Framework

A structured progression rule is embedded within the pilot protocol:

Set the sample size for the pilot study, n.

Define X or Xs, such as the number of missing records, the number of feedback not received within a specific time period, or the number of failed processes.

Conduct the pilot study.

Record the total number of observed events X. If X ≤ c, the study procedures are considered feasible and progression to the main study is supported.

Otherwise, if X > c, the protocol should be reviewed and modified before further implementation.

Overall, this Poisson-based decision framework links statistical assumptions directly to operational feasibility criteria, providing a transparent and reproducible basis for progression decisions in pilot studies.

Results

Table 1 presents the minimum pilot sample sizes required to ensure that the probability of observing more than three events remains below α = 0.05, for various maximum acceptable event rates per study (λ_o). The required sample size decreases as the acceptable event rate increases. For a low event rate of 0.05, at least 25 study units are required to maintain the risk of exceeding three events below 5.0%. When the event rate rises to 0.10, the minimum sample size decreases to 12 study units, and further decreases to 4–8 study units for event rates between 0.15 and 0.30. Table 2 presents similar results with various numbers of thresholds. Based on the calculations, the required sample size ranges from 9 to 480, depending on the event rate and the maximum event threshold specified by the researchers (e.g., 5, 10, 20, or 30 events).

Table 1.

Minimum Pilot Sample Size Based on Event Threshold (c = 3) and α ≤ 0.05.

Maximum Acceptable Event Rate Per Study (λ_o)	Minimum Pilot Sample Size (n)	Expected Events µ = n × λ_o	Interpretation
0.05	25	1.25	Assuming a maximum acceptable event rate of 5% per study (λ_o = 0.05), recruiting 25 study units ensures the probability of observing more than three events remains below α = 0.05.
0.10	12	1.20	Assuming a maximum acceptable event rate of 10% per study (λ_o = 0.10), recruiting 12 study units ensures the probability of observing more than three events remains below α = 0.05.
0.15	8	1.20	Assuming a maximum acceptable event rate of 15% per study (λ_o = 0.15), recruiting 8 study units ensures the probability of observing more than three events remains below α = 0.05.
0.20	6	1.20	Assuming a maximum acceptable event rate of 20% per study (λ_o = 0.20), recruiting 6 study units ensures the probability of observing more than three events remains below α = 0.05.
0.25	5	1.25	Assuming a maximum acceptable event rate of 25% per study (λ_o = 0.25), recruiting 5 study units ensures the probability of observing more than three events remains below α = 0.05.
0.30	4	1.20	Assuming a maximum acceptable event rate of 30% per study (λ_o = 0.30), recruiting 4 study units ensures the probability of observing more than three events remains below α = 0.05.

The sample size is set when X or expected events µ ≤ c.

Discussion

The aim of the pilot study is not to obtain a significant result that addresses the study objectives. Ideally, in research, researchers aim to achieve significant results (i.e., p < .05) with a sizable effect size.⁹ However, in most cases, a key concern in a pilot study is the precision of the estimates.^4,5,10,11 Instead, in this study, the sample size calculation was based on an estimate of the feasibility of the research process rather than estimates of the outcome variables.

These results highlight several important insights for pilot study design. Threshold-based decision rules are statistically actionable because, by modeling the expected number of events, researchers can calculate the minimum sample size needed to meet operational feasibility criteria rather than relying on arbitrary rules of thumb. This transparent and reproducible approach enables early-phase researchers to plan pilot studies more effectively, anticipate potential challenges, and make informed decisions about whether to proceed to larger-scale trials.

The proposed Poisson framework addresses a methodological gap in pilot study design, where traditional approaches often focus on parameter estimation without formally accounting for discrete event thresholds. By explicitly linking event thresholds to sample size, researchers can quantitatively control the probability of exceeding predefined operational limits, ensuring pilot studies remain feasible and safe.^12,13 This approach strengthens decision-oriented study design, facilitates early identification of procedural issues, optimizes resource allocation, and enhances the reliability of feasibility assessments before scaling to full studies. Its applicability spans healthcare, social sciences, agriculture, and any research domain where efficiency and operational feasibility are critical.

A key contribution of this framework is the explicit connection between the acceptable event threshold and the probability of exceeding that threshold under an assumed true event rate. For example, assuming a deviation probability of 0.10, a sample size of 12 ensures that the probability of observing more than three deviations remains below α = 0.05. This provides a clear statistical justification for a decision rule: proceed with the pilot if deviations are three or fewer, and refine procedures if this threshold is exceeded. Importantly, the threshold represents the maximum tolerable level, not the expected number of deviations. In practice, the framework allows a structured interpretation of pilot outcomes.

Consider a pilot survey study assessing patients’ mental health in a clinic. Missing or incomplete surveys are treated as discrete events. Using the Poisson-based framework, the research team can set a maximum threshold of three missing surveys with α equal to 0.05 and calculate the minimum number of surveys needed to ensure the probability of exceeding this threshold remains below 5.0%. If three or fewer surveys are missing, the survey administration process is considered feasible and can proceed to a larger study. If more than three surveys are missing, modifications to the survey design, instructions, or administration procedures are needed before scaling.

This example illustrates how the framework provides a transparent, statistically justified method for guiding decision-making in early-phase survey research. Accordingly, the sample size statement for the pilot study based on this example is as follows:

This pilot study requires a minimum of 25 respondents to tolerate a maximum of three missing surveys. The calculation was based on a Poisson model framework with a maximum acceptable event rate of 0.05 and alpha fixed at 0.05 introduced in a previous study.

In another example, researchers aim to determine the sample size for a pilot diagnostic study to determine anxiety. Actually, it is not easy to diagnose anxiety.^14,15 Here, a critical error during a test run, such as a false negative due to procedural mistakes, is treated as a discrete event. Using the Poisson-based framework, the team can set a maximum threshold of five errors with α = 0.05 and determine that at least 26 test runs are needed to ensure the probability of exceeding this threshold remains below 10.0%. Observing five or fewer errors indicates preliminary feasibility, while exceeding the threshold signals the need for procedural refinement before scaling to a larger clinical study.

This demonstrates how the framework can provide a statistically justified decision-making method in early-phase diagnostic research. Accordingly, the sample size statement can be written as follows:

This pilot study requires a minimum of 26 test runs to tolerate a maximum of five false negatives due to procedural mistakes. The calculation was based on a Poisson model framework with a maximum acceptable event rate of 0.10 and alpha fixed at 0.05, introduced in a previous study.

Based on Table 2, the required sample size ranges from 9 to 480, depending on the event threshold specified by the researchers, for example, 5 or 30 events and event rate. In many pilot study contexts, a sample size of fewer than 30 units is commonly considered adequate.^2,3 Under such circumstances, setting the threshold at three events is often reasonable, particularly when monitoring errors, deviations, or procedural inconsistencies. In larger studies, where the total number of study units may exceed thousands, the likelihood of observing more tolerable errors or deviations naturally increases. In these situations, a higher event threshold is more appropriate to reflect realistic operational conditions. Ultimately, the choice of threshold and maximum event rate should align with the expected efficiency and effectiveness of the research process. Regardless of the threshold selected, researchers remain responsible for implementing appropriate measures to minimize errors, deviations, and inconsistencies throughout the study. Thus, choosing a reasonable threshold such as 3 or 5 with an event rate of 10.0% or lower is reasonable.

Table 2.

Minimum Pilot Sample Size Based on Various Event Thresholds with a = 0.05.

Maximum Event Threshold	Maximum Acceptable Event Rate Per Study (λ_o)	Minimum Pilot Sample Size (n)
5	0.05	52
	0.10	26
	0.15	18
	0.20	13
	0.25	11
	0.30	9
10	0.05	146
	0.10	73
	0.15	49
	0.20	37
	0.25	30
	0.30	25
20	0.05	312
	0.10	156
	0.15	104
	0.20	78
	0.25	63
	0.30	52
30	0.05	480
	0.10	240
	0.15	160
	0.20	120
	0.25	96
	0.30	80

In general, observing events at or below the threshold indicates preliminary feasibility and supports progression to a definitive study. Exceeding the threshold provides an evidence-based signal that modifications to procedures, training, or intervention delivery may be necessary. Thus, the decision rule functions both as a statistical criterion and as a quality-improvement tool. Overall, this framework can enhance methodological consistency across feasibility studies and improve the interpretability of pilot findings. Future research could evaluate its robustness and usability across diverse operational contexts.

Conclusions

The event-threshold-based sample size framework provides a statistically coherent and practically intuitive approach for planning pilot studies. This concept is particularly valuable when the objective is to evaluate the robustness and feasibility of a research process. In well-planned studies, setting the event threshold at three is often reasonable and typically yields a pilot sample size of fewer than 30 units. Such a sample size is widely preferred by researchers when conducting pilot studies and is consistent with recommendations in the literature. However, a higher sample size is required when handling higher number of threshold and lower acceptable event rate.

Footnotes

Acknowledgements

The author would like to thank the Director General of Health Malaysia for permission to publish this article.

Reporting Guideline

Not applicable.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Mohamad Adam Bujang (the author) is one of the Statistical Editors for the Indian Journal of Psychological Medicine. I am not involved in the review and decision process of this article.

Declaration Regarding the Use of Generative AI

The author wrote the article. ChatGPT was used to identify grammatical errors and suggest amendments, some of which were accepted after careful consideration. The author remains fully responsible for the entire content of this article.

Data Sharing Statements

Not available.

Ethical Approval

Not available.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

Informed Consent

Not available.

Prospective Registration

Not available.

Citation Diversity Statement

We are committed to equitable citation practices and have made conscious efforts to include work by authors from diverse genders, geographic regions (including the Global South), career stages, and historically marginalized groups. We aim to support a more inclusive and representative scholarly record.

References

Leon

, Davis

and Kraemer

. The role and interpretation of pilot studies in clinical research. J Psychiatr Res, 2011; 45(5): 626–629.

Bujang

, Omar

and Foo

, . Sample size determination for conducting a pilot study to assess reliability of a questionnaire. Restor Dent Endod, 2024; 49(1).

Hertzog

. Considerations in determining sample size for pilot studies. Res Nurs Health, 2008; 31(2): 180–191.

O’Neill

. Sample size determination with a pilot study. PLoS One, 2022; 17(2): e0262804.

Kunselman

. A brief overview of pilot studies and their sample size justification. Fertil Steril, 2024; 121(6): 899–901.

Johanson

and Brooks

Initial scale development: Sample size for pilot studies. Educ Psychol Meas, 2010; 70: 394–400.

Donald

. A brief summary of pilot and feasibility studies: Exploring terminology, aims, and methods. Eur J Integr Med, 2018; 24: 65–70.

Teresi

, Yu

, Stewart

, . Guidelines for designing and evaluating feasibility pilot studies. Med Care, 2022; 60(1): 95–103.

Bujang

. A power primer revisited. Indian J Psychol Med, 2026; 02537176261421310.

10.

Viechtbauer

, Smits

, Kotz

, . A simple formula for the calculation of sample size in pilot studies. J Clin Epidemiol, 2015; 68(11): 1375–1379.

11.

Whitehead

, Julious

, Cooper

, . Estimating the sample size for a pilot randomized trial to minimize the overall trial sample size for the external pilot and main trial for a continuous outcome variable. Stat Methods Med Res, 2016; 25(3): 1057–1073.

12.

Eldridge

, Lancaster

, Campbell

, . Defining feasibility and pilot studies in preparation for randomised controlled trials: Development of a conceptual framework. PLoS One, 2016; 11(3): e0150205.

13.

Bond

, Lancaster

, Campbell

, . Pilot and feasibility studies: Extending the conceptual framework. Pilot Feasibility Stud, 2023; 9(1): 24.

14.

Kaufman

and Charney

Comorbidity of mood and anxiety disorders. Depress Anxiety, 2000; 12(Suppl 1): 69–76.

15.

Weissman

, Fyer

, Haghighi

, . Potential panic disorder syndrome: Clinical and genetic linkage evidence. Am J Med Genet, 2000; 96(1): 24–35.