Abstract
Sample size calculations and power analyses are essential components of experimental design in modern biomedical research. Designs that account for sample correlation, multiple testing, and other sources of variability inherent to specific studies are routinely employed for identifying differential expressions. Despite recent advances in methodologies and software tools for power analysis, there remains a lack of statistical packages capable of accommodating these complex designs in differential expression studies. To fill this gap, we provide the R package depower, which implements the simulation-based framework presented in our recent publications. This unified framework covers both independent and dependent group comparisons and controls false positive rates by employing a simulation-based approach to calculate the empirical null distribution of test statistics.
Introduction
Using biological samples from different conditions to uncover changes in gene expression profiles has become the primary objective of many transcriptomic studies. Sample size calculations and power analyses are essential components of experimental design for these differential expression analyses. To address the need, many methods have been proposed specifically for commonplace RNA-seq experiments with count data. Most of these methods are based on Poisson or negative binomial distribution. However, these approaches often rely on asymptotic approximations, which can lead to inflated false positive rates, as demonstrated in prior studies.1,2 To address this issue, Yu et al. 3 introduced a simulation-based framework that leverages the empirical null distribution to establish cutoff values for differential expression detection, thereby circumventing the reliance on asymptotic approximations. The simulation-based framework was originally intended for designs utilizing uncorrelated samples. Yu et al. 4 expanded its functionality to accommodate designs featuring diverse correlation structures. Other simulation-based approaches for power analysis of bulk and single-cell RNA-seq experiments, such as PROPER 5 and POWSC, 6 are available in literature. However, these approaches are also affected by inflated false positive rates due to their reliance on asymptotic tests. In addition, they need to simulate genome-wide expression data, whereas our proposed approach simulates expression data at gene-level. This gene-level simulation framework requires substantially fewer parameters to be specified for the simulations as a priori, thereby improving directness and robustness.
In this note, we present the R package depower, which is an implementation of the power analysis procedures developed by Yu et al.3,4 Although originally developed for RNA-seq data, the methodologies and the package are equally applicable to other high-throughput technologies with count or lognormal data, such as ChIP-Seq and proteomics. depower provides a unified framework for power analyses with functions for simulating data, hypothesis testing, calculating power, and visualizing power for a variety of experimental designs.
Methods
depower supports 2 common outcome models: negative binomial (NB) for counts and lognormal for continuous intensities. It accommodates both independent and paired group comparisons. Throughout, the target fold ratio is a ratio of group means on the original scale (geometric means).
Negative Binomial Outcomes
Independent Groups
Gene count approximations
depower provides likelihood ratio test and Wald test for the ratio of group means. Both tests allow unequal dispersions across groups.
Paired Groups
For paired designs, the bivariate negative binomial (BNB) distribution is derived by compounding two conditionally independent Poisson variables
Lognormal Outcomes
Independent Groups
Expression
Unequal CVs under different conditions are allowed for independent groups. Welch’s t-test is provided for hypothesis testing of the log-transformed data.
Paired Groups
Expression data
Type I Error Control
Inferences based on likelihood ratio test or Wald test usually depend on asymptotic theory, which approximates their distributions as Chi-square for large sample sizes. However, this testing strategy may result in much smaller critical values than expected.1,2 To address this issue, we implemented the simulation-based approach by Rettiganti and Nagaraja
2
for proper false positive rate control in Yu et al.3,4 For instance, the empirical null distribution of the test statistics (likelihood ratio test or Wald test) is obtained from simulated experimental data under the null hypothesis for a large number of iterations (eg, 100 000). Then the 100
Simulation Procedure for Power Analysis
Specify all input parameters: sample size per condition
Simulate count data T times from NB
Fit the NB model or BNB model and obtain test statistics (likelihood ratio test or Wald test) under the null hypothesis for each simulation run.
Calculate the 100
Fit the NB model or BNB model and obtain test statistics under the alternative hypothesis for each simulation run.
Calculate power (percent of rejections under the alternative hypothesis) for the input parameters listed in Step 1.
Example
Consider an RNA-Seq study in which we are interested in the difference in expression of 2 independent groups. Often, thousands of genes are examined in RNA-Seq studies, however, power is often calculated using the average parameter values for a single representative gene of interest. Yu et al.3,4 demonstrated that public resources such as TCGA and GEO are well suited for use as pilot datasets when planning new gene expression studies under similar experimental conditions (eg, tumor stages or normal tissues). Parameter estimates derived from these pilot datasets, such as gene-level mean expression and dispersion, can be incorporated into power calculations across a range of candidate sample sizes. In this example, using parameter estimates derived from the TCGA breast cancer dataset, we simulate data for 2 independent groups of negative binomial outcomes with a control group mean of 10, a dispersion parameter of 1 for both groups, and a minimum relevant fold change of 2. Suppose the total number of genes is 10 000, the proportion of truly non-differentially expressed genes is 0.8, and the number of acceptable type I errors is set to 50. Using the per-family error rate method, the type I error rate will be set to

Example: power versus sample sizes in a 2-group experiment with user-specified parameters: mean expression of 10, dispersion of 1, and fold ratio of 2 between groups.
set.seed(20251004)
sim_nb(
n1 =c(3, 10, 20, 40, 60, 100),
mean1 = 10,
ratio = 2,
dispersion1 = 1,
dispersion2 = 1,
nsims = 10000
) |>
power (
”Simulated NB LRT” = lrt_nb(
distribution = simulated(nsims = 20000)
),
alpha = 50 / (10000 * 0.8)
) |>
plot(hline = 0.8)
Conclusion
The depower package addresses a critical gap in the design and analysis of differential expression studies by providing a flexible simulation-based framework for power analysis. Unlike traditional approaches that rely on asymptotic approximations, depower leverages empirical null distributions to ensure accurate control of type I error rates, even under dependent correlation structures and small sample sizes. This capability is particularly important for high-throughput experiments, where thousands of hypotheses are tested simultaneously and conventional methods may lead to inflated false positive rates.
Our proposed simulation-based framework is primarily designed for gene-level power analysis, but it can be easily extended to genome-wide power analysis by specifying a sequence of mean expression and dispersion levels representing different genes. For the proposed framework, the computation time is variable and depends mainly on 3 factors: the sample size of the simulated studies, the number of datasets simulated under the alternative hypotheses, and optionally, the number of datasets simulated under the null hypotheses. One limitation of the simulation-based approach is that the simulated data may become skewed under extreme parameter values (eg, mean expression, dispersion, or fold ratio), which can lead to larger variability of power estimates, particularly when sample sizes are relatively small.
By supporting both negative binomial and lognormal outcome distributions, as well as independent and paired designs, depower accommodates a wide range of experimental settings, including RNA-Seq, ChIP-Seq, and proteomics studies. The package offers integrated tools for data simulation, hypothesis testing, power estimation, and visualization, enabling researchers to make informed decisions about sample size and study design before data collection. Future extensions may include support for additional correlation structures, hierarchical models, and more complex experimental designs, further broadening the applicability of this framework. Overall, depower provides a robust and practical solution for researchers seeking reliable power calculations in complex high-dimensional biological studies.
Footnotes
Acknowledgements
We thank the reviewers for their valuable suggestions.
Author Contributions
BK and LY developed and tested the R package. BK and LY wrote and reviewed the manuscript.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
