Abstract
The importance of genomic surveillance strategies for pathogens has been particularly evident during the coronavirus disease 2019 (COVID-19) pandemic, as genomic data from the causative agent, severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2), have guided public health decisions worldwide. Bayesian phylodynamic inference, integrating epidemiology and evolutionary biology, has become an essential tool in genomic epidemiological surveillance. It enables the estimation of epidemiological parameters, such as the reproductive number, from pathogen sequence data alone. Despite the phylodynamic approach being widely adopted, the abundance of phylodynamic models often makes it challenging to select the appropriate model for specific research questions. This article illustrates the application of phylodynamic birth-death-sampling models in public health using genomic data, with a focus on SARS-CoV-2. Targeting researchers less familiar with phylodynamics, it introduces a comprehensive workflow, including the conceptualisation of a research study and detailed steps for data preprocessing and postprocessing. In addition, we demonstrate the versatility of birth-death-sampling models through three case studies from Germany, utilising the BEAST2 software and its model implementations. Each case study addresses a distinct research question relevant not only to SARS-CoV-2 but also to other pathogens: Case study 1 finds traces of a superspreading event at the start of an early outbreak, exemplifying how simple models for genomic data can provide information that would otherwise only be accessible through extensive contact tracing. Case study 2 compares transmission dynamics in a nosocomial outbreak to community transmission, highlighting distinct dynamics through integrative analysis. Case study 3 investigates whether local transmission patterns align with national trends, demonstrating how phylodynamic models can disentangle complex population substructure with little additional information. For each case study, we emphasise critical points where model assumptions and data properties may misalign and outline appropriate validation assessments. Overall, we aim to provide researchers with examples on using birth-death-sampling models in genomic epidemiology, balancing theoretical and practical aspects.
Introduction
Next-generation sequencing and its applications to genomic surveillance strategies for pathogens have provided an indispensable avenue for studying and controlling infectious disease outbreaks in the context of public health.1-3 The significance of genomic surveillance has become particularly evident during the severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2) pandemic, where viral genomic information has informed public health decisions on national and international levels.4-6 Monitoring pathogen infections through genome-based methods not only allows for molecular characterisation and observation of evolutionary changes but also harbours the potential to retro- and prospectively evaluate spatiotemporal transmission dynamics. 7 It can thus provide valuable insights for informing mitigation strategies.
In recent years, genomic epidemiological surveillance has increasingly employed phylodynamics – a concept pioneered by Grenfell et al. 8 It encompasses an integrated analysis of immunodynamics, epidemiology and evolutionary biology and has recently particularly focussed on epidemiological aspects. Phylodynamic methodologies predominantly operate under the assumption that evolutionary and epidemiological dynamics occur on similar timescales, such that the branching events within the viral phylogenetic tree correspond temporally to transmission events. Consequently, monitoring pathogen genetic data across time enables inferences about past epidemiological trends, and numerous studies have successfully applied phylodynamic methods to a wide range of pathogens, including viruses and bacteria (for a review, see Baele et al, 9 Rife et al 10 and references therein).
Bayesian statistics are widely adopted in fields of data-based parameter estimation due to their straightforward quantification of uncertainty by generating probability distributions that represent our prior belief updated in light of the observed data.
11
In Bayesian inference, these distributions, also called posterior probabilities, are calculated from the product of the probability of the data under the assumed model given the parameters (‘likelihood’) and the probability of the parameters (‘prior’). To obtain a normalised distribution, this product is divided by the model evidence, that is the probability of the data under the assumed model (‘marginal likelihood’). As the posterior probability can only be calculated analytically for very simple models, in more complex inferences, it is usually approximated by numerical methods. The most common of these methods is the Markov Chain Monte Carlo (MCMC) algorithm, which draws a representative set of samples from the posterior distribution. In Bayesian phylodynamic inference, we usually aim to estimate posterior distributions for the tree, which reconstructs the ancestral relationship between the sequences, jointly with evolutionary and epidemiological model parameters. By treating the tree as a parameter of interest, the likelihood is calculated from the genomic data through an evolutionary model and a prior probability distribution for the tree is defined by a population dynamic process. In epidemiological applications, these population dynamic models are often chosen to directly quantify parameters of interest, like the growth rate or reproductive number
Typically, the prior probability distribution of the tree is derived from birth-death processes17,18 or coalescent theory.
19
In this article, we consider transmission dynamics in the context of a birth-death process with sampling through time. A more detailed description of coalescent-based models can for example be found in Featherstone et al.
20
Within the birth-death-sampling framework, a bifurcating tree is generated forwards in time by allowing lineages to undergo birth/transmission, death/recovery and sampling events. Death/recovery events refer to the removal of an individual from the pool of infected individuals, either through death or recovery. In addition, an individual is assumed to be removed from said pool when being sampled. The corresponding rate parameters – that is the transmission rate
Due to the interdependence of the rate parameters, not all three epidemiological quantities can be estimated at the same time. From a mathematical perspective, this problem is usually discussed in the context of model identifiability. For birth-death models, it has recently been studied in more detail to provide a sound theoretical foundation for the inference problem and better guide practical inferences.22-27 It is important to note here that theoretical identifiability has not been shown for all birth-death models, including the one introduced above. Examples for identifiable models include birth-death without sampling through time or sampling through time without removal of the sampled individual from the group of infected individuals. However, even if the theoretical identifiability of a model is proven, the practical inference of it from a limited amount of genetic data remains challenging. It is, therefore, strongly advisable to scrutinise the model setting and inference results, for example through sensitivity analyses and simulation studies as discussed below. In phylodynamic inference, birth-death-sampling models are appealing as they allow for stochastic changes in the population size and explicitly model the sampling process. Particularly at the start of an outbreak, during its phase of exponential growth, models based on birth-death-sampling processes have been shown to outperform alternatives. 28 In the full analysis, the inferred branching times of the phylogenetic tree are informed by both the genetic differences between sequences, through the evolutionary model, and their sampling dates, through the birth-death-sampling model. We will, therefore, refer to the posterior trees that are informed by the phylogenetic and population dynamic model as ‘phylodynamic’ trees. Moreover, certain model extensions facilitate the integration of intricate population dynamics by enabling changes in the epidemiological parameters through time over one or multiple subpopulations. More details on these models and their implementations in BEAST2 packages will be given in later sections.
In this article, we illustrate the use of phylodynamic birth-death-sampling models in extracting information relevant to public health from genomic data. For this, we focus on genomic data sets of SARS-CoV-2 that present a well-defined setup for phylodynamic analysis. We provide a detailed description of key considerations involved in using birth-death-sampling models with SARS-CoV-2 data, particularly intended for researchers with little experience with these methods. We start by describing the general conceptualisation of a research study utilising phylodynamics, followed by a section introducing technical details of a data analysis encompassing crucial steps of data preprocessing and postprocessing of the results. As an illustration, we present three case studies from Germany exemplifying how surveillance of genomic variation through time can be used to make inferences about the transmission dynamics of SARS-CoV-2. To demonstrate the versatility of birth-death-sampling models implemented in BEAST2 software packages, each case study addresses a distinct research question: Case study 1 evaluates whether viral genomic data alone can point towards a superspreading event as the epicentre of an outbreak. Case study 2 infers relative transmission dynamics to compare a nosocomial outbreak cluster and the surrounding community in the Düsseldorf area. Case study 3 quantifies regional transmission dynamics in south-eastern Germany in a complex substructured population. We furthermore interpret the applicability of the methods employed and the results for each case study within the context of public health. The article is structured in a way that the ‘General workflow – Conceptualisation’ section introduces the main conceptual parts of a Bayesian phylodynamic analysis, which, in principle, can be applied to the analysis of any pathogen. The subsequent section, ‘General workflow – Data analysis’, offers a summary of the technical parts and their rationale, with a focus on SARS-CoV-2. Case studies 1 to 3 delve into the epidemiological and methodological details of the analyses conducted in this article. They thus also target readers interested in the technical aspects of phylodynamic analysis for SARS-CoV-2.
Methods
General workflow – Conceptualisation
As in any scientific research, the process of generating new insights into the emergence and spread of infectious diseases within a Bayesian phylodynamic framework begins with formulating adequate

Bayesian phylodynamic studies: Main components and case study details. Left panel: The defining parts of the analysis workflow based on the research question, which addresses the specific epidemiological setting. Ideally, the data is collected under a well-defined strategy and a method is chosen that can resolve the main epidemiological drivers and yield the results of interest. By interpreting the results in consideration of the method’s assumptions, implications for public health strategies can be derived. Right panel: In each case study, we consider a distinct epidemiological context and present the type of data, method and result that can objectify it. We focus on birth-death-sampling models implemented in the Bayesian inference software BEAST2.
In Bayesian phylodynamic analysis, selecting an appropriate method includes choosing an evolutionary and epidemiological model and defining prior distributions for parameters included in the model. Here, we will briefly discuss the selection of the epidemiological model, whereas the other aspects are covered in the section ‘General workflow – Data analysis’. The epidemiological model defines a probability distribution for the tree topology and branch lengths, the so-called ‘tree prior’. The models used in this study fundamentally rely on the same tree-generating process, a birth-death model with serial sampling, that can capture distinct aspects of transmission dynamics through model extensions. For example, the birth-death skyline model, 32 as implemented in the BEAST2 package BDSKY, allows the rate parameters – transmission, recovery and sampling – to vary in a piecewise constant manner over time. This facilitates the reconstruction of changes in epidemiological parameters, particularly after specific points in time, and provides a framework for estimating the impact of nonpharmaceutical interventions on disease transmission.33,34 The BDSKYλ model extends the BDSKY model to allow the direct estimation of transmission differences between outbreak clusters relative to one baseline transmission rate. 35 This method can be applied, for instance, to quantify the transmission advantage of one viral lineage over another. Importantly, both models assume each cluster to contain a panmictic population in which all infected individuals follow the same transmission dynamics. Especially in the case of a pandemic, this assumption is often violated because of population substructure, induced, for example, through transmission heterogeneity caused by superspreading or distinct subpopulations due to geographic barriers. Scientific questions seeking to quantify transmission patterns within and among such subpopulations can utilise a multitype birth-death model, available in the birth-death-migration model (BDMM) package. 36 In this model, the parameters may vary among discretized subpopulations (‘types’) and, similar to BDSKY, can change over time in a piecewise constant fashion. Moreover, subpopulations can be connected through type-change events reflecting, for instance, a migration of an infected individual.
In a carefully predetermined context, the
Before using the sequencing data in the described Bayesian inference framework, it is advisable to run prior predictive simulations (also called ‘sample from prior’). In these analyses, samples are drawn from the full prior distribution only. The full prior is the product of the tree prior, calculated from the specified birth-death-sampling process, and all other prior distributions set on estimated parameters. 11 As the sampling process is explicitly modelled in birth-death-sampling tree priors, these analyses are still informed by the observed data through the sampling times of the sequences. It is important to sample from the full prior for mainly two reasons: First, to assess that the MCMC chain is mixing well and can sample correctly from the prior distribution. This is verified through convergence of the ‘sample from prior’. Second, to validate that the joint prior distribution is sensible. Through parameter interactions, the actual prior distribution on a parameter can differ from the prior distribution that is defined for it during the analysis setup. Prior predictive simulations can, therefore, be utilised to inspect that these interactions do not lead to inappropriate full prior distributions.
Therefore, it is crucial to assess the sensitivity of the results and the validity of the model choice: The robustness of the results against small changes in the analysis setup should be evaluated through sensitivity analyses. These analyses can involve, for example, using different model setups or prior distributions. Alternatively, re-estimating parameters of interest under the same model specification, but using a slightly different data set obtained through random subsampling or holding back specific samples, can provide insights into the sensitivity of the results towards certain data points. In addition, as detailed above, it is helpful to compare the resulting posterior distributions with the joint prior distribution. This comparison can reveal the extent to which the data influence the results through the phylogenetic likelihood, as well as the impact of the model assumptions through the population dynamic model. To evaluate the practical identifiability of a model for a specific research question, simulation studies are recommended. Through simulations, it is possible to assess the performance of the method under various model settings and/or data properties to improve the interpretability of the results. In principle, the efficiency of the Bayesian phylodynamic framework in informing public health policy and practice hinges on the careful design of research studies and the selection of validated models. By conducting rigorous evaluations of the results and model choice, researchers can enhance the credibility of their findings and eventually contribute to the control of infectious disease outbreaks in the context of public health.
General workflow – Data analysis
In Figure 2, we show a general workflow representation of the steps needed for a full phylodynamic analysis of SARS-CoV-2 data using the software package BEAST2. Starting from a genomic data set, we subdivide the steps into three parts: (1) preprocessing, (2) phylodynamic setup and (3) postprocessing. Preprocessing consists of steps that ensure suitable data quality and knowledge of its structure. The phylodynamic setup entails the definition of the full Bayesian model. Postprocessing steps are necessary to summarise the Bayesian posterior samples and visualise the resulting information. In the following, we will discuss all generalisable steps, that is parts (1) and (3) in full and part (2) with a focus on the evolutionary model. As such, we applied these in the same manner to all three case studies. As the epidemiological model in part (2) is informed by the specific scientific question, we discuss these steps for each case study separately.

Overview of the steps included in a phylodynamic data analysis: Data-specific preprocessing steps include quality control (eg exclusion of low coverage genomes), multiple sequence alignment (MSA) and data control (eg masking of phylogenetically uninformative or misleading positions). In the main phylodynamic setup, the method specifications are defined (here in BEAST2) and the computation performed. To present the results of these computations, summary statistics of the Bayesian posterior sample have to be calculated, visualised and contextualised to obtain an interpretation as answer of the initial scientific question. Additional analyses to test the effect of changes in strong assumptions made during the BEAST2 setup, so-called sensitivity analyses, are important to demonstrate the robustness of the obtained results.
Preprocessing
For case studies 1 and 2, we have obtained genomic SARS-CoV-2 sequence data sets of interest together with metadata from Walker et al,41,42 respectively. For case study 3, data were collected and sequenced at the University Medical Center Regensburg and at the MVZ Labor Passau (Supplemental Text S5). Similar data sets can be downloaded from public databases for sharing genomic surveillance data, like Global initiative on sharing all influenza data (GISAID)43-45 and National Center for Biotechnology Information (NCBI).
46
The metadata associated with the data sets consisted of geographic location and sample collection date. During the
To compare the sequences site-by-site, we constructed multiple sequence alignments (MSA) using MAFFT v7.453. 48 We used the first reported SARS-CoV-2 genome MN908947.3 as a reference and set the ‘keeplength’ and ‘addfragments’ options to decrease the computational demand. Alternative MSA tools are, for example, described by Pervez et al. 49 As phylogenetic analyses are often purely based on information transported by the MSA, its quality can strongly impact the results. 50 SARS-CoV-2, however, has only recently spread to the human host and has in the considered data sets not shown drastic genomic changes (ie longer insertions and duplications). The construction of MSAs for it is, therefore, still relatively robust, with different methods yielding comparable results. 51 As a quality control measure for the MSAs, we, therefore, opted for visual inspection with Jalview v2.11.4.0. 52
SARS-CoV-2 sequencing data have been shown to contain nonrandom errors associated, for example, with certain laboratories or sequencing protocols.53,54 Therefore, we performed additional
Phylodynamic setup
To define the phylodynamic model, key components include defining an epidemiological model to characterise population dynamics, selecting an evolutionary model to describe sequence evolution, specifying prior distributions for parameters in these models and, finally, deciding on the number of iterations and samples for the MCMC process. For each case study, the epidemiological model and related
The molecular evolutionary process that generated the sequence data can be described mathematically through a nucleotide substitution and evolutionary clock model. Selecting a substitution model that accurately fits the data is crucial, as wrong assumptions about the mutational process can lead to systematic errors in phylogenetic reconstruction. 65 In general, two approaches can be employed for substitution model selection: first, using a priori knowledge of the data to choose an appropriate model and, second, employing likelihood-based model selection or averaging algorithms, such as jModelTest2, 66 PartitionFinder 67 or bModelTest. 68 In our study, we reconstructed phylogenetic trees for individual phylogenetic lineages with highly identical sequences. Given the simplicity of these evolutionary relationships, we opted for the first approach and chose to utilise the Hasegawa-Kishino-Yano-85 (HKY85) model, 69 which estimates base frequencies and accounts for the difference between transitions and transversions through a scaling factor. Moreover, among-site rate variation was modelled through a gamma distribution (Γ). 70 The same approach, involving the selection of the HKY + Γ substitution model based on prior knowledge, has been commonly employed in previous SARS-CoV-2 studies.34,35,59,71,72
Sequence data from recent outbreaks often lack enough evolutionary changes over time for robust inference of evolutionary rates. 71 To improve the accuracy of phylodynamic reconstructions for SARS-CoV-2, it is common to assume clock-like evolution and restrict the rate of evolution by previous knowledge.34,35,59,72 In our case studies, we focused either on a single clonal outbreak (case studies 1 and 2) or reconstructed phylodynamic trees for each Pango lineage separately (case study 3). Given the lack of strong evidence of nonclock-like evolution within lineage and the low genetic diversity, we followed the approach from studies34,35,59,72 and used a strict clock model to describe the accumulation of diversity over time, fixing the rate to a well-established value of 0.0008 substitutions/site/year. 73 These assumptions, however, can greatly impact the temporal scale of the phylodynamic trees. To illustrate this impact, we provide additional analyses relaxing the assumptions for case study 1, which focuses on timing the first transmission event of the outbreak (see section ‘Case study 1 – Superspreading during carnival’). For SARS-CoV-2 data sets with higher levels of genetic diversity, evaluating the temporal signal and selecting the best-fit clock model are critical steps in molecular dating and phylodynamic inferences. 71
For
Postprocessing
To calculate the
Since these analyses require a set of assumptions in each specific model setup, we also ran multiple
Results
Case study 1 – Superspreading during carnival
Epidemiological context and data
One of the first larger outbreaks of coronavirus disease 2019 (COVID-19) that was detected in Germany happened in February 2020 in the district Heinsberg. After the diagnosis of the first case on the February 25, 2020, local COVID-19 case numbers quickly started to rise. Through epidemiological investigations, many of the early infections were traced back to a large-scale carnival event on February 15, 2020. 41 After this initial superspreading event, infections with SARS-CoV-2 continued to be diagnosed in the area despite early nonpharmaceutical interventions.80,81 Phylogenetic analyses of viral genome data sampled until April 2020 showed that at the end of the sampling period the outbreak already consisted of two distinct genetic groups. The first of these includes all early sequences associated with the initial superspreading event, which are assigned to Pango lineage B.3. The second was first sampled in mid-March and includes sequences from multiple different Pango lineages. 82 This data set, therefore, presents an example of the seeding dynamics of a disease in a naive population. It highlights the interplay between early high-transmission events and multiple independent introductions in continued onward transmission.
With this case study, we aim to investigate whether we can quantify these seeding dynamics from genomic data. The genomic data, in this case, consist of viral sequences sampled in the Heinsberg area over a period of two months after the first local COVID-19 detection. The sequences form two subgroups 82 : samples from the initial connected outbreak and epidemiologically unconnected samples from multiple Pango lineages. In the following, we will focus on the former subgroup, because the latter has a very small sample size and is likely affected by nonpanmictic transmission dynamics. As we are interested in estimating the timing of the first transmission event in the first subgroup, as well as the following transmission dynamics, we use the BDSKY model. Through the joint inference of the transmission dynamics with a time-scaled phylogenetic tree, it allows us to date the most recent common ancestor (MRCA) of all included samples. We can thus draw conclusions about the timing of the initialisation of the outbreak. To be able to detect the overdispersed transmission dynamics at the start of the outbreak, we model a change in the transmission rate directly after the superspreading event. It is important to note that this model does not directly describe superspreading dynamics, as it estimates population-average transmission dynamics. However, as the first interval is very short and mainly includes the carnival event, the population size of transmitting individuals included in this interval is relatively small. We, therefore, expect to detect a population average that is elevated to values higher than the basic reproductive number in the region, if the data is informative for a superspreading event. Other, more complex, phylodynamic models exist that explicitly account for heterogeneity in the transmission dynamics.38,83-85 Due to the relatively small sample size and the high phylogenetic uncertainty, we chose the simpler model in a trade-off between model correctness and robustness of estimates.
Phylodynamic setup
We started with an initial set of 125 sequences sampled between February 25, 2020, and April 20, 2020. After the data preprocessing, as described above, 47 sequences, assigned to Pango lineage B.3, remained. Due to the epidemiological context, we opted for a stricter quality control in this case study: as we expect only very few correctly informative sites in the data set, sequencing artefacts can potentially have a bigger impact. For the main analysis, we thus excluded sequences with more than 1.0% Ns and more than 0.05% unique amino acid mutations (as recommended in GISAID).
44
The first collection date in this data set was February 25, 2020, and the last April 6, 2020. We used the panmictic birth-death-sampling model implemented in BDSKY. We allowed the reproductive number to change on February 16, 2020, the day after the potential superspreading event. For both intervals, we used the prior probability distribution LogNormal(0.0, 4.0) (Supplemental Table S1). We fixed the sampling proportion to zero for the time interval before the first sample collection, as no sampling efforts were implemented in the region before this date. We set a strong prior distribution, that is Beta(32.0, 968.0), for the following time interval. This defines a prior mean of 0.032, which corresponds to the ratio of included genome samples and number of diagnosed infections until April 6, 2020, in the district Heinsberg, that is
Results and interpretation
In the estimated phylodynamic summary tree (Figure 3, left panel) we see, first, the timing of the branching events and, second, the topology, that is how the sequence samples cluster together and coalesce back in time. From the estimated branching times, we can conclude the following: With 95% of probability, the first transmission event of the outbreak happened within the 11 days preceding the carnival event, that is between February 4 and February 15, 2020. Most of the reconstructed transmission events occur temporally close to the carnival event. This strongly supports the outbreak to have started shortly before the carnival event and to have grown substantially around the time of the event. This becomes even more evident from the full posterior sample of phylodynamic trees (Figure 3, right panel), which shows distinctly higher edge density until directly after the carnival event. The sharp decrease in edge density coincides with the time when the reproductive number is allowed to change, as the reproductive number and the number of branching events in a time interval are correlated. The high edge density thus highlights how strongly the data supports most transmission events to have happened in the first interval. From the topology of the estimated tree, we can distinguish two clades that are potentially of epidemiological relevance: The first transmission, with high certainty, gave rise to two distinct sublineages, one consisting of 12.7% of included sequences, the other of 87.3%. Although most branching events of the bigger group are dated around the time of the carnival event, fewer branching events of the smaller group have 95% HPDI that include the event. This suggests that after the initial transmission event, one large subgroup got infected mainly during the carnival session, whereas another grew a few days later. The increased transmission during the carnival event is reflected in the estimates of the reproductive number: In the first interval, ending shortly after the carnival event, we estimate a median reproductive number of 15.48 (95% HPDI: [4.8, 41.5]). This estimate is higher than the population level basic reproductive number of SARS-CoV-287-89 and strongly suggestive of a superspreading event. In the second interval, which spans roughly two months afterwards, the estimated reproductive number dropped to below one (median: 0.78, 95% HPDI: [0.61, 0.95]). The latter estimate highlights the declining dynamics of this outbreak after the carnival event. The validity of these findings is supported by the simulation results (Supplemental Table S3) and by all sensitivity analyses (Supplemental Table S2). The only remarkable difference we find is with sensitivity analysis 4, in which we estimate the substitution rate with a narrow prior. This results in a lower median estimate of the substitution rate which scales the tree height to higher values. As we fix the time point when the reproductive number is allowed to change to directly after the carnival event, the time covered by the first interval becomes longer. Thus, we estimate a lower reproductive number in the first interval, which now includes a much bigger population outside of the carnival event. However, if we instead let the reproductive number change ten days after the root node (sensitivity analysis 5), we again recover the elevated transmission rate. It is important to note here that we cannot quantify the superspreading behaviour directly but only the average transmission behaviour in the whole population over a time period and that the inferred time scales are generally highly sensitive to the value the substitution rate is fixed to. As mentioned above, however, prior information on the rate is necessary if the sequence data are not informative enough.

Case study 1 – Superspreading: MCC tree using median node heights of genomes belonging to Pango lineage B.3 (left panel). Posterior sample of consensus trees visualised with DensiTree 90 (right panel). As in this illustration all posterior samples are plotted on top of each other, darker shaded areas highlight a high edge/node density or uncertainty. The reproductive number is allowed to change shortly after the carnival event on February 15, 2020 (the red dashed line).
By definition, it is relatively unlikely that the MRCA of different Pango lineages falls in this region. Hence, the second subgroup most likely represents multiple independent introductions of SARS-CoV-2 into the area within the sampling period. In Supplemental Figure S1, we also show the summary tree for all samples assigned to Pango lineage B.1 in the second subgroup. Despite the high uncertainty in the estimated parameters for this lineage due to the small sample size and low genetic diversity, we see that it could have also been introduced during the carnival festivities in February.
Taken together, these results highlight the underlying smaller-scale dynamics that took place during the first weeks of COVID-19 spread in Heinsberg: Local case numbers first began to rise drastically due to a superspreading event. However, although the outbreak caused by this event started to decline, SARS-CoV-2 was introduced into the region on multiple other occasions independently. With this case study we thus demonstrate that from viral genome sequences and their collection dates we are able to reconstruct many important aspects of the epidemiological setting in this region. We provide strong indications for an initial superspreading event during the carnival session on February 15, 2020. We further show that the outbreak started shortly before this event and visualise further small-scale dynamics. Without expensive contact tracing efforts, conclusions like these are commonly not accessible from other surveillance data, like the number of hospitalisations or cumulative case counts.
Case study 2 – Hospital outbreak
Epidemiological context and data
SARS-CoV-2 transmissions in healthcare settings have been found to play a crucial part in disease propagation, 91 and despite the large number of reported nosocomial outbreaks (see Table 1 in the study by Abbas et al), 92 genomic-based inference of transmission dynamics in healthcare settings has received little attention (for discussion, see Abbas et al 92 and Shirreff et al 93 ). Nevertheless, characterising viral spread within healthcare facilities holds immense potential for informing epidemic control policies. In this case study, we aim to determine whether transmission dynamics differ between a hospital outbreak and the surrounding community. We use nosocomial transmission chain data depicting an outbreak of lineage B.1.221 that emerged at Düsseldorf University Hospital in late 2020, affecting 29 patients and healthcare workers (‘Ward D’). 42 A combined analysis of viral genomes from the hospital transmission chain and genomes from surveillance sampling revealed a highly clonal nature of the hospital outbreak. 42 Although the joint analysis of viral genomes and contact tracing data by Walker et al 42 identified potential infection clusters, our phylodynamic approach aims to further quantify relative transmission differences between the nosocomial outbreak and a nontargeted population surveillance sample. This is achieved using a baseline transmission rate and cluster-specific scaling factors, as implemented in the BDSKYλ model.
Phylodynamic setup
We utilised 23 SARS-CoV-2 sequences from the hospital outbreak and 50 sequences from the local background population. Surveillance sequences, also representing Pango lineage B.1.221, covered a timespan from August 17, 2020, to December 16, 2020, whereas the nosocomial outbreak was sampled between December 14 and December 23, 2020. For all samples, we performed the data preprocessing steps as outlined in Figure 2 and the corresponding text. For the hospital cluster, analysis with TempEst did not reveal any phylotemporal outliers.
Relative transmission rates between hospital and surveillance sample clusters were estimated with the BDSKYλ model. Within Bayesian inference, we inferred for both clusters a separate tree, sampling proportion and transmission rate ratio
Results and interpretation
We estimated the baseline reproductive number within two time intervals: before and after the introduction of a partial lockdown in Germany. Despite the 95% HPDIs overlapping, the inferred baseline transmission rate estimate within the first interval was lower than the corresponding estimate for the second interval yielding median values of

Case study 2 – Hospital outbreak: Phylodynamic trees inferred for the hospital outbreak cluster and surveillance sample cluster. The upper tree has been reconstructed based on viral genomes obtained from a nosocomial outbreak (‘Ward D’, N = 23), whereas the lower tree represents a nontargeted population surveillance sample (N = 50). For the surveillance data, the sampling proportion was allowed to change on October 12, 2020, indicated with a red dashed line, as the percentage of sequenced cases over all confirmed infections decreased (Supplemental Text S2 and Supplemental Figure S2). For the baseline reproductive number
While the
The inferred sampling proportions for the surveillance data were in accordance with the percentages of sequenced cases as reported by Walker et al.
42
For the time periods covering calendar weeks 33 to 41 and 42 to 51, we inferred median estimates of
In summary, given that both, nosocomial and surveillance data, encompass viral genomes of the same Pango lineage collected during a similar time frame, the elevated transmissibility observed within hospital settings likely correlates with environmental factors. Therefore, our results highlight the essential need to enforce stringent preventive measures within healthcare facilities to shield the most vulnerable individuals from infection. Rapid management of nosocomial outbreaks not only prevents transmissions among patients but also substantially influences epidemic dynamics in the surrounding community. 91 Moreover, the characterization of a baseline transmission rate and its cluster-specific scaling factor in this study showcased the suitability of the BDSKYλ model for inferring relative disparities in transmission dynamics between multiple clusters. However, it is important to note that challenges for estimating the reproductive number can arise from the typically small sample sizes of nosocomial outbreaks and the stochastic nature of the transmission process.
Case study 3 – Spatiotemporal pandemic excerpt
Epidemiological context and data
One important challenge in phylodynamic inference of regional transmission dynamics during a pandemic is population substructure due to multiple independent introductions. To address this problem, we focus on two research questions with this case study, one relating to methodology and one to epidemiological context: Can we successfully infer regional transmission dynamics from data sets with high levels of unknown substructure? Do the inferred local transmission dynamics align with the dynamics on the national level?
If a disease is introduced multiple times into the considered area, the genomic data sampled will likely not represent one clonal outbreak but multiple independent, simultaneously spreading transmission chains. The MRCA and many internal nodes of a phylogenetic tree reconstructed from all genome samples will, therefore, represent transmission events outside of the considered area. If the research goal is to quantify the regional transmission dynamics, these parts of the tree will bias the resulting estimates as they represent the epidemic dynamics outside of the region. Correctly treating these independent introductions becomes more difficult the higher the number of introductions and the more closely related the respective genome samples are. To account for this substructure, two main approaches exist: First, identifying the independent introductions to group the sequences into clusters belonging to the same outbreak and reconstructing a tree for each outbreak separately as in case study 1. Second, using a phylodynamic model that accounts for this substructure and reconstructing the global tree, as done in this case study. We consider a genome data set with high levels of unknown substructure. Through regional sequencing efforts, we obtained SARS-CoV-2 genome sequences of around 5% of diagnoses from multiple laboratories in south-eastern Germany sampled between January and October 2021. In Supplemental Figure S4, we show the number of sequences and diagnosed infections per Landkreis during the sampling period. This illustrates that the considered data set comprises samples that represent only a very small fraction of cases in a geographically not explicitly defined area. In addition, the sequences show high genetic similarity, even across time and space. Grouping into connected outbreaks through cluster detection methods is, therefore, difficult. We thus opted for the second option and used a population structure-aware phylodynamic model to infer the transmission dynamics.
Phylodynamic setup
We obtained 931 sequences from East Bavaria, sampled between January 13, 2021, and October 30, 2021. Details regarding the sample collection, sequencing and quality control are given in Supplemental Text S5. Each sequence included the collection date and first three numbers of the postal code as metadata. To reduce computational complexity of the Bayesian analysis, we inferred an independent tree for each Pango lineage with more than five sequences. To validate that the size of included Pango lineages does not change our results, we ran the same inference also on all Pango lineages with at least 20 sequences. We used pangolin
97
to identify the lineages and ran the preprocessing steps as described above for each of the lineages. This resulted in 32 Pango lineages and 698 sequences in total. In a joint Bayesian analysis, we inferred independent phylodynamic trees for each Pango lineage with a shared effective reproductive number and sampling proportion. We used a multitype birth-death model in BDMM to quantify the prior probability distribution for the trees. We defined two types that are connected by the same constant type-change (‘migration’) rate. One type is sampled with a constant rate, thus corresponding to our study population. The other type, by contrast, was not sampled and, therefore, includes everything outside of the study population. All samples were assigned the same tip type, corresponding to the type with nonzero sampling rate. At two points in time, April 14, 2021, and July 23, 2023, we allowed the reproductive number and sampling proportion for both types to change. We defined the change points so that the time intervals have a similar length. To test whether we can capture the overall dynamics well with only three intervals, we also analysed the data allowing the reproductive number to change at six points in time. These were February 24, 2021, April 7, 2021, May 19, 2021, June 30, 2021, August 11, 2021, and September 22, 2021. The rate to become noninfectious was fixed to 36.5 year-1. We also fixed the geo-frequencies parameter, representing the probability of the root node of the tree being assigned to either type, to 1 for the unsampled and 0 for the sampled type. This induces the assumption that every Pango lineage includes samples from at least two independent introductions, as the root of the lineage is outside of the study population. For the reproductive number, we set a lognormal distribution with mean 0.0 and standard deviation 4.0 as prior for all types and intervals (Supplemental Table S7). As prior probability for the sampling proportion, we chose a Beta(50, 950) distribution. For all other parameters, we kept the default priors. The following steps were then carried out as discussed above. As the sampling period covers the replacement of the variant of concern Alpha by Delta, our data sets include genome sequences from lineages that have shown to differ in their transmissibility. To see whether we can quantify the transmission difference in our study population, we also set up the same analysis as above with variant stratification: We inferred a different reproductive number for the Pango lineage B.1.1.7 (Alpha) and for all included Pango lineages belonging to Delta. We ran all MCMC chains for
Results and interpretation
In Supplemental Figure S8, we show the results of 50 simulation replicates, 40 out of which reached ESS values above 200 for all estimated parameters. In all of them, the HPDI coverage is high, with a value close to 100%, and the relative bias is small. This suggests that the method can recover the correct transmission dynamics from a single sampled population under our simulation conditions, particularly if it converges well. We thus applied the method to our data set to infer the transmission dynamics in three equally long intervals to quantify temporal changes within the considered region. The resulting estimates of the effective reproductive number (

Case study 3 – Spatiotemporal pandemic excerpt: Effective reproductive number (upper panel) and sampling proportion (lower panel) estimated for the sampled study population. Both parameters were allowed to change on April 14 and July 23, 2021. The bold line shows the median posterior estimate and the shaded area the 95% HPDI. The grey line marks an
This case study shows that without any additional source of information, we are able to disentangle the transmission dynamics of our sampled population and infer the dynamics that shape the diversity outside of this population. Similar approaches based on coalescent tree priors have previously been successfully employed on SARS-CoV-2 data.84,100,101 The decrease in the effective reproductive number in the interval between April 14, 2021, and July 23, 2021, can be caused by multiple factors, like nonpharmaceutical interventions implemented locally and increasing temperature.102,103 The increase above one in the following interval is most plausibly a result of behavioural changes due to the summer holidays, as reported previously. 99
Discussion
With this article, we provide a comprehensive overview of how phylodynamic birth-death models can be used to inform transmission dynamics of SARS-CoV-2. We have summarised all steps that constitute such an analysis and highlighted several situations in which model assumptions and data properties might not align and thus profit from additional validation assessments. To illustrate the versatility of these methods, we have presented and discussed three case studies from the COVID-19 pandemic in Germany. These show how we can disentangle complex epidemiological patterns purely from time-stamped genomic data through Bayesian inference using birth-death-sampling models. In case study 1, we find indications of a superspreading event in the initial phase of the first larger outbreak in Germany by allowing a change in the reproductive number directly after a large-scale gathering. By simultaneously estimating the time-scaled phylodynamic tree we can, in addition, provide a time interval for the date of the first transmission event of the outbreak, based on the observed mutational pattern and estimates of the mutation rate from the literature. Case study 2 takes a comparative approach by focusing on the estimation of the relative difference in the transmission rate between community spread and a nosocomial outbreak. Our results indicate pronounced levels of SARS-CoV-2 transmission within the hospital, emphasising the crucial role of implementing preventive measures effectively within healthcare facilities. In case study 3, we demonstrate the usability of the methods in cases where we do not consider clonal outbreaks. Even in the presence of complex population substructure, we can robustly quantify piecewise constant intervals of the effective reproductive number. The resulting analyses show that the dynamics in the abstractly defined subpopulation well resemble what the dynamics on the national level look like.
Despite their diverse applicability, the discussed methods also come with multiple limitations: First, fundamental challenges may arise from the mathematical properties of the model and simplifying assumptions made within the mathematical models, which may not necessarily be met by the analysed data. The mathematical properties include potential identifiability issues caused by multiple different parameter combinations yielding the same likelihood value and difficulties caused by limited sample size and uncertainty in the tree structure.22,25 One of the simplifying assumptions is, for example, the piecewise constant nature of all rate parameters. Particularly for the sampling process, this can lead to pronounced biases if the sampling rate is not actually piecewise constant, but rather changing exponentially through time. 30 In addition, if the phylogeny is not a good approximation of the transmission tree, for example because the evolutionary and epidemiological processes do not act on similar timescales, this can bias parameter estimates. In that situation, the phylogeny and transmission tree can be modelled separately. 104 Second, challenges arise from specific properties of the data. For SARS-CoV-2, this entails the large amount of available sequence data, the low genetic diversity and the unknown substructure in these data sets. 105 The small number of expected substitutions over time scales of weeks or few months and thereby high sequence similarity in data sets sampled over short time periods result in weak phylogenetic signals. In addition, the existing models are computationally expensive due to numerical integration over the high-dimensional tree space. Therefore they cannot easily be applied to large data sets.106,107 Recently, methodological advancements, such as PIQMEE 108 and Thorney BEAST https://beast.community/thorney_beast, last visited May 28, 2024), have enabled phylogenetic analysis in a Bayesian framework of data sets of more than tens of thousands of genomes. Additionally, deep learning approaches have been implemented to estimate epidemiological parameters from phylogenetic trees, allowing fast and accurate inference from large trees. 109
During the COVID-19 pandemic, it has become particularly important to understand when these comprehensive but complex frameworks are necessary and where simpler methods can yield results of comparable quality. For inference of trees from many genetically very closely related sequences, maximum parsimony approaches have, for example, been shown to perform comparably to more elaborate maximum likelihood or Bayesian analyses while being computationally much less expensive.110,111 Although, generally, the least complex method that can provide accurate estimates should be chosen, the assessment of accuracy is not always easy. Simpler models might, for example, be able to estimate the same parameter faster, but lack more detailed quantifications of uncertainty. Bayesian inference, on the contrary, provides a full probabilistic framework that directly quantifies uncertainty in joint probabilities for all parameters of interest. These joint probabilities consider all model components, accounting for correlations between them. Bayesian inference also allows direct model assessment by generating random samples from the generative probability distributions. Such simulations can provide insights into the performance in specific situations. It is important to keep in mind, though, that conclusions from simulation studies are based on synthetic data, which may lack important aspects of real-world data. 112 Because of the hierarchical structure of Bayesian models, the level of complexity can also be adjusted by making assumptions about parameter values through prior point or range estimates of parameters like the substitution rate or even the phylogenetic tree. 113 Other data types can also be integrated into the analysis via prior probability distributions, bringing together different lines of evidence. 114 In addition, especially when choosing a complex Bayesian inference method, the validation of results by simpler methodologies can be considered good practice.
Conclusions
As we demonstrate here, Bayesian methods can yield valuable information, particularly on well-defined and moderately-sized data sets. In case study 1, we show how we can easily retrace complex seeding dynamics of a large outbreak, that are otherwise only accessible through very time and resource-intensive contact tracing efforts. Case study 2 combines surveillance and outbreak data in a joint analysis setting, thus providing a comparable quantification. Even for data sets for which we cannot precisely define the study population, we are able to disentangle the unknown substructure and estimate the respective transmission dynamics, as shown in case study 3. Nevertheless, these three case studies illustrate only some of the relevant epidemiological information that can be extracted from genome data through phylodynamic models: A multitude of other studies has been investigating aspects such as the origin dates and intrinsic differences in transmissibility of new viral lineages, including variants of concern (see Attwood et al, 38 Martin et al, 115 Volz 116 and references therein).
Supplemental Material
sj-pdf-1-bbi-10.1177_11779322251321065 – Supplemental material for Evolutionary and epidemic dynamics of COVID-19 in Germany exemplified by three Bayesian phylodynamic case studies
Supplemental material, sj-pdf-1-bbi-10.1177_11779322251321065 for Evolutionary and epidemic dynamics of COVID-19 in Germany exemplified by three Bayesian phylodynamic case studies by Sanni Översti, Ariane Weber, Viktor Baran, Bärbel Kieninger, Alexander Dilthey, Torsten Houwaart, Andreas Walker, Wulf Schneider-Brachert and Denise Kühnert in Bioinformatics and Biology Insights
Footnotes
Acknowledgements
The authors thank Katharina Quadt from the MVZ Labour Passau for compiling the partial data of the data set used in case study 3, Jürgen Fritsch for implementing the nanopore sequencing workflow and Theresia Prenzel for her excellent technical support in the laboratory of the university hospital Regensburg. The present work was performed in partial fulfilment of the requirements for Viktor Baran to obtain the degree ‘Dr. med. dent.’. We also thank 4 anonymous reviewers for their comments and suggestions, which helped us to improve the article.
Author contributions
Conceptualisation: SÖ, AWe, VB, BK, AD, TH, AWa, W S-B, DK.
Data curation: SÖ, AWe, VB, BK, AD, TH, AWa, W S-B, DK.
Formal analysis: SÖ, AWe.
Methodology: SÖ, AWe, DK.
Software: SÖ, AWe.
Validation: SÖ, AWe.
Visualisation: SÖ, AWe.
Writing – original draft: SÖ, AWe.
Writing – review and editing: SÖ, AWe, VB, BK, AD, TH, AWa, W S-B, DK.
Declaration of conflicting interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding:
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: Funding for this work was obtained from the Max-Planck Society (SÖ, AWe and DK) and through a Landesgraduiertenstipendium from Friedrich Schiller University Jena and the State of Thuringia (AWe). This work was, in addition, supported by the Ministry for Work (AD), Health and Social Affairs of the State of North Rhine-Westphalia (‘24.04.01 Gen. u. IKA’ and CPS-1-1A) (AD), the Jürgen Manchot Foundation (AD), the German Federal Ministry of Education and Research (Bundesministerium für Bildung und Forschung (award numbers 031L0184B and 01KX2021) (AD) and the German Research Foundation (award number 428994620) (AD).
Data availability
For case studies 1 and 2, no new data were generated in support of this article. For case study 1, focusing on an outbreak during a carnival event in Heinsberg, viral sequences and associated metadata were obtained from Walker et al (2020). For case study 2, characterising transmission differences between nosocomial outbreak and population surveillance sample in Düsseldorf, viral sequences and associated metadata were obtained from Walker et al (2022). The data for case study 3 were generated as part of the CorSurV project of the Robert Koch Institute, Germany. They are a partial data set of the data available under
.
We provide analysis xml files and R scripts for data evaluation and visualisation on Zenodo https://doi.org/10.5281/zenodo.14858533.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
