Abstract
Background
The COVID-19 pandemic demanded evidence synthesis approaches that could keep pace with rapidly evolving science and urgent policy needs. Rapid reviews and living evidence syntheses (LES) emerged as critical methodological responses, yet limited guidance exists for teams undertaking such work, particularly regarding the operational challenges of sustained production, multi-team coordination, and the transition from rapid reviews to living formats under emergency conditions.
Objectives
This paper aims to document and appraise methodological decisions made across each component of the synthesis process and their rationale; identify the contextual factors, coordination mechanisms, and infrastructure that enabled or constrained synthesis implementation under emergency conditions; and derive practical recommendations for evidence synthesis teams, funders and commissioners, and methodologists.
Methods
We present a structured reflective process evaluation of our experience producing three rapid LESs on COVID-19 vaccine effectiveness as part of COVID-END in Canada, addressing vaccine effectiveness in adults (short-term and long-term) and children/adolescents, collectively producing 79 versions over approximately two years. Drawing on a Context–Mechanism–Outcome (CMO) lens, we examine eight core methodological processes, each time highlighting: the methodological decision and its rationale; the challenges encountered; and the lessons learned.
Findings
Key insights include: decision-maker engagement played a central role in methodological decisions, directly determining scope boundaries, update frequency, and presentation formats; the multi-team coordination structure was essential to quality under streamlined methods, with cross-team verification partially offsetting the validity risks of single-reviewer screening; choosing between narrative synthesis (compared to meta-analytic approaches) in two of three products reflected capacity constraints during emergency production, a distinction with implications for how such adaptations are evaluated against existing guidance; automation produced considerable efficiency gains but required added development capacity and resources, demonstrating that teams should plan and budget for automation from the outset rather than absorbing its demands reactively during production; and coverage was acceptable for decision timelines. External cross-checks identified five studies not captured by our searches across the entire project, of which only two were ultimately eligible for synthesis. Emergency production meant that key process records, including change logs, per-cycle search records, and pilot screening agreement data, were not systematically captured. Future teams should plan to assemble a minimal transparency package from the outset. A minimum viable LES framework is proposed comprising minimum infrastructure, minimum methods, and optional enhancers.
Conclusions
The successful implementation of rapid LES requires pre-existing or rapidly mobilisable infrastructure, sustained and explicitly resourced decision-maker engagement, transparent documentation of methodological adaptations across update cycles, effective coordination in multi-team contexts, and realistic sustainability planning. These lessons, derived from a specific emergency context, offer transferable guidance for evidence synthesis teams preparing for future health emergencies and for standards bodies considering how existing guidance accommodates emergency and living synthesis conditions.
Keywords
Introduction
Coronavirus disease 2019 (COVID-19) was declared a pandemic by the World Health Organisation (WHO) on March 11, 2020 (World Health Organisation, 2020). Mass vaccination against SARS-CoV-2 proved crucial to containing the pandemic’s impact. However, decision-makers faced numerous challenges in making public policy choices amid rapidly evolving scientific evidence. Traditional systematic review methods, often requiring 12–18 months to complete, were fundamentally incompatible with the pace of the pandemic, as policy makers required syntheses that could keep up with evidence within a matter of days or weeks (Biesty et al., 2020).
Characteristics of the Three Living Evidence Syntheses
Note. HiRU = Health Information Research Unit; LES = Living Evidence Synthesis; UNED = Evidence and Deliberation Unit for Decision Making; MBMC = Montreal Behavioural Medicine Centre; CIUSSS = Centre Integré Universitaire de Santé et de Services Sociaux; VESPa = Vaccine Effectiveness Systematic Review Platform. The syntheses were numbered 6, 8, and 10 within a broader series of Public Health Agency of Canada (PHAC)-commissioned COVID-END evidence products.
This paper presents a structured reflective process evaluation of our experience producing three rapid LESs on COVID-19 vaccine effectiveness for Canadian public health decision-making (Bacon et al., 2023; Elliott et al., 2017; Flórez et al., 2023; Iorio et al., 2022). It aims to document and critically appraise the methodological decisions made across each component of the synthesis process and their rationale; identify the contextual factors, coordination mechanisms, and infrastructure that enabled or constrained implementation under emergency conditions; and derive practical recommendations for evidence synthesis teams, funders and commissioners, and methodologists planning similar work. Rather than reporting findings on vaccine effectiveness, the focus throughout is on the methods themselves and the lessons that emerge from their application.
As a structured reflective process evaluation, this work examines the implementation of the three syntheses by documenting and comparing processes, barriers, facilitators, and contextual factors across their production (rather than reporting outcome findings, which are available in the original LES reports) (Pawson & Tilley, 1997). To strengthen the credibility and transferability of our reflections, we organise our analysis using a Context–Mechanism–Outcome (CMO) lens: we consider the conditions under which our teams operated (context), the methodological choices and coordination strategies employed (mechanisms), and the resulting implications for the quality and sustainability of the syntheses (outcomes). This framing clarifies the scope and limits of the insights offered, particularly with respect to generalisation to other emergency contexts.
The paper begins with an overview of the three products and their transition from rapid reviews to an LES format. It then discusses the methodological components of the LESs, highlighting the challenges encountered and lessons learned at each stage. The final substantive section offers recommendations for future LES production, followed by conclusions on the broader implications of this experience.
Overview of the Three Rapid Living Evidence Syntheses
Table 1 summarises the three products forming the empirical basis for this paper. We selected these syntheses because they represent our full COVID-END vaccine effectiveness portfolio, encompass different populations and time horizons, employ varying synthesis approaches (narrative evidence profiles and meta-analysis), and were produced by different institutional teams. This diversity provides a rich basis for methodological reflection.
The syntheses were designed as complementary products addressing COVID-19 vaccine effectiveness across different populations and time frames: short-term (up to 120 days) effectiveness in adults (Iorio et al., 2022); effectiveness in children and adolescents (Flórez et al., 2023); and long-term effectiveness in adults (over 112 days) (Bacon et al., 2023). This strategic division of scope around narrowly defined questions allowed each team to develop expertise and produce LES in a short time frame, while collectively providing comprehensive coverage of evidence for Canadian decision-makers around vaccination.
The syntheses were produced by collaborating teams across multiple institutions, including the Health Information Research Unit (HiRU) at McMaster University, Canada; the Ottawa Knowledge Synthesis and Application Unit at the Ottawa Hospital, Canada; the Evidence and Deliberation Unit for Decision Making (UNED) at the University of Antioquia, Colombia; and the Montreal Behavioural Medicine Centre (MBMC) at the Centre Intégre Universitaire de Santé et de Service Sociaux (CIUSSS) du Nord-de-l'Île-de-Montréal, Canada. All three syntheses were disseminated through the COVID-END website on a regular update cycle, enabling decision-makers to anticipate when new evidence summaries would be available and integrate findings systematically into their deliberations. Regular coordination meetings involving all teams and PHAC representatives following each report submission created essential infrastructure for quality assurance, shared methodological learning, and maintaining alignment with decision-maker needs throughout the production period.
This work was conducted during a period of acute and sustained policy urgency, March 2021 to March 2023, in which PHAC required evidence summaries on a recurring cycle aligned with its policy deliberation schedule, with turnaround times of days rather than weeks between search completion and report release. The methodological decisions driving this urgency are examined in the following section.
Transition From Rapid Review to Living Synthesis
A defining methodological decision in our experience was the transition from a standalone rapid review to a rapid LES format. Importantly, this transition was not solely a methodological choice made by the research teams; it was a decision made together with PHAC, our primary governmental knowledge user. The goal was to package information to help decision-makers understand the ‘bottom line’ for each vaccine and identify when new studies changed this assessment, particularly for variants of concern. Specifically, decision-makers needed evidence to determine whether to adjust the distribution of purchased vaccines based on locally dominant variants of concern, refine messaging to citizens and healthcare workers, and adapt public-health measures to accommodate changes in vaccine effectiveness. This policy-driven request determined the subsequent trajectory of the work. Following completion of the initial rapid reviews, PHAC indicated that ongoing, regularly updated evidence syntheses would better serve their policy deliberation needs than a series of discrete rapid reviews. Several factors informed PHAC’s request and our assessment of the feasibility of delivering the work. First, during the pandemic, the rate of new studies being published was exceptionally high. This resulted in findings from reviews becoming irrelevant within weeks of completion. Second, the evolving pandemic context continuously created new policy-relevant questions; for example, the emergence of variants of concern (Delta, subsequently Omicron), with expected differences in vaccine effectiveness among them, and the introduction of booster vaccination programmes meant that policy questions were continuously shifting. A rapid LES format could incorporate these emerging questions more responsively and efficiently than sequential rapid reviews. Third, PHAC required evidence summaries aligned with their regular policy deliberation timelines. Update frequencies adapted to evidence flow: some teams began with weekly or biweekly cycles when evidence was emerging most rapidly, then transitioned to a four-week cycle as publication rates stabilised (Table 1). This eventual standard four-week update facilitated the integration of synthesis findings into established decision-making processes. Finally, the existence of COVID-19-specific evidence infrastructure (alerting services, curated databases, the Vaccine Effectiveness Systematic Review Platform (VESPa) collaboration) made the sustained updating feasible.
Transitioning to rapid LESs entailed significant and concrete methodological implications. Reviews had to be reconceptualised as ongoing processes without a defined end point, requiring explicit governance for methodological change. Every adaptation, such as modifying the search strategy as variants emerged, adjusting eligibility criteria for new vaccine types, revising certainty thresholds, required a documented decision made in consultation with PHAC to preserve consistency across versions. Data extraction templates required iterative changes as new vaccine categories, booster schedules, and Omicron sub-lineages emerged. The update schedule also created sustained capacity pressures that differed fundamentally from the time-limited intensity of a rapid review; these are addressed in the Sustainability and Reflexive Practice subsection.
Factors Informing the Choice Between Conducting a Living Evidence Synthesis Compared to a Rapid Review
Note. • Factors are not absolute determinants; format decisions typically involve weighing considerations across domains. • Teams may appropriately begin with rapid reviews and transition to living synthesis as conditions evolve. • Decision-maker requirements are often determinative and should be assessed first. • Feasibility constraints may preclude living synthesis even when other factors favour it.
Decision-maker requirements are often determinative, as our experience illustrates, and should be assessed early. Example considerations include how frequently decision-makers convene to review evidence and whether they need to track trends over time. Methodological considerations include elements such as the rate of evidence accumulation and the stability of the research question. Rapid evidence syntheses are most appropriate when new studies emerge frequently enough that findings risk becoming outdated quickly, and when core questions remain consistent even as specific parameters evolve.
Feasibility constraints encompass the availability of supporting infrastructure, team capacity for sustained engagement, and funding arrangements that accommodate ongoing work. Our work depended on COVID-19-specific resources that may not exist for other topics. Funding structures matter as well: traditional grant mechanisms with defined deliverables may not align well with open-ended LES commitments, and teams should negotiate expectations with funders early. The degree to which foundational infrastructure, relationships, and methodological capacity exist before a crisis is often a determinant of what is methodologically feasible as well. Investment during non-emergency periods into capacity-building plays a crucial part in allowing timely and quality emergency responses particularly when such responses need to be sustained over time.
These factors are not absolute determinants. Format decisions typically involve weighing multiple considerations and may appropriately change as contexts evolve. Table 2 summarises these factors systematically, providing a reference framework for teams facing similar decisions during future emergencies.
Methodology and Lessons Learned
This section examines eight core methodological processes across the three rapid LESs. Rather than providing exhaustive procedural detail, which is documented in the published LES reports (Bacon et al., 2023; Flórez et al., 2023; Iorio et al., 2022; Shaver et al., 2023; Wu, Joyal-Desmarais, Ribeiro, et al., 2023; Wu, Joyal-Desmarais, Vieira, et al., 2023), each subsection follows a consistent structure via a CMO lens, outlining: methodological decisions; challenges encountered; and lessons learned. The eight subsections follow the sequence of a typical synthesis workflow. Where decisions in one stage shaped options in the next, these connections are noted.
Research Questions and Scope
Methodological Decisions
Research questions were developed and refined in collaboration between the synthesis teams, COVID-END, and PHAC to ensure policy relevance and feasibility. Emergency condtions made the standard protocol-first, engagement-later sequence non-viable, so this collabrative approach was neccessary. This engagement shaped product scope from the outset, including the prioritisation of Health Canada-approved vaccines over all globally available vaccines, the focus on variant-specific effectiveness, and the selection of clinically meaningful outcomes (infections, severe disease and hospitalisation, and death). The decision to create three complementary products rather than a single comprehensive synthesis reflected both feasibility considerations and the recognition that different populations and time horizons warranted distinct analytical approaches, allowing each team to maintain focus and develop specialised expertise while collectively providing comprehensive coverage. Sustained engagement with PHAC also determined presentation formats used in reporting.
Communication between PHAC and the synthesis teams operated through several complementary channels. Regular structured meetings, held following each report submission and involving all synthesis teams alongside PHAC representatives, served as the primary forum for shared interpretation of findings, coordinated feedback on scope and presentation, and timely identification of emerging policy questions. These meetings were deliberately timed to follow report submissions, creating a regularity in which each completed product immediately generated policy inputs that then shaped subsequent update cycles. Between meetings, direct email communication with designated PHAC contacts provided a channel for rapid clarification of urgent requirements, for example, when a new variant of concern emerged mid-cycle and required an assessment of whether scope adjustment was warranted. Dissemination through the COVID-END website on a predictable schedule enabled PHAC to integrate evidence summaries systematically into their deliberation timelines, reducing ad hoc requests and allowing synthesis teams to plan update cycles around known decision-making windows. The public-facing nature of the website also broadened access beyond PHAC to other national and international health authorities and gave all synthesis teams ready access to one another’s outputs, supporting cross-team awareness throughout the production period.
Challenges Encountered
Two challenges characterised scope management across the production period. First, maintaining flexibility to adapt scope as the pandemic evolved (e.g., adding booster dose categories, incorporating Omicron sub-lineages, adjusting outcome prioritisation), required ongoing negotiation with decision-makers that extended well beyond initial question development. Teams had to manage these negotiations carefully to ensure that scope adaptations remained within available capacity, as the cumulative effect of individually reasonable requests risked expanding the work beyond what they could sustain across repeated production cycles.
Second, teams varied considerably in their prior readiness for this type of work. Some teams, such as HiRU, were not configured to conduct an LES prior to COVID-19 and needed team members to be redeployed from other tasks to meet the demand. Integrating teams at different levels of preparedness into a coordinated multi-product portfolio required the COVID-END network to perform an active coordination and support function beyond what a conventional commissioned review arrangement would require.
Lessons Learned
Early and sustained decision-maker engagement was a central activity that directly determined and led to modifications in scope boundaries, update frequency, and presentation formats. The resulting scope flexibility must be planned for prospectively. Establishing clear governance for scope change decisions, including who initiates, who approves, how changes are documented, and when changes may be expected reduces this burden without foreclosing necessary adaptation. Importantly, governance structures should also include a mechanism for monitoring team capacity relative to proposed scope changes, so that individually reasonable adaptation requests are assessed against cumulative production demands before approval.
Pre-existing infrastructure, relationships, and methodological capacity are foundational elements as well. Coordination infrastructures can facilitate the efficient recruitment of evidence synthesis teams in emergency contexts, but this depends on teams having sufficient baseline capacity to engage in the task. When emergencies require the recruitment of teams that differ in preparedness, coordination structures further benefit from tools that help bridge capacity gaps. In our experience, COVID-END supported this through regular structured coordination meetings that allowed less-experienced teams to align their approaches with more established ones, and through cross-team sharing of methodological tools, including the adapted ROBINS-I and data extraction templates, which reduced the burden on teams building capacity from a lower baseline.
Search Strategy
Methodological Decisions
Scope decisions shaped the search approaches employed across the three products. Searches were configured in response to two contextual pressures: the need for broad and timely coverage of a rapidly expanding evidence base; and the resource constraints of operating three coordinated but independently managed products simultaneously. Two teams (LES 6 and LES 8) shared a common search strategy based on PubMed via COVID-19+ Evidence Alerts and systematic scanning of preprint servers, while LES 10 conducted independent searches using the NIH iSearch COVID-19 portfolio and EMBASE. This broader search of LES 10 complemented the other teams’ approaches by helping to identify potentially relevant studies that might otherwise have been missed, with cross-team sharing and verification serving as the mechanism through which such studies were captured. Potentially relevant studies were shared across teams after each update cycle and cross-checking with the VESPa collaboration that provided external validation across all three products. PHAC’s own evidence monitoring activities, through which they occasionally flagged newly published studies, also functioned as an informal validation mechanism, providing indirect confirmation that our search coverage was capturing the literature decision-makers were encountering through their own channels. Searches were updated to include studies indexed up to two to five days before each version release date.
Challenges Encountered
Two challenges characterised search management, both shaped by the living format. First, maintaining independent searches created coordination overhead, requiring clear protocols for sharing identified studies between teams and reconciling differences in search cut-off dates and update schedules, meaning the varying points in the production cycle at which teams completed their searches, to ensure that cross-team sharing remained coherent and comparable across products. Second, the continuously evolving nature of COVID-19 evidence required ongoing search adaptation. Newly available vaccines and emerging COVID-19 variant designations were incorporated iteratively, with changes documented in version notes. Our reliance on English-language resources and COVID-19-specific databases also introduced coverage limitations that could not be fully resolved within available resources.
Lessons Learned
Coordination between teams provided an additional opportunity to compare and refine search approaches at the outset, creating an extra layer of verification beyond each team’s internal processes. Several converging indicators gave us confidence that our searches were capturing the literature relevant to decision timelines. VESPa cross-checks identified five studies across the entire project not already captured by our searches, and three of them were subsequently excluded due to critical risk of bias, indicating that eligible studies were being captured with high consistency. Substantial overlap between independently conducted searches provided further triangulation, which was a useful benefit of coordinated collaboration across teams. The optimal balance between shared search approaches (LESs 6 and 8) and independent search approaches (LES 10) depends on similarity in research scope between teams, team capacity, and policy stakes.
Study Selection
Methodological Decisions
Studies identified through the searches were screened by each team independently of the other teams, using an approach shaped by the dual demands of methodological rigour and compressed update timelines. Within each team, screening followed a pilot approach consistent with rapid review guidance (Nussbaumer-Streit et al., 2023): during the pilot stage, two reviewers independently screened at least 20% of records at title and abstract stage, with discrepancies resolved through discussion until sufficient agreement was reached. The remaining records were screened by a single reviewer, with a second reviewer verifying all potentially eligible studies. Teams shared lists of included and excluded articles after each update cycle and PHAC also flagged relevant articles identified through their own monitoring activities. This cross-team sharing mechanism provided a portfolio-level quality assurance function that extended beyond each team’s internal procedures.
Challenges Encountered
Two challenges were specific to the living format. First, single-reviewer screening carries an inherent risk that compounds across repeated update cycles, where small inconsistencies in eligibility interpretation can affect comparability across versions. Second, the living format introduced recurring difficulties with preprints and multiple publications from the same datasets. Matching preprints to peer-reviewed publications required careful tracking across update cycles, and research groups produced multiple papers over time from the same data, requiring consistent identification to avoid duplicate inclusion. Cross-team sharing helped, as teams occasionally identified preprint-to-publication linkages that others had missed, but the challenge recurred and added workload at each cycle.
Lessons Learned
Cross-team sharing of inclusion and exclusion decisions is a practical quality assurance mechanism that extends beyond internal and dual screening alone and should be built into coordination structures as a planned activity between update cycles. Single-reviewer screening risk accumulates across update cycles. As such, periodic re-piloting of eligibility criteria following substantive scope changes helps recalibrate consistency between different screeners and alleviates issues of coder drift. Preprints and multiple publications from the same datasets are predictable and recurring features of living synthesis; explicit decision rules established at project outset, specifying how preprint-to-publication linkages will be tracked, how to select among multiple reports from the same data source, and how to avoid duplicate inclusion, would reduce recurring burden on synthesis teams.
Risk of Bias
Methodological Decisions
Studies that met the inclusion criteria were assessed for risk of bias using an adapted version of ROBINS-I (Linkins et al., 2022). The adapted tool was developed collaboratively in response to the methodological demands of vaccine effectiveness research using predominantly observational evidence under emergency conditions. Drawing on WHO guidance for evaluating COVID-19 vaccine effectiveness, one team drafted the initial version, and all teams contributed to refinement through coordination meetings. The adaptation focused on study characteristics particularly relevant to bias in vaccine effectiveness research, including study design, method for confirming vaccination status, databases used for outcome ascertainment, adjustment for prognostic factors, and accounting for calendar time. Full operational details, including domain specifications, signalling questions, and the operationalisation of critical risk of bias judgements, are provided in the Supplemental File S1. While teams employed a common framework, each made population- or context-specific refinements where warranted, addressing considerations unique to paediatric populations (LES 8), long-term follow-up studies (LES 10), and variant-specific effectiveness patterns. The tool was further refined iteratively throughout the production period as teams gained experience with emerging study designs, changes in the basic understanding of virus dynamics, and variant-specific methodological challenges.
In applying this adapted tool, a single reviewer conducted initial assessments, with a second reviewer verifying all judgements. Risk of bias assessments were shared across teams for studies relevant to multiple products, providing verification and promoting consistency in judgements.
Challenges Encountered
The living format introduced a challenge particular to iterative synthesis: datasets generating multiple publications over time required risk of bias to be re-evaluated for each new paper rather than carrying-forward prior judgements. Subsequent analyses from the same data source do not necessarily maintain the same methodological quality. Analytical approaches, covariate adjustments, and reported outcomes often changed between versions, and each new paper warranted independent reassessment. Studies assessed as having critical risk of bias were in general excluded from synthesis. However, there were limited exceptions when data for a particular outcome or variant of concern were sparse; such cases required careful case-by-case deliberation based on policy need to access data weighed against the severity of the bias concerns. This iterative re-evaluation added workload but was essential for maintaining the integrity of the synthesis.
Lessons Learned
Collaborative tool development, in which a common framework is drafted by one team and refined collectively, enabled consistency across products while preserving flexibility for context-specific adaptation. This balance was deemed more achievable through structured co-development than post-hoc harmonisation. Iterative re-evaluation of studies generating multiple publications is a predictable feature of living formats and should be treated as a planned component of update cycles. Teams should document explicit decision rules at project outset specifying when reassessment is triggered, for instance, when a preprint is subsequently published in a peer-reviewed journal, when a new paper from the same data source reports different outcomes or analytical approaches from a previously assessed version, or when a study’s follow-up period is extended in a later publication, and who is responsible for initiating it. Establishing these rules prospectively reduces the case-by-case judgement burden at each cycle and promotes consistency across reviewers and across versions.
Data Extraction and Analysis
Methodological Decisions
Following risk of bias assessment, data were extracted using standardised templates developed specifically for each synthesis and piloted prior to implementation, with one reviewer extracting data and a second team member reviewing for accuracy and completeness. This dual-role approach was selected to balance verification rigour against the time constraints of recurring update cycles. Teams conducted data extraction independently, with templates tailored to each product’s analytical scope and evidence base. LES 10 made an additional methodological investment in setting up automation tools. These included developing R scripts to automate data analysis and building automated tools into data extraction spreadsheets to avoid by-hand calculations as well as to prevent, identify, flag, and correct errors. This investment, which took place between the second and third update cycle, enabled the subsequent incorporation of meta-analytic methods into the LES and provided capacity to address substantially more analytical questions than manual approaches alone would have permitted.
Challenges Encountered
Three challenges characterised data extraction and analysis. First, templates required iterative revision as the evidence base evolved. For example, booster dose categories and Omicron sub-lineage distinctions that were not anticipated at the outset required new data fields, and each revision necessitated careful version control to maintain consistency across update cycles and across teams. Second, time constraints meant that not all initially planned data elements could be extracted. For example, decisions were made to omit cases where results were only available in graphical (non-numerical) form, and not all relevant population subgroups could be examined. Third, the conditions that made automation most valuable (e.g., high update frequency, large and growing evidence volumes, and extended production timelines), were precisely the conditions that consumed available capacity and left little room for infrastructure development. LES 6 and LES 8 did not implement automation because project demands left no capacity for development work beyond core synthesis activities, despite recognition of its value.
Lessons Learned
Automation of data extraction, validation, and analysis, as implemented by LES 10 enabled substantially more analytical questions to be addressed per update cycle, reduced extraction errors, and ultimately eased the burden felt by the team. For teams employing quantitative methods across repeated updates, this type of infrastructure can further maintain analytical consistency and promote reproducibility. The tension between the value of automation and the capacity to develop it is a structural feature of living syntheses under resource constraints. Prospective resource planning should include protected time for infrastructure development, recognising that early investment reduces cumulative workload over the life of the synthesis even though it competes with immediate delivery demands.
Certainty of Evidence
Methodological Decisions
Certainty of evidence was assessed using an approach modelled after GRADE criteria adapted for the predominantly observational evidence base (Guyatt et al., 2011). A few RCTs were included across the products where they met eligibility criteria; however, the predominantly observational nature of the evidence base reflects the inherent characteristics of the vaccine effectiveness literature. The framework was developed collaboratively across teams, though application varied between products to accommodate different analytical approaches. This adaptation also reflected a practical decision-support need: a strict GRADE application to a predominantly observational evidence base would have classified nearly all findings as ‘low certainty’, leaving decision-makers without the meaningful gradients required to inform policy. The adapted criteria were designed to retain GRADE’s principles while producing useful distinctions across findings.
Two teams (LES 6 and LES 8) synthesised findings narratively, reflecting capacity constraints during the emergency production period rather than a methodological judgement about the suitability of pooling; their certainty assessments were based on the risk of bias of included studies and the direction and consistency of vaccine effectiveness estimates across studies, with vaccine effects defined in relation to a WHO-determined threshold for protection. High certainty was assigned for convergent findings across low-to-moderate risk of bias studies with consistent findings; moderate certainty for more than one study with low-to-moderate risk of bias and at least partially consistent findings; and low certainty for studies with serious risk of bias or inconsistent findings.
LES 10, which employed quantitative pooling, developed pragmatic, threshold-based criteria that incorporated volume of evidence (number of cohorts contributing to the pooled estimate), an indicator of imprecision based on the width of the pooled confidence interval and consistency of point estimates across the contributing cohorts. These thresholds were adapted from, rather than strictly equivalent to, GRADE, and were intended to support consistent reporting across updates (Guyatt et al., 2011). Higher certainty was assigned when more cohorts contributed, the pooled estimate was relatively precise, and findings were consistent across cohorts; low certainty was assigned when fewer than four cohorts were available, the pooled estimate was imprecise, or findings were inconsistent across cohorts. This threshold-based approach was feasible for LES 10 because its analytical scope and investment in script-based automation supported quantitative pooling in a way that the broader production demands on LES 6 and LES 8 did not permit.
Challenges Encountered
The variation in certainty approaches, while operationally grounded, introduced a comparability challenge: certainty judgements were not directly comparable across the three syntheses. The differences in approach, narrative consistency criteria in LES 6 and LES 8 versus quantitative precision thresholds in LES 10, reflect distinct capacity and analytical contexts rather than arbitrary divergence, as LES 10 could draw on additional criteria. However, this raises legitimate questions about whether a finding assessed as “high certainty” in LES 6 carries the same meaning as one assessed as “high certainty” in LES 10. This comparability limitation was not fully resolved during the production period.
Lessons Learned
First, in multi-product living syntheses operating under differential capacity constraints, strict harmonisation of certainty frameworks may be neither feasible nor entirely desirable. What matters is that each product’s approach is internally coherent, explicitly justified, and clearly documented, so that decision-makers can interpret certainty judgements within the context of each product. Distinctions, such as between narrative certainty criteria based on the direction and consistency of effects, and quantitative threshold-based criteria based on the direction, consistency and precision of effects, should be made explicit from the outset, both in protocols and in published products. This is especially important in living formats where both approaches may operate simultaneously across related products serving the same decision-maker, and when certainty criteria may also evolve within a single product as analytical capacity develops. In our experience, LES 10 began with narrative certainty criteria before transitioning to quantitative threshold-based criteria once meta-analytic methods were implemented, illustrating that such transitions are predictable rather than exceptional. Where such transitions occur, version notes should document the change explicitly so that users comparing findings across versions are not misled by apparent shifts in certainty ratings that reflect methodological evolution rather than changes in the underlying evidence. Second, teams planning multi-product LESs should consider whether a portfolio-level minimum standard, for example, common outcome definitions and a shared floor for risk of bias eligibility, is achievable even where full harmonisation of certainty frameworks is not; such a minimum standard would improve cross-product interpretability without requiring each team to abandon the approach best suited to its analytical capacity and evidence base.
Knowledge Translation
Methodological Decisions
Once syntheses were complete for each update cycle, findings were translated for multiple audiences, such as PHAC decision-makers, frontline public health practitioners, and the general public. Teams developed plain language summaries and infographics in English and French; LES 8 also produced Spanish-language materials. Public partner involvement, facilitated through the COVID-END network’s citizen-partnership programme, also informed knowledge-translation activities, including the preparation of plain-language summaries and infographics. Their involvement brought citizen perspectives into the synthesis process and helped strengthen the clarity, accessibility, and relevance of outputs beyond technical and policy audiences. Dissemination occurred through direct communication to PHAC, the COVID-END website, and social media. Presentation formats were refined iteratively based on decision-maker feedback: colour-coded certainty tables using green, yellow, and orange shading were developed to allow decision-makers to identify evidence strength at a glance, with the colour scheme revised iteratively following usability input from PHAC. Temporal effectiveness figures showing vaccine effectiveness plotted against days since vaccination with separate variant trend lines, were similarly developed in response to PHAC’s booster timing decisions. LES 10 additionally incorporated an explicit policy maker section applying behavioural science techniques to bridge the gap between technical evidence summaries and actionable policy guidance, and used formal literacy and reading complexity tools to ensure the accessibility of plain language summaries.
Challenges Encountered
Two challenges were specific to the living format. First, communicating evolving certainty across update cycles, conveying not just current findings but how and why they had changed, required ongoing adaptation of summary language and visual formats at each cycle. Certainty sometimes declined as new evidence emerged, and framing such changes accessibly without undermining decision-maker confidence in the synthesis required careful attention at each version. Second, time and resource constraints systematically limited knowledge translation scope. Plain language summaries and infographics were treated as a delivery activity appended to core synthesis work rather than an integrated component, constraining both quality and reach, and the creation of these products was not equally applied at each update cycle. The colour-coded certainty tables also used a palette not formally assessed for colour-blind accessibility, and their traffic-light visual logic may imply more categorical precision between certainty levels than the underlying evidence warrants.
Lessons Learned
Knowledge translation in living synthesis is not a one-time design problem but a recurring production challenge; summary language, visual formats, and dissemination strategies must be revisited as evidence evolves, certainty shifts, and new audiences emerge. This recurring demand should be reflected in resource planning. Communicating evolving and sometimes declining certainty, requires dedicated attention to framing; teams should develop explicit conventions for how version-to-version changes are described so that audiences do not interpret shifts in certainty as undermining the reliability of the synthesis itself. Citizen partner engagement, facilitated through a coordination infrastructure such as COVID-END’s citizen-partnership programme, is a practical model for strengthening the clarity, accessibility, and relevance of plain language outputs; future teams should plan and resource this engagement prospectively rather than relying on it emerging informally or being absorbed within constrained production timelines. When adopting colour-coded certainty tools, teams should use colour-blind-safe palettes and include a brief interpretive caution about the limits of traffic-light visual logic to avoid implying more categorical precision than the underlying evidence warrants. More broadly, teams and coordinating groups should attend to accessibility concerns across all knowledge translation outputs, including reading level, language access, visual format usability, and colour-blind-safe design, recognising that outputs reaching diverse public audiences carry different accessibility demands than those directed solely at technical or policy readers. These considerations should be built into the design of knowledge translation materials from the outset rather than addressed retrospectively.
Sustainability and Reflexive Practice
Methodological Decisions
The continuous demands of all preceding processes created sustainability pressures throughout the production period that differed fundamentally from time-limited reviews. Decisions about update frequency, role distribution, and documentation protocols were made not once at the original protocol stage but repeatedly as conditions evolved.
Challenges Encountered
Sustaining team capacity over the two-year production period presented concrete challenges. Staff turnover was recurring, particularly for teams without automation tools. Each transition required orientation and training of new members, creating workflow disruptions and risk of accumulated knowledge loss. Workload was unevenly distributed across the update cycle, concentrating demands on a small number of team members between search completion and report release. Teams managed this through informal redistribution of tasks rather than prospectively planned protocols, an arrangement that functioned adequately but depended on team goodwill and was not resilient to simultaneous pressures on multiple members. Regarding reflexivity, living synthesis demands that positionality reflection occur alongside continuous production, which proved difficult to sustain. One team (LES 10) conducted structured reflections at the beginning and end of the review period examining the positionality of individual team members (e.g., disciplinary influences, vaccination attitudes, communication choices), and of the team itself (e.g., reflections on team operations under time pressure), producing formal positionality statements. Team members found these exercises informative and expressed that more frequent reflection would have been valuable, if not for time constraints. These reflections helped highlight the background and beliefs within the team and identified that the team lacked representation from key perspectives including policy makers, frontline healthcare workers, and immunologists. However, the reflexive exercise did not extend beyond the LES 10 team itself, for instance, not covering interactions with PHAC which played a key role in shaping the LES (e.g., defining scope, how certainty was conceptualised).
Lessons Learned
Operational wellness and reflexivity should be treated as distinct concerns. For operational wellness, rotating roles distributes cumulative burden, and protected infrastructure sprints between cycles address the structural tension between development value and production demands. Explicit stop rules for living mode should be negotiated with commissioners at the outset. Standardised training protocols and workflow documentation buffer against the knowledge loss that accompanies staff turnover. For reflexivity, structured reflection points should be built into production cycles, at project outset, following major scope changes, and at close, with prompts examining how sustained engagement with evidence and with decision-makers may be shaping interpretation across updates.
Recommendations
Summary of Recommendations by Audience
Note. • Each theme connects to lessons discussed in the corresponding section of this paper. • Some recommendations apply broadly to any living evidence synthesis effort (e.g., transparent documentation, sustainability planning); others are more context-specific (e.g., leveraging disease-specific infrastructure) and should be adapted based on available resources and decision-maker needs. • Teams may find it useful to review recommendations across all audience columns, as actions by funders and methodologists directly affect what is feasible for synthesis teams.
Planning and Decision-Maker Engagement
Decision-maker engagement should be resourced explicitly from project outset and sustained throughout the project. Funders should resource this engagement and methodologists should develop frameworks for structured end-user consultation adaptable across contexts. Communication should be structured through regular post-submission forums, creating a regularity with which each completed product generates the policy input that shapes the next cycle.
Infrastructure and Capacity
Teams considering an LES should realistically assess whether supporting infrastructures exist before committing to a living format. Where absent, teams should adjust update frequency expectations or consider conducting discrete rapid reviews.
Drawing on our experience, we propose that a minimum viable rapid LES requires: (1) minimum infrastructure, including at least one curated evidence feed or database relevant to the review question, and a version control system ensuring that methodological changes, scope adaptations, and eligibility decisions are documented transparently across update cycles; (2) a minimum methods package, including defined criteria for inclusion and exclusion, a piloted screening procedure with verification of potentially eligible studies, a risk of bias tool with explicit decision rules, and a change log recording what changed between versions and why; and (3) optional enhancers where capacity permits, such as parallel independent searches, script-based automation, an external cross-check mechanism to provide independent validation of coverage and advanced knowledge translation activities.
Team Sustainability
Rotating roles, protected infrastructure sprints between update cycles, and explicit stop rules negotiated at project outset are essential operational practices. Funders should structure awards to accommodate ongoing work and recognise that living synthesis requires sustained commitment rather than one-time resource allocation. Prospective resourcing for bespoke automation development, not just production activities, reduces cumulative workload and can increase rigour, though this investment is itself resource-dependent.
Many of the resource constraints documented in this paper are areas where more general automation tools in evidence synthesis have also advanced rapidly in recent years and could be considered. Teams planning future living synthesis should prospectively assess which workflow components are amenable to automation and which existing tools have sufficient validation evidence, rather than defaulting to manual approaches. Adoption of any automation tool in a living synthesis context requires careful attention to reproducibility, version control, and documentation of tool-specific decisions across update cycles.
Multi-Team Coordination
Regular structured meetings involving all synthesis teams and decision-maker representatives, held following each report submission, proved essential for coordination, cross-team learning, and quality assurance. Cross-team sharing of inclusion/exclusion decisions and risk of bias assessments partially offset the validity implications of streamlined single-reviewer methods. Multi-team projects planning LESs should build these coordination structures in from the outset rather than allowing them to emerge informally.
Methodological Transparency and Reflexive Practice
LES require explicit documentation of changes between versions, including what changed, what triggered the change, and who approved it. A minimal transparency package, including a protocol snapshot, change log, search strategies, included/excluded study lists, risk of bias decisions, and extraction schema, should be treated as a standard output of living syntheses. Structured reflexivity exercises should also be built into production cycles, but may be at wider intervals than each update.
Our experience invites reflection on the adequacy of existing evidence synthesis standards for emergency and living synthesis contexts. Several adaptations, such as single-reviewer screening, narrative synthesis under capacity constraints, and iterative certainty threshold specification, are not clearly addressed by guidance developed primarily for conventional systematic reviews. The absence of field-level standards for version control, change documentation, and methodological evolution was a practical gap we navigated through team-level conventions. We recommend that future LES teams designate a project-management role responsible specifically for implementing documentation from the outset, treating it as a methodological activity. We hope the lessons documented here contribute to the ongoing development of Campbell standards in these areas, particularly regarding the accommodation of emergency contexts and living synthesis approaches.
Conclusions
Rapid LES represents a critical methodological advancement for providing timely, rigorous evidence during rapidly evolving situations. Our experience producing three complementary products on COVID-19 vaccine effectiveness over approximately two years demonstrates both the feasibility and value of this approach for emergency decision support, while revealing the substantial infrastructure, coordination, and sustained commitment such work requires. The multi-team coordination structure that made this work feasible was not incidental to quality but constitutive of it: isolated teams employing the same streamlined methods would lack the safeguards that coordination provides and should not assume equivalent resources will exist for other emergencies.
This paper contributes to methodological literature by documenting practical implementation of multi-team LESs under emergency conditions, identifying critical dependencies on coordination infrastructure, and revealing gaps between existing guidance and operational realities. Priority areas for future research include optimal update frequencies, minimum methodological standards under time constraints, and sustainable resourcing models for extended production.
We acknowledge limitations in generalisability. These lessons derive from three products within a single coordination network during an unprecedented period of scientific and policy attention; applicability to other emergencies and settings warrants investigation. By documenting our processes and reflections, we aim to strengthen the evidence synthesis community’s capacity for effective emergency response, ensuring decision-makers have access to current, transparently produced evidence to inform their deliberations.
Supplemental Material
Supplemental Material - From Rapid Reviews to Rapid Living Evidence Synthesis: Lessons From COVID-19 Vaccine Effectiveness Reviews for Canadian Public Health Decision-Making
Supplemental Material for From Rapid Reviews to Rapid Living Evidence Synthesis: Lessons From COVID-19 Vaccine Effectiveness Reviews for Canadian Public Health Decision-Making by Nana Wu, Keven Joyal-Desmarais, Lori-Ann Linkins, Iván Darío Flórez, Simon L. Bacon in Campbell Systematic Reviews.
Footnotes
Acknowledgments
We gratefully acknowledge the contributions of all team members who participated in the production of the three living evidence syntheses described in this paper. For LES 6, we thank the Health Information Research Unit (HiRU) at McMaster University and the Ottawa Knowledge Synthesis and Application Unit. For LES 8, we acknowledge the Evidence and Deliberation Unit for Decision Making (UNED) at the University of Antioquia and collaborators at McMaster University. For LES 10, we thank the Montreal Behavioural Medicine Centre META Group at the CIUSSS du Nord-de-l'Île-de-Montréal. We thank the VESPa (Vaccine Effectiveness Systematic Review Platform) collaboration for cross-checking and validation support throughout the production period. We are grateful to our knowledge users at the Public Health Agency of Canada for their ongoing engagement, feedback on presentation formats, and guidance on policy-relevant questions. We also acknowledge the broader COVID-END in Canada network and McMaster Health Forum for providing the coordination infrastructure that made this multi-team collaboration possible.
Author Contributions
Conceptualisation: Nana Wu, Keven Joyal-Desmarais, and Simon L Bacon. Methodology: Nana Wu, Keven Joyal-Desmarais, Lori-Ann Linkins, Iván Darío Flórez, and Simon L Bacon. Writing original Draft: Nana Wu. Review & editing: Nana Wu, Keven Joyal-Desmarais, Lori-Ann Linkins, Iván Darío Flórez, and Simon L Bacon.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The development and continued updating of the three living evidence syntheses described in this paper were funded by the Canadian Institutes of Health Research (CIHR) and the Public Health Agency of Canada (PHAC) through the COVID-19 Evidence Network to support Decision-making (COVID-END) in Canada.
Declaration of conflicting interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The authors declare no conflicts of interest. PHAC, as the primary funder and knowledge user of the three living evidence syntheses described in this paper, was actively involved in shaping the scope of those products, including the prioritisation of outcomes, selection of vaccines for inclusion, and update frequency, through ongoing engagement with synthesis teams throughout the production period. This engagement is documented as a substantive methodological feature in the Methodology and Lessons Learned section. PHAC had no role in the design of this methodological reflection, in the interpretation of lessons reported here, or in the decision to submit this paper for publication. The conclusions and recommendations are those of the authors alone.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
