Abstract
Introduction
The COVID-19 pandemic, responsible for 15 million deaths worldwide in its first 2 years, 1 generated an unprecedented global response. This response saw a public health threat and non-pharmaceutical interventions (NPIs), such as social distancing, simple hygiene measures and travel restrictions2–4 take centrestage in mainstream public discourse for the first time in recent history. Additionally, use of large open data resources from entities such as the World Health Organisation (WHO), Google and Our World in Data (OWID)5–9 has allowed the general public access to ostensibily comparative data to analyse the trends relating to this novel threat for themselves.
As time passes, the daily relevance of NPIs and COVID-19 data dashboards are receding from public consciousness as countries return to normality. However, COVID-19 continues as an active issue across health systems.10,11
In the initial stages of this pandemic, by necessity, public health responses preceded evidence generation. As health systems faced unprecedented strain from COVID-19 spread amongst nonimmune populations, there was a recognition that evidence-based medicine was too slow to inform all decision-making. Accordingly, many clinicians called on colleagues to expand traditional evidence-based practice to rapidly include ‘practice-based evidence’, 12 whereby natural experiments largely provide the data for scientific study.
As the third noteworthy coronavirus outbreak of the past 20 years, SARS-CoV-2 will not be the last global public health challenge we will face. 13 Thus, lessons must be learned from what our national responses have achieved, and where they have fallen short. Modern programming techniques, which can automatically download, filter and rebuild freely available open data into useful visualisations, can facilitate timely and ongoing analyses of public health responses and subsequent changes in healthcare system trends.5–10 While such resources offer an opportunity to inform such approaches, there are significant caveats, including how differences in testing and reporting criteria make international comparisons difficult.14–16 However, comparisons within national boundaries, where similar definitions and health policies have been applied over time, may offer additional insights. Mature reflection on the pandemic must also be mindful of “excess mortality”, which “measures the additional deaths in a given time period compared to the number usually expected”. 17 As healthcare systems changed care pathways considerably, and public health messaging led to an understandable reluctance for many to attend healthcare settings, direct and indirect consequences of COVID-19 were, and likely continue to be, significant.
This article describes an open data, web-enabled methodology and presents key findings from a collaboration of frontline primary care, public health physicians and researchers from nine Northern Periphery and Arctic (NPA) programme 18 countries. The Northern Periphery and Arctic 2014-2020 Programme formed a cooperation between nine programme partner countries: the European Union member States of Finland, Ireland and Sweden in cooperation with Northern Ireland, Scotland, Norway, the Faroe Islands, Iceland and Greenland. While the entire territories of the latter three countries are designated as “NPA” regions, all other more populous countries are divided into more rural “NPA” and more urbanised “non-NPA” regions. For example, the western seaboard of the Republic of Ireland is designated as an NPA region, while the midlands and east of the country are designated non-NPA. The NPA programme primarily focused on remote primary healthcare settings and thus analyses of subnational data, organised into more rural NPA and more populous non-NPA regions, were of particular interest. Using open data resources, analysed and visualised on a custom-built website, as the backdrop to discussions, our remote collaborative effort aimed to identify and collate data to assist us in generating practice-based evidence.
Methods
Setting
The project team comprised core members from Ireland, Iceland, Norway and Canada responsible for the original project proposal, with other NPA-associated researchers and clinicians (from the Faroe Islands, Finland, Scotland, Sweden and Northern Ireland) invited to participate via email.
Despite geographical differences, designated NPA regions within partner countries share many common features, such as low population density, low accessibility, low economic diversity, abundant natural resources, and high impact of climate change. 18 Please see Online Appendix A for list of NPA countries and regions.
Data sources
Demographic, geographic, COVID-19 and excess mortality data were gathered from 18 online resources (see Online Appendix B for full list of resources). While national data are available from large online databases curated by Google 6 and “Our World in Data” (OWID), 7 more detailed regional data were only available through local governmental and other official sites not included in larger databases. These latter data sources were important to facilitate more detailed analyses not available on other websites.
Data harnessing and analysis
Open data resources were harnessed using ‘R’, the open-source programming language. 19 Once collated, cleaned and organised, ‘R-Shiny’, which allows users to build interactive web applications using R code, 20 was used to create a public-facing website for collaboration partners and the general public to explore. 21 Screenshots of the website and links for the current version of the website are included in Online Appendix C. Code written for this project and data used is available at a github repository. 22
To create meaningful comparisons, all cases and deaths data were standardised to a rate per 100,000 population, and granular daily data facilitated grouping by time according to user preference (by day, week, month). Excess mortality data is presented as percentage change between the national total number of deaths in each month, as compared to the aggregated baseline data for the period 2015-2019, from the open data resources detailed in Online Appendix B.
In addition, various sections of our website were created to explain key public health concepts important for communication with the general public and other clinicians, illustrated with real-world data, at national and regional level (where available). These concepts included the importance of testing strategies and subsequent impact on data trends, varying definitions of COVID-19 deaths, illustration of some of these interrelated concepts through regional maps and explanations of case fatality proportion and percentage positivity.
For regional analyses within countries with available data, the Chi-squared test was employed to examine differences in proportions of case counts vs non-infected populations and death counts vs survivors across NPA and non-NPA regions.
Project reporting
Monthly teleconferences were held with our group of clinical experts involved in the NPA programme between September 2020 and February 2021. Scheduled monthly sessions involved the appraisal of latest data on our project website, highlighting trends in testing, case and death data across the partner territories, in addition to key public health developments and interventions ongoing in each country.
Our aim was to use open data to identify trends in COVID-19-related data that could be explained by speaking with local experts. Generation of practice-based evidence thus involved: 1) building a live, public-facing dashboard from open, comparative data from partner countries. 2) using this dashboard to stimulate discussion amongst collaborators. 3) Identifying and exploring key aspects of public health interventions in each territory, with a particular focus on the rural NPA regions in partner countries, to feed back to step 1, to benefit our collaboration and publish our insights in an open fashion.
Results
Analyses
At the national level, cases and deaths data were available for all countries.
In addition, regional case data, to facilitate dividing countries into NPA and non-NPA regions, was available for all countries, and thus appraisal of regional spread of COVID-19 cases was possible. Note regional data were generally reported at “county” (or other large administrative geographic area)-level by countries examined.
However, regional COVID-19 deaths data were not published in an open data format by Finland and Norway, and therefore a regional breakdown of deaths was not possible for these countries.
Analysis 1: Case data
COVID-19 Cases per 100,000 population March 2020 to February 2022 (data by NPA vs non-NPA regions where available).
* denotes significance p < .01 on Chi-squared testing examining cases vs non-infected populations across NPA and non-NPA regions.
Analysis 2: mortality data
COVID-19 deaths per 100000 population March 2020 to February 2022 (data by NPA vs non-NPA regions where available).
* denotes significance p < .01 on Chi-squared testing examining death counts vs survivors in populations across NPA and non-NPA regions.
Significant differences in COVID-19 death rates were found between NPA regions and non-NPA regions of three of the four of the countries (Ireland, Scotland, Sweden) in the first 12 months of the pandemic. Despite persistence of significantly higher case counts across all non-NPA regions, a similar difference in death rate across NPA versus non-NPA regions only persisted in the second year of the pandemic for Scotland.
Analysis 3: excess mortality data
Excess Mortality across Northern Periphery Arctic (NPA) Countries by month (%) [Baseline period 2015-2019 inclusive].
Website creation and content
These analyses of open data provided the backdrop to a public-facing website built in R-Shiny that automatically incorporated new data as new data were published. 22
Website content, covering areas such as the importance of differing testing strategies and case/death definitions, case fatality proportion, testing positivity rates, and disease trends across and within countries examined, was generated from visualisations of open data resources generated in R and minutes from our collaborative meetings.
Analysis of website traffic revealed over 4600 site visits from 1400 users from 33 countries over the period from its launch in October 2020 to February 2022.
Website contribution to collaboration insights
Main findings from NPA collaboration.
By enhancing our understanding of local and international COVID-19 trends as described above and providing a public-facing URL to share with colleagues, our website assisted our influencing of public health strategies and information campaigns across the EU as our group engaged with other clinicians, researchers and government stakeholders at various international fora and conferences. 34
Using key elements and screenshots of the website, largely through social media posts, we achieved widespread outreach, with over 700,000 engagements across digital platforms. 35
Initially the collaboration’s critical messages centred on the need for rapid responses, open data reporting, clear communication to mitigate SARS-CoV-2 spread, while also clarifying public health concepts like test positivity rates and why COVID-19 death definitions matter. We recognized the need for leadership to make informed decisions amidst uncertain data. The decentralized decision-making approach in countries like Norway and the Faroe Islands, involving local physicians, contrasted with centralized strategies in the UK, Sweden, and Ireland. Discussions evolved to emphasize protecting vulnerable groups, successful regional public-health autonomy, and the necessity for consistent official communication. Despite the obvious challenges, we also identified opportunities in NPA regions for remote work, re-skilling, and enhanced web-based collaboration.
Discussion
Main findings
A custom-built website allowed our collaboration space and scope to contextualise observed COVID-19 trends at regional and national level, in addition to highlighting many important elements of national responses. While engagement with our site was modest, “bite-size” social media posts based on its visualisations and our findings were popular. Having greater scope to explore concepts on the custom site meant we were not solely reliant on short social media posts, which can risk loss of message control and misinformation.36,37 In addition, the site gave our collaboration reference material and a link to send to various clinicians, researchers and policymakers interested in learning about our work.
As differing testing approaches, reporting strategies and COVID-19 deaths definitions exist across countries examined, international comparisons need to be weighed carefully. 16 This is clear from examination of Table 1 and Table 2, which highlight variance in case detection and death rates. For instance, the “overall” Year one and Year 2 columns in Table 2 reveal that Scottish non-NPA regions experienced the highest death rates, while Table 1 shows that their case rates were relatively modest in comparison. Differences in testing strategies, case and death definitions, and the impact of national health responses underscore the complexity and nuance required when interpreting case and deaths data.
Nevertheless, our analyses suggest countries involved in our collaboration can be divided into two broad groups based on case and death rates during the first year of the pandemic in North-western Europe (March 2020 to February 2021). Ireland, Northern Ireland, Scotland and Sweden experienced higher cases and deaths from COVID-19 during this time period compared to Norway, Finland, Iceland, Greenland and the Faroe Islands, which largely seemed to favour more regional as opposed to centralised decision-making. These associations are borne out in excess mortality data also, with the four former countries experiencing greater than 10% excess mortality in the first year of the pandemic.
Regarding our rural analyses, NPA regions experienced significantly less COVID-19 in the first year of the pandemic, although the experience of countries in the second year of the pandemic (March 2021 to February 2022), both nationally and sub-nationally, was more uniform. Large case rates were seen in all regions due to a combination of maturation of testing systems and viral spread as societal restrictions were relaxed in the wake of large vaccination campaigns.
Comparison with existing literature
Widespread use of open data resources and dashboards for information purposes has been a feature of the COVID-19 pandemic.5–9 It has been suggested governments and researchers use these resources to focus more on recovery, future needs and ensuring measures are targeted for at-risk groups. 8 Involvement of frontline clinicians will assist these efforts.
Analyses of COVID-19 data across countries involved in this project mirror that of much larger studies.38,39 Whilst not perfect, excess mortality remains the best metric available to assess previous and ongoing impacts of this pandemic.39–42 A large-scale analysis of excess mortality in Europe reveals overall excess deaths to be 12% higher than the 2016-2019 average in 2020, while in 2021 the corresponding figure was 14%. 39 Comparison with our analyses suggests the countries examined in this project fared better in the second year of the pandemic compared to other European countries. This is likely explained by particularly high excess mortality in Eastern Europe in the second year of the pandemic.40,41 Regional data analyses and lower COVID-19 impacts seen in NPA regions in this project also mirror trends observed in Italy 43 and America. 44 While it might be expected that older rural populations might fare worse from a disease like COVID-19, some evidence points towards better social connectedness and “community spirit” amongst rural populations, which might lead to better protection of older community members. 45 Our analyses demonstrate that even across the limited group of countries examined, some countries’ NPA areas saw a greater protective effect than others. While the association in Northern Ireland was weak, the association in Scotland was stronger, which suggests NPA designation alone does not fully explain varying rates of COVID-19 activity seen in this project.
From a practice-based evidence perspective, our collaboration has highlighted several items relevant to ongoing COVID-19 challenges and other future significant public health issues. These are broadly in keeping with other studies in this vein, with calls for greater “synergy between the medical, political and population factors”, 46 consistent, coordinated and reliable information emanating from a trusted source, 47 acknowledgement of the importance of responding quickly and building contact tracing capacity rapidly 48 and a need to improve IT and public health generally to improve pandemic preparedness. 49
Strengths and limitations
Resources generated, chiefly the public-facing website and its featured analyses and tables, have helped our collaboration’s members over the past 3 years to formulate their assessments of their own country’s response to COVID-19, at local and international meetings. It is intended that this article and the code supplied should encourage and enable other like-minded researchers and frontline clinicians to engage in similar activities in the future.
While intra-country analyses are not immune to changes in testing and reporting strategies, regional analyses should be more robust than international comparisons as each area is subject to the same protocols at any given time. Analysis of regional data suggests there were significantly less cases and deaths from COVID-19 in NPA versus non-NPA regions in the first year of the pandemic. These associations, particularly for mortality, disappeared in the second year of the pandemic.
The countries involved in this project are quite a limited group, with well-developed healthcare systems, good quality data reporting, high levels of governmental trust and high vaccination rates. Thus, our findings may not reflect experiences elsewhere. In addition, NPA regions can be large and may contain both urban as well as rural regions, which can also be said of the non-NPA regions in the same countries. Our analyses are thus of a heterogeneous group and this may be reflected in the differing strength of association between NPA designation and COVID-19 impact. Given this heterogeneity, deeper analyses including examination of rurality, age-dependency, social deprivation and social capital and how they might impact on health and illness in our communities when responding to a pandemic threat are warranted. 50 This work will require granular regional data.
Analyses were affected to a small degree by some data quality issues. Case data are not available for Svalbard and Jan Mayen (NUTS region SJ), which are part of the NPA region for Norway. Regional case and death data reporting in the very early stages were unreliable for Sweden, Scotland and Northern Ireland, although these soon tally with national data, presumably due to improved reporting capabilities. From a data mapping perspective, all but one of the NPA-designated regions map directly onto boundaries that tally with regional level COVID-19 reporting. The exception was one Scottish region- UKM62 (Inverness, Nairn, Moray, Badenoch and Strathspey), which is divided between the NHS Highlands and Grampian regions.
Implications for policy and research
This article demonstrates how open data helped deliver insights, direction and a surer footing for our collaboration in the face of the new threat of COVID-19. While open data is freely available, collating it, particularly at regional level, with oversight and insight from a group of experts distributed internationally required significant effort. Standardisation of the data being collected by countries for public health reporting for illnesses like COVID-19 and in general would reduce the analytical burden of this effort, in addition to strengthening the conclusions that could be drawn from it.
Discontinuation of open data initiatives relating to COVID-19 is a concern, 51 with large open data initiatives like the Google COVID-19 Open Database (COD) and John Hopkins University ceasing gathering of new data since late 2022/early 2023.6,9 This will hinder our ability to assess the effectiveness of COVID-19 responses and thereby impair our ability to respond appropriately to future pandemics. While excess mortality is a very important metric, we also need to be able to track ongoing vaccination efforts and COVID-19 morbidity and mortality to avoid incorrect attributions.
In addition, open data initiatives can provide insights into a country’s overall healthcare status. The absence of developing countries in the World Mortality dataset also requires attention, 30 especially considering the severe impact of COVID-19 seen in developing countries with available data. 51 Improving data collection and reporting, which will improve our understanding of how countries are faring with respect to COVID-19 and other public health issues, is critical if we are to respond in a fairer and more equitable way to future public health challenges.52,53
Finally, while “the creation and dissemination of easy-to-navigate platforms with evidence-based data” has been highlighted as a means to counter misinformation, 54 these same platforms may also introduce risk of misinterpretation from the general public, such as beliefs that COVID-19 was “no more dangerous than the flu”. 55 In addition to navigating these challenges, the scientific community should avoid creating additional data sources and silos during an “infodemic”. 56 Larger coordinated efforts of transnational cooperation based on the method described herein could be rapidly deployed and coordinated to deliver refined results to participants and the general public quickly, which may help avoid some of the mistakes and misinformation witnessed during the COVID-19 pandemic.54–56
Conclusions
Our transnational collaboration, formed at the onset of a public health emergency, was enabled by teleconferencing and open data to generate timely insights pertaining to national and regional impacts of COVID-19. These insights, presented on a public-facing website, extended our reach and influence with regard to public health messaging and assessments of our respective health services.
Ongoing appraisal of national responses in an open way is scientifically sound, given the huge efforts made by the general public worldwide in response to COVID-19 in the past 4 years. Such exercises in transparency and accountability will assist in ensuring public confidence for future pandemics, which may again require large public-enacted responses.
Ensuring ongoing availability of COVID-19 and excess mortality data, ideally at regional level, in addition to other important disease and demographic trends, is critical for ongoing evaluation of national responses and excess mortality trends. Standardising data collection and reporting systems across countries and promoting information sharing and open source data harnessing tools will improve comparisons, while strengthening future pandemic preparedness.
Supplemental Material
Supplemental Material - COVID-19 open data: An ecological study and international collaboration examining pandemic trends in Northern Periphery arctic countries
Supplemental Material for COVID-19 open data: An ecological study and international collaboration examining pandemic trends in Northern Periphery arctic countries by Uet Michael E. O’Callaghan, Monica Casey, Dana Pearl, Olivia Hickey, Anette Fosse, Sigurður E. Sigurðsson, David W. Savage, Katri Vehviläinen-Julkunen, Kirsi Bykachev, Anndra Parviainen, Holly Parker, Joan Condell, Gerry Leavey, Nigel Hart, Pál Weihe, Maria S. Petersen, Liam Glynn in Health Informatics Journal.
Footnotes
Acknowledgements
Thank you to all our partner country expert participants for their input and feedback into this project and particularly our forum of clinical experts from across the NPA region. The NPA COVID-19 Rapid Response group was founded by Dr David Heaney who very sadly passed away during the pandemic before he could see this and many other related projects come to fruition. We miss our friend and colleague dearly and this work is dedicated to his memory.
Authors’ contributions
MO'C, MC, DP, OH, AF, SES, DS, KVJ, KB, AP, HP, JC, GL, NH, PW, MSP & LG all contributed to the study conception and design and participated in the collaboration’s teleconferences. All (open) data sources were appraised by DP, OH and MO'C and subsequently brought for consideration by the group. Local context was provided by MO’C, MC & LG for Ireland, AF for Norway, SES for Iceland, KVJ, KB & AP for Finland, JC, GL & NH for Northern Ireland and by PW & MSP for the Faroes. Data collection, analysis and visualisation were performed by MO’C, with feedback from the group during successive teleconferences. Code was written by MO’C. The first draft of the manuscript was written by MO’C. MO'C, MC, DP, OH, AF, SES, DS, KVJ, KB, AP, HP, JC, GL, NH, PW, MSP & LG commented on previous versions of the manuscript, in addition to all reading and approving the final manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This work was supported by the Northern Periphery Arctic (NPA) Programme COVID-19 Rapid Response Call (Project #411, Reference 304-6311-2020, August 25, 2020). The NPA Programme was an initiative funded by the European Regional Development Fund (ERDF).
Ethical statement
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
