Abstract
Objectives
(a) To critically appraise the quality of data submitted by sub-Saharan African (SSA) cancer registries to GLOBOCAN 2020 and (b) compare the quality of data of the registries common to GLOBOCAN 2008 and 2020.
Design
Critical appraisal of cancer registry data quality using the Parkin and Bray framework.
Setting and Participants
GLOBOCAN 2020 cancer registry estimates for 46 countries in SSA. Forty-three registries in 31 (SSA) countries were identified from the GLOBCAN 2020 supplementary documents, of which data from 28 registries in 23 sub-Saharan African countries were publicly available.
Main outcomes measures
Data quality for 15 variables in four domains (comparability, validity, timeliness and completeness) were appraised using the Parkin and Bray framework. Results from the appraisal of GLOBOCAN 2020 sources were compared with previous findings for GLOBOCAN 2008.
Results
Compared with GLOBOCAN 2008, GLOBOCAN 2020 country coverage had increased from 21 to 31 countries with 15 countries having no established registries. Out of a total possible score of 15 for data quality, 18 of the 28 publicly available GLOBOCAN 2020 registries fulfilled a score of 5 or more compared with seven registries in GLOBOCAN 2008. Of the 17 registries common to GLOBOCAN 2008 and 2020, nine showed an improvement in data quality.
Conclusion
Country coverage and data quality have improved since GLOBOCAN 2008, however, overall data quality and coverage remain poor. GLOBOCAN 2020 estimates should be used with caution when allocating resources.
Keywords
Background
Global estimates of cancer incidence, prevalence, mortality and disability-adjusted life years lost to cancer, known as GLOBOCAN, are published by the International Agency for Research on Cancer (IARC), a branch of the World Health Organisation (WHO). 1 The GLOBOCAN 2020 report published estimates for 36 cancers coded using the International Statistical Classification of Diseases and Related Problems 10th Revision Codes (ICD-10). 2 According to GLOBOCAN 2020, Africa accounts for 5.7% of global cancer incidences and 7.2% of deaths. In comparison, Asia, Europe and America account for 49.3%, 22.8% and 20.9% of incidences and 58.3%, 19.6% and 14.2% of deaths, respectively. 2 GLOBOCAN 2020 forecasts that the global cancer burden will continue to rise with the sharpest increases in incidence experienced by Asia and Africa, primarily due to increases in life expectancy at birth. 2 GLOBOCAN is used by a range of international organisations to set global health priorities 3 so transparent and accurate reporting of quantitative estimates based on high-quality data collected and recorded by cancer registries is crucial.
Parkin and Bray, cancer epidemiologists at IARC, developed an appraisal tool to assess the quality of cancer registry data based on four domains: comparability, validity, timeliness and completeness.4,5 An earlier study used the tool to appraise the quality of sub-Saharan African (SSA) cancer registries submitting data to GLOBOCAN 2008. It found reliable comprehensive countrywide data were lacking due to limited or no population registry coverage in many countries, a strong urban bias in all countries and poor-quality data, with only seven of 26 registries fulfilling five or more of 15 data quality criteria. 3
The aim of this article is to examine whether the quality of registration data underpinning the 2020 estimates has improved since GLOBOCAN 2008 by (a) appraising the quality of cancer registry data in SSA in GLOBOCAN 2020 and (b) comparing the quality of data in SSA registries common to both GLOBOCAN 2008 and 2020 with previously reported findings for GLOBOCAN 2008.
Methods
Data sources
At the time of data collection, GLOBOCAN estimates were available for 46 countries in SSA. The 43 cancer registries listed as sources by GLOBOCAN 2020 correspond to 31 different SSA countries. There are 15 countries with no established registries and these countries rely on the population-based cancer registries of neighbouring countries for cancer estimates.2,6,7 Our request to the African Cancer Registry Network (AFCRN) for access to the 43 cancer registries was denied on the grounds that a contribution to cancer registration and review by the research committee are conditions for data sharing. We were thus limited to publicly available information from two key IARC sources: Cancer in SSA and Cancer Incidence in Five Continents Volume XI.8,9 These sources provide data on cancer incidences and registration practices. We retrieved data on 24 of the 43 registries via these publications and found a further four registries in online publications using references cited by the GLOBOCAN 2020 report.2,10–13 Despite an extensive literature search for the remaining registries on Google Scholar, PubMed, Embase and Medline (Annex 2) 15 registries could not be accessed. A total of 28 out of 43 registries were accessed for 23 countries (Figure 1). All data sources were downloaded during April 2021.

Data sources.
The GLOBOCAN 2008 appraisal analysed 26 registries in 21 countries, 17 of these registries were included in the current analysis. Of the nine registries from GLOBOCAN 2008 that were not included in the comparison, four were inaccessible and five contributed data to GLOBOCAN 2008 but did not contribute to GLOBOCAN 2020.
Registry populations GLOBOCAN 2020
Of 28 registries, 24 were population-based, two were hospital-based and two were pathology registries.8–13 Of the 24 population-based registries, five had national population coverage (the Gambia National Cancer Registry, Botswana National Cancer Registry, Namibia National Cancer Registry, Mauritius National Cancer Registry and France, Reunion Cancer Registry), 18 had city coverage and one had province coverage. Luanda Angola Cancer Registry and Agostinho Neto Cancer Registry (Cabo Verde) both contributed hospital registry data from their respective city. South Africa National Cancer Registry and Lomé, Togo both consisted of only pathology data.
Analysis and results
Quality of registry data
The Parkin and Bray framework (Box 1) was used to appraise the quality of available cancer registries.4,5 Each data item was given a score of 1 if fulfilled, with a maximum score of 15 for all variables across the four domains (comparability, validity, timeliness and completeness). The fulfilment of criteria was assessed based on the practices reported for each registry and extracted and entered into an Excel spreadsheet by an initial researcher (EA). A second researcher (AP) independently checked a random sample of the data sources relating to the registers.
Domains of the data quality with respective quality criteria by Bray and Parkin 4 and Parkin and Bray 5
Comparability
Details of the standardised coding system used to classify cancers were not given by 16 registries, eight registries used the ICD-O-3 classification; two used ICD-10; and two used both ICD-10 and ICD-O-3 (Table 1). For the date of diagnosis, only one registry used named incident date criteria: the Reunion registry used ENCR/IARC rules. For incidence date, four of the 28 registries used standardised criteria: Reunion used ENCR/IARC rules; Nairobi, Eastern Cape province and Harare registries used IARC/International Association for Cryptologic Research (IACR) rules. For the coding of multiple primaries only two registries used named international guidelines: Reunion used ENCR/IARC and Harare used IARC/IACR rules. No cancer registries reported the proportion of diagnoses of cancer cases from screening programmes.
Comparability domain for 28 SSA cancer registries in GLOBOCAN 2020.
SSA: sub-Saharan African; ICD-10: International Statistical Classification of Diseases and Related Problems 10th Revision Codes; ENCR/IARC: European Network of Cancer Registries/International Agency for Research on Cancer; IACR: International Association for Cryptologic Research.
*Reunion registry, **Nairobi, Eastern Cape Province and Harare registries, ***Harare registry.
Validity
All 28 registries gave the proportion of cancers verified via histology or cytology (morphological verification [MV]; Table 2) and almost all registries (26 out of 28) provided separate figures for males and females. The MV rate from the National Cancer Registry of South Africa (NCR-SA) and the Togo registry cannot be used to appraise validity because the dataset only consisted of pathology data.
Validity and completeness domains for 28 SSA cancer registries in GLOBOCAN 2020.
SSA: sub-Saharan African; M:I ratio: mortality:incidence ratio; IARC: International Agency for Research on Cancer.
*M:I ratios were not reported by individual registries but were calculated and reported in the IARC publications.
Only the Ibadan cancer registry (Nigeria) conducted a re-abstracting and recoding audit (Table 2) to assess validity, however, audit results were not published. None of the SSA registries provided quantitative information to evaluate demographic completeness.
Except for Beira (Mozambique) and Togo cancer registries, all the SSA registries used reporting software to maintain electronic database consistency. CanReg versions 4 and 5 software, developed by IARC, was used by 25 out of 28 registries while the Reunion registry used software adapted from the French cancer registration system.
Completeness
It is not known whether registries reported mortality:incidence (M:I) ratios. The IARC publications calculate and report M:I ratios for 3 SSA countries: Reunion, Mauritius and South Africa. Reunion M:I ratios (53% males and 45% females) are similar to values from the GLOBOCAN 2012 figures for Europe (54% males and 49% females).8,9 M:I ratios for Mauritius (65% males, 44% females) and South Africa (68% males, 62% females) are higher than the European values.8,9
Of 28 registries, seven received notifications of cancer cases from multiple sources, however, none have published data on the average number of notifications received per case. 8 All registries reported on childhood cancer incidence.
Timeliness
There are no agreed international guidelines for timeliness. Of 28 registries, 25 submitted data which had been collected more than five years previously (i.e. date of the data collection period was pre-2015) and, of these, three submitted data from before 2010.
Although registries should report the time to availability, that is, the time taken between a cancer being diagnosed and its entry on a register (4) none of the SSA registries reported this.
Overall scores and comparison between GLOBOCAN 2008 and GLOBOCAN 2020
Scores for data quality by domain and comparison with GLOBOCAN 2008 are shown in Table 3. In the GLOBOCAN 2020 analysis the three highest-scoring SSA cancer registries, Reunion Cancer Registry, Harare National Cancer Registry and Bulawayo African Cancer Registry, met nine of the 15 data quality criteria. Most registries (18 out of 28) fulfilled five or fewer criteria across the four domains.
For GLOBOCAN 2008 and 2020 time period of data submitted to and comparison of data quality scores by registry and domain.
HBCR: Hospital-Based Cancer Registry. Comparability criteria: named diagnostic code, definition of diagnosis date, incidence date criteria used, coding of multiple primaries, incidental diagnosis through screening. Validity criteria: morphological verification rate (MV%), death certificate only rate (DCO%), re-abstracting and recoding audits, demographic completeness, and internal consistency software. Timeliness criteria: process time, time until receipt. Completeness criteria: mortality:incidence ratio (M:I ratio), number of notifications/source of notifications, and childhood cancer incidences.
There were 17 registries common to both the GLOBOCAN 2008 and 2020 analyses. Of these, 13 and 11 fulfilled five or less data quality criteria in 2008 and 2020, respectively. Only one registry scored nine or more out of 15 in 2008 and three in the 2020 analyses.
Overall, nine out of the 17 common registries showed an improvement in data quality between GLOBOCAN 2008 and GLOBOCAN 2020. Data comparability scores improved for three registries, remained the same for nine registries and deteriorated for five registries. In the validity domain, seven registries improved, five registries maintained the same standards of practice and three registries deteriorated. There was no change in the timeliness of the data. Of the 17 common registries, nine improved their score and seven maintained their score in the completeness domain.
Discussion
GLOBOCAN 2020 estimates of cancer prevalence, incidence and mortality for sub-Saharan Africa are based on incomplete population and country coverage. Of 46 countries covered by estimates, 31 had established registries. Coverage has improved since GLOBOCAN 2008 when there were 21 countries with cancer registries. In GLOBOCAN 2020, only five registries reported national population coverage.
For most registries, the main sources of data are hospital and pathology records. Since countries in SSA have few diagnostic and treatment facilities for cancer and these are in major cities this will result in disproportionate registration of cancers in urban areas and failure to capture true cancer incidences nationally.
Our analysis using the Parkin and Bray criteria highlights continuing data quality concerns: most registries (18 out of 28) scored poorly for quality, meeting five or less of a possible 15 criteria. Nevertheless, there was a marginal improvement in the quality of registration since GLOBOCAN 2008: of the 17 registries common to the 2008 and 2020 analysis, nine improved their overall score for data quality. Of the 28 publicly available GLOBOCAN 2020 registries, 18 fulfilled five or more of 15 data quality criteria, compared with seven out of 26 in GLOBOCAN 2008. The remaining 10 registries from the GLOBOCAN 2020 analysis all scored four for data quality.
High-quality registration is contingent upon funding, functioning healthcare systems, medical records and civil registration and vital statistics.4,14 International funders have heavily invested in the control of communicable diseases and established centralised pools of financial and technical resources (e.g. PEPFAR and UNAIDS), however, similar resources do not exist for non-communicable diseases. 15 Many registries are established on a voluntary basis with little to no government funding. 8 The cost of cancer registration is relatively low: a cost analysis study of four registries (Kampala, Zimbabwe National, Nairobi and Seychelles) found that the cost per person ranged from $0.01 to $0.17 at a population level. 16 However, severe shortages in human health resources in most countries in the SSA region 17 directly impact the capacity not just for medical record keeping18,19 but for pathology services which are necessary for the accurate diagnosis of cancer cases. Most SSA regions have very few pathologists, ranging from 1 per 84,000 in Mauritius to 1 per 9,264,500 in Niger. 20
The majority of registries continued to score poorly in the comparability domain because they did not declare the use of standardised systems for diagnostic coding, definition of diagnosis date, incidence date and coding of multiple primaries. No registries reported the proportion of diagnoses of cancer cases from screening programmes despite established population-level cancer screening programmes in some countries.21–23
Positive findings in the validity, completeness and timeliness domains were that all registries published MV rates, the publication date of the most recent dataset and 26 out of 28 used reporting software to maintain electronic database consistency. However, no registries reported time to availability or demographic completeness and just one registry conducted a re-abstracting and recoding audit.
Death registration systems with the cause of death medically certified are necessary for reporting of DCO and M:I ratio, however, most SSA countries do not have these.24,25 Of the three countries reporting M:I ratios, the Mauritian and South African M:I ratios are higher than expected which suggests incomplete ascertainment of cancer incidences.
Although all countries reported childhood cancer incidences, a study of childhood cancer rates on 16 SSA registries, 15 of which were included in our analysis, found that 10 registries reported considerably lower rates of childhood cancer than expected, suggesting under-registration. 26
The Global Cancer Observatory acknowledges the limitations of the GLOBOCAN estimates, stating ‘caution must be exercised when interpreting these estimates, given the limited quality and coverage of cancer data worldwide at present, particularly in low- and middle-income countries’. 27 In response to the need for investment in cancer registries in low- and middle-income countries, IARC has launched the Global Initiative for Cancer Registry Development, with the aim of informing cancer control through improved registration practices worldwide.
Limitations
The analysis was limited to 28 out of 43 SSA registries submitted to GLOBOCAN 2020 because the remaining registries do not make their data publicly available. It is possible that the data quality of the registries that were not publicly available differed from those included in our analysis and therefore our results may not be representative of all the registries contributing to GLOBOCAN 2020.
For 24 registries, we were reliant on registry data presented in the Cancer in SSA publication. The time period of registry data included in this publication did not always correspond to the time period of registry data submitted to GLOBOCAN 2020 so it was not possible to assess whether changes in registration practices and data reporting occurred. All registry data we analysed were more recent than those submitted to GLOBOCAN 2008. Of these 24 registries, the time period of the registry data in the Cancer in SSA publication covered the time period of data submitted to GLOBOCAN 2020 for three, partially covered for 11 and for 10 fell completely outside the time period but were more recent than the GLOBOCAN 2008 data. A further limitation is that registries are not required to use the Parkin and Bray appraisal criteria therefore some items may not be reported when registry data and registration practices are published.
Conclusion
Although limited to publicly available data, our analysis of the cancer registries underpinning the GLOBOCAN 2020 estimates suggests an improvement in country coverage and data quality since 2008. However, overall data quality and coverage remain poor. Robust cancer registration data in sub-Saharan Africa are lacking, establishing these alongside civil registration and vital statistics is an urgent priority for the region. Given the lack of reliable data, GLOBOCAN estimates should be used with caution when allocating resources nationally and internationally. The challenges faced by cancer registration systems in SSA are complex and require investment within and across the health system. All cancer registry data should be publicly available and reports should include the Parkin and Bray criteria.
Supplemental Material
sj-docx-1-shr-10.1177_20542704231217888 - Supplemental material for A critical appraisal of the quality of data submitted by sub-Saharan African cancer registries to GLOBOCAN 2020
Supplemental material, sj-docx-1-shr-10.1177_20542704231217888 for A critical appraisal of the quality of data submitted by sub-Saharan African cancer registries to GLOBOCAN 2020 by Ereel Ayubi, Rosanna Lyus, Petra Brhlikova and Allyson M. Pollock in JRSM Open
Footnotes
Competing interests
None declared.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Ethics approval
Ethical approval was not required as the information appraised was available in the public domain and did not contain sensitive or confidential information.
Guarantor
AMP.
Contributorship
EA carried out the research and wrote the paper. RL assisted in writing the paper. AMP provided guidance on research and assisted in drafting the paper. PB provided guidance on research and assisted in drafting the paper.
Acknowledgements
We thank The African Cancer Registration Network for providing guidance on accessing sources in the public domain.
Provenance
Not commissioned; peer-reviewed by Padmanabhan Badrinath.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
