Abstract
Background:
Cancer is typically a notifiable disease, with notifications captured in population-based cancer registries (PBCR) to inform public health cancer control. Despite the importance of PBCRs, a knowledge gap exists regarding the impact of novel and advanced data engineering technologies on PBCR health information, data quality and utility.
Objective:
To examine the impact of electronic reporting, machine learning and automation on PBCR data quality and utility.
Method:
A mixed-methods, participatory, performance story evaluation was conducted in 2022 to examine data quality (completeness, coverage, timeliness and efficiency) and utility (real-time dashboards and research) of health information collected between 2012 and 2021 by the New South Wales Cancer Registry (NSWCR), a PBCR in New South Wales, Australia. Results: A two-fold increase in cancer notifications was observed between 2012 and 2021 (n = +171,841; 103% increase). Electronic data receipt increased by 63-percentage points between 2015 and 2021 (12% to 75%), and the number of services that provided electronic data also increased during this time. Timeliness of data receipt improved between 2012 and 2021, with 87% (n = 293,544) received on time in 2021. Manual requests, data extraction and processing times decreased from 4791 to 483 requests (2012–2021) and 921 to 63 days (2017–2021). Utility was enhanced, as supported by collaboration. Greater confidence in population-level quality improvement initiatives was reported, and an increase in research activity and functionality observed.
Conclusion:
Electronic reporting, machine learning and automation can improve data quality, utility and cancer control capability, with collaboration remaining essential.
Implications for health information management practice:
Innovative technologies and collaboration can improve PBCRs and strengthen health care, policy, research and health system capability.
Keywords
Introduction
Population-based cancer registries (PBCRs) are the gold standard for population health cancer information (North American Association of Central Cancer Registries, 2008). They enable the collection, analysis and reporting of cancer incidence, prevalence, mortality, survival and other cancer characteristics (Parkin, 2006). Central to public health programs, PBCRs help reduce cancer surveillance inaccuracies, inform health system and service planning (Division of Cancer Control & Population Sciences, 2024), estimate future cancer burden and trends and strengthen research infrastructure (Ahmed et al., 2024; Hernandez-Boussard et al., 2020). Linked PBCR information can advance precision medicine (Lancet, 2021), improve health outcomes, enrich our understanding of health more broadly (Musa et al., 2022; Parkin, 2006) and enable health systems to achieve superior outcomes at a lower cost (Institute of Medicine, 2001).
Reporting timeliness is a major priority for registry users (Bray and Parkin, 2009), driven, in part, by the expanding utility and value of PBCRs. Novel and advanced digital data engineering technologies can improve PBCR health information management systems and the timeliness of registry reporting. Traditionally, registries have relied on manual data collection and analysis. However, these methods are costly, time-consuming, and prone to error; they deliver limited transparency and traceability and result in reporting time lags (or longer data latency periods) (Offodile et al., 2021; Sutton et al., 2020; Tangka et al., 2021). In contrast, the application of advanced digital technologies – such as cloud-based platforms, electronic pathology (ePath) reporting (Rollison et al., 2022), machine learning, natural language processing and electronic data submission portals – can help improve quality, save time and increase utility (Bi et al., 2019; Hassell et al., 2009).
While the benefits of PBCRs and advanced and novel technologies are evident (Bi et al., 2019; Hassell et al., 2009; Hernandez-Boussard et al., 2020; Jones et al., 2021; Offodile et al., 2021; Rollison et al., 2022), knowledge regarding the impact of the application of novel technologies on PBCR data quality and collaboration is limited. Commonly cited articles regarding PBCR developments have largely concentrated on artificial intelligence or machine learning regarding the prevention, detection or treatment of breast, colorectal or oesophageal cancer with most of this research originating from the United States (US), China or Europe (Musa et al., 2022). Also, cancer registry articles have traditionally concentrated on publishing incidence, prevalence and mortality data. While these developments have been useful, it means that a knowledge gap regarding the impact of data engineering advances on data quality and utility for population-based registry data exists. We aimed to help address this knowledge gap by examining outcomes associated with the consolidation of data submission processing from manual to electronic pathology (ePath) reporting; machine learning to extract ePath report data; and dashboard automation to allow real-time monitoring, quality assurance and data visualisation. Our a-priori evaluation questions were (1) to what extent has the completeness and coverage of cancer data increased within NSW?; (2) to what extent has the NSW Cancer Registry’s (NSWCR) technological advances and innovations led to improved registry efficiency and timeliness?; and (3) to what extent have NSWCR data been used to underpin cancer control activities, in particular real-time dashboards to improve patient outcomes and research to reduce mortality, morbidity and improve quality of life?
Method
Study setting and context
Our study was conducted by the Cancer Institute New South Wales (Cancer Institute NSW) in Australia. The Cancer Institute NSW is NSW’s cancer control agency, which aims to lessen the impact of cancer across the state. NSW is the most populous state in Australia, with an anticipated increase in older adults from 17% in 2022 to 25%–27% in 2071 (Australian Bureau of Statistics, 2022). The total number of people diagnosed with cancer is estimated to increase as the NSW population grows (estimated growth is 0.4%–1.2% per annum) (Australian Bureau of Statistics, 2022). Cancer incidence is expected to remain stable (483.7 per 100,000 people living in NSW with 51,641 people living with cancer in NSW in 2024) (Cancer Institute NSW, 2024a). The Cancer Institute NSW is the custodian of the NSWCR (Ministry of Health, NSW Government, 2022), which is the first Australian population-based cancer registry to include data on cancer stage, treatment and care quality. The registry contains demographic, incidence and death details for people diagnosed and treated for cancer, and or who died with cancer in NSW since 1972. All primary malignant neoplasms and in situ melanoma and breast cancer are included (Cancer Institute NSW, 2024c). Information submitted to the NSWCR (under legislative requirements) includes health information submitted by public and private: radiotherapy, medical oncology and pathology services; private hospitals, forensic medicine, day procedure centres and affiliated health organisations (such as for example not-for-profit religious, charitable or other non-government organisations that provide health services and are recognised as part of the public health system) (Ministry of Health, NSW Government, 2022).
Data are submitted to the NSWCR under two legal mechanisms. The first is the Public Health Act 2010 (NSW), which mandates the reporting of notifiable diseases (including cancer) (Public Health Act, 2010). It is mandatory for services that diagnose cancer to notify the NSWCR of this; however, the completeness of these notifications is often lacking, and the NSWCR spends time following up and chasing to get all notifications required by law. Hence – the introduction of a number of these initiatives to try and address missing mandatory reports. Data are also collected under the Cancer Institute (NSW) Act (2003) to assist with quality assurance and system improvement. This Act has meant that increasing automation to assist with engagement in facilities outside of the mandatory notifiers has allowed for the increased reach and scope of collection around treatment and clinical data.
For the purposes of the PBCR, an eligible report is any diagnosis of cancer within the notifiable list of cancer types in the relevant policy directive. Missing means that even though it is described as mandatory, it is sometimes not supplied in full; this is then needing to be followed up by a letter (manual follow-up) to facilities and doctors to confirm the information. At the time of the introduction of these innovative technologies (evaluated in this article), these manual requests were not automated. Clinical coding forms another important component of the NSWCR. Clinical coding is done by in-house clinical coders at the NSWCR. Within the program, completion occurs when the year of notification data have been fully processed and are ready for reporting. Data extraction and quality assurance are done by the data collections team of the NSWCR.
The advanced and novel data engineering interventions introduced since 2012: Electronic reporting, machine learning and automation
Since 2012, the Cancer Institute NSW has been working strategically to reduce reliance on manual data collection for PBCR management and the stakeholders who provide data to the Cancer Institute NSW. Collaborative strategic improvements have been implemented to increase the ability to capture high-quality data electronically using advanced technologies, through (1) ePath reporting and natural language processing software to support digital data extraction; (2) machine learning that involved the development of cluster algorithms to group and standardise free-text fields to reduce the manual labour involved in submissions to the registry; (3) the design, development and implementation of automated Radiotherapy Management Information System dashboards to provide radiotherapy facilities with rapid-view functionality, including utilisation and treatment data; and (4) a Cancer Notification Portal enhancement to establish web-based electronic portal capabilities to facilitate the submission of cancer notifications and clinical information in an electronic format (Cancer Institute NSW, 2024b), allowing for manual methods to be replaced (Figure 1, general overview). The manual methods to be replaced by these technological advancements included the PBCR team opening postal mail, manually sorting information, scanning documents into an electronic format and manual data entry.

Flowchart details of the introduction of electronic reporting, machine learning and automation technologies.
Evaluation study design and approach
A modified version of the “performance story report evaluation study” design was used, which includes the collection and participatory interpretation of “stories” about change (Dart and Davies, 2003). The key components of a performance story evaluation include the development of program logic, monitoring, evaluation and reporting, and improvement and adaptive management (Roughley and Dart, 2009). Performance story evaluations can be participatory or non-participatory in approach. The active involvement of program staff determines which approach is used (Roughley and Dart, 2009).
All standard components of a performance story methodology were completed for our study. An internal working group was established to determine our evaluation questions. The evaluation questions were mapped against an established program logic (supplementary online appendix) and evaluation plan. An expert panel discussed the extent to which the quantitative and qualitative evidence was adequate for assessing program outcomes. A working group summit meeting was held to consider the mixed-methods results, select stories of change and review materials for the final performance story results. Program staff were actively involved in our evaluation (meaning a participatory methodology was used). The expert panel and working group involved program staff and evaluators external to the program (n = 7). Combined, the skills of the panel and group involved registry program management, implementation and systems expertise, epidemiology, data analysis, strategic and program evaluation design and implementation and communications. The involvement of program staff was assessed as useful as the evaluation outcomes were to support ongoing decision-making regarding PBCR operational and strategic direction and a range of program outcomes, along with aspects of the health information management more broadly within the cancer control community. As is important with a participatory methodology, our major findings were communicated back to stakeholders within the cancer control community via website communications. A range of methods were used to share the findings, including the use of short presentations of findings and vox pops. Vox pops involve the sharing of concise information from key stakeholders whose opinions may in turn influence the opinions of others (Beckers, 2019).
Data sources, collection and analysis
Quantitative data
The information contained within and generated by the NSWCR was analysed for this study. Information was provided by medical record department staff and clinical coders, managers and directors (health information, nursing, data, pathology laboratories), and clinical staff (radiation oncology, oncology outpatient, cancer care, pathologists, haematologists) (Ministry of Health, NSW Government, 2022). For notifications to be considered as “received on time,” the notifications are to be received within 6 weeks of pathology collection date, date of death or final determination of cause of death, and within 12 weeks of treatment completion or cancellation, with a minimum frequency of quarterly supply, with variations between admitted and non-admitted patient data (Box 1 and Box 2).
The quantitative dimensions examined, and associated data, type / sources, analysis and period.
NSWCR cancer notifications: Notifiers, notification items and timelines for notifications to be received by the NSWCR.
Data analysis
Data quality: Completeness, coverage, timeliness, efficiencies and utility
A range of items were examined to investigate completeness, coverage, timeliness, efficiencies and utility (Box 1). Descriptive statistical analysis was completed to examine changes over time.
Qualitative data
Key informant experts were purposively sampled and invited to participate in a short interview via an email invitation, conducted by a member of the evaluation team. Key informants were defined as data collectors and data users who were able to provide insights into how the technological advances had (or had not) improved aspects of the NSWCR related to data quality, timeliness and acceptability during 2012–2021, or a partial period during 2012–2021. A topic guide was used (Supplemental File: 2). Probes were used to support thematic saturation. Interviews were videoed. Qualitative analysis involved two procedures: (1) the recording was reviewed to identify stories of change, as relevant to the program logic and evaluation questions; and (2) quotes from the interviews that captured and communicated change clearly and concisely were notated and deductively extracted for integration with the quantitative results to elaborate upon the results. Quotes that also extended knowledge and understanding regarding the area of enquiry were included (Roughley and Dart, 2009).
Sample size
No sample size was calculated for the quantitative part of the study due to population-based registry data being used. We adopted the approach that up to 10 interviews would be sufficient for our mixed-methods study. This ceiling number has been suggested as sufficient for thematic saturation with careful selection of key informants to interview and probing during interviews (Wutich et al., 2024). Key informants were defined as individuals who held specialist knowledge through their professional experience in relation to PBCR data collection, data analysis and those able to communicate intelligibly and meaningfully and in a way that would add value to the evaluation aims, not otherwise able to be addressed via the quantitative component of the evaluation.
Synthesis
The quantitative and qualitative findings were synthesised to answer each evaluation question. Brief stories of change (quotations) were integrated alongside the quantitative findings to help complement, explain or extend quantitative findings to provide a cohesive performance story outcome (Roughley and Dart, 2009). In this respect, an explanatory sequential mixed-methods design was used as part of the performance story evaluation methodology (Creswell and Plano Clark, 2017). As usual with mixed-methods evaluation, triangulation of data sources and technique was a key analytical approach used in the study.
Ethics approval and reporting
While this study did not require ethics approval as determined by the NSW Health policy directive (Quality Improvement & Ethical Review: A Practice Guide for NSW, available from: https://www1.health.nsw.gov.au/pds/ActivePDSDocuments/GL2007_020.pdf), the evaluation was conducted in accordance with the National Health and Medical Research Council’s (NHMRC) National Statement on Ethical Conducts. Media release consents were obtained by the interviewees. The reporting of our study fulfils internationally agreed quality reporting requirements for observational studies using routinely collected health data (von Elm et al., 2007).
Results
Cancer notifications (3,094,161, from all sources) were received during 2012–2021. Seven key informants were invited for interview and seven were interviewed. Two informants spoke to their professional experience regarding data collection, four were data users and another held a strategic role that cut across data collections, data use and data analysis. Five were employed by the CINSW, one was employed within the broader Ministry of Health in the state, and another was from a peak charity organisation.
Overall synthesised findings
The synthesised results were that the novel and advanced engineering tools corresponded with the NSWCR data being used for broader purposes in NSW-wide reporting (including novel reporting), service planning and research, and more efficiently. Enhanced models of collaboration and feedback to support improved outcomes were evident, alongside increased confidence in population-level quality improvement initiatives, real-time surveillance activity at a local and state levels, and NSWCR-reliant increased research activity and functionality.
Data quality
Completeness
In 2015, approximately one out of 10 pathology reports were received electronically (i.e., n = 14,605/122,232; 12%). In 2021, most pathology reports were received electronically (n = 139,353/184,888; 75%). A 63-percentage point increase in electronic reporting was observed over the seven-year period (Figure 2). Qualitative analysis revealed how collaboration towards shared goals was critically important between the PBCR team and stakeholders that submitted notifications. A model of regular and frequent feedback and one that supported change management was described as useful as this approach supported the uptake and sustained use of the new technology.

Proportion of scanned and electronic pathology reports received by the NSWCR during 2015–2021.
Electronic data submission concerning NSW radiation oncology centres increased from 17 in 2019 to 39 in 2022 (129% increase). NSW medical oncology centre electronic coverage increased from 25 in 2019 to 41 in 2022 (64% increase). By 2021, nearly all metropolitan and regional pathology providers submitted data electronically (n = 18/19 providers). Nearly, all private sector admitted patient facilities and day procedure clinics in NSW provided electronic cancer notifications (<1% of sites that did not have the capacity to report electronically continued to report via paper explaining that the benefits for their service to transition to electronic reporting did not outweigh the costs to do so). Qualitative analysis also revealed an increase in the breadth of information provided: The most significant changes have been probably improved data quality, which has meant we’ve been able to include a greater breadth of indicators - some new indicators or new data variables that may not have been available or may not have been robust [in the past]. (Data analyser and user ID11, Agency program staff)
Coverage
Between 2015 and 2022, the electronic coverage of public radiation oncology services increased from 15 to 20 services (+33%). For private radiation oncology services, the electronic coverage of the program increased from 2 to 19 services (+850%). This represented 100% coverage of all radiation oncology services in NSW at the time of the evaluation (i.e. n = 39). In relation to medical oncology services, the electronic coverage for public medical oncology services expanded from 25 to 35 services (+40%). An increase in private medical oncology services increased from zero to six was also observed.
Timeliness
A notification was coded as received on time if it was received within 6 weeks of the pathology collection date. The timeliness of cancer notifications improved between 2012 and 2021, with improvements observed from 2013. In 2021, an increase was observed (in comparison with 2012–2017) with 293,544 electronic notifications (87%) received on time. (Figure 3). In 2017, the process of data extraction, quality assurance and reporting for the Radiotherapy Management Information System dashboard data collections were completed in 921 days (30 months), and by 2021, it was completed in 63 days (2 months) (Figure 4). The time taken from the diagnosis date to the commencement of clinical coding for cancer incidence decreased from 33 months in 2015 to 19 months in 2021 (−42%). The decrease in time taken ranged from 1 to 3 months from each consecutive year, except in 2019 (as COVID-19 preparations occurred within the NSW health system). The year of 2019 corresponded with a temporary increase (25 months to 28 months) in clinical coding duration. Overall, a plateau in efficiencies was not evident (Figure 4).

Proportion of pathology report notifications received on time (defined as within 6 weeks of pathology collection date) and the proportion delayed by year during 2012–2021.

The sum of notifications received on time from public and private sector-operated pathology laboratories (defined as within 6 weeks of pathology collection date), the sum of requests to assist with data processing and the linear trend regarding the sum of by request by year during 2012–2021.
The qualitative findings helped explain how the clinical coding of the cancer data was made possible, highlighting how the new technologies resulted in substantial gains: Cancer information from the Registry comes from carefully coded cases. It’s necessarily high quality and is considered the source of truth/fact in NSW. Recently the Registry demonstrated it can also provide valuable insights. In order to track cancer incidence in a timely manner as COVID–19 brought pressure on the NSW health system, Registry staff advance coded a random selection of 100 cases per month for key cancers. These smaller, but rapidly coded samples gave us timely insights into cancer incidence in the midst of the 2020 and 2021 COVID–19 disruptions. (Data analyser, Cancer Institute NSW)
Efficiencies
Between 2012 and 2021, a 90% decrease in the need for manual requests to gather missing reports was observed. In 2012, a total of 4791 requests from NSWCR were issued to gather missing reports. This was followed by an increase in the number of requests, which peaked in 2016. This peak corresponded with an increase in the volume of digital information received. Over the 10-year period the volume of digital information received on time continued to increase; however, the need for requests in relation to missing information decreased (Figure 4). Qualitative analysis revealed stories indicative of a substantial return on investment over the longer term.
Utility
Real-time cancer dashboards
Before 2020, no publicly funded facilities had a real-time dashboard. Between 2016 and 2021, 16 Radiotherapy Management Information System dashboards were produced for the facilities, and by 2021, they were implemented and shared. The average number of days to analyse data reducing from 566 to 213 (−62%). In addition, a state-wide centralised dashboard was produced in 2 minutes versus 2 weeks of effort previously. Private facilities opted out of hosting local dashboards as they had their own dashboards. Qualitative data highlighted that the utility of the dashboards improved at local clinical levels and at a state-wide, health-service planning level: Well now it’s fantastic as I can actually access the data myself in a dashboard. . ..I’ve got time to interrogate it when I need it, which is most important. . ..There’s a whole bunch of information there I use. . ..like. . ..it lists all private and public radiotherapy centres in NSW delivering radiotherapy treatment. . .and I can interrogate it by public or private, or the patients, or the councils where the patient is living. It also gives information about the machines, the numbers and their ages, of those machines used across the state, which is important for capital planning purposes and funding. And it also helps us identify whether there is any need for new or expanded radiotherapy services. (Data user, Ministry of Health staff)
Improved timeliness and accuracy of the information also enhanced collaboration between government entities. Prior to the dashboards, data were manually entered into spreadsheets and sent through to be aggregated and reports produced, with variable completion and supply observed. With the implementation of the dashboards, each measure for collection was standardised and automated, leading to these improvements in quality and timeliness. The qualitative data also revealed that the new technologies had an impact on strategic health systems monitoring and planning at the state-wide level: A state-wide aggregate radiotherapy operational data dashboard has been developed to support the Ministry of Health in radiation service planning across NSW. Prior to the delivery of the dashboard, the data had to be extracted, and quality checked manually which impacted on data timeliness and currency. Now, data can be accessed straight away, thus providing accurate and timely information. (Data user, Ministry of Health staff)
Stories of change regarding population-level quality improvement initiatives, cancer monitoring and surveillance also emerged. The work of analysts was enabled by more timely and accurate data. Confidence increased in the quality of the NSWCR data for population-level quality improvement initiatives and reporting, including for the Reporting for Better Cancer Outcomes (RBCO) Program outputs, which present the latest information regarding NSW cancer control (Cancer Institute NSW, 2022): New technologies in the data engineering and data collection space allows for rapid collection and quality assurance of clinical treatment data across the state. This provides for more timely and accurate data made available for the Reporting for Better Cancer Outcomes Program analysis team. Indicators that were unable to be produced previously are now being included and the confidence in the data being used has improved significantly. (Data user, Cancer Institute NSW)
The application of novel and advanced technologies was also described as enabling increased responsivity regarding local clinical, health service planning and operational needs: In 2021, all data are received within 30 days of end of quarter and within 23 working days of the end of the calendar year reporting period. Quality assurance is completed within two days and ready for reporting use and linkages within 60 days of the end of the quarter. Since these data are used to facilitate operations, service planning and resource utilisation at the local level, CINSW now can provide the data quickly back out to the cancer services and be responsive to their local needs. (Data collector, Cancer Institute NSW)
Research
An increased use of the registry data for research studies and a broadening of the types of research supported was observed. Twenty-two prospective patient recruitment studies were conducted in 2011. This increased to 44 studies in 2019 (+50%). Retrospective radiotherapy data were provided to ten researchers for data linkage in 2021. PBCR data also continued to use non-traditional research outputs to inform quality improvement across the NSW health system. Qualitative data also showed that the gains made from the use of artificial intelligence were characterised by increased functionality and capability in relation to research activity: For retrospective data, artificial intelligence can check whether the data for answering the research question exists. In this way, researchers can amend their research questions prior to ethics approval, thus making the research cycle more efficient. Retrospective radiotherapy data was provided to 10 researchers for linkage in 2021. (Data user, Cancer Institute NSW staff)
Key informants also shared that the technological advances represented continued value to drive quality improvements across clinical settings, services and the health system through non-traditional research outputs. This included for the Reporting Better Cancer Outcomes program that monitors and reports on the NSW cancer health system with outputs including the reporting of data for each local health district (regions) in the state, and the Aboriginal Health and Medical Research Council membership regions throughout NSW.
Discussion
Our study has shown that novel data engineering tools can improve PBCR data quality and utility, and that collaboration remains helpful to ensuring responsivity to local needs, along with the uptake of new technologies and their sustained use. Our results are in line with existing hypothesis within the peer-reviewed literature that digital technologies can assist in improving the completeness, timeliness, accuracy and the utility of population-level data, including cancer registry data (Bi et al., 2019; Hassell et al., 2009; Hernandez-Boussard et al., 2020; Jones et al., 2021; Offodile et al., 2021; Rollison et al., 2022). We contribute new knowledge as we have quantified the substantial gains that can be made for a population-health registry, including in relation to data latency periods, which is particularly important in relation to timeliness of reporting and data quality. Our qualitative results also help explain how these achievements occurred and their significance. An unexpected finding from our study is that population health digital technologies can build health system capabilities even during times of stress on health systems. This is because the gains were observed during the COVID-19 pandemic, and the NSWCR data doubled (that is, a two-fold increase) in volume over the period of the study (cancer notifications increased by 103% between 2012 and 2021). The focus on pathology reporting in our study is important given that most cancers (85%–95%) are diagnosed through investigation of human tissue reported by pathology services (Soerjomataram et al., 2022), meaning the findings may have high relevance to cancer registries internationally. In relation to our findings regarding coverage, during the period examined, within NSW the number of services grew from 17 to 39 radiation oncology services, which means that coverage of the sector remained constant at 100% and that the absorption of growth in the sector was able to be accommodated – enabled in part by the new technologies introduced. Also, our findings show that a range of the gains that can be made from the introduction of novel technologies may take more than a decade to plateau, meaning that substantial gains can continue to be made over the longer-term.
The most highly cited machine learning or artificial intelligence focussed research articles regarding cancer concentrate on the prevention, detection or treatment of breast, colorectal or oesophageal cancer. Most of this research originated from the USA and China, followed by Europe. Far fewer articles have originated from Australia or focus on population-based cancer registry developments beyond prevention, detection and treatment (Musa et al., 2022). While some cancer control agencies have reported on their development and piloting of cloud-based computing platforms to increase cancer surveillance activities (Jones et al., 2021), our study goes further by quantifying and demonstrating that machine learning can be used to support comprehensive data analysis, research and population-level quality improvement initiatives (e.g. the RBCO program). We show how data engineering tools can help strengthen research capabilities, health system performance and health learning systems. We demonstrate how the change from manual methods to digital methods may initially result in additional work; however, the increased demand (e.g. requests for digital submissions) may decrease over time, even though the volume of digital notifications may substantially increase. Our mixed-methods findings showed that collaboration between government agencies and service providers is central to the successful application of new technologies in cancer control. Most studies have not had this breadth of focus. Our study also demonstrates the value of analysing data from a 10-year period, or longer as the gains made by the introduction of novel technologies may be sustained over more than a 10-year period. This is a novel finding.
Improvements in the timeliness of data receipt along with a corresponding decrease in the need for manual requests to gather missing information and notifications were two additional significant findings. These findings were demonstrated during the COVID-19 pandemic, which was a time of major disruption in health, including for cancer registries (Jazieh et al., 2020). The development and application of AI algorithms was a principal component investigated in this evaluation. Data were extracted in advance of clinical coding and used to monitor volumes and activity and assess impacts of COVID-19 and other disruptions on health-service activity across NSW. Some cancer groups were identified as having risks to detection and treatment and such that NSWCR staff used AI algorithms to identify and fast-track a subset of real time cases to enable more timely surveillance. The Cancer Institute NSW also fed back to each local health district regarding their cancer rates for the three top high-volume cancers in the state (breast, colorectal and lung). This was achieved to provide an indication of cancer detection in 2020 by modelling using data available two to three months after diagnosis. Variables included pathology, surgery and radiation therapy reports, number of breast screens, colonoscopies, prostate-specific antigen (PSA) tests and melanoma excisions recorded by the universal, national Medical Benefits Schedule (MBS). Compared with full registry processing, modelled data for 2020 had a >95% accuracy overall (Ahmed et al., 2024). These findings are important given that reporting delays can change the relationship between the reported disease incidence and the “true” disease incidence. Reporting delays have been found to result in an under-reporting of approximately 4% of people diagnosed with cancer that are to be eventually reported. Delays of up to 22 months have been reported as standard practice in some high-income countries (Division of Cancer Control & Population Sciences, 2024). Our results support the claim that new technologies used in the NSWCR have the potential to strengthen policy and population-level public health cancer control interventions through reducing reporting delays and including during times of major health system disruption. Our findings demonstrate that the introduction of novel technologies can correspond with substantial reductions in reporting delays, which therefore increases the utility of registry data.
Limitations
Our evaluation needs to be read in the context of some limitations, and within the context of the shifting demands on registries. A modified performance story report methodology was applied due to limited resourcing for the evaluation. This means the evaluation was conducted by the NSWCR custodian. This allowed us to draw upon internal context, design, knowledge and expertise of the PBCR team. The PBCR team are uniquely placed regarding their perspective, knowledge and expertise regarding the application of registry information to facilitate and drive system change. At the same time, this limited an independent perspective. While efforts were made to minimise potential for bias, some bias remains. The appraisal of bias is relevant to all research, and bias can never be totally eradicated. However, to mitigate against this risk, the performance story methodology we used is in line with available evaluation guidelines and mixed-methods design (Dart and Davies, 2003). The evaluation conduct involved a range of quality indicators, which help address bias (e.g., data triangulation, recording of interviews and a-priori evaluation questions). In addition, in line with our design used, the results we report are observations, and we make no claim regarding causality, noting these could include policy and legislative impacts. Further, only a limited number of interviews were conducted with external stakeholders, and the views of frontline clinical staff or pathologists were not explored. Qualitative information is not only representative of the views of a population, but rather they provide concise data regarding opinions, which may in turn influence the opinions of others, but also provide more in-depth understanding of the phenomenon being investigated. While the interviews did provide stories of change and the NSWCR success, our qualitative findings (like other similar qualitative accounts) are not generalisable across the entire NSW cancer system (Beckers, 2019). Nevertheless, with any qualitative enquiry it is possible to make plausible or “logical generalisations”, regarding the application of these technologies in similar scenarios.
Also of note, the quantitative data presented here did constitute population-level reporting, which is a strength in our study. In essence, our mixed-methods approach has enhanced the credibility of the study and enriched the understanding of the use of new technologies through the triangulation of data sources and types of data to produce a coherent performance story. Triangulation and integration are quality indicators in mixed-methods studies (Creswell and Plano Clark, 2017). The mixed-methods approach has also allowed for the examination of both the innovative technological aspects along with an emerging finding that collaboration is key (although not able to be more fully explored due to the evaluation questions being established a-priori). It provided a breadth of evaluation that would not be possible with quantitative methods alone. At the same time, a cost–benefit analysis was not completed and a comparison of the relative costs and outcomes associated with the manual versus new technology approach is not presented here. Also, we only present a limited range of outcomes, as is the case with all studies. That acknowledged, the evaluation represents value for money and extends existing literature, which largely consists of studies relying on quantitative methods alone. Our article also provides evidence that is different to most traditional registry papers, which only examine incidence, mortality and survival rates with high-volume tumours (Musa et al., 2022). We report at a population-level across tumour types.
Conclusion
With more than 700 PBCRs existing internationally, differences in cancer registration practice, completeness and data quality have been observed (Andersson et al., 2021). Our study demonstrates that the NSWCR has been at the forefront in using advanced digital new technologies, and the registry has demonstrated the potential of advanced technologies for PBCRs. With a focus to improve data quality in respect to completeness, coverage, timeliness and efficiency, the NSWCR has made the essential step in enhancing the health care and outcomes for people with cancer across the state by facilitating timely and accurate PBCR data to investigate cancer burden, cancer treatments and cancer control interventions, including with respect to health disparities. Our evaluation has demonstrated the use of electronic reporting, machine learning and automation corresponded with a substantial increase in registry data completeness and coverage, volume of data extraction, and decrease in requests for capturing missing data manually. There was evidence that this enabled NSWCR data to be used for broader purposes in NSW-wide reporting, service planning, novel reporting and research, which all have the potential to strengthen the health care, policy and research involving people with cancer across NSW. Collaboration remained key to ensuring responsivity, relevance and impact of the PBCR. Also, of note, these technological innovations helped build health system capability during major stressors on the health system (COVID-19) and doubling in the volume of cancer cases, which occurred due to population growth.
Supplemental Material
sj-docx-1-him-10.1177_18333583251370057 – Supplemental material for Evaluating the use of new and advanced technologies in a population-based cancer registry
Supplemental material, sj-docx-1-him-10.1177_18333583251370057 for Evaluating the use of new and advanced technologies in a population-based cancer registry by Brooke Stapleton, Sheena Lawrance, Penny Perry, Barbara Daveson, David Roder, Shelley Rushton and Tracey O’Brien in Health Information Management Journal
Footnotes
Acknowledgements
The contribution of the radiation and medical oncology services is acknowledged and appreciated, along with the contribution of the Cancer Institute NSW program staff, and all those that were interviewed. Andrea Lammel’s significant contribution is also acknowledged.
Authors contributions
SR and BS conceived the evaluation idea and BS and Andrea Lammel (AL) created the design. Data collection was managed by SL and PP. PP and AL conducted quantitative analyses and AL conducted qualitative interviews. AL, BS, SL and PP were involved in drafting the original version of the report. DR edited the paper, provided input into its development, and made academic contributions to align the paper with emerging developments in cancer control. TO contributed to the interpretation of the data and significance of the study, reviewed the article for intellectual content, and provided leadership regarding the acquisition of data, and similarly to the other authors has agreed to be accountable for all aspects of the work. BD analysed, quality checked and interpreted data, led the drafting of this version of the article and added intellectual content. All authors approved the final version, with SR as the executive sponsor and BS as overall evaluation lead.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship and/or publication of this article: The Cancer Institute NSW is a statutory health corporation. The objectives of the Cancer Institute NSW are to increase the survival rate for cancer patients, reduce the incidence of cancer in the community, improve the quality of life of cancer patients and their carers, and operate as a source of expertise on cancer control for the government, health service providers, medical researchers and the general community.
Ethical approval and consent to participate
While this study did not require ethics approval as determined by the NSW Health policy directive (Quality Improvement & Ethical Review: A Practice Guide for NSW, available from:
), the evaluation was conducted in accordance with the National Health and Medical Research Council’s (NHMRC) National Statement on Ethical Conducts. Media release consents were obtained by the interviewees.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
