Impact of data sources and ascertainment methods on reporting paediatric genetic condition prevalence: A scoping review

Abstract

Background: Genetic conditions significantly impact health and contribute to paediatric morbidity and mortality. Despite advancements, accurate estimation of the burden of genetic conditions remains complex. Objective: To determine how different data sources and ascertainment methods influence the prevalence of paediatric monogenic and chromosomal conditions in Australia and internationally. Method: Following Arksey and O’Malley’s framework for scoping reviews, a systematic search of Medline, CINAHL, Scopus and Google Scholar identified peer-reviewed studies (2004–2024) including snowballing of references. Studies were included if they reported on at least one monogenic and/or chromosomal condition, involved children under 6 years of age, identified the data source, reported prevalence, and were conducted in Australia, New Zealand, Europe or North America. Data sources, type of case ascertainment and prevalence of genetic conditions were extracted from eligible studies. Descriptive analysis was used to summarise study characteristics, including year of publication, region, condition type, data sources and ascertainment methods. Results: Of 58 included studies, 57% originated in Europe, 5% in Australia and 78% were published post-2010. Overall, 36.2% examined monogenic disorders and 29.3% chromosomal. Registries were the most common data source (62.1%), with 78% using active case ascertainment. Main strategies included medical record abstraction (30%), genetic testing (27.5%) and International Classification of Diseases (ICD)-coded data (27.5%). In Australia, genetic testing and medical records yielded higher prevalence than ICD-coded data; internationally, disease-specific registries which use active ascertainment approaches reported greater prevalence than passive ascertainment approaches. Conclusion: Findings highlight how data source selection and ascertainment methods influence prevalence estimates, risking under-ascertainment when relying solely on ICD-coded data. In Australian studies, disease registries were not utilised, reflecting the need to address Australia’s fragmented surveillance infrastructure by integrating Orphanet nomenclature of rare diseases (ORPHAcodes) with ICD-coded data and expanding registries. Implications for health information management practice: Strengthening national coordination, training in genetic coding, nomenclature and inheritance mechanisms, and broader workforce competency will improve prevalence estimates of genetic conditions.

Keywords

public health surveillance international classification of diseases hospital records registries health information management genetic disease monogenic conditions chromosomal disorders disease frequency case ascertainment

Introduction

Genetic conditions, caused partially or entirely by variations in the DNA sequence, can significantly affect the growth, development and health of individuals (National Human Genome Research Institute, 2015; O’Malley and Hutcheon, 2007). These conditions are broadly categorised as chromosomal, monogenic (single gene) and multifactorial (Nussbaum et al., 2015). Advances in genomic technologies, such as next-generation sequencing, have reshaped the understanding and classification of genetic conditions, facilitating improved diagnostics and the enhanced identification of genetic variants, including incidental findings and variants of uncertain significance (Mattick et al., 2014; Stark and Scott, 2023; Stranneheim and Wedell, 2016). The introduction of genomic newborn screening has significantly expanded the range of conditions detectable early in a child’s life, supporting earlier interventions (Downie et al., 2021, 2024; Lunke et al., 2024; Lynch et al., 2024). Similarly, developments in prenatal diagnostics, such as non-invasive prenatal testing and chromosomal microarray, have further enhanced the early detection of chromosomal anomalies, thereby contributing to increased notifications in congenital anomaly registers (Hui et al., 2017; MacArthur et al., 2023).

Despite diagnostic advancements, accurately estimating the prevalence and burden of genetic conditions remains complex. Reliable prevalence data are essential for effective public health planning, resource allocation and the development of targeted interventions aimed at reducing morbidity and mortality. Globally, around 6% of children are born with a serious congenital anomaly of genetic or partially genetic origin, contributing significantly to paediatric hospitalisations (Christianson et al., 2006). While Australia reports lower rates compared to some countries, congenital anomalies still account for a substantial proportion of perinatal deaths in Australia, with over 30% attributed to genetic or congenital defects (Australian Institute of Health and Welfare, 2024).

One major challenge in quantifying the burden of genetic conditions lies in ascertainment. That is, how cases are identified and recorded in official data. Globally, genetic conditions may be captured using various methods, including medical record reviews, International Classification of Diseases (ICD) coded administrative data, and clinical or disease-specific registries (Peng et al., 2018; Ruseckaite et al., 2023). Each method has distinct strengths and limitations. Medical record reviews provide detailed information but often use smaller samples, limiting generalisability (Dye et al., 2011; McCandless et al., 2004). ICD-coded administrative databases cover larger populations, but clinical coding practices may result in under-ascertainment or misclassification, particularly when genetic conditions are secondary diagnoses (Assareh et al., 2016; Riley et al., 2024). Clinical registries offer longitudinal tracking of specific conditions, yet their effectiveness varies due to differences in participation, scope and data-entry standards (Ruseckaite et al., 2023). Collectively, these differing methodologies and data sources can produce inconsistent prevalence figures, complicating efforts to monitor trends, plan services or compare results across regions.

Understanding disease frequency (prevalence and incidence) has significant implications. These measures not only inform the public health impact of genetic conditions but are crucial for targeted resource allocation, effective screening and the development of tailored preventive and therapeutic strategies (Abouelhoda et al., 2016; Colburn and Lapidus, 2024). Given these implications, a clear understanding of how genetic conditions are ascertained, and how their prevalence is reported, is paramount.

Aims

The purpose of this scoping review was to determine how different data sources and ascertainment methods influence the reported prevalence of paediatric monogenic and chromosomal conditions in Australia, New Zealand, Europe and North America. Specifically, the review aimed to:

Identify the key data sources used to capture monogenic and chromosomal conditions.

Examine how varying ascertainment methods shape the reported prevalence of these conditions.

Describe how disease prevalence is presented in the peer-reviewed literature.

Highlight gaps in ascertainment and notification practices and propose areas for improvement in Australia relative to international findings.

Method

Study design

A scoping review of the literature was conducted using the Arksey and O’Malley (2005) framework and further supplemented by the Joanna Briggs Institute guidelines (Peters et al., 2020) and a modified Preferred Reporting and Meta-Analysis extension for scoping reviews checklist (Tricco et al., 2018). A protocol (non-registered) outlining the objectives, inclusion criteria and planned methods was developed a priori.

Search strategy

To address the research objectives, a structured population, context and concept search strategy (Peters et al., 2020) was conducted in consultation with a specialist librarian, to systematically identify relevant peer-reviewed literature (Box 1). During January 2025, an independent search of Medline, CINAHL and SCOPUS electronic databases was undertaken by one author (SG) using the agreed search terms. These databases were chosen for their coverage of biomedical, epidemiological and health services research. Google Scholar was also utilised; however, due to the high number of results, only the first 200 entries were considered. Reference lists of literature which met the inclusion criteria were manually searched (SG) to identify additional relevant publications. Titles that included the following terms were assessed for suitability for inclusion; “prevalence,” “frequency,” or “burden.”

Box 1.

Key search terms.

Concept 1 – population	Concept 2 – context	Concept 3 – concept
genetic disease* OR genetic disorder* OR genetic condition* OR monogenic disease* OR mendelian disorder* OR single gene disorder* OR chromsom* disorder* OR congenital malformation* OR congenital anomal*	Hospit* OR P?ediatric hospital OR Hospital admission* OR medical record* OR Clinical documentation OR Electronic medical record OR Population database OR Administrative database OR clinical registr* OR disease registr* OR population registr* OR Patient registr* OR reporting system OR notification system OR surveillance system OR hospital based ascertainment OR ascertainment method OR case ascertainment OR international classification of disease* OR ICD	frequency OR disease frequency OR magnitude OR health burden OR disease burden OR public health impact OR incidenc* OR prevalen* OR epidemiolog*

Used to retrieve variations on a distinctive word stem or root (e.g. chromosom* = chromosome and/or chromosomal).

Selection of genetic conditions

The foci of this scoping review were chromosomal and monogenic conditions (Box 2), rather than all genetic conditions, as advances in disease knowledge have blurred the distinction between genetic and non-genetic conditions (Dye et al., 2011). Multifactorial conditions fall outside the scope of the review, as they involve non-Mendelian inheritance, with traits influenced by both genetic and environmental factors (Nussbaum et al., 2015). To confirm that conditions met the criteria, inheritance patterns were verified using Online Mendelian Inheritance in Man (OMIM, 2025), Orphanet (2024a) and supplementary material from Gjorgioski et al. (2020). OMIM (2025) is a continuously updated, comprehensive database of human genes and genetic phenotypes, focusing particularly on the relationship between phenotype and genotype. Orphanet (2024a) is a European-based reference portal that provides detailed information on rare diseases, including their clinical presentation, inheritance patterns and classification.

Box 2.

Classification and definition of monogenic and chromosomal conditions.

Inheritance	Definition	Condition example
Chromosomal	A disorder caused by an abnormal chromosome constitution in which there is duplication, loss or rearrangement of chromosomal material (Nussbaum et al., 2015)	Trisomy 21, chromosome deletions
Monogenic conditions	Conditions caused by mutations in a single gene, following a Mendelian inheritance pattern (Nussbaum et al., 2015)	Varies by inheritance pattern (see below)
Autosomal recessive	A condition in which both copies of the gene in each cell have mutations. Carriers are typically asymptomatic and both parents of an affected individual typically carry one mutate copy of the gene. Autosomal recessive disorders are usually not seen in every generation of an affected family (National Library of Medicine, 2021)	Cystic fibrosis, thalassemia, sickle cell anaemia
Sex (X) linked	A disorder that results from the presence of a mutated gene on the X chromosome. X-linked disorders can have either a dominant or recessive inheritance (Nussbaum et al., 2015)	Duchenne muscular dystrophy, haemophilia
Autosomal dominant	A condition in which one mutated copy of the gene cell is sufficient to cause disease. Affected individuals usually inherit the mutation from an affected parent, though new mutations can also occur. These disorders often appear in every generation of an affected family (National Library of Medicine, 2021).	Crouzon syndrome, neurofibromatosis type 1

Eligibility criteria

Peer-reviewed articles published between 2004 and 2024, inclusive, were considered to ensure a focus on the relevant contemporary literature. No language restrictions were applied initially; however, only studies available in English proceeded to full-text screening. To be eligible for inclusion, studies needed to report on at least one monogenic or chromosomal condition. This could be through individual analysis, grouping under categories such as “monogenic” or “chromosomal,” or as part of broader research on rare diseases or birth defects, even if not all conditions met strict genetic definitions. Studies that did not clearly distinguish genetic conditions from broader birth defects (regardless of genetic contribution) were excluded from the review. Furthermore, studies had to include data on children aged under 6 years, as many genetic and congenital conditions are diagnosed beyond infancy and emerge during early childhood (Bower et al., 2010; Gibson et al., 2016). Studies involving broader age groups, including older children or adults, were included only if they reported extractable data specific to children under 6. This criterion ensured the review focused on prevalence and ascertainment during the early years of life. To maximise the comparability and applicability of findings to the Australian context, only studies from New Zealand, Europe and North America were included. These regions share socio-economic conditions, healthcare systems, and demographic characteristics sufficiently similar to Australia (Organisation for Economic Co-operation and Development, 2019) to enhance the relevance and transferability of the review’s conclusions.

In addition, research focusing on specific patient groups or subpopulations were excluded, as these do not reflect general population prevalence. For instance, a study examining obstructive sleep apnoea in individuals with Down syndrome was not included because it did not represent the overall prevalence of the genetic condition itself. Finally, to align with the study’s objectives, articles were required to report prevalence or frequency estimates. Studies not meeting these criteria were excluded. The screening criteria are summarised in Box 3.

Box 3.

Inclusion and exclusion criteria.

Inclusion	Exclusion
• Reports at least one monogenic and/or chromosomal disorder(s) • Population under study comprises children <6 years of age • Identifies the source of data (e.g. health records, national registries, etc.) • Quantitatively reports on prevalence and/or incidence • The study was undertaken in Australia, New Zealand, Europe or North America • Peer-reviewed literature • Published in English	• Studies that do not distinguish between genetic disorders and all rare diseases or birth defects (with or without genetic contribution). • Studies focusing solely on diseases without a genetic or inherited component • Studies focusing on associated conditions alongside genetic diagnoses • Reviews and meta-analyses • Protocols • Editorials • Letters to the editor; short communications; bulletins • Reports; other monographs • Grey literature • Conference abstracts and conference full papers

Inclusion

Exclusion

• Reports at least one monogenic and/or chromosomal disorder(s)
• Population under study comprises children <6 years of age
• Identifies the source of data (e.g. health records, national registries, etc.)
• Quantitatively reports on prevalence and/or incidence
• The study was undertaken in Australia, New Zealand, Europe or North America
• Peer-reviewed literature
• Published in English

• Studies that do not distinguish between genetic disorders and all rare diseases or birth defects (with or without genetic contribution).
• Studies focusing solely on diseases without a genetic or inherited component
• Studies focusing on associated conditions alongside genetic diagnoses
• Reviews and meta-analyses
• Protocols
• Editorials
• Letters to the editor; short communications; bulletins
• Reports; other monographs
• Grey literature
• Conference abstracts and conference full papers

Eligibility screening

Eligibility screening was conducted in two stages using the online reference management tool, Covidence Systematic Review Software (2024). First, two reviewers (SG, MT) independently assessed titles and abstracts against the inclusion criteria. Discrepancies were resolved by a third reviewer (MR). All records not excluded were passed on for full-text review. Two reviewers (SG, MR) independently reviewed full-text records for potentially eligible studies and any disagreements were resolved via discussion (SG, MR). Records deemed ineligible at full-text screening were excluded with the reason recorded.

Data extraction

Following full-text review, data extraction was undertaken by three reviewers (SG, MR and MT) using Covidence Systematic Review Software (2024). Where there was uncertainty about what information to extract, a second author reviewed the article, and discrepancies were resolved through discussion among the reviewers. A data extraction form was developed and pilot-tested on a subset of articles to ensure clarity and comprehensiveness. To minimise bias, data extraction for the article authored by members of the review team (SG, MR) was conducted by an independent reviewer (MT) who was not involved in its authorship. The following bibliometric and research details were extracted from all eligible studies: study identification number; title; name of lead author; (first) year of publication (online or journal issue); title of journal; country; geographical region; study aims; study design; study setting; study time period; participants’ age; sample size; genetic condition(s); genetic inheritance type; source of data (i.e. hospital records, administrative dataset, disease registry, etc.); name of the data source; case ascertainment (active/passive); case ascertainment methods (i.e. medical record abstraction, ICD coding system, genetic testing); how prevalence is reported and prevalence data.

In this review, surveillance approaches were categorised as either active or passive, depending on how data collection was conducted. Active surveillance refers to a process where personnel systematically contact healthcare providers or individuals to obtain information about health conditions. This approach yields highly accurate and timely data but is resource-intensive and costly (Nsubuga et al., 2006). Conversely, passive surveillance involves the receipt of health data from external sources such as hospitals, clinics or public health units. While more economical and capable of covering large populations, passive systems may suffer from variability in data quality and timeliness due to reliance on voluntary reporting (Nsubuga et al., 2006).

Data analysis

Descriptive analyses, supported by Microsoft Excel (Microsoft, 2025), were utilised to summarise the results. Frequencies and percentages were calculated to present the distribution of studies by year of publication, geographic region, genetic condition type, data sources and ascertainment methods. Articles that focused on more than one condition and could be classified as either chromosomal or monogenic were grouped into a “Complex and Mixed Genetic Condition” category.

Ethical approval

Ethics approval was not required because this scoping review did not involve animal or human participants.

Results

Selection of articles

The search of Medline, CINAHL, Scopus and Google Scholar yielded a total of 2429 articles (Figure 1). A further 20 articles were identified through manual reference review of eligible studies. After removing duplicates, 1894 abstracts were reviewed, and 302 full-text articles were assessed for eligibility. A total of 58 articles met the inclusion criteria.

Figure 1.

Preferred Reporting Items in Systematic Reviews and Meta-Analysis (PRISMA) flowchart of scoping review (Covidence generated).

Characteristics of studies

Table 1 summarises the key characteristics of the within-scope articles. Most of the articles (77.6%) were published from 2010 onwards; with 34.5% (n = 20) published between 2019 and 2024. Based on the source of the data, most publications originated from Europe (58.6%, n = 34), followed by North America (31.0%, n = 18) and Australia (5.2%, n = 3). There were no studies originating from New Zealand. Within Europe, four studies focused on England and Wales and three on Denmark, whereas seven spanned multiple European countries. In North America, the United States of America (USA) accounted for the majority (17 articles, or 29.5% of all included studies) and also represented the highest number of publications from a single country. A smaller fraction (5.2%) of articles involved data or collaborations across multiple regions (e.g. United Kingdom (UK) and Australia; Europe and Israel).

Table 1.

Summary of key characteristics of studies reported in within-scope articles.

Key characteristics	Articles (N = 58)	%
Year of publication
2019–2024	20	34.5
2014–2018	9	15.5
2009–2013	17	29.3
2004–2008	12	20.7
Region/country
Australia	3	5.2
Europe	34	58.6
North America	18	31.0
Combined-multiple regions*	3	5.2
Genetic condition type
Chromosomal	20	34.5
Monogenic	17	29.3
Complex and mixed genetic conditions^#	21	36.2

UK: United Kingdom; USA: The United States of America.

Includes Australia, Canada, Czech Republic, England, Finland, France, Israel, Italy, Mexico, The Netherlands, Norway, Sweden, USA, UK.

Articles focus on a range of chromosomal, monogenic and multifactorial conditions.

Types of genetic conditions

Of the 58 included articles, 34.5% (n = 20) focused on chromosomal conditions, 29.3% (n = 17) on monogenic conditions and 36.2% (n = 21) on complex or mixed genetic conditions (Table 1). Among the chromosomal disorder studies, more than half (n = 11) of the reported studies investigated Down Syndrome exclusively. Articles categorised under monogenic, examined conditions such as X-linked agammaglobulinemia and Mucopolysaccharidosis type 1. Studies grouped into the complex and mixed genetic condition category encompassed a broad range of chromosomal, monogenic and multifactorial conditions. Key examples of data that were extracted include Trisomy 21, 18 and 13, as well as more common single-gene disorders such as Cystic Fibrosis, Thalassemia and Osteogenesis Imperfecta.

Sources of data

Data sources used to assess the prevalence of monogenic and chromosomal conditions in the included studies highlighted a strong reliance on disease registries; these constituted the most utilised source, appearing in 62.1% (n = 41; Table 2). Among the most frequently used registries were the European Registration of Congenital Anomalies and Twins (EUROCAT; n = 10), a European network for monitoring congenital anomalies and the National Birth Defects Prevalence Network (NBDPN; n = 4), which collects and analyses data on birth defects in the USA. Hospital and health service records accounted for 19.7% (n = 12), while population-based administrative datasets were used in 12.3% (n = 8) of the reported studies. A smaller proportion of studies (6.2%, n = 4) relied on other sources, including clinical laboratories, vital statistics and newborn screening programs. In the three Australian studies data sources included hospital and health service records (n = 1), population-based administrative datasets (n = 2) and clinical laboratories (n = 1). Conversely, only one study using Australian data utilised a disease registry; however, this was part of an international collaboration with researchers from the UK. Since authors of the articles could use more than one data source, some studies incorporated multiple sources (n = 8), combining, for example, disease registries with administrative datasets. Most studies (n = 50), however, relied upon a single source of data, with disease registries being the sole source in 36 articles, followed by hospital and health service records (n = 10) and population-based administrative datasets (n = 3).

Table 2.

Sources of data used for case ascertainment.

Sources of data	Number	%
Disease Registries	41	63.1
Hospital and health service records	12	18.5
Population-based administrative datasets	8	12.3
Other*	4	6.2
Total^#	65	100.0

Clinical laboratories, vital statistics, newborn screening programs.

The denominator is based on the total number of data sources, not the number of articles, as individual articles may report multiple data sources. The number of sources exceeds the number of articles included.

Case ascertainment methods

Case ascertainment methods varied across studies, with the majority employing active ascertainment (77.6%, n = 45). Notably, active case ascertainment was used in most studies that relied upon disease registries, perhaps reflecting the more structured and systematic approach that these registries employ for data collection. Passive ascertainment was less common (6.9%, n = 4) and was predominantly used in studies relying on population-based administrative datasets. A mixed approach combining both active and passive methods was used in 13.8% (n = 8).

In implementing these ascertainment strategies, authors of studies used various types of case identification methods, often applying more than one method to improve case ascertainment (Table 3). Medical record abstraction was the most frequently used method (29.7%, n = 27), followed closely by ICD and other coded data (27.5%, n = 25) and genetic testing with laboratory confirmation (27.5%, n = 25). Health service or clinician-reported cases (8.8%, n = 8) and registry linkage (4.4%, n = 4) were used less frequently. While 28 studies relied on a single ascertainment method, a substantial proportion (n = 30) incorporated multiple sources, such as combining genetic test results with administrative data or medical record review. The three Australian studies demonstrated variability in how genetic conditions were ascertained. Dye et al. (2011) relied on passive ascertainment through ICD-coded data, whilst Gjorgioski et al. (2020) and Hui et al. (2020) employed active ascertainment methods incorporating medical record abstraction, genetic testing and laboratory confirmation.

Table 3.

Methods of case ascertainment.

Ascertainment method	Number	%
Medical record abstraction	27	29.7
ICD and other coded data	25	27.5
Genetic testing and laboratory confirmation	25	27.5
Health service/clinician reported	8	8.8
Registry linkage	4	4.4
Multiple sources NOS	2	2.2
Total*	91	100.0

ICD: international classification of disease or a more complex, country-specific modification; NOS: not otherwise specified.

The denominator is based on the total number of ascertainment methods, not the number of articles, as individual articles may report multiple methods. The number of ascertainment methods exceeds the number of articles included.

Studies that utilised coded data, mostly used ICD-based coding systems (ICD-10 and ICD-9 versions), including the respective clinical modifications (CMs) and the Australian modification (AM). The British Paediatric Association (BPA) extension, used to code congenital anomalies, was frequently mentioned alongside ICD-9 and ICD-10. Additionally, some studies incorporated genetic condition-specific databases such as OMIM and Orphanet, often alongside ICD-coded data.

Prevalence of chromosomal and monogenic conditions as reported in the literature

The reporting of prevalence varied across studies, reflecting differences in data sources, study populations, and case ascertainment methods. The sample size reported varied, ranging from 31 to 12,886,464 participants/admissions. Studies based on large-scale administrative datasets or national registries tended to have substantially larger sample sizes than those relying on single disease registries.

Prevalence estimates were commonly expressed in relation to population size, most frequently per number of births or per total population (Table 4). Per number of births was the most frequent approach (n = 22) reflected in studies using disease registries, particularly those focusing on congenital or early-onset genetic conditions. For example, Fraser Syndrome was reported at 0.2 per 100,000 births (Barisic et al., 2013), while Holt-Oram Syndrome had a prevalence of 0.7 per 100,000 births (Barisic et al., 2014), both derived from EUROCAT. Similarly, studies examining Down Syndrome reported prevalence of between 11.8 and 22.0 per 10,000 births using disease registries (Loane et al., 2013; Shin et al., 2009), and 7.01 per 10,000 using population administrative datasets (Glivetic et al., 2015).

Table 4.

General description and key characteristics of studies identified in the scoping review.

Article	Country-source of data	Inheritance	Condition	Data source	Name of source	CA	Ascertainment type	Method of reporting prevalence in articles	Prevalence as reported in articles
Ariceta et al. (2023)	Europe and Israel	Monogenic	X-linked hypophosphatemia (XLH)	Disease Registry	International XLH Registry	Active	Medical record abstraction; genetic testing and laboratory confirmation	Range per X people	1 in 20,000–70,000
Barisic et al. (2013)	Multiple European countries	Monogenic	Fraser syndrome	Disease Registry	EUROCAT	Active	ICD and other coded data; medical record abstraction	Per 100,000 births	0.2 per 100,000
Barisic et al. (2014)	Multiple European countries	Monogenic	Holt-Oram syndrome	Disease Registry	EUROCAT	Active	ICD and other coded data; medical record abstraction	Per 100,000 births	0.7 per 100,000
Baujat et al. (2017)	France	Monogenic	Fibrodysplasia Ossifican	Population-Based Administrative Datasets; Disease Registry	Programme de Médicalisation des Systèmes d’Information (PMSI) and rare disease registry (CEMARA)	Mixed	ICD and other coded data; medical record abstraction; health service/clinician reported	Per million people	1.36 per million
Boyd et al. (2011)	Multiple European countries	Chromosomal	Sex-chromosome trisomies	Disease Registry	EUROCAT	Active	Genetic testing and laboratory confirmation	Per 10,000 births	1.88 per 10,000
Cocchi et al. (2010)	Multiple countries	Chromosomal	T21	Disease Registry	International Clearinghouse for Birth Defects Surveillance and Research (ICBDSR)	NS	Multiple sources, NOS	Per 10,000 births	13.1–18.2 per 10,000
Coi et al. (2017)	Italy	Complex and mixed genetic conditions	Multiple genetic conditions	Disease Registry	Tuscany Registry of Rare Diseases and Tuscany Registry of Congenital Anomalies	Active	ICD and other coded data	Per 100,000 births	26.15 per 100,00 Turner Syndrome; 13.27 per 100,000 Klinefelter syndrome; 13.27 per 100,000 OI; 2.34 per 100,000 tuberous sclerosis
Coi et al. (2019)	Multiple European countries	Monogenic	Achondroplasia	Disease Registry	EUROCAT	Active	ICD and other coded data; medical record abstraction	Per 100,000 births	3.72 per 100,000
Dănescu et al. (2015)	Romania	Monogenic	Epidermolysis bullosa	Disease Registry	Romanian Epidermolysis Bullosa Registry	Active	Medical Record Abstraction; genetic testing and laboratory confirmation	Per million	4.42 per million
Doidge et al. (2020)	England and Wales	Chromosomal	T21	Population-based administrative datasets; Disease Registry	National Down Syndrome Cytogenetic Register (NDSCR) and Hospital Episode Statistics (HES), England	Mixed	ICD and other coded data; registry linkage	Per 10,000 live births	12.3 per 10,000
Dolk et al. (2005)	Multiple European countries	Chromosomal	T21	Disease Registry	EUROCAT	Active	Genetic testing and laboratory confirmation	Per 1000 births	1–3 per 1000
Dongarwar et al. (2022)	USA	Monogenic	Cystic fibrosis	Population-Based Administrative Datasets	National Inpatient Sample (NIS) database	Passive	ICD and other coded data	Per 10,000 hospitalisations	0–2 years: 3.51 per 10,000; 3–6 years: 44.02 per 10,000
Dye et al. (2011)	Australia	Monogenic	Multiple monogenic and chromosomal conditions	Population-Based Administrative Datasets	Western Australian Hospital Morbidity Data System	Passive	ICD and other coded data	Percentage of admissions	2.60%
Feldkamp et al. (2017)	USA	Complex and mixed genetic conditions	Multiple conditions	Disease Registry	Utah Birth Defect Network, Utah	Active	ICD and other coded data; medical record abstraction; genetic testing & Laboratory Confirmation	Percentage of births	20.20% (T21, 18, 13, turner syndrome, structural chromosomal abnormalities, and monogenic disorders)
Frøslev-Friis et al. (2011)	Denmark	Chromosomal	T21, T18, T13, other	Disease Registry	EUROCAT registry for Funen County, Denmark	Active	ICD and other coded data; Genetic Testing & laboratory confirmation; medical record abstraction	Per 10,000 births	35.6 per 10,000
Giampaolo et al. (2017)	Italy	Complex and mixed genetic conditions	Congenital bleeding disorders	Disease Registry	Italian Registry of Congenital Bleeding Disorders	Active	Health service/Clinician Reported	Percentage of patients	18.0% haemophilia A; 21.3% haemophilia B; 10.6% VonWD
Gjorgioski et al. (2020)	Australia	Complex and mixed genetic conditions	Multiple monogenic and chromosomal conditions	Hospital and Health Service Records	Electronic Medical Records (EMRs) from Royal Children’s Hospital, Melbourne	Active	Medical record abstraction; genetic testing and laboratory confirmation	Percentage of patients	16%
Glivetic et al. (2015)	Croatia	Chromosomal	T21	Population-Based Administrative Datasets	Croatian medical birth database and perinatal mortality database	Passive	ICD and other coded data; medical record abstraction	Per 10,000 births	7.01 per 10,000 births
Hagen et al. (2022)	USA	Complex and mixed genetic conditions	Multiple	Hospital and Health Service Records	Cincinnati Children’s Hospital Medical Center (CCHMC) NICU Data	Active	Medical record abstraction; genetic testing and laboratory confirmation	Percentage of admissions	10% confirmed genetic diagnosis (includes monogenic and chromosomal)
Heinke et al. (2021)	USA	Chromosomal	T21	Disease Registry	National birth defects prevention network (NBDPN)	Mixed	ICD and other coded data	Per 10,000 live births	12.7 per 10,000
Herlin et al. (2024)	Denmark	Monogenic	Incontentia pigmenti	Population-Based Administrative Datasets; Disease Registry	Danish National Patient Registry (DNPR), Danish National Database of Rare Genetic Diseases (RareDis), Danish Genodermatosis Database	Active	ICD and other coded data; medical record abstraction	Per 100,000	2.37 per 100,000
Hughes-McCormack et al. (2020)	Scotland	Chromosomal	T21	Clinical laboratories	Scottish Regional Genetics Centres	Active	Registry linkage; medical record abstraction	Per 1000 births	1.0 per 1000
Hui et al. (2020)	Australia	Chromosomal	22q11 deletion syndrome	Population-Based Administrative Datasets; Other: Clinical Laboratories	Perinatal Record Linkage (PeRL) collaboration, victorian clinical genetics services, Melbourne Pathology, Monash Medical Centre and Australian Clinical Laboratories	Active	Genetic testing and laboratory confirmation	Ratio (1 in X births)	1 in 4558
Irving et al. (2008)	England and Wales	Chromosomal	T21	Disease Registry	Northern Congenital Abnormality Survey (NorCAS)	Active	Genetic testing and laboratory confirmation	Per 1000 total births	1.72 per 1000
Jones et al. (2008)	England and Wales	Monogenic	Chronic granulomatous disease	Disease Registry	Chronic Granulomatous Disorder (CGD) research trust registry	Active	Medical record abstraction: health service/clinician reported	Per million births	15.9/million males; 9.7 per million females
Kristensen et al. (2024)	Norway	Monogenic	POLG disease	Disease Registry	Norwegian POLG Patient Registry	Active	Genetic testing and laboratory confirmation	Ratio (1 in X people)	1:149,253
Lialiaris et al. (2010)	Greece	Complex and mixed genetic conditions	Multiple conditions	Hospital and Health Service Records	Medical records from the University General Hospital of Alexandroupolis, Greece	Active	Medical record abstraction	Percentage of admissions	8.9% of admissions are monogenic or chromosomal
Loane et al. (2013)	Multiple European countries	Chromosomal	T21	Disease Registry	EUROCAT	Active	ICD and other coded data; genetic testing and laboratory confirmation	10,000 live births	22.0 per 10,000 T21; 5.0 per 10,000; T18 2.0 per 10,000 T13
Mai et al. (2019)	USA	Complex and mixed genetic conditions	Multiple conditions	Disease Registry	National Birth Defects Prevention Network (NBDPN)	Mixed	ICD and other coded data	Per 10,000 live births	15.74 per 10,000 T21; 3.43 per 10,000 T18; 1.49 per 10,000 T13
Marouane et al. (2022)	The Netherlands	Complex and mixed genetic conditions	Multiple conditions	Hospital and Health Service Records	Electronic Medical Records (EMRs) from Radboud University Medical Center NICU	Active	Medical record abstraction; genetic testing and laboratory confirmation	Percentage of admissions	14% T21; 1% T13; 3% T18; 3% turner syndrome
McCandless et al. (2004)	USA	Complex and mixed genetic conditions	Multiple conditions	Hospital and Health Service Records	Electronic Medical Records (EMRs) from Rainbow Babies and Children’s Hospital, Cleveland	Active	ICD and other coded data; medical record abstraction	Percentage of admissions	10.8% single gene and chromosomal
McDonnell et al. (2017)	Ireland	Chromosomal	T21, 18 and 13	Disease Registry; Hospital and Health Service Records	Dublin EUROCAT Registry, Irish Maternity and Paediatric Hospitals	Active	Medical record abstraction; genetic testing; laboratory confirmation	Per 10,000 births	35.7 per 10,000 T21; 9.3 per 10,000 T18; 3.7 per 10,000 T13
Métneki and Czeizel (2005)	Hungary	Chromosomal	T21	Disease Registry	Hungarian Congenital Abnormality Registry (HCAR)	Active	Genetic testing; laboratory confirmation	Per 1000 live births	1.17 to 1.50 per 1000
Moffitt et al. (2011)	USA	Monogenic	Multiple conditions	Disease Registry	Texas Birth Defects Registry (TBDR)	Active	ICD and other coded data	Per 100,000 live births	4.54 per 10,000 OI
Mukhina et al. (2020)	Russia	Complex and mixed genetic conditions	Primary immunodeficiencies (PID)	Disease Registry	Russian Primary Immunodeficiency (PID) Registry	Active	Medical record abstraction	Per 100,000	1.3 per 100,000
Murphy et al. (2009)	Ireland	Monogenic	Mucopolysaccharidosis type 1	Hospital and Health Service Records	National Centre for Inherited Metabolic Disorders, Children’s University Hospital, Dublin	Active	Medical record abstraction	Per 10,000 births	0.08 per 10,000
Nowaczyk et al. (2004)	Canada	Monogenic	Smith-Lemli-Opitz disease	Disease Registry	Canadian Paediatric Surveillance Program (CPSP)	Active	Health service/clinician reported	Ratio (1 in X births)	1 in 70,358 births
O’Malley and Hutcheon (2007)	USA	Complex and mixed genetic conditions	Multiple conditions	Hospital and Health Service Records	Medical records from Elizabeth Seton Paediatric Center, New York	Active	Medical record abstraction	Percentage of admissions	5.9% chromosomal; 12.2% monogenic
Óskarsdóttir et al. (2004)	Sweden	Chromosomal	2q11 deletion syndrome	Hospital and Health Service Records	Queen Silvia Children’s Hospital, Sweden	Active	Genetic testing; laboratory confirmation	Per 100,000 live births	13.2 per 100, 000
Parker et al. (2009)	USA	Complex and mixed genetic conditions	Multiple conditions	Disease Registry	National Birth Defects Prevention Network (NBDPN)	Mixed	ICD and other coded data	Per 10,000 live births	14.47 per 10,000 T21; 2.66 per 10,000 T18; 1.26 per 10,000 T13
Quental et al. (2010)	Portugal	Monogenic	Maple syrup urine disease	Other: Newborn screening program	Newborn Screening Program in Portugal	Active	Genetic testing; laboratory confirmation	Ratio (1 in X births)	1 in 86,800
Rafaelsen et al. (2016)	Norway	Monogenic	Hereditary hypophosphatemia	Hospital and Health Service Records	Norwegian Paediatric Hospital Departments	Active	ICD and other coded data; genetic testing and laboratory confirmation; medical record abstraction	Ratio (1 in X children)	1 in 45,000
Salemi et al. (2012)	USA	Complex and mixed genetic conditions	Multiple conditions	Disease Registry	Florida Birth Defects Registry (FBDR)	Mixed	ICD and other coded data; registry linkage	Per 10,000	13.3 per 10,000 T21; 2.6 per 10,000 T18; 1.3 per 10,000 T13
Savva et al. (2010)	UK and Australia	Chromosomal	T21, 18 and 13	Disease Registry	UK regional Congenital Anomaly Registers and Australian Registers	Active	Multiple sources, NOS	per 10,000 births	1.4 per 10,000 T13; 2.3 per 10,000 T18
Shai et al. (2020)	Germany	Complex and mixed genetic conditions	Severe combined immunodeficiency disorders	Disease Registry	German Paediatric Surveillance Unit (ESPED) and European Society for Immunodeficiencies (ESID)	Active	Medical record abstraction: health service/clinician reported	Per 100,000	1.6 per 100,000
Shin et al. (2009)	USA	Chromosomal	T21	Disease Registry	10 USA population-based birth defects registries	Mixed	ICD and other coded data	per 10,000 births	11.8 per 10,000
Sousa et al. (2023)	Portugal	Monogenic	Wilson disease	Population-Based Administrative Datasets; Hospital and Health Service Records	Portuguese National Health Service's Clinical Coding System, 13 Northern Portuguese Hospitals	Active	ICD and other coded data	Ratio (1 in X people)	1 in 37,000 people
Stallings et al. (2024)	USA	Complex and mixed genetic conditions	Multiple conditions	Disease Registry	National Birth Defects Prevention Network (NBDPN)	Mixed	ICD and other coded data	Per 10,000 live births	17.19 per 10,000 T21; 3.44 per 10,000; T18; 1.60 per 10,000 T13
Stanclift et al. (2022)	USA	Monogenic	N-glycanase 1 (NGLY1)	Disease Registry	NGLY1 Registry	Active	Genetic testing and laboratory confirmation	Incidence per year	12
Stochholm et al. (2010)	Denmark	Chromosomal	47, XYY	Disease Registry	Danish Cytogenetic Central Registry	Active	Genetic testing and laboratory confirmation	Per 100,000 population	14.2 per 100,000
Swaggart et al. (2019)	USA	Complex and mixed genetic conditions	Multiple conditions	Hospital and health service records	Electronic Medical Records (EMR) from Cincinnati Children's Hospital Medical Center (CCHMC)	Active	Medical record abstraction; genetic testing and laboratory confirmation	Percentage of admissions	9%
Tonks et al. (2013)	England and Wales	Chromosomal	T13 and 18	Disease Registry	West Midlands Congenital Anomaly Register (WMCAR)	Active	Genetic testing and laboratory confirmation; registry linkage	Per 10,000 births, percentages and raw numbers	3.95–6.94 per 10,000 T18; 2.45 per 10,000 T13
Waller et al. (2008)	USA	Monogenic	Achondroplasia and thanatophoric dysplasia	Disease Registry	Seven US population-based birth defects monitoring programs	Active	Medical record abstraction	Per 10,000 livebirths and 95% confidence intervals	0.36–0.60 per 10,000 achondroplasia; 0.21–0.30 per 10,000 thanatophoric dysplasia
Weijerman et al. (2008)	The Netherlands	Chromosomal	T21	Disease Registry	Dutch Paediatric Surveillance Unit (DPSU), Dutch Neonatal Registry (LNR), and Dutch Obstetric Registry (LVR)	Active	Genetic testing and laboratory confirmation; health service/clinician reported	Per 10,000 births	16 per 10,000
Wellesley et al. (2012)	Multiple European countries	Chromosomal	Multiple conditions	Disease Registry	EUROCAT	Active	ICD and other coded data; medical record abstraction	Per 10,000 births	43.8 per 10,000 chromosome abnormality; 23.0 per 10,000 T21; 5.9 per 10,000 T18; 2.3 per 10 000 T13; 2.0 per 10,000 sex chromosome trisomy; 3.3 per 10,000 45, X
Winkelstein et al. (2006)	USA	Monogenic	X-linked agammaglobulinemia (XLA)	Disease Registry	United States Immune Deficiency Network (USIDNET), Immune Deficiency Foundation Registry	Active	Health service/clinician reported	Ratio (1 in X births)	1 in 379,000
Yanni et al. (2010)	USA	Complex and mixed genetic conditions	Multiple conditions	Disease Registry; Other: Vital Statistics	Michigan Birth Defects Registry (MBDR); Michigan Department of Community Health	Passive	ICD and other coded data; genetic testing and laboratory confirmation; health service/clinician reported	Per 10,00 live births and	4.60 per 10,000 MSUD; 3.05 per 100,000 CF; 7.40 per 100,000 G6PD deficiency; 1.70 per 100,000 thalassemia; 0.28 per 100,000 SCD; 1.09 per 100,000 T21; 1.05 per 100,000 T13; 1.40 per 100,000 T18
Zinchenko et al. (2024)	Russia	Complex and mixed genetic conditions	Multiple conditions	Disease Registry	Russian Federation Rare Disease Registry	Active	Medical record abstraction; genetic testing and laboratory confirmation	Per 100,000	0.10 per 100,000 MSUD; 1.50 per 100,000 OI; 0.49 per 100,000 WD; 0.25 per 100,000 FD

CA: case ascertainment; CF: cystic fibrosis; EUROCAT: the European registration of congenital anomalies and twins; FD: Fabry disease; G6PD: glucose-6-phosphate dehydrogenase; ICD: international classification of diseases or a more complex, country-specific modification; MSUD: maple syrup urine disease; NS: not specified; OI: Osteogenesis Imperfecta; POLG: DNA polymerase gamma; SCD: sickle cell disease; T13: trisomy 13 (Patau Syndrome); T18: trisomy 18 (Edwards Syndrome); T21: trisomy 21 (Down Syndrome); UK: United Kingdom; USA: the United State of America; VonWD: Von Willebrand disease; WD: Wilson disease.

Correspondingly, some studies expressed prevalence to the general population, particularly for conditions that impact individuals across the lifespan. For example, Fibrodysplasia Ossificans Progressive was reported at 1.36 per million people, using data from a population-based administrative dataset and disease registry in France (Baujat et al., 2017). While Epidermolysis Bullosa was reported at 4.42 per million people using a disease registry in Romania (Dănescu et al., 2015). Moreover, in some studies, the prevalence was reported using ratio-based estimates expressing the frequency of genetic conditions as 1 in X births, children or people. For instance, Nowaczyk et al. (2004), using data from the Canadian Paediatric Surveillance Program, estimated the prevalence of Smith-Lemli-Opitz Disease as 1 in 70,358 births, whilst Hui et al. (2020) estimated the prevalence of 22q1 Deletion Syndrome as 1 in 4558 births in Victoria, Australia (excluding miscarriages but including stillbirths and terminations), reflecting differences in study populations and ascertainment methods.

Prevalence estimates were also reported at times using all pregnancy outcomes rather than live births alone. Heinke et al. (2021) reported a Down Syndrome prevalence of 12.7 per 10,000 live births, increasing to 13.3 per 10,000 when terminations and stillbirths were included from the USA. Similarly, Cocchi et al. (2010) found that the total prevalence of Down Syndrome from 14 participating countries, including live births, stillbirths and terminations, increased from 13.1 to 18.2 per 10,000 births between 1993 and 2004. In contrast, the prevalence of live births alone remained stable at 8.3 per 10,000. This increase in total prevalence was accompanied by a rise in terminations of pregnancy for Down Syndrome, from 4.8 to 9.9 per 10,000 births over the same period.

Eleven studies reported the prevalence of conditions as a percentage of admissions or patients. The majority of these studies (n = 7/10) utilised health or hospital records, primarily through medical record abstraction. Studies conducted in Greece and the USA examining the impact of monogenic and chromosomal conditions collectively via medical record review found that these conditions accounted for 8.9%–18.1% of admissions (Lialiaris et al., 2010; McCandless et al., 2004; O’Malley and Hutcheon, 2007). In Australia, Gjorgioski et al. (2020) reported a 16% prevalence of admissions for these conditions in a Victorian hospital using a similar approach. In contrast, Dye et al. (2011), who analysed ICD-10-AM coded data from a population-based administrative dataset in Western Australia, found a substantially lower prevalence of 2.6% of admissions.

Discussion

This scoping review explored the data sources used to identify and record monogenic and chromosomal conditions in Australia, New Zealand, Europe and North America, as reported in the peer-reviewed literature between 2004 and 2024, inclusive. The review examined how different data sources influence prevalence estimates, and highlighted variations in case ascertainment and notification systems. Nearly 60% of the within-scope studies originated from European countries and 30% from the USA, illustrating the concentration of genetic disease research in these regions. In contrast, Australian authors accounted for only four studies, one of which involved international collaboration with researchers from the UK. The dominance of European and USA-based studies is likely driven by well-established surveillance networks and large-scale registries, such as EUROCAT, the NBDPN and the Genetic and Rare Diseases Information Centre. Additionally, Orphanet plays a key role in European rare disease surveillance, offering a structured and standardised framework for data collection across multiple countries (Orphanet, 2024a). Central to this system is the use of ORPHAcodes, which are unique, stable identifiers assigned to rare diseases within the Orphanet nomenclature.

Widely recognised as the most appropriate coding system for rare diseases, ORPHAcodes offer comprehensive coverage across Europe and are increasingly being integrated into broader classification systems, such as the ICD (Orphanet, 2024b; Rare Disease Awareness Rare Portal, 2024). These centralised systems facilitate consistent case ascertainment, national and cross-national level prevalence reporting and robust data linkage across healthcare systems and settings (Bascom et al., 2023; Dolk, 2005).

Australia, by comparison, does not have a centralised national genetic or rare disease registry. Instead, surveillance relies on a fragmented system of specialised registries, including the Australian Congenital Anomalies Monitoring System, the Australian Paediatric Surveillance Unit Database and state-based registries such as the Victorian Congenital Anomalies Register. Ruseckaite et al. (2023) highlight the variability in Australian rare disease registries in terms of data entry, scope, outputs and funding, and associated challenges of interoperability. There exist Australian disease-specific registries for some conditions, such as the Australian Cystic Fibrosis Data Registry and the Australian Rett Syndrome Database; however, many genetic conditions (particularly monogenic and chromosomal) lack dedicated national registries. This arrangement contributes to fragmented data, underrepresentation in surveillance and difficulties in estimating true prevalence (Elliot et al., 2024). In their scoping review, Ruseckaite et al. (2023) found only 24 Australian-only rare disease registries and five Australian jurisdiction-based registries. Given that over 7000 rare genetic conditions have been identified, around one-third of which affect children (Lee et al., 2020), a more coordinated effort is required to ensure comprehensive data collection (Rare Voices Australia and Monash University, 2023; Ruseckaite et al., 2023) to achieve more accurate prevalence estimates.

The current review identified considerable heterogeneity in data sources, case ascertainment methods and approaches to reporting prevalence for monogenic and chromosomal conditions. The reliance on disease registries as the primary data source for determining prevalence, evident in 62.1% of the within-scope studies, underscores their recognised role in systematically capturing rare conditions (Hageman et al., 2023; Ruseckaite et al., 2023). High usage of registries such as EUROCAT and the NBDPN reflects their extensive reach, standardised data collection protocols and longstanding contributions to congenital anomaly surveillance (Dolk, 2005; Mai et al., 2019). Nevertheless, heavy dependence on a single registry can introduce limitations, including variations in case definitions, incomplete coverage and under-ascertainment of milder or late-onset conditions (Nassar et al., 2007). Despite the widespread use of disease registries internationally, only one within-scope study incorporated disease-specific registry data from Australia (Savva et al., 2010), highlighting the limited integration of local registry data in prevalence estimates. Notably, over three-quarters of the within-scope studies relied upon a single data source, thereby limiting opportunities for data triangulation (Rutherford et al., 2010) and consistency checks (Molster et al., 2012). Notwithstanding, or perhaps, in recognition of these challenges, a substantial proportion of studies (n = 30) adopted multiple ascertainment methods to enhance data robustness, such as combining genetic testing with administrative data or medical record review. This mixed approach facilitates improved sensitivity and specificity in case identification (Saczynski et al., 2013).

Case ascertainment methods also varied across studies, with active ascertainment being the predominant approach (77.6%); aligned with the high proportion of studies using disease registries which employ structured and systematic methods for case identification (Boyle et al., 2018). Active ascertainment, including medical record abstraction (29.7%) and genetic testing with laboratory confirmation (27.5%), improves case completeness and diagnostic accuracy, particularly for conditions requiring molecular confirmation (Molster et al., 2016). In contrast, passive ascertainment was less common (6.9%) and typically linked to population-based administrative datasets that rely on existing coded data rather than proactive case identification, which can result in underreporting (Reichard et al., 2016).

These methodological differences contribute to variations in prevalence estimates, as studies using disease registries from Europe and the USA, tend to report higher disease estimates due to more comprehensive case capture and larger sample sizes. For example, the reported prevalence of Down Syndrome ranged from 4.79 to 14.99 per 10,000, depending on whether population administrative datasets or disease registries were used. This discrepancy is likely because disease registries actively identify and verify cases through multiple data sources. Administrative datasets may miss cases that are asymptomatic, not coded as a primary diagnosis or diagnosed outside hospital settings. Moreover, administrative datasets in Australia primarily use ICD coded for disease surveillance, which can lead to underestimation (Liu et al., 2022; Ryan et al., 2021). These findings underscore the importance of integrating multiple data sources to improve the accuracy and reliability of genetic disease surveillance.

Furthermore, prevalence estimates of genetic conditions can differ notably based on whether only live births or all pregnancy outcomes (including terminations and stillbirths) are considered. For example, Heinke et al. (2021) reported an increase in the prevalence of Down Syndrome when terminated pregnancies and stillbirths were included. This discrepancy underscores how the inclusion of non-live-birth outcomes can substantially influence reported prevalence, particularly for conditions frequently detected through prenatal screening (Hui et al., 2016; Hui and Halliday, 2023). Moreover, the extent of this variation may depend on factors such as maternal age and access to genetic testing (Hui et al., 2016). Additionally, comparing prevalence estimates across studies is further complicated by differences in denominator definitions. For example, where prevalence is reported per 10,000 or per 100,000 births, or as a percentage of patients. This inconsistency can obscure direct comparisons and limit the interpretability of findings, particularly when determining prevalence estimates for Australia. Consequently, when interpreting and comparing prevalence rates, it is essential to account for both the population denominator and whether all pregnancy outcomes are included.

Studies employing medical record abstraction offered a more detailed and nuanced approach to identifying genetic conditions by reviewing full patient records rather than relying solely on coded discharge diagnoses. Approximately half of these studies were conducted within hospitals and health services (n = 10) while the rest were based in disease registries. This method can uncover underlying genetic conditions that are not explicitly recorded via the coded data, resulting in more complete case ascertainment and improved disease estimates. For example, McCandless et al. (2004) and Gjorgioski et al. (2020) identified genetic components missed in ICD-coded data (Dye et al., 2011), showing that administrative datasets underestimate the burden of monogenic and chromosomal diseases. McCandless et al. (2004) reported that 25% of genetic conditions were not captured in coded data, a pattern echoed in Gjorgioski (2017, unpublished thesis) and other validation studies across different diseases such as stroke (Ryan et al., 2021).

In contrast, disease registries provide a systematic approach to genetic disease surveillance through active ascertainment, ensuring that diagnosed cases are recorded and followed over time. Registries often use standardised data collection protocols, improving consistency in case reporting. Unlike medical record reviews, however, registries are usually limited to specific diseases and may not capture undiagnosed or incidental cases, particularly if they rely on voluntary reporting or do not integrate genetic testing results.

Despite their advantages, medical record reviews are resource-intensive, typically confined to single health services and have smaller sample sizes compared to registries and administrative datasets. Their accuracy also depends heavily on clinician documentation. Issues such as copy-pasting of clinical notes, outdated patient histories, or missing updates to genetic diagnoses in electronic medical records (EMRs) can introduce biases (Al Bahrani and Medhi, 2023; Casey et al., 2016). These factors may lead to over- or under-estimation of prevalence, depending on how genetic information is recorded and updated.

While ICD-based coding systems were widely used (27.5%), including ICD-9 and ICD-10 (with country-specific CMs such as the ICD-10-AM), coding-based approaches may be subject to misclassification or incomplete case capture, especially for conditions with complex diagnostic criteria (Southern et al., 2016). The incorporation of the BPA extension for congenital anomalies, alongside rare disease-specific databases such as OMIM and Orphanet, emphasises the need for coding frameworks that can handle the wide range of genetic conditions (Hageman et al., 2023; Walker et al., 2017).

Nevertheless, Australia’s reliance on ICD-10-AM for coded health data poses significant challenges for the accurate capture of genetic conditions. Historically, many genetic conditions lack specific codes or are categorised under broad groupings (Bowker and Star, 1999), leading to underreporting and poor surveillance. For instance, while Cystic Fibrosis (E84) has a distinct classification, Long QT Syndrome (I49.8) is included under “Other specific cardiac arrhythmias,” in the ICD, thus limiting detailed monitoring. Likewise, a patient with Osteogenesis Imperfecta who is admitted for a broken wrist may have only the fracture coded and not the underlying genetic diagnosis (Lujic et al., 2014). In Australia, co-morbid conditions are coded only if they meet Australian Coding Standard 0002, which stipulates that the condition must impact upon patient care by requiring therapeutic treatment, diagnostic procedures or increased nursing care and/or monitoring (Independent Hospital Pricing Authority, 2019). Thus, genetic diagnoses that do not affect the current admission, need not be coded (Assareh et al., 2016; Lujic et al., 2014).

In 2015, supplementary codes were introduced in the 9th edition of the ICD-10-AM to capture chronic conditions present on admission but not actively treated during an episode of care (Lujic et al., 2017). These codes, however, are not comprehensive and do not sufficiently capture the burden of many genetic conditions, which perpetuate their underrepresentation in administrative datasets. The increasing use of genomic testing and the multisystemic nature of many genetic conditions (Mellis et al., 2022) further complicate their classification within a statistical coding framework. Australia’s National Strategic Action Plan for Rare Diseases notes that only 517 of the nearly 7000 Orphanet-recognised rare diseases can be coded in Australian hospitals (Department of Health, 2020). While Action 3.1.1 proposes the integration of ORPHAcodes and ICD-11, implementation remains uncertain because ICD-11 adoption is still several years away in Australia. The absence of a clear pathway for incorporating ORPHAcodes into the existing ICD-10-AM framework leaves a gap in the immediate integration of genetic disease data. Consequently, until the ICD-11 is adopted, a structured approach to addressing limitations in the Australian clinical coding system remains elusive, thus creating ongoing barriers to accurate genetic disease surveillance, research and policy development.

Strengthening clinical coder and Health Information Manager (HIM) expertise is equally critical, given the complexity of genetic diagnoses and diverse inheritance patterns. In the future, HIMs will likely be responsible for assigning ORPHAcodes at the hospital level, thereby contributing directly to national genetic disease estimates. As the frontline professionals responsible for the quality and accuracy of clinically coded health data, HIMs must be equipped with the knowledge and training necessary to ensure that genetic conditions are appropriately classified and captured within national health datasets. Targeted education in genetic nomenclature, inheritance mechanisms, and coding frameworks can empower HIMs and clinical coders to accurately identify, document, and classify genetic conditions to ensure the comprehensive capture of relevant information.

Strengths and limitations

This study has several strengths. These include its international scope and comparison of like systems, which have allowed for a broader understanding of how different countries capture monogenic and chromosomal disorders. Unlike many studies that focus on all genetic conditions or rare diseases (some of which are non-genetic), this review has specifically examined monogenic and chromosomal disorders, thereby providing a more targeted analysis of prevalence estimation challenges. The study limitations include its exclusion of grey literature and white papers which may have led to the omission of relevant reports containing prevalence data. Additionally, as this was a scoping review rather than a systematic review, there is a possibility that some studies were missed. Another potential limitation is the risk of misclassifying disease inheritance, which could have led to the incorrect inclusion or exclusion of certain conditions, although this was mitigated by using established genetic databases such as OMIM and Orphanet to guide classification. Although data extraction was performed by three independent reviewers, no formal statistical analysis or audit was conducted to evaluate inter-rater reliability or the consistency of data extraction. This may potentially introduce variability or extraction bias into the results. Despite these limitations, the study provides valuable insights into the complexities of disease surveillance and data collection for monogenic and chromosomal conditions.

Conclusion

The findings highlight the significant variability in the methods used to ascertain cases and report prevalence of monogenic and chromosomal conditions. This variability underscores the need for a multipronged approach to case ascertainment that integrates registry data, ICD-coded administrative data, and rare disease-specific classification systems such as ORPHAcodes. Such an approach would enhance the accuracy, consistency and comparability of prevalence estimates across settings. In Australia, addressing the fragmented surveillance infrastructure is essential to ensure the reliability of prevalence data for genetic conditions. Key opportunities include developing multi-jurisdictional registries to support uniform data collection nationwide and leveraging Systematized Nomenclature of Medicine -Clinical Terms (SNOMED-CT) within EMRs to better capture genetic phenotypes (Mellis et al., 2022). While these priorities are outlined in the National Strategic Action Plan for Rare Diseases (Department of Health, 2020), progress (particularly in relation to ORPHAcodes) remains limited. Achieving a more integrated surveillance system will require coordinated national leadership, sustained funding and investment in workforce capabilities. Strengthening these foundations will support the generation of robust, policy-relevant prevalence data for genetic conditions in Australia.

Footnotes

Acknowledgements

The authors thank Hannah Buttery, Librarian at La Trobe University Library, Melbourne, for expert advice during the database searches.

Accepted for publication June 10, 2025.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Stephanie Gjorgioski, BHSc(MedClass), BHthInfoMgt, BHSc(Hons), GradCertHECTL

Melanie Tassos, BSc, GradCertHECTL

Monique F Kilkenny, PhD

Kerin Robinson, BAppSc(MRA), BHA, MHP, PhD, CHIM

Merilyn Riley, BAppSc(MRA), BTh, GradDipEpi&Biost, PhD

References

Abouelhoda

Sobahy

El-Kalioby

, et al. (2016) Clinical genomics can facilitate countrywide estimation of autosomal recessive disease burden. Genetics in Medicine 18(12): 1244–1249.

Al Bahrani

Medhi

(2023) Copy-pasting in patients’ electronic medical records (EMRs): Use judiciously and with caution. Cureus 15(6): e40486.

Ariceta

Beck-Nielsen

Boot

, et al. (2023) The International X-linked hypophosphatemia (XLH) registry: First interim analysis of baseline demographic, genetic and clinical data. Orphanet Journal of Rare Diseases 18(1): 304.

Arksey

O’Malley

(2005) Scoping studies: Towards a methodological framework. International Journal of Social Research Methodology 8(1): 19–32.

Assareh

Achat

Stubbs

, et al. (2016) Incidence and variation of discrepancies in recording chronic conditions in Australian hospital administrative data. Public Library of Science One 11(1): e0147087.

Australian Institute of Health and Welfare (2024) Congenital anomalies in Australia. Available at: https://www.aihw.gov.au/reports/mothers-babies/congenital-anomalies-in-australia/contents/congenital-anomalies-in-australia (accessed 6 January 2025).

Barisic

Boban

Greenlees

, et al. (2014) Holt Oram syndrome: A registry-based study in Europe. Orphanet Journal of Rare Diseases 9: 156.

Barisic

Odak

Loane

, et al. (2013) Fraser syndrome: Epidemiological study in a European population. American Journal of Medical Genetics Part A 161A(5): 1012–1020.

Bascom

Stephens

Lupo

, et al. (2023) Scientific impact of the National Birth Defects Prevention Network multistate collaborative publications. Birth Defects Research 116(1): e2225.

10.

Baujat

Choquet

Bouée

, et al. (2017) Prevalence of fibrodysplasia ossificans progressiva (FOP) in France: An estimate based on a record linkage of two national databases. Orphanet Journal of Rare Diseases 12(1): 123.

11.

Bower

Rudy

Callaghan

, et al. (2010) Age at diagnosis of birth defects. Birth Defects Research (Part A): Clinical and Molecular Teratology 88(4): 251–255.

12.

Bowker

Star

(1999) Sorting Things Out: Classification and Its Consequences. Cambridge, MA: MIT Press.

13.

Boyd

Loane

Garne

, et al. (2011) Sex chromosome trisomies in Europe: Prevalence, prenatal detection and outcome of pregnancy. European Journal of Human Genetics 19(2): 231–234.

14.

Boyle

Addor

M-C

Arriola

, et al. (2018) Estimating global burden of disease due to congenital anomaly: An analysis of European data. Archives of Disease in Childhood - Fetal and Neonatal Edition 103(1): F22–F28.

15.

Casey

Schwartz

Stewart

, et al. (2016) Using electronic health records for population health research: A review of methods and applications. Annual Review of Public Health 37(1): 61–81.

16.

Christianson

Howsan

Modell

(2006) March of Dimes Global Report on Birth Defects: The Hidden Toll of Dying and Disabled Children. New York: March of Dimes Birth Defects Foundation. Available at: https://www.marchofdimes.org/global-report-on-birth-defects-the-hidden-toll-of-dying-and-disabled-children-full-report.pdf (accessed 7 January 2025).

17.

Cocchi

Gualdi

Bower

, et al. (2010) International trends of Down syndrome 1993–2004: Births in relation to maternal age and terminations of pregnancies. Birth Defects Research (Part A): Clinical and Molecular Teratology 88(6): 474–479.

18.

Coi

Santoro

Garne

, et al. (2019) Epidemiology of achondroplasia: A population-based study in Europe. American Journal of Medical Genetics Part A 179A(9): 1791–1798.

19.

Coi

Santoro

Pierini

, et al. (2017) Prevalence estimates of rare congenital anomalies by integrating two population-based registries in Tuscany, Italy. Public Health Genomics 20(4): 229–234.

20.

Colburn

Lapidus

(2024) An analysis of Pompe newborn screening data: A new prevalence at birth, insight and discussion. Frontiers in Pediatrics 11: 1221140.

21.

Covidence Systematic Review Software (2024) Veritas health innovation. Available at: https://www.covidence.org/ (accessed 5 January 2025).

22.

Dănescu

Has

Senila

, et al. (2015) Epidemiology of inherited epidermolysis bullosa in Romania and genotype–phenotype correlations in patients with dystrophic epidermolysis bullosa. Journal of the European Academy of Dermatology and Venereology 29(5): 899–903.

23.

Department of Health (2020) National strategic action plan for rare diseases. Australian Government. Available at: https://www.health.gov.au/sites/default/files/documents/2020/03/national-strategic-action-plan-for-rare-diseases.pdf (accessed 15 January 2025).

24.

Dolk

(2005) EUROCAT: 25 years of European surveillance of congenital anomalies. Archives of Disease in Childhood: Fetal and Neonatal Edition 90(1): F355–F358.

25.

Dolk

Loane

Garne

, et al. (2005) Trends and geographic inequalities in the prevalence of Down syndrome in Europe, 1980–1999. Revue d’Épidémiologie et de Santé Publique 53(Suppl 2): 2S87–2S95.

26.

Doidge

Morris

Harron

, et al. (2020) Prevalence of Down’s syndrome in England, 1998–2013: Comparison of linked surveillance data and electronic health records. International Journal of Population Data Science 5(1): 14.

27.

Dongarwar

Garcia

Miller

, et al. (2022) Assessment of hospitalization rates, factors associated with hospitalization and in-patient mortality in pediatric patients with cystic fibrosis. Journal of the National Medical Association 113(6): 683–690.

28.

Downie

Halliday

Lewis

, et al. (2021) Principles of genomic newborn screening programs: A systematic review. The Journal of the American Medical Assocciation Network Open 4(7): e2114336.

29.

Downie

Bouffler

Amor

, et al. (2024) Gene selection for genomic newborn screening: Moving toward consensus? Genetics in Medicine 26(5): 101077.

30.

Dye

Brameld

Maxwell

, et al. (2011) The impact of single gene and chromosomal disorders on hospital admissions of children and adolescents: A population-based study. Public Health Genomics 14(3): 153–161.

31.

Elliott

Teutsch

Nunez

, et al. (2024) Improving knowledge of rare disorders since 1993: The Australian Paediatric Surveillance Unit. Archives of Disease in Childhood 109(12): 967–979.

32.

Feldkamp

Carey

Byrne

, et al. (2017) Etiology and clinical presentation of birth defects: Population-based study. British Medical Journal 357(1): j2249.

33.

Frøslev-Friis

Hjort-Pedersen

Henriques

, et al. (2011) Improved prenatal detection of chromosomal anomalies. Danish Medical Bulletin 58(8): A4293.

34.

Giampaolo

Abbonizio

Arcieri

, et al. (2017) Italian registry of congenital bleeding disorders. Journal of Clinical Medicine 6(3): 34.

35.

Gibson

Scott

Haan

, et al. (2016) Age range for inclusion affects ascertainment by birth defects registers. Birth Defects Research (Part A): Clinical and Molecular Teratology 106(9): 761–766.

36.

Gjorgioski

(2017) The burden of genetic disorders in an Australian paediatric hospital: Comparison over 1985, 1995, 2007 and 2017. Unpublished Bachelor of Health Science with Honours (Public Health) Thesis, La Trobe University, Australia.

37.

Gjorgioski

Halliday

Riley

, et al. (2020) Genetics and pediatric hospital admissions, 1985 to 2017. Genetics in Medicine 22(11): 1777–1785.

38.

Glivetic

Rodin

Milosevic

, et al. (2015) Prevalence, prenatal screening and neonatal features in children with Down syndrome: A registry-based national study. Italian Journal of Pediatrics 41: 81.

39.

Hagen

Khattar

Whitehead

, et al. (2022) Detection and impact of genetic disease in a level IV neonatal intensive care unit. Journal of Perinatology 42(1): 580–588.

40.

Hageman

van Rooij

IALM

de Blaauw

, et al. (2023) A systematic overview of rare disease patient registries: Challenges in design, quality management, and maintenance. Orphanet Journal of Rare Disease 18(1): 106.

41.

Heinke

Isenburg

Stallings

, et al. (2021) Prevalence of structural birth defects among infants with Down syndrome, 2013–2017: A US population-based study. Birth Defects Research 113(1): 189–202.

42.

Herlin

Schmidt

SAJ

Mogensen

, et al. (2024) Prevalence and clinical characteristics of incontinentia pigmenti: A nationwide population-based study. Orphanet Journal of Rare Diseases 19(1): 454.

43.

Hughes-McCormack

McGowan

Pell

, et al. (2020) Birth incidence, deaths and hospitalisations of children and young people with Down syndrome, 1990–2015: Birth cohort study. British Medical Journal Open 10(1): e033770.

44.

Hui

Halliday

(2023) A decade of non-invasive prenatal screening in Australia: National impact on prenatal screening and diagnostic testing. Australian and New Zealand Journal of Obstetrics and Gynaecology 63(2): 264–267.

45.

Hui

Hutchinson

Poulton

, et al. (2017). Population-based impact of noninvasive prenatal screening on screening and diagnostic testing for fetal aneuploidy. Genetics in Medicine 19(12): 1338–1345.

46.

Hui

Muggli

Halliday

(2016) Population-based trends in prenatal screening and diagnosis for aneuploidy: A retrospective analysis of 38 years of state-wide data. BJOG: An International Journal of Obstetrics & Gynaecology 123(1): 90–97.

47.

Hui

Poulton

Kluckow

, et al. (2020) A minimum estimate of the prevalence of 22q11 deletion syndrome and other chromosome abnormalities in a combined prenatal and postnatal cohort. Human Reproduction 35(3): 694–704.

48.

Independent Hospital Pricing Authority (2019) Australian Coding Standards, 11th edn. Sydney, NSW: IHPA.

49.

Irving

Basu

Richmond

, et al. (2008) Twenty-year trends in prevalence and survival of Down syndrome. European Journal of Human Genetics 16(12): 1336–1340.

50.

Jones

LBKR

McGrogan

Flood

, et al. (2008) Special article: Chronic granulomatous disease in the United Kingdom and Ireland: A comprehensive national patient-based registry. Clinical and Experimental Immunology 152(2): 471–486.

51.

Kristensen

Mathisen

Berland

, et al. (2024) Epidemiology and natural history of POLG disease in Norway: A nationwide cohort study. Annals of Clinical and Translational Neurology 11(7): 1819–1830.

52.

Lee

Singleton

Wallin

, et al. (2020) Rare genetic diseases: Nature’s experiments on human development. iScience 23(5): 101123.

53.

Lialiaris

Mantadakis

Kareli

, et al. (2010) Frequency of genetic diseases and health coverage of children requiring admission in a general pediatric clinic of northern Greece. Italian Journal of Pediatrics 36(1): 3.

54.

Liu

Hadzi-Tosev

Liu

, et al. (2022) Accuracy of International Classification of Diseases, 10th revision codes for identifying sepsis: A systematic review and meta-analysis. Critical Care Explorations 4(11): e0788.

55.

Loane

Morris

Addor

, et al. (2013) Twenty-year trends in the prevalence of Down syndrome and other trisomies in Europe: Impact of maternal age and prenatal screening. European Journal of Human Genetics 21(1): 27–33.

56.

Lujic

Simpson

Zwar

, et al. (2017) Multimorbidity in Australia: Comparing estimates derived using administrative data sources and survey data. Public Library of Science One 12(8): e0183817.

57.

Lujic

Watson

Randall

, et al. (2014) Variation in the recording of common health conditions in routine hospital data: Study using linked survey and administrative data in New South Wales, Australia. British Medical Journal Open 4(9): 1–11.

58.

Lunke

Bouffler

Downie

, et al. (2024) Prospective cohort study of genomic newborn screening: BabyScreen+ pilot study protocol. British Medical Journal Open 14(4): e081426.

59.

Lynch

Best

Gaff

, et al. (2024) Australian public perspectives on genomic newborn screening: Which conditions should be included? Human Genomics 18(4): 45.

60.

MacArthur

Hansen

Baynam

, et al. (2023) Trends in prenatal diagnosis of congenital anomalies in Western Australia between 1980 and 2020: A population-based study. Paediatric and Perinatal Epidemiology 37(7): 596–606.

61.

Mai

Isenburg

Canfield

, et al. (2019) National population-based estimates for major birth defects, 2010–2014. Birth Defects Research 111(18): 1420–1435.

62.

Marouane

Olde Keizer

RACM

Frederix

GWJ

, et al. (2022) Congenital anomalies and genetic disorders in neonates: A single-center observational cohort study. European Journal of Pediatrics 181(3): 359–367.

63.

Mattick

Dziadek

Terrill

, et al. (2014) The impact of genomics on the future of medicine and health. Medical Journal of Australia 201(1): 17–20.

64.

McCandless

Brunger

Cassidy

(2004) The burden of genetic disease on inpatient care in a children’s hospital. American Journal of Human Genetics 74(1): 121–127.

65.

McDonnell

Monteith

Kennelly

, et al. (2017) Epidemiology of chromosomal trisomies in the East of Ireland. Journal of Public Health 39(4): e145–e151.

66.

Mellis

Oprych

Scotchman

, et al. (2022) Diagnostic yield of exome sequencing for prenatal diagnosis of fetal structural anomalies: A systematic review and meta-analysis. Prenatal Diagnosis 42(6): 662–685.

67.

Métneki

Czeizel

(2005) Increasing total prevalence rate of cases with Down syndrome in Hungary. European Journal of Epidemiology 20(6): 525–535.

68.

Microsoft (2025) Microsoft Excel. Available at: https://www.microsoft.com/en-au/microsoft-365/excel (accessed 18 January 2025).

69.

Moffitt

Abiri

Scheuerle

, et al. (2011) Descriptive epidemiology of selected heritable birth defects in Texas. Birth Defects Research Part A: Clinical and Molecular Teratology 91(12): 990–994.

70.

Molster

Youngs

Hammond

, et al. (2012) Key outcomes from stakeholder workshops at a symposium to inform the development of an Australian national plan for rare diseases. Orphanet Journal of Rare Diseases 7: 50.

71.

Molster

Youngs

Hammond

, et al. (2016) Survey of healthcare experiences of Australian adults living with rare diseases. Orphanet Journal of Rare Diseases 11: 30.

72.

Mukhina

Kuzmenko

Rodina

, et al. (2020) Primary immunodeficiencies in Russia: Data from the National Registry. Frontiers in Immunology 11: 1491.

73.

Murphy

Lambert

Treacy

, et al. (2009) Incidence and prevalence of mucopolysaccharidosis type 1 in the Irish republic Reproductive Toxicology 94(1): 52–54.

74.

Nassar

Bower

Barker

(2007) Increasing prevalence of hypospadias in Western Australia, 1980–2000. Archives of Disease in Childhood 92(7): 580–584.

75.

National Human Genome Research Institute (2015)Frequently asked questions about genetic disorders. National Human Genome Research Institute. Available at: https://www.genome.gov/19016930/faq-about-genetic-disorders/ (accessed 12 January 2025).

76.

National Library of Medicine (2021) Patterns of inheritance. U.S. Department of Health and Human Services. Available at: https://medlineplus.gov/genetics/understanding/inheritance/inheritancepatterns/ (accessed 12 January 2025).

77.

Nowaczyk

MJM

Zeesman

Waye

, et al. (2004) Incidence of Smith-Lemili-Ppitz syndrome in Canada: Results of three-year population surveillance The Journal of Pediatric Genetics 145(4): 530–535.

78.

Nsubuga

White

Thacker

, et al. (2006). Public health surveillance: A tool for targeting and monitoring interventions. In: Jamison

Breman

Measham

, et al. (eds.) Disease Control Priorities in Developing Countries, 2nd edn. Washington, DC: The International Bank for Reconstruction and Development/The World Bank.

79.

Nussbaum

McInnes

Willard

(2015) Thompson & Thompson Genetics in Medicine, 8th edn. Amsterdam, The Netherlands: Elsevier.

80.

Organisation for Economic Co-operation and Development (2019) Health at a Glance 2019: OECD Indicators. Paris, France: OECD Publishing.

81.

O’Malley

Hutcheon

(2007) Genetic disorders and congenital malformations in pediatric long-term care. Journal of the American Medical Directors Association 8(5): 332–334.

82.

Online Mendelian Inheritance in Man (OMIM) (2025) OMIM^®. Baltimore, MD: McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University. Available at: https://omim.org/ (accessed 15 January 2025).

83.

Orphanet (2024a) Rare disease registries. Available at: https://www.orpha.net/pdfs/orphacom/cahiers/docs/GB/Rare_Disease_Registries.pdf (accessed 23 January 2025).

84.

Orphanet (2024b) Orphanet nomenclature for coding. Available at: https://www.orphadata.com/orphanet-nomenclature-for-coding/ (accessed 06 May 2025).

85.

Óskarsdóttir

Vujic

Fasth

(2004) Incidence and prevalence of the 22q11 deletion syndrome: A population-based study in Western Sweden. Archives of Disease in Childhood 89(2): 148–151.

86.

Parker

Mai

Strickland

(2009) Multistate study of the epidemiology of clubfoot. Birth Defects Research Part A: Clinical and Molecular Teratology 88(12): 897–904.

87.

Peng

Sundararajan

Williamson

, et al. (2018) Exploration of assoication rule mining for coding consistency and completeness assessment in inpatient administrative health data. Journal of Biomedical Informatics 79(1): 41–47.

88.

Peters

MDJ

Godfrey

McInerney

, et al. (2020) Chapter 11: Scoping reviews (2020 version). In: Aromataris

(eds.) JBI Manual for Evidence Synthesis. Adelaide, SA: JBI, pp. , 407–451.

89.

Quental

Vilarinho

Martins

, et al. (2010) Incidence of maple syrup urine disease in Portugal. Molecular Genetics and Metabolism 100(4): 385–387.

90.

Rafaelsen

Johansson

Ræder

, et al. (2016) Hereditary hypophosphatemia in Norway: A retrospective population-based study of genotypes, phenotypes, and treatment complications. European Journal of Endocrinology 174(2): 125–136.

91.

Rare Disease Awareness Rare Portal (2024) Rare disease classifications. Available at: https://rareportal.org.au/rare-disease-classifications/#about1 (accessed 6 May 2025).

92.

Rare Voices Australia and Monash University (2023) Recommendations: National Approach to rare disease data: Findings from an audit of Australian Rare Disease Registries. Available at: https://rarevoices.org.au/wp-content/uploads/2023/07/Recommendations_NationalApproachtoRareDiseaseData.pdf (accessed 25 January 2025).

93.

Reichard

McDermott

Ruttenber

, et al. (2016) Testing the feasibility of a passive and active case ascertainment system for multiple rare conditions simultaneously: The experience in three US states. Journal of Medical Internet Reseach Public Health and Surveillance 2(2): e151.

94.

Riley

Lee

Richardson

, et al. (2024) The applications of Australian-coded ICD-10 and ICD-10-AM data in research: A scoping review of the literature. Health Information Management Journal 53(1): 41–50.

95.

Rutherford

McFarland

Spindler

, et al. (2010) Public health triangulation: Approach and application to synthesizing data to understand national and local HIV epidemics. BioMed Central Public Health 10: 447.

96.

Ruseckaite

Mudunna

Caruso

, et al. (2023) Current state of rare disease registries and databases in Australia: A scoping review. Orphanet Journal of Rare Diseases 18(1): 216.

97.

Ryan

Riley

Cadilhac

, et al. (2021) Factors associated with stroke coding quality: A comparison of registry and administrative data. Journal of Stroke and Cerebrovascular Diseases 30(2): 105469.

98.

Saczynski

McManus

Goldberg

(2013) Commonly used data-collection approaches in clinical research. The American Journal of Medicine 126(11): 946–950.

99.

Salemi

Tanner

Block

, et al. (2012) A comparison of two surveillance strategies for selected birth defects in Florida. Public Health Reports 127(3): 391–400.

100.

Savva

Walker

Morris

(2010) The maternal age-specific live birth prevalence of trisomies 13 and 18 compared to trisomy 21 (Down syndrome). Prenatal Diagnosis 30(1): 57–64.

101.

Shai

Perez-Becker

Anders

, et al. (2020) Incidence of SCID in Germany from 2014 to 2015 an ESPED* Survey on Behalf of the API*** Erhebungseinheit für Seltene Pädiatrische Erkrankungen in Deutschland (German Paediatric Surveillance Unit) ** Arbeitsgemeinschaft Pädiatrische Immunologie. Environmental Research 40(1): 708–717.

102.

Shin

Besser

Kucik

, et al. (2009) Prevalence of Down syndrome among children and adolescents in 10 regions of the United States. Pediatrics 124(6): 1565–1571.

103.

Sousa

Magalhães

Pinto

, et al. (2023) Wilson’s disease: A prevalence study in a Portuguese population. Cureus 15(8): e43718.

104.

Southern

Hall

White

, et al. (2016) Opportunities and challenges for quality and safety applications in ICD-11: An international survey of users of coded health data. International Journal for Quality in Health Care 28(1): 129–135.

105.

Stallings

Isenburg

Rutkowski

, et al. (2024) National population-based estimates for major birth defects, 2016–2020. Birth Defects Research 116: e2301.

106.

Stanclift

Dwight

Lee

, et al. (2022) NGLY1 deiciency: Estimated incidence, clinical features, and genotypic spectrum from the NGLY1 Registry. Orphanet Journal of Rare Diseases 17: 440.

107.

Stark

Scott

(2023) Genomic newborn screening for rare diseases. Nature Reviews Genetics 24(1): 755–766.

108.

Stochholm

Juul

Gravholt

(2010) Diagnosis and mortality in 47,XYY persons: A registry study. Orphanet Journal of Rare Diseases 5(1): 15.

109.

Stranneheim

Wedell

(2016) Exome and genome sequencing: A revolution for the discovery and diagnosis of monogenic disorders. Journal of Internal Medicine 279(1): 3–15.

110.

Swaggart

Swarr

Tolusso

, et al. (2019) Making a genetic diagnosis in a Level IV neonatal intensive care unit population: Who, when, how, and at what cost? The Journal of Pediatrics 213(1): 211–217.e4.

111.

Tonks

Gornall

Larkins

, et al. (2013) Trisomies 18 and 13: Trends in prevalence and prenatal diagnosis – population based study. Prenatal Diagnosis 33(8): 742–750.

112.

Tricco

Lillie

Zarin

, et al. (2018) PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and explanation. Annals of Internal Medicine 169(7): 467–473.

113.

Walker

Mahede

Davis

, et al. (2017) The collective impact of rarwe diseases in Western Australia: An estimate using a population-based cohort. Genetics in Medicine 19(5): 546–552.

114.

Waller

Correa

, et al. (2008) The population-based prevalence of achondroplasia and thanatophoric dysplasia in selected regions of the US. American Journal of Medical Genetics Part A 146A(18): 2385–2389.

115.

Weijerman

van Furth

Vonk Noordegraaf

, et al. (2008) Prevalence, neonatal characteristics, and first-year mortality of Down syndrome: A national study. The Journal of Pediatrics 152(1): 15–19.

116.

Wellesley

Boyd

Dolk

, et al. (2012) Rare chromosome abnormalities, prevalence and prenatal diagnosis rates from population-based congenital anomaly registers in Europe. European Journal of Human Genetics 20(5): 521–526.

117.

Winkelstein

Marino

Lederman

, et al. (2006) X-linked agammaglobulinemia: Report on a United States registry of 201 patients. Medicine 85(4): 193–202.

118.

Yanni

Copeland

Olney

(2010) Birth defects and genetic disorders among Arab Americans—Michigan, 1992–2003. Journal of Immigrant and Minority Health 12(3): 408–413.

119.

Zinchenko

Tebieva

Gabisova

, et al. (2024) Orphan diseases in the Republic of North Ossetia–Alania: Structure, population genetic features, issues and prospects. Bulletin of RSMU 3: 43–51.