Identifying Algorithms to Improve the Accuracy of Unverified Diagnosis Codes for Birth Defects

Abstract

Objectives:

We identified algorithms to improve the accuracy of passive surveillance programs for birth defects that rely on administrative diagnosis codes for case ascertainment and in situations where case confirmation via medical record review is not possible or is resource prohibitive.

Methods:

We linked data from the 2009-2011 Florida Birth Defects Registry, a statewide, multisource, passive surveillance program, to an enhanced surveillance database with selected cases confirmed through medical record review. For each of 13 birth defects, we calculated the positive predictive value (PPV) to compare the accuracy of 4 algorithms that varied case definitions based on the number of diagnoses, medical encounters, and data sources in which the birth defect was identified. We also assessed the degree to which accuracy-improving algorithms would affect the Florida Birth Defects Registry’s completeness of ascertainment.

Results:

The PPV generated by using the original Florida Birth Defects Registry case definition (ie, suspected cases confirmed by medical record review) was 94.2%. More restrictive case definition algorithms increased the PPV to between 97.5% (identified by 1 or more codes/encounters in 1 data source) and 99.2% (identified in >1 data source). Although PPVs varied by birth defect, alternative algorithms increased accuracy for all birth defects; however, alternative algorithms also resulted in failing to ascertain 58.3% to 81.9% of cases.

Conclusions:

We found that surveillance programs that rely on unverified diagnosis codes can use algorithms to dramatically increase the accuracy of case finding, without having to review medical records. This can be important for etiologic studies. However, the use of increasingly restrictive case definition algorithms led to a decrease in completeness and the disproportionate exclusion of less severe cases, which could limit the widespread use of these approaches.

Keywords

birth defects congenital malformations accuracy surveillance positive predictive value

Florida has the second-largest population-based birth defects surveillance system in the United States that relies predominantly on passive case ascertainment. Operated by the Florida Department of Health and a consortium of partners, the Florida Birth Defects Registry (FBDR) has monitored a population of more than 3 million live births since January 1, 1998, to detect structural, functional, and biochemical abnormalities in infants. Data from the FBDR and other state-based surveillance programs are used to support local investigations, epidemiologic studies,^1

-4 health outcomes research,^5

-8 and national collaborative projects.^9,10 However, for passive programs, in which cases are defined by unverified administrative diagnosis codes present in hospital discharge or service-related databases, the utility of surveillance data is hampered by the likelihood of under-ascertainment and reporting errors.^11

-15

Previous analyses have suggested that International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnosis codes,¹⁶ upon which many birth defect diagnoses are established, capture the general occurrence of a birth defect well but often fail to identify the birth defect with high accuracy.¹⁷ For example, the overall accuracy of the FBDR, estimated by using the positive predictive value (PPV) for 13 selected birth defects during 2007-2011, was 93.3%; that is, 6.7% of cases identified by the ICD-9-CM codes were deemed to be false positives when validated by a diagnostic gold standard of medical record review with evidence of clinical confirmation of each birth defect. Despite a relatively high overall PPV, variation by birth defect ranged from 96.0% for gastroschisis to <55% for reduction deformities of the lower limb.¹⁷

In light of funding and other resource restrictions for population-based disease registries, which limit quality improvement efforts (including diagnosis verification through medical record review), we concluded that an exploratory study of cost-effective alternative strategies for improving diagnostic accuracy was warranted. Our investigation determined the extent to which case definition algorithms—those case inclusion criteria that leverage the number of codes, encounters, and various data sources in which birth defects diagnoses appear—can improve accuracy in the absence of medical record review. We also explored the trade-off between accuracy and completeness associated with various algorithms.

Methods

Study Design and Population

We linked data from a statewide passive surveillance database containing unverified diagnosis codes to data from an enhanced surveillance project with medical record–confirmed diagnoses for selected birth defects. The source population included all infants born alive to Florida-resident mothers between January 1, 2009, and December 31, 2011.

Passive surveillance: FBDR

To be included in the registry, 3 criteria must be met: (1) the biological mother must be a resident of Florida, (2) the infant must be born alive (spontaneous and elective terminations are not included), and (3) the infant must receive a diagnosis in the first year of life with 1 or more structural, genetic, or other specified birth outcomes that can adversely affect the infant’s health and development. The FBDR’s passive case ascertainment methodology identifies cases by collecting information from multiple extant data sources. The source population is first defined by Florida-resident birth certificates. Then, using a validated deterministic data linkage strategy,¹⁸ each data source is linked to the birth certificate record. Although data sources have changed over time¹⁹ because of funding restrictions or the identification of new, more reliable data sets, those used during the years of this evaluation did not change: (1) inpatient, (2) outpatient, and (3) emergency department hospital discharge records from the Florida Agency for Health Care Administration; and (4) infant death certificates from the Florida Bureau of Vital Statistics. Records from each linked data source contain ICD-9-CM codes or International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) diagnosis codes²⁰ that are used to identify birth defects. The final FBDR cohort consists of an unduplicated list of cases with 1 or more birth defect codes. If a case has multiple birth defects, all are documented. Because of resource constraints, the FBDR is unable to verify the entire cohort with an examination of medical records; therefore, ICD-9-CM and ICD-10-CM diagnosis codes that are indicative of birth defects are taken as valid and not subjected to case verification protocols.

Enhanced surveillance: statewide case confirmation project

In 2007, the Florida Department of Health and the University of South Florida Birth Defects Surveillance Program began operating an Environmental Public Health Tracking enhanced surveillance project that sought to implement more active case finding by contacting hospitals and by reviewing data from labor and delivery logs, neonatal intensive care units’ admission lists, and other clinical records.¹⁷ The Environmental Public Health Tracking project also implemented medical record review by trained abstractors to confirm each suspected case, defined as an infant having 1 or more of 13 birth defects (Table 1). These birth defects were initially selected based on public health importance, consistency of reporting across states in the United States, and the potential for causal links to environmental exposures. In 2010, additional funding allowed the Environmental Public Health Tracking project to be expanded to include statewide case verification, retroactively applied to birth cohorts beginning in 2007, for all cases initially identified by the passive case-finding system (FBDR). During the process, surveillance staff members queried the passive FBDR to identify all records (ie, medical encounters) from all data sources (eg, inpatient, outpatient, and emergency department discharge databases) in which 1 or more Environmental Public Health Tracking birth defects was diagnosed.²² All confirmed birth defects were recoded by using a Centers for Disease Control and Prevention–modified version of the British Pediatric Association coding system,²¹ which provides higher specificity than ICD diagnosis codes. Reasons for false-positive diagnoses were captured in prose.

Table 1.

Diagnosis codes used to identify individual birth defects and birth defect categories included in the Florida Birth Defects Registry’s Statewide Case Confirmation Project, 2007-2011^a

Birth Defect/Birth Defect Group	ICD-9-CM Codes^a,b	ICD-10-CM Codes^a,b	Modified British Pediatric Association Codes^a,c
Central nervous system
Anencephaly	740.0, 740.1	Q00.0, Q00.1	740.0x, 740.1x
Spina bifida without anencephaly	741.x without 740.0, 740.1	Q05.x, Q07.01, Q07.03 without Q00.0, Q00.1	741.x without 740.0x, 740.1x
Congenital heart defects
Hypoplastic left heart syndrome	746.7	Q23.4	746.700
Tetralogy of Fallot	745.2	Q21.3	745.200
Transposition of the great arteries	745.1x	Q20.3, Q20.5	745.1x
Limb anomalies
Reduction deformities, upper limb	755.2x	Q71.x	755.2x
Reduction deformities, lower limb	755.3x	Q72.x	755.3x
Orofacial clefts
Cleft palate without cleft lip	749.0x without 749.1x, 749.2x	Q35.1-Q35.9 without Q36.0-Q36.9, Q37.0-Q37.9	749.0x without 749.1x, 749.2x
Cleft lip without cleft palate	749.1x without 749.0x, 749.2x	Q36.0-36.9 without Q35.1-Q35.9,Q37.0-Q37.9	749.1x without 749.0x, 749.2x
Cleft palate with cleft lip	749.2x or 749.0x with 749.1x	Q37.0-Q37.9 or Q35.1-Q35.9 with Q36.0-Q36.9	749.2x or 749.0x with 749.1x
Other birth defects
Gastroschisis	756.79 with 54.71^d or 756.73	Q79.3	756.710
Hypospadias	752.61	Q54.0-Q54.9 (excluding Q54.4)	752.60x, 652.620, 652.625- 652.627
Trisomy 21 (Down syndrome)	758.0	Q90.x	758.0x

Abbreviations: ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; ICD-10-CM, International Classification of Diseases, Tenth Revision, Clinical Modification.

^aPresence of 1 or more of the listed codes serves as positive indication of the birth defect. Similarly, presence of 1 or more of the listed codes after “without” indicates absence of the birth defect regardless of other listed codes. An “x” indicates that all subcodes with the listed code prefix are part of the birth defect definition.

^bICD-9-CM¹⁶ and ICD-10-CM²⁰ codes were used by the statewide, passive surveillance system to identify cases of birth defects. ICD-10-CM codes were present only on infant death certificates; all other data sources relied on ICD-9-CM codes. All cases identified by this system underwent medical record review as part of the statewide confirmation project to confirm or rule out each included birth defect.

^cModified British Pediatric Association codes²¹ were used by trained abstractors to identify cases of birth defects after medical record review.

^dBeginning October 1, 2009, gastroschisis is identified exclusively by a single ICD-9-CM diagnosis code (756.73). Previously, cases of gastroschisis were identified by using the presence of both the nonspecific 756.79 ICD-9-CM diagnosis code (other congenital anomalies of abdominal wall) and the 54.71 ICD-9-CM procedure code indicating repair of gastroschisis.¹⁶

Exploration of algorithms to improve accuracy

In an effort to improve the accuracy of selected birth defects when medical record review was not possible, the FBDR Consortium’s analysts workgroup hypothesized that the diagnostic accuracy of ICD codes in the statewide passive system would be substantially higher when a birth defect was (1) identified by more than 1 code for the same birth defect (eg, 2 codes for reduction deformity of the lower limbs), (2) diagnosed during more than 1 medical encounter (eg, 2 inpatient hospitalizations), or (3) diagnosed in more than 1 database (eg, inpatient and emergency department). To evaluate this hypothesis, we compared several case definition algorithms (not mutually exclusive):

Original case definition in which any ICD code for the birth defect listed during any encounter, regardless of data source, was considered a positive indication of the birth defect. Example: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91 (spina bifida without mention of hydrocephalus, cervical region) and no other indication of spina bifida in any other data source. This would be a positive case of spina bifida.

Case definition that requires either multiple ICD codes for the same birth defect or multiple encounters during which the birth defect is diagnosed.

Example 1: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91 and no other indication of spina bifida in any other data source. This would not be considered a case of spina bifida.

Example 2: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91. The 741.91 code is documented during another inpatient hospitalization later in the infant’s first year of life. This would be a positive indication of the code during more than 1 distinct encounter and would be considered a case.

Example 3: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91 as well as a documented ICD-9-CM code of 741.01 (spina bifida with hydrocephalus, cervical region). Because 2 codes indicate spina bifida, this would be considered a case.

Case definition that requires either multiple ICD codes for the same birth defect or multiple encounters during which the birth defect is diagnosed (same case definition as case definition 2), but with the added restriction that all codes or encounters must come from the same data source. Therefore, although we would come to the same conclusion for examples 1, 2, and 3 by using case definition 3 as we did by using case definition 2, we would come to a different conclusion if the birth defect were identified in more than 1 data source (eg, inpatient and emergency department): this would not be considered a case.

Case definition that requires more than 1 data source in which the birth defect is diagnosed.

Example 2: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91 as well as a documented ICD-9-CM code of 741.01. This would not be considered a case of spina bifida, because the birth defect was diagnosed in only 1 data source.

Example 3: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91. The ICD-9-CM code of 741.91 is documented during an emergency department hospitalization later in the infant’s first year of life. This infant would be considered a case of spina bifida because the birth defect was diagnosed in more than 1 data source.

Statistical Analyses

We primarily used PPV to compare the accuracy of ICD codes in the FBDR data under various case definition algorithms. We defined PPV as the proportion of FBDR-identified (passive) cases that were confirmed by medical record review to have the same birth defect. We calculated 95% confidence intervals (CIs) by using the asymptotic (Wald) method for binomial proportions. We calculated PPV overall, for each of the 13 birth defects, and for several groups of birth defects classified by body system. For each birth defect or birth defect group, we compared PPV by using the 4 case definition algorithms described previously. For these analyses, infants were unduplicated at the level of the birth defect or birth defect group (eg, an infant with more than 1 code for spina bifida was counted only once as having spina bifida). An infant with more than 1 FBDR-identified birth defect considered in this study was included in more than 1 birth defect-specific analysis. All inferential tests were 2-tailed with a 5% type I error rate, and we conducted analyses by using SAS version 9.4.²³ Because this project was an evaluation of the FBDR data, whose reporting and surveillance are under the authority of Florida Statute 381.0031, the Florida Department of Health Institutional Review Board determined that this study was exempt from review.

Results

From 2009 through 2011, the FBDR’s passive surveillance system identified 4772 infants with 1 or more of the 13 birth defects studied. The PPV under the original case definition was 94.2% (95% CI, 93.6%-94.9%), with 4497 of the 4772 suspected cases confirmed by medical record review. However, case definition algorithms requiring identification of the birth defect across multiple codes, encounters, or data sources increased the PPV to between 97.5% (95% CI, 96.6%-98.4%) if the birth defect was identified by multiple codes or multiple encounters from 1 data source and 99.2% (95% CI, 98.6%-99.8%) if the birth defect was identified by more than 1 data source. Although overall, the PPV for the original case definition was high, PPVs across birth defects varied from 61.2% (95% CI, 50.8%-71.5%) for reduction deformities of the lower limb to 96.3% (95% CI, 94.2%-98.5%) for gastroschisis (Table 2). Alternative case definition algorithms again improved accuracy, even for birth defects based on ICD-9-CM codes with low PPVs. For example, the PPV for reduction deformities of the lower limb increased to 90.0% (95% CI, 71.4%-100.0%) when the birth defect was identified by multiple codes or multiple encounters and 100.0% (95% CI, 100.0%-100.0%) when the birth defect was identified by more than 1 data source. Moreover, the algorithm requiring identification by multiple codes or multiple encounters resulted in a PPV ≥90% for all except 2 birth defects (anencephaly, 88.2%, and cleft lip without cleft palate, 84.6%). When identification of the birth defect in more than 1 data source was required, the PPV became 100.0% for 7 of the 13 birth defects and 2 of the 3 birth defect groups (Table 2).

Table 2.

Birth defect-specific PPV of the FBDR for selected birth defects using various passive ascertainment algorithms to identify cases, 2009-2011

Birth Defect^a,b	Original Case Definition		Birth Defect Identified by Multiple Codes or Multiple Encounters		Birth Defect Identified by Multiple Codes or Multiple Encounters, but From Only 1 Data Source		Birth Defect Identified by >1 Data Source
Birth Defect^a,b	No. of Cases Reported by FBDR	Birth Defect- Specific PPV^c of FBDR (95% CI)	No. of Cases Reported by FBDR	Birth Defect- Specific PPV^c of FBDR (95% CI)	No. of Cases Reported by FBDR	Birth Defect-Specific PPV^c of FBDR (95% CI)	No. of Cases Reported by FBDR	Birth Defect-Specific PPV^c of FBDR (95% CI)
Individual birth defects
Anencephaly	39	82.1 (70.0-94.1)	17	88.2 (72.9-100.0)	2	100.0 (100.0-100.0)	15	100.0 (100.0-100.0)
Cleft lip without cleft palate	162	71.0 (64.0-78.0)	65	84.6 (75.8-93.4)	43	86.0 (75.7-96.4)	22	81.8 (65.7-97.9)
Cleft palate without cleft lip	367	79.0 (74.9-83.2)	160	95.6 (92.5-98.8)	123	97.6 (94.8-100.0)	37	89.2 (79.2-99.2)
Cleft palate with cleft lip	267	85.8 (81.6-90.0)	150	97.3 (94.8-99.9)	112	96.4 (93.0-99.9)	38	100.0 (100.0-100.0)
Gastroschisis	300	96.3 (94.2-98.5)	82	97.6 (94.2-100.0)	68	97.1 (93.0-100.0)	14	100.0 (100.0-100.0)
Hypoplastic left heart syndrome	185	77.8 (71.9-83.8)	115	93.0 (88.4-97.7)	68	88.2 (80.6-95.9)	47	100.0 (100.0-100.0)
Hypospadias	1999	95.5 (94.6-96.4)	479	99.0 (98.0-99.9)	89	96.6 (92.9-100.0)	390	99.5 (98.8-100.0)
Reduction deformities, upper limb	129	76.0 (68.6-83.3)	26	96.2 (88.8-100.0)	21	95.2 (86.1-100.0)	5	100.0 (100.0-100.0)
Reduction deformities, lower limb	85	61.2 (50.8-71.5)	10	90.0 (71.4-100.0)	6	83.3 (53.5-100.0)	4	100.0 (100.0-100.0)
Spina bifida without anencephaly	173	87.3 (82.3-92.2)	107	97.2 (94.1-100.0)	63	96.8 (92.5-100.0)	44	97.7 (93.3-100.0)
Tetralogy of Fallot	316	91.1 (88.0-94.3)	189	98.9 (97.5-100.0)	136	98.5 (96.5-100.0)	53	100.0 (100.0-100.0)
Transposition of the great arteries	286	92.0 (88.8-95.1)	148	95.9 (92.8-99.1)	122	95.9 (92.4-99.4)	26	96.2 (88.8-100.0)
Trisomy 21 (Down syndrome)	776	93.3 (91.5-95.1)	407	99.3 (98.4-100.0)	244	99.6 (98.8-100.0)	163	98.8 (97.1-100.0)
Birth defect groups
Any orofacial cleft	702	96.9 (95.6-98.2)	411	99.0 (98.1-100.0)	302	99.3 (98.4-100.0)	109	98.2 (95.6-100.0)
Any included congenital heart defect	722	91.4 (89.4-93.5)	456	97.1 (95.6-98.7)	332	96.1 (94.0-98.2)	124	100.0 (100.0-100.0)
Any limb reduction deformity	195	72.8 (66.6-79.1)	52	82.7 (72.4-93.0)	43	79.1 (66.9-91.2)	9	100.0 (100.0-100.0)
Any included birth defect	4772	94.2 (93.6-94.9)	1988	98.2 (97.7-98.8)	1123	97.5 (96.6-98.4)	865	99.2 (98.6-99.8)

Abbreviations: FBDR, Florida Birth Defects Registry; PPV, positive predictive value.

^aResults are presented at the birth defect level. Because infants may have more than 1 birth defect included in the case confirmation project, the sum of individual birth defects in any birth defect group and overall may add to more than group/overall totals. Ordering of birth defects in the individual birth defects group is alphabetical.

^bCase confirmation protocol included medical record review by a trained abstractor followed by documentation of each confirmed birth defect using modified British Pediatric Association diagnosis codes.²¹

^cCalculated as (number of FBDR cases confirmed with the same birth defect / number of cases of the birth defect reported by the FBDR) × 100.

Although the alternative case definition algorithms significantly improved PPV, they decreased the ability to identify cases because they had more stringent inclusion criteria. The alternative algorithms failed to identify 58.3% to 81.9% of the 4772 cases identified by the original case definition. Even for conditions such as gastroschisis, which had a high PPV based on the original case definition, 72.7% to 95.3% of cases were lost by requiring identification of the birth defect by multiple codes, multiple encounters, or multiple data sources (Table 3).

Table 3.

Birth defect-specific completeness of ascertainment (relative to the original FBDR case definition) for selected birth defects using various passive ascertainment algorithms to identify cases, 2009-2011

Birth Defect^a,b	Original Case Definition, No. of Cases Reported by FBDR	Birth Defect Identified by Multiple Codes or Multiple Encounters		Birth Defect Identified by Multiple Codes or Multiple Encounters, but From Only 1 Data Source		Birth Defect Identified by >1 Data Source
Birth Defect^a,b	Original Case Definition, No. of Cases Reported by FBDR	No. of Cases Reported by FBDR	% Decrease in No. of Cases	No. of Cases Reported by FBDR	% Decrease in No. of Cases	No. of Cases Reported by FBDR	% Decrease in No. of Cases
Individual birth defects
Anencephaly	39	17	56.4	2	94.9	15	61.5
Cleft lip without cleft palate	162	65	59.9	43	73.5	22	86.4
Cleft palate without cleft lip	367	160	56.4	123	66.5	37	89.9
Cleft palate with cleft lip	267	150	43.8	112	58.1	38	85.8
Gastroschisis	300	82	72.7	68	77.3	14	95.3
Hypoplastic left heart syndrome	185	115	37.8	68	63.2	47	74.6
Hypospadias	1999	479	76.0	89	95.5	390	80.5
Reduction deformities, lower limb	85	10	88.2	6	92.9	4	95.3
Reduction deformities, upper limb	129	26	79.8	21	83.7	5	96.1
Spina bifida without anencephaly	173	107	38.2	63	63.6	44	74.6
Tetralogy of Fallot	316	189	40.2	136	57.0	53	83.2
Transposition of the great arteries	286	148	48.3	122	57.3	26	90.9
Trisomy 21 (Down syndrome)	776	407	47.6	244	68.6	163	79.0
Birth defect groups
Any included congenital heart defect	722	456	36.8	332	54.0	124	82.8
Any limb reduction deformity	195	52	73.3	43	77.9	9	95.4
Any orofacial cleft	702	411	41.5	302	57.0	109	84.5
Any included birth defect	4772	1988	58.3	1123	76.5	865	81.9

Abbreviation: FBDR, Florida Birth Defects Registry.

Discussion

This study used medical record–verified cases to assess the degree to which various case definition algorithms for birth defects can improve the likelihood that cases identified by unverified ICD codes are true cases. Our findings suggest that these strategies can be used to substantially improve the PPV of ICD diagnosis codes for selected birth defects without the need to confirm cases via medical record review. However, as expected, although algorithms that require multiple codes, encounters, or data sources to meet the case definition can greatly increase PPV, they also decrease significantly the completeness of ascertainment, because only a subset of infants with birth defects has multiple medical encounters outside of routine care.

Approximately 60% of birth defects surveillance programs in the United States rely predominantly or exclusively on a passive strategy to ascertain cases,²² as do many other disease registries. Our results should encourage programs with limited resources for case verification to explore strategies to maximize various quality metrics (eg, accuracy, completeness, timeliness) for case ascertainment in accordance with programmatic goals. Our findings suggest that if a passive surveillance program were contributing data to a national study on the etiology of birth defects and the inclusion of false-positive diagnoses were a substantial threat to internal validity, the false-positive rate for selected birth defects could be minimized without medical record review by using alternative case definition algorithms. However, depending on the definition used, the cases selected for such a study, although being true cases, may not be representative of all cases. For example, in our study, the PPV for the ICD-9-CM code for hypoplastic left heart syndrome (746.7) increased from 77.8% to 100.0% by requiring the code to appear in more than 1 data source. Infants whose diagnosis of hypoplastic left heart syndrome was documented in more than 1 data source had more severe complications that required more inpatient hospitalizations, outpatient surgeries, and emergency department visits than infants whose diagnosis of the syndrome was documented in only 1 data source. Thus, any study based on this subset of more severe cases may overestimate morbidity, mortality, costs, and epidemiologic associations, because the case definition requires more frequent contact with the medical system.

Conversely, for fundamental activities such as routine monitoring, aimed at detecting changes in the environment or health status of populations or for planning and resource allocation, overly restrictive case definition algorithms are not recommended. They are not recommended because the limitations of the excessively low completeness of ascertainment outweigh the benefits of smaller-scale increases in accuracy. For example, in our study, requiring the ICD-9-CM code for Down syndrome (758.0) to appear in more than 1 medical encounter (ie, >1 discharge record) improved PPV from 93.3% to 99.6%, but such a case definition would capture barely half of all confirmed cases and substantially underestimate the prevalence and burden of cases. Therefore, when exploring alternative algorithms for capturing data on birth defects or any other health outcome, decision makers should consider these quality metric tradeoffs, including timeliness,²⁴ so that adopted strategies align with intended goals.

Limitations

This study had several limitations. First, despite considering algorithms that evaluate plausible case definitions based on diagnoses appearing in multiple ICD codes, medical encounters, and data sources, we did not explore all possible definitions. For example, Smith et al²⁵ found that including in their algorithm variables related to demographic characteristics and specialty providers in addition to the number of times a diagnosis code was used improved the PPV of a passive surveillance system for muscular dystrophy. Second, although the FBDR links to a broad range of data sources to establish an annual cohort of infants born with birth defects, the FBDR does not link to educational services, government insurance data, or data sources that contain information on terminations or stillbirths. These data sources might capture data on cases that would otherwise have been missed or may not have multiple occurrences in hospital administrative data sources. The availability of these data sources alone may have improved the accuracy and completeness beyond that attainable by varying case definition algorithms based on current FBDR data sources. Third, the FBDR does not use medical specialists to review records. A pediatric cardiologist, for example, does not review the entire chart of a child who receives a diagnosis of hypoplastic left heart syndrome, and in some registries, approximately 35% of hypoplastic left heart syndrome cases are misdiagnosed.²⁶

Despite these limitations, this study points to some advantages. The data source was large and statewide, providing case information on a diverse population of nearly 5000 children with birth defects of major public health importance. Unlike other passive registries, the FBDR has an enhanced surveillance component that allows for medical record review to verify selected birth defects, providing a gold standard diagnosis upon which an evaluation of the accuracy and completeness of the passive statewide program could be based. The ultimate goal of this study was to determine whether the accuracy of a passive surveillance program could be improved, specifically in situations in which case confirmation via medical record review either would not be possible or would be resource prohibitive. Our results suggest that excellent accuracy of ICD codes for selected birth defects can be achieved but that decreases in completeness and generalizability of cases are inevitable if increasingly restrictive case definition algorithms are applied.

Conclusions

Surveillance programs and disease registries use their data for various purposes, including incidence and/or prevalence estimation, temporal trends analysis, local investigations, epidemiologic association studies, health outcomes research, and etiological examinations.²⁷ Moreover, for registries based on rare outcomes, regional or national collaborative projects based on pooled data may be necessary to achieve sufficient statistical precision to contribute meaningfully to the evidence base. Improving the diagnostic accuracy and completeness of registries using passive case-finding strategies is essential to their inclusion in pooled data sources. The methods described in this study have broader applicability to public health surveillance and research using plan- or population-based claims data as well as administrative public health data collected from hospitals for conditions that require multiple health encounters.

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The analysis for this project was supported by an award from the Centers for Disease Control and Prevention (CDC) (Grant No. 1 NU50DD004946-01-00). The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of CDC, the Florida Department of Health, the University of South Florida, or Baylor College of Medicine.

References

Block

Watkins

Salemi

. Maternal pre-pregnancy body mass index and risk of selected birth defects: evidence of a dose-response relationship. Paediatr Perinat Epidemiol. 2013;27(6):521–531.

Nembhard

Salemi

Hauser

Kornosky

. Are there ethnic disparities in risk of preterm birth among infants born with congenital heart defects? Birth Defects Res A Clin Mol Teratol. 2007;79(11):754–764.

Kucik

Cassell

Alverson

. Role of health insurance on the survival of infants with congenital heart defects. Am J Public Health. 2014;104(9):e62–e70.

Salemi

Pierre

Tanner

. Maternal nativity as a risk factor for gastroschisis: a population-based study. Birth Defects Res A Clin Mol Teratol. 2009;85(11):890–896.

Dawson

Cassell

Oster

. Hospitalizations and associated costs in a population-based study of children with Down syndrome born in Florida. Birth Defects Res A Clin Mol Teratol. 2014;100(11):826–836.

Delmelle

Cassell

Dony

. Modeling travel impedance to medical care for children with birth defects using geographic information systems. Birth Defects Res A Clin Mol Teratol. 2013;97(10):673–684.

Peterson

Dawson

Grosse

. Hospitalizations, costs, and mortality among infants with critical congenital heart disease: how important is timely detection? Birth Defects Res A Clin Mol Teratol. 2013;97(10):664–672.

Radcliff

Cassell

Tanner

. Hospital use, associated costs, and payer status for infants born with spina bifida. Birth Defects Res A Clin Mol Teratol. 2012;94(12):1044–1053.

Wang

Liu

Canfield

. Racial/ethnic differences in survival of United States children with birth defects: a population-based study. J Pediatr. 2015;166(4):819–826.

10.

Canfield

Mai

Wang

. The association between race/ethnicity and major birth defects in the United States, 1999-2007. Am J Public Health. 2014;104(9):e14–e23.

11.

Boulet

Correa-Villasenor

Hsia

Atrash

. Feasibility of using the National Hospital Discharge Survey to estimate the prevalence of selected birth defects. Birth Defects Res A Clin Mol Teratol. 2006;76(11):757–761.

12.

Callif-Daley

Huether

Edmonds

. Evaluating false positives in two hospital discharge data sets of the Birth Defects Monitoring Program. Public Health Rep. 1995;110(2):154–160.

13.

Cronk

Malloy

Pelech

. Completeness of state administrative databases for surveillance of congenital heart disease. Birth Defects Res A Clin Mol Teratol. 2003;67(9):597–603.

14.

Frohnert

Lussky

Alms

Mendelsohn

Symonik

Falken

. Validity of hospital discharge data for identifying infants with cardiac defects. J Perinatol. 2005;25(11):737–742.

15.

Hexter

Harris

. Bias in congenital malformations information from the birth certificate. Teratology. 1991;44(2):177–180.

16.

National Center for Health Statistics. International classification of diseases, ninth revision, clinical modification (ICD-9-CM). https://www.cdc.gov/nchs/icd/icd9cm.htm. Published 2015. Accessed September 19, 2017.

17.

Salemi

Tanner

Sampat

. The accuracy of hospital discharge diagnosis codes for major birth defects: evaluation of a statewide registry with passive case ascertainment. J Public Health Manag Pract. 2016;22(3):E9–E19.

18.

Salemi

Tanner

Bailey

Mbah

Salihu

. Creation and evaluation of a multi-layered maternal and child health database for comparative effectiveness research. J Registry Manag. 2013;40(1):14–28.

19.

Salemi

Tanner

Block

. The relative contribution of data sources to a birth defects registry utilizing passive multisource ascertainment methods: does a smaller birth defects case ascertainment net lead to overall or disproportionate loss? J Registry Manag. 2011;38(1):30–38.

20.

National Center for Health Statistics. International classification of diseases, tenth revision, clinical modification (ICD-10-CM). https://www.cdc.gov/nchs/icd/icd10cm.htm. Published 2017. Accessed September 19, 2017.

21.

Centers for Disease Control and Prevention. Birth defects and genetic diseases branch 6-digit code: for reportable congenital anomalies. https://www.cdc.gov/ncbddd/birthdefects/documents/MACDPcode0807.pdf. Published 2007. Accessed November 3, 2017.

22.

Nembhard

Wang

Loscalzo

Salemi

. Variation in the prevalence of congenital heart defects by maternal race/ethnicity and infant sex. J Pediatr. 2010;156(2):259–264.

23.

SAS Institute, Inc. SAS Version 9.4 for Windows. Cary, NC: SAS Institute, Inc; 2016.

24.

Salemi

Tanner

Anjohrin

. Evaluating difficult decisions in public health surveillance: striking the right balance between timeliness and completeness. J Registry Manag. 2015;42(2):48–61.

25.

Smith

Royer

Mann

McDermott

. Using administrative data to ascertain true cases of muscular dystrophy: rare disease surveillance. JMIR Public Health Surveill. 2017;3(1):e2.

26.

Hirsch

Copeland

Donohue

Kirby

Grigorescu

Gurney

. Population-based analysis of survival for hypoplastic left heart syndrome. J Pediatr. 2011;159(1):57–63.

27.

Anderka

Mai

Romitti

. Development and implementation of the first national data quality standards for population-based birth defects surveillance programs in the United States. BMC Public Health. 2015;15:925.