Abstract
Objectives:
We identified algorithms to improve the accuracy of passive surveillance programs for birth defects that rely on administrative diagnosis codes for case ascertainment and in situations where case confirmation via medical record review is not possible or is resource prohibitive.
Methods:
We linked data from the 2009-2011 Florida Birth Defects Registry, a statewide, multisource, passive surveillance program, to an enhanced surveillance database with selected cases confirmed through medical record review. For each of 13 birth defects, we calculated the positive predictive value (PPV) to compare the accuracy of 4 algorithms that varied case definitions based on the number of diagnoses, medical encounters, and data sources in which the birth defect was identified. We also assessed the degree to which accuracy-improving algorithms would affect the Florida Birth Defects Registry’s completeness of ascertainment.
Results:
The PPV generated by using the original Florida Birth Defects Registry case definition (ie, suspected cases confirmed by medical record review) was 94.2%. More restrictive case definition algorithms increased the PPV to between 97.5% (identified by 1 or more codes/encounters in 1 data source) and 99.2% (identified in >1 data source). Although PPVs varied by birth defect, alternative algorithms increased accuracy for all birth defects; however, alternative algorithms also resulted in failing to ascertain 58.3% to 81.9% of cases.
Conclusions:
We found that surveillance programs that rely on unverified diagnosis codes can use algorithms to dramatically increase the accuracy of case finding, without having to review medical records. This can be important for etiologic studies. However, the use of increasingly restrictive case definition algorithms led to a decrease in completeness and the disproportionate exclusion of less severe cases, which could limit the widespread use of these approaches.
Florida has the second-largest population-based birth defects surveillance system in the United States that relies predominantly on passive case ascertainment. Operated by the Florida Department of Health and a consortium of partners, the Florida Birth Defects Registry (FBDR) has monitored a population of more than 3 million live births since January 1, 1998, to detect structural, functional, and biochemical abnormalities in infants. Data from the FBDR and other state-based surveillance programs are used to support local investigations, epidemiologic studies, 1 -4 health outcomes research, 5 -8 and national collaborative projects. 9,10 However, for passive programs, in which cases are defined by unverified administrative diagnosis codes present in hospital discharge or service-related databases, the utility of surveillance data is hampered by the likelihood of under-ascertainment and reporting errors. 11 -15
Previous analyses have suggested that
In light of funding and other resource restrictions for population-based disease registries, which limit quality improvement efforts (including diagnosis verification through medical record review), we concluded that an exploratory study of cost-effective alternative strategies for improving diagnostic accuracy was warranted. Our investigation determined the extent to which case definition algorithms—those case inclusion criteria that leverage the number of codes, encounters, and various data sources in which birth defects diagnoses appear—can improve accuracy in the absence of medical record review. We also explored the trade-off between accuracy and completeness associated with various algorithms.
Methods
Study Design and Population
We linked data from a statewide passive surveillance database containing unverified diagnosis codes to data from an enhanced surveillance project with medical record–confirmed diagnoses for selected birth defects. The source population included all infants born alive to Florida-resident mothers between January 1, 2009, and December 31, 2011.
Passive surveillance: FBDR
To be included in the registry, 3 criteria must be met: (1) the biological mother must be a resident of Florida, (2) the infant must be born alive (spontaneous and elective terminations are not included), and (3) the infant must receive a diagnosis in the first year of life with 1 or more structural, genetic, or other specified birth outcomes that can adversely affect the infant’s health and development. The FBDR’s passive case ascertainment methodology identifies cases by collecting information from multiple extant data sources. The source population is first defined by Florida-resident birth certificates. Then, using a validated deterministic data linkage strategy,
18
each data source is linked to the birth certificate record. Although data sources have changed over time
19
because of funding restrictions or the identification of new, more reliable data sets, those used during the years of this evaluation did not change: (1) inpatient, (2) outpatient, and (3) emergency department hospital discharge records from the Florida Agency for Health Care Administration; and (4) infant death certificates from the Florida Bureau of Vital Statistics. Records from each linked data source contain ICD-9-CM codes or
Enhanced surveillance: statewide case confirmation project
In 2007, the Florida Department of Health and the University of South Florida Birth Defects Surveillance Program began operating an Environmental Public Health Tracking enhanced surveillance project that sought to implement more active case finding by contacting hospitals and by reviewing data from labor and delivery logs, neonatal intensive care units’ admission lists, and other clinical records. 17 The Environmental Public Health Tracking project also implemented medical record review by trained abstractors to confirm each suspected case, defined as an infant having 1 or more of 13 birth defects (Table 1). These birth defects were initially selected based on public health importance, consistency of reporting across states in the United States, and the potential for causal links to environmental exposures. In 2010, additional funding allowed the Environmental Public Health Tracking project to be expanded to include statewide case verification, retroactively applied to birth cohorts beginning in 2007, for all cases initially identified by the passive case-finding system (FBDR). During the process, surveillance staff members queried the passive FBDR to identify all records (ie, medical encounters) from all data sources (eg, inpatient, outpatient, and emergency department discharge databases) in which 1 or more Environmental Public Health Tracking birth defects was diagnosed. 22 All confirmed birth defects were recoded by using a Centers for Disease Control and Prevention–modified version of the British Pediatric Association coding system, 21 which provides higher specificity than ICD diagnosis codes. Reasons for false-positive diagnoses were captured in prose.
Diagnosis codes used to identify individual birth defects and birth defect categories included in the Florida Birth Defects Registry’s Statewide Case Confirmation Project, 2007-2011a
Abbreviations: ICD-9-CM,
aPresence of 1 or more of the listed codes serves as positive indication of the birth defect. Similarly, presence of 1 or more of the listed codes after “without” indicates absence of the birth defect regardless of other listed codes. An “x” indicates that all subcodes with the listed code prefix are part of the birth defect definition.
bICD-9-CM 16 and ICD-10-CM 20 codes were used by the statewide, passive surveillance system to identify cases of birth defects. ICD-10-CM codes were present only on infant death certificates; all other data sources relied on ICD-9-CM codes. All cases identified by this system underwent medical record review as part of the statewide confirmation project to confirm or rule out each included birth defect.
cModified British Pediatric Association codes 21 were used by trained abstractors to identify cases of birth defects after medical record review.
dBeginning October 1, 2009, gastroschisis is identified exclusively by a single ICD-9-CM diagnosis code (756.73). Previously, cases of gastroschisis were identified by using the presence of both the nonspecific 756.79 ICD-9-CM diagnosis code (other congenital anomalies of abdominal wall) and the 54.71 ICD-9-CM procedure code indicating repair of gastroschisis. 16
Exploration of algorithms to improve accuracy
In an effort to improve the accuracy of selected birth defects when medical record review was not possible, the FBDR Consortium’s analysts workgroup hypothesized that the diagnostic accuracy of ICD codes in the statewide passive system would be substantially higher when a birth defect was (1) identified by more than 1 code for the same birth defect (eg, 2 codes for reduction deformity of the lower limbs), (2) diagnosed during more than 1 medical encounter (eg, 2 inpatient hospitalizations), or (3) diagnosed in more than 1 database (eg, inpatient and emergency department). To evaluate this hypothesis, we compared several case definition algorithms (not mutually exclusive): Original case definition in which any ICD code for the birth defect listed during any encounter, regardless of data source, was considered a positive indication of the birth defect. Example: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91 (spina bifida without mention of hydrocephalus, cervical region) and no other indication of spina bifida in any other data source. This would be a positive case of spina bifida. Case definition that requires either multiple ICD codes for the same birth defect or multiple encounters during which the birth defect is diagnosed. Example 1: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91 and no other indication of spina bifida in any other data source. This would not be considered a case of spina bifida. Example 2: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91. The 741.91 code is documented during another inpatient hospitalization later in the infant’s first year of life. This would be a positive indication of the code during more than 1 distinct encounter and would be considered a case. Example 3: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91 as well as a documented ICD-9-CM code of 741.01 (spina bifida with hydrocephalus, cervical region). Because 2 codes indicate spina bifida, this would be considered a case. Case definition that requires either multiple ICD codes for the same birth defect or multiple encounters during which the birth defect is diagnosed (same case definition as case definition 2), but with the added restriction that all codes or encounters must come from the same data source. Therefore, although we would come to the same conclusion for examples 1, 2, and 3 by using case definition 3 as we did by using case definition 2, we would come to a different conclusion if the birth defect were identified in more than 1 data source (eg, inpatient and emergency department): this would not be considered a case. Case definition that requires more than 1 data source in which the birth defect is diagnosed. Example 1: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91 and no other indication of spina bifida in any other data source. This would not be considered a case of spina bifida. Example 2: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91 as well as a documented ICD-9-CM code of 741.01. This would not be considered a case of spina bifida, because the birth defect was diagnosed in only 1 data source. Example 3: for an infant, a single inpatient birth hospitalization has a documented ICD-9-CM code of 741.91. The ICD-9-CM code of 741.91 is documented during an emergency department hospitalization later in the infant’s first year of life. This infant would be considered a case of spina bifida because the birth defect was diagnosed in more than 1 data source.
Statistical Analyses
We primarily used PPV to compare the accuracy of ICD codes in the FBDR data under various case definition algorithms. We defined PPV as the proportion of FBDR-identified (passive) cases that were confirmed by medical record review to have the same birth defect. We calculated 95% confidence intervals (CIs) by using the asymptotic (Wald) method for binomial proportions. We calculated PPV overall, for each of the 13 birth defects, and for several groups of birth defects classified by body system. For each birth defect or birth defect group, we compared PPV by using the 4 case definition algorithms described previously. For these analyses, infants were unduplicated at the level of the birth defect or birth defect group (eg, an infant with more than 1 code for spina bifida was counted only once as having spina bifida). An infant with more than 1 FBDR-identified birth defect considered in this study was included in more than 1 birth defect-specific analysis. All inferential tests were 2-tailed with a 5% type I error rate, and we conducted analyses by using SAS version 9.4. 23 Because this project was an evaluation of the FBDR data, whose reporting and surveillance are under the authority of Florida Statute 381.0031, the Florida Department of Health Institutional Review Board determined that this study was exempt from review.
Results
From 2009 through 2011, the FBDR’s passive surveillance system identified 4772 infants with 1 or more of the 13 birth defects studied. The PPV under the original case definition was 94.2% (95% CI, 93.6%-94.9%), with 4497 of the 4772 suspected cases confirmed by medical record review. However, case definition algorithms requiring identification of the birth defect across multiple codes, encounters, or data sources increased the PPV to between 97.5% (95% CI, 96.6%-98.4%) if the birth defect was identified by multiple codes or multiple encounters from 1 data source and 99.2% (95% CI, 98.6%-99.8%) if the birth defect was identified by more than 1 data source. Although overall, the PPV for the original case definition was high, PPVs across birth defects varied from 61.2% (95% CI, 50.8%-71.5%) for reduction deformities of the lower limb to 96.3% (95% CI, 94.2%-98.5%) for gastroschisis (Table 2). Alternative case definition algorithms again improved accuracy, even for birth defects based on ICD-9-CM codes with low PPVs. For example, the PPV for reduction deformities of the lower limb increased to 90.0% (95% CI, 71.4%-100.0%) when the birth defect was identified by multiple codes or multiple encounters and 100.0% (95% CI, 100.0%-100.0%) when the birth defect was identified by more than 1 data source. Moreover, the algorithm requiring identification by multiple codes or multiple encounters resulted in a PPV ≥90% for all except 2 birth defects (anencephaly, 88.2%, and cleft lip without cleft palate, 84.6%). When identification of the birth defect in more than 1 data source was required, the PPV became 100.0% for 7 of the 13 birth defects and 2 of the 3 birth defect groups (Table 2).
Birth defect-specific PPV of the FBDR for selected birth defects using various passive ascertainment algorithms to identify cases, 2009-2011
Abbreviations: FBDR, Florida Birth Defects Registry; PPV, positive predictive value.
aResults are presented at the birth defect level. Because infants may have more than 1 birth defect included in the case confirmation project, the sum of individual birth defects in any birth defect group and overall may add to more than group/overall totals. Ordering of birth defects in the individual birth defects group is alphabetical.
bCase confirmation protocol included medical record review by a trained abstractor followed by documentation of each confirmed birth defect using modified British Pediatric Association diagnosis codes. 21
cCalculated as (number of FBDR cases confirmed with the same birth defect / number of cases of the birth defect reported by the FBDR) × 100.
Although the alternative case definition algorithms significantly improved PPV, they decreased the ability to identify cases because they had more stringent inclusion criteria. The alternative algorithms failed to identify 58.3% to 81.9% of the 4772 cases identified by the original case definition. Even for conditions such as gastroschisis, which had a high PPV based on the original case definition, 72.7% to 95.3% of cases were lost by requiring identification of the birth defect by multiple codes, multiple encounters, or multiple data sources (Table 3).
Birth defect-specific completeness of ascertainment (relative to the original FBDR case definition) for selected birth defects using various passive ascertainment algorithms to identify cases, 2009-2011
Abbreviation: FBDR, Florida Birth Defects Registry.
aResults are presented at the birth defect level. Because infants may have more than 1 birth defect included in the case confirmation project, the sum of individual birth defects in any birth defect group and overall may add to more than group/overall totals. Ordering of birth defects in the individual birth defects group is alphabetical.
bCase confirmation protocol included medical record review by a trained abstractor followed by documentation of each confirmed birth defect using modified British Pediatric Association diagnosis codes. 21
Discussion
This study used medical record–verified cases to assess the degree to which various case definition algorithms for birth defects can improve the likelihood that cases identified by unverified ICD codes are true cases. Our findings suggest that these strategies can be used to substantially improve the PPV of ICD diagnosis codes for selected birth defects without the need to confirm cases via medical record review. However, as expected, although algorithms that require multiple codes, encounters, or data sources to meet the case definition can greatly increase PPV, they also decrease significantly the completeness of ascertainment, because only a subset of infants with birth defects has multiple medical encounters outside of routine care.
Approximately 60% of birth defects surveillance programs in the United States rely predominantly or exclusively on a passive strategy to ascertain cases, 22 as do many other disease registries. Our results should encourage programs with limited resources for case verification to explore strategies to maximize various quality metrics (eg, accuracy, completeness, timeliness) for case ascertainment in accordance with programmatic goals. Our findings suggest that if a passive surveillance program were contributing data to a national study on the etiology of birth defects and the inclusion of false-positive diagnoses were a substantial threat to internal validity, the false-positive rate for selected birth defects could be minimized without medical record review by using alternative case definition algorithms. However, depending on the definition used, the cases selected for such a study, although being true cases, may not be representative of all cases. For example, in our study, the PPV for the ICD-9-CM code for hypoplastic left heart syndrome (746.7) increased from 77.8% to 100.0% by requiring the code to appear in more than 1 data source. Infants whose diagnosis of hypoplastic left heart syndrome was documented in more than 1 data source had more severe complications that required more inpatient hospitalizations, outpatient surgeries, and emergency department visits than infants whose diagnosis of the syndrome was documented in only 1 data source. Thus, any study based on this subset of more severe cases may overestimate morbidity, mortality, costs, and epidemiologic associations, because the case definition requires more frequent contact with the medical system.
Conversely, for fundamental activities such as routine monitoring, aimed at detecting changes in the environment or health status of populations or for planning and resource allocation, overly restrictive case definition algorithms are not recommended. They are not recommended because the limitations of the excessively low completeness of ascertainment outweigh the benefits of smaller-scale increases in accuracy. For example, in our study, requiring the ICD-9-CM code for Down syndrome (758.0) to appear in more than 1 medical encounter (ie, >1 discharge record) improved PPV from 93.3% to 99.6%, but such a case definition would capture barely half of all confirmed cases and substantially underestimate the prevalence and burden of cases. Therefore, when exploring alternative algorithms for capturing data on birth defects or any other health outcome, decision makers should consider these quality metric tradeoffs, including timeliness, 24 so that adopted strategies align with intended goals.
Limitations
This study had several limitations. First, despite considering algorithms that evaluate plausible case definitions based on diagnoses appearing in multiple ICD codes, medical encounters, and data sources, we did not explore all possible definitions. For example, Smith et al 25 found that including in their algorithm variables related to demographic characteristics and specialty providers in addition to the number of times a diagnosis code was used improved the PPV of a passive surveillance system for muscular dystrophy. Second, although the FBDR links to a broad range of data sources to establish an annual cohort of infants born with birth defects, the FBDR does not link to educational services, government insurance data, or data sources that contain information on terminations or stillbirths. These data sources might capture data on cases that would otherwise have been missed or may not have multiple occurrences in hospital administrative data sources. The availability of these data sources alone may have improved the accuracy and completeness beyond that attainable by varying case definition algorithms based on current FBDR data sources. Third, the FBDR does not use medical specialists to review records. A pediatric cardiologist, for example, does not review the entire chart of a child who receives a diagnosis of hypoplastic left heart syndrome, and in some registries, approximately 35% of hypoplastic left heart syndrome cases are misdiagnosed. 26
Despite these limitations, this study points to some advantages. The data source was large and statewide, providing case information on a diverse population of nearly 5000 children with birth defects of major public health importance. Unlike other passive registries, the FBDR has an enhanced surveillance component that allows for medical record review to verify selected birth defects, providing a gold standard diagnosis upon which an evaluation of the accuracy and completeness of the passive statewide program could be based. The ultimate goal of this study was to determine whether the accuracy of a passive surveillance program could be improved, specifically in situations in which case confirmation via medical record review either would not be possible or would be resource prohibitive. Our results suggest that excellent accuracy of ICD codes for selected birth defects can be achieved but that decreases in completeness and generalizability of cases are inevitable if increasingly restrictive case definition algorithms are applied.
Conclusions
Surveillance programs and disease registries use their data for various purposes, including incidence and/or prevalence estimation, temporal trends analysis, local investigations, epidemiologic association studies, health outcomes research, and etiological examinations. 27 Moreover, for registries based on rare outcomes, regional or national collaborative projects based on pooled data may be necessary to achieve sufficient statistical precision to contribute meaningfully to the evidence base. Improving the diagnostic accuracy and completeness of registries using passive case-finding strategies is essential to their inclusion in pooled data sources. The methods described in this study have broader applicability to public health surveillance and research using plan- or population-based claims data as well as administrative public health data collected from hospitals for conditions that require multiple health encounters.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The analysis for this project was supported by an award from the Centers for Disease Control and Prevention (CDC) (Grant No. 1 NU50DD004946-01-00). The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of CDC, the Florida Department of Health, the University of South Florida, or Baylor College of Medicine.
