Abstract
Background:
Administrative data are commonly used to study clinical outcomes in renal disease. Race is an important determinant of renal health delivery and outcomes in Canada but is not validated in most administrative data, and the correlation with census-based definitions of race is unknown.
Objectives:
Validation of self-reported race (SRR) in a Canadian provincial renal administrative database (Patient Records and Outcome Management Information System [PROMIS]) and comparison with the Canadian census categories of race.
Design:
Prospective patient survey study to validate SRR in PROMIS.
Setting:
British Columbia, Canada.
Patients:
Adult patients registered in PROMIS.
Measurements:
Survey SRR was used as gold standard to validate SRR in PROMIS. Self-reported race in PROMIS was compared with census race categories.
Methods:
This is a cross-sectional telephone survey of a random sample of all adults in PROMIS conducted between February 2016 and November 2016. Responders selected a race category from PROMIS and from the Canadian census. Sensitivity (Sn) and specificity (Sp) were calculated with 95% confidence intervals (CIs).
Results:
A total of 21 039 patients met inclusion criteria, 1677 were selected for the survey and 637 participated (38% response rate). There were no differences between the PROMIS, sampled, and responder populations. PROMIS SRR had an accuracy of 95.3% (95% CI: 94.2%-97.0%) when validated against the survey SRR with Sn and Sp ≥90% in all race groups except in Aboriginals (Sn 87.5%). The positive and negative predictive values were ≥95%, except in very low and high–prevalence groups, respectively. The Canadian census had an accuracy of 95.7% (95% CI: 94.4%-97.6%) when validated against PROMIS SRR with Sn and Sp ≥90%. The results did not differ in subgroups based on age, sex, birth outside Canada, or renal group (glomerulonephritis, chronic kidney disease, hemodialysis, peritoneal dialysis, transplant recipients, or live donors).
Limitations:
Analysis of minority groups and lower prevalence groups is limited by sample size. Results may not be generalizable to other administrative databases.
Conclusions:
We have shown high accuracy of PROMIS SRR that validates its use in the secondary analysis of administrative data for research. There is high correlation between PROMIS and census race categories which allows linkage with other data sources that use census-based definitions of race.
What was known before
Administrative data are being used for research purposes, but use is limited by the accuracy of this data. Race is an important determinant of renal health delivery and outcomes and is studied extensively in Canadian health research.
What this adds
The race variable in our renal administrative data set is validated, allowing its use in clinical research. We compare the renal administrative dataset race variable with the Canadian census definitions of race with good accuracy, allowing linkage to other data sets.
Introduction
Race is an important determinant of patient outcome and health care utilization at all categories of kidney disease, including chronic kidney disease (CKD), end-stage kidney disease (ESKD), and transplantation.1,2 In recent years, the role of race has been studied extensively in these populations in Canada, including rates of transplantation and allograft failure,3-8 kidney donation,9,10 outcomes on dialysis,11-13 and progression of CKD and glomerulonephritis (GN).14-19 Studying the impact of race in kidney disease has identified disparities in access to care that have been targeted by health policies resulting in improved patient outcomes. 20
There is increasing use of administrative data to study clinical outcomes in kidney disease across large geographically and racially diverse populations.21-23 However, administrative databases often do not collect race or use methods that have not been previously validated, unlike other important variables that have undergone extensive validation of case definitions.24-30 In British Columbia, the Patient Records and Outcome Management Information System (PROMIS) is the administrative database that captures all patients with CKD and dialysis patients, and transplant recipients and donors in the province. Because PROMIS was designed for administrative purposes, there is not a standardized mechanism to capture self-reported race (SRR). As such, the accuracy of race in PROMIS or in other renal administrative databases remains unknown. Furthermore, broad race categories often include a mixed demographic of patients of varying age and country of birth, making it more difficult to draw meaningful conclusions from health outcomes associations in these groups.
Therefore, we sought to characterize patients within race groups captured in PROMIS, validate the capture of race in PROMIS against the gold standard of patient SRR, and to compare it with the Canadian Census definitions of race. 31
Methods
Study Design
This is a prospective validation study that used a cross-sectional survey of a randomly sampled population of patients from PROMIS between February 2016 and November 2016. PROMIS is the provincial administrative database in British Columbia (BC) for patients with kidney disease and is managed by the BC Provincial Renal Agency. Registration in PROMIS is mandatory when patients have advanced all-cause CKD needing renal-specific medications or multidisciplinary clinics, at the time of renal biopsy diagnosis of GN, or at the time of kidney transplantation, live kidney donation, or commencing dialysis (including hemodialysis [HD] and peritoneal dialysis [PD]). As such, PROMIS captures all living kidney donors in BC and all patients with GN, ESKD, and advanced CKD. At the time of registration in PROMIS, patient demographics are captured during the usual processes of clinical care, including SRR (PROMIS SRR).
Sampling Strategy
The source population consisted of all patients registered in PROMIS who were alive and ≥18 years old at the time of survey completion. Children were not sampled due to the sensitive nature of the survey questions. Eligible patients were randomly selected until the desired number of patients had responded to the survey. The sample size was chosen to allow 90% power to validate a 2-level categorical race variable with a prevalence of 15% at a sensitivity of 0.95 using an alpha of .05. The prevalence rate was based on the prevalence of different race groups in PROMIS (11% East Asian, 9% South Asian, etc). 15 The prevalence of patients with GN in PROMIS is 7%. To ensure sufficient power to validate race in a priori–defined renal subgroups, patients with GN were oversampled from 7% to 15%. The sample size was calculated to be 487 and was increased from 487 to 600 to broaden sampling and ensure an adequate mix of race and renal groups.
Survey Details and Definitions
The survey was developed by the investigators and included questions on demographics, including age, sex, and country of birth as these have been associated with differential accuracy of race measurements.32-34 Patients were asked to report their race using 2 different categorizations. First, they were asked to report their race based on the usual categorization available in PROMIS (see Table 1). Second, they were asked to report their race based on the categorization from the 2011 Canadian Census (see Table 1). Only one selection was allowed for the PROMIS categorization of race—this was defined as the Survey SRR. The Census race categories, on the contrary, allowed multiple entries. When patients chose multiple answers for these questions, the first response was defined as the Census SRR. Mapping of Census to PROMIS SRR categories is shown in Supplementary Table 1. The survey was pilot tested on healthy volunteers of different racial backgrounds to ensure content validity prior to its administration.
Race Categories for PROMIS and the 2011 Canadian Census.
Note. PROMIS = Patient Records and Outcome Management Information System.
Eligible patients were contacted by multilingual trained research coordinators via telephone and consenting patients completed the questionnaire over the telephone. Surveys were completed in English, Cantonese, Mandarin, and Punjabi by research coordinators. For patients speaking other languages, interpreters were used for consent and to complete the survey.
Statistical Analysis
Continuous variables were summarized as mean (SD) and compared across groups using the t test, and categorical variables were summarized as count (frequency) and compared across groups using the χ2 test. When validating the capture of race in PROMIS, survey SRR was considered the gold standard and was compared with PROMIS SRR. When validating the Canadian Census capture of race, PROMIS SRR was considered the gold standard and compared with the Census SRR. This approach was specifically designed so that the results could inform future research that merges PROMIS with other administrative data sets that only capture Census definitions of race. Sensitivity and specificity were calculated based on the presence or absence of each category of race, with 95% confidence intervals generated by the simple asymptotic method.
The analyses were repeated in a priori–determined subgroups based on age, sex, country of birth other than Canada, and type of renal program (CKD, GN, transplant recipient, PD, HD, and living donors) to investigate consistency of results.
All analyses were performed using SAS version 9.4. P values <.05 were considered statistically significant. The study was approved by the University of British Columbia research ethics board. The Standards of Reporting of Diagnostic Accuracy Studies (STARD) guidelines and checklists were followed for this validation study. 35
Results
Description of the Cohort
There were 21 039 patients in PROMIS who met criteria for inclusion. A total of 1677 patients were approached to participate and 640 consented to participate (survey response rate 38.0%). Three individuals did not answer key questions and were therefore excluded (see Supplementary Figure 1). Within each renal program, there were no qualitative differences in age, sex, and PROMIS SRR between those in the source population, those selected for the survey, and those who responded (see Supplementary Table 2).
A description of the cohort is provided in Table 2. According to the PROMIS categorization of SRR, of the 637 patients included in the analysis, 20 were Filipino, 62 were South Asian, 65 were East Asian, 404 were Caucasian, 19 were Aboriginal, 27 were multiracial or other (including Black, Latin American, and Middle Eastern), and 40 were listed as having unknown race.
Characteristics of the Patients Included in the Survey Based on PROMIS Self-Reported Race Categories.
Note. Other and unknown categories have been omitted for clarity. GN = glomerulonephritis; CKD = chronic kidney disease; HD = hemodialysis; PD = peritoneal dialysis; TX = transplant.
The mean age was 65 years and was significantly younger in the Filipino group (56 years, P = <.001). Overall, 36% of patients were born outside Canada, which was more common in the East Asian, South Asian, and Filipino groups (87%-100%, P = <.001.). There were also significant differences in renal groups, with Filipinos being more commonly on PD (25%) or HD (30%), Aboriginal and South Asians more commonly on HD (46 and 44%, respectively), and fewer South Asian, East Asian, and Aboriginal living donors (6%, 9% and 0%, respectively). The frequency of patients in the lowest income bracket was 21% and was significantly higher in the East Asian, South Asian, and Aboriginal groups (28%, 26%, and 32% respectively), when compared with the Caucasian and Filipino groups (18% and 5%, respectively). It is important to note, however, that the sample sizes in some race groups were quite small (n = 20 in Filipino group and n = 19 in Aboriginal group). Most (61%) of the cohort lived in the Vancouver Metropolitan Area, but this was less common in the Caucasian and Aboriginal groups (48% and 40% respectively, P = <.001).
Validation of PROMIS SRR
Compared with the gold standard of SRR reported in the survey, SRR captured in PROMIS classified 95.3% (95% confidence interval [CI]: 94.2%-97.0%) of patients into the correct race group. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for each race category are shown in Table 3. The sensitivity and specificity were both >90% in all race groups, except in the Aboriginal group where the sensitivity was 87.5% (95% CI: 48.8%-90.9%). The sensitivity for the Other category was 54.8%, lower than the other race categories.
Sensitivity, Specificity, Positive Predictive Value, and Negative Predictive Value for Each Race Category in PROMIS (PROMIS SRR) Compared With the Survey SRR as the Gold Standard.
Note. Other PROMIS SRR category includes Black, Middle Eastern, and Latin American. PROMIS = Patient Records and Outcome Management Information System; SRR = self-reported race; CI = confidence interval; PPV = positive predictive value; NPV = negative predictive value.
The PPV for each group was >95% except for the Aboriginal and Other group. The NPV for each group was >95% except for the Caucasian group. In the Caucasian group, the PPV was 96.5%, whereas the NPV was 84.5%, and in the Aboriginal group, the PPV was 73.7%, whereas the NPV was 99.0%, likely as a result of high and low prevalence, respectively. Similarly, the Other group had a lower PPV 85.2% due to low prevalence.
Supplementary Figure 2 outlines the accuracy for SRR captured in PROMIS across a priori–defined subgroups. There were no major differences in the accuracy according to subgroups based on age, sex, birth outside of Canada, or renal program.
Comparing the Canadian Census to PROMIS Categorization of Race
When validated against SRR captured in PROMIS, the Canadian Census categorization of SRR had an overall accuracy of 95.7% (95% CI: 94.4%-97.6%). Of the individuals who identified as Caucasian in PROMIS, 99% selected Caucasian on the Census and 1% selected South Asian. Most (97%) of South Asians in PROMIS selected South Asian on the Census, whereas 1.5% selected each of Caucasian and Middle Eastern. Among East Asians from PROMIS, 82% reported Chinese origin on the Canadian Census, compared with 4.7% who reported each of Japanese, Korean, and South East Asian (Vietnamese, Cambodian, Laotian, Thai, etc) ethnic origins. Among the Middle Eastern/Arabian population in PROMIS, 60% reported West Asian and 20% reported each of Caucasian and Arab origins on the Canadian Census. Most (95%) of Filipinos in PROMIS selected Filipino on the Census and 5% selected Caucasian. Only 11 patients reported multiple responses to the ethnic origins Census question.
The sensitivity, specificity, PPV, and NPV of the Census SRR compared with PROMIS SRR as the gold standard are shown in Table 4. The specificity for most groups was very good (>90%), except for Caucasians (specificity 89.1%). The sensitivity was also >90% in most groups, except in the Aboriginal, Latin American, and West Asian groups (sensitivity: 66.7%-89.5%). The accuracy was not different in subgroups based on age, sex, birth outside of Canada, or renal program, as shown in Supplementary Figure 2.
Sensitivity, Specificity, Positive Predictive Value and Negative Predictive Value of SRR Captured Using the Canadian Census Categorization (Census SRR) Compared With PROMIS SRR as the Gold Standard.
Note. SRR = self-reported race; PROMIS = Patient Records and Outcome Management Information System; CI = confidence interval; PPV = positive predictive value; NPV = negative predictive value.
Discussion
We surveyed 637 patients from a large provincial renal administrative database in British Columbia (PROMIS) and demonstrated 95% accuracy for the capture of race. The PPV and NPV for individual categories of race were ≥90% except in very low and very high–prevalence populations, respectively. We additionally compared the categorization of race in PROMIS with that taken from the 2011 Canadian Census and demonstrated an accuracy of 96%. Our results were similar in subgroups based on renal program, age, sex, and non-Canadian country of birth.
There are several characteristics of our cohort that may have affected the accuracy of SRR capture in PROMIS. Immigration patterns in BC have led to large populations of certain minority groups, such as the South Asian, East Asian, and Filipino populations. The survey results identified these as relatively homogeneous populations, the majority of which were born outside of Canada and are predominantly from India, China, Hong Kong, and the Philippines. For example, 89% of East Asians, 87% of South Asians, and 100% of Filipinos were born outside of Canada. The proportion of patients within these groups that are born outside of Canada is much higher than what is reported in the 2011 Canadian census, in which comparatively fewer South Asians (69.3%) and Chinese (73.3%) individuals were born outside Canada. 36 This suggests that outcomes for individuals with renal disease may be associated with immigration status and other social determinants of health. 37 In addition, the capture of race may be less accurate in administrative data sets servicing more heterogeneous populations with a larger admixture of immigrant, Canadian-born, and multiethnic groups.
Additional factors may also influence the accuracy of PROMIS SRR. Race entry in PROMIS occurs on registration and may be done by clerks or other health care workers. While the standard is to confirm race with the patient, assumptions are often made which can lead to inaccuracies in the data. This has been shown to be especially true for minority groups. 38 Barriers also exist in obtaining race information, for example, asking patients about their racial background may be perceived as race playing a role in their care, leading to breakdown of trust in the patient—health care worker relationship, which may be particularly relevant to underserviced ethnic groups such as the Aboriginal population. 38 Furthermore, the “other” SRR category may have been affected by errors related to multiracial or multiethnic backgrounds or in individuals in whom the categories presented do not accurately reflect their race.38-41
The secondary analysis of health administration data frequently requires linkage of multiple data sets to improve the capture of variables relevant to health research.42,43 Canada has been a leader in this regard, with an established infrastructure in many provinces for linking regional, provincial, and national administrative databases to support health research that could not have been addressed otherwise.23,43-45 One commonly used source of data on race in linked administrative data sets is the Canadian Census, which may not be interchangeable with the categorization of race in other administrative data sets. 42 As such, we sought to compare the Census capture of race with that from PROMIS. Our results show good overall accuracy with sensitivity and specificity greater than 90% for most categories of race. There were no differences noted between subgroups based on age, sex, renal program, and birth outside of Canada. These results justify using linkages between PROMIS and the Canadian Census databases for future studies investigating race in kidney disease.
There are several limitations to consider when interpreting our results. The analysis of certain minority groups, such as Aboriginals, may be limited by small sample size that increases error. Additional research is required to further validate other racial groups that also have low prevalence in PROMIS, including Black, South American, and West Asian populations. Our results may not generalize to other administrative databases, but might nonetheless apply to systems with similar methods of primary capture of race. Finally, the purpose of this study was to validate the capture of SRR and was not intended to address whether the measurement of race in administrative data captures true differences in biologic, ethnic, cultural, or environmental factors between racial groups.
In conclusion, this study validated the capture of race in PROMIS as a large provincial renal administrative database and demonstrated excellent correlation with the Canadian Census capture of race. Our results justify the secondary use of PROMIS data alone or linked to other administrative databases in future health research exploring race in kidney disease.
Supplemental Material
Supplementary_Tables_SRR – Supplemental material for Validation of Self-Reported Race in a Canadian Provincial Renal Administrative Database
Supplemental material, Supplementary_Tables_SRR for Validation of Self-Reported Race in a Canadian Provincial Renal Administrative Database by Aiza Waheed, Ognjenka Djurdjev, Jianghu Dong, Jagbir Gill and Sean Barbour in Canadian Journal of Kidney Health and Disease
Footnotes
Ethics Approval and Consent to Participate
The study was approved by the University of British Columbia ethics board. Participants consented to participate prior to administration of the survey.
Consent for Publication
Consent for publication was obtained from all authors.
Availability of Data and Materials
Data in PROMIS can be accessed through the BC Renal Agency. Results of the survey are stored on UBC research servers.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project is funded by the BC Provincial Renal Agency.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
