Sage Journals: Discover world-class research

Abstract

Objective

To analyze the 2014 results of neonatal screening external quality assessment (EQA) performed by the Chinese National Centre for Clinical Laboratories.

Methods

EQA test panels consisting of five dried blood spots (three panels for phenylalanine (Phe) and thyroid stimulating hormone (TSH), two for glucose-6-phosphate dehydrogenase (G6PD) and 17-alpha-hydroxy progesterone (17-OHP)) were distributed to laboratories and the results collected and evaluated. To compare the correct recognition rates, chi-square test was used.

Results

Test results were received from 170 laboratories for Phe, 176 for TSH, 65 for G6PD and 65 for 17-OHP. The total number of effective quantitative and qualitative results of Phe, TSH, G6PD and 17-OHP were 2520 and 2370, 2605 and 2450, 645 and 530, 645 and 645, respectively. The overall correct recognition rates for qualitative tests of Phe, TSH, G6PD and 17-OHP were 99.79 %, 99.67 %, 93.40 % and 99.84 %, and the proportion of acceptable quantitative results were 94.48 %, 98.31 %, 84.65 % and 99.84 %, respectively. There were significant differences in the rates of acceptable quantitative results of the two measurement systems for Phe, TSH and G6PD (p<0.001); χ² test showed significant differences in correct recognition or acceptable rates among programmes (p<0.001).

Conclusion

Most of the quantitative results were acceptable and the overall correct recognition rates in qualitative results approached 100%. Distributing more challenging samples and increasing the range of concentrations of EQA samples will improve standards in future assessments.

Keywords

Neonatal screening External quality assessment Phenylalanine Thyroid stimulating hormone Glucose-6-phosphate dehydrogenase 17-alpha-hydroxy progesterone

Introduction

Following a pilot study in 1981, neonatal screening was extended to all provinces in China over the next two decades.¹ The first external quality assessment (EQA) of neonatal screening laboratories testing phenylalanine (Phe) and thyroid stimulating hormone (TSH) was performed in 1999, and of laboratories testing for glucose-6-phosphate dehydrogenase (G6PD) deficiency and 17-alpha-hydroxy progesterone (17-OHP) in 2010, by the National Centre for Clinical Laboratories (NCCL). We analyzed the EQA results from these qualitative and quantitative neonatal screening tests.

Methods

EQA programme materials and scheme design

In March 2014, EQA panels, each consisting of five dried blood spots, three panels for Phe and TSH (three panels, 15 samples), two for G6PD and 17-OHP (two panels, 10 samples) were prepared, labelled, and distributed to Hospitals and Maternal and child health centres which provided neonatal screening. Blood spot homogeneity and stability was guaranteed by a manufacturer approved by the Chinese Food and Drug Administration (CFDA). The concentrations of all the spots were different, and panels were stored at 2∼8℃ until distribution. Each sample/panel was coded to make analysis clear: the first four digits represented the year in which these samples were distributed, the fifth identified the lot of the sample/panel, and the last digit indicated the serial number of a sample in one panel. To ensure confidentiality, code numbers of data reporting and further communication details were included with instruction manuals sent with the test panels. The participant laboratories were given two weeks to test the EQA panel, using their standard laboratory protocols, and return the results to NCCL via express mail or a data submission form on the NCCL website. Screen positive or negative results were required from qualitative EQA participants. Those providing quantitative schemes submitted a report showing umol/L for Phe, mIU/L for TSH, U/gHb for G6PD and nmol/L for 17-OHP.

Target values assigned and evaluation of the results

Results were assessed using consensus.² The assigned value of quantitative test was the robust average of the results reported by all participants in a subgroup, using the algorithm A referring to ISO 13528.³ For qualitative tests, the result of positive or negative which exceeded 80 % was the assigned value for this group.⁴

For quantitative results, scores were awarded on the basis of bias against the assigned value: 20 points assigned if the quantitative results of Phe fell in the range of ±30 % or 60 umol/L (whichever was larger) of the assigned value, ±30 % for TSH, G6PD, and 17-OHP. Other results were not scored in every EQA panel. For qualitative test results, 20 points were assigned for being same as the assigned value whereas the different results were not scored in every EQA panel. Equivocal or borderline results were not accepted - every dried blood spot was required to be tested and assigned a definitive conclusion of positive or negative. Thus 100 points was the maximum possible score for participant laboratories in both quantitative and qualitative assessment. As with other EQA programmes, the NCCL EQA programme considered 80 points or more acceptable performance, and less than 80 points unacceptable.^5,6 The overall correct recognition rate or acceptable rate from qualitative or quantitative results for each programme was calculated as (number of acceptable results)/(overall number of effective results).

Statistical analysis

Data was analyzed using SPSS 13.0 (StataCorp) and Clinet-EQA evaluation system V 1.0, designed by NCCL and used in the national EQA programme (see http://www.clinet.com.cn/shop/shop). Median, standard deviation (SD), robust average, robust standard deviation (RSD), robust coefficient of variation (RCV) and bias were calculated and used to evaluate performance of participant laboratories. To compare the performance between measurement system and overall correct recognition rate among the four neonatal screening EQA programmes, chi-square (χ²) test was used. A p<0.05 was considered significant.

Results

Participant laboratories

Hospitals and Maternal and Child Health centres providing neonatal screening services participated in the 2014 EQA programme. There were 170, 176, 65, and 65 laboratories enrolled in Phe, TSH, G6PD and 17-OHP EQA programmes, respectively. Two mainstream measurement systems (PerkinElmer and Ani labsystem) were used. For Phe, TSH, G6PD, and 17-OHP EQA programmes 133, 151, 52, and 58 participants used PerkinElmer; 37, 25, 13, and 7 participants used Ani labsystem, respectively. In 2014, we collected 2520, 2605, 645, and 645 quantitative EQA test reports and 2370, 2450, 530, and 645 qualitative reports for Phe, TSH, G6PD, and 17-OHP (12 laboratories submitted only quantitative results for Phe, TSH, and G6PD).

Analysis of the qualitative results

Qualitative results were analyzed and scored according to the criteria described above (Table 1). The proportion of laboratories which correctly reported all five test samples ranged from 67.92 % (G6PD, panel 20141) to 100 % (Phe, panel 20141 and 20142; TSH, panel 20141; 17-OHP, panel 20142).The overall correct recognition rate for qualitative tests of Phe, TSH, G6PD, and 17-OHP were 99.79 % (2365/2370), 99.67 % (2442/2450), 93.40 % (495/530), and 99.84 % (644/645), respectively. The χ² test showed significant differences in overall correct recognition rate for qualitative tests among different programmes (p<0.001, data not shown).

Table 1.

Analysis of the qualitative results.

Disease	Round (Panel ID)	No. of labs reporting effective results	EQA score* = 100		80 ≤ EQA score* < 100		EQA score* < 80
Disease	Round (Panel ID)	No. of labs reporting effective results	No. of labs	Proportion (%)	No. of labs	Proportion (%)	No. of labs	Proportion (%)
Phe	20141	158	158	100	0	0	0	0
	20142	158	158	100	0	0	0	0
	20143	158	154	97.47	3	1.90	1	0.63
TSH	20141	164	164	100	0	0	0	0
	20142	162	160	98.77	2	1.23	0	0
	20143	164	159	96.95	4	2.44	1	0.61
G6PD	20141	53	36	67.92	14	26.42	3	5.66
	20142	53	51	96.23	0	0	2	3.77
17-OHP	20141	64	63	98.44	1	1.56	0	0
	20142	65	65	100	0	0	0	0

20 points for same as the assigned value whereas the different results were not scored in every EQA panel. 100 points was the full score.

To evaluate possible differences in the performance of the two mainstream measurement systems (PerkinElmer and Ani labsystem), including reagents and instruments, chi-square test was used. Table 2 shows the relevant numbers and scores for the qualitative results of both systems. The proportion of laboratories which correctly reported all five test samples ranged from 80.95 % (G6PD) to 100 % (17-OHP) for PerkinElmer, and 86.36 % (G6PD) to 100 % (Phe) for Ani labsystem. The χ² test showed no relationship between EQA results and measurement system (for Phe, p = 0.563; for TSH, p = 0.353; for G6PD, p = 0.811; for 17-OHP, p = 0.109).

Table 2.

Qualitative results of PerkinElmer and Ani labsystem measurement systems in 2014.

Disease	Measurement system	No. of panels reported effective results	EQA score* = 100		80 ≤ EQA score* < 100		EQA score* < 80
Disease	Measurement system	No. of panels reported effective results	No. of panels	Proportion (%)	No. of panels	Proportion (%)	No. of panels	Proportion (%)
Phe	PerkinElmer	369	365	98.92	3	0.81	1	0.27
	Ani labsystem	105	105	100	0	0	0	0
TSH	PerkinElmer	422	417	98.82	4	0.95	1	0.24
	Ani labsystem	68	66	97.06	2	2.94	0	0
G6PD	PerkinElmer	84	68	80.95	12	14.29	4	4.76
	Ani labsystem	22	19	86.36	2	9.09	1	4.55
17-OHP	PerkinElmer	115	115	100	0	0	0	0
	Ani labsystem	14	13	92.86	1	7.14	0	0

20 points for same as the assigned value whereas the different results were not scored in every EQA panel. 100 points was the full score.

Analysis of the quantitative results

The quantitative results are summarized in Table 3, showing EQA score, number of laboratories, and the relative proportion for each test panel. The proportion of participant laboratories with a full mark ranged from 67.19 % (G6PD, panel 20141) to 95.93 % (TSH, panel 20141). The proportion of laboratories with unacceptable reports ranged from 1.74 % (TSH, panel 20141) to 18.75 % (G6PD, panel 20141). The overall acceptable rate of quantitative tests for Phe, TSH, G6PD, and 17-OHP was 94.48 % (2381/2520), 98.31 % (2561/2605), 84.65 % (546/645), and 99.84 % (644/645), respectively. The χ² test showed significant differences in rates of acceptable results in quantitative test between different programmes (p<0.001, data not shown).

Table 3.

Analysis of the quantitative results.

Disease	Round (Panel ID)	No. of labs reporting effective results	EQA score* = 100		80 ≤ EQA score* < 100		EQA score* < 80
Disease	Round (Panel ID)	No. of labs reporting effective results	No. of labs	Proportion (%)	No. of labs	Proportion (%)	No. of labs	Proportion (%)
Phe	20141	166	142	85.54	4	2.41	20	12.05
	20142	170	155	91.18	1	0.59	14	8.24
	20143	168	151	89.88	4	2.38	13	7.74
TSH	20141	172	165	95.93	4	2.33	3	1.74
	20142	175	164	93.71	7	4.00	4	2.29
	20143	174	166	95.40	2	1.15	6	3.45
G6PD	20141	64	43	67.19	9	14.06	12	18.75
	20142	65	50	76.92	3	4.62	12	18.46
17-OHP	20141	64	55	85.94	4	6.25	5	7.81
	20142	65	59	90.77	2	3.08	4	6.15

20 points were assigned if the quantitative results of Phe fell in the range of ±30 % or 60 umol/L (whichever was larger) of the assigned value, ±30 % for TSH, G6PD and 17-OHP. Other results were not scored in every EQA panel.

Table 4 shows the numbers and scores of quantitative results for both measurement systems. The proportion of laboratories with full marks ranged from 83.65 % (G6PD) to 96.44 % (TSH) for PerkinElmer, and 24 % (G6PD) to 99.07 % (Phe) for Ani labsystem. The proportion of laboratories missing more than one sample was 1.78 % (TSH) to 11.62 % (Phe) for PerkinElmer, 0.93 % (Phe) to 52 % (G6PD) for Ani labsystem. The χ² test showed that the proportion of participants with full marks, 80 points and <80 points differed significantly between the two measurement systems for Phe (p<0.001), TSH (p<0.001) and G6PD (p<0.001), but not 17-OHP (p = 0.378).

Table 4.

Quantitative results of PerkinElmer and Ani labsystem measurement systems in 2014.

Disease	Measurement system	No. of panels reported effective results	EQA score* = 100		80 ≤ EQA score* < 100		EQA score* < 80
Disease	Measurement system	No. of panels reported effective results	No. of panels	Proportion (%)	No. of panels	Proportion (%)	No. of panels	Proportion (%)
Phe	PerkinElmer	396	341	86.11	9	2.27	46	11.62
	Ani labsystem	108	107	99.07	0	0	1	0.93
TSH	PerkinElmer	450	434	96.44	8	1.78	8	1.78
	Ani labsystem	71	61	85.92	5	7.04	5	7.04
G6PD	PerkinElmer	104	87	83.65	6	5.77	11	10.58
	Ani labsystem	25	6	24.00	6	24.00	13	52.00
17-OHP	PerkinElmer	115	102	88.70	6	5.22	7	6.09
	Ani labsystem	14	12	85.71	0	0	2	14.29

The assigned value (robust average) and RSD for each Phe, TSH, G6PD, and 17-OHP sample for both measurement systems are shown in Figures 1 –4. The robust average of Phe (6 of 15 samples) (Figure 1) was lower than 120 umol/L; others were higher than 300 umol/L. The range of robust average was about 60-720 umol/L and the range of RCV was about 5-23 %. The error bars represent RSD ranging from 0.14 to 0.809 for PerkinElmer, and 0.149 to 0.988 for Ani labsystem. The RCV of EQA samples with lower concentration was higher. In lower concentrations, the PerkinElmer RCV was lower than that of Ani labsystem. In higher concentrations, the robust average of Ani labsystem was higher than that of PerkinElmer.

Figure 1.

The robust average of concentration and RCV of each EQA sample of two measurement system of Phe (The Panel ID omitted “2014”, error bar represented RSD, the following are the same.).

Figure 2.

The robust average of concentration and RCV of each EQA sample of two measurement system of TSH.

Figure 3.

The robust average of concentration and RCV of each EQA sample of two measurement system of G6PD deficiency.

Figure 4.

The robust average of concentration and RCV of each EQA sample of two measurement system of 17-OHP.

For 6 out of 15 samples, the robust average concentration of TSH was less than 10 mIU/L; for others it was higher than 25 mIU/L (see figure 2). For higher concentrations, the robust average and RSD using Ani labsystem was larger than that using PerkinElmer. In lower concentrations, the RCV was higher. The RCV using Ani labsystem was higher than PerkinElmer except for lots 13, 21, and 35. The range of RCV using Ani labsystem was about 5-25 % and RCVs were quite different from each other. The RCV using PerkinElmer was about 5-10 % but with little difference among samples.

For the G6PD EQA programme, the concentration of G6PD uniformly increased from <1-9 U/gHb. The robust average and RSD of G6PD using Ani labsystem was larger than that using PerkinElmer. The RCV using PerkinElmer was considerably smaller than that using Ani labsystem for every sample. The range of RCV using PerkinElmer was 0-15 %, and for Ani labsystem about 23-50 % (see figure 3).

Figure 4 shows that the robust average of 17-OHP for 4 of 10 EQA samples was less than 20 nmol/L, but for others was higher than 70 nmol/L. In higher concentrations the robust average and RSD using PerkinElmer were less than that those using Ani labsystem. The RCV using PerkinElmer was higher in lower concentrations but lower in higher concentration. The RCV using Ani labsystem was different among samples.

Discussion

We here provide the first analysis of EQA results for the performance of neonatal screening in China. Tens of thousands of infants are born in China every day, and only government accredited laboratories can legally provide neonatal screening services. All neonatal screening laboratories in China were enrolled in the NCCL EQA scheme.

The overall correct recognition rate in qualitative tests for G6PD was lower than for the other programmes, but there was no relationship between EQA results and measurement system. The robust average concentration for all samples of Phe, TSH, and 17-OHP was not near the equivocal or borderline value for either the PerkinElmer or Ani labsystem measurement systems, but the robust average concentration of more than one sample of G6PD was around 2 U/gHb, identified by many laboratories as a borderline result.

In the quantitative results, there was a significant difference in the rate of acceptable results among different programmes. The proportion of participants with full marks, 80 points, and lower than 80 points differed significantly between the two measurement systems for Phe, TSH, and G6PD deficiency (but not 17-OHP). The RCV of TSH and G6PD using Ani labsystem was remarkably higher than that using the PerkinElmer measurement system, suggesting that the dispersion of results using Ani labsystem was greater than that using PerkinElmer, thus leading to a lower rate of acceptable results. For higher concentrations of Phe, the RCV using Ani labsystem was slightly lower than that using PerkinElmer, while the robust average using Ani labsystem was higher than that using PerkinElmer. In relation to the evaluation standard, the Ani labsystem range was larger than PerkinElmer, and the relative dispersion was less than PerkinElmer. In lower concentrations, although the PerkinElmer measurement system had a smaller RCV, the range of acceptable results calculated as robust average (approximate 120 umol/L) ±60 umol/L was large enough to make almost all the results acceptable in lower concentrations. The rate of acceptable results using Ani labsystem was higher than that using PerkinElmer.

Although variation within the two different measurement systems led to discrepancies, and the RCV and RSD using Ani labsystem were larger than those using PerkinElmer in most samples, the overall correct recognition rate for qualitative results had no significant statistical differences between the two systems. For neonatal screening, the qualitative results were much more useful than quantitative results, and the performance of the two measurement systems could be considered to be the same.

A limitation of our study was that the test panel blood spots were not from infant blood, but simulated infant blood spots, which may have caused matrix effects. In addition, it may be difficult to score or evaluate the performance of participant laboratories if the concentration of the G6PD EQA sample was close to equivocal or borderline values.

In conclusion, most of the quantitative results were acceptable and the overall correct recognition rate in qualitative results approached 100%. The EQA programme is vital, but distributing more challenging samples and increasing the range of concentration of EQA samples may improve the quality of screening in the future.

Footnotes

Acknowldgements

This study was funded by Beijing Natural Science Foundation in 2014 (No. 7143182). The authors thank the participating neonatal screening laboratories, and Clinet () for providing NCCL computer technology support to establish network reporting platform and relevant services.

References

Chen

. Current status of neonatal screening in China. J Med Screen 1999; 6: 186–7.

ISO (1997) Proficiency testing by interlaboratory comparisons—part 1: development and operation of proficiency testing schemes. ISO/IEC Guide. International Standards Organisation, Geneva.

International Standard Organization. Statistical methods for use in proficiency testing by interlaboratory comparisons, ISO 13528, First edition, Switzerland, 2005: 18-65.

International Standard Organization. Conformity assessment – General requirements for proficiency testing, ISO 17043, First edition, 2010.

Jean Louis

Frantz

Renette Anselme

Ndongmo

Clement

Buteau

Josiane

Boncy

Jacques

Dahourou

Georges

Vertefeuille

John

Marston

Barbara

Balajee

Arunmozhi

. Evaluation of an External Quality Assessment Program for HIV Testing in Haiti, 2006-2011. Am J Clin Pathol 2013; 140: 867–871.

Wang

Lunan

Pan

Yang

Zhang

Kuo

Zhang

Rui

Sun

Xie

Jiehong

Jinming

. A 10-year human hepatitis B virus nucleic test external quality assessment in China: continual improvement. Clinica Chimica Acta 2013; 425: 139–14.

Neonatal screening external quality assessment in China,2014

Abstract

Objective

Methods

Results

Conclusion

Keywords

Introduction

Methods

EQA programme materials and scheme design

Target values assigned and evaluation of the results

Statistical analysis

Results

Participant laboratories

Analysis of the qualitative results

Analysis of the quantitative results

Discussion

Footnotes

Acknowldgements

References