Abstract
Background
The in vitro directive of the European Union requires traceability to the international recommended reference procedures. The application of the reference procedures is necessary in order to evaluate the accuracy of γ-glutamyltransferase (GGT) assays of routine measurement systems in China.
Methods
Five frozen patient-pooled serum samples were assigned values by the International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) reference procedure in order to evaluate the traceability of the results of GGT catalytic activity from six homogeneous systems. One of the serum samples was used to calibrate seven non-homogeneous systems.
Results
All of the homogeneous systems, except the Dade system (Dade Bering Inc, IL, USA), achieved traceability within the measurement range. The Roche and Hitachi systems were better than the other systems. After calibration, the variance of the non-homogeneous systems decreased dramatically from between 14.50% and 25.23% to between 1.25% and 3.09% and the bias decreased from between −11.4% and −4.1% to between 0.5% and 3.5%.
Conclusion
Manufacturers in China should ensure that their calibration systems correspond to the IFCC reference procedures. Fresh frozen pooled patient serum assigned by reference laboratories can be used to calibrate non-homogeneous systems in order to achieve traceability.
Introduction
Serum enzyme measurement is of great significance in the diagnosis, therapy and prognosis of diseases. γ-Glutamyltransferase (EC 2.3.2.2; GGT) is an important serum enzyme that has been mainly used as a marker of alcohol consumption and hepatobiliary disease. 1,2 Recently, it has been proposed as a marker of oxidative stress and is associated with an increased risk of cardiovascular diseases, including coronary heart disease and stroke. 3–6 The accuracy of the measurements, which provide reference and evidence for diagnoses and treatment, is critical. The key to accurate measurement is to make the results traceable to an agreed or proposed common reference material or procedure.
The measurement system requires an analyser, a reagent, a calibrator and an operating procedure. The primary reference measurement procedures at 37°C recommended by International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) for the standardization of enzyme assays were published in 2002 for five enzymes including GGT. 7 The directive 98/79 IEC on in vitro diagnostic medical devices 8 requires that the values obtained in clinical laboratories should be traceable to internationally recognized and accepted reference materials and/or reference measurement procedures. Routine measurement systems using validated calibrators should ensure that the values are traceable to the IFCC primary procedures. 9 Calibrators with matrix effects should be used with matched reagents and analysers. 10 This is called the homogeneous system. Jansen et al. 11 reported a significant bias in the measurements of trueness verification material taken from six homogeneous systems when compared with the target values assigned by the IFCC reference system. However, the material was enriched with recombinant human enzymes, whose properties did not exactly match human serum. In China, many clinical laboratories still use calibrators that are not matched with reagents and analysers. This is known as a non-homogeneous system, and the traceability of their results cannot always be ensured by the manufacturers of the calibrators. This lack of traceability obtained from both the homogeneous and non-homogeneous systems is a problem that must be solved. Fresh human serum may be used for common calibrations and Franck et al. used six patient-pool serum samples assigned by a routine laboratory to calibrate the methods in 18 other laboratories. After calibration, the interlaboratory coefficient of variance (CV) of all the laboratories dropped dramatically from a maximum of 61.6% to 9.5%. 12
In our study, we decided to assess the traceability of results for GGT catalytic activity from commercial systems used in China. We applied five pools of patient serum samples assigned by the IFCC reference procedure. First, the five frozen patient-pools were assigned values in the three candidate reference laboratories. Then the assigned values were used to investigate the bias and variability of six homogeneous and seven non-homogeneous well-established systems. Furthermore, one of the pools, with a middle level activity, was used to calibrate the non-homogeneous systems. Interlaboratory CVs and bias were compared before and after calibration.
Materials and methods
Collection and value assignment of patient-pooled serum samples
Fasting serum samples from leftover patient specimens were collected selectively in five concentration levels of GGT, which were within the IFCC measurement range (upper limit range 4.8 μkat/L). Lipaemic and haemolyzed samples were excluded. The pooling was carried out with the permission of Clinical Research Ethics Committee of Peking University First Hospital. A measure of 0.5 mL of the pooled samples was dispensed into 1.8 mL vials after being filtrated through a 0.2-μm pore size filter membrane. The samples were then sealed in order to preserve their properties and stored frozen at −70°C. GGT activities in the serum samples were measured according to the standard procedure. First, the samples stood at room temperature for 40 minutes they were then swirled five times, centrifuged at 1000
Five frozen serum pools were assigned values by three candidate reference laboratories. These laboratories had established GGT reference procedures and achieved good results in the IFCC External Quality Assessment Scheme for Reference Laboratories in Laboratory Medicine (RELA) in 2006. Their laboratory numbers were 48, 49 and 36 (for more information go to
Investigation of routine measure systems
The six homogeneous systems investigated were Abbott (Architect C8000 n = 5), Beckman (Synchron DXC 800 n = 2; LX20 n = 3), Dade Behring (Dimension Rxl n = 2), Hitachi (7170A n = 2; 7180 n = 3), Roche (Modular n = 5) and Ortho (Vitros 950 n = 3; 350 n = 2). All the producers stated that their procedures were traceable to the IFCC reference procedure. All the instruments had been well maintained. The two controls provided by manufactures with normal and abnormal concentrations were tested 20 times in one day to assess the precision of the systems. The analysis of the five pools was undertaken in duplicate by each of the laboratories. The means of the replicate measurements were used to calculate the interlaboratory CV and the bias was compared with target values.
Seven non-homogeneous systems were included in the investigation (Table 1). The five pools were measured twice. An assigned serum GGT concentration at 1.227 μkat/L was used to calibrate the non-homogeneous systems and we compared the CVs and bias for the measurements of the five frozen serum samples before and after the calibration.
Information of non-homogeneous systems used in this study
*Manufactured in China
Statistical methods
Outlier exclusion
Extreme outliers detected by the Box-whisker plot and data outside the mean of ±2SD (standard deviation) were discarded from the results of one reference laboratory.
Traceability assessment
The analytical performance of each commercial system was assessed according to quality goals based on biology. 13 A desirable performance was defined by bias (B A) < 0.25 (CVI 2 + CVG 2)1/2, where CVI 2 was the within-subject variation and CVG 2 was the between-subject variation. An optimum performance was defined by B A < 0.125 (CVI 2 + CVG 2)1/2. A minimum performance was defined by B A < 0.375 (CVI 2 + CVG 2)1/2. The results of the commercial systems (y) and target values (x) were compared using regression analysis (y = ax + b) according to Passing and Bablok. 14
Results
Target values
The five frozen serum samples were first analysed by the three candidate reference laboratories. The results of each laboratory were inspected for outliers and three outliers from one laboratory were excluded. Finally, the means (±SD) of the results for GGT at five levels were 0.717 μkat/L (±0.013), 1.227 μkat/L (±0.027), 2.070 μkat/L (±0.030), 3.064 μkat/L (±0.047) and 4.314 μkat/L (±0.095). The sample with 1.227 μkat/L was chosen to be the calibrator for the non-homogeneous systems.
Precision and bias of measurements from homogeneous systems
Precision measurements taken before the start of the experiment are listed in Table 2. CVs of these results complied with the requirements of the instrument operation manual.
Precision for γ-glutamyltransferase measurement of six homogeneous systems before investigation
CV, coefficient of variation
There was a significant variation in the catalytic activities of the five frozen serum samples among the six homogeneous systems (Table 3). The interlaboratory CVs were 2.17–5.07% (Abbott), 4.21–10.98% (Beckman), 0.52–2.38% (Dade), 1.35–2.59% (Hitachi), 0.23–1.54% (Roche) and 1.83–2.38% (Ortho). All the homogeneous systems showed good precision except for the Beckman system. However, the intersystem CVs of the five samples were worse than those of the same system.
Variance and bias of γ-glutamyltransferase activity measurements (μkat/L) of fresh frozen serum samples from six homogeneous systems in a traceability investigation
SD, standard deviation; CV, coefficient of variation
The relative bias of measurements taken from the six homogeneous systems compared with the target values were 0.43 to 8.41% (Abbott), −13.04 to 9.83% (Beckman), 11.2 to 17.73% (Dade), −1.65% to 4.62% (Hitachi), −2.63% to 2.0% (Roche) and −5.37% to 7.90% (Ortho) (Table 3). Hitachi and Roche achieved the optimum performance goal (bias <5.4%). Abbott and Ortho met the desirable performance goal (bias <10.8%). Beckman achieved the minimum performance goal (bias <16.2%). However, measurements taken at a high concentration level from Dade exceeded the minimum performance goal.
A regression analysis (Table 4) revealed that all the homogeneous systems correlated with the IFCC procedures. The slopes of the regression lines were between 0.817 and 1.167. The slope and intercept from Hitachi, Roche and Ortho were not significantly different from 1 and 0 respectively (95% confidence interval included the value 1 or 0 15 ). However, the slopes and intercepts from the other three systems deviated significantly from 1 and 0 respectively.
Regression analysis between International Federation of Clinical Chemistry and Laboratory Medicine procedures (x-axis) and six homogeneous systems (y-axis)
Values in parentheses are 95% confidence interval
Comparison of non-homogeneous systems before and after calibration with an assigned serum
The results of the non-homogeneous systems are listed in Table 5. The intersystem CVs were apparently higher than that of the homogeneous systems. However, the variation between the non-homogeneous systems reduced dramatically after calibration from 25.23–14.50% to 3.09–1.25% and the bias was reduced from −11.4% to −4.1% to 3.5–0.5%. Figure 1 shows the regression lines before and after calibration. Slopes after calibration were closer to value 1 and intercepts closer to 0 than those obtained before calibration.

Regression lines of γ-glutamyltransferase (GGT) results between reference systems and non-homogeneous systems before and after calibration. (a, b) GGT catalytic activities (μkat/L) of five patient-pool serum samples which were measured by International Federation of Clinical Chemistry and Laboratory Medicine reference laboratories (x-axis) and seven non-homogeneous systems (y-axix) before and after calibration by the serum with an assigned value at 1.227 μkat/L. — represents laboratory 1;
represents laboratory 2;
represents laboratory 3;
represents laboratory 4;
represents laboratory 5;
represents laboratory 6;
represents laboratory 7
Variance and bias of γ-glutamyltransferase activity measurements of fresh frozen serum samples from seven non-homogeneous systems in a traceability investigation
SD, standard deviation; CV, coefficient of variation
Discussion
Preparation and application of the pooled frozen serum samples
Pooled fresh human serum is the preferred matrix for calibration purposed, because it is free of reconstitution errors, mimics patient specimens more closely and produces superior intra-assay imprecision. 16 In this study, minimally processed patient-pool serum samples were used to evaluate the traceability of the results of commercial systems. The concentration of five serum sample concentrations, covering a linear measurement range in accordance with the IFCC procedure, can better reflect the accuracy of routine methods than that seen when using a single serum. Pooled serum samples were filtered through a membrane filter with a pore size of 0.2 μm and stored at −70°C to ensure a high level of stability without changing the properties in accordance with the method described by Henriksen et al. 17
However, in our study, the commutability of the serum samples between the reference procedure and the various routine methods needed to be verified in accordance with the recommended Clinical and Laboratory Standards Institute procedure. 18
Traceability of results from homogeneous systems
The bias of the homogeneous systems investigated varied from the target values assigned by reference laboratories. All of the systems, except the Dade system, achieved traceability in the measurement range according to the minimum performance goals. Two performed optimally and two achieved desirable performance goals. Dade had a greater bias than the other systems, one reason for which may be the smaller number of laboratories investigated. The precision of the Beckman system was poor. Its interlaboratory CV was 4.66% for the pool with target value at 2.674 μkat/L, which was much bigger than 1.38% in the pre-experiment at a similar enzyme activity; this did not occur in other systems. We therefore suspect there may have been some measurement errors in the assay of the five serum samples on the Beckman. There were also some differences in the bias of the measurement of GGT activity between our study and a study from Europe by Jansen et al. 11 If a serum, with its target value in a range of level 3 (2.070 μkat/L) to level 4 (3.064 μkat/L), were tested, the bias for the Abbott system would be between 1.63% and 4.19% (−0.77% in Jansen's report). For Beckman, the bias would be between −9.07% and −12.72% (Jansen 6.3%); for Dade, it would be between 11.54% and 13.70% (Jansen 3.6%); for Roche, between −0.72% and −0.30% (Jansen −7.3%); for Ortho, between −0.46% and 3.36% (Jansen −9.8%). There is no definite explanation for this phenomenon: perhaps it was a result of using different analysis and survey materials. The material used in Europe was prepared from fresh human serum and enriched with recombinant human enzymes provided by the Asahi Chemical Industry in Japan. Its commutability was not established in any authoritative report. Nevertheless, we conclude that all manufacturers should ensure that the traceability of calibration in their systems corresponds to the IFCC primary reference measurement procedure.
Improvement of traceability for non-homogeneous systems after calibration
The results of the non-homogeneous systems deviated significantly from the target value. Biases varied from −11.4% to −4.1%, which were similar to the −9.0% to −14.2% seen in a previous study. 19 However, the interlaboratory CVs from 14.50% to 25.23% were much greater than the 6.9% to 11.6% seen in previous reports. This may be due to the variation in the constitution of the non-homogeneous systems. The best solution would be to ensure that all systems use a common calibration. Fresh frozen pooled patient serum samples assigned by reference laboratories can sometimes be suitable for calibrators. There are different types of calibration, e.g. using correction functions in the case of homogeneous linear regression analysis, normal linear regression analysis and one-point calibration. Baadenhuijsen et al. 20 suggested that the reduction in variation was not related to the type of correction function and that a sensible choice of calibrator concentration for one-point calibration was feasible. In our study most systems indicated a high bias when samples at level one and two were measured. We therefore chose a pooled serum sample with a GGT activity of 1.227 μkat/L, whose concentration was more related to the GGT decision level, as the common calibrator of the non-homogeneous systems. The feasibility of this decision is confirmed by the reduction of a maximum interlaboratory variation from 25.23% to 3.09% and bias from −11.4% to 2.8%. However, it is evident that the traceability of non-homogeneous systems needs to be verified. Harmonization of their results can be achieved by calibration with an assigned serum, but the frequency of the calibration needs further investigation.
Conclusion
Manufacturers in China must ensure that the traceability of calibration of their systems corresponds to the IFCC reference procedure. Fresh frozen pooled patients serum assigned by reference laboratories may be suitable for calibrating non-homogeneous systems in order to achieve traceability.
Footnotes
Acknowledgement
Our study was supported by the National 863 Plan Projects of China (2006AA020909). The authors gratefully acknowledge the cooperation of the Abbott, Beckman, Dade, Roche, Hitachi and Ortho organizations.
