Abstract
Background
Neutrophil gelatinase-associated lipocalin (NGAL) is a promising biomarker for acute kidney injury that is beginning to be used in clinical practice in addition to research studies. The current study describes an independent validation and comparison of five commercially available NGAL assays, focusing on urine samples. This is an essential step in the translation of this marker to clinical use in terms of allowing valid inter-study comparison and generation of robust results.
Methods
Two CE (Conformité Européenne)-marked assays, the NGAL Test (BioPorto) on Siemens ADVIA® 1800 and the ARCHITECT Urine NGAL assay on i2000SR (Abbott Laboratories), and three research-use-only (RUO) ELISAs (R&D Systems, Hycult and BioPorto) were evaluated. Imprecision, parallelism, recovery, selectivity, limit of quantitation (LOQ), vulnerability to interference and hook effect were assessed and inter-assay agreement was determined using 68 urine samples from patients with various renal diseases and healthy controls.
Results
The Abbott and R&D Systems assays demonstrated satisfactory performance for all parameters tested. However for the other three assays evaluated, problems were identified with LOQ (BioPorto/ADVIA®), parallelism (BioPorto ELISA) or several parameters (Hycult). Between-method agreement varied with the Hycult assay in particular being markedly different and highlighting issues with standardization and form of NGAL measured.
Conclusions
Variability exists between the five NGAL assays in terms of their performance and this should be taken into account when interpreting results from the various clinical or research studies measuring urinary NGAL.
Introduction
The incidence of acute kidney injury (AKI) is increasing and is recognized as a major health problem. 1 Occurring as a complication of sepsis, following major surgery or as a direct result of nephrotoxins, AKI results in an increased mortality and need for critical care. AKI is also associated with other long-term health problems with an increased risk of chronic kidney disease for example. Serum creatinine is relatively insensitive and poor in terms of specificity and new biomarkers are urgently needed to aid in diagnosis and management.2,3
One of the most promising emerging markers is neutrophil gelatinase-associated lipocalin (NGAL). Originally isolated from neutrophil granules 2 it is expressed at low levels in several organs and has been implicated as a marker of AKI following transcriptomic profiling studies in animal models of ischaemic renal injury. 4 Using an NGAL reporter mouse model, clear association of urinary NGAL and a renal origin has been demonstrated. 3 NGAL is known to exist in at least three forms, a 25 kDa monomer thought to be the predominant form released by renal tubules, a 45 kDa homodimer predominantly secreted by neutrophils and a 135 kDa NGAL/matrix metalloproteinase-9 (MMP-9) covalently complexed heterodimer.2,5 In terms of function, in addition to binding and protecting against MMP-9-mediated degradation, NGAL-mediated shuttling of iron through binding siderophores is key to its role in bacteriostasis and in influencing proliferation, differentiation and apoptosis of a variety of cell types. 6 Urinary, plasma and serum NGAL have been reported to be superior to the use of creatinine in AKI, either diagnostically or in predicting severity or outcome.7–11 Recently a large prospective study has also shown urinary NGAL to outperform urinary kidney injury molecule-1 (KIM-1), cystatin C, interleukin-18 and liver-type fatty acid binding protein for AKI, both diagnostically and prognostically. 12 Additionally an association with delayed graft function following renal transplantation is now emerging.13,14
Critical evaluations of the literature highlighting the potential of NGAL and other markers now recommend larger, clearly designed prospective studies and with attention to issues such as thorough statistical analysis, assays used and confounding factors such as chronic kidney disease.8,11 Currently (as at 1 September 2012) on the
Materials and methods
Patient samples
Mid-stream urine samples (n = 68) were obtained from several patient groups treated at St James's University Hospital during 2011, to ensure a range of NGAL concentrations and potential sources and varying urinary matrix backgrounds. These included patients with AKI diagnosed according to Acute Kidney Injury Network criteria
20
(n = 25, from 21 hospitalized patients, with acute tubular injury, glomerulonephritis, tubulointerstitial or obstructive), renal cancer (n = 10), benign urological conditions such as renal stones and recurrent urinary tract infections (UTI) (n = 10) and diabetic albuminuria (n = 13). Samples were also obtained from healthy volunteers recruited within a similar time period as the patients and with no known illnesses (n = 10). All samples were obtained following ethical approval and with informed consent (with the exception of the diabetic albuminuric samples which were anonymized surplus diagnostic samples). Within two hours of collection (with the exception of the surplus diagnostic albuminuric samples which were kept at 4℃ and not obtained until up to eight hours after collection), samples were centrifuged for 10 min at 2000 ×
NGAL assays
The CE-marked assays evaluated were the ARCHITECT® Urine NGAL assay (Abbott Laboratories, North Chicago, IL, USA), a two-step chemiluminescent microparticle assay for use on the ARCHITECT® analysers (the i2000SR was the model used in this study) and the NGAL Test™ (BioPorto Diagnostics A/S, Gentofte, Denmark), a particle-enhanced turbidimetric immunoassay, available for use on several automated platforms. The ADVIA® 1800 (Siemens Healthcare, Erlangen, Germany) was used in our study although the original application note for this platform has been withdrawn since this study. The RUO assays evaluated were the Human NGAL ELISA (HK330) Kit (Hycult Biotech, Uden, Holland), the Human NGAL ELISA Kit 036 (BioPorto Diagnostics A/S, Gentofte, Denmark) and the Quantikine Human Lipocalin-2/NGAL (DLCN20) Immunoassay (R&D Systems, Minneapolis, MN, USA), all of which are sandwich ELISAs.
Assay validation was based upon the USA Food and Drug Administration 21 and Clinical and Laboratory Standards Institute22,23 guidelines. Measurements were performed in singlicate on the ARCHITECT® and ADVIA® platforms and in duplicate on all ELISAs in accordance with the manufacturers’ instructions, with coefficients of variation (CVs) of <10% within replicates considered acceptable. Samples used for the various stages of the validation were the same for all assays unless otherwise indicated.
Quality control and imprecision
Low and high quality control (QC) samples supplied by the appropriate assay manufacturer were analysed in each analytical run. For the Hycult and BioPorto ELISAs, where none were provided, QCs supplied with the BioPorto NGAL Test were used. Internal low and high QC urine samples were also prepared in-house from pooled urine from AKI patients and healthy controls at concentrations likely to be encountered in clinical practice (Supplementary Table S1; please see
Parallelism
Urine samples from three patients with AKI were serially diluted in two-fold steps over and above the required initial dilution for analysis, to evaluate parallelism based on dilution-adjusted (back-calculated) concentrations. 24 Parallelism was considered to be acceptable if the CV of the four concentrations was ≤15%.
Recovery
Recovery was assessed by spiking recombinant (r) NGAL (SCIPAC Ltd, Sittingbourne, UK; >96% pure) in phosphate-buffered saline (PBS; pH 7.2–7.6) containing 0.1% (w/v) human serum albumin (HSA) as a carrier protein, into four urine samples (three AKI, one healthy). The concentrations of the low and high spikes were tailored to the low and mid-range of the assay standard curves and produced theoretical increases in NGAL ranging from 34.2 to 2287 ng/mL depending on the assay (Supplementary Table S1). Percentage recovery was calculated as: ([final concentration–initial concentration]/added concentration) × 100 with acceptable limits set at 80–120%.
Selectivity
To assess whether the NGAL/MMP-9 complex 2 was detected by the assays or affected measurement of free NGAL, human recombinant matrix metalloproteinase 9 (rMMP-9; R&D Systems; final sample concentration [0.5 and 30 ng/mL]) and rMMP-9-NGAL complex (Merck KGaA; Darmstadt, Germany; final sample concentration 1 and 200 ng/mL) were spiked separately into assay diluents and four urine samples (three AKI, one healthy).
Limit of quantitation
Due to the lack of a certified reference material and the varying standard curve ranges making it difficult to use the same urine samples with low NGAL concentrations across assays, the limit of quantitation (LOQ) was determined from repeat analysis of the two lowest standards and serial dilutions of the lowest standard for each NGAL assay, on five separate days. The criteria of a CV of <20% and accuracy of 80–120% were used. 21
Haemoglobin interference
Haemolysate was prepared from washed erythrocytes, lysed by freeze-thawing in distilled water. Haemolysate was spiked into four urine samples (three AKI, one healthy control) to haemoglobin concentrations of 5 mg/mL (as recommended by CLSI 22 ) and also at concentrations of 2.25 μg/mL, 1.125 μg/mL and 0.75 μg/mL representing +++/++/+ by dipstick urinalysis (Multistix® 8SG; Siemens Diagnostics, Marburg, Germany).
High-dose hook effect
The possible presence of a hook effect was examined by spiking rNGAL into three urine samples from AKI patients and also into each assay diluent, at maximum concentrations of 25,000–100,000 ng/mL.
Inter-assay comparisons
Urinary NGAL concentrations of all 68 samples were compared across all assays using a modified Bland–Altman plot 25 and Passing–Bablok analysis.
Results
Summary of the performance data for the five NGAL assays evaluated in this study using urine samples
Full details of all results and spike concentrations are provided in Supplementary Table S1. The recovery results are based on assay-specific assigned concentrations for the spiked rNGAL given the lack of a reference standard
LOQ, limit of quantitation, CV, coefficients of variation; MMP-9, matrix metalloproteinase-9
*Not determined or determined in limited samples due to endogenous NGAL concentrations being <LOQ as determined by this study
†The same samples were used across all assays for each parameter investigated with the exception of the low spike recovery study with the R&D Systems assay due to problems with the initial run and inadequate initial samples for the repeat analysis
Imprecision
Intra-assay imprecision was generally acceptable across all platforms, with the single exception of the low NGAL QC urine sample in the Hycult assay with a CV of 34.4%. Inter-assay imprecision was also generally acceptable with the exception of the Hycult assay and additionally almost one-third of replicate CVs were >10% with this assay.
Parallelism assessment
Using the Abbott and R&D Systems assays, parallelism was demonstrated (Figure 1). However, all samples failed to dilute in parallel in the Hycult assay and 2/3 samples failed to dilute in the BioPorto ELISA even when repeated. One sample failed to dilute on the BioPorto/ADVIA® assay, potentially due to issues relating to the LOQ (see below).
Assessment of parallelism, comparing dilution-adjusted NGAL concentrations (log scale) against serial double dilutions of the samples. The initial dilution factor used is indicated next to each sample dilution. Fine dashed lines illustrate ±15% of the mean for each sample. ‘<limit of quantitation (LOQ)’ highlights sample dilutions that fall below the LOQ determined by this study
Limit of quantitation
The LOQ values (including any sample dilution factors where relevant) for the Abbott assay and R&D Systems, BioPorto and Hycult ELISAs were 5, 2, 5 and 20 ng/mL respectively (Supplementary Figure S2; please see
Recovery
Near quantitative recovery was observed when a stock solution of rNGAL (1 mg/mL; ∼80–90% monomeric form on non-reducing 1D-PAGE) was assayed using the BioPorto ELISA. However, significantly lower results (67–75%) were obtained on the other assays and only 31% with the Hycult assay, possibly reflecting differences in standardization or specificity for the different NGAL forms. In order to allow assessment of relative recovery, the recovery of spiked material in urine samples was related to the assay-specific assigned concentrations for the rNGAL stock. Using this approach, the Abbott assay and R&D Systems, BioPorto and Hycult ELISAs all showed acceptable recovery, with the exception of a low recovery (73.6%) for one of the samples in the Hycult assay.
Selectivity
The rNGAL/MMP-9 complex was not detected by any of the assays. Addition of rMMP-9 and rNGAL/MMP-9 complex to urine samples did not affect measured NGAL concentrations with the exception of the Hycult assay where effects were seen in most cases, but resulting variably in both increased and decreased NGAL concentrations.
Haemoglobin interference
All assays showed evidence of interference at a haemoglobin concentration of 5 mg/mL 22 but at the lower concentrations examined, only the Hycult assay was affected (11/13 samples).
Hook effect
All spiked samples and controls measured above the upper limit of the assay ranges and diluted back into the range of the assay at the expected point with the Abbott assay and the R&D Systems and BioPorto ELISAs. A typical high-dose hook effect was observed for the BioPorto/ADVIA® assay but only at rNGAL concentrations >∼70,000 ng/mL (Figure 2a). Results for the Hycult assay (Figure 2b) appeared to plateau at 20–25 ng/mL, although the standard curve spanned upto 100 ng/mL.
Hook effect analysis: (a) Point-to-point line illustrates high dose hook effect seen in the BioPorto (IVD) assay with dotted line showing the upper limit of the assay range; (b) Hycult assay data illustrating plateauing/saturation effect with dotted line showing the upper limit of the assay range
Inter-assay NGAL comparison
Over half of the 68 urine samples assayed using the BioPorto/ADVIA® and Hycult assays were below our determined LOQ values (Supplementary Table S3; please see Modified Bland–Altman Plots for the comparisons (samples < limit of quantitation omitted from the relevant plots) of (a) the Abbott assay, the BioPorto/ADVIA® assay and the R&D Systems and BioPorto ELISAs, and (b) the comparison of the Hycult ELISA assay with each assay – note the expanded y-axis due to the differences seen. The solid red line indicates no bias, with the dotted lines indicating the mean bias and the limits of agreement (mean difference ±1.96 SD of the differences). In each case the difference referred to in the y axis is the second assay in the plot title subtracted from the first named assay
Discussion
The process of clinical implementation of new biomarker tests is recognized as being inefficient, with various hurdles and bottlenecks.26–28 An important aspect is regulatory approval based on technical evaluation of biomarker test performance, with clinical validity/utility typically being characterized through multiple observational studies and systematic reviews. With certified assays on clinical chemistry platforms and RUO immunoassays being used at various stages in the process, it is often difficult to interpret data across studies and to assess the technical validity of the measurements. This is exemplified by NGAL, one of the most promising emerging markers for AKI7,11,12,29 and our study here illustrates the importance of assay characterization and validation as an early phase of the biomarker translational pathway.
Numerous published studies involving measurement of NGAL exist (for example >250 analysing urinary NGAL as listed in PubMed at March 2012). Several commercially available NGAL assays exist, mostly RUO ELISAs but more recently assays have also been introduced for clinical chemistry analysers although the numbers of studies using these is as yet small. Examination of assay performance and comparability is limited. A published study evaluating the Abbott NGAL assay and aspects of sample stability reported excellent reproducibility and precision although recovery, linearity and selectivity were not examined. 16 A small study confirmed the acceptable variability over the range of the Abbott assay 19 although interestingly also highlighted potential issues with the measurement range of the Biosite Triage NGAL assay for plasma NGAL. An extensive validation of the BioPorto ELISA reported generally excellent performance with urine and plasma, although problems with inter-batch variability were highlighted. 18
Our findings indicated that the Abbott and R&D Systems assays performed acceptably throughout all stages of the validation and produced comparable results. Our findings for the Abbott assay were similar to those previously reported and our LOQ of <5 ng/mL is similar to the previously reported value of <2 ng/mL 16 for functional sensitivity (based on within-run imprecision alone). Although also broadly comparable, the BioPorto assays had a bias towards higher concentrations, although lower than the 65% bias reported previously for the BioPorto assay on the Coulter platform compared with the Abbott. 17 This may indicate slight differences in standardization, or in assay design and the forms of NGAL detected and was seen with both rNGAL and endogenous urinary NGAL in our study. Due to issues with the LOQ of the BioPorto/ADVIA® assay, absolute comparison of the samples with lower NGAL concentrations was not possible, but all samples falling below the LOQ on this assay (<150 ng/mL) were measured as being <150 ng/mL on the other assays.
The form of NGAL detected by the assays and used in standardization is likely to have contributed to some of the findings, existing as a monomer, homodimer and a heterodimeric complex2,5 and rNGAL at least possibly containing various glycoforms. 30 The rNGAL we used for recovery studies was predominantly monomeric but did contain dimers and this may account for the differences between assays depending on the specificity and affinity of the antibodies used for the various forms.5,31 This would also be the case when measuring endogenous NGAL where although good agreement was largely seen between most assays, differences were apparent in some samples. The results clearly showed that the NGAL/MMP-9 complex was not detected by or affected the measurement of NGAL using the Abbott, BioPorto and R&D Systems assays.
With the Hycult assay, markedly different results were obtained and the assay appeared to saturate at about 20 ng/mL, whether with endogenous urinary NGAL or rNGAL, although the standard curve ranged up to 100 ng/mL. These effects are likely to be due to differences in standardization in terms of protein amount but also the different forms of NGAL may bind and be detected or interfere to varying extents. Variations in relative amounts of NGAL forms have been reported depending on assay and antibody configurations in assays developed for research studies.31–33 As there is evidence that the relative proportions of the different forms of NGAL change over time, e.g. following surgery 31 or during the menstrual cycle, 34 additional clinical insight may be obtained if assays measuring specifically each of the monomeric, dimeric and NGAL/MMP-9 complex were used.
The main issue encountered with the BioPorto/ADVIA® assay was the LOQ. BioPorto report the ‘measuring range’ of the assay as 25–5000 ng/mL although we found the LOQ to be 150 ng/mL. Reports of optimal cut-off values for NGAL in diagnosing AKI range from 100 to 270 ng/mL, with 150 ng/mL appearing to be optimal. 7 Ideally the LOQ of an assay should not be close to a clinical cut-off due to the greater variability in results obtained in that region. The higher than expected LOQ also made a full evaluation of this assay impossible as many of the samples used contained NGAL concentrations <150 ng/mL. The BioPorto/ADVIA® assay also demonstrated a very distinct high-dose hook effect at concentrations of NGAL > 70,000 ng/mL although such concentrations may not be encountered clinically. Acceptable performance of this assay on the Beckman Coulter AU5822 platform has been described 17 although the LOQ, evidence of any hook effect and recovery were not evaluated. The problems identified in our study may be limited to its use on the Siemens ADVIA® 1800 and since undertaking this study, BioPorto have withdrawn the original application note for this platform and issued a new version.
With the BioPorto ELISA, non-parallelism was observed in two of the three samples tested. This might suggest that the assay is influenced by urinary matrix effects or by differences between endogenous NGAL forms and the recombinant material used for the standards. A previously published study also reported some evidence of non-parallelism (2–22%). 18 The more pronounced non-parallelism (44.6–75.4%) seen in our study could reflect differences in urinary matrices used in the two studies (paediatric versus adult), endogenous NGAL forms and/or the use of different dilution ranges and should be investigated further.
Interference from haemoglobin was observed at very high haemoglobin concentrations of 5 mg/mL 22 and the threshold at which assays are affected and whether in the urinary haemoglobin range found clinically should be determined. A previous study using haemoglobin spiked into urine at 10 mg/mL found no interference on the Abbott assay. 16 However, induced haemolysis of blood affected plasma NGAL concentrations markedly 18 although probably due at least in part to NGAL released from neutrophils 32 as opposed to solely an interference by haemoglobin per se. One of the samples in our study with a very high NGAL concentration was from a patient with a UTI, and a recent study defining a reference range for urinary NGAL found leukocyturia to be associated with higher NGAL concentrations and additionally found age- and gender-related differences in urinary NGAL, 35 issues supported by other studies.34,36
This study clearly illustrates the need for independent systematic appraisals of biomarker assays, even at the research stage, to ensure appropriate standardization and validity of the results and to enable valid inter-study comparisons. Additionally addressing the lack of uniformity in the presentation of performance data by manufacturers would be of value. A recent meta-analysis of studies examining the diagnostic utility of NGAL for AKI took into account the assay used 7 and it is important that interpretation of studies examining the clinical utility of NGAL take into account the findings described here, facilitating future progression of this biomarker towards use in defined clinical scenarios.
Footnotes
DECLARATIONS
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
