Error models for immunoassays

Abstract

Background

For nearly 20 years, we and others have used a three-parameter power function as a direct estimation error model for immunoassays. The main application is imprecision profile plots (after translating from variance to coefficient of variation) but other uses include weighting functions for regression analysis and variance stabilizing transformations. Although generally successful, the intrinsic monotonicity of the function means that it fails to describe small but distinct increases in variance that occasionally occur near the assay detection limit.

Methods

A systematic comparison of five variance functions was undertaken, using randomly drawn samples from a large body of real immunoassay data.

Results

Variance function accuracy (hence imprecision profile accuracy) can be markedly improved, particularly near the assay detection limit, by employing a pair of complementary three-parameter power functions, together with a constrained four-parameter function, which provides for a variance turning point.

Conclusions

A set of rules, based on an objective goodness-of-fit statistic, can be used to automate presentation of the most appropriate function for any particular data-set. Flexibility is easily incorporated into the selection rules and is actually highly desirable to encourage ongoing evaluation with a wider variety of data. A Win32 computer program that performs the variance function estimation and plotting is freely available.

Introduction

Immunoassay practitioners have estimated and plotted the relationship between variability and concentration for more than 30 years. The original precision (imprecision) profile method, due to Ekins,^1–3 combined the assay response-error relationship (RER) with standard curve slope to yield a smooth function relating variability of assay results to concentration. Despite the popularity and utility of this approach for estimating intra-batch profiles, there are practical limitations as we pointed out previously.⁴ First, the widely used family of logistic standard curve models produce slopes that asymptotically tend to either zero or infinity and therefore predicted variances tending to either infinity or zero as concentration approaches zero. These spurious predictions are easily recognized in plots of variance or standard deviation (SD) versus concentration but are invariably completely concealed in plots of coefficient of variation (CV) versus concentration. Secondly, RER parameters are not readily available for singleton measurement assays and, compounding that, manufacturers do not always disclose standard curve parameter values from automated assays or even the mathematical model in some cases. Finally, it has never been clear how this general approach can be applied to inter-batch or inter-laboratory (or higher) variance relationships.

A grand error model encompassing all immunoassay contexts could no doubt be constructed but, complexity aside, parameter estimation is likely to be intractable in most situations because of incomplete knowledge. It now seems well accepted that the only practical solution is to directly fit simple variance models to data. Baxter⁵ suggested a parabolic variance model,

fitted by least squares regression, where U and σ ² denote mean and variance, respectively, and β ₁, β ₂ and β ₃ are the parameters. We suggested a three-parameter power function derived from Ekins'⁶ method,

with estimation by likelihood methods.^7–9 Equation (2) is monotone and guarantees positive-predicted variances over the range of the data. We have used it continuously since 1989 for internal quality control (QC) and method evaluations. Berweger et al. ¹⁰ also adopted Equation (2) but used a different estimation method.

Our long-term experience supports the general utility of Equation (2), but two complications have become apparent. First, when the data exhibit approximately constant variance Equation (2) reliably produces the expected horizontal shape, but occasionally a sharp upward or downward ‘hook’ occurs at one end or the other. This spurious behaviour simply indicates that Equation (2) is over-parameterized (too flexible) when variance is approximately constant. Secondly, immunoassay data almost always exhibit a smooth decrease in variance as concentration decreases toward zero, but occasionally a small but distinct upturn in variance occurs near the assay detection limit. Monotone Equation (2) does not provide for a turning point. Thus, a simpler model may be required when variance is approximately constant, and a more complex model is required in cases of a variance turning point near zero. This paper describes a systematic evaluation of some three- and four-parameter variance models and suggests a scheme that automates variance function selection but also offers scope for user intervention and experimentation.

Methods

Data

From January 1989, all duplicate clinical specimen results and all QC results from the following in-house RIAs were systematically archived into a database; total Thyroxine (T4) radioimmunoassay (RIA) using magnetic T4 antibody (Ciba-Corning, Medfield MA, USA; product 472299) from 1989 to 2000, and liquid-phase T4 antibody with magnetic second antibody (Cortex Biochem, San Leandro, CA, USA; product CA3178) separation thereafter; total Triiodothyronine (T3) RIA using magnetic T3 antibody (Ciba-Corning; product 100029901) from 1989 to 2000 and liquid-phase T3 antibody and magnetic second antibody thereafter; Thyrotropin (TSH) RIA (1989–1991 only); β ₂-microglobulin (β ₂m) RIA using cellulose-coupled second antibody (Immunodiagnostic Systems, Boldon, UK; product AA-SAC1) from 1989 to 2000 and magnetic second antibody thereafter. All results were indexed by batch, date and operator, and QC results were further indexed by position (start or end of the batch) and measurement order (which was systematically rotated). Similarly, all QC results from a singleton measurement TSH immunoradiometric assay (IRMA) (Serono Maiaclone, Rome, Italy; 1991–1997 only) and automated TSH immunoluminometric assay (ILMA) (Sanofi/Beckman/Coulter Access instrument, Fullerton, CA, USA; 1997 onwards) were also archived. The database contained 403,356 clinical specimen duplicates and 41,472 QC results at the commencement of the experiments described below. A set of C-reactive protein (CRP) inter-batch evaluation results¹¹ that gave an excellent example of a variance turning point near zero (12 specimens, 265 degrees-of-freedom (df); Beckman-Coulter Immage™ instrument), were kindly provided by Jill Tate, QHPS Chemical Pathology, Royal Brisbane and Women's Hospital, Brisbane, Australia.

Sampling experiments

For each combination of assay and protocol, 1000 sets of intra-batch duplicates with sizes n = 50, 100, 250 and 1000 (where possible) were randomly drawn (without replacement) from the archived data, reflecting the distribution of mean values found in clinical practice. These experiments were repeated with selection organized to yield distributions of mean values that were approximately uniform, centrally located and concentrated at the lower end. Likewise, for each combination of assay and protocol, groups of QC specimens of sizes n = 5, 10, 20 (where possible) and 30 (where possible) were selected to produce distributions of mean values that were approximately uniform, centrally located and concentrated at the lower end. In each case, 1000 n × r sets were generated, where r = 5, 10, 20 (where possible) and 50 (where possible) replicates were randomly drawn (without replacement) from the runs of available results. In each case, five variance functions were estimated by conditional likelihood;⁹ Equations (1) and (2) and the following,

Equation (3) was previously suggested by Daniels¹² as an immunoassay RER model. Equations (2) and (3) are monotone within the data range, whereas Equations (1), (4) and (5) have a single turning point that may or may not occur within the data range. A mapping approach to parameter estimation¹³ was used throughout to ensure convergence on global solutions. Goodness-of-fit was determined by the log-likelihood ratio (γ) developed by Raab.⁸ Variance function curvature (K) at any particular mean value was calculated by the standard formula, which expresses the rate of change of the angle subtended by the tangent at that point;

where f ′(U) and f ″(U) are, respectively, the first and second differentials of the function with respect to U.

Results

Monotone models

A very large number of sampling sets with strictly monotone data showed Equations (2) and (3) to be almost equally successful in terms of goodness-of-fit. This was unexpected because a small evaluation in 1987 (prior to Sadler et al. ⁴) demonstrated that Equation (2) had better overall curvature properties than Equation (3), leading us to conclude that it ought to be a superior general immunoassay model. Consequently, the entire sampling experiment was repeated with calculation of curvature at each 10th percentile of the range. This vastly more detailed evaluation showed that Equation (2) consistently yielded larger ratios of upper-end curvature to lower-end curvature relative to Equation (3); that is, Equation (2) shows relatively greater curvature at the upper end and Equation (3) relatively greater curvature at the lower end. This explained findings that, on average, Equation (2) gave superior fits with uniform distributions, whereas Equation (3) performed better when mean values were concentrated at the lower end. It now seems clear that Equations (2) and (3) should be regarded as complementary. Typical curvature patterns are illustrated in Figure 1, where Equation (2) (γ = 1.06) gave a better fit than Equation (3) (γ = 1.36) in this particular case.

Figure 1

Thyroxine 4 RIA specimens (magnetic second antibody protocol) used for generating sampling sets with N = 5 and a uniform distribution. The five data points are based on 131–143 replicates (681 df in total). The fits of monotone Equations (2) and (3) illustrate their complementary curvature patterns

Turning point models

Figure 2 shows that the first 25 sampling profiles for parabolic Equation (1) where the sampling data (5 × 20 design; 95 df) were drawn from the ‘population’ data illustrated in Figure 1. Despite moderate curvature and a very moderate 20-fold relative change in variance, there is clearly a positive bias in the middle of the range and negative bias at the upper end. Equations (4) and (5) both rectified the systematic bias illustrated in Figure 2 but they were nevertheless hopeless as general error models. Equation (4) produced relatively high frequencies of spurious maxima at the upper end of the range and, on rare occasions, both models produced examples of negative-predicted variance at the lower end. Constraints are required in order to exploit the turning point capability of Equations (4) and (5). As the aim is solely to provide for a small U-shaped turning point near the assay detection limit, it is logical to reject Equations (4) and (5) outright unless they produce a minimum at the lower end. The lower end region might be formally defined as follows,

where U _min and U _max are the minimum and maximum mean values, respectively. This immediately requires U _max/U _min >10 to ensure upper limit >U _min. Although there is no obvious way of guaranteeing that the turning point will be located near the detection limit, Equation (7) does guarantee that it must always lie inside the lowest 10th percentile of the range, and well inside it when U _max/U _min is large. In practice, mean values closer than about 2 SDs from zero should be excluded from variance function analysis in order to avoid negative replicate values. As is well known, immunoassay data reduction programs invariably convert negative results to zero and retaining such replicate values would cause systematic underestimation of the sample variance and overestimation of the mean. Given that U _min should be at least 2 SDs from zero, Equation (7) also implies that the smallest possible range will be defined as approximately 2–20 SDs from zero (and assuming U _max/U _min > 10).

Figure 2

Repeat of the ‘population’ data from Figure 1 and the fit of parabolic Equation (1) to the first 25 of the 1000 randomly drawn sampling sets (N = 5, r = 20; 95 df)

A further constraint is required to control negative-predicted variances. The estimation methods given in Finney and Philips,⁷ Rabb⁸, and Sadler and Smith⁹ are not identical, but the underlying likelihood functions all contain the term log (σ^_i ²) where σ^;_i ² is predicted variance at the ith data point; that is, the likelihood methods have the important built-in constraint that variance functions must predict positive variance at all data points. Positive-predicted variance is therefore guaranteed over the entire range for monotone functions, such as Equations (2) and (3). Figure 3 illustrates a sample of CRP data and shows that despite positive-predicted variances at all data points, Equation (5) produced a meaningless result. As the intention is to allow turning point functions only when they produce a minimum inside the Equation (7) limits, and as we also expect the function to pass through the low-end data points, it seems reasonable to also require,

where σ^_TP ² is predicted variance at the turning point and S _min ² is the smallest sample variance within the range defined by Equation (7). The resulting constrained fit of Equation (5), i.e. ‘5c’, is also shown in Figure 3. The free- and constrained fits produced γ values 1.17 and 2.05, respectively. This might seem a disconcertingly large relative difference but, to put it into perspective, γ values for monotone Equations (2) and (3) were 17.7 and 13.1, respectively. In this particular case, the monotone functions also gave poor fits at the upper end. Arbitrary constraints such as those defined by Equations (7) and (8) might seem unsatisfying but they are commonplace in modelling situations and essential in this particular context. Constrained fits should be regarded best-fits subject to (hopefully) well-founded restrictions.

Figure 3

Blow-up of the low concentration region of a randomly drawn sample of C-reactive protein data (N = 10, r = 10; 90 df) showing the fits of monotone Equations (2) and (3) and the free and constrained fits of Equation (5), where ‘5c’ indicates the constrained fit. The inset shows the full range view

Constrained fits

A repeat of the sampling experiments with constrained estimation showed Equations (4) and (5) to generally produce turning points in parallel, but goodness-of-fit results favoured Equation (5) in a ratio of approximately 4:1. Moreover, the Equation (4) numerical process consistently required more iterations and revealed higher frequencies of ‘local’ (spurious) solutions. In fact, the general form of Equation (4) means that pairs of exactly equivalent global solutions are possible, i.e. [β ₁ = a, β ₂ = b, β ₃ = 0, J = 2] and [β ₁ = a², β ₂ = 2ab, β ₃ = b², J = 1]. In short, the evidence, derived from many samples of real data, suggested Equation (5) as the superior immunoassay turning point model.

Discussion

Following the pioneering work of Ekins in the 1970s, there has been a recent revival of interest^14–16 in using error functions to characterize the performance of analytical methods. Recent work appears to be restricted to chemical or biochemical tests that have a linear calibration relationship, leading to a simple two-parameter error model,

It follows that Equations (1) and (9) are both likely to be reasonable models for immunometric (excess antibody) assays because of their near linear calibration relationships. Our long-term experience has shown that parameter J is usually close to 2 when Equation (2) is applied to immunometric assay data. Unfortunately, limited reagent immunoassays produce a wide continuum of non-linear calibration ‘shapes’ and therefore require more complicated error models than Equations (1) and (9). Figure 2 illustrates the failure of a parabolic variance model with data from a perfectly ordinary RIA.

Equation (2) is easily derived as a simplification of Ekins'⁶ method, specifically by assuming a two-parameter power function as the RER model and a four-parameter logistic function for the standard curve. Note that if a straight line or parabola is used as the RER model, an equivalent simplification leads to a quartic polynomial, which is unsuitable as a direct estimation variance function because it may produce up to three turning points. The general form of Equation (2) implies that it can provide, via parameter J, for a continuum of monotone variance relationships, effectively corresponding to the continuum of immunoassay calibration ‘shapes’. Equation (3) is a complementary alternative.

As we pointed out previously,⁴ the high flexibility of Equation (2) means that it can produce some absurd sampling shapes with small quantities of data, and especially when the range of mean values is small. Over the last 20 years, we have circumvented that complication by accumulating moderate to large quantities of data prior to analysis (minimum: 100 df). It could be argued that analysis of RER and standard curve parameters ought to yield an approximate ‘characteristic’ value of parameter J for a particular assay, which could be held fixed thereafter (or at least until assay reagents and/or conditions are modified), thereby potentially allowing more reliable estimation from smaller quantities of data. That approach is probably worth investigating but, in the real world, it would not only require immunoassay manufacturers to show greater willingness to disclose internal details of assays, but also a certain level of mathematical expertise within laboratories.

Variance turning points add a further complication. Early theoretical work by Ekins et al. ¹ showed that turning points can be predicted for limited reagent assays. The main factor is the increase in response error as concentration tends to zero. The reverse applies for immunometric assays, which implies strictly monotone variance relationships for that assay type. However, it is important to appreciate that these considerations apply only to intra-batch (baseline) error. The turning point observed in a CRP immunometric assay (Figure 3) could hardly be more clearly defined, but it contradicts theoretical expectation. It is almost certainly due to issues related to calibration of the instrument. We previously reported a turning point in an automated TSH immunometric assay,¹⁷ which was almost certainly due to daily single point recalibration of the instrument using a specimen significantly removed from zero (as expected, the corresponding intra-batch relationship showed no evidence of a turning point). Any calibration ‘mapping’ that adjusts low-end positioning to a greater degree than would be predicted by local response error is virtually guaranteed to produce a variance turning point. More generally, a turning point could never be ruled out for inter-laboratory data, regardless of baseline theoretical predictions or the efficacy of day-to-day calibration. Equations (4) and (5) are the simplest extensions of Equations (2) and (3) that provide for a single turning point but they have no obvious theoretical justification. They also require constrained estimation. Nevertheless, Equation (5) appears to provide a reasonable empirical solution.

It is clear that no simple variance model provides for all immunoassay data patterns. The aims here were not only to evaluate variance functions but also to define a logical automated selection scheme. The sampling experiment results suggest the following as one possibility;

Estimate Equation (2) as the ‘standard’ immunoassay variance function.

If the fit is essentially indistinguishable from a straight line fit, or exhibits a vertical ‘hook’ (over-parameterization case), then adopt the straight line as the appropriate model.

Otherwise, use Equations (7) and (8) constraints to estimate Equation (5) and if it produces an arbitrarily large (>25%) reduction in γ then adopt it as the appropriate model, but retain Equation (2) as a selectable option.

Otherwise, estimate Equation (3) and if it produces an arbitrarily large (>10%) reduction in γ then adopt it as the appropriate model, but retain Equation (2) as a selectable option.

Otherwise, adopt Equation (2).

A reasonably large reduction in γ must be used in step (3) to prevent acceptance of a turning point fit that is essentially just following a single data point at the low end. When the data have a genuine turning point, such as illustrated in Figure 3, the reduction is likely to be spectacularly large (>50%). Collectively, steps (3) and (4) offer greatly improved accuracy at the low end of the range, which is often of particular interest.

Although immunoassay variance functions are mainly used in constructing imprecision profile plots (usually after translating to CV versus mean), they have other statistical applications. For example, Bartlett¹⁸ showed that if the form of the variance function is known, say σ ²(U) = ϕ(U), then transforming the raw data values by the integral of [ϕ(U)]^−½ should stabilize the variance. Equation (2) and a straight line model yield simple integral expressions (hence transformation functions) for all parameter values,¹⁹ but Equations (3)–(5) yield integrals only when parameter J has a small integer value (a vanishingly small frequency in practice). That is the main reason for ensuring the availability of Equation (2) irrespective of whether it is the adopted function.

A Win32 computer program that implements the estimation scheme described above can be freely downloaded from the Australasian Association of Clinical Biochemists website (www.aacb.asn.au). A link to the program can be found under Resources > Tools. The arbitrary criteria in Equations (7) and (8) and selection steps (3) and (4) are program default settings. Each can be independently adjusted in either direction to allow experimentation with the acceptance conditions for Equations (3) and (5).

Footnotes

ACKNOWLEDGEMENT

I am most grateful to Jill Tate, Royal Brisbane and Women's Hospital, Australia, for providing an excellent example of a variance turning point.

References

Ekins

, Newman

, Piyasena

, Banks

, Slater

JDH

. The radioimmunoassay of aldosterone in serum and urine: Theoretical and practical aspects. J Steroid Biochem 1972;3:289–304

Ekins

. Basic concepts in quality control. In: Radioimmunoassay and Related Procedures in Medicine 1977. Vienna: IAEA, 1978:6–20

Ekins

. The precision profile: its use in assay design, assessment and quality control. In: Hunter

, Corrie

JET

eds. Immunoassays for Clinical Chemistry. 2nd edn. Edinburgh: Churchill Livingston, 1983:75–105

Sadler

, Smith

, Legge

. A method for direct estimation of imprecision profiles, with reference to immunoassay data. Clin Chem 1988;34:1058–61

Baxter

. Simplified approach to confidence limits in radioimmunoassay. Clin Chem 1980;26:763–5

Sadler

, Smith

. Estimation of imprecision in immunoassay quality assessment programmes. Ann Clin Biochem 1987;24:98–102

Finney

, Phillips

. The form and estimation of a variance function with reference to immunoassay. Appl Stat 1977;26:312–20

Raab

. Estimation of a variance function, with application to immunoassay. Appl Stat 1981;30:32–40

Sadler

, Smith

. A reliable method of estimating the variance function in immunoassay. Comp Stat Data Anal 1986;3:227–39

10.

Berweger

, Muller-Plathe

, Hanseler

, Keller

. Estimating imprecision profiles in biochemical analysis. Clin Chim Acta 1998;277:107–25

11.

Tate

, Koerbin

, Potter

, Hickman

. Evaluation of the Beckman-Coulter IMMAGE high sensitivity C-reactive protein (CRPH) assay. Clin Chem Lab Med 2002;40:pS226

12.

Daniels

. The fitting, acceptance, and processing of standard curve data in automated immunoassay systems as exemplified by the Serono SR1 analyzer. Clin Chem 1994;40:513–17

13.

Sadler

, Smith

. A computer program for estimating imprecision characteristics of immunoassays. Comput Biomed Res 1990;23:105–14

14.

Rocke

, Lorenzato

. A two-compartment model for measurement error in analytical chemistry. Technometrics 1995;37:176–84

15.

Rocke

, Durbin

, Wilson

, Kahn

. Modelling uncertainty in the measurement of low-level analytes in environmental analysis. Ecotoxicol Environ Saf 2003;56:78–92

16.

Thompson

, Wood

. Using uncertainty functions to predict and specify the performance of analytical methods. Accred Qual Assur 2006;10:471–8

17.

Sadler

, van der Splinter

, Murray

, Turner

. Calibration of automated thyrotropin assays. Clin Chem [Letter] 1993;39:904–5

18.

Bartlett

. The use of transformations. Biometrics 1947;3:39–52

19.

Sadler

, Smith

. Applications of a three-parameter variance function in statistical analysis of immunoassay data. In: Developments in Radioimmunoassay and Related Procedures. Vienna: IAEA, 1992:383–92