Abstract
Despite efforts to standardize molecular diagnostic tests, performance differences are not rare. Laboratories are challenged in situations where treatment rules have been established using a reference assay that is different from the assays being used in daily practice. Assessing the viral load status of patients with chronic hepatitis C under modern triple therapy is a recognized example. We demonstrate the use of the range of uncertainty as an easy and efficient tool that can provide information on the certainty of a test result's interpretation in the context of making hepatitis C virus treatment decisions.
Keywords
Introduction
Diagnostic tests that are used as reference standards do not provide error-free results. The level of error contributed by a reference standard (reference assay) can either positively or negatively bias the sensitivity, specificity, and the overall accuracy of a comparison test. 1 Despite various efforts for standardizing assays, performance differences are not particularly rare with respect to molecular diagnostic assays.2,3 The pertinent question is how the analytical differences found with different diagnostic assays impact patient management or, ultimately, patient outcomes. The diagnostic risk, which is the uncertainty about the true health status of a patient, might become unpredictable in situations where treatment rules have been established based on test results obtained with a reference assay that is different from those assays being used in daily practice. 4 The reference assay does not necessarily reflect the true disease status, but a reference status that can be used to evaluate and define treatment rules. This scenario does not appear to be rare in cases where advanced treatment algorithms require regular monitoring of a biomarker, so as to balance patient benefit and harm, and to also avoid the unnecessary use of resources.
Modern triple therapy algorithms for patients with chronic hepatitis C virus (HCV) provide perfect examples of this scenario. 5 Combination therapy with pegylated interferon alpha (PEG-IFN) and ribavirin serves as the basis of treatment in patients infected with HCV. Following the recent approval of the HCV NS3/4 protease inhibitors, boceprevir, and telaprevir in the US and Europe, triple therapy together with PEG-IFN and ribavirin has become the new standard of care in many countries for HCV genotype 1-infected patients. 6 A key parameter for the management of individualized antiviral therapy is HCV ribonucleic acid (RNA). The HCV genotype, in combination with HCV RNA concentrations based on single HCV RNA measurements at baseline and on treatment, is used in order to terminate treatment early in virologic nonresponders, and to define the treatment duration for patients responding to antiviral therapy (responseguided therapy). 5 The clinical decision points are located at the low end of the linear dynamic range of quantitative HCV RNA assays (for example, at 1,000 IU/mL, 100 IU/mL, and at the limit of quantification and/or the limit of detection with HCV RNA “detected” versus HCV RNA “undetected”). The rules for response-guided therapy were established using a single assay for the measurement of HCV RNA. 7 This assay does not appear to be widely used in clinical practice due to the availability of newer commercial assays with a higher level of automation. Differences in the analytical performance between the HCV RNA tests available on the market have been reported previously, with precision, accuracy, and sensitivity being the main performance indicators.8,9 However, little is known about the impact of the analytical differences in HCV RNA assay performance on response-guided therapy and sustained virological response assessments at the low end of the measurement range.
Three comparative studies evaluated low HCV RNA viremia using various commercially available HCV RNA assays.10–12 Although the real-time polymerase chain reaction (PCR) assays used in these studies were standardized against the World Health Organization International Standard for Hepatitis C Virus RNA, and given that they all have broad linear measurement ranges and low detection limits, head-to-head comparison studies revealed that there were differences in precision, quantitation, and detectability between the methods used.10–12 Different clinical decisions would be made from single HCV RNA determinations yielding “HCV RNA detected” by one method and “HCV RNA not detected” by another method.10,11 Low HCV viremic samples were tested in ten independent runs using one replicate each across five different assays. 11 At HCV RNA concentrations around 100 IU/mL and 1,000 IU/mL, one study showed that the quantification between the assays and the interrun variation expressed as the percentage coefficient of variation was quite different. 11 Accordingly, HCV RNA could be measured above or below the 100 IU/mL or 1,000 IU/mL clinical decision point, depending on the assay used. 11 Taken together, these results suggested that there were method-related implications for clinical decisions based on response-guided therapy and sustained virological response assessments.
Since current treatment regimens frequently cause severe adverse effects, stopping treatments early is supposed to avoid toxicity in patients who are unlikely to achieve treatment success. On the other hand, an effective selection of patients who may potentially respond to treatment could help to efficiently deploy health care resources to patients with a high chance for viral clearance and cure. 13
It has been recommended that stopping rules should be applied with particular caution when the HCV RNA values fall within the assay's variability range of the decision thresholds. 7 Given the fact that these HCV drugs have been approved using specific rules that were based on one specific HCV RNA test system, it therefore remains a challenge how laboratories could assist physicians with health status information that are highly confident while using different HCV RNA test systems.
In this short communication, we applied a conceptual approach in addressing the range of uncertainty (RoU) in a hepatitis C monitoring scenario, and we presented the formula used to calculate assay-specific RoU intervals with the objective of providing additional information on the certainty of a test result's interpretation.
Methods
We assumed six hypothetical assays with different analytical performances (Table 1). The clinical cutoff (CO) was set to 100 IU/mL, using a drug-specific therapy-stopping rule as an example. 5 The assay used for establishing the stopping rules was assigned the name “Reference.” The hypothetical results of the other assays were interpreted in relation to the “Reference.” The statistical analysis for evaluating a RoU was based on precision and the mean differences of different test systems compared to the “Reference” assay results. The calculations of the limits of the range of uncertainty (RL) were derived from a general confidence limit formula that assumed a normal distribution and that approached the clinical decision point from lower or higher viral loads. 14 A significance level of 0.05 (reflecting a 95% confidence limit) was chosen for all calculations.
Test characteristics of hypothetical HCV RNA assays.
Results and Discussion
Wachtel et al 15 introduced a “range of uncertainty,” which reflects a diagnostic overlap zone between populations with and without a suspected disease. In order to reduce the risk of false positive and false negative results, the authors suggested two cutoff points beyond which the risk for false results was very low, and between which the result was not diagnostic or uncertain. The interval between these two cutoffs was called the “range of uncertainty.” The concept proposed by Wachtel et al 15 implied that the use of sequential testing only occur for patients whose results are in the RoU. The authors demonstrated the usefulness of the concept in the workup of patients who may not require a bone marrow examination for the diagnosis of an iron deficiency. They also raised the problem of defining an appropriate RoU. Cutoffs for defining the RoU were based on distributions for iron-deficient and nonirondeficient cases, and an arbitrarily chosen sensitivity and specificity of 90%, respectively. The RoU concept identified three groups of patients: highest probability of having the disease; highest probability of not having the disease; and uncertain cases that require further workup.
The idea of the RoU could be transferred from a lack of discrimination between populations to several other diagnostic scenarios showing uncertainty or variation around a cutoff point which is, to our best knowledge, first shown in this paper. Here, we applied the RoU concept to the stopping rules of hypothetical HCV RNA assay performances, and we provided a formula to calculate a RoU based on assay-specific analytical performance information (imprecision and accuracy, relative to a reference).
The RoU comprises an interval determined by an upper limit (UL) and a lower limit (LL) in relation to the CO. Both limits refer to the value, at which point a test result does not cross the clinical decision point with a confidence level of 95%. Therefore, any test result that falls within the RoU does not provide sufficient confidence to guide a course of action (if 95% confidence is assumed to be appropriate). Following this concept, it has been recommended that those results that fall into the RoU require further work-up to evaluate the true response according to treatment rules. 15 The principles of the RoU approach are illustrated in Figure 1. The width and position of the RoU are driven by test system imprecision and quantification differences relative to the reference. While assay-specific imprecision is a performance parameter that is typically assessed during the product development of an in-vitro diagnostic assay, quantification differences could only be taken from direct comparison studies of the reference. Discrepancies in viral load results between different molecular diagnostic assays might be driven by the use of different international standards, different nucleic acid extraction methods, different target regions for PCR primers and probes, different cycling conditions for target amplification, and different calibration methods.

Schematic illustration of the range of uncertainty concept.
The RoU for different assays in relation to the stopping rules evaluated with the reference system is shown in Table 2 and Figure 2. The reference points (indicated by black diamonds) reflect assay-specific quantification differences when compared to the reference assay value. The bars in the chart show three categories for viral load test interpretation: results that fall into the blue areas provide a high probability that the viral load status is truly below the clinical threshold. Results observed in red areas are highly indicative of a viral load status that is above the threshold. If results fall into the grey zone, which represents the assay-specific RuO, the true viral load status remains unclear given the fact that random error might have caused a result to cross the threshold in both directions. Although a single measurement might be affected by random error (imprecision) of the test system, as long as the observed result falls into either the blue or red area, the position of the result relative to the cutoff (hence, the viral load status) is not misleading. Consequently, and according to response-guided therapy schemes, a clinical decision to continue or to stop treatment would be highly appropriate for results that fall in the blue or red area, respectively.
Range of uncertainty intervals for different assay characteristics, as described in Table 1.

Range of uncertainty for different hypothetical assays in relation to a reference.
According to Wachtel et al 15 the RoU is used to select patients that need to undergo subsequent investigation (Fig. 1). Retesting or additional sampling at another time point have been suggested to improve the assessment of treatment response. 12 These could also be regarded as methods of choice for assessing unclear results in the grey zone or RoU. The RoU formula shown in this paper indicates that the UL and LL of the RoU are actually affected by the number of replicates. If precision and quantification differences to the reference are known, a laboratory could easily adjust the RoU to the requested level of confidence.
Accuracy and precision systematically affect diagnostic test results. If not factored in, both could lead to significantly misleading interpretations in response-guided therapy schemes (Fig. 2, “Test A–E” versus “Reference”): a result of 110 IU/mL might be interpreted as a viral load status above the clinical threshold. Observing this result with “Test B,” the RoU, however, points to a high probability that the viral load truly falls below the threshold. A “Test D” result of 110 IU/mL would accurately reflect a viral load status that falls above the established cutoff, whereas a “Test A” or “Test E” result of 110 IU/mL would fall into the RoU, and the patient or sample should undergo further investigation. These scenarios exemplify potentially controversial clinical decisions that might be made based on the results obtained from different test systems.
Evaluating the RoU with its limits below and above the clinical threshold would introduce two additional assay-specific decision points. Results below the assay-specific LL would represent a health status that falls below the reference threshold with high confidence, indicating a therapy responder in the current example. Results above the UL would, on the other hand, reflect a viral load that falls above the reference threshold, indicating a nonresponder status. The RoU concept would mitigate uncertainty issues with different test systems in comparison to a reference. With known precision data and known quantification differences that are compared to a reference assay used for defining clinical decision rules, the RoU would estimate a laboratory's specific probability of supporting appropriate decisions and mitigating the risk of inappropriate decisions. Thus, the RoU concept might be used as a practical guide for laboratories in the context of defined decision rules.
The clinical consequences ultimately depend on the number of affected patients. Apparently, narrowing the RoU (for example, by improving precision, “Test B” versus “Test C” or “Test D” versus “Test E” in Figure 2) would reduce the number of potentially affected patients, thus reducing the need for subsequent action or sequential testing, and improving diagnostic and therapeutic yield. 16 Using viral load information at respective time points, a future model could simulate clinical and economic consequences, and this may help create a hypothesis for a future study validating the RoU concept. The RoU may also be used to define clinical specifications for diagnostic tests and facilitate the comparability of results across laboratories.
Conclusion
In areas where specific test systems have been used to develop treatment rules, differences in assay performance in terms of precision or quantification could result in an additional risk of indicating an inaccurate patient condition (for example, a viral load status). Since inappropriate clinical decisions based on single measurements could result in dramatic consequences or harm for patients under HCV treatment, test results should be reported only with a high level of confidence. Assay-specific RoU could be calculated in each molecular laboratory by using simple formula and performance information for those tests actually used in the laboratory. The RoU described in this paper graphically exhibits assay-specific precision, deviation to the reference assay used for evaluating the treatment rules, and confidence limits for guiding the appropriate or preferred course of clinical action. Given the established treatment guidelines for specific drugs, and until assay-specific decision points have been fully evaluated, the RoU might be an easy and efficient tool to support treatment decisions, while providing additional information regarding the certainty of a test result's interpretation.
