Abstract
Background:
Systems for self-monitoring of blood glucose (SMBG) have to provide accurate and reproducible blood glucose (BG) values in order to ensure adequate therapeutic decisions by people with diabetes.
Materials and Methods:
Twelve SMBG systems were compared in a standardized manner under controlled laboratory conditions: nine systems were available on the German market and were purchased from a local pharmacy, and three systems were obtained from the manufacturer (two systems were available on the U.S. market, and one system was not yet introduced to the German market). System accuracy was evaluated following DIN EN ISO (International Organization for Standardization) 15197:2003. In addition, measurement reproducibility was assessed following a modified TNO (Netherlands Organization for Applied Scientific Research) procedure. Comparison measurements were performed with either the glucose oxidase method (YSI 2300 STAT Plus™ glucose analyzer; YSI Life Sciences, Yellow Springs, OH) or the hexokinase method (cobas® c111; Roche Diagnostics GmbH, Mannheim, Germany) according to the manufacturer's measurement procedure.
Results:
The 12 evaluated systems showed between 71.5% and 100% of the measurement results within the required system accuracy limits. Ten systems fulfilled with the evaluated test strip lot minimum accuracy requirements specified by DIN EN ISO 15197:2003. In addition, accuracy limits of the recently published revision ISO 15197:2013 were applied and showed between 54.5% and 100% of the systems' measurement results within the required accuracy limits. Regarding measurement reproducibility, each of the 12 tested systems met the applied performance criteria.
Conclusions:
In summary, 83% of the systems fulfilled with the evaluated test strip lot minimum system accuracy requirements of DIN EN ISO 15197:2003. Each of the tested systems showed acceptable measurement reproducibility. In order to ensure sufficient measurement quality of each distributed test strip lot, regular evaluations are required.
Background
S
The international norm DIN EN ISO (International Organization for Standardization) 15197:20034 is one of the most widely accepted standards to assess the accuracy of SMBG systems. In DIN EN ISO 15197:2003, system accuracy is defined as closeness of agreement between a measurement result and the accepted reference value determined by the manufacturer's measurement procedure. According to this norm, at least 95% of the system measurement results shall fall within ±15 mg/dL of the results of the manufacturer's measurement procedure at BG concentrations <75 mg/dL and within ±20% at BG concentrations ≥75 mg/dL. In the recently published revision ISO 15197:2013, 5 criteria for system accuracy are more stringent, with at least 95% of the system measurement results within ±15 mg/dL of the results of the manufacturer's measurement procedure at BG concentrations <100 mg/dL and within ±15% at BG concentrations≥100 mg/dL. In this revision, mandatory compliance is recommended after a 36-month transition period.
In order to obtain the CE (Conformité Européenne) mark for SMBG meters, a minimum requirement for being marketed in Europe, manufacturers have to provide evidence of conformity with DIN EN ISO 15197:2003. However, studies have repeatedly shown that individual test strip lots of available systems do not comply with accuracy requirements stated in DIN EN ISO 15197:2003, notwithstanding their certification with the CE mark. 6 –11 Currently, application of the CE mark on a BG meter is a one-time procedure; regular and independent quality controls are not mandatory after the market approval.
In this study, the measurement quality of 12 SMBG systems was compared in a standardized manner under controlled laboratory conditions. The main objective was the assessment of system accuracy following the requirements stated by DIN EN ISO 15197:2003. In addition, reproducibility of each system's measurement results was assessed following a modified TNO (Netherlands Organization for Applied Scientific Research) procedure. 12
Materials and Methods
The study was performed between July and October 2012 at the Institut für Diabetes-Technologie Forschungs- und Entwicklungsgesellschaft mbH an der Universität Ulm, Ulm, Germany in compliance with the German Medical Devices Act. The study protocol was approved by the Ulm University Ethics Committee, and the applicable authority was notified. Informed consent forms were signed by all participants prior to study procedures.
Study population
Male and female subjects (≥18 years old) with diabetes type 1 or type 2 and subjects without diabetes were included. Exclusion criteria were as follows: pregnancy or lactation period, severe acute disease, and/or severe chronic disease. The subject's medical history and medication was reviewed by a physician to determine any usage of interfering substances (e.g., acetaminophen, salicylates, ascorbic acid, dopamine) given in the manufacturers' labeling.
BG monitoring systems for self-testing
In this study, 12 systems were evaluated (Table 1): Accu-Chek® Aviva by Roche Diagnostics GmbH, Mannheim, Germany; BGStar™ by AgaMatrix Inc., Salem, NH; Contour® XT by Bayer Consumer Care AG, Basel, Switzerland; GE100, GE200, mylife™ Pura®, and mylife Unio™ by Bionime Corp., Taichung City, Taiwan; GL40 from Beurer GmbH, Ulm; Omnitest® 3 by B. Braun Melsungen AG, Melsungen, Germany; OneTouch® Verio™ Pro by LifeScan Europe, Division of Cilag GmbH, Zug, Switzerland; and Systems x and y (owing to legal reasons, the authors/study site decided not to show brand names of two systems that missed accuracy requirements specified by DIN EN ISO 15197:2003 [see Results]). The systems displayed plasma BG values in mg/dL. mylife Unio was not yet introduced to the German market. GE100 and GE200 were available on the U.S. market. GE100, GE200, and mylife Unio were obtained from the manufacturer (Bionime Corp.); the other nine systems were available on the German market and were purchased from a local pharmacy. These nine systems were selected in order to assess the performance of systems of different manufacturers and price ranges. The systems were stored, used, and maintained in compliance with the manufacturers' labeling. In order to ensure the proper function of each tested system, control measurements according to the manufacturer's labeling were performed daily prior to the test procedure.
The systems are listed alphabetically. Brand names of systems that missed accuracy requirements specified by DIN EN ISO 15197:2003 are not shown (Systems x and y). Comparison measurement methods (glucose oxidase [GOD] or hexokinase [HK]), test strip enzyme (glucose dehydrogenase [GDH] or GOD), calibration, measurement range, and measurement conditions according to the manufacturers' labeling. Test strip lot 1 was used for system accuracy evaluation; test strip lot 1 and test strip lot 2 were used for evaluation of measurement reproducibility.
Distributor is given if different from manufacturer (according to the manufacturer's labeling).
Obtained from the manufacturer, not purchased commercially (mylife Unio is not yet introduced into the German market; GE100 and GE200 are available on the U.S. market).
After the performance of the study, a hematocrit value of 10–70% was specified by the manufacturer.
Comparison measurement
Comparison measurements were performed with the following two different methods, according to the manufacturer's labeling. For systems using a glucose oxidase method to measure reference values (Table 1), the YSI 2300 STAT Plus™ glucose analyzer (YSI Life Sciences, Yellow Springs, OH) was used. The glucose oxidase method was also used for GE100 and mylife Pura according to the manufacturer's information. For systems using a hexokinase method to measure reference values (Table 1), the cobas® c111 (Roche Diagnostics GmbH) was used. The hexokinase method was also used for GE200 according to the manufacturer's information. Trueness of the glucose oxidase method (YSI 2300 STAT Plus) was assessed by assaying bioanalytical glucose standards (YSI). Precision was assessed by assaying quality controls (liquid assayed Multiqual®; Bio-Rad Laboratories GmbH, Munich, Germany). Trueness and precision of the hexokinase method (cobas c111) were verified by assaying control material provided by the manufacturer (Precipath® U/Precinorm® U; Roche Diagnostics GmbH). Additionally, regular internal and external quality control measurements were performed, as required by the German national standard. 13
Comparison measurements were performed from capillary plasma. Both methods used for comparison measurements provided BG values in mg/dL.
Evaluation procedures
Test procedures were performed by clinical personnel well trained to the limitations of the tested systems, the manufacturer's labeling, the safety practices, and the test protocol. Test procedures were carried out in a laboratory setting with controlled room temperature (23±5°C) and humidity. Each participant washed his or her hands with soap and water and dried them before the finger puncture and the measurement procedure were performed. The hematocrit value of each blood sample was checked to be within 30% and 55% (based on the smallest range indicated in the manufacturers' labeling). BG results of comparison measurements had to be within the range of 20–600 mg/dL (based on the smallest range indicated in the manufacturers' labeling).
System accuracy evaluation
In this study, system accuracy evaluation was performed following the procedures prescribed in detail in DIN EN ISO 15197:2003. 4 Deviations from this standard are described in the following section.
Each system was tested on at least 100 capillary blood samples from different subjects over at least 10 days. Samples were measured with one test strip lot and two meters for each system. Test strips were taken from at least 10 different vials, using test strips of the same vial for the two measurements of an individual sample. The vials were changed after approximately 10 subjects. In addition, a sample for determination of hematocrit in duplicate was collected using heparinized capillaries. Capillaries were centrifuged, and the hematocrit was read on an alignment chart.
DIN EN ISO 15197:2003 prescribes the distribution of the blood samples into different BG concentration categories. Concentration categories with the following distribution were used: 5% had to be ≤50 mg/dL; 15% between >50 and 80 mg/dL; 20% between >80 and 120 mg/dL; 30% between>120 and 200 mg/dL; 15% between >200 and 300 mg/dL; 10% between >300 and 400 mg/dL; and 5% had to be>400 mg/dL. Samples were assigned to the respective category according to the mean BG result of the respective comparison measurement (manufacturer's measurement procedure).
Samples were collected from fingertips by skin puncture. Before measurement with each system and before each sample collection for comparison measurements, residual blood was wiped off from the finger. For BG concentrations >50 mg/dL and ≤400 mg/dL only native blood samples were used. For native samples, BG measurements were performed with up to three systems (two meters per system, respectively) directly from fingertips. Before and after the measurements with the systems, samples (200 μL) for comparison measurements were collected in lithium heparin tubes.
The performance of a controlled and safe human study that ensures the availability of sufficient native blood samples in the lowest and highest concentration categories is very difficult. In this study, samples could be adjusted to evaluate the systems' accuracy at BG concentrations of ≤50 mg/dL and>400 mg/dL. Fresh capillary blood samples designated for adjustment were collected in lithium heparin tubes. Adjustment was performed either by incubation to allow for glycolysis or by supplementation with glucose (stock solution; 40% glucose in 0.9% NaCl). Adjusted blood samples were gently mixed, and BG measurements were performed with up to six systems. Immediately after the sampling and measurement procedure, the partial pressure of oxygen (pO2) in adjusted blood samples was checked (Opti™ Check; OPTI Medical Systems Inc., Roswell, GA) in order to ensure a pO2 that is comparable to the pO2 in capillary blood. 14 Before and after the measurements with the systems, samples for comparison measurements were collected.
Comparison measurements of native and adjusted blood samples were performed after the measurements with the systems. Therefore, samples for comparison measurements were centrifuged, plasma was separated, and measurements were performed immediately after centrifugation.
In order to verify sample stability, the drift between the first and the second comparison measurement had to be ≤4 mg/dL at BG concentrations ≤100 mg/dL and ≤4% at BG concentrations >100 mg/dL.
Evaluation of reproducibility
The precision of measurement is defined as the closeness of the agreement between independent measurement results obtained under the stipulated conditions. 15
In this study, evaluation of measurement reproducibility was performed following the TNO guideline 12 with slight modifications as follows. Ten venous blood samples were collected in lithium heparin tubes each from one of 10 different subjects. These samples were distributed to five glucose concentration categories: 50–80 mg/dL, 80–120 mg/dL, 120–200 mg/dL, 200–300 mg/dL, and 300–400 mg/dL, with two samples within each category. Hematocrit values were checked to be within the ranges given by the manufacturers. Following the TNO guideline, a minimum of three samples with a BG value <6.5 mmol/L (<117 mg/dL) was required. Measurements were performed with two lots of test strips and two meters per system. For each concentration category sample 1 was measured with meter 1 and test strip lot 1; sample 2 was measured with meter 2 and test strip lot 2. For each sample and each system, 10 measurements were performed within 30 min. In order to assign the samples to the respective BG concentration category, BG comparison measurements were performed before and after the measurement with the system (as described above for system accuracy evaluation).
Data analysis
Data management and evaluation were performed at the study site.
System accuracy
Data were excluded from analysis if • the drift between the first and the second comparison measurement exceeded the specified acceptance criteria (as mentioned above). • the required number of samples in a BG concentration category was already reached. • the pO2 of adjusted samples was >100 mm Hg.
In this study, 200 results of 100 subjects were included for each system according to DIN EN ISO 15197:2003.
System accuracy was assessed by comparison of the system's measurement results with the respective mean result of the comparison measurement (obtained immediately before and after the measurements with the system). At BG concentrations <75 mg/dL, the relative number of system results within ±15 mg/dL, ±10 mg/dL, and ±5 mg/dL of the comparison measurement was calculated. At BG concentrations ≥75 mg/dL, the relative number of system results within ±20%, ±15%, ±10%, and ±5% of the comparison measurement was calculated. Acceptability of a system was determined by adding the number of system results within ±15 mg/dL at BG concentrations <75 mg/dL to the number of system results within ±20% at BG concentrations ≥75 mg/dL. The agreement between each system and the mean comparison result was plotted in a difference-plot as recommended in DIN EN ISO 15197:2003. In the difference-plot, the deviation of a single BG system measurement from the respective mean comparison measurement result is shown (Fig. 1).

Difference plots of the 12 systems evaluated. Solid lines indicate criteria for system accuracy following DIN EN ISO 15197:2003. Dashed lines indicate criteria for system accuracy following the recently published revision ISO 15197:2013. Brand names of systems that missed accuracy requirements specified by DIN EN ISO 15197:2003 are not shown (Systems x and y).
In addition, accuracy limits of the recently published revision ISO 15197:20135 were applied, and the relative numbers of system results within ±15 mg/dL, ±10 mg/dL, and ±5 mg/dL of the comparison measurement at BG concentrations of<100 mg/dL and within ±15%, ±10%, and ±5% of the comparison measurement at BG concentrations of ≥100 mg/dL were calculated.
In order to assess potential clinically significant deviations, consensus error grid analysis was performed for each system. 16 For consensus error grid analysis, the agreement between individual measurement results and the respective mean comparison result was determined. Then, the number of a system's results within five zones of different clinical significance was counted: Zone A, no effect on clinical action; Zone B, altered clinical action but little or no effect on clinical outcome; Zone C, altered clinical action likely to affect clinical outcome; Zone D, altered clinical action that could have significant medical risk; and Zone E, altered clinical action that could have dangerous consequences.
Reproducibility according to TNO quality guideline
For evaluation of reproducibility, results of 10 subjects (one sample per subject, 10 measurements per sample) were included for each system. For each sample, the coefficient of variation (CV) of 10 independent measurements and the SD were calculated. Acceptance criteria were defined as follows: the permissible CV had to be ≤5% for BG values ≥100 mg/dL, and the SD had to be ≤10 mg/dL for BG values <100 mg/dL.
Results
In this study, 10 of the 12 evaluated systems fulfilled the accuracy requirements specified by DIN EN ISO 15197:2003 with at least 95% of the system measurement results within±15 mg/dL of the results of the manufacturer's measurement procedure at BG concentrations <75 mg/dL and within±20% at BG concentrations ≥75 mg/dL (Table 2 and Fig. 1). The 12 evaluated systems showed between 71.5% and 100% of the measurement results within the required accuracy limits.
Brand names of systems that missed accuracy requirements specified by DIN EN ISO 15197:2003 are not shown (Systems x and y).
BG, blood glucose.
According to ISO 15197:2013 with the BG concentration threshold of 100 mg/dL (instead of 75 mg/dL as described in the ISO 15197:2003 standard), accuracy limits for BG concentrations between 75 mg/dL and 100 mg/dL are more stringent. In this study, eight of the 12 systems showed at least 95% of the system measurement results within ±15 mg/dL of the results of the manufacturer's measurement procedure at BG concentrations of <100 mg/dL and within ±15% at BG concentrations of ≥100 mg/dL (Table 2 and Fig. 1). The systems showed between 54.5% and 100% of the measurement results within the required accuracy limits.
Consensus error grid analysis showed that all tested systems had 100% of the results within Zone A (no effect on clinical action) and Zone B (altered clinical action but little or no effect on clinical outcome). For six systems (Contour XT, BGStar, Omnitest 3, GE100, GE200, and mylife Unio) 100% of the results fell within Zone A.
Each of the 12 tested systems met the modified TNO performance criteria for reproducibility with ≤5% CV at BG values of ≥100 mg/dL or ≤10 mg/dL SD for BG values of<100 mg/dL (Table 3) for both test strip lots. CV for BG concentrations of ≥100 mg/dL ranged from 0.77% to 4.84%; SD for BG concentrations of <100 mg/dL ranged from 0.88 mg/dL to 4.33 mg/dL. Half of the tested systems' reproducibility results showed a CV <3% and an SD <3 mg/dL. For two systems, measurement reproducibility with a CV <2% and a SD <2 mg/dL was found (Table 3).
The SD is given for comparison of measurement results of <100 mg/dL; the coefficient of variation (CV) is given for comparison of measurement results of ≥100 mg/dL. Brand names of systems that missed accuracy requirements specified by DIN EN ISO 15197:2003 are not shown.
Discussion
In this study, system accuracy and measurement reproducibility of 12 systems for SMBG were evaluated in a standardized manner over the clinically relevant BG concentration range. The study was performed under controlled laboratory conditions in which influencing factors with potential impact on BG measurements were reduced to a minimum.
The main objective was the assessment of system accuracy following DIN EN ISO 15197:2003. 4 In this study, 10 of the 12 systems fulfilled with the evaluated test strip lot the minimum system accuracy requirements of this norm. The 12 evaluated systems showed between 100% and 71.5% of the measurement results within the required accuracy limits; comparable results for some of the investigated systems have been reported in other studies. 6,8,17 Considering the recently published revision ISO 15197:2013, 5 the systems' measurement results within the required limits varied between 54.5% and 100%. Because many currently available SMBG systems presumably were developed and marketed in compliance with ISO 15197:2003, a 36-month transition period is recommended in ISO 15197:2013 until mandatory compliance to this standard is required.
Assessment of system accuracy according to criteria stated in DIN EN ISO 15197:2003 should ensure an evaluation procedure that follows as closely as possible the recommendations of this norm. However, the evaluation of system accuracy is complex, and the designs of most of the published studies show remarkable variations to the recommendations of DIN EN ISO 15197:2003. 18 Evaluation procedures in our study were performed under standardized conditions; however, several factors may contribute to variations of the systems' accuracy results. A possible bias of the comparison method and reported differences between the widely used reference methods (glucose oxidase, hexokinase) contribute to inaccuracies of a system`s measurement results. 19,20
In this study, system accuracy of only one lot of test strips of each of the 12 systems was evaluated. As system accuracy results can remarkably vary between different test strip lots, 7,11,21 the results of the test strip lot presented here do not allow for an overall conclusion for a given system. In order to consider test strip variations more closely, the recently published revision ISO 15197:2013 prescribes the performance of system accuracy evaluation studies with at least three different lots of test strips. Nevertheless, each lot of test strips that is distributed and available to people with diabetes has to provide sufficient measurement quality.
Adequate storage conditions of test strips are also an important aspect that potentially affects measurement quality 22 and should be ensured over the entire transport chain from the manufacturer to the pharmacies and from the pharmacies to the patient. In this context it should be mentioned that in this study, nine systems were purchased commercially, whereas three systems were obtained from the manufacturer.
When interpreting accuracy results of SMBG systems it should be taken into account that several factors potentially affect accuracy results, and it should also be recognized that not only a bias that is as small as possible but also high precision is important. Both the precision and the trueness of the measurement results contribute to a system's accuracy. Systems with a high trueness can be imprecise, and vice versa. According to DIN EN ISO 15197, system accuracy (trueness and precision) is sufficient if 95% of single measurement results are within the funnel of the difference plot (Fig. 1). Precision is shown as the closeness of agreement between independent test results under stipulated conditions. 15 In this study, each of the tested systems showed acceptable measurement reproducibility (precision under reproducible conditions). Other studies in which measurement reproducibility was investigated showed comparable results. 10,23 Trueness of a system is shown as the closeness of agreement between the average value obtained from a large series of test results and an accepted reference value. 15 In daily routine, both low imprecision and low trueness bear the risk of inadequate therapeutic decisions.
Conclusions
In this study, 83% of the systems evaluated fulfilled minimum system accuracy requirements of DIN EN ISO 15197:2003. Considering the more stringent criteria of the recently published revision ISO 15197:2013, 67% of the evaluated test strip lots showed at least 95% of the values within accuracy limits. In order to ensure reliable SMBG results, sufficient measurement quality should be provided for each BG meter and each test strip lot distributed on the market. We recommend the establishment of regular independent evaluations of SMBG systems in a standardized manner.
Footnotes
Acknowledgments
This study was funded by a grant from Ypsomed AG, Burgdorf, Switzerland.
Author Disclosure Statement
All authors are employees of the Institut für Diabetes-Technologie Forschungs- und Entwicklungsgesellschaft mbH an der Universität Ulm (IDT), Ulm, Germany. G.F. is general manager of the IDT, which carries out studies evaluating BG meters and medical devices for diabetes therapy on behalf of various companies. G.F./IDT have received speakers' honoraria or consulting fees from Abbott, Bayer, Menarini Diagnostics, Roche Diagnostics, Sanofi, and Ypsomed.
