Abstract
Continuous subcutaneous insulin infusion pumps and continuous glucose monitors enable individuals with type 1 diabetes to achieve tighter blood glucose control and are critical components in a closed-loop artificial pancreas. Insulin infusion sets can fail and continuous glucose monitor sensor signals can suffer from a variety of anomalies, including signal dropout and pressure-induced sensor attenuations. In addition to hardware-based failures, software and human-induced errors can cause safety-related problems. Techniques for fault detection, safety analyses, and remote monitoring techniques that have been applied in other industries and applications, such as chemical process plants and commercial aircraft, are discussed and placed in the context of a closed-loop artificial pancreas.
Prelude
This article is an outgrowth of a presentation titled “Algorithms to Detect Glucose Sensor and Infusion Pump Anomalies” at the NIDDK-sponsored workshop on Innovations Towards an Artificial Pancreas, held in Bethesda, Maryland, on April 9-10, 2013. The goals of that presentation were to review fault detection algorithms used in other industries, such as air travel and chemical processing and manufacturing, and suggest how lessons learned in these other application areas might be relevant to closed-loop artificial pancreas development. The primary goals of this article are largely the same, with additional citations to ongoing efforts in the broader field of safety science as well as the artificial pancreas.
Background and Motivation
Advanced automated systems that include measurement devices (sensors) and actuators (eg, valves, pumps, motors) have helped lead to more efficient and safe manufacturing processes and commercial transportation vehicles and systems. An important consideration in any of these processes or systems is the response or actions that need to be taken when 1 or more components fail; this, of course, requires that the “faults” be detected. For a closed-loop artificial pancreas there is an initial body of research that has been conducted to handle sensor and pump-related anomalies. An objective of this article is to provide an overview of fault detection algorithms that have been applied in other critical industries, and place these in the context of faults that can arise in a closed-loop artificial pancreas.
The structure of the article follows. First, we provide an overview of possible artificial pancreas (AP) faults, followed by a review of chemical process and commercial aircraft fault detection and safety. We then review developments in fault detection and safety science in general followed by specific applications of AP fault detection algorithms. Finally, we discuss the possible connections between safety and remote monitoring techniques used in the chemical process and aviation industries, and those that can possibly be used in the context of the closed-loop AP.
Artificial Pancreas: Overview of Potential Faults/Failures
An AP is composed of at least 3 major components: (1) a continuous glucose monitor (CGM; sensor), (2) a continuous insulin infusion pump (actuator), and (3) a controller with an embedded algorithm that uses the glucose sensor reading to output a signal to the insulin infusion pump. Each of these components can be faulty. The most obvious is the infusion pump, which can fail to deliver the commanded amount of insulin due to blockages or leakage back to the infusion site; see Figure 1 for an example of an infusion set failure. 1 The continuous glucose sensor can fail to provide satisfactory information due to miscalibration, fouling (slow sensor signal attenuation), dislodging of the sensor from underneath the skin, or pressure-induced sensor attenuations (resulting in rapid, but intermittent, signal degradation; see Figure 2 for an example 1 ). Also, there can be communication losses between the sensor and controller or the controller and pump. Furthermore, the control device can lose power or the operating system can “crash.” In addition, errors can be human-induced—for example, the individual may provide an incorrect estimate of the number of carbohydrates in at meal (if using meal announcement), not announce the meal at all, or announce the meal and not consume it. The general systems engineering methodology for detecting these events is known as fault detection.

Glucose concentration and insulin infusion during the life of an insulin infusion set. “Set Failure” indicates the likely start of the failure, since the boluses that follow fail to reduced the glucose level. Figure reproduced from Baysal et al 1 with permission of the American Automatic Control Council.

Illustration of a pressure-induced sensor attenuation. Several consecutive attenuations are characterized by rapid, nonphysiologic decreases with first-order behavior. Figure reproduced from Baysal et al 1 with permission of the American Automatic Control Council.
Chemical Process Systems
Complex chemical manufacturing processes are composed of a large number of pieces of equipment for the reactions, heat exchange, and separations necessary to produce products; this includes pumps and compressors to move fluids between pieces of equipment, and many sensors and actuators to operate these processes. An example process and instrumentation diagram for a typical process unit (1 of many in a petroleum refinery) is shown in Figure 3. 2 The numerous measurements present opportunities and challenges related to fault detection and safety. On one hand, some of the sensor readings are redundant, being related through known material and energy balance relationships; this analytical redundancy allows the use of data reconciliation techniques to reconcile measurement errors, as well as fault detection algorithms to detect the failure of sensors and/or actuators. On the other hand, the shear number of sensors increases the likelihood of 1 or more failing or being miscalibrated. Usually, there is a control room where operators monitor and control several similar units simultaneously; a typical control room is shown in Figure 4. 3 When process variables are outside an expected range, alarms are activated (using sounds and flashing icons) to warn operators to take action. The advantages and challenges of fault detection and safety in a chemical process plant are summarized in Table 1.

Toulene hydrodealkylation (HDA) process and instrumentation diagram. 2 Illustrates the many unit operations (reactors, heat exchangers, separation columns), and measurements and manipulated inputs (control valves) available in a typical chemical process.

Typical control room for a chemical manufacturing processes. 3
Chemical Process Applications of Fault Detection and Safety.
Overall, the chemical process industry has a good safety record. Major disasters in the chemical process industries are often due to a sequence of events rather than any single factor. The well-known release of methyl isocyanate (MIC) at the Union Carbide plant in Bhopal, India in 1984 occurred largely because of a number of bad management and operating decisions, in addition to actions of a disgruntled employee; details are in the appendix. The explosion that occurred at the BP Texas City refinery in 2005 was due to miscommunication, failure to meet blowdown drum safety standards, miscalibrated and nonfunctioning sensors and alarms, a history of violating proper startup operating procedures, and poor siting of temporary office trailers; details of this accident are also in the appendix.
The assurance of chemical process safety is based on a number of factors. First of all, there is redundancy in components that can fail; primary examples include bypass lines around control valves, and additional pumps that are started up when 1 fails. Also, the proper state of a failure is chosen—for example, a fuel gas valve to a furnace is designed to fail-closed, while a cooling water valve to a reactor to remove heat is designed to fail-open. There are also pressure relief valves that release gases to a flare header/tower before a vessel becomes overpressurized. The automation and control system usually has lower level control loops that remain active even if there are problems with higher-level optimization and control strategies.
Finally, it is important to discuss industry regulation. Process equipment must meet certain codes—ASME standards for vessel design, for example. The Environmental Protection Agency (EPA) regulates emissions to the atmosphere. In the event of a catastrophic failure the Chemical Safety Board (CSB) immediately investigates and prepares a detailed report; this often leads to significant changes in chemical process design and operation.
Aircraft Systems
Commercial aircraft have advanced control systems, with a significant number of sensors and actuators (see, eg, Figure 5). 4

Boeing 787 aircraft schematic to illustrate the large number of components, including multiple engines, and the control (cockpit) layout. Control surfaces include aileron (25), outboard flap (28), flaperon (29), inboard spoilers (30), rudder (50). 4
Since many models of the same aircraft are produced, a significant effort into detailed mathematical modeling and control system design are easily justified since the cost is spread across numerous identical aircraft. Expert pilots receive extensive training on simulators, and are tested on a large number of fault scenarios. The strengths and weaknesses of aircraft are summarized in Table 2.
Aircraft and Air Transportation Fault Detection and Safety.
The commercial air traffic industry has an outstanding safety record, with a fatal accident occurring fewer than 1 every 4.7 million flights, for 78 major world airlines. 5 There are certainly numerous reasons for this outstanding safety record. One is that this is a heavily regulated industry where safety is the highest priority. Air traffic control is a hierarchical system, with the regulating agencies at the top, and an air traffic control system as the next layer.
Aircraft are built with physical and analytical redundancy. For example, there are generally 3 to 6 flight control computers. Also, important control surfaces have multiple actuators, in case one actuator fails. Furthermore, there are redundant sensors, such as altitude. 6
Unless an engine is lost during takeoff, an aircraft can generally be flown safely on one engine. An example of a case where a pilot (Captain Sullenberger) was able to land safely (on the Hudson River) after the loss of both engines on takeoff, US Airways 1549 in January 2009, is discussed in the appendix. An example of when a sensor failure led to loss of an aircraft is with Air France flight 447 in July 2009, when the icing of pitot tubes (speed sensor) led to the disengagement of the automated flight control system. As detailed in the appendix, the pilots were not able to keep the proper speed of the aircraft, and were confused about the aircraft orientation. Finally, the crash of Asiana flight 214 (again presented in the appendix) in San Francisco in July 2013 occurred partially because a pilot assumed that the plane was under automatic speed control when it was not. It is important to note that the majority of fatal accidents occur during takeoff (20%) and landing (36%), while an additional 12% occur before takeoff (while parked, towed, during taxi phase).
The above analysis has focused on the control strategies for any particular aircraft. A major reason, however, for the outstanding safety record of the air traffic industry, is the air traffic control structure. Air traffic controllers are responsible for keeping proper distances between aircraft that are taking off and landing, as well as those that are at cruising altitude. Radar provides continuous feedback of the location and speed of aircraft within the region controlled by a particular air traffic controller.
Finally, in the event of a disaster or near-disaster, the National Transportation Safety Board (NTSB) performs a detailed investigation, including analysis of data from so-called black boxes as well as cockpit communications, data from aircraft engines, and so on. Again, the results of these investigations will often lead to recommendations for new procedures and operating protocols.
Closed-Loop Artificial Pancreas
A closed-loop AP differs significantly from chemical process manufacturing and air traffic control in a number of ways. A primary difference is in the number of sensors and actuators, since a basic closed-loop AP can be developed with a single sensor (CGM) and actuator (insulin pump); generally it is easier to design controllers for single-input, single-output systems than for multivariable systems. On the other hand, physiological systems are more difficult to mathematically model than physical and chemical systems. Also, physiological systems are much more variable, with both intra- and intersubject variability. Another major challenge is that subcutaneous insulin pharmacokinetics and pharmacodynamics inherently limit the possible dynamic performance of glucose when manipulating insulin. It should be noted, however, that there are many chemical process systems that operate with such a time scale, or often with much longer timescales. The advantages and challenges of fault detection and safety in a closed-loop AP are shown in Table 3.
Closed-Loop Artificial Pancreas Fault Detection and Safety.
Fault Detection Algorithms and Safety Science
Fault detection is a well-established area of dynamic systems and control. Frank 7 and Isermann 8 provide reviews of fault detection, with selected applications, of quantitative model-based fault detection and diagnosis techniques. Much of the fault detection literature alludes to controller performance under failure. While often not directly discussed, an important performance attribute is safety. Venkatasubramanian 9 provides a nice perspective on systemic failures, using examples from financial (Enron, Madoff), pharmaceutical (recall of inhalers), electric power grid (northeast blackout of 2003), and the mining and chemical process industries. Leveson 10 presents a new accident model for engineering safer systems, and Leveson and Stephanopoulos 11 provide a control-inspired approach to process safety. Most disasters occur due to a number of causes and not any individual event; Leveson and Stephanopoulos 11 argue that too often the focus on disaster analysis is on the chain of events that lead to the disaster, whereas the overarching cause was systemic, and the risk was increasing over a period of time. That is, too often the accident is viewed as some unfortunate sequence of independent events that happened to occur at a point in time, rather than understanding that systemically, the “accident was waiting to happen” due to the increasing risk over time.
Alarm fatigue occurs when an individual is faced alarms that occur too frequently and are either ignored or incorrect action is taken in response to the alarm. Shivers et al 12 provide a comprehensive overview of alarm fatigue with CGM devices, focused primarily on open-loop (manual) control; they also provide a review of general health care alarm systems and note that alarm fatigue in hospitals has been linked to over 200 patient deaths over an 8-year period. Alarm “flooding” is a common problem in complex chemical plants, where a control room may have tens or hundreds of alarms, of various degrees of importance, activated simultaneously—particularly during process upset conditions. Laberge et al 13 describe these problems and a new alarm tracker summary display that led to fewer false responses from the operators.
Artificial-Pancreas-Related Anomalies and Failures
Infusion Set Failures
Heinemann and Krinelke 14 refer to insulin infusion sets as the “Achilles heel” of continuous insulin infusion, since they are a frequent source of problems. Cope et al 15 performed a 10-year Food and Drug Administration (FDA) retrospective study of adverse events in adolescents using insulin pump technology; of the 1674 reported incidents identified there were 987 (61.9%) reports with patient problems of hyperglycemia, and 46.6% of these indicated that the patient had ketoacidosis. Insulin infusion set failure and infection of infusion site are the most frequent events according to a report of pump malfunctions recorded between 2001 and 2004 in 376 pumps used by patients treated with continuous subcutaneous insulin infusion therapy in Brittany. 16 Patel et al 17 performed a trial comparing steel and Teflon catheters and found that both had a 64% failure rate after 7 days of wear. Heinemann et al 18 discuss the Patel et al studies in their commentary on the need for better insulin infusion sets.
Vega-Hernandez et al 19 develop a model-based approach to detect pump over- or underdelivery of insulin, assuming that insulin boluses are given at mealtime. The in silico model of Hovorka et al 20 is used to simulate to fault-related scenarios: (A) 40% overdosing and 5% parameter variation and (B) 40% underdosing and 5% parameter variation. For both scenarios, the proposed observer-based strategy detects actuator faults based on large differences between the estimated and measured subcutaneous glucose values. Vega-Hernandez et al 21 further consider 10% variations in parameters, with meal content and timing variations of 15% and 15 minutes, respectively.
Rojas et al 22 use bivariate classification, principal component analysis (PCA) and a combined approach to detect simulated faults in 10 subjects. Cameron et al 23 use an interactive multiple model (IMM) approach to detect 27 set failures in 120 weeks of outpatient data; the infusion sets, on average, failed after 5.3 days. Cameron et al 24 use a threshold-based approach (using an alarm silencing period) to detect 80% of set failures, with a false positive rate of 0.3/day. Herrero et al 25 use an interval analysis based technique to detect faults in simulation studies involving 10 scenarios on 10 subjects. Baysal et al, 1 in a retrospective analysis of the Patel et al 17 data using real-time algorithms, found that a model-based approach had a median time to detection of 181 minutes, and a glucose value at detection of 277 mg/dl.
Continuous Glucose Sensor Anomalies
Faults associated with continuous glucose sensors can include the slow degradation of signals due to fouling, intermittent loss of signal due to communication dropouts, and intermittent degradation of signals due to a pressure-induced sensor attenuation (PISA). A PISA can occur when an individual rolls over on their sensor. Helton et al26,27 provide a physiological basis for the sensor attenuations. Mensh et al 28 perform a detailed study by placing 4 sensors on individuals without diabetes overnight, and using video to determine their sleeping position. While the median sensor values were relatively constant, individual readings would occasionally rapidly attenuate; this attenuation was directly correlated to the sleeping position.
CGM signals due to a PISA tend to attenuate for roughly 15-30 minutes before returning to near “pre-attenuation” values. Baysal et al1,29 develop a rule-based method, based on CGM signals processed by a Kalman filter, specifically applicable to overnight conditions, when PISAs are most likely to occur. The real-time PISA detection technique was tested on over 1125 nights of outpatient data from a predictive low-glucose suspend trial; 88.34% of the PISAs were successfully detected by the algorithm, and the percentage of false detections could be reduced to 1.70% by altering the algorithm parameters. 29
Fail-Safe Behavior
Safety must the highest level priority of any closed-loop AP system. Control valves in a chemical process are designed to either fail-open or fail-closed depending on the valve service. Since the greatest short-term danger in a closed-loop AP is hypoglycemia, any loss of CGM signal or control computation failure should result in either zero insulin infusion, or a reversion to the basal delivery rate, in addition to the activation of alarms. The long-term pharmacodynamics effect of insulin makes insulin onboard (IOB) an important consideration in any control algorithm. IOB is explicitly used in most model-based AP algorithms; 30 Revert et al 31 also show how IOB can be added as a constraint to any form of control algorithm. Indeed, consideration of IOB and current glucose state could determine whether an insulin delivery system should revert to basal or to zero-infusion. Human factors are an important consideration in any system design, whether under manual or closed-loop operation. Schaeffer 32 provides a nice discussion of the role of human factors in medical device design, with a focus on the design and development of an insulin pump. In the sections that follow we focus on fault detection algorithms as part of an AP system.
Stress Faults
Exercise and stress, while not AP faults per se, have major impacts on insulin sensitivity and blood glucose levels. Finan et al 33 use PCA of CGM, insulin infusion, and recorded meal data to detect “stress days” (when prednisone was given), with 89% classification accuracy.
Multiple Faults
Facchinetti et al 34 present a model-based method, using insulin infusion and CGM signals, that detects faults when the CGM predictions fall outside confidence intervals; a limitation is that it assumes overnight operation with no meals for exercise. This approach is extended by del Favero et al 35 to include meal announcement during the daytime.
Major challenges to glucose control include meals (unannounced in particular) and exercise, but these should be viewed more as disturbances, particularly since they occur on a frequent basis. There are numerous other possible disturbances (and faults) that occur on an infrequent basis, and estimation and fault detection algorithms could be developed for those specific cases. It should be clear that very infrequent or unlikely events cannot (or should not) be explicitly detected, but could be flagged as an unknown event. For example, Buckingham 36 reports the case of a patient that accidently drilled a hole in her thumb with a drill bit, and the rapid decrease in the CGM reading enabled her to avoid hypoglycemia; it is unlikely that it would be worth the time and effort to develop an algorithm to explicitly detect a “drill bit through the thumb” fault.
Software-Related Failures
While much of the discussion in this article has involved hardware failures, it is worth discussing the possibility of software related failures. Certainly, at the turn of the century there was much concern about the so-called Y2K problem, where the transition from the 1999 to 2000 could lead to problems due to the due of the last 2 digits (99 to 00) in the date field of many software systems. Fortunately, the tremendous focus on this issue resulted in minimal systemwide problems. A problem with the operating system on Apple iPhones caused wake-up alarms to fail to activate on the morning of January 1, 2011, causing thousands of people throughout the world to miss airline flights and otherwise important appointments.
Welsh et al 37 provide an overview of the engineering aspects of software for insulin infusion pumps, and include a risk analysis of the hazards associated with insulin pumps under current manual operation. They note that the software that drives cardiac pacemakers contains over 80 000 lines of code and that some hospital infusion pumps contain over 170 000 lines of code. Picton et al 38 note that proprietary data and communication protocols of diabetes devices have made the integration of these components challenging. They report that IEEE standards for glucose meters have been approved, and that standards for insulin pumps and CGM are under development.
Remote Monitoring
Chemical Processes and Commercial Aircraft
Remote monitoring has been used in chemical process plants, power plants and aircraft for safety and performance monitoring. General Electric combustion turbines have a number of sensors that are monitored at a central site in Atlanta that includes monitoring of systems from over 1600 power plants. 39 Maggiore and Kinney 40 provide an overview of the Airplane Health Management (AHM) system by Boeing that collects in-flight information and relays it to the ground in real time. The 3 types of decision support include (1) real-time fault management, (2) custom alerting and analysis, and (3) performance monitoring. The fault management system communicates in-flight faults to the ground and diagnoses them in real time, and the custom alerting and analysis system can deliver alerts and notifications through the internet, fax, email, text, and pagers. Performance monitoring results are available within hours and can be used by airlines to reduce fuel consumption and improve operation.
Artificial Pancreas
Remote monitoring is being used in a number of outpatient AP clinical trials41-43 and the 2014 ATTD had a debate about the potential of remote monitoring of AP devices.44,45 Place et al 41 provide an overview and report web monitoring results from 3 clinical trials using the DiAs web monitoring tool, which is based on the DiAs platform presented by Keith-Hynes et al. 46 Dassau et al 47 report the development of a safety system for hypoglycemia prevention with several layers, including user alarms at the lowest level, followed by emails and text messages to caregivers, and finally a call to a call center with GPS coordinates. The use of this safety system, within the context of a clinical trial, is reported by Harvey et al. 48
Discussion
So what are the most appropriate analogies between chemical process and aircraft safety and the safety of a closed-loop AP? The most dangerous times occur during start-ups and shut-downs during chemical plant operation, and during takeoffs and landings with aircraft. The start-up/shut-down of a chemical process is probably most analogous to the insertion of a new infusion set and/or CGM (along with the sensor calibration); with current sensor technology, there is a “break-in” period of at least 2 hours, so an AP must be in “open-loop” during this time period. The takeoff and landings of aircraft are probably most analogous to meals and exercise, which cause the most “stress” for an AP; aircraft certainly have the advantage that both of these events are “announced” and the dynamic behavior is well described (major disturbances would include wind gusts). We have also seen airplane disasters occur because the pilot(s) assumed that the autopilot (automatic controller) was functioning, when the system was actually under manual (open-loop) control. Similar problems could develop with an AP if an individual assumes that the controller is closed, when, either due to a component failure or being accidentally switched to manual mode, the insulin delivery is in manual mode.
We have seen that accidents/incidents in the chemical process and air transportation industries usually involve human error. In some cases a human is so accustomed to the automatic control system that it can be difficult to detect and compensate for failures/faults; is there potential for this to occur with an AP as well? The number of people directly impacted by an AP incident is low, compared with the tens and hundreds impacted by chemical process and aircraft accidents, unless the individual with an AP is piloting a plane or operating a chemical process plant—and even then there are other safety measures and pilots or operators available to take control.
In addition, there is increasing awareness that “near misses” should be given a high level of scrutiny, since there was some luck involved in the near miss not becoming an actual accident. In the context of the AP, this would indicate that there be a method of analyzing situations where hypo- or hyperglycemic events occurred even if there was no “disastrous” outcome; that is, there should be a greater focus on certain individuals that may be struggling more with these events. In addition to more “tutoring,” perhaps a higher degree of remote monitoring (through error/warning messages to health care providers, etc) can be provided for specific individuals.
The types of faults that are most likely to cause short-term safety problems include extreme positive sensor bias (sensor reading higher than the actual glucose value), which would cause too much insulin to be delivered and resulting in a danger of hypoglycemia, and a false meal announcement, which could result in a large meal bolus and again resulting in a danger of hypoglycemia. On the other side, a large negative sensor bias (sensor reading lower than the actual glucose value) combined with an infusion set failure could lead to extreme hyperglycemia for a period of time.
We have also seen that the important role of government regulation, including extensive accident investigation and reporting. In the case of the US chemical process industry this is the CSB, and with US aircraft it is the NTSB. With medical devices the regulating agency is the FDA. The AP clinical trials conducted in the US have involved the development of investigational device exemptions (IDEs), and all trials have a Data and Safety Monitoring Board (DSMB) responsible for reviewing all unanticipated problems and serious device experiences.
Summary
Methods to ensure the safe operation of a closed-loop AP have been placed in the context of safety in the chemical process and commercial air traffic industries. Primary advantages to assuring safety in these other industries include highly trained operators, continuously available maintenance staff, sophisticated alarms and control panels, and redundancy in fault susceptible equipment; engineering specialists are also available to analyze complex problems, and tuning technicians can retune malfunctioning controllers. An individual with a closed-loop AP has roles in operations and maintenance, in addition to serving as the “system” being controlled.
Footnotes
Appendix
Abbreviations
AP, artificial pancreas; ASME, American Society for Mechanical Engineering; CGM, continuous glucose monitor; CSB, Chemical Safety Board; DSMB, Data and Safety Monitoring Board; EPA, Environmental Protection Agency; FDA, Food and Drug Administration; GPS, global positioning system; IDE, investigational device exemption; IMM, interactive multiple model; IOB, insulin onboard; MIC, methyl isocyanate; NIDDK, National Institute of Diabetes and Diabetes and Kidney Disease; NTSB, National Transportation Safety Board; PCA, principal components analysis; PISA, pressure-induced sensor attenuation.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Partial support for this work has been provided by grants from National Institute of Diabetes and Digestive and Kidney Diseases (R01DK085591-03 and R01DK102188-01), and JDRF (22-2013-266 and 22-2011-647).
