Fault diagnosis of PLC-based discrete event systems using Petri nets

Abstract

This paper addresses the fault diagnosis problem of PLC-based systems that can be modeled as Petri nets under a certain level of abstraction. The existing Petri-net-based fault diagnosis approaches often associate transitions and/or places with sensors and require that any change in sensor readings needs to be treated by a PLC, leading to a situation that the PLC would be too busy processing the changes in sensor readings to perform other tasks. This paper assumes that a PLC does not monitor the changes of readings of sensors all the time, but periodically reads the values of sensors when needed. The system output is defined as a marking sequence interleaved with possible observed transitions. A fault diagnosis algorithm is developed by defining and solving integer linear programing (ILP) problems whose size is regardless of the length of the system output. The proposed approach enjoys high computational efficiency compared with other ILP-based approaches and is more suitable for fault diagnosis of PLC-based systems with low computing power.

Keywords

Petri net fault diagnosis programmable logic controller discrete event system

Introduction

Position of this paper

The rapid development of electronic technology has made the computer-integrated systems, which can be abstracted as discrete event systems (DESs),¹ more and more complex, structurally and functionally. Fault diagnosis aims to ensure the safe and stable operation of these systems by detecting and isolating faults as soon as possible (In this paper, “failure” and “fault” have the same meaning and we use them interchangeably. However, they may have slightly different meanings in some publications). More specifically, fault diagnosis requires to complete three tasks: fault detection, fault isolation, and fault identification, corresponding to determining if a system is running normally, revealing the locations and numbers of faults, and identifying the specific nature of faults, respectively.^2–5

In real industrial systems, the control of a system can often be implemented by a programmable logic controller (PLC). In general, a PLC consists of a central processing unit, a storage unit, and an input-output unit, and can receive, send and process various electronic signals. A PLC is connected to the sensors distributed on the components of a system such that it can collect the control parameters of the system by reading these sensors. Many existing fault diagnosis approaches in the literature can be implemented with PLCs.

In the domain of DESs, many existing fault diagnosis approaches have two main disadvantages: (i) a number of approaches^6–9 have exponential computational complexity in the worst case and their computational cost may be prohibitive for large systems and (ii) some approaches^10–12 require that any change in sensor readings needs to be treated by a PLC such that the PLC would be too busy processing the changes in sensor readings to perform other tasks.

To overcome these disadvantages, we propose a fault diagnosis approach which equips a system with more sensors and reads sensors at a fixed time interval. The proposed approach has higher computational efficiency and is more appropriate for use in real-world systems with low computing power. On the other hand, the main drawback of the proposed approach is the higher cost of purchasing sensors and the higher difficulty of deploying them.

Differently from the Petri-net-based approach proposed in this paper, some other approaches based on structural analysis^13,14 or fault trees^15,16 are also exposed. In contrast to actively detect and isolate faults in this paper, the stability property of a system is explored in Foroozanfar et al.¹⁷ and Lutz-Ley and Lopez-Mellado.¹⁸ A stable system can automatically return to a normal state from a fault state in finite number of steps. Before presenting the details of the proposed approach, we review some typical publications on fault diagnosis of DES.

Some original theoretical results for fault diagnosis are exposed based on automata. In Sampath et al.,⁶ a plant is modeled as an automaton in which faults are represented by unobservable events and each observable event is associated with a sensor. A diagnoser for fault diagnosis is constructed offline by an algorithm with exponential complexity. When an event is observed, by inspecting the diagnoser, the fault state of the plant may be one of three cases: faults do not occur, faults may have occurred, and faults must have occurred. This fault diagnosis approach is applied to a realistic HVAC system in Sampath et al.¹⁹

In contrast to automaton models, Petri nets, as a typical mathematical model of DESs, have been widely used in supervisory control,^20–23 performance optimization,^24–26 and model identification^10,11,27,28 because of their intrinsic distributed features and efficiency in handling large systems. Simultaneously, in recent decades, many fault diagnosis approaches based on Petri nets have also been reported, which can be generally categorized into two classes.

In the first class, the sensors are equipped with observable transitions and the output of a system is defined as a sequence of observable transitions. Integer linear programing (ILP) problems often need to be built and solved for performing fault diagnosis. In Dotoli et al.,⁷ an online fault diagnosis approach is reported that builds an ILP problem based on the observed transition sequence. By assigning two different objective functions to the ILP problem and solving it, whether a fault has occurred can be determined. This approach is extended to the case of labeled Petri net in Fanti et al.⁸ and Zhu et al.²⁹

Basile et al.³⁰ deal with the fault diagnosis problem by means of the notion of generalized markings, that is, markings in which the number of tokens can be a negative integer. By solving an ILP problem built according to the negative elements of a generalized marking, the occurrence of a fault can be deduced. Wang et al.³¹ employ generalized markings to backward conflict-free Petri nets and propose a more efficient fault diagnosis approach by constructing an ILP problem with smaller size.

In Giua and Seatzu,³² the notion of basis markings is first defined and a fault diagnosis approach based on basis markings is proposed. Given an observed transition sequence, the corresponding basis markings are computed by an algorithm using only algebraic manipulations. The diagnosis result of a fault is obtained by checking the basis markings and, if necessary, solving some ILP problems. The notion of basis markings is further extended to basis reachability graphs in Cabasino et al.^9,33 where a basis reachability graph is first constructed offline and then an online diagnosis algorithm is developed by inspecting the basis reachability graph at each step.

In the second class, the sensors are equipped with observable transitions as well as certain places of a Petri net. The system dynamic is represented as an observable transition sequence or a transition-marking sequence. In Zhu et al.,^10,11 faults are described by unobservable transitions not contained in the initial net model of a plant and the output of a system is defined as an observed evolution, that is, a transition-marking sequence. By solving an ILP problem constructed based on the observed evolution, the unobservable transitions are identified that characterize the number and locations of faults.

Ru and Hadjicostis¹² model systems as partially observed Petri nets (i.e. nets equipped with transition and place sensors) and define the system output as an ordered sequence of three tuples $(M, t, M')$ ’s, where $M$ and $M'$ are two markings and $t$ is a transition. After transforming the partially observed Petri net into a labeled net, they propose an online fault diagnosis algorithm based on reachability graph analysis of the unobservable subnet of the labeled net. The study in Lefebvre³⁴ also utilizes partially observed Petri nets and provides an efficient fault diagnosis algorithm by solving a number of linear matrix inequalities. The number of times of solving inequalities is linear in the length of an observed sequence.

In Ramirez-Trevino et al.,³⁵ a bottom-up modeling methodology is explored to build an interpreted Petri net (IPN) model of a system. By dividing the set of places into a subset of places modeling fault states and a subset of places modeling normal states, the diagnosability of the built IPN is formally defined and a scheme for detecting and locating fault states is developed. By assuming that all places are observable, Genc and Lafortune³⁶ extend the diagnoser approach based on automata in Sampath et al.⁶ to the context of Petri nets. A Petri net diagnoser is constructed to perform online diagnosis. This approach is of high computational complexity since the marking enumeration upon observing an transition at each step is needed.

The study in Pencole and Subias³⁷ explores the diagnosability of safe labeled time Petri nets based on the notion of event patterns. An efficient approach of determining diagnosability of a safe Petri net is reported by means of model-checking techniques. Al-Ajeli and Parker propose a fault diagnosis approach for labeled Petri nets based on Fourier-Motzkin elimination. A diagnoser in the form of sets of inequalities is first constructed offline and then online fault diagnosis is performed by checking the constructed diagnoser when observing an event. This approach provides a better trade-off between the size of the diagnoser and diagnosis time. In recent years, some extended versions of fault diagnosis are studied, such as robust diagnosis,³⁸ diagnosability enforcement,³⁹ and synchronous diagnosis.^40,41 Two comprehensive surveys on fault diagnosis can be found in Al-Ajeli and Parker⁴² and Lafortune et al.⁴³

Contributions

As already mentioned above, the first class of fault diagnosis approaches often needs to solve ILP problems whose size is linear or polynomial with respect to the length of the observed sequence. When the observed sequence is very long, these approaches become computationally prohibitive. On the other hand, the second class of approaches generally assumes that all or part of places are observable, that is, these places are equipped with sensors. Moreover, the approaches in the second class usually require that any change in the readings of place sensors needs to be treated by a PLC. A Petri net model of a real-world system may include a large number of places and there are a deluge of changes of sensor readings during the operation of the system. The PLC would be too busy processing the changes in sensor readings to perform other tasks. Consequently, in some cases, the existing fault diagnosis approaches are not applicable to real-world systems.

In this paper, we assume that part of transitions and all places are equipped with sensors. In addition, due to the limited processing power of a PLC, we assume that it does not monitor the changes of readings of place sensors all the time, but periodically reads the values of the sensors when needed. The system output is defined as a marking sequence interleaved with possible observed transitions.

Based on the observed system output, a fault diagnosis algorithm is developed by defining and solving ILP problems whose size is regardless of the length of the system output. Thus, the computational cost of the algorithm will not increase with the increase of the length of the system output. Namely, the algorithm enjoys a high computational efficiency for those systems that satisfy the assumptions made in this paper. On the other hand, since the PLC only requires to read place sensors once in a time period and does not monitor the changes of sensor-readings all the time, the proposed algorithm is more suitable for fault diagnosis of some PLC-based systems.

This paper is organized as follows. In Section 2, some basic definitions on Petri net are provided. In Section 3, the problem to be addressed is formally defined and a solution to the problem is reported in Section 4. In Section 5, we extend the fault diagnosis algorithm for Petri nets to the case of labeled Petri nets. To show the effectiveness of the proposed approach, in Section 6, we apply the approach to a case study example. Finally, a conclusion is drawn in Section 7.

Basic definitions

In this section, we introduce the definitions and notations used throughout the paper. For the complete details of Petri nets, the reader is referred to Giua and Silva⁴⁴ and Murata.⁴⁵

A Petri net is a bipartite graph represented as a four-tuple $N = (P, T, Pre, Post)$ , where $P$ is a set of $m$ places, $T$ is a set of $n$ transitions with $P \cup T \neq \emptyset$ and $P \cap T = \emptyset$ , $Pre : P \times T \to N$ ( $N$ is the set of non-negative integers) and $Post : P \times T \to N$ are two functions represented as two matrices, which specify the arcs of the graph. We denote by $C = Post - Pre$ the incidence matrix of net $N$ . The sets of input places and of output places of a transition $t$ are denoted by ${}^{•}t = {p \in P | Pre (p, t) > 0}$ and $t^{•} = {p \in P | Post (p, t) > 0}$ , respectively. Similarly, for a place $p \in P$ , we define ${}^{•}p = {t \in T | Post (p, t) > 0}$ and $p^{•} = {t \in T | Pre (p, t) > 0}$ .

A marking of a Petri net is a mapping $M : P \to N$ . The number of tokens in a place $p \in P$ at a marking $M$ is represented by $M (p)$ . A marking $M = [x_{1}, \dots, x_{m}]^{T}$ is written as $M = x_{1} p_{1} + \dots + x_{m} p_{m}$ (the items whose coefficients are zero are omitted) for simplicity. By associating a Petri net $N$ with an initial marking $M_{0}$ , a Petri net system $〈 N, M_{0} 〉$ is defined.

A transition $t \in T$ is said to be enabled at marking $M,$ denoted by $M [t 〉,$ if the number of tokens in each input place of $t$ is greater or equal to the weight of the corresponding arc, that is, $M (p) \geq Pre (p, t)$ for all $p \in^{•} t .$ With a slight abuse of notation, we use $M [σ 〉$ to denote that a transition sequence $σ \in T^{*}$ is enabled at marking $M .$ A marking $M_{α}$ is reachable from $M$ after firing $σ$ , denoted by $M [σ 〉 M_{α}$ . In addition, $M_{α}$ is computed by the state equation of a net

M_{α} = M + C \cdot π (σ),

(1)

where $π$ is a function, defined by $π : T^{*} \to N^{n}$ , which computes the Parikh vector of a transition sequence $σ$ . If a transition $t$ is in $σ$ , we write $t \in σ$ for simplicity.

The set of transition sequences that are enabled at the initial marking $M_{0}$ is defined as

L (N, M_{0}) = {σ \in T^{*} | M_{0} [σ 〉} .

(2)

The reachability set, represented as $R (N, M_{0})$ , of a net system contains all markings that are reachable from $M_{0}$ by firing a transition sequences $σ \in L (N, M_{0}),$ that is, $R (N, M_{0}) = {M \in N^{m} | (\exists σ \in T^{*}) M_{0} [σ 〉 M} .$

A transition is said to be observable if it is equipped with a sensor that monitors its firing; otherwise, it is called unobservable. We denote by $T_{o}$ and $T_{u}$ the set of observable transitions and the set of unobservable transitions, respectively. In addition, the cardinalities of $T_{o}$ and $T_{u}$ are represented as $n_{o}$ and $n_{u}$ , respectively. Thus, the set $T$ of transitions is partitioned into two disjoint subsets $T_{o}$ and $T_{u}$ with $T = T_{o} \cup T_{u}$ . A place of a Petri net is said to be measurable if the number of tokens residing into it can be detected by a sensor. In this paper, we assume that all places of a Petri net are measurable. A Petri net is said to be acyclic if it does not contain any directed cycle.

Theorem 1.⁴⁵ A marking $M \in N^{m}$ is reachable from $M_{0}$ in an acyclic Petri net system $〈 N, M_{0} 〉$ if and only if there exists a column vector $x \in N^{n}$ such that $M = M_{0} + C \cdot x$ , where $m$ and $n$ are the numbers of places and transitions, respectively.

Definition 1. Given a Petri net $N = (P, T, Pre, Post)$ and a transition subset $\hat{T} \subseteq T$ , the $\hat{T}$ -induced subnet of $N$ is defined as $N_{\hat{T}} = (P, \hat{T}, \Pr e_{\hat{T}}, Pos t_{\hat{T}})$ , where $\Pr e_{\hat{T}}$ and $Pos t_{\hat{T}}$ are the restrictions of $Pre$ and $Post$ on $P \times \hat{T},$ respectively.

According to Def. 1, it is straightforward to obtain the $T_{u}$ -induced subnet of $N = (P, T, Pre, Post)$ , denoted by $N_{u} = (P, T_{u}, \Pr e_{u}, Pos t_{u})$ . We also call the $T_{u}$ -induced subnet the unobservable subnet of $N$ . The incidence matrix of $N_{u}$ is represented by $C_{u} = Pos t_{u} - \Pr e_{u}$ . Unless otherwise stated, the unobservable subnet of the considered Petri nets in this paper is assumed to be acyclic.

To characterize failures in a real-world system, the unobservable transitions in $T_{u}$ are further divided into regular unobservable transitions, denoted by $T_{reg}$ , and fault transitions, denoted by $T_{f}$ , that is, $T_{u} = T_{reg} \cup T_{f}$ and $T_{reg} \cap T_{f} = \emptyset$ . The fault transition set $T_{f}$ is partitioned into $r$ fault classes, that is, $T_{f} = T_{f}^{1} \cup T_{f}^{2} \cup \dots \cup T_{f}^{r}$ , and the partition is represented as $Π_{f} = {T_{f}^{1}, \dots, T_{f}^{r}}$ . For simplicity, we say that $T_{f}^{i}$ occurs if a fault $f \in T_{f}^{i}$ takes place in a system. Given a sequence $σ \in T^{*}$ , we use $t \in σ$ to denote that $t$ is contained in $σ$ and $T_{f}^{i} \in σ$ to denote that there exists at least a transition $f \in T_{f}^{i}$ such that $f \in σ$ .

A Petri net system with a labeling function $λ : T \cup {ε} \to E \cup {ε}$ is called a labeled Petri net (LPN) system, denoted by $〈 N, M_{0}, E, λ 〉$ , where $E$ is an alphabet and $ε$ represents the empty string. The label of the empty string $ε$ is itself, that is, $λ (ε) = ε$ . The set of transitions that have the same label $e$ is denoted by $T_{e} = {t \in T | λ (t) = e}$ . The labeling function $λ$ is extended to a transition sequence $σ = t_{1} t_{2} \dots t_{h} \in T^{*}$ with $h$ being its length by defining $λ (σ) = λ (t_{1}) λ (t_{2}) \dots λ (t_{h})$ . We obtain the underlying Petri net of an LPN by removing all labels of transitions of the LPN.

Problem definition

In general, faults in a system can be divided into three types: incipient, intermittent, and permanent. The approach proposed in this paper is based on Petri nets (a typical discrete event model) and are appropriate for faults that cause a distinct change in the state of system components but do not necessarily bring the system to a halt. Thus, the proposed approach can be used to detect intermittent or permanent faults. These faults may originate from different system components, such as sensors, actuators, or controllers.

The problem of fault diagnosis consists in determining whether faults have occurred according to the behavior of a system. In general, the behavior of a system is described by its output, that is, sensor-readings at each step during the system evolution.

This paper deals with the fault diagnosis problem of PLC-based Petri nets. Though all places of the considered net are measurable, we assume that a PLC does not monitor the changes of readings of place sensors all the time due to the limited processing power of the PLC. To define the output of a Petri net system, the following assumptions hold for the systems under consideration:

(A1) Reading place sensors and firing any transition can be done instantly.

(A2) Place sensors are read once in $K$ time units.

In a real PLC-controlled system, reading sensors is usually implemented by a hardware interrupt and performing an action (i.e. firing a transition) is usually carried out by a callback function of programing languages. In this way, the controller of the system can know if the operation of reading sensors or performing an action is successful and obtain the corresponding sensor data if the answer is positive. However, in this paper, we mostly focus on the logic steps of a fault diagnosis algorithm, without explicitly considering its software implementation. Thus, we make Assumption (A1) for the convenience of discussion. Simultaneously, we make Assumption (A2) to avoid a situation that the PLC of a system is too busy processing the changes of readings of place sensors to perform other tasks.

On the basis of Assumptions (A1) and (A2), the output of a system at each step is represented by two possible cases:

(1) A marking pair $(M, M')$ with $M, M' \in R (N, M_{0})$ .

(2) A three tuple $(M, t_{o}, M')$ with $M, M' \in R (N, M_{0})$ and $t_{o} \in T_{o}$ .

In the first case, $M'$ is reachable from $M$ by firing an unobservable transition sequence $σ_{u} \in T_{u}^{*}$ , that is, $M [σ_{u} 〉 M'$ , and $M'$ is obtained by reading place sensors after $K$ time units from the starting point $M$ .

In the second cases, starting from $M$ , an observable transition $t_{o}$ fires before reaching $K$ time units and the output is represented as $(M, t_{o}, M')$ satisfying $M [σ_{u} 〉 M_{α} [t_{o} 〉 M'$ , where $σ_{u} \in T_{u}^{*}$ and $M_{α} \in R (N, M_{0})$ . Namely, marking $M_{α}$ is reached from $M$ by firing an unobservable transition sequence. After observing transition $t_{o}$ , the timer is reset to zero and the elapsed time units are recounted. Subsequently, we provide an example to clarify this.

Example 1. Consider the Petri net shown in Figure 1 , where $T_{o} = {t_{1}}$ , $T_{u} = {f_{1}, f_{2}, f_{3}, f_{4}, u_{1}, u_{2}, u_{3}}$ , and the initial marking is $M_{0} = p_{1} + p_{2}$ . If the time interval to read place sensors is four time units (i.e. $K = 4$ ), then a possible evolution of the net is shown in Figure 2 . The timeline shows the moments to read place sensors. The output sequence obtained along this timeline is represented as a marking-transition sequence, that is, $M_{0} M_{1} t_{1} M_{2} M_{3} t_{1} M_{4} M_{5}$ , which contains five steps $(M_{0}, M_{1}), (M_{1}, t_{2}, M_{2}), (M_{2}, M_{3}), (M_{3}, t_{1}, M_{4})$ and $(M_{4}, M_{5})$ , as shown in the lower part of Figure 2.

Figure 1.

A Petri net for Example 1.

Figure 2.

A possible output sequence of the net in Figure 1.

In Step 1, from the starting point $M_{0}$ , no observable transition is observed during four time units. Then, at the fourth time unit, place sensors are read and the marking $M_{1} = p_{2} + p_{3}$ is obtained. Thus, the output in Step 1 is the first case mentioned above, that is, a marking pair $(M_{0}, M_{1})$ . In Step 2, a timer restarts from marking $M_{1}$ and we observe transition $t_{1}$ before four time units elapse. In addition, by reading place sensors, the reached marking after firing $t_{1}$ is $M_{2} = p_{3} + p_{4}$ . Thus, the output in Step 2 can be represented as the second cases defined previously, that is, a three tuple $(M_{1}, t_{1}, M_{2})$ . The outputs in Steps 3–5 can be similarly explained and are illustrated in Figure 2. ∇

Definition 2. A trace of a Petri net system $〈 N, M_{0} 〉$ with $T = T_{o} \cup T_{u}$ is a sequence $σ = M_{0} b_{1} M_{1} b_{2} M_{2} \dots b_{L} M_{L}$ such that there exist unobservable transition sequences $σ_{u_{1}}, σ_{u_{2}}, \dots, σ_{u_{L}} \in T_{u}^{*}$ satisfying $M_{0} [σ_{u_{1}} b_{1} 〉 M_{1} \dots σ_{u_{L}} b_{L} 〉 M_{L}$ , where $L \geq 1$ is a positive integer, $b_{i} \in T_{o} \cup {ε}$ , $M_{i} \in R (N, M_{0})$ , $i = 1, \dots, L .$ ∇

The set of traces of a net system $〈 N, M_{0} 〉$ is defined as $T (N, M_{0}) = {σ | is a trace} .$ Obviously, the output sequence of a Petri net in this paper, defined under Assumptions (A1) and (A2), is a trace. For example, the output sequence, say $σ_{1}$ , in Example 1 is a trace and it holds $σ_{1} \in T (N, M_{0})$ . Now, we are ready to formally define the fault diagnosis problem addressed in this paper.

Problem 1. Given a Petri net system $〈 N, M_{0} 〉$ and a partition $Π_{f} = {T_{f}^{1}, \dots, T_{f}^{r}}$ of $r$ fault classes, the problem consists in determining the occurrence of each fault class $T_{f}^{i}$ ( $i = 1, \dots, r$ ) till observing a trace $σ = M_{0} b_{1} M_{1} b_{2} M_{2} \dots b_{L} M_{L}$ that contains multiple steps with each being in the form of a marking pair $(M, M')$ or a three tuple $(M, t_{o}, M') .$

Fault diagnosis algorithm for Petri nets

In this section, we define a local diagnoser and a global diagnoser for solving Problem 1 and develop a fault diagnosis algorithm based on integer linear programing techniques. To formally present the algorithm, several new definitions are first provided.

Definition 3. Given a net system $〈 N, M_{0} 〉$ with $T = T_{o} \cup T_{u}$ and two markings $M, M' \in R (N, M_{0})$ , the set of unobservable transition sequences whose firings bring the system from $M$ to $M'$ is defined by $Λ (M, M') = {σ_{u} \in T_{u}^{*} | M [σ_{u} 〉 M'} .$ ∇

It is clear that if $M'$ is reachable from $M$ by firing a sequence $σ_{u} \in T_{u}^{*}$ , that is, $M [σ_{u} 〉 M'$ , then $Λ (M, M') \neq \emptyset .$ When observing a pair $(M, M')$ during the evolution of a step of a system, the following local diagnoser describes the diagnosis decision of each fault class $T_{f}^{i} \in Π_{f}$ after observing $(M, M') .$

Definition 4. A local diagnoser is a function $Δ : N^{m} \times N^{m} \times Π_{f} \to {0, 1, 2}$ which associates a pair $(M, M')$ and each fault class $T_{f}^{i} \in Π_{f}$ with a diagnosis state such that

• $Δ (M, M', T_{f}^{i}) = 0$ if for all $f \in T_{f}^{i}$ and for all $σ_{u} \in Λ (M, M')$ it holds $f \notin σ_{u}$ , that is, there are multiple paths from $M$ to $M'$ in the reachability graph but each path does not contain transition $f$ . In such a case, the system behaves normally between $M$ and $M'$ and no fault occurs.

• $Δ (M, M', T_{f}^{i}) = 1$ if there exist two unobservable transition sequences $σ_{u_{1}}, σ_{u_{2}} \in Λ (M, M')$ such that (i) there exists a fault $f \in T_{f}^{i}$ satisfying $f \in σ_{u_{1}}$ , and (ii) for all $f \in T_{f}^{i}$ , it holds $f \notin σ_{u_{2}}$ , that is, there are at least two possible paths from $M$ to $M'$ in the reachability graph, one passing a fault $f \in T_{f}^{i}$ but another passing none of them.

• $Δ (M, M', T_{f}^{i}) = 2$ if for all $σ_{u} \in Λ (M, M')$ there exists a fault $f \in T_{f}^{i}$ satisfying $f \in σ_{u}$ , that is, each path from $M$ to $M'$ in the reachability graph contains a fault $f \in T_{f}^{i} .$ ∇

Example 2. By specifying $T_{f}^{1} = {f_{1}, f_{4}}$ and $T_{f}^{2} = {f_{2}, f_{3}}$ , let us consider the Petri net shown in Figure 1again. Consider two markings $M_{0} = p_{1} + p_{2}$ and $M_{1} = p_{2} + p_{3}$ as shown in Figure 2. Figure 3is a part of the reachability graph, describing all possible paths from $M_{0}$ to $M_{1}$ that contain unobservable transitions only.

Figure 3.

A part of the reachability graph of the net in Figure 1.

It is easy to verify that there are four paths from $M_{0}$ to $M_{1}$ , as shown in the following:

M_{0} \to M_{1} {\begin{matrix} σ_{u_{1}} & = & f_{4} f_{1} \\ σ_{u_{2}} & = & f_{4} u_{1} \\ σ_{u_{3}} & = & f_{1} f_{4} \\ σ_{u_{4}} & = & u_{1} f_{4} . \end{matrix}

(3)

Since $f_{4} \in T_{f}^{1}$ is contained in $σ_{u_{1}}, σ_{u_{2}}, σ_{u_{3}},$ and $σ_{u_{4}}$ , it holds $Δ (M_{0}, M_{1}, T_{f}^{1}) = 2$ according to Def. 4. Following a similar reasoning, the diagnosis decision $Δ (M_{0}, M_{1}, T_{f}^{2}) = 0$ is obtained. ∇

It is not realistic to compute diagnosis decision of each fault class via reachability graph analysis because of the state explosion problem. we next provide a linear algebraic characterization of all possible unobservable transition sequences from one marking to another.

For two markings $M, M' \in R (N, M_{0})$ and a fault class $T_{f}^{i} \in Π_{f}$ , we construct the following two ILP problems

\begin{matrix} ILPP 1 : Γ_{1} (M, M', T_{f}^{i}) = {\begin{matrix} M' = M + C_{u} \cdot y \\ \sum_{f \in T_{f}^{i}} y (f) = 0 \\ y \in N^{n_{u}} \end{matrix} \end{matrix}

(4)

ILPP 2 : Γ_{2} (M, M', T_{f}^{i}) = {\begin{matrix} M' = M + C_{u} \cdot y \\ \sum_{f \in T_{f}^{i}} y (f) \geq 1 \\ y \in N^{n_{u}}, \end{matrix}

(5)

where $y$ is a non-negative integer vector of $n_{u}$ -dimension. The item $\sum_{f \in T_{f}^{i}} y (f)$ stands for the sum of components of $y$ corresponding to all fault transitions in $T_{f}^{i}$ . On the basis of these two ILP problems, we obtain the following theorem:

Theorem 2. Let $〈 N, M_{0} 〉$ be a Petri net system with $T = T_{o} \cup T_{u}$ and $T_{f} = T_{f}^{1} \cup \dots \cup T_{f}^{r} \subseteq T_{u} .$ For two markings $M, M' \in R (N, M_{0})$ with $M'$ being reachable from $M$ by firing an unobservable transition sequence, and a fault class $T_{f}^{i} \in Π_{f}$ , it holds

(1) $Δ (M, M', T_{f}^{i}) = 2$ if and only if ILPP 1 has no feasible solution,

(2) $Δ (M, M', T_{f}^{i}) = 0$ if and only if ILPP 2 has no feasible solution,

(3) $Δ (M, M', T_{f}^{i}) = 1$ if and only if both ILPP 1 and ILPP 2 admit a solution,

Proof. We prove the results (1)–(3) in turn.

(1): (if) Since $M'$ is reachable from $M$ by firing an unobservable transition sequence, there necessarily exists a column vector $y$ satisfying $M' = M + C_{u} \cdot y$ . On the other hand, if ILPP 1 has no feasible solution, then there does not exist an unobservable transition sequence that contains none of fault transitions in $f \in T_{f}^{i}$ and whose firing brings the system from $M$ to $M'$ , that is, for all $σ_{u} \in Λ (M, M')$ , $T_{f}^{i} \in σ_{u}$ . According to Def. 4, we have $Δ (M, M', T_{f}^{i}) = 2$ . (only if) If ILPP 1 admits a solution, there exists a sequence $σ_{u} \in T_{u}^{*}$ such that $M [σ_{u} 〉 M'$ and $T_{f}^{i} \notin σ_{u}$ according to Theorem 1. Namely, it holds $Δ (M, M', T_{f}^{i}) = 1$ or $0$ . Clearly, this violates the condition $Δ (M, M', T_{f}^{i}) = 2$ . Thus, this result is proved by contradiction.

(2): (if ) If ILPP 2 has no feasible solution, all possible unobservable transition sequences from $M$ to $M'$ in the reachability graph do not pass one of fault transition $f \in T_{f}^{i}$ , that is, for all $σ_{u} \in Λ (M, M')$ , $T_{f}^{i} \notin σ_{u}$ . Thus, we conclude $Δ (M, M', T_{f}^{i}) = 0$ by Def. 4. (only if) We prove this result by contradiction. If ILPP 2 has a solution, by Theorem 1, there is a sequence $σ_{u}$ satisfying $M [σ_{u} 〉 M'$ and $T_{f}^{i} σ_{u}$ , that is, $Δ (M, M', T_{f}^{i}) \neq 0$ . This violates the condition $Δ (M, M', T_{f}^{i}) = 0$ .

(3): (if ) If both ILPP 1 and ILPP 2 have a solution, by Theorem 1, there exist two sequence $σ_{u_{1}}, σ_{u_{2}}$ such that $T_{f}^{i} \notin σ_{u_{1}}$ and $T_{f}^{i} \in σ_{u_{2}}$ . Thus, it holds $Δ (M, M', T_{f}^{i}) = 1$ by Def. 4. (only if) By contradiction, we first assume that ILPP 1 has no solution. Then, we have $Δ (M, M', T_{f}^{i}) = 2$ according to the result (1). This violates the condition $Δ (M, M', T_{f}^{i}) = 1$ . Thus, ILPP 1 necessarily has a solution. If we assume that ILPP 1 has no solution, it holds $Δ (M, M', T_{f}^{i}) = 0$ according to the result (2). This violates $Δ (M, M', T_{f}^{i}) = 1$ . Thus, we prove that both ILPP1 and ILPP1 admit a solution. □

Given two markings $M, M' \in R (N, M_{0})$ and a fault class $T_{f}^{i} \in Π_{f}$ , the local diagnoser $Δ (M, M', T_{f}^{i})$ can be computed by solving ILPPs 1 and 2 according to Theorem 2. However, the final aim of fault diagnosis is to determine the occurrence of faults till observing a trace $σ = M_{0} b_{1} M_{1} b_{2} M_{2} \dots b_{L} M_{L}$ , as described in Problem 1. Thus, we provide the following definitions to achieve the aim.

Definition 5. Given a trace $σ = M_{0} b_{1} M_{1} b_{2} M_{2} \dots b_{L} M_{L} \in T (N, M_{0})$ of a net system $〈 N, M_{0} 〉$ , the set of its associated marking pairs is define as

\begin{matrix} M_{σ} = {(M_{i}, M) | M = M_{i + 1} - C \cdot π (b_{i + 1}) \land \\ M \neq M_{i}, i = 0, \dots, L - 1}, \end{matrix}

where $π (b_{i}) = {\vec{0}}^{n}$ if $e_{i} = ε .$ ∇

Example 3. Consider the net shown in Figure 1 and one of its traces $σ = M_{0} b_{1} M_{1} b_{2} M_{2} b_{3} M_{3} b_{4} M_{4} b_{5} M_{5}$ , as shown in Figure 4 . In this trace, we have $b_{1} = b_{3} = b_{5} = ε$ and $b_{2} = b_{4} = t_{1}$ . In addition, it is clear that $C \cdot π (t_{1}) = [00 - 110]^{T}$ . Thus, by Def. 5, it holds

\begin{matrix} M_{σ} = {(p_{1} + p_{2}, p_{2} + p_{3}), (p_{2} + p_{3}, 2 p_{3}), \\ (p_{3} + p_{4}, 2 p_{3}), (p_{3} + p_{4}, p_{2} + p_{3})} . \end{matrix}

Figure 4.

A trace of the net in Figure 1.

When computing the set $M_{σ}$ , we delete a marking pair $(2 p_{3}, 2 p_{3})$ since $2 p_{3} = 2 p_{3}$ (i.e. $M = M_{i}$ ).

Definition 6. A global diagnoser of a net $〈 N, M_{0} 〉$ is a function $Θ : T (N, M_{0}) \times Π_{f} \to {0, 1, 2}$ that associates a trace $σ \in T (N, M_{0})$ and a fault class $T_{f}^{i} \in Π_{f}$ with a diagnosis decision such that

• $Θ (σ, T_{f}^{i}) = 0$ if for all pairs $(M, M') \in M_{σ}$ , it holds $Δ (M, M', T_{f}^{i}) = 0$ ,

• $Θ (σ, T_{f}^{i}) = 1$ if (i) there exists a pair $(M, M') \in M_{σ}$ such that $Δ (M, M', T_{f}^{i}) = 1$ and (ii) there does not exist a pair $(M, M') \in M_{σ}$ such that $Δ (M, M', T_{f}^{i}) = 2$ ,

• $Θ (σ, T_{f}^{i}) = 2$ if there exists a pair $(M, M') \in M_{σ}$ such that $Δ (M, M', T_{f}^{i}) = 2 .$ ∇

In plain terms, the diagnosis decision $Θ (σ, T_{f}^{i}) = 0$ means that the fault class $T_{f}^{i}$ does not occur during each step of the trace $σ$ . The decision $Θ (σ, T_{f}^{i}) = 2$ implies that $T_{f}^{i}$ necessarily has occurred in one of steps of $σ$ . Consequently, all paths from $M_{0}$ to $M_{L}$ in the reachability graph include the transitions in the step and necessarily pass a fault transition $f \in T_{f}^{i}$ .

In Def. 6, we observe that the global diagnoser of a trace $σ$ can be computed by inspecting all local diagnosers of steps in the trace. Next, we develop an algorithm to detect faults that have occurred till the observation of a trace.

The main logical flow of Algorithm 1 is illustrated by the flowchart shown in Figure 5. More specifically, in Line 1, the global diagnoser $Θ$ is initialized to an $r$ -dimensional column vector $Θ = {\vec{0}}^{r} \in {0, 1, 2}^{r}$ such that $Θ (i)$ stands for the diagnosis decision of fault class $T_{f}^{i}$ , where $r$ is the number of fault classes and 0, 1, and 2 are the diagnosis decisions. We compute the diagnosis decisions of all fault classes in turn, as shown in Line 3. For a fault class $T_{f}^{i}$ , the fault decision $Θ (i)$ is computed in Lines 5–18. The variable $D = 0$ in Line 1 represents the global diagnosis decision and $d = 0$ in Line 7 denotes a local decision. For all marking pairs $(M, M') \in M_{σ}$ , the local diagnosis decision $d$ is obtained by solving ILPP 1 and/or ILPP 2. The global diagnosis decision is computed by selecting the maximal local decision of all marking pairs, as shown in Line 17.

Algorithm 1: Fault diagnosis of a trace in a Petri net system
Input: A trace $σ = M_{0} b_{1} M_{1} b_{2} M_{2} \dots b_{L} M_{L}$ in a net system $〈 N, M_{0} 〉$ Output: Diagnosis decision $Θ (σ, T_{f}^{i})$ of each fault class $T_{f}^{i} \in Π_{f}$ 1 $Θ = {\vec{0}}^{r} \in {0, 1, 2}^{r};$ 2 Compute the set $M_{σ}$ of marking pairs associated with $σ$ according to Def. 5 3 for $i = 1, \dots, r$ do 4 Consider the fault class $T_{f}^{i}$ 5 $D = 0;$ 6 for all $(M, M') \in M_{σ}$ do 7 $d = 0;$ 8 Build and solve ILPP 1, i.e., $Γ_{1} (M, M', T_{f}^{i});$ 9 ifILPP 1 has no feasible solution then 10 $d = 2;$ 11 else 12 Build and solve ILPP 2, i.e., $Γ_{2} (M, M', T_{f}^{i});$ 13 ifILPP 2 has no feasible solution the 14 $d = 0;$ 15 else 16 $d = 1;$ 17 $D = \max (D, d);$ 18 $Θ (i) = D;$ 19 Output $Θ;$

Algorithm 1: Fault diagnosis of a trace in a Petri net system

Input: A trace

σ = M_{0} b_{1} M_{1} b_{2} M_{2} \dots b_{L} M_{L}

in a net system

〈 N, M_{0} 〉

Output: Diagnosis decision

Θ (σ, T_{f}^{i})

of each fault class

T_{f}^{i} \in Π_{f}

Θ = {\vec{0}}^{r} \in {0, 1, 2}^{r};

2 Compute the set

M_{σ}

of marking pairs associated with

σ

according to Def. 5
3 for

i = 1, \dots, r

do
4 Consider the fault class

T_{f}^{i}

D = 0;

6 for all

(M, M') \in M_{σ}

do
7

d = 0;

8 Build and solve ILPP 1, i.e.,

Γ_{1} (M, M', T_{f}^{i});

9 ifILPP 1 has no feasible solution then
10

d = 2;

11 else
12 Build and solve ILPP 2, i.e.,

Γ_{2} (M, M', T_{f}^{i});

13 ifILPP 2 has no feasible solution the
14

d = 0;

15 else
16

d = 1;

D = \max (D, d);

Θ (i) = D;

19 Output

Θ;

Figure 5.

The flowchart of Algorithm 1.

The main computational cost of Algorithm 1 stems from the solution of ILPP 1 and/or ILPP 2. It is well known that the complexity of an integer linear programing problem is NP-complete and is closely related to its size (i.e. the numbers of unknown variables and constraints). However, in practice, many integer linear programing problems can be efficiently solved using commercial solvers (such as Gurobi Solver⁴⁶ used in the paper). Since integer linear programing problems ILPP 1 and ILPP 2 have the same size, we analyze the size of ILPP 1 only. The number of unknown variables is denoted by

I = n_{u},

(6)

where $n_{u}$ is the number of unobservable transitions; the number of constraints is denoted by

J = m + 1,

(7)

where $m$ is the cardinality of place set. We observe that the size of ILPP 1 is regardless of the length $L$ of a trace $σ$ .

Proposition 1. Given a trace $σ = M_{0} b_{1} M_{1} b_{2} M_{2} \dots b_{L} M_{L}$ of a net system $〈 N, M_{0} 〉$ , for all fault classes $T_{f}^{i} \in Π_{f}$ , Algorithm 6 outputs the correct diagnosis decision $Θ (σ, T_{f}^{i})$ .

Proof. According to Theorem 2, the local diagnosis decision $Δ (M, M', T_{f}^{i})$ of each marking pair $(M, M') \in M_{σ}$ is correctly computed in Lines 6–16. We next prove that it is correct to compute the global diagnosis decision by selecting the maximal local diagnosis decision, that is, $D = \max (D, d)$ in Line 17. If for all $(M, M') \in M_{σ}$ , we have $Δ (M, M', T_{f}^{i}) = 0$ , that is, $d = 0$ , then the algorithm outputs $D = Θ (σ, T_{f}^{i}) = 0$ . This is obviously correct according to Def. 6. On the other hand, if there exists a pair $(M, M') \in M_{σ}$ with $d = Δ (M, M', T_{f}^{i}) = 1$ and there does not exist a pair $(\bar{M}, \bar{M}')$ with $Δ (\bar{M}, \bar{M}', T_{f}^{i}) = 2$ , the maximal value will be 1,that is, $D = \max (D, d) = 1$ . This result can be readily verified by Def. 6. Finally, if there exists a pair $(M, M') \in M_{σ}$ with $d = Δ (M, M', T_{f}^{i}) = 2$ , it is clear that $D = \max (D, d) = 2$ . This is also consistent with Def. 6. □

Example 4. Let us consider the net system shown in Figure 6 , where $T_{o} = {t_{1}, t_{2}}$ , $T_{u} = {u_{1}, u_{2}, f_{1}, f_{2}}$ , $T_{f}^{1} = {f_{1}}$ , and $T_{f}^{2} = {f_{2}}$ . Assume that one of the traces of the net is

\begin{matrix} σ = M_{0} t_{2} M_{1} t_{1} M_{2} t_{2} M_{3} ε M_{4} \\ = [\begin{matrix} 2 \\ 0 \\ 0 \\ 0 \\ 0 \\ 2 \end{matrix}] t_{2} [\begin{matrix} 1 \\ 1 \\ 0 \\ 0 \\ 1 \\ 1 \end{matrix}] t_{1} [\begin{matrix} 1 \\ 1 \\ 0 \\ 0 \\ 0 \\ 2 \end{matrix}] t_{2} [\begin{matrix} 0 \\ 1 \\ 0 \\ 1 \\ 1 \\ 1 \end{matrix}] [\begin{matrix} 0 \\ 0 \\ 1 \\ 1 \\ 0 \\ 2 \end{matrix}] . \end{matrix}

Figure 6.

A Petri net for Example 4.

We compute the global diagnoser $Θ (σ, T_{f}^{1})$ and $Θ (σ, T_{f}^{2})$ using Algorithm 1. The global diagnosis decisions are represented as a two-dimensional column vector $Θ$ such that $Θ (1) = Θ (σ, T_{f}^{1})$ and $Θ (2) = Θ (σ, T_{f}^{2})$ . At first, $Θ$ is set to $Θ = [00]^{T}$ . According to Def. 5, we have $M_{σ} = {P_{1}, P_{2}, P_{3}, P_{4}} = {(2 p_{1} + 2 p_{6}, p_{1} + p_{2} + 2 p_{6})$ , $(p_{1} + p_{2} + p_{5} + p_{6}, p_{2} + p_{3} + 2 p_{6}),$ $(p_{1} + p_{2} + 2 p_{6}, p_{2} + p_{4} + 2 p_{6})$ , $(p_{2} + p_{4} + p_{5} + p_{6}, p_{3} + p_{4} + 2 p_{6})} .$

For the pair $P_{1} \in M_{σ}$ , the local diagnosis decision for $T_{f}^{1}$ is $d = 0$ by Lines 7–16. The global diagnosis decision of $T_{f}^{1}$ is updated by $Θ (1) = \max (Θ (1), d) = 0$ . Using the same reasoning, we have $Θ (2) = \max (Θ (2), 0) = 0$ . For the pair $P_{2} \in M_{σ}$ , we have $Θ (1) = \max (Θ (1), 1) = 1$ and $Θ (2) = \max (Θ (2), 1) = 1$ . Analogously, for $P_{3}$ , the local diagnosis decision is $\vec{d} = [Δ (M, M', T_{f}^{1}) Δ (M, M', T_{f}^{2})]^{T} = [1 2]^{T}$ and the global diagnosis decision is updated by $Θ = [\max (Θ (1), 1) \max (Θ (2), 2)]^{T} = [1 2]^{T}$ . Finally, for $P_{4}$ , we have $\vec{d} = [2 2]^{T}$ and $Θ = [\max (Θ (1), 2) \max (Θ (2), 2)]^{T} = [2 2]^{T} .$ ∇

Fault diagnosis algorithm for LPNs

This section extends the fault diagnosis algorithm for Petri nets to the case of labeled Petri nets. In LPNs, two or more observable transitions may share the same label and an observation often corresponds to multiple traces of the underlying Petri nets. Thus, the algorithm for LPNs needs to enumerate all possible transitions with the same label and combine diagnosis results for different traces.

Definition 7. A labeled trace of an LPN system $〈 N, M_{0}, E, λ 〉$ is a sequence $ω = M_{0} e_{1} M_{1} e_{2} M_{2} \dots e_{L} M_{L}$ such that there exists a trace $σ = M_{0} b_{1} M_{1} b_{2} M_{2} \dots b_{L} M_{L} \in T (N, M_{0})$ , where $L \geq 1$ is a positive integer, $e_{i} \in E \cup {ε}, b_{i} \in T_{e_{i}} \cup {ε},$ and $M_{i} \in R (N, M_{0}), i = 1, \dots, L .$ ∇

In plain words, if we observe a labeled trace $ω$ , the underlying evolution of a system is a trace $σ$ such that $λ (b_{i}) = e_{i}$ with $i = 1, \dots, L$ . We use $e_{i} M_{i} \in ω$ to denote that $e_{i} M_{i}$ is contained in $ω$ , $i \in {1, \dots, L}$ . To conveniently present the fault diagnosis algorithm for LPNs, we define the following operator.

Definition 8. The operator $\otimes : {0, 1, 2} \times {0, 1, 2} \to {0, 1, 2}$ associates two operands $a, b \in {0, 1, 2}$ with a value $c \in {0, 1, 2}$ according to Table 1, which is written as $a \otimes b = c$ .

Table 1.

Binary operator ⊗.

⊗	0	1	2
0	0	1	1
1	1	1	1
2	1	1	2

Given a labeled trace $ω$ , there often exist multiple traces $σ$ ’s that have the same observation. We use the operator ⊗ to combine the diagnosis decisions of these traces to form the diagnosis result of the labeled trace. Algorithm 2 illustrates the usage of ⊗. Before present the algorithm, we define an ILP problem:

Γ (M, M') = {\begin{matrix} M' = M + C_{u} \cdot y \\ y \in N^{n_{u}}, \end{matrix}

(8)

which characterizes if there exists an unobservable transition sequence $σ_{u}$ such that $M [σ_{u} 〉 M'$ and $y = π (σ_{u})$ . If $Γ (M, M')$ admits a solution, there necessarily exists such a sequence by Theorem 1. Otherwise, such a sequence does not exist. Now, it is ready to present the fault diagnosis algorithm for LPNs.

Algorithm 2: Fault diagnosis of a labeled trace in an LPN system
Input: A labeled trace $ω = M_{0} e_{1} M_{1} e_{2} M_{2} \dots e_{L} M_{L}$ of an LPN $〈 N, M_{0}, E, λ 〉$ Output: Diagnosis decision $Φ (σ, T_{f}^{i})$ of each fault class $T_{f}^{i} \in Π_{f}$ 1 $Φ = {\vec{0}}^{r} \in {0, 1, 2}^{r};$ 2 $M = M_{0};$ 3 for $i = 1, \dots, L$ do 4 Consider $e_{i} M_{i} \in ω;$ 5 $D = {\vec{0}}^{r} \in {0, 1, 2}^{r};$ 6 if $e_{i} = = ε$ then 7 $M' = M_{i};$ 8 for $j = 1, \dots, r$ do 9 Compute the local diagnosis decision $d = Δ (M, M', T_{f}^{j})$ by solving ILPP 1 and/or ILPP 2; 10 $D (j) = d;$ 11 else 12 forall $t_{o} \in T_{e_{i}}$ do 13 $M' = M_{i} - C \cdot π (t_{o});$ 14 if $Γ (M, M')$ admits a solution then 15 for $j = 1, \dots, r$ do 16 Compute $d = Δ (M, M', T_{f}^{j})$ by solving ILPP 1 and/or ILPP 2; 17 $D (j) = D (j) \otimes d;$ 18 $Φ (k) = \max (Φ (k), D (k))$ with $k = 1, \dots, r$ 19 $M = M_{i};$ 20 Output $Φ;$

Algorithm 2: Fault diagnosis of a labeled trace in an LPN system

Input: A labeled trace

ω = M_{0} e_{1} M_{1} e_{2} M_{2} \dots e_{L} M_{L}

of an LPN

〈 N, M_{0}, E, λ 〉

Output: Diagnosis decision

Φ (σ, T_{f}^{i})

of each fault class

T_{f}^{i} \in Π_{f}

Φ = {\vec{0}}^{r} \in {0, 1, 2}^{r};

M = M_{0};

3 for

i = 1, \dots, L

do
4 Consider

e_{i} M_{i} \in ω;

D = {\vec{0}}^{r} \in {0, 1, 2}^{r};

6 if

e_{i} = = ε

then
7

M' = M_{i};

8 for

j = 1, \dots, r

do
9 Compute the local diagnosis decision

d = Δ (M, M', T_{f}^{j})

by solving ILPP 1 and/or ILPP 2;
10

D (j) = d;

11 else
12 forall $t_{o} \in T_{e_{i}}$ do
13

M' = M_{i} - C \cdot π (t_{o});

14 if

Γ (M, M')

admits a solution then
15 for

j = 1, \dots, r

do
16 Compute

d = Δ (M, M', T_{f}^{j})

by solving ILPP 1 and/or ILPP 2;
17

D (j) = D (j) \otimes d;

Φ (k) = \max (Φ (k), D (k))

with

k = 1, \dots, r

M = M_{i};

20 Output

Φ;

In Algorithm 2, the global diagnosis decisions of a labeled trace $ω$ are stored in an $r$ -dimensional vector $Φ$ , as shown in Line 1. All steps $e_{i} M_{i} \in ω$ ( $i = 1, \dots, L$ ) are considered in turn in Line 3. For each $e_{i} M_{i}$ , the local diagnosis decisions are recorded in $D = {\vec{0}}^{r} \in {0, 1, 2}^{r}$ . If $e_{i}$ is the empty string in Line 6, we compute the diagnosis decision $d = Δ (M, M', T_{f}^{j})$ of each fault class $T_{f}^{j}$ by solving ILPP 1 and/or ILPP 2. Otherwise, we check all transitions $t_{o}$ ’s with label $e_{i}$ in Line 12 and update $M'$ by $M' = M_{i} - C \cdot π (t_{o}) .$ If the programing $Γ (M, M')$ has a solution in Line 14, there necessarily exists an unobservable transition sequence $σ_{u}$ such that $M [σ_{u} 〉 M' [t_{o} 〉 M_{i} .$ Then, the diagnosis decision $d = Δ (M, M', T_{f}^{j})$ can be computed by solving ILPPs 1 and 2 and the diagnosis decisions for different $t_{o}$ ’s are combined by the operator ⊗ in Line 17. Finally, the global diagnosis $Φ$ are obtained by combine diagnosis vector $D$ ’s of all steps $(e_{i} M_{i})$ ’s using the max operation in Line 18.

Example 5. Let us consider an LPN $〈 N, M_{0}, E, λ 〉$ with $T = T_{o} \cup T_{u}$ , $T_{o} = {t_{1}, t_{2}, t_{3}, t_{4}}$ , and $E = {a, b}$ . The labels of all unobservable transitions are the empty string and the labels of observable transitions are represented as $λ (t_{1}) = λ (t_{2}) = a$ and $λ (t_{3}) = λ (t_{4}) = b$ (i.e. $T_{a} = {t_{1}, t_{2}}$ and $T_{b} = {t_{3}, t_{4}}$ ). Given a labeled trace $ω = M_{0} a M_{1} M_{2} b M_{3}$ , we show how Algorithm 2 works by taking $ω$ as an input. The fault diagnosis process is illustrated by Figure 7.

Figure 7.

An example of fault diagnosis process of Algorithm 2.

For Step 1 “ $a M_{1}$ ” of $ω$ , we consider $t_{1}, t_{2} \in T_{a}$ in turn, as shown in Line 12. In Figure 7 , for both $t_{1}$ and $t_{2}$ , $Γ (M, M')$ admits a solution. The corresponding diagnosis decisions are denoted by $D_{1} = [d_{1}, d_{2}]$ and $D_{2} = [d_{3}, d_{4}]$ , respectively. According to Line 17, the diagnosis decision vector $D$ for Step 1 is obtained by combining $D_{1}$ and $D_{2}$ using the operator ⊗, that is, $D = [d_{1} \otimes d_{3}, d_{2} \otimes d_{4}]$ , as shown in Figure 7 . For Step 2 “ $ε M_{2}$ ,” the condition in Line 6 holds and the diagnosis vector in this step is represented by $D = [d_{5}, d_{6}]$ . Similar to Step 1, two observable transitions $t_{3}$ and $t_{4}$ with the same label $b$ in Step 3 “ $b M_{3}$ ” are inspected in turn. For $t_{3} \in T_{b}$ , the programing $Γ (M, M')$ has a solution. However, it has no feasible solution for $t_{4} \in T_{b}$ in Figure 7 . We mark this case with a symbol “×” in Figure 7 . Finally, the global diagnosis result vector $Φ$ is obtained by combining all local diagnosis vectors of Steps 1–3 using max operation in Line 18. ∇

Case study

In this section, we show the application and computational efficiency of the proposed approach by means of a plant example. Let us consider the plant shown in Figure 8, which contains five machines (M1–M5), four robotic arms (R1–R4), two buffers with capacity 4 (B1 and B2), two entrances (I1 and I2), two exits (O1 and O2), and two automatic guided vehicles (AGV1 and AGV2).

Figure 8.

A plant including five machines and four robotic arms.

This plant has two production lines, as shown by the left part and right part of Figure 8, respectively:

In production line 1 (i.e. the left part of Figure 8), robotic arm R1 first takes a raw part from entrance I1 and then load it into machine M1 or M2. After machine M1 or M2 finishes processing the part, robotic arm R3 unloads the part and puts it into buffer B1. When buffer B1 is not empty, R3 takes a part from B1 and loads it into machine M3. Robotic arm R2 unloads the finished part from M3 and puts it into AGV1 that transfers the part to exit O1.

In production line 2 (i.e. the right part of Figure 8), robotic arm R1 takes a raw part from entrance I2 and loads it into machine M4. Robotic arm R4 unloads the processed part from M4 and puts it into buffer B2. When B2 is not empty, R4 takes a part from B2 and puts it into machine M5. Robotic arm R2 takes the finished part from M5 and puts it into AGV 2. Finally, the part is transferred to exit O2 by AGV 2.

Each production line contains nine events, as indicated by the numbers 1–9 in Figure 8. The Petri net model of the plant is shown in Figure 9. Nine events of production line 1 and their corresponding transitions in Figure 9 are demonstrated in Table 2. The events of production line 2 and the corresponding transitions can be analogously explained. Apart from the events in each production line, the components (machines, robotic arms, etc.) of the plant shown in Figure 8 are also explicitly characterized by the places of the Petri net model shown in Figure 9, as listed in Table 3.

Figure 9.

A labeled Petri net model of an automated manufacturing system.

Table 2.

Nine events of production line 1 and their corresponding transitions.

Event no.	Transition	Explanation
1	$t_{1}$	R1 takes a part from I1
2	$u_{1}$	R1 puts a part into M1
2′	$t_{2}$	R1 puts a part into M2
3	$t_{3}$	R3 takes a part from M1
3′	$u_{2}$	R3 takes a part from M2
4	$t_{4}$	R3 puts a part into B1
5	$u_{3}$	R3 takes a part from B1
6	$t_{5}$	R3 puts a part into M3
7	$t_{6}$	R2 takes a part from M3
8	$u_{4}$	R2 puts a part into AGV1
9	$t_{7}$	AGV1 transfers a part to O1

Table 3.

Explanations of some places of the net shown in Figure 9.

Place	Explanation	Token(s) in the place
$p_{1}$	entrance I1	number of raw part in I1
$p_{11}$	entrance I2	number of raw part in I2
$p_{20}$	machine M1	availability of M1
$p_{21}$	machine M2	availability of M2
$p_{22}$	buffer B1	number of free positions in B1
$p_{23}$	machine M1	availability of M3
$p_{24}$	AGV 1	availability of AGV1
$p_{25}$	machine M4	availability of M4
$p_{26}$	buffer B2	number of free positions in B2
$p_{27}$	machine M4	availability of M5
$p_{28}$	AGV 1	availability of AGV1
$p_{29}$	robotic arm R1	availability of R1
$p_{30}$	robotic arm R2	availability of R2
$p_{31}$	robotic arm R3	availability of R3
$p_{32}$	robotic arm R4	availability of R4

We assume that there are two failures in the plant, which are described by two fault transitions $f_{1}$ and $f_{2}$ in Figure 9, respectively. Transition $f_{1}$ characterizes a failure that a part is stored into buffer B1 before being machined by machine M1 and $f_{2}$ describes that a part is directly processed by machine M5 before being stored into buffer B2.

The labeled Petri net model in Figure 9 has 32 places and 32 transitions, where $T_{o} = {t_{1}, \dots, t_{12}}, T_{u} = {f_{1}, f_{2}, u_{1}, \dots, u_{8}}, T_{f}^{1} = {f_{1}}$ , and $T_{f}^{2} = {f_{2}}$ . The initial marking is $M_{0} = 2 p_{1} + 2 p_{11} + p_{20} + p_{21} + 4 p_{22} + p_{23} + p_{24} + p_{25} + 4 p_{26} + p_{27} + p_{28} + p_{29} + p_{30} + p_{31} + p_{32} .$ Given an alphabet $E = {a, b, c, d, g}$ , the labels of the observable transitions are represented as $T_{a} = {t_{1}, t_{8}}$ , $T_{b} = {t_{2}, t_{3}, t_{9}}$ , $T_{c} = {t_{4}, t_{10}}$ , $T_{d} = {t_{5}, t_{6}, t_{11}}$ , and $T_{g} = {t_{7}, t_{12}}$ . In Figure 9, the label of each observable transition is demonstrated in parenthesis beside the transition.

In this case study, we compare the computational efficiency of the proposed approach with the ones in Fanti et al.⁸ and Zhu et al.²⁹ by following the steps below:

Step 1: Simulate the operation of the net system shown in Figure 9 and generate an evolution, that is, a sequence that records all reached markings and fired transitions in order.

Step 2: Compute three observations of the evolution generated in Step 1, which correspond to this paper Fanti et al.⁸ and Zhu et al,²⁹ respectively.

Step 3: Run the diagnosis algorithms proposed in this paper,^8,29 respectively, by taking as input the corresponding observation, and record the running time at each step of the observation.

The generated evolution in Step 1 is shown in Figure 10. For this paper, the corresponding observation of the evolution is demonstrated in Figure 11, which is represented as $ω_{1} = M_{0} e_{1} M_{1} e_{2} M_{2} \dots e_{31} M_{31}$ . Meanwhile, for the approaches in Fanti et al.⁸ and Zhu et al.,²⁹ the observation of the evolution is $ω_{2} = aaabcdgdbdgaaabcbcdddgaccddg$ , that is, these two approaches have the same observation. Since the length of $ω_{2}$ is 28 and the approaches in Fanti et al.⁸ and Zhu et al.²⁹ use an online manner to perform diagnosis, we take $ω'_{1} = M_{0} e_{1} M_{1} \dots e_{i} M_{i}$ with $i = 1, \dots, 28$ as an input of Algorithm 2 in turn.

Figure 10.

An evolution recording all reached markings and fired transitions in order.

Figure 11.

The observation of the evolution shown in Figure 10 in this paper.

The running times of these three algorithms when observing a step of the corresponding observation are shown in Tables 4 and 5, which are tested on a laptop with Intel Core i7-1165G7 processor and 16G RAM using Gurobi Solver (academic license).⁴⁶ The relationship of running times of these three approaches in each step is visualized in Figure 12.

Table 4.

Running times (in second) for steps 1–14.

Step index	1	2	3	4	5	6	7	8	9	10	11	12	13	14
This paper	0.6	0.7	0.5	0.4	0.6	0.3	0.3	0.6	0.4	0.4	0.7	0.6	0.7	0.9
Fanti et al.⁸	0.2	0.3	0.6	0.4	0.5	0.8	1.0	1.1	2.0	2.3	2.9	5.1	5.7	7.7
Zhu et al.²⁹	0.5	0.6	0.2	0.1	0.2	0.2	0.3	0.3	0.4	0.5	0.7	0.8	1.1	1.7

Table 5.

Running times (in second) for steps 15–28.

Step index	15	16	17	18	19	20	21	22	23	24	25	26	27	28
This paper	0.6	0.7	1.2	0.9	0.9	1.4	0.9	0.8	1.3	1.1	0.7	0.8	0.7	0.8
Fanti et al.⁸	14.1	14.8	20.0	35.8	39.2	54.7	96.3	105.7	148.6	217.5	215.1	213.6	422.2	561.7
Zhu et al.²⁹	2.3	2.4	4.7	4.6	5.9	9.9	11.1	15.7	28.7	40.0	21.8	44.7	111.2	154.4

Figure 12.

Comparison of approaches in this paper Fanti et al.⁸ and Zhu et al.²⁹

We observe that the running times of the approaches in Fanti et al.⁸ and Zhu et al.²⁹ significantly increase as the increase of the length of the observation. The main reason for such a situation is that the size of the ILP problems built in Fanti et al.⁸ and Zhu et al.²⁹ is linear with the length of an observation. However, the size of the ILP problem constructed in this paper is regardless of the length of an observation, as indicated by equations (6) and (7). Thus, the running time of our approach in each step remains basically the same and does not quickly increase when the observation becomes longer. In this case study, our approach is more efficient in contrast to the ones in Fanti et al.⁸ and Zhu et al.²⁹

Conclusion

In this paper, we solve the fault diagnosis problem using Petri net models. Two fault diagnosis algorithms for Petri nets and LPNs are developed, respectively. The proposed approach enjoys higher computational efficiency compared with those in Fanti et al.⁸ and Zhu et al.²⁹ and is more suitable for use in real-world systems with low computing power. Future works are two-fold. Firstly, we plan to employ the proposed approach to a real-world system and report its effectiveness. Secondly, the proposed approach will be extended to stochastic Petri nets by associating a firing probability with each transition.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded in part by the National Natural Science Foundation of China under Grant No. 62103349, in part by the Science and Technology Development Fund, Macau SAR under Grant 0012/2019/A1, and in part by the Young and Middle-aged Scientific Research Basic Ability Promotion Project of Guangxi under Grant No. 2020KY02031.

ORCID iD

Guanghui Zhu

References

Cassandras

Lafortune

. Introduction to discrete event systems. New York, NY: Springer, 2009.

Zad

Kwong

Wonham

. Fault diagnosis in discrete-event systems: Framework and model reduction. IEEE Trans Automat Contr 2003; 48(7): 1199–1212.

López-Estrada

Rotondo

Valencia-Palomo

. A review of convex approaches for control, observation and safety of linear parameter varying and Takagi-Sugeno systems. Processes 2019; 7(11): 814.

Witczak

. Fault diagnosis and fault-tolerant control strategies for non-linear systems. New York, NY: Springer, 2014.

Blanke

Kinnaert

Lunze

, et al. Diagnosis and fault-tolerant control. New York, NY: Springer, 2006.

Sampath

Sengupta

Lafortune

, et al. Diagnosability of discrete-event systems. IEEE Trans Automat Contr 1995; 40(9): 1555–1575.

Dotoli

Fanti

Mangini

, et al. On-line fault detection in discrete event systems by Petri nets and integer linear programming. Automatica 2009; 45(11): 2665–2672.

Fanti

Mangini

Ukovich

. Fault detection by labeled Petri nets in centralized and distributed approaches. IEEE Trans Autom Sci Eng 2013; 10(2): 392–404.

Cabasino

Giua

Seatzu

. Fault detection for discrete event systems using Petri nets with unobservable transitions. Automatica 2010; 46(9): 1531–1539.

10.

Zhu

, et al. Fault identification of discrete event systems modeled by Petri nets with unobservable transitions. IEEE Trans Syst Man Cybern Syst 2019; 49(2): 333–345.

11.

Zhu

. Model-based fault identification of discrete event systems using partially observed Petri nets. Automatica 2018; 96: 201–212.

12.

Hadjicostis

. Fault diagnosis in discrete event systems modeled by partially observed Petri nets. Discrete Event Dyn Syst 2009; 19(4): 551–575.

13.

Commault

Dion

Yacoub Agha

. Structural analysis for the sensor location problem in fault detection and isolation. Automatica 2008; 44(8): 2074–2080.

14.

Doostmohammadian

Rabiee

. On the observability and controllability of large-scale IOT networks: reducing number of unmatched nodes via link addition. IEEE Control Syst Lett 2021; 5(5): 1747–1752.

15.

Shu

Guo

Yang

, et al. Reliability study of motor controller in electric vehicle by the approach of fault tree analysis. Eng Fail Anal 2021; 121: 105165.

16.

Vásquez

Pérez-Zuñiga

Sotomayor-Moriano

, et al. New concept of safeprocess based on a fault detection methodology: Super alarms. IFAC-PapersOnLine 2019; 52(14): 231–236.

17.

Foroozanfar

Doustmohammadi

Nikravesh

. Exponential stability of Petri net systems. In: Proceedings of the 6th international conference on electrical engineering ICEENG, Cairo, Egypt, May 2008, pp.1–9. New York: Springer.

18.

Lutz-Ley

Lopez-Mellado

. Stability analysis of discrete event systems modeled by Petri nets using unfoldings. IEEE Trans Autom Sci Eng 2018; 15(4): 1964–1971.

19.

Sampath

Sengupta

Lafortune

, et al. Failure diagnosis using discrete-event models. IEEE Trans Control Syst Technol 1996; 4(2): 105–124.

20.

Chen

Barkaoui

, et al. Compact supervisory control of discrete event systems by Petri nets with data inhibitor arcs. IEEE Trans Syst Man Cybern Syst 2017; 47(2): 364–379.

21.

Demongodin

Koussoulas

. Differential Petri net models for industrial automation and supervisory control. IEEE Trans Syst Man Cybern Part C 2006; 36(4): 543–553.

22.

Giua

. Characterization of admissible marking sets in Petri nets with conflicts and synchronizations. IEEE Trans Automat Contr 2017; 62(3): 1329–1341.

23.

Ammour

Leclercq

Sanlaville

, et al. State estimation of discrete event systems for RUL prediction issue. Int J Prod Res 2017; 55(23): 7040–7057.

24.

Yin

. Verification of prognosability for labeled Petri nets. IEEE Trans Automat Contr 2018; 63(6): 1828–1834.

25.

Giua

. Cycle time optimization for deterministic timed weighted marked graphs under infinite server semantics. IEEE Trans Automat Contr 2018; 63(8): 2573–2580.

26.

Tong

Seatzu

, et al. Verification of state-based opacity using Petri nets. IEEE Trans Automat Contr 2017; 62(6): 2823–2837.

27.

Basile

Chiacchio

Coppola

. A novel model repair approach of timed discrete-event systems with anomalies. IEEE Trans Autom Sci Eng 2016; 13(4): 1541–1556.

28.

Yin

Lafortune

. On the decidability and complexity of diagnosability for labeled Petri nets. IEEE Trans Automat Contr 2017; 62(11): 5931–5938.

29.

Zhu

Feng

, et al. An efficient fault diagnosis approach based on integer linear programming for labeled Petri nets. IEEE Trans Automat Contr 2021; 66(5): 2393–2398.

30.

Basile

Chiacchio

De Tommasi

. An efficient approach for online diagnosis of discrete event systems. IEEE Trans Automat Contr 2009; 54(4): 748–759.

31.

Wang

Zhu

. Fault diagnosis of backward conflict-free Petri nets by generalized markings. IEEE Access 2020; 8: 154871–154880.

32.

Giua

Seatzu

. Fault detection for discrete event systems using Petri nets with unobservable transitions. In: Proceedings of the 44th IEEE conference on decision and control, Seville, Spain, 15–15 December 2005, pp.6323–6328. New York, NY: IEEE.

33.

Cabasino

Giua

Pocci

, et al. Discrete event diagnosis using labeled Petri nets. An application to manufacturing systems. Control Eng Pract 2011; 19(9): 989–1001.

34.

Lefebvre

. On-line fault diagnosis with partially observed Petri nets. IEEE Trans Automat Contr 2014; 59(7): 1919–1924.

35.

Ramirez-Trevino

Ruiz-Beltran

Rivera-Rangel

, et al. Online fault diagnosis of discrete event systems. A Petri net-based approach. IEEE Trans Autom Sci Eng 2007; 4(1): 31–39.

36.

Genc

Lafortune

. Distributed diagnosis of discrete-event systems using Petri nets. In: Proceedings of international conference on application and theory of Petri nets, Eindhoven, The Netherlands, June 2003, pp.316–336. New York: IEEE.

37.

Pencole

Subias

. Diagnosability of event patterns in safe labeled time Petri nets: a model-checking approach. IEEE Trans Autom Sci Eng 2022; 19: 1151–1162.

38.

Yin

Chen

, et al. Robust fault diagnosis of stochastic discrete event systems. IEEE Trans Automat Contr 2019; 64(10): 4237–4244.

39.

, et al. Diagnosability enforcement in labeled Petri nets using supervisory control. Automatica 2021; 131: 109776.

40.

Cabral

Moreira

. Synchronous diagnosis of discrete-event systems. IEEE Trans Autom Sci Eng 2020; 17(2): 921–932.

41.

Veras

MZM

Cabral

Moreira

MV.

Distributed synchronous diagnosis of discrete event systems modeled as automata. Control Eng Pract 2021; 115: 104892.

42.

Al-Ajeli

Parker

. Fault diagnosis in labelled Petri nets: a Fourier–Motzkin based approach. Automatica 2021; 132: 109831.

43.

Lafortune

Lin

Hadjicostis

. On the history of diagnosability and opacity in discrete event systems. Annu Rev Control 2018; 45: 257–266.

44.

Giua

Silva

. Petri nets and automatic control: a historical perspective. Annu Rev Control 2018; 45: 223–239.

45.

Murata

. Petri nets: properties, analysis and applications. Proc IEEE 1989; 77(4): 541–580.

46.

Gurobi Optimization. Gurobi Optimizer, http://www.gurobi.com/ (2021, accessed 25 October 2021).