Abstract
Faults can have significant, negative impacts on the operation and performance of simple and complex dynamic systems. Based on the integration of Bayesian network diagnostic features with Petri net formalism, the existing Bayesian-supported Petri net tool has demonstrated the flexibility of using the Petri net approach for diagnosing failure scenario of a dynamic system. However, studies on using the proposed hybrid Petri net approach for condition monitoring and early detection and diagnosis of single and multiple failures in a dynamic system with feedback control loops are yet to be investigated. Thus, this paper presents a methodology to address this research gap using the operation of a water tank level control system as a case study. The method combines the constructed Generalised Stochastic Petri Net (GSPN) model of the system operation with its corresponding fault diagnostic Petri net model, created using the proposed modified Bayesian Stochastic Petri Net (mBSPN) formalism. The GSPN model establishes the causal relationships between the system’s components and/or subsystems. It further identifies deviations in the sensor measurements of the observable process variables characterising the system operation. The information provided by the sensors in the system model are then inputted into the mBSPN model to diagnose the root cause of the observed deviations. The obtained results demonstrated the capability of using the proposed integrated Petri net methodology for system condition monitoring, early fault detection and diagnosis of single and multiple failures in a dynamic system with feedback control loops.
Introduction
Dynamic systems are systems whose states change over time. They are characterised by complex system topologies comprising many interacting units and equipment 1 such as component-to-component and subsystem-to-subsystem interactions. The high complexity of dynamic systems, including aircraft engines, nuclear reactors and safety critical systems found in many technological industries, means that the failure of one or more components of these systems can have significant negative impacts on the operation and overall performance of the system.2–5 One way to detect and diagnose component failures is to use sensors to monitor the operational status of dynamic systems. However, using many sensors will increase cost and complicate the system.6–8 Therefore, in addition to using a reasonable number of monitoring sensors, developing a technique to aid the rapid and accurate detection and diagnosis of the root cause of faults in a dynamic system is also crucial in order to prevent the probable consequences that may result in costly incidents such as loss of lives, fire outbreak, explosions and production lost.
Many approaches have been reported in the literature for system reliability modelling and fault diagnosis. 9 Among these approaches, Bayesian Networks (BNs) and Petri Nets (PNs) have gained a wider adoption due to their strengths in modelling time, dependency and dynamic operations in detail. For example, BN uses conditional probabilities and Bayes theorem for modelling failure probability and fault diagnosis for a particular scenario. On the other hand, PNs and its variants, such as Generalised Stochastic Petri Nets (GSPN), are flexible modelling tools used to study the dynamic behaviour of complex systems. Unlike BN, which can be used in both predictive and diagnostic capacities, 10 PNs lack the ability to model uncertainty directly or to compute the updated probability of a random variable given the observation of other specific variables.11–13 Thus, this limits the modelling capacity of PN to mostly predictive analysis, similar to the failure analysis of the well-known conventional probabilistic approaches such as fault trees, reliability block diagrams and event trees, that is for computing system failure probability due to the failure of system components.
To address the aforementioned limitation of PN for fault diagnosis, some researchers have conducted studies whose primary aim is to enhance the modelling power of PN by incorporating probabilistic modelling features such as reasoning algorithms, conditional probability and Bayes theorem into Petri net formalisms. Among these studies are the works of Chiachío et al.,11,14 Taleb-Berrouane et al.15,16 and Zhou and Reniers 16 who proposed the Plausible Petri Net (PPN), Bayesian Stochastic Petri Net (BSPN) and Probabilistic Petri Net (PPN) formalisms respectively, for modelling and analysing uncertainty, forward and backward reasonings. Despite the advantages of BNs over PNs for fault diagnosis, many discrete BNs, such as Discrete Dynamic BN (DDBN) proposed for diagnosing faults in complex dynamic systems, use the time-slice discretisation approach. However, using this approach negatively affects the computing time and accuracy of the fault diagnostic technique. In order to eliminate the need for time discretisation, a Generalised Continuous-Time BN (GCTBN) incorporating a continuous model of time was proposed by Codetta-Raiteri and Portinale. 17 The GCTBN combined the strong features of both BNs and PNs to improve the performance of the fault diagnostic process. 18
Andrews and Fecarotti 19 combined the PN and BN approach in a method they named BP-Net, to investigate the safety effects of design and maintenance features on the performance of a remote unmanned wellhead platform operating over its design life. The proposed BP-Net is a model-to-model framework where PN was used to populate the conditional probability tables (CPT) in the developed BN model of the system. The proposed hybrid formalism overcomes limitations in conventional safety and risk analysis techniques, such as constant failure rates and the requirement of assuming independency among the system component failures. Taleb-Berrouane et al. 15 also proposed a hybrid formalism known as Bayesian Stochastic Petri Net (BSPN) to enhance the modelling capabilities of Stochastic PN (SPN) by integrating the continuous data updating capabilities of BN with SPN. The efficiency of the BSPN formalism for fault diagnosis was examined on a simple system with one failure scenario (pump failure). Further work is required to investigate how the modelling tool could be extended for effective fault diagnosis of large-scale and complex systems characterised by feedback control loops and multiple failure scenarios.20,21 However, for dynamic systems whose operational performance are monitored and controlled by a supervisory control system (sensors, controllers and actuators), many system component failures can lead to different unexpected system behaviour while the system is in operation. Besides, the presence of feedback control loop systems will further add complexity to the operational behaviour of the dynamic systems. Thus, this paper aims to propose a novel Petri net methodology suitable for condition monitoring and early detection and diagnosis of single and multiple failure scenarios in a complex dynamic system. The methodology is based on the fusion of Generalised Stochastic Petri Net (GSPN) model of the system operation with its fault diagnostic Petri net model, developed using the proposed modified Bayesian Stochastic Petri Net (mBSPN) approach. The modelling capability of the GSPN-mBSPN methodology was examined on a water tank level system whose operating conditions are monitored and controlled by the feedback control loop systems located at some sections of the system. The contributions of this paper can be summarised as follows:
(i) An integrated system operation and fault diagnostic Petri net methodology for single and multiple faults diagnosis in a dynamic system with control loops is proposed.
(ii) Proposition of new Petri net modelling features for further improvement on the modelling powers of Generalised Stochastic Petri net and Bayesian Stochastic Petri net formalisms for studying the reliability and fault diagnosis of complex dynamic systems with feedback control loops is made.
(iii) The results show the accuracy of using the proposed integrated Petri net methodology (GSPN-mBSPN) for single and multiple faults diagnosis in a complex dynamic system.
The remainder of the paper is structured as follows. Section 2 presents details of the methodology for the condition monitoring and early detection and diagnosis of failures in a dynamically controlled system. Steps of the methodology, such as the integration of the system operation and fault diagnostic models, Monte Carlo Simulation output analysis of the integrated model are described in detail. The application of the proposed methodology to water tank level control system is discussed in Section 3. Lastly, Section 4 gives the conclusion and a review for future work.
Proposed methodology
Figure 1 depicts the main steps of the Petri net methodology for the condition monitoring and early detection and diagnosis of faults in a dynamic system proposed in this paper. Each step is described in detail in the following sub-sections.

Proposed methodological framework for fault detection and diagnosis of dynamic systems.
Step 1: System description and analysis
The system is studied to identify its structure, functional behaviour, operating states/modes, information about the states/modes (working and failure) and the failure data of components that make up the system for the purpose of defining actions and interactions among the system components, including the monitoring components (sensors) deployed on the system.
Step 2: System operational and fault diagnostic PN module construction
When modelling complex dynamic systems, sectional and component-based approaches are often employed, as described in the works of Bartlett et al.22,23 and Remenyte-Prescott and Andrews. 23 A typical illustration of such an approach is depicted in Figure 2. The system is first divided into sections/subsystems comprised of two or more components such that each one of the sections only affects a single system process variable (e.g. flow or level). A component PN model describing the component’s normal working and failure state/mode(s) is developed for each component that makes up the section. After that, an operational propagation PN model is constructed to establish an input-output dependency structure among connected components in a section. The dependency model structures are developed considering the normal working and failure state/modes of the connected components in the section. All the section models are combined to produce an overall system operation model. Within the system’s operational GSPN model, sensor observation and fault detection PN models are developed to detect faults or anomalies while the system is in operation. In the event of unexpected system behaviour, a fault diagnostic PN module based on modified BSPN (mBSPN) would be triggered to analyse the cause of any abnormality observed in the system.

Framework for constructing system operational PN model for a dynamic system.
Overview of the GSPN-mBSPN approach
The conventional GSPN formalism 24 is an extension of a Stochastic Petri Net with ‘inhibitor arcs’ and ‘immediate transitions’. The formal definitions and the descriptions of the elements of the GSPN formalism used in this paper are given in the work of Nourredine et al. 25 Despite the usefulness of the GSPN formalism and other extended PN features proposed in the literature,9,14,20,26 such as different arc and place types and their corresponding transition types and firing rules, they have limitations in modelling the condition, early fault detection and diagnostic processes of a dynamic system with feedback control loops. To address this limitation, additional PN features such as conditional output reset place (CORP), arc (COA), transition (CRT), conditional probabilistic transitions (CPT) and new firing rules are introduced to extend the standard GSPN approach.
In addition, BSPN is one of the new PN extensions proposed by Taleb-Berrouane et al. 15 It is a Stochastic Petri Net model extended with Bayesian Network (BN) features such as conditional probabilities, Bayes theorem and data updating features. BSPN supports the use of the PN approach for fault diagnosis. Details on the methodology for converting a BN graph to a BSPN model can be found in the literature. 15 Motivated by the work of Taleb-Berrouane et al., 15 the primary aim of this paper is to propose an integrated Petri net method for the detection and diagnosis of abnormalities or faults in a dynamic system with feedback control loops by incorporating inference sampling algorithms of a Bayesian network in a GSPN system simulation model. To achieve this aim, new probabilistic transition types (Independent and Dependent Non-Observable and Observable Conditional Probabilistic Transitions; ICPT, DCPT and DCPT*), firing rule, places and arc types (evidence places and arcs) are further proposed in this paper. These new features make it possible to use an integrated Petri net model based on GSPN and modified BSPN approaches for condition monitoring, fault detection and diagnosis of a dynamic system with feedback controls during the system operation. Table 1 shows the graphical representations and descriptions of the proposed new PN features together with the existing Petri net symbols essential for modelling operational behaviour and fault diagnostic processes of a dynamic system with feedback control loops using GSPN-mBSPN approach.
Description of standard and new Petri net symbols.
The new Petri net modelling elements features.
Formal definition of the GSPN-mBSPN approach
A formal definition of the GSPN-mBSPN methodology is presented in definition 1 while the definition of the enabling and firing rules of the new transition types in a GSPN-mBSPN are explained in definitions 2 and 3.
1.
2.
3.
4.
5.
6.
7.
8.
9.
a.
b.
c.
d.
e.
i. An
ii. An
iii. An
iv. An
10.
a.
where
b.
Enableness and firing rules of transitions in the GSPN-mBSPN
The definition of the enabling and firing rules of the new transitions types in the GSPN-mBSPN are explained in definitions 2 and 3.
where

A generalised structure of a fault detection module of a GSPN-mBSPN.
Places pobi, psys, pmp and pss in Figure 3 symbolise the time interval of sensor observation, the system state, the abnormal state of the monitoring parameter and the state of the sensor related to pmp. Compared to the CRT transition, the firing rule of a CPT transition is more complex because it depends on the type of the CPT transition and the selected inference sampling algorithm. Thus, the generalised structures of the CPT transitions in the fault diagnostic module (mBSPN) of a GSPN-mBSPN are as depicted in Figure 4. At the same time, the defined firing rule for the different types of CPT transitions in the mBSPN module of a GSPN-mBSPN is given by the pseudo-code in Algorithm 1. As shown in Figure 4, suppose t2 is a distinct operational process that depends on the states of a component represented by the CPT transition t1. Then, the causal places

Generalised structure of: (a) independent conditional probabilistic transition, (b) dependent non-observable conditional probabilistic transition and (c) dependent observable conditional probabilistic transition.
Illustrative example of the fault diagnostic process of the GSPN-mBSPN methodology
A simple system is used to illustrate how the fault diagnostic process of the proposed GSPN-mBSPN methodology works. The system has three components: a valve, an operator and a flow sensor. Figure 5 shows the Bayesian network graph of this system. The valve can be in three states: normal, stuck open or stuck closed. The operator can perform two actions: open or close the valve. The flow sensor can measure the flow rate through the valve. The probability distribution for the states of the components are in the form of marginal probability (MPT), or input conditional probability table (iCPT) as depicted in the Figure 5. Based on the generalised structure of a conditional probabilistic transition (CPT), the mBSPN equivalents of the BN graph of Figure 5 is shown in Figure 6.

A Bayesian network graph of a simple system. 29

mBSPN model of the BN graph in Figure 5.
When a fault occurs in the system, places
Subsequently, Compute_Rejection_Sampling_One() checks if the firing CPT transition has evidence using the function Has_Evidence(). If it is an observable CPT transition, evidence is set for it based on its evidence place,
Steps 3 and 4: Monte Carlo simulation analysis and results of a GSPN-mBSPN model
A GSPN-mBSPN model can be analysed using Monte Carlo simulation (MCS). Several simulation runs are needed to evaluate some performance metrics of a GSPN-mBSPN simulation model. During each simulation run, the marking of places and the number of times transitions fired in the model are recorded at each timestep. These recorded values are used to calculate performance metrics such as average failure, reliability and posterior probabilities of components at each timestep over the entire simulation runs. Figure 7 depicts a flowchart for performing a number of time steps MCS simulation analysis on a GSPN-mBSPN integrated model proposed in this paper. The embedded functions tagged A, B and C are self-contained processes during the simulation. However, due to the space limitation, the flowcharts of these processes are omitted in this paper. The flowchart for the time steps MCS analysis of a GSPN-mBSPN model can be implemented in a programming language such as C++, which is adopted in this paper.

Flowchart of the Monte Carlo simulation analysis of a GSPN-mBSPN model application of the proposed methodology.
Application of the Proposed Methodology
The system description
Figure 8 shows a diagram of a simple water tank level control system for testing the methodology proposed in Section 2 of this paper. This system was taken from the work of Hurdle et al. 30 To diagnose the water tank system for any abnormality, the faults that could occur for each of the components of the water tank system need to be defined. Thus, Table 2 depicts a list of the system components’ possible states, including their failure states/modes. The water tank system can operate in either of two modes: active or dormant. In the active mode, considered in this paper valve V2 in the system is opened manually to draw water from the tank, and valve V1 is opened by controller C1 to replenish water in the tank if the water level detected by sensor S1 falls below the required level and closed if the required level has been reached. On the other hand, valve V3 can only open automatically by controller C2 if a critical water level that can cause overflow is detected in the tank by sensor S2. A further detailed description of this system can be found in the published articles by Hurdle et al. 30 and Lampis and Andrews. 31

Schematic diagram of the water tank level control system. 30
The system components states failure mode.
Developing the GSPN module of a GSPN-mBSPN model of a dynamic system
A GSPN-mBSPN model of a dynamic system, in this case the water tank level control system, is created based on the type of the modelling elements (places and transitions) and the arcs connecting them. Timed transition(s) with input and output places of type ‘component’ and without test/inhibitor places form a system component GSPN module. Besides, immediate transitions with common input and output places of type ‘component’ and test/inhibitor places of type ‘normal’ form a GSPN module for a system monitoring parameter. However, immediate transitions having common input and output places of type ‘component’ and test/inhibitor places equivalent to the input and output places of a lower-level GSPN module form an interconnection GSPN module for the propagation of a process variable. The input and output places of a GSPN module represent either the states of a component, monitoring parameter or propagated variable in the GSPN-mBSPN model of the water tank system. An incremental hierarchical approach was used to build the whole GSPN-mBSPN model. However, due to the mutual inter-dependencies between GSPN modules of the component, system monitoring parameter and propagated variables, the Petri net models were drawn using the following pattern filled notations for shared places similar to the colour-coded notations for shared places used by Boussif and Ghazel. 32 The subsequent section describes the GSPN modules developed for the various components and subsystems of the water tank level control system.
The system operating mode Petri Net model
The Petri net module for the operating mode of the water tank system is depicted in Figure 9 and represents part of the initial conditions required before a simulation can begin. A token in place p1 means the presence of an operator attempting to manually opens valve V2 to demand water from the tank. Thus, the system changes from the dormant state (place p2) to the active state (place p3). Conversely, if no operator is present to demand water from the tank (no token in place p1), the system switches from the active to the dormant state.

The system operating mode Petri net module.
The component Petri Net model
Following the system modelling steps discussed in step 2 of section 2, Figure 10 depicts the Petri net model for the working and failure state/modes of the types of the system components with single (e.g. pipes) and multiple (e.g. sensors, controllers and valves) failure states. Using the generalised Component Petri net model constructions depicted in Figure 10, the Petri net models for the modelled components of the water tank system including the states of an operator comprises of 37 component state/mode places and 22 failure state/mode transitions. For example, in Figure 10(b), a token in place pw means a component is in a normal operational working state and the movement of the token from place pw to pfm1 when transition tttf fired after time ttf1 signifies the change in the component state from working to failure mode 1 state. At the beginning of the system operation, it is assumed that all components are working properly with the valves V1, V2 and V3 in closed states and the other components in their normal working states. In addition, constant failure times are assumed for all the components.

Petri net models for single (a) and multiple (b) operational states components.
The process variable (water level) state changes PN
In this paper, discrete places are used to represent the states/condition of the modelling entities, and the operation of the water tank level control system consists of both discrete (component states) and continuous (process variable states) variables. Thus, there is a need to develop a Petri net model to represent the continuous process variable states in discrete forms based on the operational reading of the level sensors monitoring the system process variable (water level in the tank). Figure 11 shows a generalised Petri net structure for the system process variable state changes. One token in the place ‘Tank portion with water’ represents a unit volume of water in a tank. One token in the tank level air place ‘Tank portion filled with air’ represents a free space of unit volume in a tank not occupied by water. Places p1 to pn correspond to the discretised states of a system process variable. The switching between the states of a system process variable is represented by the transitions t1–tn, and are governed by the weights (a1–an) of the test arcs between the places ‘Tank portion with water’ and ‘Tank portion filled with air’, and the transitions t1–tn. To construct the Petri net model for the system tank level states, the following assumptions were made:
i. The height of the tank is assumed to be equal to 2 m, and its cross-sectional area equal to 3.1415 m2.
ii. It is assumed that the flow rate in section 1 is the same as the flow rate in section 2 of the system, but the flow rate in section 3 is twice the flow rate in section 1. The flow rate in section 1 is assumed to be Q = 0.006283 m3/s.
iii. It is assumed that overflow starts when the water level in the tank rises above 95 % of the tank level (i.e. >95% of 2 m = >1.9 m).
iv. The initial water level in the tank is assumed to be in the normal range and equals 80 % of the tank level (i.e. 80% of 2 m = 1.6 m).
v. The level of water in the tank is discretised as empty (TLE), low (TLL), normal (TLN), high (TLH), very high (TLVH) and full (TLF).
vi. It is assumed that at every time step of 0.25 s, the increase or decrease in the tank level or volume is equal to 0.0005 m or 0.00157075 m3, which in turn is equivalent to 1 token increase or decrease in the number of tokens in a particular tank level place. Thus, for the tank level state discretisation Petri net model, the following configurations depicted in Table 3 were used to increase and decrease the water level in the tank.

The generalised Petri net structure for the system process variable state changes.
Tank level discretisation configurations.
Using the generalised Petri net structure for the system process variable state changes depicted in Figure 11, the Petri net module for the water level state changes of the water tank system comprises of eight places (two auxiliary places for the system process variable states and six tank level discretised states places) and ten transitions for switching between tank level discretised states.
The operational states changes PN
The operational states changes models are developed by considering the working or failure modes of a component and the states of the inputs (e.g. the states of the immediate upstream components or process variable status change) connected to the component. Figures from 12 to 14 show the generalised Petri net modules for the operational states changes of a level sensor, controller and valve in the water tank system. The PN modules in Figure 12 are developed by considering the state of the water level in the tank and the working or failure modes of the sensor monitoring the water level. The presence of a token in either place p1, p2 or p3 and in the two shared places at the left-side in Figure 12 means a sensor (e.g. S1) has not failed and reads the water level inside the tank corresponding to the observed actual tank level state. In case of the failure of a level sensor (e.g. S1 failed low, S1FL), the sensor will ignore the true level of water in the tank and produces a reading corresponding to the current failure mode of the sensor as depicted by the Petri net structure at the right-side in Figure 12.

Working and failure operational state change PN models of a sensor.
A token in either place p1 or p2 in Figure 13 means a controller (e.g. C1) is sending a command signal (open or close depending on the operating state of the sensor and the state of the controller) to a valve (e.g. V1). In case of controller failure (e.g. C1 failed high, C1FH), the controller will ignore the actual command and send a spurious command to the valve based on the failure mode of the controller as depicted by the Petri net structure at the right-side in Figure 13. Likewise, a token in either place p1 or p2 in Figure 14 means a valve (e.g. V1) is in an operating mode (e.g. open or close depending on the operating signal command received from the controller and the current mode of the valve. If the valve has failed (e.g. V1 failed close, V1FC), the valve will ignore the controller’s actual command, and the valve will spuriously switch from the current operating mode to another or remain in the current mode depending on the occurred failure mode (i.e. firing of either transition t3 or t4 in Figure 14). The Petri net model for the remaining level sensor (S2), controller (C2) and valves (V2 and V3) in the water tank system has similar Petri net structures depicted in Figures 12 to 14, respectively. Thus, they are omitted from this paper. However, based on the generalised PN structure depicted in Figures 12 to 14, the Petri net models for the operational state changes of the sensors, controllers and valves in the water tank system comprise of 16 operating state places and 40 operating state change transitions.

Working and failure operational state change PN models of a controller.

Working and failure operational state change PN model of a valve.
The flow propagation PN
Figure 15 shows the generalised PN structure for flow propagation, for example, inflow of water through one of the flow propagation components (i.e. pipelines P1) in section 1 of the water tank system. The model is developed by considering the working and the failure states of the component and the component states/operational modes of the immediate upstream component(s) connected to the component. In this case, places are created to represent all possible states of the material flowing through the component (e.g. flow or no flow of water). Besides, immediate transitions are created for switching between these places depending on the state of the component and/or the state/mode of the immediate upstream components connected to each of the immediate transitions. For example, in Figure 15, the shared place representing state of a current component (e.g. pipe P1) and the shared place for denoting the state of the immediate upstream component (e.g. water in the main supply) connected to the current component are connected to transition t1. A token in each of the shared places and place p1 will enable transition t1. Thus, when t1 is fired, the status of the current component will change from propagation variable state 1 (e.g. no flow) to state 2 (e.g. flow, place p2). The same Petri net structure in Figure 15 and the explained procedure is employed for the flow of process variable (water) through the remaining pipes and valves in the water tank system. Considering the generalised PN structure depicted in Figure 15, the Petri net models for the flow propagation in the water tank system comprises of 18 flow state places and 27 flow states changes transitions describing flows in the input and output flow sections of the water tank system.

The flow propagation PN model at a section of a system.
Petri Net models for a system process variable (e.g. tank level) state updating and the section 4 (overspill tray) of the water tank system
The first three Petri net modules in Figure 16 depict generalised PN models of flow of a process variable (e.g. water) into a monitored process system (e.g. flow of water into the tank through the inlet pipe P2: the test place to transition t1 when transition t2 fired) or out of the system through the auxiliary downstream flow propagation components (e.g. flow of water out of the tank through the normal outlet pipe P4: the test place to transition t3 and the safety outlet pipe P6: the test place to transition t5 when transitions t4 and t7 fired, respectively) connected to flow control components (e.g. valves V1, V2 and V3 ). Transition t6 in the third Petri net module in Figure 16 will fire if there is still at least one token in the place p2 and there is no more process variable left in the process system to flow out of the safety out-flow section of the system. The first three Petri net models in Figure 16 are constructed such that the expected amount of a process variable to flow in and out of a monitored process system at each time step is first determined (token(s) in places p1, p2 and p3) based on the cross-sectional areas of the inlet and the auxiliary outlet flow propagation components. For the considered case study system, a unit of water (1 token) based on the water flow rate (0.00157075 m3/0.25 s) flows into or out of the tank depending on whether there is free space in the tank (at least one token in place ‘Tank portion filled with air’ from the system process variable state changes PN module) or water is left in the tank (at least a token in place ‘Tank portion with water’ from the system process variable state changes PN module). However, if deterministic transitions t1, t3 and t5 do not fire at the current time step, it means there is no possibility of water flowing via the auxiliary flow propagation components (i.e. in-flow via pipe P2, and out-flows via pipes P4 and P6).

PN models for the process variable state updating and overspill tray.
The Petri net model at the bottom right of Figure 16 models the presence of water in the overspill tray as a result of a fault which resulted in an overflow. The transition t8 will fire if a process variable is not leaving the monitored process system through the auxiliary out-lets flow propagation components (e.g. no out-flow via pipes P4 and P5 of the water tank system when no token is present in both the flow states shared places from the operational state changes PN modules of pipes P4 and P5), and the shared place ‘state of a monitoring component in an in-flow section’ from the system process variable state updating PN model has a token while the system is already filled-up with a process variable (i.e. a token in a shared place from the system process variable state changes PN model denoting the full state of the process system). Thus, firing of transition t8 implies that an overflow has occurred, and place p4 will store the level of the spilled process variable (e.g. water) contained in the tray underneath the tank. With the given generalised PN modules depicted in Figure 16, the Petri net models for the system process variable (tank level) state updating and the section 4 (overspill tray) of the water tank system comprises of four places (three monitoring water flow state places, and one monitoring overspilled water level place), and eight transitions (three deterministic transitions for possible flow rates computation, three immediate transitions for tank level state updating, one auxiliary immediate transition and one immediate transition for the occurrence of overflow).
Sensor readings observation and fault detection Petri Net models
The Petri net module for the observation operation is used to reveal the actual condition of sensor reading outcomes at each section of the system. To continuously monitor failures in the system, the sensor readings observation Petri net modules are executed at every time step. Figure 17 shows an example Petri net structure for the observation operation. The time step observation is carried out using the loop p1-t1-p2-t2. Transition t1 is a deterministic transition with an observation time interval equal to the time step (0.25 s). Following the development of the sensor reading observation PN model, the fault detection PN models for the states of the observable monitoring components (flow and level sensors) in the system are developed using the generalised PN structure in Figure 3 which was described in sub-section formal definition of the GSPN-mBSPN approach in section 2. In summary, the fault detection Petri net modules for the water tank system comprises of 4 evidence places, 36 conditional output places and 12 conditional reset transitions (6 each for the system active and dormant operational modes).

The sensor observation PN model.
The developed modified BSPN fault diagnostic PN model of the water tank system
The first part of the proposed GSPN-mBSPN fault detection and diagnostic Petri net methodology for a dynamic system entails the development of the system’s behavioural model (i.e. the GSPN module of the GSPN-mBSPN approach). This has been presented in the first sub-section of this section for a simple case study (water tank level control system). The second part of the method involves the development of the fault diagnostic model (i.e. the mBSPN module of the GSPN-mBSPN approach) for the case study system. The mBSPN model is used to obtain the most likely components responsible for the observed faults/abnormalities in the operation of the system. All the transitions in the developed mBSPN module are conditional probabilistic transitions (CPTs) developed using the generalised structures of the different types of CPT transitions described in Figure 4 under the sub-section formal definition of the GSPN-mBSPN approach in section 2. The parameters for characterising the CPT transitions in the mBSPN module depend on the types of the CPT transitions. The independent CPTs are characterised by the prior failure probabilities of the states of system components. On the other hand, the parameters for describing the dependent non-observable and observable CPTs are the input conditional probability tables (iCPTs). The prior failure probabilities of the independent CPTs in all the system sections are depicted in Table 4. The iCPTs for the dependent non-observable and observable CPTs are determined based on the logic gates (AND, OR and NOT) representing the dependency structures of the system operational model. As stated in section 2 of this paper, the firing rules of the CPT transitions are governed by the selected approximate inference algorithms before the model simulation. The total number of conditional probabilistic places and CPT transitions in the developed mBSPN module of the water tank system are 85 and 36, respectively.
Prior probabilities of the system mode, components and process variable states.
System simulation and fault diagnostic analysis of the water tank system
To assess the correctness and validity of the GSPN-mBSPN model of the water tank system, which aims to facilitate condition monitoring and enable early detection and diagnosis of single and multiple component failures, a custom C++ Monte Carlo Simulation (MCS) programme was created. The GSPN module of the GSPN-mBSPN model simulates system behaviour during normal operation and operation during the considered fault scenarios. Based on the description of the water tank system’s normal behaviour, the simulation results obtained (water level and flow rates) demonstrate the correctness of the GSPN-mBSPN model of the water tank system and validate the effectiveness of the developed programme in modelling and simulating a GSPN-mBSPN model of a dynamic system similar to the one described in this paper. The diagnostic accuracy of the GSPN-mBSPN approach will be assessed by comparing its results with those obtained from the HUGIN software 33 when applied to the water tank system model. The mBSPN module of the GSPN-mBSPN model of the water tank system diagnoses the cause of the faults using the sensor readings (evidence) from the GSPN simulation model describing the system operation. Therefore, the effects that the component faults would have on the system process variable (water level) and the input-and-output variables (flow) were observed by simulating the operation of the water tank level control system under single and multiple component failure scenarios using the simulation input parameters depicted in Table 5. The fault scenarios were investigated when the system was in the active mode of operation.
Simulation parameters
Simulation and diagnostic results of single component failure scenarios
One of the cases of single component failure scenarios is used to test the accuracy of the diagnostic results of the inference algorithms: rejection sampling (RSA) and likelihood weighting (LWA) inference algorithms implemented in this paper. As depicted in Figure 18, with the system in the ACTIVE mode and the initial volume of water inside the tank set to the normal required level, it is expected that starting from time t = 0.25 s, there should be flow (F) of water out of the tank in section 2, constant flow (CF) of water into the tank at section 1 and no water flow (NF) at section 3 of the system since the current level of water in the tank has not reached the safety level. Also, as shown in the figure, no water (NW) is inside the TRAY due to overflow, leaking or fracture and the level of water inside the tank remains constant from time t = 0.25–60 s.

Flow and level sensors measurements when valve V1 failed closed at 60 s.
However, starting from time t = 60 s, the failure of one of the components in section 1 of the system, for example, valve V1 failed closed causes no flow in the section, as depicted in Figure 18. Consequently, this failure causes the volume of water in the tank to start decreasing at time t = 60 s since water is flowing out of the tank via valve V2 at section 2, and no more water is entering the tank through valve V1 at section 1.
To demonstrate the diagnostic capability of the implemented inference sampling algorithms, Figure 19 depicts the average posterior probabilities of the failure state/mode of the components in Section 1 of the water tank system that could be responsible for no flow at the section when valve V1 failed closed at t = 60 s using the LWA. Similar trends were observed for the RSA and thus omitted in this paper. The diagnostic results were generated with an average time of 3 min when the model was simulated 30,000 times on a Windows 10 64-bit Intel(R) Core (TM) i3 system with a 3.60 GHz processor and 8.00 GB of RAM. Running 30,000 simulations ensures the accuracy of the results presented in Figure 19 within ±5% precision and 95% confidence interval (CI) levels, as indicated by the convergence graph in Figure 20 for the posterior probability of no water in the main supply, being the component with the lowest failure probability (0.001). The 95% CI for the average posterior probability of no water in the main supply is in the interval [0.0017, 0.0019]. As shown in Figure 19, the posterior probabilities of blockage in pipes P1 and P2, valve V1 failed closed and controller C1 failed high are significantly higher than the posterior probabilities of the remaining components in the section. The presence of valve V1 failed closed among the list of the possible causes of no flow in section 1 of the system shows that the proposed methodology is accurate and could be used to diagnose single component failures with known observable effects on the system process variable (water level).

Timestep posterior probability of section 1 components failure state/modes with the no flow observation via section 1 of the system.

The posterior probability of no water in the main supply over the 30,000 simulations.
Table 6 summarises the results of the posterior probabilities of all the single component failure scenarios tested using LWA. Similar results are also obtained for the case of fault diagnosis with the RSA algorithm but are not presented due to brevity. As depicted in the table, it could be observed that single component failure scenarios whose effects correlate to the expected sensor symptoms at the sections of the system are hidden. Some of these failures could only be revealed if the component failure causes the level of water inside the tank to fall below or rise above the predefined low, high or very high set points (i.e. <1.5, >1.7 or >1.9 m), respectively.
Average posterior probabilities of single component failure scenarios based on failure diagnosis using likelihood weighting inference algorithm.
Hidden failure cases.
Note that in each case there were other components that had an increase in their posterior probability, but the failure that was inserted appeared at the top of that list, with the highest posterior probability.
Validation of the fault diagnostic capability of the GSPN-mBSPN model
The fault diagnostic capability of the GSPN-mBSPN model was validated using a case involving the absence of flow through section 1 of the water tank system. The Bayesian network (BN) graph for the entire water tank system was drawn using the HUGIN software. However, only the BN graph for section 1 of the tank system is presented in Figure 21 for simplicity. To assess the effectiveness of the proposed GSPN-mBSPN method for fault diagnosis, evidence was applied to the ‘FV1_Status’ node in the BN graph, indicating a state of ‘no flow’ in section 1 of the tank system. The HUGIN software generated posterior probabilities (in percentage) for the states of the components in section 1, as revealed on the left-hand side of Figure 21. Although, as expected, there is an increase in the states of the posterior probability of the components that could be responsible for no flow in the section 1 of the water tank system. However, there are some differences in the posterior probabilities of the GSPN-mBSPN simulation output in Table 6 from the obtained results from the HUGIN software. This can be due to the precision and confidence interval issues of the Monte Carlo Simulation employed in the GSPN-mBSPN approach. However, the percentage difference is small, as shown in Table 7. Consequently, it can be concluded that the proposed GSPN-BSPN fault detection and diagnostic methodology is reliable, and its features hold promise for enhancing fault detection and diagnosis in complex systems.

The Bayesian network graph of section 1 of the eater tank system.
Comparison between the the posterior probabilities of the GSPN-mBSPN model and the HUGIN posterior probability results.
Simulation and diagnostic results of multiple components failure scenarios
Several cases of multiple components failures in the water tank level control system were tested. However, to demonstrate the capability of the proposed methodology for multiple faults diagnosis, ten cases of multiple component failures (one component fault from Section 1, 2 and 3) that could cause continuous rise in the tank’s water level are taken as examples. Note, all faults occur at the same time. For illustration, Figure 22 shows the observed pattern of flow sensors rates and volume of water in the tank and overspill tray when sensor S1 failed low and valve V2 failed closed starting at time t = 60 s of the system operating time. The concurrent failure of sensor S1 (failed low) and valve V2 (failed closed) at t = 60 s causes continuous flow and flow stoppage at sections 1 and 2 of the system, respectively. This is evident from the observed flow patterns depicted in Figure 22. Consequently, as shown in the figure, these failures will cause the volume of water in the tank to start increasing at time t = 60 s since water is not flowing out of the tank via section 2 and there is a continuous supply of water into the tank at section 1 of the system.

Flow and level sensors measurements when sensor S1 failed low and valve V2 failed closed at 60 s.
Table 8 summarises the results of the average posterior probabilities of ten cases of multiple components failure scenarios taking as examples to illustrate the capability of the proposed Petri net methodology for multiple faults diagnosis. Similar results are also obtained for the combination of other possible multiple failure scenarios but are omitted from this paper. As depicted in the tables, it could be observed that the average posterior probabilities of the components at Sections 1 and 2 of the tank system that could be responsible for the rise in the water level in the tank have increased.
Average posterior probabilities of multiple component failure scenarios based on failure diagnosis using likelihood weighting inference algorithm.
Besides, it was observed that there was not a lot significant difference between the prior and the posterior probabilities of the Section 3 components (Pipe P5 blocked) listed among the inserted faults in all the cases tested, excluding case 2, where the inserted fault from Section 3 is Sensor S2 failed low which also follow a similar pattern as in the other instances where pipe P1 is blocked. This is because this section only starts its functional operation if there has been a failure (s) that has caused the level of water in the tank to rise above the very high set-point which will occur from time t = 210 s based on the simulation input parameters assumed in this study. However, if any of the components in section 3 have failed, it is expected that the average posterior probabilities of the failed component (e.g. pipe P5 blocked in this case) will significantly increase. The little changes in the prior and the posterior probabilities of pipe P3 blocked in the example cases listed in Table 8 showed the capability of the proposed Petri net-based fault diagnostic methodology for computing marginal probabilities of system variables in a dynamic system with some unrevealed faults.
Conclusion
This paper proposes a condition monitoring, early fault detection and diagnostic Petri net methodology for diagnosing single and multiple faults in a dynamic system using a water tank level control system as a case study. The method was based on integrating Generalised and modified Bayesian Stochastic Petri Nets (GSPN-mBSPN) formalisms. First, a model describing the operational behaviour of the system was constructed using GSPN formalism. Then, a diagnostic model was constructed for the water tank system using a modified BSPN approach proposed in this paper. The GSPN and the mBSPN models were then integrated to aid real-time detection and diagnosis of faults in the system. This was achieved using newly proposed Petri net modelling features such as conditional reset and probabilistic transitions. The obtained posterior probabilities of the root causes of system failure scenarios showed the effectiveness and accuracy of using the proposed integrated Petri net methodology for single and multiple faults diagnosis in the considered water tank level control system. Due to the limited number of monitoring sensors and the points where they were deployed on the case study system, the proposed method fails to detect some cases of hidden faults. Thus, future research aims to improve the approach used for fault detection in the proposed methodology. Furthermore, additional component failures could also be considered, such as tank rupture and leakages at different tank levels. The methodology could be further enhanced by automatically generating the conditional probability tables required in the fault diagnostic model through the system simulation model.
Footnotes
Acknowledgements
The authors would like to thank Petroleum Technology Development Fund (PTDF), Abuja, Nigeria for funding and supporting Taofeeq Alabi Badmus PhD research through the fund Overseas Scholarship Scheme (OSS) programme [award number P6797054741222445].
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Petroleum Technology Development Fund (PTDF), Abuja, Nigeria, under its PhD Overseas Scholarship Scheme award to Taofeeq Alabi Badmus [award number P6797054741222445].
