Sage Journals: Discover world-class research

Abstract

With the advent of new sensor technologies and communication solutions, the availability of data for discrete event systems has greatly increased. This motivates research on data assimilation for discrete event simulations that has not yet fully matured. This paper presents a particle filter-based data assimilation framework for discrete event simulations. The framework is formally defined based on the Discrete Event System Specification formalism. To effectively apply particle filtering in discrete event simulations, we introduce an interpolation operation that considers the elapsed time (i.e., the time elapsed since the last state transition) when retrieving the model state (which was ignored in related work) in order to obtain updated state values. The data assimilation problem finally boils down to estimating the posterior distribution of a state trajectory with variable dimension. This seems to be problematic; however, it is proven that in practice we can safely apply the sequential importance sampling algorithm to update the random measure (i.e., a set of particles and their importance weights) that approximates this posterior distribution of the state trajectory with variable dimension. To illustrate the working of the proposed data assimilation framework, a case is studied in a gold mine system to estimate truck arrival times at the bottom of the vertical shaft. The results show that the framework is able to provide accurate estimation results in discrete event simulations; it is also shown that the framework is robust to errors both in the simulation model and in the data.

Keywords

Data assimilation discrete event simulations particle filters state interpolation

1 Introduction

Enabled by the increased availability of data, the data assimilation technique,¹ which incorporates measured observations into a dynamical system model to produce a time sequence of estimated system states, is gaining popularity. The main reason is that it can produce more accurate estimation results than using a single source of information from either the simulation model or the measurements. Due to this benefit, the data assimilation technique has been applied in many continuous systems applications, but very little data assimilation research has been found for discrete event simulations. With the application of new sensor technologies and communication solutions, such as smart sensors, or Internet of Things,² the availability of data for discrete event systems has increased as well, such as data from machines and processes,³ or high-resolution event data in traffic.⁴ The increased data availability for discrete event systems but the lack of related data assimilation techniques thus motivates this work on data assimilation in discrete event simulations.

1.1 Characteristics of discrete event simulations

Discrete event systems are usually man-made dynamic systems, for example, production or assembly lines, computer/communication networks, or traffic systems. These systems are not easily described by (partial) differential equations or difference equations; instead, they are modeled and simulated by the discrete event approach.⁵ This approach abstracts the physical time and the state of the physical system as a continuous simulation time and a collection of state variables, respectively. A point on this continuous time axis at which at least one state variable changes is called instant.⁶ State changes are only captured at discrete, but possibly random, instants,⁷ where such a change in state occurring at an instant is called an event.⁶ Since the discrete event approach jumps from one event to the next, omitting the behavior in between, it can be very efficient.⁸

The key characteristics of discrete event simulations can be summarized as follows. Firstly, the model state is defined as a collection of atomic model states, each of which is represented by a combination of continuous and discrete variables. Take the case study in the gold mine system (see Section 3) as an example. The position of the elevator is a continuous state variable; the number of trucks that are waiting for loading is a discrete state variable; and the status of the miner, that is, busy or idle, is also a discrete state variable. Secondly, the behavior of discrete event simulations is highly nonlinear, non-Gaussian. In a discrete event simulation, the state evolution is usually based on rules, which define what the next state will be when the time advance expires, how to react when external events occur, etc. These functions are highly nonlinear step functions, because state changes in a discrete event simulation happen instantaneously at the event. The Gaussian error assumption is easily violated, since both state variables and measurements can be non-numerical. Finally, state updates in a discrete event simulation happen locally and asynchronously within each atomic model component; for each atomic model component, its state is updated at time instants lying irregularly on a continuous time axis, and the duration between two consecutive state updates is usually not fixed. The state trajectory of a discrete event simulation model is thus piecewise constant, as shown in Figure 1, which only captures changes of interest in the real state evolution.

Figure 1.

Discrete event simulation of continuous and discrete state variables. (Color online only.)

1.2 Data assimilation in discrete event simulations

The aim of data assimilation is to incorporate measured (noisy) observations into a dynamical system model in order to produce accurate estimates of all the current (and future) state variables of the system.⁹ Therefore, data assimilation relies on the following three elements to work, namely the system model that describes the evolution of the state over time, the measurement model that relates noisy observations to the state, and the data assimilation techniques that carry out state estimation based on information from both the model and the measurements, and in the process address measurement and modeling errors.¹ In the literature, many data assimilation techniques exist, such as the Kalman filter,¹⁰ the extended Kalman filter,¹¹ and the ensemble Kalman filter.¹² However, their working relies on certain assumptions, such as the liner model assumption or the Gaussian error assumption.¹³ Another powerful data assimilation technique is particle filters.^10,14 The particle filters approximate a probability density function by a set of particles and their associated importance weights, and therefore they put no assumption on the properties of the system model. As a result, they can effectively deal with nonlinear and/or non-Gaussian applications.^15–17

As explained in Section 1.1, discrete event simulations are highly nonlinear, non-Gaussian systems, and therefore particle filters are in principle applicable to discrete event simulations. However, applying particle filtering in discrete event simulations still encounters several theoretical and practical problems. In discrete event simulations, state updates happen locally and asynchronously within each (atomic) model component, and the system state takes a new value when one of its components has a state update. Consequently, the time between two consecutive state updates is usually not fixed, that is, the discrete event state process is asynchronous with the measurement process, which usually feeds data at fixed times. The mismatch between the two processes incurs two problems that hinder the application of particle filtering in discrete event simulations. The first problem is the state retrieval problem, which means that the model state retrieved from a discrete event simulation model is a combination of sequential states (without the elapsed time; see Figure 1) of atomic components that were updated at past time instants. The consequence of ignoring the elapsed time is that the particles will be evaluated inaccurately, since the measurements are wrongly related to the states that were updated at past time instants. This effect is evident for continuous states (see Figure 1(a)); for discrete states (see Figure 1(b)), in order to compute the weight of a particle, one probably needs the elapsed time to define a proper measurement model that relates the discrete state to the measurement. However, ignoring the elapsed time will also make this definition and computation inaccurate. The second problem is the variable dimension problem. The “dimension” refers to the dimension of a discrete event state trajectory during a fixed time interval, which is defined as the number of state points contained in the discrete event state trajectory during that time interval. Since the duration between two consecutive state updates in a discrete event simulation is not fixed, the dimension of a discrete event state trajectory during a fixed time interval is a random variable. This will lead to inapplicability of the standard sequential importance sampling algorithm.^18,19 Other practical problems, which mainly relate to data issues, such as non-numerical data, for example, event sequences, also make particle filtering in discrete event simulations highly problematic.

The research closely related to the topic of this paper is the work on data assimilation in wildfire spread simulations.^15,16,20,21 However, the two problems explained above were not explicitly considered. In their work, the simulation model for wildfire spread is a cellular automaton-based discrete event simulation model called DEVS-FIRE^22,23; the measurements are temperature values from sensors deployed in the fire field; particle filters are employed to assimilate these measurements into the DEVS-FIRE model to estimate the spread of the fire front. Since the measurement in the wildfire application is the temperature at a time instant, and it is only related to the system state (fire front) at the same time, their system model can be formalized as a discrete time state space model that only focuses on the state evolution at time instants when measurements are available, and the detailed evolution in between (not of interest in their application) is done with the DEVS-FIRE model. However, when retrieving the system state at the time instant when a measurement is available, the retrieved state is only a combination of sequential states of all atomic components (i.e., cells), which do not reflect any elapsed time information. As a result, errors exist, as explained in Figure 1.

1.3 Contribution and outline of this paper

In this paper, we propose a particle filter-based data assimilation framework for discrete event simulations, in which we assume that model components do not change over time (i.e., closed systems). The measurements fed at time step $k \in {1, 2, \dots}$ are assumed to be distributed over the last measurement interval (i.e., data fed at time step k can contain observations occurring at any time instant during $[(k - 1) Δ T, k Δ T]$ , where $Δ T$ is the measurement interval), implying that the measurements are dependent on the state transitions during that interval. To define the data assimilation framework formally, we adopt the Discrete Event System Specification (DEVS) formalism⁸; in this framework, we solve the state retrieval problem and the variable dimension problem explained in Section 1.2. To illustrate the working of the proposed data assimilation framework, we study a case in a gold mine system in which noisy data (partial event sequences, entity positions with Gaussian errors) is assimilated into the discrete event gold mine simulation model in order to estimate truck arrival times at the bottom of the vertical shaft. The results show that the proposed data assimilation framework is able to provide accurate estimation results in discrete event simulations; it is also shown that the proposed framework is robust to errors both in the simulation model and in the data.

The rest of this paper is organized as follows. Section 2 presents the particle filter-based data assimilation framework, which includes the system model, the measurement model, and the particle filtering algorithm for discrete event simulations. The case in the gold mine system is studied in Section 3 (tailoring the generic data assimilation framework to the specific estimation problem), Section 4 (qualitative analysis), and Section 5 (quantitative analysis). Finally, the paper is concluded in Section 6.

2 The particle filter-based data assimilation framework for discrete event simulations

In this section, the proposed data assimilation framework for discrete event simulations is presented. In order to formalize the data assimilation problem, we need to formalize the state transitions in a discrete event model as an integer indexed state process (i.e., in the same form with a discrete time model), therefore, in Section 2.1, we show how to achieve such formalization. In Section 2.2, the interpolation operation is introduced in order to obtain updated state values, and the measurement model is formalized accordingly. On the basis of the integer indexed state process and the measurement model, the particle filtering algorithm is formalized in Section 2.3, in which the variable dimension problem is addressed. Finally some practical remarks that can help simplify the application of the data assimilation framework are given in Section 2.4.

2.1 System model

In order to describe the discrete event simulations formally, we need to adopt certain discrete event modeling and simulation formalism. Therefore, in Section 2.1.1, we briefly introduce the DEVS formalism,⁸ which is adopted widely in the simulation community. Subsequently, in Section 2.1.2, we introduce how the state is evolved in a DEVS model. Finally, in Section 2.1.3, we show how to formalize the state transitions in a DEVS model as an integer indexed state process.

2.1.1 Discrete Event System Specification

DEVS⁸ allows for the description of system behavior at two levels: the atomic level and the coupled level. An atomic DEVS model describes the autonomous behavior of a discrete event system as a sequence of deterministic transitions between sequential states over time as well as how it reacts to external input (events) and how it generates output (events). Formally, an atomic DEVS model M is defined by the following structure:

M = < X, Y, S, δ_{i n t}, δ_{e x t}, λ, t a >

where X and Y are the sets of input and output events, S is a set of sequential states, $δ_{int} : S \to S$ is the internal state transition function, $δ_{ext} : Q \times X \to S$ is the external state transition function, where $Q = {(s, e) | s \in S, 0 \leq e \leq ta (s)}$ is the total state set, e is the time elapsed since the last transition, $λ : S \to Y$ is the output function, and $ta : S \to R_{0, \infty}^{+}$ is the time advance function, where $R_{0, \infty}^{+}$ is the positive real with 0 and $\infty$ .

Atomic models can be coupled to form a lager model. A DEVS coupled model N is defined by the following structure:

N = < X, Y, D, {M_{i}}, {I_{i}}, {Z_{i, j}}, Select >

(1)

where

X and Y are the sets of input and output events of the coupled model,

D is a set of component names, and for each $i \in D$ , $M_{i}$ is an atomic DEVS model defined as follows:

M_{i} = < X_{i}, Y_{i}, S_{i}, δ_{int, i}, δ_{ext, i}, λ_{i}, t a_{i} >, \forall i \in D

for each $i \in D \cup {N}$ , $I_{i}$ is the set of components that are influenced by component i, and $I_{i} \subseteq D \cup {N}, i \notin I_{i}$ ,

for each $j \in I_{i}$ , $Z_{i, j}$ is the output-to-input translation function, where:

Z_{i, j} : {\begin{matrix} X \to X_{j} & if i = N and j \in D \\ Y_{i} \to Y & if i \in D and j = N \\ Y_{i} \to X_{j} & if i \in D and j \in D \end{matrix}

$Select : 2^{D} \to D$ is a tie-breaking function with $Select (E) \in E$ to arbitrate the occurrence of simultaneous events.

DEVS models are closed under coupling, that is, the coupling of DEVS models defines an equivalent atomic DEVS model.²⁴

2.1.2 State evolution in a coupled DEVS model

Consider a coupled DEVS model N defined in Equation (1). The state evolution of its atomic component $M_{i}$ is achieved by executing internal state transition $δ_{int, i} (s_{i})$ and external state transition $δ_{ext, i} (s_{i}, e_{i}, x_{i})$ . In this section, we clarify how state evolution of the coupled DEVS model N is driven by state evolutions of its atomic components.

Since DEVS models are closed under coupling,²⁴ the coupled DEVS model N is equivalent to an atomic DEVS model $M = < X, Y, S, δ_{int}, δ_{ext}, λ, ta >$ (the construction of M can be found in Vangheluwe²⁴). The sequential state of M (equivalent to the coupled DEVS model N) can be represented as follows:

s = (\dots, (s_{i}, e_{i}), \dots) \in S = \times_{i \in D} Q_{i}

(2)

where $Q_{i} = {(s_{i}, e_{i}) | s_{i} \in S_{i}, 0 \leq e_{i} \leq t a_{i} (s_{i})}$ . The state evolution of the coupled DEVS model is triggered by either an internal state transition of the selected imminent component $i^{*}$ ,²⁴ which transforms the different parts of the total state as follows:

\begin{matrix} δ_{int} (s) = (\dots, (s'_{i}, e'_{i}), \dots) \\ where (s'_{i}, e'_{i}) \\ = {\begin{matrix} (δ_{int, i} (s_{i}), 0) & if i = i^{*} \\ (δ_{ext, i} (s_{i}, e_{i} + ta (s), Z_{i^{*}, i} (λ_{i^{*}} (s_{i^{*}}))), 0) & if i \in I_{i^{*}} \\ (s_{i}, e_{i} + ta (s)) & otherwise \end{matrix} \\ where ta (s) = \min {σ_{i} = t a_{i} (s_{i}) - e_{i} | i \in D} \end{matrix}

or an external state transition, which transforms the different parts of the total state as follows:

\begin{matrix} δ_{ext} (s, e, x) = (\dots, (s'_{i}, e'_{i}), \dots) \\ where (s'_{i}, e'_{i}) = {\begin{matrix} (δ_{ext, i} (s_{i}, e_{i} + e, Z_{N, i} (x)), 0) & if i \in I_{N} \\ (s_{i}, e_{i} + e) & otherwise \end{matrix} \end{matrix}

2.1.3 Formalize discrete event state evolution as an integer indexed state process

In order to formalize the data assimilation problem, we need to formalize the state transitions in a DEVS model as an integer indexed state process:

\begin{matrix} x_{\tilde{k}} = (s_{\tilde{k}}, t_{\tilde{k}}) \\ = ((\dots, (s_{i, {\tilde{k}}_{i}}, e_{i, {\tilde{k}}_{i}}), \dots), t_{\tilde{k}}), i \in D; \\ \tilde{k} = 0, 1, 2, \dots; {\tilde{k}}_{i} = 0, 1, 2, \dots \end{matrix}

(3)

where $s_{\tilde{k}} \in S$ is a sequential state of a coupled DEVS model as defined in Equation (2), and $t_{\tilde{k}} \in R_{0, \infty}^{+}$ is the time instant when the model transfers to state $s_{\tilde{k}}$ , and we assign $t_{0} = 0$ . $s_{i, {\tilde{k}}_{i}} \in S_{i}$ is the sequential state of component $i \in D$ ; $e_{i, {\tilde{k}}_{i}} = t_{\tilde{k}} - t_{i, {\tilde{k}}_{i}}$ is the time elapsed since component i made a state transition to state $s_{i, {\tilde{k}}_{i}} \in S_{i}$ at time $t_{i, {\tilde{k}}_{i}} \in R_{0, \infty}^{+}$ . Essentially, $x_{i, {\tilde{k}}_{i}} = (s_{i, {\tilde{k}}_{i}}, t_{i, {\tilde{k}}_{i}})$ also defines an integer indexed state process for atomic DEVS component $i \in D$ . Since state evolutions of different components are again asynchronous with each other, the state index is different from component to component at the same time; therefore, the state index $\tilde{k}$ is associated with the component index i, that is, ${\tilde{k}}_{i}$ . Obviously, $\forall t_{\tilde{k}}, \exists i \in D, s . t . t_{\tilde{k}} = t_{i, {\tilde{k}}_{i}}$ , which means that a coupled model takes a new state value when one of its atomic components has a state update. The integer indexed state process is illustrated in Figure 2.

Figure 2.

The integer indexed state process (each red circle represents a state point $x_{\tilde{k}} = (s_{\tilde{k}}, t_{\tilde{k}})$ ). (Color online only.)

We denote the input event segment for the coupled DEVS model as $w : (t_{\tilde{k}}, t_{\tilde{k}} + ta (s_{\tilde{k}})] \to X^{\emptyset} = X \cup {\emptyset}$ , where $ta (s_{\tilde{k}}) = t a_{i^{*}} (s_{i^{*}, {\tilde{k}}_{i^{*}}}) - e_{i^{*}, {\tilde{k}}_{i^{*}}} = \min {σ_{i, {\tilde{k}}_{i}} = t a_{i} (s_{i, {\tilde{k}}_{i}}) - e_{i, {\tilde{k}}_{i}} | i \in D}$ , that is, $i^{*}$ is the selected imminent component. Based on $x_{\tilde{k}} = (s_{\tilde{k}}, t_{\tilde{k}})$ and the input segment w, the next state $x_{\tilde{k} + 1} = (s_{\tilde{k} + 1}, t_{\tilde{k} + 1})$ is defined as follows:

if there is no external event during $(t_{\tilde{k}}, t_{\tilde{k}} + ta (s_{\tilde{k}})]$ , that is, $∄ t \in (t_{\tilde{k}}, t_{\tilde{k}} + ta (s_{\tilde{k}})], s . t . w (t) \neq \emptyset$ , $x_{\tilde{k} + 1} = (s_{\tilde{k} + 1}, t_{\tilde{k} + 1})$ is determined as follows:

\begin{matrix} s_{\tilde{k} + 1} = δ_{int} (s_{\tilde{k}}) = (\dots, (s_{i, \tilde{k}'_{i}}, e_{i, \tilde{k}'_{i}}), \dots) \\ t_{\tilde{k} + 1} = t_{\tilde{k}} + ta (s_{\tilde{k}}) \end{matrix}

(4)

where $(s_{i, \tilde{k}'_{i}}, e_{i, \tilde{k}'_{i}})$ is defined as follows:

(s_{i, \tilde{k}'_{i}}, e_{i, \tilde{k}'_{i}}) = (\begin{array}{l} (δ_{i n t, i} (s_{i, {\tilde{k}}_{i}}), 0) = (s_{i, {\tilde{k}}_{i} + 1}, 0) & if i = i^{*} \\ (δ_{e x t, i} (s_{i, {\tilde{k}}_{i}}, e_{i, {\tilde{k}}_{i}} + t a (s_{\tilde{k}}), Z_{i^{*}, i} (λ_{i^{*}} (s_{i^{*}, {\tilde{k}}_{i^{*}}}))), 0) = (s_{i, {\tilde{k}}_{i} + 1}, 0) & if i \in I_{i^{*}} \\ (s_{i, {\tilde{k}}_{i}}, e_{i, {\tilde{k}}_{i}} + t a (s_{\tilde{k}})) & otherwise \end{array}

if there exist external events during $(t_{\tilde{k}}, t_{\tilde{k}} + ta (s_{\tilde{k}})]$ , that is, $\exists t \in (t_{\tilde{k}}, t_{\tilde{k}} + t a (s_{\tilde{k}})], s . t . w (t) \neq \emptyset \cap ∄ t^{'} \in (t_{\tilde{k}}, t), s . t . w (t^{'}) \neq \emptyset$ , $x_{\tilde{k} + 1} = (s_{\tilde{k} + 1}, t_{\tilde{k} + 1})$ is determined as follows:

\begin{matrix} x_{\tilde{k} + 1} = δ_{ext} (s_{\tilde{k}}, t - t_{\tilde{k}}, w (t)) = (\dots, (s_{i, \tilde{k}'_{i}}, e_{i, \tilde{k}'_{i}}), \dots) \\ t_{\tilde{k} + 1} = t \end{matrix}

(5)

where $(s_{i, \tilde{k}'_{i}}, e_{i, \tilde{k}'_{i}})$ is defined as follows:

(s_{i, \tilde{k}'_{i}}, e_{i, \tilde{k}'_{i}}) = (\begin{array}{l} (δ_{e x t, i} (s_{i, {\tilde{k}}_{i}}, e_{i, {\tilde{k}}_{i}} + t - t_{\tilde{k}}, Z_{N, i} (w (t))), 0) = (s_{i, {\tilde{k}}_{i} + 1}, 0) & if i \in I_{N} \\ (s_{i, {\tilde{k}}_{i}}, e_{i, {\tilde{k}}_{i}} + t - t_{\tilde{k}}) & otherwise \end{array}

Finally, we can formalize the state evolution of a coupled DEVS model as an integer indexed state process:

x_{\tilde{k} + 1} = SIM (x_{\tilde{k}}, w) + ν_{\tilde{k}}, \tilde{k} = 0, 1, 2, \dots

(6)

where w is the input event segment and SIM is a discrete event simulation model that transfers state $x_{\tilde{k}}$ to $x_{\tilde{k} + 1}$ based on Equations (4) and (5); $ν_{\tilde{k}}$ is the process noise. Notice that the time duration between two consecutive state points, that is, $t_{\tilde{k} + 1} - t_{\tilde{k}}$ , is not a constant, but a random variable. In this paper, we focus on closed systems; therefore, $w = \emptyset$ .

2.2 Measurement model

The (discrete time) measurement model relates noisy observations to the system state:

m_{k} = g_{k} (s_{k}) + ε_{k}, k = 1, 2, \dots

(7)

where $ε_{k}$ is the measurement noise. Notice that the measurement process is assumed to feed data at fixed times, that is, every $Δ T$ time units, and therefore the time of the measurement process can be represented as an integer k (the corresponding simulation time is $t = k Δ T$ ; see Figure 3). The state points in the discrete event state process can also be indexed by an integer $\tilde{k}$ (see section 2.1.3), but since these state points lie irregularly on the continuous axis, we need to explicitly represent the time instants (i.e., $t_{\tilde{k}}$ , which is a continuous variable) when the system transfers to these states.

Figure 3.

Time representation of the discrete event state process (each black dot indicates a state update) and the (discrete time) measurement process (each black dot represents the arrival of a measurement).

In a discrete event simulation, the state values are only updated when events happen. As shown in Figure 1, if we directly retrieve the model state at a time instant t, the retrieved value will be a combination of sequential states of all atomic components, which were updated at past time instants. If these retrieved states (updated at past time instants) were used for estimation, inaccurate estimation results would be obtained. This will incur the state retrieval problem, as introduced in Section 1.2. To get an updated (thus more accurate) state value at a time instant t, we need to consider the time elapsed since the model transfers to the current (sequential) state as well. Therefore, we introduce an interpolation operation to obtain the updated state value, which infers the state value at a time instant t based on the states lying around that time (i.e., neighborhood of t). How many states are involved in the interpolation is determined by the interpolation method we use. In the measurement model, the time is represented by an integer k; therefore, we define how to obtain the state value at time k (i.e., $k Δ T$ ) given the integer indexed state process $x_{\tilde{k}}$ . To this end, we first define a neighborhood of states around time k:

x_{N_{k}} = {x_{\tilde{k}}, \tilde{k} \in N_{k} (x_{0 : \infty})}

where $x_{0 : \infty} = {x_{i}, i = 0, 1, 2, \dots}$ is a sequence of state points defined in Equation (3); $N_{k} (x_{0 : \infty})$ defines a set of indexes of states that are required for the interpolation operation in order to compute the state at time k. For example, in Figure 3, if we use linear interpolation, $N_{k} (x_{0 : \infty}) = {\tilde{k} - 1, \tilde{k}}$ . Then we can compute an updated state by interpolation: ${\hat{s}}_{k} = interpolate (x_{N_{k}})$ . Based on ${\hat{s}}_{k}$ , we can now formalize the measurement model between ${\hat{s}}_{k}$ and $m_{k}$ :

m_{k} ~ p (m_{k} | {\hat{s}}_{k}) = p (m_{k} | x_{N_{k}})

(8)

which is just a reformulation of Equation (7).

In this research, we want to generalize the measurement model to include situations where measurements are dependent on the state trajectory, that is, the history of state transitions, which means that $m_{k}$ will contain observations that are distributed over the last measurement interval $[(k - 1) Δ T, k Δ T]$ . This assumption holds in many applications, such as vehicle passages (event data) collected at a loop detector during 1 minute.⁴ In this case, the measurement $m_{k}$ is not only related to a specific state at a time instant, but also related to a sequence of states over a period of time. Therefore, we define a generalized form of the measurement model:

m_{k} ~ p (m_{k} | x_{N_{k - 1}^{+} + 1 : N_{k}^{+}})

(9)

where $N_{k}^{+} = \max {i \in N_{k}}$ , and $x_{N_{k - 1}^{+} + 1 : N_{k}^{+}}$ represents a sequence of states that are indexed from $N_{k - 1}^{+} + 1$ to $N_{k}^{+}$ .

2.3 State estimation using particle filters

2.3.1 Principles of particle filters

A general discrete time state evolution can be expressed by the following:

s_{k} = f_{k} (s_{k - 1}) + ν_{k - 1}, k = 1, 2, \dots

where $f_{k}$ is a possibly nonlinear function of the state $s_{k - 1}$ and $ν_{k - 1}$ is the process noise. The measurement at time k is given by the following:

m_{k} = g_{k} (s_{k}) + ε_{k}, k = 1, 2, \dots

where $g_{k}$ is possibly a nonlinear function that maps the state to the measurement and $ε_{k}$ is the measurement noise.

The objective of the particle filter is to estimate the conditional distribution of all states up to time k given all available measurements up to k, that is, $p (s_{0 : k} | m_{1 : k})$ , where $s_{0 : k} = {s_{i}, i = 0, 1, 2, \dots, k}$ , $m_{1 : k} = {m_{j}, j = 1, 2, \dots, k}$ . Since an analytic solution of $p (s_{0 : k} | m_{1 : k})$ is usually intractable, we generate a set of Monte Carlo samples (particles) with their associated weights to approximate this posterior distribution. If the number of particles is sufficiently large, the posterior can be approximated to an arbitrary accuracy.^10,14 With this sample of particles all relevant statistical moments can be obtained using standard Monte Carlo integration techniques.

Let $χ_{k} = {s_{0 : k}^{i}, w_{k}^{i}}_{i = 1}^{N_{p}}$ represent a random measure that characterizes the posterior distribution $p (s_{0 : k} | m_{1 : k})$ , where ${s_{0 : k}^{i}}_{i = 1}^{N_{p}}$ is a set of support points (particles) and ${w_{k}^{i}}_{i = 1}^{N_{p}}$ the set of associated weights. Then $p (s_{0 : k} | m_{1 : k})$ can be approximated as follows:

p (s_{0 : k} | m_{1 : k}) \approx \sum_{i = 1}^{N_{p}} w_{k}^{i} δ (s_{0 : k} - s_{0 : k}^{i})

(10)

where $δ (\cdot)$ is the Dirac delta function. A very important concept in particle filtering is the principle of importance sampling. If we can generate the particles ${s_{0 : k}^{i}}_{i = 1}^{N_{p}}$ from $p (s_{0 : k} | m_{1 : k})$ , each of them will be assigned a weight equal to $1 / N_{p}$ . However, direct sampling from $p (s_{0 : k} | m_{1 : k})$ is usually intractable. An alternative (i.e., importance sampling) is to generate the particles from a distribution $q (s_{0 : k} | m_{1 : k})$ , known as importance density,^10,14 and assign weights according to the following:

w_{k}^{i} = \frac{p (s_{0 : k}^{i} | m_{1 : k})}{q (s_{0 : k}^{i} | m_{1 : k})}

Based on Bayes’ theorem, $p (s_{0 : k} | m_{1 : k})$ can be expressed as $p (s_{0 : k} | m_{1 : k}) = \frac{p (s_{0 : k}) p (m_{1 : k} | s_{0 : k})}{p (m_{1 : k})}$ . Similarly, we have $p (s_{0 : k - 1} | m_{1 : k - 1}) = \frac{p (s_{0 : k - 1}) p (m_{1 : k - 1} | s_{0 : k - 1})}{p (m_{1 : k - 1})}$ . Therefore, we can obtain a sequential update equation as follows:

\begin{array}{l} p (s_{0 : k} | m_{1 : k}) = \frac{p (m_{k} | s_{k}) p (s_{k} | s_{k - 1}) p (s_{0 : k - 1} | m_{1 : k - 1})}{p (m_{k} | m_{1 : k - 1})} \\ \propto p (m_{k} | s_{k}) p (s_{k} | s_{k - 1}) p (s_{0 : k - 1} | m_{1 : k - 1}) . \end{array}

(11)

In the case that the importance density is chosen to factorize such that $q (s_{0 : k} | m_{1 : k}) = q (s_{k} | s_{0 : k - 1}, m_{1 : k}) q (s_{0 : k - 1} | m_{1 : k - 1})$ , the random measure $χ_{k - 1} = {s_{0 : k - 1}^{i}, w_{k - 1}^{i}}_{i = 1}^{N_{p}}$ can be updated sequentially whenever new measurements $m_{k}$ become available. The procedure then becomes the following:

obtain samples $s_{0 : k}^{i} ~ q (s_{0 : k} | m_{1 : k})$ by augmenting samples from the previous time step $s_{0 : k - 1}^{i} ~ q (s_{0 : k - 1} | m_{1 : k - 1})$ with the new state $s_{k}^{i} ~ q (s_{k} | s_{0 : k - 1}^{i}, m_{1 : k})$ ;

update weights by the following:

\begin{array}{l} w_{k}^{i} = \frac{p (s_{0 : k}^{i} | m_{1 : k})}{q (s_{0 : k}^{i} | m_{1 : k})} \propto \frac{p (m_{k} | s_{k}^{i}) p (s_{k}^{i} | s_{k - 1}^{i}) p (s_{0 : k - 1}^{i} | m_{1 : k - 1})}{q (s_{k}^{i} | s_{0 : k - 1}^{i}, m_{1 : k}) q (s_{0 : k - 1}^{i} | m_{1 : k - 1})} \\ = \frac{p (m_{k} | s_{k}^{i}) p (s_{k}^{i} | s_{k - 1}^{i})}{q (s_{k}^{i} | s_{0 : k - 1}^{i}, m_{1 : k})} w_{k - 1}^{i} \end{array}

If we assume that $q (s_{k} | s_{0 : k - 1}, m_{1 : k}) = q (s_{k} | s_{k - 1}, m_{k})$ , that is, the importance density depends on $s_{k - 1}$ and $m_{k}$ only, we have the following:

w_{k}^{i} \propto \frac{p (m_{k} | s_{k}^{i}) p (s_{k}^{i} | s_{k - 1}^{i})}{q (s_{k}^{i} | s_{k - 1}^{i}, m_{k})} w_{k - 1}^{i}

(12)

A pragmatic choice for the importance density is the system transition density, that is, $q (s_{k} | s_{k - 1}, m_{k}) = p (s_{k} | s_{k - 1})$ . As a result, Equation (12) simplifies to the following:

w_{k}^{i} \propto p (m_{k} | s_{k}^{i}) w_{k - 1}^{i}

(13)

A major problem of particle filters is that the discrete random measure degenerates quickly.^10,14 In other words, most particles except for a few are assigned negligible weights. The solution is to resample the particles after they are updated. Different resampling algorithms and methods exist to determine when resampling is necessary.^10,14,25 A simple and often adopted resampling method is to replicate particles in proportion to their weights. It has been shown that a sufficiently large number of particles are able to converge to the true posterior distribution even in nonlinear, non-Gaussian dynamic systems.^10,14

2.3.2 Application in discrete event simulations

Consider a discrete event system with sensors deployed to monitor its operation. The measurement fed at time k, that is, $m_{k}$ , contains the partial observations of the system collected during the last measurement interval $[(k - 1) Δ T, k Δ T]$ . We are interested in the conditional distribution of the state trajectory $x_{0 : N_{k}^{+}}$ , given all measurements, that is, $p (x_{0 : N_{k}^{+}} | m_{1 : k})$ . Based on Bayes’ theorem, $p (x_{0 : N_{k}^{+}} | m_{1 : k})$ can be expressed as $p (x_{0 : N_{k}^{+}} | m_{1 : k}) = \frac{p (x_{0 : N_{k}^{+}}) p (m_{1 : k} | x_{0 : N_{k}^{+}})}{p (m_{1 : k})}$ . Similarly, we have $p (x_{0 : N_{k - 1}^{+}} | m_{1 : k - 1}) = \frac{p (x_{0 : N_{k - 1}^{+}}) p (m_{1 : k - 1} | x_{0 : N_{k - 1}^{+}})}{p (m_{1 : k - 1})}$ . Therefore, we have the following:

\frac{p (x_{0 : N_{k}^{+}} | m_{1 : k})}{p (x_{0 : N_{k - 1}^{+}} | m_{1 : k - 1})} = \frac{p (m_{k} | x_{N_{k - 1}^{+} + 1 : N_{k}^{+}}) p (x_{N_{k - 1}^{+} + 1 : N_{k}^{+}} | x_{N_{k - 1}^{+}})}{p (m_{k} | m_{1 : k - 1})}

Consequently, we can obtain a sequential update equation:

\begin{matrix} p (x_{0 : N_{k}^{+}} | m_{1 : k}) & = \frac{p (m_{k} | x_{N_{k - 1}^{+} + 1 : N_{k}^{+}}) p (x_{N_{k - 1}^{+} + 1 : N_{k}^{+}} | x_{N_{k - 1}^{+}})}{p (m_{k} | m_{1 : k - 1})} \\ \times p (x_{0 : N_{k - 1}^{+}} | m_{1 : k - 1}) \\ \propto p (m_{k} | x_{N_{k - 1}^{+} + 1 : N_{k}^{+}}) p (x_{N_{k - 1}^{+} + 1 : N_{k}^{+}} | x_{N_{k - 1}^{+}}) \\ \times p (x_{0 : N_{k - 1}^{+}} | m_{1 : k - 1}) \end{matrix}

(14)

This sequential update equation is similar in form to that in Equation (11), but an important difference here is that $N_{k}^{+}$ is a random variable, which means that the dimension of $x_{0 : N_{k}^{+}}$ , that is, the number of state points in $x_{0 : N_{k}^{+}}$ , is also random. The variable dimension problem will lead to inapplicability of the standard sequential importance sampling algorithm (see section 2.3.1).^18,19

In Godsill et al.,¹⁹ the authors proposed a solution to solve the variable dimension problem. Instead of estimating $p (x_{0 : N_{k}^{+}} | m_{1 : k})$ directly, they estimate $p (x_{0 : K} | m_{1 : k})$ , where $x_{0 : K}$ consists of two segments: $x_{0 : N_{k}^{+}}$ (our interest) and $x_{N_{k}^{+} + 1 : K}$ (extension). K is a sufficiently large constant integer such that for every k, the neighborhood $x_{N_{k}}$ is complete. If $x_{N_{k}}$ contains all state points that are required for interpolation at time k, we say that $x_{N_{k}}$ is complete. Since $x_{0 : K}$ has fixed dimension, the standard sequential importance sampling algorithm can be applied. Once samples from joint distribution $p (x_{0 : K} | m_{1 : k})$ are available, samples from its marginal $p (x_{0 : N_{k}^{+}} | m_{1 : k})$ can be obtained from the original joint samples by simply discarding the components (i.e., $x_{N_{k}^{+} + 1 : K}$ ) that are not of interest and retaining the original weights. Finally, the weight is updated by the following:

\begin{matrix} w_{k} & = \frac{p (x_{0 : K} | m_{1 : k})}{q (x_{0 : K} | m_{1 : k})} \\ \propto \frac{p (m_{k} | x_{N_{k - 1}^{+} + 1 : N_{k}^{+}}) p (x_{N_{k - 1}^{+} + 1 : N_{k}^{+}} | x_{N_{k - 1}^{+}})}{q (x_{N_{k - 1}^{+} + 1 : N_{k}^{+}} | x_{0 : N_{k - 1}^{+}}, m_{1 : k})} \times w_{k - 1} \end{matrix}

(15)

where $q (\cdot)$ is the importance density. The weight update is independent of states $x_{N_{k}^{+} + 1 : K}$ and, as a result, the extension $x_{N_{k}^{+} + 1 : K}$ is never generated in practice. More detailed proof can be found in Godsill et al.¹⁹ and Godsill and Vermaak.¹⁸

Suppose we have a large number $N_{p}$ of weighted samples $χ_{k - 1} = {x_{0 : N_{k - 1}^{+}}^{i}, w_{k - 1}^{i}}_{i = 1}^{N_{p}}$ , which approximate the posterior distribution $p (x_{0 : N_{k - 1}^{+}} | m_{1 : k - 1})$ at the previous time step; when new measurement $m_{k}$ is available, samples $χ_{k} = {x_{0 : N_{k}^{+}}^{i}, w_{k}^{i}}_{i = 1}^{N_{p}}$ , which approximate the posterior distribution $p (x_{0 : N_{k}^{+}} | m_{1 : k})$ at time k, can be obtained by Algorithm 1.

Algorithm 1. A generic particle filter for discrete event simulations.
1. % initialization of particles at $k = 0$
2. for $i = 1 : N_{p}$ do
3. generate the i-th sample $x_{0}^{i} = (s_{0}^{i}, t_{0}^{i})$ , where $s_{0}^{i} ~ p (s_{0})$ ( $p (s_{0})$ is the probability distribution of the initial state), and $t_{0}^{i} = 0$
4. set weight $w_{0}^{i} = 1 / N_{p}$
5. end
6. % the sampling step for any time $k \geq 1$
7. for $i = 1 : N_{p}$ do
8. sample particles according to the importance density $q (\cdot)$ :
•set $j = N_{k - 1}^{+ i}$
•while ${N_{k}}^{i}$ is incomplete:
- set $j = j + 1$
- sample $x_{j}^{i} ~ q (x_{j} \| x_{0 : j - 1}^{i}, m_{1 : k})$
set $N_{k}^{+ i} = j$ , and append the newly generated states to particle: $x_{0 : N_{k}^{+}}^{i} = (x_{0 : N_{k - 1}^{+}}^{i}, x_{N_{k - 1}^{+} + 1 : N_{k}^{+}}^{i})$ , where $N_{0}^{+} \equiv 0$
update weight:
$w_{k}^{i} \propto \frac{p (m_{k} \| x_{N_{k - 1}^{+} + 1 : N_{k}^{+}}^{i}) p (x_{N_{k - 1}^{+} + 1 : N_{k}^{+}}^{i} \| x_{N_{k - 1}^{+}}^{i})}{q (x_{N_{k - 1}^{+} + 1 : N_{k}^{+}}^{i} \| x_{0 : N_{k - 1}^{+}}^{i}, m_{1 : k})} \times w_{k - 1}^{i}$
9. end
10. normalize the weights, such that $\sum_{i = 1}^{N_{p}} w_{k}^{i} = 1$
11. % the resampling step
12. resample particles ${x_{0 : N_{k}^{+}}^{i}, w_{k}^{i}}_{i = 1}^{N_{p}}$ based on the chosen resampling method, which can be found in Douc et al.²⁵

2.4 Practical remarks

2.4.1 The sampling procedure

As shown in Algorithm 1, once $N_{k}^{+}$ is complete, one can stop generating new state points. This stopping condition is quite straightforward to check in simple models, for example, the equation-based model. However, in discrete event simulations that involve a large number of interacting components, this stopping condition is not easy to capture since the model is separated from its simulator. One possible solution is to put a little more effort into modeling by adding certain attributes that can make the interpolation operation conducted at a time instant independent of the states beyond that time. This solution is reasonable since the causal relationship should be obeyed in the modeling process, which means that the current state should not be influenced by events that will happen in the future. For example, in the gold mine case that will be studied in subsequent sections, we have a speed attribute for moving entities; as a consequence, when we need to get an entity position at a time instant, we only need the last updated state (which contains speed and location) and the elapsed time to fulfill linear interpolation in order to get updated entity positions.

The two state generation processes are compared in Figure 4. The blue and red dots represent state points in a discrete event state process. Specifically, the blue dots represent state points generated in the $(k - 1)$ -th data assimilation iteration, while the red dots are generated in the k-th iteration. Suppose now we need to obtain the state value at time instant $k Δ T$ using linear interpolation. In the state generation process in Algorithm 1 (Figure 4(a)), the discrete event simulation needs to generate one more state point beyond time instant $k Δ T$ to apply linear interpolation. In contrast, if the interpolation operation at a time instant is independent of state points beyond that time (Figure 4(b)), we can simply stop the simulation at time instant $k Δ T$ , since we only need one state point that lies on the left-hand side of $k Δ T$ and the elapsed time to fulfill linear interpolation. The benefit of the state generation process in Figure 4(b) is that we do not need to check the stopping condition any more, and we can simply stop state generation (e.g., the simulation execution) at time instant $k Δ T$ and all information is already sufficient for interpolation. In follow-on iterations, new states will then be generated from the interpolated state. In such a case, the sequential update rule in Equation (14) will be simplified to the following:

\begin{matrix} p (x_{0 : N_{k}^{+}} | m_{1 : k}) = p (s_{0 : k} | m_{1 : k}) \\ \propto p (m_{k} | s_{k - 1 : k}) p (s_{k - 1 : k} | {\hat{s}}_{k - 1}) p (s_{0 : k - 1} | m_{1 : k - 1}) \end{matrix}

(16)

where the partial state trajectory $s_{k - 1 : k}$ and the full state trajectory $s_{0 : k}$ are defined as follows:

\begin{matrix} s_{k - 1 : k} = {s_{\tilde{k}} | x_{\tilde{k}} = (s_{\tilde{k}}, t_{\tilde{k}}) \cap (k - 1) Δ T \leq t_{\tilde{k}} \leq k Δ T} \cup {{\hat{s}}_{k}} \\ s_{0 : k} = s_{0 : k - 1} \cup s_{k - 1 : k} \end{matrix}

(17)

The weight update in Equation (15) will thus be modified to the following:

w_{k} = \frac{p (x_{0 : K} | m_{1 : k})}{q (x_{0 : K} | m_{1 : k})} \propto \frac{p (m_{k} | s_{k - 1 : k}) p (s_{k - 1 : k} | {\hat{s}}_{k - 1})}{q (s_{k - 1 : k} | s_{0 : k - 1}, m_{1 : k})} \times w_{k - 1}

(18)

Figure 4.

The state points generation process (the blue (generated in the $(k - 1)$ -th iteration) and red (generated in the k-th iteration) dots represent state points in a discrete event state process, while the green dots represent interpolated state points; color online only).

2.4.2 Generating initial particles

Generating initial particles boils down to generating initial model states. For a discrete event simulation model, we cannot generate its initial state arbitrarily (i.e., we cannot generate the initial state of each atomic model independently), since an arbitrary combination (of atomic model states) might be infeasible in reality. For example, in the gold mine case that will be studied in subsequent sections, if we generate initial states arbitrarily, we might generate a system state which indicates that the miner is drilling while no trucks are present. Therefore, initial states should be generated from a set of feasible combinations of atomic model states. Suppose the state of an atomic model can be represented as $s = {p, θ}$ , where p denotes the phase and $θ$ indicates the corresponding parameters (state variables). Note that $θ$ can be a combination of discrete and continuous variables. Let $FS \subseteq \times_{i \in D} P_{i}$ denote a set of feasible combinations of phases of atomic components, where D is the set of names of components of the discrete event model (i.e., a coupled DEVS model) and $P_{i}$ is the set of possible phases of component i. We denote the combination of initial phases of all atomic components as a random variable $P_{0}$ , and it should take value from FS. Since $P_{0}$ is a discrete random variable, we formalize its probability distribution as follows:

P (P_{0} = p_{0}^{j}) = p_{j}, p_{0}^{j} \in FS, p_{j} \in (0, 1)

(19)

and $\sum_{j = 1}^{| FS |} p_{j} = 1$ . Notice that $p_{0}^{j} = (\dots, p_{0, i}^{j}, \dots), i \in D, p_{0, i}^{j} \in P_{i}$ . Based on this discrete probability distribution, generating an initial model state is done as follows.

Generate a feasible combination of initial phases of all atomic components, $p_{0}^{j} = (\dots, p_{0, i}^{j}, \dots) \in FS, i \in D, p_{0, i}^{j} \in P_{i}$ , by sampling the discrete probability distribution $P (P_{0})$ .

For each atomic component $i \in D$ , its initial phase is $p_{0, i}^{j}$ , and we now need to generate values for its corresponding parameters $θ_{0, i}^{j}$ :

- For a discrete variable in $θ_{0, i}^{j}$ , its value can be generated by sampling certain discrete probability distribution.

- For a continuous variable in $θ_{0, i}^{j}$ , its value can be generated by sampling certain continuous probability distribution. For example, in the gold mine case that will be studied in Section 3, suppose the initial phase of the elevator is GO_DOWN_EMPTY, one of its continuous parameters pos (the position of the elevator; see the more detailed definition in Table 1) can be generated by sampling a Uniform distribution $U (bottom, top)$ , where bottom and top represent the bottom position and top position of the vertical elevator shaft, respectively.

Table 1.

State variables of key components in the Discrete Event System Specification gold mine model.

Component type	Phases	Parameters	Description
Miner	TRANSIENT_PHASE	serving_truck	The truck that is being loaded
	HAVE_REQUEST
	DRILLING
Truck	TRAVEL_TO_MINER
	TRAVEL_TO_ELEVATOR	pos	The position of the truck
	TRANSIENT_PHASE	v	The velocity of the truck
	WAITING
Elevator	IDLE_AT_TOP
	GO_DOWN_EMPTY
	TRANSIENT_PHASE	pos	The position of the elevator
	HAVE_REQUEST	v	The velocity of the elevator
	UNLOAD_TRUCK_AT_BOTTOM	serving_truck	The truck that is being unloaded
	GO_UP_WITH_ORE	hasUnprocessedRequest	If there is any unprocessed request from miner
	UNLOAD_ORE_AT_TOP

Then the initial state of atomic component i can be represented as $s_{0, i}^{j} = {p_{0, i}^{j}, θ_{0, i}^{j}}$ and the initial state of the coupled model can be represented as $s_{0}^{j} = (\dots, s_{0, i}^{j}, \dots), i \in D$ .

Once we have generated the initial state $s_{0, i}^{j}$ for atomic component $i \in D$ , we can compute its time advance, which is denoted as $ta (s_{0, i}^{j})$ . For each atomic component i, we can simply set its elapsed time $e_{i} = 0$ .

Finally, combining the generated state $s_{0, i}^{j}$ , the time advance $ta (s_{0, i}^{j})$ , and the elapsed time $e_{i}$ for each atomic component $i \in D$ , we can initialize the discrete event simulation model.

3 Case study – estimating truck arrivals in a gold mine system

In this section and subsequent sections, we study a case in a gold mine system, to illustrate the working of the particle filter-based data assimilation framework introduced in Section 2. In this section, we focus on how to tailor the generic data assimilation framework to the specific estimation problem in the gold mine system.

3.1 Scenario description

A gold mine system is shown in Figure 5, and its operation is based on the coordination among miners, two trucks, and an elevator.

Miners drill at the mine shaft end, and they can only drill when an empty truck is present. Loading a truck varies very much. Creating a full truckload takes minimally 15 minutes, maximally 30 minutes.

Two trucks are available to transport ore; each truck travels 250/3 m/min when full through the mine shaft, and 500/3 m/min when empty. The current mine shaft is 400 m long.

An elevator can take a batch of gold ore up. The depth of the elevator shaft is 100 m; it takes the elevator 8 min to go up with ore and 3 min to go down empty.

When a truck is full, the miners ask the elevator to come down, so it will be at the bottom of the vertical shaft when the full truck arrives. When a truck of ore arrives at the bottom of the vertical shaft, it needs to be unloaded from the truck before the elevator can go up. Unloading takes between 5 and 10 min. After that, the elevator can go up, and the truck can go back. Unloading at the top of the vertical shaft takes between 2 and 4 min before the load can be put on a 100-m long conveyor belt that transports the gold ore to a processing plant. The conveyor belt has a speed of 10 m/min.

Figure 5.

The gold mine system. (Color online only.)

The gold mine is monitored by multiple sensors, which can provide partial observations of the gold mine system (the detailed available data will be explained in Section 3.4). The problem is that, given these partial observations, can we estimate when the trucks arrive at the bottom of the vertical shaft? The arrival information is important for efficient operation of the elevator, which may improve the overall performance of the gold mine system.

3.2 Modeling the gold mine system in the DEVS formalism

The scenario described in Section 3.1 is a typical discrete event system, and therefore we model it using the DEVS formalism,⁸ as shown in Figure 6. Notice that the gold mine simulation model has no external inputs. We model each component into different phases,²⁶ and each phase has a name and a life time, where the name indicates the activity that the component is undergoing, and the life time tells how long the entity will stay in that phase. The phases and associated parameters (i.e., state variables) of several key components (i.e., Miner, Truck, and Elevator) are listed in Table 1, while other components (such as Queue, Conveyor, Observer) are quite simple, and therefore we do not describe them in detail due to space limitations.

Figure 6.

The Discrete Event System Specification model of the gold mine system. FIFO: first-in, first-out. (Color online only.)

As shown in Table 1, each component has a transient phase, that is, TRANSIENT_PHASE, which has zero length of life time and is used to request resources or jobs. For example, when Miner finishes drilling and loading, it will first make a transition from DRILLING to TRANSIENT_PHASE; since TRANSIENT_PHASE has zero length of life time, a message is immediately sent to TruckQueueShaftEnd to say that Miner is idle and can drill and load other trucks if there are any; then Miner transfers to HAVE_REQUEST (i.e., idle) to wait for new trucks. Truck and Elevator work in a similar way. The movement of the elevator and the trucks is assumed with constant speed (although not realistic).

The unloading times at the bottom and the top of the vertical shaft are modeled as Uniform distribution $U (5.0, 10.0)$ and Uniform distribution $U (2.0, 4.0)$ , respectively. The drilling time of the Miner is modeled as a Triangular distribution with varying modes (shown in Figure 7). The purpose of varying modes is to simulate miners’ tiredness, which means that miners can become tired, that is, the longer time they has been working, the longer time they spend to load a truck. In the beginning ( $t = t_{s}$ ), the mode $c = c_{t_{s}}$ ; while in the end ( $t = t_{e}$ ), the mode will increase to $c = c_{t_{e}}$ ; at any time instants $t_{1}, t_{2} \in (t_{s}, t_{e})$ , if $t_{1} < t_{2}$ , we have $c_{t_{1}} < c_{t_{2}}$ . In our simulation, the run length is 480; therefore, we set $a = 15, b = 30, t_{s} = 0, t_{e} = 480, c_{t_{s}} = a + \frac{1}{4} (b - a), c_{t_{e}} = a + \frac{3}{4} (b - a)$ ; for any $t \in (t_{s}, t_{e})$ , we have $c_{t} = a + (\frac{1}{4} + \frac{1}{2} \times \frac{t - t_{s}}{t_{e} - t_{s}}) \times (b - a)$ . The unit of time is minutes.

Figure 7.

Triangular distribution with varying modes.

We denote the set of component names as D = {TruckQueueShaftEnd, TruckQueueElevatorBottom, Miner, Truck_0, Truck_1, Elevator, Conveyor, Observer}. For any component $i \in D$ , the (sequential) state of component i can be represented as $s_{i} = {p_{i}, θ_{i}}$ , where $p_{i}$ is the phase (name) and $θ_{i}$ is the corresponding state parameters (variables). Consequently, the sequential state of the gold mine model can be represented as follows:

s = (\dots, (s_{i}, e_{i}), \dots) \in S = \times_{i \in D} Q_{i}

(20)

where $Q_{i} = {(s_{i}, e_{i}) | s_{i} \in S_{i}, 0 \leq e_{i} \leq t a_{i} (s_{i})}$ . Based on the derivation shown in Section 2.1.3, we can easily formalize the state evolution of the gold mine model as an integer indexed state process (i.e., the system model of the gold mine system):

\begin{matrix} x_{\tilde{k}} = ((\dots, (s_{i, {\tilde{k}}_{i}}, e_{i, {\tilde{k}}_{i}}), \dots), t_{\tilde{k}}), i \in D \\ x_{\tilde{k} + 1} = GoldMineSim (x_{\tilde{k}}) + ν_{\tilde{k}}, \tilde{k} = 0, 1, 2, \dots \end{matrix}

(21)

where GoldMineSim is the (discrete event) gold mine simulation model and $ν_{\tilde{k}}$ is the system noise, such as position uncertainty incurred by small deviations in speed.

3.3 Interpolation operation

In this section, we introduce the interpolation method used in our gold mine case, and show the difference between the simulated state trajectory and the interpolated state trajectory. Considering that discrete state variables cannot be interpolated, we distinguish continuous states from discrete states as shown in Figure 1.

3.3.1 Continuous state

Continuous states can be interpolated. We take the elevator as an example, whose (sequential) state is represented as $s = (phase, pos, v)$ (the component index is omitted here), where phase is the phase name and pos and v are its position and velocity, respectively. Although the state contains a string-type variable (phase name), we still consider it as a continuous state since our focus is the elevator’s movement.

As introduced in Section 3.2, the elevator moves with constant speed. Therefore, we use linear interpolation to update the elevator’s state. Suppose that the last state update was at time $t_{l}$ due to the occurrence of an internal or external event, and the state was updated to $s (t_{l}) = (phas e_{l}, po s_{l}, v_{l})$ ; in that event handler, the next state update was scheduled at time $t_{n}$ , i.e., $ta (s_{l}) = t_{n} - t_{l}$ . Since we have velocity in the state definition, we can obtain the updated state at time $t \in (t_{l}, t_{n})$ based on the state at $t_{l}$ and the elapsed time e:

\begin{matrix} \hat{s} (t) = interpolate (s (t_{l}), e) \\ where {\begin{matrix} phas e_{t} & = phas e_{l} \\ po s_{t} & = po s_{l} + v_{l} \times e = po s_{l} + v_{l} \times (t - t_{l}) \\ v_{t} & = v_{l} \end{matrix} \end{matrix}

(22)

which is independent of the states beyond time t.

3.3.2 Discrete state

Discrete states cannot be interpolated. For example, the (sequential) state of the miner is $s = (phase, serving_truck)$ , where phase is the phase name and serving_truck is the name of the truck that is being loaded. Suppose that the last state update was at time $t_{l}$ , and the state was updated to $s (t_{l})$ ; in that event handler, the next state update was scheduled at time $t_{n}$ . Since the discrete state cannot be interpolated, the interpolation operation gives the following:

\hat{s} (t) = interpolate (s (t_{l}), e) = (s (t_{l}), e)

(23)

where the elapsed time $e = t - t_{l}$ . We still denote $(s (t_{l}), e)$ as $\hat{s} (t)$ , that is, $(s (t_{l}), e)$ is equivalent to those continuous states that can be interpolated (e.g., Equation (22)). Since $s (t_{l})$ cannot be interpolated, we need an elapsed time e to reflect the state evolution. If the measurement is related to the discrete state, one probably needs the elapsed time to define a measurement model that relates the discrete state to the measurement.

3.3.3 Interpolated state

Suppose that the (sequential) state of the coupled model at time instant $t_{l}$ is $s (t_{l}) = (\dots, (s_{i}, e_{i}), \dots), i \in D$ , and $ta (s (t_{l})) = \min {σ_{i} = t a_{i} (s_{i}) - e_{i}, i \in D}$ . At any time $t \in (t_{l}, t_{l} + ta (s (t_{l})))$ , the interpolated state can be represented as follows:

\begin{matrix} \hat{s} (t) = interpolate (s (t_{l}), e) = (\dots, (s'_{i}, e'_{i}), \dots), where \\ (s'_{i}, e'_{i}) = \\ {\begin{matrix} (interpolate (s_{i}, e_{i}), 0) \\ if s_{i} can be interpolated (see Equation (22)) \\ (s_{i}, e_{i} + t - t_{l}) \\ if s_{i} can not be interpolated (see Equation (23)) \end{matrix} \end{matrix}

(24)

Notice that the time advance of state $interpolate (s_{i}, e_{i})$ will be $ta (s_{i}) - e_{i}$ . In Section 2.2, when computing ${\hat{s}}_{k}, k = 1, 2, \dots$ , we essentially compute ${\hat{s}}_{k} = \hat{s} (k Δ T)$ based on Equation (24).

3.3.4 Simulated state trajectory versus interpolated state trajectory

In this section, we show the difference between the simulated state trajectory and the interpolated state trajectory. We take the state of the elevator in terms of position as an example. As shown in Figure 8, the positions of the elevator in the discrete event simulation are captured in blue, while the interpolated state trajectory is depicted in red. Since states only change when events occur, the simulated state trajectory of the elevator in terms of position is a piecewise constant curve, while the interpolated state trajectory is a piecewise linear curve since the velocity is constant and we adopt the liner interpolation method. Note that the piecewise constant segments between the elevator top and the elevator bottom in Figure 8 are the result of the elevator processing external events, for example, miners ask the elevator to come down.

Figure 8.

The state trajectory of the elevator in terms of position (the piecewise constant segments between the elevator top and the elevator bottom are the result of the elevator processing external events; each black triangle represents a time instant when a noisy observation of the elevator position is available; color online only).

As explained in the previous section, the elevator moves with constant speed. Therefore, the true state trajectory of the elevator in terms of position is also a piecewise linear curve, which overlaps the interpolated state trajectory. In this specific case, the resulted state trajectory by interpolation is equivalent to that if we simulate the continuous state variable (the position of the elevator) using Generalized Discrete Event Specification (GDEVS)²⁷ with the degree of the polynomial equal to 1. Notice that if the elevator has a different speed profile, for example, accelerate–constant speed–decelerate, the true state trajectory in terms of position and the interpolated state trajectory will not overlap any more. From Figure 8, we can clearly see that if we retrieve the state of a discrete event simulation model without interpolation, the retrieved state is only a past state that was updated at a past time instant, which cannot reflect real-time evolutions of the state; therefore, errors would be incurred if the outdated states are used for estimation. This will be proven in Section 5.

3.4 Available data and measurement model

The simulated data is generated by running the gold mine simulation (Section 3.2) for 480 min. During the run, all events are recorded; the states of the elevator and the trucks are sampled (using interpolation) and recorded very densely (every 0.01 min) in order to obtain their detailed evolutions; the data recorded for the elevator and the trucks includes phase names and their real-time positions. This ground-truth data is then processed as follows.

We extract the event sequence that only contains the following types of events (as shown in Figure 5): trucks arriving at the shaft end (Truck_Arrived_ShaftEnd); the elevator arriving at the top or the bottom of the vertical shaft (Elevator_Arrived_Top, Elevator_Arrived_Bottom); and a batch of ore arriving at the plant (Ore_Arrived_Plant). This event sequence is partial, but accurate (i.e., no missed events, and occurrence times are accurate).

We add Gaussian noise to the positions of the elevator and the trucks, respectively; specifically, we add noise drawn from $N (0, σ_{e}^{2})$ for the elevator, and add noise drawn from $N (0, σ_{t}^{2})$ for the trucks.

The noisy dataset is used for data assimilation, and we set the measurement interval to $Δ T = 30$ min. The measurement at time k is denoted as $m_{k}^{o}$ , which contains the following noisy data collected during $[(k - 1) Δ T, k Δ T]$ :

Event sequence $E_{k} = {(t_{1}, e_{1}), (t_{2}, e_{2}), \dots, (t_{n}, e_{n})}, (k - 1) Δ T \leq t_{1} \leq t_{2} \leq \dots \leq t_{n} \leq k Δ T; e_{i} \in$ {Truck_Arrived_ShaftEnd, Elevator_Arrived_Top, Elevator_Arrived_Bottom, Ore_Arrived_Plant}.

$P X_{k} = {(phas e^{j} (t_{j}), po s^{j} (t_{j})) | j \in {Elevator, Truck_0, Truck_1}, t_{j} \in [(k - 1) Δ T, k Δ T]}$ , which represents the phase and position of the elevator and the trucks, where $phas e^{j} (t_{j})$ indicates the name of the phase of component j at time $t_{j}$ , while $po s^{j} (t_{j})$ is the noisy position of component j at time $t_{j}$ . Notice that during $[(k - 1) Δ T, k Δ T]$ , there is only one observation for each component in ${Elevator, Truck_0, Truck_1}$ ; the times of observation for different components are not necessarily the same. As shown in Figure 8, the black triangles represent the time instants when noisy observations from the elevator are available. These observation times are randomly chosen, but in order to illustrate the effect of interpolation, we choose time instants when the component (either the elevator or the trucks) is moving, since when components are still, their position does not change, whether interpolate or not has no difference.

To summarize, the measurement available at time k can be represented as follows:

m_{k}^{o} = {E_{k}, P X_{k}}

(25)

and the measurement model can be formalized as follows:

m_{k}^{o} ~ p (m_{k}^{o} | x_{N_{k - 1}^{+} + 1 : N_{k}^{+}})

where $x_{\tilde{k}} = (s_{\tilde{k}}, t_{\tilde{k}}), \tilde{k} = 0, 1, 2, \dots$ is defined in Equation (21). As introduced in Section 3.3, the interpolation operation is independent of states beyond the time instant when the operation is invoked; therefore, the measurement model can be modified to the following:

m_{k}^{o} ~ p (m_{k}^{o} | s_{k - 1 : k})

(26)

where $s_{k - 1 : k} = {s_{\tilde{k}} | x_{\tilde{k}} = (s_{\tilde{k}}, t_{\tilde{k}}) \cap (k - 1) Δ T \leq t_{\tilde{k}} \leq k Δ T} \cup {{\hat{s}}_{k}}$ , and ${\hat{s}}_{k}$ is computed based on Equation (24) ( ${\hat{s}}_{k} = \hat{s} (k Δ T)$ ).

3.5 Estimating truck arrivals using particle filters

Having formalized the system model (Section 3.2) and the measurement model (Section 3.4), in this section, we implement (on the algorithmic level) the particle filtering framework (Section 2) in the (discrete event) gold mine simulation to illustrate the working of the framework by estimating the truck arrivals at the bottom of the vertical shaft.

3.5.1 Particle filtering for truck arrivals estimation

Algorithm 2 describes in detail how the generic particle filter shown in Algorithm 1 is applied in the specific gold mine case to fulfill the truck arrival estimation task. Since the interpolation operation at any time instant t is independent of states beyond that time, the formalization of Algorithm 2 is focused on system states $s_{\tilde{k}}$ , where $x_{\tilde{k}} = (s_{\tilde{k}}, t_{\tilde{k}})$ . The main steps of the proposed algorithm are summarized as below.

Initialization. In the initialization step (line 2–5 in Algorithm 1), the i-th sample $x_{0}^{i}$ is actually a guess of possible initial states (i.e., $s_{0}^{i}$ ) of the gold mine model. The process of generating initial particles is detailed in Section 3.5.2.

Sampling. In this case, we adopt the system transition density (a reformulation of $GoldMineSim (\cdot)$ in Equation (21)) as the importance density. Therefore, generating state points is done by running the gold mine simulation (line 8 in Algorithm 2). Since the interpolation operation at a time instant t is independent of state points beyond that time (see the explanation in Section 3.3), we just stop the simulation at time $t = k Δ T$ , and then update its weight based on newly available data $m_{k}^{o}$ (line 9 in Algorithm 2); detailed computation of the weight is presented in Section 3.5.3.

Resampling. To solve the degeneracy problem, we resample the particles using the standard resampling scheme, which samples particles in proportion to their weights.

Estimation. We scan the state trajectory $s_{k - 1 : k}^{i}$ and record the time instants when event Truck_Arrived_ElevatorBottom occurs. Each particle gives an estimation of the truck arrival, and estimations from all particles will form a distribution of truck arrival. These (raw) estimations will be processed to give more informative results in Section 5.

Algorithm 2. The particle filter for truck arrival estimation.
1. % initialization of particles at $k = 0$
2. for $i = 1 : N_{p}$ do
3. generate the i-th sample $x_{0}^{i} = (s_{0}^{i}, t_{0}^{i})$ where $t_{0}^{i} = 0$
4. set weight $w_{0}^{i} = 1 / N_{p}$
5. end
6. % the sampling step for any time $k \geq 1$
7. for $i = 1 : N_{p}$ do
8. run the gold mine simulation to time $t = k Δ T$ with initial state ${\hat{s}}_{k - 1}^{i}$ , where ${\hat{s}}_{k - 1}^{i}$ is obtained based on Equation (24) ( $t = (k - 1) Δ T$ ); the newly generated partial state trajectory is $s_{k - 1 : k}^{i} = {s_{\tilde{k}}^{i} \| x_{\tilde{k}}^{i} = (s_{\tilde{k}}^{i}, t_{\tilde{k}}^{i}), (k - 1) Δ T \leq t_{\tilde{k}}^{i} \leq k Δ T} \cup {{\hat{s}}_{k}^{i}}$ ; the full state trajectory is thus updated to $s_{0 : k}^{i} = (s_{0 : k - 1}^{i}, s_{k - 1 : k}^{i})$
9. compute weight: $w_{k}^{i} = p (m_{k}^{o} \| s_{k - 1 : k}^{i}) \times w_{k - 1}^{i}$
10. end
11. normalize the weights, denote them as ${s_{0 : k}^{i}, w_{k}^{i}}_{i = 1}^{N_{p}}$
12. % the resampling step
13. resample ${s_{0 : k}^{i}, w_{k}^{i}}_{i = 1}^{N_{p}}$ using the standard resampling method, which samples particles in proportion to their weights; the resampled results are again denoted as ${s_{0 : k}^{i}, w_{k}^{i}}_{i = 1}^{N_{p}}$
14. for $i = 1 : N_{p}$ do
15. $w_{k}^{i} = 1 / N_{p}$
16. end
17. % record data for estimation
18. for $i = 1 : N_{p}$ do
19. scan $s_{k - 1 : k}^{i}$ , and record the time instants when event Truck_Arrived_ElevatorBottom occurs
20. end

3.5.2 Generating initial particles

In this case study, initial particles are generated based on the procedure introduced in Section 2.4.2. For illustration purpose, we only enumerate two feasible combinations of phases, which are listed in Table 2, although there are many more feasible choices. We assume $P (p_{0}^{1}) = P (p_{0}^{2}) = 0.5$ . Note that we assume the maximum speed of the elevator is 200/3 m/min; we generate values of pos (the position) and v (the speed) for the elevator by sampling Uniform distributions, and the time advance can thus be computed as $(100 + pos) / v \min$ (see the last row in Table 2). For other atomic components in the gold mine model, that is, TruckQueueShaftEnd, TruckQueueElevatorBottom, Conveyor, Observer, we initialize them as passive (i.e., time advance is $+ \infty$ ).

Table 2.

Initial states of the gold mine simulation model.

Elements in FS	Component name	Phase	Parameters	Value	Time advance (min)
$p_{0}^{1}$	Miner	TRANSIENT_PHASE	serving_truck	NULL	0
	Truck_0	TRANSIENT_PHASE	pos	0 m	0
			v	0 m/min
	Truck_1	TRANSIENT_PHASE	pos	0 m	0
			v	0 m/min
	Elevator	IDLE_AT_TOP	pos	0 m	$+ \infty$
			v	0 m/min
			serving_truck	NULL
			hasUnprocessedRequest	NULL
$p_{0}^{2}$	Miner	TRANSIENT_PHASE	serving_truck	NULL	0
	Truck_0	TRANSIENT_PHASE	pos	0 m	0
			v	0 m/min
	Truck_1	TRANSIENT_PHASE	pos	0 m	0
			v	0 m/min
	Elevator	GO_DOWN_EMPTY	pos	$pos ~ U (- 100, 0) m$	$(100 + pos) / v$
			v	$v ~ U (0, 200 / 3) m / \min$
			serving_truck	NULL
			hasUnprocessedRequest	NULL

3.5.3 Weight computation

In this section, we detail how the weight is computed, that is, we utilize $w_{k}^{i} = p (m_{k}^{o} | s_{k - 1 : k}^{i}) \times w_{k - 1}^{i}$ . The measurement at time k is $m_{k}^{o} = {E_{k}, P X_{k}}$ , where $E_{k}$ is the observed event sequence during time interval $[(k - 1) Δ T, k Δ T]$ and $P X_{k} = {(phas e^{j} (t_{j}), po s^{j} (t_{j})) | j \in {Elevator, Truck_0, Truck_1}, t_{j} \in [(k - 1) Δ T, k Δ T]}$ represents phase and position observations from the elevator and the trucks. Since the two types of observations are conditionally independent given $s_{k - 1 : k}^{i}$ , we have $p (m_{k}^{o} | s_{k - 1 : k}^{i}) = p (E_{k} | s_{k - 1 : k}^{i}) p (P X_{k} | s_{k - 1 : k}^{i})$ .

3.5.3.1 Event sequences

Given state points $s_{k - 1 : k}^{i}$ , it is very easy to retrieve an event sequence that only contains the four types of events shown in Figure 5 (i.e., types of observed events). We denote such event sequences retrieved from the i-th particle as $E_{k}^{i}$ , then $p (E_{k} | s_{k - 1 : k}^{i}) = p (E_{k} | E_{k}^{i})$ . Subsequently, we first define a distance measure between two event sequences, and based on the distance measure, we then define $p (E_{k} | E_{k}^{i})$ .

An event can be modeled as a two-tuple $(t, e)$ , where e is the event type and t is the occurrence time. An event sequenceS is an ordered sequence of events:

S = {(t_{1}, e_{1}), (t_{2}, e_{2}), \dots, (t_{n}, e_{n})}, t_{1} \leq t_{2} \leq \dots \leq t_{n}

We adopt the edit distance²⁸ to define the “distance” between two event sequences. The edit distance is defined as “the amount of work that has to be done to convert one sequence to another,” and the amount of work is quantified by a set of transformation operations and their associated costs (more details can be found in Mannila and Ronkainen²⁸). Suppose $O = {o_{1}, o_{2}, \dots, o_{n}}$ is an operation sequence that transforms S to T, and the cost of O is defined as follows:

c (O) = \sum_{i = 1}^{n} c (o_{i})

then the edit distance between event sequence S and event sequence T is defined as the minimum cost that is needed to transform S to T, that is:

d (S, T) = \min {c (O_{j}) | O_{j}}

where $O_{j}$ is an arbitrary operation sequence that transforms S to T.

Once the distance between two event sequences can be computed, we can now define $p (E_{k} | E_{k}^{i})$ as follows:

p (E_{k} | s_{k - 1 : k}^{i}) = p (E_{k} | E_{k}^{i}) = e^{- \frac{d (E_{k}, E_{k}^{i})}{d_{m}}}

(27)

where $d_{m} = d (E_{k}, \emptyset)$ .

3.5.3.2 Phases and positions

Given state points $s_{k - 1 : k}^{i}$ , we can straightforwardly obtain the phase (name) and position of any component at any time based on interpolation explained in Section 3.3. We denote such phase and position pairs for entities in $D_{c}$ = {Elevator, Truck_0, Truck_1} as ${PX}_{k}^{i}$ , then $p (P X_{k} | s_{k - 1 : k}^{i}) = p (P X_{k} | {PX}_{k}^{i})$ .

For phase and position data, we need to consider them as a whole. For example, we assume that the observation from the elevator is ${GO_DOWN_EMPTY, - 10}$ ; in the first particle, we have ${GO_DOWN_EMPTY, - 10}$ , while in the second particle, we have ${GO_UP_WITH_ORE, - 10.0}$ . Obviously, the first particle should be assigned a larger weight than the second one, given the observation. However, if we do not consider the phase difference, we cannot differentiate the two particles. Therefore, we propose a phase match method to define a distance measure for phases.

The phase match method works as follows. Suppose the phase is represented as ${p_{i}, θ_{i}}$ , where $p_{i}$ is the name of the phase and $θ_{i}$ is the corresponding parameters. The distance between phases is defined based on the phase transition graph shown in Figure 9. The phase transition graph is actually a simplified version of the model of the corresponding component. For convenience, we assume that the index of one phase in the two phases that we want to compare is 0, while the index of the other is n, and their distance is defined as follows:

d (0, n) = \min {\sum_{i = 0}^{n - 1} d (i, i + 1), \sum_{i = n}^{N - 1} d (i, i + 1) + d (N, 0)}

where $d (i, j)$ is the distance between phase i and phase j. The distance function can be defined in many ways, for example, we can define $d (i, i + 1)$ as the time that the system stays in phase i before it makes a transition to phase $i + 1$ . In our case, we choose a simple distance function as $d (i, i + 1) = 1$ .

Figure 9.

The phase transition graph.

In our case, the parameter is the position with Gaussian noise, and therefore we define $p (P X_{k} | {PX}_{k}^{i})$ as follows:

\begin{array}{l} p (P X_{k} | s_{k - 1 : k}^{i}) = p (P X_{k} | P X_{k}^{i}) \\ = \prod_{j \in D_{c}} p ({p h a s e^{j}, p o s^{j}} | {p h a s e^{i, j}, p o s^{i, j}}) \end{array}

(28)

where $D_{c}$ = {Elevator, Truck_0, Truck_1}; $p ({phas e^{j}, po s^{j}} | {phas e^{i, j}, po s^{i, j}})$ is defined as follows:

p ({phas e^{j}, po s^{j}} | {phas e^{i, j}, po s^{i, j}}) = {\begin{matrix} \max {p_{\min}, \frac{1}{\sqrt{2 π σ_{j}^{2}}} e^{- \frac{{(po s^{i, j} - po s^{j})}^{2}}{2 σ_{j}^{2}}}} & if phas e^{i, j} = phas e^{j} \\ \frac{p_{\min}}{d (phas e^{i, j}, phas e^{j}) + 1} & if phas e^{i, j} \neq phas e^{j} \end{matrix}

We argue that the weight of a particle in which the phase is the same as the observed phase (i.e., $phas e^{i, j} = phas e^{j}$ ) should be absolutely larger than that of a particle that has a different phase to the observed phase (i.e., $phas e^{i, j} \neq phas e^{j}$ ). Therefore, we define a threshold value $p_{\min}$ to guarantee this.

4 Case study of the gold mine system – qualitative analysis

In this section, a qualitative analysis is conducted to compare the estimation results without and with assimilating noisy observations; the objective of this comparison is to prove the necessity to assimilate observations into discrete event simulations in order to get better estimation results.

If we do not assimilate noisy observations, we can run the simulation multiple times with different random seeds to generate data for estimation. Therefore, we run the gold mine simulation 2000 times with different random seeds and record the time instants when trucks arrive at the bottom of the vertical shaft. The estimation results are shown in Figure 10(a). The results show that if there is no real-time data from the real system assimilated, the discrepancy between the simulation and the real system will become larger and larger as time advances. Consequently, the simulation without data assimilation will gradually lose its prediction ability. Based on our example, from $t = 150$ min onwards, the gold mine simulation can no longer provide any useful information for truck arrivals at the bottom of the vertical shaft.

Figure 10.

A general view of the estimation results of truck arrivals at the bottom of the vertical shaft with and without assimilating noisy data (each red triangle represents a truck arrival in ground truth; color online only).

In contrast, we use the same simulation model to assimilate the noisy dataset ( $σ_{e} = 3.0, σ_{t} = 3.0$ ) every $Δ T = 30$ min with 2000 particles to estimate truck arrival times. The estimation results are depicted in Figure 10(b). The results show that if we assimilate noisy observations into the same simulation model using similar effort (i.e., 2000 particles versus 2000 runs), the simulation can provide reasonable estimations for truck arrivals during the whole simulation period (480 min). Therefore, it is necessary to assimilate data if there are any into the discrete event simulation in order to obtain better estimation results of the variable of interest.

We present the estimation results of truck arrival times at the bottom of the vertical shaft in one time step (i.e., $[(k - 1) Δ T, k Δ T]$ ) in Figure 11. Since the minimal drilling time is 15 min, there are at most two arrivals during one time step of duration $Δ T = 30$ min. Notice that the estimation results actually give a distribution of truck arrival times. In order to know how accurate the estimation results are and also to explore the influences of factors, such as data errors, model errors, and the number of particles employed, in Section 5.2 we define a set of performance indicators and conduct quantitative analysis accordingly.

Figure 11.

Histogram of estimated truck arrival times at the bottom of the vertical shaft during one step $[(k - 1) Δ T, k Δ T]$ , where $Δ T = 30 \min$ (each red triangle represents a truck arrival in ground truth; color online only).

5 Case study of the gold mine system – quantitative analysis

The particle filtering method shown in Algorithm 2 gives us raw estimation results of truck arrivals, which are depicted in Figure 10(b). In this section, we show how these raw data are processed in order to conduct a more informative analysis; based on the processed data, a set of performance indicators is proposed to quantify how accurate the estimation results are; finally, the results computed based on these performance indicators are presented and analyzed.

5.1 Data processing for estimating truck arrival times

As shown in Figure 11(b), the estimated truck arrival times obviously belong to two groups, each of which approximates the distribution of a truck arrival. Therefore, we cluster the estimated arrival times into groups (for example, using the k-means clustering algorithm²⁹), and each group estimates one truck arrival. Suppose that there are m such clusters: ${C_{c} | C_{c} = {t_{1}^{c}, t_{2}^{c}, \dots, t_{n_{c}}^{c}}}_{c = 1}^{m}$ ; based on the data in each cluster, we can fit a probability distribution of truck arrival times by whatever means. In our case, we fit a kernel distribution using the Normal kernel to the data in each cluster; for example, in Figure 12, we show the obtained kernel distribution fitted to the data belonging to the cluster on the right-hand side in Figure 11(b).

Figure 12.

Fitting a kernel probability distribution using the Normal kernel to truck arrival times in one cluster (this group of data belongs to the cluster on the right-hand side in Figure 11(b); the red triangle represents a truck arrival in ground truth; color online only).

If we denote the fitted probability distribution from data in cluster $C_{c}$ as $f_{c} (t)$ and the cumulative distribution function as $F_{c} (t)$ , the probability that a truck arriving at the bottom of the vertical shaft during a very small interval $[t - ε, t + ε]$ can be computed as follows:

Prob (arriving during [t - ε, t + ε]) = F_{c} (t + ε) - F_{c} (t - ε)

and for convenience, we denote this probability as $P_{c} (t, ε)$ . $P_{c} (t, ε)$ thus represents the probability of a truck arriving at the bottom of the vertical shaft during $[t - ε, t + ε]$ ; the subscript c indicates that the probability is computed from the probability distribution fitted to the data in cluster $C_{c}$ .

5.2 Evaluation criteria

Assume that the ground-truth value of truck arrivals is $A = {t_{1}, t_{2}, \dots, t_{n}}$ . After data processing, we obtain m clusters ${C_{c} | C_{c} = {t_{1}^{c}, t_{2}^{c}, \dots, t_{n_{c}}^{c}}}_{c = 1}^{m}$ ; from each cluster, we have a fitted probability density function. The format of the ground-truth data and the estimated data can thus be shown in Figure 13. The performance indicators are defined as follows.

Figure 13.

Format of the ground-truth data and estimated data. (Color online only.)

For each arrival $t_{i} \in A$ , if there exists a cluster $C_{c_{i}}$ such that:

\frac{P_{c_{i}} (t_{i}, ε)}{\max {P_{c_{i}} (t, ε)}} > δ

(29)

we consider that the arrival $t_{i}$ is successfully estimated by $C_{c_{i}}$ . $P_{c_{i}} (t, ε)$ should get its maximum value (i.e., $\max {P_{c_{i}} (t, ε)}$ ) around the time instant when the probability density function $f_{c_{i}} (t)$ reaches its peak; $δ \in [0, 1)$ is a threshold value we can arbitrarily set, that is, if the probability $P_{c_{i}} (t_{i}, ε)$ is larger than a certain percent (i.e., $δ$ ) of the maximum probability, we regard that the arrival $t_{i}$ is successfully estimated by $C_{c_{i}}$ . For any $t_{i} \in A$ , there is at most one cluster in ${C_{c} | C_{c} = {t_{1}^{c}, t_{2}^{c}, \dots, t_{n_{c}}^{c}}}_{c = 1}^{m}$ that can successfully estimate $t_{i}$ .

Obviously, the more arrivals in A being successfully estimated, the better performance of the estimation. Thus, we define the success rate as follows:

S R = \frac{n_{m}}{n} \times 100 %

(30)

where $n_{m}$ is the number of arrivals in A being successfully estimated. The best value of SR is $100 %$ , which means that all arrivals are successfully estimated.

If a cluster $C_{c}$ cannot estimate any $t_{i} \in A$ , we regard $C_{c}$ as wasted. Obviously, the lower the number of wasted clusters, the better performance of the estimation. Therefore, we define the waste rate as follows:

W R = \frac{| m - n_{m} |}{m} \times 100 %

(31)

The best value of WR is $0 %$ , which means that all clustered groups can be used to estimate a truck arrival.

Suppose that $t_{i} \in A$ is estimated by the cluster $C_{c_{i}}$ ; as shown in Figure 13, we certainly want $t_{i}$ to be as close as possible to the time instant when the probability distribution is peaked. Therefore, we define two measures to quantify such closeness.

Average distance to the time instant when the probability density function is peaked:

\bar{d} = \frac{1}{n_{m}} \sum_{j = i_{1}}^{i_{n_{m}}} | t_{j} - t_{c_{j}}^{*} |

(32)

where $t_{c_{j}}^{*}$ is the time instant when $f_{c_{j}} (t)$ is peaked.

Average percentage that $P_{c_{j}} (t_{j}, ε)$ accounts for $P_{c_{j}} (t_{c_{j}}^{*}, ε)$ :

\begin{array}{l} \bar{P} = 100 % \times \frac{1}{n_{m}} \sum_{j = i_{1}}^{i_{n_{m}}} \frac{P_{c_{j}} (t_{j}, ε)}{m a x {P_{c_{j}} (t, ε)}} \\ = 100 % \times \frac{1}{n_{m}} \sum_{j = i_{1}}^{i_{n_{m}}} \frac{P_{c_{j}} (t_{j}, ε)}{P_{c_{j}} (t_{c_{j}}^{*}, ε)} \end{array}

(33)

5.3 Results

In this section, we present the estimation results of assimilating the noisy dataset ( $σ_{e} = 3.0, σ_{t} = 3.0$ ) with $N_{p} = 2000$ particles. The model into which we assimilate the noisy data is the same as that we used to generate the simulated data, which means that we use a perfect model of the gold mine system; when retrieving the simulation state at any time t, we use linear interpolation, which was introduced in Section 3.3, to obtain the updated state value.

5.3.1 The estimated truck arrival times

The raw estimation results shown in Figure 10(b) are clustered using the k-means clustering algorithm,²⁹ and the results are shown in Table 3. The k-means clustering algorithm outputs 20 clusters, that is, ${C_{c}}_{c = 1}^{20}$ , as shown in the first column of the table; the second column gives the time instant ( $t_{c}^{*}$ ) where the fitted probability distribution is peaked; the third column computes the probability ( $P_{c} (t_{c}^{*}, ε)$ ) that a truck arrives at the bottom of the vertical shaft during $[t_{c}^{*} - ε, t_{c}^{*} + ε]$ . In this dataset, there are 20 arrivals during the simulation period, that is, $A = {t_{1}, t_{2}, \dots, t_{20}}$ . The probability $P_{c} (t_{i}, ε), c = 1, 2, \dots, 20; i = 1, 2, \dots, 20$ is computed and presented from the fourth column to the 23rd column. The results show that all arrivals lie in a certain cluster, that is, $\forall t_{i} \in A, \exists C_{c_{i}} \in {C_{c}}_{c = 1}^{20}, s . t . t_{i} \in [\min {C_{c_{i}}}, \max {C_{c_{i}}}]$ .

Table 3.

The data assimilation results ( $σ_{e} = 3.0, σ_{t} = 3.0; N_{p} = 2000; ε = 0.05$ min).

Data processing results			Truck arrivals ground truth
Cluster $C_{c}$	$t_{c}^{*}$	$P_{c} (t_{c}^{*}, ε)$	$t_{1}$	$t_{2}$	$t_{3}$	$t_{4}$	$t_{5}$	$t_{6}$	$t_{7}$	$t_{8}$	$t_{9}$	$t_{10}$	$t_{11}$	$t_{12}$	$t_{13}$	$t_{14}$	$t_{15}$	$t_{16}$	$t_{17}$	$t_{18}$	$t_{19}$	$t_{20}$
Cluster $C_{c}$	$t_{c}^{*}$	$P_{c} (t_{c}^{*}, ε)$	26.8918	52.0784	71.6300	96.5060	114.2494	136.1225	154.6024	177.2472	198.4608	219.3735	246.4805	273.5331	296.9505	320.6399	346.2515	373.8774	393.4647	421.3536	439.5858	459.0086
$C_{1}$	26.9086	0.0819	0.0816	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
$C_{2}$	52.0880	0.0228	–	0.0228	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
$C_{3}$	71.6378	0.0308	–	–	0.0308	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
$C_{4}$	96.2479	0.0324	–	–	–	0.0296	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
$C_{5}$	115.9603	0.0287	–	–	–	–	0.0284	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
$C_{6}$	136.0023	0.0226	–	–	–	–	–	0.0224	–	–	–	–	–	–	–	–	–	–	–	–	–	–
$C_{7}$	154.6680	0.0283	–	–	–	–	–	–	0.0280	–	–	–	–	–	–	–	–	–	–	–	–	–
$C_{8}$	178.2611	0.0239	–	–	–	–	–	–	–	0.0186	–	–	–	–	–	–	–	–	–	–	–	–
$C_{9}$	198.7183	0.0214	–	–	–	–	–	–	–	–	0.0207	–	–	–	–	–	–	–	–	–	–	–
$C_{10}$	221.2574	0.0235	–	–	–	–	–	–	–	–	–	0.0226	–	–	–	–	–	–	–	–	–	–
$C_{11}$	246.4534	0.0233	–	–	–	–	–	–	–	–	–	–	0.0233	–	–	–	–	–	–	–	–	–
$C_{12}$	273.3759	0.0289	–	–	–	–	–	–	–	–	–	–	–	0.0280	–	–	–	–	–	–	–	–
$C_{13}$	297.0031	0.0253	–	–	–	–	–	–	–	–	–	–	–	–	0.0252	–	–	–	–	–	–	–
$C_{14}$	320.7218	0.0289	–	–	–	–	–	–	–	–	–	–	–	–	–	0.0286	–	–	–	–	–	–
$C_{15}$	346.1563	0.0384	–	–	–	–	–	–	–	–	–	–	–	–	–	–	0.0375	–	–	–	–	–
$C_{16}$	372.1869	0.0309	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	0.0191	–	–	–	–
$C_{17}$	393.3903	0.0307	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	0.0304	–	–	–
$C_{18}$	420.1895	0.0733	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	0.0120	–	–
$C_{19}$	441.2210	0.0196	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	0.0139	–
$C_{20}$	459.8954	0.0105	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	0.0094

We compute the match criterion (see Equation (29)) for each truck arrival in A, and the results are depicted in Figure 14. With threshold value $δ = 50 %$ , there are 19 truck arrivals in $A = {t_{1}, t_{2}, \dots, t_{20}}$ being successfully estimated by clusters ${C_{c}}_{c = 1}^{20}$ . Therefore, we have the following:

success rate $S R = \frac{n_{m}}{n} \times 100 % = \frac{19}{20} \times 100 % = 95.00 %$ ;

waste rate $W R = \frac{m - n_{m}}{m} \times 100 % = \frac{20 - 19}{20} \times 100 % = 5.00 %$ ;

average distance $\bar{d} = \frac{1}{n_{m}} \sum_{j = i_{1}}^{i_{n_{m}}} | t_{j} - t_{c_{j}}^{*} | = 0.53 \min$ ;

average percentage $\bar{P} = 100 % \times \frac{1}{n_{m}} \sum_{j = i_{1}}^{i_{n_{m}}} \frac{P_{c_{j}} (t_{j}, ε)}{P_{c_{j}} (t_{c_{j}}^{*}, ε)} = 92.66 %$ .

Figure 14.

The match criterion $100 % \times P_{c_{i}} (t_{i}, ε) / P_{c_{i}} (t_{c_{i}}^{*}, ε)$ (each red triangle represents a truck arrival in ground truth; color online only).

In the current operation of the gold mine system, the elevator only comes down when it receives a request from the miner, and therefore it will always arrive at the bottom of the vertical shaft at least 1.8 min (the difference between the time of the truck traveling full of ore and the time of the elevator going down empty) later than the trucks do. In other words, the truck will always wait at least 1.8 min until it can be served. However, using data assimilation, we can estimate 95% of all truck arrivals with an average error of 0.53 min (which is much smaller than 1.8 min). If these estimation results can be combined in the operation of the gold mine system (especially in the operation of the elevator), the overall performance of the gold mine system should be improved.

5.3.2 The effect of the interpolation operation

In this section, we explore the influence of interpolation on the estimation results. To this end, we run the data assimilation experiment 10 times with different random seeds, and draw box plots of the four error measures (i.e., SR in Equation (30), WR in Equation (31), $\bar{d}$ in Equation (32), and $\bar{P}$ in Equation (33)) in Figure 15. The results show that although the estimation results obtained from data assimilation without interpolation are already accurate, they can be improved significantly (in the statistic sense) if the interpolation operation is used. Although it is not accurate enough to retrieve the model state without interpolation, the retrieved state still reflects reality to a certain degree; therefore, the estimation results are much better than those without data assimilation. With interpolation, the time elapsed since the last state transition is considered, and therefore the real-time evolution, which is not captured in the discrete event simulation model but does happen in reality, will be reflected through the measurement model. Consequently, the estimation results obtained from data assimilation with interpolation are more accurate than those obtained without interpolation.

Figure 15.

The influence of interpolation on the data assimilation results (noisy dataset ( $σ_{e} = 3.0, σ_{t} = 3.0$ ); $N_{p} = 2000$ ; 10 independent runs; color online only).

5.4 Sensitivity analysis

In this section, we explore the influence of several key factors on the data assimilation results based on the simple gold mine case. These factors include data quality, modeling errors, and the number of particles used. For each set of parameters, we run the experiment 10 times with different random seeds.

We should note that many factors can influence the quality of the data assimilation results. Besides the factors mentioned above, other factors may include the sensor deployment information, such as the number of sensors and data collection frequency. In this simple gold mine case, we do not consider these extra factors. For a more comprehensive analysis of the particle filter-based data assimilation methods, please refer to Gu and Hu.³⁰

5.4.1 Effect of the data quality

In the gold mine case, only position data of entities (Elevator and Truck) are noisy, and the quality of the noisy position data is characterized by the standard deviation of the zero mean Gaussian noise, that is, $σ_{e}$ (for Elevator) and $σ_{t}$ (for Truck). We vary $σ_{e}$ and $σ_{t}$ from 3.0 to 20.0; when retrieving the model state, we retrieve states through interpolation; for all experiments, we set $N_{p} = 2000$ . The results are shown in Table 4. The results are in line with our expectations that the performance improves as the data becomes more accurate. We can conclude that the proposed method is quite robust to data errors. Even with a 20-m standard deviation on entity positions, the performance does not degenerate too much. Specifically, the performance indicators of estimating the truck arrivals are 85.00% (success rate), 15.00% (waste rate), 0.83 min (average distance), and 85.82% (average percentage).

Table 4.

The influence of data quality, that is, $σ_{e}, σ_{t}$ , on the data assimilation results (states are retrieved through interpolation; $N_{p} = 2000$ ). In each table cell the median error over the 10 simulations is shown along with (in brackets underneath) the 25th and 75th percentiles.

$σ_{e}, σ_{t}$	Success rate	Waste rate	Average distance	Average percentage
$σ_{e}, σ_{t}$	$S R (%)$	$W R (%)$	$\bar{d} (\min)$	$\bar{P} (%)$
$σ_{e} = 3.0, σ_{t} = 3.0$	90.00	10.00	0.54	89.00
	(90.00, 95.00)	(5.00, 10.00)	(0.52, 0.61)	(87.56, 91.77)
$σ_{e} = 5.0, σ_{t} = 5.0$	90.00	10.00	0.62	88.53
	(90.00, 90.00)	(10.00, 10.00)	(0.59, 0.75)	(86.91, 90.48)
$σ_{e} = 10.0, σ_{t} = 10.0$	90.00	10.00	0.73	86.40
	(90.00, 95.00)	(5.00, 10.00)	(0.70, 0.85)	(85.31, 88.60)
$σ_{e} = 15.0, σ_{t} = 15.0$	85.00	15.00	0.79	86.50
	(85.00, 90.00)	(10.00, 15.00)	(0.77, 0.93)	(83.77, 87.98)
$σ_{e} = 20.0, σ_{t} = 20.0$	85.00	15.00	0.83	85.82
	(85.00, 90.00)	(10.00, 15.00)	(0.76, 0.93)	(83.61, 87.04)

5.4.2 Effect of the model errors

In the experiment in Section 5, the model we used to carry out data assimilation is the same as that we used to generate the ground-truth data, which implies that we have a perfect model of the reality. This is a very strong assumption. In this section, we investigate the data assimilation results in the case that the model has errors. We build an imperfect model by simply changing the distribution of the drilling time of the miner from Triangular distribution with varying modes (i.e., perfect model) to a standard Triangular distribution with lower bound 15 min, upper bound 30 min, and mode 20 min (acting as the imperfect model). For all experiments, we set $σ_{e} = 3.0, σ_{t} = 3.0$ , and $N_{p} = 2000$ ; states are retrieved through interpolation. The results are shown in Table 5.

Table 5.

The influence of model quality on the data assimilation results (states are retrieved through interpolation; $σ_{e} = 3.0, σ_{t} = 3.0; N_{p} = 2000$ ). In each table cell the median error over the 10 simulations is shown along with (in brackets underneath) the 25th and 75th percentiles.

Model	Success rate	Waste rate	Average distance	Average percentage
Model	$S R (%)$	$W R (%)$	$\bar{d} (\min)$	$\bar{P} (%)$
Perfect model (drilling time:	90.00	10.00	0.54	89.00
Triangular distribution with varying mode)	(90.00, 95.00)	(5.00, 10.00)	(0.52, 0.61)	(87.56, 91.77)
Imperfect model (drilling time:	90.00	12.14	0.63	88.83
standard Triangular distribution)	(85.00, 90.00)	(10.00, 15.00)	(0.57, 0.69)	(86.65, 90.95)

The results in Table 5 reveal that the proposed method is robust with respect to model errors, although with the case involved, we cannot claim to have tested this exhaustively. In the case that we model one component incorrectly (i.e., with a different distribution), the overall performance is not significantly different with that we use a perfect model. Clearly, the accuracy of the data assimilation results largely depends on the validity of the simulation models used. In our case, this validity is evident, since the ground-truth data is produced by a similar model.

5.4.3 Effect of the number of particles

The influence of the number of particles ( $N_{p}$ ) used on the data assimilation results is summarized in Table 6. As expected, the overall performance has an upward tendency as the number of particles increases. With more particles, components can explore more possibilities on their time advance values, and this results in different event sequences and entity positions/phases, which will lead to a better coverage of the system state space.

Table 6.

The influence of the number of particles on the data assimilation results (states are retrieved through interpolation; $σ_{e} = 3.0, σ_{t} = 3.0$ ). In each table cell the median error over the 10 simulations is shown along with (in brackets underneath) the 25th and 75th percentiles.

$N_{p}$	Success rate	Waste rate	Average distance	Average percentage
	$S R (%)$	$W R (%)$	$\bar{d} (\min)$	$\bar{P} (%)$
100	70.00	30.00	0.81	82.29
	(60.00, 70.00)	(30.00, 40.00)	(0.75, 1.08)	(80.76, 83.64)
400	80.00	20.00	0.58	85.96
	(75.00, 80.00)	(20.00, 25.00)	(0.50, 0.74)	(82.52, 89.62)
700	82.50	17.50	0.62	87.62
	(80.00, 90.00)	(10.00, 20.00)	(0.47, 0.76)	(86.82, 88.65)
1000	87.50	12.50	0.67	88.33
	(85.00, 90.00)	(10.00, 15.00)	(0.58, 0.80)	(87.99, 89.38)
1500	87.50	12.50	0.64	88.17
	(85.00, 90.00)	(10.00, 15.00)	(0.56, 0.75)	(87.36, 89.93)
2000	90.00	10.00	0.54	89.00
	(90.00, 95.00)	(5.00, 10.00)	(0.52, 0.61)	(87.56, 91.77)

Figure 16 depicts the error measures relative to those at $N_{p} = 2000$ (the ensemble size chosen in the gold mine case). The plot shows that the upward tendency in performance by increasing the number of particles is not proportional. A reduction in ensemble size from $N_{p} = 2000$ to $N_{p} = 100$ (i.e., 2000%) leads to an increase in error metrics ranging from around 7% (average percentage $\bar{P}$ ) to around 200% (waste rate WR); it seems that we could have safely decreased the number of particles in the gold mine case from $N_{p} = 2000$ to $N_{p} = 1000$ without a significant loss of accuracy in terms of all error measures.

Figure 16.

The influence of $N_{p}$ on the data assimilation results (states are retrieved through interpolation; $σ_{e} = 3.0, σ_{t} = 3.0$ ); the performance indicators are relative to those at $N_{p} = 2000$ .

6 Conclusions

In this paper, we presented a particle filter-based data assimilation framework for discrete event simulations (of closed systems), in which we assume that the measurements fed at time step $k \in {1, 2, \dots}$ are distributed over the last measurement interval $[(k - 1) Δ T, k Δ T]$ , implying that the measurements are dependent on the state transitions during that interval. The data assimilation framework was formally defined based on the DEVS formalism. In this framework, two key problems (i.e., the state retrieval problem and the variable dimension problem) that hinder the application of particle filtering in discrete event simulations were addressed. The state retrieval problem was solved by introducing an interpolation operation, which takes the elapsed time (i.e., the time elapsed since the last state transition) into account when retrieving the state of a discrete event simulation model in order to obtain updated state values. The variable dimension problem was addressed based on the results of Godsill et al.,¹⁹ which imply that in practice we can safely apply the standard sequential importance sampling algorithm to update the posterior distribution $p (s_{0 : k} | m_{1 : k})$ , where $s_{0 : k}$ (see the definition in Equation (17)) has a variable dimension.

To illustrate the working of the proposed data assimilation framework, a case was studied in a gold mine system, in which noisy data (partial event sequences, entity positions with Gaussian errors) was assimilated into a discrete event gold mine simulation model to estimate truck arrival times at the bottom of the vertical shaft. The experiment results show that the proposed data assimilation framework is able to provide accurate estimation results in discrete event simulations. Assimilating (with interpolation) the noisy dataset with Gaussian error $N (0, 3^{2})$ added on entity positions, the performance indicators of estimating the truck arrivals are 95.00% (success rate), 5.00% (waste rate), 0.53 min (average distance), and 92.66% (average percentage) (see Section 5.3.1). In contrast, the simulation without data assimilation totally lost its prediction ability from $t = 150$ min onwards. The experiment results also prove that a proper interpolation operation can significantly improve the estimation results compared to those obtained without interpolation. With a linear interpolation to obtain entity positions in the gold mine case, all performance indicators improved in the statistic sense compared with those obtained without interpolation (see section 5.3.2), since the interpolation operation can capture the real-time state evolution, which is not described in the discrete event model but does happen in reality.

Sensitivity analysis reveals that the proposed data assimilation framework is robust to error assumptions. In the gold mine case, even with a 20-m standard deviation on entity positions, the performance does not degenerate too much. Specifically, the performance indicators of estimating the truck arrivals are 85.00% (success rate), 15.00% (waste rate), 0.83 min (average distance), and 85.82% (average percentage) (see Section 5.4.1). Similarly, the framework is robust to model errors (i.e., differences between the model generating the ground-truth data and the model used in the case study), although we cannot claim to have tested this exhaustively. The result shows that using a model with errors does not significantly affect the estimation results (see Section 5.4.2). This result does, however, emphasize an important underlying point. Clearly, unless we have actual evidence (data), the accuracy of the estimation results depends on the validity of the simulation models used in the framework for the specific case at hand. In our case, this validity is evident, since the ground-truth data is produced by a similar model. In real life, when the predictions given by the simulation model diverge too much from the real behavior of the system, it stands to reason that the estimation results will be farther away from the ground truth.

The results of sensitivity analysis also imply several possible future research directions in order to improve the quality of the estimation results, such as developing simulation models that can make more valid predictions of the real system behavior, developing more advanced sensor technologies that can provide more accurate measurement data of the real systems, and developing a parallel and distributed version of the proposed data assimilation framework in order to deal with more complex scenarios.

Footnotes

Funding

This research was supported by the China Scholarship Council (Grant no. 201306110027) and the National Natural Science Foundations of China (Grant no. 61374185 and 61403402).

Author biographies

Xu Xie was a PhD student at the Department of Multi Actor Systems, Faculty of Technology, Policy and Management, Delft University of Technology, the Netherlands. He is now working as an assistant professor at the Department of Modeling and Simulation, College of System Engineering, National University of Defense Technology, Changsha, China.

Alexander Verbraeck is a professor at the Department of Multi Actor Systems, Faculty of Technology, Policy and Management, Delft University of Technology, the Netherlands.

References

Bouttier

Courtier

. Data assimilation concepts and methods. Meteorological Training Course Lecture Series. Reading: ECMWF (European Centre for Medium-Range Weather Forecasts), 1999.

Atzori

Iera

Morabito

The Internet of Things: a survey. Comput Network 2010; 54: 2787–2805.

Lee

Lapira

Bagheri

, et al. Recent advances and trends in predictive manufacturing systems in big data environment. Manuf Lett 2013; 1: 38–41.

Liu

HX.

Using high-resolution event-based data for traffic modeling and control: an overview. Transp Res C Emerg Technol 2014; 42: 28–43.

Y-C.

Introduction to special issue on dynamics of discrete event systems. Proc IEEE 1989; 77: 3–6.

Nance

RE.

The time and state relationships in simulation modeling. Commun ACM 1981; 24: 173–179.

Schriber

Brunner

Smith

JS.

How discrete-event simulation software works and why it matters. In: Laroque

Himmelspach

Pasupathy

Rose

Uhrmacher

(Eds.) Proceedings of the 2012 winter simulation conference, Berlin, Germany, 9–12 December 2012, pp.1–15. Piscataway, NJ: IEEE.

Zeigler

Praehofer

Kim

. Theory of modeling and simulation: integrating discrete event and continuous complex dynamic systems. 2nd ed. New York: Academic Press, 2000.

Nichols

NL.

Data assimilation: aims and basic concepts. Dordrecht: Springer Netherlands, 2003, pp.9–20.

10.

Arulampalam

Maskell

Gordon

, et al. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans Signal Proc 2002; 50: 174–188.

11.

Gillijns

Mendoza

Chandrasekar

, et al. What is the ensemble Kalman filter and how well does it work? In: Proceedings of the 2006 American control conference, Minneapolis, MN, USA, 14–16 June 2006, pp.4448–4453. Piscataway, NJ: IEEE.

12.

Evensen

The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dynam 2003; 53: 343–367.

13.

Yuan

Lagrangian multi-class traffic state estimation. PhD Thesis, Delft University of Technology, 2013.

14.

Djurić

Kotecha

Zhang

, et al. Particle filtering. IEEE Signal Proc Mag 2003; 20: 19–38.

15.

. Towards applications of particle filters in wildfire spread simulation. In: Mason

Hill

Mnch

Rose

Jefferson

Fowler

(Eds.) Proceedings of the 2008 winter simulation conference, Miami, FL, USA, 7–10 December 2008, pp.2852–2860. Piscataway, NJ: IEEE.

16.

Xue

Data assimilation using sequential Monte Carlo methods in wildfire spread simulation. ACM Trans Model Comput Simulat 2012; 22: 23:1–23:25.

17.

Wang

Data assimilation in agent based simulation of smart environments using particle filters. Simulat Model Pract Theor 2015; 56: 36–54.

18.

Godsill

Vermaak

. Variable rate particle filters for tracking applications. In: IEEE/SP 13th workshop on statistical signal processing, Bordeaux, France, 17–20 July 2005, pp.1280–1285. Piscataway, NJ: IEEE.

19.

Godsill

Vermaak

, et al. Models and algorithms for tracking of maneuvering objects using variable rate particle filters. Proc IEEE 2007; 95: 925–952.

20.

Long

Data assimilation for spatial temporal simulations using localized particle filtering. PhD Thesis, Georgia State University, 2016.

21.

Sequential Monte Carlo based data assimilation framework and toolkit for dynamic system simulations. PhD Thesis, Georgia State University, 2017.

22.

Ntaimo

Sun

DEVS-FIRE: Towards an integrated simulation environment for surface wildfire spread and containment. Simulation 2008; 84: 137–155.

23.

Sun

Ntaimo

DEVS-FIRE: design and application of formal discrete event wildfire spread and suppression models. Simulation 2012; 88: 259–279.

24.

Vangheluwe

HL.

The Discrete EVent System specification (DEVS) formalism. Technical report, McGill University, School of Computer Science, Montreal, Quebec, Canada, 2001.

25.

Douc

Cappé

Moulines

. Comparison of resampling schemes for particle filtering. In: Proceedings of the 4th international symposium on image and signal processing and analysis, Zagreb, Croatia, 15–17 September 2005, pp.64–69. Piscataway, NJ: IEEE.

26.

Honig

Seck

ϕ

DEVS: phase based discrete event modeling. In: Proceedings of the 2012 symposium on theory of modeling and simulation, Orlando, FL, USA, 26–30 March 2012, pp.39:1–39:8. Orlando, FL: ACM.

27.

Giambiasi

Carmona

JC.

Generalized discrete event abstraction of continuous systems: GDEVS formalism. Simulat Model Pract Theor 2006; 14: 47–70.

28.

Mannila

Ronkainen

. Similarity of event sequences. In: fourth international workshop on temporal representation and reasoning, Daytona Beach, FL, USA, 10–11 May 1997, pp.136–139. Piscataway, NJ: IEEE.

29.

Kanungo

Mount

Netanyahu

, et al. An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 2002; 24: 881–892.

30.

Analysis and quantification of data assimilation based on sequential Monte Carlo methods for wildfire spread simulation. Int J Model Simulat Sci Comput 2010; 1: 445–468.

A particle filter-based data assimilation framework for discrete event simulations