Abstract
This article proposes a hybrid susceptible–infected–removed model which takes into consideration the spatiotemporal dynamics of the individuals. The model is based on a system of discrete stochastic diffusion equations. To build these equations, the two-dimensional diffusion equations coming from a balanced method are coupled with the human displacement probability law pattern through a discretization made by the finite volume method for complex geometries. The validation of this model is applied to COVID-19 spread. Since it is actual and the statistics are available elsewhere. The case of a developed country is used for simulation under some assumptions. Firstly to fit the chosen displacement pattern, then the accuracy of the statistics provided helps to analyze the sensitivity of the parameters of the model. The results explain the influence of the population movement on the evolution of the spread.
Keywords
Introduction
The evolution of transportation nowadays and confinement as a restriction measure during the COVID-19 pandemic show that the displacement of individuals is a key factor in the evolution of a spread. Many researchers in the mathematical field and beyond have proposed epidemiological models increasingly suitable for the study of the evolution of epidemics such as coronavirus, for the setup of the best strategies for the control of spread. Most of the proposed models are rooted in the compartmental epidemiological modeling proposed by Kermack and McKendrick1,2 based on the systems of ordinary differential equations (ODEs), which describe only the temporal evolution of the spread.3–10 Another class of models is the dynamical epidemic model called susceptible–infected–removed (SIR) model 13 and its variants such as susceptible–exposed–infected–recovered (SEIR) and SIS. These later models are based on dividing the host population into a small number of compartments, each containing individuals that are identical in terms of their status with respect to the disease in question. 14 Despite mathematical modeling having gained more scientific attention and awareness in epidemiology and medical science in general,11,12 one notes that, in general, the spatial dynamic aspect of hosts is not explicitly highlighted in epidemiological studies.
Within the framework of prediction models, some studies focus on the estimation of the basic reproduction number
Many authors have recently extended the SIR model to capture the spatiotemporal dynamics of individuals (see Gatto et al., 20 Qianyue et al., 21 Giannone et al., 22 Chen et al. 23 and the references therein). Gatto et al. 20 examined the effects of prompted drastic measures for transmission containment in Italy. Based on modeling, they estimated the parameters of a meta-community SEIR-like transmission model that includes a network of 107 provinces connected by mobility at high resolution. Chen and co-authors 21 built a data-driven epidemic simulator with COVID-19-specific features, which incorporated real-world mobility data capturing the heterogeneity in urban environments. Giannone et al. 22 studied the effect of the economic exchange of populations due to their displacements, the economic impact, and possible optimization of management policies for the conservation of the economy and the reduction of loss of human life between the states of the US. Chen et al. 23 introduced a new model which incorporated asymptomatic infections and population migration into SEIR, hence, referred to as the SEIAR (susceptible-exposed-infected-asymptomatic-recovered) model with population migration. However, in all these approaches, there's no concrete modeling of the random aspect of the movement of individuals coupled with the impact of administrative divisions.
The purpose of this study is to build a hybrid SIR model named SIR-D which combines the traditional SIR with a stochastic diffusion term that represents the random spatiotemporal dynamics of individuals. The approach sets certain hypotheses which, on the one hand, consider the random aspect of the dynamic of individuals and, on the other hand, takes into account geographical conditions like administrative divisions. The motivation comes from the fact that the forecasts made by the statistics are not always representative due to the complexity of human interactions and geographical conditions. 20 To integrate the random aspect in the proposed model, we gained inspiration from existing studies relating to the development of human mobility patterns.10,24 This extension is possible for any variant of the SIR model (SIS, SEIR, SEAIR, etc.) and other phenomena which involve the movement of the hosts during their evolution.
According to Currie et al. 25 who stated that simulation modeling can help to support decision-makers in making the most informed decision, this work is a precursor to the design of a decision support system that will allow the decision-maker to simulate the evolution of an epidemic according to the displacement profile of a population in order to make the right decision. In this regard, the case of the COVID-19 pandemic is used to validate the proposed model. The simulation is done on a selected country that fits with the hypothesis of probability law's pattern and the results are confronted with the existing statistics data coming from www.kaggle.com. The formulation of the model uses the 2D diffusion equation. For numerical resolution, a discretization is made using the finite volume method for complex geometries to capture administrative divisions. A GIS software was subsequently used to extract the spatial data used as input in our model. Data retrieved are processed using algorithms and data science techniques found by Igual and Segui 26 and implemented in the Python environment to numerically solve the system of equations obtained. The simulation results are presented alongside the trend of the true data as a way of validating assumptions on the input parameters.
The remainder of this paper is structured as follows: the “Model formulation and assumptions” section presents our methodology, focusing on the manner in which the model is built with finite volume discretization for complex geometries and diffusion matrix structure to obtain a hybrid SIR. The “Simulation and validation” section describes the application of the model to the COVID-19 data from Italy. The “Conclusion” section gives some concluding remarks and the direction of further works.
Model formulation and assumptions
Background
The general form of the equation modeling the evolution of mobile entities that interact with each other is given by:
Our attention is focused mainly on a single type of reaction–diffusion equation where the dispersion operator
Contextualization and meshing
A finite volume scheme on unstructured staggered grids
The finite volume scheme is found by integrating the Stokes problem 29 or Laplace problem 30 on a control volume of a discretization mesh and finding an approximation of the fluxes on the control volume boundary in terms of the discrete unknowns.
To give the assumptions needed on the mesh, we consider here the Laplace problem in an open bounded polygonal subset
The finite volume scheme is, therefore, written
Under boundary conditions, each
Contextualization and meshing
Let Ω be a bounded open set of

The selected regions.

Polygonal approximation of four regions.
Henceforth, we will consider:
XH XP XN XK
The quantity
With the finite volume scheme above,
10
the term
With the above considerations applied to,
14
one obtains the following finite volume scheme
Because of meshing, D is no longer a simple coefficient but a diffusion matrix depending on individual movement from one mesh to another. Recall the general form
So Site P Site N Site K
Similarly, for the rest of the sites (P, N, K) we obtain the following equations:
Boundary conditions are implicitly considered because the quantities above are zero everywhere else except on the boundaries between our experimental environments.
For a general formulation for any mesh H, we can write the numerical scheme as follows: |H| represents the area of H
Definition of the diffusion matrix
The diffusion part of the basic equation is related to the human displacement which is random and complex. Thus it requires complementary effort to its formulation. That is why we find in the literature the relevant studies on this topic independently of the discipline. In physics, random walk processes with a power-law single-step distribution are known as Lévy flights;31,32 Lévy flights are qualitatively different from ordinary random walks. Another study in telecommunications used subscribers' telephone data to estimate human mobility in developed countries (24).33,34 According to their thinking, the pervasive usage and the high penetration rates of mobile phones have made mobile network data the largest mobility data source ever. Many other theories are used to explain human displacement as the trajectory distance, also called jump length, which corresponds to the traveled distance during a trip. Brockmann et al.
10
state that the jump length Δr follows a power-law distribution:
Based on the above observation, we will build the diffusion matrix D which will be composed of two terms; an average diffusion speed, and a probabilistic displacement matrix between regions. The computation of the coefficients of the probability matrix is based on humans’ movement probability law pattern. There are several displacement patterns in the literature depending on the context of the study. In general, we will have
Let
Here we are interested in the probability
Interpretation of de model applied to SIR
In the traditional SIR model, there are three compartments represented in Figure 3:
Susceptible: individuals who might become infected if exposed assuming that they have no immunity to the infectious agent. Infectious: infected individuals who can transmit the infection to susceptible individuals that they contact. Removed: individuals recovered from the disease.

State chart of the dynamic of susceptible–infected–removed (SIR).
The transitions between states are interpreted as follows:
Simulation and validation
Assuming that the mortality rate due to COVID-19 is not negligible, we add to the traditional SIR model a fourth state (death state) that takes into consideration the number of deaths due to the illness (Figure 4).

State chart of the dynamic of susceptible–infected–removed (SIR).
This consideration modifies the equations system as follows:
Input data
The simulation is focused on Italy before confinement from 1 to 6 March 2020. The choice of Italy is motivated by two major reasons. First, because the probability law used was set up in the context of human interaction in developed countries, second, the availability and the accuracy of the official data related to the pandemic. The COVID-19 pandemic data used in this study come from the www.kaggle.com website where we extracted the dataset (Table 1). The map of Italy is extracted from Google Earth, and the demographic data are collected on Wikipedia. We used the ArcGIS software to compute the distance between the centroids of the regions and to approximate the perpendicular border length of the adjacent regions as in the “Contextualization and meshing” section
Pandemic input data for simulation.
Figure 5 shows some discretization traces on the GIS software for the extraction of spatial data.

Distance between centroids as spatial input data.
Table 1 gives the initial values used in the simulation (1 March 2020).
- N(0) = the total population in a region, - S(0) = the number of people likely to be contaminated, - I(0) = the number of positive cases on this date, - R(0) = the number of cases healed on this date, - M(0) = the number of deaths. - N = total human population in a given region - I = total infected given by the test cases - Ir = supposed total infected - M = total death - Rc = recovered by cure - α = recovered rate by cure - β = the infection rate - γ = death rate The adjustment of the coefficients α, β, and γ is mainly based on the number of deaths. This is true because all death causes are known (including the COVID-19 pandemic). The other data, namely the number of infected or recovered seem not representative because tests are generally performed on samples made up of suspicious people with a high body temperature, which is low compared to the real size of the population
The population will be assumed to be homogeneous where everyone is likely to contract the disease. A short time interval is chosen for the simulation (1–6 March) due to the time-varying property of the rates of the SIR model. In the case of a simulation over a long period of time, it would be necessary to take into account these variations as a time function in the definition of the SIR model parameters. The infection rate (β), removal rate (α), and death rate (γ) are obtained empirically from the available official statistics (17).
Table 2 gives the list of parameters which are time-based functions used in the equations. Their initial values are found in Table 1.
Model variables and parameters.
Simulation results
In every region, four discrete equations are computed. That makes a total of 80 equations by considering the 20 regions of Italy. The following figures present the result from some regions. The graphics at the left show the simulation of the pandemic, assuming that the entire population is susceptible and the graphics at the right are the representation of the statistical data collected during the same period.
Figure 6(a) shows the evolution of the epidemic in the “Lombardia” region, assuming that the entire population is susceptible (10,011,861 people). Figure 6(b) is the visualization of the data for a sample of 13,556 tests performed until 6 March 2020.

(a) Simulation in Lombardia and (b) statistics in Lombardia.
The statistical data shows that neighboring regions to Lombardia (“Emilia Romagna,” “Veneto,” and “Piemonte”) have a higher number of infected persons than others, this is because Lombardia was the most infected region. The results of our simulations fit with this reality given that our model considers the mixing of populations that share land borders. This statement is illustrated in the following diagrams.
Figure 7(a) is the evolution of the disease in the entire population (4,446,929 people) given that everyone is a suspect and Figure 7(b) is the data visualization of data obtained for tests carried out on 3136 persons.

(a) Simulation in Emilia Romagna and (b) statistics in Emilia Romagna.
In line with what precedes, Figure 8(a) is the evolution of the disease in the entire population (4,907,408) and Figure 8(b) is the visualization of data for tests carried out on 13,023 persons.

(a) Simulation in Veneto and (b) statistics in Veneto.
Figure 9(a) is the evolution of the disease in the entire population (4,395,363) and Figure 9(b) is a visualization of data obtained for tests carried out on 793 individuals.

(a) Simulation in Piemonte and (b) statistics in Piemonte.
In the following, we are commenting about some regions that are far from Lombardia. We have decided to choose any three of them (“Abruzzo,” “Basilicata,” and “Calabria”) (Figures 10 to 12).

(a) Simulation in Abruzzo and (b) statistics in Abruzzo.

(a) Simulation in Basilicata and (b) statistics in Basilicata.

(a) Simulation in Calabria and (b) statistics in Calabria.
The simulation results on the three regions show that those regions are less infected. This statement fits with reality.
Sardegna region is a special case, it is an island. The simulation begins on 1 March 2020 with no case for a population of 1,654,796 inhabitants. A total of 99 suspected cases were tested over the period from 1 to 6 March 2020 and five were positive. Figure 13(a) shows the evolution of the disease with our approach and Figure 13(b) is a visualization of the statistical data.

(a) Simulation in Sardegna and (b) statistics in Sardegna.
The study of the epidemic in the region of “Sardegna” gives us a concrete limit of the proposed approach due to the nonexistence of neighbors with whom there is a sharing of land border. The infection evolves in “Sardegna” because of the cases imported by the spatiotemporal dynamics of the mobile entities. An improvement of the proposed model would be to find a way of taking into account the interregional movements of people other than displacement on land.
We go ahead to present another family of curves where the calibration of simulation to available data on the dynamics of the real system is made. This second group of curves is made in two regions in order to validate the simulation by evaluating the “goodness of fit” of the simulation results to the real system's actual results for the time period of interest.
With this new family of curves, a comparison is made between the simulation results and the real data. For the Lombardia region (Figure 14), both observed and simulation data were quite close and consistent at that scale. With the Emilia Romagna region (Figure 15), and Veneto (Figure 16) data and simulation results were less close than in Lombardia. This can be explained by the fact that the simulation is for regions with a smaller number of tests. The model takes geographic data as a parameter. Hence, in the case of insufficient data, there is a big discrepancy in the simulator because of false zeros due to very low magnitudes. Based on these, one can affirm that the proposed model can be considered in the study of such a phenomenon. However, a better estimation can be made if the following are taken into account:
A good estimation of diffusion coefficients through a good knowledge of the displacement patterns of the populations is to be studied. Refine the meshing and then sum the contributions for the general behavior. A good estimation of areas, populations, and inter-mesh distances.

Simulation and data in Lombardia.

Simulation and data in Emilia Romagna.

Simulation and data in Veneto.
From Figures 6 to 13, we note a discrepancy between data and their estimations. The explanation is due to the fact that in the simulation, the entire population of a region is susceptible whereas, the data come from the results of tests carried out on suspect individuals and representing < 10% of the population in each region. On the other hand, according to the WHO, 80% of infected individuals do not show any symptoms. 36
From Figure 5(a) (right), between 1 and 6 March 2020, out of 13,556 tested there are 2612 positive cases, around 135 deaths, and 469 recovered. With these data, the mortality rate in the “Lombardia” region is 5%. We find this rate very high according to the WHO statement. Also, the recovered rate is around 18% but with the WHO report, it would be possible to have > 80% of cases recovered. This motivates our approach, assuming that the entire population is tested and that at least 80% of the cases are asymptomatic and heal on their own while being contagious. This will justify the proximity of the black curve representing the total number of infected to the blue curve representing the number of recovered in the simulations. The only parameter that we keep identical coming from data is the number of deaths. On the other hand, with the explicit taking into account of the spatiotemporal dynamics implemented in equations the results of simulations are more realistic compared to the volume of the population and their interactions. Given that “Lombardia” is the epicenter of the pandemic, the model tells us why the regions neighbor to “Lombardia” are more infected than the others.
Another reality that emerges from this study is that not only the dynamics between regions are difficult to control but also the instant decision-making of decision-makers distorts the linear character of the prediction of such a pandemic. To better help decision-makers, it would be better to make prediction models dynamic and contextual according to spontaneous situations. The model proposed in this article goes in this direction because it is possible to change the patterns of population displacement, to modify the coefficients α, β, and γ of the dynamics of the epidemic during its evolution, and to take decisions according to a given simulation frequency. Equally, the proposed model allows us to work on an effective population after the balance sheets of inputs and outputs are implemented by diffusion.
The study of the epidemic in the region of “Sardegna” which is an island gives us a concrete limit of the proposed approach due to the nonexistence of neighbors with whom there is sharing of land border. This specific case also shows the limit of the traditional SIR model when the study of an epidemic begins with no cases of infection. The infection evolves in “Sardegna” because of the cases imported by the spatiotemporal dynamics of the mobile entities. An improvement of the proposed model would be to find a way of taking into account the interregional movements of people other than displacement on land.
The calibration of the simulation to available data on the dynamics of the real system is made in three regions in order to validate the simulation by evaluating the “goodness of fit” of the simulation's results to the real system's actual results for the time period of interest. The results show that this approach can be considered to model such a phenomenon with some adjustments.
Nevertheless, it should be noted that the simulation for regions with less data during this period did not give good results with the proposed model. Since the model takes geographic data as a parameter, it turns out that when the number of tests carried out is not enough, they create a big discrepancy in the simulator because of false zeros due to very low magnitudes. This model was designed to provide a decision support tool on a population in the real world whose displacement modes are known, thus simulating the impact of a phenomenon like COVID-19 day after day by varying inputs (probability laws patterns, coefficients, etc.) to make good decisions.
Conclusion
This study presents a hybrid SIR model built by introducing the individual's dynamics through a probability law pattern. The pattern of displacement is inspired by the studies carried out by (10) and (24) which are based on the models of movement of individuals in urban areas in developed countries. In this way, one of the displacement laws has been combined with the diffusion equations for modeling the interactions of individuals in developed countries. To build the proposed hybrid SIR models, the probabilistic diffusion equations obtained are combined with the traditional SIR models. For the validation of the proposed approach, a simulation is made in the case of the novel COVID-19 spread. With the displacement pattern hypothesis for developed countries, the simulation is made for the case of Italy which represents for us an interesting study sample. Data from the 20 regions of Italy were available thus facilitating the implementation of the spatiotemporal dynamics between regions. To bring out the impact of the dynamics of individuals in the simulation, the regions of Italy are divided into three zones namely, the epicenter region and its surrounding regions, the regions far from the epicenter, and isolated regions without land borders with other regions. It emerges from this division with regard to the proposed model and supported by the data that the regions close to the epicenter are among the regions which register the highest infection rate compared to those which are distant. Also, the infection rate is very low or even zero for isolated regions. This approach is proposed to the decision-makers for a global view considering the whole population as susceptible, while the official data collected from the tested cases are partial and can distort decision-making. Despite the fact that the results are encouraging, the model doesn’t walk when the pattern of displacement is not known and when there are no land borders between regions.
As a perspective to improve this model, it is necessary to add the time-varying property of the propagation coefficients rates of the epidemic. It would also be important to find a displacement pattern that integrates different types of displacement in order to take into account the exchanges of individuals among the regions which do not share land borders.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
