Sage Journals: Discover world-class research

Abstract

This paper focuses on forecasting Military Action-type events by both state and non-state actors. Here we demonstrate that the dynamics of these types of events can be adequately described by a Hidden Markov Model (HMM) where the hidden states correspond to different operational regimes of an actor, and observations correspond to event frequency—and the HMM effectively predicts events with different lead times. We also demonstrate that one can enrich statistical time series-based methods that work only on historical data by exploiting predictive signals in real-time external data streams. We demonstrate the superior predictive power of the proposed models with evaluation of recent data capturing activities over two groups, ISIS and the Syrian Arab Military, two countries, Syria and Iraq, and two cities, Aleppo and Mosul. We also present an approach to converting predictions of the proposed models to real-world warnings.

Keywords

Event forecasting Hidden Markov Model autoregressive models external signals

1. Introduction

There has been significant recent interest in modeling and predicting violent events, such as Military Action by state actors, or terrorist attacks by non-state actors,¹ collectively referred to as MANSA events. There are certain characteristics of MANSA event dynamics that make them particularly challenging to model. For instance, it is well established that the dynamics of terrorist attacks have distinctly non-Poissonian characteristics.² In particular, the inter-event duration distribution (which is exponential for the Poisson process) has been shown to be heavy tailed and bursty for a number of different event types. Thus, we need different mechanisms for adequately reproducing and predicting activity patterns with highly non-Poissonian statistics.

In this paper we study the problem of forecasting violent events using historical and external open source data (e.g., news articles, blogs, and tweets), given that the historical data may not be up-to-date. Specifically, we focus on predicting violent events in the Middle East and North Africa (MENA) region over a year from 1 August 2016 to 30 September 2017. For evaluation we use manually curated violent events with rich features—such as actor, target, time, and location—as well as news articles data collected over the MENA region by Arabia Inform.³

It has been previously demonstrated that the bursty dynamics of terrorist activity can be well-captured by an appropriately designed d-state Hidden Markov Model (HMM), where a hidden state characterizes a specific operational mode of an organization.² The simplest setting of $d = 2$ corresponds to the case where the dynamics are coarsely quantized as low-activity and high-activity regimes, respectively. Here we show that even a simple two-state HMM can be used to adequately describe the daily patterns of violent events by ISIS, the Syrian Arab Military, and other actors, providing better predictive capabilities over simple baseline models.

Another important challenge for developing high-fidelity models for MANSA events is the availability of reliable, up-to-date historical data for generating real-time predictions. Indeed, recent studies, such as Raghavan et al.² and Porter et al.,⁴ try to predict the number of terrorist attacks at time $t + 1$ , assuming that one has access to the historical data up to time t. This assumption does not hold in many realistic situations. Indeed, in a typical scenario, one usually has access to historical data up to time $t - τ$ , where $τ$ is a scenario-dependent time lag, so one has to make a prediction without relying on the most recent historical data. Figure 1 illustrates the realistic settings for event forecasting.

Figure 1.

Overview of event forecasting without recent historical data. The proposed model takes historical data and indicators from external sources as inputs, and the model makes forecasts without recent historical data. GSR: gold standard report.

Here we address this shortcoming of existing models by proposing to use additional (surrogate) data sources to compensate for the lack of most recent event data. In particular, we focus on a scenario where in addition to historical event counts, we also have a time-stamped set of documents that contains potentially relevant information about events. Our results indicate that the signals extracted from streaming news sources can indeed lead to more accurate forecasts.

The rest of the paper is organized as follows: Section 2 discusses relevant research on event forecasting and Section 3 presents models that we exploit for forecasting MANSA events. Finally, we present an evaluation of our models in Section 4 and discuss our findings in Section 5.

2. Related work

There has been a significant interest in modeling the activities of terrorist groups.^1,5,6 Enders and Sandler^7,8 proposed a threshold autoregressive (TAR) model to study both short- and long-run spurts in terrorist activities. Dugan et al.^9,10 suggested group-based trajectory analysis techniques (Cox proportional hazards model or zero-inflated Poisson model) to identify regional terrorism trends with similar developmental paths. More recently, Porter et al.⁴ suggested the two-component self-exciting hurdle model (SEHM) and Raghavan et al.² proposed a d-state HMM for describing the activity profile of terrorist groups.

Developing a precise model for the dynamic behavior of time series is a challenging problem and an essential one for the success of forecasting methods. Researchers have extensively studied and used time series analysis in many domains, such as finance,¹¹ epidemiology,^12,13 geophysics,¹⁴ and sociology.¹⁵ A popular strategy for analyzing time series data is using classical autoregressive models, such as AR, ARMA, ARIMA, and ARIMAX.^14,16,17 Autoregressive models are widely used in intrusion detection, detecting denial-of-service (DoS) attacks, and network monitoring.¹⁸ These models assume that the underlying data-generating process is linear, that is, the value at a time point is a linear combination of the past values. However, real-world time series exhibit volatility and nonlinearity. A way to deal with the problem of volatility is to employ ARCH and GARCH, which are extensions of classical autoregressive models.¹⁹

The generation of temporal features from text corpora for event forecasting is a diverse practice in the prediction of civil unrest,^20–22 crime,^23–25 political violence,^1,26,27 and epidemics.²⁸ Using datasets of social media or news articles, domain-relevant information is typically extracted using expert-generated keywords as a starting point. Techniques that generate features from social media text using some form of supervised learning—keyword counting, manual document filtering, document classification, etc.—include work in spatio-temporal forecasting of civil unrest by Zhao et al.^29,30 using keywords to filter relevant information from social media posts. In the same domain, Compton et al.³¹ use keywords and geographical terms to filter Twitter posts, performing manual annotation on a small set of tweets in order to produce detailed forecasts of the demographic, spatial, and temporal information of civil unrest events.

Emphasizing the role news articles can play as precursors to particular events, Ning et al.³² propose a nested, multi-task learning approach to discover news articles that have a high impact on future event outcomes—whether or not a protest event occurs in a certain city. In this model, documents are represented as bag-of-words or a similarly unsupervised method of representation.

Forecasting military events has gained attention in recent years, as datasets have become more available. Zammit-Mangion et al.³³ apply a point process model to conflict events from the Afghan War Diary. Yonamine³⁴ models military events in Afghanistan using the Autoregressive Fractionally Integrated Moving Average (ARFIMA) to predict time series of district-level event counts. For a comprehensive review of datasets and models for the prediction of political violence, we refer the reader to Schrodt et al.¹

3. Models

The intuition behind time series model is that when events are correlated in time, then given a sequence of events, one can learn patterns of past events that are useful for predicting future events. Time series prediction techniques use historical data about events (with optional surrogate data) to learn a model of the process that produced these events. The model can, in turn, be used to predict new events. In this section, we describe how we apply two types of models—the HMM and autoregressive models—to address the challenge of modeling events executed by military and non-state actors.

3.1. Hidden Markov Models

We first present the HMM-based approach for modeling terrorist activities. In our context, the key idea of the HMM is that the current number of events (e.g., terrorist activities) depends on the past history of events through K dominant hidden states, which represent different operational phases of the terrorist activities. For example, the hidden states of a two-state HMM correspond to “low-activity” and “high-activity” processes, as shown in Figure 2. The process transitions probabilistically between low-activity and high-activity states. While in a particular state, the process outputs some events according to a state-dependent probability distribution.

Figure 2.

(a) Two-state Hidden Markov Model (HMM) for predicting terrorist activities. (b) Rolled-out HMM with hidden states and observations.

Let $Y = (y_{1}, y_{2}, \dots, y_{T})$ be the observed sequence of events, for example, the daily number of terrorist attacks, and $Z = (z_{1}, z_{2}, \dots, z_{T})$ be the underlying states of the process giving rise to the events $Y$ . Here T denotes the length of the time series, that is, the sequence of events. A HMM is described by a set of hidden states ( $S = {S_{1}, S_{2}, \dots, S_{N}}$ ), transition probabilities between the states ( $η_{ij} = P (z_{t} = S_{j} | z_{t - 1} = S_{i})$ ), initial probabilities of the states ( $π_{i} = P (z_{1} = S_{i})$ ), and the emission probabilities of events conditioned on the hidden state ( $ϕ_{i} (k) = P (y_{i} = k | z_{t} = S_{i})$ ). The hidden states $Z$ are discrete-valued random variables. A transition between the states is Markovian, that is, the future state is conditionally independent of the past states given the current state. In our problem setting, we consider the emission probabilities of events to be a continuous value from one of four possible distributions: Poisson, Gaussian, geometric, or Hurdle geometric. The generative process for the model is shown in Algorithm 1.

Algorithm 1. Generator( $η, π$ ) for HMM
Input: A set of parameters. Output: Number of domain registrations. 1. Choose the initial state $z_{1} ~ Mult (π)$ 2. Draw each row of $η_{i}$ using $Dir (α) ▹$ Transition matrix for a user-defined $α$ 3. Choose the emission probability distribution $ϕ \in {Poisson, Gaussian, Geometric, Hurdle Geometric}$ 4. for each time $1 \leq t \leq T$ do 5. if not the 1st day then 6. $z_{t} \leftarrow Mult (η_{z_{t - 1}})$ 7. Draw $y_{t} ~ ϕ_{z_{t}}$

Algorithm 1. Generator(

η, π

) for HMM

Input: A set of parameters.
Output: Number of domain registrations.
1. Choose the initial state

z_{1} ~ Mult (π)

2. Draw each row of

η_{i}

using

Dir (α) ▹

Transition matrix for a user-defined

α

3. Choose the emission probability distribution

ϕ \in {Poisson, Gaussian, Geometric, Hurdle Geometric}

4. for each time

1 \leq t \leq T

do
5. if not the 1st day then
6.

z_{t} \leftarrow Mult (η_{z_{t - 1}})

7. Draw

y_{t} ~ ϕ_{z_{t}}

3.1.1. Estimating HMM Parameters

The unknown parameters of the proposed HMM are $H = {π, η, ϕ}$ . No analytical solution exists for this model that maximizes the probability of the observed sequence (i.e., likelihood).³⁵ Hence, we applied an Expectation Maximization (EM)-based algorithm (also known as Baum–Welch reestimation) to estimate the parameters of the model.

3.1.2. Predicting with the HMM

To predict the number of new events, we adopt a sliding window approach. We teach our model with data determined by a user-defined time window (e.g., four months), estimate the expected number of events for a gap period (e.g., one month), and forecast for the next one month. The expected number of events at time t given $z_{t - 1}$ is as follows:

{\bar{y}}_{t} = \sum_{j}^{N} η_{z_{t - 1} j} * E [S_{j}]

where $E [S_{j}]$ is the expected number of events at state $S_{j}$ .

3.2. Autoregressive models

We propose RARE—regularized autoregression with exogenous variables—for predicting terrorist activities. RARE is based on the ARX model—the autoregressive model with external variables³⁶—and Lasso.³⁷ The key idea is to use penalized regression (e.g., Lasso) for selecting autoregressive terms as well as covariates. The model is robust to the absence of historical data and requires limited history for prediction.

Let $Y = (y_{1}, y_{2}, \dots, y_{T})$ be the observed sequence of events over T time units, that is, the length of the time series. Formally, RARE(p,k) defines an autoregressive model with p autoregressive lags and k external variables. Given the observed series of events $Y = (y_{1}, y_{2}, \dots, y_{T})$ , the predicted value $y_{t}$ at time point t is expressed as follows:

y_{t} = μ_{y} + \sum_{i = 1}^{p} α_{i} y_{t - i} + \sum_{j = 1}^{k} β_{j} X_{j, t} + w_{t}

(1)

Here $μ_{y}$ is a constant, $α_{i}$ is the autoregressive (AR) coefficient at lag i, $β_{j}$ is the regression coefficient for external variable $X_{j}$ , and $w_{t} ~ N (0, σ^{2})$ is the white noise at time point t. The model exploits $ℓ_{1}$ -regularization for selecting k external variables. We estimate the model parameters $μ_{y}$ , $α = (α_{1}, \dots, α_{p})$ , $β = (β_{1}, \dots, β_{k})$ , $λ_{α}$ ( $ℓ_{1}$ -penalty for AR terms), $λ_{β}$ ( $ℓ_{1}$ -penalty for external variables) by minimizing the following objective function:

\begin{matrix} \sum_{t} {(y_{t} - μ_{y} \sum_{i = 1}^{N} α_{i} y_{t - i} + \sum_{j = 1}^{K} β_{j} X_{j, t})}^{2} + \\ λ_{α} {| | α | |}_{1} + λ_{β} {| | β | |}_{1} \end{matrix}

(2)

For comparison, we also apply the widely used ARIMA model for forecasting events. ARIMA stands for autoregressive integrated moving average (MA). The key idea is that the number of current events ( $y_{t}$ ) depends on the past counts and forecast errors. Formally, ARIMA(p, d, q) defines an autoregressive model with p autoregressive lags, d difference operations, and q MA lags (see Shumway and Stoffer¹⁴). Given the observed series of events $Y = (y_{1}, y_{2}, \dots, y_{T})$ , ARIMA(p, d, q) applies d ( $\geq 0$ ) difference operations to transform $Y$ to a stationary series $Y'$ . Then the predicted value $y_{t}'$ at time point t can be expressed in terms of past observed values and forecasting errors, which is as follows:

y_{t}' = c + \sum_{i = 1}^{p} α_{i} y_{t - i}' + \sum_{j = 1}^{q} β_{j} e_{t - j} + w_{t}

(3)

Here is a constant, $α_{i}$ is the autoregressive (AR) coefficient at lag i, $β_{j}$ is the MA coefficient at lag j, $e_{t - j} = y_{t - j}' - {\hat{y}}_{t - j}'$ is the forecast error at lag j, and $e_{t}$ is assumed to be the white noise $(w_{t} ~ N (0, σ^{2}))$ . The AR model is essentially an ARIMA model without MA terms.

We use maximum likelihood estimation for learning the parameters; more specifically, parameters are optimized with the LBFGS method.³⁸ These models assume that $(p, d, q)$ are known and the series is weakly stationary. To select the values for $(p, d, q)$ we employ grid search over the values of $(p, d, q)$ and select the one with the minimum Akaike information criterion (AIC) score.

We also compare our proposed methods against a base rate model, which predicts the number of future events as the average number of past events over a time window W. Formally:

y_{t} = \frac{1}{W} \sum_{i = 1}^{W} y_{t - i}

(4)

3.3. Evaluation of time series models

We use three error measures for quantitative evaluation of our time series models: (a) mean absolute error (MAE); (b) root mean squared error (RMSE); and (c) mean absolute scaled error (MASE).³⁹ These measures are defined as follows in terms of forecasting error, $e_{t} = y_{t} - {\hat{y}}_{t}$ , at time point t, where $y_{t}$ and ${\hat{y}}_{t}$ are the true and predicted values, respectively.

MAE:

MAE = \frac{1}{T} \sum_{t = 1}^{T} | e_{t} |

RMSE:

RMSE = \sqrt{\frac{1}{T} \sum_{t = 1}^{T} | e_{t} |^{2}}

MASE:

MASE = \frac{\frac{1}{T} \sum_{t = 1}^{T} | e_{t} |}{\frac{1}{T - 1} \sum_{t = 2}^{T} | y_{t} - y_{t - 1} |}

4. Experiments

We now present a case study for the proposed models using data on military and non-state actor events in the MENA region. Our goals are to answer the following questions.

Can the HMM capture latent structures in activities executed by various actors?

How do the proposed models perform with MANSA events at actor, country, and city levels?

Which external signals are good indicators for forecasting MANSA events?

How can we generate warnings given predicted event counts? How does the model perform in terms of quantitative evaluation of generated warnings?

4.1. Datasets

The ground truth information about MANSA events, called the gold standard report (GSR) is exclusively provided by the Center for Analytics at New Haven. The GSR is a manually created list of MANSA events by domain experts. Each event in the dataset has 22 different attributes: actor, actor status, approximate location, causalities, country, earliest reported date, encoding comment, event date, event id, event subtype, event type, first reported link, gold standard source link, latitude, longitude, news source, other links, revision date, state, target, target name, and target status. While much care had been taken to address the attribution and duplication problem in the manual event documentation step, we also remove any duplicates in preprocessing steps using the these attributes.

For evaluation we use ground truth time series of daily event counts based on manually extracted, structured reports on events, at actor, city, and country level (see Table 1). We use two actors—ISIS and the Syrian Arab Army, two countries—Syria and Iraq, and two cities—Aleppo and Mosul. In addition, we use surrogate data, which is generated from Arabic news articles originating from MENA countries.

Table 1.

Aggregates for countries and top-eight cities for MANSA events in the time period from August 2016 to October 2017.

Country	Num. of events	City	Num. of events	Actor	Num. of events
Syria	56,696	Mosul	2535	Syrian Arab Military	22,961
Iraq	14,475	Aleppo	2366	Unspecified	16,368
Egypt	1679	Deir ez-Zor	1516	ISIS	8324
Lebanon	1089	Tadmur	884	Iraqi Military	3975
Saudi Arabia	233	Jawbar	776	Iraqi Security Forces	2063
Yemen	148	Mintaqat Dar’a al Balad	748	Russian Military	1988
Jordan	49	Ar Raqqah	667	Iraqi Police	1460

In order to generate potentially predictive signals, we apply a temporal topic-based feature extraction approach to Arabia Inform news articles,³ a corpus of news documents originating from MENA countries (see Figure 3), over a time span co-occurring with our GSR event time series. We consider the subset of the corpus that has at least one of our countries of interest (Iraq, Syria, Saudi Arabia, Lebanon, Yemen, Jordan) “tagged” as part of the meta-data provided from each document’s URL. As the corpus consists mostly of articles published in Egypt, and thus the majority of articles have “Egypt” as a tagged location, we exclude articles about places in Egypt. This largely Arabic corpus has approximately 20,000 documents per day, including a variety of topics spanning entertainment, politics, reporting articles, and general purpose news items.

Figure 3.

Monthly aggregates of Arabia Inform news articles in our corpus, after filtering for documents that have one or more of the following countries “tagged” in the article meta-data: Lebanon, Jordan, Yemen, Saudi Arabia, Iraq, Syria.

4.1.1. Topic-based temporal feature generation

To learn latent shifts in the news corpus that possess information about our events of interest, we chose two topic modeling techniques: firstly, we train Latent Dirichlet Allocation (LDA) models with 100, 150, and 200 topics on the whole corpus and aggregate (see below for details) the posterior distributions of each topic over a given day’s documents. Secondly, we pre-train a LDA model on a set of 10,000 Arabic news articles—reporting MANSA events—which were used to generate the ground truth event dataset, and repeat the temporal feature generation on the entire Arabia Inform corpus,⁴⁰ inferring topic posterior distributions over the same corpus. We use the Mallet LDA package,⁴¹ performing light stemming and stop-word removal as preprocessing. We found the Mallet LDA package to produce more coherent and consistent topics compared to the Corex⁴² and Gensim LDA packages.⁴³ As the news articles in our dataset are predominantly in Arabic (90%), we perform light stemming, as Arabic is a highly inflecting language⁴⁴ and the development of a proper Arabic lemmatizer is still an active area of research. For our experiments, we did not achieve significant results with the first method (not pre-trained), and thus present only our findings with the pre-trained topic model.

In order to generate daily features given a trained topic model and a set of time-stamped documents, we denote $D_{t}$ as the set of Arabia Inform documents on day t, $N_{t}$ as the number of documents on day t, and $d_{t, i}$ as the ith document on day t, represented by a bag-of-words. Then we have $v_{t, i}$ as the k-dimensional vector of estimated topic distributions for a learned LDA topic representation with k topics, applied to document $d_{t, i}$ . $v_{t, i, j}$ represents the jth coordinate of the feature vector $v_{t, i}$ of document $d_{t, i}$ . Finally, to generate a temporal feature vector using a document corpus, we denote the jth coordinate of the feature vector $V_{t}$ , for $j \in k$ :

V_{t, j} = \frac{1}{N_{t}} \sum_{i = 1}^{N_{t}} v_{t, i, j}

4.2. Structures in actors’ activities using a two-state Hidden Markov Model

We hypothesize that the number of activities performed by an actor might have hidden structures (e.g., high-activity and low-activity periods), which may not be well-captured using a simple counting process, such as the Poisson process. We employ the HMM for capturing hidden structures in activities by various actors in our dataset, such as ISIS, the Syrian Arab Military, the Iraqi Military, and the Russian Military. Figure 4 illustrates the result of the HMM for ISIS with various settings. Specifically, the datasets contain daily counts of terrorist events by ISIS in Iraq and Syria. We used the initial 80% of the data for training, using the Baum–Welch algorithm to estimate the model parameters. Here we report results using the Gaussian observation model, so that the total number of parameters is four: two transition probabilities, from L state to H and vice versa, and four parameters for the observation model (two for each hidden state).

Figure 4.

(a) The two-state Hidden Markov Model (HMM) taught using ISIS data. (b) The predictive performance of the method, measured in mean squared error, for different observation models and different lead times.

Figure 4 depicts the model learned via the Baum–Welch method. We observe that both hidden states have significant inertia, for example, the actor is more likely to stay in the same hidden state than transition to a new one. Also, what is perhaps more important, is tht the rate of events (as characterized by the mean of the Gaussian model) differs significantly between the states: The average number of attacks per day is $13.4$ when the actor is in high-activity state H, compared to $5.7$ in the low-activity state L. Figure 4(b) shows the prediction error of the model for different choices of observation models with different lead times.

Next, we focus on the task of reconstructing the hidden trajectory of the actor. Toward that goal, we run the Viterbi algorithm, which returns a single (maximum a posteriori) hidden state sequence that best explains the observed counts. Figure 5 shows the event count together with the reconstructed hidden dynamics. Remarkably, even this simple two-state model is able to capture the spurts in the activity.

Figure 5.

Reconstructed hidden state sequence (dotted red lines) together with the observed count sequence (blue) for three different time windows. The trajectory given by the dotted red line switches between the high-activity (upper line) and low-activity (lower line) states. (Color online only.)

4.3. Predictive performance of HMM, ARIMA, and RARE models

The GSR represents the occurrence of an event on a given day at a specific location by a specific actor. As GSR is typically lagged (e.g., by a month), and thus it poses a challenge for the prediction algorithm. In our evaluation settings, we assume a gap of a month between the last day of the training period and the first day of the testing period. We keep the test period to be a month, as the GSR is updated each month. For the RARE model, we use topic-based temporal features as external signals (see Section 4.1). Before applying temporal features, we first align them using correlation analysis with the GSR: we determine the lag where the maximum correlation occurred between a temporal feature and the GSR, and use the lag for alignment. We tested the RARE model with 50 and 100 external features and $p = 30$ .

Figure 6(a) and Table 2 illustrate the models’ predictions and performance measures for ISIS activities over the month of January 2017, respectively. Here the models are trained with the data from 1 August 2016 to 30 November 2016, and we assume there is no GSR for the month of December 2016. Although the HMM performed better than the RARE model for this period in terms of performance measures, the RARE model captures the trends better than other models. Figure 6(b) and Table 3 present the models’ predictions and performance measures for Syrian Arab Military activities over the month of March 2017, respectively. Here we assume the absence of a GSR in February 2017, and the models are trained with the data from 1 August 2016 to 31 January 2017. The RARE model clearly outperforms the other models in terms of capturing the trends and performance measures.

Figure 6.

Forecasting ISIS and Syrian Arab Military activities using the Hidden Markov Model (HMM), autoregressive integrated moving average (ARIMA) model, regularized autoregression with exogenous variables (RARE) model, and a base rate model: (a) models forecast ISIS activities over the period (January 2017) wherein the models are trained with the data from 1 August 2016 to 30 November 2016, and the month of December 2016 is considered as the gap period; (b) forecasting of Syrian Arab Military activities over the period (March 2017) wherein the models are trained with the data from 1 August 2016 to 31 January 2017, and the month of February 2017 is considered as the gap period. For both settings, the HMM with two hidden states and Gaussian emission probability is used, and the ARIMA and RARE models are identified using a grid search over parameters.

Table 2.

Forecasting of ISIS activities using the Hidden Markov Model (HMM) and autoregressive models (autoregressive integrated moving average (ARIMA) and regularized autoregression with exogenous variables (RARE)). Methods are compared in terms of different performance metrics: mean absolute error (MAE), root mean squared error (RMSE), and mean absolute scaled error (MASE).

Method	MAE	RMSE	MASE
Base rate	9.25	10.85	0.87
HMM_Gaussian	8.25	11.33	0.78
ARIMA	28.95	30.82	2.73
RARE	9.33	11.01	0.88

Table 3.

Forecasting of Syrian Arab Military activities using the Hidden Markov Model (HMM) and autoregressive models (autoregressive integrated moving average (ARIMA) and regularized autoregression with exogenous variables (RARE)). Methods are compared in terms of different performance metrics: mean absolute error (MAE), root mean squared error (RMSE), and mean absolute scaled error (MASE).

Method	MAE	RMSE	MASE
Base rate	76.70	78.73	4.33
HMM_Gaussian	35.71	38.13	2.02
ARIMA	56.74	59.45	3.21
RARE	27.93	32.85	1.58

We also evaluate our models for country-level event activities. We use six months of the GSR as training data starting from 1 August 2016, and use the model for predicting over a month, where there is a gap of a month between training and forecasting spans. We then shift the training period by a month and repeat the forecasting up to the month of September 2017. Tables 4 and 5 show the comparison between methods with different average metrics over seven months. We observed that the HMM and RARE model perform better compared to the other models. Figures 7(a) and (b) illustrate the models’ predictions over activities in Syria and Iraq for the months of August 2017 and May 2017, respectively.

Table 4.

Forecasting of MANSA events in Syria using the Hidden Markov Model (HMM) and autoregressive models (autoregressive integrated moving average (ARIMA) and regularized autoregression with exogenous variables (RARE)). Methods are compared in terms of different performance metrics with average over seven months: mean absolute error (MAE), root mean squared error (RMSE), and mean absolute scaled error (MASE).

Method	MAE	RMSE	MASE
Base rate	32.69	37.62	1.44
HMM_Gaussian	23.48	29.12	1.04
ARIMA	32.76	37.95	1.45
RARE	25.91	31.45	1.14

Table 5.

Forecasting of MANSA events in Iraq using the Hidden Markov Model (HMM) and autoregressive models (autoregressive integrated moving average (ARIMA) and regularized autoregression with exogenous variables (RARE)). Methods are compared in terms of different performance metrics with average over seven months: mean absolute error (MAE), root mean squared error (RMSE), and mean absolute scaled error (MASE).

Method	MAE	RMSE	MASE
Base rate	11.85	14.00	1.09
HMM_Gaussian	13.38	15.70	1.20
ARIMA	11.95	14.23	1.09
RARE	11.73	14.03	1.08

Figure 7.

Forecasting events in Syria and Iraq using the Hidden Markov Model (HMM), autoregressive integrated moving average (ARIMA) model, regularized autoregression with exogenous variables (RARE) model, and a base rate model. (a), (b) Models forecast activities in Syria and Iraq over the period August 2017 and May 2017, respectively. For both settings, the HMM with two hidden states and Gaussian emission probability is used, and the ARIMA and RARE models are identified using a grid search over parameters.

Finally, we also evaluate our models against city-level events with two cities—Mosul and Aleppo. Similar to country-level event data, we use six months of the GSR as training data starting from 1 August 2016, and use the model for predicting over a month, where there is a gap of a month between training and forecasting spans. We then shift the training period by a month and repeat the forecasting up to the month of September 2017. Tables 6 and 7 show the comparison between methods over different metrics across seven months. We observed that the HMM and RARE model perform better compared to other models for Aleppo, but the ARIMA model outperformed others for Mosul. The reason could be the sparsity in the city-level events. Figures 8(a) and (b) illustrate the models’ predictions over activities in Aleppo and Mosul for the months of August 2017 and May 2017, respectively.

Table 6.

Forecasting of MANSA events in Aleppo using the Hidden Markov Model (HMM) and autoregressive models (autoregressive integrated moving average (ARIMA) and regularized autoregression with exogenous variables (RARE)). Methods are compared in terms of different performance metrics with average over seven months: mean absolute error (MAE), root mean squared error (RMSE), and mean absolute scaled error (MASE).

Method	MAE	RMSE	MASE
Base rate	3.69	3.88	3.01
HMM_Gaussian	1.56	1.78	1.32
ARIMA	3.40	3.74	2.22
RARE	1.57	1.83	1.24

Table 7.

Forecasting of MANSA events in Mosul using the Hidden Markov Model (HMM) and autoregressive models (autoregressive integrated moving average (ARIMA) and regularized autoregression with exogenous variables (RARE)). Methods are compared in terms of different performance metrics with average over seven months: mean absolute error (MAE), root mean squared error (RMSE), and mean absolute scaled error (MASE).

Method	MAE	RMSE	MASE
Base rate	4.37	4.89	2.07
HMM_Gaussian	3.85	4.52	1.77
ARIMA	3.64	4.17	1.59
RARE	4.25	4.77	2.03

Figure 8.

Forecasting evenst in Aleppo and Mosul using the Hidden Markov Model (HMM), autoregressive integrated moving average (ARIMA) model, regularized autoregression with exogenous variables (RARE) model, and a base rate model. (a), (b) Models forecast activities in Syria and Iraq over the period May 2017. For both settings, the HMM with two hidden states and Gaussian emission probability is used, and the ARIMA and RARE models are identified using a grid search over parameters.

4.4. Important predictors in forecasting MANSA events

The RARE model identifies a subset of autoregressive variables and external variables that are predictive of the target, which is the number of events occurring each day. We analyze the topics selected by the algorithm. As an example, Tables 8 and 9 show some of the features identified by the RARE model with ISIS and Syrian Arab Military activities, respectively. We observe that many of identified topics are meaningful and relevant to the events associated with ISIS and the Syrian Arab Military.

Table 8.

Representative features selected by the regularized autoregression with exogenous variables model with a training set for ISIS activities from 1 August 2016 to 30 November 2016.

Topic 0	Topic 9	Topic 38	Topic 45	Topic 48
Damascus	Forces	Lattakia	General	Security
Forces	Brive	Forces	Government	Capture
Eastern	Aleppo	Sham	Big	Brotherhood
Al Gouta	Syrian conflict	Syrian	Region	Prosecution
Syrian conflict	East	North	Necessity	Accused
Rights	Islamic	Support	Stressing	Security
West	Rights	Al-Qaeda	Areas	Police
Human	Clashes	Front	Work	Director
Insurgents	Loyal	group	Shadow	Major General
Bombing	Human	Nationalities	Entities	Elements
Factions	Insurgents	Syrian conflict	Including	Investigation
Islamic	Ocean	The Kurds	Operation	Group

Table 9.

Representative features selected by the regularized autoregression with exogenous variables model with a training set for Syrian Arab Military activities from 1 August 2016 to 31 January 2017.

Topic 0	Topic 9	Topic 41	Topic 47
Damascus	Forces	Saudi Arabia	National
Forces	Brive	Border	Iraq
Eastern	Aleppo	Sanafir	North
Al Gouta	Syrian conflict	Alliance	Kirkuk
Syrian conflict	East	United Nations	Baquba
Rights	Islamic	Demarcation	Security
West	Rights	The kingdom	Capture
Human	Clashes	The Houthis	Diyala
Insurgents	Loyal	Party	Reporter
Bombing	Human	agreement	Mosul
Factions	Insurgents	Countries	East
Islamic	Ocean		Security

4.5. Warning generation

The proposed models essentially forecast event counts, but an intelligence analyst may need more details about the events for better understanding and dissemination. We propose a two-phase algorithm for generating real-world warnings. We transfer these event counts for each model to meaningful warnings with sampling each event detail field from its corresponding empirical distribution of the fields. To see the efficacy of this approach, we generate warnings at the country level (Syria and Iraq) for two different types of events (military action and non-state actor events) over the months from March to September 2017. For each event count, we use six trials for generating six different sets of warnings. We match the generated warnings against GSR events using the Hungarian matching⁴⁵ algorithm as well as other numerical and string matching algorithms. If a warning occurs within seven days of the corresponding true event, then a warning is included for further analysis in terms of various metrics. Figures 9 –11 illustrate the evaluation of warnings generated by the base rate model, the HMM, and the RARE model in terms of precision, recall, and quality score. Each box in the plots represents 50% of the data, and each vertical red line denotes the median. We can see that the RARE model performs better than the others in terms of precision, and performs slightly better than the base rate model in terms of warning quality.

Figure 9.

Evaluation of warnings generated using the base rate model for two types of events in Syria and Iraq from 1 March 2017 to 30 September 2017. (Color online only.)

Figure 10.

Evaluation of warnings generated using the Hidden Markov Model for two types of events in Syria and Iraq from 1 March 2017 to 30 September 2017. (Color online only.)

Figure 11.

Evaluation of warnings generated using the regularized autoregression with exogenous variables model for two types of events in Syria and Iraq from 1 March 2017 to 30 September 2017. (Color online only.)

5. Discussion

We explore state-based (HMM) and autoregressive (ARIMA and RARE) models for generating event forecasts with external indicators. We observe that both the HMM and RARE model perform quite well with a reasonable amount of data (actor and country-level events), while performance deteriorates when events are sparse. When event density is low and event type is rare, it poses a challenge to our proposed models for predicting events in such settings. Some of the countries (e.g., Saudi Arabia and Yemen) and most of the cities in our dataset have low event density, for which the HMM and the autoregressive models seem inadequate. In addition, there are some event types that are rare, such as some epidemic disease that do not occur so often compared to flu epidemics. For these rare events, the HMM and the autoregressive models may not work well. To address these problems we need predictive models that would take the elaborate event context in external sources into account.

In this study we model each actor independent of others, although actors interact with each other in a real-world scenario. It would be interesting to pursue modeling actors with more than two operational states as well as interactions between multiple actors.

This study explores an external source (Arabia Inform news articles) for event forecasting. Our methods can be extended to deal with signals from additional sources—such as Twitter and blogs—which we plan to explore in the future. It also possible to develop models that consider each of the data sources separately and that select subsets of external signals from each group for prediction. In addition to event count prediction models, we plan to explore models that not only forecast events but also identify the precursors to events in external sources.

Footnotes

Acknowledgements

The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.

Funding

This research is based upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA).

ORCID iD

KSM Tozammel Hossain

Author biographies

KSM Tozammel Hossain is a Postdoctoral Research Associate in the Machine Intelligence and Data Science (MINDS) group at the Information Sciences Institute, University of Southern California (USC). He earned his doctoral degree in computer science at Virginia Tech. His research interests broadly lie in developing and applying data science and machine learning techniques to solve problems stemming from bioinformatics, social networks, and social science.

Shuyang Gao has recently received his PhD in computer science in the MINDS group at the Information Sciences Institute, USC. He has a BS in computer science from Fudan University, Shanghai, China. His research focuses on information theory and machine learning.

Brendan Kennedy is a PhD student in the Department of Computer Science at the USC. He received his BS in computer science at Gonzaga Unviersity. His research interests are in the areas of computational social science, interpretable machine learning, and natural language processing.

Aram Galstyan is the Director of the MINDS group at the Information Sciences Institute, USC. He is also a research associate professor at the USC Computer Science department. His work focuses on various problems at the intersection of machine learning, information theory, and statistical physics. He is the Co-PI for IARPA’s Mercury program, where he leads the algorithm development effort for machine-based forecasting. He was the PI for the DARPA SMISC project, where he developed methods that helped the USC ISI team achieve perfect accuracy on DARPA’s Social Bot Detection challenge.

Prem Natarajan is the Michael Keston Executive Director of Information Sciences Institute at the USC, a vice dean of the USC Viterbi School of Engineering, and a professor of computer science. At ISI, he leads institute-wide managerial and technical directions, including research, development, and the MOSIS electronic chip brokerage. He also heads teams in his areas of expertise: novel approaches to face, speech, handwriting, and optical character recognition (OCR), along with other deep learning and natural language processing directions.

References

Schrodt

Yonamine

Bagozzi

BE.

Data-based computational approaches to forecasting political violence. In: Subrahmanian

(ed) Handbook of computational approaches to counterterrorism. New York: Springer, 2013, pp. 29–162.

Raghavan

Galstyan

Tartakovsky

AG.

Hidden Markov models for the activity profile of terrorist groups. Ann Appl Stat 2013; 7: 2402–2430.

Arabia Inform. http://arabiainform.com (accessed 30 December 2017).

Porter

White

, et al. Self-exciting hurdle models for terrorist activity. Ann Appl Stat 2012; 6: 106–124.

Ward

Metternich

Carrington

, et al. Geographical models of crises: evidence from ICEWS. Adv Des Cross Cult Activ 2012; pp. 429–438.

Lewis

Mohler

Brantingham

, et al. Self-exciting point process models of civilian deaths in Iraq. Secur J 2012; 25: 244–264.

Enders

Sandler

The effectiveness of antiterrorism policies: a vector-autoregression-intervention analysis. Am Polit Sci Rev 1993; 87: 829–844.

Enders

Sandler

Is transnational terrorism becoming more threatening? A time-series investigation. J Conflict Resolut 2000; 44: 307–332.

Dugan

LaFree

Piquero

Testing a rational choice model of airline hijackings. Criminology 2005; 43: 1031–1065.

10.

LaFree

Morris

Dugan

Cross-national patterns of terrorism: comparing trajectories for total, attributed and fatal attacks, 1970–2006. Br J Criminol 2009; 50: 622–649.

11.

Lendasse

De Bodt

Wertz

, et al. Non-linear financial time series forecasting-application to the bel 20 stock market index. Eur J Econ Soc Syst 2000; 14: 81–91.

12.

Chakraborty

Khadivi

Lewis

, et al. Forecasting a moving target: ensemble models for ILI case count predictions. In: Zaki

Obradovic

Tan

Banerjee

Kamath

Parthasarathy

(eds.) Proceedings of the 2014 SIAM international conference on data mining, Philadelphia, PA, 24–26 April 2014. pp.262–270. SIAM.

13.

Wang

Chakraborty

Mekaru

, et al. Dynamic Poisson autoregression for influenza-like-illness case count prediction. In: Cao

Zhang

Joachims

Webb

Margineantu

Williams

(eds.) Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’15), Sydney, NSW, 10-13 August 2015. pp. 1285–1294. New York: ACM.

14.

Shumway

Stoffer

Time series analysis and its applications: with R examples. 3rd ed. New York: Springer Science Business Media, 2011.

15.

Box-Steffensmeier

Freeman

Hitt

, et al. Time series analysis for the social sciences. New York: Cambridge University Press, 2014.

16.

Box

Jenkins

Reinsel

, et al. Time series analysis: forecasting and control. 4th ed. Hoboken, NJ: John Wiley Sons, 2016.

17.

Prado

West

Time series: modeling, computation, and inference. Boca Raton, FL: CRC Press, 2010.

18.

Viinikka

Debar

Mé

, et al. Processing intrusion detection alert aggregates with time series modeling. Informat Fusion 2009; 10: 312–324.

19.

Douc

Moulines

Stoffer

Nonlinear time series: theory, methods and applications with R examples. Boca Raton, FL: CRC Press, 2014.

20.

Ramakrishnan

Butler

Muthiah

, et al. ‘Beating the news’ with EMBERS: forecasting civil unrest using open source indicators. In: Macskassy

Perlich

Leskovec

Wang

Ghani

(eds.) Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, New York, NY, 24–27 August 2014, pp.1799–1808.

21.

Muthiah

Butler

Khandpur

, et al. Embers at 4 years: Experiences operating an open source indicators forecasting system. arXiv preprint arXiv:160400033, 2016.

22.

Kallus

. Predicting crowd behavior with big public data. In: Chung

Broder

Shim

Suel

(eds.) proceedings of the 23rd international conference on world wide web, Seoul, South Korea, 7–11 April 2014, pp.625–630. ACM.

23.

Wang

Gerber

Brown

. Automatic crime prediction using events extracted from twitter posts. In: Yang

Greenberg

Endsley

(eds.) Proceedings of the International conference on social computing, behavioral-cultural modeling, and prediction, College Park, MD, 3–5 April 2012. pp.231–238. Springer.

24.

Python

Illian

Jones-Todd

, et al. A Bayesian approach to modelling fine-scale spatial dynamics of non-state terrorism: world study, 2002–2013. arXiv preprint arXiv:161001215, 2016.

25.

Wang

Gerber

. Using Twitter for next-place prediction, with an application to crime prediction. In: IEEE symposium series on computational intelligence, Cape Town, South Africa, 7–10 December 2015. 2015, pp.941–948. IEEE.

26.

Minhas

Ulfelder

Ward

MD.

Mining texts to efficiently generate global data on political regime types. Res Polit 2015; 2: 2053168015589217.

27.

Ulfelder

. A multimodel ensemble for forecasting onsets of state-sponsored mass killing. In: American Political Science Association 2013 annual meeting, Washington, D.C, August 2013.

28.

Chen

Hossain

Butler

, et al. Syndromic surveillance of flu on twitter using weakly supervised temporal topic models. Data Mining Knowl Discov 2016; 30: 681–710.

29.

Zhao

Chen

, et al. Spatiotemporal event forecasting in social media. In: Venkatasubramanian

(eds.) proceedings of the 2015 SIAM international conference on data mining, Vancouver, BC, 30 April–2 May 2015. pp.963–971. SIAM.

30.

Zhao

Sun

, et al. Multi-task learning for spatiotemporal event forecasting. In: Zaki

Obradovic

Tan

Banerjee

Kamath

Parthasarathy

(eds.) Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, Sydney, NSW, 10–13 August 2015. pp.1503–1512. ACM.

31.

Compton

Lee

, et al. Using publicly visible social media to build detailed forecasts of civil unrest. Secur Informatic 2014; 3 (1):4.

32.

Ning

Muthiah

Rangwala

, et al. Modeling precursors for event forecasting via nested multi-instance learning. In: Krishnapuram

Shah

Smola

Aggarwal

Shen

Rastogi

(eds.) proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, 13–17 August 2016, pp.1095–1104. ACM.

33.

Zammit-Mangion

Dewar

Kadirkamanathan

, et al. Point process modelling of the Afghan war diary. Proc Natl Acad Sci 2012; 109: 12414–12419.

34.

Yonamine

JE.

Predicting future levels of violence in Afghanistan districts using gdelt. Unpublished manuscript, 2013.

35.

Rabiner

A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 1989; 77: 257–286.

36.

Ljung

. Black-box models from input-output measurements. In: proceedings of the 18th IEEE instrumentation and measurement technology conference, volume 1, Budapest, Hungary, 21–23 May 2001, pp.138–146. IEEE.

37.

Tibshirani

Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Methodolog 1996; 58: 267–288.

38.

Seabold

Perktold

. Statsmodels: econometric and statistical modeling with python. In: Walt

Millman

(eds.) proceedings of the 9th Python in science conference, volume 57, Austin, TX, 28 June–3 July 2010, p.61.

39.

Hyndman

Koehler

Another look at measures of forecast accuracy. Int J Forecast 2006; 22: 679–688.

40.

Blei

Jordan

Latent Dirichlet allocation. J Mach Learn Res 2003; 3: 993–1022.

41.

McCallum

. Mallet: a machine learning for language toolkit, http://mallet.cs.umass.edu (2002, accessed 30 December 2017.).

42.

Steeg

Gao

Reing

, et al. Toward interpretable topic discovery via anchored correlation explanation. In: ICML workshop on human interpretability in machine learning (WHI’16), New York City, 19–24 June 2016.

43.

Řehůřek

Sojka

. Software framework for topic modelling with large corpora. In: proceedings of the LREC 2010 workshop on new challenges for NLP frameworks, Valletta, Malta, 22 May 2010. pp.45–50. Valletta, Malta: ELRA.

44.

Larkey

Ballesteros

Connell

. Light stemming for Arabic information retrieval. In: Soudi

Bosch

Neumann

(eds.) Arabic computational morphology. Dordrecht: Springer, 2007, pp.221–243.

45.

Caseau

Laburthe

Solving various weighted matching problems with constraints. Principl Pract Constraint Program CP97 1997; 1330: 17–31.

Forecasting violent events in the Middle East and North Africa using the Hidden Markov Model and regularized autoregressive models

Abstract

Keywords

1. Introduction

2. Related work

3. Models

3.1. Hidden Markov Models

3.1.1. Estimating HMM Parameters

3.1.2. Predicting with the HMM

3.2. Autoregressive models

3.3. Evaluation of time series models

4. Experiments

4.1. Datasets

4.1.1. Topic-based temporal feature generation

4.2. Structures in actors’ activities using a two-state Hidden Markov Model

4.3. Predictive performance of HMM, ARIMA, and RARE models

4.4. Important predictors in forecasting MANSA events

4.5. Warning generation

5. Discussion

Footnotes

Acknowledgements

Funding

ORCID iD

Author biographies

References