A Model of Dynamic Flows: Explaining Turkey’s Interprovincial Migration

Abstract

The flow of resources across nodes over time (e.g., migration, financial transfers, peer-to-peer interactions) is a common phenomenon in sociology. Standard statistical methods are inadequate to model such interdependent flows. We propose a hierarchical Dirichlet-multinomial regression model and a Bayesian estimation method. We apply the model to analyze 25,632,876 migration instances that took place between Turkey’s 81 provinces from 2009 to 2018. We then discuss the methodological and substantive implications of our results. Methodologically, we demonstrate the predictive advantage of our model compared to its most common alternative in migration research, the gravity model. We also discuss our model in the context of other approaches, mostly developed in the social networks literature. Substantively, we find that population, economic prosperity, the spatial and political distance between the origin and destination, the strength of the AKP (Justice and Development Party) in a province, and the network characteristics of the provinces are important predictors of migration, whereas the proportion of ethnic minority Kurds in a province has no positive association with in- and out-migration.

Keywords

hierarchical Bayesian modeling Dirichlet-multinomial regression Markov chain Monte Carlo migration Turkey

Flows of items such as people, resources, and information across a finite number of units such as places, institutions, and individuals over time constitute a common type of data in sociology and the wider social sciences. Migration is one of the most frequent types of such flows; other examples include financial transfers between institutions or individuals, passengers through transportation systems, mobility tables, and the number of times participants trust or allocate a fixed resource to others during an experiment. Due to the highly interdependent nature of such situations, certain modeling problems arise (e.g., destinations “compete” over a fixed amount of flow) that are not straightforward to address with standard statistical models (Block, Stadtfeld, and Robins 2022).

Here we aim to contribute methodologically to the analysis of such dynamic flows. We frame our model within a migration context. Indeed, we developed our model with a motivation to analyze real-world migration data. Nevertheless, our model can be applied to other situations that similarly involve dynamic discrete flows across a finite number of units. One of the most widely used methods of analyzing migration flows is the gravity model (Barthélemy 2011; Expert et al. 2011; Karemera, Oguledo, and Davis 2000). In this model, the (logarithm) of the number of people migrating from $i$ to $j$ is modeled as a function of the characteristics of $i$ and $j$ and the distance between $i$ and $j$ . However, this approach does not account for the fact that when a person migrates from $i$ to $j$ , they mechanistically cannot migrate to another destination $k$ . Likewise, the characteristics of $k$ may affect migration from $i$ to $j$ . For example, $k$ can be a competitor of $j$ in receiving migration from $i$ . This dependence in migration probabilities across destinations is ignored in the gravity model, whereas our model takes it into account.

In this study, we propose a Dirichlet-multinomial model, which is an extension of a class of econometric discrete-choice models (Alamá-Sabater, Alguacil, and Bernat-Martí 2017; Guimaraes and Lindrooth 2007). Our model differs from existing choice models in the treatment of nonmigration. The existing models base their estimates conditional on migration taking place. We believe any inference about the causes of migration would be problematic if one ignores those who choose to not move: that is, observations for which the causes of migration might have been present but did not generate migration, or nonmigration counts. Ignoring nonmigration results in “selection on the dependent variable.” In our model, the probabilities of migration to the available destinations as well as the probability of not migrating are directly modeled as functions of the covariates. We further extend our model with random intercepts for provinces to account for the longitudinal nature of migration flows. We fit our model within the Bayesian framework via a Markov chain Monte Carlo (MCMC) algorithm.

A long array of models, developed mostly in the social networks literature, can also be applied to dynamic flows. These include stochastic actor-oriented models (Snijders 1996, 2001), dynamic network actor and other similar relational event models (Block, Stadtfeld, and Snijders 2019; Stadtfeld and Block 2017; Stadtfeld, Hollway, and Block 2017b), exponential random graph models (Lusher, Koskinen, and Robins 2013) and its various extensions and forms (Almquist and Butts 2014; Block et al. 2022; Desmarais and Cranmer 2012; Krivitsky and Butts 2017; Krivitsky and Handcock 2014; Westveld and Hoff 2011), and latent space and latent factor models (Hoff, Raftery, and Handcock 2002; Minhas, Hoff, and Ward 2019). Some of these models are theoretically flexible and very general but may be difficult to adapt to the specific case; others are computationally demanding. The application of some require significant programming. Nevertheless, these models offer a powerful way of dealing with relational data. We believe our approach provides a relatively straightforward and natural way to deal with dynamic flows while also offering computational simplicity and taking into account key features of the data. We discuss similarities and differences between our model and these network models in the “Related Methodology” section. We contribute to the methodological literature by enriching the methodological arsenal for dealing with dynamic flows.

This article also contributes to the migration literature. Prior work has identified various determinants of voluntary migration (Boyle, Halfacree, and Robinson 1998). These include characteristics of the origin and the destination, such as economic prospects (Borjas 1999; Levy 2010) and immigration legislation (Palmer and Pytlikova 2015), and characteristics of the origin-destination pair, such as linguistic, cultural, and physical distance (Expert et al. 2011; Levy 2010; Windzio 2018) between the origin and the destination. Migration can be international as well as intranational. Yet the migration literature is based mostly on international migration, and hence the dynamics of internal migration, particularly in non-Western countries, are understudied (Bell and Muhidin 2009; Kuhn 2015). The focus on international migration is understandable, given that it is highly consequential in shaping public opinion and politics (Chan et al. 2020; Gimpel and Schuknecht 2001). The scarcity of research on internal migration, however, is surprising. Internal migration may not be as economically advantageous for the migrant or as topical as international migration. However, it is far less costly and risky. Consequently, compared to international migration, much larger shares of populations are affected by internal migration, particularly in non-Western countries (Kuhn 2015). For example, according to the official statistics, in 2018, a total of 323,918 people emigrated abroad from Turkey, our study context, yet 3,057,606, nearly 10 times more, moved from one Turkish province to another. Internal migration is thus a strong determinant of local population structures, and it is important to understand its drivers.

We contribute to the migration literature by addressing this gap. Using a unique data set compiled from administrative, registry, and survey data, we analyze the dynamics of more than 25 million migration instances between the 81 provinces of Turkey from 2009 to 2018. Almost all empirical work on internal migration in Turkey, however scarce, has been conducted with data spanning until 2000 (Filiztekin and Gökhan 2008; Gedik 1997; Yazgi et al. 2014). The more recent work is descriptive and focuses on larger interregional migration (Akın and Dökmeci 2015). Over this period, Turkey went through a large economic, demographic, and political transformation (Aksoy and Billari 2018), which has affected migration (Çoban 2013). We thus do not know if these earlier insights into Turkey’s internal migration will apply to more recent migration patterns. Internal migration in Turkey is relevant for Europe, too. Turkey’s ascension to the European Union (EU) has stalled. One of the fears blocking expansion of the EU is large-scale immigration (Strasser 2008). Understanding the determinants of interprovince migration in Turkey will help predict the extent of potential migration from Turkey to Europe and its likely destinations within the EU, should Turkey ever be part of the EU (Filiztekin and Gökhan 2008).

A Model of Migration Flows

We now describe our model in a migration context. Note that our model can be applied to many other settings with minimal adjustment, for example, provinces can be replaced by individuals or institutions, and migration can be replaced by the number of financial transfers or interactions. In fact, in many of these alternative settings, one could directly proceed with modeling the H_t(i,j)s in the following. The migration case needs a slight adjustment, as we will describe, because the stock of people who could migrate but did not is dynamic due to changes in births and deaths.

Assume we have $n > 1$ provinces and $T > 1$ consecutive years, and for each year we have data on the population of and migration between the provinces. Specifically, we observe

P_{t} (i) = the population of province i at the beginning of year t

for all $i = 1, \dots, n$ and $t = 1, \dots, T$ ; and H_t(i,j) = the number of people who migrated from province i to province j in year t for all $1 \leq i, j \leq n$ pairs with $i \neq j$ , respectively. Let us define

E_{t, t + 1} (i) : = P_{t + 1} (i) - [P_{t} (i) - \sum_{j \neq i} H_{t} (i, j) + \sum_{j \neq i} H_{t} (j, i)], i = 1, \dots, n,

which is the net change in population for province $i$ after taking into account interprovince migration during year $t$ . The net change captures births, deaths, and people who are registered in the system for the first time for any other reason. Taking the net change into account, we define the adjusted population of province $i$ for year $t$ as

X_{t} (i) : = P_{t} (i) + E_{t, t + 1} (i) = P_{t + 1} (i) + \sum_{j \neq i} H_{t} (i, j) - \sum_{j \neq i} H_{t} (j, i), i = 1, \dots, n .

We consider this adjusted population to find the number of people who did not migrate from province $i$ during year $t$ as

H_{t} (i, i) = X_{t} (i) - \sum_{j \neq i} H_{t} (i, j), i = 1, \dots, n .

Therefore, given the assumptions, we can summarize the relation between the quantities $H_{t} (i, j) s$ and $X_{t} (i)$ s as

X_{t} (i) = \sum_{i = 1}^{n} H_{t} (i, j), i = 1, \dots, n .

(1)

As a prelude to our proposed model, let us assume for now the existence of a probability of migrating from province $i$ to province $j$ in year $t$ , denoted by $ρ_{t} (i, j)$ , and that $ρ_{t} (i, j)$ is the same for all citizens in province $i$ . This yields a multinomial distribution for $(H_{t} (i, 1), \dots, H_{t} (i, n))$ for each $t$ and $i$ , specified as

(H_{t} (i, 1), \dots, H_{t} (i, n)) ~ Multinomial (X_{t} (i); ρ_{t} (i, 1), \dots, ρ_{t} (i, n)) .

A multinomial regression can be constructed by modeling $ρ_{t} (i, j)$ as a function of the factors about provinces $i$ and $j$ individually as well as the factors about the relation between $i$ and $j$ . Generically, we assume there are $d_{u} \geq 1$ and $d_{v} \geq 1$ factors that affect the likelihood of migrating from and to a province, respectively. These factors can vary across the years and provinces, hence they are denoted by $u_{t} (i) \in R^{d_{u}}$ and $v_{t} (i) \in R^{d_{v}}$ , respectively. Note that some factors can both push and pull migration (e.g., [un]employment) and hence appear as both $u_{t} (i)$ and $v_{t} (i)$ . Furthermore, for each $(i, j)$ pair, we also have $d_{z} \geq 1$ factors of sort $z_{t} (i, j) \in R^{d_{z}}$ , defined as pair-level factors that may affect migration from $i$ to $j$ . These factors determine the probability $ρ_{t} (i, j)$ of migrating from $i$ to $j$ in year $t$ as

ρ_{t} (i, j) = \frac{\exp {θ_{0} + θ_{1} \cdot u_{t} (i) + θ_{2} \cdot v_{t} (j) + θ_{3} \cdot z_{t} (i, j)}}{1 + \sum_{j' \neq i} \exp {θ_{0} + θ_{1} \cdot u_{t} (i) + θ_{2} \cdot v_{t} (j') + θ_{3} \cdot z_{t} (i, j')}} .

Here, $(θ_{0}, θ_{1}, θ_{1}, θ_{3})$ is the vector of model parameters. The components $θ_{1} \in R^{d_{u}}$ , $θ_{2} \in R^{d_{v}}$ , and $θ_{3} \in R^{d_{z}}$ correspond to the sending (→), receiving (←), and joint (pair-level) factors (↔), respectively. The scalar component $θ_{0} \in R$ is the “base” parameter, which solely determines the probability of migration when all factors are zero. Finally, we use exponentiation to ensure we have positive terms. The resulting model is an instance of the multinomial logistic regression model (Theil 1969).

A limitation of the aforementioned expression may be the deterministic relation between the factors and the migration probability $ρ_{t} (i, j)$ . We would like to allow for variability in those probabilities such that even under the same factor values, the probabilities $ρ_{t} (i, j)$ s may differ across province-year pairs, for example, due to unsystematic residual factors that our model may fail to cover (Nelson 1984). This can be done by modeling the probabilities using a Dirichlet distribution for each of the province-year pairs. Specifically, we let $ρ_{t} (i, j)$ s be random variables themselves, with $(ρ_{t} (i, 1), \dots, ρ_{t} (i, n))$ having a Dirichlet distribution independently for each $t$ and $i$ , that is,

(ρ_{t} (i, 1), \dots, ρ_{t} (i, n)) ~ Dirichlet (α_{t} (i, 1), \dots, α_{t} (i, n)) .

This time, it is the parameters of the Dirichlet distribution that we model via regression. Specifically,

α_{t} (i, j) = {\begin{matrix} \exp {θ_{0} + θ_{1} \cdot u_{t} (i) + θ_{2} \cdot v_{t} (j) + θ_{3} \cdot z_{t} (i, j) + θ_{4}} & i \neq j \\ \exp {θ_{4}} & i = j . \end{matrix}

(2)

The resulting model is a Dirichlet-multinomial regression. Observing the expected probabilities

E (ρ_{t} (i, j)) = \frac{α_{t} (i, j)}{\sum_{j' = 1}^{n} α_{t} (i, j')},

we note that in this model, the vector $(θ_{0}, θ_{1}, θ_{2}, θ_{3})$ lends itself to a similar interpretation as in the multinomial logistic regression model introduced earlier. The extra scalar parameter $θ_{4} \in R$ is present in each $α_{t} (i, j)$ and hence does not affect the expected probabilities. The parameter $θ_{4}$ instead captures the variances of the probabilities: the larger $θ_{4}$ , the smaller the variances.

Due to its flexibility, we proceed with the Dirichlet-multinomial regression model described previously. From an inference perspective, the Dirichlet-multinomial specification does not introduce additional complications compared with the multinomial logistic specification. After integrating out $ρ_{t} (i, j)$ s, $H_{t} (i, j)$ s follow a Dirichlet-multinomial distribution, which can easily be computed:

(H_{t} (i, 1), \dots, H_{t} (i, n)) ~ Dirichlet - multinomial (X_{t} (i); α_{t} (i, 1), \dots, α_{t} (i, n)) .

(3)

Figure 1 (left) depicts the Dirichlet-multinomial model for a single province and single time step. According to the Dirichlet-multinomial model, people’s decisions within a province are dependent even when conditioned on $θ$ and the covariates. This is because, given $θ$ and the covariates, the same random migration probability vector $ρ_{t} (i, :)$ , drawn from Dirichlet distribution, applies to all individuals in the same province, introducing a positive dependency among their choices. This is in contrast to the simpler multinomial model (Figure 1, the model on the right), where the decisions are independent given $θ$ .

Figure 1.

Dirichlet-multinomial model (left) and multinomial model (right).

It is possible to quantify the dependency in the Dirichlet-multinomial model in terms of closed-form conditional distributions. For example, given that individual $I_{1}$ migrated to province $j$ from province $i$ during time period $t$ , the probability distribution of the decision of individual $I_{2}$ follows a categorical distribution with

\Pr (I_{2} migrates i \to k | I_{1} migrates i \to j) = {\begin{matrix} \frac{α_{t} (i, j) + 1}{1 + \sum_{j' = 1}^{n} α_{t} (i, j')} & if k = j \\ \frac{α_{t} (i, k)}{1 + \sum_{j' = 1}^{n} α_{t} (i, j')} & if k \neq j \end{matrix}, k = 1, \dots, n .

(3)

As another example, given t among $P > 0$ individuals, the migration counts to the $n$ provinces are $A_{1 : n} = a_{1 : n}$ (so $a_{1} + \dots + a_{n} = P$ ), and the probability distribution of the migration counts $B_{1 : n}$ of $R > 0$ other individuals in the same province are jointly distributed as

B_{1 : n} | (R, A_{1 : n} = a_{1 : n}) ~ Dirichlet - multinomial (R; α_{t} (i, 1) + a_{1}, \dots, α_{t} (i, n) + a_{n}) .

(3)

Both examples indicate positive dependency among individuals’ decisions in the same province at the same time period.

The presence of positive dependency between the decisions may be considered another positive feature of the Dirichlet-multinomial model because it captures the possibility that individuals living in the same province may be influenced by each other in deciding to migrate (or not) to a particular destination, for example, through peer effects or familial migration decisions. Furthermore, the strength of this dependency is determined by $θ_{4}$ : It decreases as $θ_{4}$ increases. Hence, we can gauge the amount of within-province interpersonal influence by estimating $θ_{4}$ using the data.

Hierarchical Specification

Our data are multilevel, that is, migration from $i$ to $j$ for all $i$ and $j$ are observed over $T > 1$ years. Accordingly, we expand the specification in equation (2) to address this multilevel structure by assuming the base parameter $θ_{0}$ is a random variable defined at the province level (i.e., a random intercept). More specifically, we modify equation (2) as

α_{t} (i, j) = {\begin{matrix} \exp {θ_{0} (i) + θ_{1} \cdot u_{t} (i) + θ_{2} \cdot v_{t} (j) + θ_{3} \cdot z_{t} (i, j) + θ_{4}} & i \neq j \\ \exp {θ_{4}} & i = j . \end{matrix}

(4)

Next, by our aim of Bayesian inference, we build up the prior distributions for the model parameters introduced so far. The parameters $θ_{1}$ , $θ_{2}$ , $θ_{3}$ , and $θ_{4}$ are assumed a priori independent and have priors $θ_{i} ~ η_{i}$ , $i = 1, \dots, 4$ . Furthermore, independently from $θ_{1}$ , $θ_{2}$ , $θ_{3}$ , and $θ_{4}$ , the base parameter $θ_{0} (i)$ for each $i = 1, \dots, n$ is also assumed independent with a common normal distribution

θ_{0} (i) ~^{i . i . d .} N (μ_{0}, σ_{0}^{2}), i = 1, \dots, n .

(5)

The parameters $μ_{0}$ and $σ_{0}^{2}$ are themselves treated as random variables, which are independent with $μ_{0} ~ N (μ_{h}, σ_{h}^{2})$ and $σ_{0}^{2} ~ I G (a_{h}, b_{h})$ , where the latter is the inverse-Gamma distribution with shape and scale parameters $a_{h}$ and $b_{h}$ , respectively. We assume the hyperparameters $μ_{h}$ , $σ_{h}^{2}$ , $a_{h}$ , and $b_{h}$ are known.

The full parameter vector of the resulting hierarchical model is the $(1 + 1 + n + d_{u} + d_{v} + d_{z} + 1) \times 1$ vector

θ = (μ_{0}, σ_{0}^{2}, θ_{0} (1), \dots, θ_{0} (n), θ_{1}, θ_{2}, θ_{3}, θ_{4}) .

(5)

This completes the description of our model of migration in a closed system. We refer to our model whose Dirichlet-multinomial parameters are given by equation (4) as a hierarchical specification because it models the base parameter $θ_{0} (i)$ as random across provinces.

One could expand the aforementioned model by making $θ_{0} (i)$ , the base parameter, a polynomial function of time, of order $d_{b} \geq 0$ . In that case, the base parameter could be expressed as

θ_{0} (i) = \sum_{k = 0}^{d_{b}} θ_{0, k} (i) t^{k},

(5)

where the polynomial coefficients for province $i$ would be modeled as random with

ϑ_{i} : = (θ_{0, 0} (i), \dots, θ_{0, d_{b}} (i)) ~^{i . i . d .} N (μ_{0}, Σ_{0}), i = 1, \dots, n,

(5)

and $μ_{0}$ , $Σ_{0}$ would be multivariate random variables with suitable priors, such as $μ_{0} ~ N (μ_{h}, Σ_{h})$ and $Σ_{0} ~ W^{- 1} (ν_{h}, Ψ_{h})$ , the inverse-Wishart distribution with $ν_{h}$ degrees of freedom and the scale matrix $Ψ_{h}$ . With $d_{b} = 1$ , for example, we would have a model with “random slopes” of time.

In our implementations, we will not present results with this polynomial specification. Our results will be restricted to the “random intercept” specification where the base parameters are random across provinces, as in equation (5). This is mainly because province log populations, which are included in the model as predictors, are increasing almost as a linear function of time, rendering the estimation of a separate time slope unnecessary and numerically difficult. We did fit specifications with linear as well as quadratic random time slopes, which did not make noticeable differences in our models, apart from creating convergence problems. That being said, the polynomial specification for the base parameter may be useful in other applications; hence, we will describe our inference method in general and consider the polynomial specification.

Inference

We now turn to our inference method. To quantify the uncertainties in the estimation of $θ$ conveniently, we aim for Bayesian inference of $θ$ given the data

D = {H_{t} (i, j); i, j \in {1, \dots, n}, t \in {1, \dots, T}} .

(5)

Note that the (adjusted) province populations ${X_{t} (i) : t = 1, \dots, T; i = 1, \dots, n}$ are implicitly contained in $D$ , thanks to the relation given in equation (1). More specifically, we aim to find the posterior distribution for the hierarchical model

p (θ | D) \propto p (θ) p (D | θ) .

(5)

Here, $θ = (μ_{0}, Σ_{0}, ϑ_{1}, \dots, ϑ_{n}, θ_{1}, θ_{2}, θ_{3}, θ_{4})$ is the parameter vector with the hierarchical specification for the polynomial coefficients of the province-dependent base parameters; $p (θ)$ is the probability density of the prior distribution given by

p (θ) = [Π_{j = 1}^{4} η_{j} (θ_{j})] [Π_{i = 1}^{n} N (ϑ_{i}; μ_{0}, Σ_{0})] N (μ_{0}; μ_{h}, Σ_{h}) W^{- 1} (Σ_{0}; ν_{h}, Ψ_{h});

(5)

and $p (D | θ)$ is the likelihood given by a product of $Tn$ Dirichlet-multinomial probabilities as

p (D | θ) = Π_{t = 1}^{T} Π_{i = 1}^{n} p (H_{t} (i, :) | α_{t}^{θ} (i, :)),

(6)

where the superscript $θ$ over $α_{t}^{θ} (i, :)$ is used from now on to indicate the dependency on $θ$ explicitly and $p (H_{t} (i, :) | α_{t}^{θ} (i, :))$ is the Dirichlet-multinomial probability

p (H_{t} (i, :) | α_{t}^{θ} (i, :)) = \frac{X_{t} (i)! Γ (\sum_{j = 1}^{n} α_{t}^{θ} (i, j))}{Γ (X_{t} (i) + \sum_{j = 1}^{n} α_{t}^{θ} (i, j))} Π_{j = 1}^{n} \frac{Γ (H_{t} (i, j) + α_{t}^{θ} (i, j))}{H_{t} (i, j)! Γ (α_{t}^{θ} (i, j))},

(6)

where $Γ (\cdot)$ is the gamma function.

Because the posterior distribution is analytically intractable, we develop an MCMC algorithm (see e.g., Tierney 1994) to sample from the posterior distribution. The developed MCMC algorithm is an instance of Metropolis-Hastings-within-Gibbs. The parameter vector $θ$ is divided into disjoint blocks of components, and these blocks are updated in turn by either a Gibbs move (if sampling from the full conditional distribution of the block is possible) or a Metropolis-Hastings move (otherwise). We describe this process in detail in algorithm 1, where the aforementioned blocks are taken as

{θ_{1}}; {θ_{2}}; {θ_{3}}; {θ_{4}}; {ϑ_{1}, \dots, ϑ_{n}}; {μ_{0}}; {Σ_{0}} .

(6)

We note some issues concerning the computational complexity of algorithm 1. While updating each of ${θ_{1}}; {θ_{2}}; {θ_{3}}; {θ_{4}}$ , a product of $nT$ Dirichlet-multinomial distributions, each with $n$ categories, is computed in the numerator and the denominator of the acceptance ratio. Given the rest of the components and the observations, the variables $ϑ_{1}, \dots, ϑ_{n}$ are conditionally independent; they can thus be updated in parallel, each requiring the computation of $T$ Dirichlet-multinomial distributions, each with $n$ categories, in the numerator and the denominator of the acceptance ratio. Finally, the blocks ${μ_{0}}$ and ${Σ_{0}}$ can be updated in turn with Gibbs moves because their full conditional distributions have closed forms, thanks to conjugacy.

Algorithm 1: Metropolis-Hastings-within-Gibbs for the Dirichlet-multinomial model with the hierarchical specification for its province-dependent base parameters
Input:Migration counts $H_{t}$ and external factors $u_{t}$ , $v_{t}$ , $z_{t}$ for $t = 1, \dots, T$ ; hyperparameters $μ_{h}$ , $Σ_{h}$ , $ν_{h}$ , $Ψ_{h}$ , prior distributions $η_{j}$ for $j = 1, \dots, 4$ , proposal distributions $q_{j}$ for $j = 1, \dots, 4$ , initial value $θ^{(0)}$ Output: Samples $θ^{(0)}, \dots, θ^{(K)}$ 1 Start with $θ^{(0)}$ . 2 for $k = 1, 2, \dots$ do 3 Set $θ = θ^{(k - 1)}$ . 4 for $j = 1, 2, 3, 4$ do 5 Propose $θ'_{j} ~ q_{j} (θ'_{j} \| θ_{j})$ 6 Construct the proposed parameter vector $θ'$ by replacing $θ_{j}$ by $θ'_{j}$ in $θ$ . 7 Update $θ$ by replacing $θ_{j}$ by $θ'_{j}$ with acceptance probability $min {1, \frac{q_{j} (θ_{j} \| {θ'}_{j}) η_{j} ({θ'}_{j}) Π_{t = 1}^{T} Π_{i = 1}^{n} p (H_{t} (i, :) \| α_{t}^{θ'} (i, :))}{q ({θ'}_{j} \| θ_{j}) η_{j} (θ_{j}) Π_{t = 1}^{T} Π_{i = 1}^{n} p (H_{t} (i, :) \| α_{t}^{θ} (i, :))}};$ otherwise, keep $θ_{j}$ as before. 8 for $i = 1, \dots, n$ do 9 Propose $ϑ'_{i} ~ q_{0} (ϑ'_{i} \| ϑ_{i})$ . 10 Construct $θ'$ by replacing $ϑ_{i}$ by $ϑ_{i}'$ in $θ$ . 11 Update $θ$ by replacing $ϑ_{i}$ by $ϑ'_{i}$ with acceptance probability $min {1, \frac{q_{0} (ϑ_{i} \| {ϑ'}_{i}) N ({ϑ'}_{i}; μ_{0}, Σ_{0}) Π_{t = 1}^{T} p (H_{t} (i, :) \| α_{t}^{θ'} (i, :))}{q_{0} ({ϑ'}_{i} \| ϑ_{i}) N (ϑ_{i}; μ_{0}, Σ_{0}) Π_{t = 1}^{T} p (H_{t} (i, :) \| α_{t}^{θ} (i, :))}};$ otherwise, keep $ϑ_{i}$ as before. 12 Sample $μ_{0} ~ N (μ_{post}, Σ_{post})$ where $Σ_{post} = {(Σ_{h}^{- 1} + n Σ_{0}^{- 1})}^{- 1}$ and $μ_{post} = Σ_{post} (Σ_{h}^{- 1} μ_{h} + Σ_{0}^{- 1} \sum_{i = 1}^{n} ϑ_{i})$ . 13 Sample $Σ_{0} ~ W^{- 1} (ν_{h} + n, Ψ_{h} + \sum_{i = 1}^{n} (ϑ_{i} - μ_{0}) (ϑ_{i} - μ_{0})^{T})$ 14 Store the new sample $θ^{(k)} = (μ_{0}, Σ_{0}, ϑ_{1}, \dots, ϑ_{n}, θ_{1}, θ_{2}, θ_{3}, θ_{4}),$

Algorithm 1: Metropolis-Hastings-within-Gibbs for the Dirichlet-multinomial model with the hierarchical specification for its province-dependent base parameters

Input:Migration counts

H_{t}

and external factors

u_{t}

v_{t}

z_{t}

for

t = 1, \dots, T

;
hyperparameters

μ_{h}

Σ_{h}

ν_{h}

Ψ_{h}

, prior distributions

η_{j}

for

j = 1, \dots, 4

,
proposal distributions

q_{j}

for

j = 1, \dots, 4

, initial value

θ^{(0)}

Output: Samples

θ^{(0)}, \dots, θ^{(K)}

1 Start with

θ^{(0)}

.
2 for

k = 1, 2, \dots

do
3 Set

θ = θ^{(k - 1)}

.
4 for

j = 1, 2, 3, 4

do
5 Propose

θ'_{j} ~ q_{j} (θ'_{j} | θ_{j})

6 Construct the proposed parameter vector

θ'

by replacing

θ_{j}

θ'_{j}

θ

.
7 Update

θ

by replacing

θ_{j}

θ'_{j}

with acceptance probability

min {1, \frac{q_{j} (θ_{j} | {θ'}_{j}) η_{j} ({θ'}_{j}) Π_{t = 1}^{T} Π_{i = 1}^{n} p (H_{t} (i, :) | α_{t}^{θ'} (i, :))}{q ({θ'}_{j} | θ_{j}) η_{j} (θ_{j}) Π_{t = 1}^{T} Π_{i = 1}^{n} p (H_{t} (i, :) | α_{t}^{θ} (i, :))}};

otherwise, keep

θ_{j}

as before.
8 for

i = 1, \dots, n

do
9 Propose

ϑ'_{i} ~ q_{0} (ϑ'_{i} | ϑ_{i})

.
10 Construct

θ'

by replacing

ϑ_{i}

ϑ_{i}'

θ

.
11 Update

θ

by replacing

ϑ_{i}

ϑ'_{i}

with acceptance probability

min {1, \frac{q_{0} (ϑ_{i} | {ϑ'}_{i}) N ({ϑ'}_{i}; μ_{0}, Σ_{0}) Π_{t = 1}^{T} p (H_{t} (i, :) | α_{t}^{θ'} (i, :))}{q_{0} ({ϑ'}_{i} | ϑ_{i}) N (ϑ_{i}; μ_{0}, Σ_{0}) Π_{t = 1}^{T} p (H_{t} (i, :) | α_{t}^{θ} (i, :))}};

otherwise, keep

ϑ_{i}

as before.
12 Sample

μ_{0} ~ N (μ_{post}, Σ_{post})

where

Σ_{post} = {(Σ_{h}^{- 1} + n Σ_{0}^{- 1})}^{- 1}

and

μ_{post} = Σ_{post} (Σ_{h}^{- 1} μ_{h} + Σ_{0}^{- 1} \sum_{i = 1}^{n} ϑ_{i})

.
13 Sample

Σ_{0} ~ W^{- 1} (ν_{h} + n, Ψ_{h} + \sum_{i = 1}^{n} (ϑ_{i} - μ_{0}) (ϑ_{i} - μ_{0})^{T})

14 Store the new sample

θ^{(k)} = (μ_{0}, Σ_{0}, ϑ_{1}, \dots, ϑ_{n}, θ_{1}, θ_{2}, θ_{3}, θ_{4}),

Related Methodology

Our model’s closest relative is the one proposed by Guimaraes and Lindrooth (2007). Guimaraes and Lindrooth (2007) developed a Dirichlet-multinomial regression as an extension of the multinomial logistic regression within a discrete-choice random utility framework. They used this to model counts from different groups to multiple destinations (e.g., hospital choice of different groups of patients), where for each group, multinomial probabilities for the destinations are modeled independently with a Dirichlet distribution. Note that our model can also be interpreted from a random utility framework. Consider, for example, the migration decisions of actors at province $i$ at time $t$ . The quantity $α_{t} (i, j)$ given in equation (2) can be interpreted as the deterministic part of a random utility an actor gets, had they moved from $i$ to $j$ at time $t$ . The Dirichlet-multinomial model corresponds to the assumption that each actor makes the choice that maximizes their utility, where the log-utility for the $k$ th actor at province $i$ at time $t$ had they moved to province $j$ is in the form

α_{t} (i, j) + ν_{t} (i, j) + u_{t} (i, j, k) .

Here, $ν_{t} (i, j)$ , $j = 1, \dots, n$ are the i.i.d. random utility components that affect identically all individuals at province $i$ at time $t$ , and, independently from those, $u_{t} (i, j, k)$ , $j = 1, \dots, n$ , $k = 1, \dots, X_{t} (i)$ are the i.i.d. random utility components at the individual level. (See Guimaraes and Lindrooth [2007] for the specific forms of the distributions of these random utility components.) In fact, the province-level random utility component $ν_{t} (i, j)$ , $j = 1, \dots, n$ can be interpreted as the source of the positive dependency between the decisions of actors within the same province, as mentioned earlier.

This Dirichlet-multinomial regression model is used by Alamá-Sabater et al. (2017) to explain migration to different parts of Spain from various regions outside of Spain. Our model and Alamá-Sabater et al.’s (2017) model have critical differences. First, Alamá-Sabater et al. (2017) aim to explain how much migration a certain factor “pulls,” hence they only focus on factors of the receiving regions (corresponding to our $v_{t} (j)$ s, or the ← factors). In this study, in contrast, we consider migration on a matrix of provinces. Consequently, our model includes factors in sending (→) and receiving (←) nodes as well as pair-level covariates (↔). A second crucial difference between our model and the model in both Guimaraes and Lindrooth (2007) and Alamá-Sabater et al. (2017) is that our model accounts for people who have not migrated, whereas the latter two omit “self-edges.”Alamá-Sabater et al. (2017) model flows of immigration to different provinces in Spain, conditional on immigration taking place. Likewise, Guimaraes and Lindrooth (2007) omit individuals who do not decide to migrate. We believe an omission of self-edges and no-choice cases would create a serious issue in our setting because the analysis would be restricted to cases for which a positive outcome is observed. A further difference between our model and those in Guimaraes and Lindrooth (2007) and Alamá-Sabater et al. (2017) is that our model is hierarchical, allowing for a random intercept. Finally, whereas Guimaraes and Lindrooth (2007) and Alamá-Sabater et al. (2017) implement maximum likelihood estimation, we use Bayesian inference to quantify the uncertainties in our estimates.

A common and notable feature of our model and those previously described is that they rely on the assumption of independence of irrelevant alternatives (IIAs). This assumption implies that the ratio of the probabilities of a single actor moving from $i$ to $j$ and from $i$ to $k$ $(\frac{ρ_{t} (i, j)}{ρ_{t} (i, k)})$ depends only on the features of $i$ , $j$ , and $k$ and is independent of the features of all other destinations. Note that the IIA is about the ratio of two probabilities. The plain probability of moving to a particular destination does depend on the features of all possible destinations in our model. The IIA assumption can be violated in practice, but there are existing, however imperfect, diagnostic tests (Cheng and Long 2007). Moreover, extensions of the discrete-choice models relax this assumption (see e.g., Benson, Kumar, and Tomkins 2016). An accepted practice is to use models that rely on IIA when the alternatives are plausibly distinct and independently weighed by the decision maker (Benson et al. 2016); we believe the decision to migrate to a particular place plausibly fits these criteria.

The most common alternative model used to analyze migration is the gravity model (Barthélemy 2011; Expert et al. 2011; Karemera et al. 2000; Poot et al. 2016). In this model, the logarithm of the number of people who moved from $i$ to $j$ at time t is simply regressed on the features of $i$ and $j$ , such as population, and some measure of the distance between $i$ and $j$ . The estimation is straightforward: It can be performed within the standard maximum likelihood framework that is used for regression modeling. The disadvantages are as follows. First, the model does not incorporate systemic effects, for example, origin populations constrain the total size of out-migration, which is not directly incorporated in the model (Poot et al. 2016). In addition, destinations compete for migration: If an individual migrates to $j$ , they cannot migrate to $k$ . Naturally, the properties of $k$ should also affect the decision to migrate to $j$ , which are typically ignored in the gravity model. This is a more binding assumption than the IIA discussed previously, which is about the ratio of two probabilities. Furthermore, nonmigration (zero cell counts or the diagonals [i.e., those who stay]) is mostly omitted in this model, conditioning the estimates on migration having taken place and creating a selection on the dependent variable, just as in the models of Guimaraes and Lindrooth (2007) and Alamá-Sabater et al. (2017). Nevertheless, this is a popular model of migration, hence we will compare the predictive power of our model with that of the gravity model.

A class of models that share the Bayesian nature of our model and also deal with migration can be found in Azose and Raftery (2015, 2018). The main aim of those models is to forecast country-level future net migration based on past migration via an autoregressive hierarchical model. Azose and Raftery (2018) improve those forecasts by developing a procedure that estimates cross-country correlations in net migration rates. These models differ from ours in that explaining the determinants of migration is not part of their purpose.

The literature on social networks, too, offers a long list of statistical models that can be applied to flows. A canonical model is the exponential random graph model (ERGM; Lusher et al. 2013), which takes the entire data as a network of nodes (e.g., provinces) and edges (e.g., migration flows). In ERGMs, the whole network is modeled as a function of node, edge, and network-level covariates. ERGM enjoys greater generality than our model; using it, one can naturally include various local or global network statistics as explanatory variables and incorporate complex contemporaneous dependencies in the data (i.e., network dependencies in a cross-sectional measurement). The standard ERGM deals with binary edges and cross-sectional settings, so it cannot be directly applied to our problem. ERGM, however, has been extended to deal with longitudinal data (Krivitsky and Handcock 2014) and with count or multilayered networks (Block et al. 2022; Krivitsky 2012; Krivitsky and Handcock 2014; Krivitsky, Koehly, and Marcum 2020) and the more general weighted edges as in generalized exponential random graph models (GERGMs; Desmarais and Cranmer 2012).

An ERGM type of model would be very flexible and general, but it can be challenging to perform inference in ERGMs (Almquist and Butts 2013, 2014), particularly for various generalizations of it. The major challenge may be having to sample the whole network (typically several times) at every iteration, which can only be done approximately, for example, by using Gibbs sampling, unless the network is very small. Note that we also use a Gibbs sampler, but in our case, the inference is much simpler, as we will elaborate. Consequently, while ERGM is generalized to deal with longitudinal data and weighted edges, most applications are constrained to either longitudinal binary networks, cross-sectional networks with weighted edges, or relatively small networks. Indeed, ERGMs have been applied to the study of migration flow networks (Leal 2021; Windzio 2018; Windzio, Teney, and Lenkewitz 2021). However, likely due to those computational complexities, the authors had to simplify the flows so an ERGM can be fitted. For example, Windzio (2018) and Leal (2021) dichotomize flows, and Windzio et al. (2021) use valued edges but impose ordinal categories to migration flows and the data are analyzed only cross-sectionally even though they are longitudinal. Block et al. (2022) develop a weighted ERGM for mobility networks, but the application is restricted to cross-sectional observations. Abramski, Katenka, and Hutchison (2020) successfully apply the GERGM developed by Desmarais and Cranmer (2012) to study refugee migration patterns, but the setup is again simplified with only 12 countries and a cross-sectional analysis.

We also tried to implement an ERGM for count edges (Krivitsky 2012; Krivitsky et al. 2021) to our data. Unfortunately, it failed to converge even for a single year of our migration data, let alone the 10 years our data span. A further issue that needs to be tackled in an ERGM-type model is that the total flow of out-migration is bounded by the population of the origin, and this constraint may be difficult to implement in an ERGM. In our model, these “row-sums” are naturally handled through the Dirichlet-multinomial distribution. We must add that there are exciting recent developments in the ERGM literature, especially the pseudo-likelihood parameter estimation procedures for weighted ERGMs (Huang and Butts 2021), and its applications to migration networks (Huang and Butts 2022) are fast developing. These developments can alleviate the computational and other practical constraints for count data and make ERGMs more feasible for the study of migration flows.

Another class of models developed in the social networks literature that have some similarities with our model comprise the stochastic actor-oriented model (SAOM; Snijders 2001), the dynamic network actor model (DyNAM; Stadtfeld and Block 2017), the relational event model (REM; Butts 2008), and other similar models (Almquist and Butts 2014). These models are designed to analyze longitudinal dynamic binary network data, although extensions for count data exist (Stadtfeld et al. 2017a). In the actor-based SAOM and DyNAM, two processes are modeled separately: the timing of an event (e.g., a tie to be formed) and the target of the event (e.g., whom a person sends a tie), which is modeled with a multinomial specification. In the tie-based REM, the timing and position of an event (e.g., which two nodes are connected) are modeled simultaneously. In our model, the exact timing of migration is unknown, apart from it taking place during a given year. Moreover, many migration flows take place during a given year. To apply the dynamic network models discussed here, one needs to be able to calculate the conditional distribution of the migration data of a year given the state of variables in the previous year. This is not available in our case because we do not know the times of individual migration events and integrating them out is analytically intractable. One can only approximate those conditionals, making this class of models inconvenient for us to apply in our case.

A final class of models, developed again mainly in the social networks literature, includes latent space (Hoff et al. 2002; Hoff and Ward 2004) and latent factor (Minhas et al. 2019) models. Most recently, and perhaps most relevantly, Minhas et al. (2019) described a latent factor model in which cell outcomes in a matrix (e.g., migration flows in our case) are modeled as a function of the sender and proposer characteristics plus a latent factor matrix. The elements of this matrix are the sender and receiver latent factors multiplied by a sender-receiver pair parameter. This latent multiplicative factor captures higher-order dependence structures that are not explained by the observed covariates included in the model. Minhas et al. (2019) then proposed a Bayesian estimation procedure that assumes certain prior distributions for the parameters, including the elements of the latent factor matrix. This class of models is very flexible, too. Applications, however, are often restricted to binary data, although an extension to the multinomial and count data is possible. In addition, diagonals are often not modeled explicitly (see e.g., Minhas et al. 2019), and it is not straightforward to constrain the row-sums that are bounded in our data by the province populations.

Overall, we argue that the methodology and the modeling approach of these relational and network models are too general for our problem. In contrast, our model is directly related to the migration (and other similar) flows and can be derived from a small number of assumptions (e.g., multinomial probabilities and independence of the actors in different provinces). For example, staying within the notation in our article, the SAOM in Snijders (1996) takes $(H_{t}, Z_{t})$ as the state and assumes this is a Markov process. $H_{t}$ corresponds to the relation matrix in SAOM, and $Z_{t}$ the exogenous variables. However, in our model, $(H_{t}, Z_{t})$ is not a Markov process in that $H_{t}$ depends on the cumulative effect of the previous $H_{1 : t - 1}$ s as those sum up to the populations. In this sense, our model is similar to the model discussed by Hanneke, Fu, and Xing (2010), which conditions the current network on the earlier network realizations. Consequently, the parameters of our model can be inferred with a fairly standard MCMC algorithm (see algorithm 1). This convenience is due to the assumption in our model that out-migration in two different provinces are independent, conditional on covariates that include factors related to migration in the previous year (however, recall that our model addresses the dependency of migration decisions among people who live in the same province). This assumption is plausible in the migration context: When a person in province $i$ is deciding to migrate to province $j$ (or not to migrate at all) at time point $t$ , this decision is likely independent of another person’s decision in province $k \neq i$ at the same time point, provided we take key relevant factors into account.

Note that in our model, a person’s decision in province $i$ can depend on another person’s decision in province $j$ that was made in the previous year because we include factors that are functions of past migration as predictors of future migration. For example, if we think a person in province $i$ is more likely to migrate to province $j$ if people from other provinces also migrate to $j$ , we can include this dynamic in the model, as we will do by adding a popularity measure of a province. Hence, we can ameliorate possible misfits due to the conditional independence assumption by including relevant covariates. Also note that in our model, a person’s decision in province $i$ can depend on another person’s decision in the same province and time period due to the Dirichlet-multinomial specification as discussed previously.

We should also add that the conditional independence assumption of our model is less problematic, the more frequent the temporal measurements are. We have annual migration data, which we believe is frequent enough given that most migration decisions are not made on a whim. If, however, the researcher has only cross-sectional data or data that are collected very infrequently, such as decades apart, then complex dependencies in migration flows between actors from different provinces may not be captured in our model. In such cases, however, the researcher would have a simpler data structure for which the more complex count ERGM-type models may work better.

We believe the models developed in the social networks literature reviewed previously are rich, very general, flexible, and in principle can be applied to the type of data we have here, especially given the rapidly developing work on estimation in valued ERGMs. A systematic comparison of all possible alternative modeling approaches, however, is beyond the scope of the current study.

Determinants of Migration

We now move to the substantive discussion of the expected drivers of migration and describe the Turkish context to which our data belong. Migration is a complex phenomenon with multiple causes, hence there is no single theory for it. A nonexhaustive list of the determinants of voluntary migration can be grouped into economic, demographic, geographic, cultural, and social network factors (Boyle et al. 1998). In classical economic theory, income maximization relative to the costs of moving is a key driver of migration (Borjas, Bronars, and Trejo 1992). Economic prospects, such as a high per capita income, employment opportunities, or wages, should pull migrants. Conversely, poor economic prospects should push migration.

Migration flows are also shaped by population. The larger the population of the origin and the destination, the larger the flow will be between the two (Barthélemy 2011; Levy 2010). This prediction is due to the empirical regularity that flows correlate positively with stocks in either direction (Poot et al. 2016). A large population means there is more capacity to send and receive migrants.

Increasing population may increase migration flows, but the spatial distance between the origin and the destination constrains it. Next to the gravity model, spatial network models, too, show that distance affects ties between individuals (see e.g., Butts et al. 2012). The effect of spatial distance operates via various channels (Schwartz 1973). First, it directly increases the cost of moving, a key factor in the economic model of migration. Second, it hinders maintaining contact with friends and family in one’s origin country, hence imposing a psychological cost on migration. Third, the larger the spatial distance, the lower the information flow between the origin and destination. A lower information flow means it is more difficult to hear about available opportunities at the destination. The combined effect of population and spatial distance on migration is sometimes referred to as the “gravity law”: The “mass” (i.e., population) of the origin and the destination relative to the spatial distance between the two determines flows (Barthélemy 2011; Levy 2010).

The gravity law focuses exclusively on spatial distance, but cultural or linguistic distances constrain migration, too. Similarity between the origin and destination concerning language and religion, for example, improves migrants’ labor market integration (Van Tubergen, Maas, and Flap 2004; Windzio 2018). Furthermore, cultural similarity facilitates migration due to homophily (McPherson, Smith-Lovin, and Cook 2001) and by reducing discrimination and acculturation costs for migrants (Van Tubergen et al. 2004).

Finally, social networks are highly consequential for migration (Massey and España 1987). This is true for the social network of the individual migrant (Massey and España 1987) as well as macro-level network features of the origin and destination (Levy 2010; Windzio 2018). Knowing people who migrated previously reduces the cost of migration for a prospective migrant. A social network in the destination can introduce a prospective migrant to possible employers, suggest places to live, offer information on vacancies, and provide social support that reduces the psychological costs of migration (Massey and España 1987). These effects make migration self-perpetuating: The more migration there is from $i$ to $j$ , the more migration is expected to take place from i to j in the future (Massey 1990).

The reverse should be true as well: The more migration there is from i to j, the more migration is expected to take place from j to i. In other words, one expects a reciprocity effect. This is first due to return migration (Borjas and Bratsberg 1996; Danchev and Porter 2018). But once a link is established between i and j through large-scale migration, natives in $j$ will be introduced to opportunities and networks in $i$ as well. This may facilitate migration from j to i even among natives of $j$ .

Other network characteristics, such as betweenness centrality (Freeman 1977) and in-degree assortativity (Newman 2002), will also affect migration. The betweenness centrality of a node in a network indicates how many (weighted) shortest paths pass through the node. A high level of betweenness centrality of an area implies that the area acts as a migration hub. This also means there is high diversity because people from different origins come to the area and likewise go to different destinations. This may offer new economic opportunities that attract further migration. Indeed, using an instrumental variable design, Damelang and Haas (2012) show that in Germany, cultural diversity enhances immigrants’ labor market success. Finally, if two provinces $i$ and $j$ attracted migration from similar other provinces, as measured by in-degree assortativity, more migration should take place between $j$ and $i$ , too. This is because both $j$ and $i$ will include networks of people from similar origins, which will, in turn, facilitate information flows and reduce migration costs between $j$ and $i$ .

Internal Migration in Turkey

Historically, Turkey has sent a large number of emigrants to Europe. Most research has thus focused on Turkish immigrants’ integration into Europe. Since the 1980s, however, emigration from Turkey to Europe has decreased (İçduygu and Sert 2009). Far larger shares of the population migrate internally. Our data show that annually, around 2.5 million people migrate between the 81 provinces of Turkey. Annual emigration abroad, on the other hand, is around 250,000.

Work on the determinants of internal migration in Turkey is scarce (Koramaz and Dökmeci 2017). Most of the existing research has been carried out by urban planners whose main focus is on spatial issues (Filiztekin and Gökhan 2008). Almost all existing studies analyze data up to the year 2000, that is, the last year a population census was conducted. Studies that use more recent migration data are mainly descriptive; for example, Akın and Dökmeci (2015) classify the 12 regions of Turkey based on interregional migration patterns.

Gedik (1997) shows that starting from the 1980s, the vast majority of migration in Turkey takes place from city to city as opposed to rural to urban. Evcil, Kiroplu, and Dokmeci (2006) confirm this pattern. As expected, economic factors such as per capita gross domestic product (GDP), wages, industrial workforce, and unemployment rates are important determinants of migration (Filiztekin and Gökhan 2008; Gezici and Keskin 2005). Evcil et al. (2006) argue that GDP differentials are one of the most important drivers of internal migration.

In line with the gravity law, populations in the origin and destination are positively associated with migration flows (Filiztekin and Gökhan 2008; Gezici and Keskin 2005). The findings on the second component of the gravity law, namely, spatial distance, are equivocal. Gedik (1997) argues that beyond the immediate neighboring cities, the effect of spatial distance on migration is minimal. Koramaz and Dökmeci (2017), however, report that after peaking at a distance of 200 to 400 kilometers, migration decreases rapidly as spatial distance increases. Koramaz and Dökmeci (2017) also report that in provinces in the east, most of which have large shares of Kurds, the effect of spatial distance is less pronounced.

Earlier studies also indicate the importance of social networks. Gedik (1997) shows that previous migrants who are friends or relatives from the same area are as effective as economic factors in facilitating migration. Filiztekin and Gökhan (2008) use the stock of earlier migration between $i$ and $j$ as an indicator of a network effect and find a strong effect of this earlier stock on future migration. To our knowledge, no study looks at further social network characteristics of Turkey’s provinces, such as betweenness centrality, reciprocity, and assortativity.

Other important social forces may affect internal migration in Turkey. The first is politics. Aksoy and Billari (2018) and Aksoy and Gambetta (2021) show that provinces and districts ruled by Erdoğan’s AKP (Justice and Development Party) are more efficient in providing local services and social assistance than are those controlled by the opposition. This can potentially lead to “welfare migration” from opposition to AKP municipalities. Aksoy and Billari (2018) do not find evidence for such welfare migration, but migration is not their focus, and hence their analysis is rather descriptive. Political polarization is also rife in Turkey. In a recent poll, 78 percent of respondents did not “approve of their daughter marrying a supporter” of a party other than their own (Erdoğan and Semerci 2017). We are not aware of any previous work that systematically focuses on the effect of politics and political distance on migration in Turkey. Yet, we expect Turkey’s political divides play an important role in migration decisions.

A further potentially important factor is ethnicity. Sizeable Kurdish populations live in Turkey’s large cities. This is due to both economic and forced migration. Violent conflicts in the 1990s resulted in the forced displacement of Kurds (Ergin 2014). Kurds tend to have higher levels of unemployment, poverty, and fertility (Koc, Hancioglu, and Cavlin 2008), factors that traditionally push people to migrate. Due to historical and political reasons, data on ethnicity in Turkey are scarce; no population census since 1965 includes a question on ethnicity (Koc et al. 2008). Hence, it is unknown if Kurdish migration is particularly high once economic factors and population are accounted for. In this study, we will provide the first comprehensive test.

Data and Descriptive Results

Data and Variables

We compiled a new data set for this study. All variables are at the province level, and there are 81 provinces. We obtained all variables from the Turkish Statistical Institute (TurkStat), except the proportion of Kurds in a province, which we estimated from two representative surveys.

Dependent variable: migration counts

The number of individuals who moved from one province to another in a given year is available from TurkStat from 2009. These data come from Turkey’s Address Based Population Registration System (ABPRS), which replaced the census in 2007. By law, each Turkish citizen is required to be registered at a single primary address. Any change in one’s primary address must be updated in ABPRS within 20 working days, online or in person. Not complying results in penalties. There are other incentives to keep the primary address up to date; for example, school access is determined by residence, and any official communication takes place via the registered address. Migration that is not yet registered in the system is not captured by these data. Hence, our dependent variable should be interpreted as “official” internal migration.

We also calculate the number of people who did not migrate. We do so by subtracting net migration (out-migration – in-migration) from the population of a province in a year. We apply a correction by adjusting for the number of births, deaths, and those who register for the first time in the province in a given year. This gives us an $81 \times 81$ matrix for each year from 2009 to 2018, with off-diagonals indicating the number of people migrating between provinces and the diagonals indicating those who stay.

Explanatory variables

We predict migration with the following explanatory variables. In our model, all dynamic (i.e., time-varying) explanatory variables are lagged by one year. As indicators of economic prospects, we use annual provincial GDP per capita and the unemployment rate. Unemployment rates are only available at the Nuts-2 level, which comprises 26 large regions. As elements of the gravity law, we use annual population in and the spatial distance between provinces. Spatial distance is measured as the driving distance in kilometers between province centers. To measure the political distance between provinces, we calculate the absolute difference in vote shares of the political parties in the 2004, 2009, and 2014 mayoral elections. Political distance is a dynamic variable.

We calculate the following network characteristics using the (one-year lagged) migration matrix. The self-perpetuating nature of migration is captured by the number of people who migrated from $i$ to $j$ in the previous year and by total in-migration in $j$ , that is, popularity (or equivalently weighted in-degree). Reciprocity for migration from $i$ to $j$ is calculated as the number of migrants from $j$ to $i$ in the previous year. Betweenness centrality of a province is calculated as the normalized number of geodesics (shortest paths) going through a province in the migration network. In-degree assortativity is operationalized as the correlation between the migration in-flows of $i$ and $j$ . The higher this correlation between two provinces, the more similar the provinces are in terms of the origin and flows of incoming migrants, so this variable can also be interpreted as in-flow similarity.

We also use the proportion of municipalities in a province that is controlled by the AKP after the 2004, 2009, and 2014 elections as a dynamic variable. This variable has a [0, 1] range corresponding to the provinces in which AKP controls none or all of the municipalities in a province in a given year.

We estimate the percentage of Kurds in a province as a dynamic variable. There is no official data on ethnicity, so we use Turkey’s Demographic and Health Survey 2008 and 2013 waves (Koc et al. 2008). Both surveys are representative and sample women of reproductive age (15–49; Hacettepe University Institute of Population Studies 2008–2013). The survey includes a question about the respondent’s mother tongue, which we use to identify if someone is Kurdish. We use the 2008 and 2013 waves to estimate the proportion of Kurds in a province in and before those years.

To facilitate estimation and interpretation, we normalize all explanatory variables, except time, to the [−1, 1] range by centering around the mean and dividing by the maximum of the absolute of the centered values, hence we attain one of the $- 1, 1$ boundaries. In this way, the coefficients of those factors concern a change from the average value to the absolute maximum observed value of a factor. Finally, we use popularity, population, previous year, and reciprocity in the log domain.

Descriptive Results for Migration

Figure 2 shows out-migration and in-migration in the 81 provinces of Turkey. The figure shows the absolute level of migration, migration as the percentage of the number of inhabitants at the start of the year, and the 2018 to 2009 difference in the latter. We see that large provinces, such as İstanbul, Ankara, İzmir, Adana, and Antalya, have high levels of out-migration and in-migration (see Figure 3 for province names). This is in line with the gravity law. There is almost a perfect correlation between out-migration and in-migration levels across provinces.

Figure 2.

Absolute, per 100, and change in per 100 between 2018 and 2009 in out-migration and in-migration per province.

Figure 3.

Migration flows between provinces.

Migration as a percentage of the number of inhabitants, however, shows a different pattern. Proportional to their populations, large provinces (e.g., İstanbul, Ankara, İzmir, Adana, and Antalya) have relatively low migration. Smaller provinces in the center-east, such as Gümüşhane and Tunceli, have the highest migration per capita. The difference between absolute and relative migration will become important when we present our model estimates. The bottom panels of Figure 2 show that on average, the change in out-migration per capita is rather stable. However, it is slightly negative in some places and slightly positive in others; the largest change is in Gümüşhane.

Figure 2 shows only the origins and destinations of migration. Figure 3 shows absolute migration flows. In Figure 3, we omit small flows for clarity. These small flows will be part of the analyses in the next section. Figure 3 shows that İzmir, Anlatya, Tekirdağ, Ordu, Tokat, and Ankara are popular destinations, especially for individuals from İstanbul. Local clusters, such as Şanlıurfa, Konya, Gaziantep, Adana, and Mersin, also attract and send migrants. Finally, Figure 4 shows the bivariate correlations for the one-way (i.e., variables defined for a province, “node” characteristics) and two-way (i.e., variables defined for a pair of provinces, “edge” characteristics) factors. As expected, many predictors are correlated.

Figure 4.

Sample correlation matrix for the one-way and two-way factors.

Results

We now provide the results of the implementation of our methodology using the migration data described in the previous section. We performed the experiments in MATLAB, version R2021b. All data and the code that produce the results are available (anonymized for peer review) at https://github.com/SocNetMigration/MigNet-MATLAB-code.git. This replication package also includes guidelines for preparing other data for our model.

Model Estimates

We implemented the sampling method in algorithm 1 to estimate $θ$ . We use noninformative priors for $θ$ described in Appendix A. Algorithm 1 is run for $5 \times 10^{5}$ iterations, and the posterior distributions are estimated based on the draws obtained during the last 80 percent of the iterations (following a burn-in). Convergence is confirmed by visual inspection of the history of the draws from the chain.

Figure 5 shows the box plots for the estimated marginal posterior distributions of the coefficients for our predictors of migration. Table 1 displays the means, standard deviations, and 90 percent credible intervals of those parameters and of the random intercept and the “scale” parameter of the Dirichlet-multinomial model (see equations [2] and [4]). The correlation structure in the posterior distribution for $θ_{1}, θ_{2}, θ_{3}$ is shown in Figure 6. Appendix B.1 includes the histograms of the posterior distributions of the parameters.

Figure 5.

Box plots of marginal posterior distributions of factor coefficients in the hierarchical (random intercepts across provinces) Dirichlet-multinomial model.

Table 1.

Means, Standard Deviations, and 90 Percent Credible Intervals for the Marginal Posterior Distributions of the Parameters of the Dirichlet-Multinomial Model with the Hierarchical Specification

Parameter	Mean	S.D.	$90 %$ Credible Interval	Parameter	Mean	S.D.	$90 %$ Credible Interval
GDP →	.0088	.0155	(–.0175, .0335)	GDP ←	.2698	.0060	(.2601, .2796)
Unemployment →	.0406	.0079	(.0279, .0533)	Unemployment ←	–.0286	.0055	(–.0372, –.0192)
AKP →	–.0742	.0044	(–.0816, –.0671)	AKP ←	–.0124	.0028	(–.0167, –.0078)
Popularity →	.5691	.0237	(.5310, .6078)	Popularity ←	–.1511	.0164	(–.2064, –.1538)
Population →	.2935	.1187	(.0899, .4920)	Population ←	.1419	.0141	(.1177, .1642)
Betweenness centrality →	.3358	.0334	(.2876, .3892)	Betweenness centrality ←	.1408	.0199	(.1083, .1727)
Kurd →	–.0642	.0090	(–.0785, –.0487)	Kurd ←	–.0305	.0039	(–.0370, –.0240)
Political distance ↔	–.0376	.0040	(–.0441, –.0310)	Base mean	–8.4527	.1286	(–8.6645, –8.2419)
Spatial distance ↔	–.0213	.0037	(–.0275, –.0152)	Base variance	1.3366	.2274	(1.0079, 1.7467)
Previous year ↔	3.2927	.0153	(3.2675, 3.3177)	Scale	10.5472	.0064	(10.5368, 10.5576)
Reciprocity ↔	1.9850	.0140	(1.9616, 2.0079)
In-flow similarity ↔	.0982	.0040	(.0916, .1048)

Figure 6.

Posterior correlation matrix.

The correlation structure in the posterior in Figure 6 is in line with the correlation matrix of the predictors given in Figure 4. A comparison of these two figures shows the typical trend that a positive correlation between two predictors yields a negative correlation between their coefficients in the posterior and vice versa, as expected. This justifies the general cautionary rule, which applies to almost all multivariate models, that the effects of correlated predictors should not be considered in isolation from each other.

The results are mostly in line with theory. Unemployment (→) in the sender province is positively associated with out-migration, and unemployment (←) in the target has a negative association with in-migration. GDP (→) in the sending province does not seem to have a strong association with out-migration, whereas GDP (←) in the target is strongly and positively associated with in-migration.

Regarding the elements of the gravity law, a large population (→) is associated with a higher probability of out-migration: the posterior mean of its coefficient is .29. Population (←) in the target also has a very strong positive association with in-migration. We also find a negative association between spatial distance (↔) and migration. These results are in line with the gravity law. We also find a negative association between provincial political distance (↔) and migration. Interestingly, the size of the coefficient for political distance is slightly higher than that for spatial distance.

As expected, network characteristics are strongly associated with migration. Migration from $i$ to $j$ in the previous year (↔) is a very strong predictor of current migration. Its coefficient is the largest of all. The popularity (←) of a province (total in-migration) in a given year is associated negatively with in-migration the next year net of all other factors controlled in the model. We also find strong reciprocity (↔): the larger the migration from $j$ to $i$ in a given year, the larger migration from $i$ to $j$ the next year. Betweenness centrality of a province in a given year also makes a province attractive the next year (←). Likewise, betweenness centrality (→) of the origin is associated strongly and positively with out-migration. Hence, central provinces seem to act as hubs, attracting and initiating further migration. In-flow similarity (↔), which captures in-degree assortativity, is also associated positively with migration.

We find that the strength of the AKP in a province is negatively associated with both out-migration (→) and in-migration (←), although its coefficient is small in both cases. This shows that migration out of and into AKP-dominated regions is low. Together with the strong negative coefficient for political distance, this finding further indicates the negative association of political divides with migration.

Finally, the proportion of Kurds in a province is associated negatively with both out-migration (→) and in-migration (←). Although its coefficients are low in the absolute sense, negative values are inconsistent with the common belief that migration is a Kurdish phenomenon. Note, however, that the outcome is all migration from and to a province, Kurdish or otherwise. Hence, the evidence here is indirect. Nevertheless, if Kurds were much more mobile than Turks and other ethnicities, as the common belief suggests, one would expect a positive effect of the proportion of Kurds in a province on out-migration.

The coefficients for ← and ↔ factors can be interpreted in terms of relative probabilities of attracting migration from a given province. For example, suppose the maximum observed GDP value is normalized to $1$ (recall that all factors are centered around the mean and normalized to the $[- 1, 1]$ range). Then the mean of the posterior of the coefficient for “ $GDP \leftarrow$ ” suggests that from a given province, the probability of migration to a province with the highest GDP per capita is 31 percent higher ( $\exp (0.27) = 1.31$ ) than the probability of migration to an average province, ceteris paribus. Likewise, the probability of migration to the spatially farthest province is 2 percent lower ( $\exp (- 0.02) = 0.98$ ) than that to an average province (provided the maximum spatial distance has been normalized to $1$ ). The coefficients for → factors can be interpreted in terms of the relative probability of migration to any province versus not migration. For example, 1-unit increase in unemployment (e.g., the change from average to maximum observed value) in province $i$ is associated with a 4 percent ( $\exp (0.04) = 1.04$ ) increase in relative migration probability from $i$ , ceteris paribus.

Table 1 also shows the mean and the variance of the intercept, which is random across provinces. We see considerable variability in the levels of out-migration probabilities across provinces, the largest in Bayburt and the smallest in Istanbul. Figure 7 shows the posterior distribution of the intercepts per province, estimated with our hierarchical specification. These provincial variations somewhat match the descriptive statistics given in Figure 2, although they are not equal because these intercepts are obtained after controlling for the predictors of migration.

Figure 7.

Box plots of the marginal posterior distributions of random intercepts for provinces in the hierarchical specification of the Dirichlet-multinomial model.

Additional estimation results

Some scholars argue that controlling for the lagged dependent variable accounts for path dependency or autocorrelation of residuals in panel regressions, although it may also bias downward the coefficients for other explanatory variables (Keele and Kelly 2006). This issue has been shown to apply to temporal ERGMs, too (Block et al. 2018). To address this potential issue, we fitted our models after excluding variables that are calculated from past migrations: popularity, betweenness, previous year, reciprocity, and in-flow similarity. The results after excluding these variables are given in Appendix B.2.

These additional results show that including the measures based on migration in the previous year generally does not suppress the coefficients of the other variables. In fact, for some variables, the estimated coefficients are somewhat smaller after excluding the lagged measures (e.g., AKP ←). Coefficients do increase, in absolute value, for some variables (e.g., GDP →, Kurd ←) after excluding lagged variables. But because of a lack of general suppression of coefficients due to including measures based on lagged values of migration, we conclude that this issue is not particularly problematic in our case. Note that one expects differences in the coefficients of variables that are estimated before and after the inclusion of lagged migration variables due to the conditional nature of the Dirichlet-multinomial regression, which we observe. Additionally, as shown in Appendix Table B1, the exclusion of lagged measures reduces the predictive performance substantially, further suggesting the need for including them in the model.

Comparison with the Gravity Model in Terms of Predictive Power

The gravity model assumes a linear regression for the nondiagonal log-migration counts. The essential factors of the gravity model are population and distance, but one could easily include other one-way and two-way factors. Therefore, for fairness in comparison, we constructed a gravity model that includes all the factors considered in our model. This corresponds to the following relation for $i, j \in {1, \dots, n}$ ; $i \neq j$ ; and $t \in {1, \dots, T}$ ,

\log H_{t} (i, j) = θ_{0} (i) + θ_{1} \cdot u_{t} (i) + θ_{2} \cdot v_{t} (j) + θ_{3} \cdot z_{t} (i, j) + e_{t} (i, j),

where the sum on the right side serves as the base parameter as before; $e_{t} (i, j)$ are uncorrelated deviation terms, assumed to have zero mean and common variance; and $u_{t} (i)$ , $v_{t} (j)$ , and $z_{t} (i, j)$ are as defined before. (Wherever $H_{t} (i, j) = 0$ , which happens only in 31 out of 81 × 81 × 81 = 65,610 cells, we substituted it with a $1$ to avoid $\log 0$ .) We fitted two versions of this model: first, with a global $θ_{0} (i) = θ_{0}$ , which is constant across provinces, and the second in which $θ_{0} (i)$ s are “random effects” that are assumed to be normally distributed across provinces. The latter version matches the Dirichlet-multinomial model we discuss previously, which also has “random effects” for provinces. To facilitate comparison, we also fit a Dirichlet-multinomial specification with a global $θ_{0}$ .

We compared the gravity model and the Dirichlet-multinomial model in terms of their out-of-sample predictive performances with the following process. Recall that the total data consist of 10 consecutive years between 2009 and 2018. For every year, we excluded its migration data and used the rest of the years to train the model. The training results are then used to predict the migrations in the excluded year. For the Dirichlet-multinomial model, we ran algorithm 1 (or variants of this for the global and random effects intercepts) to obtain samples from the posterior distribution of $θ$ given the training data, and we predicted the migration counts in the test year by the Monte Carlo estimation of their expectations with respect to their posterior predictive probability distribution. For the gravity model, the maximum likelihood estimates for the parameter vector, obtained from the training data, were substituted to calculate the expectation of $\log H_{t} (i, j)$ for the test year $t$ . Then, for a test year $t$ , the nonmigration counts, that is, $H_{t} (i, i)$ , $i = 1, \dots, n$ are predicted by $P_{t} (i) - \sum_{j = 1, j \neq i}^{n} {\hat{H}}_{t} (i, j)$ , where ${\hat{H}}_{t} (i, j)$ is the predicted value for $H_{t} (i, j)$ . The prediction results for the 10 test years are then used to calculate squared errors [(predicted – observed)²] and averaged over the 10 years. We calculate separate mean squared errors for migration counts with and without the diagonals (in a typical gravity model, diagonals are excluded).

The results (see Table 2) show that the Dirichlet-multinomial model has better performance than the gravity model in predicting migration counts. Prediction errors for the Dirichlet-multinomial for count outcomes are consistently lower than that for the gravity model both when we include and exclude the diagonal (nonmigration) values in the predicted outcome. When the log-counts are considered, the gravity model performs slightly better than the Dirichlet-multimonial. This suggests the gravity model is generally mispredicting large flows. As an aside, the mean absolute error of our hierarchical model for all counts including nonmigration is only 108.6, which corresponds to less than .05 percent of the standard deviation of migration counts. This shows that our model predicts migration flows reasonably well. Also note that random province-specific intercept parameters improve the performance, and hence the random effects do not lead to overfitting.

Table 2.

Mean Squared Prediction Errors for the Dirichlet-Multinomial Model (DM) and the Gravity Model (Grav-re) with Random Intercepts for Provinces and with a Global Single Intercept (DM0 and Grav0) for Different Outcomes (Log Counts or Counts, Diagonals Excluded or Included)

Outcome	Diagonals	DM	Grav-re	DM0	Grav0
Log counts	Excluded	.1273	.1227	.1327	.1242
Log counts	Included	.1257	.1212	.1311	.1227
Counts ( $\times 10^{6}$ )	Excluded	.0927	.1054	.1096	.1165
Counts ( $\times 10^{6}$ )	Included	1.0847	1.5478	1.8661	2.1104

We also compared the predictive performance of our method with that of a simpler variant that excludes variables based on the previous year’s migration flows (see Appendix Table B1). The results show that the exclusion of lagged measures hampers predictive performance substantially, pointing to the need to include them in the model as we do.

Baseline Probability of Migration and the Predictive Accuracy of the Gravity Model

The aforementioned results show that our model outperforms the gravity model in predicting migration counts. We conjecture that one of the main reasons for the relatively poor performance of the gravity model in predicting migration counts is because the gravity model fails to capture the mechanistic relationship between migration out-flows to different targets from the same origin (e.g., due to competition between different targets for a constant number of possible migrants from a given origin). This drawback increases as the share of a population that migrates relative to those that do not increases. This is because the smaller the share of people in an origin that do not migrate (hence the larger the share of those who migrate), the larger the (negative) correlation between migration flows to different destinations, and because the gravity model does not capture those correlations naturally, the predictive accuracy of the gravity model will be lower.

To demonstrate the mechanism explained so far and help understand the predictive performance of the gravity model vis-a-vis ours better, we carry out a Monte Carlo simulation, where we test the predictive performances of the Dirichlet-multinomial model and the gravity model on simulated data sets. In our simulation, we randomly generate migration flows using two separate processes. In the first process, we generate migration flows from the Dirichlet-multinomial model with $10$ origins to $10$ destinations for five time periods. In generating these flows, we randomly simulate the origin populations (from the Poisson distribution with a mean of 1,000, independently for each origin) and 3 →, 3 ←, and 2 ↔ factors and draw migration counts from the respective Dirichlet-multinomial distributions as in equation (3). We fix $θ_{1} = (1, 1, 1), θ_{2} = (1, 1, 1)$ , and $θ_{3} = (1, 1)$ and control the baseline out-migration probability by varying a common $θ_{0}$ parameter for all origins. (For simplicity, we do not add sender random effects in the data-generation process.) We vary $θ_{0}$ from $- 5$ to $0$ with increments of .5 and generate $10$ data sets for each unique $θ_{0}$ value. Note that the parameter $θ_{0}$ controls the level of negative correlation between out-migration counts. As $θ_{0}$ increases, the negative correlation between the out-migration counts increases.

In this simulation, our aim is to better understand how the predictive accuracy of the gravity model depends on the baseline probability of out-migration, rather than comparing our model with the gravity model in all possible data-generation scenarios. Nevertheless, because we generate the migration counts from a Dirichlet-multinomial model, one may argue that the data-generation process does not do justice to the gravity model. Hence, in a second process, we generate migration counts from a multivariate normal distribution obtained by randomly “perturbing” the Dirichlet-multinomial model. Specifically, for every $t$ and $i$ , let the mean vector and covariance matrix of $Dirichlet - Multinomial (X_{t} (i); α_{t} (i, 1), \dots, α_{t} (i, n))$ be $m_{t, i}$ and $S_{t, i}$ . We sample the counts as

(H_{t} (i, 1), \dots, H_{t} (i, n)) ~ N (m_{t, i}, S_{t, i}), where S_{t, i} ~ Wishart (S_{t, i} / n, n),

(7)

that is, we distort the covariance matrix to deviate from the Dirichlet-multinomial model while still keeping the negative correlation among the counts. The counts drawn from the multivariate normal distribution are then rounded to the nearest integers and truncated at the zero lower bound. Finally, we readjust the population according to the final values of the counts. As in the first process, in this second process, $θ_{0}$ controls the level of negative correlation between out-migration counts. For both processes, we then calculate mean squared errors as we described earlier, that is, by leaving out one time period in training the gravity and Dirichlet-multinomial models and predicting flows in the left-out period. For the gravity model, we also add the population as a factor.

Figure 8 shows the results of our Monte Carlo simulations. The figure confirms our conjecture as to the reasons for the gravity model’s relatively poor predictive performance. As the baseline probability of migration increases (corresponding to larger $θ_{0}$ ), the predictive power of the gravity model vis-a-vis the Dirichlet-multinomial model decreases for both of the two data-generation processes. These results imply that as the ratio of migrants relative to nonmigrants from a given province approaches zero, the predictive accuracy of the Dirichlet-multinomial and the gravity model converge. This is an important observation that we will return to in the final section because the predictive drawbacks of the gravity model may be less problematic if, in a particular application, the migration counts (flows) relative to nonmigrant counts (stocks) are low.

Figure 8.

Predictive performance of the gravity model relative to the Dirichlet-multinomial model on simulated data.

Discussion and Conclusions

In this study, we proposed a Dirichlet-multinomial model and a Bayesian inference method to analyze the interdependent flows. We applied the developed methodology to analyze the dynamic migration flows of the 81 provinces of Turkey from 2009 to 2018. Our study offers methodological and substantive contributions to the literature.

On the methodological front, our Dirichlet-multinomial model alleviates several shortcomings of the popular models applied to migration flows. First, our model naturally captures systemic effects. That is, unlike the gravity model, our model incorporates the effects of the characteristics of all alternative destinations on a decision to migrate to a specific destination. Second, our model accounts for nonmigration. The alternative to ignoring nonmigration and conditioning the results on migration having taken place, which is a common practice in the literature, results in selection on the dependent variable. Third, our model adheres to the natural boundary of migration and the mechanistic relationship between migration and nonmigration. That is, out-migration and in-migration change the populations of the origin and the destination, and the total number of out-migration cannot exceed the population of the origin in a given period. Our model naturally incorporates these phenomena. Fourth, our model lends itself to an estimation method that is computationally much simpler than many of its alternatives. The computational simplicity allows us to expand our model with a hierarchical specification that captures variations in the baseline of, and potentially also longitudinal changes in, migration probabilities.

To demonstrate these advantages, we carried out an out-of-sample prediction comparison of our model vis-a-vis the gravity model, one of the most common models used in the migration literature. The gravity model ignores systemic effects, and it typically ignores nonmigration and the mechanistic relationship between populations, nonmigration, and migration flows. We showed that our model consistently outperforms the gravity model in predicting migration flow counts.

We also showed through Monte Carlo simulations that the predictive accuracy of the gravity model worsens as the ratio of migrants to nonmigrants increases. Conversely, as the ratio of migrants to nonmigrants approaches zero, the predictive accuracy of the gravity model converges with that of our model. This implies that when migration is a much rarer event compared to nonmigration, the gravity model would perform reasonably well.

Note that while we framed our model within a migration setting, and indeed we developed it to understand migration flows and their determinants, the model is more general. It can be applied to any setting that involves discrete dynamic flows between a finite number of origins and destinations.

Several models developed in the social networks literature, which are very general and flexible, can, in principle, be applied to dynamic flows with some adjustments. These include ERGMs and their various extensions for count data or longitudinal data, REMs, DyNAMs, SAOMs, and latent space and latent factor models. We discussed those models in detail and compared them with ours. While they are indeed very powerful and flexible, we argued they may be too general for our problem at hand. Our method provides a straightforward approach to model dynamic flows while offering computational simplicity and at the same time taking into account key features of and dependencies in the data.

The computational simplicity of our model rests on the assumption that migrations from different origins at a given time point are independent, conditional on the covariates, which can include those based on previous migration flows. Note that this conditional independence assumption is about decisions in different provinces, and our model incorporates possible positive correlations between migration decisions within the same province. We believe this assumption is not unrealistic in our longitudinal context. When individuals are deciding whether to migrate in a given time point, they may not consider or indeed be aware of the migration decisions of others outside their provinces at the same time point. Moreover, dependencies among migration probabilities that may violate this assumption can be explained away, to a certain degree, by including in the model the predictors that are the likely causes of that dependence. For example, if one expects a “herding” dynamic for migration (i.e., people in different provinces imitating each other in migrating to a specific destination), this factor can be added to the model by adding a destination popularity factor based on previous migration flows into the destination, as we do in our analysis. In other words, because we have longitudinal data, we can condition current flows on earlier flows, which makes the conditional independence assumption more plausible.

Our model allows for further extensions. In particular, in our hierarchical specification, we include random intercepts for out-migration (i.e., for sending provinces). One can include such random effects for the receiving provinces, too. This would result in a cross-classified model (Snijders and Bosker 2011). In fact, we fitted this cross-classified version. The parameter estimates and the predictive performance of our model hardly changed. Yet the receiver random effects increased the computational complexity of the model substantially. This is because in a cross-classified model, each of the n random intercepts of the receiving provinces would affect the Dirichlet-multinomial probabilities for all provinces. Thus, the acceptance probability in the estimation, which is needed to accept or reject a proposed update for each of those parameters, would require the computation of the whole product in equation (6). Note that this is in stark contrast with the update of the intercept of the sending provinces, for which we calculate a single Dirichlet-multinomial probability for each $t$ in the numerator and the denominator of the acceptance ratio (see algorithm 1). Because receiver random effects were inconsequential to our results and the predictive performance of our model, we decided to proceed without the receiver random effects. Nevertheless, they could be incorporated should the researcher see a need to do so.

Overall, we contribute to the literature methodologically by providing a viable alternative for analyzing migration and other types of flows. Our study also offers substantive contributions to the migration literature. Far greater shares of populations are affected by internal migration than by international migration. Most past research, however, focuses on the latter. Using a new data set and a model, we analyze interprovincial migration in Turkey. Our results largely confirm the existing theories of migration: economic prospects, population, spatial distance, and network characteristics shape the flow of internal migration in Turkey. We also offer several novel findings.

First, we register that political distance between provinces (measured using electoral results) is negatively associated with migration flows. This negative association is even stronger than that between spatial distance and migration. Moreover, we find that the strength of the AKP (Justice and Development Party) in a province is negatively associated with out-migration and in-migration. These findings imply that political divisions in the country contribute to the sorting of province populations. This association between politics and migration, in turn, may accelerate the political polarization between Turkey’s provinces, which is already rife.

We also provide, to our knowledge, the first systematic test of whether the proportion of Kurds in a province is associated with migration. Kurds are affected by higher levels of unemployment and poverty and have higher fertility rates. These factors are traditionally associated with migration. Indeed, there has been large-scale Kurdish migration to western parts of Turkey. Our analysis, however, shows that conditional on the factors we include in our model, the share of Kurds in a province has a small negative association with migration. While these findings should be interpreted with caution, as the independent variable is the share of Kurds in a province and the outcome is migration across all ethnic groups, a lack of a positive coefficient suggests internal migration in Turkey is not a predominantly “Kurdish phenomenon.”

Footnotes

Appendix

Acknowledgements

We thank Zsofia Boda and Burak Sönmez for their comments on earlier drafts.

Authors’ Note

The authors contributed equally to the manuscript.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Aksoy acknowledges financial support from the British Academy (Grant No. SRG20/200045).

ORCID iD

Ozan Aksoy

Data Accessibility Statement

An anonymized replication package with the data and the code that produce the results and with guidelines as to how our model can be fitted to other data are available at .

Author Biographies

Ozan Aksoy is associate professor of social science at University College London. His research interests include cooperation, trust, and religious behavior. He uses game theory, statistical and computational methods, and laboratory and natural experiments as research tools. He is the recipient of the 2019 Raymond Boudon Award for Early Career Achievement, and since 2022, he has been an elected fellow of the European Academy of Sociology. His recent work has been published in, among others, American Sociological Review, American Journal of Sociology, Social Forces, Nature Human Behaviour, Sociological Science, and European Sociological Review.

Sinan Yıldırım has been a faculty member since 2015 in engineering and natural sciences at Sabancı University, Turkey. He received his BS in 2007 and MS in 2009, both in electrical and electronic engineering at Boğaziçi University, Turkey. He holds a PhD in mathematical statistics from the University of Cambridge, UK, where he graduated in 2013. He then worked as a postdoctoral researcher at the University of Bristol from 2013 to 2015. His primary research areas are Bayesian statistics and Monte Carlo methods. Currently, he is also interested in data privacy.

References

Abramski

Katherine

Katenka

Natallia

Hutchison

Marc

. 2020. “A Network-Based Analysis of International Refugee Migration Patterns Using GERGMs.” Pp. 387–400 in Complex Networks and Their Applications VIII, edited by Cherifi

Gaito

Mendes

J. F.

Moro

Rocha

L. M.

Cham, Switzerland: Springer International Publishing.

Akın

Darçın

Dökmeci

Vedia

. 2015. “Cluster Analysis of Interregional Migration in Turkey.” Journal of Urban Planning and Development 141(3). doi:10.1061/(ASCE)UP.1943-5444.0000223.

Aksoy

Ozan

Billari

Francesco C.

2018. “Political Islam, Marriage, and Fertility: Evidence from a Natural Experiment.” American Journal of Sociology 123:1296–340.

Aksoy

Ozan

Gambetta

Diego

. 2021. “The Politics behind the Veil.” European Sociological Review 37:67–88.

Alamá-Sabater

Luisa

Alguacil

Maite

Bernat-Martí

Joan Serafí

. 2017. “New Patterns in the Locational Choice of Immigrants in Spain.” European Planning Studies 25:1834–55.

Almquist

Zack W.

Butts

Carter T.

2013. “Dynamic Network Logistic Regression: A Logistic Choice Analysis of Inter- and Intra-group Blog Citation Dynamics in the 2004 U.S. Presidential Election.” Political Analysis 21:430–48.

Almquist

Zack W.

Butts

Carter T.

2014. “Logistic Network Regression for Scalable Analysis of Networks with Joint Edge/Vertex Dynamics.” Sociological Methodology 44:273–321.

Azose

Jonathan J.

Raftery

Adrian E.

2015. “Bayesian Probabilistic Projection of International Migration.” Demography 52:1627–50.

Azose

Jonathan J.

Raftery

Adrian E.

2018. “Estimating Large Correlation Matrices for International Migration.” Annals of Applied Statistics 12:940–70.

10.

Barthélemy

Marc

. 2011. “Spatial Networks.” Physics Reports 499:1–101.

11.

Bell

Martin

Muhidin

Salut

. 2009. “Cross-National Comparison of Internal Migration.” United Nations Development Programme, Human Development Reports, Research Paper.

12.

Benson

Austin R.

Kumar

Ravi

Tomkins

Andrew

. 2016. “On the Relevance of Irrelevant Alternatives.” Pp. 963–73 in Proceedings of the 25th International Conference on World Wide Web. Republic and Canton of Geneva, Switzerland: International World Wide Web Conferences Steering Committee.

13.

Block

Per

Koskinen

Johan

Hollway

James

Steglich

Christian

Stadtfeld

Christoph

. 2018. “Change We Can Believe in: Comparing Longitudinal Network Models on Consistency, Interpretability and Predictive Power.” Social Networks 52:180–91.

14.

Block

Per

Stadtfeld

Christoph

Robins

Garry

. 2022. “A Statistical Model for the Analysis of Mobility Tables as Weighted Networks with an Application to Faculty Hiring Networks.” Social Networks 68:264–78.

15.

Block

Per

Stadtfeld

Christoph

Snijders

Tom A. B.

2019. “Forms of Dependence: Comparing SAOMs and ERGMs from Basic Principles.” Sociological Methods & Research 48:202–39.

16.

Borjas

George J.

1999. Economic Research on the Determinants of Immigration: Lessons for the European Union. Washington, DC: The World Bank.

17.

Borjas

George J.

Bratsberg

Bernt

. 1996. “Who Leaves? The Outmigration of the Foreign-Born.” The Review of Economics and Statistics 79:165–76.

18.

Borjas

George J.

Bronars

Stephen G.

Trejo

Stephen J.

1992. “Self-Selection and Internal Migration in the United States.” Journal of Urban Economics 32:159–85.

19.

Boyle

Paul

Halfacree

Robinson

Vaughan

. 1998. Exploring Contemporary Migration. Harlow, UK: Longman.

20.

Butts

Carter T.

2008. “A Relational Event Framework for Social Action.” Sociological Methodology 38:155–200.

21.

Butts

Carter T.

Acton

Ryan M.

Hipp

John R.

Nagle

Nicholas N.

2012. “Geographical Variability and Network Structure.” Social Networks 34:82–100.

22.

Chan

Tak Wing

Henderson

Morag

Sironi

Maria

Kawalerowicz

Juta

. 2020. “Understanding the Social and Cultural Bases of Brexit.” British Journal of Sociology 71(5):830–51.

23.

Cheng

Simon

Long

J. Scott

. 2007. “Testing for IIA in the Multinomial Logit Model.” Sociological Methods & Research 35:583–600.

24.

Çoban

Ceren

. 2013. “Different Periods of Internal Migration in Turkey from the Perspective of Development.” American International Journal of Contemporary Research 3:58–65.

25.

Damelang

Andreas

Haas

Anette

. 2012. “The Benefits of Migration.” European Societies 14:362–92.

26.

Danchev

Valentin

Porter

Mason A.

2018. “Neither Global nor Local: Heterogeneous Connectivity in Spatial Network Structures of World Migration.” Social Networks 53:4–19.

27.

Desmarais

Bruce A.

Cranmer

Skyler J.

2012. “Statistical Inference for Valued-Edge Networks: The Generalized Exponential Random Graph Model.” PLoS One 7:e30136. doi:10.1371/journal.pone.0030136.

28.

Erdoğan

Emre

Semerci

P. U.

2017. “Dimensions of Polarization in Turkey.”Istanbul Bilgi University Center for Migration Research, Istanbul, Türkiye, Report.

29.

Ergin

Murat

. 2014. “The Racialization of Kurdish Identity in Turkey.” Ethnic and Racial Studies 37:322–41.

30.

Evcil

Ayse Nilay

Kiroplu

Gulay Basarir

Dokmeci

Vedia

. 2006. “Regional Migration in Turkey: Its Directions and Determinants.” Paper presented at the 46th Congress of the European Regional Science Association: “Enlargement, Southern Europe and the Mediterranean,”Volos, Greece, August 30–September 3.

31.

Expert

Paul

Evans

Tim S.

Blondel

Vincent D.

Lambiotte

Renaud

. 2011. “Uncovering Space-Independent Communities in Spatial Networks.” Proceedings of the National Academy of Sciences 108:7663–68.

32.

Filiztekin

Alpay

Gökhan

Ali

. 2008. “The Determinants of Internal Migration in Turkey.” Paper presented at the International Conference on Policy Modelling (EcoMod2008), Berlin, Germany, July 2–4.

33.

Freeman

Linton C.

1977. “A Set of Measures of Centrality Based on Betweenness.” Sociometry 40(1):35–41.

34.

Gedik

Ayse

. 1997. “Internal Migration in Turkey, 1965–1985: Test of Conflicting Findings in the Literature.” Review of Urban & Regional Development Studies 9:170–79.

35.

Gezici

Ferhan

Keskin

Berna

. 2005. “Interaction between Regional Inequalities and Internal Migration in Turkey.” In ERSA Conference Papers. Louvain-la-Neuv, Belgium: European Regional Science Association.

36.

Gimpel

James G.

Schuknecht

Jason E.

2001. “Interstate Migration and Electoral Politics.” The Journal of Politics 63:207–31.

37.

Guimaraes

Paulo

Lindrooth

Richard C.

2007. “Controlling for Overdispersion in Grouped Conditional Logit Models: A Computationally Simple Application of Dirichlet-Multinomial Regression.” The Econometrics Journal 10:439–52.

38.

Hacettepe University Institute of Population Studies. 2008–2013. “Turkey Demographic and Health Survey, 2008–2013.”Ankara, Turkey: Hacettepe University Institute of Population Studies.

39.

Hanneke

Steve

Wenjie

Xing

Eric P.

2010. “Discrete Temporal Models of Social Networks.” Electronic Journal of Statistics 4:585–605.

40.

Hoff

Peter D.

Raftery

Adrian E.

Handcock

Mark S.

2002. “Latent Space Approaches to Social Network Analysis.” Journal of the American Statistical Association 97:1090–98.

41.

Hoff

Peter D.

Ward

Michael D.

2004. “Modeling Dependencies in International Relations Networks.” Political Analysis 12:160–75.

42.

Huang

Peng

Butts

Carter T.

2021. “Parameter Estimation Procedures for Exponential-Family Random Graph Models on Count-Valued Networks: A Comparative Simulation Study.” arXiv. https://arxiv.org/abs/2111.02372.

43.

Huang

Peng

Butts

Carter T.

2022. “Rooted America: Immobility and Segregation of the Inter-county Migration Networks.” arXiv. https://arxiv.org/abs/2205.02347.

44.

İçduygu

Sert

2009. “Country Profile: Turkey.” Focus Migration 5. http://focus-migration.hwwi.de/typo3_upload/groups/3/focus_Migration_Publikationen/Laenderprofile/CP_05_Turkey.pdf.

45.

Karemera

David

Oguledo

Victor Iwuagwu

Davis

Bobby

. 2000. “A Gravity Model Analysis of International Migration to North America.” Applied Economics 32:1745–55.

46.

Keele

Luke

Kelly

Nathan J.

2006. “Dynamic Models for Dynamic Theories: The Ins and Outs of Lagged Dependent Variables.” Political Analysis 14:186–205.

47.

Koc

Ismet

Hancioglu

Attila

Cavlin

Alanur

. 2008. “Demographic Differentials and Demographic Integration of Turkish and Kurdish Populations in Turkey.” Population Research and Policy Review 27:447–57.

48.

Koramaz

T. Kerem

Dökmeci

Vedia

. 2017. “TÜRKİYE’DE İLLER ARASI GÖÇ (1995–2000).” Pp. 13–36 in Türkiye’de göç ve illerin demografik, ekonomik ve fiziksel dönüşümü, edited by Koramaz

T. K.

Dökmeci

Özdemir

Istanbul: Hiperlink Eğitim İletişim Yayın Gıda Sanayi ve Pazarlama Tic. Ltd. Şti.

49.

Krivitsky

Pavel N

. 2012. “Exponential-family random graph models for valued networks.”Electronic Journal of Statistics 6:1100–128.

50.

Krivitsky

Pavel N.

Butts

Carter T.

2017. “Exponential-Family Random Graph Models for Rank-Order Relational Data.” Sociological Methodology 47:68–112.

51.

Krivitsky

Pavel N.

Handcock

Mark S.

2014. “A Separable Model for Dynamic Networks.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76:29–46.

52.

Krivitsky

Pavel N.

Hunter

David R.

Morris

Martina

Klumb

Chad

. 2021. “ergm 4: New Features.” arXiv. https://arxiv.org/abs/2106.04997.

53.

Krivitsky

Pavel N.

Koehly

Laura M.

Marcum

Christopher Steven

. 2020. “Exponential-Family Random Graph Models for Multi-layer Networks.” Psychometrika 85:630–59.

54.

Kuhn

Randall

. 2015. “Internal Migration: Developing Countries.” International Encyclopedia of the Social & Behavioral Sciences 12:433–42.

55.

Leal

Diego F.

2021. “Network Inequalities and International Migration in the Americas.” American Journal of Sociology 126:1067–126.

56.

Levy

Moshe

. 2010. “Scale-Free Human Migration and the Geography of Social Networks.” Physica A: Statistical Mechanics and its Applications 389:4913–17.

57.

Lusher

Dean

Koskinen

Johan

Robins

Garry

. 2013. Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications. Cambridge, UK: Cambridge University Press.

58.

Massey

Douglas S.

1990. “Social Structure, Household Strategies, and the Cumulative Causation of Migration.” Population Index 56:3–26.

59.

Massey

Douglas S.

España

Felipe García

. 1987. “The Social Process of International Migration.” Science 237:733–38.

60.

McPherson

Miller

Smith-Lovin

Lynn

Cook

James M.

2001. “Birds of a Feather: Homophily in Social Networks.” Annual Review of Sociology 27:415–44.

61.

Minhas

Shahryar

Hoff

Peter D.

Ward

Michael D.

2019. “Inferential Approaches for Network Analysis: AMEN for Latent Factor Models.” Political Analysis 27:208–22.

62.

Nelson

James F.

1984. “The Dirichlet-Gamma-Poisson Model of Repeated Events.” Sociological Methods & Research 12:347–73.

63.

Newman

M. E. J.

2002. “Assortative Mixing in Networks.” Physical Review Letters 89:208701. doi:10.1103/PhysRevLett.89.208701.

64.

Palmer

John R. B.

Pytlikova

Mariola

. 2015. “Labor Market Laws and Intra-European Migration: The Role of the State in Shaping Destination Choices.” European Journal of Population 31:127–53.

65.

Poot

Jacques

Alimi

Omoniyi

Cameron

Michael P.

Maré

David C.

2016. “The Gravity Model of Migration: The Successful Comeback of an Ageing Superstar in Regional Science.” Investigaciones Regionales 2016:63–86.

66.

Schwartz

Aba

. 1973. “Interpreting the Effect of Distance on Migration.” Journal of Political Economy 81:1153–69.

67.

Snijders

Tom A. B.

1996. “Stochastic Actor-Oriented Models for Network Change.” Journal of Mathematical Sociology 21:149–72.

68.

Snijders

Tom A. B.

2001. “The Statistical Evaluation of Social Network Dynamics.” Sociological Methodology 31:361–95.

69.

Snijders

Tom A. B.

Bosker

Roel J.

2011. Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling. Thousand Oaks, CA: Sage.

70.

Stadtfeld

Christoph

Block

Per

. 2017. “Interactions, Actors, and Time: Dynamic Network Actor Models for Relational Events.” Sociological Science 4:318–52.

71.

Stadtfeld

Christoph

Hollway

James

Block

Per

. 2017a. “Dynamic Network Actor Models: Investigating Coordination Ties through Time.” Sociological Methodology 47:1–40.

72.

Stadtfeld

Christoph

Hollway

James

Block

Per

. 2017b. “DyNAMs and the Grounds for Actor-Oriented Network Event Models: A Rejoinder to Snijders and Butts.” Sociological Methodology 47:56–67.

73.

Strasser

Sabine

. 2008. “Europe’s Other: Nationalism, Transnationals and Contested Images of Turkey in Austria.” European Societies 10:177–95.

74.

Theil

Henri

. 1969. “A Multinomial Extension of the Linear Logit Model.” International Economic Review 10:251–59.

75.

Tierney

Luke

. 1994. “Markov Chains for Exploring Posterior Distributions.” Annals of Statistics 22:1701–62.

76.

Van Tubergen

Frank

Maas

Ineke

Flap

Henk

. 2004. “The Economic Incorporation of Immigrants in 18 Western Societies: Origin, Destination, and Community Effects.” American Sociological Review 69:704–27.

77.

Westveld

Anton H.

Hoff

Peter D.

2011. “A Mixed Effects Model for Longitudinal Relational and Network Data, with Applications to International Trade and Conflict.” Annals of Applied Statistics 5:843–72.

78.

Windzio

Michael

. 2018. “The Network of Global Migration 1990–2013: Using ERGMs to Test Theories of Migration between Countries.” Social Networks 53:20–29.

79.

Windzio

Michael

Teney

Céline

Lenkewitz

Sven

. 2021. “A Network Analysis of Intra-EU Migration Flows: How Regulatory Policies, Economic Inequalities and the Network-Topology Shape the Intra-EU Migration Space.” Journal of Ethnic and Migration Studies 47:951–69.

80.

Yazgi

Burcin

Dokmeci

Vedia

Koramaz

Kerem

Kiroglu

Gulay

. 2014. “Impact of Characteristics of Origin and Destination Provinces on Migration: 1995–2000.” European Planning Studies 22:1182–98.