Sage Journals: Discover world-class research

Abstract

Mediation analysis is increasingly used in the social sciences. Extension to social network data, however, has proved difficult because statistical network models are formulated at a lower level of analysis (the dyad) than many outcomes of interest. This study introduces a general approach for micro-macro mediation analysis in social networks. The author defines the average mediated micro effect (AMME) as the indirect effect of a network selection process on an individual, group, or organizational outcome through its effect on an intervening network variable. The author shows that the AMME can be nonparametrically identified using a wide range of common statistical network and regression modeling strategies under the assumption of conditional independence among multiple mediators. Nonparametric and parametric algorithms are introduced to generically estimate the AMME in a multitude of research designs. The author illustrates the utility of the method with an applied example using cross-sectional National Longitudinal Study of Adolescent to Adult Health data to examine the friendship selection mechanisms that indirectly shape adolescent school performance through their effect on network structure.

Keywords

networks micro-macro mediation indirect effects network selection

A large methodological literature develops mediation methods for observational and experimental data (Bollen and Stine 1992; Breen, Karlson, and Holm 2013; Imai, Keele, and Tingley 2010; Imai, Keele, and Yamamoto 2010; Karlson, Holm, and Breen 2012; Mackinnon 2008; Mize, Doan, and Long 2019; Pearl 2001; Sobel 1982, 2008). Mediation analysis involves identifying the intervening variables that explain the relationship between an explanatory variable and an outcome. It has gained increasing popularity in the social sciences as recent theoretical advances have provided clarity on the conditions necessary to identify, estimate, and interpret indirect effects (Pearl 2001). These advancements have contributed to a burgeoning body of research that uses mediation analysis to evaluate the mechanisms responsible for social outcomes, such as racial disparities in educational attainment (Zhou 2022), implicit racial biases (Melamed et al. 2019), and poverty (Desmond and Wilmers 2019).

Although social scientists often regard networks as an intervening variable in causal processes (e.g., An, Beauvile, and Rosche 2022; Chetty et al. 2022; DiMaggio and Garip 2012; Pedulla and Pager 2019), mediation methods for social networks are currently underdeveloped. Contemporary methods are limited to comparisons between two models where the same network acts as the dependent variable (Duxbury 2023).¹ Yet sociologists are often interested in how network selection affects individual, group, and organizational outcomes. Structuralist perspectives posit that networks exert contextual effects on social action (Burt 1992; Centola 2015; Coleman 1990; Granovetter 1973; Melamed, Harrell, and Simpson 2018; White 1992). To the extent that distinct selection mechanisms create unique network contexts, network selection is likely to indirectly influence individual, group, and organizational outcomes by altering network structure.

For example, Bearman, Moody, and Stovel (2004) motivated their analysis of adolescent dating networks by arguing that chain-link structures optimize sexually transmitted disease (STD) diffusion. This implies an indirect effect of relational dating norms—in their case, four-cycle avoidance—on individual STD risk and network-level STD diffusion because of a change in network topology. Schaefer, Kornienko, and Fox (2011) motivated their study of depression homophily by arguing that peer depression worsens mental health. Duxbury and Haynie (2020) examined how suspension can decrease school achievement by driving students into academically underperforming peer groups. Padgett and Ansell (1993) classically showed that the Medicis’ navigation into advantageous network positions enabled them to consolidate political power in fifteenth-century Florence. Each of these studies imply an indirect network selection effect on an individual or group outcome via network structure.

Although indirect network selection effects are often implicated in sociology, the lack of mediation methods for social networks has hampered statistical evaluation of indirect network selection effects on higher-order outcomes.² The main impediment to micro-macro network mediation analysis is that statistical network models are formulated at a lower level of analysis (the dyad) than the individual, group, and organizational outcomes that researchers often want to study. Because current mediation methods assume multiple models fit to the same unit of analysis, they cannot be used to address questions that implicate indirect effects of network selection processes on outcomes measured on higher level units.

In this study, I develop a general approach for micro-macro mediation analysis in social networks. In the framework, micro processes represent network selection effects that dictate whether two nodes are connected and are captured by the terms in a statistical network model. Here, “macro” structure encompasses all network statistics calculated above the dyad level, including node, subgraph, and global statistics. The approach allows researchers to estimate, test, and interpret the indirect effect of a network selection process on an individual, group, or organizational outcome acting through an intervening network variable.

I begin by outlining prior approaches to mediation analysis in regression. I then discuss three problems in social networks: the unit of analysis problem, interdependence, and posttreatment confounders. Next, I introduce the average mediated micro effect (AMME) and present identification results for the AMME under the assumption of conditional independence among multiple posttreatment confounders. I then introduce parametric and nonparametric algorithms for estimating the AMME. I conclude by providing an example using cross-sectional National Longitudinal Study of Adolescent to Adult Health (Add Health) data. The mediation approach is implemented in the netmediate R package available on the Comprehensive R Archive Network.

Prior Approaches to Mediation Analysis

Following contemporary mediation research (Imai, Keele, and Yamamoto 2010; Pearl 2001; Sobel 2008), I use notation and language common to the potential outcomes framework.³ Suppose we have a random sample of size n. Let $T_{i}$ be a binary treatment (explanatory) variable,⁴ $M_{i}$ be a mediating variable, and $Y_{i}$ be an outcome of interest for unit i. We write $M_{i} (t)$ to represent the potential mediator value under the treatment status t = 0,1 where the observed value $M_{i}$ equals the potential value of the mediator under the observed data $M_{i} (T_{i})$ . Similarly, we use $Y_{i} (t, m)$ to denote the potential outcome under the treatment status and the mediator value m where $Y_{i} = Y_{i} (T_{i}, M_{i} (T_{i}))$ is the value of the outcome under the observed data.

The indirect effect of $T_{i}$ for unit i given the treatment status t is defined as (Pearl 2001)

δ_{i} (t) \equiv Y_{i} (t, M_{i} (1)) - Y_{i} (t, M_{i} (0)) .

(1)

This represents the change in the outcome that can be attributed to a treatment induced change in a mediator when the treatment is held constant and the mediator is changed from $M_{i} (0)$ to $M_{i} (1)$ . The unit-level direct effect of the treatment is

ζ_{i} (t) \equiv Y_{i} (1, M_{i} (t)) - Y_{i} (0, M_{i} (t)),

(2)

which is the effect of the treatment on the outcome when the treatment is changed from 0 to 1 and the mediator is constant. The sum of the direct and indirect effects equals the total effect:

τ_{i} \equiv Y_{i} (1, M_{i} (1)) - Y_{i} (0, M_{i} (0)) \equiv δ_{i} (t) + ζ_{i} (1 - t) .

(3)

Given the unit-level quantities of interest, we define the population average effect for each quantity:

\bar{δ} (t) = E (δ_{i} (t)), \bar{ζ_{i}} (t) = E (ζ_{i} (t)), \bar{τ} = E (τ_{i}) .

(4)

The goal of mediation analysis is to decompose the total effect into direct and indirect effects. The conventional approach involves fitting two linear regressions separately (e.g., Bollen and Stine 1992; Sobel 1982):

M_{i} = β_{1} T_{i} + ξ_{1} X_{i} + ε_{i 1}

(5)

and

Y_{i} = β_{2} T_{i} + γ M_{i} + ξ_{2} X_{i} + ε_{i 2},

(6)

where $X_{i}$ represents a vector of pretreatment confounders. After fitting both models, the researcher computes the product of coefficients, $β_{1} γ$ , which identifies $\bar{δ} (t)$ under the sequential ignorability assumption—that is, conditional independence among M, T, and X—and no interactions between M and pretreatment confounders (Imai, Keele, and Tingley 2010).⁵

Formally, the sequential ignorability assumption holds that

{Y_{i} (t, m), M_{i} (t')} ⨿ T_{i} | X_{i} = x,

(7)

and

Y_{i} (t', m) ⨿ M_{i} | T_{i} = t, X_{i} = x,

(8)

where the notation $t'$ captures the counterfactual value of the treatment variable when $T_{i} = t$ , that is, $t'$ = 0 when t = 1 and $t'$ = 1 when t = 0. Recent studies have used the sequential ignorability assumption to develop estimation procedures for nonlinear outcome variables (Cheng et al. 2018; Duxbury 2023; Imai, Keele, and Yamamoto 2010; Karlson et al. 2012; Mize et al. 2019), posttreatment confounders (Acharya, Blackwell, and Sen 2018; Imai and Yamamoto 2013; Zhou 2022), and sensitivity analysis (Cheng et al. 2018; Imai, Keele, and Tingley 2010). Like the product of coefficients, each of these strategies assumes two models fit to the same data with the same unit of analysis.

Definition of the Problem and Issues with Prior Approaches

The conventional approach to mediation analysis encounters three problems when applied to indirect network selection effects. First, prior approaches assume the mediating variable, outcome, and treatment are measured on the same unit of analysis. Second, current methods assume independent observations. Third, conventional strategies assume no posttreatment confounding. I graphically introduce the direct and indirect pathways I aim to disentangle before elaborating on each problem in the following subsections.

Direct and Indirect Micro-Macro Pathways

Figure 1 outlines the direct and indirect micro-macro relationships that we seek to disentangle. As above, I use T, M, Y, and X to denote the explanatory variable, mediating variable, outcome variable of interest, and confounding pathways. However, now T represents a micro-level network selection process. By micro-level, I refer to network selection processes that determine whether two nodes are connected, such as reciprocity, triadic closure, or homophily, and that are commonly represented by the parametrized terms in statistical network models, such as the exponential random graph model (ERGM) or stochastic actor-oriented model (SAOM). These processes are often represented by dyadic attributes, but they may also be captured by nodal or contextual variables. For example, in friendship networks, sociality effects refer to node characteristics that determine whether actors are more or less likely to send or receive friendship ties compared with other actors with different attributes (e.g., girls tend to have more friends, on average, than boys) (see Goodreau, Kitts, and Morris 2009; Robins, Elliot, and Pattison 2001). Figure 1 also incorporates the micro-level pretreatment confounders Z that provide alternative network selection mechanisms.

Figure 1.

Model of direct and indirect micro-macro network effects.

M represents the macro-level mediating network variable focal to the analysis. M is a node, subgraph, or network-level measure. For example, M may represent a node measure like betweenness centrality, a subgraph measure like community membership, or a global measure like transitivity. Y is the outcome variable measured at the node, subgraph, or network level. In an in-school friendship network, Y may represent a student measure like grade point average (GPA), a group measure such as the percentage of same-race ties within a grade level, or a school measure such as the student-to-teacher ratio. Y and M may be measured on the same unit of analysis or Y may be measured at a lower unit of analysis. This may occur when there is a contextual network effect on a node outcome, such as the effect of in-school friendship network density on student delinquency (Kreager, Rulison, and Moody 2011). X represents a vector of posttreatment confounders, or macro statistics that confound the effect of a mediator on the outcome and that may also be affected by the explanatory network selection process of interest. Posttreatment confounders are the intervening variables not focal to the analysis, and mediators are the intervening variables of substance interest.⁶

The research goal is to identify the indirect effect of T on Y acting through M, which I refer to as the AMME. The AMME is captured by the effect of T on Y that arises indirectly because of a change in M. For example, if we are interested in student school performance, we might posit a peer effect of alters’ GPAs on student GPA. A selection process, such as triadic closure, may indirectly affect GPA by increasing or decreasing GPA segregation in the network at large. Our goal would be to identify the indirect effect of triadic closure on student GPA because of a change in peer GPA (see Figure 2).

Figure 2.

Indirect selection effect of triadic closure on student grade point average (GPA) operating through GPA segregation.

Identification Problems in Network Mediation Settings

The first identification problem lies in the unit of analysis. Network selection processes are relational, and thus are captured by treating dyads as the unit of analysis in statistical network models (Butts 2008; Frank and Strauss 1986; Holland and Leinhardt 1981; Krackhardt 1987; Snijders 2001; Stadtfeld, Hollway, and Block 2017). However, many outcome variables of interest represent attributes of the individuals, groups, or organizations nested in social networks. This means the effect of T on M cannot be captured using separate models treating M and Y as outcome variables. The coefficients from such models are not comparable. Approaches that rely on models fit to the same data therefore cannot be used to identify indirect network selection effects.

The second problem relates to posttreatment confounding. Conventional mediation methods assume no posttreatment confounders. This permits the analyst to estimate the indirect effect by increasing the value of the treatment without worrying that an alternative mediator explains the indirect pathway. The assumption of no posttreatment confounding is usually violated in network mediation settings. Changing a treatment value for a selection process necessitates a change in the complete network, and thus a change in multiple network statistics. For example, if we are interested in the indirect effect of triadic closure on GPA acting through peer GPA, we must account for the confounding effect of transitivity because triadic closure necessitates an increase in transitivity. This means the assumption of no posttreatment confounding cannot be supported in many network mediation designs.

The third problem relates to nonindependence. Potential outcomes used in prior approaches are defined as the value obtained by the outcome when the mediator obtains a value implied by a prespecified treatment value. These potential outcomes assume independence among observations. They require that the value of an explanatory variable for an observation can be altered without changing the value of other variables for other observations. Network data violate this assumption because of relational dependencies between data points (for a discussion, see An 2018; VanderWeele and An 2013). For example, GPA segregation in school friendship networks is a function of the joint distribution of GPA and ties. Increasing triadic closure for a randomly chosen dyad thus breaks the covariance between GPA and friendship ties in the network at large. Yet if triadic closure produces GPA segregation, it is because the students who tend to befriend one another through triadic closure tend to have similar GPAs. This means the potential outcomes used to define conventional indirect effects do not adequately capture the relational dependencies of central interest in network mediation analysis.⁷

The AMME

With the above problems in mind, I define an indirect network selection effect that capitalizes on the distribution of attributes and network dependencies in the observed data. I then show that the AMME can be nonparametrically identified⁸ under the sequential ignorability assumption with multiple conditionally independent posttreatment confounders. This approach assumes that the outcome of interest and micro processes are measured at different units of analysis but that the micro units are “nested in” the macro units, such that the two data sets can be linked by a common identifier.

Micro-Macro Direct, Indirect, and Total Effects

Let i index the unit of analysis for the outcome variable. $M_{i}$ is the value of the macro mediator, $X_{i}$ is the value of the alternative macro mediators, and $Y_{i}$ is the value of the outcome. I use i for both mediators and outcomes to simplify notation, but note that M, X, and Y are not necessarily measured at the same unit of analysis. M and X may be measures of alters’ characteristics or contextual features of the network (e.g., density), and Y may be measured on the same unit as X and M or on a lower level unit. M, X, and Y are assumed to be measured at a higher level of analysis than the dyad. I index dyads with ij, such that $T_{ij}$ is the micro process of interest and $Z_{ij}$ is the vector of pretreatment confounders.⁹

The potential value of $X_{i}$ is assumed not to be a function of $M_{i}$ and thus the observed values of the mediators can be written $X_{i} (T_{ij})$ and $M_{i} (T_{ij})$ . The potential outcomes of $Y_{i}$ , however, depend on both $M_{i}$ and $X_{i}$ and may be directly affected by T. The observed value of $Y_{i}$ equals $Y_{i} (T_{i} (T_{ij}), X_{i} (T_{ij}), M_{i} (T_{ij}))$ , where $T_{i} = T_{i} (T_{ij})$ is the unit-level measure of T under the observed distribution of $T_{ij}$ . I emphasize that, although I focus on outcomes and mediators measured on the i units (i.e., the nodes), the approach can be used for any higher level of analysis so long as the macro units can be linked to the micro data.

Prior micro-macro network methods evaluate the contributions of network selection processes to network structure by comparing observed values to counterfactual conditions where a micro process is set to 0 (Block 2023; Duxbury forthcoming; Huang and Butts 2023; Indlekofer and Brandes 2013; McMillan, Kreager, and Veenstra 2022; Robins, Pattison, and Woolcock 2005). The logic behind this approach is that by “switching off” the contributions of a micro process holding all else constant, the change in a macro statistic can only be attributed to the selection process of interest. Extending this procedure to mediation analysis, I compare the potential outcomes in which a micro process makes no contribution to the potential outcomes expressed in the observed data. The advantage of this approach is that it preserves covariance between the ties and the distribution of node, edge, and dyad attributes in the observed data.^10,11

Under this set up, we can define two types of micro-macro indirect effects, one with respect to $M_{i}$ and one with respect to $X_{i}$ :

ω_{i}^{M} (t) \equiv Y_{i} (T_{i} (t), M_{i} (T_{ij}), X_{i} (t)) - Y_{i} (T_{i} (t), M_{i} (0), X_{i} (t)),

(9)

and

ω_{i}^{X} (t) \equiv Y_{i} (T_{i} (t), M_{i} (t), X_{i} (T_{ij})) - Y_{i} (T_{i} (t), M_{i} (t), X_{i} (0)),

(10)

for t = 0, $T_{ij}$ . $ω_{i}^{M} (t)$ represents the unit-level indirect effect of the treatment on the outcome through the mediator $M_{i}$ while holding the treatment at t and the other mediators at the values that would be realized under the same treatment status. It captures the change in $Y_{i}$ that results from a change in $M_{i}$ when the contributions of $T_{ij}$ to $M_{i}$ are eliminated and other mediators are allowed to vary. $ω_{i}^{X} (t)$ represents an identical value for $X_{i} .$ For example, if $M_{i}$ captures peer GPA and $X_{i}$ captures transitivity, $ω_{i}^{M} (t)$ captures the portion of student GPA that can be explained as a function of triadic closure operating through peer GPA, and $ω_{i}^{X} (t)$ captures the portion of the triadic closure indirect selection effect that operates through transitivity.

The indirect effects defined in equations (9) and (10) capture the change in the potential outcome for i due to a broader change in network structure. It compares the observed outcome to the potential outcome that arises when an explanatory selection process makes no contribution. A key utility of this indirect effect is that it captures network spillover. To see this, note that setting $T_{ij}$ to 0 affects the network statistics calculated on i, j, and the connections of both i and j. For example, if triadic closure tends to concentrate students of similar GPA in the same friendship circles, triadic closure will not only increase GPA among high-achieving students through peer effects but can potentially reduce GPA among other students by limiting their connections to high-GPA peers. Thus, although equations (9) and (10) are defined for a focal unit, the unit-level quantities account for the selection decisions of other nodes.¹²

As in the conventional case, I capture the population average effect (i.e., the AMME), as ${\bar{ω}}_{i}^{M} (t) \equiv E (ω_{i}^{M} (t)$ ) or ${\bar{ω}}_{i}^{X} (t) \equiv E (ω_{i}^{X} (t))$ by averaging over the target population. I emphasize that both $ω_{i}^{M} (t)$ and $ω_{i}^{X} (t)$ are counterfactual quantities. For example, $Y_{i} (T_{i} (t), M_{i} (t'), X_{i} (t))$ cannot be observed directly unless $M_{i} (t') = M_{i} (t)$ .

In some cases, T may exert a direct effect. For example, if T represents student gender, then gender may shape Y indirectly by shaping network structure as well as directly due to gendered expectations about school performance. The unit-level direct effect can be defined as

ζ_{i} (t, t') \equiv Y_{i} (T_{i} (T_{ij}), M_{i} (t), X_{i} (t')) - Y_{i} (0, M_{i} (t), X_{i} (t'))

(11)

For each t, $t'$ = 0, $T_{ij}$ . In the case of student GPA, this represents the portion of student GPA explained by the direct effect of gender. The population average direct effect is ${\bar{ζ}}_{i} (t, t') \equiv E (ζ_{i} (t, t')) .$ The total effect is

τ_{i} = Y_{i} (T_{i} (T_{ij}), M_{i} (T_{ij}), X_{i} (T_{ij})) - Y_{i} (0, M_{i} (0), X_{i} (0))

(12)

and can be calculated as the sum of the direct and indirect effects. This represents the effect of T on Y operating through all direct and indirect pathways. The percentage explained for direct and indirect effects can be calculated with reference to $τ_{i}$ . The percentage explained for the indirect and direct effects, respectively, are

{\bar{ω}}_{i}^{M} (t) / {\bar{τ}}_{i} x 100, {\bar{ω}}_{i}^{X} (t) / {\bar{τ}}_{i} x 100, {\bar{ζ}}_{i} (t, t') / {\bar{τ}}_{i} x 100

(13)

Thus far, I have assumed a continuous outcome variable. I now extend the AMME to binary outcomes. I replace the observed outcome $Y_{i}$ with the probability of observing a positive case for the dependent variable $\Pr (Y_{i} = 1)$ , which connects to $Y_{i}$ through a link function $g (Y_{i})$ . I define the indirect effect as follows:

ω_{i}^{M} (t) \equiv g (Y_{i} (T_{i} (t), M_{i} (T_{ij}), X_{i} (t))) - g (Y_{i} (T_{i} (t), M_{i} (0), X_{i} (t))) .

(14)

In this case, the AMME is the change in the probability of realizing a positive case of the outcome variable because of the indirect effect of a network selection process operating through an intervening network variable. When $g (Y_{i})$ is the identity function, this expression reduces to equation (9).

Identification Result

I now demonstrate that the AMME can be nonparametrically identified with the observed data under sequential ignorability with multiple conditionally independent mediators.

Assumption 1: Sequential Ignorability with Multiple Conditionally Independent Mediators

I assume that the following three conditions hold:

{Y_{i} (T_{i} (t), m, x), M_{i} (t'), X_{i} (t')} ⨿ T_{ij} | Z_{ij} = z,

(15)

Y_{i} (T_{i} (t'), m, X_{i} (t')) ⨿ M_{i} | T_{ij} = t, Z_{ij} = z

(16)

and

Y_{i} (T_{i} (t'), M_{i} (t'), x), ⨿ X_{i} | T_{ij} = t, Z_{ij} = z

(17)

where $0 < \Pr (T_{ij} = t, Z_{ij} = z)$ and $0 < \Pr (M_{i} = m, X_{i} = x | T_{ij} = t, Z_{ij} = z)$ for any z, t, $t'$ , x.

This assumption holds conditional independence for (1) the effect of T on M, (2) the effect of M on Y, (3) the effect of X on Y, (4) the effect of T on X, and (5) the effect of T on Y.

The assumption of sequential ignorability is a strong one and can be violated in two common ways. The first is because of unmeasured confounding variables on any pathway. The second is if there is reverse causation, for example, Y has an effect on T. This possibility is common in cross-sectional analyses of peer effects where observed levels of network autocorrelation generically blend processes of selection and influence. Because these possibilities are well studied and occur routinely in network analysis, not just mediation analysis (An 2015a; An et al. 2022; Centola 2010; Shalizi and Thomas 2011; Steglich et al. 2010), I do not belabor the point here. The main takeaway is that sequential ignorability does not hold in research settings where the direction of influence and possibility of omitted variables is not adequately addressed by the modeling strategy chosen by the researcher.

Under sequential ignorability, ${\bar{ω}}_{i}^{M} (t)$ can be identified as (see the Appendix for the derivation)

{\bar{ω}}_{i}^{M} (t) = \int \int E (Y_{i} | M_{i} = m, T_{i j} = t, Z_{i j} = z) {d F_{M_{i} | T_{i j} = 1, Z_{i j} = z} (m) - d F_{M_{i} | T_{i j} = 0, Z_{i j} = z} (m)} d F_{Z_{i j}} (z),

(18)

for any t, $t'$ = 0, 1, m, m ′, and x. The primary implication of equation (18) is that, even though calculation of the AMME requires counterfactual values, those counterfactuals can be expressed as a function of the observed data. The derivation is general and can accommodate nonbinary t (see the Appendix). It therefore applies to any real valued $T_{ij}$ . The same identification result can be obtained for ${\bar{ω}}_{i}^{X} (t)$ by substituting $X_{i}$ for $M_{i} .$ Furthermore, because ${\bar{ζ}}_{i} (t, t') = {\bar{τ}}_{i} - {\bar{ω}}_{i}^{X} (t) - {\bar{ω}}_{i}^{M} (t)$ , and because ${\bar{τ}}_{i}$ is identified under equation (15), ${\bar{ζ}}_{i} (t, t')$ is identified as well. Thus, even though the potential outcomes necessary to identify the AMME are not observed directly, they can be obtained using the observed data. This provides the basis of our estimation algorithm, which enables generic estimation of ${\bar{ω}}_{i}^{M} (t), {\bar{ω}}_{i}^{X} (t), {\bar{ζ}}_{i} (t, t')$ , and ${\bar{τ}}_{i}$ in a wide range of models.

Estimation

I now introduce parametric and nonparametric algorithms for estimating the AMME. The procedures build on bootstrap and Monte Carlo algorithms commonly used in computational statistics (Duxbury forthcoming; Imai, Keele, and Yamamoto 2010; King, Tomz, and Wittenberg 2000).

Model Requirements

As in conventional mediation analysis, we require two models to estimate the AMME: a statistical model for a complete network and a statistical model for the outcome of interest. Denote the complete network with A such that $A_{ij}$ captures the individual cells in the adjacency matrix. The statistical network model is assumed to be fit at the dyad level, and the model for the outcome of interest may be fit at any higher level of analysis so long as the two data sets can be linked by a common identifier. I represent both models using general notation to allow either nonparametric (e.g., quadratic assignment) or parametric modeling strategies:

Model 1 : f (A_{ij} | T_{ij}, Z_{ij})

(19)

and

Model 2 : g {(Y_{i} | M_{i}, X_{i}, T_{i})}^{13}

(20)

Model 1 is a statistical model for the network A, which may be either binary or valued. A may be a single network, a stacked adjacency matrix of multiple networks, or repeated measures of the same network over time. For example, if model 1 is an ERGM, then $f (A_{ij} | T_{ij}, Z_{ij})$ = $\log (\frac{\Pr (A_{ij} = 1 | A_{- ij}, T_{ij}, Z_{ij})}{\Pr (A_{ij} = 0 | A_{- ij}, T_{ij}, Z_{ij})})$ , where T and Z may contain exogenous variables or endogenous change statistics when an ij tie changes from 0 to 1 and $A_{- ij}$ indicates the rest of the network is unchanged. If model 1 is an SAOM with a fixed rate function, then $f (A_{ij} | T_{ij}, Z_{ij})$ gives the transition probabilities $\frac{1}{n} \cdot \frac{e x p (h (T_{i j}, Z_{i j}))}{\sum e x p (h {(T_{i j}, Z_{i j})}^{'})}$ for n nodes with objective function $h (T_{ij}, Z_{ij})$ and $h (T_{ij}, Z_{ij})'$ is the objective function for all other possible tie changes. Model 2 represents the effect of M, X, and T on Y. For example, if $g (.)$ is a logit function, then $g (Y_{i} | M_{i}, X_{i}, T_{i})$ is a logistic regression of the form $\log (\frac{\Pr (Y_{i} = 1 | M_{i}, X_{i}, T_{i})}{\Pr (Y_{i} = 0 | M_{i}, X_{i}, T_{i})})$ . Note that T appears in model 2 only when it is possible for T to have both a direct and indirect effect on the outcome, such as in the case of node covariates.

I emphasize again that this approach can accommodate contextual selection effects. For example, consider the case where network selection is modeled using hierarchical ERGM or a meta-regression of lower level network models with contextual covariates (e.g., An 2015b; Schweinberger and Handcock 2015; Slaughter and Koehly 2016; Snijders and Baerveldt 2003). The contextual effects are now captured by model 1. This means the AMME for the contextual effect can be identified under the conditions outlined above. In this instance, the AMME assesses whether the contextual covariate has an indirect effect on an outcome of interest because actors are more likely to select into specific network structures in some social contexts compared with others.

Algorithm

The algorithms proceed under the general logic that the AMME can be estimated using uncertainty in each model to create a distribution of mediating networks and outcome values. In the parametric algorithm, uncertainty is captured by assuming a multivariate normal distribution for the model parameters and using the covariance matrix of the estimator to approximate the variance of the parameter distribution. In the nonparametric algorithm, this is accomplished via bootstrapping. Tables 1 and 2 provide pseudocode for each algorithm.

Table 1.

Algorithm 1: Parametric Estimation Algorithm.

Input f (A_{ij} | T_{ij}, Z_{ij}), g (Y_{i} | M_{i}, X_{i}, T_{i}), P, t = [0, T_{ij}]

Output {\hat{\bar{ω}}}^{M} (t)

3:
4:

sample f^{p} (A_{ij} | T_{ij}, Z_{ij}) ~ MVN (\hat{θ}, \hat{Ω} (\hat{θ})) P times

sample g^{p} (Y_{i} | M_{i}, X_{i}, T_{i}) ~ MVN (\hat{β}, \hat{Ω} (\hat{β})) P times

6:
7: for

p = 1, 2, . . . P

do
8:

draw A^{p} (T_{ij}, Z_{ij}) from f^{p} (A_{ij} | T_{ij}, Z_{ij})

draw A^{p} (0, Z_{ij}) from f^{p} (A_{ij} | 0, Z_{ij})

10:

calculate M_{i}^{p} (T_{ij}) and X_{i}^{p} (T_{ij}) using A^{p} (T_{ij}, Z_{ij})

11:

calculate M_{i}^{p} (0) and X_{i}^{p} (0) using A^{p} (0, Z_{ij})

12:

draw Y_{i}^{p} (T_{i} (t), M_{i}^{p} (T_{ij}), X_{i}^{p} (t)) from g^{p} (Y_{i} | M_{i}^{p} (T_{ij}), X_{i}^{p} (t), T_{i} (t))

13:

draw Y_{i}^{p} (T_{i} (t), M_{i}^{p} (0), X_{i}^{p} (t)) from g^{p} (Y_{i} | M_{i}^{p} (0), X_{i}^{p} (t), T_{i} (t))

14:

calculate {\hat{\bar{ω}}}^{Mp} (t) = \frac{1}{2 n} \sum Y_{i}^{p} (T_{i} (t), M_{i}^{p} (T_{ij}), X_{i}^{p} (t)) - Y_{i}^{p} (T_{i} (t), M_{i}^{p} (0), X_{i}^{p} (t))

15: end for
16:
17:

calculate {\hat{\bar{ω}}}^{M} (t) = \sum_{p} \frac{{\hat{\bar{ω}}}^{Mp} (t)}{P}

18: return

{\hat{\bar{ω}}}^{M} (t)

Table 2.

Algorithm 1: Nonparametric Estimation Algorithm.

Input f (A_{ij} | T_{ij}, Z_{ij}), g (Y_{i} | M_{i}, X_{i}, T_{i}), P, t = [0, T_{ij}]

Output {\hat{\bar{ω}}}^{M} (t)

3:
4:

sample f^{p} (A_{ij} | T_{ij}, Z_{ij}) from f (A_{ij} | T_{ij}, Z_{ij}) P times with replacement

sample g^{p} (Y_{i} | M_{i}, X_{i}, T_{i}) from g (Y_{i} | M_{i}, X_{i}, T_{i}) P times with replacement

6:
7: for

p = 1, 2, . . . P

do
8:

draw A^{p} (T_{ij}, Z_{ij}) from f^{p} (A_{ij} | T_{ij}, Z_{ij})

draw A^{p} (0, Z_{ij}) from f^{p} (A_{ij} | 0, Z_{ij})

10:

calculate M_{i}^{p} (T_{ij}) and X_{i}^{p} (T_{ij}) using A^{p} (T_{ij}, Z_{ij})

11:

calculate M_{i}^{p} (0) and X_{i}^{p} (0) using A^{p} (0, Z_{ij})

12:

draw Y_{i}^{p} (T_{i} (t), M_{i}^{p} (T_{ij}), X_{i}^{p} (t)) from g^{p} (Y_{i} | M_{i}^{p} (T_{ij}), X_{i}^{p} (t), T_{i} (t))

13:

draw Y_{i}^{p} (T_{i} (t), M_{i}^{p} (0), X_{i}^{p} (t)) from g^{p} (Y_{i} | M_{i}^{p} (0), X_{i}^{p} (t), T_{i} (t))

14:

calculate {\hat{\bar{ω}}}^{Mp} (t) = \frac{1}{2 n} \sum Y_{i}^{p} (T_{i} (t), M_{i}^{p} (T_{ij}), X_{i}^{p} (t)) - Y_{i}^{p} (T_{i} (t), M_{i}^{p} (0), X_{i}^{p} (t))

15: end for
16:
17:

calculate {\hat{\bar{ω}}}^{M} (t) = \sum_{p} \frac{{\hat{\bar{ω}}}^{Mp} (t)}{P}

18: return

{\hat{\bar{ω}}}^{M} (t)

Parametric Estimation

The parametric estimation algorithm proceeds in four steps. In step 1, fit two models, one of the form $f (A_{ij} | T_{ij}, Z_{ij})$ with parameter vector $\hat{θ}$ and covariance matrix $\hat{Ω} (θ)$ , and one of the form $g (Y_{i} | M_{i}, X_{i}, T_{i})$ with parameter vector $\hat{β}$ and covariance matrix $\hat{Ω} (β)$ . In step 2, draw P values of $\hat{θ}$ and $\hat{β}$ from a multivariate normal distribution using $\hat{θ}$ and $\hat{β}$ to approximate the means and $\hat{Ω} (θ)$ and $\hat{Ω} (β)$ to approximate the variances. Denote the samples ${\hat{θ}}^{p}$ and ${\hat{β}}^{p}$ and the sampled models $f^{p} (A_{ij} | T_{ij}, Z_{ij})$ and $g^{p} (Y_{i} | M_{i}, X_{i}, T_{i})$ .

Step 3 contains four subphases. In the first subphase, draw two networks, one using $f^{p} (A_{ij} | T_{ij}, Z_{ij})$ where $T_{ij}$ is held at its observed value and one using $f^{p} (A_{ij} | 0, Z_{ij})$ . Denote the networks $A^{p} (T_{ij})$ and $A^{p} (0)$ . In the second subphase, calculate M and X on each network. Denote the values $M_{i}^{p} (t)$ and $X_{i}^{p} (t)$ for t = 0, $T_{ij}$ . In the third subphase, draw the potential values of $Y_{i}$ from $g^{p} (Y_{i} | M_{i}, X_{i}, T_{i})$ by replacing $M_{i}$ with $M_{i}^{p} (t)$ , $X_{i}$ with $X_{i}^{p} (t)$ , and $T_{i}$ with $T_{i} (t)$ . Denote the values $Y_{i}^{p} (T_{i} (t), M_{i}^{p} (T_{ij}), X_{i}^{p} (t))$ and $Y_{i}^{p} (T_{i} (t), M_{i}^{p} (0), X_{i}^{p} (t))$ . Finally, calculate ${\bar{ω}}_{i}^{Mp} (t) = \frac{1}{2 n} Y_{i}^{p} (T_{i} (t), M_{i}^{p} (T_{ij}), X_{i}^{p} (t)) - Y_{i}^{p} (T_{i} (t), M_{i}^{p} (0), X_{i}^{p} (t))$ and store the result. This provides one sample of the AMME. Step 3 repeats P times, once for each value of p.

In step 4, calculate the AMME point estimate with ${\bar{ω}}_{i}^{M} (t) = \frac{1}{P} \sum {\bar{ω}}_{i}^{M p} (t)$ . The standard deviation of the sampling distribution provides the standard error. Confidence intervals and percentile p values are calculated using the sampling distribution.

Nonparametric Estimation

The nonparametric algorithm follows similar logic. It proceeds in three steps. In step 1, the researcher obtains P bootstrap versions of $f (A_{ij} | T_{ij}, Z_{ij})$ and $g (Y_{i} | M_{i}, X_{i}, T_{i})$ by sampling with replacement. Denote the models $f^{p} (A_{ij} | T_{ij}, Z_{ij})$ and $g^{p} (Y_{i} | M_{i}, X_{i}, T_{i}) .$ Step 2 contains three subphases. First, draw $A^{p} (t)$ from $f^{p} (A_{ij} | T_{ij}, Z_{ij})$ and calculate $M_{i}^{p} (t)$ and $X_{i}^{p} (t)$ for t = 0, $T_{ij}$ . Second, sample $Y_{i}^{p} (T_{i} (t), M_{i}^{p} (T_{ij}), X_{i}^{p} (t))$ , and $Y_{i}^{p} (T_{i} (t), M_{i}^{p} (0), X_{i}^{p} (t))$ from $g^{p} (Y_{i} | M_{i}, X_{i}, T_{i})$ . Third, calculate ${\bar{ω}}_{i}^{Mp} (t) = \frac{1}{2 n} Y_{i}^{p} (T_{i} (t), M_{i}^{p} (T_{ij}), X_{i}^{p} (t)) - Y_{i}^{p} (T_{i} (t), M_{i}^{p} (0), X_{i}^{p} (t))$ and store the result. This provides one bootstrap sample of the AMME. Step 2 repeats P times. In step 3, calculate the AMME point estimate with ${\bar{ω}}_{i}^{M} (t) = \frac{1}{P} \sum {\bar{ω}}_{i}^{M p} (t)$ . Confidence intervals and percentile p values are calculated from the sampling distribution.

Simulation

I now provide simulation results to assess the consistency of the algorithmic estimates. Because the AMME relies on realistic distributions of edges and attributes, I initiate the simulation using the Faux Mesa High friendship network (Hunter, Goodreau, and Handcock 2008). The network contains 205 nodes representing high school students and 203 undirected friendship relationships. The simulation broadly entailed generating 1,000 synthetic networks from prespecified coefficients and then estimating the AMME on each network. I first fit a dyad independent ERGM to the Faux Mesa High network that included sex, race, and grade as node sociality effects and as homophily effects (same sex or same race; absolute difference in student grade) and stored the parameter vector. I then used a Metropolis-Hastings algorithm to simulate 1,000 synthetic networks treating the coefficients from the dyad independent ERGM as the “true” generative model parameters.

Next, I generated a node-level outcome variable using a linear model of the form:

Y_{i} = β_{Sex} Se x_{i} + β_{White} Whit e_{i} + β_{Hisp} His p_{i} + β_{Grade} Grad e_{i} + β_{Deg} De g_{i} + β_{LCC} LC C_{i} + e_{i}

$Se x_{i}$ , $Whit e_{i}$ , $His p_{i}$ , and $Grad e_{i}$ capture student characteristics; $De g_{i}$ is degree; and $LC C_{i}$ is the local clustering coefficient (Watts and Strogatz 1998) with population parameters $β_{Sex} = - 0.5$ , $β_{White} = 1$ , $β_{Hisp} = 0.5$ , $β_{Grade} = 0.2$ , $β_{Deg} = 0.8$ , and $β_{LCC} = 0.5$ . $e_{i}$ is a random normal error term with mean 0 and standard deviation 1. For each simulated network, I generated associated values of $Y_{i}$ by sampling the parameter vector from a multivariate normal distribution using the population parameters and covariance matrix as the mean and variance, respectively.

Our primary task is to estimate the AMME with respect to $De g_{i}$ and $LC C_{i}$ . This captures the indirect effect of each ERGM parameter on $Y_{i}$ by acting through either students’ degree or local clustering coefficient with each mediator acting as a posttreatment confounder in the alternative indirect pathway. I also consider the consistency of total and direct effect estimates. Algorithmic estimates were obtained with P = 500 simulations in each data set using linear regression to model each simulated value of $Y_{i}$ and ERGM to model each simulated network. With 1,000 simulated data sets, 8 selection processes, 2 mediators, and 4 sociality effects, the sample space of the simulation includes 16,000 AMME estimates (8,000 for $De g_{i}$ , 8,000 for $LC C_{i})$ , 8,000 total effect estimates, and 4,000 direct effect estimates.

Beginning with parametric estimation, Figure 3 plots results from the micro-macro mediation analyses with density plots representing sample estimates and dashed vertical lines representing population values. The parametric algorithmic estimates converge to the population value for each effect measure: the AMME with respect to $LC C_{i}$ (Figure 3A), the AMME with respect to $De g_{i}$ (Figure 3B), the direct effects for node covariates (Figure 3C), and the total effects (Figure 3D). This holds even when the effect distributions are nonnormal. For example, the $Grad e_{i}$ AMME with respect to $De g_{i}$ exhibits noteworthy left skew, but the median AMME (−0.853) approximates the population value (−0.844) even in this condition.

Figure 3.

Parametric estimates of (A and B) average mediated micro effects (AMMEs) (N = 16,000), (C) direct effects (N = 4,000), and (D) total effects (N = 8,000).

Turning to nonparametric estimation, results largely align with parametric estimation (see Figure 4). The nonparametric algorithmic estimates converge to the population value for each AMME, direct effect, and total effect. This includes both normally distributed variables as well as the left-skewed total effect and AMME with respect to $De g_{i}$ for grade. Collectively, these results provide preliminary evidence that micro-macro direct, indirect, and total network effects can be consistently estimated with the proposed estimation algorithms. Because the algorithms only require two statistical models but are agnostic about specific model choice, they can be widely applied in a range of research settings.

Figure 4.

Nonparametric estimates of (A and B) average mediated micro effects (AMMEs) (N = 16,000), (C) direct effects (N = 4,000), and (D) total effects (N = 8,000).

In summary, I introduced a general approach for micro-macro mediation analysis in social network data. The approach can be used to identify, estimate, and interpret the indirect effects of network selection processes on higher-order outcomes acting through network contexts. I now illustrate the utility of the approach in an example analyzing the friendship selection dynamics that indirectly shape adolescent school performance.

Empirical Example: Friendship Selection And School Performance In Adolescent Social Networks

Data

Research on adolescence documents contextual effects of in-school friendship networks. Peer effects shape participation in risky behaviors, mental health, and academic achievement, and overall network cohesion and clustering safeguard against substance use and delinquency (Copeland et al. 2020; Duxbury and Haynie 2020; Haynie 2001; Haynie and Osgood 2006; Kreager et al. 2011; Schaefer et al. 2011). Building on these studies, I examine the network selection mechanisms that indirectly affect adolescent GPA by acting through network structure. The analysis examines in-school friendship data from the largest network collected during the first wave of the Add Health. The network contains 1,167 students enrolled in 7th to 12th grade and 2,293 directed, binary friendship ties.

I conduct a cross-sectional analysis that combines ERGM with a linear network autocorrelation model (LNAM). Networks are modeled as a function of students’ grade, sex, race, parental income, and parental education.¹⁴ I include a mutuality term to account for reciprocity. Because Markov models are usually degenerate (Hunter et al. 2008), I include a geometrically weighted edgewise shared partnership (GWESP) term with a fixed decay parameter of 0.7. All node covariates are included as both sender and receiver effects. I also include a node match term for students’ race, parental education, and gender, and an attribute similarity (absolute difference) term for grade and parental income.¹⁵ The LNAM is formulated at the student level. Student GPA is modeled as a function of race, sex, grade, parental education, and parental income. The mediators of interest are indegree (popularity), outdegree (sociability), betweenness centrality (breadth), and the local clustering coefficient (embeddedness). To capture peer effects, I include a first-order autoregressive parameter that measures similarity between students’ GPAs and alters’ GPAs. I use row normalization such that the measure is the mean GPA among students’ outgoing friendship ties (Leenders 2002).¹⁶

Estimation and Assumptions

Our primary goal is to estimate the total, direct, and indirect effects of each ERGM parameter on student GPA. To do so, I use parametric estimation with 500 Monte Carlo samples for each network selection process. For each iteration, this entails (1) drawing two new parameter vectors on the basis of the ERGM and LNAM estimates; (2) using the simulated ERGM parameter vector to generate two networks, one from the full model and one with the selection mechanism fixed at zero; (3) using the two networks to calculate two unique values of the mediating variable and posttreatment confounders; (4) generating four values of student GPA using the simulated values of the mediating variable, posttreatment confounders, selection mechanisms, and LNAM parameter vector; and (5) calculating the AMME from the resulting output. This process repeats 500 times with each iteration using a new pair of parameter vectors to provide the AMME point estimate.

The plausibility of our estimates depends on the sequential ignorability assumption. The ERGM framework assumes endogeneity among network effects and thus accounts for reverse causality for indirect pathways implicating outdegree, indegree, betweenness centrality, and the local clustering coefficient.¹⁷ However, indirect pathways implicating peer GPA may be violated by simultaneous processes of selection into socially similar peer groups and influence from those groups. We use an approach inspired by An (2015a) to increase the plausibility of sequential ignorability for pathways implicating peer GPA. This entails using peer parental education as an instrument for peer GPA in the LNAM. Because it is unlikely that GPA has an upstream effect on parental education or that students select friends on the basis of parental education,¹⁸ this specification helps us break the feedback loop between GPA selection and influence.¹⁹

Results

Tables 3 and 4 report results from ERGM, LNAM, and micro-macro network mediation analysis. ERGM results reveal positive effects from mutuality and GWESP, indicating that students are more likely to nominate other students as friends if it reciprocates an incoming friendship or if the two students share a third mutual affiliate (Table 3). Students are also more likely to form friendships if they are the same sex, same race, or in a similar grade. The only significant sociality effect is the receiver effect for parental income, indicating that parental income is positively associated with incoming friendship nominations. LNAM results reveal positive effects from indegree and peer GPA (Table 4). The negative coefficient for outdegree, however, indicates that students who nominate many other students as friends tend to have lower GPAs. The local clustering coefficient and betweenness centrality are both nonsignificant. The positive coefficient for sex indicates that girls tend to have higher GPAs than boys. Similarly, Asian students tend to have higher GPAs than black students, and white students tend to have lower GPAs than black students. Students whose parents completed college tend to have higher GPAs than students whose parents did not complete high school.

Table 3.

ERGM and Micro-Macro Network Mediation Results.

	ERGM	LCC	Betweenness	In-Degree	Out-Degree	Peer GPA
	Coefficient (s.e.)	AMME (s.e.)	AMME (s.e.)	AMME (s.e.)	AMME (s.e.)	AMME (s.e.)	Direct Effect^a (s.e.)	Total Effect (s.e.)
Mutuality	1.687*** (.218)	.000 (.001)	.000 (.003)	.000 (.001)	.000 (.001)	–.010* (.006)		–.010 (.007)
GWESP	2.139*** (.068)	.010 (.021)	.006 (.016)	.052*** (.018)	–.045** (.019)	.036*** (.014)		.059** (.025)
Sex (male is reference)
Sender	.102 (.083)	.001 (.002)	.001 (.005)	.000 (.004)	–.002 (.003)	.003 (.009)	.287*** (.074)	.285*** (.073)
Receiver	–.006 (.099)	.000 (.001)	.001 (.004)	.003 (.003)	.000 (.003)	–.003 (.009)	.287*** (.074)	.276*** (.076)
Same	.356*** (.065)	–.001 (.002)	.001 (.005)	.004** (.002)	–.003* (.002)	.000 (.006)		.001 (.008)
Race (black is reference)
Asian sender	.029 (.061)	.000 (.001)	.000 (.003)	.001 (.001)	.000 (.001)	–.010 (.006)	.076*** (.013)	.067*** (.016)
Asian receiver	.029 (.061)	.000 (.001)	.000 (.003)	.000 (.002)	.000 (.001)	–.010 (.006)	.076*** (.013)	.067*** (.015)
White sender	–.038 (.082)	.000 (.001)	–.001 (.003)	–.001 (.001)	.001 (.001)	–.011 (.007)	–.028** (.009)	–.041*** (.012)
White receiver	.062 (.061)	.000 (.001)	.000 (.004)	.000 (.001)	.000 (.001)	–.013 (.007)	–.028** (.009)	–.040** (.011)
Other sender	.026 (.057)	.003 (.008)	.000 (.003)	.001 (.001)	–.001 (.001)	–.009 (.006)	–.009 (.012)	–.017 (.014)
Other receiver	–.007 (.057)	.000 (.001)	.001 (.004)	.001 (.001)	–.001 (.001)	–.009 (.007)	–.009 (.012)	–.017 (.015)
Same race	.649*** (.076)	–.001 (.002)	.003 (.006)	.005*** (.002)	–.004** (.002)	.000 (.007)		.003 (.009)
Grade
Sender	–.019 (.058)	.003 (.008)	–.004 (.011)	–.027 (.031)	.024 (.034)	–.052 (.039)	.156 (.338)	.100 (.345)
Receiver	.067 (.055)	.002 (.007)	–.001 (.011)	–.015 (.025)	.013 (.022)	–.033 (.042)	.156 (.338)	.121 (.340)
Absolute difference	–.681*** (.081)	.002 (.003)	–.003 (.006)	–.010*** (.004)	.008** (.004)	–.035*** (.010)		–.038*** (.012)
Parental education (no high school is reference)
Sender high school or trade	.038 (.133)	.000 (.001)	.000 (.003)	–.001 (.001)	.000 (.001)	–.010 (.007)	–.013 (.011)	–.023 (.014)
Receiver high school or trade	–.134 (.122)	.000 (.001)	.000 (.004)	.000 (.001)	.000 (.001)	–.013 (.007)	–.013 (.011)	–.025 (.014)
Sender some college	–.125 (.144)	.000 (.001)	.000 (.003)	–.001 (.001)	.000 (.001)	–.013 (.007)	.007 (.011)	–.006 (.013)
Receiver some college	–.075 (.113)	.000 (.001)	.000 (.003)	.000 (.001)	.001 (.001)	–.011 (.007)	.007 (.011)	–.005 (.014)
Sender college graduate	–.071 (.069)	.000 (.001)	.000 (.004)	.000 (.001)	.000 (.001)	–.011 (.007)	.024* (.010)	.014 (.014)
Receiver college graduate	–.094 (.145)	.000 (.001)	.000 (.003)	.000 (.001)	.000 (.001)	–.011 (.007)	.024* (.010)	.013 (.013)
Same parental education	.112 (.085)	.000 (.001)	.000 (.004)	.001 (.001)	–.001 (.001)	–.008 (.006)		–.008 (.007)
Parental income
Sender	–.003 (.002)	.000 (.001)	–.001 (.004)	–.003 (.002)	.002 (.002)	–.009 (.007)	.004 (.044)	–.014 (.045)
Receiver	.009*** (.002)	–.001 (.003)	.003 (.008)	.007*** (.000)	–.006** (.003)	–.017*** (.008)	.004 (.044)	–.015 (.045)
Absolute difference	–.002 (.003)	.000 (.001)	.000 (.003)	–.001 (002)	.000 (.002)	–.012* (.007)		–.012* (.007)
Edges	−7.047*** (.848)
AIC	−104,312
BIC	−103,996

Note: ERGM fit with outdegree constrained to a maximum of 10. AMMEs, total effects, and direct effects are estimated parametrically with 500 Monte Carlo simulations. AIC = Akaike information criterion; BIC = Bayesian information criterion; AMME = average mediated micro effect; ERGM = exponential random graph model; GPA = grade point average; GWESP = geometrically weighted edgewise shared partnership; LCC = local clustering coefficient.

Direct effects are reported only for node covariates. Direct effects are identical for sender and receiver effects because only a single node covariate is specified for each attribute in the linear network autocorrelation model.

p < 0.05. **p < 0.01. ***p < 0.001.

Table 4.

Network Autocorrelation Model of Student GPA.

	Coefficient (s.e.)
Local clustering coefficient	.062 (.093)
Betweenness centrality	.000 (.000)
Peer GPA	.120*** (.024)
Indegree	.038** (.014)
Outdegree	–.035* (.016)
Sex (male is reference)	.183*** (.044)
Race (black is reference)
Asian	.212*** (.038)
White	–.112** (.041)
Other	–.029 (.038)
Grade	.012 (.032)
Parental education (no high school is reference)
High school or trade	–.071 (.058)
Some college	.035 (.061)
College graduate	.177* (.069)
Parental income	.000 (.001)
Intercept	1.298*** (.347)
R ²	.128
F	12.05***

Note: GPA = grade point average.

p < 0.05. **p < 0.01. ***p < 0.001.

Table 3 shows direct, indirect, and total effects from micro-macro network mediation analysis. The direction and significance of all direct effects align with LNAM coefficients. The direct effect of sex is 0.287, meaning that female students’ mean GPA should decrease by 0.287 in the absence of the direct effect of sex. With a mean GPA of 2.63, this effect can account for 10.91 percent of the observed mean GPA. Similarly, the direct effects of Asian, white, and parental college education can explain 2.88 percent (0.076/2.63 = 0.029), −1.06 percent (0.028/2.63 = 0.011), and 0.91 percent (0.024/2.63 = 0.009) of the mean GPA, respectively.

Turning to indirect effects, the AMMEs are nonsignificant for all selection processes when operating through betweenness centrality or the local clustering coefficient. This reflects the nonsignificant effects of the local clustering coefficient and betweenness centrality in LNAM (Table 4).²⁰ The AMMEs for GWESP indicate that triadic closure has a positive indirect effect on student GPA by increasing students’ indegree (0.052) and peer GPA (0.036), but these positive effects are partly offset by increases in outdegree, which have a negative effect on student GPA (AMME = −0.045). The total effect captures the contributions of triadic closure to GPA operating through all indirect pathways. The total effect is 0.059 and can explain roughly 2.2 percent of the observed mean GPA (0.059/2.63 = 0.022). In other words, the GPA benefits of network selection through triadic closure in the observed network are comparable with the direct effect of being Asian compared with being black (0.059 vs. 0.067).

In contrast to GWESP, mutuality indirect effects are nonsignificant for both indegree and outdegree. The only significant mutuality indirect effect operates through peer GPA, where the AMME is −0.010. As a result, the total effect of mutuality is nonsignificant. Thus, although selection through triadic closure has positive indirect effects by increasing peer GPA, selection through reciprocity has a small negative indirect effect by decreasing peer GPA.

Formal analyses of the AMME also reveal how homophily can positively or negatively affect academic achievement. Race and sex homophily both have positive indirect effects on GPA by increasing indegree, but they have competing negative indirect effects through outdegree. Consequently, the total effects of both selection processes are nonsignificant. Grade similarity has a negative total effect on student GPA acting through all indirect pathways. Roughly 92.1 percent of the negative total effect is explained by relatively lower peer GPA among friendships between students in similar grade levels (−0.035/−0.038 = 0.921); the remaining portion of the total effect results from indirect effects of grade similarity operating through indegree and outdegree. Operating through all indirect pathways, the negative effect of network selection through grade similarity is comparable with the negative direct effect of being white compared with being black in the studied network (−0.038 vs. −0.041).

The indirect effects of parental income are also noteworthy. Although the parental income direct and total effects are nonsignificant, parental income has negative indirect receiver effects on student GPA by reducing peer GPA and outdegree. Although it is perhaps surprising for a receiver effect to shape outdegree, the relationship makes sense once we recognize it as network spillover. In the absence of a positive receiver effect, students less frequently nominate high parental income students as friends, decreasing the outdegree among students who would otherwise connect to high-income alters. Each of these indirect effects are partly offset by the positive receiver effect of parental income on indegree. Parental income similarity is also associated with decreases in GPA because of decreases in peer GPA. This likely reflects income segregation that concentrates relatively low academic achievement among low-income student peer groups.

In summary, micro-macro network mediation analyses reveal several interesting results about the selection processes that shape student GPA. Tendencies to form friends through triadic closure are associated with GPA increases because of increases in indegree and peer GPA that are partly offset by increases in outdegree. Weaker indirect effects were detected from mutuality and sex and race homophily operating through peer GPA, indegree, and outdegree. Parental income and grade similarity are both associated with indirect reductions in GPA because of decreases in peer GPA. Parental income also has significant indirect effects on student GPA by decreasing peer GPA that are partly offset by increases in indegree; most other node covariates only have direct effects on GPA but no indirect effect. Collectively, these results illustrate how the proposed approach can be used to evaluate how network selection processes indirectly shape student outcomes by altering social network structure.

Discussion

Social scientists are often interested in how networks act as an intervening mechanism in causal processes (e.g., Bearman et al. 2004; Chetty et al. 2022; DiMaggio and Garip 2012; Pedulla and Pager 2019), but researchers have been limited in their ability to statistically evaluate indirect network selection effects on individual, group, and organizational outcomes because of a lack of formal methods for micro-macro network mediation analysis. This study proposed a general methodological approach for evaluating indirect network selection effects with multiple conditionally independent posttreatment confounders. The AMME can be identified under sequential ignorability in a broad range of modeling frameworks. The approach thus provides a new tool for statistically evaluating indirect selection effects on outcome variables measured on the individuals, groups, and organizations nested in social networks.

The sequential ignorability assumption is crucial for mediation analysis. The stringency of this assumption, however, is pronounced in analyses of peer effects due to simultaneous selection and influence processes. Some modeling approaches are explicitly designed to grapple with this problem (Snijders et al. 2007; Steglich et al. 2010). However, researchers will likely have difficulty disentangling selection from influence in some empirical settings. One promising direction for future work is to develop sensitivity analyses for micro-macro network mediation. Although the sequential ignorability assumption is untestable, researchers can assess the sensitivity of indirect effect estimates to violations of sequential ignorability in some mediation frameworks (Cheng et al. 2018; Imai, Keele, and Tingley 2010; Zhou 2022). Extending sensitivity analysis to the micro-macro network mediation approach described here will help researchers evaluate how strong the violation of sequential ignorability would have to be to alter estimates of indirect network selection effects.

Another strategy to address sequential ignorability is randomization. Although I focused on observational research designs due to their common occurrence in sociology, a growing body of research conducts experiments that randomly assign actors to network locations (e.g., Melamed et al. 2018). In some instances, researchers randomize at the dyadic level, for example, by assigning nodes to either similar or dissimilar alters (Centola 2010). The proposed approach can be used in this type of experimental design. In such cases, sequential ignorability is partly supported by randomization and the researcher’s primary task is to ensure there is no omitted variable on the pathway from mediator to outcome.

Researchers have many options for disentangling the order of selection and influence mechanisms in observational research settings. One of the most powerful tools for parsing selection and influence dynamics is the SAOM (Steglich et al. 2010), which allows researchers to model the coevolution of network selection and influence processes in longitudinal network data. Outside the SAOM framework, researchers can use lagged predictor variables when modeling network selection to ensure selection dynamics are temporally prior to changes in network structure. An instrumental variable may also be available to break the feedback loop between selection and influence (e.g., An 2015a).

Researchers may be interested in estimating multiple indirect selection effects in a single analysis. In these cases, the risk for type 1 error rates increases. The nonparametric algorithm partly safeguards against type 1 error rates by resampling with replacement, but parametric estimates may be vulnerable. One possible solution is to use Bayesian estimation techniques to seed the parametric algorithm. This would involve estimating models in a Bayesian framework and then using random draws from the posterior distribution in each algorithmic call.

A promising direction for further work is to consider multiple causally connected posttreatment confounders. The approach introduced here assumes conditional independence among posttreatment pathways, but it may be possible for a posttreatment variable to intervene on the causal path from micro mechanism to outcome through a second posttreatment variable (Imai and Yamamoto 2013). Future work should examine under what conditions the micro-macro network approach can accommodate posttreatment causal relationships.

In summary, I introduced a general methodological approach for evaluating indirect network selection effects on higher-order outcomes in social network data. The AMME can be identified and estimated with common regression and network models. It is broadly relevant for researchers interested in assessing the network selection processes that contribute to individual, group, and organizational outcomes by altering network structure. The approach thus offers a new tool for researchers to examine the indirect network selection effects that are often theoretically implied but rarely empirically tested in social networks research.

Footnotes

Appendix: Derivation of AMME Identification

The derivation is an extension of theorem 1 in Imai, Keele, and Yamamoto (2010) and the derivation for multiple conditionally independent mediators provided by Imai and Yamamoto (2013). I show that the sequential ignorability results reported in prior studies apply even when examining different units of analysis. I consider the identification of ${\bar{ω}}_{i}^{M},$ the AMME with respect to $M_{i}$ . Note that equation (16) implies the following conditional independence:

(A1)

Y_{i} (T_{i} (t), m, X_{i} (t)) ⨿ T_{ij} | M_{i} (t') = m', Z_{ij} = z,

for all t, t $'$ = 0, 1, m, m ′, and z. Now for any t, t ′, we have

(A2)

\begin{matrix} E (Y_{i} (T_{i} (t), M_{i} (t'), X_{i} (t)) | Z_{ij} = z) \\ = \int^{} E (Y_{i} (T_{i} (t), m, X_{i} (t) | M_{i} (t') = m, Z_{ij} = z) d F_{M_{i} (t') | Z_{ij} = z} (m)} \end{matrix}

(A3)

= \int^{} E (Y_{i} (T_{i} (t), m, X_{i} (t) | M_{i} (t') = m, T_{i} = T_{i} (t'), Z_{ij} = z) d F_{M_{i} (t') | Z_{ij} = z} (m) .}

Using equations (15) and (16), we can write

(A4)

\int E (Y_{i} (T_{i} (t), m, X_{i} (t) | M_{i} (t^{'}) = m, T_{i} = T_{i} (t^{'}), Z_{i j} = z) d F_{M_{i} (t^{'}) | Z_{i j} = z} (m) = \int E (Y_{i} (T_{i} (t), m, X_{i} (t) | T_{i} = T_{i} (t^{'}), Z_{i j} = z) d F_{M_{i} (t^{'}) | Z_{i j} = z} (m)

(A5)

= \int^{} E (Y_{i} (T_{i} (t), m, X_{i} (t) | T_{i} = T_{i} (t), Z_{ij} = z) d F_{M_{i} (t') | T_{i} = T_{i} (t'), Z_{ij} = z} (m)}

(A6)

= \int^{} E (Y_{i} (T_{i} (t), m, X_{i} (t) | M_{i} (t) = m, T_{i} = T_{i} (t), Z_{ij} = z) d F_{M_{i} (t') | T_{i} = T_{i} (t'), Z_{ij} = z} (m)}

(A7)

= \int^{} E (Y_{i} | M_{i} = m, T_{i} = T_{i} (t), Z_{ij} = z) d F_{M_{i} (t') | T_{i} = T_{i} (t'), Z_{ij} = z} (m)}

(A8)

= \int^{} E (Y_{i} | M_{i} = m, T_{i} = T_{i} (t), Z_{ij} = z) d F_{M_{i} | T_{i} = T_{i} (t'), Z_{ij} = z} (m) .}

Finally, equation (A8) implies

(A9)

E (Y_{i} (T_{i} (t), M_{i} (t'), X_{i} (t)) = \int \int^{} E (Y_{i} | M_{i} = m, T_{i} = T_{i} (t), Z_{ij} = z) d F_{M_{i} | T_{i} = T_{i} (t'), Z_{ij} = z} (m) F_{Z_{ij}} (z) .}

Substituting this expression into the definition of ${\bar{ω}}_{i}^{M} (t)$ gives equation (18). Note that these results apply to continuous t. The same derivation can be used when replacing t = 0,1 with any desired real value. Thus, the identification result applies to any observed $T_{ij} .$ And because M and X are assumed to be conditionally independent, the same identification result is obtained for ${\bar{ω}}_{i}^{X} (t)$ by substituting $X_{i}$ for $M_{i} .$

Acknowledgements

I thank David Melamed, Ken Bollen, Per Block, Ken Frank, Santiago Olivella, Christian Steglich for helpful comments.

ORCID iD

Scott W. Duxbury

Notes

Author Biography

Scott W. Duxbury is an assistant professor of sociology at the University of North Carolina at Chapel Hill. His research focuses on social networks, drug markets, public opinion, race, and punishment. It has appeared in the American Sociological Review, the American Journal of Sociology, and Social Forces, among other outlets. He is also the author of Longitudinal Network Models, a SAGE green book on longitudinal network data analysis.

References

Acharya

Avidit

Blackwell

Matthew

Sen

Maya

. 2016. “Explaining Findings without Bias: Detecting and Assessing Direct Effects.” American Political Science Review 110(3):512–28.

Weihua

. 2015a. “Instrumental Variables Estimates of Peer Effects in Social Networks.” Social Science Research 50:384–92.

Weihua

. 2015b. “Multilevel Meta Network Analysis with Application to Studying Network Dynamics of Network Interventions.” Social Networks 43:48–56.

Weihua

. 2018. “Causal Inference with Networked Treatment Diffusion.” Sociological Methodology 48(1):152–81.

Weihua

Beauvile

Roberson

Rosche

Benjamin

. 2022. “Causal Network Analysis.” Annual Review of Sociology 48:1–19.

Bearman

Peter S.

Moody

Jim

Stovel

Katherine

. 2004. “Chains of Affection: The Structure of Adolescent Romantic and Sexual Networks.” American Journal of Sociology 110(1):44–91.

Block

Per

. 2023. “Understanding the Self-Organization of Occupational Sex Segregation with Mobility Networks.” Social Networks 73(1):42–50.

Bollen

Ken

Stine

Robert A.

1992. “Bootstrapping Goodness-of-Fit Measures in Structural Equation Models.” Sociological Methods & Research 21(2):205–29.

Breen

Richard

Karlson

Kristian Bernt

Holm

Anders

. 2013. “Total, Direct, and Indirect Effects in Logit and Probit Models.” Sociological Methods & Research 42(2):164–91.

10.

Burt

Ronald

. 1992. Structural Holes: The Social Structure of Competition. Cambridge, MA: Harvard University Press.

11.

Butts

Carter

. 2008. “A Relational Event Framework for Social Action.” Sociological Methodology 38(1):155–200.

12.

Centola

Damon

. 2010. “The Spread of Behavior in an Online Social Network Experiment.” Science 329(5996):1194–97.

13.

Centola

Damon

. 2015. “The Social Origins of Networks and Diffusion.” American Journal of Sociology 120(5):1295–1338.

14.

Cheng

Jing

Cheng

Nancy F.

Guo

Zijian

Gregorich

Steve

Ismail

Amid I.

Gansky

Stuart A.

2018. “Mediation Analysis for Count and Zero-Inflated Count Data.” Statistical Methods & Research 27(9):2756–74.

15.

Chetty

Raj

Jackson

Matthew O.

Kuchler

Theresa

Stroebel

Johannes

Hendren

Nathaniel

Fluegge

Robert B.

Gong

Sara

, et al. 2022. “Social Capital 1: Measurement and Associations with Economic Mobility.” Nature 608(7921):108–21.

16.

Coleman

James S.

1990. Foundations of Social Theory. Cambridge, MA: Harvard University Press.

17.

Copeland

Molly

Siennick

Sonja E.

Feinberg

Mark E.

Moody

James

Ragan

Daniel

. 2020. “Social Ties Cut Both Ways: Self-Harm and Adolescent Peer Networks.” Journal of Youth and Adolescence 48(8):1506–18.

18.

Desmond

Matthew

Wilmers

Nathan

. 2019. “Do the Poor Pay More for Housing? Exploitation, Profit, and Risk in Rental Markets.” American Journal of Sociology 124(4):1090–1124.

19.

DiMaggio

Paul

Garip

Filiz

. 2012. “Network Effects and Social Inequality.” Annual Review of Sociology 38:93–118.

20.

Duxbury

Scott W

. 2023. “The Problem of Scaling in Exponential Random Graph Models.” Sociological Methods & Research 52(2):764–802.

21.

Duxbury

Scott W

. Forthcoming. “Micro Effects on Macro Structure in Social Networks.” Sociological Methodology. doi:10.1177/00811750231209040.

22.

Duxbury

Scott W.

Haynie

Dana L.

2020. “School Suspension and Social Selection: Labeling, Network Change, and Adolescent Academic Achievement.” Social Science Research 85:102365.

23.

Frank

Ove

Strauss

David

. 1986. “Markov Graphs.” Journal of the American Statistical Association 81(395):832–42.

24.

Goodreau

Steven M.

Kitts

James A.

Morris

Martina

. 2009. “Birds of a Feather, or Friend of a Friend? Using Exponential Random Graph Models to Investigate Adolescent Social Networks.” Demography 46(1):103–25.

25.

Granovetter

Mark

. 1973. “The Strength of Weak Ties.” American Journal of Sociology 78(6):1360–80.

26.

Haynie

Dana L.

2001. “Delinquent Peers Revisited: Does Network Structure Matter?” American Journal of Sociology 106(4):1013–57.

27.

Haynie

Dana L.

Osgood

D. Wayne

. 2006. “Reconsidering Peers and Delinquency: How Do Peers Matter?” Social Forces 84(2):1109–30.

28.

Holland

Paul W.

Leinhardt

Samuel

. 1981. “An Exponential Family of Probability Distributions for Directed Graphs.” Journal of the American Statistical Association 76(373):33–50.

29.

Huang

Peng

Butts

Carter T.

2023. “Rooted America: Immobility and Segregation of the Intercounty Migration Network.” American Sociological Review 88(6):1031–65.

30.

Hunter

David R.

Goodreau

Steven M.

Handcock

Mark S.

2008. “Goodness of Fit of Social Network Models.” Journal of the American Statistical Association 103(481):248–58.

31.

Imai

Kosuke

Keele

Luke

Tingley

Dustin

. 2010. “A General Approach to Causal Mediation Analysis.” Psychological Methods 15(4):309–34.

32.

Imai

Kosuke

Keele

Luke

Yamamoto

Teppei

. 2010. “Identification, Inference, and Sensitivity Analysis for Causal Mediation Effects.” Statistical Science 25(1):51–71.

33.

Imai

Kosuke

Yamamoto

Teppei

. 2013. “Identification and Sensitivity Analysis for Multiple Causal Mechanisms: Revisiting Evidence from Framing Experiments.” Political Analysis 21:141–71.

34.

Indlekofer

Natalie

Brandes

Ulrick

. 2013. “Relative Importance of Effects in Stochastic Actor-Based Models.” Network Science 1(3):278–304.

35.

Karlson

Kristian Bernt

Anders

Holm

Breen

Richard

. 2012. “Comparing Regression Coefficients between Same-Sample Nested Models Using Logit and Probit: A New Method.” Sociological Methodology 42(1):286–313.

36.

King

Gary

Tomz

Michael

Wittenberg

Jason

. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44:341–55.

37.

Krackhardt

David

. 1987. “QAP Partialling as a Test of Spuriousness.” Social Networks 9(2):171–86.

38.

Kreager

Derek A.

Rulison

Kelly

Moody

Jim

. 2011. “Delinquency and the Structure of Adolescent Groups.” Criminology 49(1):95–127.

39.

Leenders

Roger Th A. J.

2002. “Modeling Social Influence through Network Autocorrelation: Constructing the Weight Matrix.” Social Networks 24(1):21–47.

40.

Mackinnon

David

. 2008. Introduction to Statistical Mediation Analysis. Abingdon, UK: Taylor & Francis.

41.

McMillan

Cassie

Kreager

Derek A.

Veenstra

Rene

. 2022. “Keeping to the Code: How Local Norms of Friendship and Dating Inform Macro-structures of Adolescents’ Romantic Networks.” Social Networks 70:126–37.

42.

Melamed

David

Harrell

Ashley

Simpson

Brent

. 2018. “Cooperation, Clustering, and Assortative Mixing in Dynamic Networks.” Proceedings of the National Academy of Sciences 115(5):951–56.

43.

Melamed

David

Munn

Christopher W.

Barry

Leanne

Montgomery

Bradley

Okuwobi

Oneya F.

2019. “Status Characteristics, Implicit Bias, and the Production of Racial Inequality.” American Sociological Review 84(6):1013–36.

44.

Mize

Trenton D.

Doan

Long

J. Scott

. 2019. “A General Framework for Comparing Predictions and Marginal Effects across Models.” Sociological Methodology 49(1):152–89.

45.

Padgett

John

Ansell

Christopher K.

1993. “Robust Action and the Rise of the Medici, 1400–1434.” American Journal of Sociology 98(6):1259–1319.

46.

Pearl

Judea

. 2001. “Direct and Indirect Effects.” Pp. 411–20 in Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence, edited by Breese

Koller

New York: Morgan.

47.

Pedulla

David S.

Pager

Devah

. 2019. “Race and Networks in the Job Search Process.” American Sociological Review 84(6):983–1012.

48.

Robins

Garry

Elliot

Peter

Pattison

Pip

. 2001. “Network Models for Social Selection Processes.” Social Networks 23(1):1–30.

49.

Robins

Garry

Pattison

Pip

Woolcock

Jodie

. 2005. “Small and Other Worlds: Global Network Structures from Local Processes.” American Journal of Sociology 110(4):884–936.

50.

Schaefer

David R.

Kornienko

Olga

Fox

Andrew M.

2011. “Misery Does Not Love Company: Network Selection Mechanisms and Depression Homophily.” American Sociological Review 76(5):764–85.

51.

Schweinberger

Michael

Handcock

Mark S.

2015. “Local Dependence in Random Graph Models: Characterization, Properties, and Statistical Inference.” Journal of the Royal Statistical Society Series B: Statistical Methodology 77:647–76.

52.

Shalizi

Cosma Rohilla

Thomas

Andrew C.

2011. “Homophily and Contagion Are Generically Confounded in Observational Social Network Studies.” Sociological Methods & Research 40(2):211–39.

53.

Slaughter

Andrew J.

Koehly

Laura M.

2016. “Multilevel Models for Social Networks: Hierarchical Bayesian Approaches to Exponential Random Graph Modeling.” Social Networks 44:334–45.

54.

Snijders

Tom A. B.

2001. “The Statistical Evaluation of Social Network Dynamics.” Sociological Methodology 31(1):361–95.

55.

Snijders

Tom A. B.

Baerveldt

Chris

. 2003. “A Multilevel Network Study of the Effects of Delinquent Behavior on Friendship Evolution.” Journal of Mathematical Sociology 27(2–3):123–51.

56.

Snijders

Tom A. B.

Steglich

Christian

Schweinberger

Michael

. 2007. “Modeling the Coevolution of Networks and Behavior.” Pp. 41–71 in Longitudinal Models in the Behavioral and Related Sciences, edited by K. van Montfort, H. Oud, and A. Satorra. Mahwah, NJ: Lawrence Erlbaum.

57.

Sobel

Michael E

. 1982. “Asymptotic Confidence Intervals for Indirect Effects in Structural Equation Models.” Sociological Methodology 13:290–312.

58.

Sobel

Michael E.

2008. “Identification of Causal Parameters in Randomized Studies with Mediating Variables.” Journal of Educational and Behavioral Statistics 33(2):230–58.

59.

Stadtfeld

Christoph

Hollway

James

Block

Per

. 2017. “Dynamic Network Actor Models: Investigating Coordination Ties through Time.” Sociological Methodology 47(1):1–40.

60.

Steglich

Christian

Snijders

Tom A. B.

Pearson

Michael

. 2010. “Dynamic Networks and Behavior: Separating Selection from Influence.” Sociological Methodology 40(1):329–93.

61.

Tchetgen

Eric J.

Fulcher

Isabel R.

Shpitser

Ilya

. 2021. “Auto-G-Computation of Causal Effects on a Network.” Journal of the American Statistical Association 116(534):833–44.

62.

VanderWeele

Tyler

Weihua

. 2013. “Social Networks and Causal Inference.” Pp. 353–74 in Handbook of Causal Analysis for Social Research, edited by Morgan

New York: Springer.

63.

Watts

Duncan J.

Strogatz

Steven H.

1998. “Collective Dynamics of ‘Small-World’ Networks.” Nature 393(4):440–42.

64.

White

Harrison

. 1992. Identity and Control: A Structural Theory of Social Action. Princeton, NJ: Princeton University Press.

65.

Zhou

Xiang

. 2022. “Attendance, Completion, and Heterogeneous Returns to College: A Causal Mediation Approach.” Sociological Methods & Research. doi:10.1177/00491241221113876.

Micro-Macro Mediation Analysis in Social Networks

Abstract

Keywords

Prior Approaches to Mediation Analysis

Definition of the Problem and Issues with Prior Approaches

Direct and Indirect Micro-Macro Pathways

Identification Problems in Network Mediation Settings

The AMME

Micro-Macro Direct, Indirect, and Total Effects

Identification Result

Assumption 1: Sequential Ignorability with Multiple Conditionally Independent Mediators

Estimation

Model Requirements

Algorithm

Parametric Estimation

Nonparametric Estimation

Simulation

Empirical Example: Friendship Selection And School Performance In Adolescent Social Networks

Data

Estimation and Assumptions

Results

Discussion

Footnotes

Appendix: Derivation of AMME Identification

Acknowledgements

ORCID iD

Notes

Author Biography

References