Sage Journals: Discover world-class research

Abstract

Collecting and analyzing observational data are essential to learning and implementing lessons in earthquake engineering. Historically, the methods that have been used to analyze and draw conclusions from empirical data have been limited to traditional statistics. The models developed using these techniques are able to capture associative relationships between important variables. However, the intervention decisions geared toward seismic risk mitigation should ideally be informed by an understanding of the causal mechanisms that drive infrastructure performance and community response. This article advocates for a paradigm shift in earthquake engineering where the language, tools, and models that have been (and continue to be) developed to draw causal conclusions from observational data are adopted. Several categories of data-driven earthquake engineering problems that can benefit from causal insights are examined. Two widely adopted frameworks from the broader causal inference literature are presented and linked to hypothetical earthquake engineering problems. The critical role of semi-parametric models and sensitivity analysis in justifying causal claims is also discussed. The article concludes with a discussion of specific opportunities and challenges toward the widespread use of causal inference as a tool for knowledge discovery in earthquake engineering. The ability to leverage the underlying physics of a problem within a causal inference framework is identified as both an opportunity and challenge for earthquake engineering researchers.

Keywords

Causal inference machine learning counterfactual analysis potential outcomes earthquake engineering

Introduction

Drawing insights from empirical data is essential to earthquake engineering research and practice. Statistical analysis of observational data (i.e., data that are not generated by a controlled experiment) permeates every area of earthquake engineering including (but not limited to) ground motion hazard characterization, analysis and design of new structures, evaluation and retrofit of existing structures, and regional risk and resilience assessments. In structural engineering (a subdomain of earthquake engineering), there is a long history of using data from physical experiments to develop statistical models to estimate response parameters such as component stiffness, strength, and/or deformation capacity. These models are then used in the design, analysis, retrofit (in the case of existing facilities), and performance evaluation of individual or distributed infrastructure. Post-earthquake field reconnaissance is also key to advancing earthquake engineering research and practice. Lessons learned from field investigation data range from identifying the presence and criticality of specific seismic vulnerabilities (e.g., captured columns, unbraced, and/or unbolted cripple walls) to the effectiveness of different risk mitigation measures (e.g., seismic retrofits (SRs)). Field observations also inform the development of ground motion models (GMMs) and provide insights into the effects of different risk mitigation policies. Structural health monitoring (SHM) is another area where observational data and analysis are used in earthquake engineering. SHM uses different sensors to measure the response of structures to ground shaking, which is then used to infer the type, location, and extent of damage. Linking measured structural response to actual damage often involves some type of statistical analysis.

For the most part, the data-driven models that have been developed in earthquake engineering are based on statistical relationships where concepts such as correlation, regression, conditional independence, and association are adopted (Pearl, 2019). Linear and nonlinear regression are commonly used to develop models that predict the response parameters for structural components, including reinforced concrete (RC) (Dai et al., 2020; Haselton et al., 2016) and steel (El Jisr et al., 2022; Lignos and Krawinkler, 2011), beam-column elements, and RC shear walls (Abdullah, 2019; Abdullah and Wallace, 2019). The resulting equations allow engineers to estimate the parameters that are used in computational or analytical models. The GMM, a core component of seismic hazard characterization, uses regression (usually linear in logarithmic space) to estimate shaking intensity and duration parameters conditioned on features related to the earthquake formation process and geophysical properties of the path taken by the waveform and the local site of interest (Abrahamson et al., 2014; Boore et al., 2014; Campbell and Bozorgnia, 2014; Chiou and Youngs, 2014). Currently, there is a growing trend to use machine learning (ML) algorithms as the basis for developing predictive models within the aforementioned application areas of earthquake engineering. This has largely been motivated by the ability of ML to detect complex patterns in observational data and produce predictions that are often more accurate than those obtained from traditional statistical models. A detailed state-of-the-art review on the application of ML in earthquake engineering problems is provided in Xie et al. (2020).

The models described in the previous paragraph provide associational relationships, which, by themselves, cannot be used to infer cause and effect. In fact, traditional statistics does not include a coherent language for describing causal effect, much less the methods and tools of analysis (Pearl, 2019). Within the earthquake engineering domain, while useful for predictions, such models are limited in their ability to provide fundamental causal insights on the phenomena being studied. A reasonable counterargument is that, when coupled within our understanding of the physical laws that generate the associated observational data, these statistical models do inform cause and effect. Yet, this approach to elucidating causal knowledge is typically unsystematic and not rigorously justifiable. However, the last two decades have seen significant advances in developing techniques that can be used to draw causal conclusions from observational data. This body of knowledge, which has been developed in fields that include statistics, computer science, and the social sciences (e.g., political science, economics), is at the heart of modern scientific research. This set of circumstances raises the question of whether the foundational principles and methods that have been advanced in the broad area of causal inference can be used to solve earthquake engineering problems.

The goal of this article is to begin the process of answering the question posed at the end of the previous paragraph. Specifically, we discuss some specific types of earthquake engineering inquiries that can potentially benefit from being viewed through the lens of causal inference. The following problem categories are considered:

Drawing fundamental insights from datasets generated by structural component and sub-assembly experiments.

Untangling the effects of specific ground motion characteristics (e.g. duration, pulses) on structural response and damage using computational models.

Quantifying the average (over the entire inventory) benefits of seismic retrofit interventions using observational data from real earthquakes.

Evaluating the factors (e.g., building or site-related) that are most responsible for the observed damage using a real event.

The next section considers potential causal inference applications in earthquake engineering through the lens of these four problem categories. Within these specific areas of inquiry, the state-of-the-art in data-driven modeling and inference is summarized and the limitations of existing approaches in terms of their ability to produce causal conclusions are highlighted. Two foundational causal inference frameworks are then presented with an acute focus on their ability to provide fundamental insights into specific earthquake engineering questions. The same section also addresses the importance of semi-parametric models and sensitivity analysis tools for justifying causal conclusions. Subsequently, we discuss the opportunities and challenges in conducting causal investigations using observational data within the domain of earthquake engineering.

Potential causal inference applications in earthquake engineering

Infrastructure earthquake risk mitigation often takes the form of a physical intervention that is intended to enhance seismic behavior at the system or component level through some causal mechanism. The effectiveness of those interventions is often informed by data-driven models and analysis of empirical data from past events. However, in earthquake engineering, the state-of-the-art in modeling and analysis of empirical data is based on traditional statistical methods where associative (or correlation) effect of the intervention is quantified. Ideally, such investigations should be underpinned by data-driven techniques which can be seamlessly integrated with our knowledge of physics to extract causal information about the effect(s) of interventional changes to the system.

This section discusses four categories of earthquake engineering problems that can benefit from causal inference. In particular, a review of the state-of-the-art in data-driven modeling and inference within each subdomain is presented while emphasizing the shortcomings in terms of the ability of existing methods to elucidate cause and effect from observational data.

Data-driven investigations based on physical structural experiments

The physical laboratory experiment is a primary tool in advancing our fundamental understanding of structural component and system-level behavior under seismic loading. For a given experimental project, the results from one or a small set of physical experiments are used to understand how different structural properties or design strategies affect behavior. The results from a given physical experiment can also be used to anecdotally validate a particular modeling strategy. The last few decades have accumulated hundreds (or more than 1000 for some component types) of structural experiments for various types of components and sub-assemblies. Also, with the creation of cyberinfrastructure systems such as DesignSafe (Rathje et al., 2017), these datasets are being widely shared and used among structural/earthquake engineers. A single dataset would typically include comprehensive information from experiments conducted on a particular type of component or sub-assembly. Examples of such components and datasets include RC and steel beam-column elements (Berry et al., 2004; Lignos and Krawinkler, 2013), RC shear walls (Abdullah, 2019), RC and steel frames with masonry infill (Huang et al., 2020; Liberatore et al., 2017), and RC and steel frame panel zones (Skiadopoulos and Lignos, 2021).

The increase in the collection and curation of data from physical experiments has spurred the development of data-driven approaches for estimating the parameters that are used in structural analysis models. Such studies use data from largely disparate experiments (i.e., each performed with a different goal in mind) and various forms of regression to develop predictive models, typically for macro-element modeling parameters. In fact, macro-elements are commonly used in nonlinear structural models because they are often easy to implement, and the associated parameters can be calibrated from the results of physical experiments. Linear and nonlinear regression models have been developed to predict the behavior of components, such as RC shear walls (Abdullah, 2019; Abdullah and Wallace, 2019), steel beam-column elements (El Jisr et al., 2022; Lignos and Krawinkler, 2011), RC beam-column elements (Dai et al., 2020; Haselton et al., 2016), and masonry-infilled frames (De Risi et al., 2018; Huang et al., 2020; Liberatore et al., 2018; Sirotti et al., 2021).

The typical workflow to develop a predictive model begins with a visual examination of any trends between the potential input variables (e.g., component material or geometric properties) and the parameter of interest (e.g., the capping strength of the element). Quantitative metrics, such as correlation and statistical significance tests, are also used to evaluate the feasibility of a particular structural property as an input into the statistical model. Once the input variables have been chosen, an appropriate statistical model is implemented. Finally, the developed model is evaluated using the plots of the predicted versus observed values of the parameter of interest and quantitative metrics, such as the coefficient of determination $(R^{2})$ , correlation coefficients, and standard errors.

A more recent trend has been to use ML algorithms as the primary engine for developing predictive data-driven models using structural component data from physical experiments. Some studies have compared the predictive performance of models developed using a suite of algorithms (Jeon et al., 2014; Mangalathu and Jeon, 2018; Rahman et al., 2021). Others use a single algorithm that is targeted toward addressing a specific concern with data-driven models (e.g. small datasets) (Luo and Paal, 2019, 2021). ML has also been used to develop models that predict the failure mode of specific structural components. These are classification models that take in various structural features as input and predict the type of failure that would occur under seismic loading (Huang and Burton, 2019; Mangalathu and Jeon, 2018, 2019; Mangalathu et al., 2020).

The data-driven models summarized in this section are useful for predictions but are unable to provide fundamental insights about the behavior of the component or system of interest. For the purpose of illustration, consider the study by Abdullah and Wallace (2019) that used a dataset of 164 experiments on RC shear walls to develop a statistical model for predicting the drift capacity. One of the considered input variables was the configuration of the boundary transverse reinforcement. The dataset included various types of overlapping and non-overlapping hoops. The article provided a detailed discussion on the importance of overlapping hoops and provided supporting evidence from individual experiments. However, due to the presence of many confounding variables, there was no attempt to quantify the benefits of overlapping hoops using the assembled dataset. The principles and methods of causal inference described later in the article can be used for such an assessment.

Structural response simulation-based data-driven investigations

Nonlinear structural response analysis is widely used to simulate the behavior of different types of infrastructure under earthquake ground motions. In the performance-based earthquake engineering (PBEE) methodology (Moehle and Deierlein, 2004), nonlinear response analysis is used to evaluate the performance of structures in terms of metrics that are of interest to stakeholders, that is, collapse risk, economic losses, and functional recovery. Within this type of assessment, questions arise about the effect of specific earthquake ground motion features (e.g. duration, spectral shape, and pulse effects) on seismic structural response and performance. These types of evaluations are typically achieved through statistical analysis of seismic response and performance data that are generated in simulation models. Note that while the data generated from a simulation model can be viewed as coming from a “controlled” experiment, the variable of interest (e.g., ground motion duration) is typically linked to the ground motion data, which is observational.

Consider an area of research that has been of growing interest to the earthquake engineering community—the effect of ground motion duration on the seismic structural collapse risk of buildings. This area of inquiry was spurred by large magnitude earthquakes that have occurred within the last two decades (e.g., 2011 M9.0 in Tohoku, Japan and 2010 M8.8 in Maule, Chile) where some sites experienced shaking for durations in excess of 4 min (De Luca et al., 2011). One approach that has been used to study this issue is to perform a statistical analysis of building structural collapse performance data generated through incremental dynamic analyses (IDAs) where the ground motion “significant duration” is one of the predictors (Raghunandan and Liel, 2013). The associational effect of duration is then assessed using a trend analysis (i.e. visual or via correlations) or statistical hypothesis tests (i.e. p-values).

Another approach that has been adopted to quantify the effect of duration on collapse performance is to subject the same structure (or set of structures) to “short” and “long” duration ground motions that are “spectrally equivalent” (Chandramohan et al., 2016a, 2016b). This strategy recognizes that the SS of a ground motion is a possible confounding variable that would also influence collapse risk. The strategy is analogous to the “matching” approach (discussed more later in the article) that has been used in early causal inference literature (Stuart, 2010). However, there are several limitations to matching, both as a general strategy for dealing with the effects of confounders (discussed later in the article) and in the specific application in the aforementioned studies focused on the effect of duration of structural collapse risk. Regarding the latter, obtaining a one-to-one spectral match for a single pair of long and short duration motions is a difficult proposition, which only increases in complexity when trying to assemble a record-set for collapse performance assessment. This approach also brings the quality of the matching into question, an issue that is not explicitly addressed in the aforementioned studies. Finally, the univariate matching approach essentially ignores the potential effect of other confounders (e.g., pulse effects). Later in the article, we discuss techniques used in the broad area of causal inference that are designed to address these specific limitations.

Statistical evaluations of structural response simulation data have also been used to study the effects of other types of ground motion features, including spectral shape (Eads et al., 2016; Haselton et al., 2011), pulses (Champion and Liel, 2012; Liapopoulou et al., 2020), and sedimentary basins (Bijelić et al., 2019; Marafi et al., 2017). In general, these studies have adopted one of the two approaches (matching or statistical trend analysis) described in the previous paragraph. It is worth mentioning that these statistical analyses are supported by fundamental insights into the physics of ground motions and the structural response of infrastructure to earthquake-induced shaking. Together, these lines of inquiry provide justification for any causal claims that are made in the findings. However, it is also important to distinguish between qualitative and quantitative causal insights. The latter cannot be inferred from statistics alone (Pearl, 2019).

Evaluating the benefits of seismic structural mitigation using observational data

A review of the earthquake engineering literature found only a single study (discussed later in this section) that sought to systematically evaluate the overall (i.e., across a set of buildings) benefits of seismic retrofit using observational data (i.e., through field reconnaissance following real events). In other words, the overwhelming majority of field observational studies that have highlighted the benefits of seismic retrofit have all been anecdotal. In a report written for a “lay audience,” the Pacific Earthquake Engineering Research (PEER) documented the performance of two similar (based on pictures taken from the exterior) 2-story homes that experienced ground shaking from the 2014 M6.0 South Napa earthquake (Rabinovici and Reis, 2011). A side-by-side image of the two buildings showed that the unretrofitted structure experienced significant damage during the 2014 earthquake and the one that was retrofitted remained undamaged. In a report commissioned by the California Earthquake Authority (CEA, 2022), Rabinovici (2017) performed a survey of houses affected by the 2014 Napa Earthquake. The study was able to identify several retrofitted single family homes that had little to no damage. These houses were identified as “the retrofit success stories” of the 2014 event. Other field investigation studies have similarly reported on the benefits of seismic retrofit based on individual (or a handful of) observations of undamaged retrofitted buildings (Briseghella et al., 2019; Dizhur et al., 2011).

For structural engineers, providing a causal link between seismic retrofit and earthquake damage may seem obvious since many laboratory experiments have been conducted to answer this question. However, making the link between the results of these experiments and a systematic quantification of the mitigating effect of the retrofit during an actual earthquake is not a trivial endeavor. This partly explains why the insights drawn of the effect of seismic retrofit on building performance during real earthquakes have mostly been either anecdotal (as described earlier in this section) or derived from detailed numerical analyses of individual buildings. A recently published study by Rabonza et al. (2022) used “counterfactual analysis” to quantify the benefit of seismic risk interventions. The authors combined observed data from the 2015 M7.8 Gorkha earthquake with vulnerability functions (i.e., engineering models) to quantify the lives saved by retrofit. Later in the article, we discuss other techniques developed in statistics, computer science, and the social sciences that can be used to provide causal insights into the effectiveness of seismic retrofit using observational data.

Elucidating the factors that contribute to observed seismic building damage

Post-event field reconnaissance is a primary tool for understanding the factors that most influence seismic damage to infrastructure. The earliest missions relied almost exclusively on in-person one-at-a-time inspections that enable engineers and scientists to collect and document perishable data that could inform various aspects of seismic risk mitigation and resilience-building. Whereas, more modern reconnaissance efforts combine in-person and virtual inspections with remote sensing at different scales (e.g., satellite, unmanned aerial vehicles). The Earthquake Engineering Research Institute (EERI) Learning from Earthquakes (LFE) (EERI-LFE, 2022) program alone has collected and curated datasets from over 300 earthquakes over a period of more than 70 years. More recently, the Structural and Geotechnical Extreme Event Reconnaissance (GEER, 2022; StEER, 2022) programs routinely spearhead earthquake field investigations. In addition, entities such as the RAPID Center (Berman et al., 2020) and DesignSafe Cyberinfrastructure (Rathje et al., 2017) have advanced the use of modern technology in the collection and curation of earthquake reconnaissance data.

Detailed post-earthquake inspections of individual infrastructure (e.g. buildings) can reveal critical insights about the factors that influence seismic damage. These factors can range from specific structural and non-structural deficiencies, soil–structure interaction, or amplification of ground shaking triggered by a myriad of possible effects (e.g., geological features of the local site and/or seismic wave path, near-source and/or directivity effects). It is well understood that generalizing the findings from in-person (or virtual) inspections requires systematic analysis of the collected data. This has led to numerous studies in the literature on the statistical (or quantitative trend) analysis of the damage to specific system or component types, including bridges (Basöz et al., 1999), non-structural elements (Giaretton et al., 2016), and buildings (Aguilar et al., 1989; Boatwright et al., 2015; Jünemann et al., 2015; Scholl, 1974; Sun and Zhang, 2011). A typical analysis of this type would quantify the empirical distribution of specific types of damage and establish associative relationships (qualitative and quantitative) with potential causal factors (i.e. shaking, structural/non-structural vulnerabilities). Depending on the size and quality of the data, empirical fragility curves may be developed to link the ground shaking intensity to the probability of different types and severity of damage (Basöz et al., 1999).

Similar to other data-driven studies in earthquake engineering, there is a growing trend toward using ML to develop models for predicting the geographic distribution of damage following an event. Buildings (Harirchian et al., 2021; Mangalathu et al., 2020; Roeslin et al., 2020; Stojadinović et al., 2021) and the components that comprise a water distribution system (Bagriacik et al., 2018; Winkler et al., 2018) are the two types of infrastructure that have been considered in this regard. These types of studies primarily focus on the predictive accuracy of the considered model(s).

Like the other application areas, the analyses and models that are applied to infrastructure damage data from real earthquakes cannot provide explicit quantitative information on causal effects. For illustration purposes, we consider a study by Boatwright et al. (2015) that performed an investigation of the factors that most influenced the overall distribution of observed building damage during the 2014 South Napa earthquake. In the end, the authors were only able to conclude that “the distribution of red and yellow tags is well-correlated with the pre-1950 development of Napa and with the underlying sedimentary basin but poorly correlated with the most recent alluvial geology.” In other words, because of the methods adopted (from traditional statistics), no causal claims could be made. In another study by Mangalathu et al. (2020), ML models were developed to predict the distribution of building damage during the same earthquake. As part of that study, the authors quantified the relative importance of different features used in a Random Forest algorithm. This was done by computing importance scores for each input parameter, which is a measure of the amount of predictive power the parameter adds to the model. Note, however, that this is an associative metric that cannot be used to draw causal insights.

Foundational principles and frameworks in causal inference

This section describes two foundational frameworks that have dominated the causal inference literature. Sensitivity analysis is also discussed as a means of justifying key assumptions in causal investigations. The objective is to initiate a dialogue about which underlying principles, tools, and methods can be adapted (and how) to data-driven inquiries within the domain of earthquake engineering. To this end, the two frameworks (and sensitivity analysis discussion) are presented while simultaneously establishing links to some of the problem types described in the previous section.

Potential outcome framework

The potential outcome (PO) framework is one of the two methodological developments that have dominated research and practice in causal inference. The initial concepts are attributed to the work of Jerzy Neyman (1923) but have since been advanced by Don Rubin (2005). In any causal inference problem, the primary goal is to evaluate the effect of some “treatment” $(D)$ on an outcome variable $(Y)$ . Given that there are $N$ units of some object of interest, each one can either be exposed to the treatment $(D = 1)$ or not $(D = 0)$ . Based on that assignment, each unit is denoted as either a treatment or control. Consider the hypothetical problem where the goal is to quantify the average (over the considered inventory) causal benefit of retrofitting single family woodframe residences with unbraced/unbolted cripple walls using a dataset of buildings affected by a real earthquake. In this context, each building represents a single unit. The treatment and control units are the retrofitted and unretrofitted buildings, respectively, and the outcome variable is the level of damage experienced during the earthquake. Each unit is also associated with a set of covariate variables $(X)$ whose values are known prior to administering the treatment and could have an effect on $Y$ . Examples of covariates in the seismic retrofit problem include the age of the building (this could affect construction quality), the shaking intensity experienced during the earthquake, and the type and density of walls used in the building.

POs are defined as the outcome variable values that each unit takes on under the treatment and control conditions. Specifically, for unit $i$ , the outcome values under treatment and control are denoted as $Y_{i} (1)$ (e.g., damage level of retrofitted building $i$ ) and $Y_{i} (0)$ (damage level of unretrofitted building $i$ ), respectively. In theory, the unit-level causal effect can be estimated as the difference or ratio between these two PO values. Such values are sometimes described as the “causal estimand.” The causal estimand array for our hypothetical seismic retrofit problem is shown in Table 1. The “Units” column contains an index for each building in the dataset. In the second column, $X_{ij}$ represents the value of covariate $j$ (e.g., shaking intensity experienced) for building $i$ . The third and fourth columns are the PO values (damage level) for the retrofitted (treatment) and unretrofitted (control) version of each building. The fourth column contains the building-level causal effects, which are obtained by comparing $Y_{i} (1)$ versus $Y_{i} (0)$ (e.g., difference or ratio between the two). The last column is used to denote summary values such as the mean causal effect for all or a subset of buildings. For a dataset comprising buildings affected by a real earthquake, we will never be able to observe both $Y_{i} (1)$ and $Y_{i} (0)$ because only one version (retrofitted or unretrofitted) of each building exists. This issue is often described as the “fundamental problem in causal inference” (Holland, 1986). Specifically, while it might be possible to observe both POs in a single unit, in many situations (such as our seismic retrofit problem), it is either impractical or impossible to do so.

Table 1.

Causal estimand array for the hypothetical seismic retrofit problem

Units	Covariates, [X]	Potential outcomes		Building-level causal effects	Summary ofcausal effects
Units	Covariates, [X]	$Y (1)$	$Y (0)$	Building-level causal effects	Summary ofcausal effects
1	$X_{1 j}$	$Y_{1} (1)$	$Y_{1} (0)$	$Y_{1} (1) vs Y_{1} (0)$	Comparison of $Y_{i} (1)$ vs $Y_{i} (0)$ for a specifiedsubset of buildings
⋮	⋮	⋮	⋮	⋮
$i$	$X_{ij}$	$Y_{i} (1)$	$Y_{i} (0)$	$Y_{i} (1) vs Y_{i} (0)$
$N$	$X_{Nj}$	$Y_{N} (1)$	$Y_{N} (0)$	$Y_{N} (1) vs Y_{N} (0)$

One potential strategy for computing the causal estimand is to “match” pairs of treatment and control units with similar covariate values and compare their POs. Revisiting our earlier hypothetical problem, we would need to find pairs of retrofitted and unretrofitted buildings with similar structural properties that have also experienced comparable shaking characteristics. Several matching methods have been proposed in the causal inference literature, particularly in the social sciences (Stuart, 2010). In real observational studies, it is almost impossible to obtain pairs of treatment and control units that are exactly or even closely matched because it is often the case that we are dealing with large numbers of covariates that could potentially affect the outcome. As such, even after matching, there could still be significant covariate imbalance between treatment and control units. To address this issue, several techniques have been developed to statistically adjust the outcomes based on the covariate distribution of the treatment and control units through “weighting” or “balancing.” Examples of such methods include bias-corrected matching (Abadie and Imbens, 2011) and kernel balancing (Hazlett, 2020).

The randomized control trial (RCT) is known to be the ideal approach for dealing with covariate mismatch. In RCTs, the treatment is randomly assigned to a portion of the units and their mean outcome values $(E [Y_{i} (1)])$ are compared with that of the control units $(E [Y_{i} (0)])$ . In this way, the influence of the covariates on the outcome is “averaged out” and the treatment effect is isolated (Rubin, 2005). However, in observational studies, it is often (more so than not) the case that random assignment of the treatment is extremely difficult, if not impossible. Such is the case in our seismic retrofit problem, where we have no control over the set of buildings that are chosen to be retrofitted or affected by a given earthquake.

Several important assumptions are embedded in the PO framework. The “consistency” assumption says that the observed outcome (denoted as $Y_{i}$ ) under a particular treatment condition is the same as the PO under the same condition. The implication to our hypothetical example is that the PO for a retrofitted building is the same as the observed level of damage for the same retrofitted building. The same applies for the unretrofitted buildings, or more generally, the controls. The second assumption is sometimes described as “one-version-of-treatment,” which means that, as best as possible, the treatment is specifically defined and is stable across all treated units. Lumping multiple types of retrofit strategies into the $D = 1$ category can be viewed as a violation of this assumption. The last assumption is described as no interference across treatments, meaning the outcome of one unit is not affected by the treatment status of another.

The “Pearl” framework

The second framework is primarily attributed to Judea Pearl (2019), who describes three hierarchical levels of causal inference. The first and most basic level is association, which reflects the act of “seeing,” where a set of statistically related variables is merely observed. The relations are quantified based on conditional dependencies. For example, $f (Y | D = d)$ represents the conditional dependence of the outcome, $Y$ , on the treatment, $D$ . All of the data-driven models described in the state-of-the-art review in the previous section (including those based on ML) operate at this level. The intervention or “doing” level, which falls in the middle of the hierarchy, allows us to ask population-level questions such as “how different would the distribution of damage have been to this set of buildings, had they been retrofitted, with all other influencing factors unchanged.” To mathematically distinguish between seeing and doing, Pearl introduced the do-operator, that is $f (Y | do (D = d))$ , which corresponds to the interventional distribution that is based on the process of doing. At the top of the three-level hierarchy are counterfactuals that correspond to the act of “imagining.” Here is where individual-level questions are addressed, such as “how different would the damage have been to this individual building if it were retrofitted and all else remained the same.”

Central to the Pearl framework is the use of directed acyclic graphs (DAGs) and structural causal models (SCMs) to conceptualize both the causal and statistical dependencies between the variables in an observational dataset. A DAG is a graph that only contains directed edges and has no cycles (i.e., there is no path from a single node that leads back to itself) (Geiger et al., 1990). Consider the simple DAG shown in Figure 1 that is based on the “effect of seismic retrofit” problem introduced in the PO framework section. The goal is to isolate the causal effect of seismic retrofit (SR) (the treatment) on seismic damage (SD) (the outcome) using data collected after a real earthquake. For illustration purposes, we will consider the peak ground acceleration (PGA) (shaking intensity) and the building age (AGE) as the only two covariates or “confounders.” Confounding refers to a situation where a common cause obscures the causal relationship between two or more variables. More precisely, the causal effect of $D$ on $Y$ is confounded if $f (Y | D = d) \neq f (Y | do (D = d))$ . In Figure 1, the effect of SR on SD is confounded by PGA and AGE because the retrofit, building age, and shaking intensity are all known to have an effect on the earthquake damage. Also, the arrows pointing from AGE and PGA to SR are an indication that the treatment (retrofit) status of a building is a decision that could be affected by the confounders. In other words, the building age and seismicity (as reflected in PGA) are likely to influence whether an owner decides to retrofit the building. The $U_{.}$ variables, which are described as “exogenous” (all other variables in the DAG are “endogenous”), are those that are unobserved and not explained in the model. Exogenous variables exert influence on but are not influenced by the endogenous variables. Factors that are unmeasurable or deemed irrelevant to explaining the behavior of the endogenous variables are typically included as exogenous. In Figure 1, the unmeasured influences on SR, SD, AGE, and PGA are denoted as $U_{SR}$ , $U_{SD}$ , $U_{AGE}$ , and $U_{PGA}$ , respectively. Quantifying the effect of SR on SD necessitates the assumption that the exogenous variables are jointly independent.

Figure 1.

DAG corresponding to the seismic retrofit problem introduced in the PO framework section.

It should be noted that, within a DAG, the causal assumptions are encoded in the links (arrows) that are missing, not the ones that are shown (Pearl, 2019). The links that are included are used to represent the possibility of a causal influence, which is determined and quantified by the data. In Figure 1, for example, the missing arrows between the exogenous variables indicate the assumption of joint independence. Additional causal assumptions that are encoded in Figure 1 are (1) AGE does not influence PGA, (2) SR does not influence AGE or PGA, (3) PGA does not influence AGE, and (4) SD does not influence any of the endogenous variables.

SCMs integrate features of structural equation models (Duncan, 2014; Goldberger, 1972), the PO framework, and graphical models for probabilistic reasoning and causal assessment (Pearl, 1988). A SCM contains a set of endogenous and exogenous variables linked by a set of functions, which capture the causal relationships between the variables. The SCM corresponding to the DAG in Figure 1 is shown in Equation 1:

AGE = f_{AGE} (U_{AGE})

(1a)

SR = f_{SR} (AGE, PGA, U_{SR})

(1b)

PGA = f_{PGA} (U_{PGA})

(1c)

SD = f_{SD} (AGE, SR, PGA, U_{SD})

(1d)

where $f_{.} (\cdot)$ describes the causal mechanism or process that determines the value of the left variable or output based on the input variables on the right. The assumption of an absence of a causal relationship between two variables is encoded by omission from $f_{.} (\cdot)$ . For instance, the assumption that SR, PGA, SD, and their associated exogenous variables do not influence AGE is encoded by their absence from $f_{AGE} (\cdot)$ . Another important feature of SCMs is the autonomy of each function that makes up the model. In other words, a change in the form of one function does not affect any of the others. More generally, a system of functions with this feature is described as structural (Koopmans, 1949; Simon and Rescher, 1966).

A causal analysis using data, DAGs and SCMs can be viewed as having two objectives: identification of a causal effect and quantification of the strength of that effect. Recall that the presence of an arrow in a DAG represents the possibility of a causal link that is determined by the analysis. For some problems, especially in domains such as earthquake engineering, the causal link may be identifiable based on our knowledge of the physical system being studied. For example, we know based on structural engineering principles that there is going to be some causal effect of the retrofit on damage, so we may want to focus on quantifying the strength of that effect. However, there are other problems, even within earthquake engineering, where the presence of a causal link is less obvious.

As noted earlier, the $do ()$ operator is used to simulate a physical intervention within a causal model. Knowing this, the SR problem represented in Figure 1 and Equation 1 can be framed in terms of the desire to quantify $E [SD | do (SR = 1)] - E [SD | do (SR = 0)]$ , where $E [SD | do (SR = 1)]$ is the expected level of SD under the hypothetical case where the retrofit is applied uniformly to the entire population and $E [SD | do (SR = 0)]$ is the case where the entire population is unretrofitted. Within a DAG, the $do (SR)$ operator is represented by fixing the value of SR and removing all arrows that point to it. Similarly, the $do (SR)$ operation in the SCM is described by assigning SR a value. Figure 2 and Equation 2 show the DAG and SCM, respectively, corresponding to $do (SR = 1)$ .

Figure 2.

DAG corresponding to the $do (SR = 1)$ intervention applied to the seismic retrofit problem.

AGE = f_{AGE} (U_{AGE})

(2a)

SR = 1

(2b)

PGA = f_{PGA} (U_{PGA})

(2c)

SD = f_{SD} (AGE, SR = 1, PGA, U_{SD})

(2d)

Identifying and quantifying the strength of a causal effect in a graph is feasible when the Markovian assumption—error terms are jointly independent, and the graph is acyclic—is a reasonable one. Identification and effect quantification in non-Markovian models (e.g., with correlated errors) are also possible, but with additional complications (see Pearl (2019). Here, we will deal with the former case. A model $M$ that satisfies the Markovian condition can be factorized as follows:

P (v_{1}, v_{2}, \dots, v_{n}) = \underset{i}{Π} P (v_{i} | p a_{i})

(3)

where $v_{1}, v_{2}, \dots, v_{n}$ are the endogenous variables in $M$ and $p a_{i}$ are the values of the endogenous parents $v_{i}$ , that is, the variables that point to $v_{i}$ in the graph. By applying the $do (X = x_{0})$ to $M$ , the factors that quantify how $X$ is influenced by its pre-intervention parents are removed from Equation 3, which leads to

P (v_{1}, v_{2}, \dots, v_{n} | do (x_{0})) = \underset{i | v_{i} \in X}{Π} P (v_{i} | p a_{i}) |_{X = x_{0}}

(4)

where $P (v_{i} | p a_{i})$ is the pre-intervention conditional probabilities. For the DAG shown in Figure 1, the pre- and post-intervention $(do (SR = 1))$ joint distribution of the endogenous variables can be factorized as shown in Equations 5 and 6, respectively:

f (AGE, SR, PGA, SD) = f (AGE) f (PGA) f (SR | AGE, PGA) f (SD | AGE, SR, PGA)

(5)

f (AGE, PGA, SD | do (SR = 1)) = f (AGE) f (PGA) f (SD | AGE, SR = 1, PGA)

(6)

Then, $f (SD | do (SR = 1))$ can be obtained by marginalizing over the confounding variables:

f (SD | do (SR = 1)) = \sum_{PGA, AGE} f (AGE) f (PGA) f (SD | AGE, SR = 1, PGA)

(7)

Equation 7, which is often described as “adjusting” for the confounding variables, coincides with the objectives of the “weighting” or “balancing” techniques mentioned in the PO section. Additional details on estimating causal effects using DAGs and SCMs are provided in Pearl (2019), where other complicating issues such as dealing with unmeasured confounders and counterfactual analysis (the third level of the causal hierarchy) are discussed.

Semi-parametric models

Semi-parametric models have been applied to a broad range of causal inference problems (usually within the PO framework) and have been shown to be especially useful in dealing with high-dimensional confounding (i.e., a large number of covariates). This body of work is largely attributed to James Robins and his collaborators (Robins, 1994; Robins and Tsiatis, 1991; Robins et al., 1992). As described in the PO framework, the general problem is defined with the goal of determining the effect of some treatment $(D)$ on an outcome $(Y)$ in the presence of a set of covariates/confounders $(X)$ (also described as “nuisance” variables in the classical literature). For clarity, we will define a second hypothetical earthquake engineering problem to be referenced throughout this subsection. In this example, we will consider the challenge of causally untangling the effect of ground motion duration (DR) on seismic damage (same outcome variable used in the seismic retrofit example) while considering the spectral shape (SS) and first-mode spectral acceleration $(S a_{T 1})$ as confounders. Note that in reality, there are likely to be other confounders (e.g., the presence of pulses in the record). We limit the number of confounders to two in this hypothetical problem for ease of illustration. Also note that, unlike the “seismic retrofit” example used in the previous two subsections, the treatment variable (DR) is continuous in this problem, which makes it especially suitable for “model-based” estimation.

One approach to estimating the causal effect for the “ground motion duration” problem is to fit regression models with SD (outcome) as the dependent variable and $[SS, S a_{T 1}]$ (covariates) as the input and a second model with DR (treatment) as the dependent variable and $[SS, S a_{T 1}]$ as the inputs. However, subsequent inference of the causal effect requires correct “specification” (functional form) for the relationship between the covariates and outcome (i.e., between SS, $S a_{T 1}$ , and SD) and between the treatment and covariates (i.e., between DR, SS, and $S a_{T 1})$ . An alternative approach is semi-parametric regression, where the only specified functional forms are in the relationship between the treatment and outcome (Equations 8 and 9) and the error distributions. This allows us to perform causal inference under the weakest (or smallest number of) possible assumptions:

y_{i} = θ d_{i} + f (x_{i}) + e_{i}

(8)

d_{i} = g (x_{i}) + w_{i}

(9)

In Equations 8 and 9, $y_{i}$ and $d_{i}$ are the outcome and treatment values, respectively, for observation $i \in 1, 2, \dots, n$ and $x_{i}$ is the vector of confounding variables for the same observation. $f (x_{i})$ is a flexible function that represents the effect of the control variables on the outcome. Similarly, $g (x_{i})$ represents the relationship between the treatment and controls. $e_{i}$ and $w_{i}$ are error terms associated with the estimation of $y_{i}$ and $d_{i}$ , respectively, where $E (e_{i} | x_{i}, d_{i}) = 0$ and $E (w_{i} | x_{i}) = 0$ . Since $f (\cdot)$ and $g (\cdot)$ are unknown, they are denoted as $\hat{f} (\cdot)$ and $\hat{g} (\cdot)$ , respectively. $[X]$ is an $n \times p$ matrix where $n$ is the number of observations and $p$ is the number of confounding variables and $θ$ is the quantified effect of the treatment on the outcome.

A strategy that has been receiving significant recent attention in the causal inference literature is to use ML models for $\hat{f} (\cdot)$ and $\hat{g} (\cdot)$ . The “double machine learning” method (Chernozhukov et al., 2018) can be used to estimate the causal effect $(\hat{θ})$ of $D$ on $Y$ in the presence of $[X]$ . The procedure begins by splitting the set of observations into a “main” $(S_{1})$ and an “auxiliary” part $(S_{2})$ . The latter is used to fit the ML models to obtain $\hat{f} (\cdot)$ and $\hat{g} (\cdot)$ , thus the “double ML.” The main part of the dataset, $S_{1}$ , is used to perform the inference by regressing $(y_{i} - f (x_{i}))$ on $(d_{i} - g (x_{i}))$ incorporating robust standard errors. The procedure is repeated with the $S_{1}$ and $S_{2}$ switched (i.e., cross-fitting) and the final causal effect is taken as the average of the two cases. Finally, through random splitting and repeated cross-fitting, an empirical distribution can be obtained for $\hat{θ}$ where the median is taken as the point estimate and a standard error is also computed.

For our ground motion duration problem, we would generate our dataset of observations using response history analysis. Each ground motion in the record-set will be used to analyze the building structural model, which will yield a performance outcome SD. We will also be able to compute the duration, DR (treatment), SS, and first-mode spectral acceleration, $S a_{T 1}$ , of each ground motion. The response data from half of the records will be used to fit $f (SS, S a_{T 1})$ , which is a model that predicts the SD and a function of SS and $S a_{T 1}$ . The other half will be used to fit $g (SS, S a_{T 1})$ which predicts DR as a function of SS and $S a_{T 1}$ . After switching subsets and repeating the process, the causal effect of DR on SD can be estimated by regressing the residuals from the two models. The details and complications (e.g., effect of sample size, uncertainty quantification) of this approach to causal inference are further discussed in Chernozhukov et al. (2018).

This section discussed a single specific approach (semi-parametric models) to estimating treatment effects in the presence of confounders within a broader causal inference framework. It is noted that several alternative approaches exist in the broader literature (Imbens and Rubin, 2015).

Sensitivity analyses

In general, causal inference on observational data (as opposed to a controlled experiment) requires the unavoidable assumption that there is “no unobserved confounding” or “ignorability” conditional on observables (also described as “conditional ignorability”) (Imbens and Rubin, 2015; Pearl, 2019; Rosenbaum and Rubin, 1983). In fact, the need for assumptions that are unverifiable by the collected data is fundamental to causal inference (Pearl, 2019). In earthquake engineering problems, the ignorability assumption can be placed in two broad categories. The first is one where conditional ignorability is justifiable based on an understanding of the physical mechanisms that drive the problem. In this scenario, it is still advisable that the investigator provides as much evidence as possible to justify that the estimated causal effects are not due to confounding. This is especially true if the underlying physics of the problem is not fully understood. The second is the situation where the ignorability assumption is not fully justifiable but necessary simply because of a lack of data on potential additional confounders. In this case, it is even more important that the modeler investigates the implications of these unmeasured confounders to the causal estimates.

In the causal inference literature, sensitivity analysis is the primary means for assessing the implications of unobserved confounding to causal inference results. Specifically, sensitivity analyses are used to determine whether an unobserved confounder can substantively alter the conclusions about an estimated causal effect. Two metrics are widely used for this purpose: the robustness value (RV) and partial coefficient of determination (or partial $R^{2}$ ) (Cinelli and Hazlett, 2019). The RV provides a summary of the robustness of the estimated causal effect to unobserved confounding. Consider a problem where $q$ is the proportion of reduction in the computed treatment effect that is considered substantial, that is, $0 < q < 1$ , the RV is computed as follows:

R V_{q} = \frac{1}{2} \sqrt{f_{q}^{4} + 4 f_{q}^{2}} - f_{q}^{2}

(10)

where $f_{q} : = q | f_{(Y D | X)} |$ is the partial Cohen’s f of the treatment with the outcome multiplied by $q$ . $f_{(Y D | X)}$ is obtained by dividing the $t$ -value of the treatment coefficient ( $θ$ in Equation 8) by $\sqrt{df}$ , where $df$ is the number of degrees of freedom. A confounder that explains $R V_{q} %$ of both the outcome and treatment will significantly change the causal estimate result. In other words, when $R V_{q}$ is near unity, the estimated treatment effect is robust to strong confounders. However, when $R V_{q}$ is close to zero, the estimated causal effect can be eliminated by even weak confounders.

The partial $R^{2}$ metric allows the investigator to evaluate the potential implications of an unobserved confounder $(Z)$ by establishing a strength range for the set of observed confounders $[X]$ . This enables the investigator to reason about how strong of a confounder $Z$ would need to be relative to those in $[X]$ to substantially change the causal estimate. The conditional ignorability assumption implies that, compared to covariate $X_{j}$ , $Z$ cannot explain as much of the residual variance (or a large fraction of that variance) of $D$ (the treatment) or $Y$ (the outcome), which is more formally described by,

k_{D} = \frac{R_{D ~ Z | X_{- j}}^{2}}{R_{D ~ X_{j} | X_{- j}}^{2}}

(11a)

k_{Y} = \frac{R_{Y ~ Z | X_{- j}, D}^{2}}{R_{Y ~ X_{j} | X_{- j}, D}^{2}}

(11b)

where $X_{- j}$ is the set of observed confounders $X$ , excluding $X_{j}$ . $k_{D}$ and $k_{Y}$ represent how much $Z$ explains the variance of the treatment and outcome, respectively, relative to $X_{j}$ . Knowing these two parameters, the partial $R^{2}$ metric for the treatment and confounder is computed as follows:

R_{D ~ Z | X}^{2} = k_{D} f_{D ~ X_{j} | X_{- j}}^{2}

(12a)

R_{Y ~ Z | D, X}^{2} \leq η^{2} f_{Y ~ X_{j} | D, X_{- j}}^{2}

(12b)

where $η$ is a scalar value computed using $k_{D}$ , $k_{Y}$ , and $R_{D ~ X_{j} | X_{- j}}^{2}$ (see Cinelli and Hazlett (2019) appendix for details). Equations 11 and 12 enable the modeler to quantify the extent to which an unobserved confounder that is $k$ times as strong as $X_{j}$ would affect the coefficient estimate.

Revisiting the “ground motion duration” problem described in the previous subsection, the argument can be made that spectral shape and $S a_{T 1}$ are the only two confounders worth considering. Note that this assumption is implicit in the Chandramohan et al. (2016b) study, where the authors performed “matching” based on spectral shape and to some extent (i.e., implicitly), $S a_{T 1}$ . The sensitivity analysis metrics described in this subsection can be used to more rigorously justify (or refute) this assumption. Consider the situation where the argument is made that “pulse effects” is a potentially strong confounder that could measurably change the causal conclusions of the study. The $R V_{q}$ value will provide direct insight into how much pulse effects can potentially change the causal estimate result. For example, if a low (near zero) $R V_{q}$ value is obtained, this is an indication that the estimated causal effect of duration can be eliminated by even a weak confounder. However, a near-unity $R V_{q}$ strengthens the argument against the need to directly consider pulse effects as a covariate. The partial $R^{2}$ metric will allow the modeler to quantify the ability of pulse effects to nullify the causal estimate based on its strength as a confounder relative to the one that is included in the analysis (i.e., spectral shape). In other words, we will be able to say that pulse effects would need to be $k$ times (see Equation 12) as strong as spectral shape to nullify the causal estimate obtained for duration. Note that both metrics can be computed using the publicly available R package that has been developed by Cinelli and Hazlett (2019).

Opportunities and challenges

Statistical analysis of observational data is a primary tool in earthquake engineering research and practice. Furthermore, the present access to multiple data streams (e.g., field reconnaissance, simulation, and remote sensing), coupled with recent advancements in computing and statistical learning, have only intensified data-driven efforts in the domain. However, as a community, we are yet to broadly leverage the principles, tools, and techniques that have been developed and disseminated in the field of causal inference. While it is generally accepted that seismic risk mitigation should ideally be based on a causal understanding of the effects of earthquakes and any intervention strategies, the overwhelming majority of data-driven investigations do not rigorously address the issue of confounding. This can be viewed as an opportunity for the earthquake engineering community, which already has a long history of interdisciplinary (or more recently, transdisciplinary) research and practice.

A general area that is ripe for research is an overall investigation of what principles, tools, and methods in the broad area of causal inference (on observational data) are adaptable to solving problems in earthquake engineering. This article presented four categories of domain-specific problems and two frameworks that have dominated the causal inference literature. While an attempt has been made to link the two, understanding which method(s) is most appropriate for a given problem (or category of problems) is still an open question. For instance, one can inquire about whether there are common or distinct strategies needed when addressing causal problems where the data are generated from physical experiments, numerical simulations, or field reconnaissance after real earthquakes. Another open question is in regard to the most appropriate causal inference method(s) (e.g., PO, semi-parametric model-based) for binary (e.g., retrofitted or not retrofitted) versus continuous (ground motion duration) treatment variables. In contrast to the social sciences (e.g., economics and political science) where causal inference methods are widely used, earthquake engineering problems are often supported by an understanding of the underlying physics (to different degrees). This knowledge can serve as a base of support for the many unavoidable assumptions that are needed when trying to elucidate cause and effect.

The last 5 years has seen a broad embrace of the use of ML methods in the earthquake engineering domain. However, the applications have generally been toward developing predictive models, where capturing associative relationships between the features and outcome variable is sufficient. There is a need for a broad assessment within the domain to understand how ML algorithms can be leveraged in causal inference problems that involve high-dimensional confounding. For ease of illustration, the hypothetical problems presented in the previous section were limited to two confounders. However, a realistic appraisal of these problems can quickly lead to the recognition that the number of confounders is much greater. Consider the hypothetical seismic retrofit problem presented earlier. The confounders can be placed into two categories: those related to the building characteristics and those related to the ground shaking intensity. Besides the age, other building-related confounders include the number of stories, floor area, plan layout, and wall density. On the shaking intensity side, there may be a need to consider multiple types of intensity measures to ensure that the confounding is rigorously addressed. With that many confounders, using ML (specifically double ML) as the “adjustment” strategy almost becomes a necessity. In contrast to their use in predictive models, the application of ML within a causal context is much more suited to the advancement of fundamental knowledge and/or discovery of new insights.

So far, we have highlighted several sub-problems within earthquake engineering that can derive potential benefits when viewed through a causal lens. Some high-level questions were also presented as “opportunities” toward a full embrace of causal inference as a tool for advancing the field. However, like any new area of inquiry, this pursuit does not come without challenges. The proposed paradigm shift is centered around the use of observational data to extract causal information. Therefore, having access to high-quality data is paramount to realizing this goal. As emphasized in early sections, a causal claim that is derived from observational data relies heavily on the validity of the conditional ignorability assumption. In other words, the need to adjust for all significant confounders implies that the dataset should contain the necessary information. Because of differences in the data-generating process, this will be more of a challenge for some problem categories than others. Problems that rely on data from physical experiments will be limited by the information that was documented and provided by the investigators. However, for problems involving existing infrastructure, while collecting the necessary data may prove to be a challenge, it is usually possible with the appropriate amount of effort.

In general, the domains that have advanced the field of causal inference (i.e., social science, public health, and statistics) are much more reliant on purely data-driven methods. Comparatively speaking, mechanistic models play a much bigger role in developing solutions to earthquake engineering problems. A major challenge in this regard is establishing methods that integrate mechanistic modeling with causal inference principles and data-driven techniques. This strategy was the centerpiece of the Rabonza et al. (2022) study which used counterfactual analysis to quantify the benefit of disaster risk reduction interventions.

Summary and conclusion

This study is rooted in the hypothesis that the foundational principles, concepts, language, and methodological advancements in the broad area of causal inference can enhance the way that knowledge is gleaned from data-driven earthquake engineering models. To advance this proposition, four categories of problems that are prevalent within the domain are discussed, highlighting the limitations in the state-of-the-art with respect to the inability to make quantitative and justifiable causal claims. The problems vary based on the data-generating process (e.g., physical experiments, numerical simulations, and field reconnaissance) and types of investigations (e.g., average benefit of interventions and effect of seismic and structural properties on damage). Next, two distinct but fundamentally related “schools of thought” in the broad area of causal inference are presented. Using hypothetical examples related to the aforementioned problem categories, we begin to forge direct connections with the earthquake engineering domain and highlight limitations and capabilities in terms of advancing the field. The importance of sensitivity analysis as a means to justifying the unavoidable conditional ignorability assumption is also examined.

Through the state-of-the-art review of domain-specific problems, we elucidate opportunities for using causal inference to advance earthquake engineering research and practice. In addition, more general questions that are agnostic to the discussed problem categories are presented as opportunities for future research. One such question is centered around the adaptability of the causal inference frameworks to earthquake engineering problems in general. For instance, some methods are more suitable to dichotomous treatment variables (e.g., retrofitted versus unretrofitted building) while others are more suited to continuous variables (e.g., significant duration of a ground motion). The potential for ML models as a tool for addressing the challenge of high-dimensional confounding in causal earthquake engineering problems is also discussed. Because of the need for information that captures all (or most) possible confounders, the current lack of high-quality data was recognized as a major challenge. Given that the methods from the broader causal inference literature are centered around fully empirical (or data-driven) analyses, we also identified the need to develop integrative causal frameworks that also incorporate engineering models.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Abadie

Imbens

(2011) Bias-corrected matching estimators for average treatment effects. Journal of Business & Economic Statistics 29(1): 1–11.

Abdullah

(2019) Reinforced Concrete Structural Walls: Test Database and Modeling Parameters. Los Angeles, CA: University of California, Los Angeles.

Abdullah

Wallace

(2019) Drift capacity of reinforced concrete structural walls with special boundary elements. ACI Structural Journal 116(1): 183.

Abrahamson

Silva

Kamai

(2014) Summary of the ask14 ground motion relation for active crustal regions. Earthquake Spectra 30(3): 1025–1055.

Aguilar

Juarez

Ortega

Iglesias

(1989) The Mexico earthquake of September 19, 1985 —Statistics of damage and of retrofitting techniques in reinforced concrete buildings affected by the 1985 earthquake. Earthquake Spectra 5(1): 145–151.

Bagriacik

Davidson

Hughes

Bradley

Cubrinovski

(2018) Comparison of statistical and machine learning approaches to modeling earthquake damage to water pipelines. Soil Dynamics and Earthquake Engineering 112: 76–88.

Basöz

Kiremidjian

King

Law

(1999) Statistical analysis of bridge damage data from the 1994 Northridge, CA, earthquake. Earthquake Spectra 15(1): 25–54.

Berman

Wartman

Olsen

Irish

Miles

Tanner

Gurley

Lowes

Bostrom

Dafni

Grilliot

Lyda

Peltier

(2020) Natural hazards reconnaissance with the NHERI rapid facility. Frontiers in Built Environment 6: 573067.

Berry

Parrish

Eberhard

(2004) Peer Structural Performance Database User’s Manual (version 1.0). Berkeley, CA: University of California, Berkeley.

10.

Bijelić

Lin

Deierlein

(2019) Quantification of the influence of deep basin effects on structural collapse using SCEC cybershake earthquake ground motion simulations. Earthquake Spectra 35(4): 1845–1864.

11.

Boatwright

Blair

Aagaard

Wallis

(2015) The distribution of red and yellow tags in the city of Napa. Seismological Research Letters 86(2A): 361–368.

12.

Boore

Stewart

Seyhan

Atkinson

(2014) NGA-West2 equations for predicting PGA, PGV, and 5% damped PSA for shallow crustal earthquakes. Earthquake Spectra 30(3): 1057–1085.

13.

Briseghella

Demartino

Fiore

Nuti

Sulpizio

Vanzi

Lavorato

Fiorentino

(2019) Preliminary data and field observations of the 21st August 2017 Ischia earthquake. Bulletin of Earthquake Engineering 17(3): 1221–1256.

14.

Campbell

Bozorgnia

(2014) NGA-West2 ground motion model for the average horizontal components of PGA, PGV, and 5% damped linear acceleration response spectra. Earthquake Spectra 30(3): 1087–1115.

15.

CEA (2022) California earthquake authority. Available at: https://www.earthquakeauthority.com/ (June 2022)

16.

Champion

Liel

(2012) The effect of near-fault directivity on building seismic collapse risk. Earthquake Engineering & Structural Dynamics 41(10): 1391–1409.

17.

Chandramohan

Baker

Deierlein

(2016a) Impact of hazard-consistent ground motion duration in structural collapse risk assessment. Earthquake Engineering & Structural Dynamics 45(8): 1357–1379.

18.

Chandramohan

Baker

Deierlein

(2016b) Quantifying the influence of ground motion duration on structural collapse capacity using spectrally equivalent records. Earthquake Spectra 32(2): 927–950.

19.

Chernozhukov

Chetverikov

Demirer

Duflo

Hansen

Newey

Robins

(2018) Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal 21: C1–C68.

20.

Chiou

BSJ

Youngs

(2014) Update of the Chiou and Youngs NGA model for the average horizontal component of peak ground motion and response spectra. Earthquake Spectra 30(3): 1117–1153.

21.

Cinelli

Hazlett

(2019) sensemakr: Sensitivity analysis tools for OLS. R Package (Version 0.1) https://cran.r-project.org/web/packages/sensemakr/sensemakr.pdf (June 2022).

22.

Dai

(2020) Phenomenological hysteretic model for corroded RC columns. Engineering Structures 210: 110315.

23.

De Luca

Chioccarelli

Iervolino

(2011) Preliminary study of the 2011 Japan earthquake ground motion record V1.01, available at http://www.reluis.it. (June 2022)

24.

De Risi

Del Gaudio

Ricci

Verderame

(2018) In-plane behaviour and damage assessment of masonry infills with hollow clay bricks in RC frames. Engineering Structures 168: 257–275.

25.

Dizhur

Ingham

Moon

Griffith

Schultz

Senaldi

Magenes

Dickie

Lissel

Centeno

Ventura

Leite

Laurenco

(2011) Performance of masonry buildings and churches in the 22 February 2011 Christchurch earthquake. Bulletin of the New Zealand Society for Earthquake Engineering 44(4): 279–296.

26.

Duncan

(2014) Introduction to Structural Equation Models. Amsterdam: Elsevier.

27.

Eads

Miranda

Lignos

(2016) Spectral shape metrics and structural collapse potential. Earthquake Engineering & Structural Dynamics 45(10): 1643–1659.

28.

EERI-LFE (2022) Earthquake engineering research institute learning from earthquakes. Available at: https://www.eeri.org/what-we-offer/learning-from-earthquakes (June 2022)

29.

El Jisr

Kohrangi

Lignos

(2022) Proposed nonlinear macro-model for seismic risk assessment of composite-steel moment resisting frames. Earthquake Engineering & Structural Dynamics 51(5): 1180–1200.

30.

GEER (2022) Geotechnical Engineering Extreme Event Reconnaissance (GEER). Available at: https://geerassociation.org/ (June 2022)

31.

Geiger

Verma

Pearl

(1990) Identifying independence in Bayesian networks. Networks 20(5): 507–534.

32.

Giaretton

Dizhur

da Porto

Ingham

(2016) Post-earthquake reconnaissance of unreinforced and retrofitted masonry parapets. Earthquake Spectra 32(4): 2377–2397.

33.

Goldberger

(1972) Structural equation methods in the social sciences. Econometrica: Journal of the Econometric Society 40: 979–1001.

34.

Harirchian

Kumari

Jadhav

Rasulzade

Lahmer

Raj Das

(2021) A synthesized study based on machine learning approaches for rapid classifying earthquake damage grades to RC buildings. Applied Sciences 11(16): 7540.

35.

Haselton

Baker

Liel

Deierlein

(2011) Accounting for ground motion spectral shape characteristics in structural collapse assessment through an adjustment for epsilon. Journal of Structural Engineering, ASCE 137(3): 332–344.

36.

Haselton

Liel

Taylor-Lange

Deierlein

(2016) Calibration of model to simulate response of reinforced concrete beam-columns to collapse. ACI Structural Journal 113(6).

37.

Hazlett

(2020) Kernel balancing. Statistica Sinica 30(3): 1155–1189.

38.

Holland

(1986) Statistics and causal inference. Journal of the American Statistical Association 81(396): 945–960.

39.

Huang

Burton

(2019) Classification of in-plane failure modes for reinforced concrete frames with infills using machine learning. Journal of Building Engineering 25: 100767.

40.

Huang

Burton

Sattar

(2020) Development and utilization of a database of infilled frame experiments for numerical modeling. Journal of Structural Engineering 146(6): 04020079.

41.

Imbens

Rubin

(2015) Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge: Cambridge University Press.

42.

Jeon

Shafieezadeh

DesRoches

(2014) Statistical models for shear strength of RC beam-column joints using machine-learning techniques. Earthquake Engineering & Structural Dynamics 43(14): 2075–2095.

43.

Jünemann

de La Llera

Hube

Cifuentes

Kausel

(2015) A statistical analysis of reinforced concrete wall buildings damaged during the 2010, Chile earthquake. Engineering Structures 82: 168–185.

44.

Koopmans

(1949) Identification problems in economic model construction. Econometrica: Journal of the Econometric Society 17: 125–144.

45.

Liapopoulou

Bravo-Haro

Elghazouli

(2020) The role of ground motion duration and pulse effects in the collapse of ductile systems. Earthquake Engineering & Structural Dynamics 49(11): 1051–1071.

46.

Liberatore

Noto

Mollaioli

Franchin

(2017) Comparative assessment of strut models for the modelling of in-plane seismic response of infill walls. In: Proceedings of the 6th international conference on computational methods in structural dynamics and earthquake engineering (COMPDYN). ECCOMAS Proceedia, Rhodes, 15–17 June, pp. 5235–5248.

47.

Liberatore

Noto

Mollaioli

Franchin

(2018) In-plane response of masonry infill walls: Comprehensive experimentally-based equivalent strut model for deterministic and probabilistic analysis. Engineering Structures 167: 533–548.

48.

Lignos

Krawinkler

(2011) Deterioration modeling of steel components in support of collapse prediction of steel moment frames under earthquake loading. Journal of Structural Engineering 137(11): 1291–1302.

49.

Lignos

Krawinkler

(2013) Development and utilization of structural component databases for performance based earthquake engineering. Journal of Structural Engineering 139(8): 1382–1394.

50.

Luo

Paal

(2019) A locally weighted machine learning model for generalized prediction of drift capacity in seismic vulnerability assessments. Computer-Aided Civil and Infrastructure Engineering 34(11): 935–950.

51.

Luo

Paal

(2021) Reducing the effect of sample bias for small data sets with double-weighted support vector transfer regression. Computer-Aided Civil and Infrastructure Engineering 36(3): 248–263.

52.

Mangalathu

Jeon

(2018) Classification of failure mode and prediction of shear strength for reinforced concrete beam-column joints using machine learning techniques. Engineering Structures 160: 85–94.

53.

Mangalathu

Jeon

(2019) Machine learning–based failure mode recognition of circular reinforced concrete bridge columns: Comparative study. Journal of Structural Engineering 145(10): 04019104.

54.

Mangalathu

Sun

Nweke

Burton

(2020) Classifying earthquake damage to buildings using machine learning. Earthquake Spectra 36(1): 183–208.

55.

Marafi

Eberhard

Berman

Wirth

Frankel

(2017) Effects of deep basins on structural collapse during large subduction earthquakes. Earthquake Spectra 33(3): 963–997.

56.

Moehle

Deierlein

(2004) A framework methodology for performance-based earthquake engineering. In: 13th world conference on earthquake engineering (WCEE), Vol. 679, Vancouver, BC, Canada, 1–6 August.

57.

Neyman

(1923) On the application of probability theory to agricultural experiments. essay on principles. Section 9.(translated and edited by DM Dabrowska and TP Speed, Statistical Science (1990), 5, 465-480). Annals of Agricultural Sciences 10: 1–51.

58.

Pearl

(1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Burlington, MA: Morgan Kaufmann.

59.

Pearl

(2019) Causality. Cambridge: Cambridge university press.

60.

Rabinovici

(2017) California Earthquake Authority (CEA) South Napa home impact study. https://www.earthquakeauthority.com/About-CEA/Research-Outreach/Our-Research/CEA-Napa-Fina-lReport-Exec-Summ (June 2022)

61.

Rabinovici

Reis

(2011) The Brace and Bolt benefit. https://peer.berkeley.edu/sites/default/files/task7.3_lay-audience-report_20201016.pdf (June 2022)

62.

Rabonza

Lin

Lallemant

(2022) Learning from success, not catastrophe: Using counterfactual analysis to highlight successful disaster risk reduction interventions. Frontiers in Earth Science 10: 847196.

63.

Raghunandan

Liel

(2013) Effect of ground motion duration on earthquake-induced structural collapse. Structural Safety 41: 119–133.

64.

Rahman

Ahmed

Khan

Islam

Mangalathu

(2021) Data-driven shear strength prediction of steel fiber reinforced concrete beams using machine learning approach. Engineering Structures 233: 111743.

65.

Rathje

Dawson

Padgett

Pinelli

Stanzione

Adair

Arduino

Brandenberg

Cockerill

Dey

Esteva

Haan

Hanlon

Kareem

Lowes

Mock

Mosqueda

(2017) DesignSafe: New cyberinfrastructure for natural hazards engineering. Natural Hazards Review 18(3): 06017001.

66.

Robins

(1994) Correcting for non-compliance in randomized trials using structural nested mean models. Communications in Statistics: Theory and Methods 23(8): 2379–2412.

67.

Robins

Tsiatis

(1991) Correcting for non-compliance in randomized trials using rank preserving structural failure time models. Communications in Statistics: Theory and Methods 20(8): 2609–2631.

68.

Robins

Mark

Newey

(1992) Estimating exposure effects by modelling the expectation of exposure conditional on confounders. Biometrics 48: 479–495.

69.

Roeslin

Juárez-Garcia

Gómez-Bernal

Wicker

Wotherspoon

(2020) A machine learning damage prediction model for the 2017 Puebla-Morelos, Mexico, earthquake. Earthquake Spectra 36(Suppl. 2): 314–339.

70.

Rosenbaum

Rubin

(1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1): 41–55.

71.

Rubin

(2005) Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association 100(469): 322–331.

72.

Scholl

(1974) Statistical analysis of low-rise building damage caused by the San Fernando earthquake. Bulletin of the Seismological Society of America 64(1): 1–23.

73.

Simon

Rescher

(1966) Cause and counterfactual. Philosophy of Science 33(4): 323–340.

74.

Sirotti

Pelliciari

Di Trapani

Briseghella

Carlo Marano

Nuti

Tarantino

(2021) Development and validation of new Bouc–Wen data-driven hysteresis model for masonry infilled RC frames. Journal of Engineering Mechanics 147(11): 04021092.

75.

Skiadopoulos

Lignos

(2021) Development of inelastic panel zone database. Journal of Structural Engineering 147(4): 04721001.

76.

StEER (2022) Structural Engineering Extreme Event Reconnaissance (StEER). Available at: https://www.steer.network/ (June 2022)

77.

Stojadinović

Kovačević

Marinković

Stojadinović

(2021) Rapid earthquake loss assessment based on machine learning and representative sampling. Earthquake Spectra 38: 152–177.

78.

Stuart

(2010) Matching methods for causal inference: A review and a look forward. Statistical Science: A Review Journal of the Institute of Mathematical Statistics 25(1): 1.

79.

Sun

Zhang

(2011) Statistical analysis of seismic vulnerability for various types of building structures in Wenchuan earthquake. Key Engineering Materials 452: 461–464.

80.

Winkler

Haltmeier

Kleidorfer

Rauch

Tscheikner-Gratl

(2018) Pipe failure modelling for water distribution networks using boosted decision trees. Structure and Infrastructure Engineering 14(10): 1402–1411.

81.

Xie

Ebad Sichani

Padgett

DesRoches

(2020) The promise of implementing machine learning in earthquake engineering: A state-of-the-art review. Earthquake Spectra 36(4): 1769–1801.

Causal inference on observational data: Opportunities and challenges in earthquake engineering

Abstract

Keywords

Introduction

Potential causal inference applications in earthquake engineering

Data-driven investigations based on physical structural experiments

Structural response simulation-based data-driven investigations

Evaluating the benefits of seismic structural mitigation using observational data

Elucidating the factors that contribute to observed seismic building damage

Foundational principles and frameworks in causal inference

Potential outcome framework

The “Pearl” framework

Semi-parametric models

Sensitivity analyses

Opportunities and challenges

Summary and conclusion

Footnotes

Declaration of conflicting interests

Funding

References