Abstract
Simulated ground motions have the potential to advance seismic hazard assessments and structural response analyses, particularly for conditions with limited recorded ground motions such as large magnitude earthquakes at short source-to-site distances. However, rigorous validation of simulated ground motions is needed for hazard analysts, practicing engineers, or regulatory bodies to be confident in their use. A decade ago, validation exercises were mainly limited to comparisons of simulated-to-observed waveforms and median values of spectral accelerations for selected earthquakes. The Southern California Earthquake Center (SCEC) Ground Motion Simulation Validation (GMSV) group was formed to increase coordination between simulation modelers and research engineers with the aim of devising and applying more effective methods for simulation validation. Here, we summarize what has been learned in over a decade of GMSV activities, principally reflecting the views of the SCEC research community but also extending our findings and suggestions for a path forward to broader United States and worldwide simulation validation efforts. We categorize different validation methods according to their approach and the metrics considered. Two general approaches are to compare validation metrics from simulations to those from historical records or to those from semi-empirical models. Validation metrics are categorized into ground motion characteristics and structural responses. We discuss example validation studies that have been impactful in the past decade and suggest future research directions. Key lessons learned are that validation is application-specific, our outreach and dissemination need improvement, and much validation-related research remains unexplored.
Keywords
Introduction
Motivation
Over the last few decades, empirical ground motion databases (Ancheta et al., 2014; Chiou et al., 2008; Goulet et al., 2021a; Mazzoni et al., 2022) have grown exponentially and continue to grow as instrumentation and computer networks enable lower-cost capture of earthquake shaking (Kuyuk and Allen, 2013). However, there remains a paucity of data for large magnitude events recorded at close source-to-site distances and for specific source–site geological features that may be important such as basins adjacent to faults (Graves et al., 2008). Ground motion simulations present an attractive means by which empirical databases for these and other cases can be supplemented. However, before they can be used for hazard assessment or practical engineering applications, these simulations should be carefully validated. The Southern California Earthquake Center (SCEC) Ground Motion Simulation Validation (GMSV) group was established for that purpose in 2011. The objective of this group has been to develop and implement, via collaboration among ground motion modelers and engineering users, testing and rating methodologies for the use of ground motion simulations in engineering applications. The focus of this group has been on validation (consistency with observations) of simulation methodologies, where a methodology consists of a simulation technique or formulation and its calibrated parameters, and not on their verification (simulation codes performing as intended; Bielak et al., 2010). In this article, we describe how over a decade, SCEC GMSV activities have evolved the concept of validation into a broad field of active multidisciplinary research. We summarize the lessons learned and our vision for future research. This article principally reflects the views of the SCEC GMSV research community; however, efforts were made to involve national and international researchers and to ensure that our findings and suggestions for a path forward extend to the broader United States and worldwide research communities.
Before the GMSV group was created, validation tended to be performed by the simulation modelers themselves. Such validation most often consisted of simulated-to-observed waveform comparisons for specific events recorded at specific stations. These waveform validations allow seismologists to identify specific wave arrivals over time and to evaluate the amplitudes and numbers of peaks for each type of waveform (Figure 1). Simulated-to-observed waveform comparisons are generally qualitative (e.g. Lee and Chen, 2016) and may likely remain part of validation activities due to their utility for simulation modelers. They do not, however, address the needs of the engineering community, for two main reasons: (1) waveform comparisons require recorded ground motions, which necessarily limits the validation to a limited range of magnitudes and distances that may not encompass those required for use in seismic hazard assessments, structural response analyses, and other engineering applications; and (2) waveform comparisons are typically done on displacement or velocity time-series, which may be insufficient to capture important features of ground motions for engineering applications such as frequency content, duration, and relatively high-frequency components of ground shaking.

Qualitative comparisons of recorded (black) and simulated (red) velocity waveforms from the Loma Prieta earthquake, where “128,”“218,” and “ver” are the two orthogonal horizontal, and the vertical components of ground motion (figure from Graves and Pitarka, 2010).
Validation for engineering applications should, therefore, consider a different approach, aiming to capture the general trends of ground motions expected for future events while not rejecting past observations. In other words, one of the key goals of validation is to ensure that the simulations, in aggregate, have similar properties (i.e. as quantified by validation metrics) as recorded events for similar source-site configurations. This can be achieved in several ways that are described in the next section on a Proposed Categorization of GMSV Methods, examples of which are presented in the ensuing section on Validation Examples and Research Needs. It should also be noted that even after a simulation methodology has been verified and validated for a specific application based on appropriate validation metrics, the accuracy of resulting simulations still depends on the end-user’s skill and expertise to correctly use the numerical codes and produce physically reasonable synthetics. In other words, although simulation methodologies can be validated, there is always a risk of inaccurate results, and therefore the use of a validated and vetted set of simulated ground motions might be encouraged for engineering practitioners.
Evolution of the GMSV group
The SCEC GMSV group was originally created to support the needs of empirical ground motion model (GMM) developers in two major projects: the Southwestern United States (SWUS, GeoPentech, 2015) ground-motion characterization project and the Next Generation Attenuation (NGA-East, Goulet et al., 2021a; 2021b) project for central and eastern North America. This research focused on the validation of SCEC Broadband Platform (BBP) simulations (Maechling et al., 2015; https://github.com/SCECcode/BBP/wiki, last accessed on 13 April 2022), an open-source software system that can generate 0 to 20 Hz seismograms for historical and scenario earthquakes, using their 5%-damped median pseudo-spectral acceleration, Sa, as the validation metric (Dreger et al., 2015; Goulet et al., 2015). The GMSV group then evolved to provide a collaborative framework to facilitate the coordination of SCEC-supported projects by individual researchers interested in validation metrics beyond the median Sa and in other simulations, such as those from the SCEC CyberShake sets (Graves et al., 2011). Over the past decade, these projects have been coordinated through periodic web-conferences, in-person meetings, and workshops, with research engineers (engineers in academia or from agencies such as the U.S. Geological Survey (USGS) constituting the majority of the group.
Starting in 2015, collaborative projects were designed to: (1) conduct multidisciplinary research that broadens validation methods and tools to meet engineering needs and (2) facilitate direct interaction between GMSV researchers, SCEC ground motion simulation modelers, and potential engineering users of simulations outside of SCEC. Examples of such projects include:
Implementation of various ground motion validation metrics on the SCEC BBP for use by ground motion simulation modelers and research engineers (https://github.com/SCECcode/BBP/wiki; https://files.scec.org/s3fs-public/15136report.pdf, last accessed on 8 June 2022).
Demonstration of the effectiveness of some of the implemented validation metrics on the BBP for two specific engineering applications related to building response analysis (i.e. ASCE 7 building code analysis, and collapse fragility analysis), for example, Zhong et al. (2020).
Development of a set of scenario-based simulations from the SCEC BBP vetted as appropriate for consideration by practicing engineers (see Goulet et al., 2018a; 2018b, for description of the data set and the tool, respectively).
Collaboration with several other SCEC groups, including the BBP and CyberShake simulation groups, and the committee for Utilization of Ground Motion Simulations (UGMS) to apply CyberShake motions for the seismic design of tall buildings in Los Angeles (Crouse et al., 2018) and to develop a vetted subset of records appropriate for practicing engineers (Baker et al., 2021).
In this article, we identify key lessons from the work completed to date, and on that basis, propose a categorization of different validation methods. This categorization is useful for conceptualizing the body of knowledge in this field and for guiding the planning of future work to meet critical user needs. Subsequent sections introduce the categorization, provide examples, and summarize the key lessons learned.
Proposed categorization of GMSV methods
As noted in the introduction, simulated ground motions are needed for scenarios poorly represented in empirical databases. To meet this need, simulation methods are devised that aim to capture the physics of ground motion generation based, in part, on previous observations. The simulation methods can be validated for the parameter space well-represented by the available recorded motions. Given that the need for simulations largely lies beyond the range of recordings, ultimately the value of ground motion simulations can only be realized both by building confidence that the simulations accurately capture ground motion characteristics within the range of empirical data and that they reliably extrapolate ground motion characteristics for earthquake scenarios that are unobserved. In the following, a validation method consists of a validation approach and one or more validation metrics.
Table 1 summarizes prior research by connecting validation approaches (columns) to validation metrics (rows). Examples of prior studies are provided in each of the cells; we have attempted to capture notable citations for diverse application regions but make no claim of capturing all applicable published works. Our intent in presenting Table 1 is to organize prior research on simulation validation in a systematic manner. Here, we categorize validation metrics into two main groups of ground motion characteristics and structural responses. Bradley et al. (2017) also discuss and categorize validation of simulations in a similar manner by considering ground motion and structural response characteristics, but with a focus on spatial extent (i.e. broad regions to site-specific). Their findings are presented in a matrix form whereby different validation metrics (waveforms to structural responses) are judged to be suitable for practice in problems having different spatial extents. Alternative categorization of validation metrics that may involve validation applications have been suggested in the past and, although not considered in this article, are discussed at the end of this section for consideration in future expansions of Table 1 that could lead to potential development of future validation gauntlets as described below.
Categorizing GMSV methods by validation approaches and validation metrics and example publications
aIncludes median, dispersion, and correlation of response spectral IMs (e.g. spectral acceleration, Sa) and peak parameters (e.g. peak ground acceleration, PGA).
Includes IMs other than response spectral and peak parameters. These could be scalar (e.g. duration, Fourier amplitude), combination of scalars (e.g. goodness-of-fit), or evolutionary (i.e. time-varying) parameters.
Includes engineering demand parameters (EDPs) from structural analyses and decision variables (DVs) such as failure probabilities.
Validation approaches
Validation can be undertaken in different ways. One approach for validation uses historical recorded ground motion data directly (see Column A in Table 1). This involves generating simulations for well-recorded events, computing motions at the locations of recording stations, and then comparing the two motion sets. For some validation metrics (e.g. median Sa), semi-empirical models are available that represent global or regional trends of ground motions for parameterized earthquake scenarios (e.g. magnitude, distance, site condition). Accordingly, a second validation approach involves comparing predictions from such semi-empirical models to metrics of simulated motions (Column B in Table 1). When conducting such validations, it is important to consider the range for which semi-empirical models are constrained by data. In summary, there have been two approaches for comparing simulations to “reality”: (1) direct comparison to historical records and (2) comparison to semi-empirical models.
Validation metrics
Different validation metrics are organized in the rows of Table 1 and follow two main categories: (1) ground motion characteristics and (2) structural responses. The ground motion characteristics category is relevant for many engineering applications, particularly for GMM development and hazard assessment. The structural response category is relevant for one of the most common applications of simulated ground motions, response history analyses performed for structural design or structural assessment purposes.
Ground motion characteristics
Ground motion characteristic metrics include: (1) waveforms (i.e. time-series, see row 1 in Table 1), as introduced above, (2) response spectral intensity measures (IMs) such as Sa and amplitude parameters such as peak ground acceleration (PGA) or velocity (PGV) (see row 2 in Table 1), and (3) any other IMs computed directly from the seismograms, such as duration (see row 3 in Table 1).
Waveform comparisons (see row 1 in Table 1) are usually applied by seismologists that develop the simulation models, as mentioned above, and are generally qualitative. Row 2 refers to validation focused on Sa ground motion metrics, which have been the most commonly used parameters in validation studies because of their frequent utilization by both semi-empirical ground motion modelers and practicing engineers. Most often, these IM validations focus on the median values of Sa and its scaling with predictive parameters such as distance (Dreger et al., 2015; Lee et al., 2022; Pitarka et al., 2002, 2017, 2020, Star et al., 2011). However, researchers have also carried out validations of the dispersion (e.g. standard deviation) of Sa (Iwaki et al., 2017; Lee et al., 2022; Nweke et al., 2022; Star et al., 2011), inter-period correlations of Sa (Burks and Baker, 2014), and spatial correlations of Sa (Chen and Baker, 2019). These additional Sa metrics are important for certain engineering applications such as probabilistic seismic hazard analysis (PSHA), nonlinear inelastic structural response analysis, and distributed infrastructure risk assessment (DIRA).
Additional validation metrics that characterize ground motion waveforms beyond Sa have also been explored (see row 3 of Table 1). They can be grouped into three sub-categories: (3.1) scalar parameters such as Arias intensity (
Structural responses
A category of validation metrics considered by the GMSV group relates to the effects of ground motions on structural responses, commonly known as engineering demand parameters (EDPs). Because ground motions represent a complex combination of waveforms in three dimensions, their effects on structures provide a useful means by which one can evaluate critical ground motion attributes. In this context, the term “structure” includes: (1) relatively simple idealized structural models (see row 4 in Table 1) such as inelastic single-degree-of-freedom (SDoF) systems and (2) more complex and realistic structural models (see row 5 in Table 1) such as multi-degree-of-freedom (MDoF) models of buildings for which we may track story drift ratios, or models of bridges or slopes. Geotechnical structures could in principle be included as well, although validation research for such systems has been very limited to date. We recognize that elastic SDoF oscillators could also be classified as structures, but we elected to discuss Sa as a ground motion parameter in the previous section due to its frequent utilization in semi-empirical GMMs.
Studies utilizing the responses of idealized structures have compared simulated waveforms to both recorded waveforms (see column A in Table 1) and to semi-empirical models (see column B in Table 1) of those responses, while studies utilizing the responses of more realistic structures have only compared simulated to recorded waveforms (see column A in Table 1).
Quantitative statistics (e.g. median, dispersion, correlation) can be developed for any scalar validation metric listed above (e.g. an IM or an EDP) when using a suite of historical earthquake records and a comparable suite of simulations. Such suites of recorded and simulated motions can be selected based on a historical earthquake at specific locations, a hypothetical earthquake scenario at hypothetical sites (e.g. a given magnitude, distance, and site condition) to generalize the validation, or they can be based on motions that are conditioned on certain similarities (e.g. similar Sa values or similar durations of motions) for consistency with typical building response applications. For a limited number of EDP metrics (e.g. response of an inelastic SDoF oscillator with a particular post-yield stiffness), empirical models exist as an alternative validation approach. However, this is rare and typically the validation of structural responses involves comparing statistics of response metrics for a suite of simulated records, such as the median and dispersion (aleatory variability) of EDPs, to the corresponding values derived from recorded motions.
Validation applications
In lieu of categorizing validation metrics into ground motion characteristics (see rows 1 to 3 in Table 1) and structural responses (see rows 4 and 5 in Table 1), we recognize that other categorizations might be equally useful. For example, one could instead categorize validation metrics according to the applications of simulated ground motions, such as: (1) GMM development and PSHA, (2) building response history analysis (RHA), and (3) DIRA. In this case, ground-motion characteristics such as Sa medians and dispersions could fall in the “GMM development and PSHA” category, inter-period Sa correlations in the “RHA” category to capture the spectral shape, and spatial Sa correlations in the “DIRA” category. Conversely, structural responses such as inelastic spectral displacement, Sdi, could be categorized in the “GMM development” category when the goal is a simulation-informed model for Sdi medians and dispersions; or in “RHA” or “DIRA” categories when Sdi serves as a proxy for more realistic structures. Other validation metrics such as waveforms might not purposefully address any application or might attempt to address multiple. While we have chosen to categorize validation metrics into ground motion characteristics and structural responses, we recognize the importance of the applications of simulated ground motions in selecting validation metrics. Categorization by application can be particularly useful when the goal is to develop validation gauntlets for specific applications, where a validation gauntlet consists of several validation metrics. Developing application-specific validation gauntlets is a topic that was explored by the GMSV group but was not fully developed in the past decade.
Validation examples and research needs
The past two decades have seen significant growth in the capabilities of ground motion simulation models and in the availability of ground motions derived from them. As described above, the validation work accompanying these developments has focused on two main applications—the estimation of ground motion characteristics and the assessment of structural responses. The sections that follow describe effective (and in some cases, not so effective) examples of validation in these domains. The procedures referenced here were summarized and categorized in the previous section and Table 1. This section provides more specific information about each reference and elaborates on the lessons learned.
Ground motion characteristics
An essential component of earthquake engineering is the prediction of ground shaking from future earthquakes. Tools for estimating ground motions, when combined with models of seismic sources, facilitate PSHA, which underpins much of earthquake engineering practice (McGuire, 2004). In most PSHA applications, ground motion IMs (typically, median and dispersion of Sa) are estimated using physics-informed regression equations that are developed, in part, using empirical data (i.e. semi-empirical models). Known as GMMs, these equations are available for diverse tectonic regimes (active crustal, stable continental, subduction) and are constrained using large empirical databases (Ancheta et al., 2014; Chiou et al., 2008; Goulet et al., 2021a; Mazzoni et al., 2022). While the databases are large and growing rapidly with time, there has long been a need for information from alternative sources (i.e. those not directly related to ground motion recordings). This need arises because GMMs are used to predict IMs for conditions that are critical to hazard but might not have been experienced yet. There are many data gaps in empirical databases, including large-magnitude earthquakes, earthquakes in particular regions with high hazard but sparse recordings (e.g. interface subduction events in the U.S. Pacific Northwest), and specific source–site configurations that may affect path and site effects (e.g. hanging wall effects, directivity effects, basin effects). Simulated ground motions hold great promise to fill those data gaps, and there are several examples where they have been applied (see Day et al., 2008, for modeling basin effects; Donahue and Abrahamson, 2014, for modeling hanging wall effects; Furumura and Chen, 2005; Iwaki et al., 2016a, 2016b; Pitarka et al., 2002; and Pitarka et al., 2020, for large-magnitude historical events in Japan; Frankel et al., 2018, for modeling large interface subduction events in Cascadia; Infantino et al., 2020, for modeling ground motions in Istanbul from large-magnitude events on the North Anatolian Fault; Goulet et al., 2021b, for modeling large magnitude events in central and eastern North America).
While the promise is clear, the ground motions provided by a particular simulation should meet certain basic criteria before they are relied upon to develop GMMs or used directly in seismic assessments. The remainder of this section will focus on the checks that have been used in past validation studies and will comment on their usefulness in engineering applications.
Waveforms
Waveform validations (see row 1 in Table 1) have traditionally been qualitative, as illustrated in Figure 1, and hence are less useful in this context. These comparisons are generally made to historical recorded waveforms (see column A in Table 1). The second validation approach (see column B in Table 1) has not been applied in past work, but in theory it could be using GMMs for stochastic processes that predict acceleration waveforms as functions of earthquake scenarios (Rezaeian and Der Kiureghian, 2010; Yamamoto and Baker, 2013).
Response spectral intensity measures and peak parameters
Response spectral IMs and peak parameters (see row 2 of Table 1) have been the subject of important validation exercises over the past decade (Dreger et al., 2015). A robust and adaptable validation technique is to calculate residuals for quantifiable validation metrics with respect to regionally applicable GMMs (see column B of Table 1):
where
where
An effective validation technique is to examine the scaling of IMs from simulated waveforms with respect to widely used predictor variables in GMMs, such as
with residuals partitioned in this manner (from mixed effects analysis; Bates et al., 2015), magnitude-scaling can be checked by plotting

Within-event residuals for

Derived site response for spectral acceleration (Sa) at 5.0 s from multiple modest-magnitude simulated events and sites in southern California.
A closely related objective of validation is to check the dispersion of IMs from simulated waveforms. In principle, the dispersion of

Total standard deviations of simulated and recorded ground motions from small-magnitude events in New Zealand, plotted versus period T. The right-side sub-figure shows values for peak ground acceleration (PGA), peak ground velocity (PGV), cumulative absolute velocity (CAV), Arias intensity (AI), and two measures of duration
Besides the direct prediction of medians and dispersions of IMs, some engineering applications require information on the correlation structure of ground motions. Two examples of important correlations are: (1) those between Sa values for different periods and (2) spatial correlations for Sa for a given period at different sites. Models for these correlations are derived from data by, for example, Baker and Jayaram (2008) for inter-period Sa correlations and by Jayaram and Baker (2009) for spatial Sa correlations. Such empirical models provide baselines that can be used to validate the correlation structure of simulated ground motions (see column B in Table 1). Burks and Baker (2014) performed inter-period Sa correlation comparisons. Chen and Baker (2019) performed a comparison for the case of spatial Sa correlation, an example result of which is shown in Figure 5. These results show that the spatial correlation is stronger in the simulations than in empirical data for separation distances less than about 40 km. Similar inter-period and spatial correlation studies have also been performed for Fourier spectra, which falls under the category of other IMs in the next section.

Spatial correlation of Sa (3.0 s) ground motions as simulated for two scenario events (San Andreas, Puente Hills) and as provided by four empirical models (figure from Chen and Baker, 2019).
Another important attribute of ground motions is the differences in ground motion amplitudes across all horizontal orientations of ground motion components (i.e. polarization of ground motions). This is needed in engineering applications because PSHA typically derives ground motion amplitudes for median-direction of ground motions, for example, SaRotD50 (Boore, 2010), whereas some building codes require the use of maximum-direction of ground motions, SaRotD100. Shahi and Baker (2014) derived maximum-to-median direction ratios from empirical data, SaRotD100/SaRotD50, finding values near 1.1 for short periods and 1.4 for long periods. Burks and Baker (2014) computed similar ratios from BBP simulations, finding that these simulations tend to be more polarized than recordings at long periods, but give varying results at short periods.
Other intensity measures
Several ground motion IMs other than Sa have also been considered as validation metrics (see row 3 of Table 1). As introduced in the previous section on proposed categorization of GMSV methods, they can be categorized into three sub-categories: (3.1) scalar parameters, (3.2) combinations of two or more scalar parameters, and (3.3) non-scalar evolutionary (i.e. time-varying) parameters.
An example of validation using a scalar parameter was presented by Afshari and Stewart (2016), who compared simulated-to-GMM scaling of ground motion significant duration with common predictor variables using a residuals analysis approach as described above (Equations 1 and 3). The study used BBP simulations and identified trends opposite to those for ground motion amplitudes. For example, positive biases were found in Sa (Star et al., 2011), whereas negative biases were found in duration (for the same simulation method). This reflects the negative correlation between ground motion duration and amplitude (Bradley, 2011). Validations of other scalar parameters such as
The scalar parameter
Fourier phase and Fourier amplitude spectrum (FAS) have been explored as validation metrics by Bayless and Abrahamson (2018). They found that FAS inter-period correlations are essential for accurate estimations of the variability of structural responses. They evaluated the SCEC BBP simulations and found that the inter-period correlation at short periods were generally too low and deficient; some BBP simulation models showed promise at long periods, but performance varied between simulation methods. The subsequent evaluations of 3D SCEC CyberShake (Los Angeles, California) and QuakeCore CyberShake (New Zealand) simulations by Bayless and Abrahamson showed substantial improvements in the inter-period FAS correlations at long periods relative to the one-dimensional (1D) BBP simulations (unpublished findings from SCEC and QuakeCore projects, obtained from personal communications with Bayless, 2022). Later, Song et al. (2021) investigated the effects of pseudo-dynamic source models on the inter-period correlation of ground motions by simulating the 1994 Northridge, California, earthquake, using the BBP. They found that the cross-correlation between earthquake source parameters in the pseudo-dynamic source models significantly affects inter-frequency ground motion correlations for frequencies around 0.5 Hz, whereas the effect is not apparent at lower and higher frequencies. Furthermore, Wang et al. (2019) imposed inter-period correlations on the SDSU BBP simulation methodology as a post-processing procedure and Wang et al. (2021) extended this to spatial correlations.
Two or more scalar parameters can be combined to develop validation metrics. An example is the goodness-of-fit (GOF) measure proposed by Anderson (2004), which is used for comparisons to historical records (see column A in Table 1). The IMs considered by Anderson consisted of 10 different characteristics: peak acceleration, peak velocity, peak displacement, Arias intensity, the integral of velocity squared, Fourier spectrum, and acceleration response spectrum on a frequency-by-frequency basis, the shape of the normalized integrals of acceleration and velocity squared, and the cross-correlation. This method has been used for event-specific validations by Smerzini and Villani (2012) and Paolucci et al. (2015). Later, Olsen and Mayhew (2010) built upon this GOF and applied it to the 2008 Chino Hills earthquake records. Rezaeian et al. (2015) proposed to use the ratio between
Evolutionary (non-scalar) validation metrics that reflect different ground motion characteristics at different times or at different frequencies can also be used. Rezaeian and Der Kiureghian (2008, 2010, 2012) defined several such ground motion metrics that describe variations of waveform intensity and frequency content with time. Three were considered for validation purposes by Rezaeian et al. (2015): (1) evolution of intensity (amplitude of acceleration), (2) evolution of the predominant frequency, and (3) evolution of motion bandwidth. The last two (predominant frequency and bandwidth) together represent the frequency content of the ground motion waveform. An example application of these parameters in a recorded-to-simulated comparison is presented in Figure 6, which shows that more complex ground motion features can be considered this way compared to the use of scalar parameters, without the subjectivity of waveform comparisons.

A recorded ground motion from the 1994 Northridge earthquake and a corresponding simulated motion, (a) original acceleration time-series, (b) simulation corrected to have the same time-sampling as the recorded motion and time-shifted to be synchronized with the recorded motion, (c) Fourier spectra of the recorded and simulated motions, (d) validation metric 1 (evolution of intensity), (e) validation metric 2 (evolution of predominant frequency), (f) validation metric 3 (evolution of motion bandwidth).
A challenge associated with the use of evolutionary parameters and plots as shown in Figure 6 is how results for many ground motions can be synthesized to provide insights into the performance of a particular simulation method. To address this, Rezaeian et al. (2015) defined scalar parameters to serve as proxies for the three evolutionary metrics: (1)
Structural responses
The second category of validation metrics is based on responses of computational models of structures, whether the models are relatively idealized (see row 4 in Table 1) or more realistic (see row 5 in Table 1). Corresponding validation studies typically use simulated and recorded ground motions as inputs for the RHA of structural models. These studies are motivated in part by the possibility of using simulated motions in RHA in engineering practice. Such applications are permitted in the American Society of Civil Engineers (ASCE) Minimum Design Loads for Buildings and Other Structures (ASCE/SEI, 2022), which specifies: “where the required number of recorded ground motions [for RHA] is not available, it shall be permitted to supplement the available records with simulated ground motions.” The subsections below summarize representative studies that used structural responses directly as validation metrics, as a basis for the related lessons learned and future research needs offered in the subsequent section.
Responses of idealized structural models
As proxies for more realistic models, idealized structural models have been used for validation metrics in a number of GMSV studies. Unlike Sa, the validation metrics described here explicitly reflect the nonlinear inelastic or multimodal behavior of structures subjected to strong ground motions. Even so, their relative simplicity facilitates GMSV comparisons for a broad range of structural characteristics and larger numbers of ground motions than when more realistic (nonlinear inelastic and multimodal) models are used.
Most of the GMSV studies in this subcategory have used the peak displacement response of bilinear inelastic SDoF oscillators, generally referred to as inelastic spectral displacement, Sdi. Similar to Sa, Sdi is typically computed for a range of vibration periods and a standard damping ratio of 0.05. However, Sdi for a bilinear oscillator also requires specification of a yield displacement or strength, and a post-yield over elastic stiffness—or “strain-hardening”—ratio. Typically, a range of yield displacements/strengths are specified, with a single strain-hardening ratio.
Early GMSV studies with Sdi compared values from suites of simulated and recorded ground motions (see column A in Table 1). For example, Bazzurro et al. (2004) compared Sdi values from seven different simulation methodologies with corresponding values from ground motions recorded at 20 stations during the 1994

Ratios of median bilinear inelastic spectral displacements from suites of simulated (according to seven different methodologies) and recorded (“real”) ground motions. The differences are largely caused by differences in Sa (figure from Bazzurro et al., 2004).
Following the development of empirical models for Sdi—including one for Sdi/Sde by Tothong and Cornell (2006) that is a function of earthquake magnitude and source-to-site distance—Burks and Baker (2014) compared Sdi/Sde values from suites of simulated ground motions to an empirical model (see column B in Table 1), in addition to comparisons with historical records (see column A in Table 1). Similar to the two example GMSV studies described in the preceding paragraph, Burks and Baker (2014) found Sdi/Sde differences at a relatively short vibration period (0.8 s) for two of three different simulation methodologies, as shown in Figure 8. They demonstrated that the differences could be at least partially explained by differences in the shape of the elastic response spectrum—the “spectral shape”—at longer periods. This is due to the effective period of an inelastic oscillator being longer than its initial (elastic) period. By subsequently selecting suites of ground motions with comparable median elastic response spectra, Burks and Baker (2014) found no substantial Sdi/Sde differences at the short period considered (0.8 s). However, at other short and long periods (0.3 s and 3 s in this case) where the variability of the spectral shape at longer periods differed between simulated and recorded ground motions, the Sdi/Sde values also differed.

Geometric mean bilinear inelastic-to-elastic spectral displacement ratios, Sdi/Sde, from suites of simulated and recorded ground motions and an empirical model (Tothong and Cornell, 2006), for two vibration periods (a) 0.8 and (b) 1.6 s, and a range of relative yield strengths on the horizontal axes (10 for the lowest strength; figure from Burks and Baker, 2014).
A more realistic but still idealized structural model is a trilinear inelastic SDoF oscillator with a peak displacement/strength beyond which the oscillator stiffness is negative. Burks and Baker (2014) used this trilinear model for an additional validation metric, structural collapse capacity (SaC), defined as the Sa at which a structure collapses. Unlike for Sdi and Sdi/Sde, no empirical models exist for SaC, and hence the GMSV study compared values from suites of simulated and recorded ground motions (see column A in Table 1). To search for SaC differences other than those resulting from differences in Sa, Burks and Baker (2014) used the suites with comparable median elastic response spectra mentioned in the preceding paragraph. The results across three trilinear inelastic SDoF oscillators representative of mid-rise concrete frame buildings are shown in Figure 9. As from the Sdi/Sde comparisons, they found SaC differences when the dispersions of the spectral shapes at longer periods differed between the simulated and recorded suites, but not when the dispersions (and medians) were comparable.

Structural collapse capacity distributions from suites of simulated and recorded ground motions with comparable median elastic response spectra, for trilinear inelastic SDoF oscillators of three periods (a) 0.3 s, (b) 0.8 s, and (c) 2 s, representative of mid-rise concrete frame buildings (figure from Burks and Baker, 2014).
Yet another idealized structural model was considered by Galasso et al. (2013), namely a linear elastic MDoF continuum system comprising a combination of a flexural cantilever beam coupled with a shear cantilever beam. Whereas the inelastic SDoF oscillators discussed above are better models for shorter buildings subjected to strong ground motions, elastic MDoF models are more useful for taller buildings subjected to moderate ground motions. In the latter case, multiple modes of vibration substantially contribute to the response, more so than inelasticity. As part of their GMSV study, Galasso et al. (2013) used the maximum interstory drift ratio (MIDR) response of 96 elastic MDoF continuum systems, and floor acceleration spectra of 4 additional elastic MDoF systems, as validation metrics. The validation consisted of comparing values from simulated and recorded ground motions (see column A in Table 1) from the 1979

Ratios of (a) logarithmic means and (b) standard deviations of maximum interstory drift ratios (MIDR) from suites of simulated and recorded ground motions, for linear elastic MDoF continuum systems of a range of fundamental vibration periods (figure from Galasso et al., 2013).
Responses of more realistic structural models
Although idealized structural models facilitate consideration of a relatively comprehensive range of structural characteristics (e.g. periods and strengths) and input ground motions, it is the responses of more realistic structural models that are of primary interest as an application of ground motion simulations. Accordingly, some GMSV studies have directly used the latter for a few example structures as validation metrics in comparing simulated versus recorded ground motions (see column A in Table 1). As summarized below, these studies have demonstrated similarities and differences between the more realistic responses and have identified a variety of ground motion characteristics that explain them, for which GMSV and/or careful selection of simulated ground motions is needed.
An early study by Goulet and Haselton (2010) compared the MIDR responses of nonlinear inelastic MDoF models of multi-story buildings subjected to recorded and simulated ground motions, with both sets selected to have similar response spectral shapes. They used three special moment-frame reinforced-concrete designs consistent with building codes of the time. The three structural models were selected to capture different numbers of stories, natural first-mode periods, and different levels of nonlinearity. The set of recorded motions was representative of a magnitude 7 earthquake, rupturing within 20 km from a site in a shallow crustal tectonic environment such as California. Simulated motions for the same type of event and distance, with spectral shapes that were consistent with (but not precisely equivalent to) the motions in the recorded set, were selected from existing SCEC-based simulations (Graves and Somerville, 2006a, 2006b; Graves et al., 2008). The study found that the simulated ground motions led to MIDR results consistent with those derived from recorded motions, provided that the elastic spectral shape of the ground motions are similar (Haselton et al., 2009).
Galasso et al. (2013) compared the responses of nonlinear inelastic MDoF structural models subjected to suites of simulated versus recorded ground motions from the 1994

Ratios of logarithmic means and standard deviations of interstory drift ratios (IDR) and peak floor accelerations (PFA) from suites of simulated and recorded ground motions, for (a) 6- and (b) 20-story nonlinear inelastic MDoF structural models (figure from Galasso et al., 2013).
Burks et al. (2019) compared responses of a nonlinear inelastic MDoF model of a real building designed by a structural engineering firm. From the 3D model with moderate fundamental periods (1.1 s and 1.8 s in the two orthogonal directions), Burks et al. (2019) investigated story drift ratios, story shears, column/beam/brace demand-to-limit ratios (shown in Figure 12), and ultimate design decisions that would be reached using simulated versus recorded ground motions having similar mean elastic response spectra. As shown in Figure 12, mean results from the two ground motion suites were similar. However, they did observe that the fault-normal responses of the 3D structural model were larger for the simulated than recorded ground motions; conversely, the fault-parallel responses to the simulated ground motions were smaller. These differences were partially explained by corresponding differences in the elastic response spectra of the simulated versus recorded ground motions, which were more polarized in the simulated suite. The polarization differences were seen via the maximum-to-median spectral ratio validation metric referenced in previous sections.

Mean column/beam/brace demand-to-limit ratios from suites of (a) recorded and (b) simulated ground motions with similar elastic response spectra, for a nonlinear inelastic MDoF three-dimension model of a five-story building with steel special moment resisting frames (SMRFs) in the fault-normal direction and buckling-restrained braced frames (BRBFs) in the fault-parallel direction (figure from Burks et al., 2019).
Similar to the studies by Goulet and Haselton (2010) and Burks et al. (2019), Zhong et al. (2020) compared structural responses from suites of simulated versus recorded ground motions with similar average elastic response spectra, for nonlinear inelastic MDoF two-dimensional (2D) models of two tall buildings with fundamental periods of 2.6 and 4.2 s. The resulting average MIDR, PFA, and maximum story shears were similar, despite ground-motion duration differences that had previously been demonstrated to influence other structural responses, namely collapse capacities (Chandramohan et al., 2016). Subsequently using suites of simulated and recorded ground motions with not only similar elastic response spectra but also comparable durations, Zhong et al. (2020) also found no statistically significant differences in the three structural responses and in collapse risks, the latter as shown in Figure 13. However, they did note differences in the PFAs that correlate with differences in the spectral shapes of the simulated versus recorded ground motions at periods from 0.5 to 1.0 s, as measured by average Sa values over the period range. The spectral shape differences were attributed to the splicing of short-period stochastic and moderate-to-long period deterministic ground motions in the simulation methodology.

Structural (a) collapse fragilities and (b) risks (λcol) from suites of simulated (“BBP”) and recorded (“NGA”) ground motions with similar elastic response spectra and durations, for a nonlinear inelastic MDoF two-dimensional model of a tall building (figure from Zhong et al., 2020).
Fayaz et al. (2021a) suggested a validation methodology that compares regressions of structural responses with respect to earthquake source, path, and site parameters from simulated versus recorded ground motions. They demonstrated the methodology with bridge column drift ratio (CDR) responses of a nonlinear inelastic MDoF model with a fundamental period of 0.6 s. Aside from the resulting regression coefficients for site and basin effects, the regressions from the simulated versus recorded ground motions were found to be statistically similar, as shown in Figure 14. CDR was also regressed against the waveform characteristics of Rezaeian et al. (2015), discussed in previous sections, with again similar regression results from recorded versus simulated ground motions. As a result, Fayaz et al. (2021b) proposed that these waveform characteristics be used as the first step in GMSV studies related to structural response. Munjy et al. (2022) advanced this proposal using realistic models of 2- and 12-story SMRF buildings and a two-span, cast-in-place concrete bridge.

Coefficients from regressions of structural responses to simulated (100 suites) and recorded (1 suite) ground motions on respective earthquake source, path, and site parameters, for a nonlinear inelastic MDoF model of a bridge.
Summary of lessons learned and future research needs
Results from the GMSV-related studies described above that have been completed over more than a decade have led to several lessons learned. One general lesson learned from the body of research is that the current state of knowledge would highly benefit from application-specific validation metrics or groups of metrics (i.e. gauntlets). This is extremely important because merely stating that a simulation set is validated does not provide much relevant information on their appropriate use. Moreover, as research has shown, simulations can have deficiencies for GMM development or PSHA and yet be suitable for RHA, or vice versa. The onus of explaining the appropriate use or application of the validated simulations should therefore reside with the analysts who perform the validation. Here, we summarize these lessons following the two general application categories of ground motion characterization and structural response prediction.
Key lessons from the GMSV research related to ground motion characterization are: (1) validations based on qualitative examination of waveforms or simple analyses of total residuals are not sufficiently robust to judge the efficacy of a simulation procedure; (2) proper validation should separate residuals such that scaling issues associated with source, path, and site effects can be examined, along with model bias; (3) when great care is applied in the generation of input parameters and their related uncertainties, realistic ground motion dispersion and minimal bias can be achieved; and (4) validation metrics that extend beyond Sa have the potential to relate complex ground motion characteristics to structural responses in a relatively simple way, but may require continued research to establish their correlation and value.
As a result, the following steps can be taken for validating a simulation platform for use in GMM development or PSHA (i.e. for validation metrics related to Sa):
Simulate a wide range of events that span from the conditions for which ample observations are available (generally small to moderate magnitudes) to relatively rare, but hazard-controlling, situations.
Compute residuals relative to suitable GMMs using Equation 1 and partition the residuals using Equation 3.
Evaluate trends of between-event residuals with respect to source parameters, which should include magnitude and potentially additional relevant source parameters (e.g. rupture depth).
Evaluate trends of within-event residuals with respect to path parameters (e.g. rupture distance).
As needed, address path misfits in the simulations to facilitate the next step.
Evaluate trends of within-event residuals (or site-terms derived from those residuals) against site parameters.
Compare standard deviation terms to empirical models.
In these studies, the simulations should be able to capture trends from GMMs over the parameter space where the GMMs are well-constrained. The same can be said for the Sa inter-period and spatial correlations as well as maximum-to-median direction ground motion ratios that are important for a variety of engineering applications.
The validation metrics that go beyond Sa aim to capture additional ground motion features that impact structural responses. However, many of them are still unconventional and have not yet been explored to their full potential, likely due to lack of dissemination, follow-up projects, or resources to go beyond structural responses that are mainly sensitive to Sa. Nonetheless, many of such validation metrics mentioned in previous sections were implemented on the BBP (SCEC Report, 2016; the code can be accessed from https://github.com/SCECcode/bbp, last accessed on 30 January 2023) with the hope to improve dissemination. For those validation metrics, future efforts should be directed toward:
Research to establish the correlation between the validation metrics and structural responses.
Development of GMMs for ground motion metrics that are distinct from Sa and that impact structural responses.
Development of guidelines documenting the applicability of the validation metrics as proxies for realistic structural responses.
The GMSV research directly using structural responses has demonstrated the following: (1) validations that simply use simulated and recorded ground motions corresponding to historical earthquakes might observe differences in structural responses, but these differences might result to a large extent from differences in elastic response spectra. Such differences are not necessarily relevant in RHA practice, where ground motions are typically selected (and scaled) to match target elastic response spectra; (2) validations that use suites of simulated and recorded ground motions with similar median elastic response spectra (e.g. that match an ASCE 7 target) might still observe differences in structural responses. Such differences can be attributed to, for example, differences in the Sa dispersions, inter-period correlations (or other spectral shape metrics), and/or polarization (for 3D structures), as well as other ground motion metrics such as duration and waveform characteristics such as directivity pulse periods; and (3) validations that further condition the simulations and recordings on additional ground motion characteristics (e.g. Sa dispersion and duration) can practically eliminate differences in structural responses. The particular ground motion characteristics that influence the structural responses of interest are not necessarily known a priori for diverse structural typologies. Based on these three lessons, for structural RHA applications of simulated ground motions, we see the following as important knowledge gaps to address:
Continue to disseminate GMSV exercises that directly use responses of realistic structural models and demonstrate how such GMSV can be done efficiently by others in practice.
Correlate results of GMSV studies that use responses of idealized structural models with those from more realistic models, in order to identify idealized structural responses (e.g. SaC) that can be used in GMSV as proxies for realistic structural responses.
Proliferate GMSV research that uses ground motion characteristics, which have already been shown to explain some differences between structural responses to simulated and recorded ground motions (e.g. Sa dispersions, inter-period correlations, and polarizations).
Outreach, dissemination, and utilization of simulations in engineering practice
Since its inception, the GMSV group has enabled substantial progress to be made in the improvement of simulation methods and validation metrics for engineering applications. The sustained structure provided the continuity needed to solicit work on a yearly basis through SCEC’s collaboration plan (also known as call for proposals), and the coordination favored complementarity and collaboration among various projects and researchers. This led to the completion of several projects referenced in Table 1 and discussed above. Of notable importance and impact is the implementation of several validation metrics in the BBP software (Luco et al., 2016), which included several Sa-based validation metrics (Goulet et al., 2015), several parameters such as the duration of motion based on the study by Afshari and Stewart (2016), the GOF based on the study by Anderson (2004), and the scalar and evolutionary validation metrics based on the study by Rezaeian et al. (2015). These validation tools can currently be accessed by running the BBP in expert mode (https://github.com/SCECcode/bbp, last accessed on 30 January 2023), and a SCEC team is developing a stand-alone software intending to make the tools more widely available to those interested in validation. Teams have also developed vetted sets of simulated ground motions deemed usable for practical applications, for scenarios from the BBP (Goulet et al., 2018a, 2018b; Zhong et al., 2020) and from CyberShake simulations (Baker et al., 2021).
As part of its activities, the GMSV group hosted yearly workshops and focus-group meetings to support consensus development within a broad community that includes seismologists, simulation modelers, computer scientists, and research engineers. Additional outreach activities were coordinated to further engage practicing engineers and to share knowledge on our current understanding of simulation methods and their performance (value and limitations) in specific engineering applications. In some cases, the engineers were invited to focus-group meetings or sessions held at engineering conferences, such as the 11th and 12th National Conferences in Earthquake Engineering.
As the GMSV activities unfolded, a number of issues with simulated ground motions were identified and shared with the community. Over time, many of these issues have been addressed. Nonetheless, having witnessed problems in earlier versions of simulations, or simply being wary of adopting new tools and resources, many engineers and regulatory bodies remain skeptical of using simulations in actual projects. This remains the case at the time of writing despite examples of successful structural RHA applications, provided appropriate record selection criteria are applied.
A lesson learned from this experience is that additional sustained efforts are needed to develop consensus on the proper development, dissemination, and use of simulated ground motions. There are numerous existing and under-development simulated ground motion databases (D’Amico et al., 2017; Paolucci et al., 2021; Withers et al., 2023) and readily available software on the Internet, but few have been vetted with the rigor needed for engineering applications (as was done by Lee et al., 2022, for applications in New Zealand and by Baker et al., 2021, for a subset of the CyberShake simulations). Simulation results depend on many poorly constrained parameters, and while it is now relatively easy to create simulated time series, their scientific or engineering value is not guaranteed until proper validation has been conducted. With over a decade of sustained efforts, we have yet to find a single set of validation metrics that would be sufficient to declare a specific simulation methodology as “validated” for any and all engineering applications. The complexity of triaxial ground motions in time and frequency makes it very difficult to find such a validation gauntlet. For example, two time series with the same median Sa may look very different if their duration and their number of cycles at each frequency are not considered. Their impact on structures can also be very different. Hence, it is desirable to validate the simulations based on suites of metrics and on their direct impact to engineering models. We believe we must continue along this path, in collaboration with engineering practitioners, for a range of applications. The lessons learned from such applications may facilitate the later development of a comprehensive suite of validation gauntlets, each designed for a different engineering application.
Although simulations are currently being used in some engineering applications and there is an impressive increase of interest by the engineering community due to recent advances in simulation methodologies and availability of numerical codes, our discussions with practicing engineers have revealed that a practical impediment to further adoption of simulations in engineering practice is the lack of availability of vetted sets of simulations with clear documentation. The development of simulation databases that include their validation criteria is very much needed. Merely assembling simulations for distribution will not serve the engineering community needs. Hence, key needs to further the use of simulations in engineering practice are: (1) continued validation research, (2) improved means of communication to share knowledge on simulations, and (3) comprehensive simulation databases that include clear validation information, such as a list of metrics for which the simulations were validated (building on the concept of gauntlets). While this needs focus on practicing engineers, other simulation users include seismology and engineering researchers. As new simulations are developed, some of which include more complex physics and broadband motions, they enable research aimed at quantifying the impact of modeling assumptions on ground motions or structural responses. In future public databases, simulations could be labeled as either suitable for research, or for engineering analysis, or for both.
Conclusion and the path forward
In the past decade, the SCEC GMSV group has been active through various validation studies performed by individual researcher investigations and collaborative projects, coordination with other SCEC groups including simulation modelers, and numerous workshops convened to interact with practicing engineers. Although these efforts were mainly focused on SCEC simulations, efforts to extend our reach to other national and international simulations were also made. Here we have proposed a categorization of GMSV methods in Table 1. This consists of two validation approaches and two categories of validation metrics. The two validation approaches are: (1) comparing simulations to historical records and (2) comparing simulations to semi-empirical models. The two categories of validation metrics are: (1) ground motion characteristics, consisting of waveforms, response spectral IMs, and other IMs and (2) structural responses, consisting of responses of idealized and more realistic structural models. Examples of existing validation exercises referenced in Table 1 were discussed in this article, along with lessons learned from these exercises. Here, we highlight three high-level key findings based on the past decade of GMSV activities.
First, validation is specific to the engineering or scientific application for which the simulations will be used. For example, in ground motion modeling, the critical issue for validation may be related to ground motion scaling with predictor variables, estimation of uncertainties, development of overall ground motion amplitudes, or other issues. In structural analysis, the appropriate validation metric depends on the type of structure and the EDP under consideration, among other considerations. In addition, extending beyond southern California, validation could be specific to earthquake sources, geological structures, regional tectonics, and the available empirical models.
Second, despite the expansion of GMSV research in the past decade, much research remains regarding validation and utilization of simulated ground motions for several of the categories shown in Table 1. Although we have touched on each of the possible combinations in this table, there is much yet to be achieved. Productive research—and future simulation method improvements—requires coordination and collaboration among research engineers and ground motion modelers.
Third, outreach and dissemination should be improved for simulations to be used in practice. Many simulation sets have been validated to some degree that deems them ready for use in some applications, such as those that only require reasonable median Sa estimates or that select and scale ground motions to match a target design spectrum in building code applications. However, the end users are either not aware of the existing simulation and validation resources or are discouraged from utilizing them due to a lack of accessibility or the overwhelming amount of simulation data. As discussed in this article, more comprehensive documentation and dissemination of already existing simulation databases is needed.
Footnotes
Acknowledgements
As leaders of the SCEC GMSV technical activity group, we acknowledge the invaluable participation and continued interactions of the group members, particularly appreciating the efforts of Thomas Jordan, Jack Baker, Gregory Deierlein, Farzin Zareian, Carmine Galasso, Paul Somerville, Andreas Skarlatoudis, Jeff Bayless, Nenad Bijelic, Ting Lin, Kuanshi Zhong, Jawad Fayaz, Peng Zhong, Huda Riadh Munjy, Luis Dalguer, Iunio Iervolino, Kioumars Afshari, Brendon Bradley, Caroline Holden, Leonardo Ramirez Guzman, Marco Stupazzini, Seok Goo, Jongwon Lee, John Vidale, and Brianna Birkel, even beyond the references herein. Special thanks go to SCEC simulation modelers and practicing engineers who have attended our workshops and provided feedback and advice to this group, including but not limited to Robert Graves, Kim Olsen, Ralph Archuleta, Philip Maechling, Fabio Silva, Scott Callaghan, C.B. Crouse, and Farzad Naeim. Noteworthy to mention are participants of a Special Interest Group session we held at the 2022 Seismological Society of America (SSA) annual meeting in Bellevue, WA, to obtain feedback from the community on publishing this article. Early reviews from Jeff Bayless, Kim Olsen, Morgan Moschetti, and Kyle Withers helped us to further improve this article. Technical reviews from Roberto Paolucci, Arben Pitarka, and Brendon Bradley were much appreciated and resulted in improving this article.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Development of this synthesis paper was supported by the SCEC Award #21144. SCEC is funded by the National Science Foundation (NSF) and U.S. Geological Survey (USGS) through cooperative agreements with the University of Southern California, with additional funding for this project provided by the Pacific Gas & Electric Company (PG&E). Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
