Abstract
Earthquake site responses or site effects are the modifications of surface geology to seismic waves. How well can we predict the site effects (average over many earthquakes) at individual sites so far? To address this question, we tested and compared the effectiveness of different estimation techniques in predicting the outcrop Fourier site responses separated using the general inversion technique (GIT) from recordings. Techniques being evaluated are (a) the empirical correction to the horizontal-to-vertical spectral ratio of earthquakes (c-HVSR), (b) one-dimensional ground response analysis (GRA), and (c) the square-root-impedance (SRI) method (also called the quarter-wavelength approach). Our results show that c-HVSR can capture significantly more site-specific features in site responses than both GRA and SRI in the aggregate, especially at relatively high frequencies. c-HVSR achieves a “good match” in spectral shape at ∼80%–90% of 145 testing sites, whereas GRA and SRI fail at most sites. GRA and SRI results have a high level of parametric and/or modeling errors which can be constrained, to some extent, by collecting on-site recordings.
Introduction
Surface geology can modify incoming seismic waves, and these modifications are denoted as “site effects” or “site responses.” Site effects arise due to the presence of impedance (the product of wave velocity and density) contrasts, surface and subsurface topographies, and so on. Currently, various approaches are available to quantify site effects in seismic hazard and risk assessments.
One approach to capture site effects is to utilize generic prediction models relating amplifications to various predictor variables (Table 1). Predictor variables can be (a) site parameters from in-situ geotechnical/geophysical measurements, for example, average shear-wave velocity in the topmost 30 m (VS30), sediments thickness, and site resonant frequency (e.g. Hassani and Atkinson, 2018; Kwak and Seyhan, 2020); or (b) proxies from existing regional models or maps, for example, topographic slope, terrain, and geology (e.g. Wald and Allen, 2007; Weatherill et al., 2020; Yong et al., 2012). Models in the first group are often embedded in recent ground-motion models (GMMs; e.g. Boore et al., 2014), whereas those in the second category are applied in regional or global site-response mapping. One needs to balance the spatial coverage and precision when considering these models.
Various techniques to quantify earthquake site responses
In addition, site responses at specific sites can be estimated using one-dimensional (1D) ground response analysis (GRA) or the square-root-impedance (SRI) method (Table 1). GRA assesses site responses by modeling the propagation of vertically incident plane SH waves through horizontally stratified sedimentary layers (i.e. the “1DSH” assumptions). In contrast, SRI is based on the ray theory and only uses quarter-wavelength (QWL) velocities and densities corresponding to the frequency of interest (Boore, 2003; Joyner et al., 1981). Both approaches require a detailed site model, and GRA also needs soil nonlinear properties for equivalent linear or nonlinear analyses.
Another site-specific technique is the empirical correction to the ratio of the Fourier spectrum of the horizontal component to that of the vertical component, namely the horizontal-to-vertical spectral ratio (HVSR, Nakamura, 2019) (Table 1). We denote this correction to HVSR as the “c-HVSR” method. Based on the diffuse field theory, Kawase’ team (Ito et al., 2020; Kawase et al., 2018b) proposed this technique, and Zhu et al. (2020c) considered it as an empirical approach and evaluated its performance in comparison with GRA using surface-downhole recording pairs. c-HVSR does not require any site models but needs on-site recordings and pre-defined correction functions for the study region.
How well can we predict site effects so far using these estimation techniques (GRA, SRI, c-HVSR, and generic site-response models)? Addressing this question is crucial to identify the current best approach and then adopt and iterate on the best practice. However, there are a few hurdles in comparing various techniques. First, we need reliable site-response observations that could be utilized as “benchmark,”“gold standard,” or “ground-truth” data to evaluate or calibrate other estimation approaches/models. In addition, we need detailed site metadata (e.g. 1D site models and seismic recordings) to test all these approaches at common sites.
However, such a comprehensive and robust evaluation can be realized, thanks to the large volume of ground-motion data accumulated on the dense strong-motion networks, K-NET, and KiK-net, in Japan, with detailed site information (National Research Institute for Earth Science and Disaster Resilience, 2019). Thus, a reliable site-response dataset is established from the recordings for a large number of sites through the general inversion technique (GIT) (Nakano et al., 2015), empirical ground-motion modeling (site term δS2Ss, Loviknes et al., 2021), and surface-to-borehole spectral ratio (SBSR, Zhu et al., 2020b) (Table 1). These site-response data are unique in that they were derived in the Fourier domain, as to our knowledge, the first such dataset. This makes it possible to study the site-response variability at high frequencies, for example, f > 10 Hz. Furthermore, an open-source site database is also recently made available for these stations (Zhu et al., 2021b). Given these latest developments, the effectiveness of current site-response estimation techniques can be tested and compared on the same dataset under one framework. In this article, we focus on the evaluation of GRA, SRI, and c-HVSR. Results on site-response models by classic regressions and machine learning are presented in a separate article.
We first develop a framework to evaluate the effectiveness of different site-response estimation techniques. Then we present the benchmark site-response data, that is, Fourier site responses derived using GIT with reference to outcrop rock conditions (VS = 3.45 km/s), and the site metadata, at 1725 K-NET and KiK-net sites. Next, we introduce the three approaches, including GRA, SRI, and c-HVSR, before the testing and comparison of their performance in predicting observed site responses. Furthermore, we discuss two strategies to potentially improve the efficacy of GRA and SRI, including using HVSR-consistent VS profiles and site-specific attenuation measurements.
Evaluation framework
Site-to-site variability
If multiple recordings are available at multiple sites, observed site response at a site s during earthquake e can be expressed as:
where
In this study, we only focus on the repeatable site response at a given site, that is,
where
Though
Figure 1 further illustrates this issue. Figure 1a depicts the site responses

(a) Observed site responses
Goodness-of-fit metrics
where n is the sample size (the number of stations in this research); x and y are the vectors of observation and prediction, respectively; xi and yi denote the ith (i = 1, 2, ..., n) elements of x and y;
where
Fourier absolute site responses from GIT
We utilize the observed site responses derived by Nakano et al. (2015) using GIT. Nakano et al. (2015) selected strong-ground motions recorded by K-NET, KiK-net, and JMA networks from 1996 to 2011 using the following criteria: (a) JMA magnitude ≥ 4.5; (b) focal depth ≤ 60 km; (c) hypocentral distance ≤ 200 km; (d) peak ground acceleration (PGA) ≤ 2.0 m/s2; and (e) number of records per event ≥ 3. These selection criteria resulted in 77,213 surface seismographs (three components: NS, EW, and UD) recorded by 2105 sites during 967 events. Then, the wave section immediately following S-wave arrival was extracted from each waveform. The length of the S-wave window depends on the JMA magnitude: 5 s for 4.5 < MJMA ≤ 6.0, 10 s for 6.0 < MJMA ≤ 7.0, and 15 s for 7.0 < MJMA ≤ 8.0. After tapering and zero-padding, these acceleration time series were transformed to Fourier amplitude spectra (FAS) which were smoothed using a Parzen window of 0.1 Hz.
Following data processing, Nakano et al. (2015) separated the repeatable site responses at each site in the database using GIT. The KiK-net site YMGH01 was selected as the reference site. The surficial layer of this site is classified as late cretaceous granite (Zhu et al., 2021b). The S-wave velocities at the ground surface (VS0) and in the topmost 30 m (VS30) are 1000 m/s and 1388 m/s, respectively. Besides, this site has a relatively flat HVSR curve from 0.1 to 30 Hz. However, to derive site responses with reference to seismological bedrock, surface ground motions at YMGH01 were deconvolved to the top of seismological outcrop rock with VS = 3450 m/s before spectral modeling. Thus, the resultant site responses in horizontal and vertical directions, that is, HHbR(f) and VHbR(f), at each of the 2105 sites are referenced to horizontal ground motions on seismological outcrop rock (Hb) and thus can be considered to be “absolute” site responses. f is a frequency scalar with values ranging from 0.1 to 20 Hz. We note that vertical site responses are also referenced to Hb, rather than Vb in this study as will be explained in the following subsection “Empirical Correction to HVSR (c-HVSR).”
The whole site-response dataset contains average HHbR(f), VHbR(f), and HVSR(f) over many events at each of the 2105 sites. In this study, we only utilized stations (a) belonging to either K-NET or KiK-net network and (b) recorded at least five events. In total, 1725 sites remain in our database (Figure 2). Next, we divide these sites into two independent subsets: training (1580 sites) and testing (145 sites) datasets (Figure 3). The training set is to derive empirical correction functions for c-HVSR and proxy-based predictive models. The testing set, which is not utilized during the training, is randomly selected for an independent evaluation of models/approaches, for example, c-HVSR, proxy-based generic models, GRA, and SRI (Figure 3). The 145 testing sites are also considered to be representative of the full dataset (Figure 4).

Spatial distribution of the 1725 K-NET and KiK-net recording sites (Japan) used in this study. (a) 1580 training sites. (b) 145 testing sites.

Flowchart illustrating the evaluation of different site-response estimation approaches. SRI—the square-root-impedance method; GRA—1D ground response analysis; c-HVSR—corrected HVSR; Amp—parametric amplification predictive models by least-squares regressions; and RF—random forests (supervised machine learning) models. Ergodic approaches are evaluated in a separate study.

Histograms at the 1580 training and the 145 testing sites. (a) Site responses (from GIT) at f = 1.0 Hz. (b) Topographic roughness. (c) VS30. (d) Peak frequency on HVSR (fP, HV). (e) Depth to 2.5 km/s velocity horizon inferred from a 3D velocity model J-SHIS (Z2.5). (f) Amplitudes of HVSR at f = 1.0 Hz (AHV). All site parameters are collected from the site database by Zhu et al. (2021b).
Site metadata
Site data, including VS30, surface geology, topographic roughness (the largest difference in elevation between the target pixel and the surrounding cells on a digital elevation model (DEM)), and geomorphological category, at the 1725 selected sites are collected from an open-source site database developed by Zhu et al. (2021b). We note that in the open site database, VS30 was only derived for sites at which velocities were logged to a depth reaching or exceeding 30 m (Zhu et al., 2021b). Thus, for sites with borehole depth z < 30 m (e.g. most K-NET sites), we estimate VS30 from its correlation with VSz (average S-wave velocity to z = 20, 15, and 10 m) using the equations established in this study (Figure S1, see Data and Resources). In addition, for the very few sites without VSz measurements (no published Vs profiles), we use VS30 values from the Japan Seismic Hazard Information Station (J-SHIS, see Data and Resources), where VS30 is inferred from other easy-to-measure proxies, for example, surface geology. Distributions of site proxies at sites in the training and testing databases are presented in Figure 4.
In addition, we also derive the attenuation parameter kappa (κ) at K-NET and KiK-net recording sites (Figure 5a) using the same FAS dataset (Nakano et al., 2015) as described above. The spectra are further smoothed by a Parzen window of 1 Hz length to assure that the high-frequency decay is not influenced by dominant troughs or peaks. κ is computed as the slope of the high-frequency decay of acceleration FAS in 10–25 Hz in log-linear space. Then the κ values of two horizontal components are combined by their geometric mean if their difference is within 10 ms (otherwise rejected). κ is assumed to be a linear function of source-site distance (Anderson and Hough, 1984). The intercept (i.e. zero-distance κ) is denoted as κ0 which represents the attenuation of seismic waves within the geological structure immediately beneath the site.

(a) Histograms of kappa (κ0) values at the 1580 training and the 145 testing sites derived from surface recordings in this study. (b) Comparison of κ0 from different studies at common sites. (c) Pearson’s correlation of κ0 with various site parameters and HVSR amplitudes at different frequencies based on the training sites. Zx (x = 1.0, 1.1, 1.2, ..., 3.4) is from the J-SHIS 3D velocity model which has velocity reversals.
Our κ0 values are considered to be consistent with those in the literature (Cabas et al., 2017; Van Houtte et al., 2011; Xu et al., 2020) at common sites (Figure 5b). Some deviations are expected since κ0 results are affected by many factors in the computation (Ktenidou et al., 2013). κ0 is often used as a site parameter (e.g. Cotton et al., 2006) and it exhibits low correlations with other site parameters examined here (Figure 5c). The strongest correlation is achieved with depth parameters, for example, r = 0.38 for Z3.1. In this study, we use κ0 as an attenuation operator in SRI (Boore, 2003; Joyner et al., 1981) and will publish the κ0 data at K-NET and KiK-net sites in the open-source site database (Zhu et al., 2021b).
Site-specific approaches in site-response estimation
Empirical correction to HVSR (c-HVSR)
For any site on the ground surface, its HVSR can be expressed as:
where
To obtain the site amplification in the horizontal direction, Equation 11 can be rewritten as:
Equation 12 indicates that
Based on the 1580 training sites, we derive correction spectra
Clustering analysis to VHbR functions
In Scheme II (Table 2), we adopt a similar approach as Zhu et al. (2020c). Under this scheme, each site is characterized by fPv and log10VS30. fPv represents the peak frequency on VHbR and is determined by following an automatic procedure (Zhu et al., 2020a). Among the 1580 sites in the training dataset, 880 stations have their fPv identified. Then we utilize an unsupervised machine learning technique, k-means clustering, to partition the 880 stations into k number of mutually exclusive clusters based on a chosen distance metric. K-means clustering needs prior knowledge of the number of clusters. According to the within-cluster sum of distances (Figure 6a), we set the optimal number of clusters k to 5. The within-cluster mean VHbR, that is,

(a) Determination of the optimal number of clusters using the within-cluster sum of distances. (b) Generic correction spectra
1D outcrop site-response analyses (GRA and SRI)
In site-specific seismic hazard evaluations, GRA is widely adopted to assess the effects of surface geology on ground motions. In GRA, the propagation of complex wave fields (P-SV, SH, and surface waves) in 3D media is simplified as the propagation of vertically incident plane SH waves through 1D soil columns (i.e. “1DSH” assumptions). In contrast, SRIs are based on ray theory (Boore, 2003; Joyner et al., 1981) and utilize the QWL velocities and densities to compute attenuated amplifications:
where
For the 145 testing sites, TTFs are computed using both GRA and SRI. For GRA, the software Strata (Kottke et al., 2013) is adopted, whereas the Fortran package site_amp_batch (Boore, 2003) is utilized for SRI. Given the fact that site-response observations are obtained from recordings with PGA ≤ 2.0 m/s2 (i.e. linear site responses) with reference to outcrop bedrock (VS = 3450 m/s), GRA and SRI are thus conducted in the linear domain with the same reference condition, namely outcrop bedrock (VS = 3450 m/s).
Both GRA and SRI require a 1D physical model, including mass density (ρ), shear-wave velocity (VS), and damping profiles or κ0 which govern two competing effects in site responses: amplification and attenuation. Density information is unavailable at the 145 testing sites (KiK-net) and thus is estimated from VS (Zhu et al., 2020c). Velocity profiles at these sites are established from the boring data published by NIED (2019) and are referred to as the original “V0” profiles (Figure 7a).

1D velocity profiles for the KiK-net site FKSH06. (a) V0 (PS logging). (b) V1 (PS logging + J-SHIS).
Since KiK-net boreholes do not reach the seismological bedrock (VS = 3450 m/s), especially for sites in deep sedimentary basins, for example, the Kanto (Tokyo) basin, we thus combine the boring data with velocity structures from the J-SHIS three-dimensional (3D) velocity model, which is developed by NIED (Fujiwara et al., 2012) for the whole of Japan. When combining the two profiles, we give a priority to the KiK-net velocity structure down to the bottom of the boring data, from which to the 3450 m/s VS horizon we switch to the velocity structure of the J-SHIS model. These composite velocity models are denoted by “V1” profiles (Figure 7b).
Seismologists have long recognized that the amplitude decay of seismic waves within the Earth’s crust, defined herein as the effective attenuation of shear (S) waves, can be approximated by Equation 15 (Anderson and Hough, 1984):
where r is distance, f is frequency, and
The slope
where Q is the effective seismic quality factor of S waves. The above equation assumes that the value of Q within the sedimentary column is independent of frequency and represents the combined intrinsic and scattering attenuation.
The value of Q is often estimated from VS. Campbell (2009) proposed a Q-VS model (Equation 18) for sites in eastern North America. In this study, we also use this model and refer it to as damping model “D1.” In addition, we also select another Q-VS relation (Equation 19) established by NIED specifically for Japan. The NIED damping model is available on J-SHIS and is referred to as “D2” in the following.
D1: Campbell (2009) attenuation
D2: J-SHIS attenuation model
Comparing D1 and D2 damping models (Figure 8), D1 gives much lower Q estimates (stronger damping) than D2 by a factor of around 3.3. Another widely adopted relation

(a) Shear-wave quality factor (Q) models considered in this study, and (b) small-strain damping (Dmin) profiles for the KiK-net site FKSH06 with dashed and solid lines corresponding to damping models D1 (Equation 18, Campbell, 2009) and D2 (Equation 19, J-SHIS), respectively.
For GRA, small-strain soil damping (Dmin) is required as input and can be derived from Q (Goodman, 1988):
In SRIs, attenuation is accounted for by utilizing
where
Results
Intra-method comparison
First, we compare the effectiveness of the empirical correction to HVSR (c-HVSR, Equation 13) under two different schemes (Table 2) using the training dataset (1580 sites). In Scheme I, there is only one vertical correction spectrum, and thus site information is not needed. In contrast, there are five correction spectra in Scheme II under which site-specific parameters fPv and VS30 are prerequisites to determine the corresponding correction spectrum. “Scheme I & II” means the mixed use of Scheme I and II in which Scheme II is applied for the sites with site-specific data (i.e. fPv or log10VS30), otherwise Scheme I is utilized. Scheme I and Scheme I & II are applied to all the 1580 training sites, whereas Scheme II is only applied to 880 sites with fPv and VS30.
In terms of standard deviations of residuals between site-response observations (through GIT) and predictions (through c-HVSR under different schemes), Scheme II has the lowest level of dispersion and is followed by Scheme I & II and then Scheme I (Figure 9a). This demonstrates that utilizing site-specific information whenever available will lead to more accurate predictions. Considering both the performance and required input data, we use the combined Scheme I & II hereafter. As an example, c-HVSR is applied to one test site FKSH05 (Figure 9b), and the estimated site response shows a very good match with the observation from GIT.

(a) Standard deviation of residuals (
Next, we evaluate GRA predictions using different generic damping models, that is, D1 (Equation 18, Campbell, 2009) and D2 (Equation 19, J-SHIS). We utilize the MAE (Equation 10) between observation and GRA prediction as evaluation criteria. At the 145 testing sites, GRA results using D1 have larger MAE than those using D2 at 92% of the sites (Figure 10a). Considering that D1 gives higher damping values than D2 (Figure 8), the comparison implies that the D1 damping model overpredicts the attenuation effects at sites in Japan. The D1 damping model was initially proposed by Campbell (2009) for sites in eastern North America and may not apply to Japan. GRA and SRI predictions using D2 damping are illustrated for the site FKSH05 (Figure 10b). Both GRA and SRI results still underestimate observed site response in a broad frequency range at this site.

(a) MAE (Equation 10) of GRA predictions using generic damping models D1 (Equation 18, Campbell, 2009) and D2 (Equation 19, J-SHIS) at each of the 145 testing sites. (b) Site-response predictions from SRI and GRA using D2 damping at the test site FKSH05 in comparison with observed site response through GIT.
Inter-method comparison
In this section, we evaluate the effectiveness of different approaches (GRA, SRI, and c-HVSR) in reproducing the observed site response (GIT) at the 145 testing sites. We assess the overall performance of a given technique m at the testing sites as a whole, and its performance at the individual sites. The overall performance is measured using the reduction in the standard deviation of residual
Success rates of different methods/models in reproducing observed outcrop site responses at the 145 testing sites
GRA: ground response analysis; SRI: square-root-impedance; c-HVSR: horizontal-to-vertical spectral ratio.

Comparison of the standard deviation of prediction residuals (

Histograms of (a) Pearson’s r, (b) Spearman’s ρ, and (c) Kendal’s τ, between site-response observation (GIT) and predictions (from GRA and c-HVSR) at the 145 testing sites. Each GoF metric is computed in the frequency range between f0, TTF and 20 Hz where f0, TTF is the fundamental resonant frequency on TTF using GRA. Success rates (GoF > 0.6) of each approach are also displayed in each plot.
Generic damping models, that is, D1 (Equation 18, Campbell, 2009) and D2 (Equation 19, J-SHIS), have negligible impacts on
GRA is effective in capturing the site-specific features of site responses at frequencies lower than ∼3.0 Hz but performs poorly at frequencies higher than ∼4.0 Hz (Figure 11). It is known that GRA results should not be utilized for f < f0 due to the limited length of velocity profiles. In the mid-frequency range ∼0.4–4.0 Hz,
Comparing SRI with GRA,
c-HVSR achieves a better overall performance than both GRA and SRI (Figure 11 and Table 3).
Using single-station recordings to constrain site models
Besides the simplifications to wave propagations in 1D site-response analyses, another main reason for the poor performance of GRA and SRI is the erroneous or inaccurate velocity and damping profiles. For instance, some of the boring data at KiK-net sites are found to be unable to reproduce the resonant frequencies derived experimentally, for example, using SBSR or HVSR (e.g. Pilz and Cotton, 2019). Besides, accurate characterization of soil attenuation properties is crucial in wave propagation modelings, especially for high-frequency components, but remains a challenge. In the following, we discuss two strategies based on the single-station recordings to potentially improve predictions from 1D site-response analyses: using HVSR-consistent velocity structure and site-specific kappa (κ0) measurements.
HVSR-consistent velocity structure
HVSR-consistent velocity structures are defined as those of which theoretical HVSR (Kawase et al., 2011) can reproduce the one from earthquake weak-motion recordings. To obtain HVSR-consistent velocity profiles for the 145 testing sites, we invert the complete velocity structures from the ground surface down to the seismological bedrock (3.45 km/s) from HVSR of earthquakes using the program by Nagashima et al. (2014). We utilize the existing V1 velocity structure as a reference model and search for the optimal one using the hybrid heuristic search method. The inversion algorithm only searches among candidate profiles of which velocities are within the ± 50% range of the reference value for shallow portion (from surface to borehole) and thicknesses within the ± 50% range of the reference for the deeper part (borehole to 3.45 km/s).
The HVSR inversion is made to reproduce the observed HVSR by searching for the best values in terms of the S-wave velocity in the boring layer or the thickness in the J-SHIS structure for the whole frequency range of 0.12–20 Hz. Because we use the V1 structure as a reference to determine the searching range, we will get the V1 structure as the resultant structure from the inversion if it is the best, but it is quite unlikely for almost all the sites. Note that the inversion should have a direct constraint to the absolute velocity or thickness values because we use the fixed seismological bedrock velocity and the whole HVSR as spectral amplitudes as evidenced by Nagashima et al. (2014) and Kawase et al. (2018a). These newly inverted VS structures are referred to as “V2” profiles. As an example, the V2 profile for the same site FKSH06 as shown in Figure 7 is illustrated in Figure 13a. Its corresponding small-strain damping profiles estimated from VS (Equation 18 and 19) are given in Figure 13b.

(a) Earthquake HVSR-consistent VS structure (V2) for the KiK-net site FKSH06 of which the V1 profile (Figure 7b) is also displayed here, and (b) its small-strain damping (Dmin) profiles using models D1 (Campbell, 2009) and D2 (J-SHIS), respectively.
Comparing to V1 profiles, the newly inverted V2 structures results in much lower

Improvement in GRA and SRI predictions using HVSR-consistent velocity profiles (V2) at the 145 testing sites comparing with those using V1 profiles. The result for c-HVSR is also displayed.
Similarly, improvement can be also achieved in SRI predictions but to a less extent compared with GRA (Figure 14). This can be attributed to the insensitivity of SRI results to model details. In addition, it is interesting to see that utilizing the improved structures (V2), the advantage of SRI over GRA, as observed between ∼4.0 and ∼7.0 Hz when V1 is used (Figure 11), diminishes. If we examine the GoF measures for V2 profiles (Table 3), GRA achieves higher success rates than SRI using all three metrics, indicating GRA results have a better match with observations in shape than SRI predictions. Thus, considering both
Even if the improved V2 profiles are employed, GRA and SRI predictions, in the aggregate, are still outperformed by c-HVSR (Figure 14 and Table 3). The underperformance of GRA and SRI is due to their ineffectiveness at high frequencies (f > ∼8 Hz), which is primarily attributable to the inaccurate estimation of small-scale velocity variations and attenuation properties in 1D site models.
Site-specific kappa (κ0) measurements
We then examine the direct use of kappa (κ0) as an attenuation operator in SRI (Equation 21). κ0 at each site (Figure 5a) in this study is estimated from surface recordings and thus represents the attenuation effect of the complete geological structure beneath a site. However, κ0 in SRI (Equation 21) corresponds to the attenuation over the length of the soil column, namely from the ground surface to seismological bedrock (VS = 3.45 km/s). Thus, the contribution from the structure deeper than 3.45 km/s (κ0.rock) needs to be separated from surface κ0 (i.e. κ0, surface):
where κ0, surface is directly estimated from surface recordings using the approach of Anderson and Hough (1984), but site-specific value for κ0, rock is unavailable and thus is set to κ0.rock = 0.007 s considering the suggested value for hard-rock sites in other regions (Campbell et al., 2014). Hence, SRI predictions can be obtained using Equations 14, 21, and 22 at the 145 testing sites.
Comparing with the damping model D2 (J-SHIS), SRI results using κ0 lead to a higher

Comparison between SRI predictions using site-specific kappa (κ0) measurements (Figure 5a) and those using κ0 estimates from Q(VS) profiles (D2, Equation 19) through Equation 17 at the 145 testing sites.
The lack of reduction in
Discussion
Performance of GRA under within and outcrop conditions
Our results on GRA denote its performance under outcrop boundary conditions (i.e. TTFout) in predicting observed site responses with reference to motions on outcrop rock (Figure 16a). There exist many studies examining the efficacy of GRA at KiK-net sites (Table 4), but many of these investigations used empirical borehole responses (i.e. SBSR) as targets due to the availability of downhole recordings, and correspondingly TTF from GRA was computed with reference to the motions at borehole bedrock at the base of soil column (i.e. within boundary conditions, TTFin).
Effectiveness of 1D GRA at KiK-net sites, Japan
GRA: ground response analysis; TTF: theoretical transfer functions; GIT: general inversion technique; SBSR: surface-to-borehole spectral ratio.

(a) Illustration of borehole and outcrop site responses. An and Bn are the amplitudes of up- and downgoing waves, respectively, within bedrock, and A1 and B1 are those at the ground surface of soil site. Theoretical borehole (TTFin) and outcrop (TTFout) site responses from GRA in comparison with corresponding observations from SBSR (Zhu et al., 2020c) and GIT (Nakano et al., 2015), respectively at sites, (b) NIGH19, and (c) MYGH04. rnd. denotes the average site response over 30 random profiles at a given site. Note SBSR and GIT have different reference conditions.
In practice, outcrop site responses from GRA are needed to adjust the seismic hazards on reference outcrop rock conditions from probabilistic seismic hazard analyses. Thus, it is imperative to check whether GRAs under within and outcrop conditions have similar efficacy; in other words, whether the effectiveness inferred from borehole response analyses applies to outcrop conditions. Herein, we only compiled some results on KiK-net sites, Japan (Table 4) since velocity profiles of other regions may have a different level of parametric uncertainty.
The success rates of GRA borehole responses are lower than that of GRA outcrop responses (Table 4). This can be explained by the fact that GRA borehole responses significantly overpredict the effects of downgoing waves which would be largely scattered by lateral heterogeneities in reality (e.g. Bonilla et al., 2002). As a result, TTFins contain significant resonant peaks that either do not exist (i.e. pseudo-resonances, Figure 16b) or are less significant than the peaks on observed borehole responses, that is, SBSR (Figure 16c), as also noted by Tao and Rathje (2020). Thus, TTFins have a low success rate in matching SBSR in shape. However, when the effects of downgoing waves on TTFin are mitigated using 30 random profiles generated by Monte Carlo simulations at each site, Zhu et al. (2020c) realized a higher success rate than using only base-case profiles (Table 4).
One needs to account for the effects of downgoing waves on TTFins when inferring the intended effectiveness of GRA (outcrop conditions) from borehole response analyses. Nevertheless, outcrop response analyses (TTFout vs GIT) should be preferred. Despite the higher success rate of GRA under outcrop conditions than under within conditions, GRA using V1 structures still fails to reproduce the observed site responses at the majority of examined KiK-net sites (Table 4). In another similar study, Pilz and Cotton (2019) excluded sites at which f0, TTFs from GRA are inconsistent with the ones from observations, and then a higher success rate was found among the remaining sites.
Modeling and parametric uncertainties in GRA predictions
The poor performance of GRA is known to be associated with two types of errors: modeling and parametric uncertainties (Toro, 1995). Modeling uncertainty arises because of the simplifications/assumptions to the complex physical process, that is, the 1DSH assumptions. These assumptions are very likely to break down in cases where a site has prominent 2D or 3D features (non-1D) or is significantly affected by surface waves (non-vertical SH incidence). Even if the 1DSH assumptions hold at a site, we are still limited by our ability to perfectly measure all parameters used to define the site model for GRA, especially small-scale variations within sediments, attenuation parameters, and soil nonlinear properties, and these types of error constitute the parametric uncertainty.
Both parametric and modeling uncertainties in GRA results vary from site to site. In site-specific seismic hazard analyses of safety-critical buildings or infrastructure, the parametric uncertainty is accounted for through sensitivity analyses in which input parameters in GRA (e.g. VS profile) are varied according to a certain statistic distribution (EPRI, 2013). However, the modeling uncertainty in GRA is not explicitly considered (e.g. Rodriguez-Marek et al., 2021). The underestimation resulting from ignoring the modeling uncertainty may be balanced by the fact that the parametric uncertainty might be overestimated by considering the marginal distributions of parameters (e.g. Strasser et al., 2009).
If on-site recordings are available at a site of interest, both types of errors can be constrained:
To reduce the level of parametric uncertainty, preliminary checks on 1D models can be conducted to identify those of which site signatures (e.g. f0, TTF, theoretical HVSR or dispersion curves) are inconsistent with the ones derived empirically. For these sites, more works on site characterization are warranted.
Similarly, to reduce the degree of modeling uncertainty, prior investigations can be carried out to determine whether a target site has subsurface irregularities. This can be achieved using, for instance, the directionality of HVSR (Matsushima et al., 2017), variability of HVSR from different events (HVSR sigma σHV,s, Zhu et al., 2021a), or a combination of indicators (Pilz et al., 2021). Then decisions can be made about the suitability of 1D analyses or the necessity of multi-dimensional models.
Furthermore, to mitigate both types of errors simultaneously, Zhu et al. (2021a) suggested a dual-parameter framework: f0, TTF/f0, HV (the ratio of f0, TTF to f0, HV) and σHV,s, to constrain parametric and modeling uncertainty, respectively.
All the above strategies entail on-site recordings, but the reduced estimation uncertainties may incentivize stakeholders/practitioners to instrument the target site and collect records in the early planning stage of a safety-critical project.
The match or mismatch between GRA predictions and observations reflects the combined effects of both modeling and parametric uncertainties, which, though conceptually clear, are difficult to separate. We utilize different GoF metrics to quantify the match or mismatch, but we emphasize that a single GoF measure only tells part of the story. For instance, Pearson’s r only measures the strength of association in spectral shapes. Keeping this partiality in mind, we check the dependency of Pearson’s r on various site characterization proxies. For both borehole (Figure 17a to e) and outcrop (Figure S2, see Data and Resources) site responses, Pearson’s r shows some dependence on these site parameters. Softer and flatter sites with less variable HVSR tend to have higher r values (better matches). However, the strength of dependence is not very strong, though it may be worth further exploring their combinations, for example, σHV,s and f0, HV (Figure 17f) from single-station recordings.

Pearson’s r between SBSR and TTFin from 1D GRA in the frequency range between f0, TTF and 25 Hz versus various site proxies. (a) VS30. (b) f0, HV. (c) Slope gradient. (d) Topographic roughness. (e) σHV (ln) averaged over 0.25–25 Hz. (f) σHV and f0, HV. Black and red circles represent the results of Zhu et al. (2020c) and Kaklamanos and Bradley (2018), respectively.
Conclusion
To address the question of how well we can predict the average site responses (over many earthquakes) at individual sites, we tested and compared the effectiveness of different estimation techniques using a unique benchmark dataset. The benchmark dataset consists of detailed site metadata (Zhu et al., 2021b) and absolute Fourier outcrop site responses based on observations (Nakano et al., 2015) at 1725 K-NET and KiK-net sites. Evaluated approaches include 1D GRA, the SRI method (also called the QWL approach), the empirical c-HVSR of earthquakes, and regression and machine learning site-response models. The performance of each technique is assessed using the standard deviation of residuals, and GoF in shape, between observations and predictions. The assessment of the regression and machine learning models is presented in a separate paper. In this article, we focus on the evaluation of GRA, SRI, and c-HVSR. Our results show that:
In the aggregate, c-HVSR is very effective in the examined frequency range from 0.1 to 20 Hz and gives significantly more accurate site-response estimates than both GRA and SRI at relatively high frequencies. c-HVSR achieves a “good match” in shape at ∼80%–90% of the 145 testing sites, whereas GRA and SRI fail at most sites.
GRA is effective in capturing site-specific features of site responses at frequencies lower than ∼3.0 Hz but performs poorly at frequencies higher than ∼4.0 Hz. However, the dispersion of GRA predictions can be significantly lowered using ground models inverted from HVSR of weak earthquake motions.
Comparing with generic damping models, utilizing kappa (κ0) derived from surface recordings as an attenuation operator in SRI realizes a better match in spectral shape but a larger standard deviation of estimation residuals.
These findings suggest that GRA and SRI results have a high level of parametric and/or modeling errors which can be constrained, to a certain extent, by collecting on-site recordings. But additional development is needed in the use of κ0 to constrain site-specific attenuation properties. c-HVSR is promising in site-specific seismic hazard analyses, and the HVSR correction functions constructed in this study (Table S1, see Data and Resources) can be directly applied to estimate outcrop site responses in Japan. Future studies on developing more site-specific HVSR correction functions and better characterizing site-specific damping profiles are warranted.
Supplemental Material
sj-docx-1-eqs-10.1177_87552930211060859 – Supplemental material for How well can we predict earthquake site response so far? Site-specific approaches
Supplemental material, sj-docx-1-eqs-10.1177_87552930211060859 for How well can we predict earthquake site response so far? Site-specific approaches by Chuanbin Zhu, Fabrice Cotton, Hiroshi Kawase, Annabel Haendel, Marco Pilz and Kenichi Nakano in Earthquake Spectra
Supplemental Material
sj-docx-2-eqs-10.1177_87552930211060859 – Supplemental material for How well can we predict earthquake site response so far? Site-specific approaches
Supplemental material, sj-docx-2-eqs-10.1177_87552930211060859 for How well can we predict earthquake site response so far? Site-specific approaches by Chuanbin Zhu, Fabrice Cotton, Hiroshi Kawase, Annabel Haendel, Marco Pilz and Kenichi Nakano in Earthquake Spectra
Supplemental Material
sj-xlsx-1-eqs-10.1177_87552930211060859 – Supplemental material for How well can we predict earthquake site response so far? Site-specific approaches
Supplemental material, sj-xlsx-1-eqs-10.1177_87552930211060859 for How well can we predict earthquake site response so far? Site-specific approaches by Chuanbin Zhu, Fabrice Cotton, Hiroshi Kawase, Annabel Haendel, Marco Pilz and Kenichi Nakano in Earthquake Spectra
Footnotes
Acknowledgements
The authors thank the National Research Institute for Earth Science and Disaster Prevention (NIED), Japan for making the strong motion and PS logging data available. They also thank three anonymous reviewers who provided very constructive feedback on this work.
Authors Note
Fabrice Cotton is also affiliated from Institute for Earth Sciences, University of Potsdam, Potsdam, Germany.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
Data and resources
Site parameters at K-NET and KiK-net stations used in this study are obtained from an open-source site database https://doi.org/10.5880/GFZ.2.1.2020.006. Seismograms and KiK-net PS logging data (V0 profiles) were downloaded from http://www.kyoshin.bosai.go.jp (profiles were accessed on 5 February 2018). J-SHIS is an integrated online platform developed by NIED for the Probabilistic Seismic Hazard Maps for Japan and various underlying data https://www.j-shis.bosai.go.jp/map/JSHIS2/download.html?lang = en. The Fortran program site_amp_batch for SRI computation is made available by David Boore at www.daveboore.com (last accessed 12 October 2019). The software Strata (Kottke et al., 2013) is used for GRA. Supplemental materials include one table (Table S1) and two figures (Figure S1 and S2). Table S1 tabulates the HVSR correction spectra. Figure S1 displays the relations between VS30 and VSz (z = 10, 15, and 20 m) established in this study.
depicts the Pearson’s r between outcrop site-response observation (from GIT) and prediction (from 1D GRA) against various site parameters.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
