New structural reliability method with focus on important region and based on adaptive support vector machines

Abstract

Support vector machine has been shown to be an effective classification tool for reliability analysis. Its training set governs the computational cost of the whole reliability analysis. To reduce this training set, researchers focus on important region to decrease samples and refine support vector machine adaptively. In accordance with this methodology, this article presents a more efficient algorithm from the aspect of sampling strategy and that of adaptive manner. To reduce simulated samples, only the important region is considered using Monte Carlo samples in only that region. Moreover, Hasofer–Lind reliability index and the physical meanings of random variables are utilized to identify samples. To speed up convergence of the adaptive procedure, it is proposed to add the most likely support vector to the training set at each step. The illustrative examples show that the proposed sampling strategy largely reduces the classification burden of support vector machine, and the new adaptive procedure converges quickly. The results of the examples demonstrate the proposed method to be accurate and efficient.

Keywords

Structural reliability failure probability simulation support vector machines important region

Introduction

Estimation of the probability of structural failure is a fundamental task in structural reliability analysis. Essentially, this probability is a multifold integral of joint probability density function of the random variables over the failed domain, that is, $\int_{g (x) < 0} f (x) d x$ . In this formula, $x$ is the vector consisting of random variables that influence structural response, $f (x)$ is the joint probability density function of $x$ , and $g (x)$ is called performance function, which aims to describe structural safety in certain sense. In this context, its definition makes $g (x) > 0$ and $g (x) < 0$ represent safe and failure state, respectively, and $g (x) = 0$ consequently denotes the limit state surface in design space.^1,2

Simulation methods have received much attention from the research community for the assessment of above integral. The fundamental one, Monte Carlo simulation (MCS), samples entire design space according to the joint probability density function. It is rather robust but has to simulate a large number of samples for problems with small failure probabilities.³ Each sample needs to be evaluated on the performance function, where a time-demanding structural analysis is performed. To alleviate simulation burden, many improved sampling ways are developed, for example, Importance Sampling, stratified sampling,⁴ directional sampling,⁵ and subset simulation.⁶ Among them, Importance Sampling receives most attention from the research community as it can largely reduce samples by focusing only on important region which carries major failure probability content. On the other hand, the original performance function is replaced by an explicit and analytical surrogate model so that structural analysis is then no longer needed. Earlier surrogates in this field are known as response surfaces.^7,8 However, they lack flexibility because of the fixed regression model and are strongly dependent on the experimental design.⁹ Artificial neural networks¹⁰ and Kriging¹¹ are also two surrogates used in this field. The former has the issue of local minima and its structure is difficult to determine.¹⁰ And the latter has the difficulty to find suitable theta value.¹¹ More recently, developments in the field of statistical learning brought a flexible and powerful tool known as support vector machine (SVM).¹² It is constructed according to the principle of “Structural Risk Minimization,” which guarantees a small generalization error. Another advantage is that SVM approximates sign of the performance function directly, not through its value whose approximation requires more computational efforts.

A surrogate is built based on a training set, which consists of points that have been evaluated on performance function. Then, size of training set governs the computational cost of a surrogate-based reliability approach. So, after Hurtado and Alvarez¹³ introduced SVM to the field of structural reliability, works on this topic turned to reduce the training set of SVM. Researchers commonly combine the focus on important region and the use of surrogate, for example, SVM. For example, Hurtado and Alvarez¹⁴ use particle swarm optimization to obtain training points in important region, whereas Dai et al.¹⁵ use adaptive Metropolis algorithm to draw training points in the region of most interest. Recently, Alibrandi et al.¹⁶ explore important region by means of sampling cone(s), where training points are generated. Richard et al.¹⁷ propose the rotated star experimental design for SVM regression and adaptively move its position to important region. Also, training points can be reduced according to the fact that only training points closest to the separating hyperplane of SVM contribute to the SVM. Upon this consideration, Jiang et al.¹⁸ take sample pairs (two close samples with different signs and given by a simple nonlinear finite element analysis) as basic elements of training set to reduce useless training points. Moreover, SVM is expressed as a linear combination of functions, each of which is centered on a training point and has only local activation. This flexible structure suggests an adaptive construction procedure where the latest updated SVM is used to guide the selection of new training point(s). In fact, in above works of Hurtado and Alvarez,¹⁴ Dai et al.,¹⁵ and Alibrandi et al.,¹⁶ SVM is built adaptively. But in their works, new training points are just newly generated samples according to a fixed sampling function. The corresponding adaptive construction procedures are less aimed at SVM accuracy than the suggested one. Basudhar chooses a new sample point that is likely to modify the current SVM and solves the associated problem of SVM locking. The selection is upon multiple considerations and implemented with genetic algorithm.^19,20 Hurtado,²¹ at each step, evaluates a sample in separation margin of current SVM and then updates current SVM. The samples are generated by Importance Sampling. So, the approach combines focusing on important region and constructing SVM adaptively under guidance of intermediate SVM. However, the training points are selected more or less arbitrarily as more than samples lie in the margin. Bourinet et al.²² proposed another adaptive scheme where new training points at each step include cluster centers, points in unstable zones and in the margin. This adaptive scheme is designed specially for subset simulation, and the number of training points added in each simulation step is more than 150.

In this article, an explicit criterion for training point selection is proposed with the expectation of most significant improvement in current SVM. The proposed method considers only the specified important region by simulating only Monte Carlo samples in that region. It will be seen that, with this sampling strategy, an important number of samples can be classified directly and effortlessly. And parameter of the sampling strategy is easily determined, which is not the case when Importance Sampling is used.

Principle of SVM for binary classification

SVM is a kind of machine learning method for pattern recognition. It is established according to the rule of structural risk minimization.^23,24 As a result, it generalizes well even when established with a small training set.

First, the linearly separable case is considered. A binary classification problem is to determine a hyperplane to separate the given training points. With the aim of structural risk minimization, the hyperplane has to maximize separation margin, which is the shortest distance from training points to separating hyperplane. Assume that the training set contains m points in n-dimensional space, $x_{i}$ , $x_{i} \in R^{n}$ , $i = 1, 2, \dots, m$ , with class labels $y_{i} \in {- 1, + 1}$ , $i = 1, 2, \dots, m$ . The hyperplane to be determined is formulated as

f (x) = w^{T} x + b = 0

(1)

in which $w$ is a normal vector to the hyperplane, b is a scalar, and $f (x)$ is called decision function.

Noticing that equation (1) is invariant by positive rescaling, the condition $| f (x_{i}) | \geq 1$ can be imposed on all the training points. As the distance from point $x_{i}$ to the hyperplane is $f (x_{i}) / | | w | |$ , the separation margin is at least $2 / | | w | |$ . Accordingly, maximizing the separation margin requires to minimize $| | w | |$ . Therefore, the concerned problem can be formulated as

min \frac{1}{2} ‖ w ‖^{2} s . t . y_{i} [w^{T} x_{i} + b] \geq 1, i = 1, \dots, m

(2)

This constrained optimization problem satisfies the Karush–Kuhn–Tucker conditions. So, it could be converted to a dual problem, which is easily handled. The corresponding Lagrange function is first constructed

L (b, w, α) = \frac{1}{2} ‖ w ‖^{2} - \sum_{i = 1}^{m} α_{i} [y_{i} (w^{T} x_{i} + b) - 1]

(3)

where $α_{i}$ , $i = 1, \dots, m$ are Lagrange multipliers. The dual problem is expressed as minimizing $L (b, w, α)$ with b and $w$ and then maximizing the resulting $L (α)$ with $α$ $(α_{i} \geq 0, i = 1, \dots, m)$ . This expression requires

\frac{\partial L (b, w, α)}{\partial b} = 0 \Rightarrow \sum_{i = 1}^{m} α_{i} y_{i} = 0

(4)

\frac{\partial L (b, w, α)}{\partial w} = 0 \Rightarrow w = \sum_{i = 1}^{m} α_{i} y_{i} x_{i}

(5)

α_{i} \geq 0, i = 1, \dots, m

(6)

In addition, one of the Karush–Kuhn–Tucker conditions is

α_{i} [y_{i} (w^{T} x_{i} + b) - 1] = 0, i = 1, \dots, m

(7)

Substitution of equations (4) and (5) into equation (3) yields

L (α) = \sum_{j = 1}^{m} α_{j} - \frac{1}{2} \sum_{i = 1}^{m} \sum_{j = 1}^{m} y_{i} y_{j} α_{i} α_{j} (x_{i} \cdot x_{j})

(8)

As the desired $α$ maximizes $L (α)$ under the constraints of equations (4) and (6), one has

{\begin{matrix} max \sum_{j = 1}^{m} α_{j} - \frac{1}{2} \sum_{i = 1}^{m} \sum_{j = 1}^{m} y_{i} y_{j} α_{i} α_{j} (x_{i} \cdot x_{j}) \\ s . t . \sum_{i = 1}^{m} α_{i} y_{i} = 0, α_{i} \geq 0, i = 1, 2, \dots, m \end{matrix}

(9)

This equation would provide the solution of $α$ . Subsequently, $w$ is obtained by equation (5). Then, b is obtained by equation (7). At this point, the hyperplane is available and the classifier is

sgn (f (x)) = sgn (\sum_{i = 1}^{m} α_{i} y_{i} (x \cdot x_{i}) + b)

(10)

in which $sgn (\cdot)$ is the sign function.

It is noteworthy that equations (5) and (7) are of great significance. Equation (7) indicates that either $α_{i} = 0$ or $| f (x_{i}) | = 1$ , $i = 1, \dots, m$ must hold. That is to say training points closest to the separating hyperplane are possible to have a strictly positive $α_{i}$ . It could then be inferred from equation (5) that $w$ is only dependent on these points and so is the separating hyperplane. These points are so-called “support vector” and the remaining training points have no contribution to the separating hyperplane.

Next, we consider the general case where linear separation is impossible. The basic idea to deal with this case is to map training points to a higher dimensional space by a nonlinear projector $x \mapsto φ (x)$ so that linear separation becomes possible in the new space. In the new space, separating hyperplane is deduced as described above. In above deduction procedure, all coordinates appear in the form of scalar product in the expressions of $α$ and b, so do them in the final classifier expression, that is, equation (10). This condition opens the possibility to directly define the scalar product $φ (x) \cdot φ (x_{i})$ , bypassing the explicit definition of the space transformation $x \mapsto φ (x)$ , which commonly brings heavy computational burden. The scalar product $φ (x) \cdot φ (x_{i})$ is known as kernel function, and popular kernel functions in SVM community include polynomial, Gaussian radial basis function (Gaussian RBF), and sigmoid. In this study, Gaussian RBF is employed, which is written as

k (x, x_{i}) = \exp (- \frac{1}{2 σ^{2}} {‖ x - x_{i} ‖}^{2})

(11)

in which the parameter $σ$ should be well chosen.

The proposed method

Like FORM (first-order reliability method) and Hurtado’s²¹ method, the proposed method aims to solve low-dimensional problems with unique MPP (or comparative MPP, MPP is short for most probable failure point). For this type of problems, probability density decays exponentially with the distance from origin in standard normal space. As MPP is the failure point with the largest probability density, most failure probability content concentrates around MPP. This region is thus the important region, which should receive more attention during failure probability estimation. It is noted that, in this case, taking account of the failure probability content only in the important region would not harm the accuracy of the simulated failure probability while largely reducing the samples. In view of this, it is proposed to perform MCS only in that region. Subsequently, to reduce SVM training set, an adaptive scheme is developed with emphasis on criterion for training point selection. According to the geometric interpretation of the MPP, some samples can be identified to be safe, and, if possible, the physical meanings of all the random variables can also be utilized to identify samples.

MCS in important region

Important region is identified by MPP, which should be searched first. MPP search is the fundamental task of FORM. FORM commonly adopts a gradient-based algorithm to search the MPP, and the gradients are calculated by a finite difference scheme. This search algorithm is adopted here. In view of its non-robustness (robustness means a wide scope of application), other search algorithms can be used instead when necessary. Except the search algorithm of FORM, existing search algorithms can at least be found in the works of Dai et al.,¹⁵ Hurtado,²¹ which both present Markov chain based algorithm, and Gong et al.,²⁵ which reports a non-gradient algorithm. Without loss of generality, the important region is specified as a hypercube centered on the MPP and with a range of $\pm k$ in each dimension, as shown in Figure 1. According to our experience and also the proposed range of central composite design in adaptive response surface methods,^7,9k is recommended to be in the interval of [1, 2] for most cases. More about the choice of k will be discussed in section “Discussion.”

Figure 1.

Illustration of Monte Carlo simulation in the important region with Example 1.

Next, sampling is performed according to the joint probability density function, but only the samples in the important region are reserved, as shown in Figure 1. When these samples are classified, the conditional failure probability in the important region can be estimated by

{\hat{p}}_{cond} = \frac{N_{fail}}{N_{all}}

(12)

where $N_{all}$ is the total number of samples and $N_{fail}$ is the number of failed samples. Then, failure probability content of the important region is obtained by

{\hat{p}}_{f} = p_{0} {\hat{p}}_{cond} = p_{0} \frac{N_{fail}}{N_{all}}

(13)

where $p_{0}$ is the probability content of the important region. Because the proposed method takes into account only the failure probability content of the important region, equation (13) also provides the end result.

It is reminded that MPP is the closest failure point to the space origin, indicating that any sample in the β-sphere (as shown in Figure 1, $β$ is the Hasofer–Lind reliability index²⁶) is safe. Within the important region, the domain inside the β-sphere has higher probability density than that outside the β-sphere. Therefore, a considerable number of samples will be directly identified as safe samples. From another perspective, as samples concentrate inside the β-sphere, the samples near the limit state surface are comparatively sparser, which will facilitate the following classification procedure. However, standard Importance Sampling procedure concentrates sampling around the MPP. So, it is not as good as the proposed sampling strategy from above two perspectives.

In engineering context, random variables (e.g. geometry sizes, loads, and material parameters) are usually associated with clear physical meanings which show positive or negative effect on structural safety. With this information, a known point (point with a known class) can be used to identify unknown points in the same class as itself. To be specific, an unknown point can be inferred to be safe if it has larger geometry sizes, larger material parameters, and lower loads, compared with a known safe point; on the contrary, given a failure point, points meeting the opposite conditions simultaneously are judged to be failed. It needs to be emphasized that this identification can be carried out only if all of the random variables express a deterministic effect on structural safety. Clearly, the use of β-sphere and physical meanings of all the random variables would play a less important role for higher dimensional problems.

Adaptive scheme to build SVM

Sampling is performed in standard normal space, which is obtained by transformation of original design space. However, this transformation increases nonlinearity of the limit state surface.⁹ This is especially detrimental for the use of SVM, which directly approximates the limit state surface. It is thus preferred to classify samples in original space. Note that space transformation does not change the implied effects of random variables on structural safety. Hence, sample identification according to the physical meanings of all the random variables would not be influenced by space transformation.

SVM is applied in original design space. According to the principle of SVM, separating hyperplane is determined by only support vectors. Evaluations of the rest training points are thus a waste of efforts. However, the support vectors can be recognized only when the separating hyperplane is available. As a compromise, an adaptive scheme is devised where the SVM is continually refined by evaluating only the current most likely support vector at each step. At each step, the evaluated sample is used to identify more samples according to the physical meanings of all the random variables if possible. Next, all the newly known samples are added to the training set and then the current SVM is updated. The initial training set of the adaptive scheme can be readily obtained from all the points evaluated during MPP search.

A question that naturally arises is how to measure the likelihood to be a support vector with absence of the exact separating hyperplane. It is adopted here that the measurement is made with the latest updated one, instead. At intermediate steps, the sample $x$ is at a distance of $f (x) / | | w | |$ from current separating hyperplane $w^{T} x + b = 0$ . Naturally, the sample with minimum $| f (x) |$ is regarded as the most likely support vector. As the current separating hyperplane is just the latest approximation of the exact separating hyperplane, the theoretical basis for this selection criterion is just conceptually correct. Nevertheless, it will be seen from the following examples that this criterion is efficient.

The adaptive scheme is directly aimed at the classification accuracy of SVM on the samples. The convergence condition of the adaptive scheme is further defined by the target of the entire analysis, that is, the failure probability. Once the SVM is updated, a new estimate of the failure probability is given by

{\hat{p}}_{f} = p_{0} \frac{N_{fail} + {\hat{N}}_{fail}}{N_{all}}

(14)

in which $N_{fail}$ denotes the number of failure samples identified by performance function evaluation or according to the physical meanings of all the random variables, whereas ${\hat{N}}_{fail}$ represents the number of samples that has to be classified by SVM and is considered to be failed. If the estimate stabilizes, the corresponding adaptive procedure is stopped. Experience shows that SVM possibly remains unchanged for one update when far from accurate. Therefore, to avoid a prematurely converged result, the adaptive procedure is stopped only when the estimate changes little for three consecutive steps. The little change is specified as a relative difference up to 1%.

Finally, the application of Gaussian RBF kernel during SVM construction needs to be detailed. The SVM toolbox in MATLAB is used to implement the construction of SVM for each training set. Except that the parameter plays an important role, the penalty parameter is commonly introduced to make the SVM model more flexible. To optimize these two parameters, the grid optimization approach is adopted, which is a proven technology. Given a series of candidate values for each parameter, this approach tries all possible combinations and carries out cross-validation for each combination. Finally, the combination with the minimum mean error is adopted. For a careful consideration, the possible ranges of both $σ$ and c are set as [2⁻³², 2³²]. The candidate values for each parameter are 2⁻³², 2⁻³¹, …, 2⁻¹, 2⁰, 2¹, 2³².

Implementation procedure

For the sake of clarity, the whole implementation procedure of the proposed method is presented:

Step 1. Transform original design space to standard normal space by Nataf transformation.²⁷ In the new space, search MPP as FORM usually does. Points evaluated on the performance function in the search process are reserved.

Step 2. Specify the important region, where Monte Carlo samples are then drawn. The samples in the β-sphere are identified as safe samples directly. Each of the known points, including the safe samples and the evaluated points during MPP search, is used to identify samples according to the physical meanings of all the random variables if possible.

Step 3. Transform the standard normal space back to the original space. Correspondingly, all samples are transformed. Leave out the evaluated points not in the important region and transform the rest.

Step 4. Establish SVM using all the known points and then predict signs of all the unknown samples by the SVM. Subsequently, failure probability is estimated according to equation (14). Afterward, the convergence condition is examined. If it is satisfied, the whole procedure is ended with current estimate of the failure probability as the end result.

Step 5. Choose the unknown sample with the minimum $| f (x) |$ , which is computed based on the current SVM. And evaluate the performance function at this sample. Subsequently, as a known point, this sample is used to identify samples according to the physical meanings of all the random variables if possible. After that, go to Step 4.

Example validation

To validate the proposed method, four typical examples are analyzed. Accuracy and computational cost of the proposed method are compared with those of other methods. Among these methods used for comparison, Hurtado’s²¹ method is more focused as it deals with the same kind of reliability problems and shares similar methodology with the proposed method. AK-IS (active learning Kriging and Importance Sampling)²⁸ and Bourinet et al.’s²² method are considered but not for all the following examples since they are both proposed specially for problems with small failure probabilities. As performance function evaluations govern the computational cost in real engineering context, the number of calls to the performance function (denoted as N_call below) is used to qualify the computational cost. For all the examples, the proposed method, Hurtado’s method, AK-IS and Alibrandi’s method search MPP by the gradient-based algorithm. They are thus on top of the cost of FORM as MPP search contributes the main computational cost of FORM.

Example 1: two-dimensional nonlinear example

For the sake of graphic visualization, a two-dimensional case is studied first, which is also used for validation in the works by Grandhi and Wang²⁹ and Grooteman.³⁰ It has a small failure probability and its nonlinear performance function reads

g (x_{1}, x_{2}) = 2.5 + 0.00463 {(x_{1} + x_{2} - 20)}^{4} - 0.2357 (x_{1} - x_{2})

(15)

in which $x_{1}$ and $x_{2}$ are basic random variables with normal distributions (both with mean of $μ = 10$ and standard deviation of $σ = 3$ ).

Results are presented in Table 1. The reference failure probability is obtained by MCS with 1E7 samples. The MPP search starts at the space origin. It just takes six calls to the performance function, offering the MPP and $β = 2.5$ . The failure probability of FORM is consequently $Φ (- β) = 6.20 \times 10^{- 3}$ , which is seriously erroneous compared with the reference value. Nonetheless, starting from the MPP, the proposed method achieves an accurate result with additional 26 calls to the performance function. The important region is specified by $k = 1.5$ and 1E4 samples are drawn in it. During the first three iterations of the proposed adaptive scheme, use was made of the simple second-order polynomial kernel function $k (x, x_{i}) = (1 + x \cdot x_{i})^{2}$ . This is because, at the beginning of the adaptive procedure, known points are insufficient for the application of Gaussian RBF kernel function, which requires a certain number of known points to optimize the parameters through cross-validation. However, the alternative has no free parameters to be determined.

Table 1.

Results of Example 1.

Method	N _call	${\hat{p}}_{f}$ (relative error)
MCS	1 × 10⁷	2.86 × 10⁻³ (0)
ARBIS	1914	2.60 × 10⁻³ (9.1%)³⁰
FORM	6	6.20 × 10⁻³ (116.8%)
AK-IS	6+30	2.94 × 10⁻³ (2.8%)
Bourinet’s method	288 × 3	2.83 × 10⁻³ (1.0%)
Alibrandi’s method	6+6	7.85 × 10⁻³ (174.5%)
Hurtado’s method	6+64	2.93 × 10⁻³ (2.4%)
Proposed method	6+26	2.75 × 10⁻³ (3.8%)

MCS: Monte Carlo simulation; ARBIS: adaptive Radial-based Importance Sampling; FORM: first-order reliability method; AK-IS: active learning Kriging and Importance Sampling.

Figure 2 shows the changes in the failure probability estimate in the proposed adaptive procedure. It can be seen that the estimate quickly reaches a stable stage. Table 1 indicates that the proposed method yields a failure probability very close to the reference value, but with much less computational cost than MCS, adaptive Radial-based Importance Sampling (ARBIS),³⁰ Bourinet’s method, and Hurtado’s method. Its efficiency may be attributed to the following three aspects. First, as a matter of fact, the probability content of the important region is as little as 0.155. This value implies that the proposed method just needs 0.155 times of samples required by MCS for a similar accuracy level. Second, among the total 1E4 samples, 9413 samples lie inside the β-sphere (see Figure 1), so they can be identified as safe samples effortlessly. Finally, the designed adaptive scheme to build SVM, mainly the criterion for training point selection, makes a fast convergence speed. AK-IS shows roughly equivalent performance with the proposed method under an overall consideration of efficiency and accuracy. Bourinet et al.’s²² method is most accurate but requires a rather heavy computational burden. When applying Alibrandi et al.’s¹⁶ method, the suggested aperture $(γ = 15 \circ)$ is inapplicable because it is so large that the corresponding cone and the limit state surface do not intersect. Then, $γ = 2 \circ$ is adopted, but the obtained failure probability is still seriously wrong.

Figure 2.

Example 1: history of the convergence of the proposed method.

Example 2: dynamic response of a nonlinear oscillator

The second example involves a nonlinear undamped system with single degree of freedom (Figure 3). It has been served as a validation example for many times.^31,32 The performance function is defined by

g (c_{1}, c_{2}, m, r, t_{1}, F_{1}) = 3 r - | \frac{2 F_{1}}{m ω_{0}^{2}} \sin (\frac{ω_{0} t_{1}}{2}) |

(16)

with $ω_{0} = \sqrt{(c_{1} + c_{2}) / m}$ . Distributions of the basic random variables are listed in Table 2. In this example, two cases are considered where only the distribution parameters of F₁ are different. In the first case, mean and standard deviation of F₁ are 1 and 0.2, respectively.

Figure 3.

The nonlinear oscillator.

Table 2.

Distributions of the random variables in Example 2.

Random variable	Distribution	Mean	Standard deviation
c ₁	Normal	1	0.1
c ₂	Normal	0.1	0.01
M	Normal	1	0.05
R	Normal	0.5	0.05
t ₁	Normal	1	0.2
F ₁	Normal	1 (0.6)	0.2 (0.1)

The results of FORM, Hurtado’s method, and the proposed method are shown in Table 3, also shown are two methods combining Importance Sampling and different surrogate,³¹ AK-MCS (active learning Kriging and Monte Carlo simulation),³² AK-IS, and Alibrandi’s method. FORM searches MPP with 21 calls to the performance function and accordingly gives a failure probability with a relative error of 9.1%. Using the MPP, the proposed method decreases the relative error to 4.5%. This is achieved at the expense of additional 32 calls to the performance function. For the same purpose, Hurtado’s method performs additional 56 calls to the performance function and produces a relative error of 4.9%. Alibrandi’s method performs additional 30 calls to the performance function, but the relative error is as much as 31.6%. For the proposed method, the important region is defined by $k = 2.5$ , corresponding to a probability content of 0.757. The probability content suggests that, in comparison with MCS, the reduction in samples by the proposed sampling strategy is rather limited. Nevertheless, the proposed method still obtains satisfactory result with a little computational cost. In the important region, 1E4 samples are generated, among which 3265 samples are inside the β-sphere and thus identified as safe samples directly. Therefore, the cost saving resulted from the geometry meaning of β is still substantial for this six-dimensional case. The proposed method is better than the two methods combining Importance Sampling and surrogate in terms of both accuracy and efficiency. AK-MCS obtains the most accurate result, but with more cost than the proposed method. Still, AK-IS gains roughly equivalent performance with the proposed method for this example.

Table 3.

Results of Example 2 (case 1: mean and standard deviation of F₁ are 1 and 0.2, respectively).

Method	N _call	${\hat{p}}_{f}$ (relative error)
MCS	10⁵	2.85 × 10⁻² (0)
Importance Sampling+neural network	68	3.10 × 10⁻² (8.8%)³¹
Importance Sampling+response surface	109	2.50 × 10⁻² (12.3%)³¹
AK-MCS	58	2.83 × 10⁻² (0.7%)³²
FORM	21	3.11 × 10⁻² (9.1%)
AK-IS	21+40	2.96 × 10⁻² (3.9%)
Alibrandi’s method	21+30	3.75 × 10⁻² (31.6%)
Hurtado’s method	21+56	2.71 × 10⁻² (4.9%)
Proposed method	21+32	2.72 × 10⁻² (4.5%)

MCS: Monte Carlo simulation; FORM: first-order reliability method; AK-MCS: active learning Kriging and Monte Carlo simulation; AK-IS: active learning Kriging and Importance Sampling.

The second case is taken from the work of Echard et al.,²⁸ where mean and standard deviation of F₁ are 0.6 and 0.1, respectively. These values are adopted to produce a rather small failure probability. Results obtained by related methods are presented in Table 4. The proposed method applies least computational cost except FORM and its accuracy is satisfactory. In its application, important region is defined as in the previous case and the corresponding probability content of the important region is 0.1478. So, the required samples can be largely reduced compared with MCS. Still, 2E6 samples are used for such a small failure probability. However, most samples are in the β-sphere and the samples remain to be classified are only 4522. So, the role of β-sphere in this case is significant, which contributes largely to the efficiency of the proposed method.

Table 4.

Results of Example 2 (case 2: mean and standard deviation of F₁ are 0.6 and 0.1, respectively).

Method	N _call	${\hat{p}}_{f}$ (relative error)
MCS	3 × 10⁸	9.20 × 10⁻⁶ (0)
FORM	21	9.76 × 10⁻⁶ (6.1%)
AK-IS	29+38²⁸	9.13 × 10⁻⁶ (0.76%)²⁸
Bourinet’s method	341 × 5	9.09 × 10⁻⁶ (1.2%)
Alibrandi’s method	21+120	1.22 × 10⁻⁵ (32.6%)
Hurtado’s method	21+89	9.40 × 10⁻⁶ (2.2%)
Proposed method	21+39	9.43 × 10⁻⁶ (2.5%)

MCS: Monte Carlo simulation; FORM: first-order reliability method; AK-IS: active learning Kriging and Importance Sampling.

Example 3: 23-bar truss

This example deals with a 23-bar truss structure (Figure 4).³³ It includes a moderate number of random variables, which are non-normally distributed. The concerned safety requirement is that the center deflection ( $D (x)$ in Figure 4) does not exceed 0.11 m. The performance function is correspondingly defined as

g (x) = 0.11 - D (x)

(17)

Figure 4.

The 23-bar truss structure.

Statistic properties of all the random variables are shown in Table 5. Elastic modulus and cross-sectional areas are assumed to be perfectly correlated among all the horizontal bars, so is the case with all the diagonal bars. In Table 5, E₁ and A₁ denote the elastic modulus and cross-sectional area, respectively, for all the horizontal bars. For all the diagonal bars, the elastic modulus and cross-sectional areas are E₂ and A₂, respectively.

Table 5.

Distributions of the random variables in Example 3.

Random variable (unit)	Distribution type	Mean	Standard deviation
E ₁ (N/m²)	Log-normal	2.1 × 10¹¹	2.1 × 10¹⁰
E ₂ (N/m²)	Log-normal	2.1 × 10¹¹	2.1 × 10¹⁰
A ₁ (m²)	Log-normal	2.0 × 10⁻³	2.0 × 10⁻⁴
A ₂ (m²)	Log-normal	1.0 × 10⁻³	1.0 × 10⁻⁴
P ₁ (N)	Gumbel	5.0 × 10⁴	7.5 × 10³
P ₂ (N)	Gumbel	5.0 × 10⁴	7.5 × 10³
P ₃ (N)	Gumbel	5.0 × 10⁴	7.5 × 10³
P ₄ (N)	Gumbel	5.0 × 10⁴	7.5 × 10³
P ₅ (N)	Gumbel	5.0 × 10⁴	7.5 × 10³
P ₆ (N)	Gumbel	5.0 × 10⁴	7.5 × 10³

Results are shown in Table 6, where RSMM is the abbreviation for response surface augmented moment method.³³ FORM performs 44 calls to the performance function and gives a failure probability with 40.0% relative error. The proposed method reduces the relative error to 1.1% with additional 32 calls. In this process, 1E3 samples are drawn in the important region specified with $k = 1.5$ . Among them, 274 samples are closer to the origin than the MPP, so these samples belong to the safe group. It can be seen that, for this ten-dimensional problem, proportion of the samples in the β-sphere is still considerable. Moreover, the probability content of the important region is 0.0479. This value indicates that, in comparison with MCS, one-twentieth of samples are sufficient to achieve a similar accuracy by the proposed method. Finally, all the random variables have a clear physical meaning that implies a deterministic effect on the structure safety. Accordingly, any known point is used to identify unknown samples. The combined action of above three aspects largely reduces the samples remained to be classified by SVM. As proved in Table 6, the proposed method is much more efficient than Hurtado’s method and gives the most satisfactory result jointly considering the accuracy and efficiency.

Table 6.

Results of Example 3.

Method	N _call	${\hat{p}}_{f}$ (relative error)
MCS	10⁵	8.33 × 10⁻³ (0)³³
RSMM	45	8.80 × 10⁻³ (5.6%)³³
FORM	44	5.00 × 10⁻³ (40.0%)
AK-IS	44+47	8.24 × 10⁻³ (1.1%)
Bourinet’s method	324 × 3	8.37 × 10⁻³ (0.5%)
Alibrandi’s method	44+36	6.89 × 10⁻³ (17.3%)
Hurtado’s method	44+91	8.17 × 10⁻³ (1.9%)
Proposed method	44+22	8.24 × 10⁻³ (1.1%)

MCS: Monte Carlo simulation; RSMM: response surface augmented moment method; FORM: first-order reliability method; AK-IS: active learning Kriging and Importance Sampling.

Example 4: three-bay and 12-story frame

Finally, a frame structure³⁴ (shown in Figure 5) is studied based on the consideration that the implied finite element model involves another typical element type, beam element. This structure is assumed to have six basic random variables, which are the wind load P, the member cross-sectional areas A₁, A₂, A₃, A₄, and A₅, respectively. A_i corresponds to the members labeled with number i (see Figure 5), $i = 1, 2, \dots, 5$ . All the loads are assumed to be perfectly correlated, also are cross-sectional areas of all members with the same number. The six basic random variables are mutually independent of each other, and their statistical properties are presented in Table 7. Elastic modulus of the structure is 2.0 × 10⁷ N/m². The sectional inertia moments of the members correlate with their cross-sectional areas as follows

I_{i} = α_{i} A_{i}^{2} (i = 1, 2, \dots, 5)

(18)

in which $I_{i}$ are the sectional inertia moments and $α_{i}$ are the coefficients, which are given in Table 7.

Figure 5.

The three-bay and 12-story frame structure.

Table 7.

Distributions of the random variables in Example 4.

Random variable (unit)	Distribution type	Mean	Standard deviation	Coefficient α_i
A ₁ (m²)	Log-normal	0.25	0.025	0.08333
A ₂ (m²)	Log-normal	0.16	0.016	0.08333
A ₃ (m²)	Log-normal	0.36	0.036	0.08333
A ₄ (m²)	Log-normal	0.20	0.020	0.26670
A ₅ (m²)	Log-normal	0.15	0.015	0.20000
P (kN)	Gumbel	30	7.5	–

The performance function is defined as

g (x) = 0.096 - u_{max} (x)

(19)

where $u_{max} (x)$ denotes the maximum horizontal displacement of the frame.

Results of this example are listed in Table 8. The reference failure probability is obtained by Importance Sampling with 2000 samples.³⁴ Two surrogate-based simulation methods are also included for comparison.^10,35 It can be seen from Table 8 that FORM needs least performance function evaluations but is inaccurate. The cumulative formation of response surface and Alibrandi’s method require not too much cost but their relative errors are large. On the contrary, the RBF network requires much more cost but yields almost exact result. An accurate result is also achieved by the proposed method. But only two times of the calls to the performance function are needed in comparison with FORM. For this example, the important region is defined by $k = 1.5$ and 1E3 samples are used. The important region contains a probability content of 0.278, and 88 samples locate in the β-sphere. The proportion of the samples in the β-sphere is smaller than that in previous examples, but it is still non-ignorable. In addition, like Example 3, all the random variables express a deterministic effect on the structure safety, and these effects have been used to identify samples. Compared with the proposed method, Hurtado’s method is less efficient and less accurate. In fact, Hurtado’s method has tried the sampling variance of 1, which is adopted in above three examples. In that case, the obtained failure probability is 9.46 × 10⁻² and 54 additional calls to the performance function are needed. As the failure probability exhibits a large error, the sampling variance is then increased to 2, which corresponds to the results in Table 8. Hurtado’s method is possible to experience the difficulty to set sampling parameter as it uses Importance Sampling. However, the proposed method will not encounter this difficulty as the new sampling strategy is adopted. More details about this aspect will be discussed in section “Discussion.”

Table 8.

Results of Example 4.

Method	N _call	${\hat{p}}_{f}$ (relative error)
Importance Sampling	2000	7.51 × 10⁻² (0)³⁴
Radial basis function network	277	7.41 × 10⁻² (1.3%)¹⁰
Cumulative formation of response surface	39	9.56 × 10⁻² (27.3%)³⁵
FORM	28	9.25 × 10⁻² (23.2%)
Alibrandi’s method	28+29	10.8 × 10⁻² (43.8%)
Hurtado’s method	28+80	8.41 × 10⁻² (12.0%)
Proposed method	28+28	7.32 × 10⁻² (2.53%)

FORM: first-order reliability method.

Discussion

AK-IS cannot be as efficient as the proposed method for this ten-dimensional example (Example 3). And it is specially designed to cope with small failure probabilities, which, however, may not be easily identified beforehand in practical case. Alibrandi’s method can be considered as efficient, but it is hard to guarantee the accuracy of the result. On the contrary, Bourinet’s method gains accurate result with large computational cost.

The proposed sampling strategy consists in restricting MCS in the important region so that samples can be reduced. From the other side, the failure probability content outside the important region will not be counted in the failure probability integral. So, the side length of the important region k should be large enough to cover most failure probability content. However, increase in k results in more samples to be simulated. It is clear that there is a trade-off between preserving accuracy and improving efficiency. In section “MCS in important region,” the interval [1, 2] is recommended based on more or less experience. It is applicable for all the examples except the second one. In view of this, one can set the value of k adaptively when necessary. One can start with a small k and then increase it for several times until the failure probability stabilizes. Each k is associated with a complete estimation process for the failure probability. During each estimation, sampling is started from the same seed so that samples in previous estimation can be reused. Therefore, the adaptive way will not bring substantial increment of computational cost than the direct assignment of k. To illustrate the adaptive way, the second example is reanalyzed where k is adaptively determined. The variation in the results due to the changes in k is obtained, which is shown in Figure 6. Figure 6 indicates that the adaptive way is applicable.

Figure 6.

Example 2: effect of size of the important region on the results of reliability analysis.

Also concerned is the sample amount, which is determined by the conditional failure probability of the important region for a given accuracy level. In above examples, we set a whole number directly with a coefficient of variation of the failure probability between 0.05 and 0.1 just to show the performance of the proposed algorithm. As the conditional failure probability is the goal of the simulation in practical applications, an adaptive way is suggested to determine the sample amount when prior knowledge about the conditional failure probability is not enough. Its implementation is quiet simple. First, a small number of samples are generated. Then, conditional failure probability is estimated, according to which the required number is recomputed. Sampling is then continued to reach the required number, which initiates the next cycle of estimation. In general, both the sample amount and the conditional failure probability would converge after several steps.

Finally, to validate the efficiency of only the proposed criterion for training point selection, we compare the proposed criterion with that of Hurtado’s method, which selects a sample in the SVM separation margin at each step. To this end, the corresponding two adaptive schemes are executed with respect to the same sample population and from the same initial training set. The β-sphere and the physical meanings of random variables are not considered here. Example 3 is taken for the validation. Sample population and initial training set are those adopted by the proposed method in Example 3. The numbers of additional calls to the performance function for the proposed scheme and Hurtado’s scheme are 29 and 73, respectively. The results indicate that the proposed criterion for training point selection is more efficient than that of Hurtado’s method.

Conclusion

Regarding the use of SVM in reliability analysis, several papers have proved that it is efficient to iteratively refine SVM with focus on important region. This article applies SVM in the same way for low-dimensional problems with unique MPP. Its originality mainly lies in two aspects. First, simulation is only for the important region and Monte Carlo samples in that region are used. Second, for the adaptive scheme to build SVM, a criterion for training point selection is proposed which adds the most likely support vector to the training set at each step. The examples indicate that the proposed sampling strategy can largely reduce the samples to be classified with the consideration of the geometrical interpretation of Hasofer–Lind reliability index β. Moreover, physical meanings of all random variables are utilized to identify samples if possible. The criterion for training point selection can speed up the convergence of the adaptive procedure for SVM construction compared with Hurtado’s method. The four typical examples prove that the proposed method is accurate and efficient. For a more flexible method, the important region and the sample amount can be specified in an adaptive manner without a significant increase in computational cost.

Footnotes

Academic Editor: Jianqiao Ye

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Fundamental Research Funds for the Central Universities (Grant No. HIT.MKSTISP. 201609).

References

Ditlevsen

Madsen

HO.

Structural reliability methods. New York: Wiley, 1996.

Shinozuka

Basic analysis of structural safety. J Struct Eng: ASCE 1983; 109: 721–740.

Schueller

GI.

Efficient Monte Carlo simulation procedures in structural uncertainty and reliability analysis—recent advances. Struct Eng Mech 2009; 32: 1–20.

Tong

Refinement strategies for stratified sampling methods. Reliab Eng Syst Safe 2006; 91: 1257–1265.

Ditlevsen

Melchers

Gluver

General multi-dimensional probability integration by directional simulation. Comput Struct 1990; 36: 355–368.

Beck

JL.

Estimation of small failure probabilities in high dimensions by subset simulation. Probabilist Eng Mech 2001; 16: 263–277.

Bucher

Bourgund

A fast and efficient response surface approach for structural reliability problems. Struct Saf 1990; 7: 57–66.

Meng

Jing

Zhang

. A new sampling approach for response surface method based reliability analysis and its application. Adv Mech Eng 2015; 7: 305473.

Guan

Melchers

RE.

Effect of response surface parameter variation on structural reliability estimates. Struct Saf 2001; 23: 429–444.

10.

Deng

. Structural reliability analysis for implicit performance functions using artificial neural network. Struct Saf 2005; 27: 25–48.

11.

Kaymaz

Application of Kriging method to structural reliability problems. Struct Saf 2005; 27: 133–151.

12.

Rocco

Moreno

JA.

Fast Monte Carlo reliability evaluation using support vector machine. Reliab Eng Syst Safe 2002; 76: 237–243.

13.

Hurtado

Alvarez

DA.

Classification approach for reliability analysis with stochastic finite-element modeling. J Struct Eng: ASCE 2003; 129: 1141–1149.

14.

Hurtado

Alvarez

DA.

An optimization method for learning statistical classifiers in structural reliability. Probabilist Eng Mech 2010; 25: 26–34.

15.

Dai

Zhang

Wang

. Structural reliability assessment by local approximation of limit state functions using adaptive Markov chain simulation and support vector regression. Comput-Aided Civ Inf 2012; 27: 676–686.

16.

Alibrandi

Alani

Ricciardi

A new sampling strategy for SVM-based response surface for structural reliability analysis. Probabilist Eng Mech 2015; 41: 1–12.

17.

Richard

Cremona

Adelaide

A response surface method based on support vector machines trained with an adaptive experimental design. Struct Saf 2012; 39: 14–21.

18.

Jiang

Luo

Liao

. An efficient method for generation of uniform support vector and its application in structural failure function fitting. Struct Saf 2015; 54: 1–9.

19.

Basudhar

Missoum

Adaptive explicit decision functions for probabilistic design and optimization using support vector machines. Comput Struct 2008; 86: 1904–1917.

20.

Basudhar

Missoum

An improved adaptive sampling scheme for the construction of explicit boundaries. Struct Multidiscip O 2010; 42: 517–529.

21.

Hurtado

JE.

Filtered importance sampling with support vector margin: a powerful method for structural reliability analysis. Struct Saf 2007; 29: 2–15.

22.

Bourinet

J-M

Deheeger

Lemaire

Assessing small failure probabilities by combined subset simulation and support vector machines. Struct Saf 2011; 33: 343–353.

23.

Vapnik

The nature of statistical learning theory. Berlin: Springer, 1995.

24.

Shawe-Taylor

Cristianni

An introduction to support vector machines and other kernel-based learning methods. Cambridge: Cambridge University Press, 2000.

25.

Gong

Zhao

Non-gradient–based algorithm for structural reliability analysis. J Eng Mech: ASCE 2014; 140: 04014029.

26.

Hasofer

Lind

NC.

An exact and invariant first-order reliability format. J Eng Mech: ASCE 1974; 100: 111–121.

27.

Liu

P-L

Der Kiureghian

Multivariate distribution models with prescribed marginals and covariances. Probabilist Eng Mech 1986; 1: 105–112.

28.

Echard

Gayton

Lemaire

. A combined Importance Sampling and Kriging reliability method for small failure probabilities with time-demanding numerical models. Reliab Eng Syst Safe 2013; 111: 232–240.

29.

Grandhi

Wang

Higher-order failure probability calculation using nonlinear approximations. Comput Method Appl M 1999; 168: 185–206.

30.

Grooteman

FP.

Adaptive radial-based importance sampling method for structural reliability. Struct Saf 2008; 30: 533–542.

31.

Schueremans

Gemert

DV.

Benefit of splines and neural networks in simulation based structural reliability analysis. Struct Saf 2005; 27: 246–261.

32.

Echard

Gayton

Lemaire

AK-MCS: an active learning reliability method combining Kriging and Monte Carlo simulation. Struct Saf 2011; 33: 145–154.

33.

Lee

Kwak

BM.

Response surface augmented moment method for efficient reliability analysis. Struct Saf 2006; 28: 261–272.

34.

Zhao

GF.

Reliability theory and its applications for engineering structures. Dalian, China: Dalian University of Technology Press, 1996 (in Chinese).

35.

Das

Zheng

Cumulative formation of response surface and its use in reliability analysis. Probabilist Eng Mech 2000; 15: 309–315.