Abstract
The reliability-related design has been a crucial chain in complex and reliability-critical engineering systems. It serves as a preventive countermeasure to catastrophic failures caused by uncertainties. To keep up with this rapidly developing field, this paper presents a novel metaheuristic, based on the variational Bayesian inference (VBI) to efficiently solve the optimal reliability design. Specifically, the reliability-redundancy allocation problem (RRAP). The proposed metaheuristic starts from a primary population, then leverages VBI to fully excavate the information of feasible individuals from the previous generation, in order to produce the next-generation population. This process is iterated until the solution converges to the optimal decision scheme. In addition, we set up an automatic stratification strategy, so that the new individuals can approach the optimal solution faster. Furthermore, we divide RRAP into a reliability optimization problem (ROP) and a redundancy allocation problem (RAP). This not only reduces the dimension of decision variables, but also speeds up the convergence. ROP and RAP are solved sequentially and iteratively until the preset stopping condition is satisfied. The case studies showcase that the proposed approach can obtain the optimal or near-optimal solution within a reasonable period.
Keywords
Introduction
Nowadays, the advancement of high-tech industrial technology has increased the requirement of system functionalities. As feedback, engineering systems have experienced an ever-increasing level of sophistication. Moreover, the ever-harsher work environment together with the increasing awareness of high reliability, creates challenging optimal reliability design for complicated and expensive systems, forcing it to be a hot spot of intensive concern in both academia and industry.1,2 System reliability can be enhanced by providing redundancy at the component level and/or increasing component reliabilities.3–5 This, without any doubt, increases the system budget. Thus, it requires a smooth trade-off between budget constraint and system reliability to obtain a cost-effective design. The problem that maximizes the system reliability through redundancy or/and component reliability choice is dubbed as the reliability redundancy allocation problem. It can be further categorized into three streams according to the decision variables’ types, that is, the redundancy allocation problem (RAP, a pure integer nonlinear programming), reliability optimization problem (ROP, a continuous nonlinear programming), and reliability-redundancy allocation problem (RRAP, a mixed-integer nonlinear programming). They are non-deterministic polynomial hard (NP-hard) problems. The topic this paper is concerned with pertains to RRAP.
In the light of designers’ requirements, RRAP can be formulated either to maximize the system reliability under resource constraints or to minimize the total cost under the minimum demand on system reliability. For brevity, we only utilize the former to illustrate the proposed approach. Putting more clearly, the general mathematical formulation of RRAP is:
where
The solution of equation (1) includes two parts, that is, component (reliability) choices and redundancy levels. RRAP is classified as the hardest problem in the optimal reliability design community. This is because the involved decision variables are mixed-integer and the system reliability function is nonlinear, non-separable, and non-convex. Hence, most of the canonical mathematical methods have not been successful in finding an optimal or near-optimal solution for RRAP. In contrast, solving RRAP via metaheuristics has attracted increasing attention in recent years.
In the past few decades, the genetic algorithm (GA) has attracted the widest attention.6,7 GA is a numerical simulation approach coming from mimicking the natural selection and genetic mechanism. GA converts the process solving the optimization problem into a process similar to the crossover and mutation of chromosome genes in biological evolution. When solving complex combinatorial optimization problems, GA can usually rapidly obtain good optimization results. The GA-based metaheuristics for solving RRAP, for example, Ardakan et al. 8 investigated RRAP of hybrid-redundancy systems, and a modified version of GA was developed to solve the given optimization problem. Kim and Kim 9 investigated RRAP considering the optimal redundancy strategy (either active or cold-standby), and applied the parallel GA to solve the formed optimization problem. Particle swarm optimization (PSO) is another prevalent approach for solving RRAP. PSO is a random search algorithm that was constructed by imitating the foraging behavior of birds.10,11 PSO is initialized as a group of random particles (random solutions) within the feasible domain, and the optimal solution is found through subsequent iterations. In each iteration, the particles update themselves by tracking two optimal solutions, that is, the one found by the particle itself (individual best or personal best) and the other one found by the whole population (global best). For the PSO-based metaheuristics to solve RRAP, Wu et al. 12 proposed an improved PSO algorithm involving two position updating strategies to solve RRAP, in which a mutation operator was introduced after position updating. This not only prevents the algorithm from trapping into the local optimum, but also enhances its space developing ability. Ouyang et al. 13 introduced heterogeneous components into RRAP, and the formulated optimization problem was resolved by an improved PSO algorithm with stochastic perturbation nature.
Artificial bee colony algorithm (ABC) is an optimization method proposed via imitating the behavior of bees. It does not need to learn the special information of the problem, but only compare the advantages and disadvantages of the obtained information. Through the local optimization behavior of the individual artificial bee, the global optimum is finally emerged among these local optimums. The application of ABC algorithm for solving RRAP, see Refs.14–16Cuckoo search algorithm (CS), originated from imitating the brood parasitism of cuckoo, is a population-based intelligent optimization approach. It mainly depends on two strategies, namely, the brood parasitism of cuckoo and the Levy flight mechanism. 17 The main advantages of CS algorithm include: (1) small number of parameters, (2) easy implementation, (3) random search path, and (4) strong optimization ability. The applications of CS in RRAP, see Refs.18–20 The other metaheuristics for solving RRAP are, for example, the artificial fish swarm algorithm, 21 biogeography optimization approach, 22 and stochastic fractal search algorithm.23–25 In addition, many studies have devoted their vigour to hybrid metaheuristics, in hope of obtaining more efficient algorithms than the single metaheuristic, for example, the combination method of PSO and simplified swarm optimization, 26 the hybrid Jaya algorithm based on a time-variant acceleration coefficient, 27 and the bilevel evolutionary algorithm based on quadratic approximations. 28
Although fruitful achievements have been made in RRAP, obviously, the research on RRAP will not stop here. It remains an exciting topic with many new research challenges and opportunities. Motivated by this fact, a novel optimization algorithm based on the variational Bayesian inference (VBI) is proposed, aiming to search for the optimal decision scheme more quickly and globally. VBI is one of the variational methods in statistical inference, which can asymptotically obtain the posterior distribution of latent variables from a given variational family.29–31 VBI leverages the mean field theory (MFT) to expand the posterior joint distribution of latent variables into the product of the marginal distributions of latent variables, in order to obtain the calculation framework. It can be utilized as a low computational alternative to the Markov chain Monte Carlo (MCMC), and has been applied to some machine learning algorithms, such as the learning of variational autoencoder.
The proposed optimization approach starts from an initial population randomly generated from the design region. Then, the next-generation population is produced by VBI, through leveraging the information provided by feasible individuals of the previous-generation population. This process is iterated until the optimal decision scheme is found. It is worth mentioning that the proposed algorithm does not utilize the information of all feasible individuals in each generation to produce new individuals, but first layers feasible individuals according to their objective function values by setting an intermediate threshold, and carries out the optimization algorithm along the direction of the reduction of the objective function value (for the minimization problem), in order to search for the optimal solution faster. The proposed optimization approach is a numerical-simulation-based technique, which is suitable for optimization problems with discontinuous, discrete, and multiple feasible regions. To apply the proposed optimization algorithm to solve RRAP, we first divide RRAP into ROP together with RAP, and solve these two formulated problems sequentially and iteratively until the stopping condition is satisfied. Such segmentation strategy transforms the original
This paper is structured as follows. The problem formulation is briefly expounded in Section 2. Section 3 concretely presents the proposed optimization algorithm for solving RRAP. The performance of the proposed approach is validated by several examples in Section 4. Conclusions are drawn in Section 5.
Problem formulation
Conventionally, we first transform the RRAP in equation (1) (a maximization problem) into a minimization problem:
where
We then suppose that the decision variables in
To search for the optimal solution of design variables, the intuitive measure is to produce the next-generation better feasible individuals based on the information provided by current feasible individuals, and repeat this process to mine the feasible domain until the optimal solution comes to light. This is a general procedure of the greedy optimization algorithm. Undoubtedly, the quality of the next-generation population severely affects the convergence speed of the optimization process and the precision of the final solution. In other words, the strategy of producing the next-generation population is really crucial. The most classic strategy to generate the next-generation population should be the MCMC, but MCMC has two fatal defects: (1) the generation of the latter individual is reliant upon previous individuals, resulting in the generated individuals being related rather than independent; (2) the decision of acceptance or rejection of new individuals relies on the proposal distribution and a uniformly distributed variable. MCMC is a random approach, making it difficult to converge to a stable state. Hence, this paper adopts VBI instead of MCMC to produce the next-generation population. VBI is a deterministic approach, which is more stable than MCMC.
Proposed approach for solving RRAP
To facilitate understanding, we first list the general execution process of the proposed “crude” optimization algorithm. After showcasing the key techniques of the proposed optimization approach, we design an “elaborate” optimization algorithm for exploring the optimal decision scheme of RRAP.
Proposed non-directional optimization approach
This section first presents the general whole algorithm of the proposed “crude” optimization algorithm, as shown in Algorithm 1. This “crude” optimization algorithm is referred to as the non-directional approach, and the corresponding reason will be concretely explained later. After that, the key factors of the proposed approach are revealed, in order to pave the way for the subsequent design of an improved optimization algorithm.
From Algorithm 1, we can see that the key of the proposed approach is how to establish a generator to produce the next-generation population based on the information about the existing feasible individuals. This is to increase the number of feasible individuals and procure individuals that are better than those of previous generations. To address this problem, the first idea that comes to mind is to fit a conditional probability density function (PDF)
where
However, the conditional PDF
The process of the proposed non-directional approach for solving the optimal reliability design is depicted in Figure 1, where MCS denotes the Monte Carlo simulation. In Figure 1, MCS is first used to generate the primary-generation individuals (blue points) in the design space. These individuals are divided into infeasible individuals (circular points) that do not meet the constraints and feasible individuals (triangular points) that satisfy the constraints. By calculating the objective function values of these feasible individuals, we can obtain the current minimum of the objective function (

Illustration of the VBI-based non-directional approach for searching for the optimal design.
Variational Bayesian inference
This section first expounds the basic theory and general algorithm of VBI. Then, some details are illustrated using an example.
Basic theory of VBI
The KL-divergence of the asymptotical model
Then, the optimal asymptotical model
Furthermore, we convert equation (4) into:
where
Let
Since
Following the MFT, we can obtain
where
In addition,
where
Combining equations (8) and (9), the ELBO
where
Since
Because
Finally, the asymptotical model
Case study
We take the rosenbrock function (Banana function) as an example to demonstrate the process of VBI approximating a PDF. The 2D banana function is as follows:
where
It is seen that the global minimum of

Contour map of the 2D banana function.
Now, we need to explore this special valley to find the minimum of the 2D banana function. We suppose that the information of some samples located in or around this valley has been learned. Let
First of all, we need to allocate a prior distribution to the conditional PDF
After that, the optimal asymptotical model

The approximate valley zones of the 2D banana function: (a) Scenario 1, (b) Scenario 2, (c) Scenario 3, (d) Scenario 4, (e) Scenario 5, and (f) Scenario 6.
It can be concluded from Figure 3 that, different prior distributions will obtain different posterior distributions, causing the approximate valley zones to be different from each other. Moreover, even if the form of the prior distribution is same, the parameter settings also have an impact on the posterior distribution. The ELBO estimates and corresponding standard deviations (Sd. (ELBO)) and the estimated minimum values (Min) of the banana function are presented in Table 1. It should be pointed out that, the uncertainty of the ELBO estimate comes from the Bayesian quadrature. It is seen that the minimum of the banana function is related to the prior distribution. For this case study, the normal prior with mean values {0.5, 0.5} and standard deviations {5, 5} (i.e., Scenario 2) is the best choice. Furthermore, we consider a special situation, as follows: if the obtained minimum is smaller than
Results of the case study under different prior distributions.
Proposed directional optimization approach
Until now, we have presented the general process and key techniques of the proposed optimization approach for solving optimal reliability design. As shown in Figure 1, the first-generation individuals are generated randomly in the design region, then the feasible individuals are filtered. On the premise of the information provided by current feasible individuals, the next-generation individuals are produced by VBI, so the explored feasible areas are expanded and the number of feasible individuals are increased. The optimal solution can be procured through repeating this process until the termination condition is met. However, in this approach, although the number of feasible individuals increase iteratively, these feasible individuals do not necessarily increase along the direction of the reduction of the objective function value. In other words, the next-generation population is produced based only on the information provided by current feasible individuals. This increases the number of feasible individuals disorderly and aimlessly, and the optimal individual of the new-generation population may not be as good as that of the previous-generation population. It can be seen from Figure 1 that the feasible individuals of the next generation are not necessarily better than those of the previous generations (the objective function values of the new feasible individuals are not necessarily smaller than those of the previous feasible individuals). This fact may incur in slow convergence of the optimization algorithm. For the above reasons, this approach is dubbed as a non-directional method.
General whole algorithm of the proposed directional optimization approach
In response to the above issues, we will make an improvement to the non-directional approach, so that the improved approach could produce more feasible individuals along the direction of reducing the value of the objective function to speed up the convergence of the algorithm. The basic measure of the improved approach is to set a series of intermediate events, forcing new individuals to be produced along the direction that the objective function value is smaller than the threshold of the intermediate event. Then, the new-generation individuals will fall (as far as possible) in the area whose objective function value is smaller than the preset threshold. These intermediate events are automatically determined instead of being set in advance, so the improved approach is very easy to be implemented without any additional constraints. The specific execution steps of the proposed improved optimization algorithm are listed in Algorithm 3.
Comparing Algorithm 1 and Algorithm 3, we can observe that the improved algorithm has a filtering operation (Step 6). In this step, a fraction

Illustration of the VBI-based directional approach for searching for the optimal design.
Mapping function for integer decision variables
The proposed directional approach for solving the optimal reliability design involving real decision variables has been concretely presented. Here, we assume that the optimization problem only contains integer decision variables, that is,
Until now, the proposed optimization algorithm has been introduced in detail. To facilitate understanding, the flowchart of the proposed optimization algorithm for solving the optimal reliability design is depicted in Figure 5.

Flowchart of the proposed optimization algorithm.
Proposed two-stage approach for solving RRAP
As mentioned, the solution of RRAP contains two parts, namely, the reliability choices as well as the redundancy levels. Of course, the proposed optimization approach can simultaneously handle the integer and real design variables, but such a hybrid measure has two defects: the computational effort of handling the integer and real decision variables simultaneously is huge. This is because the sum of dimensions of these two types of design variables is high, leading to the longer process time in producing new individuals; moreover, this high dimension also reduces the accuracy of VBI, resulting in the produced new individuals to be sub-optimal.
In response to this fact, we propose a two-stage approach to efficiently and precisely solve RRAP. The first stage is to solve ROP, in order to determine the reliability of components when the redundancy level is fixed. The second stage is to solve RAP, aiming to determine the redundancy level of subsystems when the reliability choice is known. These two stages are conducted sequentially and iteratively until the termination criterion is satisfied. The concrete algorithm of the proposed two-stage approach for solving RRAP is presented in Algorithm 4.
Discussion and remarks
(1) The proposed optimization algorithm starts from the string set of individuals instead of a single initial point. Thus, it is not easy for it to be caught into the local optimum, and is conducive to global optimum.
(2) The proposed optimization algorithm processes multiple individuals in a population, simultaneously. This not only reduces the risk of falling into the local optimum, but also facilitates parallelization.
(3) The proposed optimization algorithm does not need the information of the design space or any auxiliary information in advance, but only utilizes the given constraints to screen feasible individuals, and uses the objective function to evaluate individuals to guide its search direction. The objective function or constraint function is not constrained by any assumption or premise. These characteristics greatly expand the scope of application of the proposed optimization algorithm.
(4) The proposed optimization algorithm generates a new population based on a group of excellent individuals rather than the most prominent individual. This avoids the emergence of super individual and eventually circumvents the local optimum.
(5) In the proposed optimization algorithm, each generation only retains the
(6) To get more excellent individuals at each generation, we do not require that the individuals strictly satisfy the constraints, but retain those individuals that meet the constraints and those that are close to the constraint boundary. This aims to avoid missing the individuals that contribute greatly to the accuracy of the optimization algorithm. Therefore, the “feasible” individuals of the
where
(7) The proposed two-stage approach converts the solution of RRAP into a process of sequentially solving ROP and RAP. In each iteration, only the reliability choice or redundancy level needs to be optimized. This reduces the dimension of design variables that need to be optimized and raises the optimization speed.
(8) This paper uses the uniform prior distribution in VBI, because the decision variables
Examples
To showcase the performance of the proposed optimization approach for solving RRAP, three examples, including a series system with 10 subsystems, an oil transportation system, and a 15-unit system, are presented in this section. The CPU type which this paper used is “11th Gen Intel(R) Core(TM) i7-11800H @ 2.30 GHz.”
A series system with 10 subsystems
A series system with 10 subsystems, modified from Refs.,12,17 is first used to validate the feasibility of the proposed algorithm. The optimization problem is to determine the reliability choice and redundancy level of each subsystem, as follows:
where the objective function equation (15) is the reliability of this series system. The first constraint equation (16) is the volume constraint related to the redundancy level. The second constraint equation (17) is the budget constraint that is related to the redundancy level and reliability choice. The third constraint equation (18) is the weight constraint.
For comparison, we first list the results obtained by GA under different population sizes (
Optimal solutions of example 1 obtained by GA under different population sizes (
Optimal solutions of example 1 obtained by GA under different choices of
Then, to illustrate the performance of the proposed approach for exploring the optimal design of this system, the results obtained by the proposed approach under different population sizes
Optimal solutions of example 1 obtained by the proposed approach under different population sizes (
Optimal solutions of example 1 obtained by the proposed approach under different choices of
The possible mechanism of population size
The possible principle of the parameter
An oil transportation system
Consider an oil transportation system shown in Figure 6, which is abstracted from practical applications. This system needs to transport the oil from the starting city to the destination city. This system possesses multiple transportation routes, and there are multiple transfer stations built along each route. The oil continues to be transported after passing through transfer stations till it is transported to the destination. Now the oil supplier needs to establish these transfer stations at the necessary 10 positions, that is, there are 10 subsystems in this oil transportation system. It is supposed that the reliability of this system is only reliant upon the reliabilities of transfer stations

Schematic view of example 2.
Then, the reliability of this system can be obtained according to its structure, as follows:
where
Here,
Similarly, we first apply GA to obtain comparison results. The optimal solutions obtained by GA under different population sizes and crossover fractions are listed in Tables 6 and 7, respectively. It can be concluded that the population size affects the final optimal solution as well as the computational efficiency. With the increased size of population, GA needs longer time to procure the optimal solution. Meanwhile, the crossover fraction also severely affects the computational precision and efficiency. When the crossover fraction is smaller than 0.5, the obtained optimal solution seems to be suboptimal.
Optimal solutions of example 2 obtained by GA under different population sizes (
Optimal solutions of example 2 obtained by GA under different choices of
The optimal solutions obtained by the proposed approach under different population sizes
Optimal solutions of example 2 obtained by the proposed approach under different population sizes (
Optimal solutions of example 2 obtained by the proposed approach under different choices of
A 15-unit system
Consider a 15-unit system shown in Figure 7, which is modified from Refs.17,32 The system reliability is:
where

Structure of the 15-unit system.
The optimization model of this 15-unit system is:
where
Moreover,
We first apply GA to solve this optimal reliability design problem, and the optimal solutions under different population sizes and choices of crossover fractions are listed in Tables 10 and 11, respectively. In this example, the population size affects the final optimal solution and CPU time. In addition, the crossover fraction severely influences the final optimal solution, but has little effect on the CPU time.
Optimal solutions of example 3 obtained by GA under different population sizes (
Optimal solutions of example 3 obtained by GA under different choices of
The optimal solutions obtained by the proposed approach under different population sizes and choices of
Optimal solutions of example 3 obtained by the proposed approach under different population sizes (
Optimal solutions of example 3 obtained by the proposed approach under different choices of
Conclusions
Two issues are addressed in this paper, that is, proposing an effective optimization algorithm and applying it to solve RRAP. The proposed optimization approach is initialized as a small size population, then new populations are generated by VBI through sufficiently utilizing the information of the existing feasible individuals. Moreover, an automatic stratifying strategy is proposed to accelerate the speed of convergence. Thus, the new population will be generated iteratively based on informative feasible individuals, rather than all feasible individuals, until the optimal solution comes to light. The proposed optimization approach can avoid being caught into local optimum. It does not make any additional assumptions or preconditions for the optimization problem, so it can be easily applied to various kinds of optimization problems. After that, a sequential two-stage approach based on the proposed optimization algorithm is developed to solve RRAP, which divides RRAP into two parts, i.e., ROP and RAP, and solves these two problems sequentially and iteratively. This reduces the number of decision variables, thus decreasing the computation complexity of solving RRAP. The listed examples showcase that the proposed approach tends to obtain more reliable systems than GA, although it may need longer CPU time.
Footnotes
Appendix
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work described in this paper was supported by the Hong Kong Institute for Advanced Study.
