Sage Journals: Discover world-class research

Abstract

Decomposing the large-scale problem into small-scale subproblems and optimizing them cooperatively are critical steps for solving large-scale optimization problem. This article proposes a cooperative differential evolution with utility-based adaptive grouping. The problem decomposition is adaptively executed by the two mechanisms of circular sliding controller and relation matrix, which consider the variable interactions on the basis of the short-term and long-term utilities, respectively. The circular sliding controller provides baselines for the subproblem optimizer. The size of the sliding window and the sliding speed in the controller are adjusted adaptively so that the variables with higher activeness can be optimized extensively. The relation matrix–based grouping strategy enables interacted variables to be grouped into the same subproblem with higher probabilities. The novelty is that decomposition is conducted as the optimization process without extra computational burden. For subproblem optimization, we use a self-adaptive differential evolution operator that adaptively adjusts the parameters to guide the search to the optimum solutions of the subproblems. Experiments on the benchmarks of CEC2008 and CEC2010, and practical problems show the effectiveness of the proposed algorithm.

Keywords

Cooperative coevolution differential evolution large-scale optimization circular sliding controller

Introduction

High-dimensional optimization problems can be found in many engineering fields, so effective and efficient optimization algorithms are always in demand.^1,2 Evolutionary computation has emerged as an intelligent computing discipline for optimization problems; it contains a large number of heuristics that not only have strong robustness and global search ability but also do not require domain knowledge. Thus, the evolutionary computation has been shown to be successful in many applications. However, many of them are still plagued by “curse of dimensionality,”³ which means that their performance will deteriorate when applied in high-dimensional environments.

To solve high-dimensional optimization problems, a natural solution is to adopt cooperative coevolution (CC),^4,5 which is a framework based on divide-and-conquer. The CC algorithm has shown good performance on separable problems but degenerates on nonseparable problems.^6,7 Thus, research and development on CC algorithms for nonseparable optimization problems have attracted attentions of researchers.^8,9 Essentially, the CC algorithm simplifies the complexities of problem through decomposition,¹⁰ and the performance is sensitive to the decomposition strategy.¹¹ Based on how to deal with the variable interactions, the existing algorithms can be classified into three categories. Generally, the algorithms performed without considering variable interactions are effective for separable problems but have difficulty in solving problems with interacted variables.^12,13 On the other hand, the algorithms performed by considering variable interactions implicitly or explicitly provide opportunities to solve problems with interacted variables.^5,14 However, the object-oriented strategies for extensively exploring the variable interactions as well as avoiding extra computational efforts to the algorithms are still required. Moreover, as the interactions among subproblems impose impact on the algorithm’s performance,¹⁵ the strategies of co-adapting the subproblems are also required.

In order to address the aforementioned issues, we propose a cooperative differential evolution with utility-based adaptive grouping (CoDE-AG). For problem decomposition, the strategies of circular sliding controller (CSC) and relation matrix (RM) are proposed. In the controller, the variables to be optimized are arranged in a loop, and the variable window covers a subset of the loop. While optimizing, the variables covered by the window are optimized, and the variables outside of the window are kept constant. For optimization of all the variables, the window slides after the optimization of each subproblem. Since different subsets of the variables have different surface landscapes and require different computational efforts, the size of the window and the sliding speed are self-adapted. The window size that has good performance will have higher probability to be adopted for the next cycle. Meanwhile, the variable subsets are identified as active, regular, and inactive regions. The active and regular regions imply that the variables have probabilities being interacted, and the sliding speed will be slow when the window covers them. The inactive regions imply that the variables are not interacted, and the sliding speed will be fast when the windows cover them. In order that they are optimized with latent interacted variables, the permutations of the variables belonging to different inactive regions are rearranged for the next cycle. Moreover, to improve the adaptability of grouping the subproblems, a reward-based RM is constructed to reflect the long-term utilities of variable interactions. The underlying assumption is that if a subset of variables optimized in the same window yield improvement on the optimization results, then they will have higher probabilities being interacted. For optimization of the subproblems, that is, the subsets of variables covered by the window, a self-adaptive differential evolution (SaDE) operator is adopted.

The rest of this article is organized as follows. The “Related works and background” section reviews the CC and DE (differential evolution). The “CoDE-AG” section describes the details of the CoDE-AG. The “Experimental studies” section presents the experimental studies, and the “Conclusion” section concludes this article.

Related works and background

The optimization problem considered in this article is formulated as follows

\begin{matrix} Find : {\vec{x}}^{*} \in S \\ Such that : \forall \vec{x} \in S, f ({\vec{x}}^{*}) \leq f (\vec{x}) \end{matrix}

(1)

where $S = [b_{l 1}, b_{u 1}] \times [b_{l 2}, b_{u 2}] \times \dots \times [b_{\ln}, b_{un}] \subseteq R^{n}$ is the bounded solution space, $\vec{x} \in S$ is the solution vector, and $f : S \to R$ is the objective function.

This work is related to cooperative optimization and DE. In this section, we first review the prevailing cooperative optimization algorithms and then present the background with regard to DE.

CC

CC is performed by decomposing the high-dimensional problem into a set of small-scale subproblems and then optimizing each subproblem according to a standard optimization metaheuristic. During the process, a context vector is constructed and updated by using a representative individual (e.g. the best individual) provided by each subcomponent. The cooperation takes place by the context vector.¹⁶ In the framework of CC, how to decompose the problem plays a crucial role.¹² Hereafter, we review the state of the art cooperative optimization algorithms along different categories of decomposition strategies.

The algorithms belonging to the first category conduct decomposition tasks without considering the variable interactions. This includes two original decomposition methods: the one-dimensional based and splitting-in-half strategies.^12,17 Some algorithms decompose problem arbitrarily.^18,19 While optimizing, the decomposition strategy is kept unchanged, and each subproblem is optimized using a round robin strategy. Some algorithms decompose problem dynamically.^3,20 While optimizing, the decomposition strategy is changed adaptively, and each subproblem is optimized using a measured stagnation. In the preliminary work of this article, we propose a simple circular sliding window strategy to decompose a large-scale problem into small-scale subproblems. However, it does not consider the variable interdependencies.²¹ It has been demonstrated that this category of algorithms is effective for separable problems, but the performance degenerates for the problems with interacted variables.²²

The algorithms belonging to the second category conduct by exploring variable interactions explicitly. In Sun et al.,¹⁴ a statistical model is proposed to quantify the degree of interactions among variables. With the interactions, the problem is optimized under a CC framework. In Omidvar et al.,¹⁵ a differential grouping algorithm is proposed for capturing the variable interactions. As the differential grouping can only identify decision variables that interact directly, an extended differential grouping method is proposed to cater for different forms of interactions,²³ and a global differential grouping method is proposed to identify independent subproblems.²⁴ But the process of differential discrimination requires intensive computation efforts for fitness evaluations. Moreover, differential grouping is based on the assumption that the oscillation of the second order derivatives of the problems across the whole feasible search space is not violent. The new developed strategy DG2²⁵ addresses the burden to an acceptable level on problems with thousands of variables by increasing the utilization rate of the trial data. Yet, such attempts for reducing the resource requirements have not solved the problem of scalability. In Chen et al.,²⁶ an evolutionary learning algorithm is proposed to examine the interactions of each decision variable with other decision variables in a pairwise fashion. The optimization is then conducted using a DE optimizer. The learning algorithm consumes up to 60% of the computational efforts. The results show that a near-optimal decomposition is beneficial in solving large-scale global optimization problems with up to 1000 decision variables. Generally, the algorithms belonging to this category provide opportunities to solve nonseparable problems. However, an essential drawback is that they require intensive computational efforts. More specially, the expensive nature of objective function in real-world problems and the difficult nature of large-scale nonseparable problems require the algorithm to avoid incurring extra computational burden.

The algorithms belonging to the third category conduct by exploring variable interactions implicitly. Some algorithms capture the variable interactions by using random grouping strategy. Representative works are EACC-G⁵ and MLCC.²⁷ The EACC-G uses a randomization method which divides the decision variables into several groups according to a predefined group size, with each constituting a subproblem. For co-adaptation, they apply weights to the subproblems, and the subproblems and their weights are optimized separately. The MLCC applies several problem decomposers with different group sizes to construct a decomposer pool; the decomposers in the pool indicate different interaction levels. The extended random grouping strategies make more frequent random grouping and adopt self-adaptation of subproblem sizes.²⁸ Although the random grouping strategy achieves a measure of success, the probability of grouping interacted variables in one subproblem will decrease as the scale of the problem increases. It has been shown that random grouping is ineffective when the number of interacted variables grows more than five. Some algorithms capture the variable interactions by investigating the relationship between the objective functions and the candidate solutions.^29,30 During the optimization process, the decomposition and the optimization strategies are adapted according to the obtained interactions based on the top 50% solutions of the population. These algorithms might fail to effectively capture the variable interaction because of lacking efficiency guidelines and heuristics in the whole evolutionary process. They might have limited scalability to large-scale problems. Therefore, it is desirable to design new heuristics that are capable of exploiting the structure of a problem to find a suitable decomposition without incurring extra computational burden.

DE algorithm

Under the CC framework, each subcomponent is optimized using a standard optimization metaheuristic. The widely used optimizers are based on natural inspired phenomena, such as genetic algorithm,^31,32 ant colony optimization,³³ DE,^34,35 particle swarm optimization,^36,37 and memetic algorithm.³⁸ The DE is a population-based stochastic algorithm and is characterized by easy in implementation, and efficient in performance. In each iteration of the DE, for each target individual, the algorithm first constructs a differential vector subtraction using two or more other individuals from the population. The difference vector is then multiplied by a scaling factor and added to the current target individual to construct a new individual. Finally, the variation of the individual and the target individual will undergo crossover and selection operations for the next generation. After the initialization of the algorithm settings, the algorithm repeatedly executes mutation, crossover, and selection until the termination criteria are satisfied.

Population representation and initialization: The population is represented by $NP$ n-dimensional real-valued vectors, which encode the candidate solutions, that is, $pop = {X_{i} | i = 1, 2, \dots, NP}$ , where $X_{i} = (x_{i} (1), x_{i} (2), \dots, x_{i} (n))$ denotes the ith individual in the population, $n$ is the dimension of the problem, and $NP$ is the population size. Generally, the population is initialized with random values in the search space.

Mutation: After initialization, mutation is applied to produce a mutant vector $V_{i, G}$ for each individual $X_{i, G}$ . There are several mutant strategies. For example, the “DE/rand/1” mutation is formulated as follows

V_{i, G} = X_{r_{1}, G} + F \times (X_{r_{2}, G} - X_{r_{3}, G})

(2)

where $X_{r_{1}, G}$ , $X_{r_{2}, G}$ , and $X_{r_{3}, G}$ are exclusive individuals taken from the population, and $F$ is the scaling factor in the interval $(0, 2]$ .

Crossover: After mutation, crossover operation is applied to target vector $X_{i, G}$ and its corresponding mutant vector $V_{i, G}$ to generate a trial vector $U_{i, G}$ . In the basic version, the binomial crossover is employed

U_{i, G} (j) = {\begin{matrix} V_{i, G} (j), if U (0, 1) < CR or j = j_{rand} \\ X_{i, G} (j), otherwise \end{matrix}

(3)

where $U (0, 1)$ is a uniform random number in the range $(0, 1)$ ; $j_{rand}$ is a random integer between $1$ and $n$ to ensure that $U_{i, G} (j)$ is not identical to $X_{i}$ ; and $CR \in [0, 1]$ is the crossover probability, which controls the variation of the value on the use of decision variables.

Selection: The selection determines whether $X_{i, G}$ or $U_{i, G}$ will survive to the next generation. The formulation is expressed as follows

X_{i, G + 1} = {\begin{matrix} U_{i, G}, & if f (U_{i, G}) < f (X_{i, G}) \\ X_{i, G}, & otherwise \end{matrix}

(4)

where $f (\cdot)$ is the objective function, and $X_{i, G + 1}$ replaces $X_{i, G}$ into the next generation.

Field search capability is an important indicator to evaluate an evolutionary algorithm.³⁹ DE is a competitive evolutionary optimizer for continuous search spaces. However, the performance of classical DE depends on the evolution strategies and their control parameters. It has been shown that the control parameters and learning strategies are highly problem-dependent.⁴⁰ It is time-consuming to tune parameters manually for different problems. SaDE⁴¹ and SaNSDE (self-adaptive differential evolution with neighborhood search)⁴² use multiple strategies adaptively based on the success during previous generations with adaptive values of $F$ and $CR$ using different schemes. Yang et al.⁴³ proposes a generalized adaptive differential evolution (GaDE) based on a generalized parameter adaptation scheme. JADE⁴⁴ introduces a new mutation scheme that utilizes the best solution with an optional external archive and updates schemes for adapting control parameters. SaDE techniques outperform the classical DE algorithms without adaptive control.⁴⁵ This article employs the SaNSDE optimizer to execute subproblem optimization.

CoDE-AG

CSC

In the controller, the window covers several adjacent variables, only the variables in the window can be optimized at each optimization process, and other variables are taken from context vector and kept constant. After optimizing, the context vector is updated with the optimized variables and the window will slide forward $s$ (sliding step) for the next cycle of optimization. The above procedure implies that the context vector is dynamically updated as the subproblems are being optimized. This strategy focuses on the instant interaction and co-adaptation among subproblems. To tackle different problems, a set of window size pool is designed. Each size in the pool implies an interaction level among objective variables. At the beginning of each cycle, the optimization algorithm selects a window size from the pool based on their performance records. Meanwhile, the variable subsets are identified as active, regular, and inactive regions. The active and regular regions indicate that they need to be assigned more computation budget; thus, the sliding speed will be slow when the window covers them. On the other hand, the inactive regions indicate that they should be assigned less computation budget; thus, the sliding speed will be fast when the window covers it.

The details of the “circular sliding window” strategy are as follows. For an n-dimension optimization problem, the variables are arranged in an end-to-end loop; the window starts from the beginning of the loop and slides $s$ (sliding step) variables after each optimization. These actions are repeated until the window reaches the ending position. The above process is called a cycle. The framework can be summarized as follows:

Set $i = 1$ to start a new cycle;

Initialize the starting position and select a window size from the window size pool;

Optimize the variables in the current window and record the number of success $n_{s}$ and failing optimization $n_{f}$ ;

Identify the activity of subregions and update the context vector;

If $i < n$ , then $i = i + s$ , and go back to step 3;

Record the best optimization results of current cycle. Terminate the algorithm if the stop condition is satisfied, else turn to step 7;

Re-calculate the selection probability for the selected size of the window;

Randomly permute the order of the decision variables in inactive regions and go to step 1 for a new evolutionary cycle.

Figure 1 shows an illustrative example for the “circular sliding controller.” The objective problem is regarded as an end-to-end n-dimension vector $\tilde{X} = ({\tilde{x}}_{1}, {\tilde{x}}_{2}, \dots, {\tilde{x}}_{n})$ . The order of the variables is permuted when they are located in inactive regions. And a window contains $m$ (window size) variables ${{\tilde{x}}_{i}, {\tilde{x}}_{i + 1}, \dots, {\tilde{x}}_{i + m - 1}}$ , and ${\tilde{x}}_{i + (m - 1) / 2}$ is the center of the window. The parameter $s$ determines the sliding speed. Concerning the circular sliding window strategy, the following two issues should be addressed:

How to determine the size of the window.

At the beginning, assign an alternate set of window size $W = {w_{1}, w_{2}, \dots, w_{t}}$ , where different $w_{i} (1 \leq i \leq t)$ presents different window sizes.

Assign a parameter to each $w_{i} \in W$ , expressed as a set $R = {r_{1}, r_{2}, \dots, r_{t}}$ , where $r_{i}$ indicates the window fitness on the problem. Initially, $r_{i}$ is set to 1 and will be updated according to the following formula

r_{w}^{i} = \frac{| v - v' |}{| v |}

(5)

where $v$ is the best fitness value of the previous cycle, and $v'$ is the best fitness value of the current cycle.

Select the sliding window size of the next cycle using roulette strategy

p_{w}^{i} = \frac{10^{r_{w}^{i}}}{\sum_{j = 1}^{t} 10^{r_{w}^{i}}}

(6)

This strategy is inspired by the works of Yang et al.;²⁷ it provides a high probability for a window size with better performance being selected and provides opportunities for other window size being selected, thus ensures the window size diversity.

How to identify the activity of subregions and determine the sliding step.

At each sliding window, for subregions $s_{i}$ , record the number of success $n_{s}^{i}$ and the number of failure $n_{f}^{i}$ , and calculate the activity of each subregion using the following formula

r_{s}^{i} = \frac{n_{s}^{i}}{n_{s}^{i} + n_{f}^{i}}

(7)

Mark the activity to each variable in the current subregion and accumulate the values of the activity for the variables in the overlapping windows.

Partition the decision variables into three different regions: active region, regular region, and inactive region according to the activity. The first 30% high-activity variables are denoted as active regions (A), the last 30% active variables are denoted as inactive regions (I), and the rest of the variables are denoted as regular regions (R).

When the window center covers the active region, the sliding step is taken as $max {1, \frac{m}{5}}$ ; when the window center covers the regular region, the sliding step is taken as $max {1, \frac{2 m}{5}}$ ; and when the window center covers the inactive region, the sliding step is taken as $max {1, \frac{3 m}{5}}$ .

Figure 1.

Illustration example for circular sliding controller.

Under the framework of CSC, the variables in inactive regions will be randomly permuted because the variables with low activity means that their current adjacent relations may be inappropriate. Moreover, multiple individuals can be produced in this process of random permutation for the inactive regions. This method is one of the divide-and-conquer methods, and the CSC is one of the grouping strategies. The main differences between this method and the methods that divide all variables into some subgroups and optimize each subgroup are summarized as follows:

The window covers a subset of variables, with the variables in the window being optimized at each optimization process. Each variable is marked with an activity. As the window slides across the variable set, the variables can be overlapped, so that the interacting variables have higher probabilities to be optimized together when the variables are in active regions. On the other hand, the variable in inactive regions will be randomly permuted to reduce the probabilities of grouping unrelated variables together.

The entire decomposition process is conducted automatically. The sliding speed and the size of the window are adjusted dynamically, so that the computational efforts can be self-distributed according to the activities of different regions and the performance of previous cycle.

Grouping based on RM

In the CSC, the heuristics of the selection of window size, the selection of sliding step, and the identification of different regions are conducted based on current evaluations, but not consider the variable interactions on the basis of the long-term utility. This limits the performance of the algorithm.

In this section, a RM $R = (r_{ij})_{n \times n}$ is introduced to depict the interactions among all the variables. Its entries $r_{ij}$ denote the interaction between variables $x_{i}$ and $x_{j}$ . The matrix is updated during the execution process of the circular sliding window, and each successful evolution will give a fixed reward $(+ 1)$ for the variable relationship in RM. After the circular sliding strategy executes for a given number of cycles, a nonoverlapping variable grouping can be obtained according to the normalized RM. Similar to the process of circular sliding, the activities of these groups are identified, respectively. When the lowest activity of the RM-based subgroup becomes lower than a given threshold $θ$ , the CSC strategy will be executed again. The CSC and the RM-based grouping are carried out alternately. The advantage of the updating algorithm lies with the fact that it explores the variable interactions as the evolution process is going on and does not add extra computational burden to the algorithm.

Algorithm framework

In this subsection, we present the framework of the proposed CoDE-AG. The algorithm conducts by executing the CSC and the RM-based grouping alternatively. When the circular sliding executes for a predefined number of cycles, the algorithm switches to execute the RM-based grouping. On the other hand, when the activity of the RM-based subgroup becomes lower than a given threshold $θ$ , the algorithm switches to execute the circular sliding strategy. The SaNSDE is applied to optimize the subproblems. While optimizing, the RM and vector context are also updated simultaneously. The flowchart of CoDE-AG is shown in Figure 2.

Figure 2.

Flowchart of cooperative differential evolution with utility-based adaptive grouping.

Complexity analysis

Consider an optimization problem with $n$ dimensions; the potential solution space is $Π_{i = 1}^{n} (b_{ui} - b_{li})$ , which implies that its volume increases exponentially as the number of decision variable increases. In the CoDE-AG, the optimization of the entire problem is achieved by iteratively optimizing the subproblem window using SaNSDE. Suppose the variables covered by the window are in a variable set $S_{i}$ , then the potential solution space searched by each SaNSDE is $\underset{j \in S_{i}}{Π} (b_{uj} - b_{lj})$ . As the number of function evaluations (FEs) accounts for the computational efforts, we use FEs to measure the complexity of the algorithm. Within the optimization process of each subproblem $S_{i}$ , suppose the population size is $NP$ , the maximum number of iterations is $G$ and then the complexity of the SaNSDE is $O (NP \times G)$ . Suppose the minimum window size is $m$ , and all the variables are identified as active; then the minimum sliding speed of the window is $m / 5$ . As a result, the upper bound for the maximum number of execution of SaNSDE is $5 n / m$ . As a result, the complexity of the entire algorithm with respect to FEs is $O (NP \times G \times (n / m))$ .

Experimental studies

Test case and experimental setup

To evaluate and analyze the performance of the proposed CoDE-AG, we perform experiments on the CEC2008 and CEC2010 benchmarks, which are designed to test the performance of algorithms on large-scale optimization problems. The benchmarks simulate various real-world optimization problems and ensure that the optimization algorithms proposed by researchers are the same efficient in practical scenarios. The CEC2008 benchmark has seven functions. Among the functions, three of them ( $f_{1}$ , $f_{4}$ , $f_{6}$ ) are separable and four of them ( $f_{2}$ , $f_{3}$ , $f_{5}$ , $f_{7}$ ) are nonseparable. Details of these functions are reported in Tang et al.⁴⁶ The CEC2010 benchmark has 20 functions. Among the functions, 3 of them $(f_{1} - f_{3})$ are separable, 15 of them are partially separable $(f_{4} - f_{18})$ , and 2 of them are completely nonseparable ( $f_{19}$ , $f_{20}$ ). Details of these functions are reported in Tang et al.⁴⁷

In the experiments, the parameters are determined concerning the quality of the results and the computational efforts incurred. The alternative set of window size is taken as $W = {5, 10, 20, 50, 100}$ . For the window with size 5, 10, and 20, the population size is taken as $NP = 30$ . For the window with size 50 and 100, the population size is taken as 50. The number of maximum cycles is taken as 8. The switch threshold $θ$ from RM-based grouping to CSC is taken as $10^{- 2}$ . The number of FEs is used to measure the computational efforts. The maximum numbers of fitness evaluation are taken as $5.0 \times 10^{6}$ and $3.0 \times 10^{6}$ for CEC2008 and CEC2010, respectively. For optimization operator SaNSDE, the initial scale factor and crossover rate are taken as $F = 0.5$ and $CR = 0.9$ , respectively. To test the stability of the algorithm, the experiment on each functions inside each benchmark suites is repeated 25 times. The experimental results are presented under the instructions in the technical reports associated to the benchmark functions.

Simulated results and discussions

Results and discussions for CEC2008 benchmark

This subsection presents the results of the CoDE-AG for CEC2008 benchmarks. The dimensionality of the functions is taken as 1000. The results are presented following the instructions reported in the literature associated to the CEC2008.⁴⁶ For each run, the experiment is repeated 25 times, and the results after $5.0 \times 10^{4}$ , $5.0 \times 10^{5}$ , and $5.0 \times 10^{6}$ FEs are recorded. The mean value, the standard deviation (SD) of the 25 runs, and the 1st, 7th, 13th, 19th, and 25th values of the 25 runs are presented in Table 1. For functions $f_{1} - f_{6}$ , the values of $f (\vec{x}) - f ({\vec{x}}^{*})$ are presented. For function $f_{7}$ , since the optimum value is unknown, the function values are presented directly. Figure 3 plots the evolution curves for functions $f_{1} - f_{7}$ . For the functions, the curve is plotted using the results of the 25 runs. The abscissa is the number of FEs and the vertical axis is the optimization error averaged over the 25 runs.

Table 1.

Results of the 25 independent runs for CEC2008 benchmark functions $(\dim = 1000)$ .

FEs/functions		$f_{1}$	$f_{2}$	$f_{3}$	$f_{4}$	$f_{5}$	$f_{6}$	$f_{7}$
5.0E+04	1st	9.17E+01	8.98E+01	5.11E+05	4.87E+03	7.11E–00	1.33E+01	−1.23E+04
	7th	1.06E+02	9.22E+01	5.30E+05	5.25E+03	8.23E–00	1.79E+01	−1.19E+04
	13th	1.83E+02	9.51E+01	5.71E+05	6.12E+03	9.10E–00	1.98E+01	−1.17E+04
	19th	2.35E+02	1.02E+02	6.02E+05	6.69E+03	1.30E+01	2.12E+01	−9.79E+03
	25th	2.95E+02	1.27E+02	7.15E+05	7.04E+03	2.03E+01	2.37E+01	−8.89E+03
	Mean	1.85E+02	1.00E+02	5.80E+05	6.03E+03	1.12E+01	1.86E+01	−1.10E+04
	SD	6.97E+01	1.14E+01	5.86E+04	7.36E+02	3.98E–00	4.64E–00	1.16E+03
5.0E+05	1st	4.97E–08	5.72E+01	1.58E+03	5.13E+02	3.31E–07	5.62E–03	−1.39E+04
	7th	5.28E–08	5.94E+01	1.82E+03	5.36E+02	3.68E–07	8.75E–03	−1.38E+04
	13th	5.79E–08	6.18E+01	1.89E+03	5.89E+02	5.03E–07	1.22E–02	−1.38E+04
	19th	6.33E–08	6.60E+01	2.21E+03	6.25E+02	8.31E–07	2.53E–02	−1.33E+04
	25th	1.07E–07	6.93E+01	2.36E+03	7.30E+02	1.17E–06	4.16E–02	−1.33E+04
	Mean	6.24E–08	6.26E+01	1.97E+03	5.93E+02	6.18E–07	1.74E–02	−1.36E+04
	SD	1.47E–08	3.89E–00	2.47E+02	6.40E+01	2.54E–07	1.07E–02	2.31E+02
5.0E+06	1st	0.00E–00	1.49E–02	2.81E–00	3.26E–16	1.57E–18	8.76E–14	−1.60E+04
	7th	0.00E–00	2.66E–02	2.99E–00	4.17E–16	2.02E–18	1.11E–13	−1.52E+04
	13th	0.00E–00	3.14E–02	3.26E–00	4.44E–16	2.18E–18	2.27E–13	−1.47E+04
	19th	0.00E–00	3.57E–02	3.77E–00	7.29E–16	2.67E–18	3.91E–13	−1.47E+04
	25th	0.00E–00	5.09E–02	4.35E–00	1.16E–15	3.18E–18	4.54E–13	−1.40E+04
	Mean	0.00E–00	3.19E–02	3.41E–00	5.83E–16	2.31E–18	2.52E–13	−1.49E+04
	SD	0.00E–00	9.40E–03	4.78E–01	2.55E–16	4.49E–19	1.36E–13	5.44E+02

FEs: function evaluations; SD: standard deviation.

Figure 3.

Plots of evolution curves of the CoDE-AG for CEC2008 benchmarks. The results were obtained from 25 independent runs of the algorithm: (a) f₁, dim = 1000; (b) f₂, dim = 1000; (c) f₃, dim = 1000; (d) f₄, dim = 1000; (e) f₅, dim = 1000; (f) f₆, dim = 1000; and (g) f₇, dim = 1000.

For the purpose of comparison, we first select the MTS⁴⁸ and the LSEDA-gl,⁴⁹ whose reported results are top-ranked in the CEC2008 competition. The MTS uses multiple agents to search the solution space concurrently. Each agent does an iterated local search using one of the three optimizers. It adopts multi-optimizer strategy, and the optimizer that best fits the landscape of a solution’s neighborhood is selected. The LSEDA-gl adopts three strategies on the Estimation of Distribution Algorithm (EDA), that is, the sampling under mixed Gaussian and Levy probability distribution strategy, the standard deviation control strategy, and the restart strategy. Then we select relatively new algorithms that yield high-quality results on CEC2008 benchmarks, that is, the CSO⁵⁰ and the CCPSO2.⁵¹ The CSO introduces a pairwise competition mechanism, where the particle that loses the competition will update its position by learning from the winner. The CCPSO2 is a cooperative coevolving particle swarm optimization algorithm based on random grouping. The position update rule relies on Cauchy and Gaussian distribution to sample new points. We compare the results of the CoDE-AG with the reported results associated with the algorithms in the original paper. We further select the LSHADE-cnEpSin,⁵² which is one of the winning DE variants in CEC2017 competitions for moderate-scale optimization problems. The LSHADE-cnEpSin introduces two major operators to enhance the performance, that is, the ensemble of sinusoidal approaches based on performance adaptation and the covariance matrix learning for the crossover. In the original paper of LSHADE-cnEpSin, only the results of 10-D, 30-D, 50-D, and 100-D problems of the benchmarks are reported. For comparison with our algorithm, we extend it to solve 1000-D problems.

Table 2 presents the results from the 25 runs of the CoDE-AG and the results of MTS, LSEDA-gl, CSO, CCPSO2, and LSHADE-cnEpSin. As recommended in the CEC2008 technical report, all the results are taken at $5.0 \times 10^{6}$ FEs and are averaged over the 25 runs. In the table, “Mean” represents the average optimization error of the 25 runs, and “SD” represents the standard deviation of the result. To illustrate the statistical differences between the CoDE-AG and the compared algorithms, the Friedman test and the Holm test are conducted. The results are presented in Table 3. The Friedman test results indicate that the differences among the six algorithms are statistically significant with 99.88% certainty. The CoDE-AG obtains the best overall rank. The Holm test results show that the differences of CoDE-AG with the MTS, LSEDA-gl, CSO, CCPSO2, and LSHADE-cnEpSin are statistically significant with 69.09%, 95.33%, 93.79%, 97.52%, and 99.99% certainty, respectively.

Table 2.

Comparison results of CoDE-AG and other algorithms for CEC2008 benchmark functions $(Dim = 1000)$ .

Functions	CoDE-AG		MTS		LSEDA-gl		CSO		CCPSO2		LSHADE-cnEpSin
Functions	Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
$f_{1}$	0.00E–00	0.00E–00	0.00E–00	0.00E–00	3.22E–13	2.84E–14	1.09E–21	4.20E–23	5.18E–13	9.61E–14	7.04E–09	9.96E–09
$f_{2}$	3.19E–02	9.40E–03	4.72E–02	8.58E–03	1.04E–05	5.11E–07	4.15E+01	9.74E–01	7.82E+01	4.25E+01	7.39E+01	9.12E–00
$f_{3}$	3.41E–00	4.78E–01	3.41E–04	1.38E–04	1.73E+03	1.40E+02	1.01E+03	3.02E+01	1.33E+03	2.63E+02	1.66E+03	1.35E+02
$f_{4}$	5.83E–16	2.55E–16	0.00E–00	0.00E–00	5.45E+02	1.80E+01	6.89E+02	3.10E+01	1.99E–01	4.06E–01	1.57E+03	9.66E+02
$f_{5}$	0.00E–00	0.00E–00	0.00E–00	0.00E–00	1.71E–13	0.00E–00	2.26E–16	2.18E–17	1.18E–03	3.27E–03	9.65E–02	1.31E–01
$f_{6}$	2.52E–13	1.36E–13	1.24E–11	4.78E–13	4.26E–13	0.00E–00	1.21E–12	2.64E–14	1.02E–12	1.68E–13	5.22E–00	4.56E–00
$f_{7}$	−1.49E+04	5.44E+02	−1.39E+04	2.94E+01	−1.35E+04	5.72E+01	−1.40E+04	3.37E+02	−1.43E+04	8.27E+01	−8.83E+03	5.65E+01

CoDE-AG: cooperative differential evolution with utility-based adaptive grouping; SD: standard deviation.

Note: The bold values are the best results obtained by these algorithms.

Table 3.

Results of the Friedman test and Holm test $(α = 0.05)$ .

Friedman test				Holm test
Algorithm	Rank	$χ^{2}$	1 – p value	Algorithm	z	1 – p value
CoDE-AG	1.6429	20.000	99.88%	–	–	–
MTS	2.3571			CoDE-AG vs. MTS	0.7143	69.09%
LSEDA-gl	3.7143			CoDE-AG vs. LSEDA-gl	2.0714	95.33%
CSO	3.5714			CoDE-AG vs. CSO	1.9286	93.79%
CCPSO2	4.0000			CoDE-AG vs. CCPSO2	2.3571	97.52%
LSHADE-cnEpSin	5.7143			CoDE-AG vs. LSHADE-cnEpSin	4.0714	99.99%

CoDE-AG: cooperative differential evolution with utility-based adaptive grouping.

From Table 1, it can be seen that the CoDE-AG achieves consistently good results on all the functions with 1000 dimensions. From Figure 3, it can be seen that the algorithm can improve the quality of the results steadily until the algorithm halted. From Table 2, it can be seen that the CoDE-AG achieves the best results on three functions ( $f_{1}$ , $f_{6}$ , $f_{7}$ ). The results on $f_{2} - f_{5}$ rank 2 among the six algorithms. Table 3 shows that the CoDE-AG obtains the first overall rank and the differences are statistically significant with different confidence. The benchmark suite includes separable functions and nonseparable function with different extent of variable interdependencies. The performance of the CoDE-AG indicates that the CSC can provide opportunities to implicitly capture the variable interdependencies and optimize them in one window.

Results and discussions for CEC2010 benchmark

This subsection presents the results of the CoDE-AG for CEC2010 benchmarks. The dimensionality of the functions is 1000. The results are presented following the instructions reported in the literature associated to the CEC2010.⁴⁷ For each run, the experiment is repeated 25 times, and the results after $1.2 \times 10^{5}$ , $6.0 \times 10^{5}$ , and $3.0 \times 10^{6}$ FEs are recorded. The mean value, the SD of the 25 runs, and the best, median, worst values of the 25 runs are presented in Table 4. As an example, Figure 4 plots the evolution curves of the CoDE-AG for functions $f_{2}$ , $f_{5}$ , $f_{8}$ , $f_{10}$ , $f_{13}$ , $f_{15}$ , $f_{18}$ , and $f_{20}$ . For the functions, the curve is plotted using the results of the 25 runs. The abscissa is the number of FEs and the vertical axis is the average optimization error.

Table 4.

Results of the 25 independent runs for CEC2010 benchmark functions $(\dim = 1000)$ .

FEs/functions		$f_{1}$	$f_{2}$	$f_{3}$	$f_{4}$	$f_{5}$	$f_{6}$	$f_{7}$
1.2E+05	Best	3.21E+08	4.59E+03	7.39E–00	8.39E+12	1.87E+08	2.72E+04	7.87E+08
	Median	7.06E+08	5.12E+03	1.21E+01	2.51E+13	2.72E+08	3.80E+04	1.06E+09
	Worst	2.67E+09	8.13E+03	3.03E+01	4.19E+13	4.01E+08	5.36E+04	3.65E+09
	Mean	9.31E+08	5.58E+03	1.43E+01	2.51E+13	2.81E+08	3.77E+04	1.70E+09
	SD	4.70E+08	9.20E+02	6.68E–00	7.68E+12	5.79E+07	6.54E+03	9.47E+08
6.0E+05	Best	3.61E+01	6.77E+01	3.96E–02	1.79E+12	7.66E+07	8.19E+02	1.83E+06
	Median	2.15E+02	2.51E+02	5.72E–01	2.86E+12	9.52E+07	2.11E+03	2.65E+06
	Worst	8.11E+02	5.66E+02	8.83E–01	6.02E+12	2.36E+08	3.05E+03	3.95E+06
	Mean	2.80E+02	2.61E+02	4.67E–01	3.18E+12	1.22E+08	1.94E+03	2.56E+06
	SD	1.93E+02	1.39E+02	2.44E–01	1.18E+12	4.55E+07	6.62E+02	4.81E+05
3.0E+06	Best	5.37E–26	3.97E–06	1.32E–13	4.33E+11	2.38E+07	1.53E+02	8.02E+02
	Median	1.17E–25	2.26E–05	2.16E–13	8.94E+11	3.34E+07	3.73E+02	1.50E+03
	Worst	7.64E–25	7.78E–05	4.01E–13	2.10E+12	5.87E+07	6.08E+02	2.22E+03
	Mean	2.28E–25	3.08E–05	2.18E–13	8.80E+11	3.43E+07	3.46E+02	1.53E+03
	SD	1.83E–25	2.42E–05	6.03E–14	3.09E+11	7.89E+06	1.10E+02	4.55E+02
FEs/functions	$f_{8}$	$f_{9}$	$f_{10}$	$f_{11}$	$f_{12}$	$f_{13}$	$f_{14}$
1.2E+05	Best	3.65E+08	1.13E+09	8.06E+03	2.02E+02	6.15E+05	5.37E+06	1.82E+09
	Median	7.70E+08	2.58E+09	9.39E+03	2.06E+02	7.63E+05	6.25E+06	2.09E+09
	Worst	4.50E+09	4.81E+09	2.15E+04	2.14E+02	1.06E+06	9.54E+06	2.23E+09
	Mean	1.32E+09	2.62E+09	9.97E+03	2.06E+02	7.75E+05	6.69E+06	2.04E+09
	SD	9.83E+08	9.69E+08	2.52E+03	2.32E–00	1.13E+05	1.08E+06	1.09E+08
6.0E+05	Best	3.23E+07	1.82E+08	3.73E+03	6.01E+01	1.07E+05	3.18E+04	3.16E+08
	Median	5.11E+07	3.90E+08	4.41E+03	6.92E+01	1.51E+05	4.49E+04	4.11E+08
	Worst	9.82E+07	5.03E+08	7.17E+03	8.33E+01	2.55E+05	6.30E+04	7.08E+08
	Mean	5.44E+07	3.71E+08	4.61E+03	6.60E+01	1.64E+05	4.36E+04	4.47E+08
	SD	1.56E+07	8.51E+07	8.71E+02	1.31E+01	4.40E+04	7.07E+03	9.72E+07
3.0E+06	Best	2.58E+05	1.23E+07	1.05E+03	1.87E+01	5.58E+02	5.63E+02	1.77E+07
	Median	7.71E+05	2.08E+07	1.78E+03	2.29E+01	6.53E+02	6.75E+02	3.00E+07
	Worst	2.06E+06	4.31E+07	2.64E+03	3.21E+01	1.29E+03	9.72E+02	7.02E+07
	Mean	9.01E+05	2.21E+07	1.68E+03	2.41E+01	6.89E+02	6.83E+02	3.07E+07
	SD	5.43E+05	7.72E+06	3.47E+02	3.73E–00	1.55E+02	8.57E+01	1.05E+07
FEs/functions	$f_{15}$	$f_{16}$	$f_{17}$	$f_{18}$	$f_{19}$	$f_{20}$
1.2E+05	Best	7.02E+03	4.08E+02	1.83E+06	2.18E+07	4.14E+06	2.76E+09
	Median	8.21E+03	4.18E+02	2.07E+06	3.55E+07	5.58E+06	3.30E+09
	Worst	1.02E+04	4.27E+02	2.45E+06	6.31E+07	1.10E+07	5.05E+09
	Mean	8.37E+03	4.18E+02	2.11E+06	3.73E+07	5.64E+06	3.47E+09
	SD	8.48E+02	6.19E–00	1.71E+05	9.75E+06	1.52E+06	6.26E+08
6.0E+05	Best	5.21E+03	3.89E+02	1.32E+05	3.99E+04	1.21E+06	6.74E+05
	Median	6.88E+03	3.95E+02	1.59E+05	6.01E+04	2.10E+06	8.15E+05
	Worst	8.52E+03	4.08E+02	2.31E+05	7.08E+04	2.57E+06	1.08E+06
	Mean	6.84E+03	3.96E+02	1.61E+05	5.73E+04	1.94E+06	8.32E+05
	SD	9.12E+02	4.68E–00	2.31E+04	9.17E+03	4.69E+05	1.07E+05
3.0E+06	Best	9.35E+02	3.53E+02	1.23E+03	1.92E+03	6.58E+05	2.08E+02
	Median	1.76E+03	3.76E+02	2.16E+03	2.45E+03	7.32E+05	2.17E+02
	Worst	2.97E+03	3.80E+02	2.87E+03	2.94E+03	8.33E+05	2.85E+02
	Mean	1.65E+03	3.71E+02	2.06E+03	2.41E+03	7.33E+05	2.33E+02
	SD	4.27E+02	9.45E–00	4.59E+02	2.62E+02	4.47E+04	3.14E+01

FEs: function evaluations; SD: standard deviation.

Figure 4.

Plots of evolution curves of the CoDE-AG for CEC2010 benchmarks. The results were obtained from 25 independent runs of the algorithm: (a) f₂, dim = 1000; (b) f₅, dim = 1000; (c) f₈, dim = 1000; (d) f₁₀, dim = 1000; (e) f₁₃, dim = 1000; (f) f₁₅, dim = 1000; (g) f₁₈, dim = 1000; and (h) f₂₀, dim = 1000.

For the purpose of comparison, we first select MA-SW-Chains³⁸ and DMS-PSO-SHS,⁵³ whose reported results are top-ranked in the CEC2010 competition . The MA-SW-Chains assigns each individual a local search intensity, and several local search optimizers are applied to perform optimization tasks based on the intensity. The DMS-PSO-SHS divides the population into sub-swarms, then regroups the sub-swarms using regrouping strategies. The information is exchanged among the particles in the whole swarm. Then we select relatively new algorithms that yield high-quality results on CEC2010 benchmarks, that is, the DECC-DG¹⁵ and the DECC-XDG.²³ The DECC-DG adopts a differential grouping strategy and uses DECC to perform CC. The DECC-XDG proposes an extension of differential grouping to identify decision variables with indirect interaction and uses DECC to perform CC. We compare the results of the CoDE-AG with the reported results associated with the algorithms in the original paper. We further select the LSHADE-cnEpSin. For comparison with our algorithm, we extend it to solve 1000-D problems.

Table 5 presents the results from the 25 runs of the CoDE-AG and the results of MA-SW-Chains, DMS-PSO-SHS, DECC-DG, DECC-XDG, and LSHADE-cnEpSin. As recommended in the CEC2010 technical report, all the results are taken at $3.0 \times 10^{6}$ FEs and are averaged over the 25 runs. The item meanings of Table 5 are the same as those in Table 2. To illustrate the statistical differences between the CoDE-AG and the compared algorithms, the Friedman test and the Holm test are also conducted. The results are presented in Table 6. The Friedman test results indicate that the differences among the six algorithms are statistically significant with 99.98% certainty. The CoDE-AG obtains the best overall rank. When conducting pairwise comparison, the Holm test results show that the differences of CoDE-AG with the MA-SW-Chains, DMS-PSO-SHS, DECC-DG, DECC-XDG, and LSHADE-cnEpSin are statistically significant with 69.18%, 66.51%, 99.97%, 99.74%, and 97.57% certainty, respectively.

Table 5.

Comparison results of CoDE-AG and other algorithms for CEC2010 benchmark functions $(\dim = 1000)$ .

Functions	CoDE-AG		MA-SW-Chains		DMS-PSO-SHS		DECC-DG		DECC-XDG		LSHADE-cnEpSin
	Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD	Mean	SD
$f_{1}$	2.28E–25	1.83E–25	2.10E–14	1.99E–14	5.51E–15	4.00E–14	5.47E+03	2.02E+04	2.23E+04	8.01E+04	7.32E–12	4.75E–12
$f_{2}$	3.08E–05	2.42E–05	8.10E+02	5.88E+01	8.51E+01	2.06E+01	4.39E+03	1.97E+02	4.44E+03	1.64E+02	4.12E+03	2.29E+02
$f_{3}$	2.18E–13	6.03E–14	7.28E–13	3.40E–13	5.52E–11	3.25E–10	1.67E+01	3.34E–01	1.66E+01	4.12E–01	2.28E–00	2.11E–01
$f_{4}$	8.80E+11	3.09E+11	3.53E+11	3.12E+10	2.46E+11	3.31E+10	4.79E+12	1.44E+12	7.84E+11	1.67E+11	1.15E+11	4.77E+10
$f_{5}$	3.43E+07	7.89E+06	1.68E+08	1.04E+08	8.36E+07	6.10E+06	1.55E+08	2.17E+07	1.68E+08	1.74E+07	7.83E+07	6.10E+6
$f_{6}$	3.46E+02	1.10E+02	8.14E+04	2.84E+05	8.28E–02	9.96E–01	1.64E+01	2.71E–01	1.63E+01	3.28E–01	2.02E–00	8.66E–02
$f_{7}$	1.53E+03	4.55E+02	1.03E+02	8.70E+01	1.95E+03	1.56E+02	1.16E+04	7.41E+03	1.39E+03	2.61E+03	1.31E+12	0.00E–00
$f_{8}$	9.01E+05	5.43E+05	1.41E+07	3.68E+07	1.29E+07	1.91E+06	3.04E+07	2.11E+07	4.78E+05	1.32E+06	2.75E–04	2.08E–04
$f_{9}$	2.21E+07	7.72E+06	1.41E+07	1.15E+06	8.72E+06	6.51E+05	5.96E+07	8.18E+06	1.12E+08	1.13E+07	2.42E+07	1.26E+06
$f_{10}$	1.68E+03	3.47E+02	2.07E+03	1.44E+02	5.53E+03	5.18E+02	4.52E+03	1.42E+02	5.31E+03	1.55E+02	4.75E+03	8.67E+02
$f_{11}$	2.41E+01	3.73E–00	3.80E+01	7.35E–00	3.25E+01	3.00E–00	1.03E+01	3.00E–00	1.04E+01	1.15E–00	8.35E+01	4.46E+01
$f_{12}$	6.89E+02	1.55E+02	3.62E–06	5.92E–07	6.13E+02	6.00E+01	2.52E+03	4.86E+02	1.24E+04	2.23E+03	9.14E+03	3.30E+03
$f_{13}$	6.83E+02	8.57E+01	1.25E+03	5.72E+02	1.12E+03	1.05E+02	4.54E+06	2.13E+06	1.21E+03	2.25E+02	1.93E+04	1.25E+04
$f_{14}$	3.07E+07	1.05E+07	3.11E+07	1.93E+06	1.76E+07	1.55E+06	3.41E+08	2.41E+07	5.83E+08	4.11E+07	7.88E+07	2.79E+07
$f_{15}$	1.65E+03	4.27E+02	2.74E+03	1.22E+02	4.08E+03	2.17E+02	5.88E+03	1.03E+02	5.91E+03	7.56E+01	4.64E+03	8.94E+01
$f_{16}$	3.71E+02	9.45E–00	9.98E+01	1.40E+01	6.98E+01	4.24E–00	7.39E–13	5.70E–14	1.81E–08	1.57E–09	1.67E+02	7.65E+01
$f_{17}$	2.06E+03	4.59E+02	1.24E–00	1.25E–01	3.83E+03	4.15E+02	4.01E+04	2.85E+03	1.26E+05	7.47E+03	5.97E+04	1.21E+04
$f_{18}$	2.41E+03	2.62E+02	1.30E+03	4.36E+02	2.56E+03	1.16E+02	1.11E+10	2.04E+09	1.41E+03	1.88E+02	7.99E+03	6.37E+03
$f_{19}$	7.33E+05	4.47E+04	2.85E+05	1.78E+04	1.17E+06	1.06E+05	1.74E+06	9.54E+04	1.59E+06	4.96E+04	6.88E+05	2.66E+05
$f_{20}$	2.33E+02	3.14E+01	1.07E+03	7.29E+01	3.52E+02	4.02E+01	4.87E+07	2.27E+07	5.55E+05	1.75E+06	2.37E+03	1.16E+02

CoDE-AG: cooperative differential evolution with utility-based adaptive grouping; SD: standard deviation.

Note: The bold values are the best results obtained by these algorithms.

Table 6.

Results of the Friedman test and Holm test $(α = 0.05)$ .

Friedman test				Holm test
Algorithm	Rank	$χ^{2}$	1 – p value	Algorithm	z	1 – p value
CoDE-AG	2.450	24.18	99.98%	–	–	–
MA-SW-Chains	2.875			CoDE-AG vs. MA-SW-Chains	0.7184	69.18%
DMS-PSO-SHS	2.800			CoDE-AG vs. DMS-PSO-SHS	0.5916	66.51%
DECC-DG	4.700			CoDE-AG vs. DECC-DG	3.8032	99.97%
DECC-XDG	4.3250			CoDE-AG vs. DECC-XDG	3.1693	99.74%
LSHADE-cnEpSin	3.8500			CoDE-AG vs. LSHADE-cnEpSin	2.3664	97.57%

CoDE-AG: cooperative differential evolution with utility-based adaptive grouping.

From Table 4, it can be seen that the CoDE-AG achieves consistently good results on $f_{1}$ , $f_{2}$ , $f_{3}$ , and $f_{11}$ . From Figure 4, for $f_{1}$ , $f_{8}$ $f_{10}$ , and $f_{13}$ , it can be seen that the algorithm can improve the quality of the results steadily until the algorithm halted. For $f_{5}$ , $f_{15}$ , $f_{18}$ , and $f_{20}$ , it can be seen that the algorithm is trapped into pseudo-optimum and then escapes. From Table 5, it can be seen that the CoDE-AG achieves the best results on eight functions ( $f_{1}$ , $f_{2}$ , $f_{3}$ , $f_{5}$ , $f_{10}$ , $f_{13}$ , $f_{15}$ , and $f_{20}$ ). Table 6 shows that the CoDE-AG obtains the first overall rank and the differences are statistically significant with different confidence. The benchmark suite includes separable functions $f_{1} - f_{3}$ , partially separable functions $f_{4} - f_{18}$ , and completely nonseparable functions $f_{19}$ and $f_{20}$ . For partially separable and nonseparable functions, the reason that the CoDE-AG being trapped into pseudo-optimum lies with the fact the circular sliding window strategy adopts to permute the variables dynamically, and the interacting variables might be not arranged in one group at the beginning iterations of the algorithm. However, as the algorithm going on, the interacting variables might be arranged in one group, which leads the algorithm to escaping from the pseudo-optimum.

Results for the RM-based grouping strategy

This subsection aims to show the performance of the RM-based grouping strategy. The experiments are conducted on CEC2010 benchmarks. For each function, the experiments are conducted with and without the relation-based grouping strategy, respectively. The experiments with RM-based grouping strategy are repeated three times, with the maximum number of cycles taken as 5, 8, and 10, respectively. For each run, the experiments are repeated 25 times. Table 7 presents the mean error and standard deviation of the 25 runs.

Table 7.

Results of experiments with different RM settings.

Functions	CoDE-AG (cycles = 5)		CoDE-AG (cycles = 8)		CoDE-AG (cycles = 10)		CoDE-AG (without RM)
	Mean	SD	Mean	SD	Mean	SD	Mean	SD
$f_{1}$	3.34E–25	1.77E–25	2.28E–25	1.83E–25	6.89E–25	2.12E–25	7.37E–25	3.52E–25
$f_{2}$	5.29E–05	3.02E–05	3.08E–05	2.42E–05	6.62E–06	2.75E–06	2.88E–06	1.07E–06
$f_{3}$	1.98E–13	3.83E–14	2.18E–13	6.03E–14	3.12E–12	2.98E–13	6.51E–13	7.13E–14
$f_{4}$	1.08E+12	5.10E+11	8.80E+11	3.09E+11	2.00E+12	5.72E+11	2.09E+12	6.61E+11
$f_{5}$	5.11E+07	1.30E+07	3.43E+07	7.89E+06	8.27E+07	2.06E+07	8.36E+07	1.21E+07
$f_{6}$	7.28E+02	2.51E+02	3.46E+02	1.10E+02	1.21E+03	5.66E+02	8.07E+04	1.17E+04
$f_{7}$	1.95E+03	4.96E+02	1.53E+03	4.55E+02	2.38E+03	4.78E+02	1.27E+04	4.33E+03
$f_{8}$	9.01E+05	4.87E+05	9.01E+05	5.43E+05	3.15E+06	7.32E+05	9.09E+05	2.27E+05
$f_{9}$	5.60E+07	1.73E+07	2.21E+07	7.72E+06	3.62E+07	8.91E+06	5.90E+07	7.10E+06
$f_{10}$	1.81E+03	4.15E+02	1.68E+03	3.47E+02	2.08E+03	4.76E+02	2.65E+03	6.22E+02
$f_{11}$	4.37E+01	9.16E–00	2.41E+01	3.73E–00	2.34E+01	3.77E–00	5.21E+01	1.00E+01
$f_{12}$	7.22E+02	2.01E+02	6.89E+02	1.55E+02	1.53E+03	8.21E+02	1.77E+03	5.96E+02
$f_{13}$	7.53E+02	1.13E+02	6.83E+02	8.57E+01	7.50E+02	1.52E+01	9.74E+02	2.18E+02
$f_{14}$	3.11E+07	8.98E+06	3.07E+07	1.05E+07	3.86E+07	1.23E+07	1.07E+08	6.36E+07
$f_{15}$	2.03E+03	4.25E+02	1.65E+03	4.27E+02	1.97E+03	5.04E+02	2.30E+03	5.23E+02
$f_{16}$	3.71E+02	9.61E–00	3.71E+02	9.45E–00	3.77E+02	1.06E–00	3.97E+02	1.06E+01
$f_{17}$	3.96E+03	1.11E+03	2.06E+03	4.59E+02	1.02E+04	8.75E+03	3.91E+04	4.80E+03
$f_{18}$	2.47E+03	4.04E+02	2.41E+03	2.62E+02	2.26E+03	5.12E+02	2.12E+03	4.31E+02
$f_{19}$	7.26E+05	4.19E+04	7.33E+05	4.47E+04	8.72E+05	2.66E+04	8.30E+05	3.17E+04
$f_{20}$	3.51E+02	5.08E+01	2.33E+02	3.14E+01	9.71E+02	7.96E+01	8.27E+02	7.35E+01

CoDE-AG: cooperative differential evolution with utility-based adaptive grouping; RM: relation matrix; SD: standard deviation.

Note: The bold values are the best results obtained by these algorithms.

It can be seen from Table 7 that for the separable functions $f_{1}$ , $f_{2}$ , and $f_{3}$ , and nonseparable functions $f_{19}$ and $f_{20}$ , no significant difference can be found among the results. For the partially separable functions $f_{4} - f_{18}$ , the algorithm conducted with RM strategy yields better results. The results demonstrate that the RM strategy imposes positive effect for decomposing a large-scale problem into small-scale subproblems. Moreover, it can be seen that for the algorithm conducted with RM strategy, when the maximum number of cycles is taken as 8, the algorithm yields the best overall performance; thus, we take this configuration in other experiments of this article.

CoDE-AG for material optimization of dome structure

In order to validate the effectiveness of the proposed CoDE-AG for the large-scale optimization problems in real-world scenarios, the CoDE-AG is applied to the material volume optimization of dome structures. In the example adopted in this article, the dome structure is composed of truss beams and some vertical pillars. It is a double spherical structure of concentric double spheres with inner spherical radius of $24 m$ and outer spherical radius of $25 m$ . The distance is $10 m$ from the vertex of the outer sphere to the horizontal section, and the distance is $1 m$ between the vertex of the inner sphere and the vertex of the outer sphere. The projection radius of the outer sphere and the projection radius of the inner sphere on the horizontal plane are $20$ and $19.2 m$ , respectively. Each sphere is divided into eight layers from bottom to top. Each layer has 48 nodes evenly distributed, and each layer has a vertex. Therefore, the structure has 770 joints and 2849 connecting rods. Although the lengths of these connecting rods are the same, they have 56 groups of different cross-sectional areas (53 groups with 48 connecting rods, 2 groups with 192 connecting rods, and 1 group with 49 connecting rods). The 48 nodes at the bottom of the dome inner layer and the 8 vertices forming the shape “⋇” of the outer layer belong to fixed nodes, and other nodes possess 3 degrees of freedom. Deformation under stress is shown in Figure 5. The connecting rods of each section can be represented in different colors. The optimization objective is to obtain the design scheme of connecting rod section with the smallest total volume of connecting rod material under the condition that the structure can withstand a certain static load. This engineering problem can be transformed into an optimization problem of minimizing the total cross-section area of 56 groups under static constraints.

Figure 5.

The top view and flat view of dome after adding load and fixed nodes.

Assume that the specific design load is a snow load of $0.5 m$ covered on the dome. The average stress of each node of the outer dome can be calculated by the surface area and the weight of snow as in the following formulas

S = 2 \times π \times R \times H

(8)

F = \frac{ρ \times T \times g \times S}{385}

(9)

where $S$ is the upper surface area of the dome, $R$ is the radius of the outer dome $(25 m)$ , $H$ is the height of the dome $(10 m)$ , $ρ$ is the density of snow $(87.67 kg / m^{3})$ , $T$ is the thickness of snow $(0.5 m)$ , and $g$ is the acceleration of gravity $(9.8 N / kg)$ .

The objective is to optimize the distribution of the cross-sectional area of the materials under the condition that the stress does not exceed the maximum. The fitness function can be taken as follows

f = \frac{1}{\sum_{i = 1}^{N} A_{i}} + λ \sum_{i = 1}^{N} Q (F_{i} - θ)

(10)

where $N$ is the number of dome connecting rods $(N = 2849)$ , $λ$ is the penalty coefficient $(λ = 2)$ , $θ$ is the maximum stress that the connecting rod can support ( $θ = 2.35 E + 9$ N), and $A_{i}$ and $F_{i}$ are the cross-sectional area and stress of the ith connecting rod, respectively. $Q (F_{i} - θ)$ is defined as follows

Q (F_{i} - θ) {\begin{matrix} e^{- \frac{{(F_{i} - θ)}^{2}}{2 σ^{2}}} & when F_{i} - θ \leq 0 \\ e^{- \frac{{(F_{i} - θ)}^{2}}{σ^{2}}} & when F_{i} - θ > 0 \end{matrix}

(11)

Given the cross-sectional area of each material group, the maximum stress of the dome structure can be obtained by the displacement of the connecting rod using the finite element toolkit. The maximum number of fitness evaluation is taken as $10^{4}$ . The other parameters are the same as those in the “Test case and experimental setup” section. Table 8 presents the results from the 25 runs of the CoDE-AG and the results of CSO,⁵⁰ LSEDA-gl,⁴⁹ CCPSO2,⁵¹ and DMS-PSO-SHS.⁵³ In the table, “MeanFit” represents the average value of optimal fitness for 25 runs, and “SD” represents the standard deviation of the result. “Total-MaterVol” is the total volume of materials for the optimal scheme and “Max-DeforDist” is the maximum deformation distance of the connecting rod in this case. It can be seen that the proposed CoDE-AG obtains the best fitness value and the minimum total volume of materials compared with the other algorithms.

Table 8.

Comparison results on material optimization of dome structure.

Algorithm	MeanFit	SD	Total-MaterVol (m³)	Max-DeforDist (mm)
CSO	987.07	29.59	2.62	5.49
LSEDA-gl	913.26	53.18	2.96	5.75
CCPSO2	832.6	38.20	3.51	4.93
DMS-PSO-SHS	1013.44	42.76	2.27	6.71
CoDE-AG	1089.32	36.17	2.01	6.63

SD: standard deviation; Total-MaterVol: total volume of materials for the optimal scheme; Max-DeforDist: maximum deformation distance of the connecting rod; CoDE-AG: cooperative differential evolution with utility-based adaptive grouping.

Note: The bold values are the best results obtained by these algorithms.

Discussions

The results of the CoDE-AG can be summarized as follows:

On the CEC2008 and CEC2010 benchmarks, the Friedman test results indicate that the CoDE-AG obtains the best overall rank.

On the CEC2008 and CEC2010 benchmarks, the Holm test results indicate that the CoDE-AG yields better results than the compared algorithms, and the differences are statistically relevant with different certainty.

The RM strategy is effective in capturing the variable interactions without incurring extra computational burden to the algorithm, thus facilitating the algorithm to decompose a large-scale problem into small-scale subproblems .

The proposed CoDE-AG is effective for the practical engineering application problems, such as material optimization of dome structure.

The reason that the CoDE-AG obtains such results relies on the following issues. First, the CSC considers the variable interactions on the basis of short-term utility, and the RM-based grouping strategy considers the variable interaction on the basis of long-term utility. Thus, the interacted variables have high opportunities to be optimized in the same subproblem, which facilitates the divide-and-conquer of CC optimization. The experiments here illustrate the usefulness of the proposed algorithm and provide a guideline for researchers when designing related algorithms.

Conclusion

To solve optimization problems, especially large-scale optimization problems, it is essential to decompose the problem into small-scale subproblems and to optimize the subproblems cooperatively. For problem decomposition, a CSC and an RM-based grouping strategy are proposed. In the controller, the circular sliding window slides across different regions of the variable blocks and provides baselines for the subproblem optimizer. Moreover, the sliding speed and the size of the window are adaptively adjusted according to the performance of the latest optimization cycle and the activeness of different regions of the variable blocks. The RM strategy groups the variables and provides opportunities for interacted variables being grouped adjacently. The proposed two strategies are executed alternatively so that the short-term and long-term utilities of the variable interactions can be utilized. The contributions lie with the fact that they decompose the problem concerning the variable interactions as well as avoid incurring extra computational burden to the algorithm. The performance of the proposed algorithm is tested on the CEC2008 and CEC2010 benchmarks, and practical scenarios. Simulated results show that the proposed algorithm can perform well on the large-scale optimization problems, especially when the problem is partially independent.

Footnotes

Handling Editor: Wanli Li

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors are grateful to the support of the National Natural Science Foundation of China (61572104, 61103146), the Fundamental Research Funds for the Central Universities (DUT17JC04), and the Project of the Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University (93K172017K03).

ORCID iD

Hongwei Ge

References

Mandavi

EShiri

Rahnamayan

. Metaheuristics in large-scale global continues optimization: a survey. Inform Sciences 2015; 295: 407–428.

Regis

. Evolutionary programming for high-dimensional constrained expensive black-box optimization using radial basis functions. IEEE T Evolut Comput 2014; 18: 326–347.

Van den Bergh

Engelbrecht

. A cooperative approach to particle swarm optimization. IEEE T Evolut Comput 2004; 8: 225–239.

Gong

Luo

et al . A multiobjective cooperative coevolutionary algorithm for hyperspectral sparse unmixing. IEEE T Evolut Comput 2017; 21: 234–248.

Yang

Tang

Yao

. Large scale evolutionary optimization using cooperative coevolution. Inform Sciences 2008; 178: 2985–2999.

Trunfio

Topa

Was

. A new algorithm for adapting the configuration of subcomponents in large-scale optimization with cooperative coevolution. Inform Sciences 2016; 372: 773–795.

Mei

Yao

. Cooperative coevolution with route distance grouping for large-scale capacitated arc routing problems. IEEE T Evolut Comput 2014; 18: 435–449.

Chao

Wang

Han

. Cooperative coevolution for large-scale optimization based on kernel fuzzy clustering and variable trust region methods. IEEE T Fuzzy Syst 2014; 22: 829–839.

Wei

Wang

Zong

. A novel cooperative coevolution for large scale global optimization. In: Proceedings of the IEEE international conference on systems, man, and cybernetics, San Diego, CA, 5–8 October 2014, pp.738–741. New York: IEEE.

10.

Chandra

Frean

Zhang

. On the issue of separability for problem decomposition in cooperative neuro-evolution. Neurocomputing 2012; 87: 33–40.

11.

Omidvar

Mei

. Effective decomposition of large-scale separable continuous functions for cooperative co-evolutionary algorithms. In: Proceedings of the IEEE congress on evolutionary computation, Beijing, China, 6–11 July 2014, pp.1305–1312. New York: IEEE.

12.

Potter

Jong

. Cooperative coevolution: an architecture for evolving coadapted subcomponents. Evol Comput 2000; 8: 1–29.

13.

Zamuda

Brest

Boskovic

et al . Large scale global optimization using differential evolution with self-adaptation and cooperative co-evolution. In: Proceedings of the IEEE congress on evolutionary computation, Hong Kong, China, 1–6 June 2008, pp.3718–3725. New York: IEEE.

14.

Sun

Yoshida

Cheng

et al . A cooperative particle swarm optimizer with statistical variable interdependence learning. Inform Sciences 2012; 186: 20–39.

15.

Omidvar

Mei

et al . Cooperative coevolution with differential grouping for large scale optimization. IEEE T Evolut Comput 2014; 18: 378–393.

16.

Sun

Yang

et al . Cooperative differential evolution with fast variable interdependence learning and cross-cluster mutation. Appl Soft Comput 2015; 36: 300–314.

17.

Potter

de Jong

. A cooperative coevolutionary approach to function optimization. In: Proceedings of the third conference on parallel problem solving from nature, Jerusalem, Israel, 9–14 October 1994, pp.249–257. Berlin; Heidelberg: Springer.

18.

Liu

Yao

Zhao

et al . Scaling up fast evolutionary programming with cooperative coevolution. In: Proceedings of the congress on evolutionary computation, Seoul, South Korea, 27–30 May 2001, pp.1101–1108. New York: IEEE.

19.

Shi

Teng

. Cooperative co-evolutionary differential evolution for function optimization. In: Proceedings of the advances in natural computation, Changsha, China, 27–29 August 2005, pp.1080–1088. Berlin; Heidelberg: Springer.

20.

Sofge

Jong

Schultz

. A blended population approach to cooperative coevolution for decomposition of complex problems. In: Proceedings of the congress on evolutionary computation, Honolulu, HI, 12–17 May 2002, pp.413–418. New York: IEEE.

21.

Sun

Yang

. Adaptive hybrid differential evolution with circular sliding window for large scale optimization. In: Proceedings of the 12th international conference on natural computation, fuzzy systems and knowledge discovery (ICNC-FSKD), Changsha, China, 13–15 August 2016, pp.87–94. New York: IEEE.

22.

Wiegand

. An analysis of cooperative coevolutionary algorithm. PhD Dissertation, George Mason University, Fairfax, VA, 2004.

23.

Sun

Kirley

Halgamuge

. Extended differential grouping for large scale global optimization with direct and indirect variable interactions. In: Proceedings of the genetic and evolutionary computation conference, Madrid, 11–15 July 2015, pp.313–320. New York: ACM.

24.

Mei

Omidvar

et al . A competitive divide-and-conquer algorithm for unconstrained large-scale black-box optimization. ACM T Math Software 2016; 42: 131–134.

25.

Omidvar

Yang

Mei

et al . DG2: a faster and more accurate differential grouping for large-scale black-box optimization. IEEE T Evolut Comput 2017; 21: 929–942.

26.

Chen

Weise

Yang

et al . Large-scale global optimization using cooperative coevolution with variable interaction learning. In: Proceedings of the 11th international conference on parallel problem solving from nature, Kraków, 11–15 September 2010, pp.300–309. Berlin; Heidelberg: Springer.

27.

Yang

Tang

Yao

. Multilevel cooperative coevolution for large scale optimization. In: Proceedings of the congress on evolutionary computation, Hong Kong, China, 1–6 June 2008, pp.1663–1670. New York: IEEE.

28.

Omidvar

Yang

et al . Cooperative co-evolution for large scale optimization through more frequent random grouping. In: Proceedings of the congress on evolutionary computation, Barcelona, 18–23 July 2010, pp.1754–1761. New York: IEEE.

29.

Ray

Yao

. A cooperative coevolutionary algorithm with correlation based adaptive variable partitioning. In: Proceedings of the congress on evolutionary computation, Trondheim, 18–21 May 2009, pp.983–989. New York: IEEE.

30.

Yang

Tang

Yao

. Turning high-dimensional optimization into computationally expensive optimization. IEEE T Evolut Comput 2018; 22: 143–156.

31.

Gong

Sun

Miao

. A set-based genetic algorithm for interval many-objective optimization problems. IEEE T Evolut Comput 2018; 22: 47–60.

32.

Hou

Zhou

et al . Pareto-optimization for scheduling of crude oil operations in refinery via genetic algorithm. IEEE T Syst Man Cy-S 2017; 47: 517–530.

33.

Juang

Hung

Hsu

. Rule-based cooperative continuous ant colony optimization to improve the accuracy of fuzzy system design. IEEE T Fuzzy Syst 2014; 22: 723–735.

34.

Das

Mullick

Suganthan

. Recent advances in differential evolution—an updated survey. Swarm Evol Comput 2016; 27: 1–30.

35.

Shen

et al . Ensemble of differential evolution variants. Inform Sciences 2018; 423: 172–186.

36.

Nouiri

Bekrar

Jemai

et al . An effective and distributed particle swarm optimization algorithm for flexible job-shop scheduling problem. J Intell Manuf 2018; 29: 603–615.

37.

Sun

Tan

et al . Cooperative hierarchical PSO with two stage variable interaction reconstruction for large scale optimization. IEEE T Cybernetics 2017; 47: 2809–2823.

38.

Molina

Lozano

Herrera

. MA-SW-Chains: memetic algorithm based on local search chains for large scale continuous global optimization. In: Proceedings of the congress on evolutionary computation, Barcelona, 18–23 July 2010, pp.3153–3160. New York: IEEE.

39.

Yao

Liu

Lin

. Evolutionary programming made faster. IEEE T Evolut Comput 1999; 3: 82–102.

40.

Penunuri

Cab

Carvente

et al . A study of the classical differential evolution control parameters. Swarm Evol Comput 2016; 26: 86–96.

41.

Qin

Huang

Suganthan

. Differential evolution algorithm with strategy adaptation for global numerical optimization. IEEE T Evolut Comput 2009; 13: 398–417.

42.

Yang

Tang

Yao

. Self-adaptive differential evolution with neighborhood search. In: Proceedings of the congress on evolutionary computation, Hong Kong, China, 1–6 June 2008, pp.1110–1116. New York: IEEE.

43.

Yang

Tang

Yao

. Scalability of generalized adaptive differential evolution for large-scale continuous optimization. Soft Comput 2011; 15: 2141–2155.

44.

Zhang

Sanderson

. Jade: adaptive differential evolution with optional external archive. IEEE T Evolut Comput 2009; 13: 945–958.

45.

Islam

Das

Ghosh

et al . An adaptive differential evolution algorithm with novel mutation and crossover strategies for global numerical optimization. IEEE T Syst Man Cy B 2012; 42: 482–500.

46.

Tang

Yao

Suganthan

et al . Benchmark functions for the CEC’2008 special session and competition on large scale global optimization. Technical report, Nature Inspired Computation and Applications Laboratory, University of Science and Technology of China, Hefei, China, November 2007, pp.1–18.

47.

Tang

Suganthan

et al . Benchmark functions for the CEC’2010 special session and competition on large scale global optimization. Technical report, Nature Inspired Computation and Applications Laboratory, University of Science and Technology of China, Hefei, China, November 2009, pp.1–24.

48.

Tseng

Chen

. Multiple trajectory search for large scale global optimization. In: Proceedings of the congress on evolutionary computation, Hong Kong, China, 1–6 June 2008, pp.3052–3059. New York: IEEE.

49.

Wang

. A restart univariate Estimation of Distribution Algorithm: sampling under mixed Gaussian and Levy probability distribution. In: Proceedings of the congress on evolutionary computation, Hong Kong, China, 1–6 June 2008, pp.3917–3924. New York: IEEE.

50.

Cheng

Jin

. A competitive swarm optimizer for large scale optimization. IEEE T Cybernetics 2015; 45: 191–204.

51.

Yao

. Cooperatively coevolving particle swarms for large scale optimization. IEEE T Evolut Comput 2012; 16: 210–224.

52.

Awad

Ali

Suganthan

. Ensemble sinusoidal differential covariance matrix adaptation with Euclidean neighborhood for solving CEC2017 benchmark problems. In: Proceedings of the IEEE congress on evolutionary computation, San Sebastian, 5–8 June 2017, pp.372–379. New York: IEEE.

53.

Zhao

Suganthan

Das

. Dynamic multi-swarm particle swarm optimizer with subregional harmony search. In: Proceedings of the IEEE congress on evolutionary computation, Barcelona, 18–23 July 2010, pp.1983–1990. New York: IEEE.

Cooperative differential evolution framework with utility-based adaptive grouping for large-scale optimization

Abstract

Keywords

Introduction

Related works and background

CC

DE algorithm

CoDE-AG

CSC

Grouping based on RM

Algorithm framework

Complexity analysis

Experimental studies

Test case and experimental setup

Simulated results and discussions

Results and discussions for CEC2008 benchmark

Results and discussions for CEC2010 benchmark

Results for the RM-based grouping strategy

CoDE-AG for material optimization of dome structure

Discussions

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

References