Abstract
The problem of QoS-aware multiobjective optimization is an important issue for Web services selection in distributed computing environment. In this paper, a novel algorithm called MOASS (multiobjective optimization algorithm for web service selection) is proposed through analyzing the genetic operators such as constraint handling, the initial population generation, fitness assignment, and diversity preservation. Compared with MOEAWP (Yu et al., 2007), simulation results show that the feasible objective region can be filled uniformly with the optimal solutions obtained by MOASS under different test applications. In the case of higher constraints especially, MOASS can obtain more high-quality and evenly distributed nondominated solutions than MOEAWP.
1. Introduction
Service-oriented architecture (SOA) has been a major software framework for distributed applications such as e-business and enterprise systems. These distributed applications, that is, composition services, can be dynamically and flexibly composed by integrating existing component services developed independently [1]. Usually, service providers can offer some different service versions with similar functionality and different QoS (Quality of Service) levels. QoS for Web services refer to various nonfunctional characteristics such as response time, execution price, availability, and reliability [2]. It is the way to select the best services to compose an efficient and complex service with QoS assurance that matters.
QoS-oriented service selection is essentially a multiobjective optimization problem, and it is usually an NP-hard problem. In the present study, multiple objectives are usually combined into a scalar objective function via a predetermined aggregating function for optimization. Wen et al. [3] presented a QoS objective function with high performance/cost ratio. Zeng et al. [4] transformed the five QoS attributes into a scalar linear weighted sum objective function; the above approach is also found in the literatures [5–7]. Although the method is simple to implement with low time complexity, it can only obtain a particular optimum point on the tradeoff surface. This drawback of such an approach is that the weights are difficult to determine precisely, especially when there is insufficient information or knowledge concerning the optimization problem.
Multiobjective optimization problems involve simultaneous optimization of several incommensurable and often competing objectives. Often, there is no single optimal solution, but rather a set of alternative solutions. They are known as Pareto-optimal solutions. Multiobjective evolutionary algorithms seem to be particularly suited for solving this kind of optimization problems [8]. Based on multiobjective evolutionary algorithm, several Web service selection algorithms are developed [9–11]. Yu et al. [12] proposed a workflow execution planning approach using multiobjective evolutionary algorithm (MOEAWP), which developed on basis of NSGAII [13]. It implemented simultaneously to optimize the execution time and service price while meeting time and cost constraints. This class of methods can generate a set of nondominated solutions in single run and it is easy for users to select the most suitable solution for the current issue.
Following the guideline of literature [12], we proposed a new Pareto-based multiobjective optimization algorithm for Web service selection (MOASS), which implements the simultaneous optimization of execution time and cost while meeting deadline and budget constraints. Experimental results show that the population can converge and uniformly distribute in the Pareto front.
The rest of this paper is organized as follows. Section 2 defines the optimization problem for Web service selection, the constraints definition, and the problem assumptions. Section 3 describes the proposed algorithm MOASS for Web service selection. Section 4 shows the performance evaluation and comparison of two algorithms. The paper is concluded in Section 5.
2. Problem Description
Following the literature [12], a Web service composition application considered in this paper can be described as a workflow application, which is modeled as DAG (directed acyclic graph). Let G = {V, E} denote the DAG graph, where node set V = {1, 2,…, n} denotes the set of tasks (or activities), and directed arc (edge) set E represents intertask data or control dependencies. Task 1 and n are dummy single-start and single-terminal nodes. Each arc (i, j) ∈ E represents the precedence constraint from i to j; that is, task j cannot start before i has finished.
For any activity i in graph G, there are many candidate services which can fulfill i. All candidate services for i are called the service pool of activity i, denoted as SP(i). Let l(i) denote the service pool length; that is, l(i) = |SP(i)|. In the same pool, different services have different processing times and costs. Generally, the faster a service executes, the more a user should pay [12]. Let s ij be the jth service name for conducting activity i, t ij the processing time on service s ij , and c ij the cost for executing i on service s ij . Thus, 3-tuple (s ij , t ij , c ij ,) denotes the jth service that can fulfill activity i. The service pool of activity i can be expressed as SP(i) = {(s ij , t ij , c ij ,) ∣ 1 ≤ j ≤ l(i)}. The service pool of dummy activities is empty.
For any application G = (V, E), one of its solutions can be obtained if each task allocates a service instance using some selection policy. Let S be the solution space of G. For ∀x ∈ S, the required execution time is denoted as T(x), and the required total execution cost is denoted as C(x). Let f
i
(1 ≤ i ≤ n) be the completion time of task i; then T(x) = f
n
. Let z
ik
be 0–1 variable. It means that its value is 1 when node i chooses service level k to perform it and 0, otherwise; then
Let D be the given time constraint (deadline), which is also the latest completion time of task n. Let B be the given cost constraint (budget), which is the upper limit of the required total cost. The objective of this problem is to select an appropriate service level for each task so that the duration and the total cost can be simultaneously minimized while meeting the given deadline and budget constraints.
Assume that the number of service providers is large enough to accommodate the number of tasks that need to be run simultaneously at any given time, so that the waiting time for an available server is negligible. Thus, each service can provide the expected execution time of any task at any service level. Users can choose properly service levels for these tasks. The constrained multiobjective optimization problem can be formulated as follows:
The objective function (1) corresponds to minimizing the duration and the total cost. Equation (2) ensures that exactly only one service should be chosen for each task. Constraint (3) meets precedence constraints. Constraint (4) meets the given deadline constraints. Constraint (5) guarantees that the required total cost is less than or equal to budget B. Constraint (6) guarantees that z ik is 0–1 variable.
3. MOASS Algorithm
A new approach, called MOASS (multiobjective optimization algorithm for web service selection) is proposed in this paper. MOASS uses a mixture of established and new techniques in order to find multiple Pareto-optimal solutions in parallel.
The flow of the MOASS algorithm is as follows.
Step (1). Handle constraints based on penalty functions.
Step (2). Encode the problem and determine the population size.
Step (3). Generate an initial population P and create the empty external nondominated set P′.
Step (4). Copy nondominated members of P to P′.
Step (5). Operate crossover and mutation on population P.
Step (6). Update population P′ and preserve its diversity.
Step (7). Calculate the fitness of each individual in P as well as in P′.
Step (8). Select individuals from populations P and P′ (multiset union), until the mating pool is filled.
Step (9). If the maximum number of generations is reached, then stop; otherwise go to Step (5).
In MOASS algorithm, every chromosome is encoded by a permutation of integer; each integer represents a selected service instance for executing its corresponding task. We apply two-point crossover and one-point mutation operators as usual. In Step (8), binary tournament selection with replacement is used. In the next four subsections, all technologies applied in MOASS are described in detail.
3.1. Handling Constraints
The most common way of handling constraints is by using penalty techniques like exterior penalty method [14]. Two penalty functions are developed to handle the deadline and budget constraints. Let C B (x) be the budget penalty function and let T D (x) be the deadline penalty function; they are defined as follows:
Since satisfying deadline and budget requirements is the primary goal of the Web service selection, the penalty is added separately to the optimization objectives. The new objective function is defined as follows:
where y1(x) = T(x) + T D (x) and y2(x) = C(x) + C B (x).
3.2. Determining the Population Size
In the MO evolution process, population size has great influence on the exploration efficiency of the solution space. It is difficult to determine an optimal population size. A fixed value of population size is set in literature [12]. For the optimization problem considered in this paper, let N be the number of nodes in a service composition application and let M = max{l(i) ∣ i ∈ V} be the maximum length of all service pools; the optimization problem may have N M solutions at most. It is shown that the solution space for each optimization problem is dependent on the composition scale and the length of service pool. Considering specific example of the optimization problem and the effective exploration of the solution space, the population size, denoted as L(P), can be defined as follows:
External population size, denoted as L(P′), can be defined by the user. In this paper, it depends on the test application; that is, L(P′) is equal to N.
3.3. Fitness Assignment
A fitness function is used to measure the quality of the solutions according to the given optimization objectives. Considering the optimum and diversity of solutions, a method called Hybrid Nondominated Sorting and Niche Technique (HNSNT) is proposed to calculate the fitness of individuals. Including the evolutionary population and external population, the procedure is a three-stage process and is described as follows.
Step 1 (pareto-based sorting). First, it selects nondominated solutions in population P∪P′ as the first level nondominated front, denoted as h i = 1. The nondominated solutions in the rest of population are selected as the next level nondominated front, denoted as h i = 2. The procedure is repeated until all individuals in the population are classified.
Step 2 (compute dynamic niche radius). Niche techniques aim at preserving diversity in population, and the key problem is to determine the niche radius. Based on literature [15], an approximate calculation method for dynamic niche radius is given. As pointed out in [15], the tradeoffs for an m-objective optimization problem is in the form of an (m − 1)-dimensional hypersurface. For the two-objective optimization problem in this paper especially, its one-dimensional surface is actually a curve.
Let
where

The tradeoff curve of two-objective optimization.
Let
Step 3 (compute the similarity of individuals). Let D i be the similarity; the processes are as follows.
For each j ∈ P∪P′,
for each j ∈ P∪P′&&j≠i,
D i = m/|P∪P′|.
Therefore, the fitness function of each individual can be defined as follows:
3.4. Preserving the Diversity of Population
The external population P′ stores the current nondominated solutions, which should be distributed evenly along the Pareto front. Meantime, it also participates in selection. So pruning the external nondominated set while maintaining its characteristics might be necessary or even mandatory.
An approach, called NTSC (niche technique of similarity based on crowding), is given. Let j be a nondominated individual in P, and let i be an individual in external population P′. All the individuals in P′ that are dominated by j are denoted as a set K and then the NTSC procedure is described as follows.
Step (1). Get any untreated individual j in P.
Step (2). Initialize set K empty.
Step (3). If ∃i ∈ P′ dominates j, then go to Step (11).
Step (4). Search all individuals in P′ dominated by i and add them to set K.
Step (5). Remove those individuals of set K from P′ and add j to P′.
Step (6). If P′ < L(P′), then go to Step (11).
Step (7). Compute the similarity D k of individual k ∈ P′ via Step (3) in Section 3.3.
Step (8). If there is an infeasible solution in P′, then go to Step (10).
Step (9). Delete the feasible individual with the highest value
Step (10). Delete the infeasible individual with the highest value
Step (11). If all individuals in P have been treated, then stop; otherwise, go to Step (1).
In this approach, nondominated infeasible solutions are stored when the number of feasible solutions is less than the upper limit of nondominated set. When updating, the infeasible solutions are preferentially eliminated from external population.
4. Experimental Results
The proposed algorithm (MOASS) is compared with MOEAWP algorithm [12], which is developed on NSGAII. The strategy of the population size in MOASS can also be adopted by MOEAWP. All these algorithms are coded in Java and performed on a Pentium IV with a 2.93 GHz processor and 512 MB RAM using the operating system Window XP.
4.1. Parameter Setting
Since no standard test instances are available to test the presented algorithms, a DAG graph random generator is developed to generate application DAG with different scales and different structures [16]. The number of nodes varies between 30 and 120 with an increment of 30; that is, V = {30, 60, 90, 120}. The maximum out-degree of the DAG ranges from 1 to 3; that is, out_degree = {1, 2, 3}. For each task in a DAG application, the length of its service pool length is 20. For those services in the same service pool, time, and cost quality parameters are randomly generated, but their execution costs are inversely proportional to their processing times.
The behaviors of algorithms are also observed at three constraint levels, namely, relaxed constraint, medium constraints, and tight constraint. The relaxed constraint level assumes that users require relatively large deadline and budget, while tight constraint level assumes that users require low deadline and budget. The relaxed/tight deadlines and budgets of an application are determined by the maximum and minimum time/cost for the service composition execution. The deadline D and budget B are defined as follows:
where T min is the minimum completion time of a composite instance and Tmax is the maximum completion time of a composite instance when all tasks selected are the slowest service for execution. The C min and Cmax have the same meaning of T min and Tmax. The value of θ varied from 0.3 to 0.7 to generate the tight, medium, and relaxed deadlines and budgets.
On all test applications, 2000 generations were simulated per optimization run. The probabilities of crossover (two-point) and mutation were fixed (0.8 and 0.1, resp.). The sizes of evolutionary population and external population were determined. These parameters are shown in Table 1.
The parameters setting and the experimental results of performance measures.
4.2. Performance Measures
In this section, three different quantitative interpretations of statistical MO optimization performances are applied to validate the proposed MOASS, together with its comparison with MOEAWP [12].
(1) Coverage of Two Sets (C) [15]. The coverage of two sets (C) is a measure to compare the domination of two populations in a pairwise manner. Let X′, X″⊆X be two sets of decision vectors, the function C maps the ordered pair (X′, X″) to the interval [0, 1]. Consider
The value C(X′, X″) = 1 means that all solutions in X″ are dominated by or equal to solutions X′. While C(X′, X″) = 0 means none of the points in X″ are covered by the population X′. Since C(X′, X″) is not necessarily equal to C(X″, X′), both C(X′, X″) and C(X″, X′) have to be considered.
(2) Uniform Distribution (U) [17]. An important metric in MO optimization concerns the distribution of individuals. The measure formula is referenced in literature [17]. The metric is able to measure the spread of nondominated individuals effectively. The smaller its value, the better uniformity for the nondominated set.
(3) The Number of Optimal Solutions (I). The metric concerns the numbers of nondominated individuals searched at the last generation. If the metric value is bigger, the algorithm should have better performance.
4.3. Experimental Results
Based on the three performance measures, the proposed MOASS has been compared with MOEAWP [12] for all above test applications. In our proposed method, four members of initial population were produced by MPBL [16], where θ ∈ {0, 0.7, 0.8, 1} in (16). The MOEAWP and MOASS were independently executed 40 times for each test application. The arithmetic means of these results using the metrics discussed above are shown in Table 1, where MOA denotes the new approach and MOE denotes the compared method.
For the measure of C, it can be seen from the results in Table 1 that MOASS has the higher mean value as compared to MOEAWP in different situations. The average value is up to 90.8%. Also, MOASS can almost dominate MOEAWP when the constraint is higher (middle, tight). The measure value, especially, is equal to 1 in case of N = 60 and higher (middle, tight) constraints. It clearly indicates that MOASS has the capability of providing useful nondominated solutions.
With respect to the number of optimal solutions (I), it can be seen that MOASS can obtain a large number of nondominated solutions when the constraint is tight. The MOEAWP algorithm obtained more solutions than the proposal with medium and relaxed constraints. This indicates that MOASS has better performance than MOEAWP in tighter constraint. Although the number of solutions obtained by MOEAWP is more than those of MOASS, the solutions can almost be dominated by MOASS.
In the context of U-measure of the nondominated population, MOASS gives the lower value of U than MOEAWP except for two test examples; that is, N = 30 and N = 60 with tight constraint. It is shown that MOASS has more capability to distribute the nondominated individuals evenly. This is because that MOEAWP does not have any operation to take care of the population distribution, whereas our approach applied the niche technologies to both fitness assignment and preservation of population diversity. So it can discover missing tradeoffs at each generation.
The running results of the second group and the fourth group of test problems in Table 1 are randomly sampled. The population distribution of Pareto optimal solutions in the objective domain is shown in Figures 2 and 3 (where EC denotes execution cost C(x) and ET denotes execution time T(x)). In general, the purpose of producing these figures is to visually inspect the performance of various algorithms in terms of the population distribution, since quantitative performance alone may provide insufficient information for full observation and understanding. As shown in Figures 2 and 3, MOASS is more capable of filling up the gap along the discovered Pareto front in all test application, especially those test examples with higher constraints. Although, MOEAWP has a good performance under the relaxed constraint, it is still slightly inferior to the MOASS.

The Pareto-front with N = 60 and different constraints.

The Pareto-front with N = 120 and different constraints.
5. Conclusion
A novel multiobjective optimization algorithm (named MOASS) is proposed for solving constrained multiobjective Web service selection problem. In handing constraints, two time and cost penalty functions are added separately to the two objective functions without liner summation. The algorithm generates initial population by existing heuristic algorithm to achieve broader explorations in boundary regions. The population size depended on the complexity of test applications. The algorithm also applies the niche technology to both fitness assignment and population diversity, which aims to discover missing tradeoffs at each generation.
Furthermore, extensive quantitative comparisons between MOASS and MOEWP on given test applications have been performed. Experimental results show that MOASS performed better than MOEAWP in searching and maintaining the uniform distribution of the nondominated solutions on all the test applications. Under tight and medium constraints especially, MOASS algorithm can discover more optimal solutions distributed uniformly along the tradeoff surface.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Footnotes
Acknowledgments
This work was supported by National Natural Science Foundation of China under Grant (no. 60873236) and Natural Science Foundation of Hebei Province in China under Grants (no. F2009000653 and no. F201204089). In addition, our research is also supported by the Youth Science Research Foundation of Hebei Province in China under Grant (no. 2010251).
