Sage Journals: Discover world-class research

Abstract

As a component of knowledgeable manufacturing systems, the structure of flow shop–like knowledgeable manufacturing cells is similar to that of a flow shop, thus representing an NP-hard issue. Here, we propose a self-evolutionary algorithm that exhibits learning ability and is composed of learning and scheduling modules. Unlike traditional scheduling algorithms, whose performances remain unchanged when the procedure is coded, the performance of the algorithm proposed in this study gradually improves as the learning process continues. The self-evolutionary ability is realized through the training of a hybrid kernel support vector machine. The hybrid kernel support vector machine was designed to approximate the value of the Q-function to select the appropriate action for the scheduling module and thus to obtain the optimal solution. An iterative process of value based on the Q-learning was adopted to train the hybrid kernel support vector machine to gradually enhance the algorithm’s efficiency and accuracy. The extracted state features of the flow shop–like knowledgeable manufacturing cells serve as inputs to hybrid kernel support vector machine for easy generalization of the learning results. The action exerted on a feasible solution is also defined as the input of the hybrid kernel support vector machine. The computational results show that the performance of the proposed procedure improves as the learning process progresses. Data from the computation and comparisons with other algorithms verify the validity and efficiency of the proposed algorithm.

Keywords

Scheduling self-evolution Q-learning hybrid kernel support vector machine flow shop knowledgeable manufacturing system

Introduction

With increasingly customized demands and acute competition, manufacturing enterprise must adapt to new competitive environments more than ever. Several manufacturing systems, such as flexible manufacturing systems (FMSs) and agile manufacturing systems (AMSs), have been developed to meet such challenges. Yan and Liu¹ proposed knowledgeable manufacturing systems (KMSs) to account for this trend. Considering the optimal time, cost and quality as the primary goals, a KMS, as a truly intelligent manufacturing system, can be characterized by self-adaptation, self-learning, self-evolution and self-configuration. Studies that exploit knowledge in manufacturing systems have rapidly developed in recent years and have attracted significant attention.^2–4 Similar to digital enterprise technology (DET) proposed by Maropoulos et al.,^5–7 KMSs also emphasize on the utilization of recent technology advances in manufacturing systems. Li et al.⁸ presented a review on the development of knowledge-based systems for manufacturing systems.

Various achievements have been made in recent studies that have focused on KMSs. Wan and Yan⁹ proposed a heuristic algorithm for integrated assembly job shop scheduling and self-reconfiguration in knowledgeable manufacturing. Yan et al.¹⁰ proposed a hybrid variable neighborhood search electromagnetism-like mechanism algorithm to overcome the two-stage assembly flow shop scheduling problem. Yang and Yan¹¹ proposed an adaptive scheduling control strategy based on B-Q learning for KMS. Yan et al.¹² proposed a new economic production lot size model for multi-cycle flexible production/inventory that incorporated an iterative learning algorithm. Unlike previous studies that focused primarily on the general frame properties of KMSs, such as knowledge presentation of KMS,¹³ the aforementioned studies focus on the subsystems of KMSs, such as scheduling decision systems. Flow shop–like knowledgeable manufacturing cells (KMCFS) represent a component of the scheduling decision system of KMSs that focus on problems in the scheduling algorithm, such as flow shop problems. Compared with general scheduling algorithms, which aim to optimize the solution, the scheduling algorithm for KMCFS prioritizes self-evolutionary performance. The performance of the scheduling algorithm proposed here for KMCFS can be improved gradually through off-line learning.

The structure of KMCFS is similar to that of a flow shop, which is strongly NP-hard according to Garey et al.¹⁴ The key module of KMCFS is the learning module rather than the scheduling module, which acts as the application of the former. Studies dedicated to flow shops have primarily focused on the design of heuristic and meta-heuristic algorithms to obtain a better solution, since flow shop problems were proved to be NP-hard in 1976. Ogbu and Smith¹⁵ adopted a simulated annealing method to identify a better solution. Reeves and Yamada¹⁶ applied a genetic algorithm to flow shops and obtained the best solutions of the time, which were characterized by many benchmarks. Ahmadizar¹⁷ proposed a new mechanism to initialize the pheromone trails whose intensities are limited by bounds that changed dynamically in the ant colony algorithm. Li and Yin¹⁸ designed composite mutation strategies in a discrete artificial bee colony algorithm. Grabowski and Wodecki¹⁹ developed a tabu search algorithm that rapidly obtained a desirable solution in light of block properties of flow shop. Kim and Lee²⁰ proposed two heuristic algorithms for a re-entrant flow shop with parallel machines. Marimuthu et al.²¹ proposed a simulated annealing algorithm and tabu search algorithm for a flow shop with lot sizing. Li et al.²² used a hybrid flow shop model to schedule one-of-a-kind products, and a genetic algorithm was designed to solve the problem. Chan²³ proposed three routing policies and four dispatching rules to schedule a manufacturing system. Zobolas et al.²⁴ presented a hybrid meta-heuristic algorithm that consisted of a genetic algorithm and variable neighborhood. Luo et al.²⁵ used a genetic algorithm to solve a two-stage hybrid flow shop problem with blocking and machine availability constraints. A quantum differential evolution algorithm was developed by Zheng and Yamashiro.²⁶ Rao and Kalyankar²⁷ developed an advanced optimization algorithm for the parameter optimization during the scheduling in a laser beam welding (LBW) process. Pan et al.²⁸ proposed a chaotic harmony search algorithm for a flow shop with limited buffers. The heuristic algorithms discussed here seek optimal or suboptimal solutions for certain benchmarks or some specialized problems within a short period of time. However, demerits do exist with these algorithms, such as sensitivity to certain parameters, dependence on initial solutions and so on, due to the random factors in these algorithms. Moreover, these algorithms have a fixed performance and cannot improve after the procedures are coded.

Due to these limitations, the development of highly intelligent algorithms has gained significant attraction in recent years. The goal of this study was to develop an algorithm that can improve its searching performance using Q-learning to overcome the disadvantages of previously reported algorithms. Some flow shop algorithms with learning ability have been recently reported. Agarwal et al.²⁹ proposed an adaptive learning approach to address this issue. Lee and Michael³⁰ developed a method based on neural-net with two hierarchies for a real-time flow shop with sequencing knowledge learning. Solimanpur et al.³¹ developed a neural-net tabu search algorithm. Li and Yin³² developed a differential evolution algorithm with learning ability based on a memetic algorithm in which a new largest-ranked-value (LRV) rule was designed. Xie et al.³³ proposed a novel teaching-learning-based algorithm with a variable neighborhood. The idea of Q-learning based on value iteration has been widely used to solve scheduling problems as a type of reinforcement learning. Li et al.³⁴ modeled issues regarding the order decision making and scheduling in an order system and proposed an algorithm based on Q-learning. Xanthopoulos et al.³⁵ proposed a reinforcement learning–based approach for a single scheduling problem with stochastic arrivals and processing time. Aydin and Őztemel³⁶ adopted reinforcement learning to solve dynamic job shop problems. Stefan³⁷ developed a method based on reinforcement learning to solve flow shop problems. Q-learning converges markedly slower and may even be impossible for large-scale problems. Therefore, a technique to extract the state features of a flow shop is thus used here to reduce the complexity of state space, and a hybrid kernel support vector machine (HK-SVM) is proposed to approximate the Q-function through Q-learning. The proposed algorithm is composed of learning and scheduling modules. The scheduling module rapidly searches for the optimal solution using knowledge obtained from the learning module. The scheduling knowledge is generated from a process that iterates through values based on Q-learning by training the HK-SVM. More training during the off-line learning process corresponds to increased knowledge accumulation in the learning module, and thus, the algorithm’s efficiency and accuracy gradually increase as the amount of knowledge increases. Thus, the algorithm proposed in this study possesses a self-evolution feature in that the performance of searching for the optimal solution progressively improves due to the aggregate knowledge obtained through continuous off-line learning.

The remainder of this article is organized as follows. In section “Problem formulation,” the KMCFS is introduced and formulated. Q-learning and HK-SVM are presented and analyzed in section “Q-learning and HK-SVM.” The implementation steps of the proposed algorithm are given in section “Scheduling algorithm with evolutionary features for KMCFS.” Section “Numerical simulation” presents the numerical simulation. Section “Conclusion” presents the final conclusion.

Problem formulation

There are m processing units (PUs) to process n tasks and the function of the KMCFS is to schedule the tasks to minimize the make-span. Similar to the flow shop structure reported in Brucker,³⁸ the tasks to be processed and the PUs need to satisfy the following requirements: (a) the sequences of all of the tasks flowing through the PUs are identical, (b) each task can be processed by at most one PU at given time, (c) every PU can process at most one task at given time, (d) process time of each task by a PU is determined in advance and (e) the process sequences of all of the tasks for each PU are identical.

A feasible scheduling given by the KMCFS can be represented as

ω = (ω (1) \dots ω (i) \dots ω (n))

(1)

where ω(i) denotes the task to be processed in the ith position in the sequence ω . Let $Π$ denote the set of feasible scheduling, that is, $ω \in Π$ , and let C _max( ω ) denote the make-span of ω . Then, the optimal scheduling $ω^{*}$ can be represented as follows:

ω^{*} = \arg min_{ω \in Π} (C_{max} (ω))

(2)

Let p_iω _(j) denote the processing time of the jth task in ω (i.e. ω(j)) for the ith PU. Then, all of the processing times for m PUs and n tasks can be denoted by the time process matrix P ( ω ) with m rows and n columns, that is

P (ω) = {({p_{i ω}}_{(j)})}_{m}_{\times n}

(3)

Q-learning and HK-SVM

The evolutionary characteristics of the KMCFS are realized via a support vector machine (SVM) with a hybrid kernel that learns the value of the Q-function for the master scheduling. A brief account of Q-learning and the construction of the HK-SVM is presented below.

Q-learning

Q-learning, which is a type of reinforcement learning, is widely used for system control, and inventory control, scheduling and is an important technique in machine learning and intelligent control fields. The key principle of Q-learning is that an intelligent system can obtain knowledge from the environment by observing the feedback of the system’s action and modifying its strategy to reach the goal.

According to Mitchell,³⁹ the expected reward can be represented by equation (4) when a system adopts a strategy π in state s

v^{π} (s) = E [\sum_{i = 0}^{\infty} γ^{i} r_{t + i}]

(4)

The optimal strategy $π^{*}$ corresponds to the optimal value function, which is denoted as

v^{*} (s) = max_{π} v^{π} (s), \forall s \in S

(5)

There exists the following relation between state s _i and the following state s _j

v^{*} (s_{i}) = max_{a} (r (s_{i}, a) + γ v^{*} (s_{j})), i = 1, \dots, n

(6)

According to equation (5), the optimal strategy $π^{*} (s)$ can be obtained from the optimal value function $v^{*} (s)$ , and the value of $v^{*} (s)$ can be obtained by theoretically solving the equation group in equation (5). However, this method is not feasible for the KMCFS because there are an excessive number of states for a problem, especially with its increasing scale. Therefore, an iterative method is adopted to obtain $v^{*} (s)$ , and the Q-function of state s _i and action a is denoted as

q (s_{i}, a) = r (s_{i}, a) + γ v^{*} (s_{j})

(7)

Then, equation (5) can be transformed into

v^{*} (s_{i}) = max_{a} q (s_{i}, a), i = 1, \dots, n

(8)

where the value of $q (s_{i}, a)$ can be obtained by equation (9) (Haykin⁴⁰)

\begin{matrix} {\hat{q}}_{t + 1} (s_{i}, a) = {\hat{q}}_{t} (s_{i}, a) + η_{t} (s_{i}, a) \\ (r (s_{i}, a) + γ v^{*} (s_{j}) - {\hat{q}}_{t} (s_{i}, a)) \end{matrix}

(9)

The computation process of $\hat{q} (s_{i}, a)$ in equation (9) converges under certain conditions. However, for the KMCFS, these conditions cannot be achieved for the above reason. Therefore, the HK-SVM is proposed here to approximate the value of $q (s_{i}, a)$ , which allows off-line training.

HK-SVM

SVMs have gained their wide applications in fields such as pattern recognition⁴¹ and the regression of non-linear functions. The principle of SVMs is to map the input vector from a low-dimensional space into a high-dimensional linear space such that the problem can be treated as linear and solved. The results can be well generalized based on Vapnik–Chervonenkis (VC) dimension theory to minimize the structure risk.

For a sample set $X = {(x_{i}, d_{i}) | i = 1, \dots, n}$ , there exists a functional relation between x and d. The approximated value of d (denoted as y) can be obtained by mapping x into a high-dimensional space, that is

y = \sum_{j = 0}^{m} w_{j} φ_{j} (x) = w^{T} φ (x)

(10)

The function regression in equation (10) can be transited to a pattern classification problem using non-negative slack variables $ξ_{i}, ξ'_{i}$ , and the corresponding optimal problem with hyper-plane constraints in the high-dimensional space can be represented by the following equation⁴²

\begin{matrix} min L_{p} (w, ξ, ξ') = C (\sum_{i = 1}^{n} (ξ_{i} + {ξ'}_{i})) + \frac{1}{2} w^{T} w \\ s . t . {\begin{matrix} d_{i} - w^{T} φ (x_{i}) \leq ε + ξ_{i} \\ w^{T} φ (x_{i}) - d_{i} \leq ε + {ξ'}_{i} \end{matrix} i = 1, \dots, n \end{matrix}

(11)

where C and ε are given constants. The optimization in equation (11) can be solved using the Lagrange function to construct its dual problem, and the approximation function of d (denoted as y( x )) can be represented as follows

y (x) = \sum_{i = 0}^{n} (α_{i} - {α'}_{i}) k (x, x_{i})

(12)

where $α_{i} and α'_{i}$ are Lagrange factors that can be obtained by solving the dual problem of the Lagrange function in equation (10) and $k (x, x_{i})$ is the kernel function defined by the Mercer theorem.⁴²

The performance of an SVM depends heavily on the selection of the kernel functions. Among the most commonly used are polynomial functions, Gaussian functions and Sigmoid functions. A hybrid kernel function is proposed here to guarantee full performance of the SVM

k_{mix} (x, z) = λ {(x^{T} z + 1)}^{2} + (1 - λ) \exp (\frac{- {‖ x - z ‖}^{2}}{2 σ^{2}})

(13)

where λ denotes the optimal hybrid coefficient, $λ \in (0, 1), x, z \subset R^{d}$ .

Now, let us prove that the function $k_{mix} (x, z)$ proposed is the kernel function of SVM. A lemma is given below to simplify the proof.

Lemma 1

Let X be a finite input space with $k (x, z) (x, z \in X)$ a symmetric function on X. Then, $k (x, z)$ is a kernel function if and only if the matrix K (according to the following equation) is positive semi-definite⁴²

K = {(k_{ij})}_{n \times n}, k_{ij} = k (x_{i}, x_{j}), i, j = 1, \dots, n

(14)

Property 1

The function $k_{mix} (x, z)$ given by equation (13) is kernel function defined in space $R^{d}$ .

Proof

If the function $k_{1} (x, z) = {(x^{T} z + 1)}^{2}$ is a polynomial kernel function defined in space $R^{d}$ , then matrix K ¹ can be denoted as

K^{1} = {(k_{ij}^{1})}_{n \times n}, k_{ij}^{1} = k_{1} (x_{i}, x_{j}), i, j = 1, \dots, n

(15)

For $\forall x \subset R^{d}$

x^{T} K^{1} x \geq 0

(16)

Similarly, if the function denoted as $k_{2} (x, z) = \exp (- ‖ x - z ‖^{2} / 2 σ^{2})$ is a Gaussian kernel function defined in space $R^{d}$ , then matrix K ² can be written as

K^{2} = {(k_{ij}^{2})}_{n \times n}, k_{ij}^{2} = k_{2} (x_{i}, x_{j}), i, j = 1, \dots, n

(17)

For $\forall x \subset R^{d}$

x^{T} K^{2} x \geq 0

(18)

Therefore, matrix K related to the function $k_{mix} (x, z)$ can be expressed as

K = {(k_{ij})}_{n \times n}, k_{ij} = k_{mix} (x_{i}, x_{j}), i, j = 1, \dots, n

(19)

where

k_{ij} = λ k_{ij}^{1} + (1 - λ) k_{ij}^{2}

(20)

Then, for matrix K

K = λ K^{1} + (1 - λ) K^{2}

(21)

For $\forall x \subset R^{d}$

\begin{matrix} x^{T} Kx = x^{T} (λ K^{1} + (1 - λ) K^{2}) x \\ = λ x^{T} K^{1} x + (1 - λ) x^{T} K^{2} x \geq 0 \end{matrix}

(22)

Thus, matrix K is positive semi-definite. In addition, it can be inferred from equation (12) that the following equation holds

k_{mix} (x, z) = k_{mix} (z, x)

(23)

That is to say, the function $k_{mix} (x, z)$ is symmetrical, and the conclusion holds that the function $k_{mix} (x, z)$ is a kernel function in space $R^{d}$ . The proof ends here.

Scheduling algorithm with evolutionary features for KMCFS

The difference in our scheduling algorithm for the KMCFS (denoted as KMCFSSAEF) from other general scheduling algorithms is that the KMCFSSAEF can rapidly obtain the solution to the current problem based on the knowledge contained in the HK-SVM by which the off-line Q-learning is performed. Various techniques were employed to realize the algorithm regarding the representation of state and action. The details and full procedure of the algorithm are given in the following sections.

State representation

For a specific KMCFS problem, its state information can be precisely represented by its feasible scheduling ω and time process matrix P (ω), although it is difficult for a learning system to recognize such representation. Therefore, the key features of the problem must be extracted so that it can be easily recognized. The techniques used to represent the state of the problem are presented below:

1. Make the time process matrix P ( ω ) unitary to eliminate the numerical difference in P ( ω ) for particular KMCFS problems. The unitary matrix of P ( ω ) is denoted as $\bar{P} (ω) = {({\bar{p}}_{i ω (j)})}_{m \times n}$ and can be obtained from the following equation

\bar{P} (ω) = \frac{P (ω)}{\sum_{i = 1}^{m} \sum_{j = 1}^{n} p_{i ω (j)}}

(24)

2. Extract the KMCFS state features. The parameters representing the state features are acquired as follows:

The column index $t = {(t_{j})}_{n \times 1}$ can be obtained as

t_{j} = \sum_{i = 1}^{m} {\bar{p}}_{i ω (j)}

(25)

Denoting the end time of the ith PU as $a c_{i}$ , the row index $a = {(a_{i})}_{m \times 1}$ can be expressed as

a_{i} = a c_{i} - \sum_{j = 1}^{n} {\bar{p}}_{i ω (j)}

(26)

The average slack $av$ can be calculated by the following equation

av = \frac{1}{m} \sum_{i = 1}^{m} a_{i}

(27)

In addition, we obtain the variance of slack $ad$ as

ad = {(\frac{1}{m} \sum_{i = 1}^{m} {(a_{i} - av)}^{2})}^{1 / 2}

(28)

Denoting the end time of the jth task as $t c_{j}$ , we obtain the average waiting $tw$ as

tw = \frac{1}{n} \sum_{j = 1}^{n} (t c_{j} - t_{j})

(29)

The corresponding variance $td$ can be obtained using equation (30)

td = {(\frac{1}{n} \sum {((t c_{j} - t_{j}) - tw)}^{2})}^{1 / 2}

(30)

With the above parameters, the state features of the KMCFS with a scale of m × n can be represented by the following vector

sc (ω) = (t, a, av, ad, tw, td)

(31)

Action representation

The definitions and computation steps of the actions performed for the feasible scheduling ω of the KMCFS are presented below.

Definition 1

An action is defined as one performed on sequence ω to swap the task ω(f) at the fth position with the task ω(g) at the gth position when ω is a feasible scheduling of the KMCFS. This action is denoted as $sw (f, g) (f, g \in (1, \dots, n), f \neq g)$ and is calculated by

sw (f, g) = \frac{f + g + \sqrt{fg}}{n}

(32)

When the action $sw (f, g)$ is performed on ω , another feasible scheduling $ω'$ can be obtained as follows:

\begin{matrix} ω' = (ω (1), \dots, ω (f - 1), ω (g), \\ ω (f + 1), \dots, ω (g - 1), ω (f), ω (g + 1), \dots, ω (n)) \end{matrix}

(33)

Algorithmic steps

The algorithm is composed of a training-based learning module and a scheduling module. The former module plays a key role in the entire procedure, and the latter module acts as the application. The purpose of the learning module is to ensure that the approximation degree of the Q-function gradually enhances through the training of the HK-SVM weight parameters. The value of ${\hat{q}}_{t} (s_{i}, a)$ can be approximated by equation (12), and the error of the approximation can be computed as

Δ q_{t} (s_{i}, a) = r (s_{i}, a) + γ v^{*} (s_{j}) - {\hat{q}}_{t} (s_{i}, a)

(34)

where $r (s_{i}, a) = C_{max} (ω_{i}) - C_{max} (ω_{j}), v^{*} (s_{j}) = max_{a} \hat{q} (s_{j}, a)$ .

In equation (34), $r (s_{i}, a) + γ v^{*} (s_{j})$ is treated as the target value of the pair of $(s_{i}, a)$ , and can thus be denoted as

q_{_{t}}^{tar} (s_{i}, a) = r (s_{i}, a) + γ v^{*} (s_{j})

(35)

States s _i and s _j corresponding to sequences ω _i and ω _j can be represented by the time processing matrices P ( ω _i) and P ( ω _j), that is, s _i = P ( ω _i) and s _j = P ( ω _j). Steps of the learning module are as follows:

Step 1: Initialization. Set the initial vector w ₀ as the weight vector w of the HK-SVM, and select ω ₀ from the set $Π$ as the random initial state s ₀ of the KMCFS, that is, s ₀ = P ( ω ₀). Set s _t = s ₀, and ω = ω ₀, and set the upper and lower bounds of the number of iterations to N₁ and N₂, respectively.

Step 2: Extract the features of state s _t by computing vector sc ( ω ). For all of the actions performed on ω , choose the action with the maximum r( s _t, a) to be performed on ω and compute its sw(f, g).

Step 3: Calculate ${\hat{q}}_{t} (s_{t}, a)$ , $Δ q_{t} (s_{t}, a)$ and $q_{_{t}}^{tar} (s_{i}, a)$ according to equations (12), (34) and (35), respectively, and if $| Δ q_{t} (s_{t}, a) | < Δ$ holds, then go to step 6; otherwise, go to step 4.

Step 4: If the number of iterations is greater than the upper bound N₁, then end the learning procedure; otherwise, add ( sc (ω), sw(f, g), q ^tar( s _t, a)) to the sample set of the HK-SVM for new training.

Step 5: Renew the current state s _t with the following state ${s'}_{t}$ corresponding to $ω'$ after the action is performed, and then, go to step 2.

Step 6: If the number of iterations is greater than the lower bound N₂, then end the procedure; otherwise, go to step 5.

In contrast to the learning procedure, the scheduling process is implemented briefly, thus exploiting the achievements of the first module after a sufficient amount of learning of the HK-SVM. The implementation proceeds as follows:

Step 1: Initialize a new problem. Choose ω ₀ at random from the set $Π$ as the initial state s ₀, that is, s ₀ = P ( ω ₀), and set s _t = s ₀ and ω = ω ₀.

Step 2: Compute the vectors sc ( ω ) and sw(f, g) of all of the actions performed on ω .

Step 3: Compute the outputs of the HK-SVM ${\hat{q}}_{t} (s_{t}, a)$ for all actions and choose the action with the maximum ${\hat{q}}_{t} (s_{t}, a)$ to perform on ω . Denote the following sequence as $ω'$ for this action.

Step 4: Compute the reward r( s _t, a) of the chosen action. If r( s _t, a) is negative, go to step 5; otherwise, renew ω and s by setting $ω = ω'$ and $s_{t} = {s'}_{t} = P (ω')$ , and go to step 2.

Step 5: Compute r( s _t, a) of all of the actions performed on ω. If they are all negative, end the procedure; otherwise, substitute the action with the maximum reward for the above-chosen action, and go to step 2 after renewing ω and $s_{t}$ .

Numerical simulation

The numerical simulation is implemented to evaluate both the influence on the performance of the procedure with varying off-line learning parameters and the procedure’s scheduling performance by comparing with other algorithms. The procedure is coded using MATLAB7.0 and run on a PC equipped with a Celeron M (1.6 GHz) CPU and 504 MB of RAM.

Impact of the variance of the learning parameters

Unlike traditional scheduling algorithms, the performance of the proposed KMCFSSAEF algorithm improves as the learning progresses. In this section, the simulation is performed to test the solution’s accuracy and efficiency with the varying numbers of learning iterations and training samples and the optimal hybrid coefficient λ. The variance in the solution accuracy upon changing the number of learning iterations and the number of 100 × 20 training samples is given in Table 1. The training samples are produced randomly using the method proposed by Taillard.⁴³ The data in Table 1 show that the make-span of the same problem solved decreases as the number of training samples and the number of iterations increase, indicating that the solution accuracy is improved when the learning process is sufficiently long.

Table 1.

Make-span with different number of training samples and iterations.

Number of iterations	Make-span with training samples
	5	10	15	20
1000	7376	7184	7020	6956
2000	7232	6987	6840	6769
3000	6958	6756	6676	6604
4000	6745	6598	6469	6392
5000	6586	6441	6321	6271
6000	6430	6302	6286	6204

Figure 1 illustrates the trend in the time needed to search for solutions with an increasing number of learning iterations and 200 × 20 training samples. Overall, a decreasing trend is observed with increasing numbers of iterations and training samples, except for a few of the points. Figure 1 illustrates that the time needed to find the solution will decrease with fewer waves, as the learning process is sufficient. In the figure, the horizontal axis represents the number of iterations, and the vertical axis represents the average time needed to find the solution using the procedure.

Figure 1.

Time for solution search with the variance of iteration times.

As shown in Figure 2, the approximation performance of the HK-SVM is affected by the optimal hybrid coefficient λ. The horizontal axis represents the value of λ, and the vertical axis denotes the relative error of the HK-SVM approximation (REA), which is defined as follows:

REA = \frac{Δ q_{t} (s_{i}, a)}{{\hat{q}}_{t} (s_{i}, a)}

(36)

Figure 2.

REA with the optimal hybrid coefficient λ.

The REA for problems never trained using the HK-SVM gradually decreases with increasing λ, whereas the REA for those that have been trained gradually increases. This phenomenon occurs because the polynomial kernel function exhibits good global generalization, and thus, the REA is smaller for untrained problems with a greater lambda. However, the local generalization ability of the polynomial kernel function is weaker than that of a Gaussian kernel function, and thus, the REA is larger for the trained problems. Therefore, we adopted the hybrid kennel function. The HK-SVM possesses the advantages of both the polynomial kernel function and Gaussian kernel function and can simultaneously compensate for each of their disadvantages. The scale of the problems in Figure 2 is equal to that in Figure 1.

Comparison of algorithms

In this section, the accuracy and efficiency of the KMCFSSAEF algorithm are compared with those of the TSGW¹⁹ (Grabowski and Wodecki), RY¹⁶ (Reeves and Yamada), ODDE³² (Li and Yin) and HTLBO³³ (Xie et al.) algorithms for the benchmarks of Taillard⁴³ on the scales of 20 × 20, 50 × 10, 50 × 20, 100 × 10, 100 × 20, 200 × 20 and 500 × 20, respectively, and then with those of the QDEA²⁶ (Zheng and Yamashiro), CDABC¹⁸ (Li and Yin) and ODDE³² (Li and Yin) algorithms for the benchmarks of Carlier,⁴⁴ Reeves and Yamada.¹⁶ Off-line learning was completed before the comparison was made. The key parameters of learning process set are listed in Table 2.

Table 2.

Key parameters for learning procedure.

N₁	N₂	C	ε	Δ	Number of samples
100	20,000	0.25	0.05	0.05	10

A comparison of the accuracies of the five algorithms is shown in Figure 3, where the horizontal axis represents the benchmarks for different scales, that is, T1: 20 × 20; T2: 50 × 10; T3: 50 × 20; T4: 100 × 10; T5: 100 × 20; T6: 200 × 20; and T7: 500 × 20, and the vertical axis represents the make-span. As can be seen from the figure, the solution accuracy obtained by our algorithm is slightly lower than that of the other four algorithms when the problem scale is small, but that gap is narrowed gradually with increasing scale, particularly in terms of some specific benchmarks. For the scale 500 × 20, the accuracy of KMCFSSAEF reaches that of TSGW, HTLBO and RY.

Figure 3.

Comparison of the accuracies of the five algorithms.

Table 3 presents a comparison of the efficiencies of the five algorithms. The time required by KMCFSSAEF to find the solution is less than that required by TSGW, RY, ODDE and HTLBO. In addition, the former efficiency becomes even more conspicuous as the scale of a problem increases because its time spent increases slowly and nearly linearly with increases in the problem scale.

Table 3.

Comparison of the efficiencies of the five algorithms.

Search time (s)		Problem scale
		T1	T2	T3	T4	T5	T6	T7
RY	aver	18	35.8	44.8	52.6	60.3	102.1	157.8
RY	min	15.3	29.8	38	48.3	52.1	89.7	140.3
TSGW	aver	17.1	29.2	37.1	46.5	54.2	96.5	134.2
TSGW	min	14.2	23.3	33.5	41.1	46.9	80.4	121.6
ODDE	aver	17.5	36.9	45.6	51.6	58.2	98.4	142.5
ODDE	min	15.5	32.2	37.6	47.2	50,3	90.1	133.2
HTLBO	aver	16.8	34.6	42.1	50.3	57.3	94.2	145.1
HTLBO	min	15.1	29.8	38.1	42.3	48.5	85.6	132.8
KMCFSSAEF	aver	15.6	16.8	17.6	21.2	23.7	30.2	42.1
KMCFSSAEF	min	13.4	14.5	15.4	18.4	20.1	26.7	36.8

KMCFSSAEF: scheduling algorithm for flow shop–like knowledgeable manufacturing cells; aver: average; min: minimum.

The accuracy performance of the KMCFSSAEF algorithm was also compared with three other algorithms, QDEA, CDABC and ODDE. The results are listed in Table 4. In Table 4, M ^* denotes the lower bound value or the optimal make-span known thus far. The best relative error to M ^* is denoted as b _re, and the average relative error to M ^* is denoted as a _re, which can be computed according to equations (37) and (38), respectively. In Table 4, each instance is run 20 times. The data in Table 4 show that our KMCFSSAEF algorithm is slightly better than the QDEA and CDABC algorithms, but is not as good as the ODDE algorithm for obtaining the optimal make-span of each instance. However, the KMCFSSAEF algorithm exceeds the other three algorithms regarding the average searching performance after repeated runs, indicating that KMCFSSAEF is more robust than the other three algorithms in terms of searching performance.

b_{re} = \frac{H_{best} - bes t_{know}}{bes t_{know}}

(37)

where H _best denotes the best make-span obtained by algorithm H (H refers to QDEA, CDABC, ODDE and KMCFSSAEF) and best _know denotes the optimal make-span, or the lower bound value known thus far

a_{re} = \frac{1}{R} * \sum_{τ = 1}^{R} \frac{H_{i} - bes t_{know}}{bes t_{know}}

(38)

Table 4.

Comparisons of the QDEA, CDABC, ODDE and KMCFSSAEF algorithms in terms of make-span.

Problem	m\|n	M ^*	QDEA		CDABC		ODDE		KMCFSSAEF
			b _re	a _re	b _re	a _re	b _re	a _re	b _re	a _re
Car1	5\|11	7038	0	0	0	0	0	0	0	0
Car2	4\|13	7166	0	0	0	0	0	0	0	0
Car3	5\|12	7312	0	0	0	0	0	0	0	0
Car4	4\|14	8003	0	0	0	0	0	0	0	0
Car5	6\|10	7720	0	0	0	0	0	0	0	0
Car6	9\|8	8505	0	0	0	0	0	0	0	0
Car7	7\|7	6590	0	0	0	0	0	0	0	0
Car8	8\|8	8366	0	0	0	0	0	0	0	0
Rec01	5\|20	1247	0	0.112	0	0.0321	0	0	0	0
Rec03	5\|20	1109	0	0.009	0	0	0	0	0	0
Rec05	5\|20	1242	0.242	0.242	0.242	0.242	0	0.170	0.186	0.167
Rec07	10\|20	1566	0	0	0	0	0	0	0	0
Rec09	10\|20	1537	0	0	0	0	0	0	0	0
Rec11	10\|20	1431	0	0	0	0	0	0	0	0
Rec13	15\|20	1930	0.104	0.225	0	0.135	0	0.124	0.046	0.122
Rec15	15\|20	1950	0	0.158	0	0.133	0	0.062	0	0.059
Rec17	15\|20	1902	0	0.126	0	0.073	0	0	0	0
Rec19	10\|30	2093	0.287	0.435	0.287	0.392	0.239	0.373	0.258	0.37
Rec21	10\|30	2017	0.149	1.041	0.149	1.056	0.149	0.999	0.148	0.982
Rec23	10\|30	2011	0.348	0.597	0.149	0.428	0.149	0.438	0.152	0.428
Rec25	15\|30	2513	0.119	0.995	0.0796	0.664	0.199	0.454	0.0682	0.454
Rec27	15\|30	2373	0.253	0.954	0.253	0.615	0.126	0.607	0.242	0.603
Rec29	15\|30	2287	0	0.824	0	0.791	0	0.770	0	0.768
Rec31	10\|50	3045	0.263	0.565	0.263	0.348	0.263	0.302	0.263	0.303
Rec33	10\|50	3114	0	0.297	0	0.025	0	0	0	0.013
Rec35	10\|50	3277	0	0	0	0	0	0	0	0
Rec37	20\|75	4951	1.717	2.771	1.555	1.902	1.737	2.165	1.555	1.912
Rec39	20\|75	5087	0.845	1.485	0.904	1.089	0.649	0.869	0.836	0.864
Rec41	20\|75	4960	1.190	1.965	1.189	1.682	1.008	2.085	1.172	1.68
Average			0.1902	0.441	0.1748	0.331	0.156	0.325	0.1698	0.3008

KMCFSSAEF: scheduling algorithm for flow shop–like knowledgeable manufacturing cells.

where H_i denotes the make-span obtained by algorithm H in the ith running. R denotes the total number of times that the algorithm is run.

Conclusion

In this study, we propose a scheduling algorithm referred to as KMCFSSAEF that is characterized by self-evolutionary features for KMCFS, and consists of a learning module and a scheduling module.

The self-evolutionary ability of KMCFSSAEF is realized through the trained HK-SVM for approximating the value of the Q-function, which is then used to choose the suitable action in the scheduling module to obtain the optimal solution. The training process for the HK-SVM is an iterative process based on Q-learning. The scheduling performance of KMCFSSAEF gradually improves based on the knowledge obtained during the learning process, and this feature offsets the static performance while searching for a common optimization algorithm.

The numerical simulations showed that KMCFSSAEF corresponds to an apparent improvement upon completion of the learning process due to the off-line training. When training samples are increased from 5 to 20, the accuracy of the solution can be increased 5.12% on average, and the accuracy is an average of 11.25% higher for 6000 iterations compared to 1000 iterations. The time spent searching for the optimal solution can be reduced by 24.43% upon increasing the number of iterations from 1000 to 6000 and increasing the training samples from 5 to 15. Although the KMCFSSAEF may seldom exceed a traditional scheduling algorithm, such as TSGW or RY, the efficiency far surpasses its peers, especially for large-scale problems. Our algorithm’s linear increase in the time spent on searching leads to a conspicuous advantage compared to other algorithms, thus making the proposed algorithm feasible and effective for engineering applications, especially in situations that involve real scheduling, rescheduling and on-line scheduling.

Further studies should focus on improving the evolutionary capacity of the proposed KMCFSSAEF algorithm by adopting a more suitable representation method of the system states that affect the learning quality. Another interesting issue is to study evolution algorithms of other extended types of flow shop problems, such as re-entrant flow and no-wait flow shop, based on the algorithm proposed in this article.

Footnotes

Appendix 1 Acknowledgements

The authors thank the Editor-in-Chief and Professor P. G. Maropoulos, the Associate Editor, and Mr Martin McDonald, the three anonymous reviewers and Professor Li Lu for their valuable comments and suggestions.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a key program of the National Natural Science Foundation of China under grant 60934008, the Fundamental Research Funds for the Central Universities of China under grant 2242014K10031, a Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions and the Jiangsu Province University Natural Science Research Project under grant 13KJB460005.

References

Yan

Liu

. Knowledgeable manufacturing system—a new kind of advanced manufacturing system. Comput Integr Manuf 2001; 7: 7–11(in Chinese).

Christophe

Bernard

Coatanéa

. RFBS: a model for knowledge representation of conceptual design. CIRP Ann: Manuf Techn 2010; 59: 155–158.

Alizon

Shooter

Simpson

. Reuse of manufacturing knowledge to facilitate platform-based product realization. J Comput Inf Sci Eng 2006; 6: 170–178.

Cheung

Bramall

Maropoulos

. Organizational knowledge encapsulation and re-use in collaborative product development. Int J Comp Integ M 2006; 19: 736–750.

Maropoulos

Bramall

Chapman

. Digital enterprise technology in production networks. Int J Adv Manuf Tech 2006; 30: 911–916.

Maropoulos

Zhang

Chapman

. Key digital enterprise technology methods for large volume metrology and assembly integration. Int J Prod Res 2007; 45: 1539–1559.

Maropoulos

Kotsialos

Bramall

. A theoretical framework for the integration of resource aware planning with logistics for the dynamic validation of aggregate plans within a production network. CIRP Ann: Manuf Techn 2006; 55: 482–488.

Xie

. Recent development of knowledge-based systems, methods and tools for one-of-a-kind production. Knowl-Based Syst 2011; 24(7): 1108–1119.

Wan

Yan

. Integrated scheduling and self-reconfiguration for assembly job shop in knowledgeable manufacturing. Int J Prod Res 2015; 53: 1746–1760.

10.

Yan

Wan

Xiong

. A hybrid electromagnetism-like algorithm for two-stage assembly flow shop scheduling problem. Int J Prod Res 2014; 52: 5626–5639.

11.

Yang

Yan

. An adaptive approach to dynamic scheduling in knowledgeable manufacturing cell. Int J Adv Manuf Tech 2009; 42: 312–320.

12.

Yan

Jiang

Shi

. An iterative learning method for multi-cycle flexible production/inventory control under random demands. J Intell Fuzzy Syst 2013; 26: 2591–2607.

13.

Yan

. A new complicated knowledge representation approach based on knowledge meshes. IEEE T Knowl Data En 2006; 18: 47–62.

14.

Garey

Johnson

Sethi

. The complexity of flow-shop and job-shop scheduling. Math Oper Res 1976; 1: 117–129.

15.

Ogbu

Smith

. Simulated annealing for permutation flow shop problem. Omega: Int J Manage S 1990; 19: 64–67.

16.

Reeves

Yamada

. Genetic algorithms, path relinking, and the flow shop sequencing problem. Evol Comput 1998; 6: 45–60.

17.

Ahmadizar

. A new ant colony algorithm for make-span minimization in permutation flow shops. Comput Ind Eng 2012; 63: 355–361.

18.

Yin

. A discrete artificial bee colony algorithm with composite mutation strategies for permutation flow shop scheduling problem. Sci Iran 2012; 19: 1921–1935.

19.

Grabowski

Wodecki

. A very fast tabu search algorithm for the permutation flow-shop problem with make-span criterion. Comput Oper Res 2004; 31: 1891–1909.

20.

Kim

Lee

. Heuristic algorithms for re-entrant hybrid flow shop scheduling with unrelated parallel machines. Proc IMechE, Part B: J Engineering Manufacture 2009; 223: 433–442.

21.

Marimuthu

Ponnambalam

Jawahar

. Tabu search and simulated annealing algorithms for scheduling in flow shops with lot streaming. Proc IMechE, Part B: J Engineering Manufacture 2007; 221: 317–331.

22.

Huang

Luo

. Hybrid flowshop assembly scheduling for one-of-a-kind production. In: Proceedings of international conference on computers and industrial engineering, Hong Kong, 16–18 October 2013, pp.784–803. Hong Kong: Curran Associates, Inc.

23.

Chan

FTS

. Evaluation of combined dispatching and routeing strategies for a flexible manufacturing system. Proc IMechE, Part B: J Engineering Manufacture 2002; 216: 1033–1331.

24.

Zobolas

Tarantilis

Ioannou

. Minimizing make-span in permutation flow-shop scheduling problems using a hybrid meta-heuristic algorithm. Comput Oper Res 2009; 36: 1249–1267.

25.

Luo

Huang

Zhang

. Two-stage hybrid batching flowshop scheduling with blocking and machine availability constraints using genetic algorithm. Robot CIM: Int Manuf 2009; 25: 962–971.

26.

Zheng

Yamashiro

. Solving flow shop scheduling problems by quantum differential evolutionary algorithm. Int J Adv Manuf Tech 2010; 49: 5–8.

27.

Rao

Kalyankar

. Multi-objective multi-parameter optimization of the industrial LBW process using a new optimization algorithm. Proc IMechE, Part B: J Engineering Manufacture 2012; 226: 1018–1025.

28.

Pan

Wang

Gao

. A chaotic harmony search algorithm for the flow shop scheduling problem with limited buffers. Appl Soft Comput 2011; 11: 5270–5280.

29.

Agarwal

Colak

Eryarsoy

. Improvement heuristic for the flow-shop scheduling problem: an adaptive-learning approach. Eur J Oper Res 2006; 169: 801–815.

30.

Lee

Michael

. A neural-net approach to real time flow-shop sequencing. Comput Ind Eng 2000; 38: 125–147.

31.

Solimanpur

Vrat

Shankar

. A neural-tabu search heuristic for flow-shop scheduling problem. Comput Oper Res 2004; 31: 2151–2164.

32.

Yin

. An opposition-based differential evolution algorithm for permutation flow shop scheduling based on diversity measure. Adv Eng Softw 2013; 55: 10–31.

33.

Xie

Zhang

Shao

. An effective hybrid teaching–learning-based optimization algorithm for permutation flow shop scheduling problem. Adv Eng Softw 2014; 77: 35–47.

34.

Wang

Sawhney

. Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems. Eur J Oper Res 2012; 221: 99–109.

35.

Xanthopoulos

Koulouriotis

Tourassis

. Intelligent controllers for bi-objective dynamic scheduling on a single machine with sequence-dependent setups. Appl Soft Comput 2013; 13: 4704–4717.

36.

Aydin

Őztemel

. Dynamic job-shop scheduling using reinforcement learning agents. Robot Auton Syst 2000; 33: 169–178.

37.

Stefan

. Flow-shop scheduling based on reinforcement learning algorithm. Prod Syst Inf Eng 2003; 1: 83–90.

38.

Brucker

. Scheduling algorithms. Berlin: Springer-Verlag, 2006.

39.

Mitchell

. Machine learning. Beijing, China: China Machine Press, 2003.

40.

Haykin

. Neural networks: a comprehensive foundation. Beijing, China: China Machine Press, 2004.

41.

Zhang

Gong

. Integrating grey relational analysis and support vector machine for performance prediction of modular configured products. Proc IMechE, Part B: J Engineering Manufacture 2013; 227: 1218–1231.

42.

Cristianini

Shawe

. An introduction to support vector machines and other kernel-based learning methods. Beijing, China: China Machine Press, 2005.

43.

Taillard

. Benchmarks for basic scheduling problems. Eur J Oper Res 1993; 64: 278–285.

44.

Carlier

. Ordonnancements a contraintes disjonctives. Rairo: Rech Oper 1978; 12: 333–351.

A scheduling procedure for a flow shop–like knowledgeable manufacturing cell with self-evolutionary features

Abstract

Keywords

Introduction

Problem formulation

Q-learning and HK-SVM

Q-learning

HK-SVM

Lemma 1

Property 1

Proof

Scheduling algorithm with evolutionary features for KMCFS

State representation

Action representation

Definition 1

Algorithmic steps

Numerical simulation

Impact of the variance of the learning parameters

Comparison of algorithms

Conclusion

Footnotes

Appendix 1

Acknowledgements

Declaration of conflicting interests

Funding

References