Abstract
This article proposes a proactive approach for the integrated problem of production planning and condition-based maintenance under uncertain environment, that is, the exogenous uncertainty of demand and the endogenous uncertainty of failure. Considering that the production system has limited capacity and suffers degradation with usage, an integration model of production planning and condition-based maintenance policy is proposed, in which the maintenance threshold and production quantity are proactively decided simultaneously. The robust and stable solution of the complicated non-linear model is obtained by a simulation-based multi-population genetic algorithm. Numerical results not only show the impacts of different factors on the model, but also prove the superiority of the developed model and algorithm in uncertain environment.
Keywords
Introduction
In fact, the manufacturing systems always show a natural interdependence between production and maintenance activities, which raises the problem of how to jointly optimize the production and maintenance to avoid conflicts and minimize total cost. In particular, it might be more complicated to optimize the joint problem in an uncertain environment, which is a problem that manufacturers must face in pursuit of competitiveness and efficiency under increasing market pressure. In the last decades, Aghezzaf et al. 1 first developed an integrated model of capacitated lot-sizing problem (CLSP) and interval-based preventive maintenance (PM) plan for a single component, and the integrated model had been demonstrated to be cost saving. Subsequently, some extensions based on CLSP and PM have been studied. Fitouhi and Nourelfath 2 and Lu et al. 3 modeled the integrated problem with run-based PM plan and obtained further cost saving. An iterative method was proposed in Zhao et al. 4 with consideration of the expected consumption of capacity by random failures. Recently, Wang et al. 5 explored the integration problem of CLSP and imperfect PM with consideration of nonconforming items in production. These studies generally employ the expected failure numbers to evaluate the impact of random failures, which, however, ignore the stability of the plan in reality.
Lots of uncertainties exist in the tactical production planning, including but not limited to uncertain demand and random failures. Both uncertain demand and random failures can have an effect on the service level. On the one hand, the random failures caused by system degradation may lead to low productivity and increased downtime. On the other hand, the uncertain demand would increase uncompleted orders and lower the service level. Appropriate maintenance actions and production plan could improve the system availability and keep competitiveness of the manufacturer. Thus, a robust production plan and corresponding maintenance policy are needed to reduce the impacts of the uncertainties and minimize the total cost.
To deal with these uncertainties, many production systems have adopted the proactive approaches, which develop an initial plan to guide production activities by incorporating the knowledge of the uncertainty at the decision stage, and adjust the initial plan through reactive procedures to maintain its feasibility and performance when unforeseen situations occur. Generally, a deterministic plan constructed through predictive approach always has an optimum expected performance, but it is not likely to be the optimum one once the uncertainties are realized. So, the objective of the reactive procedures is to generate a feasible plan to the present situation that deviates as little as possible from the initial baseline plan, which is also referred to as the stability objective.
Several studies have been reported to deal with the failure uncertainty in production problem proactively. Goren and Sabuncuoglu 6 generated robust and stable schedules in a single-machine environment subject to machine breakdowns. The authors developed two surrogate measures for robustness and stability, which considered both busy and repair time distributions. Al-Hinai and ElMekkawy 7 addressed the problem of finding robust and stable solutions for the flexible job shop scheduling problem with random machine breakdowns. Cui et al. 8 and Lu et al. 9 proactively decided the production scheduling and PM simultaneously to optimize the bi-objective of robustness and stability considering that the breakdowns would affect the stability of the machine. However, relatively little attention has been given to failure uncertainty in tactical level. On the other hand, the demand uncertainty in tactical production planning has received much attention and gained a lot of insights. Jing et al. 10 developed a fuzzy mixed-integer linear programming model, which considers the uncertainties of market demands, to address a capacitated dynamic lot-sizing problem with remanufacturing. Guan et al. 11 proposed a branch and cut algorithm to solve stochastic uncapacitated lot-sizing problems under demand uncertainty. Aghezzaf et al. 12 presented and discussed three different models to generate robust tactical production plans in a multi-stage production system considering the random demand. An extensive review of production planning under uncertainty can be found in Mula et al. 13
For a degradation system, its condition can be monitored continuously or discretely. Based on the condition, proper maintenance can be employed to restore the system, and this is called condition-based maintenance (CBM). The problems of modeling and optimizing CBM have been widely investigated in literature. Among them, the control limits and/or inspection intervals of CBM have been optimized in many models in order to minimize the long-run cost.14–17 Cox’s proportional hazards model (PHM) is one of the most commonly used models to describe degrading system in the CBM framework, which considers both system age and condition variables. Some works have been done in CBM to apply the PHM to derive the optimal control-limit maintenance policy.18–22 Compared with PM, CBM monitoring the actual system condition is more effective in reducing cost and uncertainty of the stochastic degrading system.23,24
The purpose of this article is to jointly optimize the CBM policy and production planning for a degrading system considering uncertain demand and random failures. The motivation for this work is the need for better production planning and maintenance policy under the uncertain environment in real industrial applications. In fact, it is more beneficial to make real-time PM decisions based on the inspection result every period. The main contribution of this article is to develop a proactive approach for the integration problem of production planning and CBM to obtain a robust plan under uncertain environment. As far as we know, no research integrating production and maintenance has considered the fact that the uncertainty of failure and demand can affect the stability of the production plan. In addition, a multi-population genetic algorithm (MPGA) with discrete event simulation is developed to solve this stochastic programming problem.
The remainder of this article is organized as follows. “Problem statement” presents the statement of problem. The formulation of integration model is developed in “Model formulation.” The solution method is proposed in “Solution algorithm.”“Numerical results and discussion” presents a case study and some comparison experiments to demonstrate the performance of the model. Conclusions are drawn in “Conclusion.”
Problem statement
The uncertainty in tactical planning problems has been considered by many manufacturing plants. Suppose that a production system is required to produce a set of products during a finite time horizon. The demand of every product is assumed to be mutually independent, and the probability distribution of the demands can be evaluated according to their past experience. Furthermore, assuming that the system is subject to failure, PM is needed to reduce failures and improve availability. Therefore, managers should not only determine the production quantity of each product in each period, but also determine the corresponding maintenance actions to minimize production and maintenance costs under the uncertain environment.
The PHM is used to describe the degradation of the system and calculate the system hazard rate. The system is assumed to degrade with operating time. Thus, random failure in production depends on the production decisions, which means that it becomes endogenous uncertainty. 25 The relationship between system degradation and production quantity will be discussed in detail in the next section. Whenever a random failure occurs, a corrective maintenance (CM) is conducted immediately to restore it without changing the system’s usage and condition. Continuous monitoring of the system is impossible due to economic and technological constraints. So, inspections and PM decisions are made at the beginning of every period, and the inspections are assumed to be instantaneous and perfect. Based on the inspection information, the hazard rate is calculated. If the hazard rate exceeds a predetermined threshold, a PM is carried out to restore the system to the “as good as new” state.
In practice, exact demand and failures are discovered only after the production plan is put into execution. Unsatisfied demand will occur when the system capacity is not enough to execute the initial plan or the real demand is increased. Meanwhile, redundant items will be produced when the real demand is reduced. Managers must react quickly to the changes by carrying additional storage or increasing capacity by outsourcing, which would result in additional costs.
Model formulation
Degradation model and maintenance model
Hazard rate is widely used to represent the condition of the system in reliability theory. A PHM model which combines a Weibull baseline hazard rate function and system state covariate is concerned in this batch production model. The hazard rate is a product of a Weibull probability distribution function of the system usage and an exponential function of the degradation process, which is shown as follows
where
where
To avoid disruptions in production, the inspection is carried out at the beginning of every period. The hazard rate can be calculated by equation (1) using the usage time
Here, the variable
Although the CBM policy can decrease failure probability and increase the system reliability, random failures may happen in production since the hazard rate is not 0. CM is conducted as soon as a failure occurs to resume production without changing the system’s usage time and degradation state. After CM, the production continues till the batch is completed.
Considering the random failure numbers in every period, the total maintenance time in each period can be defined as
Considering the inspection cost, the total maintenance cost in each period can be written as
Production model
In this problem, a set of products
Other constraints of production model are shown in the integrated model in next sub-section.
Integrated model
Let
Minimize
Subject to equations (1)–(6)
The objective of the joint model is to minimize the sum of expected total cost and expected deviation cost. The total cost is the sum of realized production cost and maintenance cost in every period in Constraint (8). The sum of deviation cost is calculated in Constraint (9). Constraints (10) and (11) are the formulas of cumulative deviation between the planned production quantity and realized demand. Constraints (12) and (16) are the setup constraints. The capacity constraints for the initial plan and the real plan are shown in Constraints (13) and (15), respectively. Constraint (14) is the formula of total production time in every period. Constraint (17) is the standard inventory balance equation of the real production plan. Constraint (18) represents the CBM policy. Constraint (19) is the iterative formula of the system usage time at the beginning of every period. Constraint (20) represents the mapping relationship between the initial plan and the realized plan based on the maintenance time and corresponding constraints, which is explained in detail in Algorithm 3. Constraints (21) and (22) represent non-negative, binary and initialization variables, respectively.
Solution algorithm
Preliminary analysis
The aim is to construct an initial production plan and a maintenance policy that minimize the cost (7) for a capacitated system with known demand distributions and degradation process. Obviously, the integrated problem of production and maintenance is very complicated even assuming the demand is deterministic. Although there are several algorithms in literature to solve the deterministic problem efficiently, it is difficult to find the optimal decisions to guarantee the robustness and stability of the plan at the same time.
Considering that demand variables are independent of each other, the stability of the plan is only decided by the planned production quantity. The predictive demands that minimize the deviation cost
If the predictive demand can be perfectly satisfied by the planned production quantity, in other words, the predictive demand is used as deterministic demand to decide the initial production plan, then the minimal deviation cost is achieved. The value for
After the predictive demand is determined, the only uncertainty in this problem when deriving an initial plan is the random failure. The hazard rate
where
The expected failure numbers
Since the degradation level
where
Based on the equations (25) and (27), the expected failure numbers, which are used to evaluate the maintenance cost and time to derive an initial plan on the basis of deterministic demand and known maintenance threshold, can be calculated. And, the system reliability function is written as
According to the property that has been proved by Yashin, 29 the failure cumulative distribution function becomes
Given the initial usage
Based on equation (30), the inversion sampling is used to sample the failure time to calculate the failure numbers in the simulation.
MPGA
As the formulated model is a complicated stochastic mixed-integer programming and involves endogenous uncertainty, it is impossible to be solved effectively by commercial solvers. Thus, a MPGA is developed in this article, which has the effectiveness when solving the combinatorial optimization problems.30–32 The flowchart of the proposed MPGA is shown in Figure 1.

Flowchart of the proposed MPGA.
Coding strategy
The computational complexity of the integrated model forces us to use simulation instead of an analytical method to evaluate the objective function (7). Before performing the simulation, an initial plan is needed to be a baseline, in which the planned production quantity and maintenance threshold are the needed decision variables. Based on the previous analysis, the initial plan, with the objective to minimize the expected total cost

An example of the chromosome representation.
Population initialization
In MPGA, three different subpopulations are initialized with different methods. Each subpopulation contains
Selection, crossover and mutation
The roulette wheel selection method is adopted to pick up the parent chromosomes to generate the next generation. The selecting probability of a chromosome
in which
Upgrading and immigration
The best chromosomes from all subpopulations are recorded as the elite population, which is updated for every generation to improve the fitness value. This is the upgrading process. To make different subpopulations co-evolutionary, an immigration strategy is proposed to keep chromosome information exchange between individuals from different subpopulations. For each subpopulation, the best chromosome is chosen to substitute the worst one in the next subpopulation.
Stopping criteria
If the best fitness value in the elite population does not change for a given number of consecutive generations, or the total number of generations reaches a given upper bound, the algorithm terminates, and the latest best fitness value and its corresponding chromosome are considered as the outputs of the MPGA.
Decoding algorithm
Due to the minimum deviation cost in objective function that can be obtained on the premise of meeting the predictive demands, the main purpose in the decoding algorithm is to use the predictive demands and fixed maintenance threshold to generate an initial plan
First, the original model is transformed to be an integrated production and maintenance model (M1) under deterministic demand and random failures. The difficulty is that the expected maintenance time depends on the production time in every period, resulting in a chicken-and-egg conundrum between production quantities and maintenance time. With the objective to minimize the expected total cost, M1 can be solved by using the iterative algorithm in Zhao et al.
4
to generate an initial plan. The procedure is described in Algorithm 2, in which the simplified model is solved by CPLEX. By this algorithm, a near-optimal plan can be found without constraint violation. The convergence of Algorithm 2 is illustrated in Zhao et al.
4
After determining the initial plan
Numerical results and discussion
Parameters selection
In this article, all experiments have been performed on an Intel Core i7-3.6 GHz with 12 GB RAM computer, with algorithms compiled by MATLAB. The investigated degradation system has the characteristics given in Table 1. The planning horizon is composed of
Parameters of the degradation system.
Parameters of the three products in production.
Since sampling and simulation are embedded in the procedure of MPGA, the computational time will depend mainly on the sample size
where
Three test instances with the parameters given in Tables 1 and 2 are generated to determine the sample size. For each instance with different decision variables, let

The cumulative mean value and relative deviation of the three instances.
Results and sensitivity analysis
In this proposed example, the capacity utilization

The evolution process and convergence of MPGA.
When considering the influence of system parameters on the expected costs, the main concerns are the parameters of demand variability, degradation rate and capacity tightness. The value of
Fifteen instances with different parameters are realized with the developed MPGA. The expected realized total cost, the expected deviation cost and the objective value are compared under different factors. In Figure 5, there is a visible trend that all costs are increasing with the increase of factor

The impact of different demand variability factor

The impact of different deterioration rate factor

The impact of different capacity tightness factor
Performance analysis on the algorithm
The performance of the proposed MPGA is compared with that of a standard genetic algorithm (GA), and a simulated annealing (SA)
33
on several instances. According to the results from the previous experiments, it is confirmed that the robustness and stability of the solution are mainly affected by demand variability and capacity tightness, so the comparison tests are conducted under different demand variabilities and capacity tightness. Three different levels of
In Table 3, the expected realized total cost, the expected deviation cost and the objective function of the three algorithms are calculated. From the results, MPGA is always better than the other algorithms in all comparative tests, and GA is better than SA, which indicates that the global search ability in GA has a great influence on the solution of this problem. And, these results validate the good performance of the proposed MPGA in solving this model. With the increase of factor
Results of the three algorithms under different
MPGA: multi-population genetic algorithm; GA: genetic algorithm; SA: simulated annealing.
Improvement of MPGA over other algorithms in cost saving under different
Performance analysis on the model
This part of experiment is the numerical investigation on the proposed integrated model. It has two parts: the first part concerns the performance of the model in uncertain environment, and the second part compares the performance of the model with that of a separate decision model.
To investigate the performance under uncertainty, three different methods are compared. The first one is the proposed proactive approach (M1) with the aim to obtain a robust and stable plan under uncertain environment. According to the mean values of demands, the second method (M2) solves a deterministic model to obtain a baseline plan and maintenance threshold. And, the solutions are put into simulation to calculate the expected real costs and deviation costs. The third method (M3) is a posterior method which provides a lower bound for the original problem. Decisions are determined based on the real demand and failure information in a particular instance. To compare with the previous two methods, each test solves 100 instances with different perfect information and calculates the expected real costs (no deviation costs) of these instances. The corresponding total costs of these three methods are denoted as Z1, Z2 and Z3. In fact, the gap between Z1 and Z2 evaluates the value of stochastic solution, and the gap between Z1 and Z3 evaluates the value of perfect information. 34
Table 5 illustrates the experiment results for nine different tests under different factors of
Comparison results with different decision models under different
The second part of the experiment compares the performance of the proposed integration model with a separate decision (M4) method, in which the manufacturer decides the production plan based on a predetermined maintenance threshold. The maintenance threshold is individually optimized to minimize the average long-run maintenance cost with the objective function
Comparison results between integration model and separate decision method under different
From Table 6, it can be found that both robustness and stability of the integration model are better than the separate decision method. When comparing the relative improvement of the joint model that is calculated as
Conclusion
In this article, a stochastic optimization framework is proposed to deal with the integration problem of batch production and CBM with uncertain demand and random failures. The uncertainty of the input parameters and the information about the reactive policy followed at execution time are considered proactively at the decision stage. The integration model needs to determine the production quantity for each product and maintenance policy simultaneously to form the complete initial plan. The deterministic CLSP is already non-deterministic polynomial-time (NP)-hard, thus the integration problem with CBM under uncertain environment is even more complex. A simulation-based MPGA is developed to optimize the problem for the purpose of high robustness and stability. The algorithm is implemented and tested with other algorithms under different experiments and has been proved efficient in getting good initial plans. The economic benefits in both robustness and stability under uncertain environment validate the necessity to consider the uncertainties in decision-making. This methodology is suitable for production and maintenance decisions in manufacturing system with multi-product, multi-period uncertain demand, which is common in semiconductor, automotive and electronic industries.
Other interesting directions for further research can be followed in integration problem of production and maintenance. For example, if the reactive policy is changed, for example, re-planning policy, the decision procedure will be more complex and the performance may be better.
Footnotes
Appendix 1
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by National Natural Science Foundation of China (No. 61473211, 71171130).
