Sage Journals: Discover world-class research

Abstract

Tool wear monitoring is critical for ensuring product quality and productivity. This article presents a novel tool wear prediction model based on improved least squares support vector machine method, combined with leave-one-out technique and Nelder–Mead technique. Leave-one-out is applied to tune the regularization factor and radial basis function kernel parameter of least squares support vector machine for enhancing the global search ability. Nelder–Mead is applied to raise the local search ability. The optimized least squares support vector machine based tool wear prediction model is constructed by learning the highly nonlinear correlationships between tool cutting conditions and actual tool wear. The effectiveness of the proposed prediction model is validated by experiments. Compared with particle swarm optimization algorithm-based least squares support vector machine and basic least squares support vector machine, Nelder–Mead-leave-one-out-based least squares support vector machine demonstrates a better performance in prediction accuracy, generalization, robustness, and convergence. The average accuracy obtained in tests for tool wear prediction is above 97%. This model provides theoretical basis for the machining condition configuration in the actual processing.

Keywords

Tool wear monitoring method support vector machines

Introduction

Metal cutting processing plays a core role in manufacturing. Cutting tool wear is inevitable during metal cutting, which reduces workpiece quality and damages the cutting tool. Performance, quality, and management of tools directly affect the stability of machining process, product reliability, processing time, production efficiency, and so on. In the actual processing, the cutting tool and the workpiece undergo intense friction under high temperature and high pressure of working conditions. The contact area around the flank wear takes a variety of complex forms, including front and rear flank wear, abrasion border, type of fatigue wear, and breakage. Tool wear affects the machining quality directly, even the normal operation of the whole processing system. Research shows that tool condition monitoring not only can improve the utilization rate, but also avoid artifacts caused by damage of tool scrap and equipment failures.¹ Both tool wear condition monitoring and the timely replacement of failure cutting tool have vital significance on the machining system safety, cutting performance, production costs and labor productivity. Therefore, tool wear conditions should be monitored, with useful information being extracted and tool wear status being analyzed, to reduce production cost, production failures and improve production efficiency. Based on the tool wear condition monitoring, tool wear degree can be predicted from the tool’s historical wear conditions, whereby timely measures could be taken before tool blunt. And an accurate tool wear prediction is capable of optimizing machining system and improving production efficiency by preventing damages to machine tools and workpieces.

In recent years, perspective monitoring technologies and industrial high-speed camera-based systems have been proposed for direct online tool condition monitoring.^2–4 This approach has the advantage in visually identifying appearance changes by the cutting tool geometry. However, implementations of these direct measurement systems in harsh industrial environments are restricted by machining cost and additional sensors. Indirect measurement systems monitor tool condition by modeling relationships between tool wear and sensory signals in machining processes (e.g. force, vibration, and acoustic emission).^5–7 Compared to the direct monitoring method, the biggest advantage of the indirect method is that machining is monitored in real time. In order to recognize tool wear occurrence in turning operations, Rangwala and Dornfeld⁸ adopted neural networks to integrate information of multiple cutting parameters. The superior learning and noise suppression abilities of these networks are effective in recognizing tool wear under a range of machining conditions. Sick⁹ evaluated 138 publications dealing with online and indirect tool wear monitoring in turning by means of artificial neural networks and compared the methods applied in these publications as well as methodologies to select methods. Boutros and Liang¹⁰ established a discrete hidden Markov model (HMM) to detect and diagnose mechanical faults. The success rate obtained in tests for fault severity classification was above 95%. In addition to the fault severity, a location index was developed to determine the fault location. Aliustaoglu et al.¹¹ studied development of a tool wear condition-monitoring technique based on a two-stage fuzzy logic scheme, whereby signals acquired from various sensors are processed to make a decision about tool status.

Nonetheless, most previous studies predict tool wear based on cutting force and cutting vibration signals, which are difficult for the monitoring threshold’s determination and the feature information’s adaption to cutting parameter changes. Consequently, prediction effect of tool condition is not ideal. Although these techniques achieve excellent results with limited conditions, it is probable that cutting parameters that lead to successful predictions of tool wear under certain conditions change under different conditions.¹² Tool wear is very closely related to cutting parameters. Different cutting parameters reflect different machining conditions and also different tool wear conditions. The model can be trained with facility by experimental data, while it becomes very strict in the actual production, which results in large time consumption. Therefore, a quantitative mathematical model that relates tool wear condition with cutting parameters is required. It implies that the system must have knowledge of where it can monitor all life cycle of tool wear condition effectively and improve the flexibility, robustness, and generality.

Statistical learning theory–based support vector machines (SVM) is a new achievement in the data-driven modeling field, which has been implemented successfully in areas such as classification, regression, and function estimation.^13–15 Theoretically, SVM ensures the maximum generalization ability of the model, especially in dealing with small sample, nonlinear and high dimensional pattern recognition problems. To a large degree, SVM can overcome the “dimension disaster” and over fitting problem by minimizing structural risk, which has been widely used in pattern recognition, function fitting, time series modeling, and so on. Experiments show that convergence speeds of particle swarm optimization (PSO) algorithm and genetic algorithm (GA) are slow when optimizing the SVM parameters, albeit they can reach high prediction accuracy. Least squares support vector machines (LSSVM), as a kind of extension of SVM, operates rapidly and takes up less computing resources.

The objective of this article is to develop a tool wear condition-monitoring model for tool wear prediction. Based on the cutting condition parameters during turning operations, LSSVM is used to evaluate the tool wear condition. The rest of this article is organized as follows. LSSVM is adopted to model relationships between tool wear and cutting condition parameters in section “LSSVM-based tool wear prediction model.” In section “Leave-one-out-based LSSVM optimization,” leave-one-out (LOO) technique is used to optimize parameters c and $σ$ of LSSVM. The Nelder–Mead (NM) algorithm is introduced in section “NM simplex search method.” And NM algorithm is then used to further optimize the two parameters of LSSVM in section “NM LOO-based LSSVM-based tool wear prediction model.” The model of tool wear condition monitoring is established in section “Experiments and discussions,” and experiments are conducted to verify the algorithm’s effectiveness and robustness. Finally, section “Conclusion” concludes this article.

LSSVM-based tool wear prediction model

SVM is a novel machine-learning tool which is especially useful for classifications and predictions with small sample cases.¹⁶ It is inspired by statistical learning theory leading to a class of algorithms characterized using nonlinear kernels, high generalization ability, and the sparseness of the solution. Unlike the classical neural networks approach, the learning problem formulation of SVM leads to quadratic programming (QP) with linear constraints. However, the size of matrix involved in the QP problem is proportional to the number of training points. Hence, to reduce the complexity of optimization processes, a modified version called LSSVM is proposed by taking equality instead of inequality constraints to obtain a linear set of equations instead of a QP problem in the dual space.^17,18 Instead of solving a QP problem as in SVM, LSSVM can obtain the solutions of a set of linear equations. The formulation of LSSVM is introduced as follows. Consider a given training set ${(x_{i}, y_{i}), i = 1, 2, \dots, l}$ with input data $x_{i} (x_{i} \in R^{d})$ and output data $y_{i} (y_{i} \in R)$ , where l is the number of data samples. The following regression model¹⁸ can be constructed using nonlinear mapping function $φ (x)$ as equation (1)

y (x) = w^{T} φ (x) & b

(1)

where w is the weight vector and b is the bias term. By mapping original input data into a high-dimensional space, the nonlinear separable problem becomes linearly separable in space. Afterward, the following cost function is formulated in the framework of empirical risk minimization as equation (2)

min_{w, b, ξ} J (w, b, ξ) = \frac{1}{2} w^{T} w + \frac{1}{2} c \sum_{i = 1}^{l} ξ_{i}^{2}

(2)

which is subjected to equality constraints

y_{i} = w^{T} ϕ (x_{i}) + b + ξ_{i}, i = 1, 2, \dots, l

(3)

where $ξ_{i}$ is the random errors and c is the regularization parameter in determining the trade-off between minimizing the training errors and minimizing the model complexity. The larger the value c is, the more severe punishment the errors get. To solve this optimization problem, Lagrange function is constructed as follows

L (w, b, ξ_{i}, α_{i}) = J (w, ξ_{i}) - \sum_{i = 1}^{l} α_{i} [w^{T} ϕ (x_{i}) + b + ξ_{i} - y_{i}]

(4)

where $α_{i} \in R^{l \times 1}$ are Lagrange multipliers.

According to Karush–Kuhn–Tucker (KKT) rule, the solution of equation (4) can be obtained by partially differentiating with respect to w, b, $ξ_{i}$ , $α_{i}$

\begin{matrix} \frac{\partial L}{\partial w} = 0 \\ \frac{\partial L}{\partial b} = 0 \\ \frac{\partial L}{\partial ξ_{i}} = 0 \\ \frac{\partial L}{\partial α_{i}} = 0 \end{matrix} \Rightarrow {\begin{matrix} w = \sum_{i = 1}^{l} α_{i} ϕ (x_{i}) \\ \sum_{i = 1}^{l} α_{i} = 0 \\ \begin{matrix} α_{i} = c ξ_{i}, i = 1, 2, \dots, l \end{matrix} \\ w^{T} ϕ (x_{i}) + b + ξ_{i} - y_{i} = 0, i = 1, 2, \dots, l \end{matrix}

(5)

After eliminating w and $ξ_{i}$ , equation (5) is rewritten as follows

[\begin{matrix} 0 & {\vec{1}}^{T} \\ \vec{1} & Z Z^{T} + c^{- 1} I \end{matrix}] [\begin{matrix} b \\ α \end{matrix}] = [\begin{matrix} 0 \\ y \end{matrix}]

(6)

where

\begin{matrix} Z = {[ϕ (x_{1}), ϕ (x_{2}), \dots, ϕ (x_{l})]}^{T} \\ y = {[y_{1}, y_{2}, \dots, y_{l}]}^{T} \\ \vec{1} = {[1, 1, \dots, 1]}^{T} \\ α = {[α_{1}, α_{2}, \dots, α_{l}]}^{T} \end{matrix}

(7)

The inner-product operation $Z Z^{T}$ can be replaced by kernel function $K (x_{i}, y_{i})$ for nonlinear regression function. Introduction of kernel function allows LSSVM to operate in the input space directly instead of the potential high-dimensional feature space to avoid the “dimension disaster”. According to Mercer’s theorem, if $Ω = Z Z^{T}$ , then

Ω_{ij} = ϕ (x_{i})^{T} ϕ (x_{j}) = K (x_{i}, x_{j})

(8)

Optimal $b^{*}$ and $α^{*}$ can be obtained by solving the following linear system

{\begin{matrix} b^{*} = \frac{{\vec{1}}^{T} {(Ω + c^{- 1} I)}^{- 1} y}{{\vec{1}}^{T} {(Ω + c^{- 1} I)}^{- 1} \vec{1}} \\ α^{*} = {(Ω + c^{- 1} I)}^{- 1} (y - \vec{1} b^{*}) \end{matrix}

(9)

and the result of LSSVM model can be expressed as follows

f (x) = \sum_{i = 1}^{l} α_{i} K (x_{i}, x) + b

(10)

In comparison with some other feasible kernel functions, the radial basis function (RBF) is a more compact supported kernel, and thus, it is able to reduce computational complexity of the training process and improve generalization performance of LSSVM. As a result, RBF kernel was selected as kernel function as follows

K (X_{i}, X_{j}) = \exp {\frac{{| X_{i} - X_{j} |}^{2}}{2 σ^{2}}}, i, j = 1, 2, \dots, l

(11)

where $σ$ is the scale factor for tuning. If scale parameter $σ$ approaches zero, all the components of the Lagrange multipliers are greater than 0, which means all sample points are support vectors. When $σ$ approaches infinity, SVM discriminant function is often a constant function, and all sample points are sentenced to the same class.

Finally, the structure of regression function is shown in Figure 1, the output is a linear combination of intermediate nodes, and each intermediate node corresponds to a support vector. In this article, the input support vectors are tool machining conditions, and the output is tool wear.

Figure 1.

Structure of regression function.

LOO-based LSSVM optimization

In the LSSVM, regularization parameter c and RBF kernel parameter $σ$ influence the accuracy of regression prediction. Regularization parameter c is used to control the complexity of model and compromise of approximation error. Presetting c cannot track and adapt to neither changes of the sample number and sample types nor mappings of them for regression prediction according to prior knowledge alone. RBF kernel parameter reflects the distribution of training data or range characteristics, which determines the width of the local areas and affects the number of support vector directly.¹⁹ Therefore, it is necessary to optimize regularization parameter c and RBF kernel parameter $σ$ . Common parameter optimization includes GA, PSO, cross-validation (CV) algorithms, grid search algorithm,^20–22 and so on. The convergence rates of GA and PSO-based LSSVM are slow, and those of CV algorithms and grid search algorithm are fast. The prediction accuracy of PSO and GAs is higher than that of CV and grid search algorithms.

Tuning of regularization parameter c and RBF kernel parameter $σ$ is usually accomplished by minimizing an estimation of generalization error such as the LOO error,²³ the k-fold CV error,²⁴ and the PSO.²⁵ The LOO error is a gradual unbiased estimator of true generalization error compared to the k-fold CV error, which has a considerable bias when data are sparse, despite of its high computational cost.²⁶ This article uses LOO to optimize the two parameters of LSSVM by taking account of sparse sample data sets, accuracy and convergence of LSSVM, and optimal criteria which can be calculated by mean square error (MSE). Data set D will be divided into N segments, in which each segment is $G_{n}, (n = 1, 2, \dots, N)$ . MSE can be defined as equation (12)

MS E_{CV} = \sqrt{\frac{1}{l} \sum_{n = 1}^{N} \sum_{i = G_{n}} {(y_{i} - y (x_{i} | θ_{n}^{*}))}^{2}}

(12)

where $G_{n}$ is the test data, $θ_{n}^{*}$ is the optimal parameter vectors which are obtained through training the n segment data. $MS E_{CV}$ reflects the distribution range of error, with a smaller value generating higher accuracy.

When N = l, $MS E_{CV}$ is $MS E_{LOO}$

MS E_{LOO} = \sqrt{\frac{1}{l} \sum_{n = 1}^{N} {(y_{i} - y (x_{i} | θ_{n}^{*}))}^{2}}

(13)

regression function of LSSVM (10) can be rewritten as follows

y (x_{k} | θ_{k}) = \sum_{i = 1}^{l} α_{i} K (x_{i}, x_{k}) + b_{k}

(14)

then c can be optimized by calculating the following function

\begin{matrix} \frac{\partial MS E_{LOO}}{\partial c} = - \frac{1}{l} \frac{1}{\sqrt{\frac{1}{l} \sum_{n = 1}^{N} \sum_{k \in G_{n}} {(y_{k} - y (x_{k} | θ_{n}^{*}))}^{2}}} \\ \sum_{k = 1}^{l} (y_{k} - y (x_{k} | θ_{k})) \frac{\partial y (x_{k} | θ_{k})}{\partial c} \end{matrix}

(15)

Finally, minimizing equation (15) obtains the optimal c and $σ$ value.

Steps to optimize parameters c and $σ$ are described as follows:

Step 1: Set training points $D : {(x_{i}, y_{i}), i = 1, 2, \dots, l}$ , initialize c and $σ$ .

Step 2: Divide training data set into N segments, where the N − 1 pieces of data are used for training, and the remaining piece of data is used for testing. Get a decision function and the corresponding ${α_{k}, α_{k}^{*}, k = 1, 2, \dots, l}$ , then adjust and optimize c and $σ$ until optimal model parameters are found.

Step 3: Use the LOO-LSSVM technique for training and prediction.

NM simplex search method

Although LOO-LSSVM could search the optimum c and $σ$ globally, while local optimum values are missed. The NM simplex search method is proposed by Nelder and Mead,²⁷ which is a local search method designed for unconstrained optimization without using derivatives. Operations of this method rescale the simplex based on the local behavior of the function using four basic procedures: reflection, expansion, contraction, and shrinkage. In this method, the highest vertex points of the objective function values are replaced by the new point from comparing the objective function values of the n + 1 vertex points of simplex. Afterward, the point by stepwise iterative simplex constantly is updated; thus, the simplex gets closer to the optimal solution. The optimization steps are described as follows:

Initialization. For minimization of the n variables unconstrained function, let $P_{0}, P_{1}, \dots, P_{n}$ denote n + 1 points in n-dimensional space, constituting the initial “simplex,” and calculate the value of each function vertices of the simplex. And $y_{i}$ denotes function values of point $P_{i}$ .

Reflection. Determine $P_{high}$ , $P_{\sec h}$ and $P_{low}$ vertices with the highest, the second highest, and the lowest function values, respectively. Then find the center of the simplex $P_{cent}$ without $P_{high}$ in the minimization case. Generate a new vertex $P_{refl}$ by reflecting the worst point according to the following equation

P_{refl} = (1 + α) P_{cent} - α P_{high}

(16)

where $α$ is the reflection coefficient ( $α > 0$ ), which is set as $α = 1$ according to suggestions of Nelder and Mead. If $y_{low} \leq y_{refl} \leq y_{\sec h}$ , proceed with expansion, otherwise proceed with contraction.

3. Expansion. If the reflection operation produces a lower vertex, namely $y_{low} > y_{refl}$ , the expansion operation of $P_{refl}$ is accepted according to the following equation

P_{\exp} = γ P_{refl} + (1 - γ) P_{cent}

(17)

where $γ$ is the expansion coefficient ( $γ > 1$ ); it is set as $γ = 2$ . Two possible expansion cases need to be considered, as described below:

If $y_{\exp} < y_{low}$ , the expansion operation is accepted by replacing $P_{high}$ with $P_{\exp}$ ;

Exit the algorithm if the stopping criteria are satisfied; if $y_{\exp} > y_{low}$ , execute step 2 by replacing $P_{high}$ with $P_{refl}$ .

4. Contraction. It is described as follows:

(1) If $y_{\sec h} < y_{refl} < y_{high}$ , the contraction operation is accepted by replacing $P_{high}$ with $P_{refl}$ according to the following equation

P_{cont} = β P_{high} + (1 - β) P_{cent}

(18)

where $β$ is the contraction coefficient ( $0 < β < 1$ ); it is set as $β = 0.5$ . If $y_{refl} > y_{high}$ , the contraction operation is accepted without replacing $P_{high}$ with $P_{refl}$ according to equation (18);

(2) Exit the algorithm if the stopping criteria are satisfied; if $y_{cont} \leq y_{high}$ , then execute step 2 by replacing $P_{high}$ with $P_{cont}$ ; otherwise turn to the next step.

5. Shrinkage. Following step 4 in which $y_{cont} > y_{high}$ and contraction have failed, this step attempts to all points except $P_{low}$ by the following equation

P_{i} \leftarrow δ P_{i} + (1 - δ) P_{low}

(19)

where $δ$ is the shrinkage coefficient ( $0 < δ < 1$ ); it is set as $δ = 0.5$ . Exit the algorithm if the stopping criteria are satisfied; Otherwise go to step 2 by recalculating function values of each vertex except $P_{low}$ .

The NM simplex algorithm is simple and demands low analytic properties of the objective function. However, there are two main shortcomings: one is that the choice of initial vertex is very sensitive and the other one is that simplex cannot guarantee the global convergence optimum.²⁸

NM LOO-based LSSVM-based tool wear prediction model

In order to improve the local search ability of LOO-LSSVM algorithm, NM LOO-based least squares support vector machines (NM-LOO-LSSVM) is proposed. The goal of integrating NM simplex search method and LOO-LSSVM is to combine their advantages and avoid disadvantages. The NM simplex method is a very efficient local search procedure, but the choice of initial points is very sensitive and it is incapable of guaranteeing global optimum. LOO-LSSVM belongs to the class of global search procedure; nonetheless, it lacks local optimum.

NM-LOO-LSSVM algorithm has two stages in each cycle: the first stage is the global search. Based on LOO-LSSVM, global optimum solutions are searched in the solution space of the optimized objective functions, which are as the initial points nearby global optimal solution for the NM simplex method. The second stage is the local search, and further optimization is carried out based on the NM simplex algorithm to find the optimum solution. The whole algorithm process is shown in Figure 2.

Figure 2.

Procedures of NM-LOO-LSSVM model.

Experiments and discussions

Tool wear prediction model

According to prior knowledge of tool wear principles in machining, tool wear changes over machining time, with wear speed is different under different cutting conditions. Under the same cutting condition, rate of wear change is approximately a constant value K

\frac{Δ VB}{Δ t} = K

(20)

where $Δ VB$ is the tool wear variable quantity under the same cutting condition over tool cutting time $Δ t$ .

Due to the different interactions between the tool and the workpiece, the generating mechanisms of tool wear that include abrasive wear, adhesive wear, and diffusion wear are different. Abrasive wear occurs when the workpiece material is removed from one surface by tool, leaving built-up edge or hard particles of debris between the two surfaces.²⁹ Abrasive wear is presented under various cutting speeds, but it is the main wear mechanism at low cutting speed. Adhesive wear occurs when minute peaks of the two rough surfaces contact each other and weld or stick together, removing a wear particle. Adhesive wear is more serious under the moderate cutting speed. When cutting temperature is higher than the brittle temperature, tool produces diffusion wear. Experiment showed that with the increase in the cutting speed, carbide cobalt element will decompose into tungsten and carbon diffusing to the steel when cutting temperature is over 800 °C, which makes the tool wear exacerbation.³⁰

The cutting tools, processing methods, and the macro- and micro-geometry parameters of the tool need to be changed to meet the different processing objects and different precision required constantly in machining, which makes the testing process very complicated. And in one processing step, except changing of the cutting conditions, other factors do not change generally. Therefore, geometric parameters of the tool are fixed in the tool wear condition-monitoring experiment. Different cutting time corresponds to different tool wear conditions. Consequently, cutting time that reflects the entire life cycle of tool is taken into account in this model. Due to complexity of the actual process, the linear relationship does not truly reflect the relationship between tool wear degree and the cutting time interval. Taking different cutting conditions into account, the relationship between tool wear and cutting parameters is constantly changing. Equation (20) generates a highly nonlinear relationship as equation (21)

Δ VB = K \cdot Δ t^{x} \cdot a^{y} \cdot f^{m} \cdot v^{n}

(21)

where a is the cutting depth, f is the feed rate, and v is the spindle speed.

Sample data collection

Experiments are performed on CK6143/100 computerized numerical control (CNC) machine with M10 carbide alloy tool and ZMn13 high manganese cast steel, and rake angle γ = 2°, relief angle α = 8°, cutting edge angle k_r = 35°. It is desirable that tool wear monitoring model reflects the conditional changes in cutting tools under diverse cutting conditions such as different levels of cutting speed, feed rate, and depth of cut. In experiments, four factors used for the design of experiment were cutting speed, feed rate, depth of cut, and cutting time, which are shown in Table 1.

Table 1.

Cutting conditions.

Number	Cutting conditions
Number	Cutting speed, v (m/min)	Depth of cut, d (mm)	Feed rate, f (mm/r)
1	21	0.5	0.05
2	21	0.5	0.7
3	21	1.5	0.7
4	21	1.5	0.5
5	43	0.5	0.7
6	43	0.5	0.05
7	43	1.5	0.05
8	43	1.5	0.7

Tool wear limit is called tool life criterion. In the cutting process, the friction between the tool rake face, flank, and the workpiece will cause high voltage and high temperature on the contact area, where wear occurs. Crater wear occurs on the rake face; flank wear occurs on the tool flank. In many cases, the two occur simultaneously and influence each other. Flank wear has impact on the processing quality, cutting force, and cutting temperature, while the amount of flank wear can be readily observed, measured, and controlled when compared with crater wear. Therefore, the maximum tool flank wear VB is used as tool wear standard in the model of this article.

Experimental methods can be described as follows: the tool is removed from the lathe to observe the flank wear under the microscope after a slot turning. Tool wear is detected every 3 min under each condition to obtain a total of 10 samples. When one slot turning is completed, another slot is ready. Each experiment repeats three times. Every time the tool follows the same cutting path and then detects the amount of wear to avoid the randomness of detection. It should be pointed out that a new cutting tool has been used for each of the eight cutting conditions in each experiment. Experimental conditions, such as cutting fluid performance, the fluid volume, and tool corrected situation, should be strictly controlled for the repeatability of test results in cutting experiments.

In order to avoid differences between each node of sample data, normalization processing was applied to all train and test data as a preprocessing step according to the following function

x^{*} = 2 \frac{x - x_{min}}{x_{max} - x_{min}} - 1

(22)

where x is the uncompressed value, $x^{*}$ is the compressed value, and $x_{max}$ and $x_{min}$ are the maximum and minimum value, respectively.

The whole data set can be further divided into two sub-sets, that is, training data and random test data. Then, the NM-LOO-LSSVM-based tool wear model was trained by training data and two turning parameters c and $σ$ . Once the training stage is accomplished, the tool model was validated through test data.

Evaluation criteria

To evaluate the performance of NM-LOO-LSSVM model, the following measures are taken:

1. Root MSE

MSE = \frac{1}{m} \sum_{i = 1}^{m} {(y_{i}^{*} - y_{i})}^{2}

(23)

In regression analysis, the term MSE is sometimes used to refer to the unbiased estimate of error variance.³¹

2. Coefficient of determination

R^{2} = \frac{{(m \sum_{i = 1}^{m} y_{i}^{*} y_{i} - \sum_{i = 1}^{m} y_{i}^{*} \sum_{i = 1}^{m} y_{i})}^{2}}{(m \sum_{i = 1}^{m} {(y_{i}^{*})}^{2} - {(\sum_{i = 1}^{m} y_{i}^{*})}^{2}) (m \sum_{i = 1}^{m} {(y_{i})}^{2} - {(\sum_{i = 1}^{m} y_{i})}^{2})}

(24)

$R^{2}$ is a statistical value that gives some information about the fitness of a model. In regression, the $R^{2}$ coefficient of determination is a statistical measure of how well the regression line approximates the real data points. An $R^{2}$ of 1 indicates that the regression line perfectly fits the data.³²

3. Mean absolute percent error (MAPE)

MAPE = \frac{1}{m} \sum_{i = 1}^{m} | \frac{y_{i} - y_{i}^{*}}{y_{i}} | \times 100 %

(25)

MAPE measures the method’s accuracy for constructing fitted time series values in statistics, specifically in trend estimation. It usually expresses accuracy as a percentage.

4. Accuracy

Accuracy = 1 - \frac{1}{m} \sum_{i = 1}^{m} | \frac{y_{i} - y_{i}^{*}}{y_{i}} | \times 100 %

(26)

where m is the number of test data, $y_{i}$ is the real tool wear value, and $y_{i}^{*}$ is the estimated tool wear value.

Sensitivity analysis of NM simplex search parameters

In this section, parameter sensitivity of NM simplex search method will be investigated. The rate of successful minimization is used as the criterion for tuning NM parameters. For the sensitivity investigation, the study is conducted using the original coefficients for NM simplex search method, as shown in Table 2, which also includes ranges of parameters.^28,33

Table 2.

Parameter sensitivity ranges.

NM simplex coefficient	Original value	Range		Increment
NM simplex coefficient	Original value	Minimum	Maximum	Increment
Reflection ( $α$ )	1.00	0.50	2.00	0.25
Expansion ( $γ$ )	2.00	1.50	3.00	0.25
Contraction ( $β$ )	0.50	0.25	0.75	0.25
Shrinkage ( $δ$ )	0.50	0.25	0.75	0.25

NM: Nelder–Mead.

Each time one of the four parameters is altered according to the ranges given in Table 2, while other three parameters are fixed. For example, in the sensitivity study for reflection coefficient $α$ , it varies from 0.50 to 2.00 with an increment of 0.25, while the expansion $γ$ , contraction $β$ , and shrinkage $δ$ remain at 2.0, 0.5, and 0.5, respectively. Consequently, it can be seen that which value results in the best performance in terms of the rate of successful minimization.

Figure 3 shows the sensitivity results for the reflection coefficient. As illustrated in the figure, the optimum solutions converge quickly to the minimum value when the reflection coefficient is greater than 0.75, and increment of reflection coefficient does not result in lager oscillation. Additionally, the reflection coefficient setting at 1.5 reaches the best rate of successful minimization. According to equation (16), larger value for the generation of the new vertex $P_{refl}$ helps the NM simplex search method to further expand in the design space in search of the optimum solution by having larger reflection coefficient.

Figure 3.

Reflection sensitivity.

Figure 4 shows the sensitivity results for the expansion coefficient. It can be seen that the optimum solutions converge quickly to the minimum value when the expansion coefficient is greater than 1.5, and increment of reflection coefficient does not result in lager oscillation. Also, the expansion coefficient setting at 2.5 achieves the best rate of successful minimization. Equation (17) suggests that the larger value for the generation of the new vertex $P_{\exp}$ helps the NM simplex search method to extend search space by having larger expansion coefficient.

Figure 4.

Expansion sensitivity.

Figure 5 shows the sensitivity results for the contraction coefficient. It is observed from Figure 3 that the reflection coefficient setting at 0.75 returns the best rate of successful minimization. Equation (18) suggests that the larger value for the generation of the new vertex $P_{cont}$ helps the NM simplex search method to provide more flexibility to expand in the design space in search of the optimum solution by increasing contraction coefficient.

Figure 5.

Contraction sensitivity.

Figure 6 shows the sensitivity results for the shrinkage coefficient. It shows that the optimum solutions quickly converge to the minimum value when the shrinkage coefficient is greater than 0.25 for test cases. It is observed from Figure 3 that the shrinkage coefficient setting at 0.5 returns the best rate of successful minimization. Equation (19) suggests that the larger value for the generation of the new vertex $P_{i}$ helps the NM simplex search method to provide more flexibility to move in the design space.

Figure 6.

Shrinkage sensitivity.

Reflection coefficient $α$ , expansion coefficient $γ$ , contraction coefficient $β$ , and shrinkage $δ$ all have impact on the optimum solution, which results in the best performance in terms of rate of successful minimization and improves effectiveness and efficiency of the algorithm. The above investigations are summarized in Table 3, and these suggested parameters are applied in the NM-LOO-LSSVM model.

Table 3.

Best suggested for NM-LOO-LSSVM parameters after sensitivity analysis.

NM simplex coefficient	Original value	Best suggested value
Reflection ( $α$ )	1	1.5
Expansion ( $γ$ )	2	2.5
Contraction ( $β$ )	0.5	0.75
Shrinkage ( $σ$ )	0.5	0.5

NM: Nelder–Mead.

Experimental results

To validate the effectiveness of the designed algorithm, LSSVM and PSO-LSSVM are used for comparison. The experiments include 10 tests and each has different initial parameters for consideration of data randomness.

The average results of processing are shown in Table 4. It is observed that NM-LOO-LSSVM has the smallest MSE value, largest $R^{2}$ , smallest MAPE. And LSSVM model has the largest MSE value, smallest $R^{2}$ , largest MAPE. In addition, Table 4 shows that LSSVM tuned by NM and LOO techniques results in the highest accuracy, which equals to 97.29%.

Table 4.

Performance of LSSVM, PSO-LSSVM, NM-LOO-LSSVM.

Performance measures	LSSVM	PSO-LSSVM	NM-LOO-LSSVM
MSE	0.0048	0.0022	0.000218
$R^{2}$	0.9184	0.9850	0.9907
MAPE	0.0943	0.0489	0.0271
Accuracy (%)	90.57	95.11	97.29

NM-LOO-LSSVM: Nelder–Mead leave-one-out based least squares support vector machine; PSO: particle swarm optimization; MSE: mean square error; MAPE: mean absolute percent error.

One prediction using LSSVM, PSO-LSSVM, and NM-LOO-LSSVM model, and actual tool wears measured by optical scan microscope are compared as shown in Figure 7. Table 5 lists the cutting conditions of test data. The estimated tool wear, actual tool wear, and fractional error between estimated tool wear and actual tool wear are shown in Table 6. Both the figures and tables demonstrate that NM-LOO-LSSVM and PSO-LSSVM predictions are in accordance with actual tool wears whatever the tool wear level.

Figure 7.

Comparisons between estimated tool wear and actual tool wear.

Table 5.

Cutting conditions of test data.

No.	Cutting speed, v (m/min)	Depth of cut, d (mm)	Feed rate, f (mm/r)	Cutting time, T (min)
1	21	1.5	0.5	18
2	21	0.5	0.7	21
3	43	0.5	0.7	12
4	21	0.5	0.05	30
5	43	1.5	0.7	3
6	43	1.5	0.05	6
7	43	0.5	0.05	30
8	21	1.5	0.7	27

Table 6.

Fractional error between estimated tool wear and actual tool wear.

No.	Real wear	NM-LOO-LSSVM		PSO-LSSVM		LSSVM
No.	Real wear	Estimated wear	Fractional error (%)	Estimated wear	Fractional error (%)	Estimated wear	Fractional error (%)
1	0.4	0.4063	1.56	0.3949	1.27	0.4130	3.24
2	0.42	0.4426	5.39	0.4559	8.54	0.4643	10.56
3	0.35	0.3386	3.26	0.3269	6.59	0.3500	0.00
4	0.61	0.6083	0.28	0.5540	9.18	0.4651	23.75
5	0.25	0.2568	2.71	0.2684	7.36	0.2543	1.70
6	0.25	0.2405	3.78	0.2396	4.17	0.2246	10.16
7	0.75	0.7530	0.40	0.7098	5.36	0.5682	24.24
8	0.6	0.5911	1.49	0.6009	0.16	0.6217	3.62
Average			2.36		5.33		9.66

NM-LOO-LSSVM: Nelder–Mead leave-one-out based least squares support vector machine; PSO: particle swarm optimization.

Time consumptions of LSSVM, PSO-LSSVM, and NM-LOO-LSSVM model are compared as shown in Figure 8. As illustrated, NM-LOO-LSSVM model has a running time that is less than 1 s, while PSO-LSSVM exceeds 10 s.

Figure 8.

Processing time of LSSVM, PSO-LSSVM, and NM-LOO-LSSVM model.

Results discussion

Table 4 summarizes performances of LSSVM, PSO-LSSVM, and NM-LOO-LSSVM model for tool wear prediction. It is observed from the table that NM-LOO-LSSVM shows the best performance. The MSE value of NM-LOO-LSSVM prediction is smaller than those of the former two, which shows high generalization abilities. The $R^{2}$ value of NM-LOO-LSSVM is the largest, which indicates that its regression line is more closely fits the data compared with those of the other two. With respect to the MAPE value and accuracy value, NM-LOO-LSSVM achieves the best performance, albeit fractional error ( $E_{f}$ ) of PSO-LSSVM is also less than 5%. The robustness and generalization are largely improved by the NM-LOO-LSSVM model.

It is observed from Figure 7 that NM-LOO-LSSVM prediction is most close to the real tool wear on each level intuitively, while prediction of LSSVM is not so good. Table 5 shows the $E_{f}$ between real tool wear and estimated tool wear. $E_{f}$ of NM-LOO-LSSVM is less than 5%, while that of PSO-LSSVM and LSSVM are more than 5%. It is observed that the average $E_{f}$ of NM-LOO-LSSVM prediction is the smallest, which means the model has the strongest robustness.

In Figure 7, the condition #5 (v = 43 m/min; d = 1.5 mm; f = 0.7 mm/r; T = 3 min) results in lower wear as compared with #1 (v = 21 m/min; d = 1.5 mm; f = 0.5 mm/r; T = 18 min). It can be seen that the cutting speed of condition #5 is higher than that of condition #1, while the cutting time of condition #5 is three times less than that of condition #1. As a result, tool wear and cutting time are closely related.

Based on aforementioned elaboration, it can be concluded that accuracies of NM-LOO-LSSVM and PSO-LSSVM are both higher than 95%. However, the PSO-LSSVM model is high in time consumption compared to the NM-LOO-LSSVM model. Figure 8 shows that the convergence rate is also largely improved by the NM-LOO-LSSVM model. Therefore, the established model in this article is the best and fastest method.

The satisfied results of NM-LOO-LSSVM model demonstrate this model’s feasibility for tool wear prediction.

Conclusion

This article establishes a reliable LSSVM model based on NM and LOO techniques focused on the accuracy, generalization, robustness, and convergence rate to predict tool wear. To verify effectiveness of the proposed NM-LOO-LSSVM model, experiments have been conducted. Major contributions of this work are summarized as follows:

A high nonlinear relationship between tool wear and cutting condition parameters is established.

LSSVM techniques have been implemented to predict tool wear based on cutting parameters. LSSVM has good generalization, although sample amount is limited. Furthermore, it also shows excellent global convergence ability due to the use of statistical learning theory.

Parameters sensitivity study is conducted on the NM simplex search method, and relationships between them and their impacts on the optimum solution are investigated. The results show that sensitivity study could reduce NM’s computational cost, thus reaches a faster convergence rate. This study also provides essential insights for designs of other intelligent algorithms.

LOO and NM algorithms are used to tune the LSSVM model to improve the global and local search ability. Analysis results from experiment have shown that NM-LOO-LSSVM can improve the accuracy, generalization, robustness, and convergence of the regression. This is critical since NM-LOO-LSSVM can reduce LSSVM’s failure risks of falling into a global minimum value.

This article not only has studied the tool wear prediction, but also has established a fast and accurate predictive model. This study has done a lot of cutting experiments under certain cutting conditions. Therefore, conclusions of the experiment analysis have some inevitable limitations. Due to the limited actual conditions, the tool wear condition-monitoring model of this article has not yet been used for online real-time monitoring in practical applications. Future research in the real production environment is needed to further validate the study. Meanwhile, the predictive ability of the model depends on the existing knowledge base. When a new processing environment appears, the prediction does not match the actual situation well, which leads to misjudgments. Therefore, this article offers guidance to online real-time tool wear monitoring and product applications.

In the conclusion, this article studies tool wear prediction under different cutting conditions and aims at dealing with the disadvantage of single cutting condition. And mapping relationships between tool wear and four cutting conditions are explored, which indicates that more tool cutting conditions can be explored. Experimental results validate the effectiveness of NM-LOO-LSSVM model since the estimation error is acceptable. This model provides an effective, reliable solution for optimization of tool machining condition, which can be used for real industrial applications with its high accuracy and rapid convergence. According to the desired tool life, tool cutting parameters can be adjusted to make estimated tool life by selecting tool and materials. Therefore, the production efficiency could be efficiently improved.

Footnotes

Acknowledgements

The authors would like to express their thanks to related financial supports. The authors express their gratitude to Yang Cao for the language help.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the National Science and Technology Support Plan Subsidization Project (Grant No. 2014BAF08B02), the National Science and Technology Major Project (Grant No. 2012ZX04011-031), National Outstanding Youth Science Foundation (Grant No. 50925518), and the Youth Science Foundation of National Natural Science Foundation (Grant No. 51005260).

References

Wang

Shao

. The tool wear and breakage monitoring in turning using neural network. J Shanghai Jiaotong Univ 2006; 40(12): 2057–2062.

Zhou

Wang

Qin

Tool wear characteristics in high-speed milling of graphite using a coated carbide micro endmill. Proc IMechE, Part B: J Engineering Manufacture 2009; 223(3): 267–277.

Chen

HZ.

Development of a tool wear observer model for online tool condition monitoring and control in machining nickel-based alloys. Int J Adv Manuf Tech 2009; 45(7–8): 786–800.

Fadare

Oni

AO.

Development and application of a machine vision system for measurement of tool wear. J Eng Appl Sci 2009; 4(4): 42–49.

Paul

Varadarajan

AS.

A multi-sensor fusion model based on an artificial neural network to predict tool wear during hard turning. Proc IMechE, Part B: J Engineering Manufacture 2012; 226(5): 853–860.

Sun

Brandt

Barnes

. Experimental investigation of cutting forces and tool wear during laser-assisted milling of Ti-6Al-4V alloy. Proc IMechE, Part B: J Engineering Manufacture 2011; 225(9): 1512–1527.

Alonso

Salgado

DR.

Application of singular spectrum analysis to tool wear detection using sound signals. Proc IMechE, Part B: J Engineering Manufacture 2005; 219(9): 703–710.

Rangwala

Dornfeld

(eds). Sensor integration using neural networks for intelligent tool condition monitoring. J Eng Ind: T ASME 1990; 112(3): 219–228.

Sick

On-line and indirect tool wear monitoring in turning with artificial neural networks: a review of more than a decade of research. Mech Syst Signal Pr 2002; 16(4): 487–546.

10.

Boutros

Liang

Detection and diagnosis of bearing and cutting tool faults using hidden Markov models. Mech Syst Signal Pr 2011; 25(6): 2102–2124.

11.

Aliustaoglu

Ertunc

Ocak

Tool wear condition monitoring using a sensor fusion model based on fuzzy inference system. Mech Syst Signal Pr 2009; 23(2): 539–546.

12.

Silva

Wilcox

Reuben

RL.

Development of a system for monitoring tool wear using artificial intelligence techniques. Proc IMechE, Part B: J Engineering Manufacture 2006; 220(8): 1333–1346.

13.

Kaya

Oysu

Ertunc

. A support vector machine-based online tool condition monitoring for milling using sensor fusion and a genetic algorithm. Proc IMechE, Part B: J Engineering Manufacture 2012; 226(11): 1808–1818.

14.

Cao

Tay

FEH

. Support vector machine with adaptive parameters in financial time series forecasting. IEEE T Neural Networ 2003; 14(6): 1506–1518.

15.

Goethals

Pelckmans

Suykens

JAK

. Subspace identification of Hammerstein systems using least squares support vector machines. IEEE T Automat Contr 2005; 50(10): 1509–1519.

16.

Vapnik

VN.

The nature of statistical learning theory. New York: Springer, 2000.

17.

Suykens

JAK

Vandewalle

. Least squares support vector machine classifiers. Neural Process Lett 1999; 9(3): 293–300.

18.

Suykens

JAK

Gestel

Brabanter

. Least squares support vector machines. Singapore: World Scientific, 2002.

19.

Martin

MSL

Sathiya

Chong Jin

. An efficient method for computing leave-one-out error in support vector machines with Gaussian kernels. IEEE T Neural Networ 2004; 15(3): 750–757.

20.

Liu

Jia

Hui

. Research on Kernel parameter optimization of support vector machine in speaker recognition. Sci Technol Eng 2010; 10(7): 1669–1673.

21.

Chen

Wang

Lee

. Model selection of SVMs using GA approach. In: Proceedings of the 2004 IEEE international joint conference on neural networks (ed Chen

Wang

Lee

), Budapest, 25–29 July 2004. New York: IEEE.

22.

Eberhart

Kennedy

A new optimizer using particle swarm theory. In: Proceedings of the 6th international symposium on micro machine and human science (ed Eberhart

Kennedy

), Nagoya, Japan, 4–6 October 1995, vol. 1, pp.39–43. New York: IEEE.

23.

Molinaro

Simon

Pfeiffer

RM.

Prediction error estimation: a comparison of resampling methods. Bioinformatics 2005; 21(15): 3301–3307.

24.

Burman

A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 1989; 76(3): 503–514.

25.

Yusup

Zain

Hashim

SZM

. Overview of PSO for optimizing process parameters of machining. Proced Eng 2012; 29(4): 914–923.

26.

Chapelle

Vapnik

Bousquet

. Choosing multiple parameters for support vector machines. Mach Learn 2002; 46(1–3): 131–159.

27.

Nelder

Mead

A simplex method for function minimization. Comput J 1965; 7(4): 308–313.

28.

Wang

Shoup

TE.

Parameter sensitivity study of the Nelder–Mead simplex method. Adv Eng Softw 2011; 42(7): 529–533.

29.

Shaw

Cookson

JO.

Metal cutting principles. Tribol Int 2005; 18(1): 55.

30.

Shao

Liu

Wan

. Diffusion wear for carbide tools based on thermodynamics. J Wuhan Univ Technol 2008; 10: 29.

31.

Carpenter

RG.

Principles and procedures of statistics with special reference to the biological sciences. Eugen Rev 1960; 43(3): 172–173.

32.

Cameron

Windmeijer

FAG

. An R-squared measure of goodness of fit for some common nonlinear regression models. J Econometrics 1997; 77(2): 329–342.

33.

Fan

SKS

Zahara

. A hybrid simplex search and particle swarm optimization for unconstrained optimization. Eur J Oper Res 2007; 181(2): 527–548.

A novel monitoring method for turning tool wear based on support vector machines

Abstract

Keywords

Introduction

LSSVM-based tool wear prediction model

LOO-based LSSVM optimization

NM simplex search method

NM LOO-based LSSVM-based tool wear prediction model

Experiments and discussions

Tool wear prediction model

Sample data collection

Evaluation criteria

Sensitivity analysis of NM simplex search parameters

Experimental results

Results discussion

Conclusion

Footnotes

Acknowledgements

Declaration of Conflicting Interests

Funding

References