Abstract
In-line assessment of descaler functionality was conducted through use of high-speed infrared (IR) video imaging of moving steel strips following laminar cooling. IR temperature measurements were obtained for fifteen (15) steel strips at a mill location between laminar cooling and coiling. Anomalous temperature values were observed in the IR images and were attributed to the presence of surface oxides. The location and spacing of these oxides were quantified using an outlier detection method consisting of a Gaussian process and Gaussian mixture model. Validation of the outlier detection method was done through comparison of both the morphology and emissivity of oxide regions to those reported in literature. The oxides were predominately located at the mid-width of the strip, while their transverse spacing correlated well with the descaler nozzle spray patterns. The transverse spacing and location of these oxides indicates that the reheat furnace and/or rough-rolling entry descaler was operating with reduced functionality. Total fraction of surface oxides was also shown to be correlated with the amount of silicon in each strip and the total roll reduction.
Introduction
Thermomechanical controlled processing (TMCP) is commonly used to produce high strength low alloyed (HSLA) steels for the oil and gas and automotive industries. TMCP consists of four major process steps including homogenisation (
Oxides formed during TMCP can be classified as primary, secondary, or tertiary.1,8 Primary oxide scale is formed during homogenisation and generally exhibits a strong interfacial bond with steel.
1
A secondary oxide scale forms and grows during the rough rolling schedule. Finally, a tertiary oxide scale can form during finish rolling, cooling, and coiling. The scale that forms in each TMCP stage is generally a layered structure consisting of an inner layer of wüstite (FeO) overlaid with magnetite (Fe
Pressurised water descalers are typically used to remove surface oxides from steel strips. These descalers consist of flat-fan nozzles connected to a pressurised water header and are arranged in banks across the width of the strip. The arrangement and positioning of each individual nozzle in a bank is pre-set to maximise surface area coverage and oxide removal. 14 The descaler banks are located at specific positions in the hot strip mill including between the homogenisation furnace and the roughing mill and between the roughing mill and the finishing mill. Decreased descaling efficiency has been observed15,16 and is attributed to nozzle spray interaction and obstruction. In addition, rolling mill parameters and steel chemistry17,18 can influence the functionality (i.e. ability to remove oxides) of the descalers on a hot strip mill. Oxides that are not removed prior to rolling deformation can exhibit increased adhesion to the steel surface and resist removal by subsequent descaler banks. 19
As the strip traverses the ROT from the finishing stands, it goes through various cooling mechanisms as a result of its interactions with both air and water. In the regions where there is no exposure to water, the primary cooling mechanisms are convection and radiation. Directly underneath each header nozzle, multiple cooling mechanisms are taking place, namely single phase convection, nucleate boiling, transition boiling, and film boiling. These various cooling mechanisms result in a non-uniform surface temperature of the strip. Studies such as Guedia Guemo et al. 20 illustrate the influences of these mechanisms on the subsequent strip surface temperature profile.
The use of machine learning algorithms for image analysis is a well established practice.21–23 Some studies have combined machine learning with thermography for the detection and classification of defects.24–26 Generally, these models consist of a preprocessing step followed by clustering. The purpose of preprocessing is to clean the data and prepare it for effective clustering. Examples of these types of models are KNNimputer, kernal ridge regression, and Gaussian processes (GPs). Many clustering models exist, but some of the most common are K-means, DBSCAN (density-based spatial), and GMMs. In this work, a combination of GPs and Gaussian mixture models (GMMs) was used. An overview of these methods is provided in Appendix A.
The ability to assess (and ultimately correct) the descaler functionality generally consists of visual assessment of the strip surface. This work combines in-line infrared (IR) thermography with statistical analysis to quantify the presence and location of surface oxides. A series of plant trials were carried out in which IR temperature imaging data was obtained from a hot steel strip mill at a camera position after laminar cooling and before coiling. For each steel strip studied,
Materials and methods
Materials
Fifteen industrially hot rolled steel strips were analysed using IR thermography and are labelled S1 to S15. The strip thickness, homogenisation, rough rolling, finish rolling temperatures and the Si wt-% for each strip are shown in Table 1.
Strip processing parameters and chemistry.
.
Methods
IR thermography
A Telops FAST M350 IR camera with a spectral detection band of 1.5–5.4

Schematic diagram of camera placement in thermomechanical controlled processing (TMCP) process. 27
Validation and emissivity determination of IR measurements was done through comparison with the coiler pyrometer. The imaging software for the IR camera allows for defining regions of interest (ROIs) in the videos, where only the temperature profile contained in the ROI is recorded. By positioning the ROI close to the sampling region of the pyrometer, and matching both temperature profiles, the emissivity of the strip recorded was determined. Figure 2 shows a comparison of temperature profiles produced from the IR camera versus the pyrometer for S15, where an emissivity of 0.86 was shown to produce the best match. For all strips, the emissivity showed little variance with values of

Calibration of S15 using an emissivity of 0.86.
Using the calibrated IR data, the mean temperature (Temperature measurements are presented as scaled temperature to obscure proprietary process data. Scaled temperature is calculated by dividing all measurements by the maximum temperature in the range.) and standard deviation of each frame was plotted as a function of distance along the length of the strip (Figure 3). The darker blue shading indicates one standard deviation in temperature with the light blue shading indicating two standard deviations in temperature. The large, consistent variation in temperature along the strip is attributed to the spatial variation in strip cooling.

Scaled mean and standard deviation of each frame for S15.
The large temperature variation observed along the length of the strip (Figure 3) is illustrated for a single IR image frame (Frame 199 (199) for S15) in the colour coded temperature heatmap shown in Figure 4. The temperature of the strip was observed to vary from <600

Infrared (IR) video frame for S15-199. Colour bar indicates the scaled temperature.
Outlier detection
With abnormal temperature fluctuations present in the IR measurements, an outlier detection method was developed to isolate these regions. The developed method combined the techniques of GPs and GMMs. Through use of the GP, the true underlying temperature of the strip surface was estimated. With this, the emissivity of each pixel in the IR video could be calculated. These corrected emissivity values could then be clustered into distinct components. This section provides a comprehensive analysis of S15 while only the results for all other strips are presented.
Accounting for differential cooling
While IR thermography has been proposed for the inline detection of abnormally cold regions on strip surfaces, 3 its applications are limited by its inability to account for variations in emissivity. This is especially relevant to TMCP, where oxides (scale) form throughout the process. As was shown by del Campo et al., 28 the emissivity of an iron surface can increase dramatically as it oxidises. While the difference may not be as extreme for the unpolished steel in TMCP, it cannot be assumed that variations in surface emissivity are negligible. While two-colour pyrometers are capable of accounting for this variation, they are limited by the small region that they measure. Furthermore, as even small regions of oxidation have been shown to form localised hard zones, 3 pyrometer-based monitoring systems may not capture these defects.
During the calibration of an IR camera, the emissivity of the subject is adjusted until the temperature observed by the camera matches its true temperature. If the true temperature is known for every pixel in the frame, this method can be inverted to transform an IR image into a map of emissivities. In the case of TMCP, the temperature of the subject (the surface of the strip) is much higher than the ambient temperature. By neglecting this ambient radiation, and assuming that the emissivity of the subject is constant within the waveband of the camera, equation (1) is derived.
To estimate the emissivity-independent surface temperature
As temperature variations measured along the width of the strip were assumed to be dominated by real variations in the true surface temperature, these measurements were isolated through the use of a GP. This was done by disregarding the length index of each pixel, and fitting a GP to the resulting temperature-width data for each frame of the IR video. This is shown in Figure 5 for frame 199 of S15. The mean of the GP was taken as the true temperature (

Gaussian process (GP) approximation of true temperature profile across the width of S15-199.

Scaled true temperature (
With

Surface plot of temperatures (left) and estimated emissivities (right) for S15-199.
This method was used in place of taking the average temperature along the width. While this is considerably more complicated, it has some advantages. First, the resulting mean temperature profile is smooth; there are no discontinuities and it is infinitely differentiable. This is a more accurate representation of the underlying system than a set of discrete averages. Second, this minimizes the number of assumptions about the shape of the true temperature profile, which eliminates bias that would be introduced by fitting a parameterised function to the data.
Clustering IR data
Stemming from the distinct components observed in Figure 4, a mixture distribution of emissivities was hypothesised for each strip. It was assumed that a finite number of factors influence emissivity, resulting in a probability density function modelled as a combination of multiple Gaussian distributions. Each Gaussian component represents a distinct cluster of data, which may correspond to variations arising from specific factors.
The number of components in the IR videos for each strip is unknown, so a fixed value was not predetermined. Instead, the Akaike information criterion (AIC) was employed to determine the optimal number of components. As the number of components in the GMM increases, the model fit improves, but the risk of overfitting also increases. AIC serves as a selection criterion to identify the model that achieves the best balance between goodness of fit and overfitting. Equation (2) defines the AIC for a given model, where
A minimum of three components was assumed for each video. This was based on the observation of three distinct colours in the IR frames (Figure 4). An upper bound of 10 components was also assigned to minimise the risk of overfitting. For each strip, the data from individual frames were aggregated into a single dataset, to which equation (2) was applied. Figure 8 compares the fit of the GMM with the measured emissivity data and includes the individual Gaussian components.

Gaussian mixture model (GMM) results for S15.
The optimal number of components was determined on a case-by-case basis by analysing the AIC curve. In most cases, the first or second elbow of the AIC curve was selected as the optimal value. However, for datasets exhibiting a smooth curve without a clear elbow, the maximum component value of 10 was used. For example, based on the AIC curve of strip S15, 10 components (
From the model results (Figure 8), the highest Gaussian component was assumed to correspond to the emissivity associated with oxides on the surface. To isolate all oxides from the IR videos, a threshold value based on this Gaussian component needed to be chosen. Since each Gaussian component provides the probability that a given emissivity value belongs to that component, the threshold was set at the point where the probability of the second-highest component approaches zero (the right tail). This approach ensures that only the emissivity values belonging exclusively to the highest component are isolated. Using this method, emissivity values for the oxide components ranged from 0.76 to 0.92, which was in agreement with previous studies. 28 With these emissivty thresholds, each frame in the IR videos were analysed.
Oxide morphology
The morphologies of all oxides was analysed. Figure 9 shows the oxide morphologies for S15-199 (black regions). Each oxide displays a smooth tear drop shape, which correlates strongly with primary and secondary scales observed during rough rolling. 1 These morphologies were similar among all the strips analysed.

S15-199 oxides and their morphologies.
Results and discussion
Oxide spacing
For the TMCP line under investigation, six descaler systems are employed. Descaler 1 is positioned between the reheat furnace and the roughing mill. Descaler 2 is located at the entry to the roughing mill. Descaler 3 is situated between the roughing mill and the coiling box. Descalers 4 and 5 are between the coil box and finishing mill. Lastly, descaler 6 is in the finishing mill, between stands 1 and 2.
Figures 10 and 11 present the y-coordinates of the detected oxides for select strips. In each case, regions of high and low concentrations are evident from the primary maxima and minima in the histograms on the left side of each plot. Additionally, secondary maxima and minima are observed both within and outside the regions defined by the primary peaks. The y-coordinate plots for all other strips are provided in Appendix B.

Positions of detected oxides for each frame of strip S12.

Positions of detected oxides for each frame of strip S15.
To quantify the seperation between regions of high oxide density, a 1D Gaussian kernel density estimator with a bandwidth value of 10 was applied to reduce background noise. Figure 12 presents the distribution of measured spacings across the minima from each plot. The spacings were found to have a median value of 59.4 mm and mean value of 64.5 mm.

Histogram of measured spacings between minima from each strip.
When the spatial distribution between high oxide density regions is compared to the spray pattern of the descaler nozzles, there is agreement between the two measurements. Individual nozzles for descalers 1–6 can contribute 79, 81, 114, 152, 152, and 83 mm of descaling, respectively, and there is a total of 26 mm of over-overlap (13 mm on either side) with neighbouring nozzles. Figure 13 shows a simplified schematic diagram of the descaling region for three nozzles in descaler 6, with the oxide density profile for S9 overlaid.

Rough rolling descaler schematic, with the oxide density plot of S9 overlaid. Red regions denote areas of descaler overlap that may cause washout.
If it is assumed that washout occurs in the overlap regions (reducing the descaling efficiency) and that the centre impact provides the most effective descaling, the expected distances between descaled regions for descalers 1–6 are approximately 53, 55, 88, 126, 126, and 57 mm, respectively. As shown in the overlaid oxide density profile in Figure 13, S9 demonstrates agreement with the descaler spacing values of 53, 55, and 57 mm, corresponding to descalers 1, 2, and 6. While S9 is a particularly strong example, this trend persists across most strips. Figure 12 shows the distribution of oxide peak spacings across all strips. The mode of this spacing distribution closely aligns with the 53 and 55 mm spacings of descalers 1 and 2.
Oxide location
To further investigate the influence of the descaling systems on the observed oxide density profiles, the locations of oxide peaks were analyzed. From the plots of Figures 10 and 11, it is evident that most strips exhibit a high density of oxides towards the mid-width.
Towards the outer regions (apart from strips 1, 3, 4, and 5) the presence of oxides was found to be minimal. This indicates that the nozzles positioned towards the edges of the strip provided better descaling than those at the centre. For the hot strip mill studied in this article, two different piping configurations are used: descalers 1, 3, 4, 5, and 6 are fed with a horizontal inlet, while descaler 2 is fed with a vertical inlet. Figure 14 provides a schematic diagram of descalers 2 and 4 to illustrate these differences.

Vertical (a) and horizontal (b) feed descaler configurations.
The performance of descaling nozzles is correlated to the maximum impact force that it exerts on the strip surface. Maximum impact force can be calculated using equations (3) and 4, where
For piping systems such as those used for descaling, the nozzles closest to the entrance of the pipe are most at risk of experiencing increased turbulence and approach velocities. As the approach velocity in the nozzles increases, the turbulence levels also increase, resulting in irregular spray patterns that reduce the impact force. 31 Based on this, it is suggested that the high density of oxides detected towards the mid-width of the strip is the result of inadequate descaling from the center nozzles of the entry descaler of the roughing mill (descaler 2).
Oxide fraction
Figure 15 shows the total oxide fractions for S1–S15, determined using the model. As shown in the figure, there was a wide variation (0.5%–2.28%) between the strips, with S1 exhibiting the largest amount (2.28%). These differences in oxide fractions may be the result of a combination of both the Si content in the steel and the rough rolling of the slab. From Figure 15, the general trend of a decrease in oxide fraction was correlated with a decrease in Si content and roll reduction. This also suggests that the measured oxides are related to the primary scale formed in the reheat furnace.

Total fraction of detected oxides for each strip. Fill colour and pattern indicate strip thickness.
Among the strips analysed, only S1 was expected to be influenced by Si. This is because even for Si-killed steels (
If the larger variations in oxide fraction are attributed to the Si content, the differences between the 10.97 and 11.51 mm strips may be due to the rough rolling process. These strips were rough rolled between 1097
Novelty and limitations
Surface defect detection in TMCP is an area of active study. Contemporary methods often utilise a combination of feature extraction and image classification to flag areas of concern in grayscale images.34,35 This approach typically utilises cameras that operate in visible-light wavebands (i.e. CCD 36 ). While these methods have demonstrated high classification accuracy, they are not able to utilise information about the temperature or emissivity of the surface. This limits their detection capabilities to defects that are visible to the human eye.
Local microstructural defects, (i.e. localised hard zones) are not always coupled with visible surface defects.
3
If these local defects are caused by sufficiently thin oxide layers (on the order of 30
The primary limitation of this non-parametric approach is the method used to estimate the true surface temperature (Figure 6). The assumption of constant temperature along the length of each IR frame only provides a first-order approximation. Figure 16 shows the expected error in predicted emissivity value as a function of error in true temperature. If the true difference in emissivity is on the order of 10%, there is only a

Expected % error in predicted emissivity as a function of total error in prediction of true temperature. Calculated using average wavelength of 3.45

Positions of detected oxides in each frame along lengths of strips S1–S6.

Positions of detected oxides in each frame along lengths of strips S7–S12.

Positions of detected oxides in each frame along lengths of strips S13–S15.
Future work should investiage methods of measuring local emissivity directly, eliminating the need for true temperature predictions.
Conclusions
In-line assessment of descaler functionality was achieved through using IR video imaging in combination with a GP and GMM. The GP removed the influences of differential cooling on the strip, and allowed for the approximation of a true underlying temperature profile. With the true underlying profile, emissivty values could be calculated and clustered using the GMM. The oxide cluster was taken as the highest GMM component and used to identify oxides on the strip surface. Detected oxides had emissivty values ranging from 0.76 to 0.92, and morphologies resembling a smooth tear drop. Oxide density plots were generated for each strip, and the corresponding spacings between minima were calculated to have a median and mean of 59.4 and 64.5 mm, respectively. The mean spacing value correlated strongly with the nozzle spacings of the reheat furnace (53 mm), roughing mill entry (55 mm), and finishing mill (57 mm) descalers. Furthermore, a large peak was present at the mid-width of most oxide density plots. This is suspected to be caused by the vertically fed descaler at the entry to the roughing mill. A large variation in total oxide fractions (0.5%–2.28%) was measured among the strips, where the general trend of decreasing oxide fraction with decreasing Si and roll reduction was present. This trend is attributed to the increased adhesion of the oxide layer caused by the formation of Fe
Footnotes
Acknowledgements
The authors of this article would like to thank Stelco Inc. for allowing us to conduct plant trials on their TMCP line and their cooperation throughout the duration of this work. The financial support from Stelco Inc. and the Natural Sciences and Engineering Research Council of Canada (Grant CRDPJ 538420-18) is gratefully acknowledged.
Funding
The authors disclosed receipt of the following financial support for the research, authorship and/or publication of this article: Funding for this work was provided by Stelco Inc. and the Natural Sciences and Engineering Research Council of Canada (Grant CRDPJ 538420-18).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Appendix A. Gaussian processes and mixture models
GP regression is a subset of a collection of machine learning models called kernel methods. 37 These methods work on the principle of predicting functions directly, rather than a parameterised approximation. GPs apply some statistical methods to these algorithms to create more robust predictions. The following section explains how these models work and some of the considerations that are made when constructing them. Most of the notation is taken from Murphy, 37 with additional insights from Rasmussen. 38 The reader is encouraged to see these texts for a more comprehensive overview of GP regression.
Every dataset has input points and response points. The input points can be thought of as a vector, where every entry is the value of the control variable for a given observation. If there are multiple control variables, each entry is itself a vector with length
To make predictions, it is first assumed that every data point is sampled from an
GP regression is most commonly used for time-series modelling 39 and the analysis of geospatial data. 40 More recently, GP regression has been applied to the analysis of IR images in the fields of food science, 41 forestry, 42 and astronomy. 43 In the field of physical metallurgy, GP regression has been used to forecast temperatures of molten steel in a ladle furnace, 44 predict austenite start temperatures, 45 and predict oxygen consumption in a converter steelmaking process. 46 In this work, GP regression is utilised in a similar fashion to signal denoising, which has been demonstrated extensively in the literature.47–50
A GMM is a probabilistic model that is used to deconvolute an overall distribution into several sub-populations that exist within it. In this model, the assumption that every data point belongs to a specific component is made. Equation (10) illustrates this relationship, where
