A comparison of ultrasonic temperature monitoring using machine learning and physics-based methods for high-cycle thermal fatigue monitoring

Abstract

Failure of pipe network components in so-called mixing zones due to high-cycle thermal fatigue (HCTF) can occur within nuclear power plants where fluids of different thermal and hydraulic properties interact. Given that the consequences of such failures are potentially deadly, a method to monitor HCTF non-invasively in real-time is expected to be of great use. This method may be realised by a technique to determine the inaccessible temperature distribution of a component since thermal gradients drive HCTF. Previous work showed that a physics-based method called the inverse thermal modelling (ITM) method can obtain the temperature distribution from external temperature and ultrasonic time of flight (TOF) measurements. This study investigated whether the long-short-term memory (LSTM) machine learning architecture could be a faster alternative to the ITM method for data inversion. On experimental data, a 25-member ensemble of LSTM networks achieved an ensemble median root mean square error (RMSE) of 1.04°C and an ensemble median mean error of 0.194°C (both relative to a resistance temperature device measurement). These values are similar to the ITM method which achieved a RMSE of 1.04°C and a mean error of 0.196°C. The single LSTM network and the ITM method achieved a computation-to-real-world time ratio of 0.008% and 14%, respectively demonstrating that both methods can invert data in real-time. Simulation studies revealed that LSTM performance is sensitive to small differences between the training and real-world parameters leading to unacceptable errors. However, these errors can be detected via an ensemble of independent networks and, corrected by simply adding a correction factor to the TOF prior to being input into the networks. The results show that LSTM has the potential to be an alternative to the ITM method; however, the authors favour ITM for temperature distribution monitoring given its interpretability.

Keywords

ITM LSTM machine learning thermal fatigue ultrasonic thermometry

Introduction

Motivations

Pipe networks within nuclear power plants (NPPs) are susceptible to high-cycle thermal fatigue (HCTF) in so-called mixing zones where fluids of different thermal and hydraulic properties interact.^1,2 This susceptibility is, in part, due to the high thermal expansion coefficient and low thermal conductivity of austenitic stainless steels (SSs)³ that are used throughout different types of NPPs.^4,5

In May 1998, a crack of a pipe elbow within the reactor heat removal system (RHRS) caused a leak at the French Civaux 1 pressurised water reactor after just 1500 h of operation.⁶ The leak caused the release of radioactive steam at a rate of 30 m³ h⁻¹ into the reactor building.⁷ The location of the 180 mm through-wall crack on the pipe elbow is shown in Figure 1. Following this incident, Civaux 1 and three other reactors of the same design were defueled, and a failure analysis was performed.⁷ The analysis identified that the failure was caused by thermal-fatigue-initiated cracks.⁸ This HCTF phenomenon had not been considered during the design of the NPPs since it was not captured in the design standards of the time.⁶ Hence, redesign, requalification and replacement of the affected components of the RHRS in all four reactors had to be performed. Subsequent ultrasonic inspection in 1999 of all NPPs in France revealed that thermal fatigue cracking was not unique to the Civaux 1 reactor design (due to the limitations of the design standards⁶). Further research identified that mixing zone HCTF is primarily caused by repeated exposure to temperature fluctuations affected by differences >50°C between hot and cold fluids.⁶ It was also found that the fatigue is most severe for temperature fluctuations with frequencies in the range from 0.1 to 1 Hz.^9,10

Figure 1.

Schematic of the Civaux 1 RHRS pipe elbow illustrating the location of the fatigue crack location.

Given the susceptibility of austenitic steels to thermal fatigue,³ and that safety is paramount in the nuclear industry, it is desirable to investigate techniques that can monitor the progression of HCTF and as a result, relax the requirements on the inspection interval and remove operators from the hazardous environment.

Article structure

The remainder of the article is structured as follows: the section ‘Limitations of current HCTF monitoring methods’ details the short-comings of current HCTF monitoring methods, and the physical reasons for this. A brief explanation of a physics-based ultrasonic temperature inversion method is presented in the section ‘Inverse thermal modelling method’. The section ‘Long short-term memory’ describes the machine learning network architecture considered in this work, the process for generating (simulated) training and testing data and the steps for training networks. The section ‘Simulation studies’ defines an initial test case and two additional test cases concerning data that have never been seen before seen by the trained networks. The section ‘Experimental studies’ introduces experimental data used to evaluate the machine learning networks on real-world data. The results of the machine learning networks are shown and discussed first for the simulated test data followed by the experimental test data, with results for the physics-based inversion method included for comparison. Finally, a summary of the key findings are provided in the conclusions.

Limitations of current HCTF monitoring methods

A component exposed to temperature fluctuations will develop thermal gradients that will generate thermal stresses. If sufficiently large, these stresses will impart damage leading to crack initiation/propagation and eventually cause the component failure. This failure mechanism is known as thermal fatigue.¹¹ For a pipe carrying a thermally varying fluid, the inaccessible interior surface will experience the largest (compressive or tensile) stresses.

Since thermal fatigue progression is driven by thermal gradients, knowing how the through-thickness temperature profile evolves over time is vital for monitoring thermal fatigue.¹⁰ However, this is a difficult task because traditional temperature measurement equipment (e.g. thermocouples) can only measure surface temperatures. Several techniques to overcome this issue have been developed; the following two sections will introduce and evaluate these techniques.

Embedded thermocouples

The obvious method to obtain the temperature profile in a component is by embedding sensors into a component. This method is demonstrated in two studies^12,13 by embedding thermocouples throughout the thickness of a component via drilled holes. However, these holes will create stress-raising features that will accelerate thermal fatigue progression.¹⁴ Furthermore, RTDs resistance temperature detector have been shown to lag true temperature in the order of seconds due to thermal conduction into the device¹⁵ causing measurement errors.

FAMOSi

The integrated FAtigue MOnitoring System (FAMOSi) was developed by Siemens in the 1980s and later updated by Areva (a French multinational group with a focus on nuclear power) for thermal fatigue monitoring following the discovery of fatigue cracks in NPPs.¹⁶ The resulting non-invasive system, called Integrated FAMOS (FAMOSi), comprises seven temperature sensors mounted around one half of a pipe’s circumference as shown in Figure 2. The system is based on a comparison of the outer surface temperature-time history to a pre-compiled reference database of ‘responses’ which are computed via a finite element model (FEM). It is likely to be a significant task to validate the FEM. Furthermore, the system is incapable of detecting thermal fluctuations >1 Hz.¹⁷ Although the reason for this limitation is not explicitly stated in the literature, it is assumed to be due to the low thermal conductivity of austenitic steels preventing thermal energy from diffusing to the component’s outer surface rather than a limitation in computational or data acquisition capabilities. This effect is demonstrated in the next section.

Figure 2.

Schematic diagram of FAMOSi.

Thermal conduction: a low-pass filter

This section presents simulations to demonstrate that materials with low thermal conductivity effectively act as low-pass filters of temperature. This low-pass effect implies that the previously introduced HCTF monitoring methods, based on externally mounted temperature sensors, will be unsuitable for resolving sub-surface temperatures that fluctuate rapidly, especially for thick components.

An explicit 1D, finite difference heat transfer model with convective boundary conditions was developed based on the model used by Zhang et al.¹⁸ The model simulated the temperature profile evolution of a section of 304 SS pipe carrying water at constant pressure and the outer surface exposed to air as shown in Figure 3. The fluctuation frequency of the water temperature was set to be a linear up-chirp: the instantaneous frequency increases linearly with time. The properties and parameters of the simulation are summarised in Table 1 where T, h, k, $α$ , L, $ρ$ and $Δ x$ are temperature, convective heat transfer coefficient, thermal conductivity, thermal diffusivity thickness, density and nodal spacing, respectively. The material properties of 304 SS were taken from the CES Edupack materials database (Granta Design Limited).¹⁹ The values for thickness and $h_{water}$ match the conditions at Civaux 1 that lead to the tee-joint failure.²⁰ The minimum time step for stable computation is given by $Δ t_{\min}$ in Equation (1).

Δ t_{min} = \frac{Δ x^{2}}{2 α (1 + \frac{h_{in} \times Δ x}{k})} \approx 0.027

(1)

Figure 3.

Schematic of the simulation setup. The colour bar denotes an arbitrary temperature gradient due to the difference in temperature between the air and water.

Table 1.

Parameters and material properties used in the simulation (304 SS).

Parameter	Value	Unit
Simulation time	500	min
T _air	50	°C
Mean T_water	50	°C
Chirp peak-to-peak amplitude	40	°C
Chirp frequency range	0.01–5	Hz
h _air	45.2	W/m²/K
h _water	15,000	W/m²/K
k ₃₀₄	18.5	W m⁻¹ K⁻¹
$α_{304}$	4.64	mm² s⁻¹
L	10	mm
$ρ_{304}$	7850	kg m⁻³
$Δ x$	0.625	mm

SS: stainless steel.

The 500 min of simulated time yields a rate of change of chirp frequency of $\approx 0.17 mHz {s^{-}}^{1}$ which is quasi-stationary compared with the sampling rate $\approx 0.027 s$ . Following the computation of the temperature profile, shear wave time of flights (TOFs) were calculated using the trapezoidal approximation¹⁸ given in Equation (2).

TOF = \int_{0}^{L} \frac{2}{v (T (x))} dx \approx Δ x [(\frac{1}{v_{1}} + \frac{1}{v_{N}}) + 2 \sum_{i = 2}^{N - 1} \frac{1}{v_{i}}]

(2)

In Equation (2) $Δ x$ , $v_{i}$ and $N$ denote the nodal spacing, wave velocity at the i^th node, and the total number of nodes respectively. A quadratic fit was used to describe the velocity–temperature $(v (T))$ relationship. Previous work by Gajacsi²¹ provided an experimentally determined fit given in Equation (3).

v_{Sh, 304} (T) = - 10^{- 5} T^{2} - 0.775 T + 3188.73

(3)

Figure 4 shows the water temperature $(T_{water})$ , the internal $(T_{in})$ and external $(T_{ex})$ surface temperatures of the steel and shear (TOF_Sh) TOFs computed in the simulation. Panel (a) demonstrates the low-pass effect: the amplitude of the fluctuations of $T_{ex}$ rapidly decreases as the frequency of $T_{water}$ increases and quickly drops below the measurement repeatability (standard deviation (SD)) of a class A RTD²² at 100°C. The reduction in amplitude of $T_{in}$ is less relative to $T_{ex}$ and remains above the class A RTD measurement repeatability. The amplitude of the fluctuations of the shear wave TOF also decreases over time. However, the peak-to-peak amplitudes both remain above the measurement repeatability (SD) of the piezoelectric transducers used by Zhang et al.¹⁸

Figure 4.

Results of the finite difference simulation for a 10 mm block of 304 SS exposed to temperature-varying water that fluctuates according to a sine wave with linearly increasing frequency. Temperatures of the water, and at the inner and outer surface of the component are shown in the top row. The bottom row shows the shear TOF. Panels (c) and (d) are zoomed to show the first 5 mins of panels (a) and (b), respectively. The left column shows full 500 min of the finite difference simulation whilst the plots in the right column show only 5 min.

A spectrogram of the full 500 min of simulation for each variable is shown Figure 5. The horizontal lines at 0.1 Hz (red dash-dot) and 1 Hz (white dashed) are superimposed on Figure 5 to show the range of critical fluctuation frequencies for HCTF of the tee-joint at Civaux.^9,10 The spectrograms were computed to identify the maximum resolvable fluctuation frequency for each variable. To create a spectrogram, a given variable was segmented into periods of 2048 time steps. The Fourier transform was computed for each period and stacked along the x-axis, that is, a visual representation of the frequency content of a signal for each consecutive 2048 time step period. Each variable was normalised by its absolute range before the spectrogram was computed, that is, Equation (4).

X^{*} = \frac{X}{| X_{\max} - X_{\min} |}

(4)

Figure 5.

Spectrograms for the temperatures (a)–(d) and, shear TOF (e) computed by the finite difference simulation (Figure 4). Each variable was normalised according to Equation (4) before computing the spectrogram. The horizontal red dash-dot and white dashed lines denote 0.1 and 1, respectively. The solid white lines in panel (c) bound the zoom extents for panel (d).

The minimum of the colour bar $(\hat{X})$ for each panel in Figure 5 was set according to Equation (5) where $X$ denotes temperature or TOF. $σ_{X}$ denotes the experimental SD.

\hat{X} = 20 \log_{10} (\frac{σ_{X}}{| X_{\max} - X_{\min} |})

(5)

For temperature, $σ_{X}$ was set to be 0.35°C (class A RTD²² at 100°C). For TOF, $σ_{X}$ was set to 16 ps (SDs of the piezoelectric transducers used by Zhang et al.¹⁸).

Figure 5(d) shows that a temperature sensor mounted to the external surface of a component cannot differentiate temperature fluctuation frequencies greater than approximately 0.29 Hz. Furthermore, the practical frequency limit is likely lower than 0.29 Hz due to the lag time of typical temperature sensors^15,18,23 which is not considered in these simulations. In contrast, the shear TOFs (Figure 5(e)) remain sensitive up to 5 Hz. TOF should remain sensitive well beyond 5 Hz although the theoretical upper limit of the sensitivity was not investigated in this simulation. The upper limit is expected to be governed by the sampling rate of the acquisition hardware.

Inverse thermal modelling method

A feasibility study by Zhang et al.¹⁸ utilised the thermal sensitivity of (shear) ultrasonic TOF (as shown in the previous section), demonstrating internal surface temperature estimation within 2°C. The method couples TOF and outer (accessible) surface temperature measurements with a physics-based inversion model to obtain temperature estimations. Of the two inversion methods presented by Zhang et al., only the inverse thermal modelling (ITM) method based on earlier work by Ihara et al.²⁴ will be considered in this article.

The ITM method is based on iterative optimisation of an explicit finite difference formulation of the inverse heat conduction problem enabling it to obtain the full temperature profile. The explicit formulation requires that the time step of the data must be sufficiently small to ensure a stable solution. Hence, interpolation of the data is usually necessary to ensure stability which increases computational expense.

Long short-term memory

Machine learning is a technique that is being increasingly implemented in non-destructive evaluation scenarios, including:

Ultrasonic flaw classification²⁵

Deconvolution of ultrasonic signals²⁶

Artefact identification and suppression in ultrasonic images²⁷

Defect detection in guided wave signals²⁸

Noise quantification in ultrasonic images²⁹

Ultrasonic crack characterisation³⁰

Machine learning has several beneficial characteristics including:

A physics-based model of an underlying system is not required³¹

The ability to implicitly create complex non-linear relationships

One drawback is that it is difficult to explore the explainability of the mathematical operations inside the model leading to machine learning networks to often be described as a ‘black box’ method.³² Very broadly, deep learning – a subset of machine learning – can use all the available information embedded within the data set by simply using the raw data as the input. In contrast, shallow learning requires hand selection of input features but requires less training data than deep learning.³³ Within deep learning, recurrent neural networks (RNNs) are well suited for problems involving time-series data; however, they can be susceptible to gradient vanishing and gradient exploding problems.³⁴ To overcome this issue, a novel and efficient gradient-based method, called long short-term memory (LSTM) was developed by Hochreiter and Schmidhuber.³⁵ This article explores whether machine learning networks can replace the (single shear wave) ITM inversion method for real-time temperature gradient monitoring using networks trained using simulation data only. Given the available scope of this article, it would not be possible to explore all potential machine learning architectures. The LSTM architecture was selected for investigation as it seemed a prominent candidate and showed promising results in initial evaluations. For a detailed discussion of other types of architectures that have been proposed for a range of non-destructive evaluations applications, beyond time-series data, the review paper by Cantero-Chinchilla et al.³⁶ is suggested to the interested reader.

Figure 6 shows a schematic of an LSTM cell that contains four layers. The four layers comprise three logistic sigmoid and one hyperbolic tangent (tanh) functions that interact to produce the output and the state of the cell which are then passed onto the next hidden layer. An LSTM cell has three inputs: $h_{t - 1}$ , $c_{t - 1}$ and $x_{t}$ and two outputs: $h_{t}$ and $c_{t}$ . Subscripts $t$ and $t - 1$ denote the current and previous time steps, respectively. $h$ is the hidden state, $c$ is the cell state (or memory) and $x$ is the input. The output of the first sigmoid layer (forget gate $f_{t}$ ) defines the amount of information of the previous cell to be maintained via an element-wise multiplication with $c_{t - 1}$ . The tanh layer yields a vector ${\tilde{c}}_{t}$ of the new candidate values. The element-wise multiplication of the second sigmoid layer (input gate, $i_{t}$ ) and ${\tilde{c}}_{t}$ determines the amount of information to be added to the cell state. This result is then added to the output of the forget gate multiplied with $c_{t - 1}$ to produce $c_{t}$ . The final sigmoid layer (output gate, $o_{t}$ ) is multiplied in an element-wise fashion with a tanh layer to produce the output $h_{t}$ of the cell. Further details on the LSTM architecture can be found in Hochreiter et al.³⁵ and DiPietro and Hager.³⁷

Figure 6.

Graphical illustration of an LSTM cell. The weights are omitted for clarity. $σ$ , T, ⊙ and + denote sigmoid and tanh layers, element-wise multiplication and addition, respectively.

Train/test data generation and network training

The previously introduced finite difference heat transfer code was used to generate data sets for training and testing LSTM networks. However, the method for defining the water temperature fluctuations differed. The first method (later referred to as the sine method) defined the fluctuations to be sinusoidal (rather than a linear chirp) and were defined by three parameters:

1. Mean temperature (increasing, decreasing, or constant)

2. Temperature range (increasing, decreasing, or constant)

3. Fluctuation frequency ( $0.1 \leq f \leq 1 Hz,$ critical frequency range for HCTF at Civaux^9,10)

Each of the parameters was selected for each ‘region’ in the data set using uniform random distributions; discrete forms of the distribution are used for the mean temperature and temperature range. Figure 7 shows nine representative regions demonstrating each of the possible mean temperature and temperature range combinations for the sine method. The training data set used to train all LSTM networks was generated using the parameters given in Table 2.

Figure 7.

Representative example showing a region for each of the possible nine mean temperature and temperature range combinations used when generating $T_{water}$ using the sine method for training data. The fluctuation frequencies and time for each region were chosen purely for clear visualisation. A frequency range of 0.1–1 Hz was used to generate simulated training or test data.

Table 2.

Parameters and material properties used to generate the training data.

Parameter	Value	Unit
Generation method	Sine	–
Regions	100	–
Fluctuation frequency range	0.1–1	Hz
Region time range	5–20	min
Time steps	$1.01 \times 10^{7}$	–
Train-validation split	85–15	%
Thickness	30	mm
Nodes	61	–
Nodal spacing	0.5	mm
Time step	7.01	ms
Material	EN32B steel	–
Thermal diffusivity	17.7	mm² s⁻¹
Thermal conductivity	70.2	W m⁻¹ K
Density	7890	kg m⁻³
Water heat transfer coefficient	1100	W/m²/K
Air heat transfer coefficient	45.2	W/m²/K
Pressure	0.10	MPa
Saturation temperature	99.6	°C

The second method (later referred to as the square method) used to create a test set that mimics the experimental data introduced in a later section, defined the fluctuations as square waves. In both methods, the upper limit of $T_{water}$ was set as the saturation temperature at the chosen (constant) pressure, calculated using the properties of water and steam based on the formulation coordinated by the International Association for the Properties of Water and Steam,³⁸ to ensure physical sense. Finally, the generated data were down-sampled to a 0.5 Hz sampling rate which is the Nyquist frequency³⁹ of the maximum critical frequency of HCTF at Civaux.^9,10

The networks were created in Python 3⁴⁰ using the Keras deep learning API.⁴¹ Each network comprised a single LSTM layer with 180 neurons followed by a one-unit dense layer. An 85:15 train-validation split was used on training data sets, and separate unseen data sets were used for testing. All training sets were generated using the sine method. The input data were assembled such that for a given time step, the data passed to the network contained two values: the value at the current, and one previous time steps. The Adam optimiser⁴² was used in conjunction with Equation (6) which describes the exponentially decaying learning rate, $L_{r}$ , where $n$ refers to the current training epoch. The root mean squared error (RMSE) and mean error were used as quantitative metrics to assess performance on the test sets.

L_{r} [n] = 0.01 e^{- 0.05 n}

(6)

The mean square error was used as the loss function. Early stopping based on 25 epochs without reduction of the validation set loss was used to prevent the networks from over-fitting to the training data. The networks were trained with a 12th Gen Intel core i7 processor CPU of a desktop PC. This PC was used throughout the work presented in this article.

A manual ‘trial and error’ approach was used over formal optimisation strategies of the LSTM networks for two reasons. Firstly, because this work set out to determine whether machine learning might be a less computationally expensive alternative to physics-based methods rather than finding the best (highest accuracy) machine learning method. Secondly, the first trial implementation yielded good results (single 50-neurons LSTM layer followed by a one-unit dense layer). Furthermore, better performance (defined as lower RMSE/mean error) should only be considered on experimental data. However, this poses a limitation due to the lag of RTDs^15,18,23– the LSTM predictions are compared to a reference measurement rather than a ground truth during experiments. Hence lower RMSE/mean error on experimental data by LSTM would mean that the inherent lag of the RTDs is being learnt, this is not beneficial and hence a sensitivity study of changes in performance metrics was not attempted.

Simulation studies

It was expected that if a LSTM network could predict the temperature at a single distance from the water–metal interface, a more complex network would be able to predict the full temperature profile of a component. In this work, a single distance network was investigated to confirm whether the LSTM architecture was a suitable choice for data inversion. The single distance was chosen to be 5 mm from the water–metal interface to match the reference RTD distance in the experimental data (introduced later). A training set that simulated an EN32B mild steel block exposed to water on one side and the other exposed to constant temperature air was created using the sine method comprising 100 regions. A single shear wave was used since the changes in TOF were due to temperature changes only (the component thickness remained constant). The shear wave velocity–temperature relationship of the sample is given in Equation (7).

v_{Sh, EN 32 B} (T) = AT + B = - 0.48894 T + 3237.61

(7)

LSTM networks were then trained with this data set. Following training, the performance of the networks was evaluated using an initial simulated test set with the water temperature defined with the square method to switch between hot and cold, with the magnitudes and periods of exposure matching the experimental data set. The simulation parameters are shown in Table 3 where the symbols have the same definitions as in Table 1.

Table 3.

Properties used to generate the simulated training and test data (EN32B mild steel).

Parameter	Value	Unit
T _air	20	°C
h _air	45.2	m² s⁻¹
h _water	1100	m² s⁻¹
k_EN32B	70.2	W m⁻¹ K⁻¹
$α_{EN 32 B}$	17.7	mm² s⁻¹
L	30	mm
$ρ_{EN 32 B}$	7890	kg m⁻³

Deep ensemble

To observe the influence and minimise the impact of the random initialisation of the network weights each time a new network was trained, 100 networks (with identical architectures) were trained using the same training set. These networks were used to create a deep ensemble. Deep ensembles of machine learning networks have been shown to increase prediction accuracy and provide a measure of uncertainty.³³ To determine a sufficient number of networks (members) in the ensemble, increasing numbers of members were included in the ensemble and the SD of the RMSE was computed. This process was repeated with random (unique) shuffles of the order in which the networks were included in the ensemble. However, only 20,000 shuffles were considered since for 100 networks it would be unrealistic to consider all $1.26 \times 10^{30}$ possible combinations, as given in Equation (8). In Equation (8), $n$ and $r$ are the total number of possible members and the current number of members considered, respectively.

\sum_{r = 1}^{n} \frac{n!}{r! (n - r)!} = 2^{n} - 1 = 2^{100} - 1 \approx 1.26 \times 10^{30}

(8)

The range of the RMSE SD was compared to the (peak-to-peak) range tolerance, at the mean temperature of the simulated data set ( $\approx 36.6 ° C$ , which closely matches that of the experimental data set), of a class A RTD²²: $Tol \approx 0.45 ° C$ (Equation (9)).

Tol = \pm 0.15 + 0.002 \times 36.6 \approx \pm 0.223 \to 0.446 ° C

(9)

This comparison is shown in Figure 8. As expected, as the number of ensemble members reaches the maximum number, the range of the SD of the RMSE reduces to zero since a change in the inclusion order of networks becomes less significant. The ensemble SD range falls below that of the RTD after 10 members. While 10 members could be argued to be sufficient the authors decided to be conservative by including 25 members so that variations due to simulations would be smaller than those that are expected in experimental measurements. For the avoidance of any doubt, 25 of the 100 networks were randomly selected and the same 25 networks were subsequently used throughout this article for any and all simulation or experimental studies.

Figure 8.

The upper panel shows the standard deviation of RMSE of the ensembles for increasing numbers of members in the LSTM ensemble for predictions on the baseline simulation test set. The standard deviations of the RMSE are shown for 20,000 unique random shuffles of the order in which networks are considered for inclusion in the ensemble. Values that were more than 1.5 × the interquartile range are shown by the circles. The lower panel shows the range of standard deviations of the RMSE (which are shown in the upper panel). In the lower panel, the solid black line denotes the tolerance range of a class A RTD²² as defined by Equation (9).

Robustness against out-of-distribution data

Out-of-distribution data (OODD) are data that a trained network has never been exposed to because the training set does not capture it. The response of the LSTM ensemble to OODD was explored to assess two areas. Firstly, assess the magnitude of the impact of OODD on the ensemble, that is, how badly wrong do the predictions get on previously unseen data. Secondly, whether the deep ensemble could detect when the model is working outside of the predefined domain of operation using the SD of the ensemble predictions. The second area encompasses quantification of the epistemic uncertainty. Epistemic uncertainty arises from a lack of knowledge about data generation method, resulting in uncertain network parameters. The uncertainty can be reduced by increasing the amount of relevant training data, provided that the training data aligns closely with the test data. However, it is important to note that in this work the epistemic uncertainty cannot be fully eliminated because simulation data has been used to approximate real-world data leading to an inherent difference between training and test data.³³

Two independent OODD scenarios that were deemed most likely to occur in the real world were explored. The OODD scenarios were incorrect component thickness, and $v (T)$ coefficients, referred to as thickness OODD and velocity OODD, respectively. All parameters used to generate the base test set in both OODD scenarios exactly matched those used to generate the training data apart from the random seed. The OODD test sets were derived from the base test set using the variation in parameters given in Tables 4 and 5– all other parameters remained the same across the OODD sets (including the random seed). The OODD test sets including the base set were generated using the sine method.

Table 4.

Values of the velocity–temperature relationship coefficients used to generate the velocity OODD test sets. The percentage changes are relative to the values used when generating the base test set.

OODD case	Velocity (V = AT + B)		Deviation (%)
OODD case	A (m s⁻¹ °C⁻¹)	B (m s⁻¹)	Deviation (%)
1	−0.48845	3234.37	−0.10
2	−0.48869	3235.99	−0.05
3	−0.48889	3237.29	−0.01
Base	−0.48894	3237.61	0.00
4	−0.48898	3237.93	0.01
5	−0.48918	3239.23	0.05
6	−0.48942	3240.85	0.10

OODD: out-of-distribution data. The base dataset is highlighted in bold.

Table 5.

Values of the thicknesses used to generate the thickness OODD test sets. The percentage changes are relative to the values used to generate the base test set using absolute magnitudes.

OODD case	Thickness	Deviation (%)
OODD case	L (mm)	Deviation (%)
1	29.90	−0.33
2	29.95	−0.17
Base	30.00	0.00
3	30.05	0.17
4	30.10	0.33

OODD: out-of-distribution data. The base dataset is highlighted in bold.

Experimental studies

The ‘temperature fluctuations only’ experiment data were provided by Zhang and Cegla²³ for use as a test set to evaluate the real-world performance of the 25 previously trained LSTM networks. The ITM MATLAB⁴³ code was also provided and then rewritten in Python. The experiment comprised a sample of EN32B mild steel $(L \approx 30 mm)$ exposed to alternating hot and cold water by a purpose-built setup. A reference measurement was provided by an RTD embedded in the sample 5 mm from the water–metal interface using thermally conductive epoxy. Data were recorded at a sampling rate of 0.25 Hz, or one point every 4 s. The interested reader is directed to the original paper for in-depth details of the materials and methods.²³ The shear wave velocity–temperature relationship of the sample is given in Equation (7).

Prediction time

To investigate the computational time of the LSTM ensemble, single LSTM network and ITM method, their respective Python codes were run 20 times using the experimental test set. Their respective computation times were measured (using the perf_counter() function⁴⁴) and averaged. The snippets of code that were timed only included instructions explicitly related to making predictions.

Results and discussion

Throughout this section, Equation (10) was used to define the error between the true (simulated) or reference (RTD) measurement and the inversion method (LSTM or ITM) predictions whilst Equation (11) was used to define the error between the true (simulated) or reference (RTD) measurement and the external surface temperature. The subscript 5 mm refers to the distance from the water–metal interface that the simulated or experimental temperature measurements are taken from as the true and reference measurement, respectively.

ϵ_{T}^{Method} = T_{5 mm}^{Source} - T_{5 mm}^{Method}

(10)

Δ T^{Source} = T_{5 mm}^{Source} - T_{ex}^{Source}

(11)

Simulation studies

Figure 9 shows the results of the initial simulated test set. The top panel of Figure 9 shows the true (simulated) temperature 5 mm from the water–metal interface and the mean temperature predictions of the 25 LSTM networks at the same spatial location. One SD above and below the mean of the 25 networks is also superimposed to demonstrate the aforementioned impact of the random initialisation of each network. The bottom panel shows the error relative to the true temperature for the mean LSTM predictions as well as for the temperature at the steel block’s outer surface (the air–metal interface). $ε_{T}^{LSTM}$ is approximately half that of $Δ T^{Sim}$ demonstrating that the LSTM machine learning approach can outperform the most basic non-invasive temperature sensing method. The largest values of SD occur when the water temperature switches between hot and cold which lead to the largest temperature gradients at the water–metal interface.

Figure 9.

Initial simulated test set results. Top panel: simulated temperature 5 mm from the water–metal interface $(T_{5 mm}^{Sim})$ and the ensemble mean predictions across the LSTM networks $(T_{5 mm}^{LSTM})$ . Bottom panel: errors of the LSTM mean predictions $(ε_{T}^{LSTM})$ and external simulated temperature $(Δ T^{Sim})$ , both relative to the true simulated temperature. One standard deviation across the LSTM networks above and below the mean prediction is superimposed in both panels $(\pm 1 σ)$ .

Figure 10 shows the RMSE and mean error for the velocity OODD test sets taken over all time steps for each network. The metrics based on the external temperature are also shown as a reference. As both $v (T)$ coefficients diverge from the values used during training (and in the base test case), the absolute magnitude of both the RMSE and mean error increase. For the avoidance of doubt, OODD cases 4–6 of Table 4 are referred to as positive deviations of the $v (T)$ coefficients. Essentially, the absolute magnitude of each coefficient in the test set is larger than the values used during training; $A$ becomes more negative and $B$ becomes less negative. In a similar fashion, OODD cases 1–3 of Table 4 are referred to as negative deviations. It was expected that for the positive deviations of both $v (T)$ coefficients, the LSTM networks would underpredict the true temperature and therefore the error (according to Equation (10)) would be positive.

Figure 10.

Performance metric boxplots for each trained network on the velocity OODD test sets. The percentage deviation refers to the $v (T)$ coefficients given in Table 4. The box plots and crosses denote the metrics for the predictions by each of the LSTM networks and the external surface temperature, respectively. Prediction metrics for LSTM networks that were more than 1.5 × the interquartile range are shown by the circles. (a) RMSE, (b) Mean error.

This can be intuitively explained by considering the following scenario. Consider a block of thickness $L$ where the true $v (T)$ coefficients are A_true and B_true and an LSTM network which is trained using the coefficients A₊ and B₊ where $| A_{true} | < | A_{+} |$ and $| B_{true} | < | B_{+} |$ . Although not strictly necessary, it will be assumed that the block is at some uniform temperature (for simplicity) such that the integral of Equation (2) becomes Equation (12).

TOF = \frac{2 L}{v (T)} = \frac{2 L}{AT + B}

(12)

Suppose the TOF is measured at this given temperature, then the apparent temperature could be computed by rearranging Equation (12) using A₊ and B₊. The apparent temperature would underpredict the true temperature and hence this behaviour will be learnt by the network during training. The inverse of this effect was expected if $| A_{true} | > | A_{+} |$ and $| B_{true} | > | B_{+} |$ (OODD cases 1–3, Table 4). This behaviour is reflected in the positive trend between the mean error and the deviation of the $v (T)$ coefficients shown in Figure 10(b).

Figure 11 shows RMSE for the thickness OODD test sets. The metrics based on the external temperature are also shown as a reference. Similarly to the velocity OODD cases, the absolute magnitude of both the RMSE and mean error increase as the parameters (thickness) of the test sets diverge from the value used during training. However, it was expected that for the thickness OODD a positive deviation (OODD cases 4–6, Table 5) would cause the LSTM networks to over-predict the true temperature and therefore the error (according to Equation (10)) would be negative. This can again be explained by imagining a similar scenario as that described for the velocity OODD case except this block has true thickness $L_{true}$ but an LSTM network is trained using a thickness $L_{-}$ where $| L_{true} | > | L_{-} |$ . The same uniform temperature assumption is made to yield Equation (12). Since the true block thickness is less than the assumed thickness during training, the TOF values will be smaller than expected and the computed temperature would over-predict the true temperature. The inverse of this effect was expected for a scenario where $| L_{true} | < | L_{+} |$ (OODD cases 1 and 2, Table 5). This behaviour is reflected in the negative trend between the mean error and the deviation of the thickness shown in Figure 11(b).

Figure 11.

Performance metric boxplots for each trained network on the thickness OODD test sets. The base set (30.0, 0.0%) is included for reference. The box plots and crosses denote the metrics for the predictions by each of the LSTM networks and the external surface temperature, respectively. Prediction metrics for LSTM networks that were more than 1.5 × the interquartile range are shown by the circles. (a) RMSE, (b) mean error.

In both OODD scenarios, the RMSE and mean error in temperature prediction both grow with the increasing deviation of the training parameters. This error manifests as a constant offset which is shown by the mean error (Figures 10(b) and 11(b)). However, the RMSE box plots for both scenarios (Figures 10(a) and 11(a)) demonstrate that using an ensemble of many networks can help to diagnose this issue since the SD becomes larger as the deviation of parameters grows. This behaviour has previously been exploited for uncertainty quantification.³³ The influence of the size of the LSTM ensemble was not investigated, a smaller ensemble may be equally as informative whilst reducing computational expense. It should be noted that this issue of over/under prediction is also expected to be suffered by the ITM method and it is not possible to apply the ensemble method to ITM.

Experimental studies

The top panel of Figure 12 shows the mean predicted temperatures across the 25 LSTM networks with one SD above and below this value superimposed. The predictions by the ITM method as well as the embedded RTD measurements are also shown as references. The bottom panel of Figure 12 shows the errors based on Equations (10) and (11). To achieve the predictions shown in Figure 12, $0.1414 μ s$ $(TO F_{adj})$ was added to the TOF values before being passed to the LSTM networks. This correction $(TO F_{adj})$ was needed because the networks were trained on a training set that assumed a thickness of $L_{train} = 30 mm$ but the true thickness was measured to be $L = 29.77 mm$ . The use of $L_{train} = 30 mm$ was chosen simply to (significantly) reduce the spatial resolution necessary to generate training data and hence reduce the computation time. In effect, the experimental data was OODD compared with the training data. Using Equations (7) and (13) with the temperature during the first 5 min, for which the block is uniformly at 24.3°C, shows that adjusting the TOF by 0.1414 μs correctly calibrates the assumed training set thickness $(L_{train} = 30 mm)$ .

\begin{matrix} TO F_{corr} = TO F_{train} + TO F_{adj} \\ = TO F_{train} + 2 \times \frac{L_{corr} - L_{train}}{v (T)} \end{matrix}

(13)

Figure 12.

Experimental test set results. Top panel: RTD-measured temperature 5 mm from the water– metal interface $(T_{5 mm}^{RTD})$ , LSTM ensemble mean predictions $(T_{5 mm}^{LSTM})$ and ITM²³ predictions $(T_{5 mm}^{ITM})$ . Bottom panel: error of the LSTM predictions $(ε_{T}^{LSTM})$ , external temperature $(Δ T^{RTD})$ and ITM predictions $(ε_{T}^{ITM})$ . One standard deviation across the LSTM networks above and below the mean prediction is superimposed in both panels $(\pm 1 σ)$ .

The corrected thickness, $L_{corr}$ , matches the true experimental thickness: $L = 29.77$ and gave the best results on the experimental data set. Figures 13(a) and (b) show the RMSE and mean error respectively for LSTM, ITM and external surface temperature predictions. The median LSTM ensemble RMSE and mean error are both in close agreement with the ITM values, showing good accuracy. Furthermore, both methods achieved approximately $4 \times$ lower RMSE than if the external surface temperature was used.

Figure 13.

Performance metrics for the LSTM networks, ITM and external surface temperature predictions made on the experimental. The box plots, tri-stars and triangles denote the metrics for predictions by each of the LSTM networks, the external surface temperature and ITM method, respectively. Prediction metrics for LSTM networks that were more than 1.5× the interquartile range are shown by the circles. (a) RMSE, (b) Mean error.

The ITM implementation used a similar method to adjust the assumed thickness using the period in which the block temperature is uniform and then rearranging Equation (12) to obtain $L$ . However, without a period of uniform temperature, this calibration is not possible and the ITM predictions would suffer the same offset experienced by the LSTM networks. Therefore, in the context of a thickness OODD scenario, the ITM method is superior to the presented LSTM approach because a trained LSTM network has the training thickness ‘baked in’. The current machine learning architecture would typically require retraining the network(s) to change the assumed thickness; however, the adjustment of the TOF data given in Equation (13) can also resolve the thickness issue.

Prediction time

The percentage ratio, $R_{t}$ , between the raw mean computation time and real experimental time was calculated with Equation (14) where $n_{T}$ is the number of data points (749) and $f_{s}$ is the measurement sampling rate (0.25 Hz).

R_{t} = \frac{μ_{t}}{n_{T} \times f_{s}} \times 100

(14)

If $R_{t} < 100 %$ then inversion can be done in real-time because computations are completed before the next data measurements are made. The mean adjusted computation time, $C$ , was calculated with Equation (15) where $μ_{t}$ , $n_{S}$ , $IF$ , $EF$ are the average computation time, number of spatial points, interpolation factor and the ensemble factor, respectively.

C = \frac{μ_{t}}{n_{T} \times n_{S} \times IF \times EF}

(15)

As in Equation (14), $n_{T}$ is the number of data points. The interpolation factor accounts for the upsampling of the data from 4 to 0.001 s required by the ITM method. The ensemble factor accounts for the LSTM ensemble repeating the calculations for each network. These correction factors ensure the calculated times report the time taken to compute the temperature value at a single spatial location at a single time step. The values for the variables of Equation (15) are given in Table 6.

Table 6.

Value of each correction factors used in Equation (15) to compute the adjusted mean computation times.

Method	$n_{T}$	$n_{S}$	$IF$	$EF$
LSTM ensemble	749	1	1	25
LSTM single	749	1	1	1
ITM	749	151	4000	1

LSTM: long-short-term memory; ITM: inverse thermal modelling.

Table 7 shows both the raw $(μ_{t})$ and adjusted $(C)$ mean computation times for the LSTM ensemble, a single LSTM network and ITM method. The computation-to-real-time ratio $(R_{t})$ is also shown as a percentage. The LSTM ensemble contained 25 members, and had a raw computation time approximately 25× that of the single LSTM network implying that the computation time scales linearly with ensemble size. Comparing the raw computation times suggests that the single LSTM and ensemble are both significantly faster than ITM. However, the adjusted values $(C)$ , which account for differences in sampling rates and the number of spatial nodes computed at each time step, show that for each computation of a single temporal and spatial point, the ITM method is approximately 292 and 338 times faster than the LSTM ensemble and single LSTM network, respectively. A multi-output LSTM network – one that makes temperature predicitons for different spatial locations – may well be faster than ITM. However, this hypothetical multi-output network would require a more complex architecture and hence a larger number of computations at each time step (due to an increase in trainable parameters). Given the larger number of computations, this complex LSTM is expected to remain slower than ITM. However, without further work this cannot be answered definitively. Despite this apparent large difference between the speeds of ITM and LSTM, the values of both $R_{t}$ and $C$ suggest either method would be fast enough for real-time computations, assuming similar performance is achieved on field-deployable hardware.

Table 7.

Raw mean computation time $(μ_{t})$ , associated standard deviation $(σ_{t})$ , computation-to-real-time ratio $(R_{t})$ , mean computation time $(C)$ adjusted for a single point temperature estimate at one time instance and associated standard deviation $(σ_{C})$ for 20 repeats of the LSTM network (full 25-network ensemble and a single network) and ITM method codes. Values of $C$ were calculated using Equation (15) and Table 6.

Method	$μ_{t} (s)$	$σ_{t} (s)$	$R_{t} (%)$	$C (μ s)$	$σ_{C} (μ s)$
LSTM ensemble	5.12	0.0534	0.17	273	2.85
LSTM single	0.237	0.00473	0.008	316	6.32
ITM	423	6.05	14	0.936	0.0134

LSTM: long-short-term memory; ITM: inverse thermal modelling.

For the prediction of temperature-gradient-induced stresses some form of spatial resolution is required. ITM satisfies this requirement as the full (151-node) spatial grid must be computed at each time step. In contrast, the presented LSTM networks only computed a single spatial point. Therefore, spatial resolution could be achieved using multiple LSTMs for different spatial locations (spatial ensemble). Since the raw computation time scaled for LSTM (approximately) linearly with ensemble size, it is expected that a spatial ensemble would follow a similar behaviour. This is not to say that a spatial LSTM ensemble would need 151 members (to match ITM) since this discretisation is expected to exceed the spatial resolution necessary for predicting the thermo-mechanical stresses that cause HCTF. Hence, the impact of increasing LSTM spatial outputs is not expected to increase LSTM prediction times prohibitively, that is, $R_{t}$ would remain below 100%. Further work is necessary to determine suitable spatial grid resolutions for both the ITM and LSTM methods.

Another factor to consider is that the experimental data sampling rate (0.25 Hz) would alias the maximum critical HCTF frequency at Civaux (1 Hz). Therefore, the measurement sampling frequency would have to be increased to at least 2 Hz³⁹ to properly capture 1 Hz fluctuations. At 2 Hz, the ITM method prediction time and $R_{t}$ would remain (approximately) constant because the method already interpolates the data to 1 kHz (for numerical stability) which is capable of resolving 1 Hz. On the other hand, at higher sampling frequencies, the LSTM would need to perform a higher number of computations ( $R_{t}$ would increase). However, the LSTM method would only be required to predict on data sampled at 2 Hz, not 1 kHz to resolve 1 Hz fluctuations. Hence, the impact of increasing measurement sampling rates is not expected to increase LSTM prediction times prohibitively, that is, $R_{t}$ would remain below 100%. Ultimately the choice of ITM versus LSTM depends on the required temporal and spatial resolutions with ITM becoming more favourable for significantly higher sampling rates.

The adjusted computation times for each of the methods are expected to be sufficient for real-time monitoring considering the critical range of HCTF frequencies (0.1–1 Hz) for Civaux. Nevertheless, further investigation is required to determine whether these speeds are achievable on lightweight, field-deployable hardware. During the development of the LSTM and ITM codes, speed and efficiency were not prioritised. Therefore, the execution speed of both codes might be improved through careful programming of the respective methods. For the LSTM ensemble such techniques might include:

Reduction of the number of ensemble members

Conversion of the ensemble into a single compact ‘multi-headed’ network⁴⁵

Network simplification by pruning^46,47 (also applicable to a single network)

Performing a hyperparameter search to remove network complexity, for example, number of neurons (also applicable to a single network)

It is important to highlight that the same 25-member LSTM ensemble was used throughout the simulation and experimental studies. The networks were all trained using pure simulation data yet were able to make predictions on experimental data which have similar accuracy as the ITM method (with a correction factor to address the difference between the assumed training data thickness and true experimental thickness). Furthermore, the LSTM networks were trained on simulated data sampled at 0.5 Hz. When predictions were made on the experimental data which were sampled at 0.25 Hz, no interpolation was applied meaning the LSTM networks were predicting on sparse data.

Conclusions

HCTF in NPP mixing zones is driven by large temperature gradients, that is, large differences between the interior and exterior wall temperatures in a pipe. It was previously shown that using the physics-based ITM method the inaccessible pipe wall temperature can be estimated to within 2°C by using the information from an external temperature measurement and the ultrasonic TOF. However, the ITM method was perceived to be relatively slow requiring 423 s to invert the full data set on a 12th Gen Intel core i7 processor CPU of a desktop PC. For field deployment less powerful processors would most likely be available and therefore this study investigated whether LSTM machine learning architecture would be less computationally intensive than the ITM method, whilst achieving comparable accuracy.

It was found that relative to a resistance temperature detector measurement, the 25-member LSTM ensemble achieved an ensemble median RMSE of 1.04°C and an ensemble median mean error of 0.194°C. This is almost identical to the performance of the ITM method which achieved a RMSE and mean error of 1.04°C and 0.196°C, respectively. These key metrics demonstrate that LSTM networks can perform as well as the ITM method if parameters such as the component thickness and velocity–temperature relationship coefficients used during training are in perfect agreement with the (unseen) test set. However, differences between the training and testing sets as small as $\pm 0.01 %$ of the velocity–temperature relationship coefficients or $\pm 0.17 %$ of the thickness caused the accuracy of the predictions to drop significantly. In both cases, these errors cause a simple offset of the predicted temperatures. Similarly to the ITM method, periods of constant temperature can be used to correct the offset caused by thickness discrepancies between the training data and real-world data. SD of the predictions made by an ensemble of 25 independent networks was found to be a clear indicator of the magnitude of errors in the predictions.

The aspects that affect computation time for a temperature prediction using both the LSTM and ITM methods were also discussed. While, for the implementations in this work, the LSTM looked considerably faster for performing temperature estimates for a full data set, the ITM method actually had a lower computation time per temperature estimate. However, for the stability of the ITM method it is required that it performs predictions at very small time steps, which therefore requires many interim computations if the sampling rate is relatively slow, that is, 0.25 Hz. This means that the ITM computation time will be unaffected by an increase in sampling rate, unless the increase exceeds the stable ITM time step. On the other hand, the LSTM method would need to perform more computations for an increased sampling rate, increasing computation time. A similar argument will apply in space; for the prediction of temperature-gradient-induced stresses some form of spatial resolution will be required. The ITM requires the use of a spatial grid of $N$ points ( $N = 151$ in this work), whereas multiple LSTM networks would need to be run separately to perform another spatial prediction, increasing computation time. Finally, the behaviour of the ITM can be fully interpreted and explained whilst that of the LSTM is a black box. Therefore, the authors conclude that the ITM should be favoured over the LSTM for a high frequency field application.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors would like to acknowledge funding from EPSRC for project funding as part of the FIND CDT (EP/S023275/1).

ORCID iDs

Laurence Clarkson

Yifeng Zhang

References

Courtin

. High cycle thermal fatigue damage prediction in mixing zones of nuclear power plants: engineering issues illustrated on the father case. Procedia Eng 2013; 66: 240–249.

Fontes

Braillard

Cartier

, et al. High-cycle thermal fatigue in mixing zones: investigations on heat transfer coefficient and temperature fields in PWR mixing configurations. In: 18th international conference on nuclear engineering, Xi’an, China, 17–21 May 2010, pp. 179–186.

George

. Mechanical metallurgy. New York, NY: McGraw-Hill, 1961.

Was

Ukai

. Austenitic stainless steels. Amsterdam, The Netherlands: Elsevier, 2019.

Maziasz

Busby

. Properties of austenitic steels for nuclear reactor applications. Comprehn Nuclear Mater 2020; 7: 303–318. DOI: 10.1016/B978-0-12-803581-8.11736-9.

Couturier

Schwarz

. Current state of research on pressurized water reactor safety. EDP Sci 2018. DOI: 10.1051/978-2-7598-2164-8.

World Information Service on Energy. France: serious accident at Civaux-1, https://www.wiseinternational.org/nuclear-monitor/495/france-serious-accident-civaux-1 (1998, accessed 19 November 2021).

Cipière

Le Duff

. Thermal fatigue experience in French piping: influence of surface condition and weld local geometry. Weld World 2002; 46: 23–27.

Timperi

. Development of a spectrum method for modelling fatigue due to thermal mixing. Nucl Eng Des 2018; 331: 136–146.

10.

Radu

Paffumi

Taylor

, et al. A study on fatigue crack growth in the high cycle domain assuming sinusoidal thermal loading. Int J Press Vessels Pip 2009; 86(12): 818–829.

11.

Basu

Debnath

. Chapter 2 – Main equipment. In: Basu

Debnath

(eds.) Power plant instrumentation and control handbook. 2nd ed. Boston, MA: Academic Press, 2019, pp. 41–147.

12.

Taler

Kaczmarski

, et al. Monitoring of thermal stresses in pressure components based on the wall temperature measurement. Energy 2018; 160: 500–519.

13.

Taler

Dzierwa

Jaremkiewicz

, et al. Thermal stress monitoring in thick walled pressure components of steam boilers. Energy 2019; 175: 645–666.

14.

Murakami

. 2 – Stress concentration. In: Murakami

(ed.) Metal fatigue. 2nd ed. Boston, MA: Academic Press, 2019, pp. 13–27.

15.

Wang

Zou

Cegla

. Acoustic waveguides: an attractive alternative for accurate and robust contact thermometry. Sens Actuator A Phys 2018; 270: 84–88.

16.

Bergholz

Jouan

Rudolph

, et al. Automatic fatigue monitoring based on real loads and consideration of EAF: live demonstration. In: ASME 2013 pressure vessels and piping conference, Paris, France. DOI: 10.1115/PVP2013-97236.

17.

Rudolph

Bergholz

Heinz

, et al. AREVA fatigue concept: a three stage approach to the fatigue assessment of power plant components. London, UK: InTech, 2012. DOI: 10.5772/37029.

18.

Zhang

Cegla

Corcoran

. Ultrasonic monitoring of pipe wall interior surface temperature. Struct Health Monitor 2020; 20: 2476–2492.

19.

Granta Design Limited. Ansys Granta EduPack, https://www.ansys.com/en-gb/products/materials/granta-edupack (accessed 20 August 2022).

20.

Dahlberg

Nilsson

Taylor

, et al. Development of a European procedure for assessment of high cycle thermal fatigue in light water reactors: final report of the NESC-thermal fatigue project2007, https://op.europa.eu/s/yfbF (accessed 27 October 2007).

21.

Gajdacsi

. High accuracy ultrasonic degradation monitoring. PhD Thesis, Imperial College London, Mechanical Engineering, London, UK, 2015.

22.

British Standards Institution. BS EN IEC 60751:2022. Industrial platinum resistance thermometers and platinum temperature sensors. UK: BSI, 2022.

23.

Zhang

Cegla

. Co-located dual-wave ultrasonics for component thickness and temperature distribution monitoring. Struct Health Monit 2022; 22: 1090–1104.

24.

Ihara

Yamada

Kosugi

, et al. New ultrasonic thermometry and its applications to temperature profiling of heated materials. In: 2011 fifth international conference on sensing technology, Palmerston North, New Zealand, pp. 60–65. New York, NY: IEEE.

25.

Munir

Park

Kim

, et al. Performance enhancement of convolutional neural network for ultrasonic flaw classification by adopting autoencoder. NDT E Int 2020; 111: 102218.

26.

Chapon

Pereira

Toews

, et al. Deconvolution of ultrasonic signals using a convolutional neural network. Ultrasonics 2021; 111: 106312.

27.

Cantero-Chinchilla

Wilcox

Croxford

. A deep learning based methodology for artefact identification and suppression with application to ultrasonic images. NDT E Int 2022; 126: 102575.

28.

Pyle

Croxford

, et al. Potential and limitations of NARX for defect detection in guided wave signals. Struct Health Monit 2023; 22: 1863–1875.

29.

Bevan

Zhang

Budyn

, et al. Experimental quantification of noise in linear ultrasonic imaging. IEEE Trans Ultras Ferroelectr Freq Control 2019; 66: 79–90.

30.

Pyle

Bevan

Hughes

, et al. Deep learning for ultrasonic crack characterization in NDE. IEEE Trans Ultras Ferroelectr Freq Control 2021; 68: 1854–1865.

31.

Hasan

Pourmousavi

Ardakani

, et al. A data-driven approach to estimate battery cell temperature using a nonlinear autoregressive exogenous neural network model. J Energy Storage 2020; 32: 101879.

32.

. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J Clin Epidemiol 1996; 49: 1225–1231.

33.

Pyle

Hughes

Ali

AAS

, et al. Uncertainty quantification for deep learning in ultrasonic crack characterization. IEEE Trans Ultras Ferroelectr Freq Control 2022; 69: 2339–2351.

34.

Hochreiter

. The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int J Uncertain Fuzziness Knowledge Based Syst 1998; 6: 107–116.

35.

Hochreiter

Schmidhuber

. Long short-term memory. Neural Comput 1997; 9: 1735–1780.

36.

Cantero-Chinchilla

Wilcox

Croxford

. Deep learning in automated ultrasonic NDE – developments, axioms and opportunities. NDT E Int 2022; 131: 102703.

37.

DiPietro

Hager

. Chapter 21 – Deep learning: RNNs and LSTM. In: Zhou

Rueckert

Fichtinger

(eds.) Handbook of medical image computing and computer assisted intervention. Boston, MA: Academic Press, 1990, pp. 503–519.

38.

Wagner

Kretzschmar

. International steam tables. Berlin, Heidelberg: Springer, 2008.

39.

Shannon

. Communication in the presence of noise. Proc IRE 1949; 37(1): 10–21.

40.

Python Software Foundation. Python, https://www.python.org/ (accessed 24 October 2022).

41.

Chollet

, et al. Keras, https://github.com/fchollet/keras; https://keras.io/getting_started/faq/ (2015, accessed 9 February 2023).

42.

Kingma

. Adam: a method for stochastic optimization, http://arxiv.org/abs/1412.6980 (2014, accessed 15 March 2023).

43.

MathWorks. Matlab. Massachusetts, USA: The MathWorks, Inc. 2021.

44.

Python Software Foundation. Time – time access and conversions, https://docs.python.org/3/library/time.html#time.perf_counter (2023, accessed 24 October 2022).

45.

Tran

Veeling

Roth

, et al. Hydra: preserving ensemble diversity for model distillation. arXiv:2001.04694, 2020; https://arxiv.org/abs/2001.04694.2001.04694

46.

Blalock

Ortiz

JJG

Frankle

, et al. What is the state of neural network pruning? arXiv:2003.03033, 2020; http://arxiv.org/abs/2003.03033

47.

Molchanov

Tyree

Karras

, et al. Pruning convolutional neural networks for resource efficient inference. arXiv:1611.06440, 2016; http://arxiv.org/abs/1611.06440