Sage Journals: Discover world-class research

Abstract

Recently, in the field of structural health monitoring, the detection of bolted connection looseness through percussion-based method and machine learning technology has received much attention due to the advantages of removing the requirement of sensor installation and potential for automation. However, there are few such research which are performed in the underwater environment. The paper proposes a new method, Feature-reduced Multiple Random Convolution Kernel Transform (FM-ROCKET), to identify the looseness level of the underwater bolted connections based on the percussion-induced sound (audio signal). By integrating deep learning (DL) and shallow learning, the FM-ROCKET model uses the 1D convolutional layer (a DL method) to extract features from the percussion-induced audio signal and adopts the rigid classifier (linear classifier, a shallow learning method) to classify the features. Five different preload levels of the bolted flange are considered. A hammer is utilized to tap the flange surface and the continuous percussion-induced audio signal is collected by a smartphone in an underwater environment. After the audio signal segmentation, single-hit audio signals are fed into the FM-ROCKET model. To verify the effectiveness of the proposed method, three case studies are conducted on two flanges. In case study I, the proposed method slightly outperforms other DL-based methods under different training/test splitting ratios. In case studies II and III, the proposed method is far more effective than other DL-based methods on independent and different test sets. The results demonstrate the superiority of the FM-ROCKET model in the underwater detection of bolted flange looseness. To the best of our knowledge, this article is the first attempt to address the detection of bolted flange looseness in the underwater environment by combining percussion-based method, DL, and shallow learning.

Keywords

Bolted connection looseness flange looseness detection underwater bolt looseness detection structure health monitoring percussion-induced sound Random Convolution Kernel Transform

Introduction

From the early industrial age to modern times, among many structural connection types, bolted connections have been a reliable type of connection for structural components and widely used in many industries.¹ For example, the offshore oil industry employs pipelines with flange connections to transport the oil from the seabed to the land.² Underwater pipelines, which stretch for miles and miles, are often jointed by bolted flanges. However, such connection still suffers from problems, such as bolt looseness, which may be resulting from chemical erosion, mechanical vibration, impact from foreign objects, and improper installation.^3,4 That is, in the pipeline system, bolted connection represents a point of vulnerability and is prone to self-loosening due to uncertainties, which might lead to disastrous consequences, resulting in economic losses. Therefore, it is important to detect the bolt looseness of the subsea flange.

In the past decades, benefitting from the rapid development in structural health monitoring (SHM),^5–7 researchers have been making contributions to the detection of bolted connection looseness in the air.^8–10 Notably, detection approaches based on the piezoelectric material¹¹ have been developed. By analyzing the energy dissipation caused by tangential damping, Wang et al.¹² presented an active sensing method for quantitative monitoring of the bolt looseness. Later, Wang et al.¹³ developed a novel electromechanical impedance model for monitoring the bolt looseness, which illustrates the relationship between the mechanical impedance of the bolted joint and the electrical impedance of a piezoceramic patch mounted on the joint. Then, by combining spectral sidebands and high-order harmonics, Zhang et al.¹⁴ introduced a contact acoustic nonlinearity-based monitoring system for detecting bolt looseness. Meyer and Adams¹⁵ used the impacted-acoustic modulation method to detect the bolt looseness, where impact modulation results are quantified using an integration-based metric and this metric increases as the preload force on bolts decreases. Moreover, Li and Jing¹⁶ proposed a novel second-order output spectrum-based method to detect multi-bolt loosening faults in complex structures with a sensor chain. By introducing the machine learning techniques, Wang and Song¹⁷ further developed a novel vibro-acoustic method for bolt looseness detection, which outperforms the traditional vibro-acoustic method. Although above detection methods have demonstrated their effectiveness, they were all implemented in the air and might be limited under the water because of the requirement of constant contact between structures and transducers. In addition, sensor installation requires additional human labor and financial costs in some complex situations.

Taking advantage of the computer vision technology, researchers developed non-contact detection methods for the bolted connection looseness. Cha et al.¹⁸ applied the image processing and support vector machine (SVM) to bolt looseness detection, where feature classification is based on the horizontal and vertical lengths of the bolt heads. However, this method has prerequisites that bolts need to be located in the middle of the image and, for each test, the bolt connections should have the same layouts. Ramana et al.¹⁹ improved the method in Cha et al.¹⁸ by using the Viola–Jones algorithm to automatically localize the bolt in the image. Recently, by discriminating the rotation angle of the nuts, deep learning-based (DL-based) methods^20–23 were proposed for identifying loose bolts. Nonetheless, vision-based articles all ignore a problem that, in the early stage of the bolted connection looseness, it may not cause visible changes in the rotation angle and position of the nut. Moreover, considering the refraction, reflection, and shadows in the underwater environment, the camera view may become blurred and fail to acquire accurate images.

To eliminate the above drawbacks into consideration, in recent years, as an ancient but effective method, the non-destructive percussion-based method^24,25 has entered our sight again, achieving promising performance in the detection of bolted connection looseness. Kong et al.²⁶ proposed a new percussion-based approach to determine the preload level of bolted joints by decision tree. Zhang et al.²⁷ combined mel-frequency cepstral coefficients (MFCC) feature matrix and principal component analysis (PCA) principal components of the percussion-induced audio signal, then they used these feature representations to train and test a SVM model. Next, due to the rapid development of artificial intelligence, more DL technologies are applied to identify the percussion-induced audio signals. Yuan et al.²⁸ employed multiscale-entropy analysis to extracting underlying characteristics and fed them into a back propagation neural network for training and testing. Moreover, Wang and Song²⁹ developed the 1D training-interference CapsNet, which combines feature extraction and classification. Similarly, the percussion-based method was also employed to detect the damage in other structures, such as cup-lock scaffolds,³⁰ aluminum spatial structure,³ and timber columns.³¹

Although many articles have been reported on the detection of bolted connection looseness by percussion-based method, most of them are conducted in the air. A few articles^32–34 focused on the underwater bolt looseness detection; however, they did not use the percussion-based method and more attention is paid on the design of the detection device, not on the detection technology. Jiang et al.³⁵ conducted a feasibility study of subsea bolt looseness detection by using piezoceramic transducers enabled active sensing; however, they just analyzed the lead zirconate titanate (PZT) signals under different preload levels and revealed the difference between these signals. Subsequently, based on the active PZT sensing and entropy theory, Wang et al.³⁶ proposed a stacking-based ensemble learning method to detect the bolted connection looseness under the water, while only two preload levels (tightened and loose) were considered and the sensor installation was required in underwater experiments.

Overall, the above literatures show many novel methods, such as signal processing technologies, machine learning technologies, and DL technologies, facilitating the development of the detection of bolted connection looseness. However, in the field of the detection of bolted connection looseness, three issues have not received the desired attention in current research. First of all, most of the detection studies using percussion-induced sound achieved satisfactory results in the air while their models might not be able to obtain desired results under the water. The second concern is that current underwater detection using sensors only consider two preload levels (tightened and loosened). Third, most of percussion-based detection studies train and test their models on a dataset by a certain training/test splitting ratio, while the performance of their models on independent test sets have not been fully verified. In other words, the robustness of the models to environmental and operational variants and the robustness of the models to different detection objects with similar structure have not been fully verified.

Regarding the above three problems, we proposed a new strategy to detect bolted flange looseness in the underwater environment using percussion-based method and Feature-reduced Multiple Random Convolution Kernel Transform (FM-ROCKET). The FM-ROCKET model is developed based on the Multi-ROCKET model³⁷ which is a newly emerged convolution-based model. The FM-ROCKET model uses a 1D convolutional layer to extract features from percussion-induced audio signals and employs a linear classifier to classify the features, achieving better performance than other state-of-the-art, DL-based methods in SHM. In summary, we are confident to declare main contributions of this articles as follows:

This article, for the first time, studies the underwater detection of bolted flange looseness through integrated percussion and DL method.

Based on features computed by the Multi-ROCKET model, we modify two kinds of old features with different scale factors. Experimental results from three case studies (I, II, III) show the effectiveness of the modified feature representation.

Compared to other advanced, DL-based methods in SHM and other fields, the proposed FM-ROCKET model achieves better performance on independent datasets collected under different scenarios (assemble, operator, temperature, time, object), which shows the robustness of the proposed method to environmental and operational variants (case study II) and the robustness of the proposed method to different detection objects with similar structure (case study III).

Compared to Multi-ROCKET model, the proposed FM-ROCKET model achieves similar classification performance in case study I and III. However, the proposed method obtains better performance in case study II where the training set and corresponding test set are collected from the same flange and independent of each other, which shows better robustness to environmental and operational variants (case study II).

The rest of this paper is organized as follows: Section ‘Feature extraction: MFCC’ introduces the MFCC which we use to compare with the proposed method. Section ‘The proposed method: FM-ROCKET’ elaborates relative theoretical background and the proposed method. Section ‘Experimental setup’ describes the experimental setup. Section ‘Results and discussion’ presents the experimental results and corresponding discussion, and Section ‘Conclusion’ concludes the paper.

Feature extraction: MFCC

For the audio signal processing, there are many technologies that can be used to extract representative features from the audio signal, such as bark band energy features,³⁸ power spectral density,^26,39 and so on. Among them, MFCC^30,40,41 based on the human perceptual frequency range, is an effective and commonly-used feature representation of audio signals. In this paper, MFCC is mainly adopted to compare with the proposed method in the underwater detection of bolted flange looseness, therefore we give an introduction of MFCC. Figure 1 illustrates the main steps of MFCC processing and corresponding formulas are shown as follows:

(1) Pre-emphasis

Figure 1.

The flow chart of mel-frequency cepstral coefficients (MFCC) processing.

The pre-emphasis is used to balance the spectrum of the audio signal that has a steep roll-off in the high-frequency region by a high-pass filter.

\underset{1 \leq n \leq N}{s (n)} = s (n) - α s (n - 1)

(1)

where $s (n)$ is a digitized audio signal with length $N$ and $α$ is the pre-emphasis coefficient.

(2) Framing and windowing

Next, according to the specific frame length and step, the signal is split into frames,

s = {\begin{matrix} s_{1}^{L} & s_{2}^{L} & \dots & s_{R}^{L} \end{matrix}}

(2)

where $R$ is the number of frames and $L$ is the length of single frame. Subsequently, each frame is processed by a hamming window to enhance the harmonics and smooth the curves,

\underset{1 \leq i \leq R}{s_{i}^{L}} = h \times s_{i}^{L}

(3)

\underset{1 \leq j \leq L}{h (j)} = 0.54 - 0.46 \cos (2 π (j - 1) / (L - 1))

(4)

where $s_{i}^{L}$ is the $i th$ frame and $h$ is the hamming window with length $L$ .

(3) Fourier transform

In the third step, each frame is transferred into magnitude spectrum by discrete Fourier transform (DFT),

\underset{\binom{1 \leq i \leq R}{1 \leq k \leq L}}{S_{i} (k)} = \sum_{o = 1}^{L} s_{i}^{L} (o) e^{- \frac{j 2 π ko}{L}}

(5)

where $S_{i}$ is the magnitude spectrum of the $i th$ frame and the number of points used to compute the DFT is equal to the frame length $L$ .

(4) Mel-filtering

In this step, we predefine a set of Mel-filter bank,

bank = {\begin{matrix} Me l_{1} & Me l_{2} & \dots & Me l_{M} \end{matrix}}

(6)

\underset{\binom{1 \leq m \leq M}{1 \leq k \leq L}}{Me l_{m} (k)} = {\begin{matrix} 0 & g (k) < f (m) \\ \frac{g (k) - f (m)}{f (m + 1) - f (m)} & f (m) \leq g (k) \leq f (m + 1) \\ \frac{f (m + 2) - g (k)}{f (m + 2) - f (m + 1)} & f (m + 1) \leq g (k) \leq f (m + 2) \\ 0 & g (k) > f (m) \end{matrix}

(7)

where $M$ is the total number of the Mel-filters. The functions $g (k)$ and $f (m)$ are defined as follows:

g (k) = (k - 1) \cdot C / (L - 1)

(8)

f (m) = 700 \times 10^{Mh (m) / 2595 - 1}

(9)

where $C$ is the sampling frequency and $g (k)$ maps the $k th$ DFT point to the corresponding frequency value in Hertz. $f (m)$ denotes the center frequency (Hertz) of the $m th$ filter, turning the frequency from Mel-scale to Hertz-scale. $Mh (m)$ is the $m th$ user-defined frequency in Mel scale. Then, by passing each frame through the Mel-filter bank, Mel-coefficient matrix (R × M) is obtained by the following computation:

\underset{\binom{1 \leq m \leq M}{1 \leq i \leq R}}{z_{i} (m)} = \sum_{k = 1}^{L} [{| S_{i} (k) |}^{2} Me l_{m} (k)]

(10)

(5) Discrete cosine transform

Finally, the MFCC matrix (R × T) can be obtained by applying the discrete cosine transform to Mel-coefficient matrix above.

\underset{1 \leq m \leq M}{e (m)} = \ln (z (m))

(11)

\underset{1 \leq n \leq T}{c (n)} = \sum_{m = 1}^{M} e (m) \cos (\frac{π n (m + 0.5)}{M})

(12)

where T often takes 12.

The proposed method: FM-ROCKET

Random Convolution Kernel Transform

Derived from the typical 1D convolutional kernel in the convolutional neural network, ROCKET⁴² introduces a very large number of 1D convolutional kernels, which have random and different length, bias, dilation, weights, and paddings, to capture feature maps for the input time series. Particularly, the length of each kernel is selected randomly from three values (7, 9, 11) given the same probability. In addition, the values of weights are sampled from a normal distribution and the values of biases are sampled from a uniform distribution. Dilation scale is sampled on the following exponential scale,

d = ⌊ 2^{x} ⌋, x ~ u (0, \log_{2} \frac{l_{in} - 1}{l_{\ker} - 1})

(13)

where $l_{in}$ is the length of the input time series and $l_{\ker}$ is the length of the 1D kernel. Moreover, to center the “middle” element of the kernel on every point of the input time series, zero padding is used at the start and end of the input time series. Subsequently, each kernel goes through the input time series to generate the corresponding feature map.

After the convolution, ROCKET computes two features from each feature map, which means that it produces two real-values per kernel. One is the Maximum Value (MV) of the current feature map. The other one is called Proportion of Positive Values (PPV) that captures the proportion of positive values of the input time series. Notably, in literatures,^42,43 PPV has been proven to be a significant feature that develops meaningfully higher accuracy than other features, like mean value of the input, in classification problems.

Furthermore, literatures^37,42,43 demonstrate that, under the features produced by the ROCKET, a linear classifier can develop higher classification accuracy than other classifiers, even for datasets where the number of features dwarfs both the number and length of samples (1D sequence).

Multi-ROCKET

Unlike the ROCKET, the Multi-ROCKET uses the fixed length (9) for all kernels and the values of weights are selected from two kinds of values (−1, 2). In addition, to balance the classification accuracy with the computational advantages from a small set of kernels, the Multi-ROCKET adopts a fixed group of 84 kernels. In literature,³⁷ this group has been justified that it produces high classification accuracy and is kept as a default parameter of Multi-ROCKET.

In the fixed group (84 kernels), each kernel uses a fixed set of dilations sampled on Equation (13). In terms of the bias, for each kernel/dilation combination, researchers randomly select a sample from the training set, calculate the corresponding convolution output, and take the quantiles of the convolution output as the bias values. Besides, zero padding is implemented between kernel/dilation combinations alternately. Consequently, the randomness of the Multi-ROCKET comes from the bias and other parameters are fixed.

In addition, the Multi-ROCKET removes the MV and adds three new features to increase the diversity and discriminatory, which are Mean of Positive Values (MPV), Mean of Indices of Positive Values (MIPV), and Longest Stretch of Positive Values (LSPV). Therefore, a feature map is represented by four different features, namely, PPV, MPV, MIPV, and LSPV. An example of calculating four features is shown in Figure 2. The detailed calculation formulas of four features are presented as follows:

Figure 2.

Single feature map.

First, the output of the 1D convolutional operation is computed below,

o = x \times k + b

(14)

where $x (1 \times m)$ is the input time series with $m$ sample points, namely percussion-induced audio signal, $k (1 \times d)$ is the kernel, and $b$ is the bias, and $o (1 \times n)$ is the feature map. Then, four kinds of features are extracted from the feature map:

(1) Proportion of Positive Values

The PPV is defined as,

PPV (o) = \frac{1}{n} \sum_{i = 1}^{n} [o (i)]

(15)

[o (i)] = {\begin{matrix} 1 & if o (i) > 0 \\ 0 & otherwise \end{matrix}

(16)

where $n$ is the length of the feature map and $o (i)$ is the value of the $i th$ point of the feature map. PPV captures the proportion of positive values of the feature map and collaborates with the bias term. A positive bias helps the PPV capture the proportion of the feature map reflecting “weak” matches between the feature map and the given pattern $[o (i)]$ , while a negative bias helps PPV capture the proportion of the feature map reflecting “strong” matches between the feature map and the given pattern.

(2) Mean of Positive Values

The MPV is defined as

\underset{i \leq n}{MPV (o)} = \frac{1}{m} \sum_{i = 1}^{m} o_{i}^{+}

(17)

where $m$ is the number of positive values of the feature map and $o_{i}^{+}$ means the $i th$ positive value of the feature map. MPV is to calculate the mean value of all positive values of the feature map to capture the intensity of the matches between the feature map and a given pattern.

(3) Mean of Indices of Positive Values

The MIPV is defined as,

\underset{i \leq n}{MIPV (o)} = {\begin{matrix} \frac{1}{m} \sum_{i = 1}^{m} I_{i}^{+} & if m > 0 \\ - 1 & otherwise \end{matrix}

(18)

where $I_{i}^{+} (\leq n)$ is the index of the $i th$ positive value of the feature map. MIPV captures information about the relative location of positive values in the feature map.

(4) Longest Stretch of Positive Values

The LSPV is defined as,

\underset{i, j, k \leq n}{LSPV (o)} = max | j - i |, (\forall i \leq k \leq j, o (k) > 0)

(19)

where $i$ , $j$ , $k$ are all indices of positive values of the feature map. LSPV returns the maximum length of any subsequence of successive positive values of the feature map. It is designed to distinguish between the feature map with many short sequences of successive positive values and that with few long sequences where MPV fails to discriminate.

Notably, the Multi-ROCKET not only extracts four different features from the input time series, but also extracts that from its first order difference, which increases the diversity of the features. As a result, both the input time series and its first order difference are convolved with the fixed group of 84 kernels. Finally, these feature representations are used to train or test a linear classifier.

The Proposed FM-ROCKET

Based on above four features, we modified these features and proposed a FM-ROCKET. Compared with the values of PPV and MPV, the values of MIPV and LSPV are considerably large. This difference may bring more uncertainties into the model. Therefore, on the one hand, we introduce an alterable scale factor $2 / l_{in}$ to reduce the value of the LSPV. This new feature is called Scaled Longest Stretch of Positive Values (SLSPV),

SLSPV = 2 \cdot LSPV / l_{in}

(20)

where $l_{in}$ is the length of the input audio signal. It is clear that this new feature is not applicable to a long input, therefore the audio signal with multiple percussion-induced sound is divided into single-hit audio signals (0.21 s) to avoid this limitation. On the other hand, the segmentation of the original audio signal leads to a new problem that the tapping event does not begin at a fixed time point within the duration of each single-hit audio signal. In this situation, the feature MIPV brings much randomness and disorder into the model, thus we introduce a scale factor $1 / l_{in}$ to reduce the value of MIPV considerably and this new feature is named Scaled Mean of Indices of Positive Values (SMIPV),

SMIPV = MIPV / l_{in}

(21)

In addition, the features PPV and MPV are retained. The final feature representation vector of the single-hit audio signal is shown as follows,

\begin{matrix} X = {\begin{matrix} \begin{matrix} {PPV}_{1} & \dots & {PPV}_{τ}, \end{matrix} & \begin{matrix} {SLSPV}_{1} & \dots & {SLSPV}_{τ}, \end{matrix} \end{matrix} \\ \begin{matrix} \begin{matrix} {MPV}_{1} & \dots & {MPV}_{τ}, \end{matrix} & \begin{matrix} {SMIPV}_{1} & \dots & {SMIPV}_{τ} \end{matrix}, \end{matrix} \\ \begin{matrix} \begin{matrix} {PPV}_{1}^{diff} & \dots & {PPV}_{τ}^{diff}, \end{matrix} & \begin{matrix} {SLSPV}_{1}^{diff} & \dots & {SLSPV}_{τ}^{diff}, \end{matrix} \end{matrix} \\ \begin{matrix} \begin{matrix} {MPV}_{1}^{diff} & \dots & {MPV}_{τ}^{diff}, \end{matrix} & \begin{matrix} {SMIPV}_{1}^{diff} & \dots & {SMIPV}_{τ}^{diff} \end{matrix} \end{matrix}} \end{matrix}

(22)

where $τ$ is the number of each kind of feature. In the experiment, we set $τ$ to 1176. The $diff$ means the feature is computed from the feature map of the first-order difference of the audio signal. Taking a single-hit audio signal for an example, the values of LSPV, SLSPV, MIPV, and SMIPV are shown in Figure 3. Compared with the values of LSPV and MIPV, the those of SLSPV and SMIPV become relatively small. After transforming all single-hit audio signals into the corresponding feature vector, we adopt a simple rigid classifier (linear classifier) with built-in Leave-One-Out cross-validation, which ensures the robustness of the decision results, and the cost function is:

L = \frac{1}{b} \sum_{i = 1}^{b} ({‖ y - wx ‖}^{2} + λ {‖ w ‖}^{2})

(23)

where $b$ is the batch size, $λ$ is the regularization coefficient, and $w$ is the weight.

Figure 3.

An example of Longest Stretch of Positive Values (LSPV), Scaled Longest Stretch of Positive Values (SLSPV), Mean of Indices of Positive Values (MIPV), and Scaled Mean of Indices of Positive Values (SMIPV) values.

The overall architecture of the underwater bolted flange looseness detection is illustrated in Figure 4. The percussion-induced sound is collected by a smartphone with a waterproof jacket and, after the audio signal segmentation, the single-hit audio signal and its first order difference are fed into the FM-ROCKET model. First, the single-hit audio signal and its first order difference convolve with $n \times 84$ dilated kernels, respectively, to generate two sets of $n \times 84$ feature maps. Afterward, four different features (PPV, MPV, SLSPV, SMIPV) are extracted from each feature map. Finally, a single-hit audio signal is transferred into a feature vector with $2 \times n \times 84 \times 4$ elements and fed into the rigid Classifier for training or test.

Figure 4.

Flowchart of the proposed method (FM-ROCKET).

Experimental setup

To demonstrate the effectiveness of the proposed FM-ROCKET model, we implemented the proposed methods along with other methods on two stainless flanges (A and B). As depicted in Figure 5(b), the two flanges have the same dimension and each one employs four pairs of bolts and nuts. By tightening bolts using a torque wrench, we select five different preload levels (five classes) for each flange (Table 1). Under each preload level, a steel hammer is used to tap the surfaces of the flanges and a smart phone with microphone (48 kHz sampling frequency, 16 bits resolution) is adopted to collect percussion-induced audio signals (Figure 5(a)). At the data collection stage, under each pair of flanges, we tap around 120 times for each class. Particularly, each tapping is performed under a random force at a random point in the area encircled by red lines in Figure 5(b). As a result, the dataset includes around 600 single-hit audios. In the experimental stage, the percussion can be performed manually. However, in some real scenarios which are not accessible to persons, we can adopt unmanned underwater vehicle and underwater robot to execute the percussion task. It is worth nothing that, in order to consider the environmental and operational variants, eight independent datasets (Table 2) are captured in different scenarios (assemble, operator, temperature, time). Specifically, eight datasets are collected by two researchers in 1 week and operators first loosen the bolts before tightening them under each preload level in each dataset, which ensures that classes are independent of each other and datasets are independent of each other. Datasets (1, 2, 3, 4) are taken from the flange A and datasets (5, 6, 7, 8) are taken from the flange B. Figure 6 exhibits single-hit audio signals under different datasets and preload levels. We can see that the received audio signals have no obvious relationship with preload levels. Moreover, to intuitively show the influence of the underwater environment on the detection, Figure 7 displays the percussion audio signals collected under the water and in the air. It became apparent that the underwater environment reduces the audio quality and brings much noise. In addition, the FM-ROCKET model is implemented in python (3.6) with main library tensorflow-gpu (2.6.2). The computer is equipped with Intel i7-11800H CPU, 32GB RAM memory, and NVIDIA GeForce RTX 3070 GPU.

Figure 5.

Experimental setup: (a) apparatus and (b) two flanges.

Table 1.

Arrangement of five classes.

Flange	Class	Bolt 1 (ft-lbs)	Bolt 2(ft-lbs)	Bolt 3(ft-lbs)	Bolt 4(ft-lbs)
A/B	1	0	0	0	0
	2	20	20	20	20
	3	40	40	40	40
	4	60	60	60	60
	5	80	80	80	80

Table 2.

Number of samples of eight independent datasets.

Flange	Dataset	0 ft-lbs	20 ft-lbs	40 ft-lbs	60 ft-lbs	80 ft-lbs	Total
A	Dataset 1	125	126	124	124	122	621
	Dataset 2	124	118	121	122	124	609
	Dataset 3	122	120	120	117	120	599
	Dataset 4	120	119	118	120	125	602
B	Dataset 5	122	119	123	119	120	603
	Dataset 6	123	121	122	120	120	606
	Dataset 7	127	120	121	122	156	646
	Dataset 8	120	122	120	120	121	603
Total							4889

Figure 6.

Percussion-induced audio signals under different preload levels and datasets.

Figure 7.

Percussion-induced audio signals collected under the water (left) and in the air (right).

Results and discussion

In real-world scenarios, it is difficult to detect the bolt connection looseness under the same situation since the environment includes the randomness and uncertainty. Therefore, it is necessary to test the robustness of the model to the environmental and operational variants. Additionally, it is impractical to build a dataset that covers enough data from countless underwater bolted connections so that the serviceability or applicability of the trained model is another concern. These issues contribute to three case studies. In case study I, we combine all eight datasets into a large dataset and split this large dataset into training and test sets. In case study II, one of the eight datasets is selected as the test set and the remaining seven datasets are integrated into a training set. In case study III, datasets from the flange A(B) are taken as the training set and that from the flange B(A) as the test set.

Case study I: Performance under different training/test splitting ratios

In this case, we combine all the datasets into a single large dataset (4889) and split this dataset into training and test sets by the different training/test ratios (8:2, 7:3, 6:4, 5:5). To keep the balance of both the training and test sets, each class occupies the same percentage in the training and test sets. In addition, samples from each dataset account for the same proportion in both training and test sets. Subsequently, to illustrate the effectiveness of the proposed method, we compare the FM-ROCKET model with some DL-based models in SHM literatures and Table 3 shows all the classification accuracy which is the average value under four repeated experiments. Normally, since the machine learning or DL methods randomly initialize their trainable parameters, in each repeated run, the model may have different trained parameters when the training process is over. Hence, it is necessary to run the model several times and compute the average accuracy. It can be found that while the proposed method performs slightly better than other DL-based methods, the computational cost is relatively low. The computational time for both FM-ROCKET model and Multi-ROCKET model is around 50 s, which is far less than that of other methods. Overall, in the detection of underwater bolted flange looseness, the classification performances of all the methods are similar and there is no significant difference under four different training/test splitting ratios.

Table 3.

Comparison of test set accuracy (%) among various methods and training/test splitting ratios.

Method	Ratio				Trainable parameters (K)
Method	8:2	7:3	6:4	5:5	Trainable parameters (K)
1D-CapsNet^44,45	98.47	97.89	97.8	97.46	3000
1D-ECapsNet⁴⁶	98.26	98.30	97.65	97.83	82
MFCC + CapsNet⁴⁷	99.39	99.39	99.49	99.10	1400
MFCC + ECapsNet	99.39	99.52	99.49	99.47	30
MFCC + deep CNN³¹	98.92	99.10	99.34	99.14	261
MFCC + deep LSTM⁴⁸	99.39	99.66	99.52	99.6	1613
Multi-ROCKET³⁷	99.69	99.45	99.79	99.71	47
The proposed method	99.79	99.72	99.7	99.71	47

CNN: convolutional neural network; MFCC: mel-frequency cepstral coefficients; ROCKET: Random Convolution Kernel Transform.

LSTM: long short-term memory; The proposed method: Feature-reduced Multiple Random Convolution Kernel Transform.

Bold values are used to highlight the classification accuracy of the proposed method.

Case study II: Robustness to environmental and operational variants

Since eight independent datasets are collected from different scenarios (assemble, operator, temperature, time), in this case, eight types of experiments are designed. In the $k th$ type of experiments, we use $k th$ dataset as the test set and combine the remaining seven datasets into a training set, ensuring that the training set and test set are independent and collected from different scenarios. Therefore, eight kinds of experiments are used to verify the robustness of the method to environmental and operational variants. Similar to the case study I, in Table 4, we still compare the classification accuracy of our method with that of the other methods in Table 3. For each kind of experiments, we train the model four times, taking the average accuracy on the test set as the final accuracy. Moreover, in Figure 8, we present the confusion matrices for some typical classification results of the proposed method and the Multi-ROCKET. Confusion matrices for other methods are not displayed in this paper because their classification results are unsatisfactory.

Table 4.

Comparison of test set accuracy (%) among various methods.

Method	Test set
Method	D1	D2	D3	D4	D5	D6	D7	D8
1D-CapsNet^44,45	44.93	22.99	31.39	44.02	36.32	65.68	26.01	42.79
1D-ECapsNet⁴⁶	59.42	32.35	56.43	53.82	59.37	68.81	41.33	55.22
MFCC + CapsNet⁴⁷	69.89	62.40	77.13	39.04	36.82	76.4	50	63.68
MFCC + ECapsNet	71.5	66.17	72.95	41.03	42.29	71.78	56.66	58.54
MFCC + deep CNN³¹	66.83	44.33	56.26	51.5	59.37	71.62	47.52	61.36
MFCC + deep LSTM⁴⁸	70.69	33.66	60.1	51.83	54.89	61.06	52.01	50.58
Multi-ROCKET³⁷	78.98	69.06	81.05	81.22	96.76	76.48	72.94	94.56
The proposed method	82.56	76.18	81.97	85.96	97.63	82.84	73.76	96.68

CNN: convolutional neural network; MFCC: mel-frequency cepstral coefficients; ROCKET: Random Convolution Kernel Transform.

LSTM: long short-term memory; The proposed method: Feature-reduced Multiple Random Convolution Kernel Transform.

Bold values are used to highlight the classification accuracy of the proposed method.

Figure 8.

Confusion matrices of some typical classification results.

Obviously, for the performance on independent and different test sets, the proposed method outperforms all the other methods, demonstrating its advantages in the detection of underwater bolted flange looseness. However, for the test sets 2 and 7, the classification accuracy is relatively poor. For dataset 2, our model cannot effectively classify the audio signals (60 ft-lbs) into their actual class and, for dataset 7, our model cannot effectively classify the audio signals (80 ft-lbs) into their actual class. A possible explanation for these outcomes is that, as mentioned above, these datasets are independently collected under different scenarios and contain the randomness and uncertainty, therefore the percussion-induced audio signals may encounter with “saturation” or “similarity” problem under high preload levels in some datasets. In Figure 9, the audio signal energy reflects this possible problem. It can be found that the signal energy distribution under high preload levels (60, 80 ft-lbs) is more compact than that under low preload levels (0, 20, 40 ft-lbs). In addition, the signal energy under high preload levels (60, 80 ft-lbs) is similar. Furthermore, it is noted that this “saturation” or “similarity” phenomenon also appears when K-means clustering algorithm is applied to the datasets 2 and 7 in the Figure 10. The K-means algorithm classifies most of the audio signals under 60 and 80 ft-lbs into the same group (red dashed box in Figure 10), which reveals the similarity between the audio signals under 60 and 80 ft-lbs.

Figure 9.

Energy of percussion-induced audio signals of (a) dataset 2 and (b) dataset 7.

Figure 10.

Clustering results: (a) true label of dataset 2, (b) clustered label of dataset 2, (c) true label of dataset 7, and (d) clustered label of dataset 7.

Case study III: Robustness to different detection objects with similar structure

In this case, we employ datasets from different flanges to respectively serve as training and test sets to study the method’s robustness and its potential for practical application. As mentioned above, datasets (1, 2, 3, 4) are captured from the flange A and datasets (5, 6, 7, 8) are from the flange B. The trial settings are shown in Table 5, the performances of different methods are illustrated in Figure 11, and the confusion matrices are given in Figure 12. Same as the above two case studies, the classification accuracy is the average value of four repeated experiments. As expected, the performance of all methods decreased due to the fact that an independent dataset is used for the testing. It is important to point out that the proposed method and the Multi-ROCKET model have the similar and best performance, far outperforming other methods. Although all the methods do not perform well in this case, it still demonstrates that, in the underwater environment, the FM-ROCKET model and Multi-ROCKET model surpass other DL-based methods in terms of the applicability to another similar bolted connection.

Table 5.

Settings for different trials.

	Training set (size)	Test set (size)
Trial 1	Flange A (2431)	Flange B (2458)
Trial 2	Flange B (2458)	Flange A (2431)

Figure 11.

Comparison of test set accuracy among significant works.

Figure 12.

Confusion matrices of Trial 1 and Trial 2: (a) FM-ROCKET (test set: B) and (b) FM-ROCKET (test set: A).

Conclusion

To monitor the bolted flange looseness in the underwater environment, we propose a novel detection method using percussion-induced audio signal, DL, and shallow learning. Specifically, to process the percussion-induced audio signal, we develop the FM-ROCKET, which achieves promising classification accuracy. Compared to current DL-based methods, the proposed FM-ROCKET model uses a 1D convolutional layer (a DL method) and a rigid classifier (linear classifier, a shallow learning method), and the 1D convolutional layer is used to extract features from the input audio signal (DL part) and the linear classifier is adopted to identify the feature representation of the signal (Machine learning (ML) part). Notably, many state-of-the-art, DL-based models in SHM lack the robustness to environmental and operational variants and the robustness to different detection objects with similar structure in the underwater environment. However, with the verification under three case studies, the proposed FM-ROCKET model demonstrates approximate or better performance in these two aspects than several significant DL-based models in SHM.

In case study I, the training set and corresponding test set are dependent of each other because they are from the same dataset under the same flange. In case study II, the training set and corresponding test set are independent of each other since they are from different datasets under the same flange. In case study III, the training set and corresponding test set are more independent of each other since they are from different datasets under different flanges. Case study I proves that all the methods have similar classification performance on training and test sets which are dependent of each other. Next, in case study II, the proposed method and Multi-ROCKET model surpass other methods. In addition, with the help of the reduced features, the proposed method achieves better performance than Multi-ROCKET model in five independent test sets (D1, D2, D4, D6, D8) and similar performance to Multi-ROCKET model in the remaining three independent test sets (D3, D5, D7). Therefore, in case study II, the overall classification performance of the proposed method is better than that of the Multi-ROCKET model and far outperforms that of the other methods, which shows the robustness to environmental and operational variants. Last, case study III indicates that, on training and test sets which are more independent of each other, the proposed method obtains a similar classification performance to the Multi-ROCKET model, and outperform other methods, which shows the robustness to different detection objects with similar structure in the underwater environment. In terms of case studies I and III, the proposed method has similar performance to the Multi-ROCKET model while in case study II the proposed method excels the Multi-ROCKET model with reduced features.

In the future work, several issues will be investigated: (1) the real-world noises will be taken into consideration; (2) the proposed method will be extended to the cases with higher preload levels; (3) the proposed method will be further improved to determine the detailed location of the loosened bolts on the flange; (4) the robustness of the classification model to different detection objects with similar structure will be further enhanced. Therefore, we will further improve our method to address the underlying problems. Without installing the constant-contact sensors, this easy-to-implement and low-cost detection method has great potential in future applications.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is supported by Texas Commission on Environmental Quality through Subsea Systems Institute Award #582-15-57593. This project was paid for [in part] with federal funding from the Department of the Treasury through the State of Texas under the Resources and Ecosystems Sustainability, Tourist Opportunities, and Revived Economies of the Gulf Coast States Act of 2012 (RESTORE Act). The content, statements, findings, opinions, conclusions, and recommendations are those of the author(s) and do not necessarily reflect the views of the State of Texas or the Treasury.

ORCID iDs

Jian Chen

Gangbing Song

References

Wang

Song

Liu

, et al. Review of bolted connection monitoring. Int J Distrib Sens Netw 2013; 2013; 8.

El-Borgi

Patil

, et al. Inspection and monitoring systems subsea pipelines: a review paper. Struct Health Monit 2019; 19: 606–645.

Wang

Song

. A novel percussion-based method for multi-bolt looseness detection using one-dimensional memory augmented convolutional long short-term memory networks. Mech Syst Signal Process 2021; 161: 107955.

Wang

. Identification of multi-bolt head corrosion using linear and nonlinear shapelet-based acousto-ultrasonic methods. Smart Mater Struct 2021; 30: 085031.

Kong

Robert

Silva

, et al. Cyclic crack monitoring of a reinforced concrete column under simulated pseudo-dynamic loading using piezoceramic-based smart aggregates. Appl Sci 2016; 6: 341.

Zhou

, et al. Detecting damage size and shape in a plate structure using PZT transducer array. J Aerosp Eng 2018; 31(5): 04018075.

Song

Xiang

, et al. Singular spectrum analysis and fuzzy entropy-based damage detection on a thin aluminium plate by using PZTs. Smart Mater Struct 2022; 31: 035015.

Jiang

Chen

Dai

, et al. Multi-bolt looseness state monitoring using the recursive analytic based active sensing technique. Measurement 2022; 191: 110779.

Zhou

Chen

S-X

Y-Q

, et al. EMI-GCN: a hybrid model for real-time monitoring of multiple bolt looseness using electromechanical impedance and graph convolutional networks. Smart Mater Struct 2021; 30: 035032.

10.

Chen

Shen

, et al. Coda wave interferometry-based very early stage bolt looseness monitoring using a single piezoceramic transducer. Smart Mater Struct 2022; 31: 035030.

11.

Wang

Song

. New entropy-based vibro-acoustic modulation method for metal fatigue crack detection: an exploratory study. Measurement 2020; 150: 107075.

12.

Wang

Huo

Song

. A piezoelectric active sensing method for quantitative monitoring of bolt loosening using energy dissipation caused by tangential damping based on the fractal contact theory. Smart Mater Struct 2018; 27: 015023.

13.

Wang

SCM

Huo

, et al. A novel fractal contact-electromechanical impedance model for quantitative monitoring of bolted joint looseness. IEEE Access 2018; 6: 40212–40220.

14.

Zhang

Liu

Liao

, et al. Contact acoustic nonlinearity (CAN)-based continuous monitoring of bolt loosening: hybrid use of high-order harmonics and spectral sidebands. Mech Syst Signal Process 2018; 103: 280–294.

15.

Meyer

Adams

. Using impact modulation to quantify nonlinearities associated with bolt loosening with applications to satellite structures. Mech Syst Signal Process 2019; 116: 787–795.

16.

Jing

. Fault diagnosis of bolt loosening in structures with a novel second-order output spectrum-based method. Struct Health Monit 2019; 19: 123–141.

17.

Wang

Song

. Monitoring of multi-bolt connection looseness using a novel vibro-acoustic method. Nonlinear Dyn 2020; 100: 243–254.

18.

Cha

Y-J

You

Choi

. Vision-based detection of loosened bolts using the Hough transform and support vector machines. Autom Constr 2016; 71: 181–188.

19.

Ramana

Choi

Cha

Y-J

. Fully automated vision-based loosened bolt detection using the Viola–Jones algorithm. Struct Health Monit 2018; 18: 422–434.

20.

Sun

Xie

Cheng

. A fast bolt-loosening detection method of running train’s key components based on binocular vision. IEEE Access 2019; 7: 32227–32239.

21.

Huynh

T-C

Park

J-H

Jung

H-J

, et al. Quasi-autonomous bolt-loosening detection method using vision-based deep learning and image processing. Autom Constr 2019; 105: 102844.

22.

Huynh

T-C

. Vision-based autonomous bolt-looseness detection method for splice connections: design, lab-scale evaluation, and field application. Autom Constr 2021; 124: 103591.

23.

Pan

Dong

, et al. A vision-based monitoring method for the looseness of high-strength bolt. IEEE Trans Instrum Meas 2021; 70: 1–14.

24.

Zhou

Wang

Zhou

, et al. Percussion-based bolt looseness identification using vibration-guided sound reconstruction. Struct Control Health Monit 2022; 29: e2876.

25.

Yang

Huo

. Bolt preload monitoring based on percussion sound signal and convolutional neural network (CNN). Nondestr Test Eval 2022; 37: 1–18.

26.

Kong

Zhu

SCM

, et al. Tapping and listening: a new approach to bolt looseness monitoring. Smart Mater Struct 2018; 27: 07LT02.

27.

Zhang

Zhao

Sun

, et al. Bolt loosening detection based on audio classification. Adv Struct Eng 2019; 22: 2882–2891.

28.

Yuan

Kong

, et al. Percussion-based bolt looseness monitoring using intrinsic multiscale entropy analysis and BP neural network. Smart Mater Struct 2019; 28: 125001.

29.

Wang

Song

. 1D-TICapsNet: an audio signal processing algorithm for bolt early looseness detection. Struct Health Monit 2021; 20: 2828–2839.

30.

Wang

Song

. Looseness detection in cup-lock scaffolds using percussion-based method. Autom Constr 2020; 118: 103266.

31.

Chen

Xiong

Sang

, et al. An innovative deep neural network-based approach for internal cavity detection of timber columns using percussion sound. Struct Health Monit 2022; 21: 1251–1265.

32.

Manjunatha

Arockia Selvakumar

Godeswar

, et al. A low cost underwater robot with grippers for visual inspection of external pipeline surface. Procedia Comput Sci 2018; 133: 108–115.

33.

Rumson

. The application of fully unmanned robotic systems for inspection of subsea pipelines. Ocean Eng 2021; 235: 109214.

34.

Zhang

Wang

, et al. Subsea pipeline leak inspection by autonomous underwater vehicle. Appl Ocean Res 2021; 107: 102321.

35.

Jiang

SCM

Tippitt

, et al. Feasibility study of a touch-enabled active sensing approach to inspecting subsea bolted connections using piezoceramic transducers. Smart Mater Struct 2020; 29: 085038.

36.

Wang

Chen

Song

. Smart crawfish: a concept of underwater multi-bolt looseness identification using entropy-enhanced active sensing and ensemble learning. Mech Syst Signal Process 2021; 149: 107186.

37.

Chang

Angus

Christoph

, et al. MultiRocket: multiple pooling operators and transformations for fast and effective time series classification. arXiv:210200457v3 [csLG], 2021.

38.

Muhammad

Kaliappan

. Deception detection in speech using bark band and perceptually significant energy features. In: 2013 IEEE 56th international midwest symposium on circuits and systems (MWSCAS), Columbus, OH, USA, 2013, pp.1212–1215. IEEE.

39.

Uddin

Altaf

Bilal

, et al. Amateur drones detection: A machine learning approach utilizing the acoustic signals in the presence of strong interference. Comput Commun 2020; 154: 236–245.

40.

Muhammad

Masud

. Neural network based classification of stressed speech using nonlinear spectral and cepstral features. In: 2014 IEEE 12th international new circuits and systems conference (NEWCAS), Trois-Rivieres, QC, Canada, 2014, pp.33–36. IEEE.

41.

Mouaz

Abdelmajid

Abderrahim

B-h

. Feature extraction of some Quranic recitation using Mel-Frequency Cepstral Coefficients (MFCC). In: 2016 5th International Conference on Multimedia Computing and Systems (ICMCS), Marrakech, Morocco, 2016. IEEE.

42.

Dempster

Petitjean

Webb

. ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min Knowl Discovery 2020; 34: 1454–1495.

43.

Dempster

Schmidt

Webb

. MiniRocket: a very fast (almost) deterministic transform for time series classification. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, virtual event, 2021, pp.248–257. Association for Computing Machinery.

44.

Liu

Xie

, et al. P300 event-related potential detection using one-dimensional convolutional capsule networks. Expert Syst Appl 2021; 174: 114701.

45.

Butun

Yildirim

Talo

, et al. 1D-CADCapsNet: one dimensional deep capsule networks for coronary artery disease detection using ECG signals. Phys Med 2020; 70: 39–48.

46.

Mazzia

Salvetti

Chiaberge

. Efficient-CapsNet: capsule network with self-attention routing. Sci Rep 2021; 11: 14634.

47.

Nassif

Shahin

Elnagar

, et al. Emotional speaker identification using a novel capsule nets model. Expert Syst Appl 2022; 193: 116469.

48.

Rejaibi

Komaty

Meriaudeau

, et al. MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech. Biomed Signal Process Control 2022; 71: 103107.

Underwater bolted flange looseness detection using percussion-induced sound and Feature-reduced Multi-ROCKET model

Abstract

Keywords

Introduction

Feature extraction: MFCC

The proposed method: FM-ROCKET

Random Convolution Kernel Transform

Multi-ROCKET

The Proposed FM-ROCKET

Experimental setup

Results and discussion

Case study I: Performance under different training/test splitting ratios

Case study II: Robustness to environmental and operational variants

Case study III: Robustness to different detection objects with similar structure

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

References