Sage Journals: Discover world-class research

Abstract

In view of the complex multi-scale target detection environment of ultrasonic atlas of weld defect and the poor detection performance of existing algorithms for the multiple small target defects, the Faster RCNN convolution neural network is applied to weld defect detection, and a Fast RCNN deep learning network is proposed in combination with an improved ResNet 50. Based on the coexistence of multiple small targets and multi-scale target detection, this paper proposes to combine deformable network, FPN network and ResNet50 to improve the detection performance of the algorithm for multi-scale targets, especially small targets. Based on the efficiency and accuracy of candidate frame selection, K-means clustering algorithm and ROI Align algorithm are proposed, and the anchors points and candidate frames suitable for weld defect data sets are customized for accurate positioning. Through the self-made ultrasonic atlas data set of weld defects and experimental verification of the improved algorithm in this paper, the overall mean average precision has reaches 93.72%, and the average precision of small target defects such as “stoma” and “crack” has reaches 92.5% and 88.9% respectively, which is 4.8% higher than the original Faster RCNN algorithm. At the same time, through the ablation experiments and comparison experiments with other mainstream target detection algorithms, it is proved that the improved method proposed in this paper improves the detection performance and is superior to other algorithms. The actual industrial detection scene proves that it basically meets the requirements of weld defect detection, and can provide a reference for the intelligent detection method of weld defects.

Keywords

Nondestructive testing feature extraction defect detection faster RCNN

Introduction

Aluminum alloy is widely used in active army and navy weapons and equipment because of its advantages of small specific gravity, high strength, good fracture toughness, strong corrosion resistance, good processing performance, and excellent welding performance. According to the requirements of lightweight design of weapons and equipment, it is mostly designed as the assembly and welding structure of high-strength aluminum alloy plates. In the production of some weapons and equipment parts, due to the influence of welding process parameter deviations, clamping conditions, material surface conditions, operators and other factors, the weld seam of the workpiece may have small defects that cannot be detected by the naked eyes. Then, with the expansion of missing defection defects under the action of the cyclic firing load, there will inevitably be failure problems such as large deformation or instantaneous fracture, which will directly lead to the loss of attack fighting capacity of weapons and equipment. Therefore, before machining the workpiece, it is necessary to carry out nondestructive testing. The traditional methods of welding detection mostly use the ultrasonic phased array flaw detector to generate ultrasonic images of defects,¹ and distinguish the types of defects through manual observation. The result obtained in this way is not only time-consuming, but also has a high rate of missed detection. At this time, it is necessary to study more effective methods of defect detection and strictly control the internal quality of the workpiece welds.

In recent years, with the development of computer Internet technology, the neural network-based target detection technology in the field of artificial intelligence has gradually gained attention. At present, the mainstream image-based target detection algorithms framework is divided into two²: two-step target detection algorithm and single-step target detection algorithm. The former generates a suggestion box of the target area, and performs classification and regression twice. The target detection accuracy is high, but the running speed of the algorithm is slow. The representative algorithms are R-CNN,³ Fast R-CNN,⁴ Faster R-CNN⁵; The latter directly performs target classification and regression on the feature map, which reduces the calculation scale of the algorithm, so the running speed is high, but the detection accuracy is not as good as the former. The representative algorithm are YOLO series^6–9 and SSD algorithm.¹⁰ The algorithm mentioned above has a high accuracy on public data sets, which makes deep learning methods gradually penetrate into the research fields of face recognition, license plate recognition, target detection, defect detection, and so on.

Aiming at the requirement of ultrasonic atlas target detection of workpiece weld defects, a large number of scholars have conducted extensive research to solve the problem of ultrasonic atlas target detection, especially the classification of defects to be detected. Richard et al.¹¹ used the training of convolutional neural network to characterize crack defects in pipeline crack detection. Compared with the traditional image-based crack characterization method, it is proved that the accuracy of the deep learning method to characterize the length and angle of crack defects was significantly improved. Zhang et al.¹² tried to classification scheme of welding defects based on convolutional neural network, and classified defects by multi-model model integration framework to reduce the false detection rate of weld defects. In order to realize the identification of weld surface defects. Zhu et al.¹³ proposed an intelligent identification method composed of convolutional neural network and forest random. The advanced features through CNN intelligent learning, the forest randomly predicts and classifies the extracted features. The experimental results show that the accuracy of this method can reach 98.75%. Compared with traditional image processing methods, the above research can learn advanced features of the target to be detected from sample data, without preselecting any features of the image, and the recognition effect is much higher than that of traditional methods. However, there are many small target defects in the weld internal defects of aluminum alloy workpieces at the same time. The above research has no satisfactory performance in the data set of multiple defects of weld.

Guo et al.¹⁴ realized the welding defect detection of X-ray images based on the classic model of Faster RCNN model in the field of object detection. Liu et al.¹⁵ proposed a full convolution structure based on VGG16 to classify the welding defect images, and trained the defect images to be inspected by using the idea of transfer learning. Among various deep learning methods, this method uses a relatively small data sets to achieve high-precision recognition. Oh et al.¹⁶ proposed an automatic welding defect detection method based on Faster R-CNN. It compared two internal feature extractors ResNet and Inception-ResNet V2 and analyzed the anchor size to fit the defect. Performance evaluation shows that the detection accuracy is improved. Zhang and Shen¹⁷ proposed an improved Faster RCNN algorithm. The experiment verified that compared with the original algorithm, the improved algorithm has made great improvements in all aspects. The average detection accuracy of this method can reach 94%. Considering the diversity of defect shape and size of defects, Zhou et al.¹⁸ chose the K-Means algorithm to generate the aspect ratio of the anchor box according to the ground truth value, and fused the feature matrices with different receptive fields to improve the detection performance of the model. The experimental results show that the recognition accuracy of the improved model on the collected ceramic data sets is 94.6%. Yin and Yang¹⁹ adopted the Faster R-CNN algorithm as baseline of the training model. The feature pyramid network is added to the original network of Faster RCNN, furthermore, replace the ROI Pooling module with ROI Align to reduce quantization error, which helps the mAP increase by about 0.8%. Yang et al.²⁰ put forward that the most advanced single-stage target detection algorithm YOLOv5 is applied to the field of steel pipe weld defect detection, and compared with the two-stage representative target detection algorithm Faster R-CNN, the detection speed can be greatly improved, and multiple classifications of real-time detection tasks can be completed. In this paper, the steps of detecting the weld defects of the workpiece are as follows: Firstly, the ultrasonic atlas of the defect is collected offline; Secondly, the atlas is strengthened; Finally, the defect detection are detected. This process has low requirements for real-time performance. Although the real-time performance of the target detection method based on Faster RCNN proposed by the above research is not as good as the YOLO algorithm, it has a higher target detection accuracy, can be more sensitive to the existence of multi-scale target defects in the weld, and is less likely to miss the detection.

In this paper, a new Faster RCNN deep learning network combined with the improved ResNet50 is proposed to detect five kinds of defects in welds. ResNet-50 network combines deformable convolution reconstruction as the feature extraction network of Faster R-CNN, which improves ability to identify the weld irregularities of workpieces. By adding the feature pyramid network structure, the feature map is subjected to scale fusion operation, and the feature extraction capability of small target defects in workpiece welds is enhanced. Subsequently, the K-means algorithm is added to cluster the defect data, and the anchors that are more suitable for welding defects are obtained, Finally, the ROI Align algorithm is used to replace the rough ROI Pooling. This algorithm can obtain more accurate information of defect location. The content of this research can directly solve the current technical problem that large-scale military workpieces cannot be effectively inspected on site. It can guide inspectors to carry out scientific detections, reasonably evaluate manufacturing quality and ensure their battlefield indexes. It has significant military interests. The “Intelligent Qualitative Classification Model of Ultrasonic Phased Array Detection Defect Atlas” constructed in this research provides a reference for the accurate qualitative analysis of defects in the field of ultrasonic detection. The effectiveness and superiority of the algorithm are verified by self-built ultrasonic atlas data set of weld defects.

This article is organized as follows. In the next section, the Faster RCNN target detection technology is presented. Firstly, the structure of the proposed Faster RCNN target detection network is introduced. Then, the improvement content of the original network structure is proposed. Next, introduces the experimental system, the collected data set and the evaluation index of the network. Finally, the method proposed in this paper is used to detect weld defects. Then the experimental tests and results are provided, followed by a section that shows the performance of the proposed method. Finally, the conclusions of this study are summarized.

Materials and methods

Improved faster R-CNN

Faster RCNN mainly consists of the following four parts. The basic structure of Faster RCNN improved in this paper is shown in Figure 1.

Feature extraction network: Faster RCNN firstly extracts the features of the weld defects to be inspected by using a convolutional neural network (such as: VGG, ResNet, etc.), and the obtained feature maps are used in the subsequent RPN layer and fully connected layer. This paper proposes a ResNet50 network based on deformable convolutional reconstruction to detect defects of various shapes, while using Feature Pyramid Network to improve the accuracy and robustness of defect detection for small targets.

Regional suggestion network: RPN network uses a sliding window to generate recommendation regions, and takes each pixel of the input feature as the center to generate nine anchors with different areas and aspect ratios. The suggested anchors are preliminarily screened to find the area with the most defects. Because the anchor scheme in Faster RCNN is based on the PASCAL VOC 2007 data set, in view of the small target defects in the weld seam, there is a problem of scale in adaptability and long time to generate the suggestion frame. A scheme of generating anchors by using K-means clustering algorithm is proposed, which can speed up the network convergence, generate better target candidate regions and improve the accuracy of target detection.

Region of interest pooling: ROI Pooling acquires the feature map and the suggestion box at the same time, and then intercepts feature maps by using the suggestion box, and adjusts the acquired feature maps of different sizes to the size required by the classifier. It is normalized and then sent to the following full connection layer. There are two rounding processes in the whole frame selection cutting network frame, which will cause the result box to deviate from the original target image. The ROI Align will maintain the floating-point number boundary for each candidate region to ensure that the region bounding box is not reduced, and change the quantization operation to a bilinear difference to reduce quantization deviation. Through the above improvements, the problem of region mismatch can be avoided, and the detection capability of the detection network for small defects can be further enhanced.

Classification: Using classification and regression network, judge whether the intercepted feature map is a defect to be inspected, and adjust the suggestion box, and finally get the accurate position of the inspection box.

Figure 1.

The basic structure of faster RCNN.

Improved ResNet50 extraction network

The selection of feature extraction network has an extremely important influence on the performance of the model. Common feature extraction networks include VGG, ResNet, AlexNet, DenseNet, MobileNet, etc. According to the existing test results of feature extraction network,^21–24 the overall accuracy of the ResNet series network as a feature extraction network is above 80%, which is better than other networks. ResNet50 and ResNet101 have the highest overall accuracy^25–27 in the ResNet series of networks. The overall accuracy of ResNet50 is 0.75% lower than that of ResNet101, but the detection speed of a single image is reduced by 25%. Combined with the research object of this paper, ResNet50 is selected as the backbone feature extraction network, considering the detection accuracy and speed comprehensively.

When constructing model transformation, ResNet50 network is limited to a fixed geometric structure. This limitation determines that the convolution unit can only sample at a fixed position on the input image, which leads to the weak feature representation ability extracted by the convolution layer, which leads to the serious loss of features, which in turn leads to weak fitting ability of the loss function and poor accuracy of the network detection. There are various types and shapes of defects in the ultrasonic atlas of weld defects. Therefore, it is difficult to make the convolution network completely “memorize” the various changes of welding defects through a large amount of data and data enhancement. In order to solve the above problems, a deformable convolution module²⁸ is introduced in this paper, which can greatly enhance the adaptability of convolution neural networks to geometric transformations modeling. This method is based on adding an additional offset to the spatial sampling position in the module, and using back propagation for end-to-end training, so as to generate a deformable convolutional neural network. The module learns the offset based on a parallel network, which makes the convolution kernel offset at the sampling point of the input feature map, and focuses on the region or target we are interested in.

As shown in the Figure 2(a) is a traditional standard convolution kernel with a size of 3 × 3 (the green dot in the Figure 2(a)); Figure 2(b) is deformable convolution, by adding a direction vectors (light green arrow in Figure 2(b)) to the parameters of each convolution kernel on the basis of Figure 2(a), the convolution kernel can become any shape; Figure 2(c) and (d) are special forms of deformable convolution. Due to the different shapes of weld defects and no fixed geometric structure, ResNet50 is limited by the geometric transformation of the features to be inspected. The idea of deformable convolution is introduced to reconstruct ResNet50, so as to improve the ability of neural network to recognize irregular targets.

Figure 2.

Deformable convolution: (a) standard convolution, (b) deformable convolution(c), (d) special forms of deformable convolution.

As shown in the Figure 3(a) is the sampling process of standard convolution and Figure 3(b) is the sampling process of deformable convolution. The top layer is the active unit on different size objects. The middle layer is the sampling process executed by the top activation unit. The left picture shows the standard 3 × 3 square matrix sampling, and the right picture shows the nonstandard shape sampling, but the sampling point is still 3 × 3. The bottom layer is the sampling area where the intermediate layer is obtained. Obviously, it is found that deformable convolution can be closer to the shape and size of the object when sampling, while standard convolution cannot do this.

Figure 3.

Convolution kernel comparison: (a) standard convolution and (b) deformable convolution.

Multiscale feature fusion

As shown in the Figure 4, the feature pyramid network (FPN)²⁹ is composed of a bottom-up line, a top-down line, and horizontal connections. Send the picture to a pre-trained feature network (such as ResNet, etc.), that is, build a bottom-up circuit, and then build a top-down circuit. As shown in the figure, the second layer is reduced by 1 × 1 convolution, and the fourth layer is up-sampled, then the two are added, and finally the 3 × 3 convolution operation is carried out. FPN can fuse the high semantic features of deep feature maps with the detailed features of shallow feature maps, so that the detection network has a stronger small defect detection ability.

Figure 4.

The feature pyramid network.

In this paper, the feature extraction structure of ResNet50 combined with FPN is adopted, as shown in the Figure 5. Bottom-up feature extraction is performed on the Conv2, Conv3, Conv4, and Conv5 convolution layers of ResNet50, and Conv2–Conv5 is reduced to 256 by 1 × 1 convolution to obtain M2–M5, Then the shallow feature map is up sampled by two times and merged with the deep feature map 1 × 1 convolution from top to bottom, and the fused feature map is subjected to 3 × 3 convolution operation to obtain P2–P5. After performing maximum pooling down-sampling on the P5 feature map, the P2–P6 feature map is finally obtained. Combined with FPN feature extraction network, richer anchor can be generated, and the model can perform better for the diversity of defect sizes and the identification of small defects by using the abundant detail information of each scale.

Figure 5.

ResNet50 combined with FPN.

K-means clustering generates anchor boxes

As an unsupervised learning algorithm, K-means clustering algorithm³⁰ was proposed as early as 1967. It is the most widely used in clustering analysis. It aims to divide the data set into k clusters, so that the data similarity within the same cluster is higher and the data similarity between different clusters is lower. The original K-means clustering algorithm usually uses Euclidean distance as the criterion function to evaluate similarity. When generating anchors, this paper uses the area intersection ratio (IOU) of the box corresponding to the cluster center and each marked box as the evaluation standard of similarity. The algorithm is used to train the target anchors for weld defects, which makes it easier for the detection network to detect weld defects, thus speeding up the convergence speed of the network and improve the detection accuracy.

The clustering steps in this paper are as follows:

Randomly initializing a clustering center;

Calculate the IOU value of each label box and k cluster centers, and classify each label box into the closest cluster center;

Recalculate the k cluster centers in step (2);

Stop if the cluster center does not change any more or reach the maximum number of iterations.

The number of anchors in the original Faster RCNN is 9, and the size and proportion are determined by the PASCAL VOC data set, but for the weld defect data set that is very different from the PASCAL VOC data set, these anchors do not match, so this study uses K-means to complete the analysis of the weld defect data set, and resets the anchor size and proportion according to the clustering results. The experimental results of defect detection accuracy under different k values are shown in the Table 1:

Table 1.

k value experiment.k	mAP (%)	Time (ms)
3	92.89	126
6	92.91	127
9	93.46	127
12	93.35	128

It can be seen from the table that using the improved anchor generation method of K-means clustering in Faster RCNN, selecting the appropriate value of k can be more suitable for welding seam defect detection. When k = 9, the model has the highest accuracy and limited detection time. so the number of K-means clustering centers is also set to 9.

According to K-means clustering, the results of nine anchor coordinates and aspect ratio of weld defects are shown in the Table 2. The average IOU value is 78.58%, and the anchor size of this paper is re-customized. The anchors customized by the K-means clustering algorithm is more reasonable for the data set of weld defects, which makes the detection network converge faster and the model can obtain better detection performance.

Table 2.

K-means clustering anchor coordinates.

	Anchor boxes coordinates	Ratios
1	33.9030837, 136.27586207	0.25
2	45.81497797, 96.82758621	0.47
3	65.05726872, 63.65517241	1.02
4	25.65638767, 109.37931034	0.23
5	46.73127753, 173.93103448	0.27
6	25.65638767, 41.24137931	0.62
7	20.15859031, 83.37931034	0.24
8	118.20264317, 140.75862069	0.84
9	35.73568282, 60.06896552	0.59

ROI align

Faster RCNN obtains a series of candidate frame regions of different sizes, and it is necessary to classify the candidate frame regions accurately and determine their position coordinates. These different sizes candidate frame regions need to be normalized to form the same size candidate frame regions before they can be input to the next layer of convolutional neural network. The ROI Pooling layer in the original Faster R-CNN cuts preselection boxes of different sizes into element maps with a fixed scale. There are two rounding procedures in the whole network framework. The procedure is as follows:

According to the input image, the candidate area is mapped back to the position corresponding to the feature map, and the mapped floating point coordinates are rounded down to matrix coordinate values;

Divide the obtained area into k × k bins on average, and quantify and round the coordinates of the floating point unit.

Then the result frame of the two quantization errors will have a larger deviation from the original image, and the impact on the target with smaller defect will be more prominent. In this paper, the pixel size of the welding defect is usually less than 20. After the feature extraction network extracts features, the feature and the image will be scaled to only 0.7 pixels, which will lead to the loss of defect information, which is also called region mismatch problem.

ROI-Align³¹ uses bilinear interpolation to obtain image values on pixels with floating point coordinates during the mapping process of ROI region and the original image, thereby transforming the whole feature aggregation process into a continuous operation and solving the problem of region mismatch. According to the following steps:

Divide the candidate frame area into equal parts according to the size of output demand. It is likely that the vertices after segmentation will not fall on the real pixels;

Take four fixed points in each bin, and for each point, weight the value of the four nearest real pixel points (bilinear interpolation) to obtain the value of that point;

Four new values will be calculated in a bin, and max will be taken as the output value of the bin from these new values, and the output of 2 × 2 can be obtained finally.

As shown in the Figure 6, the dot grid represents the feature map (5 × 5), and the stereo grid is the ROI with variable size, that is, the candidate area. For the four sampling points determined in each unit, bilinear interpolation is used to calculate the floating point coordinates of the sampling points to calculate the values of these four positions, and then the maximum aggregation operation is performed to obtain the output of the region of interest with fixed dimensions. After the above processing, the model can obtain more accurate candidate feature regions, and the detection network can further strengthen the ability of detecting small defects and obtain higher precision.

Figure 6.

ROI-align.

Experimental system

Laboratory equipment

The hardware of the experiment consists of a computer, image acquisition equipment, and so on. The computerise configured with 4.2 GHz Intel Core i7 10700K CPU, 16 G memory, and GeForce RTX2060 as the graphics card. The image acquisition equipment is a French M2M desktop ultrasonic phased array detector, and the imaging method is full focus imaging, with 10 and 5 MHZ probe. The experimental software system includes: CUDA10.1, Python 3.7.2, OpenCV vision library, Python Deep Learning Framework, and Windows10 operating system. The data set marking tool is LabelImg.

Image acquisition and enhancement

This paper takes the weld defects in the welding structure of a certain weapon and equipment as the research object, using the M2M desktop ultrasonic phased array detector, using 10 and 5 MHZ probes. By changing different scanning angles, 2124 ultrasonic all-focus images of internal defects in the weld were collected. The acquisition scene is shown in the Figure 7, and some defect samples are shown in the Figure 8. According to the defect data, there are five types of internal defects in the weld, including slag inclusion, cracks, stoma, incomplete penetration, and incomplete fusion.

Figure 7.

The experimental equipment.

Figure 8.

Some defect samples.

In order to obtain a better neural network model and avoid the low adaptability of the model caused by too few samples, a larger amount of data samples are usually required for training. In this paper, the defective data set is expanded by cropping, adjusting brightness, and random rotation. About 4950 sample images were obtained, which were divided into training set, verification set, and test set at the ratio of 7:2:1. Finally, the LabelImg labeling tool is used to label all defect images in the data set according to the standard data set labeling format.

Evaluation index

In this paper, as long as the detection frame contains defect information, and the overlapping area is larger than 75%, it is considered as effective detection. In order to evaluate the performance of the defect recognition detection network, precision, recall, F1 Score, average precision, and mean average precision are selected as evaluation indicators. Precision indicates the proportion of real objects detected by the model, which refers to the proportion of real classes ( $T_{p}$ ) among all the positive classes ( $T_{p}$ + $F_{p}$ ) judged by the model. Recall refers to that for all positive classes ( $T_{p}$ + $F_{N}$ ) in the data set, the model correctly judges the proportion of positive classes ( $T_{p}$ ) in all positive classes in the data set. F1 Score is an index to measure the accuracy of binary classification model in statistics. It takes into account both Precision and Recall of the classification model. F1 Score can be regarded as a weighted average of the models Precision and Recall, with a maximum value of 1 and a minimum value of 0. For a specific target category, the detection effect of this algorithm can be expressed by average precision, and the average accuracy of m categories can be expressed by mean average precision.

The equation is as follows:

P = \frac{T_{p}}{T_{p} + F_{p}}

R = \frac{T_{p}}{T_{p} + F_{N}}

F_{1} = \frac{2 PR}{P + R}

AP = \int_{0}^{1} P (R) dR

mAP = \frac{1}{m} \sum_{i = 1}^{m} A P_{i}

P is the precision; R is the recall; $F_{1}$ is the F1 Score; $T_{p}$ is the number of samples that have correctly identified defects; $F_{p}$ is the number of samples that have incorrectly identified defects; $F_{N}$ is the number of samples that have not detected defects; AP is average precision; mAP is mean average precision.

Results and discussion

Test results

The algorithm proposed in this paper has been verified in the data set of internal defects of welds. The results of some defects detection are shown in Figure 9. The target detection network has achieved ideal results in the process of identifying weld defects. Even under the interference of small noises, the detection framework can more accurately select the information of defects to be detected.

Figure 9.

Some defect detection results.

The Table 3 shows the recognition accuracy of five types of defects. Among them, the ultrasonic atlas of five kinds of welding defects has a good overall recognition effect, and the precision, recall, and $F_{1}$ score are basically kept around 90%.

Table 3.

The recognition accuracy of five types of defects.

Defects	Slag inclusion	Incomplete penetration	Incomplete fusion	Stoma	Crack
P (%)	91.6	89.6	89.6	89.6	91.5
R (%)	95.6	92.8	91.2	86.5	73.1
$F_{1}$ (%)	93.6	91.2	90.4	88.0	81.3

The average precision of “slag inclusion” is the highest, reaching 97.3%, while that of “crack” is the lowest, reaching 88.9%. The overall mean average precision is 93.72%. The algorithm in this paper can accurately detect five types of weld defects including “slag inclusion,”“cracks,”“stoma,”“incomplete penetration,” and “incomplete fusion.” Among them, the recognition effect of small-size defects is poor. This is because the area occupied by small-size defects in the whole image is relatively small, and the detail features are not obvious, which easily lead to missed detection. Due to the small target of weld defects and the coexistence of many kinds of defects, this algorithm can detect multiple small target defects at the same time, which proves the reliability of this method in detecting multiple small target defects of weld, and has a good classification effect of various types of defects.

The original Faster RCNN algorithm is tested in the test set, and its performance is compared with the improved algorithm in this paper. As shown in the Table 4, compared with the original algorithm, the improved algorithm has improved the performance of multi-scale target detection, especially the average detection precision of small target defects such as “stoma” and “crack” has increased by 6.9% and 7.5% respectively. The results show that the multi-scale feature extraction method has a good effect on the detection performance of small targets, and the overall mean average precision is improved by 4.8%, and the anchor scheme after K-means clustering is more suitable for the defect data set in this paper. Generally speaking, the accuracy of this algorithm in weld defect data set is higher than that of the original algorithm.

Table 4.

Improved algorithm in this article compared with the original algorithm.

Algorithm	AP (%)					mAP (%)
Algorithm	Slag inclusion	Incomplete penetration	Incomplete fusion	Stoma	Crack	mAP (%)
Before improvement	95.6	91.7	90.3	85.6	81.4	88.92
After improvement	97.3	95.5	94.4	92.5	88.9	93.72

Ablation experiment

In order to explore the impact of the improved algorithm in this paper on the overall performance improvement of the algorithm, the experimental scheme shown in the Table 5 are designed, and the performances of the improved scheme are tested. Option 1 is the original Faster RCNN model; Option 2 is the Faster RCNN model that uses ResNet50 as the feature extraction network; Option 3 is the Faster RCNN model that introduces a deformable network; Option 4 is the Faster RCNN model that adds FPN on the basis of Option 3; Option 5 is the Faster RCNN model customized by K-means clustering anchor; and Option 6 is the Faster RCNN model using ROI-Align. The experimental results are shown in the Table 6.

Table 5.

Ablation experiment scheme.

No.	ResNet50	Deformable networks	FPN	K-means	ROI align
1	×	×	×	×	×
2	√	×	×	×	×
3	√	√	×	×	×
4	√	√	√	×	×
5	√	√	√	√	×
6	√	√	√	√	√

Table 6.

Ablation experiment results.

Option	AP (%)					mAP (%)
Option	Slag inclusion	Incomplete penetration	Incomplete fusion	Stoma	Crack	mAP (%)
1	95.6	91.7	90.3	85.6	81.4	88.92
2	96.1	92.4	91.7	87.4	84.3	90.38
3	96.7	93.5	93.2	90.6	86.9	92.18
4	96.8	94.0	93.9	91.6	87.7	92.8
5	97.1	94.4	94.3	91.8	88.1	93.18
6	97.3	95.5	94.4	92.5	88.9	93.72

The overall mean average precision of the original Faster RCNN model is 88.92%. When ResNet50 is used as the feature extraction network, the overall mean average precision is increased by 1.46%. When the deformable network and FPN are used, the overall mean average precision is increased by 3.88%. Among them, the average precision of small target defects such as “stoma” and “crack” are significantly improved, reached 91.6% and 87.7% respectively. When K-means clustering anchor customization and ROI-Align are used, the overall mean average precision is increased by 0.66%, which shows that the improved scheme has greatly improved the performance of the original Faster RCNN model. Meanwhile, it is proved that the improved algorithm proposed in this paper has the highest detection performance.

Algorithm comparison

The performance of the algorithm proposed in this paper is compared with other target detection algorithms, such as YOLO target detection algorithm, SSD target detection algorithm and others’ improved Faster RCNN target detection algorithm. YOLO target detection algorithm adopts the method in research.³² A deep learning method of B-scan image recognition of rail defects based on the improved YOLO V3 algorithm, which modifies the network structure of the YOLO V3 model, and uses the Darknet-53 feature extraction network to expand the receptive field of the model. The SSD target detection algorithm adopts the method in research,³³ which is a solid wood board surface defect detection method based on SSD. ResNet replaces the VGG network part of the original SSD network, and optimizes the regression of the prediction bounding box and the input characteristics of the classification tasks. Others’ improved Faster RCNN target detection algorithm adopts the method in research,³⁴ which is based on the improved Faster RCNN prickly pear fruit recognition method, and selects the VGG16 network model as the feature extraction network. The ROI pooling in the convolutional neural network is improved by the method of ROI align feature aggregation. In order to compare fairly, the same training set and test set are selected for data evaluation, and the advantages of this algorithm are verified by comparison from the perspective of detection accuracy. The experimental results are shown in the Table 7.

Table 7.

Algorithm comparison experimental results.

Algorithm	AP (%)					mAP (%)	Time (ms)
Algorithm	Slag inclusion	Incomplete penetration	Incomplete fusion	Stoma	Crack	mAP (%)	Time (ms)
YOLO	90.76	89.89	87.32	84.50	82.33	86.96	58
SSD	90.62	90.30	90.91	86.48	84.19	88.50	90
Improved faster RCNN (others)	91.15	90.19	90.20	88.30	87.23	89.41	200
Improved faster RCNN (this article)	97.3	95.5	94.4	92.5	88.9	93.72	127

The overall mean average precision of defect recognition of YOLO algorithm is 86.96%, that of SSD is 88.50%, and the improved Faster RCNN algorithm proposed by others has an overall mean average precision of 89.41%. Compared with the three algorithms, the algorithm in this paper has increased by 6.76%, 5.22%, and 4.31% respectively. YOLO algorithm and SSD algorithm are not effective in detecting small objects, and target positioning errors are easy to occur. Therefore, the ability to detect tiny defect targets such as “stoma” and “crack” is weak. The improved Faster RCNN algorithm proposed by others is better than the YOLO algorithm and the SSD algorithm, because VGG16 extracts the features to be detected, and RPN extracts the location information. However, as the feature extraction network has not been improved, the detection effect cannot reach the level of the algorithm proposed in this paper. The improved algorithm in this paper maintains the high precision characteristics of the Faster RCNN algorithm and improves the small target detection ability.

However, during the experiment, it was found that the detection speed of the Faster RCNN algorithm improved by others was 200 ms and the method proposed in this paper was 127 ms, which is significantly lower than that of YOLO and SSD single-stage target detection algorithm. In this paper, firstly, the ultrasonic atlas is acquired, then image is enhanced, and finally, the defect is detected. The process requires low real-time performance and high precision. Although the Faster RCNN-based target detection method proposed in this paper is not as real-time as the single-stage target detection algorithm, its target detection accuracy is higher, which can still meet the practical application scenarios.

Conclusions

In response to the needs of weld internal defect target detection, a Faster RCNN deep learning network combined with improved ResNet50 is proposed. On the basis of the original Faster RCNN network, ResNet50 is adopted as the feature extraction network, and deformable network and FPN network are introduced in ResNet50 to enhance the ability of detecting small target defects. Perform K-means clustering to customize anchors for specific data sets to generate anchors that are more suitable for weld defects. The original ROI pooling algorithm was changed to the ROI Align algorithm to make the candidate frame more accurate and obtain higher positioning accuracy. Finally, through the comparison test with the original Faster RCNN algorithm, it was verified that the overall precision of this algorithm in this article was improved by 4.8%, especially the detection precision of small target defects such as “stoma” and “crack” is improved by 6.9% and 7.5% respectively. At the same time, by comparing with other mainstream algorithms, the superiority of this algorithm in recognition accuracy is verified, however, there is still a problem of long detection time. This system has studied and analyzed several common defects in weld in detail. Although the detection time can meet the requirements, the parameters and architecture can be further adjusted and optimized on the network model, so that the network model is simpler and the detection speed is improved. Therefore, in future research, the detection speed of the algorithm needs to be further optimized. This model can effectively detect all kinds of defects in welds, especially when there are several small targets at the same time. After the actual detection on-site verification, it can fully meet the needs of daily industrial detection and improve the detection efficiency. Therefore, the algorithm proposed in this paper has some reference value in the field of weld defect detection.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the National Natural Science Foundation of China (NSFC), grant numbers (52075270); Science and Technology Plan Project of Inner Mongolia, grant numbers (2020GG0160); Young Science and Technology Talents Support Plan Project of Inner Mongolia, grant numbers (NJYT22063); and Natural Science Foundation of Inner Mongolia, grant numbers (2019MS05041); Technical Basic Research Project of National Defense Science and Industry Bureau, grant numbers (JSZL2018208C004).

ORCID iD

Changhong Chen

Data availability

The data used to support the findings of this study are available from the corresponding author upon request.

References

Wang

Zhao

, et al. Non-destructive detection of GIS aluminum alloy shell weld based on oblique incidence full focus method. Mater Sci Forum 2020; 1007: 105–110.

Xiao

Kang

SC.

Development of an image data set of construction machines for deep learning object detection. J Comput Civ Eng 2021; 35(2): 05020005.

Girshick

Donahue

Darrell

, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2014.

Girshick

. Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, 2015.

Ren

Girshick

, et al. Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 2015; 28: 91–99.

Redmon

Divvala

Girshick

, et al. You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.

Redmon

Farhadi

YOLO9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.

Joseph

Farhadi

Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767, 2018.

Bochkovskiy

Chien-Yao

Hong-Yuan Mark

Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, 2020.

10.

Liu

Anguelov

Erhan

, et al. SSD: single shot multibox detector. In: European conference on computer vision. Cham: Springer, 2016.

11.

Pyle

Hughes

Ali

AAS

, et al. Uncertainty quantification deep learning for in ultrasonic crack characterization. IEEE Trans Ultrason Ferroelectr Freq Control 2022; 69(7): 2339–2351.

12.

Zhang

Chen

Zhang

, et al. Weld defect detection based on deep learning method. In: 2019 IEEE 15th international conference on automation science and engineering (CASE), 2019. New York: IEEE.

13.

Haixing

Weimin

Zhenzhong

Deep learning-based classification of weld surface defects. Appl Sci 2019; 9(16): 3312.

14.

Wen-ming

GUO

Liu

Hui-fan

QU.

Welding defect detection of X-ray images based on faster R-CNN model. J Beijing Univ Posts Telecom 2019; 42(6): 20–28.

15.

Liu

Zhang

Gao

, et al. Weld defect images classification with vgg16-based neural network. In: International forum on digital TV and wireless multimedia communications. Singapore: Springer, 2017.

16.

Jung

Lim

, et al. Automatic detection of welding defects using faster R-CNN. Appl Sci 2020; 10(23): 8629.

17.

Zhang

Shen

Solder joint defect detection in the connectors using improved faster-RCNN algorithm. Appl Sci 2021; 11(2): 576.

18.

Zhou

Wang

, et al. Detection of micro-defects on irregular reflective surfaces based on improved faster R-CNN. Sensors 2019; 19(22): 5000.

19.

Yin

Yang

. Detection of steel surface defect based on faster R-CNN and FPN. In: 2021 7th international conference on computing and artificial intelligence, 2021.

20.

Yang

Cui

, et al. Deep learning based steel pipe weld defect detection. Appl Artif Intell 2021; 35: 1237–1249.

21.

Theivaprakasham

Identification of Indian butterflies using deep convolutional neural network. J Asia Pac Entomol 2021; 24(1): 329–340.

22.

Ananda

Ngan

Karabağ

, et al. Classification and visualisation of normal and abnormal radiographs; a comparison between eleven convolutional neural network architectures. medRxiv 2021; 21: 5381.

23.

Ali

Muzammil

Haq

, et al. Deep feature selection and decision level fusion for lungs nodule classification. IEEE Access 2021; 9: 18962–18973.

24.

Anilkumar

Manoj

Sagi

TM.

Automated detection of leukemia by pretrained deep neural networks and transfer learning: a comparison. Med Eng Phys 2021; 98: 8–19.

25.

Ahn

Kim

Rhim

, et al. Multi-view convolutional neural networks in rupture risk assessment of small, unruptured intracranial aneurysms. J Pers Med 2021; 11(4): 239.

26.

Shao

Zheng

Zhang

Deep convolutional neural networks for thyroid tumor grading using ultrasound B-mode images. J Acoust Soc Am 2020; 148(3): 1529–1535.

27.

Fangyan

Meng

Yizhi

, et al. Detection of diseased takifugu rubripes based on ResNet50 and transfer learning. Fish Modern 2021; 48(4): 51.

28.

Dai

Xiong

, et al. Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, 2017.

29.

Lin

Dollár

Girshick

, et al. Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.

30.

Zhang

Song

Zhang

Deep learning-based object detection improvement for tomato disease. IEEE Access 2020; 8: 56607–56614.

31.

Bai

Pang

Wang

, et al. An optimized faster R-CNN method based on DRNet and RoI align for building detection in remote sensing images. Remote Sens 2020; 12(5): 762.

32.

Chen

Wang

Yang

, et al. Deep learning for the detection and recognition of rail defects in ultrasound B-scan images. Transp Res Rec 2021; 2675(11): 888–901.

33.

Yang

Wang

Jiang

, et al. Surface detection of solid wood defects based on SSD improved with ResNet. Forests 2021; 12(10): 1419.

34.

Yan

Zhao

Zhang

, et al. Recognition of Rosa roxbunghii in natural environment based on improved faster RCNN. Trans Chin Soc Agricult Eng 2019; 35(18): 143–150.

No.	ResNet50	Deformable networks	FPN	K-means	ROI align
1	×	×	×	×	×
2	√	×	×	×	×
3	√	√	×	×	×
4	√	√	√	×	×
5	√	√	√	√	×
6	√	√	√	√	√

No.	ResNet50	Deformable networks	FPN	K-means	ROI align
1	×	×	×	×	×
2	√	×	×	×	×
3	√	√	×	×	×
4	√	√	√	×	×
5	√	√	√	√	×
6	√	√	√	√	√

An improved faster RCNN-based weld ultrasonic atlas defect detection method

Abstract

Keywords

Introduction

Materials and methods

Improved faster R-CNN

Improved ResNet50 extraction network

Multiscale feature fusion

K-means clustering generates anchor boxes

ROI align

Experimental system

Laboratory equipment

Image acquisition and enhancement

Evaluation index

Results and discussion

Test results

Ablation experiment

Algorithm comparison

Conclusions

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

Data availability

References

No.	ResNet50	Deformable networks	FPN	K-means	ROI align
1	×	×	×	×	×
2	√	×	×	×	×
3	√	√	×	×	×
4	√	√	√	×	×
5	√	√	√	√	×
6	√	√	√	√	√