Abstract
Welding is a critical process in industries such as construction, manufacturing, and automotive, where weld quality directly impacts structural integrity and safety. Traditional manual inspection of weld defects via radiographic testing is time-consuming, subjective, and prone to error, underscoring the need for an automated solution. We propose Weld-CNN, a hybrid convolutional neural network that combines sequential convolutional layers with parallel blocks to effectively extract both low-level and high-level features from X-ray images. Trained on a comprehensive dataset of 24,407 X-ray images covering four weld defect categories (cracks, porosity, lack of penetration, and no defect), Weld-CNN achieved a test accuracy of up to 99.83%. The outstanding performance of Weld-CNN demonstrates its potential as a reliable tool for automated, non-destructive weld defect detection, offering significant improvements in efficiency and quality control over manual methodologies.
Introduction
Welding is a critical process in various industries, including construction, manufacturing, and automotive. The quality of welds directly impacts the structural integrity, safety, and longevity of welded products. 1 According to recent industry reports, weld defects account for approximately 50%–80% of all structural failures, highlighting the crucial need for effective quality control measures. 2
Non-Destructive Testing (NDT) methods have become indispensable in ensuring weld quality without compromising the integrity of the welded structure. Among these methods, Radiographic Testing (RT) has gained prominence due to its ability to detect internal defects that are not visible to the naked eye or through surface inspection techniques. RT utilizes X-ray or gamma-ray technology to produce detailed images of the weld’s internal structure, enabling the identification of defects such as porosity, cracks, and lack of penetration.3,4
Traditionally, the analysis of radiographic images has relied on human expertise, which presents several challenges. Manual inspection is time-consuming, subjective, and prone to human error. Studies have shown that even skilled inspectors achieve an accuracy of only about 80% under optimal conditions. 5 Moreover, the increasing demand for high-throughput production in modern industry has made manual inspection a bottleneck in the quality control process.
To address these limitations, there has been a growing interest in applying artificial intelligence and deep learning techniques to automate weld defect detection and classification. Recent advancements in Convolutional Neural Networks (CNNs) have shown promising results in image classification tasks, making them particularly suitable for analyzing radiographic images of welds.
This study presents Weld-CNN, a novel hybrid CNN architecture designed specifically for weld defect detection and classification. Our proposed model combines successive layers and parallel blocks to effectively extract and process features from radiographic images. The primary objectives of this research are:
To develop a robust and accurate deep learning model for automated weld defect detection and classification.
To evaluate the performance of Weld-CNN against existing state-of-the-art models in the field.
To contribute to the broader goal of enhancing quality control processes in welding applications through AI-driven solutions.
Our work builds upon previous research in this domain, addressing limitations such as low classification accuracy and the complexity of transfer learning from standard networks. We utilize a comprehensive dataset of 24,407 X-ray images, encompassing four types of weld conditions: lack of penetration, cracks, porosity, and no defect.
The rest of this paper is organized as follows: Section Relatived work provides an overview of related work in the field of automated weld defect detection. Section The proposed methodology describes the background of convolutional neural networks and details our proposed Weld-CNN architecture and methodology. Section Experimental setup presents the experimental setup and dataset. Section Result and analysis discusses the results and analysis of our model’s performance. Finally, section Conclusions concludes the paper and outlines directions for future research.
Relatived work
In recent years, the rapid growth of deep learning has spurred significant interest in the detection and classification of weld defects from radiographic images. Traditionally, weld inspection relied on manual interpretation of radiographic testing results—a process that is both time-consuming and prone to subjective error. The shift toward automated, non-destructive testing (NDT) methods has led researchers to explore various convolutional neural network (CNN) architectures and hybrid models to improve both accuracy and efficiency.
Totino et al. 6 made an early contribution by introducing the RIAWELC dataset, which comprises 24,407 X-ray images labeled across four defect classes. Their work employed a customized SqueezeNet architecture with approximately 724,548 parameters, achieving a test accuracy of 93.33%. Although their study clearly demonstrated the potential of CNNs for weld defect classification, it also highlighted challenges in capturing fine-grained defect details with relatively shallow networks.
Building on these ideas, Pan et al. 1 proposed a transfer learning-based model-TL-MobileNet-that leverages pre-training on large-scale datasets like ImageNet. By incorporating additional layers, such as a fully connected layer and employing techniques including DropBlock and global average pooling, they achieved a notable prediction accuracy of 97.69% on weld defect data. Despite the performance gains, the reliance on pre-trained models raises concerns about generalizability in scenarios with a limited number of defect samples.
Stephen and Lalu 7 tackled the data scarcity issue by using a small dataset of only 200 images, which was later augmented to generate 16,000 training examples. Their CNN model, comprising several convolutional layers followed by data augmentation, reached a training accuracy of 99.2% and a validation accuracy of 95%. However, the limited diversity of the original dataset poses challenges for generalization across various defect types and imaging conditions.
Addressing the need for enhanced feature representation, Yang and Jiang 8 introduced a unified deep neural network that features multi-level feature fusion. Their approach integrated contrast-based features (such as histogram contrast, roughness, skewness, and kurtosis) to capture the nuanced differences between defect regions and their backgrounds. While this method improved robustness, it did not benchmark its performance against more established CNN architectures nor address the limitations imposed by small datasets.
Further efforts in the field include Say et al., 9 who developed an automated system that combines X-ray image processing with CNN-based classification, augmented by techniques such as random rotation, shearing, and zooming. Their method, applied to an enriched version of the GDXray dataset, achieved an average accuracy of approximately 92% across six defect categories. Complementarily, Palma-Ramírez et al. 10 employed transfer learning with a ResNet50-based model for classifying four weld defect types. Although they obtained high accuracies on different datasets, their approach was largely dependent on the RIAWELC dataset and did not explore defects beyond a limited set of classes.
More recently, Zhang et al. 11 tackled the challenge of limited defective samples by adopting a dual augmentation strategy-using traditional image-processing techniques for porosity defects and generative adversarial networks (WGAN) for other defect types. Their multi-model ensemble, which utilized pre-trained Inception and MobileNet architectures, sought to reduce the training data requirement and speed up convergence, though it provided limited comparative statistics. Meanwhile, Yang et al. 12 proposed a recognition algorithm that combined multi-scale feature extraction using transfer learning from AlexNet with decision fusion techniques (SVM and Dempster-Shafer theory), addressing the issues posed by weak-textured and low-contrast images. In a different vein, Wang et al. 13 presented an automated approach using RetinaNet to simultaneously detect multiple defect types directly on original X-ray images. Their work focused on a narrower range of defects but underscored the potential for real-time applications in an industrial context.
Despite these advancements, several challenges remain. Many existing approaches either suffer from high computational complexity due to an excessive number of parameters or exhibit limited generalizability due to reliance on small, domain-specific datasets. Moreover, few studies have thoroughly addressed the practical considerations of deploying such systems in real-time industrial environments.
In this context, our proposed Weld-CNN model seeks to address these challenges by incorporating a hybrid architecture that combines sequential layers for low-level feature extraction with parallel blocks for capturing diverse high-level representations. With a significantly reduced parameter count yet a remarkably high classification accuracy (99.83% on test data), Weld-CNN offers a robust and computationally efficient solution for automated weld defect detection, making it well-suited for implementation in quality control processes across various industrial settings.
The proposed methodology
Weld defects
Welding defects are anomalies in the welded structure’s morphology, dimensions, and metallurgical composition that deviate from design specifications and technical standards, compromising the structure’s mechanical properties and functionality. The rigorous inspection of weld quality is paramount in ensuring the structural integrity, performance, and longevity of welded assemblies.
Figure 1 illustrates the process of weld quality inspection using Radiographic Testing (RT), a prominent Non-Destructive Testing (NDT) method. The figure underscores the challenge of identifying internal weld defects through visual examination alone, highlighting the necessity for advanced inspection techniques.

Overview of weld defects.
In Figure 1, each image has been carefully selected from distinct weld samples, with the dominant defect in each instance clearly annotated. While it is indeed possible for multiple defects to co-occur within a single weld, for clarity and to facilitate precise classification, our dataset labels each image based on the primary observed defect. This approach simplifies the analysis and ensures that our CNN model is trained with unambiguous examples for each defect category.
Traditionally, weld quality assessment has employed two primary methodologies: Destructive Testing (DT) and Non-Destructive Testing (NDT). While DT provides valuable insights, it inherently renders the test specimen unusable, incurring significant costs in sample preparation and destruction. Furthermore, DT is unsuitable for applications requiring the preservation of structural integrity or the evaluation of critical components in situ.
To address these limitations, NDT methods have gained prominence in weld quality assurance. NDT enables the comprehensive evaluation of materials and structures without compromising their integrity or functionality. In the context of weld defect diagnosis, NDT facilitates the detection, localization, and characterization of anomalies while maintaining the weld’s structural continuity. This non-invasive approach is particularly crucial in industries demanding high reliability, such as construction, aerospace, and maritime engineering.
Among the various NDT techniques available for weld inspection, Radiographic Testing (RT) stands out as a highly effective method, particularly relevant to the dataset utilized in this study. RT employs X-rays or gamma rays to generate high-resolution images of the weld’s internal structure, revealing intricate details such as voids, cracks, and inclusions that may elude visual or surface inspection methods.
The advantages of RT in weld inspection are multifaceted:
It enables the detection of subsurface defects not visible to the naked eye.
It allows for the inspection of in-service components without the need for separate sample preparation.
It provides detailed information on defect morphology, including location, size, and geometry.
It facilitates early defect detection, enabling prompt remediation and quality assurance measures.
By leveraging RT, manufacturers and inspectors can ensure comprehensive weld quality assessment, thereby enhancing product reliability and safety across various industrial applications.
Figure 1 illustrates the three common types of weld defects investigated and detected in this study: cracks, porosity, and lack of penetration.
Cracks (CR): represent the most critical defects in weld joints. These discontinuities can manifest on the weld surface, within the weld itself, or in the heat-affected zone (HAZ), as illustrated in Figure 2. Crack formation can occur during two distinct phases of the welding process: during weld metal solidification at elevated temperatures exceeding

The structure of the weld has crack defect.
Macroscopic cracks can often be identified through visual inspection or with the aid of a magnifying glass. However, microscopic cracks within the weld structure necessitate more sophisticated non-destructive testing (NDT) methods for detection, such as ultrasonic testing, magnetic particle inspection, or radiographic examination.
The significance of crack detection cannot be overstated. Even minute cracks, if left undetected and subsequently exposed to high temperatures or stress, can propagate rapidly, potentially compromising the structural integrity of the entire welded component. This underscores the critical importance of thorough and timely inspection procedures in welding applications.
Weld cracking is predominantly attributed to excessive welding temperatures or substandard welding materials. To mitigate the risk of cracking and enhance weld integrity, experts recommend a multi-faceted approach:
Material Selection: Utilize high-quality welding consumables that are compatible with the base material and meet the specific requirements of the application.
Preheat Treatment: Implement a preheating procedure for the weld area to reduce thermal gradients and minimize the risk of sudden temperature fluctuations during the welding process.
Controlled Cooling: Apply post-weld heat treatment (PWHT) to regulate the cooling rate, thereby reducing residual stresses and the potential for hydrogen-induced cracking.
These preventive measures serve to control the thermal cycle of the welding process, from preheating through to post-weld cooling. By managing heat input and dissipation, the likelihood of crack formation is significantly reduced. Furthermore, these techniques can contribute to improved mechanical properties and overall weld quality.
Adherence to these recommended practices can substantially enhance weld integrity and minimize the occurrence of crack-related defects, thereby improving the overall reliability and performance of welded structures.
Porosity (PO): is a weld defect characterized by the entrapment of gases within the solidifying weld metal. This phenomenon occurs when gases fail to escape from the liquid metal before solidification. Porosity can manifest internally or on the surface of the weld, often localized at the interface between the base metal and filler metal. The distribution of pores can be uniform, clustered, or isolated within the weld structure, as illustrated in Figure 3.

The structure of the weld has a porosity defect.
Several factors contribute to the formation of porosity:
Inadequate shielding: Insufficient protective atmosphere or environmental exposure, particularly wind speeds exceeding 5 m/s, can lead to rapid porosity formation.
Surface contamination: Presence of dirt, moisture, or other contaminants on the weld surface can introduce gases into the weld pool.
Welding parameters: Excessive welding speed or inappropriate arc length can disrupt the stability of the weld pool, promoting gas entrapment.
The presence of porosity in weld joints can significantly compromise mechanical properties and performance:
Reduced tensile strength, ductility, and hardness.
Decreased load-bearing capacity and stress response of the structure.
Impaired tightness and overall integrity of the joint.
Increased susceptibility to crack initiation and propagation.
Potential for sudden failure under stress.
To mitigate porosity formation, the following preventive measures are recommended:
Surface preparation: Thorough cleaning and drying of the base material prior to welding.
Material selection: Use of low-hydrogen welding consumables.
Process control: Optimization of welding speed and arc length.
Environmental control: Adequate shielding and protection from drafts.
Implementation of these practices can substantially reduce the incidence of porosity, thereby enhancing weld quality and structural integrity.
Lack of penetration (LP): is a critical weld defect characterized by inadequate fusion between the base metal and the weld metal. This condition arises when the volume of molten weld metal is insufficient to fully penetrate and integrate with the base material, as illustrated in Figure 4. LP represents a severe deficiency in welded joints, resulting in significantly compromised mechanical properties compared to properly executed welds. The diminished bonding strength associated with LP increases the susceptibility to crack initiation and propagation under mechanical stresses or environmental influences.

The structure of the weld has penetration defect.
The occurrence of LP can be attributed to several interconnected factors. Inadequate preparation of the welding edge, particularly insufficient bevel angle, is a primary contributor. Welding parameters play a crucial role, with suboptimal welding current or excessive welding speed often leading to LP. The manipulation of the welding electrode, including its angle and insertion method, must be precisely controlled to ensure proper fusion. Additionally, maintaining an appropriate arc length and ensuring the correct movement of the welding electrode along the weld axis are essential for preventing LP.
To mitigate the risk of LP, a comprehensive approach to weld preparation and execution is necessary. Prior to welding, thorough cleaning of the weld area is essential to remove contaminants that might impede fusion. Increasing the bevel angle and optimizing the weld gap can significantly improve penetration. Furthermore, careful adjustment of welding parameters, particularly current and speed, in accordance with technical specifications, is crucial for achieving adequate penetration. These preventive measures, when implemented consistently, can substantially reduce the incidence of LP and enhance the overall integrity and performance of welded joints.
The implications of LP extend beyond mere structural concerns, potentially affecting the safety, reliability, and longevity of welded components. As such, understanding and addressing the root causes of LP is paramount in various engineering disciplines where welding is a critical process. By adopting a rigorous approach to weld quality management, encompassing proper preparation, parameter optimization, and skilled execution, the risk of LP can be minimized, thereby ensuring the production of high-quality, durable welded structures.
Convolution neural network
CNNs have demonstrated good performance and have emerged as one of the most prominent neural network architectures in deep learning, garnering significant attention from both industry and academia in recent years. 14 CNNs are feedforward neural networks capable of extracting features from data through a convolutional structure. Typically, a CNN comprises three main components: convolutional layers, pooling layers, and fully connected layers.
The convolutional and pooling layers are primarily responsible for image feature extraction, while the fully connected layers map the extracted features to the model’s output, such as classification results. Figure 5 illustrates the basic architecture of a CNN.

An overview of a convolutional neural network architecture.
Convolutional layers consist of multiple kernels (or filters) used to compute different feature maps, with the number of feature maps corresponding to the number of filters. Each feature map represents a local feature learned from the input data. The output of these filters is then passed through an activation function.
Pooling layers aim to achieve translation invariance by reducing the resolution of the feature maps. 15 This is accomplished through mathematical operations such as max-pooling or average-pooling, which aggregate local features within specific regions of the feature map. Consequently, the size of the feature map is reduced while preserving the principal features, enabling CNNs to learn characteristic features independent of their location in the input images.
In the context of our research, we employ a specialized CNN architecture, referred to as Weld-CNN, for data training and analysis of weld defects.
The proposed Weld-CNN
Figure 6 presents a comprehensive schematic of the proposed Weld-CNN architecture. In this figure, the input radiographic image-resized to the standardized

The architecture of the Weld-CNN model.
Parameters of the Weld-CNN.
The hybrid structure employed in Weld-CNN, which integrates sequential convolutional layers with parallel blocks, is particularly well-suited for defect detection in radiographic images. Radiographic images often exhibit complex patterns with both subtle textures and prominent structural features. The sequential layers are effective in progressively extracting low-level features and refining them into more abstract representations, while the parallel blocks facilitate simultaneous learning of diverse feature sets at the same level. This combination allows the network to capture a broader range of defect characteristics, thus enhancing the model’s ability to differentiate between various types of weld defects. Moreover, the parallel structure helps in reducing the risk of overfitting by diversifying the learned features, which is especially beneficial when dealing with subtle variations and noise inherent in radiographic imaging.
Initially, the network comprises a series of convolutional layers with
Following the initial layers, the architecture integrates two consecutive blocks, each containing four parallel sub-blocks. Each sub-block consists of a
To reduce computational complexity, the network incorporates a single block with four parallel convolutional layers, accompanied by subsequent max-pooling operations. To mitigate the risk of overfitting, a dropout layer with a 20% dropout ratio is employed at the network’s output stage. This is followed by two fully connected layers and a softmax function, which facilitate the final classification output.
The Weld-CNN architecture represents an advanced approach to automated weld inspection. By leveraging sophisticated deep learning techniques, the model significantly enhances the reliability and accuracy of defect detection in welding processes, contributing to improved quality control and operational efficiency in industrial applications.
Experimental setup
Dataset
Radiographic images of weld defects were obtained from the RIAWELC dataset.
6
Representative samples from the training dataset are displayed in Figure 7. This dataset comprises four categories of welding defects: lack of penetration, cracks, porosity, and no defect, totaling 24,407 radiographic images with dimensions of

Representative samples of welding defects from the dataset: (a) lack of penetration, (b) cracks, (c) porosity, and (d) no defect.
Total number of labeled weld defect.
To ensure consistency with the benchmark data and to facilitate a fair comparison with existing CNN architectures, all input images were resized to a resolution of
For the training procedure, the dataset was randomly partitioned into three subsets comprising 65% for training, 25% for validation, and 10% for testing. This allocation ensures an adequate distribution of data for model development, hyperparameter tuning, and unbiased evaluation of performance.
Setup and metrics
Table 3 presents an overview of the hardware and software configurations employed in this study. All experimental procedures, including design, computation, analysis, and evaluation, were conducted using MATLAB R2023a software. To evaluate the performance of the proposed model, 10 independent runs were executed under identical conditions. The final results utilized for evaluation and comparison purposes are derived from the average outcomes of these 10 runs.
Hardware specifications were employed for this study.
The model was trained using the ADAM optimization algorithm, with a validation split of 10%, an initial learning rate of 0.0003, and a batch size of 128 over 50 epochs. To assess the effectiveness of the proposed model, four performance metrics were employed to evaluate and compare its performance on the test dataset:
Accuracy quantifies the proportion of correct predictions out of all predictions made, indicating the percentage of samples accurately classified by the model. While accuracy provides a general measure of performance, it may not be the most informative metric in cases where the dataset is imbalanced, meaning one class is significantly more prevalent than others.
Recall measures the percentage of actual positive samples that are correctly identified by the model. This metric reflects the model’s ability to capture all relevant positive instances, which is particularly critical in industries where missing a positive sample can lead to severe consequences.
Precision assesses the model’s capability to accurately predict positive samples, thereby minimizing the occurrence of false positives. High precision is essential to avoid class confusion and ensures that the predicted positive instances are indeed relevant.
F1-Score is the harmonic mean of Precision and Recall, providing a balanced metric that accounts for both false positives and false negatives. The F1-Score is especially useful in scenarios with unbalanced datasets, as it offers a single measure to evaluate the model’s overall performance by balancing the trade-off between Precision and Recall.
Result and analysis
We developed the Weld-CNN model for the classification of weld defects using X-ray images. Weld-CNN is a hybrid architecture that integrates both sequential and parallel blocks to optimize feature extraction process while reducing computational complexity. To maximize the representation of data characteristics and reduce dimensionality, multiple layers of max-pooling are employed within each parallel block. Concurrently, to mitigate the risk of overfitting, a dropout layer is incorporated prior to the classification stage.
To rigorously evaluate the model’s performance, we conducted 10 runs under identical conditions, with the results presented in Table 4. The classification outcomes of Weld-CNN achieved an average accuracy of 99.83%, demonstrating the model’s high efficacy in accurately classifying weld defects.
Model performance metrics across 10 independent experimental runs, demonstrating the consistency and reliability of the proposed approach.
Figure 8 summarizes the training progress of our Weld-CNN model during the 10th experimental run. This figure presents key parameters, including a validation accuracy of 98.87% at the end of 50 epochs, a total of 6150 iterations (with 123 iterations per epoch), and an elapsed training time of 27 min and 14 s on a single GPU. The steady decrease in the loss metric and the plateauing of accuracy suggest that the network reached a robust state well before the final epoch.

Training process of the Weld-CNN model during the 10th experimental run.
Figure 9 displays the confusion chart that summarizes the classification performance of our CNN model across four weld conditions: Crack, Lack of penetration, No Defect, and Porosity. The matrix compares the predicted class against the true class, showing both the number of correctly classified and misclassified instances. For example, in the Crack category, the model misclassified only 2 out of 444 samples (an error rate of 0.4%), while the Lack of penetration and Porosity categories achieved 100.0% accuracy. The No Defect class demonstrated a similar level of high performance with an accuracy of 99.6%–99.7%, reflecting a minimal error rate (approximately 0.3%–0.4%). This detailed breakdown confirms the robustness and reliability of the CNN model in differentiating between subtle variations of weld defects, thereby ensuring dependable and efficient quality inspections.

Confusion matrix of the Weld-CNN model from the 10th experimental run, illustrating the classification performance across different defect categories.
Table 5 presents a comparative analysis of the Weld-CNN model against previously investigated models that utilize the same or similar dataset characterized by X-ray images. Differences in datasets inherently lead to variations in model complexity and accuracy. This comparison provides a comprehensive overview of the performance of models. The Weld-CNN model achieves higher accuracy than prior studies, underscoring the suitability of the research dataset for developing and training an automatic classification model. Specifically, using the same dataset comprising 24,407 images across four classes, 6 previous approaches employed transfer learning with the SqueezeNet standard model, which has 724,548 parameters, and ResNet50, which has 25.5 million parameters. In contrast, the Weld-CNN model utilizes only 259,100 parameters. Despite having significantly fewer parameters than the two comparison models, Weld-CNN demonstrates superior classification accuracy compared to all models listed in Table 5. These results highlight the efficiency and effectiveness of the Weld-CNN architecture in accurately classifying weld defects while maintaining a lower computational complexity.
Comparison of results with previous studies.
While the proposed Weld-CNN model demonstrates outstanding classification performance, several limitations merit discussion. First, the decision to resize input images to 227 × 227 pixels, although beneficial for computational efficiency, may lead to a loss of fine details that are crucial for detecting subtle weld defects. Second, our model has been developed and evaluated on a controlled dataset of radiographic images; thus, its performance and generalizability under more variable real-world industrial conditions remain to be fully ascertained. Finally, despite the hybrid structure enhancing feature extraction, the increased architectural complexity poses challenges in hyperparameter tuning and may impact real-time deployment scenarios. Addressing these limitations will be the focus of future work, aimed at improving model robustness and optimizing its applicability in diverse industrial environments.
Conclusions
In this paper, we introduce Weld-CNN, a convolutional neural network designed for the detection and classification of weld defects using X-ray images. The architecture of Weld-CNN combines sequential layers for feature extraction with parallel blocks, each comprising consecutive layers of Convolution and ReLU activation. In the final parallel block, the ReLU activation is substituted with a max-pooling layer, which reduces the model size. To address overfitting, a dropout layer is incorporated at the output stage.
Employing this hybrid architecture that integrates both serial and parallel network structures, we conducted 10 experimental runs under consistent conditions. The average classification accuracies achieved were 99.81% on the training set and 99.83% on the test set. These results indicate that the Weld-CNN network exhibits exceptionally high classification efficiency compared to existing studies.
Future work will focus on collecting additional data samples from actual production environments to increase data diversity and further validate the model’s robustness. Given the high detection efficiency and predictive accuracy demonstrated by Weld-CNN, we aim to implement this classification system directly within industrial production processes by integrating it into mobile terminals. This integration will facilitate real-time defect detection and classification, thereby enhancing quality control and operational efficiency in welding applications.
Footnotes
Acknowledgements
This work was supported by the Van Lang University.
Handling Editor: Sharmili Pandian
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
