Abstract
Objective
Breast cancer is considered a significant cause of death among females globally, so there is an urgent need for accurate and computerized methods of diagnosis. The study will develop a robust computerized framework for the identification of breast lesions using multiresolution feature fusion based on the Dual-Tree Complex Wavelet Transform (DT-CWT) applied to histopathological images.
Methods
A total of 7909 histopathological images were sourced from the publicly available BreakHis dataset, and 594 clinical samples were acquired from the Department of Radiodiagnosis and Imaging at Graphic Era Institute of Medical Sciences to ensure that variations across breast cancer subtypes were represented. In the proposed framework, the steps included image preprocessing, two-level DT-CWT decomposition using two-dimensional wavelet filters, and feature extraction from the low-frequency sub-bands. Handcrafted features were extracted as Law’s texture energy measures, Gabor-based texture descriptors, and statistical textural features, and fused into a comprehensive multiresolution feature set. These fused features have then been used to classify benign and malignant breast lesions in a dual-path convolutional neural network.
Results
The experimental assessment showed that the Reverse Biorthogonal (Rbio-2.4) wavelet filter achieved the best classification performance, with a test accuracy of 97.32%, compared to other wavelet filters across various wavelet families. Moreover, the new method showed the highest values for precision, recall, F1 score, Matthews correlation coefficient, and Cohen’s kappa, which measure diagnostic performance.
Conclusion
The results obtained have confirmed that the designed feature fusion approach using the DT-CWT has superior capability for the precise identification of breast lesions in histopathologic images. By combining multiresolution wavelet analysis, hand-engineered texture features, and CNN-based learning models, the proposed approach has potential applications in healthcare for breast cancer detection.
Keywords
Introduction
Breast cancer, now believed to account for nearly one in four cancer diagnoses among women worldwide, is clearly on the rise. According to recent WHO statistics, this disturbing trend reflects a growing Western way of life, increased life expectancy, and, to some extent, improved diagnostics. Although the incidence of cases has been increasing, advances in treatment options and early detection have incredibly enhanced survival rates. Surgery, radiation, and systemic therapies, including targeted treatments, are among the results of modern active approaches. Regular screenings for early detection, such as mammograms, have also been highly instrumental in detecting cancers at more treatable stages. Thus, many women diagnosed with breast cancer today have better survival rates and quality of life.1–16
Breast cancer still ranks first in female death rates worldwide, 17 which is a significant global public health problem. Thus, timely diagnosis of this disease is necessary for avoiding its complications in advanced stages and reducing morbidity among the female population. Breast cancer is an example of a heterogeneous disease; different entities within the cells exhibit unique clinical and pathological features. Unfortunately, this cancer starts in the cells of our Breast that can grow and invade healthy tissue nearby. Clinical diagnosis of breast cancer is a multi-step process. Clinical examination is the first step, followed by radiological imaging such as mammography and MRI (Magnetic Resonance Imaging) of the Breast. Although these non-invasive imaging techniques can differentiate between benign and malignant regions, the results may be misleading. 18 Hence, Biopsy images are used 19 along with different stains to generate histopathology images for professional cancer examination. However, manual examination of histopathological images requires medical knowledge and is time-consuming, making manual diagnosis subjective. Computer-aided diagnosis (CAD) is a valuable tool for pathologists, helping them identify the presumed site and distinguish confusing histopathology images. It generally enhances the diagnostic performance of breast cancer by reducing intra- and inter-pathologist variability in their final decision. 20
Histopathologic images 21 remain the top-most discriminative type of image in breast cancer, with resolution far outstripping any other modality. Mammograms or ultrasounds show structure, but histopathological images represent cell and tissue-level changes and therefore help pathologists detect cancer more accurately. Using these images, changes in cell shape, size, and organization that could indicate cancer can be morphologically observed in biopsy samples at the microscopic scale. 22 Histopathological images at a high resolution differentiate benign and malignant tumors, or they may indicate the grade of the tumor. 23 Examiners follow up on the diagnosis as to how radical that cancer is, and thereby determine what course of treatment will be taken. Given the increasing number of personalized therapies tailor-made for breast cancers, it is wise to see that this proficiency rendered by histopathological evaluation continues as one armamentarium in the fight against such disease. BreastHis is a medical imaging dataset designed for analyzing and classifying histopathological breast cancer images. 24 The dataset consists of over 8,000 images of breast tumor tissue samples captured at magnifications of 40×, 100×, 200×, and 400 ×. Depending on the specific histopathological parameters, these categories can be further divided into various classes. 25 However, the primary groups are benign tumors and malignant tumors. The BreakHis dataset is diverse, containing images of women of different ages and stages of cancer. This diversity is essential for creating and evaluating machine learning models that can be applied to and generalized across diverse populations. The high-resolution images used to develop this dataset enable researchers to examine and extract finer details, which are crucial for robust classification and diagnosis. Ultimately, BreakHis is a significant milestone in evaluating algorithm performance for automated breast cancer diagnosis and, consequently, enhancing computer-aided diagnosis systems. 26
Key contributions of the proposed work
• It is pertinent to highlight that a proposed multiscale framework for the analysis of breast lesions is established based on histopathological images obtained at various magnifications. • A multiresolution technique of feature extraction has been proposed, which makes use of the low sub-band images derived from DT-DWT to calculate the discriminatory handcrafted image features, namely, Gabor filters, GLCM descriptors, as well as Law’s energy values. • A comparison analysis of different wavelet families, such as Coiflet, Symlet, Daubechies, and Bi-orthogonal, has been conducted by using two-level DT-DWT decomposition to select the appropriate wavelet transform structure to classify breast cancer. • A novel hybrid multiresolution DT-DWTCNN model is proposed by marrying concepts from the wavelet transform domain with the power of deep convolutional learning.
Research gap and contribution
Careful attention to the existing literature reveals three significant limitations: Items listed below: • Strong reliance on large labeled datasets; limited interpretability with CNN-only models. • Lack of exploration in the way of effective feature fusion for the analysis of breast histopathological images. • Underutilization of the multiresolution and directional capabilities of DT-CWT to extensively integrate features.
The present work proposes a hybrid framework to address these gaps, combining DT-CWT-based multiresolution features with handcrafted texture descriptors and deep CNN embeddings for accurate, interpretable, and robust breast lesion classification from histopathological images.
The rest of the paper is organized as follows. Section 2 explains the motivation and the significant contributions of the proposed work. Section 3 covers related prior work. The DT-CWT-based proposed framework and model parameters are explained in Section 4. Experimental results and performance evaluation are presented in Section 5, followed by conclusions and future research directions in Section 6.
Motivation
Under the proposed framework, histopathological images are decomposed into low- and high-frequency sub-bands at multiple scales and orientations using the DT-CWT, enabling efficient modeling of structural and textural details. Effective feature fusion is employed to preserve prominent details within the sub-bands and reduce information loss.
To effectively leverage the different learning abilities, a dual-path CNN approach combining ResNet and DenseNet is used. Residual learning in ResNets enables more stable gradients for deeper feature learning. At the same time, DenseNet’s dense connections facilitate feature reuse. Additionally, hand-engineered features are derived from the low-frequency sub-bands of the DT-CWT for breast cancer classification. They include Gabor filters for texture analysis based on orientation, Gray-Level Co-occurrence Matrix (GLCM) texture description for statistical texture description, as well as Law’s energy for spatial pattern description. These hand-engineered features are then stacked with features from a deeper CNN to produce a more robust representation that can effectively classify different types of breast cancer lesions, covering different aspects of the learning task using a more scalable approach.
Recent breakthroughs in breast cancer diagnosis have been catalyzed by convolutional neural network models applied across imaging modalities, including mammography, ultrasound, magnetic resonance imaging, and histopathology. Previous deep models, trained on mammographic databases-for example, large-scale FFDM and public repositories-have reported good performance for the detection of masses and calcifications. Mammograms capture the macroscopic projection of tissue and therefore lack the cellular and microarchitectural detail required to differentiate between benign and malignant lesions based on subtle histological differences?26,27,28. Recent analyses and reviews also note that mammography-based CNNs can be sensitive to factors such as breast density, projection artifacts, and scanner and vendor variability, which limits their generalization to fine-grained diagnostic tasks?.On the other hand, research in deep learning within histopathological image sets (such as patch-level and whole slide histology images) indicates that those trained on images of a cellular resolution can learn — in a more direct manner — morphological features critical for lesion identification and cancer staging: nuclear pleomorphism, mitotic rate, and glandular pattern, in particular.29,30 This influence on the present research will focus solely on histopathological images and the creation of hybrid architectures that combine map features at multiple directions with CNN features reflecting both texture and higher-level patterns within images.
Bejnordi et al. 31 have also introduced a context-aware stacked CNN architecture for WSIs and demonstrated that integrating patch-level features and contextual features can lead to a dramatic improvement in classification accuracy. Nevertheless, this method demands high processing capacity and requires large, properly annotated datasets for accurate classification. Another example is Alom et al., 32 where a deep architecture of Inception Recurrent Residual CNN was used for histopathological classifications for BWS, and dramatic improvements have been seen in performance. Nevertheless, the increasing complexity, high risk of overfitting, and lack of interpretability made this work less significant for clinical applications.
Some recent review articles highlight that the accuracy achieved with CNN models on histopathological images is not robust to variability introduced by staining and extraction methods, as well as by the amount of training data. This context justifies basing this work primarily on histopathological images, while using additional representations for features that are not overly dependent on CNNs, especially in limited-data scenarios.
Feature fusion has attracted increasing attention for improving classification performance by leveraging complementary information from multiple representations. Fusion techniques can be generally categorized into data-level, feature-level, and decision-level fusion. Although decision-level fusion is easy to implement, it is generally inefficient at capturing feature correlations. On the other hand, early fusion tends to result in a high-dimensional feature space, especially in medical imaging analysis that involves significant feature diversity. 33
In breast histopathological image analysis, Sitaula & Hossain 34 have proven the effectiveness of fusing features derived from the entire image with those derived from parts in classifying images. Its effectiveness depends on proper detection of the area of interest, or foreground segmentation. This may pose challenges in different staining techniques. Similarly, Li et al. 35 have proposed the use of multi-stream CNN models. These models have enhanced features derived from different scales. The models have achieved improved classification results on histopathological images. However, this has been done with increased computational cost.
Fusion techniques based on the idea of using color have also received attention in current research. Mallick 36 proposed the concept of a color-space ensemble fusion technique for the classification of breast histopathology images. However, this technique cannot also model features in terms of frequencies and orientations, and it is susceptible to staining normalization. Other recent approaches that find fusion in CNNs are essentially based on combining multiple CNN outputs or ensemble classifiers.29,37 Though these approaches improve classification accuracy, they are likely to yield complex models that are less interpretable and increase execution time, which could limit their applicability in real-world clinical environments.
Hand-crafted textural feature extraction methods, like GLCM, Gabor filter-based, or wavelets, have proved to provide a good discriminative power for histopathological image applications, even with a limited number of cases.30,38
However, these handcrafted features, by themselves, lack the hierarchical semantic abstraction afforded by CNNs or may neglect fine-textured details and directional information, which are highly important for lesion discrimination. Addressing these complementary sets of weaknesses and strengths, this paper introduces a proposed method for feature-level fusion that combines handcrafted multiresolution features with CNN embeddings. Consequently, by incorporating direct texture, frequency, and direction cues alongside learned semantics, this proposed method aims to counteract potential overfitting, enhance interpretability, and maintain favorable accuracy.
Wavelet-based multiresolution analysis has long been used to extract frequency- and orientation-sensitive features from biomedical images. However, conventional discrete wavelet transforms suffer from shift variance and limited directional selectivity. The Dual-Tree Complex Wavelet Transform (DTCWT) overcomes these limitations by providing approximate shift invariance, improved directional selectivity, and complex-valued coefficients that encode both magnitude and phase information.
More recent works that combined DTCWT with deep learning architectures have demonstrated significant improvements in robustness and feature extraction capability. A new CNN with directional correlations that utilized DTCWT was proposed by Gao et al., 39 demonstrating effectiveness for resisting compression artifacts. Nevertheless, this process requires significant computation and a large-scale training database, making its practical use in real-world medical imaging applications challenging. Another attempt at applying CNNs with input DTCWT sub-bands for medical image segmentation was conducted by Darooei et al., 40 producing encouraging outcomes for boundary identification but not for classification of lesions. 41
DTCWT is also used for medical image denoising and fusion, and its effectiveness in removing noise while preserving structure is evident. However, these techniques are generally vulnerable to high-frequency noise and require optimal parameter adjustment. Moreover, most of these previous works incorporate DTCWT either as preprocessing or as part of a deep learning architecture. However, its multiresolution characteristics are not fully integrated into a unified framework for feature fusion in breast lesion detection. These shortcomings are addressed by the proposed approach, which uses the DTCWT as a multiresolution feature extractor that combines the channel-wise features in the directional and frequency subspaces with the feature representations learned by the CNN. These capabilities highlight the benefits of wavelet-based feature extraction over CNNs.
Methods
This section describes the approach for developing the proposed multi-path CNN framework. It provides a complete overview of the framework, detailing how each component is integrated to demonstrate the flow of data from input to output. The present study falls under the category of a retrospective experimental study carried out on publicly available histopathology datasets. This study aims to develop and evaluate the performance of the deep learning-based classification approach.
Dataset
The BreaKHis dataset is a problematic, large-scale histopathological breast cancer image dataset consisting of 7,909 images from 82 anonymous patients, representing eight subtypes of breast tumors. The images were obtained from the Laboratory of Pathological Anatomy and Cytopathology in Brazil and represent both benign and malignant samples taken at 40×, 100×, 200×, and 400× magnifications. These differences represent real-world scenarios in which pathologists use cell morphology to distinguish between different tumor types.
Additionally, a second dataset consisting of 594 histopathological images of breast cancer was obtained from patients at the Department of Radiodiagnosis and Imaging, Graphic Era Institute of Medical Sciences, Dehradun, Uttarakhand, India, between March 2024 and June 2024. All images were taken with the prior consent of the patients and after due clearance from the institutional medical ethics board. Each image was resized to 128 × 128 pixels.
Proposed work
In this study, a hybrid framework is developed that combines the Dual-Tree Complex Wavelet Transform (DT-CWT) and a hybrid architecture—ResNet and DenseNet—to enhance image classification accuracy. Multi-scale and directional analysis employs the DT-CWT to extract fine texture and edge details while preserving shift invariance. Handcrafted features such as Gabor filters, the Gray-Level Co-occurrence Matrix (GLCM), and Law’s Energy are extracted from the processed sub-bands and fused with deep features learned by ResNet and DenseNet. The fusion enables both coarse and fine image details to be effectively represented. Dense connectivity in DenseNets helps preserve smooth gradient flow, alleviate gradient-related issues, and improve classification robustness. The three main phases of the suggested model are the preprocessing phase, the dual-tree complex wavelet decomposition stage, the hybrid architecture training stage, and the testing stage.
The algorithm described in Algorithm 1 details the proposed multi-path dual DT-CWT-based framework shown in Figure 1. The proposed framework adopts a parallel, hierarchical approach to feature learning. To begin with, histopathological images undergo preprocessing and transformation using the Dual-Tree Complex Wavelet Transform (DT-CWT), where the low-pass (approximation) subband is retained for further processing. Schematic representation of the proposed multi-path dual-DT-CWT-based deep learning framework for breast lesion classification.
The retained low-pass subband is processed in parallel along two paths. For the first path, the subband is fed into a hybrid CNN framework consisting of ResNet and DenseNet branches for deep semantic feature extraction. For the second path, handcrafted texture features, such as Gabor, Gray-Level Co-occurrence Matrix (GLCM), and Law’s energy features, are extracted from the same subband.
The features extracted from both paths are combined to form a comprehensive multiscale feature representation, which is then used as input to fully connected layers for classification. The proposed framework is trained end-to-end using backpropagation. With the proposed framework, the fusion of frequency-domain information, handcrafted texture features, and deep semantic features enables robust, discriminative representation learning for classification.
Preprocessing and augmentation
Most datasets include only raw images that require correction before analysis can proceed. Pre-processing is therefore crucial and represents the initial step toward creating an effective CAD system. This work includes image scaling, data enhancement, and stain normalization. Instead, in all experiments, we used the whole slide image itself, which was resized from 700 × 460 to 128 × 128 pixels to feed into the proposed model’s input layer, which highlights one of the significant constraints of training a deep learning model, particularly a CNN: it requires a large number of data samples. This effect is even greater in fine-tuning. As mentioned above, most BC datasets are small in terms of the number of biological samples. To generate an adequate number of samples, the training set was augmented by exploiting affine transformation techniques. The BreakHis dataset is imbalanced in terms of sample count. The malignant class has about twice as many images as the benign class. Moreover, subclasses within the benign and malignant classes are further imbalanced. To address the problem, multi-scale data augmentation and up-sampling techniques are applied to increase the number of images for less-represented classes. First, the original images were rotated by 90°, 180°, and 270° to create a sequence of affine transformations. The resulting rotated images were then mirrored and added to the training set.
Dual-tree complex wavelet transform
The Discrete Wavelet Transform (DWT), a basic signal processing and image analysis method, has played an important role in this field, for it can represent signals temporally and spatially well. While the standard DWT is very well established, it has a few fundamental shortcomings that may impair its performance for specific applications - particularly when faced with problems such as shift variance and directional selectivity. A reliable dual-tree complex wavelet transform (DT-CWT) framework for signal and image analysis has been proposed to handle these problems. It is an extension of the DWT that uses two parallel filter banks to decompose signals into multiple subbands and features. The dual-tree structure enhances the ability to capture detailed directional information, making it effective for tasks involving textual analysis, edge detection, and feature detection. In DT-CWT, each filter bank is designed to produce a complex coefficient by combining the real and imaginary parts, forming an analytic signal representation. The DT-CWT introduces redundancy, which increases computational load while providing a robust signal representation. This repetition ensures that the transformation can be applied more effectively for noise reduction, textual classification, and image fusion. This work demonstrates the effectiveness of DT-CWT for breast cancer recognition, addressing the challenges posed by traditional wavelet transforms.
Feature extraction
Law energy features
It is applied to analyze image textures, capturing all intricate texture patterns within an image. Convolution is applied using small, fixed-size 5x5 kernels and customized to detect specific texture characteristics, such as edges, spots, and ripples. These filters will prove very useful for enhancing intensity variations within local regions and for identifying specific patterns that contribute to the overall texture. These kernels, when convolved with the image, extract a variety of texture descriptors that can be applied to tasks such as texture classification, segmentation, and even pattern recognition in computer vision. These kernels are convolved with the image to produce energy maps that represent texture information. The energy in these maps is computed by squaring the convolution and summing over local neighborhoods. First, we discuss this technique for identifying patterns in textures by analyzing pixel intensity variation across multiple regions of an image. Figure 2 shows Law’s texture masks of varying lengths and their associated one-dimensional filter coefficients for calculating the energy feature in Algorithm 3. The primitive 1D vectors, Level (L5), Edge (E5), Spot (S5), Wave (W5), and Ripple (R5), are combined using outer product manipulations to form two-dimensional texture filters. These texture masks are applied to the low-pass image derived from the DT-CWT decomposition to highlight unique texture patterns, such as edges, spots, and ripples. The energy of the filtered responses is then calculated to measure texture patterns, yielding a distinctive feature set for classification. Law’s texture masks of different lengths along with their corresponding filter coefficients.
Gabor features
These are extracted by a Gabor filter, which is used in image processing for texture analysis. A Gabor filter is just a sinusoidal wave filtered with the Gaussian version. It works well for image formulations that manage both the frequency and spatial information of an image. Gabor filters at different orientations and scales are widely used to extract multi-resolution, oriented texture information. Gabor features are local texture information in an image and are computed by convolving an image with a set of Gabor filters. They can be used in applications such as Texture Classification, Face Recognition, and Object Detection.
Algorithm 4 describes the step-by-step process of Gabor-based texture feature extraction from a grayscale image. For a given orientation and scale, a Gabor filter is designed by modulating a sinusoidal plane wave using a Gaussian window, enabling joint localization in both space and frequency. The input image is then filtered using the designed Gabor filter to obtain a response that highlights texture patterns with specific orientations and frequencies. Statistical properties, mean, and variance of the filtered image are then calculated to represent the local texture information. By iteratively performing this process across different orientations and scales, a robust multi-resolution, orientation-selective feature representation is achieved for classification.
Figure 3 shows the Gabor filter bank used for texture feature extraction, with filters designed at different orientations and frequencies. Each filter is designed to capture directional texture information by strongly responding to image structures at a particular orientation and scale. The variety of filters allows for multi-resolution analysis of texture patterns, making it convenient to extract discriminative local features. The filtered responses from these filters are then used to calculate statistical properties, as described in Algorithm 4, to achieve a robust texture representation for image classification. Gabor filter bank illustrating different orientations and frequencies used for texture feature extraction.
Textual features
It refers to the spatial distribution and arrangement of pixel intensities in an image and defines patterns such as repetitiveness, smoothness, or roughness. Textural features play a crucial role in image analysis, including segmentation, recognition, and classification. There are several texture feature extraction methods, including statistical methods such as the Gray Level Co-occurrence Matrix (GLCM), structural methods that detect repetitive patterns, and model-based techniques that employ mathematical representations, such as fractals. Important texture features include contrast, which measures intensity variation; correlation, which measures the relationship between pixels; energy, which measures uniformity; homogeneity, which measures smoothness; and entropy, which measures randomness. These characteristics make textural analysis valuable in applications such as medical imaging, remote sensing, and quality control.
The above algorithm 5 outlines the process of extracting compact statistical textural features from a grayscale image using the Gray Level Co-occurrence Matrix (GLCM). A GLCM is generated by examining the spatial relationship between pairs of pixels at a predefined distance and angle. The co-occurrence matrix highlights the spatial frequency of gray-level pairs in the image. Several second-order statistical features are then derived from the normalized co-occurrence matrix to describe textural characteristics. Contrast indicates the intensity variation of the texture, correlation indicates the linear dependence between neighboring pixels, energy indicates the uniformity of the texture, homogeneity indicates the local smoothness of the texture, and entropy indicates the randomness of the texture.
Proposed model
A multi-path Convolutional Neural Network (CNN) architecture is developed by integrating components from ResNet and DenseNet to improve feature extraction capabilities and classification performance. The proposed model processes the input image through three concurrent convolutional paths: a traditional CNN, a ResNet-based path, and a DenseNet-based path.
The input images are resized to 128 × 128 × 3 (height × width × RGB channels) before feeding them into the network Figure 4. Schematic representation of the proposed multi-path dual-DWT model.
Convolutional layers
Each path begins with a set of convolutional layers, followed by batch normalization and max-pooling operations. The convolution operation can be mathematically represented as:
Batch normalization is applied as:
Max-pooling is performed as:
Residual blocks
In the ResNet path, residual blocks are used to improve gradient flow. Each residual block computes:
Dense blocks
The DenseNet path uses dense blocks, in which each layer receives feature maps from all previous layers. It is expressed as:
This dense connectivity promotes feature reuse and improves gradient propagation.
Feature fusion and classification
The outputs from all three paths are concatenated along the depth axis:
The concatenated features are passed to fully connected (dense) layers:
Finally, a sigmoid activation is used in the last layer for binary classification:
Model overview
The proposed model contains approximately 4.13 million parameters and leverages the benefits of CNNs, ResNets, and DenseNets. The ResNet path helps mitigate gradient issues, while the DenseNet path enhances feature reuse. This combination yields a robust architecture suitable for complex image classification tasks.
The experimental process flow shown in Figure 5 above encompasses the entire workflow of the proposed framework, from raw histopathological images to final lesion classification. The process flow is intended to systematically combine multiresolution analysis, deep learning-based feature extraction, handcrafted texture analysis, and feature fusion. Experimental flowchart.
Results
Dataset distribution of benign and malignant cases in training and testing sets.
Experiments
The provided dataset is first processed with Law’s Energy Features, Textural Features, and Gabor Features for feature extraction. The extracted features are concatenated into a common representation, which is then transformed using the Dual-Tree Complex Wavelet Transform (DT-CWT) to improve robustness and preserve directional and frequency information. Several combinations of DT-DWT filters were used to verify the proposed model’s results. Several combinations of DT-DWT filters were used for the given dataset to verify the proposed model’s results. Numerous experiments have been performed with DT-DWT filter banks, namely Coif, Symlet, Bio-orthogonal, rev-biorthogonal, and Daubechies filters. In this experiment, all five filters were tested and trained with all the vanishing moments present in the filter. Vanishing moment means that a wavelet filter can remove polynomial components up to a given degree, allowing it to focus on the high-frequency components of the signal (localized in time and frequency) rather than smooth or redundant information. Numerous experiments have been performed with DT-DWT filter banks, including Coiflet, Symlet, Biorthogonal, and many others. These experiments aim to find the best-performing filter. To evaluate each filter’s performance, they were trained and tested for 50 epochs. In our experiment, we tested all filters at different vanishing moments to examine whether changes in vanishing moments affect the training and testing performance of the proposed model. The findings indicate a significant increase or decrease in accuracy, depending on the unique vanishing horizontal moment for each filter.
Coiflet filter
The Coiflet wavelet filter, along with its variants with different numbers of vanishing moments, is employed in this case. The goal is to thoroughly test each variant’s performance to determine which number of vanishing moments yields the best results for the task at hand. In this experiment, coif2 performed better than all other complete-filter variants.
Performance comparison of different Coiflet filters based on various evaluation metrics.
Figure 6 shows the training and testing accuracy curves of various Coiflet wavelet filters. The Coif2-based classifier converges faster and shows fewer differences between training and test accuracies, indicating improved generalization and reduced overfitting. Higher-order Coiflet wavelet filters exhibit less stable learning performance, possibly due to greater noise sensitivity and feature redundancy. These findings are consistent with the results shown in Table 2, thus re-emphasizing the superiority of Coif2-based classifier in the proposed classification system. Training and testing accuracy curves for the Coiflet wavelet filter, illustrating the improvement in model performance across training epochs.
Symlet filter
In this study, we have used the Symlet wavelet family because it offers better time-frequency localization, making it very efficient for feature extraction in signal and image processing applications. The filters of Symlet are an extension of Daubechies wavelets, providing better symmetry and reduced phase distortion, which are particularly useful for tasks that require more precise preservation of features. To analyze the effects of different wavelet functions in detail, we analyze all 45 filters of the Symlet family. These filters differ in the number of vanishing moments, which directly affects their smoothness, orthogonality, and ability to preserve high-frequency details while maintaining signal integrity. Differences in vanishing moments directly affect wavelet performance across various tasks, and this analysis is necessary to determine the best filter.
Performance comparison of Symlet wavelet filters using various evaluation metrics.
Figure 7 shows the training and testing accuracy curves of Symlet wavelet filters during the training process. The figure shows the effect of vanishing moments on the convergence characteristics and stability of learning. Filters with higher vanishing moments exhibit smoother convergence curves and smaller differences between training and test accuracies, indicating stable learning and better generalization. Filters with lower vanishing moments exhibit slower convergence rates and larger differences, indicating poor feature representation. The convergence characteristics of learning provide qualitative information that supplements the quantitative information presented in Table 3. Training and testing accuracy curves for the Symlet wavelet filters, illustrating the improvement in model performance across training epochs.
Biorthogonal filter
A comparative analysis was conducted to determine the best biorthogonal wavelet filter among 15 variants for the given task. Biorthogonal wavelet filters use a pair of decomposition and reconstruction filters that support perfect reconstruction and offer flexibility in smoothness, support size, and the number of vanishing moments. These factors are highly significant in tasks such as feature extraction, denoising, and image processing, where precise signal modeling and reconstruction are required.
Figure 8 shows the training and testing accuracy plots for different biorthogonal wavelet filters during training epochs. The plot demonstrates the impact of filter properties on convergence speed and testing accuracy. Biorthogonal wavelet filter 2.8 exhibits faster convergence and a smaller gap between training and testing accuracy, indicating better learning and improved generalization. On the other hand, poorer-performing filters exhibit slower convergence and larger oscillations, indicating poor feature modeling. Table 4 provides a quantitative analysis of the performance of biorthogonal wavelet filters based on training and testing accuracy, precision, recall, F1 score, MCC, and Cohen’s kappa values. Among all the filters tested, the Bior2.8 filter performs best in terms of test accuracy (96.28%) and F1 score (97.22%). Other filters, like Bior2.2 and Bior3.7, provide balanced results, while Bior3.5 performs relatively poorly in most cases. Higher-order biorthogonal filters perform better, underscoring the importance of selecting optimal wavelet filters for accurate classification. Training and testing accuracy curves for the Biorthogonal filter, illustrating the improvement in model performance across training epochs. Performance metrics for different biorthogonal wavelet filters.
Rev-biorthogonal filter
In this research, 15 different reverse biorthogonal wavelet filters were systematically examined to determine the best variant. The decomposition function in each filter decomposed the input data, whereas the reconstruction function recomposed the wavelet coefficients. These filters are designed explicitly for symmetry and perfect reconstruction and are therefore widely used in image processing, denoising, and feature extraction. Their distinctions are due to differences in vanishing moments, support length, and wavelet smoothness, all of which affect their ability to extract details, edges, and textures from the data.
Figure 9 illustrates the training and testing accuracy curves achieved by using different reverse biorthogonal wavelet filters during the training epochs. This figure highlights the effects of symmetry, vanishing moments, and smoothness on the proposed model’s learning process. Rbio-2.4 shows faster convergence and a small gap between training and testing accuracy. In comparison, the less accurate reverse biorthogonal wavelet filters exhibit slower convergence and larger oscillations, indicating less accurate feature extraction. Table 5 shows the performance of the tested reverse biorthogonal wavelet filters on various criteria. Among all filters, Rbio-2.4 achieves the best test accuracy (97.32%) and F1 score (98.01%), validating its superior classification performance. Rev-2.2 and Rev-1.1 show balanced performance across various criteria, while Rbio-5.5 shows lower accuracy and MCC, indicating lower robustness. The Rbio-2. x filters always perform better than other filters, which emphasizes the significance of proper wavelet choice for reliable classification performance. Training and testing accuracy curves for various rev-biorthogonal wavelet filters during the training process. Performance metrics of different reverse biorthogonal wavelet filters.
Daubechies filter
Daubechies wavelets are preferred for their compact support and sound time-frequency localization, which are particularly useful for a wide range of signal processing applications, including compression, feature extraction, and noise reduction. A fundamental property of Daubechies wavelets is the concept of vanishing moments, which denotes the number of moments a wavelet function has that are equal to zero. This property determines the accuracy with which the wavelet represents polynomial signals and also affects its smoothness and frequency localization. Within the Daubechies family, the 45 filters differ in the number of vanishing moments, which influences their ability to capture fine details in signals without losing orthogonality or excessive time-domain overlap.
Figure 10 shows the training and testing accuracy curves achieved by using different Daubechies wavelet filters during the training process. This figure shows how the number of vanishing moments affects the convergence process and generalization ability of the proposed approach. The higher-order Daubechies wavelet filters exhibit smoother convergence curves and fewer differences between training and test accuracies, indicating that the proposed approach with higher-order Daubechies wavelet filters can better represent features and learn from data. The lower-order Daubechies wavelet filters exhibit slower convergence rates and larger accuracy differences, indicating that the proposed approach with these filters has a poor ability to model signal fine details. Training and testing accuracy curves for the Daubechies wavelet filter, illustrating the improvement in model performance across training epochs.
Performance comparison of Daubechies wavelet filters using various evaluation metrics.
During the experiments, a detailed assessment of wavelet filters from various families, including Coiflet, Symlet, Biorthogonal, Reverse Biorthogonal, and Daubechies wavelets, was conducted. Every wavelet family holds unique properties: the Coiflet wavelet filters offer compact support and smoothness; the Symlet wavelet filters offer symmetry and better-defined phases; Biorthogonal wavelet filters allow flexible and complementary analysis and synthesis; Reverse Biorthogonal wavelet filters promote the accuracy of the reconstruction process through the use of non-orthogonal wavelet pairs; and the Daubechies wavelet filters offer excellent time and frequency localization.
Although both wavelet families showed competitive performance, the experiments revealed that the Reverse Biorthogonal Rbio-2.4 filter outperformed the others across key performance parameters. For a better understanding of the performance of Rbio-2.4 in comparison to those filters from the Coiflet, Symlet, Biorthogonal, and Daubechies wavelet families that showed better performance, Table 4 reveals that all parameters are in favor of the Reverse Biorthogonal Rbio-2.4 filter.
Discussion
Comparative analysis of deep learning models for breast cancer histopathology using BreakHis and multi-dataset evaluations (2019–2026).
For example, recent hybrid and attention-based models that leverage multi-scale analysis and feature fusion achieve accuracies close to 97%, but at the cost of greater complexity, reduced interpretability, and generalization.46,47
As for performance comparison among the subject methods, the proposed DT-DWTCNN outperforms all other methods, achieving a test accuracy of 97.32%, an F1-score of 98.01%, and very high agreement measures, as estimated by MCC: 93.80 and Kappa: 93.77. Such results emphasize robust classification performance with strong reliability, well beyond chance, which is explained by incorporating multiresolution texture features derived from DT-DWT into CNN embeddings, thereby improving directional sensitivity, reducing data dependence, and enhancing generalization to unseen datasets.
Overall, the proposed framework demonstrates higher discriminative capability, robustness, and interpretability, therefore, establishing its effectiveness in real-world breast cancer histopathological analysis.
Limitations and conclusion
A new multi-classification model is introduced by combining Dual-Tree Complex Wavelet Transform (DTCWT) with a multi-path Convolutional Neural Network (CNN) to diagnose breast lesions in histopathology images. The strengths of improved wavelet-based analysis and deep learning are combined to produce highly accurate results. The data is initially processed through feature extraction, where Law’s texture energy is used to measure image patterns and textures, and Gabor filters are used to capture features across specific frequency ranges and orientations. Through texture analysis, significant information about the spatial relationships between image pixels and their intensity variations is revealed, enabling a richer representation of the data. After extraction, the features are combined into a single Excel spreadsheet and then processed by DTCWT filters. During training and testing, the suggested framework is thoroughly evaluated using a histopathological image dataset. The presence of Wavelet Packet Decomposition (WPD) filters in the pipeline effectively enhances feature extraction efficiency by enabling the capture of fine image details. The method ensures that multiple aspects of feature extraction are considered to preserve subtle image details, thereby effectively enhancing model robustness. A comparative analysis of the short-listed wavelet packet decomposition filters was performed to assess their suitability for application in the proposed framework. Of all the filters tested, the Reverse Biorthogonal (Rbio-2.4) wavelet achieved the highest classification accuracy of 97.32%, outperforming the other wavelets. The Rbio-2.4 wavelet can extract better features from breast histopathological images.
The high values of precision, recall, F1-score, MCC, and Kappa statistics achieved during evaluation of the Rbio-2.4 filter reinforce the robustness and reliability of the technique for learning complex textural and structural information about breast lesions.
In conclusion, the results above demonstrate the efficacy of the proposed DT-DWTCNN framework for breast lesion identification. By combining the best-in-class feature extraction with optimal selection of the WPD filter, the proposed hybrid model performs exemplarily well, setting the stage for subsequent work. It is also clear that the proposed Rbio-2.4 filter performs satisfactorily in its application, suggesting its possible use in other image-related healthcare applications.
Future scope
Although the proposed DT-DWTCNN model achieves high classification accuracy, future studies will emphasize enhancing explanation and interpretation capabilities through a combination of explainable artificial intelligence (XAI) tools, which will include the use of visualization tools such as Gradient-weighted Class Activation Mapping (Grad-CAM) and class-specific saliency maps to identify important regions in histopathological images that contribute to predictions in a decision-making model. The addition of attention maps derived from XAI will not only improve interpretation but also enhance model trustworthiness and aid pathologists in understanding how a model reaches a decision when distinguishing tumor characteristics in breast cancer. Another area worth investigating is the identification of multiple breast cancer classifications in large datasets.
Thus, the proposed DT-DWTCNN model integrates the best feature extraction algorithms with appropriate wavelet selection to achieve accurate breast lesion identification. The efficacy of the Rbio-2.4 filter in performing this task has been evidenced in this manuscript.
Footnotes
Ethical considerations
The BreaKHis and ICIAR2018 datasets are openly accessible to researchers. The use of both datasets adhered to the terms and conditions of both databases, and no further approval was required for their use in the current study. In the current study, public domain de-identified histopathological image datasets were used. An approval from the Institutional Ethics Committee (IEC) of Graphic Era Hill University was obtained for the experiment conducted. An ethics waiver (Waiver No.: GEHU/R&D/Eth/2026/02) has been issued by the Institutional Ethics Committee (IEC).
Author contributions
Ms. Manvi Bohra performed Conceptualization, Methodology, Software, Validation, Writing - Original Draft, Visualization, Formal analysis and data collection. Dr. Kamred Udham Singh performed Review & Editing, Supervision, Validation, and Formal Analysis. Dr. Indrajeet Kumar and Shambhu Mahato administered and conducted a formal analysis of the project. All authors reviewed the manuscript.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
