Abstract
The second most frequent malignancy in women worldwide is cervical cancer. In the transformation(transitional) zone, which is a region of the cervix, columnar cells are continuously converting into squamous cells. The most typical location on the cervix for the development of aberrant cells is the transformation zone, a region of transforming cells. This article suggests a 2-phase method that includes segmenting and classifying the transformation zone to identify the type of cervical cancer. In the initial stage, the transformation zone is segmented from the colposcopy images. The segmented images are then subjected to the augmentation process and identified with the improved inception-resnet-v2. Here, multi-scale feature fusion framework that utilizes 3 × 3 convolution kernels from Reduction-A and Reduction-B of inception-resnet-v2 is introduced. The feature extracted from Reduction-A and Reduction -B is concatenated and fed to SVM for classification. This way, the model combines the benefits of residual networks and Inception convolution, increasing network width and resolving the deep network’s training issue. The network can extract several scales of contextual information due to the multi-scale feature fusion, which increases accuracy. The experimental results reveal 81.24% accuracy, 81.24% sensitivity, 90.62% specificity, 87.52% precision, 9.38% FPR, and 81.68% F1 score, 75.27% MCC, and 57.79% Kappa coefficient.
Introduction
Human Papilloma Virus infection acquired through sexual contact is the leading cause of cervical cancer, which begins in the cervix (HPV). It accounts for 12% of all malignancies and is the second leading cause of death for women globally. 1 The progression of epithelial alterations from pre-cancerous to invasive malignancy takes several years. As a result, there is enough time for pre-cancerous phase screening, identification, and management. There are 2 basic diagnostic screening methods for cervical cancer: (1) cellular level and (2) tissue level. The Pap test, liquid-based cytology (LBC), HPV-DNA testing, electromagnetic spectroscopies, and liquid-based cytology (LBC) are all used for cervical cancer screening at the cellular level. The tissue level screening entails colposcopy, hyperspectral diagnostic imaging, and visual inspection following the application of Lugol’s iodine or acetic acid (VILI or VIA) (HSDI) 2 . Before it is investigated for expert analysis, specimen collection is necessary for cellular level diagnosis. For diagnostic screening based on the tissue level, on the other hand, specimen collection is not required. For third-world nations, the screening methods based on the cellular level might not be cost-effective enough. 3 Additionally, the VIA technique, also known as VIA, has been developed to incorporate the spraying of acetic acid. 4 The uterine cervix is painted with 5% acetic acid and then seen with a 100 W light after 1 minute. The cervix is painted with Lugol’s iodine in the subsequent stage. Cases with changed acetic acid (VIA) and/or Lugol’s (VILI) staining are noted as positive tests. This technique only separates a normal cervix region from an abnormal one. This approach cannot be used to classify the various types of cancer. Again, hyperspectral imaging-based approaches have many benefits, including the fact that they are non-invasive and may reduce the need for unnecessary biopsies. 5 However, the current spectrum imaging technique collects data by performing successive scanning in the spatial domain to cover the area that must be diagnosed. Therefore, collecting the necessary diagnosis data requires time because it takes time for current spectral or hyperspectral imaging techniques to scan the entire cervix. The requirement for spatial registration, uneven lighting, expensive and cumbersome setup, and sophisticated image processing are further drawbacks. 6 Due to severe infrastructure, resources, and funding limitations, it is impossible to provide setup in all rural health care centers for screening cervical cancer. Therefore, the colposcopy method is the only one that can be used to build low-cost screening strategies for cervical cancer. This will inevitably include adopting basic techniques that paramedical staff in remote rural areas can easily teach and implement.
The lower genital tract (cervix, vulva, and vagina) is thoroughly evaluated visually during a colposcopic examination, with special attention paid to the subjective appearance of the metaplastic epithelium comprising the transformation zone (TZ) on the cervix. The colposcope is used for this. A colposcope is a low-power binocular microscope with an objective lens coupled to a support mechanism and a built-in white light source. A 3% to 5% acetic acid solution is injected into the cervix during the examination, turning the abnormal and metaplastic epithelia white. Performing a colposcopic examination makes it possible to distinguish between cervical cancer pre-cancer and invasive carcinoma due to their distinct abnormal morphologic characteristics. 7
Cancer of the cervix begins in the cells that line the cervix. The cervix is the lower part of the uterus, an organ in a woman’s reproductive system. It links the vagina to the main part of the uterus and acts as a passageway between the 2. The cervix comprises 2 parts: the endocervical canal and the ectocervix. The part of the cervix that sticks out into the uterus is called the ectocervix. It is lined with stratified squamous epithelium that is not keratinized. The endocervical canal, or endocervix, is the cervix part closer to the body and more “inside.” It is lined by simple columnar epithelium that makes mucus. The endocervical canal ends at a narrow point called the internal os, where the uterine cavity starts. The transition from the ectocervix to the endocervical canal is marked by the external os, which is a hole in the ectocervix. Figure 1 shows the views of the anatomy and the cross-section.

Anatomical view of cervix (a), cross-sectional view of cervix (b).
A colposcopy image is a critical tool in the early detection of malignancy. The transition zone (TZ) colposcopic examination is used to assess and identify persons with irregular cytology who require additional care or follow-up. More than 90% of pre-cancerous lesions originate in the TZ. The SCJ and TZ are critical dynamic markers in the transformation process. In the colposcopy perception of differentiating characteristics, intra- and inter-observer heterogeneity are relatively significant. Nonetheless, the observer heterogeneity of TZ form and squamous column junction (SCJ) visibility appraisal and the quantitative computation of intra- and inter-observer similarities of TZ contour tracing have received little attention. 8 The squamocolumnar junction (SCJ) connects the squamous and columnar epithelium. Its position on the cervix varies. The SCJ develops due to a constant remodeling process caused by uterine expansion, cervical enlargement, and hormonal state. During this process, the original SCJ everts and significant parts of columnar epithelium migrate from their initial site onto the ectocervix. 9 Figure 2 shows an instance of TZ and SCJ.

Illustration of transformation zone and squamocolumnar junction.
According to the upper limit visibility of squamocolumnar junction (SCJ), TZ is classified as type 1, type 2, and type 3. 10 The TZ is considered type 1 when it only contains ectocervical components, that is, the whole TZ, including all the upper limits, is ectocervical. Type 2 and Type 3 have endocervical components. In case of type 2, the latest SCJ was fully visible in TZ. If the new SCJ was not fully visible, it is considered type 3. The biggest downside of using colposcopy as a diagnostic instrument is the clinician’s expertise and experience. 11 So, an automated method is required to develop the classification of cervical cancer types. The 3 types of cervical cancer colposcopy images are illustrated in Figure 3.

Samples of colposcopy images in the cervical region: (a) Type 1, (b) Type 2, and (c) Type 3.
The major contribution of this work is as follows.
This is the first work on segmentation and classification of transformation zone for cervical cancer diagnosis based on heterogeneity analysis of the cervix region.
An improved inception-resnet-v2 is proposed for the classification transformation zone and achieved 81.24% accuracy.
Comparative analysis is carried out in other 13 CNN models and existing work.
The proposed work performed better than the existing model for neoplasia classification, which implies the research is in the right direction.
The remaining article is organized as follows. The literature review is presented in Section 2. The material and methodology are described in section 3. Section 4 detailed the findings and discussed the remarkable results. Finally, section 5 concludes the article.
Literature Review
Many studies are being conducted to detect cervical cancer using pap smear images using machine learning and deep learning.12-16 However, only a limited number of works are done with colposcopy. “Zhang et al. proposed a CAD approach for automatically diagnosing cervical pre-cancerous lesions, that is, determining CIN2 or higher-level lesions in cervical images. Initially, image data are prepossessed with ROI extraction and data augmentation. The parameters of all layers were then fine-tuned using pre-trained DenseNet convolutional neural networks, achieving an accuracy of 73.08 % (AUC 0.75) in 600 test images 17 .” Using time-lapsed colposcopic images, Li et al. suggested a deep learning framework for effectively diagnosing LSIL+ (including CIN and cervical cancer). “The suggested framework comprises 2 primary parts: key-frame feature encoding networks and feature fusion networks. The feature encoding networks encode the features of the initial (pre-acetic acid) image and the colposcopic images acquired around the acetic acid test’s 60, 90, 120, and 150 seconds. Several fusion approaches are evaluated and found to surpass existing automated cervical cancer diagnosis systems in a single time window. Due to its excellent classification accuracy of 78.33 % using 7668 images, a graph convolutional network with edge features (E-GCN) is the most appropriate fusion strategy in this study 18 .” Yu et al. introduced a gated recurrent convolutional neural network (C-GCNN) for colposcopy image processing that considers time series and combines multistate cervical images to grade CIN. The accuracy was 96.87%, the sensitivity was 95.68%, and the specificity was 98.72%. 19 Saini et al. 20 describe a deep-learning-based technique for cervical cancer classification using colposcopy images. The suggested method’s architecture, ColpoNet, is an upgraded version of the DenseNet model. ColpoNet attained an accuracy of 81.353% utilizing 400 images of CIN1 type and 400 images of CIN2/CIN3/CIN4 type, according to the testing results. According to the literature, only cervical intra-epithelial neoplasia (CIN) classification is performed utilizing colposcopy images. Recently many works have been reported the utilization of deep learning in cervical cancer screening,21-23 breast cancer24-26 and lung cancer. 27 There has been no work on the TZ classification to date.
In this study, we are mainly interested in classifying TZ types based on the SCJ visibility, which supports clinical decisions. This study develops an automated TZ classification method and demonstrates that a computerized classification method can be used for colposcopy.
Material and Methodology
This section discusses the segmentation procedure of TZ using image processing tools, the data augmentation process of segmented images, and finally, the classification of TZ using improved inception-resnet-v2. The dataset is collected from IARC (International Agency for Research on Cancer) on request. The dataset contains samples after normal saline, acetic acid, acetic acid with a green filter, and Lugol’s iodine. In this research, the samples after acetic acid are considered.
Transformation zone segmentation
In this sub-section, different steps of TZ segmentation are illustrated in Figure 4.
(a) Initially, the colposcopy images in RGB form are taken.
(b) The red channel is extracted, as the cervix is red color in nature and implementation of thresholding in the next step.
(c) A thresholding value of 200 is set to detect the center part of the cervix and ignore the periphery.
(d) Now, detect the big object using an area filter. As many objects are in the cervical region, it needs to ignore all except the central part.
(e) Trace the boundary of the big object and generate the binary mask within the boundary.
(f) Finally, the binary mask is multiplied by the original colposcopy (RGB) image to get the TZ.

The proposed method to segment the transition zone in colposcopy images.
Dataset augmentation
The dataset used in this research collected from IACR comprises 292 colposcopy images, including Type 1, Type 2, and Type 3 of 169, 43, and 80, respectively. These all images are passed through the segmentation process, as in section 2.1. Then the segmented images pass through flipping and rotation operation, that is, flip horizontal & vertical and rotate left & right. In this way, the dataset is increased 5 times. To make dataset balance, the lowest sample is considered, that is, 43 × 5 = 215, and all types are of 215 samples. To increase the dataset further, random variations are introduced like contrast variation, brightness variation, random rotation, and translation. The detail of the dataset is noted in Table 1. Finally, each type of cervical cancer sample is of 1075 number. The original 292 images are considered for segmentation, and the segmented images are passed through augmentation. And finally, 3225 images are executed for classification with a training and testing ratio of 80:20.
Details of dataset.
Classification of transformation zone
For deep convolutional neural networks, researchers have discovered that the features extracted by shallow and deep layers are distinct. Surface features like edges and textures are extracted by shallow layers, while deeper layers extract complex semantic features that are not accessible to human intuition. The former helps locate the target, while the latter aids in detecting the target. A convolutional neural network structure is just one example of a deep convolutional neural network. Improvements have occurred in detection after the launch of Inception’s Resnet-v2 module. Small-scale target identification outcomes would suffer due to the network’s depth because of the loss of effective location information. As a result, multi-scale feature fusion information was required. To extract the feature of the cervical area, we propose a multi-scale feature fusion framework that utilizes 3 × 3 convolution kernels from Reduction-A and Reduction-B of inception-resnet-v2. In this paper, the improved inception-resnet-v2 is suggested for the classification of TZ, and its end-to-end convolutional neural network is depicted in Figure 5. Backbone: Inception-Resnet-v2, multi-scale context information fusion (concatenation), and linear SVM for classification are all components of the network. Reduction-A and Reduction-B characteristics are combined in the Inception-Resnet-v2 structure. Figure 6 shows the reader’s perspective on the inception-resnet-v2 core blocks and the architecture of Reduction-A and Reduction-B. The detailed architecture of Inception-ResNet-v2’s can be found in Manna. 28

Improved inception resnet V2 for transformation zone classification.

Core structure of inception-resnet-v2 and architecture of Reduction-A and Reduction-B.
Result and Discussion
The proposed methodology is executed in HP Pavilion Note Book with core i5, fifth generation, window 10 with inbuilt NVIDIA GPU and MATLAB 2021a. The method is executed in 2 phases: (1) segmentation of TZ and (2) classification of TZ types.
Experimental results on transformation zone segmentation
A total of 294 colposcopy images are passed through the segmentation process, as illustrated in Figure 4. The detailed step-by-step outputs of the segmentation process are shown in Figure 7. Figure 7a shows the RGB image of the original colposcopy image, Figure 7b shows the red channel extracted image of the colposcopy image, and Figure 7c shows the image following the application of the red channel threshold level of 200, Figure 7d shows the boundary of the transition zone following the application of the BW area filter, Figure 7e shows the image of mask following the application of the BW trace boundary, and Figure 7f shows the Segmented TZ. Despite variations in brightness and illumination, the TZs are extracted from all 294 images.

Snapshot of different steps of segmentation of transition zone of colposcopy images.
Experimental results on transformation zone classification
We compared different CNN classifiers by analyzing their confusion matrix measures, that is, the accuracy (ACC), sensitivity (Sen), specificity (Spe), precision (Pre), FPR, F1 Score, MCC, and kappa values. The enhanced dataset was randomly divided into training (80%) and testing (20%) sets. To ensure the classifier generalizes to unseen patients, the split of the dataset is randomized. The results presented in this work are the average of 10 runs. The results of our 10 rounds are used to generate receiver operating characteristic (ROC) curves. The performance of 13 CNN models and the proposed model is recorded in Table 2.
Performance comparison of CNN models for classification of transformation zone types in (%).
It is observed from Table 2 the proposed model performed well compared other 13 CNN models for the classification of TZ types. Further, the comparative analysis is carried out with the state-of-art and recorded in Table 3.
Comparison with the existing work.
It is observed from Table 3 that to date, there are only 3 works reported using colposcopy images and only for neoplasia classification. Further, 2 are carried out on 2-way classification, that is, cancer lesion present or absent, and achieved approximately 81% accuracy. Another work is reported for 5-way classification and achieved a very low classification accuracy rate, that is, 51.7%. In this consequence, the proposed work is the first on TZ classification, that is, analysis of heterogeneity of cervix for cancer diagnosis, and achieved the satisfactory 3-way classification of 81.24% accuracy.
Conclusion
The heterogeneity study of the cervix region based on the visibility of SCJ is not reported to date. The colposcopy screening of cervical cancer is feasible in rural areas of underdeveloped and low-income countries because of its small and economical setup. This cervical cancer screening method avoids biopsy, where the specimen is required to analyze. This study proposed a CNN model based on improved inception-resnet-v2, considering the advantages of a wide network. The features from Reduction-A and Reduction-B are taken & merged and fed to Linear-SVM for classification. The classification of TZ types achieved an accuracy of 81.24%, sensitivity of 81.24%, specificity of 90.62%, precision of 87.52%, FPR of 9.38%, and F1 score of 81.68%, MCC of 75.27%, and Kappa coefficient of 57.79%. Further, the proposed model performed better than the other CNN models and existing work, implicating this research in the right way.
Footnotes
Acknowledgements
We are thankful to Eric Lucas of the Early Detection, Prevention & Infections Branch, International Agency for Research on Cancer, WHO, for providing us the colposcopy images.
Funding:
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Declaration Of Conflicting Interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Author Contributions
Srikanta Dash, Dr. Prabira Kumar Sethy and Dr, Santi Kumari Behera are equivalently contributed for designing methodology, data analysis and findings. Dr. Prabira Kumar Sethy Drafted the manuscript and Dr. Santi Kumari Behera respond the reviwer’s query
Ethical Statement
In this research, there is no direct/ indirect involvement of any human beings or animals.
Data Availability
The image samples provided by International Agency for Research on Cancer, WHO, are not transferable. The data samples are meant for this research only. If anyone needs further research, may contact the main source.
