Machine learning model for umbilical cord classification using combination coiling index and texture feature based on 2-D Doppler ultrasound images

Abstract

The umbilical cord is an organ that circulates oxygen and nutrition from mother to fetus during pregnancy. This study aims to classify the umbilical cord based on ultrasound images. The similarity of shape and coil between each class becomes a challenge. Therefore, it requires feature values that are relevant to the characteristics of these three classes. The condition of imbalanced data sets in this study is also an obstacle that causes the classifier’s performance to degrade on minority classes. Therefore, this study proposes a machine learning model capable of properly dealing with imbalanced data sets and recognizing the umbilical cord class.

Furthermore, this study proposes a new feature extraction method, namely, the umbilical coiling index (UCI), which directly adopts obstetricians’ knowledge. The proposed model consists of five stages: image preprocessing, feature extraction, feature selection, oversampling data using SMOTE, and Classification. Machine learning method observations were carried out comprehensively on five based classifiers: Random Forest, KNN, Decision tree, SVM, Naïve Bayes, and Multiclassifier. The results showed that the Random forest and Multiclassifier methods provide the highest accuracy, precision, recall, and F-measure performance in imbalanced data sets.

Keywords

Umbilical cord Machine learning imbalanced data sets Multiclassifier SMOTE

Introduction

The umbilical cord is a connective tissue or channel that connects the placenta and the fetus. It also serves as a source of life for the fetus by maintaining fetal viability (survival), facilitating its growth, aiding in the disposal of waste compounds, and transporting oxygen, nutrients, and antibodies in the womb.¹ This channel consists of three blood vessels: one umbilical vein and two arteries connecting fetal circulation with the placenta. Medical research on fetomaternal and obstetrics widely used the umbilical cord to determine fetal growth and development. Abnormalities in the shape and morphology cause blood flow disruption from the placenta to the fetus, as reported by Bosselmann et al.² Therefore, an obstetrician’s examination is critical, considering the risks of disrupting blood flow to the fetus, such as malnourishment. Ndolo et al.,³ Bosselmann et al.,² and Kulamani et al.⁴ stated that an assessment standard called the Umbilical Coiling Index (UCI) is used to determine the umbilical cord category. UCI is a standard for measuring the number of twists based on the total length of the umbilical cord. One twist is a 360-degree turn in the spiral shape of the umbilical vessel. UCI value is the measurement by dividing the number of coiling by the length of the umbilical cord. The umbilical cord is classified into three forms: Normocoiled, Hypocoiled (supposing the umbilical index is less than the 10th percentile), and Hypercoiled (assuming it is greater than the 90th percentile).^1,5–10 The number of coils affects blood flow, oxygen, antibodies, and nutrients the fetus needs.^7,11–13

Figure 1 shows an example of the umbilical cord taken with an ultrasound machine using the Doppler effect. The color difference between red and blue indicates blood flow in the veins, while blue indicates the arteries. Hypocoiling conditions where there is no coil in the venous blood vessels from the arteries cause the umbilical cord to become prone to true knots so that it clogs the blood flow in the vessels.

Figure 1.

Three categories of Umbilical cord; (a). Normocoiled; (b, c). Hypercoiled; (d). Hypocoiled¹.

The fetal umbilical cord is excluded from the mandatory routine examination by an obstetrician. However, when the obstetrician notices any abnormality, such as inappropriate fetal weight, the umbilical cord is one of the organs that need to be analyzed. The ultrasound machine does not provide the much-needed information about the anatomy and condition of the umbilical cord. Due to the essential role of the umbilical cord in fetal growth and development, it is imperative to solve this problem to provide supporting information to obstetricians when diagnosing umbilical cords that require prompt and appropriate medical actions.

Artificial intelligence technology, especially with the machine learning method, has been widely used to support this analysis and provide second opinions for clinicians in making a diagnosis.¹⁴ One of the important steps in the machine learning model for medical image classification is feature extraction. This stage extracts the information contained in the image to provide a characteristic representation of an object or organ. An example is a study carried out by Acharya et al.¹⁵ on breast cancer MRI images’ statistical and structural features. Moreover, there is an additional feature, namely, the Run Length Matrix, with a value based on the predetermined parameters’ gray tone, length, and direction.

Furthermore, Fajrin et al.^16,17 combined the GLCM method with wavelet decomposition and entropy characteristics. The combination of these methods provided additional information regarding the nature of texture diversity and image randomness. The textural characteristics realized from a combination of the GLCM method and Two Dimensional Discrete Wavelet Transform (2D-DWT) were proposed by Beura et al.¹⁷ It was carried out to obtain the accurate textural characteristics of the MRI breast cancer image using the Multiresolution Analysis concept. A similar model involving a combination of the 2D-DWT method and GLCM was proposed by Mohanty et al.¹⁸ The difference lies in the division of the image on the region of interest (ROI) into several parts called blocks with a size of 64×64. The GLCM method was also applied to each sub-bands resulting in the wavelet decomposition of the ROI block. In addition to using MRI images, Belsare et al.^19,20 used biopsy images to classify cancer into benign and malignant. This is different from the research carried out by George et al.¹⁹ on textural features using the GLCM method. The study is performed on four different color channels: red, green, blue, and gray. Wu et al.²¹ categorized breast cancer into Triple-Negative (TN) and Non-Triple-Negative (NTN). Several extractions such as texture, shape, and vascularity are used to determine the best combination of features in the test.

Limited studies on machine learning methods for fetal umbilical cord image classification make this study a pioneer in developing subsequent studies. Pradipta et al.²² specifically carried out a study on fetal umbilical cord classification using the GLCM feature extraction approach and several morphological features. However, this study failed to use the specific characteristics adopted by clinical practitioners to assess the fetal umbilical cord ultrasound image. Applying obstetricians’ knowledge in the feature extraction process is expected to improve the machine learning models’ performance. This study proposed combining UCI and the GLCM method to improve the classification performance reported by previous studies. Furthermore, several classifier-based methods such as Random Forest, Support vector machines (SVM), Naïve Bayes, Decision tree, and K-nearest neighbors (KNN) were observed. The observation was also made by combining all classifiers into the Ensemble Multiclassifier method based on voting decisions.

Proposed Model

Generally, the proposed machine learning model consists of four stages: Image preprocessing, Feature extraction and Selection, Data oversampling, and Classification, as shown in Figure 2.

Figure 2.

Proposed Single classifier architecture for umbilical cord classification.

The multiclassifier voting classification is one of the methods included in the ensemble learning category. Moreover, the use of several classifiers to predict test data ensures accurate label output. The proposed method uses the majority voting method. Each classifier has a weight value in determining the final decision and the output label based on the most predicted class votes from the input data, as shown in Figure 3.

Figure 3.

Proposed Multi classifier architecture for umbilical cord classification.

Image Dataset

The data comprises of the fetal umbilical cord image taken using a 2-dimensional Doppler ultrasound machine with specifications. Its retrieval and collection were carried out at Kasih Medika Obstetrics and Gynecology Clinics in Bali, Indonesia. The data is divided into three categories: Normocoiling, Hypocoiling, and Hypercoiling, taken from 8 to 32 weeks of gestation. Figure 4 shows an example of image data. The image dataset number of Normocoiling, Hypocoiling, and Hypercoiling classes is 108, 34, and 9 images and the labeling process of the entire image dataset is done by an obstetrician. From the number of datasets, it can be seen that there is an imbalanced data condition where the Hypercoiling class has a tiny amount of data compared to other classes. This imbalanced condition requires separate handling before the learning process on the classification algorithm is carried out.

Figure 4.

Example of Doppler ultrasound fetal umbilical cord image data; (a). Normocoiling; (b). Hypocoiling; (c). Hypercoiling.

Image Preprocessing

The image segmentation process is carried out in the HSV (Hue, Saturation, Value) color space. The cropping of the original image aims to remove text or captions. The presence of information in text and numbers tends to interfere with the segmentation and feature extraction processes. This cropping approach changes the image size to 744 x 522 pixels. The second process involves transforming the color space from RGB (Red, Green, Blue) to HSV. After the conversion process, the segmentation method is applied with the threshold value sought for each hue, saturation, and value channel. The optimal threshold value is determined by observing the image histogram of each channel.

After determining the threshold value for object segmentation, the image edges are smoothened using the opening and closing morphological methods. The closing operation is useful for smoothing the contours and eliminating small holes in the segmented image. The final step involves the selection of the object’s largest region. This process is carried out by measuring the object area using the region props library in MATLAB. In addition, the results of this preprocessing stage are shown in Figure 5.

Figure 5.

Umbilical cord image preprocessing process; (a). Image segmentation by thresholding; (b). Opening and closing results; (c). Labeling object; (d). After region props operation; (e). RGB image after final preprocessing; (f). Grayscale image after final preprocessing.

Texture Analysis With Gray level Co-Occurrence matrix (GLCM)

GLCM is used to detect textures by calculating the probability of the relationship between 2 neighboring pixels at a certain angle orientation distance.²³ This technique is used to obtain a second-order statistical value by calculating the probability of the close relationship between 2 pixels at a certain distance (d) and angle (θ). The θ value is dependent on the direction of the angle, namely, $0^{0}$ , $45^{0}$ , $90^{0}$ , and $135^{0}$ .

The first step involves forming a concurrency matrix and determining the spatial relationship between the reference and neighboring pixels based on the angle (θ) and distance (d). The concurrency matrix is constructed using a second-level histogram. This matrix is a joint probability distribution of pixel pairs at a certain gray level. Figure 6 is an illustration of an image with size 4. Furthermore, the neighboring pixels are selected from the east (right) or at an angle of $0^{0}$ and distance d = 1. For example, the image matrix $A_{n \times n} = [a_{i, j}]$ has a size of $n \times n$ where $a_{i, j}$ is the element of matrix A with $i, j = (0 \dots .. n - 1)$ . Consequently, p is the maximum value of the elements in A, namely, $p = \max_{i, j} a_{i, j}$ . Furthermore, matrix B is formed as an element composition from A, namely, the pixel values contained in the image. The size of matrix B is $(p + 1) \times (p + 1)$ , which is a representation of the pixel composition from matrix A where $B = [b_{k, l}]$ where $k, l = (0 \dots .. p)$ and $b_{k, l} = (k, l),$ with k and l as the row and column indices of matrix B. Furthermore, a concurrency matrix C is formed and has a similar size as matrix B (( $p$ +1) $\times$ ( $p$ +1)), namely, $C = [c_{k l}]$ where $k, l = (0 \dots .. p)$ . The $c_{k, l}$ value is the number of pairs of $(a_{i j}, a_{i j + 1})$ with $a_{i j}$ = k and $a_{i j + 1} = l$ where $i = (0 \dots .. n - 1)$ and $j = (0 \dots .. n - 2)$ . Equations 1 and 2 are used to obtain the $c_{k, l}$ value

H_{k, l} = {(a_{i, j}, a_{i, j + 1}) | a_{i, j} = k, a_{i, j + 1} = l, i = 0 \dots (n - 1), j = 0... (n - 2)}

and

c_{k l} = # (H_{k, l}) = number of members H_{k, l}

(2)

Figure 6.

Illustration of Matrix co-occurrence in GLCM.

The second step involves the formation of a symmetrical matrix. The concurrency matrix C was initially known as its framework. This needs to be processed into a symmetric matrix by adding the transpose results to $G = [g_{k, l}]$ with size $(p + 1) \times (p + 1)$ using equation (3)

G = C + C^{T}

(3)

where

C^{T}

is the transposed result of matrix C.

The third step is to normalize the symmetric matrix G to eliminate the dependence on image size. The GLCM values need to be normalized, thereby leading to the sum of 1. Equation (4) shows the normalization of each matrix element. After the normalization process, the feature values in the GLCM method are calculated

G_{n o r m a l} = [g_{k l}^{n}]

(4)

where

g_{k l}^{n} = \frac{g_{k l}}{T}

(5)

and

T = \sum_{k = 0}^{p} \sum_{l = 0}^{p} g_{k l}

(6)

Umbilical Coiling Index (UCI)

UCI is a method of measuring umbilical cord types by obstetricians. It is dependent on two parameters, namely, the length of one coil and the number of cords. The UCI value is obtained by dividing the number of coils by the length of the umbilical cord.⁹ Table 1 shows the pseudocode UCI feature extraction process proposed in this study. The realized coiling index is the standard for classifying the umbilical cord category. Determining the length using a Doppler ultrasound machine is carried out using a caliper to draw a line on the umbilical cord. UCI values less than 0.21 and greater than 0.59 are included in the Hypocoiled and Hypercoiled categories.

Table 1.

Pseudocode UCI Feature extraction.

Algorithm. Umbilical coiling index (UCI) feature extraction
Input : original umbilical cord image Img;
Output : UCI value P
// Creating line caliper on the umbilical cord object
1. h ← imfreehand (‘Closed’,0)
2. position ← wait (h)
// Creating line for pixel calibration
3. h ← imline
4. Positition2 ← waith (h)
// RGB to HSV convert for segmentation
5. HSV ← rgb2hsv (Img)
6. H ← HSV (:, :, 1)
7. S ← HSV (:, :, 1)
// Thresholding for saturation component
8. bw ← Find (S ≥ 0.6)
// Morphological operation
9. bw ← bwareaopen (bw, 100)
10. str ← strel (‘disk’,5)
11. bw ← imclose (bw, str)
12. bw ← imfill (bw, ‘holes’)
// labeling region object segmentation result
13. [B, L] ← bwlabel (bw)
// selects the object that is traversed by the caliper line
14. for k = 1: size (position, 1)
A(k) = B (ceil (position (k,2)), ceil (position (k,1)))
15. end
16. bw ← B==mode (A)
// segmentation red area of umbilical cord
17. bw_red ← H ≥ 0.2 \| (H≥ 0 & H≤ 0.2)
// Morphological operation
18. bw_red ← imfill(bw_red, ‘hoels’)
19. bw_red ← bwareaopen (bw_red, 100)
//segmentation red area of umblical cord
20. bw_blue ← H≥ 0.2 & H≤ 0.9
// Morphological operation
21. bw blue ← imfill(be red, ‘holes’)
22. bw_blue ← bwareaopen(bw_red,100)
//Counting red are object
23. [B1,∼] ← bwboundries (be_red, ‘noholes,)
24. red_area ← size (B1,1)

Meanwhile, the UCI values between 0.21 and 0.59 are included in the Normocoiled category.² Table 1 shows the pseudocode UCI feature extraction process proposed in this study. After the image input process, the first thing that needs to be performed is to determine the length of the umbilical cord object in the image by drawing a caliper line. This is drawn from the upper limit of the umbilical cord to the lower. The im-freehand function found in Matlab was used to determine the length and is shown in lines 1 and 2 of the Pseudocode. The image segmentation approach follows the caliper drawing process. This involves the thresholding method on the HSV image and is based on the saturation (S) component as depicted in lines 5 to 13.

Furthermore, the ROI of the object is determined based on the object traversed by the caliper line. This is carried out because the UCI analysis is also dependent on this procedure. To eliminate irrelevant small objects in the feature extraction process, morphological operations need to be performed.

The opening and closing operations are carried out based on the element structural size of 1000 and the Disk type. This is followed by selecting the fetal umbilical cord object. First, the labeling of each segmented object is carried out using the label function connected to elements in 2-D binary images (bwlabel), as shown in line 14. The next process is to select objects traversed by the caliper line as in lines 15 to 17. The row symbolizes the starting point of the caliper line, and column location coordinates in the image. Each pixel coordinate in the segmented image, namely, B traversed by the caliper line, is accommodated in variable A. The k value indicates the number of pixels that constitute the caliper line. Therefore, A(k) is the object in figure B, which contains the coordinates of the caliper stored in the variable’s position. Finally, the object transversed by the coil is selected using the mode function.

Subsequently, the number of pixels that constitute the length of the caliper line on the fetal umbilical cord object is counted. The pixel line length is obtained by calculating the Euclidean distance for each. The variable’s position stores the row and column coordinates of the pixel points that constitute the caliper line. Each pixel point is calculated using the Euclidean distance to the last column. Furthermore, all distances are summed up and stored in the variable n. However, this constitutes the number of pixels according to the caliper line length. The process of calibrating pixel values in centimeters (cm) is carried out using a reference point around the information area to the right of the ultrasound image. This is because taking pictures manually with a camera makes shooting distances and angles inconsistent or stable. This method makes each image have a different pixel size value consistent with the distance between the dots. The caliper line is symbolized by positioncal consisting of the coordinates of the positional starting (1,1) and ending points (2,2). This process is continuously repeated for each input; therefore, different values are obtained when the pixel units are calibrated in centimeters. The first calculation of the UCI value is carried out based on the number of coils in the umbilical cord. This is carried out by predetermining the number of blue (blue_area) and red objects (red_area) in the ROI.

One coiling of the cord consists of a pair of blue area and red area objects. To determine the number of red and blue objects, conducted by segmenting the substances that constitute the coil. These are segmented using the threshold method in the HSV color space. To get a blue object, the threshold value in the Hue channel needs to be greater than 0.2 and less than 0.9. Furthermore, the threshold value for the red object is a Hue value greater than 0.9 or 0 and less than 0.2. Figures 5(c) and (d) show umbilical artery and vein objects segmentation. Boundary detection is formed on each object using the bwboundaries function in Matlab to determine the number of objects detected as umbilical vein and artery. The number of objects is calculated using the size function. In Figure 7, it is detected that the number of umbilical veins and arteries is three each. It was therefore concluded that the number of coils is 3. However, when there is a difference in the number of objects detected, the number of coils becomes equivalent to the minimum number of veins and arteries.

Figure 7.

UCI feature extraction; (a). Original USG doppler image; (b). ROI segmentation result; (c). red_area segmentation result; (d). blue_area segmentation result; (e). Final result for UCI image and value.

Synthetic Minority Oversampling Technique (SMOTE)

The SMOTE algorithm was first carried out by Chawla et al.²⁴ using oversampling and undersampling procedures in minority and majority classes. This algorithm was conducted by using several samples from the class and generating synthetic data from the k point to the nearest minority class. Oversampling is the process of adding new data to a class by resampling the minority. Conversely, undersampling reduces the data in a class till there is a balance. However, with the SMOTE oversampling approach, the amount of data in the minority class is added to the desired ratio. Therefore, the number of k-nearest neighbors randomly selected and commonly used is 5. The synthetic samples are created by calculating the distance between the selected feature vectors and their closest neighbors. Furthermore, random numbers between 0 and 1 were multiplied before being added to the previously selected feature vector. The flow chart of the SMOTE algorithm is shown in Figure 8.

Figure 8.

SMOTE algorithm workflow diagram.

Result and Discussion

Experiment on Single Classifier

In this section, the umbilical cord feature extraction results serve as input in developing a predictive machine learning model. The total number of features generated from the extraction process is 353 consisting of 88 and 264 GLCM texture features for gray and RGB images. The first experiment involves the feature selection performance using the Information gain method. This experiment aims to determine the features that affect the performance of the machine learning model used. Information Gain is used to measure the relevance or influence of a feature on the results. This technique tends to reduce feature dimensions by measuring the entropy reduction before and after separation. In addition, 353 features are realized and ranked on the information gain value using the Sklearn Python library.

Table 2 shows the five features with the maximum gain value. Validation of the model is carried out by dividing the data as training data and test data using k-fold cross-validation. This method is a validation process by separating the dataset into k subsets and iteratively processing it as training and test data. In this study, the folds are 3, considering that the Hypercoiling class data is very limited. Using three folds is that if we use a high fold, there is a risk in testing the original data that several folds do not contain this minority class.

Table 2.

Top five features with the highest gain value.

No	Features	Gain Value
1	Maximum_Probability_Red_135	0.1098010
2	Umbilical Coiling Index (UCI)	0.0673471
3	Difference_Variance_Blue_135	0.0634676
4	Sum_Of_Squares_Gray_135	0.0613738
5	Maximum_Probability_Gray_135	0.0580884

After the feature selection process, the dataset is oversampled on the minority class, namely, the Hypercoiling category. Sampling rate observations for the minority class were carried out based on relatively 100 to 500 percent. The test was carried out to determine the appropriate sampling rate value to be used for the umbilical cord dataset. SVM, Naïve Bayes, Random forest, KNN, Multilayer perceptron, and Decision tree (C.45) classification methods were observed by analyzing the accuracy, recall, precision, and F-measure performance. The Random forest parameters use a n_estimator of 10 criterion uses “entropy”, the maximum depth tree is unlimited, while the minimum split node is 2.

Furthermore, model validation uses the k-fold cross-validation scheme with the number of folds being 3. The random forest method performance in this experiment is shown in Table 3. From this second experiment, the best performance of the random forest method is the feature selection dataset and a combination of 400% SMOTE method with an average accuracy, precision, recall, and F-measure of 96%, 95.3%, 96.3%, and 96%, respectively. In the subsequent experiment on the KNN method, the parameter used is the number of Neighbors within five metrics calculated using Euclidean Distance. The performance of the KNN method on the fetal umbilical cord dataset is shown in Table 4. KNN method failed to achieve good outcomes on the datasets without feature selection. Overall, the accuracy, recall, precision, and F-measure values are less than 55%. These results indicate that the KNN method could not achieve satisfactory results when faced with relatively high imbalanced data and feature dimensions. This method uses the Euclidean distance concept as a basis for discerning the data points. However, the calculated Euclidean distance is less precise when dealing with high data dimensions and various feature values. The results increase when the combination of feature selection and data oversampling is added with an average performance above 80%.

Table 3.

Performance of random forest method on fetal umbilical cord dataset.

Matrix (average)	Random forest and SMOTE
	Without feature selection						Feature selection
	None	100 %	200 %	300 %	400 %	500 %	None	100 %	200 %	300 %	400 %	500 %
Accuracy	45.0	60.3	62.0	60.6	63.6	66.6	84.6	87.8	91.0	89.7	96.0	91.0
Precision	48.0	53.3	61.0	62.6	65.0	66.0	93.0	87.3	92.0	89.0	95.3	90.3
Recall	45.3	53.6	62.3	61.0	63.6	67.0	85.0	87.6	91.0	88.3	96.3	91.6
F-measure	45.6	53.6	61.6	61.6	64.0	66.3	88.3	87.6	91.6	88.6	96.0	91.0

Table 4.

Performance of the KNN method on the fetal umbilical cord dataset.

Matrix (average)	KNN & SMOTE
Matrix (average)	Without feature selection						Feature selection
	None	100 %	200 %	300 %	400 %	500 %	None	100 %	200 %	300 %	400 %	500 %
Accuracy	35.3	43.3	46.0	47.6	48.0	46.6	81.3	88.6	91.6	92.3	92.3	92.6
Precision	36.7	46.3	51.0	51.6	53.0	51.3	96.0	86.3	87.4	90.1	91.1	91.6
Recall	35.6	44.3	46.7	48.6	49.3	47.6	81.6	89.1	92.2	92.6	93.0	93.1
F-measure	35.6	44.6	48.3	49.3	50.3	48.3	86.0	87.6	89.2	91.3	92.1	92.3

The subsequent fetal umbilical cord classification test uses the Naïve Bayes method. The performance of the Naïve Bayes on both the selection and non-selection data and SMOTE oversampling are shown in Table 5. The Naïve Bayes classification model tends to be less optimal when the feature dimensions of the fetal umbilical cord data are higher. These results further indicate that the Naïve Bayes method requires additional processing to deal with Imbalanced datasets obtained from the fetal umbilical cord. Improving the Naïve Bayes performance tends to be carried out by including feature selection and oversampling data processes.

Table 5.

Performance of the Naïve Bayes method on the fetal umbilical cord dataset.

Matrix (average)	Naïve Bayes & SMOTE
Matrix (average)	Without feature selection						Feature selection
	None	100 %	200 %	300 %	400 %	500 %	None	100 %	200 %	300 %	400 %	500 %
Accuracy	50.7	62.4	60.8	62.4	62.3	62.4	80.2	81.3	85.2	85.1	77.9	84.2
Precision	42.1	50.3	50.6	52.6	53.6	54.2	70.6	74.3	79.3	81.6	83.6	84.6
Recall	50.6	62.6	61.2	62.6	62.6	62.6	82.1	81.3	85.1	85.1	84.5	83.6
F-measure	38.6	47.3	47.6	50.3	51.3	52.3	71.3	75.3	80.3	82.1	83.1	84.2

The next classification method is the Decision Tree (CART). The confusion matrix depicts the Decision tree model performance on the umbilical cord dataset combined with the SMOTE method in Table 6. The Decision tree method shows relatively good results with an average accuracy greater than 80% in the dataset without feature selection and SMOTE oversampling. This approach is effective for datasets with high dimensions feature space because it uses the method for calculating the gain value to determine the roots and nodes of the developed tree structure. In the second experiment, by adding a feature selection process, the Decision tree method increased performance. The best results are the SMOTE 500% dataset with an average accuracy, precision, recall, and F-measure of 92.6, 92.3%, 93.3%, and 92.3%, respectively. This also proves that the Decision tree method, in accordance with the calculated gain value and the pruning concept, is appropriate to classify data in high and low dimensions. Tree structure formation results realized from both feature selection and non-selection are relatively similar.

Table 6.

Performance of the Decision tree method on the fetal umbilical cord dataset.

Matrix (average)	Decision tree & SMOTE
Matrix (average)	Without feature selection						Feature selection
	None	100 %	200 %	300 %	400 %	500 %	None	100 %	200 %	300 %	400 %	500 %
Accuracy	82.6	84.2	89.4	89.2	88	90.3	87.2	90.6	92.1	90.1	93.3	92.6
Precision	79.3	82.6	89.3	88.6	87.6	91.3	84.2	85.3	92.1	88.3	91.2	92.3
Recall	83.3	84.3	89.3	89.6	88.6	91.1	87.3	91.2	92.1	90.3	93.3	93.3
F-measure	81.2	83.6	89.3	89.3	88.1	91.3	85.3	87.1	92.1	89.3	92.2	92.6

The final based classifier method is the SVM. In this proposed model, the Multiclass SVM method is used to classify Nonbinary classes. The one versus all (OVA) approach is applied. The proposed model uses SVM with a Round Basis Function (RBF) kernel to overcome data distribution problem that is difficult to solve linearly. The overall performance of the SVM method in this experiment is shown in Table 7. Based on the experiment without feature selection, this method seems unsatisfactory by achieving a less than 40% classification performance. Similarly, the dataset after feature selection shows a slightly insignificant increase in model performance for SVM method.

Table 7.

Performance of the SVM method on the fetal umbilical cord dataset.

Matrix (average)	SVM & SMOTE
Matrix (average)	Without feature selection						Feature selection
	None	100 %	200 %	300 %	400 %	500 %	None	100 %	200 %	300 %	400 %	500 %
Accuracy	33.3	33.3	33.3	33.3	33.3	33.3	33.3	38.9	53.6	57.6	57.6	59.2
Precision	24.2	22.6	21.3	20.3	19.3	18.3	24.4	56.3	48.3	49.6	49.3	50.6
Recall	33.3	33.3	33.3	33.3	33.3	33.3	33.3	39.5	53.6	58.2	57.6	59.3
F-measure	27.6	27.2	26.2	25.3	24.3	23.6	27.6	37.6	50.6	53.1	53.1	54.6

Experiment on Ensemble Multiclassifier Voting

The ensemble multiclassifier voting model used is consists of five classifiers used in the previous experiment. SVM, Naïve Bayes, Random Forest, Decision Tree (CART), and K-Nearest Neighbors (KNN) methods are applied by setting the parameters of each process based on the best results initially obtained. The hard-voting method was used in this experiment. This means that each classifier has a similar weight in determining or predicting each data class. This experiment is similar to the previous one, involving model evaluation using the before and after feature selection as well as SMOTE oversampling. Furthermore, for the evaluation model, we use three folds on the k-fold cross-validation method. The overall model performance of the Multiclassifier method is shown in Table 8. This voting method showed the best performance on 400% SMOTE oversampling in the experiment without feature selection. This is indicated by the average accuracy, precision, recall, and F-measure of 61%, 68.3%, 61%, and 63.6%, respectively. However, the first experiment showed that the multiclassifier model failed to categorize the Hypocoiling and Hypercoiling classes, as shown in the confusion matrix in Table 8.

Table 8.

Multiclass confusion matrix SMOTE 400% with multiclassifier voting and all feature.

Umbilical cord multiclassifer SMOTE 400%		Output class			Total
Umbilical cord multiclassifer SMOTE 400%		Normal	Hypocoiling	Hypercoiling
Target class	Normal	91	12	5	108
	Hypocoiling	23	11	0	34
	Hypercoiling	15	0	30	45
	Total	129	23	35

It is estimated due to the high feature dimensions and few training data. Therefore, the second experiment was carried out on feature selection to improve the model performance. The feature selection method used is the same as before, involving feature ranking applications based on the gain value in the Decision Tree method. The results produce five features with the highest gain value. In accordance with the results of this second experiment, the multiclassifier voting model on the SMOTE 500% and selection feature dataset were able to achieve the best performance with an average accuracy, precision, recall, and F-measure values of 95.2%, 93.6%, 93.3%, and 93.3%, respectively. The improvement of the model in recognizing these three classes occurred significantly as shown in the confusion matrix in Table 9.

Table 9.

Multiclass confusion matrix SMOTE 500% with multiclassifier voting and feature selection.

Umbilical cord + SMOTE 500%		Output class			Total
Umbilical cord + SMOTE 500%		Normal	Hypocoiling	Hypercoiling
Target class	Normal	99	0	9	108
	Hypocoiling	2	32	0	34
	Hypercoiling	2	0	52	54
	Total	103	32	61

The multiclassifier voting provided satisfactory results, despite the slightly insignificant increase performance, using several based classifier methods in previous experiments. The overall performance comparisons and evaluation results of single and multiclassifier models for fetal umbilical cord classification on data without and with feature selection and with and without SMOTE are shown in Tables 10, 11, 12, and 13. In the original fetal umbilical cord data, namely, without selection features and oversampling, the decision tree method achieved the best results compared to other methods, with an average accuracy, precision, recall, and F-measure of 82.6%, 79.3%, 83.3%, and 81%, respectively.

Table 10.

Performance of Multiclassifier voting with several SMOTE oversampling ratios and feature selection.

Matrix (average)	Decision tree & SMOTE
Matrix (average)	Without feature selection						Feature selection
	None	100 %	200 %	300 %	400 %	500 %	None	100 %	200 %	300 %	400 %	500 %
Accuracy	37.7	54.6	59.2	56.6	61.0	53.0	86.7	91.6	92.4	94.4	94.5	95.2
Precision	42.6	65.6	74.3	67.2	68.3	70.1	94.2	89.2	91.6	91.6	91.3	93.6
Recall	38.1	54.6	59.3	57.2	61.	53.6	86.6	91.3	92.3	94.4	91.2	93.6
F-measure	37.3	58.3	63.6	59.1	63.6	55.1	89.6	90.3	92.2	92.3	93.6	93.6

Table 11.

The best performance evaluation of the single-classifier and multiclassifier voting models on the original data.

Method	Average Accuracy (%)	Average precision (%)	Average Recall (%)	Average F-measure (%)
Random Forest	45.2	48.1	45.3	45.6
Decision Tree (CART)	82.6	79.3	83.3	81.2
KNN	35.3	36.6	35.6	35.3
Naïve Bayes	50.7	42.2	50.6	38.6
SVM	33.3	24.3	33.3	27.6
Multiclassifier	37.7	42.6	38.2	37.3

Table 12.

The best performance evaluation of the single classifier and multiclassifier voting models on umbilical cord dataset without feature selection + SMOTE.

Method	Average Accuracy (%)	Average precision (%)	Average Recall (%)	Average F-measure (%)
Random Forest	66.6	66.2	67.2	66.3
Decision Tree (CART)	90.3	91.3	91.3	91.2
KNN	47.6	51.6	48.6	49.3
Naïve Bayes	62.4	54.2	62.6	52.3
SVM	33.3	24.2	33.3	27.6
Multiclassifier	59.0	74.3	59.3	63.6

Table 13.

Evaluation of the best performance of the single classifier and multiclassifier voting models on the umbilical cord dataset with feature selection + SMOTE.

Method	Average Accuracy (%)	Average precision (%)	Average Recall (%)	Average F-measure (%)
Random Forest	96.1	95.3	96.3	96.1
Decision Tree (CART)	92.6	92.3	93.3	92.6
KNN	92.6	91.6	93.2	92.3
Naïve Bayes	84.0	84.6	83.6	84.2
SVM	59.2	50.6	59.3	54.6
Multiclassifier	94.5	93.3	94.6	93.6

The final evaluation is based on the feature selection dataset and SMOTE oversampling. Moreover, the overall method shows improved performance compared to the previous experiment. However, the SVM method shows the reverse and does not affect the feature selection process or data oversampling. This is probably because the SVM algorithm is not suitable for large data sets. SVM does not perform very well when the data set has more noise, that is, target classes are overlapping. Methods involving an ensemble learning approach such as Random Forest and Multiclassifier voting showed the most significant improvement compared to other methods. These results indicate that the ensemble learning approach combined with data oversampling with the SMOTE method has succeeded in overcoming the imbalanced data problem in this study’s umbilical cord dataset.

Evaluate the combination of features

In this section, the combination of UCI and Texture features is tested to determine the performance of the combination of these two features. This experiment aims to determine the effectiveness of the combination of these two features. In addition, a combination of feature selection and oversampling is also carried out in the evaluation in this section. The classification method used is Ensemble Multiclassifier voting and validation using cross-validation three-fold. Table 14 shows the comparison of the multiclassifier voting performance on the combination of features. From the table, it is known that there is an increase in the performance of multiclassifier voting with a combination of texture features and UCI. The addition of the feature selection process and data oversampling also showed an increase in the performance of the classification model, which achieved the best results with 94.5% accuracy, 93.3% precision, 94.6% recall, and 93.6% F1 measure.

Table 14.

Performance comparison of texture and UCI feature combinations.

	Performance measurement parameters
Dataset	Accuracy (%)	Precision (%)	Recall (%)	F1 measure
Texture	65.3	37.3	38.7	66.8
Texture & UCI	37.8	42.7	38.0	37.3
Texture & UCI + Feature Selection	86.8	94.0	86.7	89.7
UCI	92.7	86.0	71.0	75.0
UCI + SMOTE	92.6	93.0	91.1	92.0
Texture & UCI + Feature Selection + SMOTE	94.5	93.3	94.6	93.6

Conclusion

This study proposes a machine learning model for fetal umbilical cord image classification based on 2-D ultrasound Doppler. Based on the experiment and test results, it is evident that the classification model in recognizing feature patterns with high dimensions was unable to produce a satisfactory performance. However, after the additional feature selection process and data oversampling, the model performance for each classification method showed a significant increase in performance. Based on data testing carried out both before and after feature selection and data oversampling, methods with ensemble learning approaches such as Random Forest and Multiclassifier voting have been proven to improve the classification results of the fetal umbilical cord. These two methods show the best classification results, including Random Forest, which realized an average accuracy, precision, recall, and F-measure of 96%, 95.3%, 96.3%, and 96%.

Furthermore, the Multiclassifier voting method achieved an average accuracy, precision, recall, and F-measure of 94.5%, 93.3%, 94.6%, and 93.6%, respectively. The results obtained show that the combination of UCI and GLCM features can provide very satisfactory performance. In the future, the proposed model in this study can be used as a reference for making artificial intelligence-based ultrasound machine prototypes so that they can provide supporting information to obstetricians when diagnosing umbilical cords that require prompt and appropriate medical actions.

Footnotes

Acknowledgements

The author would like to thank the Director of Research and Community Service (DPRM) Indonesia for helping with funding this research through the funding of the 2021 Doctoral Dissertation Research scheme.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Director of Research and Community Service (DPRM) Indonesia through the 2021 Doctoral Dissertation Research scheme.

ORCID iDs

Gede A. Pradipta

Retantyo Wardoyo

References

Gupta

Faridi

Krishnan

. Umbilical coiling index. J Obstet Gynecol India 2006; 56: 315–319.

Bosselmann

Mielke

. Sonographic assessment of the umbilical cord. Donald Sch J Ultrasound Obstet Gynecol 2012; 6: 66–75.

Ndolo

Vinayak

Silaba

, et al. Antenatal umbilical coiling index and newborn outcomes: cohort study. J Clin Imaging Sci 2017; 7: 21.

Kulamani

Ashima

Pramod

, et al. Evaluation of umbilical coiling index as a predictor of pregnancy outcome. Int J Health Sci (Qassim) 2015; 5: 9.

De Laat

MWM

Franx

Van Alderen

, et al. The umbilical coiling index, a review of the literature. J Matern Neonatal Med 2005; 17: 93–100.

Chowdhury

. Evaluation of umbilical coiling index as an indicator of perinatal outcome. Int J Med Heal Res 2018; 5: 208–210.

Strong

Jarles

Vega

. The umbilical coiling index. Am J Obstet Gynecol 1994; 170: 29–32.

Predanic

. Sonographic assessment of the umbilical cord. Int J Contin Educ Curr Aware ISSN 2005; 5: 105–110.

Chitra

Sushanth

Raghavan

. Umbilical coiling index as a marker of perinatal outcome: an analytical study. Obstet Gynecol Int 2012; 2012: 1–6.

10.

Rana

Ebert

Kappy

. Adverse perinatal outcome in patients with an abnormal umbilical coiling index. Obstet Gynecol 1995; 85: 573–577.

11.

Olaya-c

Gil

Salcedo

, et al. Anatomical pathology of the umbilical cord and its maternal and fetal clinical associations in 434 newborns. Pediatr Dev Pathol 2018; 21: 1–8.

12.

Mcclennen

Chamchad

, et al. Hypercoiling of the umbilical cord in uncomplicated singleton pregnancies. J Perinat Med 2017; 46: 1–6.

13.

Cicero

D’angelo

Racchiusa

, et al. Antenatal umbilical coiling index and newborn outcomes: cohort study. J Clin Imaging Sci 2018; 7: 1–6.

14.

Ayu

PDW

Hartati

Musdholifah

, et al. Amniotic fluid segmentation based on pixel classification using local window information and distance angle pixel. Appl Soft Comput 2021; 107: 107196.

15.

Acharya

EYK

, Tan thermography based breast cancer detection using texture features. J Med Syst 2012; 36: 1503–1510.

16.

Fajrin

Nugroho

Soesanti

. Prosiding SNST ke-6 Tahun 2015 fakultas teknik universitas wahid hasyim semarang 47. Pros SNST 2015; 1: 47–52.

17.

Beura

Majhi

Dash

. Neurocomputing mammogram classi fi cation using two dimensional discrete wavelet transform and gray-level co-occurrence matrix for detection of breast cancer. Neurocomputing 2015; 154: 1–14.

18.

Mohanty

Rup

Dash

, et al. Digital mammogram classification using 2D-BDWT and GLCM features with FOA-based feature selection approach. Neural Comput Appl 2019; 32: 7029–7043.

19.

George

. The using of gray level co-occurrenc matrix for features extraction of the breast canced biopcy image ( GLCM ). Int J Eng Res Sci Technol 2018; 5: 8–12

20.

Belsare

Mushrif

Pangarkar

, et al. Classification of breast cancer histopathology images using texture feature analysis. IEEE Region 10 Annu Int Conf Proceedings/TENCON 2015; 12: e0177544. DOI: 10.1109/TENCON.2016.7372809

21.

Sultan

Tian

, et al. Machine learning for diagnostic ultrasound of triple-negative breast cancer. Breast Cancer Res Treat 2018; 2: 365–373.

22.

Pradipta

Wardoyo

Musdholifah

, et al. Improving classifiaction performance of fetal umbilical cord using combination of SMOTE method and multiclassifier voting in imbalanced data and small dataset. Int J Intell Eng Syst 2020; 13: 441–454.

23.

Vidya

EYK

Acharya

, et al. Computer-aided diagnosis of myocardial infarction using ultrasound images with DWT, GLCM and HOS methods : a comparative study. Comput Biol Med 2015; 62: 86–93.

24.

Chawla

Bowyer

Hall

, et al. SMOTE : synthetic minority over-sampling technique. J Artif Intell Res 2002; 16: 321–357.