Sage Journals: Discover world-class research

Abstract

Prostate is a second leading causes of cancer deaths among men. Early detection of cancer can effectively reduce the rate of mortality caused by Prostate cancer. Due to high and multiresolution of MRIs from prostate cancer require a proper diagnostic systems and tools. In the past researchers developed Computer aided diagnosis (CAD) systems that help the radiologist to detect the abnormalities. In this research paper, we have employed novel Machine learning techniques such as Bayesian approach, Support vector machine (SVM) kernels: polynomial, radial base function (RBF) and Gaussian and Decision Tree for detecting prostate cancer. Moreover, different features extracting strategies are proposed to improve the detection performance. The features extracting strategies are based on texture, morphological, scale invariant feature transform (SIFT), and elliptic Fourier descriptors (EFDs) features. The performance was evaluated based on single as well as combination of features using Machine Learning Classification techniques. The Cross validation (Jack-knife k-fold) was performed and performance was evaluated in term of receiver operating curve (ROC) and specificity, sensitivity, Positive predictive value (PPV), negative predictive value (NPV), false positive rate (FPR). Based on single features extracting strategies, SVM Gaussian Kernel gives the highest accuracy of 98.34% with AUC of 0.999. While, using combination of features extracting strategies, SVM Gaussian kernel with texture $+$ morphological, and EFDs $+$ morphological features give the highest accuracy of 99.71% and AUC of 1.00.

Keywords

Prostate cancer support vector machine (SVM)decision tree Bayesian approach morphological scale invariant feature transform (SIFT)texture and elliptic fourier descriptors (EFDs)

1. Introduction

Prostate cancer is most commonly diagnosed cancer among men and remains a second leading cause of deaths in men globally. In 2017, there will be 161, 360 cases of prostate cancer, and approximately 26, 730 men will die from prostate cancer in the United States [1]. In 2013, there were approximately 240,000 and 40,000 cases of Prostate cancer reported in USA and UK respectively, and the estimate will reach up to 1.7 million cases globally by 2030 [2]. The incidence of prostate cancer varies worldwide, with the highest rates found in the United States, Canada, and Scandinavia, and the lowest rates found in China and the rest of Asia. The risk of developing prostate cancer is related to advancing age, African American ethnicity, and a confident family history, and might be influenced by diet and other factors [3]. Detecting a Prostate cancer at early stage can help to survive nine out of 10 men for the last five years. However, early detection of Prostate cancer remains a source of controversy and uncertainty [4]. The detection of prostate cancer at an early stage is crucial to extend the likelihood of prosperous treatment. Ancient prostate cancer detection uses digital body part examinations and humor prostate specific substance levels [5]. Brachytherapy represents one of many oldest techniques of radiation therapy for prostate cancer [6]. The techniques for transperineally permanent prostate brachytherapy are relatively modern, have been developed within the past years, and the selected cohorts of this study represent some of the largest series with longer follow up [7].

Medical imaging has gained much importance with- in the last few decades, especially in analyzing different body parts [8]. Researchers developed different clinical diagnostic tools such as digital rectal examination (DRE), prostate specific antigen (PSA), transrectal ultrasound (TRUS) and biopsy tests most widely used for detecting prostate cancer irrespectively of acquiring accurate results [9]. Schröder et al. [10] employed PSA to detect the prostate cancer that reduces the death rate by 20%, the benefit was associated with overtreatment and overdiagnosis. It was observed that PSA test could not predict the cancer aggressiveness. Thus, non-aggressive and slow growing prostate cancer is frequently diagnosed in older patients [11]. Moreover, TURS guided biopsy did not detect all clinical cancers [4].

However, since the accuracy of TRUS is restricted, magnetic resonance imaging (MRI) has been planned as another tool to TRUS, because of its superior soft-tissue imaging capabilities [12, 13]. Many studies have indisputable that MRI will offer higher resolution to help in detection of smaller volumes of prostate cancer with the next accuracy than TRUS [12] and may be considered as a promising technique for prostate cancer localization. Prostate MRI has become an increasingly common adjunctive procedure in the detection of prostate cancer [14]. The development of computer power-assisted diagnosing (CADx) tools for MRI of prostate holds nice promise for improved detection and characterization of prostate cancer [15]. The texture of images identifies the looks, structure and arrangement of the parts of a thing within the image [15]. And morphological analysis of medical images is utilized in many of research and clinical studies that investigate the after effect of diseases and treatments on anatomical structure [16].

In the previous studies, the researchers extracted different features from MRI images to detect the prostate cancer. Perez et al. extracted texture features with Gabor filter, Gray-level co-occurrence matrix (GLCM), Local binary patterns (LBP), Haar transform, and Hu moments and statistical features to classify prostate cancer. A high performance with AUC values of 0.81 to 0.85 was obtained with the union of texture features from the parametric maps [17]. Han et al. proposed a new prostate detection method using multiresolution autocorrelation texture features and clinical features such as location and shape of tumor. The cancerous tissues were detected efficiently with high specificity (about 90–95%) and high sensitivity (about 92–96%) by the measurement of the number of correctly classified pixels. The support vector machine (SVM) is used to classify tissues based on texture features [18]. De Rooij et al. proposed Hybrid Morphological-Textural Model in which different texture and morphological extracted features from MRI are combined for classification and obtained improved results with respect to specificity and sensitivity [19]. Doyle et al. [20] extracted texture features, First-Order Statistics, Co-occurrence Features and Wavelet Features to perform pixel-wise Bayesian classification at each image scale to obtain corresponding likelihood scenes. They applied AdaBoost algorithm to combine the most discriminating features and found overall classification accuracy of 88%. Daliri [21] extracted scale-invariant feature transforms (SIFT) features from MRI and classify disease (Alzheimer) using SVM algorithm and got accuracy 86%. Sahrim et al. [22] performed image Analysis with a boundary description derived using Fourier descriptors to detect the presence of Alzheimer’s disease and shows how image analysis using EFDs can be used to detect diseases.

The existing techniques have some limitations as only few features extracting strategies were employed which may not properly detect the valuable information in Prostate MRIs. Moreover, most of the knowledge hidden within the MR images can be extracted using complexity based sample entropy and wavelet entropy features because of complexity of healthy subject is greater than the pathological subjects and minimized because of the degrading of structural and purposeful coupling functions. Thus, these features will offer important information to differentiate the normal and cancer subjects. In this study, different features extracting strategies such as texture, morphological, sample entropy, wavelet entropy, SIFT, and EFDs are proposed to extract the valuable information from the prostate cancer MRIs which is then used as input to the novel Machine Learning classifiers including Support Vector Machine (SVM) and its kernels, Decision Trees and Bayesian approach.

Figure 1.

Schematic diagram of ML (machine learning) classification techniques to classify the prostate cancer with brachytherapy subjects based on various features extracting strategies.

The Fig. 1 shows the schematic diagram of proposed system. In the first step, the images are taken as input from the relevant database. In the second step, the features such as texture, morphological, scale-invariant feature transforms (SIFT), elliptic Fourier descriptors (EFDs) and Entropy base features are extracted. The extracted features (single and in different combination) are then passed as input to the machine learning (ML) classifiers such as SVM with polynomial, RBF and Gaussian kernels; Bayesian classifiers and Decision Tree classifiers. Finally, the training and target data split was made using Jackknife 10-fold cross validation to classify the Prostate and Brachytherapy subjects.

2. Methodology

2.1 Dataset

The Dataset were taken from publicly available database provided by the Harvard University (National Center for Image Guided Therapy Department of Radiology, Brigham and Women Hospital, Harvard Medical School), and funded by National Institutes of Health available at (http://prostatemrimagedatabase.com/index.html). The database contains MRI Images for research purposes. In database images are arranged with different series and examination description. In this research, the images of Prostate and Brachytherapy with the last series are taken for further analysis. In the present study, a total of 682 MRIs from 20 patients consisting of 482 images from Prostate subjects and 200 images from Brachytherapy subjects are used for extracting features and employing machine learning classifiers to detect and predict the cancer.

2.2 Features extraction strategies

The first and foremost important step toward classification problems is to extract and select the relevant features based on the type and characteristics of problem. In the past researchers extracted different features for classification and detection purposes. Rathore et al. [8, 23, 24] extracted geometric and hybrid features to detect and predict the colon cancer. Moreover, Hussain et al. [25] extracted acoustic and Mel frequency cepstral Coefficients (MFCC) features for emotion recognition in human speech, geometric and texture features [26, 27] for detection and recognition of human faces, complexity based features [27, 28] for heart rate variability and to distinguish alcoholic and non-alcoholic subjects.

Figure 2.

SVM (a) linear separation and (b) margin.

2.2.1 Morphological features

The morphology of tissue is important to determine that tissues are normal or not. Morphological features are extracted from images by converting the morphology of images into set of quantitative values used in classification [29, 30, 31], segmentation [32] and so on. The shape based features are mostly largely used to classify the masses present in medical images [33]. There are seventeen shape based (morphological) features extracted for binary images are (a) Area (Ar), (b) Perimeter (PMT), (c) Maximum Radius (MAX_RD), (d) Minimum Radius (MIN_RD), (e) Eccentricity (ECTT), (f) Equivdiameter (EQDT), (g) Elongatedness (EGDN), (h) Entropy (ETP), (i) Circularity1 (CIR_1), (j) Circularity1 (CIR_2), (k) Compactness (CPN), (l) Dispersion (DPS), (m) Thinness Ratio (TN-R), (n) Standard Deviation of Image (SDV), (o) Standard Deviation of Edge (ESDV), (p) Shape Index (S-ID). The definitions and description are taken from Surendiran and Vadivel [34] and Bresson and Chan [35].

2.2.2 Texture features

In previous studies, the texture features are most widely used in solving classification issues [36, 37, 38], particularly to classify the colon biopsies [39, 40]. The texture features are calculated from grey level co-occurrence matrix (GLCM) that covers the spatial relationship between pixels (of image). Each entry ( $u, v$ ) th in the GLCM defines how frequently the pixel with intensity value $u$ co-occur in a specified relationship with pixel having intensity value of $v$ . The relationship between the two pixels was determined by the (i) relative distance between the neighboring pixel and pixel of interest and (ii) their orientation $\theta$ (relative). Normally, the $\theta$ have four directions $(0^{{\circ}},{45}^{{\circ}},{90}^{{\circ}},{135}^{{\circ}})$ [41]. The texture features extracted from the image (grey scale) are (a) contrast (C), (b) correlation (CR), (c) Dissimilarity (Dis), (d) energy (EG), (e) entropy (ETP), (f) Homogeneity (HG), (g) mean (Ã), (h) variance (VAR), (i) standard deviation (SDV).

2.2.3 Scale invariant feature transform (SIFT)

Lowe [42] proposed SIFT features that have been used to analyze the problems of panoramas reconstruction [43], face identification [44, 45, 46] and visual object tracking [47]. Due to their robustness, characteristics such as illumination changes, rotation, noise, scaling and blurry effects, SIFT features have been used in wide area of research. These characteristics of SIFT features are useful for classification of prostate cancer samples. In the initial step of extricating SIFT highlights, the key points are localized in an image. The convolved scale-space image was created by convolving the image with Gaussian. The nearby Gaussian convolved images are subtracted to create the distinction of Gaussian image. The original image is then down sampled by a factor of 2 after calculating the difference at one scale, and this process is repeated until reaching the lowest possible scale. The larger number of key points are identified at initial step which are additionally reduced in the subsequent stage. In the second step, every pixel is contrasted and 8 neighbors in its own scale and 9 neighbors in the scale below or above it. After this step, the points having their values smaller or greater compared to all the neighbouring pixels are retained. In the third step, the points which are localized poorly along edges or having poor contrast are discarded. The orientation and descriptors are assigned to the remaining key points.

Figure 3.

(a) Error on margin using slack variable, (b, c) SVM non-linear separation.

2.2.4 Elliptic fourier descriptors (EFDs)

The EFDs features are useful for discriminating the images having epileptic shape. In 1982, EFDs features are introduced by Kuhl and Giardina [48] to classify the sold objects such as car, box etc. These features are also widely have been used in applications of pattern recognition systems [49, 50]. To compute the EFDs features, we require two stages. In the initial stage, the epileptic objects are recognized in the white bunch of images. In the second stage, the epileptic items are sorted in light of their region and EFDs of the top most L objects up to the desired level X. EFDs depend on chain code by approximating the state of closed shape by a succession of eight standardized line segments and are invariant to translation dilation, rotation and starting point of a contour. To extract EFDs the H harmonic levels are used and four Fourier coefficients i.e. $a$ , $b$ , $c$ and $d$ against each harmonic level as computed below:

$\displaystyle e_{i}=[e_{1},e_{2},e_{3},\ldots∼{}\ldots∼{}\ldots∼{}e_{X}]i^{P}$ $\displaystyle\quad∼{}\text{ where }i=1,2,3,\ldots∼{}L$ $\displaystyle f_{i}=[f_{1},f_{2},f_{3},\ldots∼{}\ldots∼{}\ldots∼{}f_{X}]i^{P}$ $\displaystyle g_{i}=[g,g_{2},g_{3},\ldots∼{}\ldots∼{}\ldots∼{}g_{X}]i^{P}$ $\displaystyle h_{i}=[h,h_{2},h_{3},\ldots∼{}\ldots∼{}\ldots∼{}h_{X}]i^{P}$ (1)

where $e_{i},f_{i},g_{i}$ and $h_{i}$ vectors contain $e$ , $f$ , $g$ and $h$ Fourier coefficients of ith primitive up to harmonic level X. The average of these vectors is computed as below:

$\displaystyle\bar{e}=\frac{1}{L}\sum\nolimits_{i=1}^{L}e_{i},∼{}∼{}\bar{f}=% \frac{1}{L}\sum\nolimits_{i=1}^{L}f_{i},$ $\displaystyle\bar{g}=\frac{1}{L}\sum\nolimits_{i=1}^{L}g_{i}\text{ and }\bar{h% }=\frac{1}{L}\sum\nolimits_{i=1}^{L}h_{i}$ (2)

The final Fourier feature vector is obtained by combining the above average vectors

$ve=[\bar{e}^{P},\bar{f}^{P},\bar{g}^{P},h^{P}{]}^{P}$ (3)

where $p$ denotes the transformation of orthogonal matrices in Eqs (2.2.4) and (3).

2.2.5 Entropy features

The biological signals are output of multiple interacting components of biological systems which exhibit some complex rhythmical patterns. These rhythms and patterns are altered due to malfunctioning in structural components and reduced interactions in coupling functions. The change in pattern contains very useful information to understand the underlying dynamics of these systems which can be extracted in form of complexity measures to be computed using information theoretic approaches. Recently researchers used complexity based measures [51, 52, 53, 54, 55, 56] and Wavelet packet entropy [57, 58] methods to quantify and analyse the dynamics of physiological systems. In this study, the entropy features are computed by calculating the sample entropy, Wavelet entropic measures such as, Shannon, norm, threshold, sure and log energy to extract the useful information hidden in the MRIs of prostate cancer subjects.

2.3 Classification

Classification is a process of categorization in which the objects and ideas are distinguished, predictable and understood. Based on features extracted, the accuracy, and other performance evaluation parameters are estimated using this model. The known label of the test sample is compared with the classified results from the model. 10-fold cross validation was used training, test and validation purposes.

2.3.1 Support vector machine (SVM)

For supervised learning methods, the most robust and generalized classifiers is the SVM most widely used for the classification problems. SVM is used in many applications such as pattern recognition problems [59], medical diagnosis area [60, 61] and machine learning [62]. Currently, SVM is used in variety of applications such as text recognition, speech recognition, emotion recognition, facial expression recognition, content based image retrial, biometrics, etc. SVM construct a hyperplane or set of hyperplanes in infinite or high dimensional space which can be used for classification a good separation using this hyperplane is achieved that has the largest distance to the nearest training data point of any class (also known as functional margin). Generally, larger the margin indicate that the classifier exhibits lower generalization error. SVM kernel trick has ability to find a hyperplane that gives the largest minimum distance to the training example. In SVM theory, it is also known as margin. Optimal margin is obtained for maximized hyperplane. Another important feature of SVM is that it gives the greater generalization performance. SVM is basically, a two-category classifier which transformed data into a hyperplane depends on the nonlinear training data or higher dimension.

Consider a hyperplane $x.w+b=$ 0, where $w$ is its normal. The linearly separable data is labelled as:

$\displaystyle\left\{x_{i},y_{i}\right\},x_{i}\in R^{N}d,y_{i}\in\left\{-1,1% \right\},$ $\displaystyle\quad∼{}i=1,2,\ldots,N$ (4)

here $y_{i}$ is the class label of two class SVM. The optimum boundary is obtained by minimizing the objective function with maximal margin i.e. $E=\left\|w\right\|^{2}$ subject to

$\displaystyle x_{i}.w+b\geqslant 1\text{ for }y_{i}=+1$ $\displaystyle x_{i}.w+b\leqslant 1\text{ for }y_{i}=-1$ (5)

Combining the inequalities as:

${(x}_{i}.b+b)y_{i}\geqslant 1\text{ for all }i$

When the data is not linearly separable, a slack variable ${\Xi}_{i}$ is used to denote the amount of misclassification rate. Thus, new subjective function can be reformulated using following equation:

$\displaystyle E=\frac{1}{2}\left\|w\right\|^{2}+C\sum\limits_{i}{L(\Xi_{i})}$ (6)

Subject to

${(x}_{i}.b+b)y_{i}\geqslant 1-\xi_{i}\text{ for all }i$

Here first term i.e. regularization term on right hand side gives SVM an ability to generalize well on sparse data. While, the empirical risk can be computed using second term which denotes misclassified or lie within the margin. $L$ denotes the cost function and $C$ denotes the hyper parameter representing trad-off effect by minimizing the empirical risk against maximizing the margin. Linear-error cost function is most commonly used because of its ability to detect the outliers. The dual formulation with $L\left({\Xi}_{i}\right)={\Xi}_{i}$ is

$\displaystyle\alpha^{*}=\max\nolimits_{\alpha}\left(\sum\limits_{i}\alpha_{i}+% \sum\limits_{i,j}{\alpha_{i}\alpha_{j}{y}_{i}y_{j}}{x}_{i}x_{j}\right)$ (7)

Subject to

$0\leqslant\alpha_{i}\leqslant C\text{ and }\sum\limits_{i}{\alpha_{i}y_{j}=0}$

in which $\alpha=\left\{\alpha_{1},\alpha_{2},\alpha_{3},\ldots\alpha_{i},\right\}$ is a set of Lagrange multipliers of the constraints in the primal optimization problem. The optimal decision boundary is now given by

$\displaystyle w_{0}=\sum\limits_{i}{\alpha_{i}x_{i}y_{i}}$ (8)

SVM for non-linearly separable data

The kernel function trick is used to deal with the data which is not linearly solvable. In this case the non-linear mapping from input space is transformed into higher dimensional feature space. In this case the dot product between two vectors in the input space is expressed by dot product with some kernel functions in the feature space. The most commonly used kernel functions are polynomial, Gaussian and radial base function (RBF). Mathematically, these are expressed as:

Mathematically, the kernels can be defined as:

SVM Polynomial Kernel

$\displaystyle K(x_{i},y_{i})=(x_{i}.y_{i}+1)^{n}$ (9)

SVM Gaussian (RBF) kernel

$\displaystyle K(x_{i},y_{i})=\exp\left(\frac{-1}{2}\frac{\left\|x_{i}-y_{i}% \right\|^{2}}{\sigma^{2}}\right)$ (10)

SVM Fine Gaussian (RBF) kernel

$\displaystyle K(x_{i},y_{i})=\exp\left(\frac{-1}{2}\frac{\left\|x_{i}-y_{i}% \right\|^{\prime}||x_{i}-y_{i}||}{\sigma^{2}}\right)$ (11)

where $n$ is the order of polynomial kernel and $\sigma$ is the width of RBF. The dual formulation for non-linear case is given by

$\displaystyle\alpha^{*}=\max\nolimits_{\alpha}\left(\sum\limits_{i}\alpha_{i}+% \sum\limits_{i,j}{\alpha_{i}\alpha_{j}{y}_{i}y_{j}}K({x}_{i}.x_{j})\right)$ (12)

Subject to

$0\leqslant\alpha_{i}\leqslant C\text{ and }\sum\limits_{i}{\alpha_{i}y_{j}=0}$

The performance of SVM classifier depends on several parameters. The grid search method was used to select the optimal parameter value to set carefully the optimization parameters i.e. grid range and step size. The linear kernel involves only one parameter (‘ $c$ ’ soft margin constant), that represent the constraint violation cost associated with the data point occurring on the wrong side of the decision surface. The SVM with RBF and Gaussian kernel function has two training parameters: cost ( $C$ ) which control the overfitting of the model and sigma, which control the degree of nonlinearity of the model. In this study, the optimal values of cost function and sigma are obtained by adjusting the grid with $\sigma=[8.35e-16,\ldots,0.860,\mu=\left[0.003,\ldots,0.925\right]$ bias 0.002 for both two kernels, whereas for sigmoid kernel, the default value of $r$ was used.

2.3.2 Decision tree (DTs)

In Decision Tree, the similarities in the dataset are checked and accordingly the classification is made into distinct classes. DTs were used by [63] for classifying the data based on choice of an attribute which maximizes and fix the data division. The attributes are split into several branches until the termination criteria is met. Mathematically, following equations are used to construct a DT algorithm:

$\displaystyle\bar{X}=\{X_{1},X_{2},X_{3},\ldots,X_{m}\}^{T}$ (13) $\displaystyle X_{i}=\{x_{1},x_{2},x_{3},\ldots,x_{ij},\ldots,x_{in}\}$ (14) $\displaystyle S=\{S_{{1}},S_{{2}},\ldots,S_{i},\ldots,S_{m}\}$ (15)

where $m$ represents the available observations number, $n$ denotes the independent variable number, $S$ is the m-dimension vector of the variable forecasted from $\bar{X}$ . $X_{i}$ is the $i$ th component of n-dimension autonomous variables $x_{i{1}},x_{i{2}},x_{i{3}},\ldots,x_{in}$ are autonomous variable of pattern vector $X_{i}$ and $T$ is the transpose notation.

The purpose of DTs is to forecast the observations of $\bar{X}$ . From $\bar{X}$ several DTs can be built with different accuracy level; however, an optimal DT is challenging because search space has large dimension. For DTs, suitable algorithms can be developed to reflect the trade-off between accuracy and complexity. In this case a sequence of local optimal decisions about the feature parameters are used to partition the dataset $\bar{X}$ using DT algorithms. Optimal DT $T_{k0}$ is constructed according to the subsequent optimization problem.

$\displaystyle\hat{R}(T_{k0})=\min\{\hat{R}\left(T_{k0}\right)\},k=1,2,3,\ldots,K$ (16) $\displaystyle\hat{R}(T)=\sum\nolimits_{t\in T}^{k}\left\{r\left(t\right)p(t)\right\}$ (17)

where $\hat{R}\left(T\right)$ denote the error level during the misclassification of tree $T_{k}$ , $T_{k0}$ denote the optimal DT that minimizes the error of misclassification in the binary tree, $T$ denote the binary tree $\in\{T_{{1}}T_{{2}},\ldots,T_{k}t_{1}\}$ , the index of tree is denoted by $k$ , tree node by $t$ , root node by $t_{1}$ , resubstituting error by $r(t)$ that misclassify node $t$ , probability that any case drop into node $t$ is denoted by $p(t)$ . $T^{L}$ and $T^{R}$ denote the sub-trees of left and right partition set. The tree $T$ is formed by feature plan portioning.

2.3.3 Naïve Bayes (NB)

In machine learning, Naïve Bayes [64] classifier is from the family of probabilistic classifier based on the Bayes’ theorem having strong independence assumptions between the features. NB is most popular in classification tasks [65]. This algorithm is most popular since 1950. Due to the good behaviour [66], NB is extensively used in recent developments [67, 68, 69, 70, 71] which try to improve NB performance. Moreover, during the learning process of NB, it requires a larger number of parameters. It is efficiently trained in supervised learning setting. The maximum likelihood function is used for parameter estimation. NB is a conditional probability model which is computed using Bayes theorem: given a problem instance to be classified, denoted by a vector $X=\left\{X_{1},X_{2},X_{3},\ldots.X_{n}\right\}$ representing some $n$ features, then the conditional probability will be

$P(C_{k}|X_{1},X_{2},X_{3},\ldots X_{n})$ (18)

for each possible $k$ outcomes or classes $C_{k}$ .

The Bayes theorem is mathematically expressed as:

$P(C_{k}|X)=\frac{P(C_{k})P(X|C_{k})}{P(X)}$ (19)

where $P(C_{k}|X)$ denote the posterior probability, $P(C_{k})$ is the prior probability, $P(X|C_{k})$ denote likelihood and $P(X)$ is the evidence. Naïve Bayes is mathematically expressed as:

$\displaystyle P(C_{k}|X_{1},X_{2},X_{3},\ldots X_{n})$ $\displaystyle\quad∼{}=\frac{1}{Z}P(C_{k})\prod\nolimits_{i=1}^{n}{P(X_{i}|C_{k% })}$ (20)

where evidence $Z=P(x)$ is a scaling factor dependent only on $\left(X_{1},X_{2},X_{3},\ldots,X_{n}\right)$ . The parameter for estimation required: a marginal probability distribution for class variable $P(C_{k})$ and conditional probability distribution for each predictive attribute given the class $P(X_{i}|C_{k})$ . Depending upon the nature of $x_{i}$ (numeric or discrete) such distribution can be Gaussian (Normal) or multinomial respectively and it is estimated for each value of $c_{j}$ of $C$ . For inference MAP principle is used i.e. given the instance $<x_{1},x_{2},\ldots,x_{n}>$ we choose the class $c^{*}$ such that

$\displaystyle c^{*}=\arg{\max}_{c_{j}}∼{}P(C=c_{j}|X_{1}=x_{1},x_{2},\ldots,$ $\displaystyle\quad∼{}X_{n}=x_{n})$ $\displaystyle=\arg{\max}_{c_{j}}∼{}P(C=c_{j}\prod\nolimits_{i=1}^{n}P(X_{i}=x_% {i}$ $\displaystyle\quad∼{}|C=c_{j})$ (21)

NB has complexity O(tn) to induce the classifier over a dataset having $t$ instances and $n$ variables and complexity O(cn), with $c$ number of class variables to classify a new instance. NB has the characteristics of simplicity and unrealistic independence approach and successfully been used in many practical applications [72]. It is also sensitive in presence of irrelevant and/redundant attributes. The presence of highly correlated (redundant) attribute can bias the decision taken by NB classifier [71].

2.4 Performance measures based on confusion matrix parameters

The performance using ML classifiers to detect the prostate cancer was measured by computing sensitivity, specificity, PPV, NPV and Total accuracy.

2.5 Confusion matrix:

	Predicted
Actual	True positive	False positive	PPV $=$ tp/(tp $+$ fp)
	(tp) $=$ 481	(fp) $=$ 1	$=$ 481/(481 $+$ 1)
			$=$ 99.79%
	False negative	True negative	NPV $=$ tn/(tn $+$ fn)
	(fn) $=$ 8	(tn) $=$ 192	$=$ 192/(192 $+$ 8)
			$=$ 96.00%
	Sensitivity $=$	Specificity $=$
	tp/(tp $+$ fn)	tn/(tn $+$ fp)
	$=$ 481/(481 $+$ 8)	$=$ 192/(192 $+$ 1)
	$=$ 98.36%	$=$ 99.48%

2.5.1 Sensitivity

The sensitivity measure is used to test the proportion of people who test positive for the disease among those who have the disease. Mathematically, it is expressed as:

$\textit{Sensitivity}=\frac{\textit{TP}}{\textit{TP}+\textit{FN}}$ (22)

i.e. the probability of positive test given that patient has disease.

2.5.2 Specificity

Specificity measures the proportion of negatives that are correctly identified. Mathematically, it is expressed as:

$\textit{Specificity}=\frac{\textit{TN}}{\textit{TN}+\textit{FP}}$ (23)

i.e. probability of a negative test given that patient is well.

2.5.3 Positive predictive value (PPV)

PPV is mathematically is expressed as:

$\textit{PPV}=\frac{\textit{TP}}{\textit{TP}+\textit{FP}}$ (24)

where TP denote that the test makes a positive prediction and subject has a positive result under gold standard while FP is the event that test make a positive perdition and subject make a negative result.

2.5.4 Negative predictive value (NPV)

$\displaystyle\textit{NPV}=\frac{\textit{TN}}{\textit{TN}+\textit{FN}}$ (25)

2.5.5 Total accuracy (TA)

The total accuracy is computed as:

$\textit{TA}=\frac{\textit{TP}+\textit{TN}}{\textit{TP}+\textit{FP}+\textit{FN}% +\textit{TN}}$ (26)

2.6 Training/testing data formulation

The Jack-knife k-fold cross validation technique was applied for training/testing data formulation and parameter optimization. In this research, 2, 4, 5 and 10-fold CVs were used to evaluate the performance of classifiers for different features extracting strategies. The higher performance was obtained using 10-fold CV. As 10-fold CV is most commonly used and well-known method which is being successfully used to evaluate the performance of classifiers. Using 10-fold CV, the data is divided into 10 folds, in training, the 9 folds participate and classes of samples of remaining folds are predicted based on the training performed on 9 folds. For the trained models, the test samples in test fold are purely unseen. The entire process is repeated 10 times and each class sample is predicted accordingly. The similar approach is applied for other CVs. Finally, the unseen samples predicted labels are used to determine the classification accuracy. This process is repeated for each combination of system’s parameters, and classification performance have been reported for the samples as depicted in the Tables.

Table 1
Classification performance based on single feature extracting strategy using 10-fold CV

Classifiers	Sens.	Spec.	PPV	NPV	TA	FPR	AUC	Error	95% CI
									L	U
Texture
Bayes	0.9545	0.9519	0.9558	0.9234	0.9545	0.04811	0.9898	0.0455	0.9139	0.9750
Decision tree	0.9501	0.915	0.9499	0.9423	0.9501	0.08504	0.9551	0.0499	0.9736	0.9736
SVM Gaussian	0.9824	0.9634	0.9825	0.9867	0.9824	0.03655	0.9999	0.0176	0.9759	0.9849
SVM RBF	0.9106	0.7903	0.9185	0.958	0.9106	0.2097	0.9914	0.0894	0.7959	0.9627
SVM Poly.	0.9809	0.9745	0.981	0.9717	0.9809	0.02546	0.9968	0.0191	0.9731	0.9843
Morphological
Bayes	0.8617	0.9344	0.902	0.7721	0.8617	0.06563	0.9604	0.1383	0.6673	0.9500
Decision tree	0.8894	0.8455	0.8897	0.8409	0.8894	0.1545	0.9293	0.1106	0.7386	0.9571
SVM Gaussian	0.9083	0.8710	0.9084	0.8686	0.9083	0.129	0.9896	0.0917	0.7896	0.9621
SVM RBF	0.9170	0.8864	0.9174	0.879	0.917	0.1136	0.9596	0.083	0.8135	0.9644
SVM Poly.	0.9025	0.8804	0.905	0.8503	0.9025	0.1196	0.9689	0.0975	0.7738	0.9605
EFDs
Bayes	0.2737	0.6133	0.3433	0.2978	0.2737	0.3867	0.4314	0.7263	0.0882	0.5964
Decision tree	0.8341	0.7521	0.8323	0.7684	0.8341	0.2479	0.7728	0.1659	0.6019	0.9428
SVM Gaussian	0.7642	0.7028	0.7744	0.6622	0.7642	0.2972	0.9674	0.2358	0.4635	0.9233
SVM RBF	0.7453	0.6331	0.7442	0.6377	0.7453	0.3669	0.8291	0.2547	0.4325	0.9177
SVM Poly.	0.7249	0.501	0.7012	0.6082	0.7249	0.499	0.5638	0.2751	0.4016	0.9113
Entropy
Bayes	0.7331	0.3366	0.7872	0.8748	0.7331	0.6634	0.707	0.2669	0.4137	0.9139
Decision tree	0.739	0.6311	0.7415	0.622	0.739	0.3689	0.7356	0.261	0.4226	0.9157
SVM Gaussian	0.7522	0.5011	0.7322	0.6679	0.7522	0.4989	0.9705	0.2478	0.4435	0.9197
SVM RBF	0.8123	0.6266	0.806	0.7782	0.8123	0.3734	0.8257	0.1877	0.5546	0.9369
SVM Poly.	0.8021	0.6317	0.7931	0.7441	0.8021	0.3683	0.8548	0.1979	0.5338	0.9341
SIFT
Bayes	0.7671	0.6244	0.7584	0.6755	0.7671	0.3756	0.8193	0.201177	0.4685	0.9242
Decision tree	0.7016	0.6005	0.7093	0.5792	0.7016	0.3995	0.6878	0.264059	0.3694	0.9037
SVM Gaussian	0.7773	0.5874	0.7651	0.7115	0.7773	0.4126	0.9793	0.191474	0.4866	0.9271
SVM RBF	0.8006	0.6559	0.7926	0.7369	0.8006	0.3441	0.7948	0.169413	0.5308	0.9337
SVM Poly.	0.7860	0.6587	0.7795	0.7040	0.7860	0.3413	0.8279	0.183219	0.5026	0.9296

2.7 Receiver operating curve (ROC)

The ROC is plotted against the true positive rate (TPR) i.e. sensitivity and false positive rate (FPR) i.e. specificity values of prostate and brachytherapy subjects. The mean features values for brachytherapy subjects are classified as 1 and for prostate subjects are classified as 0. This vector is then passed to the ROC function, which plots each sample values against specificity and sensitivity values. To diagnose and visualize the performance of a classifier, ROC is one of the standard way to measure the performance [73]. The TPR is plotted against $y$ -axis and FPR is plotted against $x$ -axis. The area under the curve (AUC) shows the portion of a square unit. Its value lies between 0 and 1. AUC $>$ 0.5 shows the separation. The higher AUC shows the better diagnostic system. Correct positive cases divided by the total number of positive cases are represented by TPR, while negative cases predicted as positive divided by the total number of negative cases are represented by FPR.

Table 2
Classification performance based on combination of features using 10-fold CV

Classifiers	Sens.	Spec.	PPV	NPV	TA	FPR	AUC	Error	95% CI
									L	U
Morphological $+$ Entropy
Bayes	0.8603	0.9338	0.9013	0.7705	0.8603	0.06623	0.9654	0.1397	0.6853	0.9618
Decision tree	0.9039	0.8633	0.9039	0.8633	0.9039	0.1367	0.9418	0.0961	0.7140	0.9641
SVM Gaussian	0.9753	0.9781	0.976	0.9535	0.9753	0.02195	0.9977	0.0247	0.9586	0.9851
SVM RBF	0.9316	0.8658	0.9319	0.935	0.9316	0.1342	0.9874	0.0684	0.7740	0.9687
SVM Poly.	0.9505	0.9296	0.9505	0.9296	0.9505	0.07042	0.9809	0.0495	0.8712	0.9762
EFDs $+$ Morphological
Bayes	0.9575	0.9737	0.9613	0.913	0.9575	0.02625	0.9946	0.0425	0.9229	0.9809
Decision tree	0.9692	0.9477	0.9691	0.9624	0.9692	0.0523	0.9731	0.0308	0.9246	0.9811
SVM Gaussian	0.9971	0.9958	0.9971	0.9958	0.9971	0.004249	1	0.0029	0.9944	0.9966
SVM RBF	0.717	0.2932	0.7973	0.9188	0.717	0.7068	0.9911	0.283	0.2254	0.8790
SVM Poly.	0.9868	0.9886	0.987	0.9738	0.9868	0.01143	0.9934	0.0132	0.9825	0.9893
EFDs $+$ Texture
Bayes	0.305	0.6325	0.5001	0.3631	0.305	0.3675	0.4574	0.695	0.1185	0.7697
Decision tree	0.934	0.8936	0.9335	0.9128	0.934	0.1064	0.9308	0.066	0.8038	0.9709
SVM Gaussian	0.9487	0.9364	0.9494	0.9174	0.9487	0.0636	0.9982	0.0513	0.8734	0.9764
SVM RBF	0.7683	0.4245	0.8183	0.9119	0.7683	0.5755	0.9464	0.2317	0.2939	0.9061
SVM Poly.	0.8446	0.6703	0.8453	0.8483	0.8446	0.3297	0.7846	0.1554	0.4770	0.9414
EFDs $+$ Entropy
Bayes	0.2722	0.6068	0.341	0.2957	0.2722	0.3932	0.4302	0.7278	0.1069	0.7434
Decision tree	0.8384	0.7568	0.8365	0.7755	0.8384	0.2432	0.8056	0.1616	0.5195	0.9464
SVM Gaussian	0.7496	0.576	0.7359	0.6505	0.7496	0.424	0.9861	0.2504	0.3299	0.9158
SVM RBF	0.7467	0.4687	0.729	0.6853	0.7467	0.5313	0.8431	0.2533	0.2915	0.9053
SVM Poly.	0.7394	0.5423	0.7216	0.6349	0.7394	0.4577	0.646	0.2606	0.3096	0.9106
SIFT $+$ Entropy
Bayes	0.7991	0.6906	0.795	0.7204	0.7991	0.3094	0.8623	0.2009	0.4296	0.9348
Decision tree	0.8064	0.7408	0.8098	0.7208	0.8064	0.2592	0.7738	0.1936	0.4650	0.9398
SVM Gaussian	0.8748	0.7865	0.8724	0.8438	0.8748	0.2135	0.9901	0.1252	0.5985	0.9544
SVM RBF	0.8195	0.6106	0.8236	0.8365	0.8195	0.3894	0.9029	0.1805	0.4145	0.9325
SVM Poly.	0.8311	0.7833	0.8358	0.7525	0.8311	0.2167	0.9034	0.1689	0.5251	0.9470
Texture $+$ SIFT
Bayes	0.9545	0.948	0.9555	0.9234	0.9545	0.05202	0.991	0.0455	0.8955	0.9783
Decision tree	0.9457	0.9229	0.9458	0.9202	0.9457	0.07707	0.9397	0.0543	0.8551	0.9749
SVM Gaussian	0.9883	0.9799	0.9883	0.986	0.9883	0.02007	0.9999	0.0117	0.9796	0.9886
SVM RBF	0.7214	0.3073	0.7592	0.8179	0.7214	0.6927	0.7697	0.2786	0.2311	0.8819
SVM Poly.	0.9751	0.9808	0.9761	0.9497	0.9751	0.01921	0.9932	0.0249	0.9602	0.9853
Texture $+$ Morphological
Bayes	0.9575	0.9737	0.9613	0.913	0.9575	0.02625	0.9946	0.0425	0.9229	0.9809
Decision tree	0.9692	0.9477	0.9691	0.9624	0.9692	0.0523	0.9731	0.0308	0.9246	0.9811
SVM Gaussian	0.9971	0.9958	0.9971	0.9958	0.9971	0.004249	1	0.0029	0.9944	0.9966
SVM RBF	0.717	0.2932	0.7973	0.9188	0.717	0.7068	0.9911	0.283	0.2254	0.8790
SVM Poly.	0.9868	0.9886	0.987	0.9738	0.9868	0.01143	0.9934	0.0132	0.9825	0.9893
Texture $+$ Entropy
Bayes	0.9516	0.9468	0.953	0.9171	0.9516	0.0532	0.9846	0.0484	0.8885	0.9777
Decision tree	0.9545	0.9357	0.9546	0.9329	0.9545	0.06432	0.9568	0.0455	0.8849	0.9774
SVM Gaussian	0.9765	0.9568	0.9765	0.9747	0.9765	0.04322	0.9995	0.0235	0.9453	0.9834
SVM RBF	0.7522	0.3811	0.816	0.9263	0.7522	0.6189	0.9852	0.2478	0.2694	0.8980
SVM Poly.	0.9721	0.9581	0.9721	0.961	0.9721	0.0419	0.9884	0.0279	0.9383	0.9826
SIFT $+$ Morphological
Bayes	0.8661	0.9303	0.9014	0.7777	0.8661	0.06973	0.9583	0.1339	0.6939	0.9625
Decision tree	0.8865	0.8414	0.8868	0.8368	0.8865	0.1586	0.9005	0.1135	0.6617	0.9599
SVM Gaussian	0.9592	0.9361	0.9591	0.9471	0.9592	0.06388	0.9988	0.0408	0.8950	0.9783
SVM RBF	0.837	0.6119	0.8604	0.9239	0.837	0.3881	0.9735	0.163	0.4359	0.9357
SVM Poly.	0.9403	0.9136	0.9402	0.9162	0.9403	0.08639	0.9766	0.0597	0.8352	0.9733
SIFT $+$ EFDs
Bayes	0.2737	0.6133	0.3433	0.2978	0.2737	0.3867	0.4385	0.7263	0.1080	0.7461
Decision tree	0.8195	0.7402	0.8190	0.7441	0.8195	0.2598	0.8057	0.1805	0.4824	0.9420
SVM Gaussian	0.8210	0.7615	0.8242	0.7407	0.821	0.2385	0.9669	0.179	0.4969	0.9438
SVM RBF	0.7613	0.4659	0.762	0.7636	0.7613	0.5341	0.8457	0.2387	0.3018	0.9084
SVM Poly.	0.7686	0.5602	0.7546	0.7012	0.7686	0.4398	0.6366	0.2314	0.3409	0.9184

Table 3

Performance evaluation using test data (holdout 0.10) based on texture feature

Classifier	Sens.	Spec.	PPV	NPV	TA	FPR	AUC	Error	95% CI
									L	U
Texture
Bayes	0.9235	0.8981	0.924	0.8883	0.9235	0.1019	0.9790	0.0765	0.7853	0.9695
Decision tree	0.9706	0.9527	0.9705	0.9639	0.9706	0.04725	0.9779	0.0294	0.9313	0.9818
SVM Gaussian	0.9765	0.9552	0.9765	0.9781	0.9765	0.0448	0.9993	0.0235	0.9441	0.9832
SVM RBF	0.9853	0.9647	0.9856	0.994	0.9853	0.03529	0.9979	0.0147	0.9659	0.9862
SVM Poly.	0.9824	0.9926	0.9834	0.96	0.9824	0.007353	0.9998	0.0176	0.9788	0.9885

Figure 4.

ROC analysis based on signle feature set using a) texture b) morphological c) entropy d) EFDs e) SIFT.

Figure 5.

ROC analysis based on combined feature set using a) texture $+$ morphology b) texture $+$ entropy c) texture $+$ EFDs d) texture $+$ SIFTe) entropy $+$ SIFT.

Figure 6.

ROC analysis based on combined feature set using a) entropy $+$ morphological b) EFDs $+$ entropy c) morphological $+$ EFDs d) EFDs $+$ SIFT e) morphological $+$ SIFT.

3. Results and discussions

The classification performance was measured using different classifiers such as Decision Tree, SVM with linear, polynomial, RBF (Radial Base Function) and Gaussian kernels and Bayesian approach. The performance was evaluated by extracting features (texture, morphological, SIFT, EFDs and Entropy based features) as shown in Table 1. The performance was measured using sensitivity (Sens.), specificity (Spec.), PPV, NPV, TA, FPR, AUC, Error and 95% Confidence Interval (Lower bound (L) and upper bound (U)) as reflected in Tables 1 to 3.

Based on the single features extracting methodology, with 10-fold CV the highest performance obtained using texture features by employing SVM Gaussian kernel i.e. Sensitivity (98.24%), Specificity (96.34%), PPV (98.25%), NPV (98.67%), TA (98.24%), and AUC (0.999) followed by SVM polynomial with sensitivity (98.09%), specificity (97.45%), PPV (98.10%), NPV (97.17%), TA (98.09%) and AUC (0.9968). Moreover, the performance measure based on texture features for other classifiers was obtained as Bayes give TA (95.45%), followed by Decision Tree with TA (95.01%) and SVM RBF with TA (91.06%). Likewise, the highest performance evaluation based on morphological features was obtained using SVM RBF with sensitivity (91.70%), specificity (88.64%), PPV (91.74), NPV (87.90), and AUC (0.9596) followed by SVM Gaussian with TA (90.83%), SVM polynomial with TA (90.25%), DT with TA (88.94%) and Bayes with TA (86.17%). The performance evaluation by extracting different features using ML classification methods was obtained as EFDs features using DT (TA $=$ 83.41%), entropy features using SVM RBF (TA $=$ 81.23%), and SVM polynomial (TA $=$ 80.20%); SIFT features using SVM RBF (TA $=$ 80.06%). The details performance evaluation based on other different features and ML classifiers is reflected in Table 1.

The cancer detection accuracy was enhanced using of combination of different extracted features as depicted in the Table 2. The highest performance using combination of features i.e. texture $+$ morphological and EFDs $+$ morphological was obtained using SVM Gaussian kernel such as Sensitivity and TA (99.71%), specificity (99.58%), PPV (99.71%), NPV (99.58%), and AUC (1.00) followed by SVM Gaussian with texture $+$ SIFT features i.e. sensitivity (98.83%), specificity (97.99%), PPV (98.83%), NPV (98.83%), and AUC (0.999). Moreover, SVM Gaussian with EFDs $+$ morphological gives TA (97.65%), followed by SVM Gaussian with morphological $+$ entropy features with TA (97.53%), SVM polynomial with texture $+$ SIFT features gives TA (97.51%), SVM polynomial with texture $+$ entropy features give TA (97.21%) and DT with EFDs $+$ morphological and texture $+$ morphological features gives TA (96.92%). The performance measures with combination of features and classifiers is reflected in Table 2.

The Table 3 depicts the evaluation performance using test data (10% holdout) based on texture features using ML classifiers. The overall highest accuracy was obtained using SVM RBF (TA $=$ 98.53%) followed by SVM polynomial (TA $=$ 98.24%), SVM Gaussian (TA $=$ 97.65%), Decision Tree (TA $=$ 97.06%), Bayes (TA $=$ 92.35%). Moreover, minimum false positive rate was obtained using SVM polynomial (FPR $=$ 0.007353), followed by SVM RBF (FPR $=$ 0.03529), SVM Gaussian (FPR $=$ 0.0448), DT (FPR $=$ 0.04725) and Bayes (FPR $=$ 0.1019).

Figure 7.

ROC analysis based on signle feature using SVM RBF at different folds CVs for features a) texture b) morphological c) entropy d) EFDs e) SIFT.

The AUC values using single feature with different ML (Machine learning) classifiers are obtained as reflected in Fig. 4. The highest separation (AUC $=$ 99%) was obtained using SVM Gaussian classifier, SVM polynomial and SVM RBF with texture. However, the Decision Tree showed highest (AUC $=$ 95%) using texture. Bayes classifier can show highest separation (AUC $=$ 98.98%) with texture features. SVM Gaussian with Morphological, EFDs, Entropy and SIFT features (AUC $=$ 98.96%, 96.74%, 97%, 97.93%) respectively, SVM RBF with Morphological and EFDs features (AUC $=$ 95.96%, 82.91%), and SVM polynomial with Morphological features (AUC $=$ 96.89%). Bayes and Decision Tree showed (AUC $=$ 96%, 92.93%) with Morphological features.

Figure 8.

Performance evaluation with different folds CVs on texture features by applying ML classifiers a) Bayes b) decision tree c) SVM Gaussian d) SVM RBF e) SVM polynomial.

Figure 9.

Prediction model based on mean $\pm$ $n$ SD, where $n=$ { $\pm$ 2, 5, 10, 15} based on morphological features using (a) SVM Gaussian kernel with TA (91.70%), (b) decision tree 88.94% with 10-fold CV.

Figure 10.

Mean values with 5% percentage error for selected morphological features to distinguish the prostate and brachytherapy subjects.

Figures 5 and 6 reflect AUC values of combined features. The AUC value for combination of EFDs $+$ Morphological and texture $+$ Morphological are (AUC $=$ 100%) using SVM Gaussian. Likewise, SVM polynomial give highest separation (AUC $=$ 99%) with texture $+$ SIFT, EFDs $+$ Morphological and texture $+$ Morphological features sets. Similarly, SVM RBF showed also the maximum separation (AUC $=$ 99%) with combination of features such as EFDs $+$ Morphological and texture $+$ Morphological whereas Bayes classifier also gives (above 99%) with texture $+$ SIFT, EFDs $+$ Morphological and texture $+$ Morphological features. The highest separation (AUC $=$ 97%) was obtained using Decision Tree with EFDs $+$ Morphological and texture $+$ Morphological feature sets. SVM Gaussian also showed (AUC $=$ 99%) using Morphological $+$ Entropy, EFDs $+$ texture, SIFT $+$ entropy, texture $+$ SIFT, texture $+$ entropy, SIFT $+$ Morphological.

The comparisons using AUC values of single features with different cross fold validations (i.e. 2, 4, 5 and 10) for different classifiers are shown in Fig. 7. The highest separation (AUC $=$ 99%) was obtained with 2-fold using SVM Gaussian with texture feature. 2-fold SVM polynomial showed (AUC $=$ 99%) with Texture feature. However, 2-fold Bayes, SVM RBF and Decision Tree give us (AUC $=$ 98%, 98% 95%) respectively for texture feature. For Morphological, Entropy, EFDs and SIFT 2-fold SVM Gaussian showed (AUC $=$ 98%, 97%, 96%, 98%) respectively. 4-fold SVM Gaussian with texture feature gives (AUC $=$ 100%). Also, 4-fold Bayes, SVM RBF and SVM Polynomial showed (AUC $=$ 98%) with texture features. Classifiers with 4-fold and 5-fold provided overall improved results than 2-fold but overall higher performance was observed with 10-fold.

The performance was also evaluated based on different cross fold vailations (i.e. 2, 4, 5 and 10) as depicted in the Fig. 8. Based on texture features, the Basyes classifiers gives the performance using k-fold cross validation as sensitivity i.e. 2-fold (94.87%), 4-fold (94.87%), 5-fold (95.01%), and 10-fold (95.45%); specificiy with 2-fold (94.95%), 4-fold (94.07%), 5-fold (93.84%), and 10-fold (95.19%); PPV with 2-fold (95.08%), 4-fold (94.97%), 5-fold (95.07%) and 10-fold (95.58%) and son on. Similarly, based on texture features the other classifiers gives using DT the TA was obtained uisg 2-fold (95.31%), 4-fold (96.48%), 5-fold (95.01%), 10-fold (95.01%); SVM RBF TA using 2-fold (87.68%), 4-fold (90.32%), 5-fold (89.30%), 10 fold (91.06%); and SVM plynomial gives TA using 2-fold (97.95%), 4-fold (96.63%), 5-fold (98.53%), and 10-fold (98.09%). It was observed that 10 fold CV give higher performancec in most of the cases than the other folds which is most knownly used k-fold CV method to meaures the validation performance of the classifiers. Thus, for reminder of the work, 10-fold CV method was employed.

In Fig. 9, the blue color denotes the means of brachytherapy subjects and red color denote prostate cancer subjects. The lines denote the correctly classified subjects, while $x$ denotes the incorrectly classified samples using (a) SVM Gaussian kernel by obtaining prediction accuracy of 91.70% and (b) Decision Tree with accuracy of 88.94% based on morphological features as reflected in the figure. The figure clearly depicts using SVM Gaussian, there are less incorrectly classified subjects than the DT classifier as revealed by accuracy percentage. Moreover, it can also help to select the features which are more relevant by predicting the frequency of incorrectly classified subject. A similar approach was applied with other features extracting strategies. For optimal features extraction and selection, multifeatured extracting strategy was employed from different features subspace such as texture, morphological, SIFT, EFDs and multiresolution image based on complexity. The main guidelines in selecting and extracting features was employed as discussed in [74] which is comprised of reliability, discrimination, independence, and optimality using principal component analysis [2] and genetic algorithms [75, 76, 77, 78, 79]. The texture and morphological features set exhibit higher performance using most of ML classifiers followed by entropy, SIFT and EFDs features. However, combination of features with texture and morphological increased the detection performance. In the past researchers extracted multifeatured strategy [75] for diagnosis of prostate cancer such as color channel histogram (accuracy of 95.6%) using kNN, fractal domain (accuracy of 92.6%) using SVM, wavelet features (accuracy of 96.7%) using linear SVM and combined features with accuracy of 96.2% using kNN.

The performance was evaluated using ROC (receiver operating curve) analysis. The proximity of a measured value to the true value is indicated by accuracy measure. The proportion of positive results are measured using sensitivity such as the percentage of patients who were correctly identified as having Prostate cancer while the percentage of patients who were correctly identified as having normal and negative results proportion is measured using the sensitivity. The sensitivity in the ROC curve is measured is plotted in function of (1-specificity) for different operating points. Each operating point on the ROC plot denote a specificity/sensitivity pair corresponding to a particular decision threshold. The perfect discrimination is a ROC curve passing through a coordinate of (0, 1) or upper left corner of the ROC space. Therefore, the closer the ROC curve to the upper left corner, the higher the overall accuracy if the test (i.e. maximum sensitivity and specificity). The threshold value for operating point was selected closest to the coordinate of (0, 1). Likewise, the NPV (negative predictive value) and PPV (positive predictive value) are computed denoting the proportion of Prostate cancer with negative and positive test results that were correctly diagnosed respectively.

4. Conclusion

In this research, the robust ML (Machine learning) classification techniques such as SVM kernels, Bayesian approach and Decision Tree are used to classify the cancer from the Brachytherapy subjects. High resolutions images exhibit higher nonlinear dynamics and complexity which require multidimensional features extracting strategies to detect cancer from image due to large variation in size shape to effectively distinguish the cancer. Thus, to handle this problem, different features extracting strategies are employed such as scale invariant feature transform (SIFT), texture, morphology and elliptic Fourier descriptors (EFDs). To distinguish the Brachytherapy subjects from the Prostate cancer, the novel ML classification techniques such as SVM and its kernels, Decision Tree and Bayes approaches are developed in Matlab version 2016. The Cross validation (Jack-knife 10-fold) was used to train and test the MR image database. The performance was evaluated using some measures (specificity, sensitivity, PPV, NPV, FPR and AUC). Both single and combination of features extracting strategies are devised to evaluate the performance. The higher classification accuracy based on single texture and morphological features were obtained using SVM kernels, whereas, combination of different features such as morphological with EFDs and texture gives more accuracy than single feature followed by texture feature with entropy and EFDs using SVM kernels, DTs, and Bayes approach. In the past, the researchers use only few single features based strategy and few combined features to detect the prostate cancer. However, the result reported in this study revealed that the current features extracting strategy is more effective to diagnose and detect the prostate cancer in detecting the specificity and sensitivity for acquiring higher detection ratio of prostate cancer.

References

Siegel

R.L.

Miller

K.D.

Fedewa

S.A.

Ahnen

D.J.

Meester

R.G.S.

Barzi

and Jemal

, Colorectal Cancer Statistics, 2017, CA Cancer J Clin 67 (2017), 177–193. doi: 10.3322/caac.21395.

Chou

Croswell

J.M.

Dana

Bougatsos

Blazina

and Fu

, Review Annals of Internal Medicine Screening for Prostate Cancer: A Review of the Evidence for the U.S. Preventive Services Task Force, Ann Intern Med 155 (2011), 375–386.

Bashir

M.N.

, Epidemiology of prostate cancer, Asian Pacific J Cancer Prev 16 (2015), 5137–5141. doi: 10.7314/APJCP.2015.16.13.5137.

Wall

W.A.

Wiechert

Comerford

and Rausch

, Towards a comprehensive computationalmodel for the respiratory system, Int J Numer Method Biomed Eng 26 (2010), 807–827. doi: 10.1002/cnm.

Ohori

Wheeler

T.M.

and Scardino

P.T.

, The New American Joint Committee on Cancer and International Union Against Cancer TNM Classification of Prostate Cancer Clinicopathologic Correlations, Cancer 73 (1994), 104–14. doi: 10.1002/1097-0142(19940701)74:1.

Talcott

J.A.

Manola

Chen

R.C.

Clark

J.A.

Kaplan

D’Amico

A.V.

and Zietman

A.L.

, Using patient-reported outcomes to assess and improve prostate cancer brachytherapy, BJU Int 114 (2014), 511–516. doi: 10.1111/bju.12464.

Kattan

M.W.

Potters

Blasko

J.C.

Beyer

D.C.

Fearn

Cavanagh

Leibel

and Scardino

P.T.

, Cme article brachytherapy in prostate cancer, Urology 4295 (2001), 393–399.

Rathore

Hussain

and Khan

, Automated colon cancer detection using hybrid of novel geometric features and some traditional features, Comput Biol Med 65 (2015), 279–296. doi: 10.1016/j.compbiomed.2015.03.004.

K.K.

and Hricak

, Imaging prostate cancer, J Urol 38 (2000), 59–85. doi: 10.1016/S0033-8389(05)70150-0.

10.

Schröder

F.H.

Hugosson

Roobol

M.J.

Tammela

T.L.J.

Ciatto

Nelen

Kwiatkowski

Lujan

Lilja

Zappa

Denis

L.J.

Recker

Berenguer

Mããttãnen

Bangma

C.H.

Aus

Villers

Rebillard

van der Kwast

Blijenberg

B.G.

Moss

S.M.

de Koning

H.J.

and Auvinen

, Screening and prostate-cancer mortality in a randomized european study, N Engl J Med 360 (2009), 1320–1328. doi: 10.1056/NEJMoa0810084.

11.

Vos

P.C.

Hambrock

Barenstz

J.O.

and Huisman

H.J.

, Computer-assisted analysis of peripheral zone prostate lesions using T2-weighted and dynamic contrast enhanced T1-weighted MRI, Phys Med Biol 55 (2010), 1719–1734. doi: 10.1088/0031-9155/55/6/012.

12.

Seltzer

Getty

and Tempany

, Staging prostate cancer with MR imaging: A combined radiologist-computer system, 1997, 219–226. http://radiology.rsna.org/content/202/1/219.short.

13.

Hricak

Choyke

P.L.

Eberhardt

S.C.

Leibel

S.A.

and Scardino

P.T.

, Imaging prostate cancer: A multidisciplinary perspective 1, Radiology 243 (2007), 28–53. doi: 10.1148/radiol.2431030580.

14.

Röthke

Blondin

Schlemmer

H.-P.

and Franiel

, PI-RADS classification: Structured reporting for MRI of the prostate, Rofo 185 (2013), 253–61. doi: 10.1055/s-0032-1330270.

15.

Castellano

Bonilha

L.M.

and Cendes

, Texture analysis of medical images, Clin Radiol 59 (2004), 1061–1069. doi: 10.1016/j.crad.2004.07.008.

16.

Fan

Shen

Gur

R.C.

Gur

R.E.

and Davatzikos

, Compare: Classification of morphological patterns using adaptive regional elements, IEEE Trans Med Imaging 26 (2007), 93–105. doi: 10.1109/TMI.2006.886812.

17.

Perez

I.M.

Toivonen

Movahedi

Kiviniemi

Pahikkala

Aronen

H.J.

and Jambor

, Diffusion weighted imaging of prostate cancer: Prediction of cancer using texture features from the parametric maps of the monoexponential and kurtosis functions using a grid approach, Image Processing Theory Tools and Applications (IPTA), 2016 6th International Conference on (2016), 0–7.

18.

Han

S.M.

Lee

H.J.

and Choi

J.Y.

, Computer-aided prostate cancer detection using texture features and clinical features in ultrasound image, J Digit Imaging 21 (2008), 121–133. doi: 10.1007/s10278-008-9106-3.

19.

De Rooij

Hamoen

E.H.J.

Fütterer

J.J.

Barentsz

J.O.

and Rovers

M.M.

, Accuracy of multiparametric MRI for prostate cancer detection: A meta-analysis, Am J Roentgenol 202 (2014), 343–351. doi: 10.2214/AJR.13.11046.

20.

Doyle

Madabhushi

Feldman

and Tomaszeweski

, A boosting cascade for automated detection of prostate cancer from digitized histology, Med Image Comput Comput Interv – Miccai 2006, Pt 2 4191 (2006), 504–511. doi: 10.1007/11866763_62.

21.

Daliri

M.R.

, Automated diagnosis of alzheimer disease using the scale-invariant feature transforms in magnetic resonance images, J Med Syst 36 (2012), 995–1000. doi: 10.1007/s10916-011-9738-6.

22.

Sahrim

Nixon

M.S.

and Carare

R.O.

, Blood vessel feature description for detection of Alzheimers disease, 2014 13th Int Conf Control Autom Robot Vision, ICARCV 2014 2014 (2014), 317–322. doi: 10.1109/ICARCV.2014.7064325.

23.

Rathore

Hussain

Iftikhar

M.A.

and Jalil

, Ensemble classification of colon biopsy images based on information rich hybrid features, Comput Biol Med 47 (2014), 76–92. doi: 10.1016/j.compbiomed.2013.12.010.

24.

Rathore

Iftikhar

Ali

Hussain

and Jalil

, Capture largest included circles: An approach for counting red blood cells, Commun Comput Inf Sci 281 CCIS (2012), 373–384. doi: 10.1007/978-3-642-28962-0_36.

25.

Hussain

Shafi

Saeed

Abbas

Awan

I.A.

Nadeem

S.A.

Kazmi

S.Z.H.

and Shah

S.A.

, A radial base neural network approach for emotion recognition in human speech, Int J Comput Sci Netw Secur 17 (2017), 52–62. http://ijcsns. org/04_journal/04_journal_01.htm.

26.

Hussain

Aziz

Kazmi

S.Z.H.

and Awan

I.A.

, Classification of human faces and non faces using machine learning techniques, Int J Electron Electr Eng 2 (2014), 116–123. doi: 10.12720/ijeee.2.2.116-123.

27.

Hussain

Aziz

Nadeem

S.A.

and Abbasi

A.Q.

, Classification of normal and pathological heart signal variability using machine learning techniques classification of normal and pathological heart signal variability using machine learning techniques, Int J Darshan Inst Eng Res Emerg Technol 3 (2015), 13–19.

28.

Hussain

Aziz

Khan

A.S.

Abbasi

A.Q.

and Hassan

S.Z.

, Classification of electroencephlography (EEG) alcoholic and control subjects using machine learning ensemble methods, J Multidiscip Eng Sci Technol 2 (2015), 126–131.

29.

Naranjo

Lloréns

Alcañiz

and López-Mir

, Metal artifact reduction in dental CT images using polar mathematical morphology, Comput Methods Programs Biomed 102 (2011), 64–74. doi: 10.1016/j.cmpb.2010.11.009.

30.

Masseroli

Bollea

and Forloni

, Quantitative morphology and shape classification of neurons by computerized image analysis, Comput Methods Programs Biomed 41 (1993), 89–99. doi: 10.1016/0169-2607(93)90068-V.

31.

Welfer

Scharcanski

and Marinho

D.R.

, Fovea center detection based on the retina anatomy and mathematical morphology, Comput Methods Programs Biomed 104 (2011), 397–409. doi: 10.1016/j.cmpb.2010.07.006.

32.

Y.M.

and Zeng

X.P.

, A new strategy for urinary sediment segmentation based on wavelet, morphology and combination method, Comput Methods Programs Biomed 84 (2006), 162–173. doi: 10.1016/j.cmpb.2006.07.010.

33.

Ertaş

Gülçür

H.Ö.

Aribal

and Semiz

, Feature extraction from mammographic mass shapes and development of a mammogram database, Annu Reports Res React Institute, Kyoto Univ 3 (2001), 2752–2755. doi: 10.1109/IEMBS.2001.1017354.

34.

Surendiran

and Vadivel

, Mammogram mass classification using various geometric shape and margin features for early detection of breast cancer, Int J Med Eng Inform 4 (2012), 36–54. doi: 10.1504/IJMEI.2012.045302.

35.

Bresson

and Chan

T.T.F.

, Fast dual minimization of the vectorial total variation norm and applications to color image processing, Inverse Probl Imaging 2 (2008), 455–484. doi: 10.3934/ipi.2008.2.455.

36.

Guru

D.S.

Sharath

Y.H.

and Manjunath

, Texture features and KNN in classification of flower images, Int J Comput Appl (2010), 21–29.

37.

Mougiakakou

S.G.

Valavanis

Nikita

K.S.

Nikita

and Kelekis

, Characterization of CT liver lesions based on texture features and a multiple neural network classification scheme, Proc 25th Annu Int Conf IEEE Eng Med Biol Soc (2003), 1287–1290. doi: 10.1109/IEMBS.2003.1279504.

38.

Mavroforakis

M.E.

Georgiou

H.V.

Cavouras

Dimitropoulos

and Theodoridis

, Mammographic mass classification using textural features and descriptive diagnostic data, Int Conf Digit Signal Process DSP 1 (2002), 461–464. doi: 10.1109/ICDSP.2002.1027918.

39.

Esgiar

A.N.

Naguib

R.N.

Sharif

B.S.

Bennett

M.K.

and Murray

, Microscopic image analysis for quantitative measurement and feature identification of normal and cancerous colonic mucosa, IEEE Trans Inf Technol Biomed 2 (1998), 197–203. doi: 10.1109/4233.735785.

40.

Esgiar

A.N.

Naguib

R.N.G.

Sharif

B.S.

Bennett

M.K.

and Murray

, Fractal analysis in the detection of colonic cancer images, IEEE Trans Inf Technol Biomed 6 (2002), 54–8. doi: 10.1109/4233.992163.

41.

Huang

Zheng

Xie

Chen

Zeng

McLean

D.I.

and Lui

, Laser-induced autofluorescence microscopy of normal and tumor human colonic tissue, Int J Oncol 24 (2004), 59–63.

42.

Keypoints

and Lowe

D.G.

, Distinctive image features from, Int J Comput Vis 60 (2004), 91–110. doi: 10.1023/B:VISI.0000029664.99615.94.

43.

Brown

and Lowe

D.G.

, Automatic panoramic image stitching using invariant features, Int J Comput Vis 74 (2007), 59–73. doi: 10.1007/s11263-006-0002-3.

44.

Kisku

D.R.

Rattani

Grosso

and Tistarelli

, Face identification by SIFT-based complete graph topology, 2007 IEEE Work Autom Identif Adv Technol – Proc (2007), 63–68. doi: 10.1109/AUTOID.2007.380594.

45.

Luo

Takikawa

Lao

Kawade

and Lu

B.L.

, Person-specific SIFT features for face recognition, ICASSP IEEE Int Conf Acoust Speech Signal Process – Proc 2 (2007), 593–596. doi: 10.1109/ICASSP.2007.366305.

46.

Bicego

Lagorio

Grosso

and Tistarelli

, On the use of SIFT features for face authentication, Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 2006 (2006). doi: 10.1109/CVPRW.2006.149.

47.

Fazli

Pour

H.M.

and Bouzari

, Particle filter based object tracking with sift and color feature, 2009 Second Int Conf Mach Vis (2009), 89–93. doi: 10.1109/ICMV.2009.47.

48.

Kuhl

F.P.

and Giardina

C.R.

, Elliptic fourier features of a closed contour, Comput Graph Image Process 18 (1982), 236–258. doi: 10.1016/0146-664X(82)90034-X.

49.

Nicoli

L.P.

and Anagnostopoulos

G.C.

, Shape-based recognition of targets in synthetic aperture radar images using elliptical fourier descriptors, Proc SPIE – Int Soc Opt Eng 6967 (2008), The International Society for Optical Engineering. doi: 10.1117/12.777806.

50.

Taxt

and Bjerde

K.W.

, Handwritten Vector, 1994, 123–128.

51.

Hussain

and Aziz

, Time-frequency spatial wavelet phase coherence analysis of EEG in EC and EO during resting state, Procedia Comput Sci 95 (2016), 297–302. doi: 10.1016/j.procs.2016.09.338.

52.

Hussain

Aziz

Saeed

Shah

S.A.

Nadeem

M.S.A.

Awan

I.A.

Abbas

Majid

and Kazmi

S.Z.H.

, Quantifying the dynamics of electroencephalographic (EEG) signals to distinguish alcoholic and non-alcoholic subjects using an MSE based K-d tree algorithm, Biomed Eng/Biomed Tech 0 (2017). doi: 10.1515/bmt-2017-0041.

53.

Hussain

Aziz

Alowibdi

J.S.

Habib

Rafique

Saeed

and Kazmi

S.Z.H.

, Symbolic time series analysis of electroencephalographic (EEG) epileptic seizure and brain dynamics with eye-open and eye-closed subjects during resting states, J Physiol Anthropol 36 (2017), 21. doi: 10.1186/s40101-017-0136-8.

54.

Hussain

Aziz

and Saeed

, Coupling functions between brain waves: Significance of opened/closed eyes, 2017, 275–280.

55.

Hussain

Aziz

Saeed

Shah

S.A.

Nadeem

M.S.A.

Awan

Abbas

Majid

Zaki

and Kazmi

, Complexity analysis of EEG motor movement with eye open and close subjects using multiscale permutation entropy (MPE) technique, Biomedical Research 28 (2017), 1–8.

56.

Qumar

Aziz

Saeed

Ahmed

and Hussain

, Comparative study of multiscale entropy analysis and symbolic time series analysis when applied to human gait dynamics, Open Source Syst Technol (ICOSST), 2013 Int Conf (2013), 126–132. doi: 10.1109/ICOSST.2013.6720618.

57.

Rosso

O.A.

Blanco

Yordanova

Kolev

Figliola

Schürmann

and Başar

, Wavelet entropy: A new tool for analysis of short duration brain electrical signals, J Neurosci Methods 105 (2001), 65–75. doi: 10.1016/S0165-0270(00)00356-3.

58.

Wang

Miao

and Xie

, Best basis-based wavelet packet entropy feature extraction and hierarchical EEG classification for epileptic detection, Expert Syst Appl 38 (2011), 14314–14320. doi: 10.1016/j.eswa.2011.05.096.

59.

Vapnik

V.N.

, An overview of statistical learning theory, IEEE Trans Neural Netw 10 (1999), 988–99. doi: 10.1109/72.788640.

60.

Dobrowolski

A.P.

Wierzbowski

and Tomczykiewicz

, Multiresolution MUAPs decomposition and SVM-based analysis in the classification of neuromuscular disorders, Comput Methods Programs Biomed 107 (2012), 393–403. doi: 10.1016/j.cmpb.2010.12.006.

61.

Subasi

, Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders, Comput Biol Med 43 (2013), 576–586. doi: 10.1016/j.compbiomed.2013.01.020.

62.

Gammerman

Luo

Vega

and Vovk

, Conformal and probabilistic prediction with applications: 5th international symposium, COPA 2016 Madrid, Spain, 20–22 april 2016 proceedings, Lect Notes Comput Sci (Including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 9653 (2016), 185–195. doi: 10.1007/978-3-319-33395-3.

63.

Wang

Kwong

Wang

and Jiang

, Continuous valued attributes, IEEE Transactions on Cybernetics 45 (2015), 1262–1275.

64.

de Figueiredo

J.J.S.

Oliveira

Esmi

Freitas

Schleicher

Novais

Sussner

and Green

, Automatic detection and imaging of diffraction points using pattern recognition, Geophys Prospect 61 (2013), 368–379. doi: 10.1111/j.1365-2478.2012.01123.x.

65.

Fang

, Inference-based naive bayes: Turning naive bayes cost-sensitive, IEEE Transactions on Knowledge and Data Engineering 25 (2013), 2302–2314. doi: 10.1109/TKDE.2012.196.

66.

Huang

Weng

R.C.

and Lin

, Generalized Bradley-Terry models and multi-class probability estimates, J Mach Learn Res 7 (2006), 85–115. http://portal.acm.org/citation.cfm?id=1248551.

67.

Zhang

Chen

Xiang

Zhou

and Xiang

, Internet traffic classification by aggregating correlated Naive Bayes predictions, IEEE Trans Inf Forensics Secur 8 (2013), 5–15. http://dblp.uni-trier.de/db/journals/tifs/tifs8.html#ZhangCXZ X13.

68.

Zaidi

N.A.

and Webb

G.I.

, On the effectiveness of discretizing quantitative attributes in linear classifiers, J Mach Learn Res 1 (2017), 1–28.

69.

Chen

Zhang

Yang

Milton

J.C.

and Alcántara

A.D.

, An explanatory analysis of driver injury severity in rear-end crashes using a decision table/Naïve Bayes (DTNB) hybrid classifier, Accid Anal Prev 90 (2016), 95–107. doi: 10.1016/j.aap.2016.02.002.

70.

Mendes

Hoeberechts

and Albu

A.B.

, Evolutionary computational methods for optimizing the classification of sea stars in underwater images, Proc – 2015 IEEE Winter Conf Appl Comput Vis Work WACVW 2015 (2015), 44–50. doi: 10.1109/WACVW.2015.9.

71.

Bermejo

Gámez

J.A.

and Puerta

J.M.

, Knowledge-based systems speeding up incremental wrapper feature subset selection with Naive Bayes classifier, Knowledge-Based Syst 55 (2014), 140–147. doi: 10.1016/j.knosys.2013.10.016.

72.

Fischer

E.A.

J.Y.

and Markey

M.K.

, Bayesian networks of BI-RADS descriptors for breast lesion classification, 2004, 3031–3034.

73.

Hajian-Tilaki

, Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation, Casp J Intern Med 4 (2013), 627–635. doi: 10.1017/CBO9781107415324.004.

74.

Wang

Liu

K.J.R.

S.C.B.

and Freedman

M.T.

, Computerized radiographic mass detection – Part I: Lesion site selection by morphological enhancement and contextual segmentation, IEEE Trans Med Imaging 20 (2001), 289–301. doi: 10.1109/42.921478.

75.

Tabesh

Teverovskiy

Pang

H.Y.

Kumar

V.P.

Verbel

Kotsianti

and Saidi

, Multifeature prostate cancer diagnosis and gleason grading of histological images, IEEE Trans Med Imaging 26 (2007), 1366–1378. doi: 10.1109/TMI.2007.898536.

76.

Competition

Affects

and In

, NIH public access, 86 (2008), 3279–3288. doi: 10.1007/s11103-011-9767-z.Plastid.

77.

Babor

T.F.

Zeigler

Xuan

Morisky

Hovell

Nelson

T.F.

Shen

Lansing

Sciences

C.H.

Angeles

Sciences

C.H.

Angeles

Diego

and Hospital

, HHS public access, 110 (2016), 68–78. doi: 10.1111/add.12784.Health.

78.

Khalvati

Wong

and Haider

M.A.

, Automated prostate cancer detection via comprehensive multi-parametric magnetic resonance imaging texture feature models, BMC Med Imaging 15 (2015), 27. doi: 10.1186/s12880-015-0069-9.

79.

Sahiner

Chan

H.-P.

Petrick

Helvic

M.A.

and Goodsitt

M.M.

, Design of a high-sensitivity classifier based on a genetic algorithm: Application to computer-aided diagnosis, Phys Med Biol 43 (1998), 2853–2871. papers2://publication/ uuid/AAC8FE75-C655-4A6B-8BCA-3BE7FA0EF3F9.

Prostate cancer detection using machine learning techniques by employing combination of features extracting strategies

Abstract

Keywords

1. Introduction

2.1 Dataset

2.2 Features extraction strategies

2.2.2 Texture features

2.2.3 Scale invariant feature transform (SIFT)

2.3 Classification

2.3.1 Support vector machine (SVM)

SVM for non-linearly separable data

2.5 Confusion matrix:

2.5.1 Sensitivity

Table 1 Classification performance based on single feature extracting strategy using 10-fold CV

Table 2 Classification performance based on combination of features using 10-fold CV

References

Table 1
Classification performance based on single feature extracting strategy using 10-fold CV

Table 2
Classification performance based on combination of features using 10-fold CV