Sage Journals: Discover world-class research

Abstract

Recently, many neural network models have been successfully applied for histopathological analysis, including for cancer classifications. While some of them reach human–expert level accuracy in classifying cancers, most of them have to be treated as black box, in which they do not offer explanation on how they arrived at their decisions. This lack of transparency may hinder the further applications of neural networks in realistic clinical settings where not only decision but also explainability is important. This study proposes a transparent neural network that complements its classification decisions with visual information about the given problem. The auxiliary visual information allows the user to some extent understand how the neural network arrives at its decision. The transparency potentially increases the usability of neural networks in realistic histopathological analysis. In the experiment, the accuracy of the proposed neural network is compared against some existing classifiers, and the visual information is compared against some dimensional reduction methods.

Keywords

cancer diagnosis microarray gene expression data neural network visualization

Introduction

Recent surge in deep learning has naturally found many applications in medical field, mainly in histopathological diagnosis.^1–4 The increasing number in applications of deep neural networks for medical science is due to the availability of vast health records of many persons over many years, the availability of fast processors, and the increasing power of neural networks, mainly deep models^5–7 to learn from those vast data. A nice review of the applications of deep models into histopathological diagnosis is given in Ravi et al.⁸ It can be expected that in the near future, machine learning will play increasingly important roles in medical and clinical sciences. Currently, in medical fields, neural networks are primarily trained with past data of patients and then used to diagnose and predict the future pathological state of new patients. Some studies already report on human–expert level accuracy for cancer predictions.^9,10 Unfortunately, for most of the neural networks, the high prediction accuracy is not accompanied by transparency in explaining their decisions. They have to be treated as black box and thus do not offer any insight on how they generate their decision from the given inputs. For many applications like game playing,¹¹ the intransparency of neural networks is not important, but in the explainability-required medical field, the inability of the neural networks to give explanations on their decision may hinder their further usage in realistic clinical settings.

In this research, a model of neural network that offers transparency is applied to cancer microarray data sets. The proposed neural network, Softmax restricted radial basis function networks (S-rRBF)¹² is modified from the previously proposed restricted radial basis function networks (rRBF).^13–15 The modification allows the new model to learn better and has clearer mathematical description, although these are not the focus of this article. The objective of this article is to apply the S-rRBF for cancer classifications and to explain its ability to offer auxiliary visual information that supports its classifications. The S-rRBF is a supervised hierarchical neural network, in which in its internal layer, neurons are arranged in two-dimensional (2D) grid, similar to Kohonen’s self-organizing maps (SOMs).^16,17 During the learning process, the internal layer is self-organized in a unique way, in which, as opposed to the conventional SOM, the S-rRBF forms an internal representation that preserves the topological structure of the inputs in the context of their class labels. As the internal layer is organized to enable the network to optimize its classification accuracy, the internal layer provides information on how the network arrives at its classification decisions. Furthermore, due to its low dimensionality (2D), the internal layer can be visualized. The visualization of this internal representation, as a kind of map, complements the classification decision of the S-rRBF. As opposed to most supervised neural network models, the S-rRBF is transparent, in which it provides intuitive explanation through visualization on its decision process. It allows the users to understand why a certain problem is easy or difficult. In this research, the S-rRBF is trained with cancer microarray data sets obtained from Gene Expression Model Selector.^18–20 These data are multi-class, high-dimensional (ranging from a few thousands to tens of thousands dimensions), small in size and imbalanced, which make them challenging to learn. This article explains about the preliminary results on the ability of the proposed model to generalize from these data and to provide visual explanations about them. The experimental results show that the S-rRBF can potentially improve the usability of neural networks in real-world settings where accuracy should be paired with explainability, mainly in histopathological diagnosis.

The article is organized as follows. The first section is for briefly explaining the dynamics of the S-rRBF. It will be followed by a section explaining the experiment results, while the conclusions and future works will be discussed in the final section.

S-rRBF

The S-rRBF¹² is modified from the previously proposed rRBF. S-rRBF is a hierarchical supervised neural network as illustrated in Figure 1. The internal layer of the S-rRBF is composed from a set of neurons arranged in 2D grid in a similar way as the Kohonen’s SOMs. The low-dimensional internal layer allows the visualization of the internal structure of this hierarchical neural network and offers auxiliary intuitive information on its decision process. Hence, as opposed to many existing classifiers, the S-rRBF is not a black box, in which it provides visual explanation for the given problem.

Figure 1.

Outline of S-rRBF.

Here, the S-rRBF is trained to classify a high-dimensional input, $X^{i} \in ℝ^{d}$ into its class $Y^{i} \in {1, \dots, C}$ , where $i$ is the index of the input, $d$ is the dimension of the input, and $C$ is the number of classes. The dynamics is as follows.

Given input, $X^{i}$ , at time $t$ , the jth hidden neuron generates output, $h_{j}^{i}$ , as

\begin{matrix} h_{j}^{i} = σ (w i n^{i}, j, t) e^{- ∥ X^{i} - W_{j} ∥^{2}} \\ w i n^{i} = \underset{j}{a r g \min} ∥ X^{i} - W_{j} ∥^{2} \end{matrix}

(1)

In equation (1), $σ ()$ is a neighborhood function defined as

σ (w i n^{i}, j, t) = e^{- \frac{d i s t (w i n^{i}, j)}{S (t)}}

(2)

\begin{array}{l} S (t) = S_{s t a r t} {(\frac{S_{e n d}}{S_{s t a r t}})}^{\frac{t}{t_{e n d}}} \\ (0 \leq t \leq t_{e n d}, S_{s t a r t} > S_{e n d}) \end{array}

(3)

where $d i s t (w i n^{i}, j)$ is the Euclidean distance between the winning neuron and the jth neuron on the 2D grid of the hidden layer, while $t$ and $t_{e n d}$ are the current epoch and the target epoch when the learning process is terminated, respectively. $S_{s t a r t}$ and $S_{e n d}$ are empirically decided constants. $W_{j}$ is the reference vector associated with the jth hidden neuron.

The activation function of a hidden neuron in S-rRBF is similar to that of radial basis function networks (RBF),²¹ except that in S-rRBF, it is topologically restricted by the neighborhood function $σ (w i n^{i}, j, t)$ .

The outputs of the hidden neurons are then propagated to the output layers, where the kth output, $O_{k}$ is defined as follows

O_{k} = e^{V_{k}^{T} h^{i}}

(4)

Here, $V_{k}$ is the weight vector connecting the hidden layer with the kth output neuron, $T$ is for transpose, and $h^{i}$ denotes the output vector of the hidden layer given $X^{i}$ as the input.

While the original rRBF adopts sigmoidal neurons, the S-rRBF replaces it with Softmax function that indicates the conditional probability that the S-rRBF classifies input $X^{i}$ into class $k$ as follows

P (Y^{i} = k | W, V, X^{i}) = \frac{e^{V_{k}^{T} h^{i}}}{\underset{l}{Σ} e^{V_{l}^{T} h^{i}}}

(5)

The S-rRBF is then trained to minimize the cross entropy defined as follows

\begin{matrix} J (W, V) = - \sum_{i} P (Y^{i}) l o g P (Y^{i} | W, V, X^{i}) \\ W = {W_{1}, W_{2}, \dots, W_{n h i d}} \\ V = {V_{1}, V_{2}, \dots, V_{C}} \end{matrix}

(6)

In equation (6), $n h i d$ is the number of hidden neuron and the number of output neuron. Considering that $Y^{i} \in {1, \dots, C}$ , equation (6) can be rewritten as

J (W, V) = - \sum_{i} \sum_{k} Π (Y^{i} = k) \log \frac{e^{V_{k}^{T} h^{i}}}{\sum_{l} e^{V_{l}^{T} h^{i}}}

(7)

In equation (7), $Π (Y^{i} = k) = 1$ when $Y^{i} = k$ is true, and $Π (Y^{i} = k) = 0$ otherwise. The minimization of the entropy is obtained by calculating the gradients $\partial J (W, V) / \partial V$ , which is trivial, and $\partial J (W, V) / \partial W$ that generates the modification for the reference vector as follows

W_{n} (t + 1) = W_{n} (t) + η \sum_{i} {(v_{K n} - {\tilde{v}}_{n}^{i}) σ (w i n^{i}, n) e^{- ∥ X^{i} - W_{n} ∥^{2}} (X^{i} - W_{n})}

(8)

Here, it is assumed that the true class of the input $X^{i}$ is $K$ , and $v_{K n}$ is the weight connecting the nth hidden neuron with the Kth output neuron, while ${\tilde{v}}_{n}^{i} = Σ_{l} v_{l n} P (Y^{i} = l | W, V, X^{i})$ .

The reference vector modification in equation (8) significantly differs from that of SOM, in which it includes a term $(v_{K n} - {\tilde{v}}_{n}^{i})$ that can be either positive or negative, while in SOM, this term always takes the value of 1. The inclusion of this term causes the label of the input to influence the self-organization of the internal representation, unlike the conventional SOM that is only influenced by the topological structure of the inputs. Hence, the generated 2D internal representation differs significantly from that of SOM, in which it reflects the topographical structure of the inputs in the context of their class labels.

Experiments

The data sets for experiments in this article are cancer microarray data sets obtained from Gene Expression Model Selector.^18–20 The data configurations are given in Table 1, showing the number of samples, the dimensions of the inputs, the number of classes, and the class distribution for each data set. It is clear that with respect to class distribution, many data sets are imbalanced. For example, data set Brain Tumor 1 should be classified into five classes, in which the first class accounts for 66.7 .percent of the data, the second class accounts for 11.1 percent, and the fourth class accounts for only 4.6 .percent of the data. The relatively small data sizes and the imbalanced data set in class distributions make the learning task challenging.

Table 1.

Data configuration.

Data set	Data size	Dimension	Class number	Class distribution (%)
Brain Tumor 1	90	5920	5	66.7, 11.1, 11.1, 4.4, 6.7
Brain Tumor 2	50	10,367	4	28.0, 14.0, 14.0, 28.0, 30.0
DLBCL	77	5469	2	75.3, 24.7
Leukemia 1	72	5327	3	52.8, 12.5, 34.7
Leukemia 2	72	11,225	3	38.9, 33.3, 27.8
Lung Cancer	203	12,600	5	68.5, 8.4, 10.3, 9.9, 3.0
Prostate Tumor	102	10,509	2	51.0, 49.0
Small blue round cell tumor (SRBCT)	83	2308	4	34.9, 30.1, 13.3, 21.7
Tumors	60	5726	9	15.0, 11.7, 13.3, 10.0, 10.0, 13.3, 13.3, 3.3, 10.0
Tumors	174	12,533	11	15.5, 4.6, 14.9, 13.2, 6.9, 6.3, 4.0, 14.9, 3.4, 8.0, 8.0

DLBCL: diffuse large B-cell lymphoma.

The classification accuracy of the S-rRBF is compared against two simple deep models, stacked autoencoders (SAEs)^22–24 and multilayered perceptrons (MLPs) with rectified linear unit (ReLU) as the activation function,^25–27 and the conventional nearest neighbor classifier²⁸ for a wide range of cancer classification problems based on microarray. Table 2 shows the average error rates together with their standard deviations in the bracket obtained over 15-cross-validation test. The error of the best performing classifier is highlighted with bold and the worst performing one in italic. Table 2 shows that the S-rRBF does not always outperform other classifiers. However, for most of the problems, its performances are never far from the best performing classifiers, showing the reliability and stability of S-rRBF against wide range of problems. From the table, both the S-rRBF and SAEs offer the best classification rates for four of the problems. However, when they are not selected as the best performers, in general the performance of the S-rRBF is better than that of SAE. Furthermore, the internal layer of the S-rRBF offers auxiliary visual information that is not offered by other methods.

Table 2.

Error rate (%) (standard deviation).

Data set	S-rRBF	SAEs	ReLU	k-NN
Brain Tumor 1	13.3 (14.4)	16.7 (17.8)	14.4 (12.4)	21.1 (25.6)
Brain Tumor 2	28.3 (25.4)	22.8 (19.3)	32.8 (11.2)	32.2 (24.6)
DLBCL	9.1 (10.1)	5.3 (11.9)	9.3 (16.7)	14.2 (17.6)
Leukemia 1	7.0 (12.8)	15.3 (16.3)	8.7 (16.4)	15.3 (14.5)
Leukemia 2	12.3 (12.9)	11.3 (13.4)	16.7 (12.3)	9.67 (10.8)
Lung Cancer	11.8 (8.5)	6.5 (16.3)	17.7 (10.4)	9.5 (8.8)
Prostate Tumor	14.6 (13.3)	15.1 (14.1)	33.3 (17.4)	16.5 (16.1)
SRBCT	6.9 (10.8)	14.4 (13.3)	4.9 (13.0)	17.1 (16.1)
Tumors	40.0 (20.7)	63.3 (28.1)	70.0 (31.6)	68.3 (22.1)
Tumors	27.5 (9.5)	22.8 (14.3)	51.1 (36.6)	25.9 (8.4)

S-rRBF: Softmax restricted radial basis function networks; SAE: stacked autoencoder; ReLU: rectified linear unit; DLBCL: diffuse large B-cell lymphoma.

To show the uniqueness of the visualization of S-rRBF’s internal representation, some of the problems are visualized using other dimensional reduction methods, kernel principal component analysis (K-PCA),^29,30 t-distributed stochastic neighborhood embedding (t-SNE),^31,32 and the Kohonen’s SOM.¹⁶

Figure 2 shows the 2D representations of Leukemia 1 problem. It can be observed that the internal map of the S-rRBF in Figure 2(d) shows better separability between the samples belonging to three classes, compared to the other maps. In recent years, t-SNE is considered to be one of the best dimensional reduction methods that very nicely reflect the original structure of the data. The t-SNE map for this problem, shown in Figure 2(b), indicates that the samples for ALL T-cell, marked , are distributed among the other two classes. This overlapping distribution is likely the cause of classification errors. The overlapping nature of ALL T-cell samples is also reflected in its K-PCA representation shown in Figure 2(a) and (c). Due to its learning process, the S-rRBF generated a 2D internal representation that is at least sub-optimum for the classification task. Figure 2(d) shows that the three classes are better separated, and this map is also consistent with the low error rate of S-rRBF for this problem. In those maps, the size of a marker indicates the number of samples represented by the marker, and a × in the maps indicates a representation of two or more samples from conflicting classes, hence some samples in its vicinity are likely to be misclassified.

Figure 2.

Leukemia 1: (a) kernel PCA, (b) t-SNE, (c) SOM, and (d) S-rRBF.

The validity of the visual information offered by the S-rRBF representation is evaluated against a confusion matrix shown in Table 3. The component $C (i, j)$ of this matrix indicates the percentage of inputs belonging to class $i$ classified as $j$ . The representation of S-rRBF is consistent with this confusion matrix, in which there are areas around the middle of the map where the three classes are assigned close to each other causing misclassifications. Table 3 indicates that the true positives for all the three classes are high, but the close positions of some of the points belonging to conflicting classes around the middle of S-rRBF’s representation cause false negatives in the second and third rows of the matrix.

Table 3.

Leukemia 1 confusion matrix (%).

Class	: ALL B-cell	: ALL T-cell	: AML
: ALL B-cell	94.6	2.6	2.6
: ALL T-cell	0	88.8	11.2
: AML	8.0	0	92.0

The second example is the Brain Tumor 1 problem. From Figure 3(a), it can be observed that K-PCA does not generate 2D representation with good separability. The representations of t-SNE and SOM in Figure 3(b) and (c) also indicate that there are many overlapping samples with conflicting classes. The internal representation of S-rRBF shows that the lower part of the map is dominated by the : medulloblastoma samples, hence classification of samples in this part is likely to be easy, which is consistent with high classification rate of medulloblastoma as apparent from the confusion matrix in Table 4. Most of the misclassifications occur in the upper half of the S-rRBF representation due to the neighboring samples from the conflicting classes, indicated especially by the large confusions in the fourth and fifth rows of the confusion matrix in Table 4.

Figure 3.

Brain Tumor 1: (a) kernel PCA, (b) t-SNE, (c) SOM, and (d) S-rRBF.

Table 4.

Brain Tumor 1 confusion matrix (%).

Class	: medulloblastoma	: malignant glioma	: AT/RT	: normal cerebellum	: PNET
: medulloblastoma	93.3	1.7	3.3	0.0	1.7
: malignant glioma	0.0	100.0	0.0	0.0	0.0
: AT/RT	10.0	0.0	90.0	0.0	0.0
: normal cerebellum	25.0	25.0	0.0	50.0	0.0
: PNET	66.6	16.7	0.0	0.0	16.7

For the third example, SRBCT problem, the S-rRBF is slightly outperformed by ReLU MLP. However, the S-rRBF offers informative visual representation as shown in Figure 4(d). As captured by the K-PCA and t-SNE representations in Figure 4(a) and (b), respectively, the original structure of this problem includes many samples with overlapping classes, which is also consistent with some $\times s$ on SOM in Figure 4(c). During the learning process, the S-rRBF disentangles the original overlapping structure to form a more classifiable internal representation. However, there is a portion of the map where , , and are mapped very close to each other, causing Ewing Sarcoma (EWS) to be misclassified as Rhabdomyosarcoma (RMS) or Burkitt Lymphoma (BL) as shown in the first row of the confusion matrix in Table 5. The close and in the top right of the map is responsible for misclassification of the first class as indicated by the first row of the confusion matrix.

Figure 4.

SRBCT: (a) kernel PCA, (b) t-SNE, (c) SOM, and (d) S-rRBF.

Table 5.

SRBCT confusion matrix (%).

Class	: EWS	: RMS	: BL	: NB
: EWS	82.7	6.9	6.9	3.5
: RMS	0.0	96.0	0.0	4.0
: BL	0.0	0.0	100.0	0.0
: NB	0.0	0.0	0.0	100.0

The fourth example is the Leukemia 2 problem. K-PCA, t-SNE, and SOM representations, in Figure 5(a)–(c), show that the three classes are overlapping among each other. It is also clear that the : MLL samples are truncated into two clusters that sandwich : ALL samples between them. These characteristics are also nicely captured by the S-rRBF representation, in Figure 5(d), that also shows that there is an area where : MLL and : ALL samples are overlapping with each other. The visualization of the S-rRBF consistently explains the confusion matrix in Table 6 where confusions are apparent especially in its second row.

Figure 5.

Leukemia 2: (a) kernel PCA, (b) t-SNE, (c) SOM, and (d) S-rRBF.

Table 6.

Leukemia 2 confusion matrix (%).

Class	: AML	: ALL	: MLL
: AML	89.3	3.6	7.1
: ALL	0.0	87.5	12.5
: MLL	10.0	5.0	85.0

The fifth example is the Prostate problem. The difficulty in this problem is illustrated by the appearances of the K-PCA and t-SNE representations, in Figure 6(a)–(b), where there are many overlapping samples from the two different classes. The problem’s difficulty is also obvious in SOM, in Figure 6(a), where there are many $\times s$ , depicting overlapping conflicting samples. As shown in Figure 6(d), S-rRBF generates a map with better separability between the two conflicting classes. The visual information about the problem’s structure is consistent with the classification performances in Table 2. The internal representation of the S-rRBF is also consistent with the confusion matrix in Table 7, where there are some adjacent representations from the two conflicting classes that may cause misclassification, but most of the samples are nicely separated.

Figure 6.

Prostate: (a) Kernel PCA, (b) t-SNE, (c) SOM, and (d) S-rRBF.

Table 7.

Prostate confusion matrix (%).

Class	: tumor	: normal
: tumor	88.5	11.5
: normal	18.0	82.0

The next example is Brain Tumor 2 problem, in which S-rRBF is not the best classifier. Here, the low-dimensional representations for this problem of K-PCA, t-SNE, and SOM are shown in Figure 7(a)–(c), respectively, where it can be observed that there are many overlapping samples belonging to different classes, indicating that this is a hard classification problem. The internal representation of the S-rRBF, shown in Figure 7(d), is consistent with the confusion matrix shown in Table 8, for example, in visualizing the imbalanced representation of : classic anaplastic oligodendrogliomas and hence its low classification rate, as clearly indicated by large confusions in the second row of the matrix.

Figure 7.

Brain Tumor 2: (a) Kernel PCA. (b) t-SNE. (c) SOM. (d) S-rRBF.

Table 8.

Brain Tumor 2 confusion matrix (%).

Class	: GC	: CAO	: N-cG	: N-cAO
: GC	64.4	21.4	7.1	7.1
: CAO	42.9	57.1	0.0	0.0
: N-cG	7.1	0.0	71.5	21.4
: N-cAO	0.0	0.0	13.3	86.7

CG: classic glioblastomas; CAO: classic anaplastic oligodendrogliomas; N-cG: non-classic glioblastomas; N-cAO: non-classic anaplastic oligodendrogliomas.

The final example is diffuse large B-cell lymphoma (DLBCL) problem, in which K-PCA, t-SNE, SOM, and S-rRBF are shown in Figure 8(a)–(d), respectively. In this example, the S-rRBF forms a relatively separable two of the two conflicting classes, in which the imbalance of the data is clearly shown. For this problem, SAE outperforms other methods. The confusion matrix in Table 9 shows that the samples for follicular lymphoma are harder to classify, likely because of the imbalance in the training data.

Figure 8.

DLBCL: (a) Kernel PCA, (b) t-SNE, (c) SOM, and (d) S-rRBF.

Table 9.

DLBCL confusion matrix (%).

Class	: DLBCL	: FL
: DLBCL	98.3	1.7
: FL	31.6	68.4

DLBCL: diffuse large B-cell lymphoma; FL: follicular lymphoma.

In the examples above, the S-rRBF provides descriptive visual information in intuitively explaining its decisions. The visualization of the internal layer of the S-rRBF is visually more informative compared to other models, due to the learning algorithm of the proposed model. While in other dimensional reduction algorithms, the dimensions of the inputs are reduced independent of their class labels, in the proposed S-rRBF the dimensional reduction is an integrated part of its learning mechanism that naturally takes the class labels into account. The primarily mathematical property for the integrated dimensional-reduction-learning mechanism is expressed in the reference vector modifications in equation (8). Here, the modification encodes the information for the class labels into the 2D internal organization of the neural network, generating visual maps that are highly relevant with its decision process.

Conclusion

In this study, a cancer classifier trained on small and imbalanced data set is proposed. As opposed to most of the existing supervised neural networks that offer no transparency in their decision process, and thus have to be treated as black box, the proposed S-rRBF offers visual information on its internal layer. This auxiliary information complements the classification decision of the S-rRBF in intuitive manner, in which it explains why a problem is hard or easy to classify. It is intuitive to see that a new input that falls into areas where there are overlapping samples that belonging to conflicting class is hard to classify, while one that falls in the areas within distinctive clusters of a certain class is unlikely to be misclassified. The proposed visual transparency can potentially improve the usability of neural networks in medical diagnosis, in which not only the classification accuracy but also the explainability are important.

The experiments indicate that the S-rRBF do not outperform all of the compared models on all of the problems. However, when outperformed, its performances were generally close to the best classifier. Considering its generality and transparency, which are not offered by other classifiers, it is reasonable to choose the S-rRBF in wide range of cancer diagnosis problems.

This article reports on the preliminary study to test the reliability and explainability of the S-rRBF against many cancer diagnosis problems. The immediate future task is to integrate the S-rRBF into a diagnosis system that can be used in realistic clinical settings. Currently, the proposed neural network provides visual information but not logical explanation for its decisions. Thus, in the future, it is also important in converting the visual information into more understandable semantics for explaining the neural network.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Pitoyo Hartono

References

Chen

Mahjoubfar

Tai

, et al. Deep learning in label-free cell classification. Sci Rep 2016; 6: 21471.

Luo

Wang

, et al. A deep convolutional neural network for segmenting and classifying epithelial and stromal regions in histopathological images. Neurocomputing 2016; 191: 214–223.

Wahab

Khan

Lee

YS.

Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Comput Biol Med 2017; 85: 86–97.

Abreu

Santos

Abreu

, et al. Predicting breast cancer recurrence using machine learning techniques: a systematic review. ACM Comput Surv 2016; 49(3): 52.

LeCun

Bengio

Hinton

Deep learning. Nature 2015; 521: 436–444.

Hinton

Learning multiple layers of representation. Trend Cognit Sci 2007; 11(10): 428–434.

Salakhutdinov

Learning deep generative models. Ann Rev Stat Appl 2016; 2: 361–385.

Ravi

Wong

Deligianni

, et al. Deep learning for health informatics. IEEE J Biomed Health 2017; 21(1): 4–21.

Esteva

Kuprel

Novoa

, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017; 542: 115–118.

10.

Litjens

Snchez

Timofeeva

, et al. Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Sci Rep 2016; 6: 26286.

11.

Silver

Huang

Maddison

, et al. Mastering the game of go with deep neural networks and tree search. Nature 2016; 529: 484–489.

12.

Hartono

Trappenberg

Topographic representation adds robustness to supervised learning. J Intel Fuzzy Syst 2019; 36(4): 3249–3262.

13.

Hartono

Hollensen

Trappenberg

Learning-regulated context relevant topographical map. IEEE T Neur Net Lear 2015; 26(10): 2323–2335.

14.

Hartono

Trappenberg

Classificability-regulated self-organizing map using restricted RBF. In: Proceedings of the 2013 international joint conference on neural networks (IJCNN), Dallas, TX, 4–9 August 2013, pp. 160–164. New York: IEEE.

15.

Hartono

Classification and dimensional reduction using restricted radial basis function networks. Neural Computing and Applications 2018; 30(3): 905–915.

16.

Kohonen

Self-organized formation of topologically correct feature maps. Biol Cybern 1982; 43: 59–69.

17.

Kohonen

Essential of self-organizing map. Neur Netw 2013; 37: 52–65.

18.

Statnikov

Tsamardinos

Dosbayev

, et al. Gems: a system for automated cancer diagnosis and biomarker discovery from microaarry gene expression data. Int J Med Inf 2005; 74: 491–503.

19.

Statnikov

Henaff

Narendra

, et al. A comprehensive evaluation of multicategory classification methods for microbiomic data. Microbiome 2013; 1: 11.

20.

Statnikov

Aliferis

Tsamardinos

, et al. A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics 2005; 21(5): 631–643.

21.

Poggio

Girosi

Networks for approximation and learning. Proc IEEE 1990; 87: 1484–1487.

22.

Hinton

Zemel

Autoencoders, minimum description length and Helmholtz free energy. In: Cowan

Tesauro

Alspector

(eds) Advances in neural information processing systems 6 (NIPS 1993). US: Morgan-Kaufmann, 1994, pp. 3–10.

23.

Hinton

Salakhutdinov

Reducing the dimensionality of data with neural networks. Science 2006; 313(5786): 504–507.

24.

Bourland

Kamp

Auto-association by multilayer perceptrons and singular value decomposition. Biol Cybern 1988; 59: 291–294.

25.

Nair

Hinton

Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on international conference on machine learning, Haifa, Israel, 21–24 June 2010, pp. 807–814. New York: ACM.

26.

Goodfellow

Bengio

Courville

Deep learning. Cambridge, MA: The MIT Press, 2016.

27.

Glorot

Bordes

Bengio

(2011) Deep sparse rectifier neural networks. In: Proceeding of the fourteenth international conference on artificial intelligence and statistics (AISTATS) (eds Gordon

Dunson

Dudik

), Fort Lauderdale, FL, 11–13 April 2011, pp. 315–323. PMLR.

28.

Cover

Hart

Nearest neighbor pattern classification. IEEE T Inform Theory 1967; 13: 21–27.

29.

Weinberger

Sha

Saul

(2004) Learning a kernel matrix for nonlinear dimensionality reduction. In: Proceedings of the twenty-first international conference on machine learning (ICML ’04), Banff, AB, Canada, 4–8 July 2004. New York: ACM.

30.

Schlkopf

Smola

Mller

KR.

Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 1998; 10(5): 1299–1319.

31.

Van Der Maaten

Postma

Van Den Herik

. Dimensionality reduction: a comparative review. Technical report TiCC TR 2009–005, Tilburg University, 2009, https://www.tilburguniversity.edu/upload/59afb3b8-21a5-4c78-8eb3-6510597382db_TR2009005.pdf

32.

Van Der Maaten

. Visualizing high-dimensional data using t-SNE. J Mach Learn Res 2008; 9:2579–2605.

A transparent cancer classifier

Abstract

Keywords

Introduction

S-rRBF

Experiments

Conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

References