Abstract
Feature extraction is an important preprocessing step in many research areas. For anomaly detection, the purpose of feature extraction lies in not only extracting the most important features hidden in the datasets, but also discriminating different classes of samples. The latter is usually referred to as discriminative ability. The data collected from production systems usually do not follow Gaussian distribution. They may correspond to nonlinear mixture of independent components. In order to cope with non-Gaussian data and implement nonlinear feature extraction, this article proposes a feature extraction algorithm based on Supervised Independent Component Analysis with Kernel (termed SKICA). SKICA first adopts Kernel Principle Component Analysis (KPCA) to whiten the datasets. Further, by virtue of the within-cluster scatter matrix derived from Linear Discriminate Analysis (LDA), SKICA extends Independent Component Analysis (ICA) to supervised situation by introducing within-cluster information into solving independent components. The latter improvement makes SKICA obtain the independent components more beneficial to separating different classes of samples. In order to quantitatively measure discriminative ability of the feature extraction algorithms involved in experiments, this article defines three kinds of average square distance. This article conducts experiments on artificial datasets, Cloud datasets, and KDD Cup datasets to evaluate the effectiveness of SKICA. The experimental results show that SKICA outperforms several popular supervised feature extraction algorithms, including LDA, LDA with kernel (KDA), and supervised ICA (SICA).
Keywords
Get full access to this article
View all access options for this article.
