Sage Journals: Discover world-class research

Abstract

Feature extraction is an important preprocessing step in many research areas. For anomaly detection, the purpose of feature extraction lies in not only extracting the most important features hidden in the datasets, but also discriminating different classes of samples. The latter is usually referred to as discriminative ability. The data collected from production systems usually do not follow Gaussian distribution. They may correspond to nonlinear mixture of independent components. In order to cope with non-Gaussian data and implement nonlinear feature extraction, this article proposes a feature extraction algorithm based on Supervised Independent Component Analysis with Kernel (termed SKICA). SKICA first adopts Kernel Principle Component Analysis (KPCA) to whiten the datasets. Further, by virtue of the within-cluster scatter matrix derived from Linear Discriminate Analysis (LDA), SKICA extends Independent Component Analysis (ICA) to supervised situation by introducing within-cluster information into solving independent components. The latter improvement makes SKICA obtain the independent components more beneficial to separating different classes of samples. In order to quantitatively measure discriminative ability of the feature extraction algorithms involved in experiments, this article defines three kinds of average square distance. This article conducts experiments on artificial datasets, Cloud datasets, and KDD Cup datasets to evaluate the effectiveness of SKICA. The experimental results show that SKICA outperforms several popular supervised feature extraction algorithms, including LDA, LDA with kernel (KDA), and supervised ICA (SICA).

Keywords

Feature extraction anomaly detection independent component analysis (ICA)supervised kernel method

Get full access to this article

View all access options for this article.

References

Chandola

, Banerjee

and Kumar

, Anomaly Detection: A Survey, ACM Computing Surveys 41(3) (2009) Article 15.

Davis

J.J.

and Clark

A.J.

, Data preprocessing for anomaly based network intrusion detection: A review, Computers & Security 30(6-7) (2011), 353–375.

Hyvarinen

, Karhunen

and Oja

, Independent Component Analysis, John-Wiley & Sons Inc. Press, 2001.

Ding

S.F.

, Zhu

, Jia

W.K.

and Su

C.Y.

, A survey on feature extraction for pattern recognition, Artificial Intelligence Review 37(3) (2012), 169–180.

Pearson

, On lines and planes of closest fit to systems of points in space, Philosophical Magazine 2(11) (1901), 559–572.

Pechenizkiy

, Puuronen

and Tsymbal

, The Impact of Sample Reduction on PCA-based Feature Extraction for Supervised Learning, SAC (2006), pp. 553–558.

Fisher

R.A.

, The use of multiple measurements in taxonomic problems, Annals of Eugenics 7(2) (1936), 179–188.

Ohta

and Ozawa

, An improvement of incremental recursive fisher linear discriminant for online feature extraction, Electronics and Communications in Japan 96(4) (2013), 29–40.

Jing

X.Y.

, Li

, Lan

et al., Color image canonical correlation analysis for face feature extraction and recognition, Signal Processing 91(8) (2011), 2132–2140.

10.

Stuhlsatz

, Lippel

and Zielke

, Feature extraction with deep neural networks by a generalized discriminant analysis, IEEE Transactions on Neural Networks and Learning Systems 23(4) (2012), 596–608.

11.

Liu

Y.H.

, Aickelin

, Feyereisl

and Durrant

L.G.

, Wavelet feature extraction and genetic algorithm for biomarker detection in colorectal cancer data, Knowledge-Based Systems 37 (2013), 502–514.

12.

Herault

and Jutten

, Space or Time Adaptive Signal Processing by Neural Network Models, AIP Conference Proceedings 15(1) (on Neural Networks for Computing), (1986), pp. 206–211.

13.

Comon

, Independent Component Analysis, a New Concept? Signal Processing 36(3) (1994), 287–314.

14.

Lan

Z.L.

, Zheng

Z.M.

and Li

Y.W.

, Toward automated anomaly identification in large-scale systems, IEEE Transactions on Parallel and Distributed Systems 21(2) (2010), 174–187.

15.

Akaho

, Conditionally independent component analysis for supervised feature extraction, Neurocomputing 49 (2002), 139–150.

16.

Kwak

and Choi

C.-H.

, Feature extraction based on ICA for binary classification problems, IEEE Transactions on Knowledge and Data Engineering 15(6) (2003), 1374–1388.

17.

Takabatake

, Kotani

and Ozawa

, Feature extraction by supervised independent component analysis based on category information, Electrical Engineering in Japan 161(2) (2007), 542–547.

18.

Wang

X.M.

, Supervised Manifold Learning and Kernel Independent Component Analysis Applied lo the Face Image Recognition, In Proceedings of the 15th Conference on Intelligent Computation Technology and Automation, 2012, pp. 600–603.

19.

Yamazaki

and Fels

, Local Image Descriptors Using Supervised Kernel ICA, In Proceedings of 3rd Pacific Rim Symposium on Advances in Image and Video Technology 2009. pp. 94–105.

20.

Tao

M.L.

, Zhou

, Liu

and Z. Zhang

, Tensorial independent component analysis-based feature extraction for polarimetric SAR data classification, IEEE Transactions on Geoscience and Remote Sensing 53(5) (2015). 2481–2495.

21.

Zhao

C.H.

, Wang

Y.L.

and Mei

, Kernel ICA feature extraction for anomaly detection in hyperspectral imagery, Chinese Journal of Electronics 21(2) (2012), 265–269.

22.

Kwak

, Kim

and Kim

, Dimensionality reduction based on ICA for regression problems, Neurocomputing 71, (2008). 2596–2603.

23.

Palmieri

, Fiore

and Castiglione

, A distributed approach to network anomaly detection based on independent component analysis, Concurrency and Computation: Practice and Experience 26(5) (2014), 1113–1129.

24.

Bartlett

M.S.

, Movellan

J.R.

and Sejnowski

T.J.

, Face recognition by independent component analysis, IEEE Transactions on Neural Networks 13(6) (2002). 1450–1464.

25.

Yang

, Gao

X.M.

, Zhang

and Yang

J.Y.

, Kernel ICA: An alternative formulation and its application to face recognition, Pattern Recognition 38(10) (2005), 1784–1787.

26.

Scholkopf

, Smola

and Miiller

K.R.

, Nonlinear component analysis as a kernel eigenvalue problem, Neural Computation 10(5) (1998), 1299–1319.

27.

Mika

, Ratsch

, Weston

, Scholkopf

and K. Miiller

, Fisher Discriminant Analysis with Kernels, In Proceedings of the 1999 IEEE Signal Processing Society Workshop, Neural Networks for Signal Processing IX 1999. pp. 41–48.

28.

Hyvarinen

and Oja

, A fasi fixed-poinl algorithm for independent component analysis, Neural Computation 9(7) (1997). 1483–1492.

29.

Narasimhamurthy

and Kuncheva

L.I.

, A Framework for Generating Data to Simulate Changing Environments, In Proceedings of the 25th IASTED International Multi-conference: Artificial intelligence and applications (A/AP), 2007. pp. 384–389.

30.

Kuncheva

L.I.

, Artificial Data Sets, (2007) http://pages.bangor.ac.uk/ mas00a/activities/artificial_data.htm.

31.

Cortes

and Vapnik

V.N.

, Support-vector networks, Machine Learning 20(3) (1995), 273–279.

32.

Deng

N.Y.

, Tian

Y.J.

and Zhang

C.H.

, Support Vector Machines - Optimization based Theory, Algorithms, and Extensions, CRC Press 2013.

33.

Hettich

and Bay

S.D.

, The UCI KDD Archive, 1999 http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.

34.

Lippmann

R.P.

, Fried

D.J.

, Graf

et al., Evaluating Intrusion Detection Systems: The DARPA Off-line Intrusion Detection Evaluation, Proceedings of DARPA Information Survivability Conference and Exposition, 2000, pp. 12–26.

SKICA: A feature extraction algorithm based on supervised ICA with kernel for anomaly detection

Abstract

Keywords

Get full access to this article

References