An experimental study: An interpretative division method on principal component analysis

Abstract

Numerical relationships of multi-feature data are widely concerned in data preprocessing, but semantic interpretation of the features received less attention. We completely approve of the importance of numerical relationships. However, in our opinion, the interpretative relationships of the data should be important as well. In this paper, we regard the principle component analysis (PCA) as a special case of numerical relationships. We propose an interpretative division method on the PCA and its improved algorithms from an explanatory perspective. Our method integrates the numerical data analysis with the semantic understanding of the problem. Experiments are conducted on real data sets and our method demonstrates good performance and outperforms the corresponding PCA algorithms. On the real data sets of our experiments, we also find that the interpretative features with small eigenvalues are better choices than the principle components of PCA.

Keywords

Interpretative data analysis multi-feature data analysis feature reduction data preprocess principal component analysis

Get full access to this article

View all access options for this article.

References

Zhang

and El-Gohary

N.M.

, Semantic nlp-based information extraction from construction regulatory documents for automated compliance checking, Journal of Computing in Civil Engineering30(2) (2016), 04015014.

George

, Dunteman, Principal components analysis, Sage, 1989.

Everitt

B.S.

and Dunn

, Applied Multivariate Data Analysis, Second Edition. chapter Principal components analysis, Wiley Online Library, 2013, 48–73.

Nguyen

C.-H.

, Huynh

V.N.

and Pedrycz

, A construction of sound semantic linguistic scales using 4-tuple representation of term semantics, International Journal of Approximate Reasoning55(3) (2014), 763–786.

Jadidinejad

A.H.

, Mahmoudi

and Meybodi

M.R.

, Clique-based semantic kernel with application to semantic relatedness, Natural Language Engineering21(5) (2015), 725–742.

Visinescu

L.L.

and Evangelopoulos

, Orthogonal rotations in latent semantic analysis: An empirical study, Decision Support Systems62 (2014), 131–143.

Role

and Nadif

, Beyond cluster labeling: Semantic interpretation of clusters’ contents using a graph representation, Knowledge-Based Systems56 (2014), 141–155.

Bianchi

F.M.

, Scardapane

, Rizzi

, Uncini

and Sadeghian

, Granular computing techniques for classification and semantic characterization of structured data, Cognitive Computation8(3) (2016), 442–461.

, Shi

, Wu

and Song

Y.D.

, Fault detection filtering for nonlinear switched stochastic systems, IEEE Transactions on Automatic Control61(5) (2016), 1310–1315.

10.

, Wu

, Shi

and Song

Y.D.

, A novel approach to output feedback control of fuzzy stochastic systems, Automatica50(12) (2014), 3268–3275.

11.

Moore

B.C.

, Principal component analysis in linear systems: Controllability, observability, and model reduction, Automatic Control IEEE Transactions on26(1) (2010), 17–32.

12.

Bro

and Smilde

A.K.

, Principal component analysis, Analytical Methods6 (2014), 2812–2831.

13.

David

C.C.

and Jacobs

D.J.

, Principal component analysis: A method for determining the essential dynamics of proteins, Methods in Molecular Biology1084 (2014), 193–226.

14.

Bouwmans

and Zahzah

E.H.

, Robust pca via principal component pursuit: A review for a comparative evaluation in video surveillance, Computer Vision and Image Understanding122 (2014), 22–34.

15.

, Garibaldi

J.M.

and He

, Leaf classification using multiple feature analysis based on semi-supervised clustering, Journal of Intelligent & Fuzzy Systems29(4) (2015), 1465–1477.

16.

Lian

, Li

, Liu

, Huang

, Zhou

and Han

, Research on adaptive control strategy optimization of hybrid electric vehicle, Journal of Intelligent & Fuzzy Systems30(5) (2016), 2581–2592.

17.

Avanzini

and Jara

, The use of data reduction techniques to assess systemic risk: An application to the Chilean banking system, Intelligent Data Analysis19(s1) (2015), S45–S67.

18.

, Liu

, Peng

and Sun

, Overview of principal component analysis algorithm, Optik - International Journal for Light and Electron Optics127(9) (2016), 3935–3944.

19.

Schölkopf

, Smola

and Müller

K.R.

, Kernel Principal Component Analysis, Springer, Berlin, Heidelberg, 1997.

20.

Huang

and Da

, A dictionary learning and kpca-based feature extraction method for off-line handwritten tibetan character recognition, Optik126(23) (2015), 3795–3800.

21.

Huang

and Yan

, Related and independent variable fault detection based on kpca and svdd, Journal of Process Control39(1) (2016), 88–99.

22.

Pan

, Liu

, Zhou

and Zhang

, Anomaly detection for satellite power subsystem with associated rules based on kernel principal component analysis, Microelectronics Reliability55(9-10) (2015), 2082–2086.

23.

Tipping

M.E.

and Bishop

C.M.

, Probabilistic principal component analysis, Journal of the Royal Statistical Society61(3) (1999), 611–622.

24.

Tipping

and Bishop

, Mixtures of probabilistic principal component analyzers, Neural Computation11(2) (2006), 443–482.

25.

Ahmadkhani

and Adibi

, Face recognition using supervised probabilistic principal component analysis mixture model in dimensionality reduction without loss framework, Iet Computer Vision10(3) (2016), 193–201.

26.

Geraci

and Farcomeni

, Probabilistic principal component analysis to identify profiles of physical activity behaviours in the presence of non-ignorable missing data, Journal of the Royal Statistical Society Series C-Applied Statistics65(1) (2016), 51–75.

27.

Mredhula

and Dorairangaswamy

M.A.

, An effective filtering technique for image denoising using probabilistic principal component analysis (ppca), Journal of Medical Imaging and Health Informatics6(1) (2016), 194–203.