Learning from examples with data reduction and stacked generalization

Abstract

Data reduction can increase generalization abilities of the learning model and shorten learning time. It can be particularly helpful in analyzing big data sets. This paper focuses on the machine learning from examples with data reduction. In the paper data reduction is carried out by selection of relevant instances, called prototypes. The discussed approach bases on the assumption that the selection of prototypes is carried-out by a team of agents and that the prototype instances are selected from clusters of instances under the constraint that from each cluster a single prototype is obtained. For cluster initialization the kernel-based fuzzy clustering algorithm is used. Main feature of the proposed approach is integrating data reduction with the stacking technique. Stacked generalization assures diversification among prototypes, and hence, base classifiers. To validate the proposed approach we have carried-out computational experiment. We have also evaluated experimentally the influence of the clustering method and the number of stacking folds used, on the classification accuracy.

Keywords

Learning from big data data reduction stacked generalization kernel-based clustering

Get full access to this article

View all access options for this article.

References

Aha

D.W.

, Kibler

and Albert

M.K.

, Instance-based learning algorithms, Machine Learning6 (1991), 37–66.

Andrews

N.O.

and Fox

E.A.

, Clustering for data reduction: A divide and conquer approach. Technical Report TR-07-36, Computer Science, Virginia Tech, 2007.

Asuncion

and Newman

D.J.

, UCI machine learning repository, Irvine, CA: University of California, School of Information and Computer Science (2007). Available at: http://www.ics.uci.edu/~mlearn/MLRepository.html

Bhanu

and Peng

, Adaptive integration image segmentation and object recognition, IEEE Transactions on Systems, Man and Cybernetics30(4) (2000), 427–441.

Bull

, Learning Classifier Systems: A Brief Introduction, Applications of Learning Classifier Systems, in: Bull

, (ed.), Studies in Fuzziness and Soft Comuting, Springer, 2004.

Cano

J.R.

, Herrera

and Lozano

, On the combination of evolutionary algorithms and stratified strategies for training set selection in data mining, Applied Soft Computing6 (2004), 323–332.

Carbonera

J.L.

and Abel

, A Density-based Approach for Instance Selection. In: Proceedings of the 2015 IEEE 27th International Conference on Tool with Artificial Intelligence, 2015, pp. 768–774. DOI: 10.1109/ICTAI.2015.114

Czarnowski

and Jędrzejowicz

, A New Cluster-based Instance Selection Algorithm, in: O’Shea

, et al., (Eds.), KES-AMSTA 2011, LNAI 6682, Springer-VerlagBerlin Heidelberg, 2013, pp. 436–444.

Czarnowski

and Jędrzejowicz

, Agent-based data reduction using ensemble technique, in: Badica

, Nguyen

N.T.

and Brezovan

, (Eds.): Comutational Collective Intelligence. Technologies and Applications, ICCCI 2013. LNAI 8083, SpringerBerlin-Heidelberg, 2013, pp. 447–456.

10.

Czarnowski

and Jędrzejowicz

, An approach to data reduction and integrated machine classification, New Generation Computing28 (2010), 21–40.

11.

Czarnowski

, Distributed Learning with Data Reduction, in: Nguyen

N.T.

(ed.), Transactions on CCI IV, LNCS 6660, Springer-VerlagBerlin Heidelberg, 2011, pp. 3–121.

12.

Czarnowski

and Jędrzejowicz

, An Approach to Instance Reduction in Supervised Learning, in: Coenen

, Preece

, Macintosh

, (eds.), Research and Development in Intelligent Systems XX, Springer, London, 2004, pp. 267–282.

13.

Czarnowski

and Jędrzejowicz

, An Approach to Machine Classification Based on Stacked Generalization and Instance, in: Proceedings of the 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2016) – –in:acceted for publication.

14.

Dash

and Liu

, Feature selection for classification, Intelligence Data Analysis1(3) (1997), 131–156.

15.

Datasets used for classification: Comparison of results. In. directory of data sets. Available at: http://www.is.umk.pl/projects/datasets.html. Accessed 1 Sep 2009.

16.

Eschrich Ke

, Hall

J.L.O.

and Goldgof

D.B.

, Fast accurate fuzzy clustering through data reduction, IEEE Transactions on Fuzzy Systems11(2) (2013), 262–270.

17.

García

, Derrac

, Cano

J.R.

and Herrera

, Prototype selection for nearest neighbor classification: Taxonomy and empirical study, IEEE Transactions on Pattern Analysis and Machine Intelligence34(3) (2012), 417–435. doi: 10.1109/TPAMI.2011.142

18.

Graves

and Pedrycz

, Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study, Fuzzy Sets and Systems161 (2010), 522–543. DOI: 10.1016/j.fss.2009.10.021

19.

Hart

P.E.

, The Condensed nearest neighbour rule, IEEE Transactions on Information Theory14 (1968), 515–516.

20.

Havens

T.C.

, Bezdek

J.C.

and Palaniswami

, Cluster Validity for Kernel Fuzzy Clustering, in: Proceedings of 2012 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) IEEE Brisbane, QLD, 2012, pp. 1–8. DOI: 10.1109/FUZZ-IEEE.2012.6250820

21.

Jędrzejowicz

and Jędrzejowicz

, Cellular GEP-induced Classifiers, in: Pan

J.-S.

, Chen

S.-M.

and Nguyen

N.T.

, (Eds.), ICCCI 2010, Part I, LNAI 6421, Springer–Verlag, Berlin Heidelberg, 2010, pp. 343–352.

22.

Kim

S.-W.

and Oommen

B.J.

, A brief taxonomy and ranking of creative prototype reduction schemes, Pattern Analysis Application6 (2003), 232–244.

23.

Lee

S.H.

and Lim

J.S.

, Evolutionary instance selection algorithm based on Takagi-Sugeno Fuzzy Model, Applied Mathematics & Information Sciences8(3) (2014), 1307–1312.

24.

Leyva

, Gonzalez

and Perez

, Three new instances selection methods based on local sets: A comparative study with several approaches from bi-objective Perspective, Pattern Recognition48(4) (2015), 1523–1537.

25.

, Tang

, Xue

and Jiang

, Modified FCM Clustering Based on Kernel Mapping, in: Proceedings of the International Conference on Society for Optical Enginering, Vol. 4554, 2001, pp. 241–245. DOI: 10.1117/12.441658

26.

Michalski

R.S.

and Tecuci

, Machine Learning, A Multistrategy Approach, Vol. IV. Morgan Kaufmann, 1994.

27.

Quinlan

J.R.

, C4.5: Programs for Machine Learning. Morgan Kaufmann, SanMateo, CA, 1993.

28.

Raiwani

Y.P.

and Panwar

S.S.

, Data reduction and neural networking algorithms to improve intrusion detection system with NSL - KDD dataset, International Journal of Emerging Trends & Technology in Computer Science4(1) (2015), 219–225.

29.

Ritter

G.L.

, Woodruff

H.B.

, Lowry

S.R.

and Isenhour

T.L.

, An algorithm for a selective nearest decision rule, IEEE Transactions on Information Theory21 (1975), 665–669.

30.

Rozsypal

and Kubat

, Selecting representative examples and attributes by a genetic algorithm, Intelligent Data Analysis7(4) (2003), 291–304.

31.

Sadeghzadeh

and Fard

, Nonparametric Data Reduction Approach for Large-scale Survival Data Analysis, in: Proceedings of 2015 Annual Reliability and Maintainability Symposium (RAMS), 2015, pp. 1–6. doi: 10.1109/RAMS.2015.7105128

32.

Sasmero

M.P.

, Ledezma

A.I.

and Sanchis

, Generating ensembles of heterogeneous classifiers using Stacked Generalization, WIREs Data Mining Knowledge Discovery5 (2015), 21–34. doi: 10.1002/widm.1143

33.

Sikora

and Al-laymoun

O.H.

, A modified stacking ensemble machine learning algorithm using genetic algorithms, Journal of International Technology and Information Management23(1) (2014), 1–11.

34.

Skalak

D.B.

, Prototype selection for composite neighbor classifiers, University of Massachusetts Amherst. 1997. Available at: https://web.cs.umass.edu/publication/docs/1996/UM-CS-089.pdf

35.

Talukdar

, Baerentzen

, Gove

and de Souza

, Asynchronous teams: Co-operation schemes for autonomous, computer-based agents. Technical Report EDRC 18-59-96, Carnegie Mellon University, Pittsburgh, 1996.

36.

Tomek

, An Experiment with the edited nearest-neighbour rule, IEEE Transactions on Systems, Man, and Cybernetics6-6 (1976), 448–452.

37.

Triguero

, Derrac

, García

and Herrera

, A taxonomy and experimental study on prototype generation for nearest neighbor classification, IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews42(1) (2012), 86–100. doi: 10.1109/TSMCC.2010.2103939

38.

Uno

, Multi-sorting algorithm for finding pairs of similar short substrings from large-scale string data, Knowledge and Information Systems (2009). doi: 10.1007/s10115-009-0271-6

39.

Walter

D.B.

, Instance selection for model-based classifiers. Graduate Theses and Dissertations. Paper 13783, 2014.

40.

Wilson

D.R.

and Martinez

T.R.

, Reduction techniques for instance-based learning algorithm, Machine Learning33(3) (2000), 257–286.

41.

Wilson

D.R.

and Martinez

T.R.

, An Integrated instance-based learning algorithm, Computational Intelligence16 (2000), 1–28.

42.

Wolper

D.H.

, The supervised Learning no free lunch theorems, Technical Raport, NASA Ames Research Center, Moffett Field, California, USA, 2001.

43.

Wolpert

, Stacked generalization, Neural Networks5 (1992), 241–259.

44.

, Ianakiew

K.G.

and Govindraju

, Improvement in k-nearest Neighbor Classification, In: Proceedings of ICARP 2001, LNCS 2013. Springer, Berlin, 2001, pp. 222–229.

45.

Yildirim

A.A.

, Özdoğan

, and Watson

, Parallel data reduction techniques for big datasets, in: Hu

Wen-Chen

, Kaabouch

Naima

(Eds.), Big Data Management, Technologies, and Alications. IGI Global, 2014, pp. 72–93.

46.

, Xiaowei

, Ester

and Kriegel

H.P.

, Feature weighting and instance selection for collaborative filtering: An information-theoretic approach, Knowledge Information Systems5(2) (2004), 201–224.

47.

Zhou

and Gan

J.Q.

, Mercel Kernel Fuzzy c-means Algorithm and Prototypes of Clusters, in: Proceedings of the International Conference on Data Engineering and Automated Learning. vol. 3177, Lecture Notes in Comuter Science, 2004, pp. 613–618. DOI: 10.1007/978-3-540-28651-6_90

48.

Zhu

and Wu

, Scalable Representative Instance Selection and Ranking, in: IEEE Proceedings of the 18th International Conference on Pattern Recognition, vol. 3, 2006, pp. 352–355.