A Novel Approach to Model Ensembled-Based ANFIS for Big Data

Abstract

This study aims to develop a new ANFIS-based ensemble modeling approach that provides high prediction accuracy and generalization capability on large datasets. The proposed approach utilizes the parallel processing capacity of the MapReduce algorithm to divide large datasets into smaller chunks and create and train independent ANFIS models for each chunk. While the input and output membership functions obtained from the trained structures are directly transferred to the new architecture, the rule bases are integrated using the rule adjustment function. The number of rules has been significantly reduced compared to the classical ANFIS structure. In this way, both the computational cost has been reduced and the model complexity has been effectively managed. In traditional ensemble approaches found in the literature, the output values of the models are generally combined, whereas in this study, the proposed approach combines the ANFIS structures obtained from each subset of the data to create a single ANFIS-based ensemble model. The obtained results demonstrate that a single ensemble system architecture, encompassing the entire large dataset and possessing high generalization capability, has been successfully created.

Plain Language Summary

This study presents a new intelligent prediction model developed to analyze large data sets more effectively. Today, the rapid growth of data volume causes traditional methods to experience difficulties in processing this data. An artificial intelligence model called ANFIS was used to solve this problem. ANFIS creates logical rules using data and makes predictions using these rules. However, working capacity and time interval restrictions occur with large data sets.

In this study, separate models were created for each piece by dividing large data sets into smaller pieces and then these models were combined into a single model. With this method, data processing time is reduced and prediction accuracy is increased. Additionally, a special editing function is designed that reduces the number of unnecessary rules in the model. In this way, the model made predictions both faster and more accurately.

In the study, the performance of the proposed model has been compared with other widespread and it has been proven that superior results are obtained. As a result, it has been proven that this new model can make more accurate and reliable predictions in large data sets.

Keywords

ANFIS MapReduce ensemble fuzzy rule base integration Bigdata PSO-ANFIS GA-ANFIS

Get full access to this article

View all access options for this article.

References

Ahmed

M. S.

(2020). Optimization of ensemble ANFIS models using genetic algorithm for improved prediction accuracy. Applied Soft Computing, 96, 106626. https://doi.org/10.1016/j.asoc.2020.106626

Akan

Demir

(2019). The effect of data size for ANFIS and MLR models on predicting the unconfined compressive strength of clayey soils. SN Applied Sciences, 1(8), 843. https://doi.org/10.1007/s42452-019-0862-2

Ali

Deo

R. C.

Downs

N. J.

Maraseni

(2018a). An ensemble-ANFIS based uncertainty assessment model for forecasting multi-scalar standardized precipitation index. Atmospheric Research, 207, 155–180. https://doi.org/10.1016/j.atmosres.2018.02.024

Ali

M. H.

Rahman

S. M.

Hossain

M. A.

(2018b). Combined ANFIS method with FA, PSO, and ICA as steering control optimization on electric car. 2018 Electrical Power, Electronics, Communications, Controls and Informatics Seminar (EECCIS) (pp. 1–6).

Benbriqa

Idri

& Abnane

(2023). Performance of heterogeneous neuro-fuzzy ensembles over medical datasets. Scientific African, 21, e01752. https://doi.org/10.1016/j.sciaf.2023.e01838

Chen

D. W.

Zhang

H. P.

(2005). Time series prediction based on ensemble ANFIS. Proceedings of the Fourth International Conference on Machine Learning and Cybernetics (pp. 1–6).

García

Fernández

Herrera

(2008). An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. Journal of Machine Learning Research, 9, 2677–2694. https://jmlr.csail.mit.edu/papers/volume9/garcia08a/garcia08a.pdf

Gürkan

A. A.

Demir

(2017). Comparison of the effects of dimensionality reduction methods in the training of neuro-fuzzy (ANFIS) classifications. International Artificial Intelligence and Data Processing Symposium (pp. 1–6).

Jang

J. S. R.

Tung

T. L.

(2019). Adaptive network based fuzzy inference system (ANFIS) training approaches: A comprehensive survey. Artificial Intelligence Review, 52, 2263–2293. https://doi.org/10.1007/s10462-017-9610-2

10.

Kaloop

M. R.

Kumar

Zarzoura

Roy

J. W.

(2020). A wavelet - Particle swarm optimization - Extreme learning machine hybrid modeling for significant wave height prediction. Ocean Engineering, 213, 107777. https://doi.org/10.1016/j.oceaneng.2020.107777

11.

Kirgsn . Electric motor temperature dataset. Accessed October 10, 2024. https://www.kaggle.com/datasets/wkirgsn/electric-motor-temperature

12.

MathWorks . Mamdani and Sugeno fuzzy inference systems. Accessed January 7, 2025. https://www.mathworks.com/help/fuzzy/types-of-fuzzy-inference-systems.html

13.

Melin

Soto

Castillo

Soria

(2012). A new approach for time series prediction using ensembles of ANFIS models. Expert Systems With Applications, 39(4), 3494–3506. https://doi.org/10.1016/j.eswa.2011.09.040

14.

Nou

M. R. G.

Zolghadr

Bajestan

M. S.

(2021). Application of ANFIS–PSO hybrid algorithm for predicting the dimensions of the downstream scour hole of ski-jump spillways. Iranian Journal of Science and Technology - Transactions of Civil Engineering, 45, 1845–1859. https://doi.org/10.1007/s40996-020-00413-w

15.

Ouifak

Afkhkhar

Manzi

A. T. I.

Idri

(2024). Homogeneous ensembles of neuro-fuzzy classifiers using hyperparameter tuning for medical data. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 32(2), 273–301. https://doi.org/10.1142/S0218488524500119

16.

pattinson9999 . UCI MetroPT-3 dataset. Accessed January 5, 2024. https://www.kaggle.com/datasets/pattinson9999/uci-metropt-3-dataset

17.

Polikar

(2006). Ensemble based systems in decision making. IEEE Circuits and Systems Magazine, 6(3), 21–45. https://doi.org/10.1109/MCAS.2006.1688199

18.

Seng

L. K.

Wong

F. W.

Lim

P. E.

(2012). Applying ensemble learning techniques to ANFIS for air pollution index prediction in Macau. Lecture Notes in Computer Science, 7367, 509–516. https://doi.org/10.1007/978-3-642-31346-2_57

19.

Shetty

ODSC hackathon dataset. Accessed January 15, 2024. https://www.kaggle.com/datasets/isujeeth/odsc-hackathon

20.

Suhartono

F. R.

Lusia

D. A.

Otok

B. W.

, & Sutikno, K. H. (2012). Ensemble method based on ANFIS-ARIMA for rainfall prediction. In Proceedings of the 2012 International Conference on Statistics in Science, Business and Engineering (pp. 447–454). https://doi.org/10.1109/ICSSBE.2012.6396564

21.

Vakhshouri

Nejadi

(2018). Prediction of compressive strength of self-compacting concrete by ANFIS models. Neurocomputing, 272, 13–22. https://doi.org/10.1016/j.neucom.2017.09.099

22.

Vatsal

H. G.

Patel

V. M.

(2015). Improving efficiency of MapReduce paradigm with ANFIS for big data. International Journal of Science Technology and Engineering, 1(7), 72. https://www.ijste.org/articles/IJSTEV1I12055.pdf

23.

Wan

K. S.

Wong

F. W.

Lim

P. E.

(2012). Applying ensemble learning techniques to ANFIS for air pollution index prediction in Macau. Lecture Notes in Computer Science, 7367, 509–516. https://doi.org/10.1007/978-3-642-31346-2_57

24.

Wang

Zhang

X. Z.

(2017). An overview on the roles of fuzzy set techniques in big data processing: Trends, challenges and opportunities. Knowledge-Based System, 118, 15–30. https://doi.org/10.1016/j.knosys.2016.11.008

25.

Yadegaridehkordi

E. H.

Hourmand

Nilashi

(2018). Influence of big data adoption on manufacturing companies’ performance: An integrated DEMATEL-ANFIS approach. Technological Forecasting and Social Change, 137, 199–210. https://doi.org/10.1016/j.techfore.2018.07.043

26.

Zanganeh

Chaji

(2024). A new aspect of the ApEn application to improve the PSO-ANFIS model to forecast Caspian Sea levels. Regional Studies in Marine Science, 69, 103347. https://doi.org/10.1016/j.rsma.2023.103347

27.

Zhang

, &

Chen, T. (2024). Scikit-ANFIS: A scikit-learn compatible Python implementation for adaptive neuro-fuzzy inference system. International Journal of Fuzzy Systems, 26, 2039–2057. https://doi.org/10.1007/s40815-024-01697-0