Abstract
This study aims to develop a new ANFIS-based ensemble modeling approach that provides high prediction accuracy and generalization capability on large datasets. The proposed approach utilizes the parallel processing capacity of the MapReduce algorithm to divide large datasets into smaller chunks and create and train independent ANFIS models for each chunk. While the input and output membership functions obtained from the trained structures are directly transferred to the new architecture, the rule bases are integrated using the rule adjustment function. The number of rules has been significantly reduced compared to the classical ANFIS structure. In this way, both the computational cost has been reduced and the model complexity has been effectively managed. In traditional ensemble approaches found in the literature, the output values of the models are generally combined, whereas in this study, the proposed approach combines the ANFIS structures obtained from each subset of the data to create a single ANFIS-based ensemble model. The obtained results demonstrate that a single ensemble system architecture, encompassing the entire large dataset and possessing high generalization capability, has been successfully created.
Plain Language Summary
This study presents a new intelligent prediction model developed to analyze large data sets more effectively. Today, the rapid growth of data volume causes traditional methods to experience difficulties in processing this data. An artificial intelligence model called ANFIS was used to solve this problem. ANFIS creates logical rules using data and makes predictions using these rules. However, working capacity and time interval restrictions occur with large data sets.
In this study, separate models were created for each piece by dividing large data sets into smaller pieces and then these models were combined into a single model. With this method, data processing time is reduced and prediction accuracy is increased. Additionally, a special editing function is designed that reduces the number of unnecessary rules in the model. In this way, the model made predictions both faster and more accurately.
In the study, the performance of the proposed model has been compared with other widespread and it has been proven that superior results are obtained. As a result, it has been proven that this new model can make more accurate and reliable predictions in large data sets.
Get full access to this article
View all access options for this article.
