Abstract
Network Traffic Classification (NTC) is an important technology for network management, traffic control, security detection and so on. With the development of the high-speed, large-scale complex networks, NTC appears some challenges in area of data storage and processing for massive network traffic. Although there are a few NTC based on cloud computing, its parallel computing model has not received enough attention. In this paper, based on the Selective Ensemble and Diversity Measures, we propose a novel Parallelized Network Traffic Classification framework (PNTC-SE-DM), which is used to parallel process the large-scale network traffic data by MapReduce architecture. In particular, in PNTC-SE-DM, we present a new method to select the classifiers for ensemble classification, which is closely related to both the prediction accuracy of the single classifier and the diversity among the multi-classifiers. The experimental results demonstrate that the new approach has the advantage of tackling large-scale network traffic data, and is favorable in terms of the evaluation metrics of speedup, sizeup and accuracy.
Get full access to this article
View all access options for this article.
