Abstract
The paper studied the encrypted network behavior recognition and mining in a large amount of network data environment, and proposed a fast online recognition method for the encryption network behavior based on the combination of correlation coefficient and k-nearest neighbor (KNN). Taking the encrypted Twitter traffic as the research object, a lot of encrypted Twitter network behaviors including message sending, pictures sending and other behaviors were analyzed, and then the statistical characteristics to express the encryption network behavior were extracted, and the samples library of encryption network behaviors based on correlation coefficient were established. Then, through the real-time collection of interactive network data, the correlation coefficient between the interactive data and the sample library were calculated, in order to overcome the noise interference of the similar data traffic. Meanwhile, the data packets after the similarity filtering were classified as the true behavior or the false behavior by using the KNN algorithm, and then the encryption network behavior was identified automatically by the default threshold of the correlation coefficient in big data environment, and compared with the traditional correlation coefficient method, the recognition efficiency of this method was greatly improved, which reaches to about 94%. Based on above, combined with the network vulnerability analysis, web crawler and virtual identity mining, the comprehensive encryption network behavior mining was successfully realized in the environment of big data.
Keywords
Get full access to this article
View all access options for this article.
