Abstract
There are various types of geographic information to be tested, single detection methods for sensitive geographic information, and low detection accuracy in the Internet. In this regard, research will focus on the detection of classified geographic information in Internet maps. To detect sensitive geographic information on Internet maps, a single and combined sensitive word library is first constructed. Words in sensitive word library and feature words’ similarity, as well as the sensitivity of map files, are calculated. Based on geographic information sensitivity, the classified geographic information presence in map files is detected, and the classified geographic information detection on Internet maps is implemented under Spark computing framework. After adding the combination sensitive vocabulary, the detection accuracy and recall rate were significantly improved, and F-measure value increased by 0.07. The optimal accuracy of the research method, the sensitive text detection model based on conditional random field, and the sensitive word detection algorithm for map sensitive information detection are 0.74, 0.65, and 0.59, respectively. The optimal recall rates are 0.84, 0.76, and 0.68, respectively, and the optimal F-measure values are 0.78, 0.70, and 0.64, respectively. The detection accuracy, recall rate, and F-metric of Spark cluster mode are better than those of single machine modes, indicating that parallelized sensitive geographic information detection has good comprehensive performance. This is beneficial for detecting sensitive geographic information on Internet maps.
Get full access to this article
View all access options for this article.
