Abstract
The incremental topic detection method in news topic detection relies too much on the order of document flow, which leads to the drift of clustered topics. The drift phenomenon of news will affect the recognition of topic detection. This article proposes a Dynamic Subtopic Detection Model for Temporal Text (DSDTT), which establishes both related and independent relationships between topics in time windows by designing leader documents for them. The proposal of “Inflection Point Analysis Method” and “CrossMountain” pattern solves the problems of scattered clustering and low discrimination caused by the minimum perplexity determining the number of topics; the model can automatically construct subtopic evolution scenarios based on time windows, effectively alleviating the phenomenon of topic drift. The experiments were conducted from the optimal number of topics and topic perplexity, dynamic subtopic detection, and subtopic evolution relationship scenarios, verifying that the Dynamic Subtopic Detection Model for Temporal Text outperforms the topic perplexity method in selecting the optimal number of topics and exhibits more accurate tracking performance in dynamic subtopic detection.
Keywords
Get full access to this article
View all access options for this article.
