Abstract
The operation platform of the power system contains a large amount of multisource information data. Efficient and accurate retrieval of this data information is not an easy task. To address this issue, this paper first studies data mining technologies based on term frequency-inverse document frequency (TF-IDF) and Word2Vec methods, aimed at extracting keywords and features from the operational data of the power grid system. Then, the paper proposes an improved decision tree (IDT) algorithm based on mutual information and parallel computing, and constructs a decision tree (DT) model on this basis. Finally, by setting up simulation experiments using various system databases as examples, the effectiveness and advantages of the proposed IDT model are validated. The experimental results demonstrate that the IDT algorithm achieves higher mining accuracy compared to the traditional ID3 algorithm, with accuracies of up to 99.72% across different datasets. Additionally, the model shows significant improvements in retrieval efficiency, effectively handling large-scale data with reduced processing time. The paper also introduces a database for the power grid supervision and management (SM) system to verify the effectiveness of data mining technology and the advantages of the proposed IDT algorithm in retrieving key information.
Get full access to this article
View all access options for this article.
