Abstract
With the rapid advancement of information technology, the digitization and intelligent management of electronic teaching materials have become increasingly important. Traditional methods face challenges in handling the complex formatting and structural diversity of textbooks, limiting their effectiveness in content extraction and classification. To address this, we propose VB-GCN, a framework that integrates BERT for semantic feature extraction, a variational autoencoder (VAE) for feature compression, and a graph neural network (GCN) for structural modeling. The framework leverages OCR-extracted text to build graph-based representations of textbook content, enabling effective hierarchical classification. Experiments on both a public dataset and a real textbook corpus show that VB-GCN outperforms competitive baselines such as TextRCNN and XLNet in terms of Precision, Recall, and F1-score. Ablation studies further confirm the importance of combining BERT with VAE for robust feature learning.
Get full access to this article
View all access options for this article.
