Abstract
The increasing availability of fine-grained student behavior data on smart campuses offers significant opportunities for personalized education. However, traditional clustering methods applied to such structured data often fail to capture the semantic complexity of behavioral features and the relational dependencies among individuals. To address these dual challenges, we propose a deep unsupervised clustering framework that integrates Bidirectional Encoder Representations from Transformers (BERT) and Graph Attention Networks (GAT). Recognizing that raw numerical features lack contextual depth, our approach first transforms structured data into natural language profiles, leveraging a pretrained BERT model to extract semantically rich embeddings. These individual representations are situated within a student behavior graph, where a GAT module refines node features by capturing relational structures and inter-student similarities. The combined embeddings enhance the performance of multiple clustering algorithms in identifying distinct behavioral patterns across students. In addition, we introduce a hierarchical anomaly detection module that identifies both unstable behavior clusters and outlier individuals based on intra-cluster variance and local density, providing a solution for detecting anomalous patterns in student populations. Experimental results on real-world campus datasets demonstrate the framework’s effectiveness, while further analysis highlights its practical utility in uncovering early indicators of academic risk through interpretable behavioral modeling.
Keywords
Get full access to this article
View all access options for this article.
