Abstract
Graph neural networks (GNNs) have achieved excellent results in various graph-based learning tasks. However, the redundant parameters of GNNs and the large-scale graphs used as inputs have prevented GNNs from scaling up to real-world large-scale graph applications. To solve this problem, the graph lottery hypothesis claims the existence of graph lottery tickets (GLT), a combination of sparse core subgraph and subnetwork, which can be retrained to achieve performance similar to the original input graphs and dense networks. However, the GLT identified in the existing work lose valuable information due to irreversible pruning schemes. In addition, the performance of GNNs drops significantly when the graph sparsity is high. In this paper, we propose a gradual pruning and knowledge distillation (GPKD) framework to compensate for the loss caused by pruning and eventually identify GLT efficiently. Specifically, we first prune the input graph and model parameters according to the gradual iterative magnitude pruning strategy and then reset the remaining parameters. After each round of pruning, the pre-trained and pruned networks are considered teacher and student models, respectively. We employ a knowledge distillation scheme to allow students to mimic the output of the teacher model. The experimental results demonstrate that our proposed GPKD framework significantly outperforms the state-of-the-art unified GNNs sparsification (UGS) framework.
Get full access to this article
View all access options for this article.
