Abstract
Graph neural networks have shown impressive performance in a variety of biomedical application tasks due to their powerful graph representation capabilities. Although GNN has achieved great success, the data noise and data scarcity problems commonly faced in real psychiatric disease prediction scenarios may affect the training and prediction of graph learning models. At present, there is no relevant work to obtain a reasonable solution. Data augmentation, which allows limited data to produce value equivalent to more data without substantially increasing the data, is considered a practical approach to addressing the problem of noisy data and data scarcity. In this work, we propose a method based on graph data augmentation for solving the problem of noisy data and data scarcity in mental illness prediction. To mitigate the negative effects of label noise, we use edge predictors to optimize the graph topology, enhance links to nodes with high similarity, remove erroneous noisy edges, and enhance the model robustness by adding adversarial perturbations in the feature space. In addition, a confident self-checking mechanism allows accurate pseudolabeling to be obtained, providing more supervision for the model training phase and further reducing the effect of label noise. Extensive experiments on two multimodal real mental illness datasets show that the proposed approach has better performance. Sufficient ablation experimental studies were conducted to assess the effectiveness of each component. The experimental results validate the effectiveness and scalability of our framework for population-based disease prediction, even under challenging conditions of data noise and sparsity. The implementation code is publicly available at: https://github.com/jiachengpan98/GDA-GCN.
Get full access to this article
View all access options for this article.
