Abstract
This study used machine learning methods to detect risk and protective factors for bullying perpetration in adolescents. The study sample consisted of 777 students with an age range of 11 to 16 years old. Multidimensional data covering both individual and environmental levels were collected. Individual factors included moral disengagement, normative beliefs about aggression, neuroticism, and self-control; environmental factors included parent–child relationships, deviant peer affiliation, school connection, and violent media exposure. The current study tested and compared six machine learning algorithms: Logistic Regression, Random Forest, Gradient Boosting Decision Tree, XGBoost, LightGBM, and Stacking, to detect risk and protective factors for bullying behavior. The results demonstrated that: (a) the Random Forest algorithm performed optimally, with recall, F1 score, and area under the curve values of 0.9394, 0.8516, and 0.8043, respectively; (b) both Gini importance and SHapley Additive exPlanations (SHAP) values identified self-control as the most significant protective factor, while moral disengagement was identified as the most influential risk factor. The recommended model not only provides an application value in preventing bullying but also provides a scientific basis for developing targeted interventions.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
