Abstract
Sociability plays a critical role in students’ social interaction, collaboration, and emotional well-being. This study examines factors associated with student sociability using data from the 2021 OECD Survey on Social and Emotional Skills, focusing on Canadian students aged 10 and 15 (n = 5,440). A machine learning approach was employed to model sociability as a continuous outcome using multiple regression-based algorithms. Results indicate that cooperation, optimism, sense of belonging, and students’ energy levels are significant predictors of sociability. Among the models evaluated, Support Vector Regression and Lasso Regression demonstrated the strongest predictive performance, exhibiting the lowest error rates and highest explained variance. These findings highlight the close relationship between social and emotional skills and student sociability and underscore the importance of educational policies that support students’ social engagement and emotional development. Future research may extend this work by examining how social and emotional skills relate to academic outcomes.
Sociability, the ability to engage in social interactions, establish relationships, and maintain these connections over time, is an important aspect of students’ learning process (Boswell et al., 2020). Research shows that sociability influences not only students’ academic success but also their emotional well-being and personal growth (Latorre-Cosculluela et al., 2022). For example, a student who actively participates in classroom discussions and easily forms friendships exhibits high sociability (Kirschner & Kreijns, 2005). Conversely, students who withdraw from peers or avoid social interactions may struggle to form meaningful relationships. Social-emotional skills, such as motivation, cooperation, empathy, and communication, are necessary for successful interpersonal interactions among students (Lee & Shute, 2009). These skills enable students to understand others’ emotions, express themselves clearly, and maintain effective relationships. Many scholars have identified the importance of these skills for students’ cognitive development and academic success (e.g., Getahun Abera, 2023; Hachem et al., 2022; Taylor et al., 2017). Also, programs and interventions designed to target students’ social and emotional skills have been effective in improving academic success (Durlak et al., 2011).
In the Canadian education context, identifying key factors that affect students’ sociability is essential to creating supportive and inclusive school environments (Campbell, 2021; Ryan & Patrick, 2001). Canada’s multicultural society provides a unique context for studying sociability, as the diversity of students’ cultural backgrounds (i.e., international and domestic) can enrich interactions or pose challenges (Barrett, 2018; De Leersnyder et al., 2022). It is therefore important to explore how cultural diversity and other factors, such as school environment, family background, and student-teacher and peer interactions, influence sociability (Kutsyuruba et al., 2015). Supportive school and family environments encourage social engagement, whereas limited support may hinder students’ sociability (Allen et al., 2018; Smith et al., 2020). Also, positive student–teacher relationships further strengthen students’ confidence and willingness to interact with peers (Ibrahim & El Zaatari, 2020; Zheng, 2021). In sum, a better understanding of the factors that affect Canadian students’ sociability can help foster a supportive and inclusive educational environment for students, considering the multicultural nature of the education system in Canada.
Although several studies highlight the role of social and emotional skills in cognitive development and academic success (e.g., Franco et al., 2017; Hachem et al., 2022; Taylor et al., 2017), fewer studies focus directly on sociability as an outcome, especially within Canada’s multicultural context. In addition, much of the existing literature relies on traditional statistical approaches that examine isolated predictors, which may not adequately capture the complex, non-linear relationships among multiple social-emotional constructs. As a result, there remains limited empirical evidence identifying which social-emotional skills most strongly predict sociability and how predictive performance varies across different modeling approaches. Large-scale datasets, such as the Organization for Economic Co-operation and Development (OECD) Survey on Social and Emotional Skills, offer a unique opportunity to address these gaps through a more comprehensive, data-driven framework.
Our study aims to identify factors that predict students’ sociability levels in the Canadian context. We examine how various social-emotional skills influence students’ ability to form and maintain social relationships. Using large-scale data from the OECD survey on social and emotional skills, this study employs machine learning techniques to analyze the interactions between social-emotional factors and their impact on sociability. The OECD survey is an extensive international initiative aimed at measuring and comparing social and emotional skills among students across different countries (OECD, 2023). Using this large-scale dataset will enhance the robustness and reliability of the findings, enabling an in-depth analysis of the factors influencing students’ sociability. Although several studies have explored the effects of social and emotional skills on students’ academic performance (e.g., Ahmed et al., 2020; Corcoran et al., 2018; Hachem et al., 2022), the current study is the first attempt to utilize the OECD survey on social and emotional skills dataset with a specific focus on Canadian students. By utilizing this dataset, we aim to address a significant gap in the literature on the impact of social and emotional skills on students’ sociability within the Canadian context. Findings from this study will enhance our understanding of how students’ sociability evolves within Canada’s multicultural and diverse community. The primary research questions underlying this study are: (1) Which social-emotional factors best predict students’ sociability? and (2) Which machine learning model best predicts students’ sociability? Our study aims to fill the gaps in existing research by exploring the connections between sociability prediction, social-emotional skills, and predictive modeling using machine learning.
Literature Review
Social and Emotional Skills
Social and emotional skills play a central role in students’ academic, social, and developmental outcomes. Grounded in Social Cognitive Theory (Bandura, 2001), these skills include cooperation, empathy, communication, self-regulation, and relationship building, which shape how students interpret and respond to social environments (Snyder, 2014). Prior studies consistently show that social and emotional skills support academic achievement, emotional well-being, and long-term life outcomes (Bulut et al., 2020; Hachem et al., 2022; Taylor et al., 2017). As education systems increasingly emphasize holistic development, social and emotional learning has become a core focus beyond cognitive outcomes alone (Burroughs & Barkauskas, 2017; Devis-Rozental & Farquharson, 2020; Newman & Dusenbury, 2015). Social-emotional skills have also become essential for developing diverse individuals in the constantly shifting education scene (Devis-Rozental & Farquharson, 2020). Within the broad spectrum of social-emotional skills, socialization serves as a bridge, facilitating their practical application in real-world interactions. Through socialization, students can practice and enhance their communication, empathy, and relationship-building abilities, thereby reinforcing their social-emotional development.
Students’ socialization with one another is recognized in contemporary educational institutions as an important part of students’ overall growth (Kahu & Nelson, 2018). This includes how students develop healthy relationships with others, maintain emotional wellness, and perform well in school (Estrada et al., 2021). As a crucial skill, sociability refers to students’ ability to initiate social interactions, form relationships, and sustain social connections over time (Gao et al., 2010). This pertains to a desire for acceptable interaction, as well as to being absorbed or maintaining ethical principles when complying with obligations under the laws of nature (Heikki, 2018). Previous studies link sociability to classroom engagement, peer collaboration, emotional adjustment, and academic functioning (e.g., Brackett et al., 2012; Shao et al., 2024). Furthermore, sociability plays a vital role in developing successful, well-rounded students by improving their cognitive abilities and academic achievement (Hachem et al., 2022). Students with higher sociability tend to participate more actively in learning environments and develop stronger peer and teacher relationships, whereas lower sociability is associated with social withdrawal and reduced engagement (Yılmaz & Yılmaz, 2023).
Educational organizations attempt to create enabling conditions that promote both academic achievement and sociable individuals (Lodge et al., 2018). However, as Bloom et al. (2007) noted, it is important to identify and strengthen the connections between weaknesses in social skills and academic challenges, especially for elementary-aged students deemed at risk of emotional and social disorders. Despite strong evidence linking social and emotional skills to academic and developmental outcomes, sociability itself has received comparatively limited attention as a primary outcome, particularly within Canada’s multicultural educational context. Much of the existing literature examines isolated predictors such as cooperation or empathy, rather than modeling how multiple social-emotional skills jointly contribute to sociability (e.g., Wang et al., 2024).
Prior studies predominantly rely on traditional statistical approaches, which may not adequately capture complex, non-linear relationships among interrelated social-emotional constructs. Recent methodological advances highlight the value of machine learning and predictive modeling for studying complex educational phenomena, especially when relationships among predictors are interdependent and non-linear (Guo et al., 2025; Lavelle-Hill et al., 2025). However, few studies apply these approaches to sociability outcomes, and even fewer leverage large-scale international datasets to examine sociability using data-driven methods. As a result, empirical evidence remains limited regarding which social-emotional skills most strongly predict sociability and how predictive performance varies across modeling approaches, particularly for Canadian students. Also, utilizing a large-scale dataset enhances the robustness and reliability of the findings (Chen et al., 2014). The dataset includes a wide range of social and emotional factors across various demographic groups, enabling a comprehensive analysis of their impacts on sociability. Unlike small-scale studies, which may have limited applicability, this study’s findings can be applied to a broader student population, making them more relevant and impactful.
Furthermore, given the variety of social-emotional factors that affect students’ sociability, traditional statistical methods often fail to capture complex interactions and behaviors. Using machine learning will provide a comprehensive understanding of the complex relationships between social-emotional skills and students’ sociability. This approach uncovers the complex patterns and relationships that traditional methods might overlook, offering deeper insights into the dynamics of sociability. Our study seeks to address these gaps by examining the relationship between social-emotional skills and sociability outcomes through machine learning a data-driven technique that provides a theoretical contribution to the foundation of social-emotional skills and sociability (Tuomi, 2022). Using survey data collected from a large sample of students in Canada, we aim to investigate the extent to which social and emotional skills can predict sociability using machine learning methods.
As a global organization, the OECD aims to develop stronger regulations to improve the quality of life (OECD, 2021a). Social-emotional competence (SEC), which can be applied across multiple cultural settings, has been shown to directly influence the OECD’s Big Five framework (Lee & Junus, 2024). By integrating social and emotional skills into their assessments and policies, the OECD aims to promote the holistic development of students globally. This approach highlights the importance of social and emotional skills in fostering not just individual students’ well-being but also societal development.
Sociability and Academic Performance
According to Yılmaz and Yılmaz (2023), an essential aspect of how students develop is sociability, which goes beyond school achievement to encompass emotional and social skills. In educational research, the value of social and emotional skills in developing well-rounded individuals has attracted greater attention (Van der Eecken et al., 2019). A study by Weissberg (2016) stated how schools are increasingly becoming more multicultural and multilingual, hence the need for sociability among students. Findings stress that sociability changes considerably across age groups, especially among adolescents, who are often at an important stage with greater susceptibility to interpersonal and peer relationships (McMahon et al., 2020).
Many initiatives have been implemented in the worldwide educational scene to boost student socialization. For example, social and emotional learning (SEL) projects are growing in popularity worldwide (Berg et al., 2021). To promote positive interactions and effective communication, bodies including the World Health Organization and UNESCO have come together to integrate psychological skills into school curricula (Sawyer et al., 2021). Primarily concentrating on Ottawa, the capital of Canada, research shows how individuals’ socialization mirrors and reveals the distinctive geographic makeup of Canada as a whole (Mitchell & Lovegreen, 2009). An extensive study by the Ottawa-Carleton District School Board examines the sociability pattern distinct to Ottawa, offering perspectives on the geographical factors that impact students’ interactions with others (Kutsyuruba et al., 2015). The primary objective of these local programs in Ottawa has constantly been to promote learner socialization (Forseille & Raptis, 2016). Youth mentoring programs in Ottawa, such as those offered by Big Brothers Big Sisters of Ottawa, further reveal an ongoing commitment to supporting children and youth through mentorship opportunities (Big Brothers Big Sisters of Ottawa, 2025). This provides an opportunity for the development of social skills and for cultivating positive interactions among students.
Several factors that influence students’ sociability have been researched in previous studies. Bores-García et al. (2021) examined the significance of connections between educators and learners in school settings, focusing on the effects of pleasant experiences on learners’ capacity to connect and collaborate. Bhargava and Witherspoon (2015) explored the impact of parental engagement on students’ social growth, offering insight into the role of family background in socialization. Nevertheless, despite such significant advancements, there remain significant discrepancies in the body of knowledge that require further study. Prior studies focused on particular traits in a solitary existence, with inadequate examination of the strong connections among multiple factors that influence students’ sociability. The emergence of machine learning models offers curation rather than an instrument, providing a variety of methods that can reveal connections and trends hidden from the human eye (Hatcher & Yu, 2018).
By introducing new methods that explain and boost student sociability, incorporating machine learning models into educational research indicates a transition (Liu et al., 2023). Although individual variables impacting sociability have been researched and identified with considerable success by traditional research, the complex interactions among these factors are less well understood (Ungar et al., 2013). The incorporation of machine learning in educational environments spans beyond mere predictive modeling (Fedik et al., 2022). Machine learning involves discovering relationships, trends, and patterns that may not be readily evident to researchers. The findings of this extensive study can shed light on the connections among multiple variables, affording an understanding of how educational environments, instructional methods, and student backgrounds together affect sociability.
Several insights into students’ social and emotional skills in varied educational contexts could be gathered using this comprehensive OECD dataset (OECD, 2021b). A systematic review by Eime et al. (2013) indicates that building social and emotional skills positively affects students’ academic success. Gao et al. (2010) examined the relationship between students’ academic performance and their social and emotional skills using OECD data and found significant associations, underscoring the importance of these skills for academic success. Building on recent advances in social-emotional learning research and educational data science, the present study addresses these gaps by examining sociability as a central outcome, using machine learning models applied to a large-scale Canadian sample from the OECD Survey on Social and Emotional Skills 1 . By modeling multiple social-emotional predictors simultaneously, this study moves beyond isolated effects and provides a data-driven assessment of how different skills jointly contribute to sociability. This approach offers a more nuanced understanding of sociability in Canada’s multicultural educational context and provides practical insights for educators and policymakers seeking to support students’ social development.
Methods
Data Source
The data for this study came from the OECD Social and Emotional Skills Survey for Canada (Ottawa). OECD is a global organization focused on broadening students’ awareness of their social and emotional skills, improving students’ academic performance, and promoting the development of these skills. These data were collected in Ottawa across four school boards and grouped by language of education and school type (elementary or secondary). A total of 5,440 students from Ottawa, Canada were sampled. Participants were primarily aged 10 to 15 years, consistent with the target age groups of the OECD SSES. About half of the students were male (49.4%) and female (50.6%), and they were mostly in elementary school and a few in other middle schools at age 15. The survey took place in 2019, but the data were made public in September 2021. The survey assessed students’ socio-emotional skills, home-related factors, school-related factors, peer environment-related factors, and background-related factors, and how these factors affect life outcomes. The framework for this survey dwells on the Big Five model (OECD, 2021a).
Measures
The OECD Social and Emotional Skills Survey was conducted to gather information on students’ social and emotional skills based on the OECD framework organized around the Big Five personality traits (Steponavičius et al., 2023). This framework differentiates five dimensions of social and emotional development: task performance, emotional regulation, collaboration, open-mindedness, and engagement with others. The framework contains 15 unique skills with 2 additional indices (achievement motivation and self-efficacy) created from items belonging to other skills. Information on students’ social and emotional skills was obtained from students, teachers, school principals, and, optionally, from parents. Each item on the survey was measured on a five-point Likert scale ranging from 1 (completely disagree) to 5 (completely agree; OECD, 2023). Student sociability served as the dependent variable. Sociability was measured using the OECD-derived weighted scale score, reflecting students’ tendencies to initiate, maintain, and engage in social interactions.
The predictor variables used in this study are OECD-derived weighted scale scores constructed using the SSES psychometric framework. These predictors were derived from the OECD SSES psychometric framework and included constructs such as sense of belonging, cooperation, optimism, energy, emotional regulation, and engagement with others (see Table 1 for list of variables). All analyses were conducted using these OECD-provided scale-level scores rather than re-estimating measurement models. Evidence for the reliability and construct validity of these scales was therefore drawn from the OECD SSES technical report (see Table 2). The OECD reports acceptable to strong internal consistency for the student scales based on Cronbach’s α and McDonald’s ω, and confirmatory factor analyses indicate generally good model fit across scales, as documented in the OECD SSES Technical Report (OECD, 2021b).
List of Variables (Features) Used for the Study.
Note. This table presents the feature codes alongside their full variable names. An asterisk (*) next to a feature code indicates that the feature was used in the final analysis.
Reliability Coefficients for Student Direct Assessment Final Scales.
Note. Values reproduced from the OECD Social and Emotional Skills Survey technical report (OECD, 2021a). Reliability estimates were computed by the OECD using item-level data.
Data Pre-Processing
Data from 5,440 students on more than 30 variables were initially selected for this study. A set of data pre-processing steps was performed to prepare the dataset for subsequent analyses. For example, the descriptive analysis was performed in Python to analyze the selected variables. Duplicate rows were identified and removed. Missing data were minimal across all variables, ranging from 0% to 1.32%. Given the low level of missingness and the large sample size, missing values were handled using k-nearest neighbors (KNN) imputation, which replaces missing values with values from the k most similar observations in the dataset. This approach preserves the full sample size, reduces potential bias arising from listwise deletion, and avoids the oversimplification associated with mean substitution.
Using libraries such as Scikit-learn, a feature selection procedure was carried out in Python to identify variables with theoretical relevance to the outcome variable. This procedure yielded 28 features (i.e., predictors) in total (see Table 1 for a complete list). Prior to model estimation, all continuous predictors were standardized using z-score normalization. Standardization was applied to place variables on a common scale and to satisfy the assumptions of regularized and distance-based learning algorithms, including Lasso regression and support vector regression. This ensured that differences in variable scales did not unduly influence model coefficients and penalty terms.
Feature selection followed a hybrid approach combining theoretical grounding and empirical model-based selection (Guyon & Elisseeff, 2003; Povak et al., 2014). While the initial feature pool was informed by the OECD social and emotional skills framework, the identification of key predictors was driven entirely by data-driven procedures. Specifically, Lasso regression performed embedded feature selection via coefficient shrinkage (Tibshirani, 1996), whereas the remaining algorithms used Recursive Feature Elimination and model-based importance measures (Guyon et al., 2002; Kilmen & Bulut, 2023). No predictors were manually retained or excluded; instead, variables were selected based on their contribution to predictive performance across models. Given the conceptual overlap among social and emotional skill constructs, multicollinearity was assessed prior to model estimation. Variance inflation factors (VIFs) and tolerance values were examined using a multiple regression framework in which all predictors were entered simultaneously.
Predictive Modeling With Machine Learning
Machine learning is a discipline of computer science and artificial intelligence that employs data and algorithms to recreate human learning and gradually boost the accuracy of a prediction (Sarker, 2021). Machine learning is capable of delivering interesting and data-driven techniques to investigate the areas of social-emotional skills and sociability. In this study, as the research aimed at exploring factors that affect the sociability of students, several supervised machine learning algorithms, such as Gradient Boosting Regression (GBR), Random Forest (RF), Support Vector Regression (SVR), Decision Tree (DT), and Lasso Regressor (LR), were used. These algorithms can provide an easy interpretation of the model outcomes (DT and LR), handle non-linear relationships among the predictors of sociability (SVR, GBR, and RF), and reduce both prediction error and bias (RF and GBR). The use of machine learning methods is in alignment with current technological advances and allows for improving the accuracy and diversity of our expertise in the domains of psychology and education.
After handling missing data and conducting data standardization, a correlation matrix was created using Pearson’s correlation to see how the selected features correlated with the target variable, student sociability. After the initial correlation matrix was printed, variables with weak correlation were removed (see Figure 1 for the correlation matrix of the top correlated features). However, variables with strong correlation were maintained for the analysis, and the total number of variables included for the analysis was 21 out of the initial 28 variables.

Correlation matrix.
Before the model development phase, the dataset was split into the target variable (i.e., student sociability) and the other set of selected variables as independent variables. 70% of the dataset was set aside for training to train the machine learning models, and 30% was set aside for testing data to evaluate model performance. Multiple regression models were used in the analysis because student sociability was defined as a continuous variable. To further control for overfitting and assess model stability, 10-fold cross-validation was applied within the training data. For each fold, model performance was evaluated using R2 and RMSE, and the mean and standard deviation of these metrics across folds were computed. This cross-validation procedure provided a robust estimate of generalizability beyond a single train–test split. The most suitable alpha for Lasso Regression was first determined by LassoCV. LassoCV was used to find the best alpha value for the LR model. The Decision Tree and Gradient Boosting Regression models have been trained, and their feature importance was assessed. GridSearchCV was used for the Decision Tree and Random Forest model hyperparameter optimization to find the best settings. This helps to make the models’ performance accurate by systematically testing different combinations of settings. For feature selection, Recursive Feature Elimination (RFE) was used to identify predictors that contributed most strongly to model performance, allowing the models to be simplified while retaining predictive accuracy.
Throughout the modeling process, we focused on refining the models (model optimization), understanding which features are key (feature importance), training and testing the models. A technique called cross-validation was used to test the effectiveness of the model. Cross-validation involves dividing the data into multiple segments; one segment is used to train the model, and the other set is used to test the model. This process is repeated several times, using different segments for testing each time. The cross-validation process ensures that the model performs well not only on one set of data but across various sets, reducing the risk of fitting too closely to a particular subset of data. This process, in turn, provides an accurate measure of how the model will perform on unseen data, which is crucial for building models that are robust and generalizable. Also, the models were fine-tuned, which involves adjusting the hyperparameters of the model to achieve better performance, ensuring that the model’s predictions are accurate.
Results
As an initial diagnostic step, multicollinearity among predictors was assessed. VIFs ranged from 1.12 to 5.10, and tolerance values ranged from 0.20 to 0.89. Although one predictor slightly exceeded the conventional VIF threshold of 5, all tolerance values remained above 0.10, indicating that multicollinearity did not compromise model estimation. To further assess associations among predictors, Pearson correlation coefficients were examined. Correlations among the SSES variables were generally moderate, with most coefficients falling between .30 and .70. Although several conceptually related constructs showed stronger associations, no correlations exceeded conventional thresholds indicative of redundancy (e.g., r ≥ .90). The full correlation matrix is provided in the (see Table 3).
Correlation Matrix for All Variables.
The models were evaluated using Mean Squared Error (MSE) and Root Mean Squared Error (RMSE). These metrics reflect the average deviation between the expected (predicted) and observed values within the dataset. This implies that lower MSE and RMSE values indicate that the model’s predictions are closer to the actual data, indicating better accuracy. In this study, models with lower MSE and RMSE are considered more reliable for predicting student sociability. Additionally, we used R-squared (R2), a metric that quantifies the proportion of variance in the dependent variable that can be predicted from the independent variables (Hassan et al., 2022). R2 values range from 0 to 1, where 1 represents perfect prediction. A higher R2 value indicates that the model can explain a larger portion of the variance in student sociability, thus serving as a good predictor. The higher the R2 value, the better (see Table 4 and Figure 3 for a summary of the models’ performance). The baseline (mean-predictor) model predicts the average sociability score for all students and therefore has an expected R2 of 0, which serves as a reference point for evaluating the predictive performance of all substantive models.
Model Performance.
Note. MAE = mean absolute error; RMSE = root mean squared error; R2 = R-squared

A summary of predictive model performance.

Feature importance for random forest.
Additionally, Figure 4 provides a visual assessment of model fit for the Lasso regression model by plotting predicted sociability scores against observed values. The predicted scores show a clear positive association with the observed values and tend to cluster around the line of perfect prediction. The dispersion of points around the diagonal reflects prediction error, particularly at higher levels of sociability. This pattern is consistent with the moderate predictive accuracy indicated by the RMSE and R2 values, because the best-performing models (Lasso Regression and Support Vector Regression) explain approximately half of the variance in students’ sociability (R2 ≈ 0.52 and 0.49, respectively), while still exhibiting non-trivial prediction error as reflected by their RMSE values. Together, these results indicate meaningful improvement over a baseline model, but also substantial unexplained variability, which is consistent with a moderate level of predictive accuracy.

Predicted versus observed sociability scores for the Lasso regression model.
LR and SVR have the lowest MSE and RMSE values, showing better predictive accuracy at 49% and 48.8%, respectively. These low error values indicate that the predictions made by LR and SVR are closer to the actual values, making them the most accurate models in this study. Again, LR and SVR have an R2 value of 52%, indicating the amount of variance in the dependent variable that is predictable from the independent variables by the models. Figure 2 shows the RMSE and R2 values, where the lowest RMSE is the red portion and the highest R2 portion is the deep green portion (indicating the best-performing model). Furthermore, in predicting student sociability, these models (LR and SVR) can reliably estimate sociability from the variables, providing a solid foundation for further analysis and interventions. DT has an R2 value of 38%, indicating weaker prediction power. That is, DT explains less variance than LR and SVR, suggesting that the DT model is less effective in capturing the factors that influence student sociability. Therefore, LR and SVR models were top-performing models, with higher R2 values and lower RMSE values, indicating they provide more accurate predictions and explain a greater portion of the variance in student sociability. This result answered the second research question by showing that LR and SVR are the most effective models for predicting students’ sociability in this dataset. This analysis also provides a methodological pathway for further research.
Feature importance was examined to enhance model interpretability and to clarify how key predictors of sociability were identified. Importance estimates derived from Lasso, Support Vector Regression, and Random Forest models were compared to assess consistency across modeling approaches. Variables were interpreted as salient predictors only when they emerged as influential across multiple models rather than within a single algorithm. This convergence-based approach strengthens confidence in the robustness of the identified predictors and reduces the likelihood that results reflect model-specific artifacts. Although advanced interpretability techniques such as SHapley Additive exPlanations (SHAP) were not implemented, the consistency of feature importance across multiple algorithms provides a robust basis for interpreting the identified predictors.
In terms of feature importance (see Figures 5 and 6 Please change this to (see Figures 3, 5 and 6), RF, SVR, and Lasso attributed importance to the most similar features. Sense of belonging in school emerged as the most influential predictor across Lasso, Support Vector Regression, and Random Forest models, indicating a robust and model-consistent association with student sociability. The consistency of a sense of belonging as a key predictor underscores its importance in understanding and predicting sociability among students. This result suggests that interventions aimed at improving students’ sense of belonging could be effective in enhancing sociability. Again, this finding suggests that sociability can be cultivated through targeted educational policies and practices. Also, other factors such as the energy level of students, cooperation, and optimism were identified as factors affecting student sociability.

Feature importance of Lasso.

Feature importance of support vector regressor.
These findings provide valuable insights into which areas should be prioritized in efforts to improve sociability among students, suggesting that schools might focus on fostering a supportive environment, promoting cooperative activities, and encouraging positive attitudes. Moreover, these factors collectively suggest that sociability is not only about feeling connected but also involves active engagement and a positive attitude among students. By prioritizing these factors, institutions can create a more nurturing and supportive environment that promotes all aspects of student development. This could involve strategies such as implementing cooperative learning tasks that require teamwork, promoting physical activities to enhance energy levels, and fostering an optimistic school climate through positive reinforcement and support systems (Alam, 2022). This result helped to answer (RQ1) on identifying the factors that predict sociability among students using OECD survey data.
Discussion
This study was grounded in Social Cognitive Theory (Bandura & National Institute of Mental Health, 1986), a theoretical framework that asserts that social connections and individual qualities influence the way people act. Using data from the 2021 OECD Survey on Social and Emotional Skills, this study examined whether supervised machine learning models could meaningfully predict student sociability and identify key contributing factors. The findings from this study provide the relationship between environmental factors (school environment), personal factors (energy level, optimism), and behaviors (sociability of students). This aligns with Bandura’s concept that the environment and individual characteristics, such as energy, optimism, and behavior, interact to influence each other. Also, this study emphasizes the need for social-emotional learning in schools by showing how these skills influence sociability. Developing social-emotional skills can help students navigate various social contexts and build resilience, which is important for their social development.
The results further emphasized the critical roles a strong sense of belonging, cooperation, energy, and optimism play in sociability. This result aligns with prior research emphasizing the importance of belonging for social engagement, psychological well-being and students’ educational experiences (Ahn & Davis, 2020; Haslam et al., 2022; Saeri et al., 2018; Su & Wang, 2022). From a theoretical perspective, this finding reinforces sociability as a socially embedded construct rather than an individual trait, shaped by students’ perceptions of their relational environment. The findings of this study have direct implications for school psychologists and educators seeking to support students’ social development through evidence-based social–emotional learning practices. The consistent identification of a sense of belonging as the strongest predictor of sociability suggests that school-level interventions should prioritize inclusive school climates, positive teacher–student relationships, and opportunities for peer connection. School psychologists can use these findings to inform needs assessments and guide the selection of interventions that strengthen students’ perceived acceptance and connectedness within the school environment.
In addition to a sense of belonging, cooperation, and energy, these factors also emerged as salient predictors across multiple models, highlighting the importance of interactive and engaging learning contexts. Cooperation reflects students’ capacity to engage constructively with peers, while energy captures behavioral activation and engagement in social contexts. For educators, this suggests that cooperative learning structures, group-based problem-solving tasks, and classroom routines that encourage active participation may foster greater sociability among students. School psychologists can collaborate with teachers to design and evaluate classroom strategies that promote collaboration and sustained engagement, particularly for students who may struggle socially. Additionally, optimism was another meaningful predictor, indicating that students’ positive expectations and outlooks are linked to their social functioning. Together, these predictors highlight that sociability is not driven by a single dimension but by a constellation of modifiable social and emotional skills. This supports recent work suggesting that students’ social functioning is best understood through integrated, multi-dimensional frameworks rather than isolated predictors.
Methodologically, the findings demonstrate the value of machine learning approaches for modeling sociability. Lasso Regression and Support Vector Regression consistently showed superior predictive performance, confirming their capacity to handle correlated predictors and capture complex relationships among social-emotional constructs. These results are consistent with prior studies that have highlighted the advantages of regularized and kernel-based models in educational prediction tasks (Hassan et al., 2022; Rustam et al., 2020). Importantly, the convergence of results across different algorithms strengthens confidence in the robustness of the identified predictors. From an applied perspective, the findings carry meaningful implications for school psychologists, educators, and policymakers. Interventions aimed at strengthening school belonging, fostering cooperative learning, and promoting student engagement and optimism may be particularly effective for enhancing sociability. Rather than viewing sociability as a fixed disposition, the results support its conceptualization as a developable outcome that can be shaped through intentional, school-based practices.
Finally, the findings of this study position sociability as both a measurable and developable construct within educational settings. By demonstrating that sociability can be reliably predicted using social and emotional skill indicators, this study shows that sociability is not an abstract or incidental outcome, but a construct that can be systematically examined using empirical models. At the same time, the identification of modifiable predictors such as belonging, cooperation, energy, and optimism underscore the potential for intentional school-based practices to support students’ social development. Framing sociability in this way encourages educators and school psychologists to move beyond viewing social behavior as fixed and instead recognize it as an outcome that can be strengthened through targeted, data-informed interventions.
Conclusions
This study set out to identify the social-emotional factors that best predict student sociability and to determine which machine learning models most effectively capture these relationships. The results clearly indicate that sociability among Canadian students can be reliably predicted using social and emotional skill indicators, with sense of belonging, cooperation, energy, and optimism emerging as the most influential predictors. Among the tested models, Lasso Regression and Support Vector Regression demonstrated the strongest predictive performance, suggesting their suitability for modeling complex educational outcomes. Compared with prior research that has focused primarily on academic achievement or employed traditional statistical approaches, this study advances the literature by positioning sociability as a measurable and empirically tractable outcome. By leveraging a large-scale international dataset and data-driven modeling techniques, the study contributes new evidence on how multiple social-emotional skills jointly shape students’ social functioning within a Canadian context. The findings also carry implications for SEL-based educational policy and practice. Policies that promote inclusive school climates, encourage cooperative learning, and support students’ emotional engagement may yield benefits not only for well-being but also for students’ capacity to form and sustain social relationships. By explicitly linking sociability to modifiable school- and student-level factors, this study provides a foundation for evidence-informed interventions to strengthen students’ social development.
Implications for School Psychology
Our findings from this study have implications for psychologists working in schools or educational settings. The findings indicate that social and emotional skills play a significant role in affecting the sociability of students. There is an interplay between students’ sense of belonging, cooperation among students, energy levels, and cooperation, suggesting that programs designed to enhance student sociability should adopt a holistic approach, considering these variables not in isolation. For school psychologists, our findings underscore the importance of nurturing these Social-emotional skills in educational environments. Psychologists should consider interventions that boost students’ energy and optimism levels, such as incorporating physical activities into the daily routine of students and implementing positive reinforcement strategies. These interventions can create a more dynamic and supportive school environment that enhances student sociability. Programs that prioritize creating an empowering school environment should be encouraged.
Surprisingly, cooperation was also identified as a predictor of students’ sociability. Collaborative learning and group activities have been shown to enhance students’ ability to work together and build social skills (Berg et al., 2021). Psychologists can play an important role in designing and advocating for cooperative learning environments that will encourage teamwork and positive social interactions among students. These environments can help students develop essential social skills that are critical for their overall development. By advocating for practices that build positive classroom environments and high-quality social interactions, school psychologists would contribute to the development of sociability among students. It may also be important for researchers to utilize machine learning models to analyze data with multiple variables compared to the traditional methods, as proposed in previous studies (e.g., Ersozlu et al., 2024; Sarker, 2021).
Strengths and Limitations of the Study
A primary strength of this study is its innovative application of machine learning techniques to explore the factors contributing to student sociability. Traditional statistical methods often fall short of capturing complex interactions among multiple variables. Machine learning, on the other hand, can uncover hidden patterns and relationships, providing a more comprehensive understanding of the factors influencing sociability. Also, the study utilizes a large-scale dataset from the OECD Survey on Social and Emotional Skills, which includes a diverse sample of students from Ottawa. This extensive dataset enhances the robustness and reliability of the findings, offering a rich source of information on various social-emotional factors and their impact on sociability. Moreover, by incorporating multiple machine learning models, including Gradient Boosting Regression, Random Forest, Support Vector Regression, Decision Tree, and Lasso Regressor, the study provides a thorough analysis of the predictive power of different social-emotional factors. This multi-faceted approach allows for a more nuanced understanding of which factors are most influential in predicting student sociability.
However, the dataset is restricted to students from Ottawa, Canada, which may limit the generalizability of the findings to other regions or countries. Additionally, the study’s focus on a specific age group (10–15 years) and educational context may not fully capture the variability in sociability factors across different demographics and cultural settings. The study relies on cross-sectional data, which provides a snapshot of the relationships between social-emotional skills and sociability at a single point in time. This design limits the ability to draw causal inferences and understand how these relationships may evolve. Longitudinal studies are needed to explore the developmental effects of sociability and the long-term impact of social and emotional skills. Again, the data on social and emotional skills and sociability were collected through self-reports from students, teachers, and school principals. Self-reported data can be subject to biases, such as social desirability bias or inaccuracies in self-assessment. While efforts were made to ensure data quality, these potential biases should be considered when interpreting the findings.
Finally, although the study includes a comprehensive set of social-emotional factors, there may be other relevant variables influencing sociability that need to be captured in the dataset. Future research should consider a broader range of factors, including environmental and contextual variables, to provide a more holistic understanding of student sociability. Machine learning models were used for this study, while innovative, they also introduce certain technical limitations. The performance of these models can be influenced by the quality and preprocessing of the data, the choice of hyperparameters, and the specific algorithms used. Although rigorous validation procedures were followed, the results may still be affected by these technical considerations.
Future Directions
The findings of this study offer several promising avenues for future research on SES. First, while this study employed supervised machine learning models for regression analysis, future studies could explore classification-based approaches to group students into sociability profiles (e.g., high, moderate, low) based on SES predictors. Such classification could help educators identify students who may benefit most from targeted interventions. Second, cross-national and cross-cultural comparative studies should be conducted using the OECD’s SSES dataset across different countries. Expanding the scope beyond Ottawa would enhance the generalizability of the findings and help uncover cultural and contextual variations in the development of sociability. Third, longitudinal designs should be adopted to examine how SES and sociability evolve over time. This would allow researchers to move beyond predictive correlations and begin identifying causal pathways and developmental trajectories. For instance, tracking changes in sociability from elementary to secondary school could reveal critical periods for intervention and growth. Lastly, there is a need to incorporate qualitative and mixed methods approaches to enrich the current findings. While machine learning offers predictive power and pattern discovery, qualitative methods such as interviews and focus groups can illuminate the lived experiences and contextual nuances behind students’ social interactions. These methods could be particularly valuable for understanding how marginalized or underrepresented student groups perceive and experience sociability.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
