Determining suicide risk in an adolescent population: Questionnaire-based approaches

Abstract

Objective: In the context of assessing suicide risk using questionnaires as measurement instruments, the main goals are: (i) to compare the performance of the classification task with knowledge-based algorithms and inferred approaches, and (ii) to reduce the set of questionnaires. Methods: A classification task is performed on the set of questionnaires considering two methods: expert knowledge translated to algorithms and represented as diagrams, and data-inferred machine learning models. Feature ablation is performed to reduce the questionnaire items. Results: Machine learning models are able to detect risk with an F1 macro average score of up to 85%, significantly better than knowledge-based models. The number of questionnaire items can be reduced with no significant impact. Conclusions: Inferred models can be used to predict the level of suicide risk with a reduced set of questionnaires. Moreover, the Suicide Cognitions Scale-Revised questionnaire is seen to be the most impactful in the prediction.

Keywords

feature reduction inferred approach knowledge-based machine learning suicide risk

Introduction

Suicide, or completed suicide, is defined as the realization of the act to intentionally kill oneself.^1,2 When not fatal, the concept of ‘attempted suicide’ is used. According to the World Health Organization (WHO), suicide is one of the main causes of death, causing more than 700,000 deaths yearly.³ It is specifically the fourth main cause of death for young people globally, which is a major cause of concern.⁴ Early diagnosis and care is a major concern, as many mental disorders appear before adulthood and may decrease quality of life if untreated.⁵ Moreover, for every successful suicide, there are many more attempts, causing medical expenses and affecting communities.⁶ Analysis of the environment of the suicides and the suicide attempts may give us some risk factors involved in this choice,⁷ including ethnicity⁸ and gender.⁹ Suicide is also more prevalent in low- or middle-income countries,³ or in people who live alone or are unfit for working.¹⁰ There is also a clear relationship between suicide risk and previous suicide attempts.¹¹

It has been recorded that most people who commit suicide consulted with a specialist in a short span of time before.¹² This is why this type of death is considered preventable, yet, difficult to detect. Among the methods for suicide ideation detection, machine learning methods applied to questionnaires, electronic health records or suicide notes have been used.¹³ There have been previous works in which data extracted from verbal questionnaires responded by adolescents were analysed using a Support Vector Machine (SVM) for the classification of suicidal and non-suicidal patients.¹⁴ In related works, tabular features resulting from questionnaires or scales such as the Pierce Suicidal Intent Scale (PSIS)¹⁵ or the international personality disorder examination screening questionnaire¹⁶ have been analysed with regression techniques for classification purposes.

In this article, we apply both knowledge-based and machine learning-based inferred approaches to tabular data obtained from a set of measuring instruments implemented as questionnaires in which adolescents and youth who have entered the child welfare system took part. Then, we propose and compare the results achieved with the two approaches. In addition, by learning which suicide-related concepts measured by the questionnaires are the most relevant to achieve a good classification performance, we propose a reduction of the original set of assessment instruments. The aim is to be able to detect a risk of suicide in an adolescent population with as few questions as possible and with a high performance.

The main contributions of our work are the following ones:

(1) By analysing the results of a set of questionnaires filled in by a group of young people, we have defined a novel diagrammatic knowledge-based representation of an algorithm, a step-by-step approach whose objective is to assess the level of suicide risk of the surveyed, “low”, “moderate”, or “high”, and measured its performance.

(2) Using the result of the set of questionnaires as input for various data-based inferred algorithms, we have measured their performance and determined the usefulness of machine learning in this particular domain.

(3) We have seized the relevance of the questionnaire items towards suicide-attempt ideation, contributing to questionnaire choice for suicide risk assessment and identifying the most important items and performing feature ablation to detect the redundant ones. This feature ablation approach identified a minimal yet highly informative subset core of items for risk prediction.

These contributions have been used to answer two research questions:

• Are all the measuring instruments employed in the study relevant to estimate the suicide risk? Are there redundant questions or irrelevant questions that are nearly uncorrelated with the suicide risk assessment?

• Utilizing the results of this set of questionnaires, do knowledge-based or data-inferred approaches better assess the level of suicide risk of the respondent?

Related work

Some applications of machine learning in healthcare, and specifically targeting the field of mental health, could be addiction treatment, cyber-harassment recognition, and detection of several mental health issues such as depression, bipolarity, and anxiety.¹⁷ Despite ongoing development and challenges, these applications hold immense potential. Another application could be the use of machine learning in order to detect suicide risk.¹² Existing research usually lacks open data and is inconsistent in terms of varying dataset sizes, features, demographics, detection methods, metrics, and time spans.¹⁸

Even though the outcomes of the researches are varied, binary risk estimation seems to be the main focus. Nevertheless, suicide attempt in a time span, suicide ideation and suicidality are also of interest. In this work, a classification of risk of suicide attempt is calculated, with the categories being “low risk”, “moderate risk”, and “high risk”.

The different works also differ in the choice of the population range. Most studies focus on adult high-risk people, like veterans.¹² Even though adults constitute the highest number of suicides, the main risk of ideation and suicidal conducts corresponds to teenagers and young adults aged 15–29, and not many researches can be found about this study group. We have striven to fill that gap by researching the impact applied to young people with high risk.

The reviews around this subject highlight the difference of risk factors depending on the environment, which is why population-based investigations of suicide risk are needed. For adults, the use of alcohol and drugs, the presence of disabilities, and the socio-economical environment and life events have a significant effect.^19,20 This may not be the case for other demographics. Some studies try to analyse the environmental factors exclusively.²¹ Although suicide in the youth population is a complex and multi-causal phenomenon, it generally occurs when certain life stressors and mental health factors converge to leave a young person with a sense of hopelessness, despair, and social isolation. Apart from demographical factors, including gender, the vulnerability, negative affect, and feelings of inadequacy can lead to suicidal thoughts.²² Research suggests that the most important predictor of completed suicide is a history of suicide attempts.²³

Regarding feature ablation, there have been studies showing that feature reduction is possible thanks to the identification of the main factors contributing to a machine learning classification.²⁴ One of the main goals of this work is, therefore, to identify risk factors based on a predictive model.

We initiated our work by modelling a knowledge-based system involving experts. While the approach had not been made explicit before, it was employed by health practitioners. Furthermore, we explored data-driven inferred models due to their current impact. The literature rarely compares knowledge-based systems with inferred approaches,²⁵ even though the comparison allows interpreting and improving the existing model.

Most studies use electronic medical records (EMR) to calculate this risk of suicide. Lately, importance of the use of social media for the early detection of suicide risk has been highlighted, as online community members could be more likely to show indications they do not disclose to healthcare workers.²⁶ Some studies have been conducted regarding this, usually targeting X (before, Twitter). The use of Natural Language Processing (NLP) is needed for this. There are also studies which include biological variables like urine samples or neuroimaging.²⁷ There is, seemingly, a lack of datasets involving heterogeneous types of features.²⁸ Another gap in research is the scarcity of works employing a mental health survey corpus.^29,30 In this work, survey-based data will be used to develop the risk calculus.

With regard to the main trends employed to assess suicide risk, random forests, decision trees and SVMs are normally used. NLP is also used in different techniques. Cross-validation, specially k-folds, is also usually utilized to check the validity of the models,¹² with accuracy and AUC¹² being the main focus. In this work, different models with be developed and contrasted.

Materials and methods

Measuring instruments

We have incorporated various measuring instruments to assess the level of suicidal ideation in the young and adolescent population.

In our work, the measuring instruments were presented to the target population of youth in residential care in Spain as a set of independent question items (I) and questionnaires (Q):

• The Adolescent Suicidal Behavior Assessment Scale (SENTIA)³¹ is a validated self-report tool that measures a range of suicide-related thoughts and behaviors in various timeframes. For the present study, we used three SENTIA items (I) designed to assess lifetime experiences of suicide ideation (“Have you ever had ideas about taking your life?”), previous suicidal attempt (“Have you tried to take your own life?”), and non-suicidal self-injury or self-harm (“Have you harmed yourself [self-injury: cuts, punctures, etc.] without intent to die?”).

• Number of previous suicide attempts (I). A number is given for the attempts to take one’s own life.

• Suicide Desire and Plan are used to assess whether the respondent has suicidal desire and/or a plan. These are yes/no items (I).

• Suicides of people near (I). A yes/no question about whether people in their surrounding have committed suicide is made.

• The relationship with the person of the attempt and how much the respondent identifies with that person is inspected (I).

• Suicide Cognitions Scale-Revised (SCS-R)³² questionnaire (Q). It is a 16-item self-report questionnaire designed to measure the Suicidal belief system, a range of beliefs, attitudes, expectations, and perceptions associated with the emergence of suicidal thoughts and behaviors (e.g., “Nothing can help solve my problems”). Respondents indicate on a 5-point Likert-type scale the degree to which they agree or disagree with each item statement (range 0, strongly disagree to 4, strongly agree). Item responses are summed to provide an overall metric of the suicidal belief system, with higher scores indicating increased vulnerability to suicidal thoughts and behaviors.

• Patient Health Questionnaire, or PHQ (Q)³³ consists of eight items (e.g., “Feeling down, depressed, or hopeless”) that assess the presence of depressed mood, anhedonia, sleep problems, fatigue, changes in appetite or weight, feelings of guilt or worthlessness, difficulty concentrating, and feelings of laziness or worry during the past 2 weeks. Items are scored on a 4-point Likert-type scale, from 0 (never) to 3 (almost every day). A score of 10 or above is frequently used as a cut-off point to identify patients with major depression. We purposely opted to use the PHQ-8 rather than the PHQ-9 as the ninth item in the latter assesses thoughts of death and self-harm, which might potentially have confounded the results.

• The Spanish adaptation of the Psychache Scale³⁴ in young adults, PS-E (Q),³⁵ is used to measure psychological or mental pain. It consists of 13 items that assess mental pain and anguish (e.g., “I can’t take my pain any more”) in a Likert-type scale. Items 1–9 direct respondents to indicate how often they experience mental pain (e.g., “I feel psychological pain”) on a 5-point scale ranging from 1 (never) to 5 (always), while their task on items 10–13 is to indicate how much they disagree or agree with statements reflecting mental pain (e.g., “I can’t take my pain any more”), using a 5-point scale ranging from 1 (strongly disagree) to 5 (strongly agree). Item scores are summed, with higher scores indicating more intense and frequent (i.e., less bearable) mental pain.

• Tolerance for Mental Pain Scale (TMPS)³⁶ is a questionnaire (Q) to assess tolerance for mental pain. It consists of 10 items that assess negative and positive perspectives of mental pain: feeling unable to manage one’s mental pain (e.g., “I cannot get the pain out of my mind”) and perceiving that one’s pain will not endure (e.g., “I believe that my pain will go away”). Items are scored on a 5-point Likert-type scale ranging from 1 (not true) to 5 (very true). Higher scores on the manage subscale indicate reduced tolerance for mental pain, whereas higher scores on the endure subscale indicate stronger expectations that mental pain will resolve.

• Beck’s Hopelessness Scale, BHS, is a questionnaire (Q)³⁷ that consists of 4 items that measure the sensation of hopelessness during the last week (e.g., “My future looks dark to me”) using a true/false response option. Item scores are summed, with higher scores representing more severe hopelessness.

• Perceived burdensomeness and thwarted belongingness are measured using the Interpersonal Needs Questionnaire, INQ (Q).³⁸ It consists of 12 items that measure perceived burdensomeness (8 items; e.g., “These days, I feel like a burden on the people in my life”) and thwarted belonging (4 items; e.g., “These days, I feel disconnected from other people”), each rated on a 7-point Likert-type scale ranging from 1 (not true for me at all) to 7 (very true for me). Item scores on each subscale are summed, with higher scores indicating more severe perceived burdensomeness and thwarted belonging.

Almost all of the aforementioned instruments have been used with different thresholds in the knowledge-based model (we will specify it in Section Knowledge-based model) and all of the numeric ones in the data-inferred approaches (Section Inferred approaches).

This set of questionnaires and question items were answered by 197 adolescents.

With respect to the data format, the aforementioned information was gathered in a tabular format of 19 columns or features as shown in Table 1. Most of the independent question items (I) were answered as yes/no, and the ones related to questionnaires (Q) gathered the result of the addition of the correspondent set of items (e.g. in the example, the addition of the answers in the SCS-R questionnaire, 16 items in a 0–4 scale, is 30). In order to preserve privacy, the adolescent is identified by a unique code, and the answers to specific questions in the questionnaires can not be known, only the final score. Looking at the responses and considering their experience, two suicide experts (a psychologist and a psychiatrist) have manually annotated each of the cases as of High, Moderate or Low risk (as in Table 1).

Table 1.

Example of the tabular data for a subject. The set of answered questionnaires and question-items are described in Section Measuring instruments. For the Questionnaires, the total score is gathered, not the response for each of the items.

Description of the sample

The sample size that was used for this study was 165 participants. It is important to note that collecting data from high-risk population is challenging, and that the targeted group is both unique and often underrepresented in research, which makes this dataset valuable and insightful. In fact, the sample size is typical for a study of this kind,³⁹ even if in the last years, sample sizes have increased.⁴⁰ There is research showing that machine learning models require a minimum sample size of 20 participants when using cross-validation to ensure reliable estimates,⁴¹ provided the feature-to-sample ratio does not exceed 1:10. With a feature set of 17 variables and a sample size of 165 participants, our study aligns with these recommendations, ensuring sufficient data diversity for statistical analyses and internal validation. The inclusion criteria for the subjects are the following:

• Adolescents aged 12–18 years.

• Being in residential child care.

• Having been at risk of child neglect.

On the other hand, the exclusion criteria are these:

• Adolescents who were in a situation of emotional crisis at the time of the assessment, given that this condition could affect the validity of the responses to the questionnaire.

• Adolescents with cognitive or developmental disabilities that made it difficult to understand the questionnaires used in the study.

• Participants with incomplete data in the questionnaires.

A quick analysis of the sample of adolescents whose ages are comprised around 12−18 shows that more than a 50% of the individuals show suicide desire. Most of the group also performed self-harm. The group was fairly balanced in terms of gender, and the age mean was comprised between 15 and 16. The demographics of the sample group are shown in Table 2.

In Spain, residential care is an alternative custody measure, either administrative or judicial, designed to provide comprehensive care for children and adolescents whose material, emotional, and educational needs cannot be temporarily met within their family environment. These units aim to create a safe environment that promotes the physical, psychological, social, and educational well-being of young people. Some residential care units are also designed to address specific mental health or behavioral needs, in collaboration with families and community resources. Adolescents in residential care often present multiple risk factors associated with suicidal behavior, and previous research has documented a high incidence of suicidal ideation and behaviors in this population.²³In this study, adolescents were not placed in residential care specifically due to heightened suicide risk but rather because of broader protective or welfare needs. However, a significant proportion of participants reported a history of suicidal ideation and/or attempts, underscoring the importance of considering the context when interpreting the study’s findings.

Table 2.

Descriptive statistics for demographic characteristics.

Characteristic	Percentage (%)
Gender
Male	43.0
Female	55.8
Non-binary	1.2
Age (years)
12–14	32.7
15–16	44.2
17–18	23.1

Data collection process

We began the recruitment process by contacting child protection services across several provinces in northern Spain. The research objectives were explained, and these services were invited to participate. Those who agreed and granted approval for the study contacted the managers of youth residential care units in their area, informing them about the research and requesting their collaboration. Managers who agreed were then directly contacted by the research team to provide further details and coordinate the implementation of the study. Informed consent was obtained from all participants prior to any data collection.

Adolescents who met the inclusion criterion and agreed to participate completed the instruments individually in a private room within their residential care unit. Data collection was conducted electronically via a secure online platform, with each participant accessing the questionnaire on a computer. Research has demonstrated that online data collection is as reliable as face-to-face methods for both normative and clinical populations. Furthermore, online questionnaires are particularly advantageous for assessing stigmatized behaviors such as suicide and self-harm, as they reduce social desirability bias that can influence responses in face-to-face or group settings.⁴²Although existing evidence indicates that asking young people about suicide does not increase their risk of suicidal ideation or behavior,⁴² we implemented additional safeguards to ensure participants’ emotional well-being. Specifically, a staff member from the residential care unit was available during and after the completion of the questionnaire to provide emotional support if needed.

Missing data

Regarding missing data, any incomplete responses were excluded from the data wrangling and analysis phases. This approach ensured the reliability of the dataset and minimized the potential for biases arising from imputation or incomplete data.

32 out of the original 197 instances missed some of the 19 data-points described in the previous section. We only considered the fully completed sets of questionnaires. Out of the 165 instances, 83 of them have been annotated as at low risk of suicidal behavior (50.3%), 43 of them were categorized as moderate risk (26.1%), and the remaining 39 (23.6%) were labelled as high risk (see Figure 1).

Figure 1.

Suicidal risk level class distribution (high, moderate or low) annotated by experts in the analysed sample.

The psychological characteristics of the three types of suicidal behavior risk levels are the following:

• High: Adolescents present high intensity of both psychological and behavioural alarm signals. There is high intensity of psychological pain, hopelessness, perception of being a burden and a high activation of suicidal beliefs. High risk suicidal behaviour includes an established suicidal plan and the intention to implement it. Previous attempts and depression should also be considered.

• Moderate: Adolescents may have expressed suicidal thoughts and thought of a plan, but such a plan would lack a high degree of precision or concreteness. They present warning signs of emotional suffering, as well as hopelessness, the perception of being a burden and a moderate suicidal ideation.

• Low: They may have expressed passing thoughts about death or suicide, but have no plan or intention to carry it out. Adolescents at this level can present, albeit with a much lower intensity, emotional pain and the perception of being a burden, and the suicidal ideation is low.

Each case in the sample was studied by clinicians to determine a suicidal risk level.

Methodology

Being the questionnaire the central instrument employed in this work, section Questionnaire delves into critical aspects of questionnaire quality assessment. Next, in section Knowledge-based model, clinician expertise was implemented as a knowledge-based model. Finally, in section Inferred approaches, with the data collected, we explored machine learning based simple approaches. In this case, alternative models are inferred automatically from data without expert-knowledge.

Questionnaire

The questionnaire proposed by expert clinicians is formed by typical tools when assessing suicide risk, and, in this study, they have been selected and ordered in a flow diagram to classify each user response in the previously mentioned risk levels. This tool was assessed to determine the relation between each question and the expected outcome, i.e. the suicide risk. This would give us a practical idea of the ability of each question to convey information about the target class, that is, the predictive ability of each item in the questionnaire. Moreover, the correlation between items in the questionnaire were assessed. While question-to-risk correlation is desirable, question-to-question correlation would reveal redundancies in the questionnaire. In brief, the design of the questionnaire should include a minimal set of relevant though non-redundant questions.

In order to assess the questionnaire itself, two quantitative perspectives were explored: on the one hand, the Pearson correlation and, on the other hand, Mutual Information. Particularly, for Mutual Information calculations, entropy-based information gain was used as a key measure to evaluate the information contribution of each questionnaire feature regarding suicide risk. Entropy quantifies the uncertainty or disorder within the data, and information gain measures the reduction in this uncertainty when the dataset is split based on a given feature. Features with higher information gain are considered more informative, as they more effectively reduce uncertainty about the target variable.

Knowledge-based model

The knowledge-based model conveys an expert system developed by two experts in the area, a psychologist and a psychiatrist, both specialized in suicide. In addition, modern explanatory theories of suicide risk such as the fluid vulnerability theory,⁴³ the interpersonal theory of suicide⁴⁴ or the three-step theory were taken into account.⁴⁵

After years of manually analysing the results of these items, experts elaborate the threshold in each questionnaire and which concepts are the most important to assess whether a person is at high, medium or low risk of suicide. The full questionnaires (Q) and independent question items (I) were described in Section Measuring instruments. Table 3 describes the thresholds considered in the full questionnaires as positive for high and moderate suicide risks. These thresholds are defined considering the sum of the item scores in each questionnaire. In contrast, the yes/no answers to independent question items (I) are used directly.

Table 3.

Thresholds for interpreting the concepts measured in the questionnaires (Q) as positive. For “tolerance for mental pain” and “thwarted belongingness,” higher scores indicate a lower associated risk. Note that the ”hopelessness” scale is a 4-item true/false questionnaire, with a score of 2 or more considered positive.

Questionnaire	High risk	Moderate risk
Mental pain (PS-E)	≥46	[36–45]
Suicidal belief system (SCS-R)	≥40
Depression (PHQ-8)	≥10
Tolerance for mental pain (TMPS)	≤25
Hopelessness (BHS)	≥2
Thwarted belongingness (INQ)	≤12

In our work, this knowledge has been encoded with the help of computer scientists in an algorithm that is shown in diagrammatic representations in Figures 2(a) and 2(b), respectively, for high and moderate suicidal risk. Based on the conditions for both moderate and high risk, if none of these were met, the patient was deemed low risk.

Figure 2.

Diagrammatic description of the knowledge-based approach model.

We observed that not all the measuring instruments described in Section Measuring instruments were used in these algorithms: based on their prior knowledge, experts have not considered ‘number of previous suicide attempts’, ‘identification with close suicide’, and ‘suicide desire’. The questionnaire about ‘perceived burdensomeness’ was not used either, as it is closely related to ‘thwarted belongingness’. The adolescent identifier, the age and the gender were not considered in this approach.

Inferred approaches

With the goal of (i) comparing the results to the knowledge-based method, and (ii) trying to reduce the number of measuring instruments, we use some machine learning-based inferred approaches to analyse the tabular data described in Section Measuring instruments. All the numeric data described in Table 1 has been used in the inferred approaches and, as it is usual in this area, the term feature is going to be used to describe each all the personal descriptors (‘age’, ‘gender’), individual question-items (I) or questionnaire results (Q). The machine learning models perform a classification task using this numerical data, assigning a suicide risk level to each sample by predicting a number between 0 (low risk) and 2 (high risk).

Machine learning models allow us to measure how much each of the features contributes to infer the correct class (high, moderate, low) annotated by the experts in the Gold Standard. That is, these techniques help to interpret whether the features have redundant information, or whether some of them are not important to reach the correct answer-type, and, as a consequence, could be removed from the set of questionnaires. Feature ablation helps quantify the significance of each feature by observing changes in predictive abilities.²⁴ This technique is crucial for enhancing model accuracy and interpretability by identifying which features contribute most significantly to predictions. Studies show that feature ablation can lead to better model performance.⁴⁶ This technique also estimates feature relevance and enables feature selection, as progressive ablation methods can refine feature sets without significant loss in accuracy.⁴⁷

To assess the impact of feature reduction on model performance, a statistical validation was performed using the Student’s t-test.⁴⁸ This analysis compared the performance metric F1-score of the Machine Learning models with all features against those of models with reduced feature sets (12, 8, and 1 features). The null hypothesis (H₀) assumes that the means of the performance distributions are equal between the full-feature model and each reduced-feature model. The alternative hypotheses are defined as follows: H₁ (the means are unequal), $H_{1}^{<}$ (the reduced-feature model performs worse), and $H_{1}^{>}$ (the reduced-feature model performs better). This statistical approach provides a formal evaluation of whether reducing the number of features leads to any significant loss or improvement in model performance.

These are the machine learning approaches used to perform the experiments for inferring the suicide ideation levels:

• Decision tree

• Random Forest (RF)

• Extra trees

• Boosted classifier

• Linear regression

• Logistic regression

• Support Vector Machine (SVM)

• Naïve Bayes

• Neural network

Given that the sample is of small size, with a population of 165 adolescents, the inferred methods were assessed by means of Leave-One-Out Cross-Validation⁴⁹ in an attempt to avoid evaluation biases. This technique ensures maximized training data, with benefits also in robustness against overfitting, as the models are tested against diverse data points.⁵⁰

Moreover, to further reduce overfitting, we deliberately employed simpler machine learning models, an approach supported by⁵¹ indicating that for tabular data, simpler models, and particularly Tree-based models, not only require minimal hyperparameter tuning but often perform comparably or better than complex models, while offering greater interpretability. In line with this, we also used a very simple feedforward neural network architecture with minimal hyperparameter tuning.

Experimental results

Questionnaire assessment

An analysis of the elements in the questionnaire was carried out in an attempt to rank the relevance of the features described in Section Measuring instruments when it comes to suicide risk deduction and, in the same way, to seize whether feature-pairs conveyed redundant information. Pearson correlation is shown in Figure 3.

Figure 3.

Pearson correlation matrix of the features and the suicidal risk level (gold standard).

From the correlation matrix, we found that ‘gender’ and ‘age’ features were the least correlated with risk. Moreover, these features also had low variance and cardinality, indicating lack of information. Therefore, these features, which are not used in the knowledge-based approach, are expected to have little importance for the machine learning approach. By contrast, elements from the questionnaire with the highest correlation with respect to suicidal risk are the features ‘SCS-R’, ‘mental pain’ and ‘previous suicidal attempt’. Thus, these features are expected to be relevant in the machine learning models.

Regarding redundant information, it was found that ‘suicidal desire’, ‘suicide ideation’, ‘suicide plan’ and ‘previous suicidal attempt’ were greatly correlated among themselves. The features ‘suicides of people near’ and ‘identification with close suicide’ were also highly correlated.

In parallel, Mutual Information was assessed to evaluate the relevance of each feature. Specifically, the information gain of each feature was calculated by measuring the reduction in entropy when the feature is known, reflecting how much uncertainty about the target variable is decreased. This measure provides a ranking of feature importance within the dataset. Therefore, features that produce a larger decrease in entropy compared to the original entropy (without any feature conditioning) are considered more informative and impactful for the prediction task.

The normalized results are gathered in Table 4.

Table 4.

Importance levels of the features based on class entropy information gain.

Importance level	Features	Information gain (%)
1	Suicidal belief system (SCS-R)	49.66
2	Perceived burdensomeness	11.06
	Previous suicidal attempts	7.96
	Suicide plan	6.43
	Mental pain	5.46
	Thwarted belongingness	5.30
	Identification with close suicide	3.43
	Number of previous suicide attempts	3.08
3	Tolerance for mental pain	1.94
	Age	1.71
	Self-harm	1.71
	Depression (PHQ-8)	1.44
4	Hopelessness	0.46
	Suicides of people near	0.21
	Suicide desire	0.05
	Suicide ideation	0.05
	Gender	0.04

In this analysis, the most prevalent feature resulted in ‘suicidal belief system’ (SCS-R survey). This is also the feature most correlated with the gold standard, as can be verified in Figure 3.

On the other hand, the features ‘suicide desire’ and ‘suicide ideation’, even if they are highly correlated with the gold standard and with ‘suicide plan’, do not contribute significantly to the decision-making, which may be due to the redundant information they provide.

Risk prediction

In this section, we gather the experimental results in terms of predictive ability by the two alternative approaches presented, i.e. Knowledge based (section Knowledge-based model) and Inferred (section Inferred approaches) models.

To begin with, we compare all the machine learning-based inferred models, providing them with fewer and fewer surveys from the questionnaire based on the feature importance level. The results with the performance of inferred models applying feature ablation are shown in Table 5. The higher the F1-score, the better, and the best approach by number of questions provided is boldfaced.

Table 5.

Performance of inferred models given a different number of features, assessed in terms of macro-averaged F1-score obtained by means of leave one out cross-validation. The best-performing model for each feature quantity has been bold-faced.

	All features	12 features	8 features	Only SCS-R
Decision tree	79	78	80	66
RF	80	79	85	70
Extra trees	75	82	81	66
Boosted classifier	75	75	77	67
Linear regression	79	81	84	76
Logistic regression	76	73	73	63
SVM	77	76	81	73
Naïve Bayes	77	81	76	76
Neural network	78	81	80	76

Table 5 reveals that tree-based approaches attained superior predictive ability for all the scenarios, except for the scenario in which just a single feature is given to make the prediction. Random forest resulted one of the best performing inferred approaches in terms of F1-score and just required 8 elements to make the prediction. Note that, even though, intuitively, we might have expected that the more questions the better the predictive ability, we found that the models attained best performance with a subset of features selected according to the information gain, as shown in section Questionnaire assessment. It seems as if redundant or non-relevant questions would be detrimental to the inference algorithm. This can be further verified with a statistical analysis, as explained in the Methodology.

The analysis reveals that reducing the feature set to 12 features does not significantly affect the performance metrics. However, for a reduction to 8 features, an improvement over the complete feature set is obtained, as confirmed by statistical significance tests. Assuming statistical significance at p < 0.05 with the Student’s t-test,⁴⁸ comparing the complete set to the 8-feature subset under the alternative hypothesis $H_{1}^{>}$ , the p-value is p = 0.020, rejecting the null hypothesis (H₀) and indicating significantly better performance for the 8-feature subset. With the alternative hypothesis H₁, p = 0.048 is obtained, which confirms a significant difference between using all features versus 8 features. Consistent results are obtained using the Mann-Whitney test.⁵²

Conversely, the case of using only 1 feature shows statistically worse performance compared to other subsets, with a p = 0.001 for the $H_{1}^{<}$ hypothesis, indicating that using all features is better. This highlights the importance of the SCS-R questionnaire as a singular feature, but demonstrates its insufficiency in maintaining overall model performance.

These findings suggest that an 8-feature subset strikes an optimal balance between model simplicity and performance. Next, we compared the inferred approaches with the Knowledge-Based (KB) model. The predictions made by this model were contrasted with the expected outcome and summarized in terms of confusion matrices in Figure 4(a) and compared to one of the inferred approaches, i.e. Random Forest (RF) with 8 features in Figure 4(b). Together with the confusion matrix, the F1-score was provided for cohesion with Table 5.

Figure 4.

Confusion matrices for the knowledge-based and the RF inferred approach. Notation: 0 = low risk, 1 = moderate risk, 2 = high risk.

A shallow inspection of matrices in Figure 4 reveal that the KB predicted a lot of moderate and high risk cases as no risk, while this discrepancy occurs less with the RF approach. The F1-score provides us with an overall view gathering both precision and recall (reflected in the confusion matrices) and resulted in 52 and 85, respectively, for KB and RF, revealing the high difference between both approaches.

Discussion

The aim of this study was twofold: first, the performance of models based on knowledge-based algorithms was compared with models inferred from machine learning data. We also sought to reduce the set of assessment instruments (questionnaires) by identifying redundancies and non-relevant items, maintaining or improving the predictive capacity of the model, and proposed a simplified model for suicide risk detection with a reduced set of features that allows for efficient and high-performance assessment.

Regarding the first aim, results showed that data-inferred approaches, such as machine learning models, consistently outperformed knowledge-based models in terms of predictive capacity. In particular, the Random Forest-based model achieved an average macro F1-score of 85% using a reduced feature set, while the knowledge-based model achieved an average F1-score of only 52%. This significant difference highlights several important issues. On the one hand, the machine learning models were more effective at correctly classifying risk levels (low, moderate and high), especially in the moderate and high risk cases. This can be seen in Figure 4, where knowledge-based and inferred approaches are directly compared.

These findings are consistent with previous research with adolescent populations that has highlighted the capacity of machine learning to capture complex patterns in data and improve predictive accuracy in clinical settings, particularly in vulnerable populations such as adolescents, where traditional models may be limited due to the non-linear nature of suicide risk factors.⁵³

Among the selected variables, mental pain and previous suicide attempts were strongly correlated with the level of risk, reflecting their relevance in identifying high-risk adolescents. This finding is consistent with previous research highlighting the role of mental pain as a central marker in the conceptualisation of suicidal risk in at-risk adolescents.⁵⁴ The inclusion of this variable in a simplified model provides a more nuanced perspective tailored to the emotional realities faced by young people in vulnerable environments. There have been numerous recent attempts to approximate adolescent suicide risk through machine learning, either by drawing on data associated with prior suicidal ideation and behaviour⁵³ or through indirect information associated with risk factors.²¹ A prominent feature of these investigations has been the large number of variables used. While these approaches have shown promising results in terms of predictive accuracy, their application in clinical or psychosocial settings poses significant challenges. The time and effort required for participants to complete such a large number of items can be problematic, especially in younger adolescents, whose capacity for concentration and attention tends to be more limited. In this regard, one of the main contributions of this study has been the development of a robust system capable of predicting the potential risk of suicidal behaviour using a reduced set of characteristics. The results in Table 5 show that, after assessing the relevance and redundancy of the features, a subset of only 8 features not only maintained, but in some cases improved their predictive performance, achieving an average macro F1-score of 85%. This facilitates the practical feasibility of their implementation in real-world environments, reducing the burden on both participants and practitioners administering these tools. Furthermore, this simplification responds to the practical and ethical needs of tailoring assessments to the specific context and characteristics of adolescents, as noted in recent research.^21,55

Thus, the results indicate that it is possible to significantly reduce the number of features used in the model without compromising its performance. Furthermore, by assessing only one feature, the SCS-R questionnaire proved to have the highest predictive value among the variables analysed, which reinforces its importance as a central tool in suicide risk assessment. Importantly, although the predictive capacity of the SCS-R has been previously validated in adult populations,⁵⁶ its use in adolescents has been less explored. This study provides additional evidence for the efficacy of this instrument in juvenile populations, thus broadening its applicability and utility in this age group. In a context where adolescent-specific tools are limited, these findings reinforce the value of the SCS-R as a key instrument for assessing suicide risk in vulnerable adolescents.

In summary, this study demonstrates that machine learning approaches can outperform knowledge-based models in predicting adolescent suicide risk, especially by optimising the number of features employed. The combination of clinical relevance, redundancy reduction and robust performance reinforces the usefulness of inferred models in real-world contexts. Furthermore, the use of simplified and specific questionnaires, such as the SCS-R, provides a practical and effective tool to identify potential risk for suicidal behaviour, promoting more efficient and accessible assessments in clinical and educational settings. These findings open the door to future work to extend the generalisability of the models and their application in different populations and contexts.

Conclusion

This work started with a self-designed questionnaire addressed to adolescents and aimed at suicide risk detection. The responses of 165 adolescents were considered. The questionnaire itself was assessed quantitatively, in an attempt to detect redundant questions and also to seize the relevance of each feature. Two approaches have been applied to the questionnaire responses, a knowledge-based model and a machine learning-based inferred approach. Different conclusions can be drawn from different models, but the inferred approaches, based on machine learning, have been demonstrated to improve prediction significantly.

A great conclusion in this work is that the ‘SCS-R’ survey, determining the suicidal belief system, can be a great metric in order to detect suicide risk. Whilst other parameters seem to have little effect on the outcome, this survey can be used by itself to detect the risk up to a F1-score macro average of 76%. This is much more interpretable than the models containing up to 17 features, though at the cost of reduced performance.

Even if not only one questionnaire is used, the feature quantity can be significantly reduced to just 8 features with no negative impact on the model, and it can even improve the performance of the model. Thus, by performing feature ablation based on cross-entropy information gain, we have been able to identify the most important features for suicide risk prediction. This minimizes the need for redundant questions and potentially leads to shortened questionnaires that can be more easily distributed.

Regarding the limitations of this investigation, the relatively small size of the dataset, although common in these kinds of studies, may have contributed to overfitting and limited the model’s generalizability. We used Leave-One-Out Cross-Validation and simple Machine Learning methods to mitigate this risk. However, a larger and more diverse dataset would likely improve the robustness of our conclusions. The use of a single dataset for both training and testing may also limit external validity.

Therefore, we acknowledge that future research should incorporate larger and more diverse datasets to explore the potential benefits of more complex models, including Deep Learning techniques, to determine whether they yield meaningful improvements in predictive accuracy. Future studies will also aim to include secondary or external datasets to support independent validation, improve reproducibility, and generalize the findings beyond this preliminary investigation.

Other limitations of our work could be the lack of time assessment or clinical assessment tools, which have been found to be beneficial for the performance of the models.¹² A follow-up or monitoring of the patients along a time span would be interesting to pursue. Moreover, this study was conducted on a specific group of high-risk adolescents in residential care. We acknowledge that this may limit the generalisability of findings to adolescents outside this particular context. Therefore, it is important to note that the tools and models developed in this study have been designed specifically for this high-risk population and their applicability to other contexts or populations should be interpreted with caution. However, the methodological approach of using entropy to assess information gain and verifying this with machine learning approaches is, doubtlessly, generalizable. The age span of the participants (12-18) is also a limitation, as the risk factors may vary with age. However, with the inferred models, we found that the feature ‘Age’ did not have a significant impact on the prediction of suicide risk. Moreover, it has to be taken as a limitation that this tool has not been validated in another dataset.

Despite these limitations, our findings provide valuable insights into suicide risk assessment in vulnerable adolescents. Importantly, through our feature ablation analysis, we identified a reduced set of eight questionnaires, and a particularly informative single questionnaire, that effectively capture critical risk factors. This reduction in assessment complexity represents a meaningful step toward more practical and accessible suicide risk screening tools. These findings lay important groundwork for adapting and validating these streamlined tools across broader populations and diverse clinical settings in future research.

Footnotes

Acknowledgements

The authors would like to express their gratitude to the research team members and collaborators who contributed their time and expertise to this study. Special thanks are extended to Osakidetza, whose guidance and support were instrumental in shaping this work. The computational resources provided by HiTZ are gratefully acknowledged.

ORCID iDs

Ane Varela

Maite Oronoz

Arantza Casillas

Alexander Muela

Jon García-Ormaza

Alicia Pérez

Ethical considerations

The study was approved by the Ethics Committee for Research on Human Beings of the University of the Basque Country (Ref.97/18). All the participants, and, if applicable, their legal representatives, gave written informed consent.

Author Contributions

The author contributions are highlighted using the CRediT taxonomy.

Conceptualization: Alexander Muela and Jon García-Ormaza contributed to the initial conceptualization of the study, integrating clinical expertise to ensure medical relevance.

Data Curation: Alexander Muela and Jon García-Ormaza were responsible for the collection and organization of clinical data.

Formal Analysis: Ane Varela conducted the primary data analysis. Methodology: Maite Oronoz, Arantza Casillas and Alicia Pérez designed the study methodology, developed and validated computational models, and ensured methodological rigor.

Software: Ane Varela developed the computational code used in the analysis.

Supervision: Technical oversight was provided by Maite Oronoz, Arantza Casillas and Alicia Pérez, while Alexander Muela and Jon García-Ormaza supervised the medical aspects, ensuring alignment with clinical standards.

Writing – Original Draft Preparation: Ane Varela prepared the initial manuscript draft.

Writing – Review & Editing: All authors contributed to reviewing and editing the manuscript, ensuring accuracy, clarity, and alignment with clinical and technical standards.

All authors have read and approved the final manuscript.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially funded by LOTU with code TED2021-130398B-C22 funded by the MCIN/AEI /10.13039/501100011033 and by the European Union NextGenerationEU/ PRTR. Besides, this work was partially funded by the Spanish Ministry of Science, Innovation and Universities (EDHIA PID2022-136522OB-C22) and by the Basque Government (IXA IT-1570-22).

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The dataset used in this study contains sensitive and confidential information from a vulnerable adolescent population in residential care and cannot be publicly shared due to ethical and privacy considerations. Access to the raw data is restricted to protect participant confidentiality in compliance with ethical guidelines. The used questionnaires are described and referenced in the article. For reproducibility purposes, the source code used for data analysis, feature selection, and machine learning modelling has been made publicly available in .

References

University of Manitoba . Concept: suicide and attempted suicide (intentional self inflicted injury); 2021. Accessed: 2024-02-06. Available from: https://mchp-appserv.cpe.umanitoba.ca/viewConcept.php?conceptID=1183

Martens

Fransoo

McKeen

, et al. Patterns of regional mental illness disorder diagnoses and service use in Manitoba: a population-based study. Manitoba Centre for Health Policy, 2004.

World Health Organization . Suicide; 2023. Accessed: 2024-02-06. Available from: https://www.who.int/news-room/fact-sheets/detail/suicide

Van Meter

Knowles

Mintz

. Systematic review and meta-analysis: international prevalence of suicidal ideation and attempt in youth. J Am Acad Child Adolesc Psychiatry 2023; 62(9): 973–986.

The Lancet Regional Health – Southeast Asia . Early intervention in mental health: the best bet. Lancet Reg Health Southeast Asia 2022; 5: 100090.

Shepard

Gurewich

Lwin

, et al. Suicide and suicidal attempts in the United States: costs and policy Implications. Suicide Life-Threatening Behav 2015; 46(3): 352–362.

World Health Organization . Suicide rates; 2021. Accessed: 2024-02-06. Available from: https://www.who.int/data/gho/data/themes/mental-health/suicide-rates

Wang

Yan

Imm

, et al. Racial and ethnic disparities in prevalence and correlates of depressive symptoms and suicidal ideation among adults in the United States, 2017–2020 pre-pandemic. J Affect Disord 2024; 345: 272–283.

Zhu

Lang

Zhang

. Gender differences in prevalence and clinical risk factors of suicide attempts in young adults with first-episode drug-naive major depressive disorder. BJPsych Open 2024; 10(1): e19.

10.

Berkelmans

Schweren

Bhulai

, et al. Identifying populations at ultra-high risk of suicide using a novel machine learning method. Compr Psychiatry 2023; 123: 152380.

11.

de la Torre-Luque

Pemau

Ayad-Ahmed

, et al. Risk of suicide attempt repetition after an index attempt: a systematic review and meta-analysis. Gen Hosp Psychiatry 2023; 81: 51–56.

12.

Bernert

Hilberg

Melia

, et al. Artificial intelligence and suicide prevention: a systematic review of machine learning investigations. Int J Environ Res Publ Health 2020; 17(16): 5929.

13.

Pan

, et al. Suicidal ideation detection: a review of machine learning methods and applications. IEEE Trans Comput Soc Syst 2021; 8(1): 214–226.

14.

Venek

Scherer

Morency

, et al. Adolescent suicidal risk assessment in clinician-patient interaction. IEEE Trans Affect Comput 2017; 8(2): 204–215.

15.

Chattopadhyay

. A study on suicidal risk analysis. 2007 9th International Conference on e-Health Networking, Application and Services, 2007, pp. 74–78.

16.

Delgado-Gomez

Blasco-Fontecilla

Sukno

, et al. Suicide attempters classification: toward predictive models of suicidal behavior. Neurocomputing 2012; 92: 3–8.

17.

Pillai

Oza

Sharma

. Review of machine learning techniques in health care. In: Shing

(ed). Proc. of ICEIC 2019. Springers Nature Switzerland, 2019, vol 597, pp. 41–50.

18.

Kessler

Bossarte

Luedtke

, et al. Suicide prediction models: a critical review of recent research with recommendations for the way forward. Mol Psychiatr 2019; 25(1): 168–179.

19.

Hunter

Farmer

Benny

, et al. The association between social fragmentation and deaths attributable to alcohol, drug use, and suicide: longitudinal evidence from a population-based sample of Canadian adults. Prev Med 2023; 175: 107688.

20.

García de la Garza

Blanco

Olfson

, et al. Identification of suicide attempt risk factors in a national US survey using machine learning. JAMA Psychiatry 2021; 78(4): 398–406.

21.

Haghish

Czajkowski

von Soest

. Predicting suicide attempts among Norwegian adolescents without using suicide-related items: a machine learning approach. Front Psychiatr 2023; 14: 1216791.

22.

Brzeziński

Fedorovich

Ziarek

, et al. Suicidal thoughts and self-destructive tendencies among adolescents. Quarterly Journal Fides Et Ratio. 2024; 58(2): 92–100.

23.

Muela

García-Ormaza

Sansinenea

. Suicidal behavior and deliberate self-harm: a major challenge for youth residential care in Spain. Child Youth Serv Rev 2024; 158: 107465.

24.

Merrick

. Randomized ablation feature importance. arXiv. 2019.

25.

Lindh

ÅU

Beckman

Carlborg

, et al. Predicting suicide: a comparison between clinical suicide risk assessment and the Suicide Intent Scale. J Affect Disord 2020; 263: 445–449.

26.

Castillo-Sánchez

Marques

Dorronzoro

, et al. Suicide risk assessment using machine learning and social networks: a scoping review. J Med Syst 2020; 44: 205.

27.

Parsaei

Taghavizanjani

Cattarinussi

, et al. Classification of suicidality by training supervised machine learning models with brain MRI findings: a systematic review. J Affect Disord 2023; 340: 766–791.

28.

Aseltine

Doshi

, et al. Machine learning for suicide risk prediction in children and adolescents with electronic health records. Transl Psychiatry 2020; 10(413): 413.

29.

Chou

Wang

, et al. A machine-learning model to predict suicide risk in Japan based on national survey data. Front Psychiatr 2022; 13: 918667.

30.

Bryan

Rozek

Khazem

. Prospective validity of the suicide cognitions scale among acutely suicidal military personnel seeking Unscheduled psychiatric intervention. Crisis 2020; 41(5): 407–411.

31.

Díez Gómez

Pérez Albéniz

Ortuño Sierra

, et al. SENTIA: an adolescent suicidal behavior assessment scale. Psicothema 2020; 32: 382–389.

32.

Bryan

May

Thomsen

, et al. Psychometric evaluation of the suicide cognitions scale-revised (SCS-R). Mil Psychol 2021; 34(3): 269–279.

33.

Kroenke

Strine

Spitzer

, et al. The PHQ-8 as a measure of current depression in the general population. J Affect Disord 2009; 114(1–3): 163–173.

34.

Holden

Mehta

. The Psychache Scale. 1998.

35.

Ordóñez-Carrasco

Cuadrado Guirado

Rojas Tejada

. Escala de dolor psicológico: adaptación de la Psychache Scale al español en jóvenes adultos [Psychological pain scale: adaptation of the Psychache Scale into Spanish for young adults]. Rev Psiquiatía Salud Ment 2022; 15(3): 196–204.

36.

Orbach

Gilboa-Schechtman

Johan

, et al. Tolerance for mental pain scale. Ramat-Gan: Department of Psychology, Bar-Ilan University, 2004.

37.

Beck

ATBHS

. Beck hopelessness scale. Psychological corp. Harcourt Brace Jovanovich, 1988.

38.

Van Orden

Cukrowicz

Witte

, et al. Thwarted belongingness and perceived burdensomeness: construct validity and psychometric properties of the Interpersonal Needs Questionnaire. Psychol Assess 2012; 24(1): 197–215.

39.

Schuster

Kaiser

Terhorst

, et al. Sample size, sample size planning, and the impact of study context: systematic review and recommendations by the example of psychological depression treatment. Psychol Med 2021; 51(6): 902–908.

40.

Sassenberg

Ditrich

. Research in social psychology changed between 2011 and 2016: larger sample sizes, more self-report measures, and more online studies. Adv Methods Pract Psychol Sci 2019; 2(2): 107–114.

41.

Khondoker

Dobson

Skirrow

, et al. A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies. Stat Methods Med Res 2016; 25(5): 1804–1823.

42.

Fox

Harris

Wang

, et al. Self-injurious thoughts and behaviors interview—revised: development, reliability, and validity. Psychol Assess 2020; 32(7): 677–689.

43.

Rudd

. Fluid vulnerability theory: a cognitive approach to understanding the process of acute and chronic suicide risk. Cognition and suicide: theory, research, and therapy. American Psychological Association, 2006, pp. 355–368.

44.

Van Orden

Witte

Cukrowicz

, et al. The interpersonal theory of suicide. Psychol Rev 2010; 117(2): 575–600.

45.

Klonsky

Pachkowski

Shahnaz

, et al. The three-step theory of suicide: description, evidence, and some useful points of clarification. Prev Med 2021; 152: 106549.

46.

Pansari

Srivastava

, et al. Attack classification using machine learning on UNSW-NB 15 dataset using XGBoost feature selection & ablation analysis. 2024 IEEE 9th International Conference for Convergence in Technology (I2CT), 2024, pp. 1–9.

47.

Duan

Liu

, et al. AIS-based operational phase identification using Progressive Ablation Feature Selection with machine learning for improving ship emission estimates. J Air Waste Manag Assoc. 2024; 74(2): 100–115.

48.

Student . The probable error of a mean. Biometrika 1908; 6(1): 1–25.

49.

Wong

. Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recogn 2015; 48(9): 2839–2846.

50.

Reza

SMS

Butman

Park

, et al. AdaBoosted Deep ensembles: getting Maximum performance out of small training datasets. In: Liu

Yan

Lian

, et al. (eds). Machine Learning in Medical Imaging. Springer International Publishing, 2020, pp. 572–582.

51.

Somvanshi

Das

Javed

, et al. A survey on Deep tabular learning. arXiv 2024.

52.

Mann

Whitney

. On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 1947; 18(1): 50–60.

53.

John

Lin

. Machine learning-based prediction for self-harm and suicide attempts in adolescents. Psychiatry Res 2023; 328: 115446.

54.

Muela

Bryan

García-Ormaza

, et al. Cross-cultural adaptation and psychometric validation of the Suicide Cognitions Scale-Revised (SCS-R) in Spanish adolescents in residential care. Spanish J Psychol 2024; 27: e30.

55.

Zang

Hou

Lyu

, et al. Accuracy and transportability of machine learning models for adolescent suicide prediction with longitudinal clinical records. Transl Psychiatry 2024; 14(1): 316.

56.

Bryan

Thomsen

Bryan

, et al. Scores on the suicide cognitions scale-revised (SCS-R) predict future suicide attempts among primary care patients denying suicide ideation and prior attempts. J Affect Disord 2022; 313: 21–26.