Abstract
Sexual grooming, a common facet of sexual abuse, involves preparing a child for abuse by gaining emotional and/or physical access. However, definitions of sexual grooming vary. The lack of consensus among experts contributes to the difficulty in identifying behaviors that are solely characteristic of sexual groomers, as opposed to behaviors in which a well-intentioned individual might also engage. There are various models and measures that attempt to define sexual grooming after the fact; however, a forensic measure to better detect the likelihood of grooming is necessary for improved prevention. Given that children spend the majority of their time away from home at school, we chose to focus on the educational setting. The current project is an initial stage in the development of the Trauma Research Institute Grooming Scale (TRIGS) and is intended for use in the assessment phase of a clinical or forensic assessment or investigation. The present study of 99 psychologists and educational professionals confirmed the presence of two subcategories of grooming behavior: Desensitization (DS) and Relationship Enhancement (RE). Results demonstrated DS behaviors as more clearly indicative of grooming, whereas RE behaviors were not only less likely to be identified as definite grooming indicators but also produced less agreement as to which items within this category should be seen as problematic. Additionally, all DS behaviors were consensually identified as “red flags” or reportable to an administrator, whereas only a little over a quarter of RE behaviors were seen as such. The initial findings of the TRIGS provide strong evidence for the two distinct categories of grooming behaviors and hold promise for prevention efforts against sexual grooming in schools.
Introduction
Sexual grooming is often argued to be a common facet of sexual abuse, describing a process by which the perpetrator gradually conditions the child to be a more vulnerable target for such abuse. As early as the 1930s, the concept of grooming was referenced in The Boys’ Club handbook, with a recommendation that leaders remain vigilant regarding the behavior of volunteers (Atkinson, 1939). Johnson and Robinson (1957) described parental “seductive practices” toward children that were believed to produce “future sexual deviant[s],” such as caressing the child’s genitals or allowing the child to touch the parents’ genitals (p. 1560). The concept of grooming took a more explicit form during the 1970s, with Federal Bureau of Investigation (FBI) Behavioral Science Unit profiler Kenneth Lanning noting that techniques such as offering compliments, money, or affection, or engaging in seductive activities (e.g., massaging, touching), formed a distinct cluster of common offender behaviors that often preceded abuse (Lanning, 2018). Although the historical record is not entirely clear, many experts who practiced during this period, including one of the authors of the present study (CJD), recall first hearing the term “grooming” from Lanning (Burgess & Hartman, 2017; Conte, 1984). Conte may have been the first to use the term in print, acknowledging that sexual abuse takes place along on a continuum, beginning with a preparatory phase that may be hard to recognize.
Although research on sexual grooming has been increasing over the years, the definitions employed by researchers still vary greatly. For the present research, we gathered definitions by searching the literature on PsycINFO citing the terms “grooming” and “child sexual abuse,” reviewing each article for a clear definition of grooming. Table 1 includes a sampling of these definitions (but is not intended to be a systematic review of all definitions available). As noted in Table 1, some definitions specify the behaviors used to foster a specific emotional state in the child, such as trust (Gillespie, 2002) or affection (Salter, 1995). Other definitions are nonspecific but include “behaviors. . . [that] make the victim less resistant to the eventual sexual abuse” (Sheldon & Howitt, 2007, pp. 58–59), and still others open the category to any behavior that raises concern to an outside observer that unlawful activity is imminent (O’Connell, 2003).
Sample Grooming Definitions Quoted from the Psychological Literature.
Jeglic et al. (2023) found a wide variety of behaviors that were more frequent in sexual abuse contexts than in non-abuse contexts, considering all of them to be relevant to the grooming concept and labeling them “red flags.” However, Metcalf et al.’s (2021) earlier use of the red flag label singled out those behaviors that are more clearly reportable (e.g., sexual language, nudity) and excluded more ambiguous and potentially innocuous behaviors meant to foster interpersonal connection (e.g., private tutoring sessions or nonsexual compliments).
Indeed, a close inspection of definitions in Table 1, together with our review of the literature, suggested two subcategories of grooming behavior that differ in the level of interpretive ambiguity but that are treated interchangeably in relevant research. Winters and Jeglic (2016), for instance, included behaviors such as asking a child (old enough to play on a team, and, therefore, likely aged 8–10 years old in the mind’s eye) to sit on an adult’s lap, or accompany the child to the bathroom, as grooming. These behaviors, in our view, would be questionable for most adults, and are thus less ambiguous. We label these activities as Desensitization (DS). In addition to the DS behaviors, Winters and Jeglic (2016) also included such behaviors as offering a child a ride home or volunteering at child-centered organizations—behaviors characteristic of many well-meaning and caring individuals (and, admittedly, many pedophiles). We label this type of behavior Relationship Enhancement (RE), to describe its likely purpose, irrespective of underlying benevolent or malevolent motivations.
DS behaviors potentially increase a child’s vulnerability by attempting to directly change the child’s attitude or inclination toward sexuality, sexual expression, or nudity. Perpetrators may use DS behaviors to engage in such activities as gradually familiarizing the child with sexual images, accustoming the child to sexual language, or routinizing the perpetrator’s physical touch, leading to eventual sexual contact (Elliot, 2017). Such behaviors could also include discussing with a child one’s own sexual experiences or asking a child about the child’s sexual experiences (Winters et al., 2020). The appropriateness of most of these behaviors is not context-dependent—that is, the behaviors are inappropriate, or at least worth questioning, in virtually all settings (Kaufman et al., 2006). DS behaviors should arguably be seen as “red flags”—unacceptable behaviors, reportable to authorities—by most professionals who work closely with children.
In comparison, the appropriateness or inappropriateness of RE behaviors appears to be more context dependent. These behaviors, thought to be employed to engender trust or affection, include such activities as the giving of a small gift (not tied to enactment of taboo behaviors), listening to a child’s problems, helping a child in academic pursuits, or complimenting a child’s academic gifts. An educator spending time with a child alone in a classroom after school may be viewed as a caring adult interested in fostering the child’s academic potential. However, if allegations of impropriety were leveled against the educator, that same behavior may be viewed as purposely isolative. Biases regarding appropriate behavior of male versus female adults toward children may also come into play here, as many cultures show greater acceptability for women than for men regarding behaviors that involve touch or physical closeness with children (e.g., comforting a child with a playground injury; Christensen, 2018).
Over-inclusive definitions of grooming are not without potential cost. Research has shown that humans retrospectively overestimate their ability to recognize abuse once it is clear that abuse has occurred (Spenard & Cash, 2022). Such findings may be attributed to hindsight bias (Hawkins & Hastie, 1990)—the tendency to believe, in retrospect, that events are more predictable than they actually are. Scurich et al. (2023) tested hindsight bias specifically in grooming and found that the magnitude of the believed relationship between grooming behaviors and later offending doubled after the members of a large community sample had been told of the purported crime. Similar results, albeit from a college sample and absent data integrity controls, were found in Winters and Jeglic (2016). We would argue that such hindsight bias is much more likely if the behaviors are in the RE class.
It is important to note that the RE and DS behaviors are recognized by many, if not most, grooming researchers. Elliot’s Self-Regulation Model of Illicit Grooming identifies phases to the grooming process, including rapport building (akin to our RE category) and disinhibition (akin to our DS category; Elliot, 2017), but the model has not been formalized in a clinical instrument with established validity and reliability. Similarly, Kaufman’s (1994) Modus Operandi Questionnaire (MOQ) and DiLillo et al.’s (2010) Computer Assisted Treatment Inventory (CAMI) both include questions related to desensitizing the child and building trust, but these researchers treat the behaviors as interchangeable in meaning and confound grooming with the abuse itself.
The most recent measure of sexual grooming, and a clear contribution to the literature, is the Sexual Grooming Model (Winters et al., 2020). This measure categorizes grooming as a five-step process: victim selection, gaining access and isolation, trust development, desensitizing the child to sexual content and physical contact, and post-abuse maintenance. Child victims are asked about their impressions of each phase (Winters & Jeglic, 2022). Winters and Jeglic (2022) do not differentiate the potential stand-alone meaning of high scores on the various “phase” scores, but they present evidence that virtually all sexual abuse victims endorsed items from both DS and trust development (closest to our RE) categories. The measure requires victim admission and insight, with items asking whether the child’s relatives were “manipulated” or whether the perpetrator “misstated moral standards.” Winters et al. (2020) developed their model based on a small number of experts (n = 18), without replication, eliminating 35 of their original 77 items.
The present project is an initial stage in the development of the Trauma Research Institute Grooming Scale-Education (TRIGS-E). The “E” specifier in TRIGS-E is an indication that potential future versions of the TRIGS should be context-specific, as norms for DS and RE behaviors will differ in various settings (educational, medical, caretaking, etc.). The TRIGS-E is intended for use in the assessment phase of a clinical or forensic assessment or investigation, to guide the evaluator in differentiating between behaviors that are clearly grooming-related (DS) and behaviors that should be interpreted with the educational context in mind (RE). We proposed to show that behaviors may be reliably classified into the two categories. Given that RE behaviors are expected to be context-dependent, we chose to begin within a specific context—school—the setting where children spend the majority of their time apart from home. In Assini-Meytin et al.’s (2024) large sample of sexual abuse and boundary violations retrospectively reported in youth-serving organizations, the largest number of offenses (37.09%) in their most recent cohort occurred in the K–12 school setting.
In support of our contention that grooming behaviors in the RE category are more ambiguous—and, interpreted without sufficient context, may lead to miscarriages of justice—we predicted that there would be substantial disagreement among experts as to which behaviors in the RE category are concerning in isolation, while no such disagreement would be seen in the DS category. Consequently, in Study 1, we predicted that the DS behaviors would more likely be seen specifically as grooming, while the RE behaviors would more likely be seen, contextually, as either Benign (BN) or grooming. Additionally, we predicted that both DS and RE items were more likely to be classified as potential or definite grooming behaviors than items that describe Benign educational behaviors—for example, “tells a joke in class” or “gives a gold sticker when the student does a good deed (e.g., cleans a classmate’s desk).”
In Study 2, we hypothesized that clinical and forensic experts could replicate our categorization of items in DS and RE categories. We also hypothesized that DS behaviors were more likely to be classified as Red Flag behaviors (reportable to authorities) than RE behaviors. Finally, in keeping with the societal expectations of female nurturance (Christensen, 2018) and the lower tolerance of sexual exploitation seen in many studies (Jones et al., 2021; Voogt & Klettke, 2017), we expected women to have higher Red Flag scores on DS and lower Red Flag scores on RE than men. An Open Science Framework (OSF) preregistration was completed for this study (https://osf.io/brymf).
Methodology
Participants
Participant experts on sexual abuse were solicited through two sources in 2022 and 2023. First, the search terms “sexual grooming,” “child sexual abuse,” and “school sexual grooming” were entered into PsycINFO to identify relevant publications and obtain corresponding author email addresses. Second, clinical psychologists specializing in child sexual abuse or sexual grooming were located through Google searches, crossing the terms “clinician” or “therapist” with the terms “sexual abuse,” “child sexual abuse,” or “sexual grooming.” Profiles on the Psychology Today website were also searched to locate clinicians who indicated a specialty in the treatment of sexual abuse and/or trauma. All potential experts were individually emailed and asked to complete the survey, evaluating the teacher behaviors with either a 5- or 8-year-old child as the prototype. Due to acceptability of behaviors differing across ages, a distinction between these ages was made. Experts were randomly chosen until 700 emails were sent, with 116 responses collected, a recruitment rate of 15% (a relatively high response rate for a survey of professionals). The first 52 respondents located by our methods were allocated to Study 1, and the remaining 64 were allocated to Study 2. This allocation allowed for paired comparisons between DS and RE behavior scores with power of .80 for effect sizes larger than f = .20. Participants were removed for excessive missing data (>10%). All participants were associated with an English-speaking institution. Although all subjects were experts in sexual abuse, in Study 2 respondents were queried specifically regarding their expertise in grooming. Exclusions resulted in the removal of three respondents from Study 1 (final n = 49) and 14 from Study 2 (final n = 50).
Measures
Trauma Research Institute Grooming Scale-Education
We attempted to establish the key behaviors constituting grooming (Metcalf et al., 2021). The literature on grooming was reviewed by the current authors, each of whom contributed potential items to a joint database. The 118 articles crossing the terms “grooming” and “child sexual abuse” in the PsycINFO database were read, with additional items added by the two senior authors based on clinical experience. Items were retained only if 100% of a sample of 11 trauma-informed raters (graduate students) rated the potential DS and RE items as Plausible Grooming behaviors, and kappa exceeded .90 for categorization as DS or RE. Four categories of items were included in the final survey (with full item text in the Results section), although only the last two constitute the TRIGS scale.
Filler items were identified that represented BN behaviors that might occur in an educational setting. These items were included to disrupt method variance for those who might tend to rate all items as problematic. Content for BN items was collected by reviewing websites for educational organizations that provide advice for safe and effective connection with children in mainstream English, math, and physical education courses. Comparison of the results of ratings of these items to proposed grooming items also served as a check of the respondents’ commitment to the task. The Benign items were reviewed by our coders (along with the potential DS and RE items) with only two miscategorizations across all raters.
Four items were included in Study 1 that constituted actual sexual abuse in most, if not all, states in the United States. These items included “exposes naked body to the child,” “teaches the child about masturbation or sexual protection,” “takes sexual photos/videos of the child,” and “shows child pornography magazines/videos.” These sexual abuse items were a further response check, included to disrupt a style of affirming all items as benign, in that no activities in this category should be seen as benign by any serious respondent.
Thirteen items fit the definition of DS (with complete agreement among the three senior raters) and were tested in Study 1. The final scale in Study 2, after refinement from quantitative and qualitative feedback in Study 1, contains 14 DS items (see below for rationale for changes). DS behaviors were actions thought to increase the susceptibility of a child to future sexual abuse by engaging in behaviors that normalize greater comfort with inappropriate touch, sexuality, or nakedness.
Twenty-four items fit our definition of RE (with complete agreement among our three raters on 22 of 24 items) and were used in Study 1. The final scale in Study 2 contained 21 items (see Table 2). RE behaviors were defined as those that would be likely to increase the child’s feeling of closeness and/or dependency upon the adult by suggesting the adult’s interest in and care for the child.
Classification of Items as Context Dependent or Definite Grooming and Red Flag Ratings.
Note. TRIGS-E item categories reflect Study 2 categorization. Italicized text changed from Studies 1 to 2. Items with missing data in first two columns were added in Study 2. Numbers = order presented in survey; Category history = category tested in Studies 1 and 2; DS = Desesnitization; RE = Relationship Enhancement; Definite Grooming = coded by Study 1 respondents as a definite grooming behavior; Plausible Grooming = coded by Study 1 respondents as either definite or context-dependent grooming behavior (i.e., nonbenign); Red Flag = coded by Study 2 respondents as reportable.
Procedure
In both studies, participants were sent an email detailing the intent of the research, with a survey link to the online platform SurveyMonkey. The setting was specified as educational, with a child who is 5 or 8 years old. Participants were randomized into age group conditions. In addition, participants were asked for demographic information, years of practice in the specialization of child sexual abuse, and whether sexual abuse grooming was an area of expertise for the respondent (although all practiced and/or published in the area of sexual abuse). The DS, RE, and BN subscale items are presented as clusters in the tables below but were intermixed (see item numbers in Table 2) when given to respondents. The scale as seen by respondents is available on the OSF site for this project at (https://osf.io/brymf).
Participants were not compensated. Study 1 participants rated the items as “definitely a grooming behavior,” “context dependent (could be benign or a grooming behavior),” or “Benign.” Experts were also encouraged to respond with their thoughts about the measure. Study 2 participants were asked to classify the behaviors as “Desensitization,” “Relationship Enhancement,” or “not grooming” and were given the option to say that, although the item fit the category of potential grooming, they were “unsure” of the category. In a second part of Study 2, the participants were asked whether each of the behaviors constituted a Red Flag, indicating reportable to an administrator (with options of Yes, No, and Unsure). Study 1 had a mean completion time of 7.73 minutes (SD = 3.38) with four positive outliers removed; in Study 2, which was double the length, the mean completion time was 16.69 minutes (SD = 8.01) with four positive outliers removed. We used the STROBE reporting guideline (von Elm et al., 2007) to draft this manuscript, and the STROBE reporting checklist is included in Supplement A.
Results
Approximately twice as many participants in Studies 1 and 2 identified as female (65.3% and 66%, respectively) than as male. All participants were defined as experts, either by their publication history in the sexual abuse field or by their self-reported clinical experience. The average number of years practicing for participants was 21.63 in Study 1 (SD = 11.96, all reporting practice) and 22.59 in Study 2 (SD = 12.52, excluding three retired participants and six participants who did not report practice). The two means were not significantly different. The majority of participants in Studies 1 and 2 held a doctoral degree in psychology or a related field of study (80% and 86%, respectively). One individual did not report their degree. Race and age were requested in Study 2 only, where participants were largely White (n = 42, 84%), with mean age of 53.12 (SD = 14.61). Three participants did not disclose their race and one did not state their age.
Categorization in Study 1
In Study 1, all but three of the 37 DS and RE behaviors were classified as Plausible Grooming indicators (definite or context-dependent) by at least 75% (n = 37) of the sample (see Table 2). BN items, including categorization and percentages, are provided in Supplement B. As expected, the four sexually abusive items (detailed earlier and not shown in Table 2) showed no variance—that is, all respondents viewed the behaviors as grooming; thus, the items did capture behavior in the grooming domain.
As predicted, DS behaviors were more likely to be seen as Definite Grooming behaviors. The average percentage of items in each category classified as definitely grooming by the respondents was highest in DS (M = 0.74, SD = 0.19), substantially lower in RE (M = 0.22, SD = 0.12), and lower still in the BN category (M = 0.05, SD = 0.06) group. A repeated measures ANOVA comparing the DS and RE groups, removing BN due to floor effects, was highly significant (Pillai’s Trace = 525.98, p < .001, η2 = .92). No main effect for gender emerged, but the Gender by Grooming Type interaction was significant (Pillai’s Trace = 6.27, p < .02, η2 = .12). Average classifications for the RE behaviors were virtually identical between genders, but female experts rated more of the DS behaviors as Definite Grooming compared to males, with means of 0.78 (SD = 0.14) and 0.65 (SD = 0.23), respectively, t = 2.44, p < .01, d = 0.76.
It is important to note that respondents not only were less likely to identify RE items as Definite Grooming indicators but also disagreed as to which items within the category should be seen in this way. Only 4 of the 49 respondents believed that all of the RE items were context-dependent; the respondents individually identified from 0 to 10 of the 24 RE items as definite signs of grooming. In contrast, 88% (n = 43) of the respondents agreed that the majority of the DS behaviors were definite indicators of grooming. The percentage of DS and RE deemed as Definite Grooming are graphically represented in the violin plot in Figure 1.

Percentage of desensitization and relationship enhancement behaviors identified as definite grooming examples.
Response to Quantitative and Qualitative Feedback: Changes in Study 2 Categorizations
In response to both the quantitative results from Study 1 and qualitative feedback, minor scale changes were made and are reflected in the categorization history presented in Table 2. As outlined in the table, five items were reworded slightly, eight changed categories, six were removed, and three were added. “Engages in discussions about the adult’s other adult relationships to illustrate a point” was moved from DS to RE, given low rates of Definite Grooming categorization and comments by experts as to context-dependent situations in which such discussions might be nonsexual and appropriate. The two items referring to secrets had high rates of Definite Grooming categorizations and were moved to DS. Two BN items, “coaches the child individually in an extracurricular sport” and “tutors a child who is educationally behind” were reclassified to RE due to high Plausible Grooming ratings, while the item “massages a child’s foot when the child has a cramp” was moved from BN to DS, again due to high Plausible Grooming ratings (and content involving touching). Additionally, two RE items, “plays sports with the children as a player” and “is involved in youth-serving organizations,” which were outliers in the Plausible Grooming identification task, were moved to the BN category. These items received comments from experts that school administrations can at times include such activities as part of job requirements. The remaining item with a Plausible Grooming score below 75% (“focuses attention on a child who is depressed/anxious,” at 65.3%, an item receiving no comments) was tentatively retained, particularly in light of Jeglic et al.’s (2023) finding that “troubled” children were more likely to be victims of child sexual abuse. These classifications were then retested in Study 2.
Expert comments also led to the removal of six items, not shown in Table 2 or used in Study 2. For instance, “texts the child about school related topics” was seen as too dependent on school rules, undermining its general utility. Similarly, “asks questions about child’s sexual experiences/relationships” could be necessary under some circumstances (e.g., suspected abuse). Several individuals commented in Study 1 that the item “stays in room when child is undressing (e.g., for gym or swimming)” might be more or less appropriate depending on gender of child and/or staff. Consequently, the item was removed and two versions identifying a staff member of the same or different gender as the child were added. The item “offers a one-armed hug to child as an expression of pleasure to see the child” was added. Two items lacked clarity (“tells the child that his or her behavior is important to make the adult feel happy or sad” and “talks about sexual things they themselves have done”) and therefore were difficult to rate. Lastly, although the item, “makes threats about abandonment/rejection/family breaking up,” relates to relationships, the behavior itself is not enhancing or conducive to creating a closer relationship. The full text for each version of the scale is available at https://osf.io/brymf. The final scale tested in Study 2 consisted of 14 DS items, 21 RE items, and 13 BN items intermixed to disrupt response biases. It should be noted that BN items are included in the research version of the scale only and are not intended for use in a professional evaluation.
Categorization in Study 2
The average respondent classified 78.6% of the DS behaviors, 61.6% of the RE behaviors, and 34.6% of the BN behaviors in one of the two grooming categories. Table 3 shows full results. For those who believed that an item was a grooming behavior, 83.5% of the DS items were categorized as DS, and 83.2% of the RE items were categorized as RE. As the table shows, the experts typically demonstrated most uncertainty differentiating the RE and BN categories. A behavior in the Benign group was almost never classified as DS (M = 0.44 misclassified items per expert, or 3.4% of the 13 behaviors, SD = 1.19) but was classified as RE almost one third of the time (M = 4.06, SD = 4.55). Total correct classifications were not related to gender, degree, research status, or years of clinical experience. All individual grooming items were sorted into the predetermined category by the majority of respondents. These results support the hypothesis that experts could differentiate and reliably sort items into these categories, but they also indicate that the RE behaviors would be more likely to be seen as context-dependent. Note that respondents were unsure where to categorize an average of 4.10 (SD = 4.14) of the 21 RE items (19.5%).
Mean Number of Categorizations of Behavior by Type.
Note. Parentheses indicate standard deviation. DS = Desensitization; RE = Relationship Enhancement; BN = Benign.
Within the RE behaviors, there was one missing value; regression replacement was used to determine the score.
Red Flag Results
The summary score for Red Flag rating was calculated by summing the ratings within category (0 = No Red Flag; 1 = unsure; 2 = Red Flag) and dividing by the number of items. The DS items, as predicted, were consensually seen as likely red flags (M = 1.73, SD = 0.20), while the RE category mean fell almost exactly midrange (M = 0.96, SD = 0.37). The Red Flag summary score thus significantly differentiated the DS and RE categories (with BN not included, given the floor effects for this variable): F[1, 49] = 200.42, p < .001, η2 = .83) in a two-way mixed analysis of variance crossing type with gender. The gender main effect and interaction were nonsignificant. As predicted, very few of the 13 BN items were seen as red flags (Table 3). Most (n = 38) of the respondents gave no Red Flag ratings to any of the items, while seven respondents rated one item and five respondents rated two items as problematic.
As seen earlier in Table 2, each item in the DS category was seen by the majority of respondents as a likely Red Flag, with the exception of “stays in room when child is undressing (e.g., for gym or swimming)” (30%) and “massages a child’s foot when the child has a cramp” (48%). No respondent rated fewer than 7 of the 14 DS items as Definite Red Flags, with an average number of 11.30 items identified this way (SD = 2.00). Further evidence of consensus was seen in that the same three items (the two undressing items and foot massage) were responsible for 90.2% of the 51 No Red Flag answers across items in this category.
The RE items, as predicted, presented the most complicated picture. An average of 5.9 of the 21 items (SD = 0.19) were identified as red flags. Further, as predicted, there was more disagreement among respondents as to which of the 21 items were most and least indicative of grooming, as well as most deserving of Red Flag status. Also, we note that experts were two to three times more likely to state that they were unsure whether the RE items were red flags (giving this rating to 40.1% of the items) as opposed to DS (12%) or BN (16%) items. As a class, RE items appeared to raise concern in some experts and not others and, therefore, may be more subject to context, while DS behaviors raised concern in most and BN behaviors, as expected, were concerning to few.
Individual Difference Results
None of the hypothesized gender differences were present for correct categorizations or for the use of the Red Flag category in Study 2. Exploratory correlations were conducted with the Red Flag summary score for DS and RE and the variables of age, years of experience, clinical and/or research experience, and level of self-reported expertise in grooming. Only 1 of the 10 correlations was significant. The clinical experience (as a dichotomous variable) correlated at r = −.35, p < .02, with the RE Red Flag score, indicating that clinicians were more likely to view the behaviors as reportable. However, given that a z for the binomial revealed that the full pattern of correlations could be due to chance, this post hoc finding was considered marginal. There were also no significant differences between age groups (5- vs. 8-year-old).
Discussion
Although sexual grooming is a widely recognized aspect of sexual abuse, there is a lack of consensus in the field regarding which behaviors actually constitute such grooming. The variety of definitions, models, and measures highlight the ambiguity regarding the nature of grooming and, by extension, the behaviors that comprise it (Craven et al., 2006; Denne & Stolzenberg, 2023; McAlinden, 2006; Winters et al., 2020). Many behaviors that could be employed by individuals attempting to groom a child are also common among well-intentioned adults, including educators. Given the tendency to interpret an accused perpetrator’s ambiguous actions as evidence of guilt due to confirmatory bias (Neal et al., 2022; O’Donohue & Cirlugea, 2021; Scurich et al., 2023), without a clear consensus on what behaviors are indicative of sexual grooming, it is challenging both to protect at-risk children from abuse and to prevent wrongful accusations against educators or other adults who interact with children.
The present study lends support to the notion that there are at least two distinct categories of sexual grooming behaviors, DS and RE. Both types of behavior were seen as Plausible Grooming behaviors, in that they are behaviors that can be instituted for the purpose of rendering the child more open to sexual contact from an adult. Experts agreed that the items in both categories could be grooming behaviors. Thus, a high score in either category could be interpreted by a forensic expert as consistent with suspect motives.
Importantly, however, only the DS behaviors were consensually seen as “Red Flags”—that is, reportable behaviors that should raise the suspicions of an administrator. Respondents demonstrated a greater reluctance to categorize RE items as definitive grooming indicators, or as Red Flags, and categorization of the items was highly variable. An RE behavior might be seen as a red flag by one evaluator, equivocal by a second, and normal by a third. No two evaluators wholly agreed as to which RE behaviors might be red flags, although all but 2 of the 50 participants in Study 2 nominated at least one RE behavior in the definitive Red Flag category. An accused innocent teacher showing behaviors in the RE category might therefore be at greater risk of conviction, as the accusation casts a pall over the RE behavior history.
The TRIGS-E must be further evaluated by teachers themselves in an educational setting. If current results are replicated, the resulting scale could serve two purposes. First, the categories could form a foundation for more helpful teacher training. “Never hug a child” or “never give extra time to a needy teenager” may appear to be risk-management strategies (and certainly some teachers make such statements), but, as advice, these statements may be neither realistic nor compassionate. Preliminary data from a study with educator participants demonstrated that most educators do report having at least occasionally hugged a child or spent extra time with a child (Metcalf, 2025), but concerns around misperceptions can shut down necessary discussion about the contexts in which such behavior is appropriate.
Second, there is a need for a reliable measure to identify grooming behaviors in forensic settings, both to protect students from sexual abuse and to reduce bias by safeguarding educators from false accusations. The results from Study 2 show progress in the development of the TRIGS-E as a more consensual measure of whether an individual’s behavior was judged by a reasonable professional as grooming behavior. Combining the data herein on expert judgments with input from educators themselves, national norms could be made available to experts evaluating such cases. Rather than making a vague statement that one or more of the individual’s behaviors appeared to be grooming, a replicated version of the scale would potentially allow an evaluator to state whether an individual had (a) engaged in behaviors that most educators reject (DS) at rates above established cutoffs and/or (b) engaged in behaviors that are more ambiguous (RE) at levels that are close to the norm (a more defense-supportive result) or above the norms in frequency (a more plaintiff-supportive result, but not as problematic as the DS finding).
The last author has extensive forensic experience in which grooming behaviors in the RE category were used to justify an accusation of misconduct that came to trial, at times leading to exoneration but occasionally resulting in a conviction. These experiences underline the necessity to evaluate the instrument and its use carefully—to use the instrument as a way to raise alternative hypotheses that can be tested more thoroughly by external evidence rather than as a decision-making tool. The information that an educator is engaging in RE behaviors only, for instance, can suggest that the evaluators should resist a rush to judgment. We would argue, however, that the existence of a measure such as the TRIGS-E might begin to address the chilling effect of oversurveillance on teachers, many of whom already work in fear of false accusations (Anderson & Levine, 1999). Rawlinson (2015) found that fear of false accusations was a primary concern of teachers considering leaving the profession.
There is growing interest in attachment theory as an underpinning of the study of the damage that sexual abuse may cause. The ramifications of attachment theory are too complex to describe here, but it is important to note the harm that may be caused by arbitrarily curtailing RE behaviors in educators. The positive impact of the single compassionate adult in the life of a child deprived of social input has been strongly emphasized (Ashton et al., 2021; Bellis et al., 2017), and anecdotal reports abound in which teachers went well beyond the requirements of their job to enhance the likelihood of a child’s future success (Roberts, 2023). Relationships with adults other than parents can stabilize attachment in an otherwise poorly developing system (Joseph et al., 2014; Lamb et al., 2022) but relies on behaviors that signal that attachment is present (such as RE behaviors).
Importantly, analysis of the data revealed that behaviors related to secret-keeping were frequently classified with the DS behaviors and seen as red flags. Consequently, the definition of DS behavior was expanded in Study 2 to include the fostering of secrecy between the adult and the child, leading to the recategorization of the two secrecy-related items (as noted earlier). These items appeared to serve multiple purposes for the perpetrators, both as desensitizers of more serious misbehavior and as RE devices. The fostering of secrecy functions more similarly to a DS behavior empirically, theoretically, and practically. Similar rates of rejection were demonstrated by both psychological experts and educators, with correlations reflecting more similarity to DS items as opposed to RE or BN items. By encouraging children to maintain secrecy, the perpetrator desensitizes the child to engage in taboo behavior and limit disclosure to trusted adults, increasing the likelihood that future problematic behavior will occur. Importantly, van Delft et al. (2015) cite secrecy as a potential mediator in the prediction of the child’s reactive psychopathology after child sexual abuse. Secrecy is also itself characterized as a grooming strategy in Ringenberg et al.’s (2022) scoping review of child grooming strategies.
The expected gender differences did not reliably emerge, although one small effect was found showing that women were more definitive in rejection of DS items. The absence of predicted effects was unlikely to be the result of power insufficiency, given that nonsignificant differences had effect sizes well under η2 = .05. However, the differences in the strength of the rejection that the significant effect implies may translate into differences in behaviors in situ (e.g., reporting a colleague). These possibilities are yet to be explored. Nonetheless, both male and female experts agreed that (a) DS behaviors were likely Red Flags, (b) RE behaviors should be seen contextually, and (c) both DS and RE behaviors could in some contexts be seen as grooming. The greater sensitivity of female raters to sexual misconduct often seen in jury research (McCoy & Gray, 2007) and the greater fear in male professionals of false accusations (Clyde, 1994; Fansher et al., 2022) did not affect expert judgments, although it may affect educators in situ.
Limitations and Future Directions
Though results are encouraging that the TRIGS-E may provide a useful forensic tool differentiating benevolent and malevolent behaviors, the sample size across studies made it difficult to conduct the exploratory analyses that might begin to characterize how elements such as cultural background and other identity issues may inform its use. The sample was predominantly female, for instance, as well as predominantly White. Although no reliable gender differences emerged in the analyses of experts, gender differences may well be found in analyses of educators themselves. Gender-specific expectations and norms deserve analysis.
Further, participants’ race and ethnicity may have impacted their view of grooming behavior. Cultures may vary in the acceptability and normalization of physical boundaries and touch (Beaulieu, 2004; Burleson et al., 2018), which may result in differences in how DS and RE behaviors are interpreted, underlining the need to conduct analyses cross-culturally. The nearly universal rejection of the DS behaviors across genders and races within our sample does imply that the cross-cultural variation of acceptability of these behaviors may be small, but the strong caveat here is that our sample lacked diversity (in race, although not in gender or region). Additionally, there may be potential barriers in reporting that exist, such as fear, community passivity, prejudice, and general attitudes toward sexual grooming (Shafe & Hutchinson, 2014) that interact with gender, race, and sexual orientation. These interactional effects were not possible to explore, given the largely female and White sample. Cultural and race-related differences in homonegativity have also been reported (Richter et al., 2017), suggesting that Relational Enhancement behaviors might be less accepted if the teacher–student pair is same-sex. Versions of the TRIGS that could address grooming in medical settings (where touching rules differ) or online (where almost all interaction is verbal or visual) may also be helpful. Further, regional differences in the acceptability and meaning of various behaviors could exist and should be explored.
Another clear consideration regarding the large-scale use of the TRIGS-E would be to explore the age group component to the interpretation of DS, RE, and BN behaviors. The current sample was asked to evaluate teacher behaviors with a 5- or 8-year-old child as a prototype. However, a behavior considered benign with elementary school children (e.g., sitting close to the child while reading) might take on more overtones of DS when performed with middle school- or high school-aged children. Further, the present study did not investigate the process of online grooming nor adapt the TRIGS-E with these considerations in mind. Future studies can contribute to the refinement of scale items to clarify behaviors (e.g., with more precise or nuanced descriptions), as well as integrate behaviors that may be present in online contexts. We are also currently collecting data on the utility of the TRIGS-E as a teaching tool for educators; the TRIGS-E could serve as both a potential source of school policy about appropriate and inappropriate behaviors and a standard by which to measure such behaviors.
In sum, the TRIGS-E shows promise as the first scale with practical applicability in an educational context. The views of psychological experts evaluated here must be supplemented by a regionally, racially, and gender-diverse sample of educators. A random sample of educators would also be more likely to be economically diverse, which is less likely true in our sample of experts. As psychologists attempt to contribute to justice in this contentious area of forensic work, development of the TRIGS could serve to increase clarity and decrease unfairness.
Supplemental Material
sj-docx-1-jiv-10.1177_08862605261428180 – Supplemental material for Development of the Trauma Research Institute Grooming Scale for Educational Settings (TRIGS-E)
Supplemental material, sj-docx-1-jiv-10.1177_08862605261428180 for Development of the Trauma Research Institute Grooming Scale for Educational Settings (TRIGS-E) by Katherine E. Metcalf, Kenneth J. Thompson, Lillian Mecum, Alberto Gomez and Constance J. Dalenberg in Journal of Interpersonal Violence
Supplemental Material
sj-docx-2-jiv-10.1177_08862605261428180 – Supplemental material for Development of the Trauma Research Institute Grooming Scale for Educational Settings (TRIGS-E)
Supplemental material, sj-docx-2-jiv-10.1177_08862605261428180 for Development of the Trauma Research Institute Grooming Scale for Educational Settings (TRIGS-E) by Katherine E. Metcalf, Kenneth J. Thompson, Lillian Mecum, Alberto Gomez and Constance J. Dalenberg in Journal of Interpersonal Violence
Footnotes
ORCID iDs
Ethical Considerations
This study was approved by the Institutional Review Board at Alliant International University-San Diego (approval #AY2024-2025-344) on June 19, 2025.
Consent to Participate
Study purpose and requirements were described. By completing and submitting the survey, participants provided consent.
Funding
The authors received no financial support for the research and/or authorship of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interests with respect to the authorship and/or publication of this article.
Data Availability Statement
Supplemental Material
Supplemental material for this article is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
