Abstract
Substantial resources have been dedicated to designing and implementing training courses that focus on enhancing the interviewing skills of police officers. Laboratory research studies and real-world assessments of the effectiveness of interview training courses, however, have found notably mixed results. In this article, empirical studies (N = 30) that have assessed the effectiveness of police interview and interrogation training courses were systematically reviewed. We found a wide variation in terms of the type, length, and content of the training courses, the performance criteria used to assess the training effectiveness, and the impact of the training courses on interviewing performance. Overall, the studies found that basic interviewing skills can be developed to a certain level through even short evidence-based training courses. More cognitively demanding skills, such as question selection and meaningful rapport-building, showed less of an improvement post training. The courses that included multiple training sessions showed the most consistent impact on interviewing behavior. This review also indicated a need for more systematic research on training effectiveness with more uniform and longer-term measures of effectiveness. Our findings should help guide future research on this specific topic and inform the training strategies of law enforcement and other investigatory organizations.
The information gathered from victims, witnesses, and suspects is vital for the successful resolution of criminal investigations (Snook et al., 2010). Interviewees are often the only source of information regarding the event in question, and even if physical evidence does exist, questioning those involved helps provide context and explain how the available evidence fits into the overall event (Westera et al., 2016). The ability of investigators to generate the maximal amount of accurate information from interviewees is therefore a vital skill; a fact that led practitioners and academics to create evidence-based interviewing protocols to improve interviewing performance (Fisher and Geiselman, 1992; R Milne and Bull, 1999). The purpose of the current article was to systematically review the studies that assess the effectiveness of training on such protocols on improving interviewing practice.
Research on investigative interviewing practices within law enforcement samples has demonstrated that they consistently fall short of evidence-based best practices. In one early examination of real-world interviews, Fisher et al. (1987) analyzed 11 full-length adult victim interviews conducted by detectives in Florida. They found that the interviewers in their sample relied upon direct closed-ended questions, interrupted the interviewee frequently, and used a questioning sequence that did not match the interviewee’s mental representation of the event—all of which have been found to limit the amount and quality of information an interviewee is likely to provide. Wright and Alison’s (2004) analysis of 19 Canadian police interviews with adult witnesses found similar issues, as interviewers in their sample interrupted the interviewee frequently and relied heavily on closed as opposed to open questions (also see Snook and Keating, 2011 for similar findings). These poor questioning practices extend to the interviewing of children, as research has shown that interviewers rarely use open-ended prompts when questioning child victims and witnesses (see Lamb et al., 2007 for a review).
Poor interviewing practices have also been found within studies analyzing the interviewing of suspects. For example, in a seminal study on interview performance by Baldwin (1993), a large-scale review of 600 interviews from the UK found a series of inappropriate questioning practices, including the use of leading questions, exerting pressure on the interviewee, frequent interruptions, a lack of structure within the questioning sequence, and a general lack of confidence and control over the interview process. A review of questioning practices within 80 suspect interviews from a Canadian police organization found an almost exclusive reliance on closed and probing questions while open questions were used only rarely (MacDonald et al., 2017).
In response to the poor interviewing practices seen in many real-world contexts, a variety of evidence-based interviewing protocols have been created based on psychological science research. One of the earliest and most widely used examples of such an approach is the Cognitive Interview (CI; Fisher and Geiselman, 1992). This comprehensive interviewing approach was created using findings from the cognitive and social psychology literature and includes a number of memory-enhancing techniques (e.g., adapting questions to the interviewee’s unique perspective and mentally reinstating the context of the original event) and elements related to the social dynamics of the interview setting (e.g., building rapport). Meta-analytic reviews have shown that the CI is effective in generating more detailed and accurate information from interviewees compared with control conditions (Köhnken et al., 1999; Meissner et al., 2014; Memon et al., 2010).
More recently, a comprehensive interviewing approach for questioning all interviewee types, known as the PEACE framework, was created and implemented within the UK in the early 1990s (R Milne and Bull, 1999). Incorporating the CI and Conversation Management models (see Shepherd, 2007), PEACE is an inquisitorial and information-seeking approach that focuses on building rapport with the interviewee as a way of facilitating information disclosure (Snook et al., 2010). Research on PEACE-based information-gathering approaches has shown that they are effective at generating information while avoiding potential pitfalls associated with accusatory approaches such as false confessions (Meissner et al., 2014). A more comprehensive version of the PEACE framework was developed a decade after its launch and a five-tier training was designed to equip the officers at different knowledge and experience levels and those conducting different types of interviews (i.e., suspect, witness, and victim interviews; L Griffiths and Milne, 2006; B Milne et al., 2019). The PEACE framework is largely considered the current best practice for questioning suspects, and PEACE-based interviewing models have been incorporated into law enforcement organizations across the world (e.g., New Zealand, Norway, Canada; see Bull, 2018; Snook et al., 2010).
Given the developmental differences within young people and children, specialized protocols for questioning this population have been created as well. Although the approaches share many similarities, arguably the most well-known and researched is the National Institute of Child Health and Human Development (NICHD) protocol (Lamb et al., 2008). Developed by a group of researchers led by Michael Lamb, the NICHD protocol is a step-by-step process for moving through an interview with a child about a past event that they have witnessed or experienced, including building rapport, reviewing ground rules, and relying on open-ended questioning. The protocol has been tested extensively both within laboratory settings and within real-world interviews with actual child victims, and has shown that even children as young as 4 years old can provide relatively detailed and accurate accounts if interviewed using this method (Lamb et al., 2007). The protocol has been translated into many other languages and widely used across the world including the UK, USA, Canada, Netherlands, Finland, Israel, Japan, Korea, Norway, Portugal, and Scotland (La Rooy et al., 2015).
The evidence-based interview models and frameworks developed in recent decades have some common principles that can be applied in any type of investigative interview. These principles include rapport-building (i.e., establishing and maintaining a positive relationship with the interviewee; Gabbert et al., 2021), positive attitude towards the interviewee (Holmberg and Christianson, 2002), eliciting a free narrative (Kontogianni et al., 2020), appropriate questioning (using more open-ended and non-leading probing questions; Boon et al., 2020; Oxburgh et al., 2010), 80–20 rule (allowing the interviewee to talk for 80% of the interview; Snook et al., 2012b), and usage of memory enhancement techniques (e.g., report everything, mental reinstatement of context, change temporal order and change perspective; Fisher and Geiselman, 1992).
While research has demonstrated the potential utility of these evidence-based interviewing models, to have a true impact, the research needs to be put into practice within real-word investigative settings. In recognition of this fact, substantial resources have been dedicated to designing and implementing training courses that focus on enhancing the interviewing skills of police officers (R Milne and Bull, 1999; Smets, 2009). Recently, L Griffiths and Milne (2018) developed the Framework for Investigative Transformation (FIT) to address the question of how to transfer effectively the research-based knowledge on criminal investigations to practitioners and make it work in the field. There are eight factors of FIT that can be implemented to enhance the capacity of law enforcement organizations in any investigative task including: (a) leadership that will encourage the institutional change towards evidence-based approaches, (b) a legislative framework that will allow new interview approaches, (c) a mind-set or cognitive style that is open to new techniques among investigators, (d) a knowledge base to apply the appropriate methods, (e) an organizational training and knowledge regime, (f) quality assurance mechanisms, (g) corresponding skill set of the investigators, and (h) the required technology.
A number of studies have been conducted to assess the effectiveness of training courses that were developed to transfer the theory and research findings into practice. For instance, the effectiveness of a 3-week suspect interview training based on the PEACE Model was examined by L Griffiths and Milne (2006). They analyzed the audiotapes of 60 interviews conducted by 15 experienced interviewers before and after the training, and found that training improved some simple skills of the interviewers such as delivering legal rights to suspects. However, their performance in some complex skills such as appropriate questioning, sequence of questioning, and topic structure did not improve. Although differing widely in content and structure, other studies including laboratory research studies (Köhnken et al., 1999; Memon et al., 2010) and real-world assessments (MacDonald et al., 2017) of the effectiveness of interview training courses have found notably mixed results (for a discussion on the observations and challenges in interview training, see St-Yves et al., 2014). The factors other than training such as supervision (Clarke et al., 2011) and personality (Akca and Eastwood, 2009) were also found in the literature as predictors of the variance in the interview skills of police officers. To date, however, there has been no attempt to analyze systematically the extant research in the area to summarize outcomes and identify best practices for future training endeavors.
The goal of this article was to fill this gap in the interviewing literature by identifying and reviewing studies that measured the outcome of interview training initiatives. Specifically, we: (a) review the findings from the extant literature in the area including the type of design used, the measurements used to assess efficacy and the impact of the training courses on interview performance; (b) summarize trends arising from across the studies; and (c) make recommendations to increase efficacy of future training activities. The findings will guide future studies on this specific topic and inform the training strategies of law enforcement and other investigatory organizations.
Method
We conducted a narrative review to summarize the effectiveness of police interview and interrogation training programs. Narrative reviews synthesize the results of individual quantitative studies with no reference to the statistical significance of the findings. We preferred the narrative review method due to the variety in the methodologies and outcome measures of the studies in our review (Siddaway et al., 2019).
We reviewed the type, length, and content of the training courses, the performance criteria used to assess the training effectiveness, and the outcomes of the training courses.
The main research questions that guided our reviews were: Are current investigative interview training courses effective in improving the performance of interviewers? What outcome measures are most impacted by training courses? What characteristics of training courses are associated with improved outcome measures?
Search strategy
To find the relevant studies in this systematic review, we used the STARLITE search strategy developed by Booth (2006). The acronym STARLITE refers to the steps of the systematic review: S, sampling strategy; T, types of studies; A, approaches; R, range of years; L, limits; I, inclusion/exclusion criteria; T, terms used; and E, electronic sources. The sampling strategy used in this review was a purposive sampling from three databases, a type of nonprobability sample that aims to produce a sample of studies that can be logically assumed to be representative of the studies available in the literature (Lavrakas, 2008). The types of studies reviewed included those that analyze the outcomes of training courses on the investigative interview performance of professionals including police officers, military officers, human resources, and judiciary staff. We searched for the studies on three electronic databases: Scopus, Web of Science Core Collection, and ProQuest Dissertations & Theses Global. We also posted a call on social media to reach unpublished studies that meet the inclusion criteria. The social media post reached a substantial number of forensic psychology researchers and the official accounts of forensic psychology labs. Also, we sent emails to leading scholars in investigative interviewing areas to ask if there is any unpublished study that they had conducted or knew about that they would be willing to share with us.
Two different searches were made on these databases by using following terms: “Investigative interview” AND “Training”, and “Interrogation” AND “Training”. There were no restrictions in terms of the year of the study; however, we limited our search to studies in English. We had four inclusion criteria: (a) published and unpublished studies in English that quantitatively or qualitatively evaluate the impact of training on the performance in investigative interviewing/interrogations; (b) studies evaluating the impact of interview training for suspects, victims, witnesses, children and human intelligence gathering; (3) laboratory studies and studies evaluating real-life interviews; and (4) studies evaluated the impact of training based on self-report of the interviewee. There were no exclusion criteria in our study.
Data coding and analysis
In accordance with the narrative analysis methodology, we reviewed the narrative text and tables in the published and unpublished papers to identify the findings of the studies on training efficacy. We created a codebook to classify the findings in the studies and coded the following variables: publication status; location of training; type of training (suspect, victim, witness, etc.); study design (pre–post, laboratory, real-life, etc.); length of training; number of trainees attending; content of training; performance criteria to measure the effectiveness of training; outcome of training; and limitations of study.
Results
As a result of our search and screening phase, we collected 30 studies that met our inclusion criteria (Table 1). One was an unpublished study sent to us by the authors upon our call via social media. Twenty-nine studies that we accessed through literature search were published between 1989 and 2020. The majority of the studies were published between 2011 and 2020 (n = 14). The remaining articles were published between 2001 and 2010 (n = 7) and between 1989 and 2000 (n = 9), respectively. Most studies were published in psychology journals (n = 20), whereas others were published in policing (n = 6) and developmental (child-related) research (n = 4) journals. The training courses evaluated in these studies were provided in various countries: UK (n = 9), USA (n = 9), Canada (n = 7), Israel (n = 2), Norway (n = 1), Scotland (n = 1) and Finland (n = 1).
Research designs in the reviewed studies also varied. Seventeen of the studies were field assessments in which real-life interviews were examined. Thirteen studies were based on laboratory research where the impact of training on mock interview performance of participants was evaluated. In only nine (29%) studies, a control group was used when assessing the effectiveness of the training courses. Pre–post testing design was used in 18 (60%) of the studies. One study assessed the effectiveness of interview training courses based on self-reports of the trainees. The length of training courses ranged widely between 2.5 hours and 9 months. The longer training courses included follow-up sessions (n = 4). The majority of the courses were either 1-week (n = 13) or 1-day long (n = 8). The number of trainees attended in the courses ranged between 6 and 514 (M = 62.83, SD = 103.4). The most frequently taught interview protocol in the studies reviewed were the PEACE framework (n = 9), NICHD (n = 8) and CI (n = 6).
Descriptive statistics for the reviewed studies.
More than half of the training courses evaluated in the studies were on adult interviews (n = 18), and the remaining were on child interviews (n = 15; see Table 2). The type of adult training courses was as follows witness interviews (n = 6), suspect interviews (n = 3), all types (suspect, victim, and witness) of interviews (n = 8), and military information-gathering interviews (n = 1). Child interview training courses were on child victim interviews (n = 9), child witness interviews (n = 5), and juvenile suspect interviews (n = 1).
Type of interview training in the reviewed studies.
Measures of success and outcome of training courses
Numerous measures were used to assess the effectiveness of the training courses in the reviewed studies (Table 3).
Frequency of the measures used in the studies to assess training effectiveness.
Based on the various success measures used in the studies reviewed, 16 of the training courses were found to be successful at improving interview outcomes, 13 were partially successful, and 3 were found to be unsuccessful.
Table 4 shows the distribution of studies that found a positive effect or no effect for the most frequently used success measures. In general, the majority of the studies found a positive effect of the training on the amount of information elicited by the trainees, adherence to the interview protocol taught and the questioning style of the trainees. The least effect was found in the questioning style. Nine of 25 studies (36%) that measured the questioning style of trainees found no effect. Eleven studies found a positive effect of training on the amount of information, and 13 studies found a positive effect on adherence to training. Of five studies measuring the impact of training on rapport-building, three found no effect and two found a positive effect. In terms of the skills for eliciting a free narrative from interviewees, two studies found a positive effect of training and two found no effect.
Types of success measures used in the reviewed studies to assess training efficacy and outcomes.
Course characteristics and outcome of training
We also examined whether and how the success of training varied according to the characteristics of the training such as the length of training, type of training, whether there was a follow-up training, and the interview protocol taught. We assigned numeric values to the outcomes of the training evaluated in the studies as following: Unsuccessful = 0, Partially successful = 1, Successful = 2. The correlation between the length of training and the outcome of the training was not significant (p = .615). Table 5 shows the outcomes of the training courses that included follow-up sessions and those that did not. Training courses that included follow-up sessions were either successful or partially successful. Three training programs that did not include follow-up sessions were unsuccessful.
Outcome of training courses with or without follow-up sessions.
Table 6 shows the outcome of training courses by the types of training provided (i.e., child or adult interviews; victim, witness or suspect interviews). The three unsuccessful outcomes were found in the studies assessed the following types of training respectfully: adult witness, child victim, and child witness.
Outcome of training courses by type of training.
Table 7 shows the success of training courses based on the type of interview protocol taught. The highest number of successful training courses was the number of those on NICHD protocol.
Outcome of training courses by interview protocol taught.
Discussion
Field studies and laboratory research have been used to assess the effectiveness of the interview techniques and protocols and to develop novel approaches for the last few decades. As summarized in this systematic review, a substantial number of studies have also explored the extent to which those techniques can be effectively trained and successfully applied by interviewers in real-life situations or laboratory paradigms. The substantial variation between studies both in terms of structure and content of the included training program and the measures used to assess effectiveness make it difficult to draw specific conclusions. However, some general trends that emerged that may be useful for guiding future training interventions and research are summarized below.
In general, our review indicated that training courses on evidence-based interview techniques such as the PEACE framework, the CI, and the NICHD protocol can enhance the effectiveness of interviewers. The majority of studies did find at least some positive impact of the training on subsequent interviewing performance. Although this may be partially a result of poor baseline interviewing abilities, it is promising to note that any type of evidence-based training intervention is likely to be effective in improving interviewing skills. However, most studies only included short-term follow-up and therefore the longer-term retention of the learned skills is unknown. In addition, training courses were less effective in developing certain skills required for a successful interviewing such as questioning style, eliciting a free narrative, and rapport-building.
The skill that the highest number of studies found no effect of training on was the use of proper questioning procedures in interviews. Effective questioning has been found to be one of the challenging and complex skills in investigative interviewing so that even highly trained interviewers have difficulties in asking proper questions throughout the course of an interview (A Griffiths et al., 2011). Wright and Powell (2006) asked interviewers about the difficulties they have in questioning. The emergent explanations were the detailed nature of the information obtained from the interviewees in a criminal investigation, the unfamiliarity of interviewers with the open question style, and the difficulties in making a distinction between open and closed questions. Similarly, A Griffiths et al. (2011) asked fully trained and experienced detectives to evaluate their own questioning style in suspect and witness interviews after they conducted simulated interviews. Participants consistently commented on the complexity of questioning task because it requires interviewers “to frame the next question in their minds while simultaneously attempting to compare the current answers being given by the interviewees with other case information that the interviewers already knew” (A Griffiths et al., 2011: 261). These characteristics of questioning might make it a cognitively demanding task, and research has recently shown that high cognitive load in interviewing is a predictor of low performance (Hanway et al., 2021). This partially explains why relatively high number of studies did not find a positive effect of training on this task across different types of interviews.
Another potential explanation for our finding on relatively few studies that found a positive training effect on some skills might be the impact of other factors such as personality characteristics and experience levels of interviewers. In a laboratory research, Akca and Eastwood (2019) found that high scores in Agreeableness dimension of Big Five personality scale is correlated with positive witness perceptions regarding interviewers. This means that rapport-building skills might be related to traits under the Agreeableness dimension such as being courteous, flexible, trusting, and tolerant (Ono et al., 2011). Therefore, training alone might not develop the rapport-building skills of interviewers to the expected levels without the personality effect. Akca and Eastwood (2019) also found that extraverts score significantly lower in appropriate questioning scores although they are rated significantly higher in overall interviewing performance. The talkative nature of extraverts might lead them to talk more than the interviewees and ask more inappropriate questions (e.g., leading, complex and multiple questions). This might be another explanation of why fewer studies found positive effect of training on questioning skills. Similarly, extraverts might be less patient when eliciting free narratives due to their assertive nature so that training might not be helpful for them in this skill.
One other important finding of this review was the impact of follow-up sessions on the effectiveness of training. All the training courses that included a follow-up session was either successful or partially successful. On the other hand, all courses that were found to be unsuccessful were lacking a follow-up session. Although these findings were not conclusive, they indicate a need for multiple sessions to reinforce concepts and ultimately enhance the effectiveness of training on investigative interviewing. Research has found a significant impact of follow-up training activities such as action plans, performance assessment, peer meetings, supervisory consultations, and technical support on the improvement of job skills (Martin, 2010). Training programs should include follow-up sessions in the form of “feedback, continuing education, refresher courses, and supervision” to maximize the transfer of knowledge and skills in interviewing and “ensure that learning is consolidated and retained” (St. Yves et al., 2014: 31).
Conclusions
This study summarized the findings of real-life and laboratory research on training effectiveness in investigative interviewing. While keeping in mind the variability in study design and training structure, we can make the following tentative conclusions regarding interview training efficacy. Although studies reached mixed results on training effectiveness across various skills, it appears that basic interviewing skills can be developed up to a certain level through even short evidence-based training courses. Given the importance of effective interviewing within investigations, and recognizing the time and resource constraints present in many real-world contexts, we would encourage the continued use of evidence-based training protocols regardless of their exact structure and content. Conducting an effective interview remains a complex set of skills, which may be difficult to deliver and develop within the context of a single training course/session. In order to see a sustained improvement in the more cognitively demanding aspects of effective interviewing, such as question selection and meaningful rapport-building, multiple training sessions may be required. In addition, some individuals may be inherently better suited to interviewing tasks due to differences in relevant personality characteristics. There is a need for more systematic research on training effectiveness with more uniform and longer-term measures of effectiveness. Very few studies in our sample used a pre–post design with a control group or had multiple follow-up sessions and measures of effectiveness. Although recognizing the difficulty of conducting such research, it is necessary in order to allow for more conclusive statements to be made about how to best improve interviewing performance.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
