Sage Journals: Discover world-class research

Abstract

Trials of international crimes frequently rely on a complex type of witnesses: insiders or accomplices. While harnessing essential knowledge, insiders pose serious challenges to the decision-makers assessing their credibility. Prior research suggests that judges dismiss a sizeable proportion of insider testimony during trials of international crimes. While some reasons might lie with the witnesses, a closer look at the professional practices is warranted. This study aimed to examine the process of insider witness statement assessments by international criminal justice professionals and to analyze how they resolve the tension between the concerns about witness truthfulness and the quality of the testimony. One hundred sixty practitioners took part in an experimental vignette survey. Results of qualitative analyses demonstrate that the assessments of the witness and the statement contents are interrelated: across all experimental conditions, respondents drew inferences about the quality of the testimony based on their assessment of the witness and vice versa. Furthermore, the same indicators were given various, at times contradictory, meanings, highlighting individual differences in professional practice and the noise in decision-making.

Keywords

Witness evidence experiment international criminal justice ICC credibility

Introduction

Despite the increasing use of digital, forensic, and documentary sources of proof (Dubberley et al., 2019; Freeman, 2018), witness testimony is still frequently used in (international) criminal proceedings. Witness evidence, however, continues to challenge fact-finding. On the one hand, questions are raised regarding witnesses’ capabilities to provide accurate accounts of the events they had observed or experienced (Paulo et al., 2019; Wade et al., 2018). On the other, there is an increasing examination of the fact-finders’ abilities to reliably determine the witness's honesty and accuracy (McDermott, 2017; Sagana, 2018; Simon, 2012; Wistrich & Rachlinski, 2017). This determination is further complicated where fact-finders rely on accomplices or otherwise involved witnesses (Cryer, 2014; Nicolson & Auchie, 2017), commonly referred to as “insiders” in international crime cases (Kelsall, 2009; Whiting, 2009).

Assessing the accuracy, completeness, and objectivity of evidence provided by insider witnesses is a formidable task. Insiders are highly valuable. They harness privileged, unique information which might be crucial to establishing individual criminal responsibility of higher-ranking accused, commonplace at the International Criminal Courts and Tribunals (ICCTs) (Del Ponte, 2006; Fry, 2014; Wald, 2002). In this sense, insiders appear as quasi-experts on the organization in question, often providing evidence on the military and political structures, de facto functioning of the groups, and actions of the accused (Chlevickaitė et al., 2020; Wald, 2002). On the other hand, insiders present specific concerns regarding their objectivity and trustworthiness. The motivation of insider witnesses to testify, especially in the absence of plea agreements (Cook, 2005; Harmon, 2009), is a crucial issue in determining whether a witness can be expected to provide a complete and truthful account (Chlevickaitė et al., 2020; Stepakoff et al., 2014). Unlike victim-witnesses, insiders might, for instance, have been involved in the commission of crimes, have personal relationships with other members of the criminal organization, including the accused, or have related security concerns that might influence the extent to which they are willing or able to tell the whole truth (Cryer, 2014; Stover, 2005). Hence, the decision-makers are walking a tightrope between their concerns about the honesty of the witness and the significance of the evidence they might provide.

It is difficult, if not impossible, to estimate whether practitioners’ decisions about witness evidence are accurate, as no ground truth is readily available. Nevertheless, prior research has shown that judges find issues with up to 50% of insider witness evidence at ICCTs, to the extent that their probative value is profoundly compromised (Chlevickaitė et al., 2021). The implications of such a magnitude of loss of evidence at trial are considerable. Not only does it impact case outcomes, but it also means that many insider witnesses testify unnecessarily, without contributing to fact-finding, which is both expensive to the ICCTs and potentially life-altering to the witnesses.¹ Multiple explanations for this state of affairs are possible, both external: among other things, investigative challenges in uncooperative environments (Çakmak, 2017), multilingual and cross-cultural nature of the assessments (Kelsall, 2009; Nistor et al., 2020), time lapses (Bradfield, 2019), trauma (Smeulers & Grunfeld, 2011); and internal: for example, the mismatch between investigative, prosecutorial, and judicial witness assessment practices; inaccurate, inconsistent, or biased decision-making. While external factors are mainly out of the hands of ICCT practitioners, internal processes, and standards of witness evidence assessments can be studied and evaluated.

Assessment of Source: (Insider) Witness Credibility

Credibility assessments in criminal justice contexts consist of evaluating a witness's objectivity (or honesty) and competence (Delisle, 1978; Schum & Morris, 2007; Sluiter et al., 2013). Similar to domestic settings (Brodsky et al., 2010; Cohen, 2013; Kane, 2007), at the ICCTs, objectivity is assessed in reference to the witness's character and background information on prior relationships, membership in particular groups, identity (ethnic, national, other), the harm suffered, and incentives to testify (Chlevickaitė et al., 2020). Furthermore, though largely discouraged based on current scientific knowledge (Snook et al., 2017; Vrij et al., 2019), objectivity assessments continue to take into account the witness's behavior on the stand as an indicator of truthfulness as well (Chlevickaitė et al., 2021). While, arguably, nonverbal communication might serve other informative purposes during fact-finding (Denault et al., 2019), the risks of inappropriate assessments in cross-cultural settings are substantial (Johannesson, 2012; Vrij et al., 2011).

Assessments of insider witness credibility also differ from those of other witnesses: victims, experts, or overview witnesses (ICC, n.d.a). This is evidenced by the ICCT judges developing insider-specific credibility criteria: conditions of the plea agreement, criminal record, circumstances of confessions, detention or case status, involvement in the crimes, and others (Chlevickaitė et al., 2020). These indicators, as compared to the general witness assessments, are focused on examining witness motivations to avoid telling the truth overall or regarding specific areas of the testimony (e.g., personal involvement in the crimes or the involvement of the comrades) and indicate the specific focus on insiders’ reasons for not being truthful.

Another dimension of witness assessment is competence, characterized by the witness's ability to perceive, understand, and report the events witnessed. It includes evaluation of the witness's memory, medical issues, observation conditions, the ability to understand the language spoken, and related concerns (Agirre Aranburu, 2020; Brodsky et al., 2010). Researchers focusing on ICCT evidence have uncovered indications of serious witness competence issues arising from their conflict-related experiences, time lapse before the testimony, and similar reasons (Combs, 2009; Kelsall, 2009; Perrin, 2016; Swigart, 2017). However judicial assessments of insider witnesses seldom mention competence concerns (Chlevickaitė et al., 2021).

The difficulty inherent to assessing witness credibility is the essentially subjective nature of the process (Brodsky et al., 2010) and the difficulty in distinguishing between matters of truthfulness and competence issues. While truthfulness is assessed to identify a witness's inclination to tell deliberate lies, competence might provide honest explanations for the shortcomings in witness evidence that might otherwise be attributed to lying.

Assessment of Information: Testimonial Quality

Assessment of whether the information provided by the witness is accurate and complete, also known as evidence reliability, can also be divided into two aspects: external and internal validation (Chlevickaitė et al., 2021). External validation relates to consistency with prior statements by the same witness and corroboration or contradiction with other evidence in the case. Since external validation requires additional evidence, it is not always feasible. Accordingly, practitioners must assess witness statements in and of themselves (internal validation), which is most relevant for the current study.

ICCT researchers have uncovered a widespread lack of detail, implausibilities, inconsistencies, and related shortcomings of international witness testimonies (Cohen, 2013; Combs, 2017; Kelsall, 2009). While, at least initially, judges seemed to be willing to explain some of these deficiencies away by reference to competence issues (Combs, 2009), recent studies uncovered a stricter approach towards witnesses, where issues in testimonial quality are closely linked to the negative outcomes of judicial witness assessments (Chlevickaitė et al., 2021; Combs, 2017). Such assessments tend to focus on the extent of detail, basis of knowledge, and consistency, both internally and as compared to other evidence in the case (Combs, 2010; Kelsall, 2009). Furthermore, judgments commonly refer to the “plausibility” of testimony, indicating that a witness's account is, on the face of it, reasonable and compelling (see, e.g., Prosecutor v Ndindabahizi, 2004: §23; Prosecutor v Bemba, 2016: §230). To note, the assessment of what is “plausible” is especially complicated in cross-cultural settings (Granhag et al., 2017; Maegherman et al., 2018).

Indicators of high-quality testimony thus depend on both the contents of the testimony and the context of each case. Where additional evidence is available, decision-makers might focus on its corroboration and contradiction by external sources (Combs, 2017). Where such evidence is unobtainable, factfinders have only internal quality factors to rely upon. Furthermore, while reliability factors appear to be more objective, and thus fewer differences between insider and non-insider witnesses may be expected, legal decision-makers might have different, or higher, expectations of the type and quality of information to be provided by insider witnesses due to their rank, involvement, and other aspects of their profile (Agirre Aranburu, 2009).

Source and Contents: Inevitable Dependencies?

The assessments of credibility, and those of reliability, may not be easily separated. Research has demonstrated that source credibility impacts the persuasiveness of the message (Brodsky et al., 2010; Mondak, 1990): the more credible the source is considered to be, the more persuasive the message appears (Pornpitakpan, 2004; Smith et al., 2013). Furthermore, legal reasoning is first and foremost reasoning by inference (Roberts & Redmayne, 2007), whereby the credibility of the evidence source “forms the foundation for cascaded reasoning” (Schum & Martin, 1982, p. 114), also known as “inference networks” (Schum, 2009, p. 198). Here, the assessment of each fact is dependent on the assessment of the witness's credibility and the inference is directed from the witness to the information, as depicted in the figure below (De Smet, 2020, p. 627): Figure 1.

Figure 1.

Components of trustworthiness.

While the assessment begins with observing the information provided, reliance on this information depends on whether the witness is considered to be trustworthy. Thus, source assessment appears to mediate or otherwise influence the assessment of information. Importantly, such an understanding of reasoning from evidence introduces the possibility that factors other than the relevance/quality of the information may determine the decision-maker's confidence in the truthfulness of the account (Schum & Martin, 1982).

The opposite inference is also possible, whereby the source is assessed based on the information provided. According to Sobel's theory of credibility, “someone becomes credible by consistently providing accurate and valuable information or performing useful services” (Sobel, 1985, p. 557). This link is explicit in some of the intelligence analysis models (UNODC, 2011; US Army, 2012), and has been found in communications research: message quality can directly affect and partially mediate the effects of initial credibility assessments on subsequent source credibility assessments (Pornpitakpan, 2004). Prior research on insiders’ assessments at ICCTs also found instances of witnesses who had serious trustworthiness concerns that were alleviated by providing highly relevant, (self-)incriminatory information (Chlevickaitė & Holá, 2016; Kelsall, 2007).

The relationship between the assessment of the source (the witness) and the information in (international) criminal justice settings is unclear. On the one hand, the jurisprudence of ICCTs supports a dual approach to witness evidence assessments, whereby a credible witness may provide inaccurate information and vice versa (Kunarac et al., 2000: §8; Ntaganda, 2019: §53). Hence, in theory, the credibility of the witness ought not to determine the overall assessment of the information and the other way around. However, whether practitioners follow this approach is not known. Judges at ICCTs are allowed “free evaluation of evidence” (Caianiello, 2011; McIntyre, 2014), which unbinds them from any regulations on the factors or aspects of witnesses or their evidence to take into account. Others, such as prosecutors, defence, victims’ lawyers, investigators, and analysts, may resort to internal guidelines, if available, or rely on their individual, professional experience and expertise. The only publicly known existence of analysis guidelines in practice at ICCTs is the International Criminal Court (ICC) Office of the Prosecutor model for analytical source assessment (Agirre Aranburu, 2020).² In line with the jurisprudence, the guidelines divide the assessment criteria between source- and information-related, to be assessed independently from one another.

While the division of source and information assessments might be desirable to structure the decision-making process,³ both the possibility and the practicality of independent source and information assessments in criminal justice contexts are questionable. Irwin and Mandel found that assessors instructed to score information and source separately tended to pair source and information accuracy and to base decisions about accuracy more on information than on the source (2019). Conversely, where no other data on information reliability was provided, evaluators tended to base their content rating on source credibility, assuming that credible sources tend to produce reliable information (Irwin & Mandel, 2019; Volbert & Steller, 2014). Moreover, Samet showed that analysts estimate accuracy less reliably when basing their decisions on separate reliability and credibility metrics than when accuracy estimate is based on a single measure combining the two metrics (Samet, 1975). Finally, the assessors may well be unable to disregard the qualities of the source while assessing information and vice versa due to common heuristics and bounded rationality of human cognition (Cook et al., 2003; Nisbett & Wilson, 1977; Simon, 1990). Such difficulties are supported by studies on instructing juries to disregard specific evidence after it had been introduced (Lieberman, 2000; Steblay et al., 2006).

This study is a first attempt to assess the extent to which such a separation, or a lack of it, can be observed in international criminal justice practitioners’ assessments of insider witness statements by employing an experimental vignette study.

Methodology

Participants

One hundred sixty current and former practitioners of international criminal law completed the vignette survey. Respondents, all individuals with professional experience with witness evidence in at least one ICCT⁴ were recruited via purposive and snowball sampling. While the contact with individuals was personal (thus, the details of the respondents are known to the author), the survey responses were collected and shall be reported anonymously. Table 1 presents an overview of the demographics of the respondents. The sample is nearly balanced among genders (42.5% female and 52.5% male). The majority of the respondents have either legal (Prosecution, Defence, Chambers, and Legal Representatives of Victims) or investigative (Investigator and Analyst) experience. The duration and range of professional experience are reflected in questions on institutional backgrounds and years of practice.

Table 1.

Overview of Respondents’ Characteristics.

	N	%
Professional background
Lawyer: prosecution	49	30.63
Lawyer: defence	36	22.50
Investigator	21	13.13
Lawyer: chambers	20	12.50
Analyst	18	11.25
Legal representative of victims	7	4.38
Mixed (legal)	5	3.13
Judge	2	1.25
Psychosocial expert	2	1.25
Gender
Gender: female	68	42.50
Gender: male	84	52.50
Years of experience
0–5	19	11.88
6–15	70	43.75
16–25	47	29.38
25 <	24	15.00
Educational background
Master's degree	71	44.40
Law degree	60	37.50
Doctorate	17	10.60
Bachelor's degree	7	4.40
Police academy or equivalent	4	2.50
Number of ICCTs worked at:
One	47	29.38
Two	49	30.63
Three	37	23.13
Four	20	12.50
Five<	7	4.38

Materials

An experimental vignette study with a 2×2 factorial design was used, where two independent variables: source quality (credibility) and information quality (reliability), were manipulated in text-based vignettes, depicting excerpts of fictitious insider witness statements in a hypothetical situation. Each vignette included basic witness information, an explanation of the witness's involvement in the armed forces/group, description of context, and a potentially criminal incident. Each respondent was exposed to two vignettes (thus response N = 320); therefore, two comparable witness statements were created: one depicting a military insider witness, another one—a rebel group insider witness (see Appendix). Table 2 presents a visual overview of the factors.

Table 2.

Vignette Factors and Levels.

		Source quality (S0/S1)		Information quality (I0/I1)
Order (AB, BA)	Vignette A	Low	High	Low	High
Order (AB, BA)	Vignette B	Low	High	Low	High

Vignettes were chosen as the most appropriate method since witness assessments, and specifically, the assessment of insider witness evidence is a sensitive matter for practitioners (Aguinis & Bradley, 2014; Hughes & Huby, 2004). Furthermore, vignettes allow for manipulation of the evidence characteristics; thus, causal inferences may be drawn (Aguinis & Bradley, 2014). To ensure that the vignettes were true-to-life (Hughes & Huby, 2004), they were developed based on authentic witness statements retrieved from the evidence databases of the ICC and the International Criminal Tribunal for the Former Yugoslavia (ICTY) (ICC, n.d.b; ICTY, n.d.). To further test the internal validity, realism and clarity (Taylor, 2006), the vignettes were piloted twice with four expert practitioners from ICCTs. The experts who took part in the pilot sessions were not invited to participate in the study. The vignettes were revised based on their feedback and piloted for the third time with a group of 10 researchers at the Netherlands Institute for the Study of Crime and Law Enforcement (NSCR).

Factors: Source and Information Quality. In order to manipulate the two factors: quality of the source and quality of information, the most prevalent criteria appropriate for a text-based statement assessment were selected based on the literature and prior analyses of ICCT case law (Chlevickaitė et al., 2021; Combs, 2010). Regarding source (S0/S1, source quality low/high), the focus is on potential bias: motives, personal relationships, and (risk of) self-incrimination. For information (I0/I1, information quality low/high), amount and extent of detail, and privileged information, were manipulated. Table 3 presents the source/information factor characteristics represented in the different settings. Notably, certain aspects of the statements’ quality were constant throughout all the conditions in order for the assessors not to dismiss them outright. Hence, the following characteristics were kept stable: coherence, the extent of detail not directly related to the conduct of the superiors, description of contextual events, direct observation, and insider status/role in the group.

Table 3.

Construction of Vignette Factors.

Source quality: High	Source quality: Low
No personal relationships	Family relationship with the commander
Motive to join group: professional, group-based	Motive to join group: personal
Acknowledging involvement	Distancing from involvement
Acknowledging crimes	Distancing from crimes
Information quality: High	Information quality: Low
Month and year, day of the week provided	No precise dates
Numbers of victims, members of armed group	No numbers
Three names of others in the group	One name of others in the group
Direct information on chain of command	No direct information on chain of command

Procedure

The study was designed with the online survey software LimeSurvey. Eight conditions, with two vignettes each, were created, and participants were randomly assigned to one of them while maintaining a balance in the sample (20 respondents per condition).

The respondents were first asked to complete an informed consent form,⁵ after which they answered a set of demographic questions. The instructions for the vignettes and the situation context followed. After that, respondents were presented with Vignette A alongside the questions (on the same page). To explore the process and the focus of witness assessments, practitioners were asked first to provide a numerical answer to the question: Q1: Indicate how useful you consider this witness statement to be for further fact-finding in this situation (investigation/trial) on a scale from 1 (not at all useful) to 10 (extremely useful), followed by an open-ended explanation of the score given. Following that, respondents were asked to list three problematic areas in the statement excerpt that they would like to follow up on during further investigative activities (question: Q2: Could you indicate the problematic areas in the statement excerpt that you would like to acquire additional information/clarifications on from the witness? Try to list at least three). The respondents could progress to Vignette B only after answering the questions, and they were not allowed to return to Vignette A, to avoid revisions and direct comparisons.

Analytical Approach

The answers to the open questions (Q1 and Q2) were analyzed qualitatively, using the theoretical thematic analysis approach (Braun & Clarke, 2006). All responses were read and coded by the author, using Atlas.ti software. The broadest level of themes was in line with the framework set out in the introduction: the analytical focus of the participants (manifest, semantic level) and the inferential framework, that is, the relationships between the assessment of the source and the content (latent analysis) (Kleinheksel et al., 2020). Additionally, the responses were coded for assessment factors, source, and information characteristics, mentioned by the respondents. After the initial coding and grouping of themes were completed, the themes and codes were reviewed, similar codes and sub-themes were grouped, and codes were verified for internal coherence, consistency and distinctiveness (Braun & Clarke, 2006).

Following qualitative analysis and coding, the themes and codes were quantified. In subsequent sections, the most common themes (profile of the witness, information provided) and codes are listed in Figures 2–3. Quantification was used for descriptive and comparative purposes and does not necessarily represent the most important themes.

Figure 2.

Inferences of bias and knowledge across conditions.

Figure 3.

Assessment of contents across conditions.

Results

Profile of the Witness: The Same Characteristics, Opposite Conclusions

This sub-section presents the results related to the assessment of witness profile: insider status and role/rank in the group, personal involvement in potentially criminal events, and personal relationship with a commander. The insider status and personal involvement in the events were kept stable across the conditions; personal relationship was included only in low source quality conditions. Figure 2 presents the distribution of the factors mentioned by the respondents across the conditions. The y-axis denotes the frequency at which a certain factor was mentioned across the 320 responses (in %). The responses were coded for the factors indicated (0 = not mentioned, 1 = mentioned). Source quality is denoted by S0/S1 (Low/High), information quality: I0/I1 (Low/High). The analysis begins with the factors present across all experimental conditions.

Assessment of Role/Rank: Knowledge or Bias? All statements assessed by the participants featured an insider witness, a witness involved in the alleged perpetrator group, whose rank was explicitly mentioned. 208 (65%) responses mention insider profiles as a factor in their decision-making. As expected, responses demonstrate the dilemma of assessing the relevance of the information vis-à-vis potential bias of the witness: respondents ascribed two different, competing, meanings to it. Two-thirds (67.3%) of insider profile mentions were positive: identifying witnesses as an insider, an individual in a position of authority and in command of others, was found to suggest additional, unreported knowledge of the structures, hierarchy, and individuals involved in the crime, familiarity with military matters, and other privileged information, for instance:

he is a high enough officer or in a position to have reliable information. (D24: S1I1 ID32⁶)

in particular because of his position within the DAF, where he could have been privy to information which may not be obtainable through other sources. (D113: S0I0 ID102)

The above suggest that the person would know the structure and membership of the armed group in question, have information about its modus operandi, insider knowledge about various military operations etc. (D6: S1I1 ID22)

While witness rank for many was indicative of additional, high-value information that the witness could provide, a third of insider profile mentions expressed concern regarding potential bias. For instance, the witness's membership in an armed group was seen as a direct indication of a lack of neutrality, and described in unequivocal terms:

clearly not a neutral witness (D67: S0I0 ID67)

his account of the events is clearly one-sided. (D113: S0I0 ID102)

This confidence in profile-based inferences was a common feature in the responses concerning witness rank/role, though some respondents provided for the possibility that the witness was not objective, but genuine. This observation links back to the jurisprudence on witnesses being credible (perceived as willing to tell the truth), but unreliable (not having or being able to convey accurate or complete information):

witness is a senior member of the ZTI, and therefore may have provided biased information (whether intentional or due to 'group think'). (…) Therefore I do not consider him an objective or reliable witness, and thus would treat all of the information as potentially tainted (but not necessarily completely inaccurate). (D63: S0I0 ID65, emphasis added)

The recognition that the witness might be mistaken, or unconsciously biased, was a present, but infrequent occurrence in the responses. Furthermore, even where the witness was considered not objective, some respondents observed explanatory factors, or other reasons that would outweigh the possible bias. One of these was the transparency with which group membership was reported:

assumptions are made in DAF favor (e.g., no real knowledge of prisoner fate, but willing to say he ‘doesn't think anyone was killed’) - but this is mitigated because his biases are transparent and increase the credibility of inculpatory information. (D47: S0I0 ID50, emphasis added)

Finally, the challenging task of assessing insider witnesses is epitomized by contradictory inferences by the same respondents. On multiple occasions, respondents mentioned the dual nature of identifying the witness as an insider: both as a positive, and as a negative factor. From the quantitative overview in Figure 2, the positive inference is visibly more prevalent across all conditions: inferences of high knowledge (N = 209) were more common than inferences of bias (N = 113). However, without further analyses, it is difficult to tell whether it also had a stronger influence on the assessment outcomes. Based on the responses, respondents differ in both the conclusions drawn from the information presented and the relative confidence in their conclusions. Similar patterns were found regarding other factors.

Personal Involvement: Risk of Self-Incrimination, Inferring Honesty, and Direct Knowledge. Personal involvement in the events, as reported by the witness, might be a good indicator that the witness's knowledge is direct. Hence, where the witness acknowledged personal involvement, some assessors saw it as an additional positive sign regarding the immediacy and authenticity of the witness's knowledge. In 56 responses (17.5% of all responses) this was interpreted as indicating further knowledge and additional information that could be acquired from the witness.

Apparently, he was present for several hours and should be able to provide more details about the people involved, the precise conduct carried out and the identity of the victims. (D21: S0I1 ID31)

The witness was commanding a unit involved in fighting in NESRIDE and can provide a direct account of the events. (D123: S0I1 ID108)

The other side of reporting involvement in potentially criminal events was the inference of self-incrimination fear, which would reduce the likelihood that the witness would report the facts objectively and exhaustively. Though observed less commonly (6% of all responses), it provides another example of the duality of certain factors:

since he also participated in the commission of the crimes, his statement should be taken with the utmost caution as he could be undermining the reality to undermine his own responsibility. (D146: S1I1 ID122)

Hence, like the assessment of witness's rank and membership in the organization, personal involvement in crimes was assessed inconsistently. To some respondents, it indicated extensive knowledge, thus potentially highly relevant information. To others, the quality of the information notwithstanding, personal involvement meant the witness was not likely to deliver a truthful account due to self-incrimination fears and must be approached with caution. Finally, some respondents expressed the possibility that both inferences could be correct, directly demonstrating the complexity of the decision-making in this area.

Personal Relationships. Parallel to the opposing meanings ascribed to witness rank and involvement in potentially criminal actions, the assessment of a personal relationship with a senior commander in the group⁷ was twofold.

Out of 39 mentions of a personal relationship (12% of all responses), two-thirds (26/39) of the inferences were negative. For these respondents, a family/personal relationship with a commander of a group potentially under investigation clearly indicated bias and created an expectation that the witness will or may minimize the commander's responsibility for the events in question. Again, for some respondents, the relationship was clear and linear. Yet, the majority of the respondents tempered the conclusion of this type of bias, indicating a lower degree of confidence compared to the bias inferred from the witness's profile as an insider in the group and potential culpability for the crimes.

He is very close to LEFBEN, and would certainly be inclined to defend him. (D161: S0I0 ID134)

LEFBEN is the witness’ uncle, and the very reason he joined the military. He may be loyal to a person who is both his boss and a family member, and not inclined to incriminate the commander. (D221: S0I1 ID180)

One-third of all mentions inferred a different meaning from the personal relationship. First, it indicated that the witness was forthcoming, as the assessor would not expect the witness to volunteer information about family relations:

voluntary disclosure of family relations to Brig. LEFBEN (aka LETO) positively influences the score. (D99: S0I0 ID94)

Secondly, the relationship might lead the witness to be in a better position to acquire valuable information, and thus positively influence the assessment of the witness's knowledge:

I do think that this could be a relevant witness as his relationship to MOSO and role as an area chief places him in a very good position to know what happened. (ID125: S0I1 ID109)

Again, the same factor: a personal relationship with the accused, depending on the respondent, was seen as an indicator of the witness's honesty, knowledge of relevant facts, or potential bias, parallel to the findings concerning other witness credibility-related indicators.

Information Provided: Who Sets the Standard?

This sub-section presents the results related to the assessment of information quality, specifically where the quality of the statement contents is linked to conclusions about the source (witness): admissions of being personally involved in the commission of a crime, lack and type of detail provided, hesitations (expressing uncertainty in the statement), and implausible or unclear assertions. Figure 3 below presents the distribution of factors mentioned by the respondents across experimental conditions (S0/S1—Source quality Low/High, I0/I1—Information quality Low/High).

Admission of Involvement in the Commission of Crime. As described above, indications that the witness was personally involved in the events were perceived as indicative of high-quality knowledge or a possible cause for lack of objectivity. However, a third inference was found where the witness went beyond indicating personal involvement and into admitting that crimes were committed. Where the witness recounted events in first person language and/or provided detail of ordering, planning, or otherwise being directly involved in criminal events, some assessors (in 61/320, or 19% of all responses) inferred that the witness was forthcoming, “willing to talk about the events” (D68: S1I1 ID67), and that his recollection was “sincere” (D152: S1I1 ID127). This inference of willingness to admit involvement in crime, though manifestly included only in S1 (high source quality) scenarios, was observed across all four conditions:

He also admits that he could recognize who was DAF or not so he still chose to detain and question them. He incriminates himself there. (D24: S1I1 ID32, emphasis added)

appears truthful, to the point that he has provided information that may incriminate if not himself, at least other members of his group. (D33: S0I1 ID40, emphasis added)

Accidently admits to prolonged detention of persons who were not involved with DAF, as well as continued detention when ‘they annoyed’ detention officers. (D109: S0I0 ID100, emphasis added)

Noticeably, the respondents’ suspicion of witness's dishonesty increased with decreasing quality of information, from readily taking the self-incriminatory information as a voluntary admission or truthfulness in the high source and information quality condition, to “accidental” admission in the low source and information quality condition. Again, this indicates a link between the contents of the statement and the perceived credibility of the witness.

Assessing Hesitation. An interesting phenomenon occurred when respondents were faced with knowledge limitations and hesitations expressed by the witness. Certain qualities of speech, such as passive voice, correcting oneself, and qualifying the confidence in the information shared resulted in twofold impressions. Hesitations were mentioned in 35 responses (11% of total responses). For some respondents (N = 9), hesitations indicated nuance or admissions that the person is unsure:

witness openly admits own uncertainty (D216: S1I0 ID176)

Recognises limitations in language terminology (think/maybe) (D142: S1I1 ID120)

nuance provided by terms such as (‘I think’) (D99: S0I0 ID94)

When interpreted in this light, uncertainty and limitations were seen to help the assessor know which parts of the reported facts the witness was confident of and which ones he was more hesitant about. Conversely, some respondents inferred hesitations to mean speculation and being passive about the events witnessed:

Sometimes he speculates – ‘I think…’ and ‘I don't think’ which I do not like in a statement which should be about things he knows (D107: S0I1 ID99)

he uses phrases such as ‘I think’ and ‘maybe’ which leads to uncertain in regard to what he is saying (D118: S1I0 ID105).

Why does he use the word 'maybe' twice in the last two sentences? To be explored (D221: S0I1 ID181)

Once again, this demonstrates the divergent inferences, where the same words are ascribed quite different meanings, with contradictory consequences. It also reflects the subjectivity in assessing the language and contents of the testimony. Furthermore, since these divergent conclusions were found across experimental conditions, the direction of this inference appears to depend on the individual decision maker.

Detail and Plausibility: Evasive or Genuine? Mentions of the level of detail, especially the lack of it, were relatively frequent across the responses (147/320, or 46% of total responses). Importantly, indications that the statement lacks detail were found across conditions, though the actual detail mentioned differed only across high and low information quality conditions (I0, I1). It indicates that the respondents might be considering additional factors when assessing the extent of detail provided or assess the same extent of detail differently. Similarly, information was determined to be implausible across conditions, again pointing toward the subjective interpretation of what is “implausible.” Beyond observing the presence of these issues, the respondents also considered whether they were intentional or genuine:

He either lacks the precision or the will to provide more information on the circumstances of the attack and post-attack arrest. (D199: S0I0 ID167, emphasis added)

information provided by the witness is unclear – possibly voluntarily – and should be clarified. (D196: S1I0 ID165, emphasis added)

Deciding whether omissions are genuine or intentional based on only the information provided is a difficult task. Hence, respondents sometimes relied on the witness's profile: role/rank, relationship, and involvement to make a decision. Some respondents (in 29/320, or 9% of all responses), faced with a shortcoming in the information provided by the witness, clearly searched for the explanation elsewhere, and combined the information quality with the source characteristics to decide between genuine or deliberate causes:

Most likely, the witness's family ties with Brig. LEFBEN and self-apprehensiveness of being involved in the commission of heinous crimes in April 2014 make him overlook crucial details in his statement. (D75: S0I1 ID71, emphasis added)

Witness is the direct participant in the events, so he is clearly trying to protect himself/his superiors and subordinates, by omitting details and information that he knows or must know based on the circumstances described in the excerpt. (D 134: S1I0 ID114, emphasis added)

While it could be true that the witness does not possess the requisite knowledge and details, the respondents found the source profile sufficiently informative to make conclusions about the information provided. However, the difficulties did not end here. Even after deciding that the observed shortcomings in the information provided were likely due to deliberate actions by the witness, the assessors had to decide whether all information was to be dismissed, or whether some degree of truth remained:

It cannot be discounted that some of the account is true even if he is minimising his role or withholding some incriminating information. (D241: S0I1 ID198)

No further explanation as to how this assessment—determining which parts of the statement are true despite the witness's deliberate omissions—was found in the responses. While the experimental conditions did not allow for additional information to be consulted, this is one of the instances where, without supplementary sources, an accurate decision regarding the reasons for the lack of detail is doubtful.

Meeting the Assessor's Expectations

The sections above demonstrate that assessments of witness statements go beyond analyzing the information on the page. The profile of the witness, characterized by their rank/role in the group, personal relationships, and involvement in the events appear to inform a “standard” for the quality of information expected of the witness. Hence, it can be cautiously concluded that there is a relationship between the assessment of the witness and the assessment of information, in line with the model depicted in Figure 1. Some respondents, perhaps based on their prior experience, demonstrated clear expectations of the type and amount of detail to be delivered by a witness with an insider profile:

Statement does not seem consistent with witness's military training (D43: S0I0 ID47, emphasis added)

Witness should have provided more details about the events/avoids doing so, considering he is a local (born in the area and resides there), and member of ZTI for over 13 years with family ties to ZTI commander. (D133: S0I1 ID114, emphasis added)

Should he not be fully aware, in his position, who was in charge? (D221: S0I1 ID180)

At times, this standard appeared to be based on the profile alone, while other respondents combined the rank/role with reported involvement. The extent of detail expected from the witnesses also differed across the responses. Some respondents expected quite extensive, verifiable evidence:

He was overseeing the arrest and transmitted the order to his men, mentioned some beatings, his brigade also helped with the interrogation, he should be able to provide a detailed account and provide with registration documentation on all the person arrested and then detained, and provided specific numbers and ID of detainee who died or required medical assistance following interrogation. He should also be able to describe interrogation technique used during the interviews. (D160: S1I0 ID131)

Other respondents did not define this information in such strict terms, but rather expressed dissatisfaction with the level of detail provided and indicated the need for more. For instance, some respondents felt that a witness related to a commander should have provided more information about their relationship, the commander himself, and his actions, failing which, the witness appeared in a more negative light:

He is a commanding officer. He should have access to much more detailed information to this about what occurred. (D242: S1I0 ID198, emphasis added)

His uncle is a commander in the area and he does not refer at all to his involvement in the relevant events. (D147: S0I1 ID124, emphasis added)

This combination, or direct mentions, of the witness's profile and the assessor's expectations is a rather problematic aspect of witness assessments. The responses above indicate a lack of a standard for what is considered to be a “detailed” statement overall, and such assessment appears to be related to the profile of the witness providing the information. The flexibility is understandable, and likely necessary, as each witness will have different abilities, the extent of memory, and the capability to testify in detail. However, basing expectations on a subjective evaluation of a witness is of questionable accuracy and likely inconsistent across different assessors. Finally, as touched upon above, the divergences in what the respondents considered to be a vague or low detail statement, show that different assessors might evaluate similarly detailed statements differently, based on their subjective expectations and tendencies (Bond & Depaulo, 2008).

Insider Witness Evidence Assessment Model

The insider witness evidence assessment process, including the inferential relations between the different assessment factors found in the vignette responses, is expressed visually in Figure 4. This model builds upon and incorporates prior quantitative analyses of judicial insider witness assessments (Chlevickaitė et al., 2021) and the formal inferential models described in the introduction (De Smet, 2020; Schum, 2009). Importantly, this model elucidates the process assessors seem to follow when determining whether the limitations observed in witness statements are genuine or not by introducing the “expectations of quality” determinant. While prior models have identified the distinct factors constituting witness objectivity (truthfulness) or competence, they have not explained the process of evidence assessment, thus missing an area key to understanding decision-making.

Figure 4.

Model of insider witness evidence assessments.

Based on the study findings, insider witness evidence assessment starts either directly with the assessment of statement quality or with witness objectivity and competence. In the former process, any issues identified in the statement are followed by the determination of the reasons for these deficiencies: the qualities of the witness. In the latter process, witness objectivity and competence are assessed first. Then, based on assessing witness objectivity and competence, the assessors determine, whether explicitly or not, their expectations of statement quality and compare these expectations with the information provided by the witness. The expectations of quality also feed back into the determination of whether there are issues with quality or not, as demonstrated by the lowest arrow in the model. For instance, where initially lack of detail is observed (issue with quality), but the assessment of the witness indicates to the assessor that more detail cannot be expected of the particular witness (e.g., due to indirect observation, memory issues), the lack of detail is not taken to be an issue that needs to be addressed. Hence, expectations of quality form an integral part of witness evidence assessment. However, these expectations also introduce a layer of subjectivity, as everyone might have personal expectations based on their interpretation of witness objectivity and competence, as was demonstrated throughout this paper.

The model also demonstrates the links between the assessment of the witness and the assessment of the information provided. In the instances where no issues with the evidence are found (where evidence is of high quality), this is shown to positively impact the assessment of witness objectivity and competence and/or directly lead to the acceptance of witness evidence without explicit assessment of the witness qualities. The other way around, where issues are found with witness objectivity and/or competence, it links back to the determination of whether the evidence provided has deficiencies via the expectations of quality.

This model explains the process of witness assessments based on the data available to date. Understanding the process allows for its modification in practice, but it does not determine the approach that would lead to the most accurate, consistent, or reliable decision-making. Further studies exploring the causal relationships between the distinct factors and their clusters, as well as the comparative weight assigned to them, would be useful in determining which steps in the process are the most predictive of the final acceptance of witness evidence.

Discussion

The assessments of insider witness statements illuminate the challenging task practitioners face in reconciling their concerns of witness motivation, or potentially tainted credibility, with the quality of information contained in the testimony and vice versa. This study found clear indications of a bi-directional relationship between the quality of the witness (credibility) and the quality of the information provided (reliability) in the respondents’ decision-making. Inferences were also drawn from certain (observed) qualities of the witness to other (unobserved) qualities of the witness.

Regarding decision-makers’ focus, frequent mentions of potential motives or self-incrimination bias were found, which is unsurprising considering the profile of the witnesses and prior research findings (Chlevickaitė et al., 2021; Combs, 2017). Not only respondents focused on the witness's insider profile and personal involvement, but these factors were also assigned multiple, contrasting meanings. Insider profile, personal involvement, and personal relationships were chiefly interpreted to imply bias, but also: high-level knowledge or demonstrated honesty, uncovering the complexity obscured in jurisprudence analyses.

Similar observations were found with regard to the assessment of statement contents. Lack of detail and hesitations were assessed not only as indicators of lower statement quality but also linked to witness credibility. Most prominently, respondents inferred either evasiveness or genuinely deficient knowledge from a lack of detail and hesitations. Here, the inference appeared to depend on the prior assessment of the witness's objectivity: where the witness presented objectivity concerns to the respondent, shortcomings in the statement could be attributed directly to bias and evasiveness. Related, what constituted shortcomings was also informed by the respondents’ assessments of witness profile. The indications of the witness's rank in the group and role in the events were linked to a certain (individual) standard or expectation for the quality of the information to follow. Where witness statements failed to meet that standard, they tended to be attributed to evasiveness.

These findings tend to support the assertion that the assessments of the source (witness) and information (testimony) are not independent and have a complex relationship. While some legal decision-making models include witness trustworthiness as a mediating factor between a fact asserted and a fact confirmed (“if witness A trustworthy, information trustworthy”) (De Smet, 2020; Schum, 2009), the inferences appear to go the other way around as well: if information is trustworthy, witness A is trustworthy. This confirms prior research in communications studies, demonstrating that the quality of the source influences the persuasiveness of the message and vice versa (Pornpitakpan, 2004; Smith et al., 2013). It further confirms prior studies on judicial assessments of insider witnesses, where insiders of questionable credibility were relied upon due to the high quality of the information provided (Chlevickaitė & Holá, 2016; Kelsall, 2007). Importantly, these findings also support Sobel's theory of credibility, whereby “someone becomes credible by consistently providing valuable information” (Sobel, 1985, p. 557). Furthermore, the different approaches demonstrate that the formal analytical guidelines or formal, unidirectional, inferential networks, with the source being either a separate or a mediating factor in the assessment of information, might appear rather different in applied settings.

The reliance on inferences, as well as a certain extent of subjectivity, is an inevitable and largely unproblematic feature of legal reasoning. However, it becomes problematic where the assessment of the same information is widely inconsistent, and inferences are drawn in multiple directions. Whichever approach is taken by the assessors (source and content evaluated separately or not), like evidence should be treated alike, and different evidence should be treated differently. Likewise, certain factors, for instance, witness background, should have similar implications across the assessors. In other words, the expectation is of little subjectivity and of a discernible pattern of decision-making, which was not found regarding several salient assessment factors: for example, profile, involvement, personal relationships, hesitations, and lack of detail. This diversity of assessments indicates a certain degree of subjectivity and noise, which in the model is accounted for by “expectations of quality” criteria, hence formalizing the subjectivity inherent to the process. This finding is in line with prior research demonstrating that where information is complex, and decision-making is unstructured, it is vulnerable to bias and noise (Greene & Ellis, 2007; Kahneman et al., 2019; Sagana, 2018). “Noisy” decisions are observed in the diversity, or spread, of decision outcomes when different individuals are faced with a similar problem (Kahneman et al., 2016). Such decisions are prone to heuristic, subjective thinking, especially where multiple pieces of information have to be assessed concurrently (Dunstall & Reeson, 2009; Kahneman, 2003; Kahneman et al., 2016, 2019). Considering the complexity of the task and the diversity of ways the practitioners approached it, it appears that assessments of witnesses and their statements might be a perfect setting for sub-optimal decision-making (Brehmer, 1992; Chermack, 2004; Edland & Svenson, 1993).

This research has two main practical implications. First, the results show that assessments of the source and the information are, to an extent, not independent. Thus, the standard operating procedures or analytical guidelines used by organizations should either explicitly instruct the assessors on the extent to which such inferences are acceptable and in what circumstances they can be relied upon, or implement procedures where source and information can be assessed independently, if that is desirable.⁸ Ignoring the intrinsic links between the assessments of the witnesses and their statement is not the solution, as the assessors are still likely to base their decisions on both, in an inter-related manner, without making it explicit and thus not subject to review.

A second practical implication relates to the diversity of decisions and inferences observed. Based on the analyses presented above, it appears that witness statement assessments might be worryingly “noisy.” Such a range of inferences is not desirable in situations where consistent and discernible decision-making is expected and might also be a precursor for individual subjectivity and bias. It could be useful to conduct an audit of diversity, or spread, of decision outcomes when different individuals are faced with a similar problem, to assess the consistency in their judgments (Kahneman et al., 2016). Unlike for the assessment of bias, no ground truth establishment is necessary for the assessment of noise, thus it should be available in many criminal justice settings. The outcomes of the audit could inform the development of improved source evaluation guidelines and training techniques.

Finally, this research was not without limitations. First, the statement excerpts presented to the respondents were necessarily shorter than most of the real-world international criminal justice witness statements. This might have made the manipulated factors more explicit and easier to spot, though all effort was put into maintaining realism. It is also likely that the respondents did not pay as much attention to or consider all the factors to the extent they would in the real-world task of witness evidence assessment. The shorter attention span was mitigated by assigning respondents just two vignettes and including a progress bar to reduce anxiety about the length of the process. Even with the mitigation of these risks, some authors suggest that vignettes are artificial and may lack external and ecological validity (Sauer et al., 2014; Taylor, 2006). However, vignettes also allowed for manipulation and control of the factors presented to the respondents, which should outweigh the possible concerns with realism. To evaluate the extent to which these experimental conditions were realistic, it would be ideal to conduct further research into real-world witness assessments across parties and organs of the ICCTs. It would also allow uncovering whether there are systematic differences between diverse types of practitioners or the parties they represent. Finally, it is also possible that the diversity of responses was due to the respondents’ cultural and linguistic profiles. While this may be the case, this limitation is also present at the ICCTs, which is the context of interest. Additional research with homogeneous samples of ICCT practitioners could be conducted to assess whether cultural diversity indeed affects respondents’ perceptions.

Footnotes

Acknowledgements

I would like to thank my PhD supervisors Dr Barbora Hola and Prof Catrien Bijleveld for their extensive support in conducting this research. I am also grateful to all the respondents who took part in the study, the experts, and the NSCR colleagues who participated in the pilot of the vignettes, and everyone who helped reach the participants. Finally, my gratitude goes to the three anonymous reviewers for their considerate, insightful, and by all means, helpful, suggestions and comments.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article. This work was supported by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek, (grant number 406.17.519).

ORCID iD

Gabrielė Chlevickaitė

Notes

Appendix Sample Vignettes

Author Biography

Gabrielė Chlevickaitė is Assistant Professor in Empirical and Normative Studies at the VU Amsterdam, where she also conducts research into fact-finding in international criminal investigations. Gabrielė is a board member of the Center for International Criminal Justice (), an interdisciplinary research center at the VU Amsterdam and a fellow at the Netherlands Institute for the Study of Crime and Law Enforcement (NSCR) in Amsterdam, where she conducted her NWO-funded PhD research in 2017–2021. In 2013–2017, Gabriele was an analysis assistant at the International Criminal Court.

References

Agirre Aranburu

(2009). Prosecuting the most responsible for international crimes: Dilemmas of definition and prosecutorial discretion. In Gonzalez

(Ed.), Protección internacional de derechos humanos y estado de derecho (pp. 381–404). Grupo Editorial Ibáñez.

Agirre Aranburu

(2020). The contribution of analysis to the quality control in criminal investigation . In Agirre

Bergsmo

De Smet

Stahn

(Eds.), Quality control in criminal investigation (pp. 117–272). Torkel Opsahl Academic EPublisher.

Aguinis

Bradley

K. J.

(2014). Best practice recommendations for designing and implementing experimental vignette methodology studies. Organizational Research Methods, 17(4), 351–371. https://doi.org/10.1177/1094428114547952

Bond

C. F.

Depaulo

B. M.

(2008). Individual differences in judging deception: Accuracy and bias. Psychological Bulletin, 134(4), 477–492. https://doi.org/10.1037/0033-2909.134.4.477.supp

Bradfield

(2019). Preserving vulnerable evidence at the international criminal court – the Article 56 milestone in Ongwen. International Criminal Law Review, 19(3), 1–39. https://doi.org/10.1163/15718123-01903001

Braun

Clarke

(2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa

Brehmer

(1992). Dynamic decision making: Human control of complex systems. Acta Psychologica, 81(3), 211–241. https://doi.org/10.1016/0001-6918(92)90019-A

Brodsky

S. L.

Griffin

M. P.

Cramer

R. J.

(2010). The witness credibility scale: An outcome measure for expert witness research. Behavioral Sciences and the Law, 28(2), 211–223. https://doi.org/10.1002/bsl.917

Buisman

(2013). Evidential Collusion. A compendium on the legacy of the ICTR and the development of international law. http://unictr.irmct.org/sites/unictr.org/files/publications/compendium-documents/i-evidential-colusion-buisman.pdf

10.

Caianiello

(2011). Law of evidence at the international criminal court: Blending accusatorial and inquisitorial models. North Carolina Journal of International Law & Commercial Regulation, 36(2), 287–318. https://ssrn.com/abstract=1843304

11.

Çakmak

(2017). A brief history of International Criminal Law and International Criminal Court. Palgrave Macmillan. https://doi.org/10.1057/978-1-137-56736-9

12.

Chermack

T. J.

(2004). Improving decision-making with scenario planning. Futures, 36(3), 295–309. https://doi.org/10.1016/S0016-3287(03)00156-3

13.

Chlevickaitė

Holá

(2016). Empirical study of insider witnesses’ assessments at the international criminal court. International Criminal Law Review, 16(4), 673–702. https://doi.org/https://doi.org/10.1163/15718123-01604002

14.

Chlevickaitė

Holá

Bijleveld

(2020). Judicial witness assessments at the ICTY, ICTR and ICC. Journal of International Criminal Justice, 18(1), 185–210. https://doi.org/10.1093/jicj/mqaa002

15.

Chlevickaitė

Holá

Bijleveld

(2021). Suspicious minds? Empirical analysis of insider witness assessments at the ICTY, ICTR and ICC. European Journal of Criminology, 1–23. https://doi.org/10.1177/1477370821997343

16.

Cohen

C. R.

(2013). Demeanor, deception and credibility in witnesses. ABA Section of Litigation Annual Conference. https://www.americanbar.org/content/dam/aba/administrative/litigation/materials/sac2013/sac_2013/33_demeanor_deception.authcheckdam.pdf

17.

Combs

N. A.

(2009). Testimonial deficiencies and evidentiary uncertainties in international criminal trials. UCLA Journal of International Law and Foreign Affairs, 14(09–167), 235. https://scholarship.law.wm.edu/facpubs/336/

18.

Combs

N. A.

(2010). Fact-finding without facts: The uncertain evidentiary foundations of international criminal convictions. Cambridge University Press

19.

Combs

N. A.

(2017). Grave crimes and weak evidence: A fact-finding evolution in international criminal law. Harvard International Law Journal, 58(1), 47–125. https://scholarship.law.wm.edu/facpubs/1867

20.

Cook

G. I.

Marsh

R. L.

Hicks

J. L.

(2003). Halo and devil effects demonstrate valenced-based influences on source-monitoring decisions. Consciousness and Cognition, 12(2), 257–278. https://doi.org/10.1016/S1053-8100(02)00073-9

21.

Cook

J. A.

(2005). Plea bargaining at the Hague. The Yale Journal of International Law, 30(2), 473–506. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=578581

22.

Cryer

(2014). Witness tampering and international criminal tribunals. Leiden Journal of International Law, 27(1), 191–203. https://doi.org/10.1017/S0922156513000691

23.

De Smet

(2020). Controlling the quality of reasoning about the link between evidence and factual findings. In Agirre

Bergsmo

De Smet

Stahn

(Eds.), Quality control in criminal investigation (pp. 611–638). Torkel Opsahl Academic EPublisher.

24.

Del Ponte

(2006). Investigation and prosecution of large-scale crimes at the international level: The experience of the ICTY. Journal of International Criminal Justice, 4(3), 539–558. https://doi.org/https://doi.org/10.1093/jicj/mql032

25.

Delisle

(1978). Witnesses: Competence and credibility. Osgoode Hall Law Journal, 16(2), 337–360. http://digitalcommons.osgoode.yorku.ca/ohlj/vol16/iss2/4

26.

Denault

Dunbar

N. E.

Plusquellec

(2019). The detection of deception during trials: Ignoring the nonverbal communication of witnesses is not the solution—A response to Vrij and Turgeon (2018). International Journal of Evidence and Proof, 24(1), 1–9. https://doi.org/10.1177/1365712719851133

27.

Dror

I. E.

Kassin

S. M.

Kukucka

(2013). New application of psychology to law: Improving forensic evidence and expert witness contributions. Journal of Applied Research in Memory and Cognition, 2(1), 78–81. https://doi.org/10.1016/j.jarmac.2013.02.003

28.

Dubberley

Koenig

Murray

(Eds.). (2019). Digital witness: Using open source information for human rights investigation, documentation, and accountability. Oxford University Press.

29.

Dunstall

Reeson

A. F.

(2009). Behavioural economics and complex decision-making implications for the Australian tax and transfer system. CSIRO. https://www.researchgate.net/publication/242762186_Behavioural_Economics_and_Complex_Decision-Making_Implications_for_the_Australian_Tax_and_Transfer_System

30.

Edland

Svenson

(1993). Judgment and decision making under time pressure. In Svenson

Maule

J. A.

(Eds.), Time pressure and stress in human judgment and decision making (pp. 27–40). Springer US. https://doi.org/10.1007/978-1-4757-6846-6_2

31.

Eskridge

W. N.

Ferejohn

(2002). Structuring lawmaking to reduce cognitive bias: A critical view. Cornell Law Review, 87(2), 616–647.

32.

Freeman

(2018). Digital evidence and war crimes prosecutions: The impact of digital technologies on international criminal investigations and trials. Fordham International Law Journal, 41(2), 283–336. https://ir.lawnet.fordham.edu/ilj

33.

Fry

(2014). The nature of international crimes and evidentiary challenges: Preserving quality while managing quantity. In van Sliedregt

Vasiliev

(Eds.), Pluralism in international criminal law (pp. 1–25). Oxford University Press.

34.

Granhag

P. A.

Landström

Nordin

(2017). Evaluation of oral statements. a scientifically based decision-aid for migration cases. https://moam.info/evaluation-of-oral-statements_5b78fde8097c47c8468b45a0.html

35.

Greene

Ellis

(2007). Decision making, in criminal justice. In Carson

Milne

Pakes

Shalev

Shawyer

(Eds.), Applying psychology to criminal justice (pp. 183–200). John Wiley & Sons, Ltd. https://doi.org/10.1002/9780470713068.ch11

36.

Harmon

(2009). Plea bargaining: The uninvited guest at the international criminal tribunal for the former Yugoslavia. In Doria

Gasser

H.-P.

Bassiouni

M. C.

(Eds.), The legal regime of the international criminal court (pp. 161–182). Martinus Nijhoff Publishers. https://doi.org/10.1163/ej.9789004163089.i-1122.45

37.

Hughes

Huby

(2004). The construction and interpretation of vignettes in social research. Social Work and Social Sciences Review, 11(1), 36–51. https://doi.org/10.1921/17466105.11.1.36

38.

ICC (n.d.a). Witnesses. https://www.icc-cpi.int/about/witnesses

39.

ICC (n.d.b). Legal Tools Database. https://www.legal-tools.org/

40.

ICTY (n.d.). ICTY Court Records. http://icr.icty.org/

41.

Irwin

Mandel

D. R.

(2019). Improving information evaluation for intelligence production. Intelligence and National Security, 34(4), 503–525. https://doi.org/10.1080/02684527.2019.1569343

42.

Johannesson

(2012). Performing credibility: Assessments of asylum claims in Swedish migration courts. Retfaerd Årgang, 35(138), 3–138. https://docs.google.com/viewerng/viewer?url=http://retfaerd.org/wp-content/uploads/2014/08/Retfaerd_3_2012_5.pdf&hl=da

43.

Kahneman

(2003). Maps of bounded rationality: Psychology for behavioral economics. American Economic Review, 93(5), 1449–1475. doi:10.1257/000282803322655392

44.

Kahneman

Lovallo

Sibony

(2019). A structured approach to strategic decisions. MIT Sloan Management Review, 60(3), 67–73. https://sloanreview.mit.edu/article/a-structured-approach-to-strategic-decisions/

45.

Kahneman

Rosenfield

A. M.

Gandhi

Blaser

(2016). Noise. How to overcome the high hidden cost of inconsistent decision making. Harvard Business Review, 3, 1–9.

46.

Kane

J. L.

(2007). Judging credibility. In ABA Practice Essentials, reprinted from Litigation Magazine, 33(3), 1–8.

47.

Kelsall

(2007). Truth vs. justice? Popular views on the Truth and Reconciliation Commission and the Special Court for Sierra Leone. The Online Journal of Peace and Conflict …, June. http://www.trinstitute.org/ojpcr/7_1SawKel.pdf%5Cnpapers2://publication/uuid/4C95F85A-56FE-42AD-8891-DE4DD5D963DA

48.

Kelsall

(2009). Culture under cross examination. International justice and the Special Court for Sierra Leone. Cambridge University Press.

49.

Kleinheksel

A. J.

Rockich-Winston

Tawfik

Wyatt

T. R.

(2020). Demystifying content analysis. American Journal of Pharmaceutical Education, 84(1), 127–137. https://doi.org/10.5688/ajpe7113

50.

Lieberman

J. D.

(2000). Understanding the limits of limiting instructions: Social psychological explanations for the failures of instructions to disregard pretrial publicity and other inadmissible evidence. Psychology, Public Policy, and Law, 6(3), 677–711. doi:10.1037/1076-8971.6.3.677

51.

Maegherman

Van Veldhuizen

Horselenberg

(2018). Dropping the anchor: The use of plausibility in credibility assessments. Oxford Monitor of Forced Migration, 7(2), 37–55.

52.

McDermott

(2017). Strengthening the evaluation of evidence in international criminal trials. International Criminal Law Review, 17(4), 682–702. https://doi.org/10.1163/15718123-01704005

53.

McIntyre

(2014). ICTR - Assessment of evidence. Symposium on the Legacy of the ICTR, 1–12. https://unictr.irmct.org/sites/unictr.org/files/publications/compendium-documents/ii-symposium-on-legacy-ictr-mcintyre_0.pdf

54.

Mondak

(1990). Perceived legitimacy of Supreme Court decisions: Three functions of source credibility. Political Behavior, 12(4), 363–384. https://doi.org/10.1007/BF00992794

55.

Nicolson

Auchie

D. P.

(2017). Assessing witness credibility and reliability: Engaging experts and disengaging Gage? In Duff

Ferguson

P. R.

(Eds.), Scottish Criminal Evidence Law. Current Developments and Future Trends (pp. 1–25). Edinburgh University Press.

56.

Nisbett

R. E.

Wilson

T. D.

(1977). The halo effect: Evidence for unconscious alteration of judgments. Journal of Personality and Social Psychology, 35(4), 250–256. https://doi.org/10.1037/0022-3514.35.4.250

57.

Nistor

A.-L.

Merrylees

Holá

(2020). Spellbound at the ICC: The intersection of spirituality and international criminal law. In Fraser

McGonigle Leyh

(Eds.), Intersections of law and culture at the international criminal court (pp. 147–168). Edward Elgar. https://doi.org/10.4337/9781839107306.00016

58.

Paulo

R. M.

Albuquerque

P. B.

Bull

(2019). Witnesses’ verbal evaluation of certainty and uncertainty during investigative interviews: Relationship with report accuracy. Journal of Police and Criminal Psychology, 34(4), 341–350. https://doi.org/10.1007/s11896-019-09333-6

59.

Perrin

(2016). Memory at the international criminal tribunal for the former Yugoslavia (ICTY): Discussions on remembering and forgetting within victim testimonies. East European Politics and Societies and Cultures, 30(2), 270–287. doi:10.1177/0888325415581881

60.

Pornpitakpan

(2004). The persuasiveness of source credibility: A critical review of five decades’ evidence. Journal of Applied Social Psychology, 34(2), 243–281. https://doi.org/10.1111/j.1559-1816.2004.tb02547.x

61.

Roberts

Redmayne

(Eds.).(2007). Innovations in evidence and proof: Integrating theory, research and teaching. Hart Publishing.

62.

Robertson

C. G.

Kesselheim

(Eds.). (2016). Blinding as a solution to bias: Strengthening biomedical science, forensic science, and law. Academic Press. https://doi.org/10.1016/C2014-0-01237-3

63.

Sagana

(2018). The downward spiral of biases in criminal investigations: From eyewitnesses to forensic experts and judges. In Barton

Dubelaar

Kölbel

Lindemann

(Eds.), “Vom hochgemuten, voreiligen Griff nach der Wahrheit”: Fehlurteile im Strafprozess (pp. 133–146). Nomos Verlag.

64.

Samet

M. G.

(1975). Quantitative interpretation of two qualitative scales used to rate military intelligence. The Journal of Human Factors and Ergonomics Society, 17(2), 192–202. https://doi.org/10.1177/001872087501700210

65.

Sauer

Auspurg

Hinz

Liebig

Schupp

(2014). Method effects in factorial surveys: An analysis of respondents’ comments, interviewers’ assessments, and response behavior. SOEP Papers on Multidisciplinary Panel Data Research, 629, 1–29. https://doi.org/10.2139/ssrn.2399404

66.

Schum

D. A.

(2009). A science of evidence: Contributions from law and probability. Law, Probability and Risk, 8(3), 197–231. https://doi.org/10.1093/lpr/mgp002

67.

Schum

D. A.

Martin

A. W.

(1982). Formal and empirical research on cascaded inference in jurisprudence. Review, 17(1), 105–152. https:///doi.org/10.2307/3053534

68.

Schum

D. A.

Morris

J. R.

(2007). Assessing the competence and credibility of human sources of intelligence evidence: Contributions from law and probability. Law, Probability and Risk, 6(1–4), 247–274. https://doi.org/10.1093/lpr/mgm025

69.

Simon

(2012). In doubt. Harvard University Press. https://doi.org/10.4159/harvard.9780674065116

70.

Simon

H. A.

(1990). Bounded rationality. In Eatwell

Milate

Newman

(Eds.), Utility and probability (pp. 15–18). Palgrave Macmillan UK. https://doi.org/10.1007/978-1-349-20568-4_5

71.

Sluiter

(2005). The ICTR and the protection of witnesses. Journal of International Criminal Justice, 3(4), 962–976. https://doi.org/10.1093/jicj/mqi058

72.

Sluiter

Friman

Linton

Vasiliev

Zappala

(Eds.). (2013). International criminal procedure. Oxford University Press.

73.

Smeulers

Grunfeld

(2011). International crimes and other gross human rights violations. A multi- and interdisciplinary textbook. Martinus Nijhoff Publishers. https://doi.org/10.4324/9780203083284

74.

Smith

C. T.

de Houwer

Nosek

B. A.

(2013). Consider the source: Persuasion of implicit evaluations is moderated by source credibility. Personality and Social Psychology Bulletin, 39(2), 193–205. https://doi.org/10.1177/0146167212472374

75.

Snook

McCardle

M. I.

Fahmy

House

J. C.

(2017). Assessing truthfulness on the witness stand: Eradicating deeply rooted pseudoscientific beliefs about credibility assessment by triers of fact. Canadian Criminal Law Review, 22(3), 297–306.

76.

Sobel

(1985). A theory of credibility. The Review of Economic Studies, 52(4), 557–573. https://about.jstor.org/terms. doi:10.2307/2297732

77.

Steblay

Hosch

H. M.

Culhane

S. E.

Mcwethy

(2006). The impact on juror verdicts of judicial instruction to disregard inadmissible evidence: A meta-analysis. Law and Human Behavior, 30(4), 469–492. https://doi.org/10.1007/s10979-006-9039-7

78.

Stepakoff

Reynolds

G. S.

Charters

Henry

(2014). Why testify? Witnesses’. motivations for giving evidence in a war crimes tribunal in Sierra Leone. International Journal of Transitional Justice, 8(3), 426–451. https://doi.org/10.1093/ijtj/iju019

79.

Stover

(2005). The witnesses: War crimes and the promise of justice in The Hague. University of Pennsylvania Press.

80.

Swigart

(2017). Linguistic and cultural diversity in international criminal justice: Toward bridging the divide. University of the Pacific Law Review, 48(2), 197–218. https://scholarlycommons.pacific.edu/uoplawreview/vol48/iss2/10

81.

Taylor

B. J.

(2006). Factorial surveys: Using vignettes to study professional judgement. British Journal of Social Work, 36(7), 1187–1207. https://doi.org/10.1093/bjsw/bch345

82.

UNODC (2011). Criminal intelligence: Manual for analysts. https://www.unodc.org/documents/organized-crime/Law-Enforcement/Criminal_Intelligence_for_Analysts.pdf

83.

US Army (2012). Open-source intelligence. http://www.fas.org/irp/doddir/army/atp2-22-9.pdf

84.

Volbert

Steller

(2014). Is this testimony truthful, fabricated, or based on false memory? Credibility assessment 25 years after Steller and Köhnken (1989). European Psychologist, 19(3), 207–220. https://doi.org/10.1027/1016-9040/a000200

85.

Vrij

Granhag

P. A.

Porter

(2011). Pitfalls and opportunities in nonverbal and verbal lie detection. Psychological Science in the Public Interest, 11(3), 89–121. https://doi.org/10.1177/1529100610390861

86.

Vrij

Hartwig

Granhag

P. A.

(2019). Reading lies: Nonverbal communication and deception. Annual Review of Psychology, 70(1), 295–317. https://doi.org/10.1146/annurev-psych-010418-103135

87.

Wade

K. A.

Nash

R. A.

Lindsay

S. D.

(2018). Reasons to doubt the reliability of eyewitness memory: Commentary on Wixted, Mickes, and Fisher (2018). Perspectives on Psychological Science, 13(3), 339–342. https://doi.org/10.1177/1745691618758261

88.

Wald

P. M.

(2002). Dealing with witnesses in war crime trials: Lessons from the Yugoslav Tribunal. Yale Human Rights and Development Journal, 5(1), 217–242. http://hdl.handle.net/20.500.13051/5827

89.

Whiting

(2009). In international criminal prosecutions, justice delayed can be justice delivered. Harvard International Law Journal, 50(2), 323–364.

90.

Wistrich

A. J.

Guthrie

Rachlinski

J. J.

(2005). Can judges ignore inadmissible information? The difficulty of deliberately disregarding. Law Review, 153(4), 1251–1345. https://doi.org/10.2139/ssrn.2934295

91.

Wistrich

A. J.

Rachlinski

J. J.

(2017). Implicit bias in judicial decision making: How it affects judgment and what judges can do about it. In American Bar Association, Enhancing Justice (pp. 87–130). https://doi.org/10.2139/ssrn.2934295

92.

Prosecutor v Ndindabahizi, ICTR-01-71, Trial Judgment (2004).

93.

Prosecutor v Bemba, ICC-01/05-01/08, Trial Judgment (2016).

94.

Prosecutor v Kunarac et al., IT-96-23 & 23/1, Trial Decision on Motion for Acquittal (2000).

95.

Prosecutor v Ntaganda, ICC-01/04-02/06, Trial Judgment (2019).

Towards a Model of (Insider) Witness Assessments in International Crime Cases: An Experimental Vignette Study

Abstract

Keywords

Introduction

Assessment of Source: (Insider) Witness Credibility

Assessment of Information: Testimonial Quality

Source and Contents: Inevitable Dependencies?

Methodology

Participants

Materials

Procedure

Analytical Approach

Results

Profile of the Witness: The Same Characteristics, Opposite Conclusions

Information Provided: Who Sets the Standard?

Meeting the Assessor's Expectations

Insider Witness Evidence Assessment Model

Discussion

Footnotes

Acknowledgements

Declaration of Conflicting Interests

Funding

ORCID iD

Notes

Appendix Sample Vignettes

Author Biography

References