Abstract
This retrospective study reports on (a) the prevalence of malingering in a sample of 20 homicide defendants seen in jail settings for criminal responsibility evaluations, and (b) the feasibility of the Schedule for Nonadaptive and Adaptive Personality (SNAP) for malingering detection in this sample. Based on previous non-clinical simulation research, it was hypothesized that the SNAP validity scales would predict group membership for homicide defendants malingering psychopathology. Those with intellectual disabilities or psychotic disorders were excluded. Diagnostically, nearly one half of the sample had Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR) personality and substance use disorders. Point prevalence of malingering was 30%. Using the criterion of any SNAP validity scale score in the clinical range (T ≥ 65), a reasonable sensitivity was demonstrated in the detection of malingering (83%), yet this outcome was hindered by a high false positive rate (64%). This study suggests further exploration of the SNAP for assessing malingering in forensic populations is warranted.
Introduction
During criminal forensic assessment, it is essential to always consider the possibility that defendants are malingering psychopathology, that is, intentionally producing false or grossly exaggerated mental illness symptoms in pursuit of external incentives (American Psychiatric Association [APA], 2013). Criminal defendants malinger in 10% to 70% of cases (Bourget & Whitehurst, 2007; Grøndahl, Vaerøy, & Dahl, 2009; Mittenberg, Patton, Canyock, & Condit, 2002; Myers, Hall, & Tolou-Shams, 2013; Resnick, 1993; Rogers, 1986, 2008; Woodworth et al., 2009). In addition, about one half of persons in the criminal justice system have antisocial personality disorder (ASPD) and two thirds of them have any personality disorder, with the presence of a personality disorder being a risk factor for malingering (Fazel & Danesh, 2002). Being on the alert for malingering is all the more relevant when evaluating defendants accused of serious crimes, such as murder, given the harsh consequences they face if successfully prosecuted (Myers et al., 2013). The prospect of lengthy prison sentences, life imprisonment, or even execution in those states with the death penalty can be a powerful incentive for defendants to falsely produce or embellish mental illness symptoms.
Undetected malingering inflicts serious safety, justice, and financial consequences upon society. It has been estimated that the medical and legal costs of malingering exceed five billion dollars annually (Gouvier, Lees-Haley, & Hammer, 2003). Furthermore, successful malingering diverts funds from the deserving to the undeserving (Bordini, Chaknis, Ekman-Turner, & Perna, 2002). The administration of justice is hijacked when malingerers avoid criminal conviction through specious mental health defenses and divert valuable, limited treatment-based rehabilitation resources away from those who lack criminal responsibility due to genuine psychopathology (Frederick, Crosby, & Wynkoop, 2000).
Psychometric testing may be used by forensic evaluators to supplement clinical interview and collateral information in the detection of malingering (Archer, Buffington-Vollum, Vauter Stredny, & Handel, 2006; Hall & Hall, 2012b; Resnick & Knoll, 2005; Rogers, 2008). There has been a notable expansion over time of forensically relevant assessment instruments and related research (Denney & Sullivan, 2008; Otto & Heilbrun, 2002). Some commonly used instruments (presented alphabetically and not prioritized) include the Miller Forensic Assessment of Symptoms Test (M-FAST), Minnesota Multiphasic Personality Inventory–2 (MMPI-2), Personality Assessment Inventory (PAI), Rey 15-Item Test (FIT), Structured Interview of Reported Symptoms (SIRS), and Test of Memory Malingering (TOMM; Archer et al., 2006; Jackson, Rogers, & Sewell, 2005; Pelfrey, 2004; Slick, Tan, Strauss, & Hultsch, 2004). A majority of diplomats in forensic psychology surveyed by Lally (2003) rated these instruments as acceptable for the assessment of malingering (with the exception of the M-FAST, for which no opinion was provided). Of interest from an historic perspective is that some tests (i.e., the MMPI-2 and the PAI) were originally designed for clinical use and not malingering detection, yet with time their role expanded into the forensic realm (Buchanan, 1994; Graham, Watts, & Timbrook, 1991; Rogers, Sewell, Morey, & Ustad, 1996).
To date, however, no consensus has emerged among researchers as to which assessment instruments are of most value, which ones should be used for particular clinical settings, and to what extent they should even be used in forensic mental health evaluations (Archer et al., 2006; Myers et al., 2013). Moreover, the importance of new tests being developed for malingering has been stressed over time due to concerns that the instruments may become less effective the more a population becomes familiar with them. That is, the more knowledge about the content and mechanics of a test that leaks out over time, the easier it becomes for enlightened test takers to deceive psychometricians (Hall & Hall, 2012b; Inman & Berry, 2002; Youngjohn, 1995). This concern has some parallels with the Flynn effect, which refers to the increase over time in IQ scores for a particular test (Neisser, 1997). Presumably, the longer a test exists in society and the more it is relied upon, the less effective it may become due to rising scores with time. In correctional populations, for instance, inmates often have abundant free time to interact, and those with the knowledge or experience may take the opportunity to coach others about how to successfully deceive on psychometric testing. Having multiple measures and strategies for the detection of malingering helps to cast a wider net to uncover it, increases the likelihood of correctly identifying it when it occurs, and reduces the likelihood that those with the intention to deceive during assessment will be successful (Larrabee, 2008; Lynch, 2004). For example, Larrabee’s (2008) study, which manipulated multiple conditions, determined the probability of detecting malingering based on one symptom validity test was 35.7% to 97.8%, with two was 73.5% to 99.6%, and with three was 93.3% to 99.9%. Hence, it is generally recognized that there is not only a need to optimize the accuracy and application of current malingering assessment measures but also to consider and develop novel and ideally more streamlined approaches. Our present study examining the Schedule for Nonadaptive and Adaptive Personality (SNAP) is consistent with these aims in that we examined in a novel fashion a clinical personality instrument that also has a broad spectrum of validity scales, potentially bridging clinical and malingering assessment goals.
The SNAP is a self-report measure that dimensionally assesses personality traits, temperament, and disorders (Clark, 1993; Tellegen, 1993). It has 12 trait scales (e.g., Aggression, Entitlement, Propriety), three temperament scales (Positive, Negative, and Disinhibition), and diagnostic scales for the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; APA, 1994) personality disorders. The SNAP also has six validity scales to identify response biases and other types of invalid responding (Variable Response Inconsistency [VRIN], True Response Inconsistency [TRIN], Desirable Response Inconsistency [DRIN], Rare Virtues [RV], Deviance [DEV], and the Invalidity Index [II]). These scales work in a manner similar to the MMPI-2 in that some of them look for consistency of responses internally (VRIN, TRIN, DRIN), while others look for abnormalities in response patterns that signal over- or under-endorsement of traits to a degree not normally seen (RV, DEV). The II provides an overall measure of profile invalidity (Clark, 1993). Potentially most relevant to malingering are DEV scale items which reflect severe pathology rarely endorsed by normal participants, as high scores may indicate faking bad (Clark, 1996).
The SNAP (and, more recently, the SNAP-Y for adolescents) has demonstrated strong psychometric properties among adult and adolescent populations, and it has been used in a variety of settings to identify normal traits in adolescent, college-age, and community-based samples as well as to assess the nexus between aggression, criminal behavior, and personality (Fiedler, Oltmanns, & Turkheimer, 2004; Latzman, Vaidya, Clark, & Watson, 2011; Linde, Stringer, Simms, & Clark, 2013; Melley, Oltmanns, & Turkheimer, 2002; Morey et al., 2003; Myers, 2002; Myers & Monaco, 2000). For example, Melley et al. (2002) administered the SNAP in a college sample and found test–retest correlation coefficients over a 9-month period of time of .58 to .81 for the various scales. Yen et al. (2011) found that the SNAP demonstrated good predictive power for suicide attempts (hazard ratio = 1.28, p < .001) in nonpsychotic populations. For more background on SNAP validity scale development and psychometric properties, see Clark (1993), Clark and Watson (1995), and Simms and Clark (2001).
We are not aware of any study that has investigated the use of SNAP validity scales in the assessment of malingering with actual forensic populations. A study with possible relevance to this issue, however, was conducted by Simms and Clark (2001) who used naïve simulators to investigate the validity scales of the SNAP. One hundred ninety-two undergraduate students were randomly assigned as participants in positive distortion (appearing more virtuous or faking good), negative distortion (appearing more troubled or faking bad), and control groups (respond normally). The study determined that the negative distortion group was easier to identify than the positive distortion group. The scales which were found to be best for detecting the simulators were the RV and DEV validity scales, and they predicted group membership at least as well as MMPI-2 validity scales.
The purpose of the current study was to investigate the frequency of malingering in a sample of pretrial homicide defendants and to consider the feasibility of using the SNAP in the detection of malingering in this population. Given prior research that demonstrated the SNAP DEV scale may be a marker for malingering in non-forensic populations, we examined this scale separately from other validity scales to understand how it might be singularly associated with malingering in a forensic clinical population. This exploratory, retrospective study originated from the empirical observation that, during the assessment of homicide defendants, the SNAP validity scale results often complimented the findings from other evaluation methods and information sources regarding the presence or absence of malingering. Furthermore, research investigating clinical instruments for potential application in the assessment of malingering assessment too, as in this study, ultimately can lead to the development of more time- and cost-efficient forensic evaluation approaches (i.e., if single tests can be shown to serve both purposes; Inman & Berry, 2002).
Method
Sample
The sample was comprised of 20 pretrial homicide defendants incarcerated in multiple states. All were seen over an approximately 12-year period for forensic psychiatric consultation in jail settings to evaluate mental state at the time of the offense as it related to criminal responsibility (e.g., “insanity defense”). Mean age at the time of committing homicide was 37.5 years (SD = 18.1, range = 14-67); 90% were male; 85% were White and 15% were Black. The average level of education was 11.0 years (SD = 2.6, range = 7-16).
Exclusion Criteria
Two exclusionary criteria were used to minimize the risk of cognitive limitations or major mental illness symptoms confounding the SNAP results for this feasibility study. The SNAP is designed to measure normal and abnormal personality features—not intellectual disabilities or major mental illness symptomatology—and requires a sixth-grade reading level. Thus, defendants with intellectual disabilities (defined as a pre-arrest diagnosis of mental retardation or a standard IQ test score <70), and defendants with an established diagnosis of schizophrenia, schizoaffective disorder, bipolar disorder, or other reality-impairing mental disorder, were excluded from the study.
Homicide Classification
The homicides were divided into three general types based on the most salient phenotype: argument/conflict (n = 8; 40%), domestic (n = 9; 45%), and felony (n = 3; 15%). The argument/conflict category encompassed killings related to a dispute between unrelated persons. Domestic homicides occurred between family members, spouses, or intimate partners living in the same household. Felony homicides took place during the commission of another felony. In 17 cases, there was a single victim, and in three cases, there were multiple victims (range = 3-5).
Mental Disorders in the Sample
Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR; APA, 2000) best estimate diagnoses incorporating all available clinical and collateral information (e.g., treatment records, jail records, arrest reports, school records), and present in more than 10% of the sample, are as follows: substance abuse/dependence disorders (n = 9; 45%), any personality disorder (n = 8; 40%), any depressive disorder (n = 5; 25%), ASPD (n = 4; 20%), posttraumatic stress disorder (n = 4; 20%), and borderline personality disorder (n = 3; 15%). Four out of 20 (n = 5; 20%) homicide defendants had co-occurring Axis I diagnoses (e.g., posttraumatic stress disorder and polysubstance dependence). (Incidentally, the high prevalence of substance use disorders in this sample was consistent with national survey data that showed approximately two thirds of jail inmates have such conditions; U.S. Department of Justice, 2005.)
Procedure
Each case was evaluated by a board-certified forensic psychiatrist over the course of multiple hours. A customary forensic mental health evaluation format was followed. After the defendant was informed about the nature and purpose of the evaluation and limits to confidentiality, a comprehensive clinical interview was conducted (information was obtained through a face-to-face interview regarding current adjustment, life history, detailed narrative of the crime[s], mental status, etc.). Following the interview, all subjects completed the SNAP. Based on clinical presentation, other psychometric tests were administered as indicated to further assess cognition, psychopathology, and response style (e.g., Mini-Mental State Examination, MMPI-2, Rey 15-Item Memory Test, Structured Interview for Reported Symptoms, TOMM, Trauma Symptom Inventory). In most cases, a forensic psychiatry fellow or psychiatric resident was also present. In an attempt to minimize external influences on the process, no examinations were conducted with a detention facility officer in the examination room or within hearing range.
Determination of Whether Malingering Was Present During the Evaluation
An initial determination of suspected malingering cases was made by the evaluating forensic psychiatrist, taking into account all available case data (clinical interview results, supplementary psychometric testing outcomes, and collateral data) except for the SNAP findings (dependent variable). Each of the six suspected cases of malingering was then reviewed with one co-author independent of the forensic evaluation process for external verification. Classification as a malingerer required both raters to be in agreement that malingering was present as defined by Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5; APA, 2013; “the intentional production of false or grossly exaggerated physical or psychological symptoms, motivated by external incentives such as . . . avoiding criminal prosecution”). Full concordance was reached that malingering was present for all six cases. Given the “state dependent” quality of malingering and this study’s focus on the real-time utility of the SNAP, those subjects who were identified as having feigned mental illness in other evaluations or at other times per collateral data were not classified as malingerers.
Data Analysis
Data for the current analyses were compiled from forensic chart review. The Lifespan Institutional Review Board approved this chart review protocol. Frequencies and distributions of all numeric variables were examined for any issues of non-normality. Given the small sample size and resulting non-normal distributions, nonparametric statistics (i.e., the Fisher exact test and Mann–Whitney U test) were conducted to examine differences between malingerers and non-malingerers on demographics and any SNAP validity scale scores of T-score ≥65 (clinically relevant, per Clark, 1993) versus validity scale T-scores <65 (not clinically relevant). The Fisher exact test is used when a sample size is too small (e.g., n is less than 5 in a particular cell) to conduct chi-square tests for independent samples. The Mann–Whitney U test is a common nonparametric statistic used when the assumptions for an independent t test are not met (Pett, 1997); this test is based on group comparisons of summed rank scores.
Results
Point Prevalence of Malingering
Six of the 20 defendants (30%) were determined to be actively malingering current or past psychopathology at the time of forensic evaluation. Their fabricated symptoms were claimed to exist at the time of the crime in two cases and at the time of the evaluation in four cases. The spectrum of malingering presentations was broad despite the small sample. It consisted of delirium at the time of the crime (one case), psychogenic amnesia for the crime (one case), psychotic symptoms endorsed during the evaluation (delusions or hallucinations; two cases), a combination of cognitive deficits and psychotic symptoms displayed during the evaluation (one case), and an MMPI-2 F scale score >110 from evaluation testing (one case).
SNAP Identification of the Malingering Defendants
Bivariate comparisons of malingerers (n = 6; 30%) versus non-malingerers (n = 14; 70%) indicated no group differences in age, education, biological sex, race, or presence of ASPD (all ps > .05).
The malingering versus nonmalingering groups were compared categorically using the presence or absence of one or more SNAP validity scale scores with a T-score ≥65 (VRIN, TRIN, DRIN, RV, DEV, II). We chose this approach versus identifying bivariate associations between continuous validity (e.g., mean) scale scores and malingering groups so that our results would be more clinically relevant and interpretable. Using this approach, five of the six malingerers were correctly classified, a sensitivity of 83% (see Table 1: “Any Invalidity Scale T-Score ≥65” column). In four of these cases, only one validity scale was elevated (two had an elevated RV score and two had an elevated TRIN score), and in the fifth case, two validity scales were elevated (DEV, TRIN). In contrast, nine of the 14 other defendants who were not malingering had one or more validity scale T-scores ≥65. Thus, for any defendant with a validity scale T-score ≥65, the positive predictive value was only 36% (probability of a false positive was 64%). Moreover, the negative predictive value for any defendant with no elevation in validity scales was 83% (probability of a false negative was 17%). Malingering and nonmalingering groups did not differ on the average number of elevated validity scale scores (Mann–Whitney U = 40.00, p = .86, significance ≤ .05, two-tailed).
SNAP Identification of Malingering for the Sample.
Note. Sensitivity: proportion of defendants with malingering who tested positive. Specificity: proportion of defendants without malingering who tested negative. Positive predictive value: proportion of defendants with positive tests who were malingering. Negative predictive value: proportion of defendants with negative tests who were not malingering. SNAP = Schedule for Nonadaptive and Adaptive Personality; DEV = Deviance.
The DEV scale, which measures the extent to which a respondent answers in an atypical, deviant manner, and thus may indicate a “fake bad” response style when elevated, was of minimal usefulness in identifying malingering in this sample. It was positive (T-score ≥ 65) in just one of the six malingering cases, and positive in five of the 14 cases that were not classified as malingerers (see Table 1: “DEV Validity Scale T-Score ≥65” column).
Discussion
This exploratory, retrospective study examined the frequency of malingering in 20 pretrial homicide defendants undergoing criminal responsibility evaluations. It also assessed the detection of malingering in this sample using the SNAP. The finding of a 30% point prevalence rate for malingering in this group was consistent with similarly high rates noted in other research studies on pretrial criminal defendants. For instance, Rogers (1986) reported that 20% of criminal defendants being evaluated for insanity were suspected of malingering. Likewise, Myers et al. (2013) reported a malingering prevalence of 17% in pretrial homicide defendants.
To the authors’ knowledge, this is the first study to evaluate the feasibility of using the SNAP to assess for malingering in forensic populations. Based on earlier SNAP research in which validity scales predicted distortion in a non-clinical simulation study (Simms & Clark, 2001), this work explored whether SNAP validity scale outcomes would predict group membership for those homicide defendants malingering psychopathology during clinical evaluations in “real world” pretrial settings (i.e., those with ecological validity). The preliminary results of this study proved to be modestly positive in that the SNAP identified four of six malingerers. The sensitivity of the SNAP was moderately good (.83), as was the negative predictive value (.83), using one or more elevated validity scales as a marker for malingering. To add context for these values, the two most often used malingering tests by forensic neuropsychologists, the Rey FIT and the TOMM, have reported sensitivities of .40 to .62 and .82, respectively, using generally agreed upon cutoff scores (Frederick, 2002; Slick et al., 2004; Tombaugh, 1996). This salutary finding was significantly tempered by a modest positive predictive value (.36); nearly two out of three subjects were misclassified as malingerers using this approach. These results do not support the use of the SNAP as a solo instrument for the determination of malingering, but suggest it might have a role as part of a test battery or for screening purposes.
Somewhat unexpectedly given the Simms and Clark’s (2001) study, the SNAP DEV scale was unhelpful in identifying malingering in the present study. It correctly classified only one of the six malingering cases (positive predictive value = .17). We had anticipated that the SNAP DEV scale would function analogous to the MMPI-2 F scale (“infrequency”) in its ability to detect malingering in criminal defendants (Toomey, Kucharski, & Duncan, 2009). This negative finding underscores the difficulty that can arise when attempting to translate findings from simulation studies to clinical settings (Batista & Myers, 2012; Shadish, Cook, & Campbell, 2002).
In short, this small, feasibility study indicates that the SNAP may have some limited utility as a screening measure for the detection of malingering in nonpsychotic, intellectually intact homicide defendants. Four out of five (83%) of the malingerers had elevations (T ≥ 65) on one or more SNAP validity scales. Alternatively, use of the screening criterion of no SNAP validity scale elevation correctly classified 83% of those defendants not malingering.
Several limitations in this work exist. First, the study was exploratory and retrospective in design. Second, the sample size was limited. Practically speaking, however, it can be challenging for researchers to access a large n of homicide defendants undergoing criminal responsibility evaluations due to the relatively infrequent use of the insanity defense and legal and ethical obstacles to research studies in this setting. Nevertheless, the small sample size compromised statistical power, thereby limiting our ability to detect certain group differences. Notably larger sample sizes would allow more in-depth statistical analysis of group differences, for example, comparison of means and standard deviations for SNAP scale T-scores. Third, our clinical sample excluded offenders with documented intellectual disabilities and major mental illness in an attempt to have idealized conditions for this feasibility study on a personality assessment instrument; therefore, these results may not apply to all homicide offenders. Researchers going forward might also consider broadening inclusion criteria to include individuals with serious mental illness (provided their conditions would not preclude evaluation) to more closely simulate the typical forensic evaluee population. Fourth, the exclusionary criteria may have influenced the results by increasing the proportion of subjects with personality disorders and thus the rate of malingering. Persons with personality disorders, especially ASPD, are believed to be more likely to feign psychiatric symptomatology in medicolegal circumstances. Fifth, the SNAP was not specifically designed for forensic populations, and this population is not included in the normative sample. Therefore, its applicability to the assessment of homicide defendants remains unclear. Sixth, we cannot be positive that the subjects identified as malingerers were in fact genuine malingerers. This is an inherent limitation in virtually all real-life malingering studies considering most mental disorder symptoms are subjective and can be feigned with success to varying degrees (Hankins, Barnard, & Robins, 1993; Harrison, Edwards, & Parker, 2007; Rosenhan, 1973). In short, knowing another’s mental state and motives with certainly remains elusive unless known simulators are used in controlled conditions, and then the question always arises as to whether such findings can be generalized to the real world. Despite this, we believe the conservative classification approach we used was clinically accurate, and if any errors did occur, they would have involved overlooking those subjects with less obvious expressions of malingering rather than the contrary.
It bears repeating that too much confidence can be placed in psychometric testing outcomes by forensic clinicians and courts. This is despite repeated warnings in the literature that a determination of malingering should be based on all available information and never on psychometric test results in isolation. Doing so runs the risk of erroneously diagnosing malingering and perpetrates the myth that psychometric testing is “objective.” While psychometric testing normally utilizes standardized administration and scoring procedures, its results are different from an X ray or blood test in that they are influenced by test-taker factors like motivation, attention, intent to deceive, stress, mood, energy level, cognitive ability, degree of reality testing, medication effects, physical illness, and nutritional status. Environmental influences also can influence test validity. Thus, psychometric testing merely allows for inferences, and not conclusions, to be drawn about the traits and abilities of a specific individual (Cassel, 1969; Tyson, 1979). Only after a comprehensive assessment of all relevant data has been completed and other possible explanations have been confidently ruled out (e.g., genuine pathology, unconscious motivation, medical disorders, clinician countertransference), should a diagnosis of malingering be made (Dean, Victor, Boone, Philpott, & Hess, 2008; Drob, Meehan, & Waxman, 2009; Hall & Hall, 2012a; Lee, Loring, & Martin, 1992; Lynch, 2004).
Larger forensic samples are needed to further explore the potential usefulness of the SNAP in the evaluation of malingering in criminal defendants. Research that leads to an increase in the armamentarium of malingering detection instruments available to forensic evaluators will ultimately strengthen the integrity of the forensic evaluation process. There is evidence that increased awareness about detection techniques can enhance malingering skills in evaluees (Bury & Bagby, 2002). Having more instruments in the field lessens the potential threat from coaching (by defense lawyers, legally savvy inmates, etc.), self-study of mental illness symptoms, and other negative external influences (e.g., Internet instruction on test-taking strategies) on the reliability of psychometric assessment outcomes (Hall & Hall, 2012b; Larrabee, 2008; Lynch, 2004). In summary, we believe further investigative efforts examining the feasibility of the SNAP for assessing malingering in forensic populations are warranted.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research and/or authorship of this article.
