Abstract
Objectives:
The aim of this study was to conduct a systematic review of the literature to collect, analyse and synthesise the evidence on skin picking disorder as defined by Arnold’s criteria or the
Method:
The databases CINAHL, Medline, Embase and PsycINFO were searched for articles published between January 2008 and May 2018. Eligible articles were empirical studies that used Arnold’s or DSM-5 criteria to diagnose skin picking disorder, published in English, with participants aged 18 years or older. The methodological quality of included studies was assessed according to the
Results:
A total of 20 studies were considered eligible out of 1554. Most of the papers were case-control studies with small clinical samples. Only one out of Blashfield’s five criteria was met; there were commonly accepted diagnostic criteria and assessment scales present in the literature. However, at the time of review, the criterion of 50 published articles (25 of which are required to be empirical) was not met; there had been no publication specifically assessing the clinical utility or validity of skin picking disorder and no studies addressing the differentiation of skin picking disorder from other obsessive-compulsive and related disorders.
Conclusion:
Only a small proportion of published studies on skin picking disorder have employed validated criteria. The current literature fulfills only one of Blashfield’s five criteria for the inclusion of skin picking disorder as a specific entity in psychiatric diagnostic manuals. Further empirical studies on skin picking disorder are needed in order to substantiate skin picking disorder as a disorder distinct from related disorders under the obsessive-compulsive and related disorders category.
Skin picking disorder (SPD), also referred to as psychogenic or neurotic excoriation, pathological skin picking or dermatillomania, was introduced as a disorder in the fifth edition of the
Diagnostic criteria for SPD as proposed by Arnold’s criteria a and the DSM-5 b .
SPD: skin-picking disorder; DSM:
Arnold et al. (2001); bAmerican Psychiatric Association (2013)
The validity of SPD as a distinct disorder speaks to some of the broader issues surrounding the introduction of new diagnostic categories and needs to take the historical evolution of psychiatric nosology into account. Nosology has progressed from a ‘top-down’ approach based on a-priori features advocated by experts to a system increasingly based on empirical evidence (Kendler, 2009). The first three editions of the DSM used a descriptive approach towards classification based on the rationale that the creation of new diagnostic categories would lead to enhanced empirical research in otherwise neglected areas of psychopathology (Robins and Barrett, 1989). This strategy resulted in a proliferation of diagnostic categories that often had sparse empirical evidence to support their existence, which clinicians reported were poor in capturing the clinical complexities of patients (Kendler et al., 2009). Moreover, the door was opened for various interest groups to lobby for inclusion of specific disorders, with the eventual realisation that, once added, diagnostic categories are difficult to remove (Blashfield et al., 1990). In order to impede emotional and political discourse from impacting the contents of the DSM and to ascertain an evidence base, Blashfield et al. (1990) proposed a set of guidelines to be satisfied in order for a putative diagnostic category to be recognised in the DSM; henceforth referred to as ‘Blashfield’s criteria’. The application of Blashfield’s criteria against the published literature has been used to assess the validity and clinical utility of other diagnostic categories including binge eating disorder (Striegel-Moore and Franko, 2008), night eating syndrome (Striegel-Moore et al., 2006), muscle dysmorphia (Santos Filho et al., 2016) and catatonia (Taylor and Fink, 2003). Other proposed criteria to assess diagnostic validity and clinical utility include those by Robins and Guze (1970) and Kendler et al. (2009). These criteria have significant overlap with Blashfield’s criteria; however, they lack the specificity of Blashfield’s and are therefore difficult to operationalise.
Given the above clinical and historical perspectives, the objective of this study was to ascertain the current status of SPD as a discrete category in the DSM-5 according to Blashfield’s criteria. To achieve this, we conducted a comprehensive search of the literature, assessed them against the Blashfield criteria, and evaluated and summarised the nosological classification of SPD as proposed by various studies.
Methods
This study was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Moher et al., 2010).
We performed a systematic review of the literature searching CINAHL, Embase, Medline and PsycINFO for articles published between January 2008 and May 2018 using the following terms in title and abstract fields:
Inclusion criteria
Eligible studies were analytical or prevalence articles published in English in which participants were assessed for SPD using Arnold et al.’s (2001) or DSM-5 (American Psychiatric Association, 2013) criteria. Henceforth referred to as ‘Arnold’s criteria’, this operational diagnostic criterion was proposed by Arnold et al. (2001) prior to the inclusion of SPD in the DSM-5. The criteria were based on the reported phenomenology of patients with skin picking behaviours in multiple studies and are akin to the criteria later included in the DSM-5 (see Table 1). Only articles with participants aged 18 or older were included.
Exclusion criteria
Theoretical articles and reviews, dissertations, studies with a qualitative design, tool validation articles, case studies and studies that focused exclusively on treatment of SPD were excluded.
Selection process
Sourced studies were imported into Covidence online software (www.covidence.org). Two independent reviewers screened studies for relevance based on titles/abstracts and later full-texts (Z.J. and H.Z.) with disagreements resolved through discussion or by consulting the third reviewer (D.C.).
Data extraction
Relevant data were extracted from each study including country undertaken, study design, sample size (
Assessment of the methodological quality
The strength of evidence presented by the included articles was assessed according to levels of evidence and grades of recommendations provided by the National Health and Medical Research Council’s (NHMRC, 2008) ‘Additional Levels of Evidence and Grades for Recommendations’. The NHMRC Evidence Hierarchy determines six designations of ‘levels of evidence’ according to study design (see Table 2). These levels summarise study designs according to their generally perceived capacity to minimise or eliminate bias in the effect being measured. Articles were assessed independently by two reviewers (Z.J. and H.Z.) and disagreements were resolved through discussion or by consulting the third reviewer (D.C.).
Description of included studies.
SPD: skin picking disorder; TTM: trichotillomania; F: female; M: male; N/A: not applicable; SCID-I: Structured Clinical Interview for DSM-IV Axis I Disorders; CGI: Clinical Global Impression Severity Scale; SDS: Sheehan Disability Scale; HCs: healthy controls; OCD: obsessive-compulsive disorder; CNB: compulsive nail biting; SP-SAS: Skin Picking Symptom Assessment Scale; SP-YBOCS: Yale Brown Obsessive-Compulsive Scale Modified for Skin Picking; NE-BOCS: Yale–Brown Obsessive-Compulsive Scale modified for neurotic excoriation; HARS/HAM-A: Hamilton Anxiety Rating Scale; HDRS/HAM-D: Hamilton Depression Rating Scale; SPS: Skin Picking Scale; HADS: Hospital Anxiety and Depression Scale; MINI: Mini International Neuropsychiatric Interview; PSWQ-8: Penn State Worry Questionnaire – 8-item version; UPPS: UPPS Impulsive Behaviour Scale; MIDI: Minnesota Impulsive Disorders Interview; PHQ-9: Patient Health Questionnaire; PSS: Perceived Stress Scale; BDD: body dysmorphic disorder; BDDQ: Body Dysmorphic Disorder Questionnaire; NS: not specified; SPQ: Skin Picking Questionnaire; BDI: Beck Depression Inventory; OCI: Obsessive-Compulsive Inventory; BIS-11: Barratt Impulsiveness Scale; QoLI: Quality of Life Inventory; OCRDs: obsessive-compulsive and related disorders; EIQ: Eysenck Impulsiveness Questionnaire; MIDAS: Milwaukee Inventory for the Dimensions of Skin Picking; SPS-R: Skin Picking Scale – Revised; BSI: Brief Symptom Inventory; PSQI: Pittsburgh Sleep Quality Index; BFRBDs: body-focused repetitive behaviour disorders; SADS: Scale for the Assessment of Disgust Sensitivity; QADP: Questionnaire for the Assessment of Disgust Proneness; QASD: Questionnaire for the Assessment of Self-Disgust; fMRI: functional magnetic resonance imaging; STADI: State Trait Anxiety Depression Inventory.
Blashfield’s criteria
To examine the validity of SPD as a diagnostic category, articles selected were evaluated in accordance with the guidelines proposed by Blashfield’s criteria: (1) There should be at least 50 journal articles published on the proposed diagnostic criteria in the last 10 years; (2) The diagnostic criteria should be clearly defined with the existence of assessment instruments; (3) The proposed syndrome should be reliably diagnosed by two or more assessors; (4) The category should represent a syndrome of frequently co-occurring symptoms; and (5) There should be evidence to demonstrate that the proposed category represents an independent disorder that can be differentiated from other (similar) disorders. Included articles were assessed against each criterion by two reviewers (Z.J. and H.Z.) and disagreements were resolved through discussion or by consulting the third reviewer (D.C.).
Results
A PRISMA flow diagram shows the selection of articles for inclusion and exclusion (Figure 1). A total of 1647 articles were retrieved. Of these, 93 were duplicates and 1308 were excluded from title/abstract screening as they were not focused on skin picking or were not empirical articles. A further 246 articles were assessed by full-text review and ultimately 20 articles satisfied the inclusion criteria.

Flow chart showing the retrieval process of articles included in the systematic review.
Study characteristics are listed in Table 2. Three quarters of the studies (15 papers) were conducted in the United States, two in Austria, one in Iceland, one in Israel and another in South Africa. Five studies (25%) used Arnold’s criteria and 15 (75%) used the DSM-5 criteria. Most studies (18 papers; 90%) included clinical participants, with the two non-clinical participant studies using a university student population (Odlaug et al., 2013) and a sample of community members (Leibovici et al., 2015). Three studies (15%) included female-only samples, with sample sizes ranging from 12 to 2145 individuals.
Assessment against Blashfield’s criteria
Criterion 1: literature
Blashfield’s criteria recommended that, for a diagnostic category to be sufficiently recognised in the literature, there needs to be at least 50 journal articles (of which 25 are empirical) published about the proposed category in the previous 10 years. A systematic search yielded 1554 published articles that mentioned SPD, with 254 related to SPD over the last decade (see Figure 1 for PRISMA diagram). Within these 254 articles, there were only 20 empirical articles that used Arnold’s or DSM-5 criteria. Therefore, based on the inclusion criteria of this review, the first of Blashfield’s criterion was not met.
Criterion 2: diagnostic criteria
The second guideline proposed by Blashfield et al. (1990) stipulated that there be a set of diagnostic criteria defining the disorder and a corresponding assessment device available to ensure the meaning associated with the diagnostic category is clear and readily assessable. Our search of published empirical articles found Arnold’s criteria was commonly used as diagnostic criteria for SPD prior to the inclusion of SPD in the DSM-5. One assessment device and three self-report scales were used alongside Arnold’s or DSM-5 criteria to assess the symptomatology and severity of SPD:
The Yale–Brown Obsessive-Compulsive Scale (YBOCS) (Goodman et al., 1989) modified for neurotic excoriation or skin picking (NE-YBOCS/SP-YBOCS) was used in 10 studies to rate skin picking symptom severity (see Table 2). This modification of the YBOCS specifically for SPD was used in two treatment trials and was found to have good psychometric properties.
The Skin Picking Symptom Assessment Scale (SP-SAS) was modified from a reliable and valid self-report scale used to assess other ICDs (Grant et al., 2007). The test–retest reliability in an SPD treatment trial was 0.73 (Spearman’s correlation,
A further self-report scale employed alongside Arnold’s or the DSM-5 criteria was the Skin Picking Scale (SPS) developed to measure the psychosocial impact of SPD. In one study, the SPS was used alongside the DSM-5 criteria for SPD and demonstrated good internal consistency (alpha = 0.80) and was capable of discriminating between the SPD patients and the non-pathological skin pickers. Snorrason et al. (2013) conducted a factor analysis of the original SPS which led to the introduction of a revised version (SPS-R) measuring two factors of SPD: impairment and severity. Two empirical studies included the SPS-R alongside the DSM-5 criteria. The SPS-R demonstrates high internal consistency, robust factor structure, and good convergent and discriminant validity (Snorrason and Olafsson, 2012).
Finally, the Milwaukee Inventory for the Dimensions of Skin Picking (MIDAS) assesses whether SPD is automatic (conducted without awareness) or focused (occurring with full awareness). The MIDAS was reported to have good convergent and divergent validity and adequate internal consistency in an Internet sample, with three empirical articles using the MIDAS scale alongside the DSM-5 criteria.
Overall, based on the 20 studies that used either Arnold’s or DSM-5 diagnostic criteria as well as the use and psychometric evaluation of assessment instruments, we found that the diagnostic criterion proposed by Blashfield et al. (1990) was met.
Criterion 3: reliability of diagnosis
Blashfield’s criteria recommended that there should be at least two empirical studies conducted by independent research groups where inter-clinician agreement levels (kappa values) of diagnosis are 0.70 or greater. Of the studies that used Arnold’s or DSM-5 criteria to diagnose SPD, no studies specifically assessed inter-rater reliability. Therefore, this criterion was not met.
Criterion 4: syndrome
The fourth guideline recommended by Blashfield’s criteria requires the proposed diagnostic category to represent a syndrome of frequently co-occurring symptoms. Specifically, there should be at least two independent studies that show that, if a patient meets one diagnostic criterion for SPD, there is at least a 0.50 probability that they will meet another criterion. Of the 20 studies that used Arnold’s or DSM-5 criteria to confirm diagnosis of SPD, we found no empirical studies that investigated the co-occurrence of symptoms; hence, the fourth criterion was not met.
Criterion 5: differentiation
In order to ascertain that SPD is a distinct diagnostic category, is not redundant with existing categories and represents an independent disorder, Blashfield’s criteria recommended there be two independent empirical studies establishing differential diagnosis.
The results of the included studies (see Table 2) included a clinical comparison of SPD and OCD, reporting that the SPD sample had a significantly higher proportion of females than did the OCD sample and individuals with OCD spent more time engaging in obsessions and/or compulsions than those with SPD (Grant et al., 2010). Similar rates of general psychiatric comorbidity were observed in the SPD and OCD samples but those with SPD were more likely to have an additional grooming disorder (such as TTM or nail biting), whereas the OCD participants had higher rates of BDD. Another study that attempted to detangle the shared psychopathology between SPD and BDD investigated psychosocial dysfunction, quality of life and comorbid substance use disorders in patients with co-occurring SPD and BDD compared to SPD alone, but found no between-group differences (Grant et al., 2015).
TTM is also thought to share phenomenological and clinical similarities with SPD; evidence of clinical similarities includes age of onset, gender ratio, clinical severity and psychosocial functioning (Odlaug and Grant, 2008). Moreover, investigations into quality of life (Odlaug et al., 2010) and sleep disturbance (Ricketts et al., 2017) found no differences between individuals with SPD and those with TTM. However, differences in brain volume and cortical thickness between individuals with SPD and TTM have been demonstrated, proposing potential neurobiological differences between the two disorders (Roos et al., 2015).
In the 20 empirical studies included in our review, no two independent studies clearly demonstrated differences between SPD and related disorders. Therefore, there is insufficient evidence to satisfy Blashfield’s fifth criterion.
Quality of methodological design
The methodological quality of evidence of each included study was critically appraised using the
Nosological classification of SPD
Of the 20 studies that defined a diagnosis of SPD as per Arnold’s or DSM-5 criteria, 13 postulated a nosological classification for SPD (see Table 2). Of the 20 studies, 5 (25%) supported SPD as part of the obsessive-compulsive spectrum of disorders, 4 studies (20%) as a grooming disorder or body-focused repetitive behaviour (BFRB), 3 studies (15%) as an ICD, 1 (5%) as a body image disorder and 7 (35%) did not suggest a nosological classification of SPD.
Discussion
The findings of the current review suggest that only one of the five Blashfield criteria has been met in support for SPD. Therefore, based on the current literature, we propose that there is insufficient empirical support for the inclusion of SPD as a discrete disorder. Other diagnostic criteria that have failed to fulfil Blashfield’s criteria include muscle dysmorphia (Santos Filho et al., 2016) and night eating syndrome (Striegel-Moore et al., 2006), whereas the literature supports the criteria for binge-eating disorder (Striegel-Moore and Franko, 2008) and catatonia (Taylor and Fink, 2003).
In the current assessment of SPD against Blashfield’s first criterion relating to the existence of sufficient empirical literature, we identified a substantial number of studies on SPD; however, only a small proportion of these used validated diagnostic criteria to determine the presence of SPD. Moreover, studies investigating the prevalence of SPD have primarily been conducted using online surveys which rely on self-report measures to confirm the presence of SPD (Calikusu et al., 2012; Hayes et al., 2009; Keuthen et al., 2010; Machado et al., 2018; Prochwicz et al., 2016), limiting the validity of the findings. This is also reflected in the relatively low quality of methodological design employed by the studies. The prevalence reported in these studies may actually represent the prevalence of skin picking as a
Limitations
A limitation of the current review includes the use of a single criterion to assess the validity of including SPD as a diagnostic construct; the authors acknowledge there are other assessment criteria, such as those of Robins and Guze (1970) and Kendler et al. (2009). The current review also excluded articles not in English and only included those published in the past 10 years which potentially narrowed the number of included articles.
Conclusion
There was substantial literature on SPD identified in this review, yet only a small proportion of these studies used validated criteria to determine the presence of SPD, indicating a lack of consensus on what constitutes the putative disorder. Similarly, the literature proposed varying nosological classifications of SPD and reflected mixed opinions on the relationship of SPD with other disorders. Future research needs to substantiate SPD as a disorder distinct from related disorders which it accompanies under the OCRDs category. A potential advantage of the inclusion of SPD as a distinct category in psychiatric classification systems is increased clinical utility through enhanced clinician awareness of SPD and recognition of the close relationship between SPD and other OCRDs.
Footnotes
Acknowledgements
Z.J. and H.Z. both are first authors.
Declaration of conflicting interests
D.C. has received grant monies for research from Eli Lilly, Janssen-Cilag, Roche, Allergen, Bristol-Myers Squibb, Pfizer, Lundbeck, AstraZeneca and Hospira; travel support and honoraria for talks and consultancy from Eli Lilly, Bristol-Myers Squibb, Astra Zeneca, Lundbeck, Janssen Cilag, Pfizer, Organon, Sanofi-Aventis, Wyeth, Hospira and Servier; and is a current Advisory Board Member for Lu AA21004: Lundbeck; Varenicline: Pfizer; Asenapine: Lundbeck; Aripiprazole LAI: Lundbeck; Lisdexamfetamine: Shire; Lurasidone: Servier; Brexpiprazole: Lundbeck; and treatment-resistant depression: LivaNova. He does not knowingly have stocks or shares in any pharmaceutical company.
Funding
The author(s) received no financial support for the research, authorship and/or publication of this article.
