Abstract
This research evaluated the efficacy of the Wechsler Intelligence Scale for Children—Fifth Edition (WISC-V) for gifted identification. Our sample included 390 gifted, highly gifted, and twice-exceptional (2e) children, referred by parents for testing at seven U.S. sites. We examined de-identified scoring data to determine mean performance patterns across WISC-V indexes and investigated which robust scoring options (each summarizing four to eight subtests) were sensitive to gifted and twice exceptional strengths. We found discrepant WISC-V primary index scores (≥1.5 SD differences) in a majority of our sample, undermining interpretability of Full Scale IQ scores for gifted identification. WISC-V mean scores ranged from very superior in untimed high-g Verbal Comprehension to average in low-g Processing Speed (irrelevant to gifted identification). A similar pattern emerged in our 2008 WISC-IV study of 334 gifted children. We show that the WISC-V performs effectively within multi-dimensional gifted and 2e identification approaches provided that the Full Scale IQ is not required. Instead, we demonstrate that any of six robust, high-g scores (ancillary and expanded indexes; FSIQ) may be used to document global strength or individual strength areas, satisfy gifted identification requirements, and guide advanced programming—even for 2e children, whose correspondingly low scores warrant further evaluation for co-existing weaknesses. We recommend best practices for use of these WISC-V scoring options for ethical identification of a broad, diverse range of gifted and 2e students, many of whom would be missed by averaging discrepant scores. Our study data prompted NAGC position statements on the WISC-V and WISC-IV.
Plain Language Summary
We researched the Wechsler Intelligence Scale for Children—Fifth Edition (WISC-V), a well-known individual ability test, to see how well it finds children who need gifted programs in school. We tested 390 gifted children and found their scores ranged from very high to much lower. This made their Full Scale IQ scores, which combine all the different skills tested, impossible to understand. The children did best when asked questions verbally, and also when doing mental math problems or copying designs. They were not as good at timed paper-and-pencil tasks. Yet, skills like handwriting speed are not important in gifted programs. Schools usually prefer a high Full Scale IQ score for entrance to gifted programs, but it was often too low to qualify these children. This pattern is not new; it happened earlier in our study of 334 gifted children taking an older version of this test: the WISC-IV (Gilman et al., 2008). Our research shows that the WISC-V is helpful for finding gifted children, but the Full Scale IQ should not be required. It is better to consider six possible broad scores, and report the ones that show the child’s strengths (e.g., strong verbal skills, advanced visual reasoning). This should be enough to meet the requirements of the gifted program. It also shows what the child’s strengths are so teachers can plan the child’s instruction. We provide a list of rules for using this test that make it easier to find the broad range of children who need gifted programs, without missing as many children. (See also NAGC WISC-V and WISC-IV position statements).
Keywords
Introduction
Comprehensive, individual intelligence tests are invaluable when used as part of a multi-dimensional selection process for the identification of gifted and twice exceptional (2e) children. With norms typically designed from U.S. Census demographic breakdowns, and multiple psychometric protections against bias, they are defensible measures of intellectual potential for children with high ability in one or more domains. Comprehensive tests provide a multifaceted assessment, recognizing that giftedness may be expressed in more ways than those identified by brief measures, screeners, and unidimensional ability tests. They are administered by highly trained examiners who can observe and address culturally and linguistically relevant concerns, as well as monitor salient behaviors, such as problem-solving processes. Comprehensive, individual intelligence scales are essential to clarify the complex interaction of strengths and weaknesses in gifted children with co-existing disabilities: the twice exceptional (National Association for Gifted Children [NAGC], 2008, 2018).
The Wechsler Intelligence Scale for Children (WISC) is the most widely used individual cognitive test to identify gifted children (Robertson et al., 2011). Introduced by David Wechsler in 1949, the test originally yielded Verbal, Performance, and Full Scale IQ scores, instead of the single IQ score derived from the competing Stanford-Binet Intelligence Scale. By the 1960s, the preference for Wechsler scales had become well-established (Lubin et al., 1971), and the tests continued to gain in complexity as factor analyses elucidated stable structures across the child and adolescent age range (e.g., Kaufman, 1975). With successive editions (Wechsler, 1974, 1991, 2003a, 2014a), the WISC has added and redefined new scoring indexes corresponding to its underlying factor structure, most recently with Verbal Comprehension, Visual Spatial, Fluid Reasoning, Working Memory, and Processing Speed.
As new subtests were added to strengthen these factors, test scores lacked cohesiveness for gifted students; a high score on one index did not necessarily predict high scores on others. Gifted children earned higher scores in Verbal Comprehension, Visual Spatial, and Fluid Reasoning—but lower scores in Processing Skills: Working Memory and especially Processing Speed. The addition of Processing Skills subtests changed the composition of Full Scale IQ (FSIQ) scores in the 2003 and 2014 editions, enhancing the role of Working Memory and Processing Speed, while reducing the importance of abstract reasoning. When testing gifted children, were some elements better indicators of giftedness than others? Would the FSIQ score represent a unitary construct or be uninterpretable due to excessive discrepancies among the scores used to calculate it? These were important questions for gifted identification in states and school districts requiring mandatory use of the FSIQ (Rimm et al., 2017).
This study of the Wechsler Intelligence Scale for Children—Fifth Edition (WISC-V) was conducted to better understand the scoring patterns of gifted, highly gifted, and twice exceptional children to optimize identification of the diverse range of gifted children. Comparisons were made with an earlier, unpublished study of the Wechsler Intelligence Scale for Children—Fourth Edition (WISC-IV), which highlighted challenges using the Full Scale IQ for gifted identification. Both studies were conducted by members of the Assessments of Giftedness Special Interest Group (“Assessment SIG”) of the National Association for Gifted Children (NAGC), a group primarily consisting of psychologists and other specialists in the evaluation of gifted children. Representing private practices, university testing centers, and schools, members with access to gifted databases provided data and shared observations from test administration. Our research findings served as the basis for two position statements approved by the NAGC Board of Directors recommending best practices for the administration and scoring of the WISC-IV and WISC-V with gifted children.
Background
In 2005, co-author and psychologist Sylvia Rimm assembled what became known as the NAGC Assessment SIG to explore problems affecting gifted identification when the WISC-IV Full Scale IQ was used as a primary identifier. Changes made to the WISC-IV appeared to have a substantial impact on student selection for gifted programs. Compared with the WISC-III, the weight of Working Memory and Processing Speed in the WISC-IV Full Scale IQ score had doubled from 20% to 40%, with a consequent reduction from 80% to 60% in the weight of subtests measuring general intelligence (g). Rimm noted that Working Memory and Processing Speed “are often weak areas for gifted children, and not considered to be as related to general intelligence (g) as Verbal Comprehension and Perceptual Reasoning are” (S. Rimm, personal communication, 2005). She added that for gifted children, especially gifted boys, processing speed scores (assessed by tasks of visual discrimination and fine motor speed) could be especially low. Psychologist and co-author Edward Amend recalled a twice exceptional boy with ADHD, who earned a WISC-III FSIQ of 145 and a WISC-IV FSIQ of 115. Some teachers said, “See! We knew he wasn’t that smart!” and he was subsequently removed from gifted services. In schools in which the FSIQ was the required score for gifted program admission, the redefined WISC-IV FSIQ prevented some qualified students from receiving gifted services.
Publisher data (Wechsler, 2003b, p. 77; Williams et al., 2003, p. 1) comparing a gifted group (N = 63) with a matched control group (N = 63) foreshadowed unusual WISC-IV scoring patterns for gifted children. The Control Group’s WISC-IV primary mean index scores varied from a high of 106.6 in Verbal Comprehension to a low of 102.8 in Processing Speed—a difference of just 3.8 points across the four indexes of the test. The Gifted Group’s mean index scores ranged from 124.7 (approximate 95th percentile) in Verbal Comprehension to 110.6 (approximate 77th percentile) in Processing Speed—a difference of 14.1 points, close to the standard deviation of 15.
A study of 104 gifted children administered the WISC-IV at Gifted Development Center (GDC) showed a discrepancy of 27.4 points between VCI (131.7) and PSI (104.3; Silverman et al., 2004). Only four children (4%) earned “gifted” Processing Speed scores of 130 or higher. This center tests many highly gifted and twice exceptional children.
Flanagan and Kaufman (2004) stated that when WISC-IV index scores vary by 23 points or more (at least 1.5 standard deviations), the resulting Full Scale IQ score is not a unitary construct and is, therefore, “not interpretable” (p. 128). In 79% of GDC cases, the Full Scale IQ score was uninterpretable due to discrepancies of 23 points or more. In such cases, Flanagan and Kaufman (2004) recommended use of the General Ability Index (GAI), provided that the Verbal Comprehension (VCI) and Perceptual Reasoning (PRI) Index scores are less than 23 points apart. They reasoned that the WISC-IV GAI, calculated using the subtest scaled scores of the VCI and PRI (three verbal and three visual subtests), omitting Working Memory (WMI) and Processing Speed (PSI) subtests, provided a better overall indicator of the child’s reasoning abilities than the FSIQ. Separating the GAI from the WMI and PSI allowed examiners to differentiate a child’s strengths and weaknesses, documenting and supporting each individually. What should be done when the discrepancy between VCI and PRI is greater than 23 points? Flanagan and Kaufman (2004) recommended using either the VCI or PRI as evidence of giftedness to ensure that gifted programming efforts address the child’s strengths. These methods could prevent use of a Full Scale IQ score that masks one or both of a child’s exceptionalities and results in service denial.
Information about the General Ability Index (GAI) was not included in the WISC-IV Administration and Scoring Manual. In response to concerns raised after release of the WISC-IV in 2003 (Flanagan & Kaufman, 2004; Weiss et al., 2006), the publishers issued General Ability Index (GAI; WISC-IV Technical Report #4; Raiford et. al., 2005) online, with scoring instructions. They sponsored training sessions for testers in use of the GAI, yet many users of the test remained unaware of these developments.
Members of the NAGC Assessment SIG resolved to explore the efficacy of the WISC-IV for gifted identification with a larger sample of gifted children. Their purpose was twofold:
To make recommendations about the interpretation of the WISC-IV for identifying gifted children, based on an analysis of data collected on gifted children;
To consider ways to discriminate among highly, exceptionally, and profoundly gifted children whose levels of giftedness were not differentiated by current individual IQ tests.
WISC-IV Gifted Study
Data collection began in 2006 when members of the Assessment SIG collected anonymized WISC-IV scoring data from 334 gifted children at eight U.S. sites, including private psychology practices, gifted testing centers, and public school gifted programs. All children earning at least one WISC-IV index score of 130 (98th percentile; a typical criterion for gifted programs) were included. While these participants may be considered representative of parent-referred assessments for gifted and twice exceptional educational decision making, they should not be considered representative of gifted students as a whole. The emphasis of testing was optimal assessment of each child, thus occasional modifications (e.g., subtest substitutions made according to administration rules) were made to accommodate a child’s needs or disabilities. The resulting data were presented at conferences (Gilman et al., 2008; Gilman, 2009, Robinson et al., 2010), but never published.
As shown in Table 1, WISC-IV Verbal Comprehension (VCI) yielded the highest mean primary index score: 133.17 (99th percentile). VCI assesses verbal abstract reasoning, language, social judgment, and understanding of cause and effect through orally administered items. Perceptual Reasoning (PRI) yielded a mean index score of 127.84 (96th percentile). PRI assesses visual-spatial reasoning, visual abstract reasoning, pattern recognition, and attention to visual detail, utilizing oral instructions and visual prompts. Working Memory (WMI) yielded a mean index score of 121.58 (92nd percentile), assessing non-meaningful auditory-sequential memory of digit strings and letter-digit combinations read to the child. Processing Speed (PSI) scored lowest: 112.02 (79th percentile), measuring visual-motor speed and visual discrimination speed on paper-and-pencil tasks.
WISC-IV Study, NAGC Task Force (N = 334).
Data presented as conference papers (Gilman et al., 2008; Gilman, 2009; Robinson et al., 2010).
The mean index scores of Verbal Comprehension (133.17) and Processing Speed (112.02) differed by 21 points. Index score discrepancies greater than or equal to 23 points occurred in 235 children. Using the Flanagan and Kaufman criterion, 235 of the 334 children (70%) did not qualify for calculation of the Full Scale IQ because such a score would be uninterpretable. Only 99 of 334 children (30%) qualified for calculation of the Full Scale IQ.
A discrepancy of 11.59 points was found between the mean Verbal Comprehension Index of 133.17 and mean Working Memory Index of 121.58. For all ages, a statistical difference of 11.18 points is significant at the 0.05 level (Wechsler, 2003b, p. 256). The discrepancy between the mean Verbal Comprehension Index and mean Processing Speed Index of 112.02 was 21.15 points—considerably larger than the 12.62 difference deemed significant at the 0.05 level. The mean Processing Speed score was high average for this sample. Only 36 children (less than 11%) earned PSI scores of 130 or above.
For the GAI to be a unitary construct and meaningful global estimate of intelligence, the VCI and PRI discrepancy must be less than 23 points (Flanagan & Kaufman, 2004). A total of 285 children (85%) qualified for calculation of the GAI. Due to large discrepancies, 51 (15%) did not qualify for either the FSIQ or GAI. Their strengths were evident in either the VCI or PRI.
The Full Scale IQ score was a meaningful assessment of intelligence for only 30% of the sample, while the GAI documented the high ability of the gifted group in 85% of cases. Working Memory and especially Processing Speed scores were lower. Our brightest students were not necessarily our fastest ones.
WISC-IV Extended Norms Developed in 2008
Some of our children earned more raw score points than needed to qualify for the maximum subtest scaled score (19), but these extra points were ignored because scoring was capped. Could publishers develop a scoring method reflective of their actual performance? A review of data from our highest scorers convinced publishers to extend current norms, addressing our second research goal.
Normative scaling was extended on the upper end of the test to reflect actual raw score points earned, and Extended Norms scoring information appeared online (Zhu et al., 2008). Global test score ceilings capped at 150 to 160 rose to 210. Subtest scaled scores capped at 19 were extended as high as 28, depending on the subtest and the child’s age. When a child earned a subtest scaled score of 19, examiners were instructed to consult an extended norms chart (Zhu et al., 2008) to determine the actual subtest scaled score and resulting global scores.
Extended norms did not change all scores; most scores that changed increased only a few points. Yet, some children’s scores rose dramatically, alerting educators to unusual learning needs (e.g., for significant acceleration) and allowing children to qualify for support groups and programs for exceptionally and profoundly gifted children. Extended norms were unavailable for our study of 334 gifted children, but their use would have increased some mean scores and the discrepancy between gifted and control group scores. Our study’s findings resulted in the following NAGC recommendations for best practices in the use of the WISC-IV when determining admission to gifted programs.
NAGC WISC-IV Position Statement
In January, 2008, the Board of Directors of the National Association for Gifted Children (NAGC) approved the Position Statement, “Use of the WISC-IV for Gifted Identification.” Position Statements are distributed to state Departments of Education. Key recommendations included the following:
… Where comprehensive testing is available, NAGC recommends that WISC-IV Full Scale IQ scores not be required for admission to gifted programs…either the General Ability Index (GAI), which emphasizes reasoning ability, or the Full Scale IQ Score (FSIQ), should be acceptable for selection to gifted programs… If neither the FSIQ or GAI is appropriate…The Verbal Comprehension Index (VCI) and the Perceptual Reasoning Index (PRI) are also independently appropriate for selection to programs for the gifted, especially for culturally diverse, bilingual, twice exceptional students, or visual-spatial learners. (NAGC, 2008)
The Wechsler Intelligence Scale for Children—Fifth Edition (WISC-V)
Development and Structure
Released in October 2014, the WISC-V offers comprehensive evaluation of children ages 6 through 16. Designed to identify and qualify students with learning disabilities for services, identify intellectual disability or giftedness, evaluate cognitive processing, and assess the impact of brain injuries (Pearson, 2016), its comprehensive nature supports its use with gifted and twice exceptional students. The WISC-V can be used for high-stakes evaluations, appeals following denial of services, and for gifted identification, possibly following an intelligence screener.
Publishers examined WISC-V item content, relevance, and potential bias; instructions for the examiner and child; administration procedures; psychometric properties; and scoring criteria (Pearson, 2016). The normative sample consisted of 2,200 U.S. children of 11 different age groups, balanced with respect to gender and closely matched to 2012 U.S. Census data on race/ethnicity, parent education level, and geographic region. Special population studies of gifted children and children with a variety of disabilities (e.g., specific learning disorder in reading, written expression, and math; ADHD; Autism) undergird the development of WISC-V and prompted relevant scoring options. Publishers offer a more comprehensive discussion of validity and reliability in Wechsler (2014b).
Representing a five-factor model, the WISC-V has five primary indexes, each with two required core subtests (in parentheses): Verbal Comprehension (VCI; Similarities and Vocabulary), Visual Spatial (VSI; Block Design and Visual Puzzles), Fluid Reasoning (FRI; Matrix Reasoning and Figure Weights), Working Memory (WMI; Digit Span and Picture Span) and Processing Speed (PSI; Coding and Symbol Search). A WISC-V Full Scale IQ (FSIQ) can be derived administering seven subtests; 10 subtests are needed to compute all five primary index scores. The WISC-V features 16 subtests, including supplementary subtests in several indexes which facilitate additional scoring options. The addition of five diagnostic subtests brings the total number of WISC-V subtests to 21: 13 retained from the WISC-IV with modifications and eight new or borrowed from other editions. An Ancillary and Complementary analysis may be done using the following indexes: Quantitative Reasoning (QRI), Auditory Working Memory (AWMI), Nonverbal (NVI), General Ability (GAI), and Cognitive Proficiency (CPI). A Process Analysis is available to evaluate evidence of disability (e.g., using Immediate and Delayed Symbol Translation to evaluate learning disability; Wechsler, 2014a). The Wechsler Intelligence Scale for Children—Fifth Edition, Integrated (Wechsler & Kaplan, 2015) is available to provide additional diagnostic subtests when weaknesses are suspected.
Challenges to Structure
Challenges to the efficacy of the WISC-V primarily address structural validity. Should the four-factor WISC-IV structure have been altered to the five-factor WISC-V structure, separating the Perceptual Reasoning (PR) index into Visual Spatial (VS) and Fluid Reasoning (FR) indexes? Assessing goodness of fit using 2-, 3-, 4-, and 5-factor models, the publishers conclude, “All of the five-factor models have excellent fit that is significantly better than the fit of comparable four-factor models” (Wechsler, 2014b, p. 82).
Canivez et al. (2020) disagree, concluding that, while separate Visual Spatial and Fluid Reasoning domains ensure more consistency with the VS and FR factors of the popular CHC Theory (Schneider & McGrew, 2012), the standardization sample fails to provide support for five group factors. “Subtests thought to measure distinct VS and FR factors shared variance associated with a single PR dimension similar to the former WISC-IV” (Canivez et al., 2020, p. 15). Although Canivez and colleagues prefer the four-factor model of the WISC-IV, they conclude that a dominant general intelligence (g) factor renders the WISC-V individual group factors of poor interpretive value, warranting only minimal interpretation by clinicians beyond the Full Scale IQ (Canivez & Watkins, 2016; Canivez et al., 2016; Canivez et al., 2020).
Reynolds & Keith (2017, p. 31) reach a different conclusion: “Both g and five distinct broad abilities (Verbal Comprehension, Visual Spatial Ability, Fluid Reasoning, Working Memory, and Processing Speed) are needed to explain the covariances among the WISC-V subtests….” Interestingly, they note that the g-saturation (see discussion below) of the new seven-subtest WISC-V Full Scale IQ is strong, and only slightly lower than the 10-subtest WISC-IV FSIQ, allaying concerns about fewer subtests in the FSIQ. Compared with the WISC-IV’s FSIQ, 40% of which came from WMI and PSI, the weight of WMI and PSI in the WISC-V FSIQ is only 29%.
Robust Scoring Options
WISC-V test authors have created additional expanded and ancillary index scores sensitive to gifted and twice exceptional abilities (Raiford et al., 2015; Raiford, Silverman, et al., 2019, Wechsler, 2014b). These index scores are robust, each summarizing four to eight subtests, and offer powerful documentation of strengths comparable to the Full Scale IQ, calculated from seven subtests. (Less powerful are the WISC-V VCI, VSI, and FRI primary index scores, each of which is calculated using just two core subtests.) Some of the robust indexes minimize or eliminate processing skills. Each has Extended Norms to measure performance beyond the normal scoring ceiling of the test (Raiford, Courville, et al., 2019). The following scoring choices are not all found in WISC-V Administration and Scoring Manuals (Wechsler, 2014a, 2014c), but are beneficial tools for gifted assessments (references are provided):
Verbal (Expanded Crystallized) Index (VECI) is composed of the four untimed subtests of the Verbal Comprehension composite: Similarities and Vocabulary (core); Information and Comprehension (supplementary; Raiford et al., 2015). The VECI documents advanced verbal intelligence, serving as a broad Verbal IQ that often proves important for gifted assessment and identification.
Nonverbal Index (NVI) is composed of Block Design, Matrix Reasoning, Coding, Figure Weights, Visual Puzzles, and Picture Span (Wechsler, 2014c). Four of six subtests are timed; a fifth limits the time each stimulus is presented to the child.
Expanded Fluid Index (EFI) is composed of the four subtests of the Fluid Reasoning domain: Matrix Reasoning and Figure Weights (core); Picture Concepts and Arithmetic (supplementary; Raiford et al., 2015). Two of four subtests are timed. The EFI offers a broad measure of Fluid Reasoning.
Full Scale IQ (FSIQ) score is composed of Block Design, Similarities, Matrix Reasoning, Digit Span, Coding, Vocabulary, and Figure Weights (Wechsler, 2014a). Now more visual than verbal, three nonverbal subtests are timed. The FSIQ integrates reasoning and processing skills (Working Memory or Processing Speed subtests), and is interpretable when scores are relatively consistent.
General Ability Index (GAI) is composed of Block Design, Similarities, Matrix Reasoning, Vocabulary, and Figure Weights (Wechsler, 2014c). The GAI measures Verbal, Visual Spatial, and Fluid Reasoning without processing skills. More visual than verbal, two nonverbal subtests are timed.
Expanded General Ability Index (EGAI) is composed of the four untimed Verbal Comprehension subtests, as well as Block Design (timed), Matrix Reasoning (untimed), Figure Weights (timed), and Arithmetic (timed). This particularly robust index was created by the publishers for gifted assessment (Raiford, Silverman, et al., 2019) and measures verbal, visual, and mathematical reasoning without processing skills.
Note: Quantitative Reasoning Index (QRI) is composed of Figure Weights and Arithmetic, both timed. While not robust, the QRI can be useful to document mathematical talent for development in schools (Wechsler, 2014c).
Wechsler Intelligence Scale for Children—Fifth Edition (WISC-V) Study
In 2017, members of NAGC’s Assessments of Giftedness SIG commenced research on the WISC-V with gifted, highly gifted, and twice exceptional children. The goal was to determine best practices for use of the WISC-V, as part of a multi-faceted gifted identification process in schools. We hoped to facilitate identification of diverse populations of gifted children with varying strengths and possible weaknesses.
The purpose of the study was to examine mean scores on the WISC-V among a sample of students referred for testing to various clinics/private practices specializing in assessment of high ability children. The following research questions were posed:
How do gifted and twice exceptional children perform on the WISC-V? Performance patterns across WISC-V indexes Interpretability of the WISC-V Full Scale IQ A comparison of WISC-V and WISC-IV scoring patterns
Which scoring options are sensitive to gifted and 2e strengths? Robust expanded and ancillary indexes Extended norms
Method
Participants and Procedures
Participants included 390 gifted children with a mean age of 9.6 (SD = 2.2). The sample was 71% male and 29% female. Test ages ranged from 6 years, 0 months to 16 years, 11 months. Participants were assessed between October 2014 (when the WISC-V was released) and fall 2017 at one of seven U.S. sites specializing in the assessment of gifted children. These included private and non-profit psychology practices and gifted educational centers. Parents most often requested assessment to determine a child’s gifted program eligibility, understand the child’s instructional needs, inform their advocacy with the school, confirm and address a child’s high level of giftedness, and understand why a child was denied entrance to the gifted program when advanced abilities were evident at home. We would expect our sample to include children missed by the school’s identification program, especially twice-exceptional children and children with higher levels of giftedness than were realized at school. In this regard, our findings could enhance identification programs in schools to better represent all gifted children.
Because each site preferred to maintain its own client confidentiality, we utilized a two-stage system in which individual sites compiled their own anonymized data before submitting it to a separate site for analysis of aggregate mean scores for the entire sample. Each site submitted data on spreadsheets with only a site number and numbered cases. We utilized limited demographic data: gender, bilingual/multilingual, and test age. We calculated aggregate mean scores for the following: subtest scaled scores, primary index scores (and discrepancies between them), FSIQ scores, and ancillary and expanded index scores. While including all children tested at these sites during this time period may reasonably describe gifted children referred by parents for testing, our sample does not reflect all gifted children. Limitations reflect non-random selection of participants, a lack of additional demographic data, data collected by independent examiners and researchers, and some non-standard administration to address individual children’s needs. Most sites administered 10 or more subtests, calculating five primary index scores and the Full Scale IQ, then adding subtests to demonstrate strengths or explore weaknesses specific to each child.
Performance at or above the 90th percentile (SS 119 or above) on at least one index was required for inclusion in the WISC-V study, based on the NAGC definition of giftedness at that time. The index scores considered for inclusion were the following: Verbal (Expanded Crystallized) Index (VECI), Nonverbal Index (NVI), Expanded Fluid Index (EFI), Full Scale IQ (FSIQ), General Ability Index (GAI), and Expanded General Ability Index (EGAI). Because information about expanded indexes appeared online after initial release of the WISC-V (Raiford et al., 2015; Raiford, Silverman, et al., 2019), and each index requires supplementary subtests that may not have been administered, we have fewer cases with expanded index scores. An effort was made to calculate these index scores later, provided that the required subtests had been given.
Instrument
Wechsler Intelligence Scale for Children, Fifth Edition (WISC-V; Wechsler, 2014a).
Results
Gifted and Twice Exceptional Performance on the WISC-V
Performance Patterns Across WISC-V Indexes
As shown in Table 2, the WISC-V yielded the following mean primary index scores for the study sample: Verbal Comprehension (VCI) 131.75, Visual Spatial (VSI) 120.34, Fluid Reasoning (FRI) 121.91, Working Memory (WMI) 115.12, and Processing Speed (PSI) 105.84—reflecting a discrepancy of 25.91 points. The mean WISC-V Full Scale IQ score was 126.59 and the mean General Ability Index (GAI) was 128.90.
WISC-V Mean Primary, Expanded, and Ancillary Index Scores: NAGC Assessments of Giftedness SIG (N = 390).
Presented at conference sessions: Megan Foley Nicpon provided insights into the impact of timing (Gilman et al., 2018); inclusion was addressed (Gilman et al., 2021).
N = 390. Not all indexes were calculated for all children. Choices reflect additions to explore and document areas of strength and weakness for individual children (numbers of cases provided).
A small Bilingual/Multilingual subgroup of 34 children was identified, based on parent report of language competence in two or more languages (not language classes). These results should be considered with caution due to the small number of children. Like our sample of 390 children, our bilingual/multilingual subgroup scored highest in Verbal, with the same progressive lowering of scores to Processing Speed. Mean scores included VCI (N = 34) 134.74, VSI (N = 31) 126.26, FRI (N = 34) 125.50, WMI (N = 31) 118.26, PSI (N = 31) 108.35, FSIQ (N = 34) 131.21, VECI (N = 18) 138.11, NVI (22) 126.50, EFI (N = 6) 126.33, GAI (N = 28) 133.82, and EGAI (N = 15) 136.40. Further research is warranted to determine whether the scoring pattern we found is typical of bilingual/multilingual gifted children with sufficient language skills to be assessed on a WISC-V in English.
WISC-V Mean Subtest Scaled Scores
Table 3 shows mean subtest scaled scores, rank ordered from highest to lowest, with degree of timing noted. Mean subtest scaled scores ranged from 15.89 in Similarities to 10.54 in Cancellation, a discrepancy of 5.35 points (6 points is two SDs). Untimed Verbal Comprehension subtests yielded mean scores above 15. Visual Spatial subtests and three of four Fluid Reasoning subtests yielded mean scores of 13+. All VSI and FRI subtests are presented with visual prompts except Arithmetic (mental math), which is read to the child. Working Memory subtests yielded mean scores of 12+. Processing Speed subtests (timed clerical-type tasks) yielded the lowest mean scores, ranging from 10.5 to 11.4. Degree of timing was notable in subtests yielding lower scores.
WISC-V Mean (SD) Subtest Scaled Scores, Psychometric g-Loadings, and Timing: NAGC Assessments of Giftedness Special Interest Group (N = 390).
Presented at conference sessions (Gilman et al., 2018; Gilman et al., 2021).
WISC-V g-loadings are based on the first unrotated factor coefficients from the total standardization sample (Canivez et al., 2016, p. 980). Respective g-loadings can be interpreted according to Kaufman (1994): factor loadings of 0.70 or greater indicate good measures of g; loadings of 0.50 to 0.69 define fair measures of g; and loadings below 0.50 are considered poor measures of g. WISC-V g-loadings range from 0.77 to 0.22—from good to poor.
The Diagnostic and Statistical Manual of Mental Disorders: Fifth Edition cautions that “…highly discrepant individual subtest scores may make an overall IQ invalid” (American Psychiatric Association, 2013, p. 37). The difference between the mean WISC-V primary index scores of Verbal Comprehension and Processing Speed is almost 26 points (30 points is 2 SDs), suggesting that invalid overall IQ scores would not occur for every child in our sample, but would be common for the gifted population. Such a discrepancy in an individual child would represent a statistically significant difference at the 0.01 level for every age from 6 through 16 (Wechsler, 2014a, pp. 350–351), and would yield an uninterpretable Full Scale IQ (Flanagan & Kaufman, 2004). A total of 286 children (73% of the sample) showed one or more variances between index scores of 23 points or more. Thus, using the Flanagan and Kaufman criterion, 286 of the 390 children (73%) did not qualify for calculation of the Full Scale IQ on the basis that such a score would be uninterpretable. Only 104 of 390 children (27%) qualified for calculation of the Full Scale IQ using this 1.5 SD criterion.
Notable were the children with large discrepancies between VCI and PSI. One child earned a VCI of 150 at the 99.9th percentile (“extremely high”), a PSI of 66 at the 1st percentile (“very low”), and a Full Scale IQ score of 117 at the 87th percentile. If only the Full Scale score is considered, this child’s discrepancy of 84 points—over 5 SDs—would be masked. The high average Full Scale score (117) would likely be deemed too low for gifted program admission and too high to warrant 504 Plan accommodations for weaknesses. Requiring the FSIQ as a child’s best score is not defensible.
A Comparison of WISC-V and WISC-IV Scoring Patterns
Figure 1 reveals strikingly comparable mean scoring profiles for our WISC-V study (N = 390) and WISC-IV study (N = 334). Mean index scores for both gifted groups were not only higher than average, but highly discrepant across cognitive domains. Both WISC-V and WISC-IV scores decline sequentially from a high in verbal comprehension to visual spatial/fluid reasoning, working memory, and processing speed. Timing appeared to negatively affect scoring for both. WISC-V mean index scores of Verbal Comprehension (131.75) and Processing Speed (105.84) varied by 25.91 points, while WISC-IV mean index scores of Verbal Comprehension (133.17) and Processing Speed (112.02) varied by 21.15 points. Gifted scoring similarities occurred despite differences in the test editions and inclusion criteria for the studies. Meaningful FSIQ scores were possible for only 27% of WISC-V and 30% of WISC-IV test takers, due to averaging highly discrepant scores.

A comparison of mean WISC-V and WISC-IV primary index score for gifted and control groups: WISC-V (n = 390) versus WISC-IV (n = 334).
WISC-V Scoring Options Sensitive to Gifted and 2e Strengths
Robust Expanded and Ancillary Index Scores
Table 2 also denotes broad WISC-V index scores that proved sensitive to gifted strengths. The Verbal (Expanded Crystallized) Index (VECI; mean score 132.93), which summarizes all four Verbal Comprehension subtests, is a stronger indicator of verbal strength than the VCI when determining the need for programming for a verbally gifted child. The particularly robust Expanded General Ability Index (EGAI, 131.08) revealed multiple gifted strengths. Created by the publishers in response to a request by our co-authors, the EGAI represents eight subtests measuring verbal, visual, and mathematical reasoning—without processing skills (Working Memory or Processing Speed). The EFI and NVI may be helpful to document visual spatial, fluid, or mathematical abilities, provided the child performs well in timed situations. The mean Expanded Fluid Index (EFI) score was 124.56. The Nonverbal Index (NVI) yielded a mean score of 122.44, 10 points below the VECI. Although not “robust,” the Quantitative Reasoning Index (QRI; 123.71) is useful for documenting math talent. The General Ability Index (GAI, 128.9), which does not include processing skills measures, scored higher than the NVI, EFI, and QRI.
Extended Norms
Following completion of the WISC-V study, parents of children in our study earning one or more subtest scaled scores at the 99.9th percentile, were invited to contribute data to an extended norms study conducted by the publishers. The resulting WISC-V extended norms (Raiford, Courville, et al., 2019) credit actual raw score points earned, without artificial scoring caps, and locate exceptionally and profoundly gifted children requiring services. Substituting corrected extended norms scores improves the validity of test scores and the recommendations made from them.
Discussion
Verbal Scores Highest; Nonverbal Is Emphasized
The Verbal (Expanded Crystallized) Index yielded our highest WISC-V mean composite score (132.93), measuring verbal abstract reasoning, vocabulary, cause-effect reasoning, social judgment, and general knowledge. Children with verbal strengths require instruction that develops related skills.
Yet, the WISC-V now features only two verbal (and three nonverbal) reasoning subtests in both the GAI and FSIQ. The GAI has no WMI and PSI subtests; the FSIQ adds one WMI and one PSI subtest. In comparison, the WISC-IV balanced three verbal and three nonverbal subtests in the FSIQ with two Working Memory and two Processing Speed subtests. Although reducing the number of Working Memory and Processing Speed subtests could have raised the WISC-V FSIQ, the mean WISC-V FSIQ was 126.59—down from 131.58 on the WISC-IV. While the samples have differences, was reducing the influence of Verbal Comprehension, the group’s highest scoring area, one aspect of the lower FSIQ scores? The WISC-V GAI also dropped to 128.90, from 136.85 on the WISC-IV.
The assessment of verbal intelligence has sustained criticism for unfairness to diverse children, including English language learners, rural children, children of poverty, and children with cultural differences (Naglieri & Ford, 2003; Raven, 2000). Some schools have chosen only nonverbal intelligence screeners (e.g., the Naglieri, 2008 ) to “level the playing field” for all children, regardless of native language and culture. Yet, our co-authors observe that gifted multilingual children, who are capable enough English speakers to be tested in English, may earn their highest scores in Verbal Comprehension—not unlike our small Bilingual/Multilingual subgroup. Their enhanced language experience, knowledge of root words, and gifted reasoning ability may confer unusually adept vocabulary and verbal reasoning in English.
The preference for nonverbal assessment is changing. “The nonverbal tests were specifically created to address the proportional gap in identification of underserved students (Naglieri & Ford, 2003), but the gap persists (Hodges et al., 2018, p. 36).” A meta-analysis of primarily nonverbal group tests found them inadequate to address inequity in identification for gifted programs (Hodges et al., 2018). No statistically significant difference was found in the risk ratio between verbal and nonverbal methods of identification. Far earlier, Lohman (2005) challenged Naglieri and Ford’s (2003) assertion that the NNAT was effective in identifying underrepresented groups, and questioned their methods and sample. Exploring the consequences of choosing verbal, quantitative, and nonverbal tests for talent identification, Lakin and Lohman (2011) concluded that nonverbal tests cause classification errors and fail to identify more English language learners and diverse minority students. A meta-analytic study by Lee et al. (2021) found that the NNAT identifies only 42% as many under-identified students as are identified from white and Asian populations, and that the NNAT better predicts students’ academic achievement than their intelligence.
Matching Strengths, Tests and Programming
Verbal and nonverbal ability tests are not interchangeable. “Gifted education as it currently stands is extremely verbal, saturated with meaning and spoken communication” (Wasserman, 2020, p. 340). Highly verbal children assessed on a nonverbal test will only be identified as gifted, and have access to such verbally-saturated classes, if they also have strong visual spatial, fluid or mathematical reasoning—and no vision weaknesses to lower their scores. The development of verbal strengths is essential to such fields as law, politics, literature, filmmaking, technical writing, religion and philosophy.
Likewise, children with nonverbal, visual spatial, and fluid reasoning strengths should perform well on a nonverbal test of visual pattern recognition, but a single verbal test may not pave the way for the advanced math, science, art and computer classes they need. The development of these strengths is essential for such fields as engineering, the sciences, the arts, computer science, and mathematics. Sadly, if the choice of testing doesn’t capture the strengths the teacher observed, the child may be missed indefinitely for gifted programming and the teacher may hesitate to refer future students for gifted assessment.
The careful choice of WISC-V verbal and/or nonverbal composite scores, or relevant portions of brief tests (Silverman & Gilman, 2020), can document a child’s strengths and match them to the attributes of the program. This allows children to flourish with an approach matched to their strengths.
Increased Timing
On both the WISC-V and WISC-IV, Processing Speed (PSI) at the average or high average level is considered an intra-individual weakness for gifted and twice exceptional students relative to other cognitive domains. Measured by paper-and-pencil tasks of visual motor and visual discrimination speed, PSI is the lowest scoring index in both gifted studies (WISC-V 105.84; WISC-IV 112.02). Processing Speed is not a good measure of general intelligence (g) and is not central to gifted identification or gifted programming. Teachers can easily accommodate the child with unusual advancement in language arts or math, whether or not the child’s handwriting or mathematical calculations are fast and efficient. Slow processing speed can lower Full Scale IQ scores substantially. Insistence on using the WISC Full Scale IQ score as the main or only criterion for identification inappropriately incorporates processing speed in gifted identification practices. Requiring Full Scale scores tends to shift gifted identification to high ability students without weaknesses, effectively eliminating more asynchronous gifted or twice exceptional students who may be harder to teach and motivate. This is inappropriate, if not unethical, for public schools that must respect civil rights and ensure equity.
Timing influences WISC-V subtests outside of the Processing Speed index. As shown in Table 2, timing affects both the Visual Spatial index (BD and VP are timed) and the Fluid Reasoning index (FW is timed). The WISC-V Visual Spatial index yielded a mean score of 120.3; Fluid Reasoning scored 121.9. This compares with a mean index score of 127.8 on the older WISC-IV Perceptual Reasoning index, which consisted of one timed task (Block Design) and two untimed tests (Picture Concepts and Matrix Reasoning).
Timing may have amplified scoring differences where short time limits interact with increasing item complexity. Visual Puzzles and Figure Weights (administered back-to-back), along with Arithmetic, feature 20- to 30-s time limits, regardless of item difficulty. Some co-authors reported children saying, “I don’t like being timed” as they progressed further than most children, tackling the most difficult items. Arithmetic, which loads on both Working Memory and Fluid Reasoning factors, presents a unique challenge. Not only is it timed, but Arithmetic requires children to recall from memory increasingly complex details of the problems read to them—while doing the calculations mentally. One item has 11 data elements to recall. One repetition of the harder items is allowed, stopping the timer to repeat, but the total time to calculate answers is only 30 s.
The assumption that our brightest mathematicians will be similarly fast warrants consideration. Stephen Hawking noted that “…the chemical messages responsible for our mental activity are relatively slow-moving—so further increases in the complexity of the brain will be at the expense of speed. We can be quick-witted or very intelligent, but not both” (Farndale, 2000).
Must items be timed or else many children will be able to do them? Is timing necessary to create the right score distribution? Both the Differential Ability Scales-II (DAS-II; Elliott, 2007) and the Stanford-Binet 5 (SB-5; Roid, 2003) feature visual and mathematical reasoning items that are largely untimed. The untimed DAS-II Sequential and Quantitative Reasoning subtest allows the mathematically-inclined child who loves math patterns to take several minutes to solve the hardest items, exhilarated by the accomplishment. This is important information for the teacher planning the child’s program.
It is noteworthy that the untimed Verbal Comprehension subtests that score well above all other subtests pose questions orally and allow the child unlimited time to ponder, answer, and even hone responses. The calm environment invites children to risk sharing their best thinking.
Twice exceptional children experience more difficulty under timed conditions because their ability to compensate is limited (Lovecky, 2023). A child with fine motor issues may drop the blocks in Block Design and exceed the time limit. Vision issues can undermine quantitative reasoning tests when the child must quickly count tiny pictured objects to answer the question. Timed diagnostic subtests may be helpful to document a weakness; for example, slow processing speed might allow the use of a keyboard in class. However, when multiple core subtests are timed, the FSIQ may underestimate the 2e child’s ability for years, until the effect of interventions becomes apparent. The best early indication of ability may be evident in a single strength area.
Shortened Discontinue Criteria
WISC-V discontinue criteria were shortened substantially. A succession of just three missed items terminates most WISC-V subtests, whereas the WISC-IV generally required four or five items missed in a row. Our co-authors noted instances in which children seemed to reach WISC-V discontinue criteria suddenly. When administration was continued to test the limits of the child’s ability, it was not uncommon for the child to answer several more items correctly. In GDC’s study of 104 children on the WISC-IV (Silverman et al., 2004), 43.3% of the WISC-IV sample met no discontinue criteria on 4 or more subtests (Robinson et al., 2010), suggesting a need for harder items at the ends of some subtests. Does shortening discontinue criteria substantially hardwire discontinuance into each subtest, perhaps at the cost of underestimating ability?
The Efficacy of g to Measure Reasoning Strengths
Examiners have long noted the apparent association between the variable performance of gifted children across the cognitive domains of an IQ test, and the degree to which individual subtests emphasize abstract reasoning. Charles Spearman (1904) examined correlations between academic performance measures and derived a fundamental “general intelligence” or “g” factor that could be distinguished from performance on a wide range of academic subjects and specific sensory skills. Derived statistically from positive intercorrelations between individual scores in batteries of mental test instruments (Wasserman, 2020), Spearman believed that g was exhibited most prominently in complex mental activities requiring reasoning, prior learning, or problem solving in novel situations.
Spearman (1927) later devised a two-factor theory, which states that every intellectual activity consists of a general (g) factor, which is shared with all other intellectual activities, and a specific (s) factor, which is unique. The (s) factor represents that portion of each ability not correlated with other variables. This two-factor model evolved into more complex hierarchical models of intelligence, with theorists and researchers accepting the basic principle of an overarching general factor, but focusing their attention on categorizing lower order broad and narrow ability factors. Spearman’s conceptualization forms a basis for current multifactor intelligence theories, such as the Cattell-Horn-Carroll (CHC) theory, upon which IQ tests are based (see Schneider & McGrew, 2018).
Not all agree with the deemphasis on g. Wasserman (2019) writes, “It is…ironic that advocates of CHC have seemingly turned away from g in favor of interpreting broad and narrow ability factors, given that g has well-established credibility in facilitating identification of at least two exceptionalities (i.e., intellectual disability and intellectual giftedness)” (p. 258). Research ties g to real world outcomes (Wasserman, 2020) such as educability (grades, achievement test scores, and years of education completed); occupational attainment and performance; and health and lifespan (better adaptive behavior, lower levels of anxiety, and lower risk of early death in adulthood). Gottfredson (1997) explores g as an important factor in the ability to deal with cognitive complexity in everyday tasks and work. The more complex the work, the more a higher level of g contributes to success. Wasserman (2020) concludes that g predicts educational and occupational success; thus, g is relevant to the identification of gifted children for programs designed to provide necessary modifications to their education.
WISC-V Subtest Scaled Scores and g
As seen in Table 3, WISC-V subtest scoring patterns for our gifted sample compare favorably with g-loadings (Canivez et al., 2016, p. 980; see also Reynolds & Keith, 2017, p. 42; Wechsler, 2014c, p. 84). Respective g-loadings can be interpreted according to the convention recommended by Kaufman (1994): factor loadings of 0.70 or greater define subtests that are good measures of g; loadings of 0.50 to 0.69 define fair measures of g; and loadings below 0.50 are considered poor measures of g. WISC-V g-loadings range from 0.77 to 0.22—from good to poor.
The highest g-loadings correspond with the highest mean scores in Verbal Comprehension subtests, followed by Fluid Reasoning, Visual Spatial, Working Memory and Processing Speed. This pattern prevails on previous WISC editions (Cohen, 1959; Flanagan & Kaufman, 2004, p. 309; Kaufman, 1979, p. 110), with untimed, high-g Verbal subtests scoring highest (e.g., Vocabulary, g-loadings 0.77 or higher) and low-g processing speed subtests scoring lowest (e.g., Coding, g-loadings 0.48 or lower). Exceptions occurred with subtest modifications; for example, increasing the complexity of DS (previous g-loadings 0.52 or lower), raised the WISC-V DS g-loading to 0.70. Wasserman notes: “Measures of psychometric g may drive the FSIQ elevations associated with giftedness, but information processing efficiency is more often a discordant factor” (Personal Communication, January 4, 2021).
Silverman (2009) writes, “Abstract reasoning and general intelligence (g) are synonymous. Giftedness is high abstract reasoning. Therefore, g could as easily stand for giftedness as for general intelligence” (p. 967, italics in original).
Use of the WISC-V Full Scale IQ Score
Our studies of the WISC-V and WISC-IV offer little support for required use of the Full Scale IQ score for gifted identification. Our research contradicts the findings of Canivez et al. (2020) that the general intelligence factor (g) dominates the group factors, thus they recommend “extreme caution in WISC-V interpretation beyond the FSIQ” (p. 18). The best indicators of high intelligence are highly correlated with general intelligence (g), but the factors reflect the different ways in which intelligence is demonstrated. Furthermore, the way in which gifted children interact with the individual factors is diagnostically valuable.
The Full Scale IQ (117) of the child earning a WISC-V VCI of 150 at the 99.9th percentile (“extremely high”) and a PSI of 66 at the 1st percentile (“very low”), places the child at the 87th percentile. If viewed as the preeminent score, this FSIQ would convey that this is a bright child, well matched to the curriculum, who is sufficiently challenged and able to succeed.
Yet, a VCI of 150 documents a child with verbal reasoning abilities at the 99.9th percentile, who likely stands out in classroom discussion, is not easily challenged by typical curricula, and will need programming to develop high verbal abilities (e.g., advanced language arts). A better use of the WISC-V for such a child would be to prioritize calculation of the Verbal (Expanded Crystallized) Index (VECI), the broad indicator of Verbal IQ, and substitute the VECI for the FSIQ in the gifted identification process. If extended norms are applicable they must be used, and may document an even higher level of giftedness requiring support groups, acceleration, etc.
The PSI of 66 would suggest the child is twice exceptional and has significant weaknesses in visual-motor and visual discrimination speed. A 504 or special education evaluation would be needed to determine age-appropriate interventions and 504 accommodations (e.g., assistance with handwriting, use of a keyboard or voice-to-text). Advanced language arts courses are still appropriate and necessary, but must include options for getting thoughts onto paper. Using the more comprehensive VECI helps ensure that this verbally articulate child with poor written output is acknowledged as gifted, not “lacking in motivation,” and can receive help through a 504 plan for weakness.
The argument could be made that the Full Scale IQ (FSIQ) should be designed to measure g, instead of sampling from every area—good-g to poor-g subtests—on the WISC-V (Wasserman, Personal Communication, January 4, 2021). This is especially true as Robertson et al. (2011) report that 82.6% of surveyed school psychologists indicate that Score on an IQ Test is “Important or Very Important” in determining gifted program eligibility, and the Full Scale IQ score remains the best known composite, even as better alternatives emerge. A g-based FSIQ would help to avoid interpretive errors made by examiners unaware of alternative scoring choices. Measures of processing skills (working memory, processing speed, and other measures of executive functioning) can still be offered separately for diagnostic purposes without incorporating them into a FSIQ. The FSIQ has served as a global, overall composite score that essentially captures a linear average of whatever mix of subtests test developers choose to measure, a decision that makes little sense when psychometric g, with its massive predictive capacity, can be reliably measured (Wasserman, Personal Communication, January 4, 2021).
Alternatively, the FSIQ might be eliminated. The Differential Ability Scales-II (DAS-II; Elliott, 2007) provides no FSIQ; instead, it utilizes a General Conceptual Ability (GCA) score that summarizes verbal, nonverbal and spatial reasoning, without diagnostic processing skills measures (available separately). Silverman (2018) writes
The reign of Full Scale IQ scores is no longer psychometrically defensible… (p. 204). …It is incongruous with a multifaceted view of intelligence. And it leads to questionable cutoff scores for admission to gifted programs embedded in state legislation and district regulations…. (p. 202)
The publishers maintain their support for the WISC-V Full Scale IQ score, but offer scoring options. The high-g WISC-V Verbal (Expanded Crystallized) Index (VECI) is a powerful indicator of broad Verbal IQ, was sensitive to the strengths of most of our gifted subjects, and scored highest (132.93). The WISC-V Expanded General Ability Index (EGAI) is more robust than the successful WISC-IV GAI, but similar—both omit Working Memory and Processing Speed. The WISC-V EGAI uses eight high-g subtests: four Verbal Comprehension subtests, two nonverbal visual reasoning subtests, and two mathematical reasoning subtests (Raiford, Silverman, et al., 2019). The EGAI is a more robust measure of giftedness than the FSIQ and scored higher (EGAI 131.08; FSIQ 126.59). The WISC-V mean GAI for our gifted sample was 128.90.
Implications for Educational and Psychological Practice
Most schools reserve the WISC-V for high stakes disability testing, but its use can improve gifted identification. WISC-V identifies a broader range of children for gifted programs when used after universal screening or to understand a bright child with inconsistencies. Recommended scoring options that do not rely upon the FSIQ are essential.
All gifted children benefit from documentation of strength areas and programming to develop their strengths (Lovecky, 2023). High VECI scores can indicate students who need advanced instruction in areas such as language, writing, social sciences, biological sciences, philosophy, logic, and debate. Students with high NVI and EFI scores can benefit from advanced instruction in physical sciences, technology, mathematics, computer sciences, music, and art. (The QRI further documents math strength). High Expanded General Ability Index (EGAI) scores document global giftedness requiring comprehensive educational advancement. When extended norms confirm exceptionally or profoundly gifted scores, schools can confidently endorse accelerative and other advanced educational options (e.g., dual enrollment, early college) recommended for the well-being of such students (see accelerationinstitute.org; Assouline et al., 2015; Colangelo et al., 2004).
Twice-exceptional children are frequently missed for school services. Twice-exceptionality requires comprehensive assessment for diagnosis (Foley Nicpon et al., 2011): a comprehensive individual intelligence test (e.g., WISC-V), individual achievement test, and diagnostic tests in all suspected areas of disability. Access to comprehensive assessment was reduced with the 2004 reauthorization of the Individuals with Disabilities Education Act (IDEA 2004). Teachers were given the primary responsibility of locating children performing below/well below average through Response to Intervention (RTI). The NAGC White Paper “Twice-Exceptionality” (NAGC, 2009) examines the ramifications of these changes for 2e children; Gilman et al. (2013) explore five case studies of 2e children missed through this process. School psychologists must necessarily play a greater role (Foley Nicpon & Assouline, 2020), utilizing their knowledge to choose and interpret cognitive ability/intelligence tests for gifted children, and recognizing markers of twice exceptionality otherwise missed. The WISC-V allows examiners to separate and identify areas of strength and weakness, and determine the need for further evaluation. This pathway can be critical to the inclusion of 2e children in gifted programs with both advanced programming in their areas of strength and supports/interventions for their weaknesses.
Private psychologists can utilize WISC-V scoring options and extended norms to present a more authentic assessment of giftedness to schools and parents. Significant scoring discrepancies, regardless of level, must be explored and further assessment for other identified exceptionalities/disabilities done as needed. This approach provides parents more specific guidance in educational planning, needed intervention, and a roadmap for the child’s future.
Limitations and Future Research
Future research warrants improved controls over case selection and test administration. We recommend administration of 16 subtests (including supplementary) to calculate available robust indexes, and extended norms to compare scoring patterns at different levels of giftedness. Because diagnosis of twice exceptionality requires additional time, assessment, and possible specialists, including a cohort of pre-identified twice-exceptional children would facilitate comparison of 2e scoring patterns, possibly keyed to disability. A limitation is that the current study used only children tested in the U.S. Future studies could compare our findings with those of other countries to determine if there is consistency of results internationally. Such studies could also look at within-group differences. Our sample was limited to children with access to private assessment and professionals with knowledge of the markers of twice exceptionality. Samples taken from school gifted classrooms may reflect identification procedures that inadvertently cull out twice exceptional children. Both are needed to improve the representativeness of gifted identification practices.
Strength-Based Gifted Identification
Most appropriate, we believe, is the identification of domain-specific giftedness, or individual areas of strength which can inform gifted programming choices in schools. A child can receive relevant gifted programming in just one area—or several. Identification for the gifted program should require only ONE of six robust WISC-V expanded and ancillary index scores recommended below.
The ability to separate strengths from weaknesses and document each separately is essential for high ability children from all backgrounds. A robust score without processing skills may identify a child for the gifted program, while another score documents the need for classroom and testing accommodations for a disability. This is preferable to averaging disparate scores, confounding the Full Scale IQ, and causing 2e children with clear cognitive strengths and areas for growth to miss cut-offs for both gifted identification and service eligibility for weaknesses.
Conclusion
When used according to the guidelines below, the Wechsler Intelligence Scale for Children—Fifth Edition is an effective instrument to identify students with intellectual giftedness, including students with exceptionally high levels of giftedness (using extended norms) and gifted students with co-existing disabilities (the twice exceptional). Our study of 390 gifted children revealed a highly discrepant scoring pattern with the highest scores in the high-g factors of Verbal Comprehension, Visual Spatial, and Fluid Reasoning. Lower scores occurred in lower-g processing skills: Working Memory and especially Processing Speed. Processing Speed is not a good measure of general intelligence (g), and is not central to gifted identification or gifted programming. Large mean scoring discrepancies in the gifted population may render a gifted child’s WISC-V Full Scale IQ score uninterpretable—not a meaningful unitary construct. The FSIQ may mask both strengths and weaknesses, and may inappropriately incorporate processing speed into gifted identification practices. For these reasons, the WISC-V Full Scale IQ score should not be required. Instead, other robust expanded and ancillary index scores may be used, as needed, to document a child’s areas of individual strength, global strength, or weakness requiring services. Employing the recommendations below when using the WISC-V optimizes its flexibility and ensures a defensible basis for the identification of a richly diverse population of gifted children.
Recommendations for Use of the WISC-V
When using the Wechsler Intelligence Scale for Children—Fifth Edition (WISC-V) as part of a comprehensive assessment of gifted and twice exceptional children, we recommend the following:
The WISC-V Full Scale IQ score should not be required. Because gifted children earn a broader range of scores, with distinctive highs and relative lows, significant discrepancies may render the Full Scale IQ score uninterpretable. The Full Scale score is helpful when scores are consistent, but not as the sole or best identifier of giftedness in schools, districts, and state regulations. The Full Scale score may impede efforts to ensure that gifted classrooms, programs, and schools are accessible to children with disabilities.
Instead, any one of the following WISC-V scores (subtests in parentheses), representing areas of strength, should be acceptable for use in the selection process for gifted programs if it falls within the confidence interval of the required score for admission: a. the Verbal (Expanded Crystallized) Index (VECI; SI, VC, IN, and CO), b. the Nonverbal Index (NVI; BD, MR, CD, FW, VP, and PS), c. the Expanded Fluid Index (EFI; MR, FW, PC, and AR), d. the General Ability Index (GAI; BD, SI, MR, VC, and FW), e. the Full Scale IQ Score (FSIQ; BD, SI, MR, DS, CD, VC, and FW), and/or f. the Expanded General Ability Index (EGAI; SI, VC, IN, CO, BD, MR, FW, and AR) * the Quantitative Reasoning Index (QRI; FW and AR)
If applicable, calculate Extended Norms scores (Raiford, Courville, et al., 2019) and report these and their resulting extended norms index scores in lieu of regular scores. Source. Adapted from NAGC (2018). Information about these scores can be found in Wechsler (2014a, 2014c), Raiford et al. (2015), and Raiford, Silverman, et al. (2019). *Less robust, the QRI is useful for the documentation of mathematical talent.
Footnotes
Authors Note
Barbara Gilman has retired from Gifted Development Center. Research data explored in this article prompted position statements by the National Association for Gifted Children (NAGC) recommending best practices for use of WISC-V and WISC-IV for gifted and twice exceptional identification. Portions of these data were presented by the authors at conferences between 2008 and 2021, and referenced in a related article, but have never been fully reported. The authors contributed their time to the collection of anonymized Wechsler Intelligence Scale for Children-Fifth Edition (WISC-V) data at seven assessment sites in the U.S. All identifying information about the children in our samples (including birthdates) remained with the contributing sites, which forwarded anonymized spreadsheet scoring data, gender, bilingual/multilingual status and test age to a central site for aggregate data analysis of the entire sample. Statistical analysis was completed by Frank Falk (now deceased), Research Director of the Institute for the Study of Advanced Development (ISAD), Westminster, Colorado, United States, with the assistance of Caelan Darnell for data entry. [In 2006, Frank Falk, with the assistance of Abra Havens, analyzed the Wechsler Intelligence Scale for Children—Fourth Edition (WISC-IV) data referenced in the Background section of this article and score comparisons.] Stephen Chou, former Summit Center Training and Research Director, and Charlotte Beard, former Research Associate, played key roles in the WISC-V data collection from Summit Center. We appreciate the assistance of Megan Foley-Nicpon for her insight into the effect of processing speed and timing on the WISC-V. We are grateful to John Wasserman for sharing his understanding of intelligence theory and g, invaluable to understanding our results. The fee for submission of the WISC-V study to SAGE Open was contributed by the authors.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. Our data were contributed by experienced private assessment sites that scrupulously deidentified their data to meet parent concerns prior to its being sent to a central site for analysis. We will share no birthdates, test dates or, in some cases, test ages to address parent concerns.*
