Abstract
The aim of this study was to examine the reliability of the Gifted Rating Scales-Preschool/Kindergarten form (GRS-P) and its relationship with performance assessments in a sample of Greek preschool children. In the initial screening, 60 high-potential children were nominated by their teachers using the GRS-P. In a second phase, these children were assessed for nonverbal intelligence, verbal skills, and early numeracy, with 50 children eventually being identified after excluding those who did not meet criteria. Findings revealed high reliability and internal consistency across all GRS-P subscales. Significant positive correlations were found between the GRS-P Intellectual and Academic Ability subscales and nonverbal intelligence, while verbal ability positively correlated with the GRS-P Academic subscale. Additionally, Motivation subscale correlated significantly with nonverbal intelligence and showed a nonsignificant relationship with Academic Ability scale and Early Numeracy. The study's findings highlight implications for educational policy supporting high-ability preschoolers in both Greek and international contexts.
Keywords
Introduction
The identification of high-ability/gifted children has been a controversial issue in the field (Acar et al., 2016; Pfeiffer, 2015a). Giftedness has been described through numerous terms and models, ranging from IQ-based definitions to multifactorial approaches considering broader traits and abilities (Carpenter, 2019; Ivleva & Ivlev, 2017; Pfeiffer, 2013, 2015a; Plucker et al., 2017; Smedsrud, 2020). This reflects a continuum from conservative, IQ-focused views to more inclusive perspectives aligned with contemporary practices (Renzulli, 2002). Recent conceptions emphasize dynamic person–environment interactions over fixed traits (Renzulli, 2002; Sternberg, 2007), with definitional variability continuing to fuel debate (Carman, 2013).
Gifted identification methods are generally divided into performance-based (e.g., intelligence and achievement tests) and nonperformance-based approaches (e.g., teacher/parent nominations, rating scales, self- and peer-ratings) (Hodges et al., 2018; Wiley, 2020). Contemporary guidelines recommend using multiple evidence sources and allowing flexibility across both types of methods (Dai, 2018; NAGC, 2015; Smedsrud, 2020).
In clinical contexts, high IQ remains the main criterion for identifying giftedness (Pfeiffer, 2021) and serves as a baseline for recognizing high ability students (Fernández et al., 2017; Renzulli & Gaesser, 2015). Gifted individuals are typically defined as those in the top 3% to 5% in intelligence, academics, the arts, and leadership. This means that inherently, intelligence is used as the measurement tool for identifying gifted students (McBee & Makel, 2019). Using an IQ cutoff of 130 excludes those with scores like 120 or 125, despite the arbitrary nature of such thresholds (Pfeiffer, 2015b). Since there is no clear boundary between gifted and nongifted, some propose broader criteria, including the top 10% to 15% (Pfeiffer, 2021; Worrell & Dixson, 2018), aligning with NAGC's “top 10% or rarer” definition (NAGC, 2010). Consequently, IQ cut scores of 120, 125, or 130 are widely used in practice (Pfeiffer, 2015a; Renzulli et al., 2002; Silverman, 2018).
High Academic Potential in the Preschool Educational Context and Formal Identification
Giftedness is typically associated with high academic achievement and above-average potential (Gagné, 2013; Miranda et al., 2013; Pfeiffer, 2012), with general intelligence considered the strongest predictor of academic performance (Cucina et al., 2016; Warne & Burton, 2020; Zaboski et al., 2018). As a result, intelligence remains the primary defining criterion in giftedness research (Baudson & Preckel, 2013; Pfeiffer, 2021), given its reliable measurement compared to less robust constructs like creativity or motivation (Erwin & Worrell, 2012).
Preschool children with high academic potential often show accelerated cognitive development, advanced vocabulary as well as numeracy skills (Kettler et al., 2017; Koshy & Robinson, 2006; Porter, 2005; Sternberg & Kaufman, 2018). However, early identification remains controversial due to definitional disagreements, limited reliable screening tools and instability of test scores in early years (Borland, 2014; Gottfried et al., 2009; Grant, 2013; Hertzog, 2014; Perleth et al., 2000; Pfeiffer & Petscher, 2008; Walsh et al., 2010). Stability improves after age four, with measures predicting later achievement (Colombo et al., 2009) and test authors agree that giftedness can be identified before age five or six (Valler et al., 2017).
Identification of gifted children involves both formal and informal methods, with potential developing at different rates depending on the individual (Johnsen, 2024). Early identification enables alignment of strengths and interests with meaningful learning opportunities in inclusive settings (Johnsen et al., 2022). An international scoping review highlighted themes such as conceptualizing giftedness, equitable opportunities, identification strategies, stakeholder collaboration and teacher training as central to effective inclusion (Mossberg et al., 2024). The NAGC Pre-K–Grade 12 Standards emphasize recognizing students’ diverse abilities through varied assessments and supportive environments that foster social, emotional, and psychosocial growth (Ferguson, 2015; Johnsen et al., 2022).
Giftedness has evolved from intelligence-based definitions to multidimensional, contextual conceptions that include creativity, motivation, effort and social skills alongside cognitive ability (Renzulli, 2016; Smedsrud, 2020; Sternberg, 2015; Subotnik et al., 2018; Westberg, 2012; Wiley, 2020). Contemporary theorists emphasize broader criteria beyond IQ, highlighting achievement, motivation and domain-specific engagement (Heller et al., 2000; Renzulli et al., 2009; Sternberg, 2019; Sternberg & Davidson, 2005). However, the evolving definition of giftedness has shaped assessment practices, moving from intelligence and achievement tests (Pfeiffer, 2015a; Wiley, 2020) to broader methods such as teacher and parent rating scales, peer assessments, and student portfolios (Acar et al., 2016; Dai, 2018; Smedsrud, 2020). Researchers recommend staged identification, beginning with teacher assessments as brief screening tools and followed by standardized testing for more precise evaluation (Almeida et al., 2016; Renzulli & Gaesser, 2015).
With respect to teacher rating scales, these instruments provide structured insights into diverse characteristics of giftedness not captured by cognitive tests, making them valuable tools for identification (Benson & Kranzler, 2017; Makel et al., 2015; Pfeiffer & Jarosewich, 2007). Commonly used instruments include the SRBCSS (Renzulli et al., 2010), GRS (Pfeiffer & Jarosewich, 2003), GATES (Gilliam et al., 1996), SIGS (Ryser & McConnell, 2004), and HOPE Scale (Gentry et al., 2015).
Two prevailing perspectives dominate the use of teacher ratings in identifying giftedness: while some highlight teachers’ nuanced insights into student abilities (VanTassel-Baska, 2008), others stress risks of bias that may lead to disproportionality (Ford & King, 2014; Grissom & Redding, 2016; Martínez et al., 2009). Teacher conceptions, biases, and factors such as behavior, gender, prior achievement, and special education status influence ratings (Harlen, 2005; Miller, 2009). In addition, the “halo effect” bias may also impact how teachers assess students (Rothenbusch et al., 2018). Fisicaro and Lance (1990) proposed three possible explanations for this phenomenon: one, teachers may form an overall general impression of a student that colors all ratings; two, a single prominent trait may disproportionately influence evaluations across various attributes; and three, teachers might struggle to clearly distinguish between different traits conceptually. Research has often shown that the correlations between different characteristics rated by teachers have been frequently found to be higher than the correlations between corresponding student data such as tests (Li et al., 2008; Urhahne, 2011). Variability in how teachers conceptualize giftedness and apply rating scales further leads to inconsistencies across raters, as scores reflect both student characteristics and teacher-related factors, making them inherently nonindependent (McCoach et al., 2024). Reliability is also complicated by inter-rater variability and potential bias in equitable identification. Teacher effectiveness may also influence scores, since students of more effective educators often show higher achievement and cognitive development (McCoach et al., 2024). To address these issues, teacher training and frequent professional development are needed to standardize scale use and reduce between-teacher variance, which contributes to inequity and underrepresentation in gifted education (McCoach et al., 2024; Renzulli et al., 2010).
With regard to the effectiveness of teacher rating scales, research suggests that when teachers possess a well-defined understanding of giftedness and its behavioral manifestations, their evaluations are more accurate and inclusive (Daglioglu & Suveren, 2013; Dal Forno et al., 2015; Harradine et al., 2014; Lee et al., 2022; McBee et al., 2016; Miranda et al., 2013). Accordingly, educators working with gifted learners need training in recognizing the characteristics of giftedness, supported by continuous, research-based professional development (Lee et al., 2022; Renzulli et al., 2010). Such development should include diversity-responsive and inclusive practices, access to sufficient resources, engagement in professional learning communities, application of research on psychosocial growth, development of culturally relevant curricula, critical examination of personal and historical biases, and commitment to ethical principles that advance equity and access (Johnsen et al., 2022). With this preparation, teachers can better nurture students’ strengths, interests, and developmental differences by creating instructional practices and environments that showcase varied abilities. Moreover, by integrating formative and summative assessments that are aligned with students’ interests, educators can create learning opportunities that foster the social, emotional, and psychosocial competencies essential for the development of giftedness (Ferguson, 2015).
The Theoretical Framework of the Study
Pfeiffer's tripartite model (2015), which is a composition of traditional psychometric, developmental, transformational, and ecological models, was used to frame the current study. It offers three different, but complementary ways to conceptualize, identify and program for children with high ability or extraordinary potential. The first uses IQ or cognitive ability tests to identify students significantly above average, based on either general intelligence or multidimensional models like the popular C-H-C model of cognitive abilities (Pfeiffer, 2015a). The second focuses on outstanding accomplishments, including academic excellence and creativity, emphasizing the need for enriched educational programs for consistently high achievers (Pfeiffer & Shaughnessy, 2020). The third considers unrealized potential in children who haven’t had the opportunity to develop their abilities. These students may not excel on standardized tests or meet typical IQ thresholds for giftedness, often scoring between 110–115. These categories are not mutually exclusive (Pfeiffer, 2013).
Typically, these children have IQs in the 120–130 range or higher and rank among the top performers in their class (Pfeiffer & Shaughnessy, 2020). This study adopts a developmental, inclusionary approach aligned with Pfeiffer's model, defining “high ability” as the top 10% of preschoolers with the potential to excel beyond their peers (Pfeiffer, 2015a, 2021; Wellisch, 2019; Worrell & Dixson, 2018). Adopting a developmental view and an inclusionary approach, the term “high ability” with an emphasis on general intelligence and high academic potential is embraced in this study, aligned with Pfeiffer's tripartite model, to describe the top 10% of the preschool children population with a potential to perform or achieve beyond age-peers (Pfeiffer, 2015a, 2021; Wellisch, 2019; Worrell & Dixson, 2018). The terms “high ability” and “gifted” are used sometimes interchangeably in this study, which is common in the field of giftedness.
The Objectives of the Study
The structure and operation of the Greek educational system lack the administrative, instructional, and pedagogical frameworks necessary for the effective identification and support of gifted students in all dimensions of their development—cognitive, social, psychological, and emotional (Rizos, 2011). Moreover, the lack of formal gifted identification programs and organized enrichment initiatives underscores the pressing necessity to implement effective methods for assessing gifted students. Considering that the Gifted Rating Scales-Preschool/Kindergarten form (GRS-P) were originally developed within a different educational and cultural setting, it becomes inevitable to critically assess their relevance and applicability within the Greek context.
Therefore the objectives of our study were: (a) to examine the reliability of the Greek version of the GRS-P and (b) to investigate the relationship between teachers’ nominations and traditional performance tests.
Method
Participants
A total of 36 Greek preschool teachers participated in the study. Specifically, 34 (94.4%) were women and 2 (5.5%) did not declare their gender. Regarding the length of time, they had known the student, whose gifted characteristics were going to evaluate, 20 (55.5%) reported they knew the student more than a year, while 16 (44.5%) from 4 to 6 months. Regarding their teaching experience, 52.8% of the teachers (N = 19) had 16 to 20 years of experience, while 30.6% (N = 11) had 26 to 30 years of experience. The remaining 16.6% (N = 6) fell into the category of 6 to 10 years of experience. 50 preschool children between 5 and 6 years of age (M = 62.50 months, SD = 2.65) were identified. The gender ratio in the group was approximately 1:1 (24 male: 26 female). The majority of parents had attained a university-level education (80% of mothers and 60% of fathers). With respect to occupational status, fathers were most frequently employed in the private sector (48%) or self-employed (38%), whereas mothers were predominantly employed in the private sector (38%), followed by self-employment (28%) or unemployment (16%). Socioeconomic classification indicated that most families (86%) belonged to the middle class, while a small proportion were categorized as low (n = 4) or upper class (n = 3). Since the GRS-P is not weighted in the Greek population, we used raw scores. The mean raw scores in all subscales of GRS-P were above the 84th percentile (Intellectual Ability: M = 94.20, SD = 7.10, Academic Ability: M = 91.16, SD = 6.87, Creativity: M = 80.78, SD = 10.21, Artistic Talent: M = 80.84, SD = 18.63) indicating high probability of being gifted (GRS-P; Pfeiffer & Jarosewich, 2003). The mean raw score of Motivation was M = 86.76, SD = 15.77 above the 84th percentile, indicating an above average-high probability of being highly motivated as well (Table 1). Motivation is a key factor in the expression of giftedness (Pfeiffer, 2003; Renzulli, 1986). According to the authors’ guidelines for the GRS-P, the Motivation score is not used as an eligibility criterion for identifying giftedness, as motivation is not considered a type of giftedness. Instead, the GRS-P Motivation scale measures a student's drive or persistence, desire to succeed, and willingness to work hard (GRS-P; Pfeiffer & Jarosewich, 2003).
Characteristics of the Identified High Ability Preschool Children.
Note. N = 50. Descriptive statistics are presented for age (in months), GRS-P subscales, CPM, CVS and Early Mathematical Competence.
Children's raw scores were used in both Colored Progressive Matrices (CPM) and Crichton Vocabulary Scales (CVS) to gauge their actual abilities, not those relative to the norming sample and avoid ceiling effects based on t-scores. Besides, the tests were not designed to address high ability children (Arffa, 2007; Bain & Bell, 2004).
Mean raw scores were at or above the 91st percentile equivalent with a standardized score of IQ ≥ 120 in CPM and CVS measures in the Greek population (CPM: M = 24.24, SD = 2.11 and CVS: M = 71.34, SD = 8.32), since it was one of the eligibility criteria for the participants to be identified with high ability. In addition, the mean raw score at the Utrecht ENT (Barbas & Vermeulen, 2008; Van Luit et al., 1994) was at or above the 91st percentile as it corresponded to Level A (M = 31.42, SD = 4.22), since it was again one of the eligibility criteria for the participant to be identified as highly able (see Table 1).
The presence of large ranges and standard deviations, particularly in the GRS-P Artistic Talent, Motivation, and Creativity scales suggests that the group is not homogeneous, implying variability. Indeed, the population of high-ability children is not homogeneous but a diverse group with varying abilities and potentials in one or many domains (Bucaille et al. 2021; Hernandez Finch et al. 2014; Holocher-Ertl & Seistock, 2019).
In contrast, CPM, CVS, Early Mathematical Competence Test, GRS-P Intellectual Ability and GRS-P Academic Ability show tighter distributions with narrower ranges and lower standard deviations, implying more consistency. Owing to the wide range of scores and the potential influence of outliers—particularly at the lower end of certain scales—the data were transformed into z-scores (Banas, 2017) to facilitate direct comparisons across measures with differing scales and distributions (Table 2).
Characteristics of the Identified High Ability Preschool Children (Z-Scores).
Note. N = 50. Descriptive statistics are presented for age (in months), GRS-P subscales, CPM, CVS, and Early Mathematical Competence (Z-scores).
Measures
Gifted Rating Scales-Preschool/Kindergarten Form
The GRS-P, standardized with the WPPSI-III (Wechsler, 2002), is designed for children aged 4:0 to 6:11 and aligns with Pfeiffer's (2015a) tripartite model of giftedness (Pfeiffer & Jarosewich, 2003). It comprises five scales—Intellectual Ability, Academic Ability, Creativity, Artistic Talent, and Motivation—each containing 12 teacher-rated items on a 9-point scale. Raw scores are converted to age-based t-scores, with higher scores indicating greater likelihood of giftedness. Specifically, a t-score below 55 (<69th percentile) suggests a low probability of giftedness, a score between 55 and 59 (69th–83rd percentile) indicates a moderate probability, a score between 60 and 69 (84th–97th percentile) reflects a high probability and a score of 70 or above (98th percentile and above) indicates a very high probability. Research has established the reliability and validity of the GRS-P (Benson & Kranzler, 2017; Karadag et al., 2016; Siu, 2010). The Greek version was adapted by Thomaidou et al. (2014) and validated in two samples by Sofologi et al. (2022), demonstrating excellent internal consistency and sound factorial, convergent, and discriminant validity. Permission for use and reproduction was granted by Multi-Health Systems Inc., Toronto, Canada.
Raven's Educational
The Greek standardized version of Raven's Educational (Sideridis et al., 2015) was used to assess the nonverbal and verbal ability of high-ability preschool children. It is based on the English version of Raven et al. (2008) and assesses nonverbal aspects of general cognitive ability, while vocabulary test measures aspects of general cognitive ability in a verbal context. It consists of the CPM [α = .90] and the CVS [α = .98], both of which can be used separately. The administration of the two scales is individual and they are intended for children up to 11 years old. Cronbach's alpha coefficient was calculated to assess reliability of CPM and CVS in our sample. Overall, Cronbach alpha for the CPM was .87 and for the CVS was .85. The mean CPM raw score was M = 24.24, SD = 2.11, with scores spanning from a minimum of 20 to a maximum of 28, corresponding to standardized scores ranging from 120 to 140. Regarding CVS, the mean raw score was M = 71,34, SD = 8,32, with scores spanning from a minimum of 57 to a maximum of 97, corresponding to standardized scores ranging from 120 to 140. Permission was granted by the publisher.
Utrecht Early Mathematical Competence Test [Early NumeracyTest]
The early numeracy of high ability preschool children was assessed with the standardized psychometric Criterion of Early Mathematical Competence of Utrecht (Utrecht Early Mathematical Competence Test), for children 4.00–7.05 (Barbas & Vermeulen, 2008; van Luit et al. 1994). It is an untimed, individually administered 40-item assessment of young children's math skills, covering eight domains: concepts of comparison, classification, one-to-one correspondence, seriation, use of number words, structured counting, resultative counting, and general understanding of numbers. The mean Cronbach’s alpha for the Utrecht Early Mathematical Competence Test was .87 in our sample. The mean raw score was M = 31.42, SD = 4.22, with scores spanning from a minimum of 16 to a maximum of 39.
Procedure
The study was conducted in eight public and seven private schools in the urban area of Thessaloniki in Central Macedonia, Greece during February and March 2023. Four hundred and forty-five (445) children attended public schools and 355 children attended private schools, forming a pool of 800 children, ranging between 5 and 6 years. A total of 36 teachers participated in the nomination process.
A multi-criteria evaluation in phases was applied (Almeida et al., 2016; Cao et al., 2017; Renzulli & Gaesser, 2015). Best practices for assessment of high intellectual and academic potential or potential to excel in young children recommend the use of developmentally appropriate assessment instruments as screeners before further assessment is implemented (Davis et al., 2014; Morrison, 2014).
Hence, in the 1st screening phase, teachers’ nominations were used as screening tools. Initially, the Principals and the teaching staff of all 15 schools gave their written informed consent and this was ensured by providing them with clear written and verbal information about the research and the planned use of data. Next, a letter introducing the purpose of the study and a written consent form were sent to parents, while confidentiality and protection of data were also ensured for its archiving. Following written parental consent, teachers were invited to complete the Greek version of the GRS-P(Pfeiffer & Jarosewich, 2003; Sofologi et al., 2022; Thomaidou et al., 2014) for children who were thought to display a potential for high ability.
In the second screening phase, further assessment was implemented. The 60 children underwent performance assessments evaluating their nonverbal intelligence (Raven's CPM: Raven et al., 2008; Sideridis et al., 2015), their ability in the verbal domain (Raven's CVS: Raven et al., 2008; Sideridis et al., 2015) and their early numeracy (Utrecht Early Numeracy Test [ENT] : Barbas & Vermeulen, 2008; Van Luit et al., 1994), thus assessing their academic potential. Two of the authors, both certified in administering the measures, conducted the assessments.
The aforementioned measures were used as the criteria for validation of the teachers’ nominations. Fluid intelligence reflects the most commonly accepted component of giftedness among the various giftedness conceptions (Peters et al., 2020; Sternberg et al., 2009), especially in early childhood (Bildiren, 2017; Silverman, 2009). The CVS and the ENT represent salient measures of academic potential and are relevant indicators as well for early giftedness (Wilson, 2015). Besides, all the distinguished test authors who participated in Valler et al.'s (2017) study- examining their perspectives on giftedness—stated that giftedness can be identified before the age of 5 or 6 and they provided explanations supporting this view. Five notable themes emerged from the authors’ open-ended responses: (a) strong verbal, communication and language skills are measurable in young high-ability learners; (b) the reliability of early childhood assessment is respected; (c) creativity and conative traits are important considerations in early childhood; (d) the authors’ theoretical perspectives and personal experiences support early identification of high-ability children; and (e) early identification depends on how giftedness is defined. Based on the aforementioned findings and considering the absence of ability tests for preschool children in Greece, we proceeded with administering the available measures that were developmentally appropriate for this age group.
Children were assessed individually in a quiet area near their classroom across three sessions within a one-week period, with each session lasting approximately 45 to 60 minutes. Eligibility criterion was a total score at or above the 91st percentile on all three aforementioned measures, corresponding to standardized scores ranging from 120 to 140. According to Pfeiffer's tripartite model (2015), children whose IQ scores may fall in the 120 to 130 range, tend to excel in the classroom and thrive on learning and academic challenges. For the ENT measure, the same cut-off point was applied, as it corresponded to Level A, indicating that these children would exhibit very good mathematical competence. Ten (10) children were excluded because they did not meet the criteria set on the performance assessments and eventually 50 children were identified (see Figure 1).

Flow diagram of the identification process.
Data Analysis
Children's raw scores were used in our analyses to gauge their actual abilities, not those relative to the norming sample and avoid ceiling effects based on t-scores (Arffa, 2007; Bain & Bell, 2004). Data obtained were analyzed by SPSS software version 29. Given the variability in score ranges and the possible influence of lower-end outliers, the data were converted to z-scores (Banas, 2017) to ensure comparability across measures with distinct scales and distributions. Internal consistency reliability of the GRS-P scales was assessed using Cronbach's alpha. Intraclass correlation coefficients for all GRS-P scales were calculated using a two-way mixed-effects model with a consistency definition (Liljequist et al., 2019). Pearson’s correlation coefficient (r) was used to examine the correlations within the GRS-P subscales and the correlations between GRS-P subscales with scores on CPM, CVS, and ENT, based on the procedure outlined by Pfeiffer and Jarosewich (2003) in the Manual of the GRS.
Results
The Reliability of the Greek Version of the GRS-P
The reliability of all GRS-P scales was examined. Since the GRS-P is not weighted in the Greek population, we used raw scores. The Cronbach's Alpha reliability index was calculated and showed high reliability for all subscales (Table 3).
Reliability of all GRS-P Subscales.
Note. Cronbach's alpha (α) coefficients indicate the internal consistency reliability for each GRS-P subscale.
All subscales show high internal consistency, with Cronbach's alpha values above .80, indicating that each scale reliably measures its respective construct.
Furthermore, the statistical analysis demonstrated that the tool exhibits strong internal consistency, with all GRS-P subscales (Intellectual, Academic, Creativity, Artistic Talent, and Motivation) showing statistically significant inter-item correlations (Person's coefficient). Intraclass correlation coefficients (ICCs) were calculated using a two-way mixed-effects model with a consistency definition. “Single Measures” ICCs assessed the reliability of individual ratings, while “Average Measures” ICCs reflected reliability across multiple raters. The model assumed no rater-individual interaction and excluded rater differences from the error term, focusing on consistent ranking rather than identical scoring (Table 4).
Intraclass Correlation Coefficients for all GRS-P Scales.
Note. Intraclass correlation coefficients (ICCs) represent the consistency of ratings across raters for each GRS-P subscale. a = single measures; c = average measures; CI = confidence interval.
The reliability analysis showed that all GRS-P subscales demonstrate statistically significant consistency among raters. Single-rater reliability varies, with Academic showing the lowest (ICC = .277) and Artistic Talent the highest (ICC = .809). When ratings are averaged, reliability improves substantially across all subscales, with ICCs ranging from .821 to .981, indicating strong agreement.
Furthermore, the statistical analysis showed that all GRS-P subscales showed statistically significant correlations (Person's coefficient) (Table 5).
Correlations Within the GRS-P Subscales (Pearson's Coefficient).
Note. **Correlation is statistically significant at the .01 level (2-tailed).
*Correlation is significant at the .05 level (2-tailed).
All GRS-P subscales are positively correlated, reaching statistical significance. This indicates that higher ratings in one domain of giftedness are generally associated with higher ratings in others. The strongest correlations were observed between GRS-P Intellectual and Academic (r = .60, p < .01) and GRS-P Motivation and Artistic Talent (r = .59, p < .01), suggesting particularly strong connections in these areas. GRS-P Intellectual scale was significantly related to all other subscales, with the strongest association found with the GRS-P Academic scale. Similarly, GRS-P Academic scale showed significant correlations with Creativity, Artistic Talent, and Motivation scales.
The Relationship Between Teachers’ Nominations and Traditional Performance Tests
Pearson correlation coefficients were calculated to examine the relationships between GRS-P subscales and CPM, CVS, and ENT. The results are presented in Table 6.
Correlations Between the GRS-P Subscales and the Performance Tests.
Note. **Correlation is statistically significant at the .01 level (2-tailed).
*Correlation is statistically significant at the .05 level (2-tailed).
Significant positive correlations were found between Intellectual Ability and CPM (r = .43, p < .01), and between Academic Ability and CPM (r = .53, p < .01). Academic Ability also showed a significant, moderate correlation with CVS (r = .33, p < .05) and a small, nonsignificant correlation with ENT (r = .27, p > .05). Motivation correlated significantly with CPM (r = .32, p < .05) and showed a nonsignificant positive relationship with ENT (r = .25, p > .05). No significant correlations were observed between Creativity or Artistic Talent and any of the performance measures.
Discussion
The objectives of the present study were to examine the reliability of the Greek version of the GRS-P and the relationship between teachers’ nominations and traditional performance tests that measure cognitive abilities and aspects of intelligence.
Reliability of the Greek Version of the GRS-P
Results revealed strong reliability across all GRS-P subscales and high internal consistency, with each item within the subscales (GRS Intellectual, GRS Academic, GRS Creativity, GRS Artistic Talent, and GRS Motivation) showing statistically significant correlations (Pearson's coefficient). Specifically, internal consistency reliability indices for the five scales of the Greek version of the GRS-P were excellent, ranging from .82 to .98. These values are consistent with those reported by Sofologi et al. (2022) in a Greek sample of preschool children, as well as with results found in studies from the United States by Pfeiffer and Jarosewich. (2003) and in adaptations for Turkish and Chinese samples by Karadag et al. (2016) and Siu (2010), respectively.
However, single-rater reliability varied across subscales, with the Academic subscale showing the lowest reliability (ICC = .277), suggesting raters are relatively inconsistent in their evaluations of academic traits. This inconsistency may stem from reliance on indirect or behaviorally inferred indicators, subjective interpretation, or insufficient rater training. In contrast, the Artistic Talent subscale demonstrates the highest single-rater reliability (ICC = .809), reflecting much stronger agreement between raters. This higher reliability is likely due to the greater visibility and tangibility of artistic abilities (e.g., observable artifacts), as well as increased variability in artistic performances, which may facilitate easier discrimination and promote consensus. Nonetheless, when the scores from multiple raters are averaged, the reliability for each subscale increases substantially (ICCs ranging from .821 to .981). This pattern is expected and reflects a fundamental principle of reliability theory: averaging ratings across multiple raters reduces the impact of individual rater bias or error, resulting in more stable and reliable composite scores (Hallgren, 2012; McGraw & Wong, 1996).
Additionally, the correlations within the GRS-P subscales ranged from weak to strong positive correlations with GRS-P Intellectuall having the strongest correlation with GRS-P Academic (r = .60, p < .01). This correlation may suggest that the GRS-P primarily reflects a general cognitive ability and not a multidimensional conceptualization of giftedness (Benson & Kranzler, 2017; Sofologi et al., 2022). Our findings are consistent with those of recent studies sharing a similar focus, which suggest that teacher judgments—whether related to students’ cognitive abilities or academic performance—are largely influenced by students’ academic skills (Baudson & Preckel, 2013; Urhahne & Wijnia, 2021). Additionally, our findings correspond with previous research that has assessed the reliability and validity of the GRS-P (Benson & Kranzler, 2017; Karadag et al., 2016; Pfeiffer & Jarosewich, 2003; Siu, 2010; Sofologi et al., 2022).
The Relationship Between Teachers’ Nominations and Traditional Performance Tests
Correlations between GRS-P Intellectual Ability and GRS-P Academic Ability subscales with CPM were positive and significant (r = .43, p < .01 and r = .52, p < .01 respectively). In addition, CVS yielded a positive correlation with GRS-P Academic subscale (r = .33 p < .05). These findings provide evidence to support the criterion-related validity of the Greek-translated GRS-P, by comparing the scores on the Intellectual Abilty and Academic Ability subscales with measures of nonverbal intellectual ability (CPM) and ability in the verbal domain (CVS). Indeed, an examination of the American standardization sample using evaluative efficiency statistics provided support for the identification accuracy of the GRS-P Intellectual Ability and Academic Ability subscales identifying intellectual and academic giftedness (Pfeiffer & Petscher, 2008). Similarly, Siu (2010) found significant correlations between GRS-P subscale scores and children's school performance including language. However, school performance was not assessed by standardized measures in the aforementioned study. Likewise, Karadag et al. (2016) also found the highest relation to be between the intellectual ability subscale and the academic competence subscale. Besides, Arabic-, Czech-, Korean-, Chinese-, and Spanish-translated versions of the GRS, though school forms, (Hemdan Mohamed & Omara, 2020; Jabůrek et al., 2021; Lee & Pfeiffer, 2006; Li et al., 2008; Rosado et al., 2015) have yielded favorable results regarding academic performance.
With regard to advanced early numeracy, no significant correlations were found with the GRS-P subscales. The Academic scale of the GRS-P includes only two specific items related to mathematical competence, which may have limited teachers’ ability to accurately identify advanced early numeracy skills. This finding further supports the view that the GRS-P may primarily reflect general cognitive ability rather than a multidimensional conceptualization of giftedness (Benson & Kranzler, 2017; Sofologi et al., 2022).
In a classroom's setting, high ability is conceptualized by the expression of an above average potential that impacts on the student's learning and school performance (Cross & Coleman, 2014; Gagné, 2013). Besides, teachers assess a wide range of student traits—including cognitive, academic, creative, and social abilities—along with personality factors like motivation and leadership, forming an overall impression of each student (Benson & Kranzler, 2017). These judgments are often shaped by observable factors such as academic performance and classroom behavior, as well as broader attributes like intelligence and creativity (Golle et al., 2018). These interpretations align with the tripartite model of giftedness, which defines it as high intelligence, notable nonintellectual traits (e.g., creativity, motivation), and potential for excellence. Research consistently shows that teachers’ views on giftedness and their identification practices are strongly influenced by cognitive abilities, learning-related traits and personality characteristics (Baudson & Preckel, 2013; Golle et al., 2018; Matheis et al., 2017). As such, teacher nominations mainly identify gifted students whose strengths align with school-promoted and assessed areas. Such students are often broadly gifted and display strong social competence (Preckel et al., 2024).
Nonetheless, careful consideration of Type I and Type II errors is essential when using teacher rating scales as preliminary screening tools for identifying gifted students. In our study, with a prevalence rate of 10%, it was estimated that 80 out of 800 children would demonstrate high-ability potential (IQ ≥ 120). This level of cognitive ability corresponds approximately to a t-score of 63 on the GRS-P, aligning with the 90th percentile of the population. Teachers ultimately nominated 60 children, suggesting that some gifted students were not identified, representing a Type II error. In screening for giftedness, there is always a balance between the risks of Type I and Type II errors. For instance, when the GRS-P is used as a screening tool to aid in identifying gifted students, setting a t-score cutoff at ≥ 60 minimizes the number of truly gifted students missed. However, this same cutoff may also result in overidentifying some students who, upon more comprehensive assessment, do not meet the criteria for intellectual giftedness based on IQ scores. This pattern was observed in our study when performance assessments were administered (Heller & Schofield, 2008; Pfeiffer & Blei, 2008).
In the present study Greek preschool teachers seem to select children that show promise of high intellectual and academic potential in general (Wright & Ford, 2017). It appears that their judgment is most grounded on children's intelligence, which is demonstrated indirectly in their academic achievement (Jabůrek et al., 2021). A possible explanation for this might be the high correlation among GRS-P Intellectual Ability and Academic Ability subscales identifying intellectual and academic giftedness (Pfeiffer & Petscher, 2008), which was also found in the present study (r = .60). Recent research has indicated that a general factor (latent variable) accounts for most of the variance captured by the five GRS-P ratings (measured variables) (Benson & Kranzler, 2017), which was also supported by the findings of Sofologi et al. (2022) for the Greek version of GRS-P. Findings from studies with a similar focus, in which the prevailing paradigm is teacher judgment (whether in regard to cognitive abilities or academic performance) have demonstrated that teacher estimations are based primarily on student academic abilities (Baudson & Preckel, 2013; Urhahne & Wijnia, 2021).
At the same time, this judgment affects even other areas, such as artistic abilities or creativity, since in the present study all preschool children, nominated by their teachers, received a score at or above the 84th percentile on all subscales of GRS-P, indicating a high probability of having an artistic talent and being creative as well. A possible explanation of our results is the presence of a “halo effect” in teacher rating scales, which is a cognitive bias in which a teacher's overall impression of a student influences his or her ratings of a particular gifted student's characteristics (Benson & Kranzler, 2017; Rothenbusch et al., 2018).
Finally, preschool learning opportunities and family socioeconomic status (SES) significantly influence students’ opportunity to learn (OTL) before formal schooling begins, often compromising talent identification processes. Without adjusting for prior educational exposure, assessments may reflect differences in educational opportunity rather than true ability. To ensure equity, identification practices should incorporate group-specific norms (Peters & Gentry, 2012), local norms, and universal screening to capture talent across SES groups (Peters & Engerrand, 2016). While such disparities are much less pronounced in Greece compared to the United States, a major challenge remains: the accurate identification of gifted students. Moreover, gifted education is virtually absent in the Greek public educational system, highlighting the need for systemic reform.
Limitations
There are some limitations in this study that need to be addressed. The sample size was rather small and from a single region in Greece, which limits generalizability. Larger, more diverse samples are needed to draw firmer conclusions. Research on giftedness is inherently challenging primarily because giftedness is relatively uncommon, making large, representative sampling difficult and costly. Using large-scale school achievement data, as suggested by Preckel et al. (2024), offers a promising method for identifying gifted students in Greece. Additionally, measures of creativity, artistic abilities, and motivation could also be included to enhance the test of discriminant validity.
Moreover, although teacher rating scales are widely supported for identifying high-ability students (Daglioglu & Suveren, 2013; Dal Forno et al., 2015; Miranda et al., 2013; Harradine et al., 2014; McBee et al., 2016), there are limitations to this approach in that it relies on the skill of the observer. The insufficient training of teachers in both the identification of gifted preschoolers and the administration of the GRS-P may have influenced the study's outcomes. Moreover, teachers’ stereotyped conceptions about high ability/giftedness which are based on expectations for idyllic behavior characteristics and attitudes (McClain & Pfeiffer, 2012) and a “halo effect” bias might have influenced the teachers’ ratings (Benson & Kranzler, 2017; Rothenbusch et al., 2018).
Another limitation was the use of the unweighted teachers’ rating scales. However, the internal consistency reliability of the Greek Version of GRS-P subscales was excellent in our sample, consistent with the findings of Pfeiffer and Jarosewich (2003) and Sofologi et al. (2022).
Conclusion
Research in the early childhood giftedness field is very limited despite the fact that the early identification of high abilities in preschool children plays a determinant role in the direction their education will take (NAGC, 2019).
Each child possesses a genetic predisposition for potential abilities, which necessitates cognitive, social, emotional, motor, and various environmental stimulations to be fully realized and developed (Gagné, 2011; Mooij, 2013). As such, teachers should design comprehensive and cohesive learning plans tailored to students with diverse gifts and talents. These plans should incorporate differentiated instruction across all subject areas, utilize a balanced and effective assessment system, and integrate various technologies. Additionally, instructional practices should incorporate appropriate accommodations for exceptional learners. Learning experiences ought to be designed to nurture students’ social, emotional, and psychological development as an integral component of a strengths-based approach to gifted education, by incorporating culturally responsive curricula, diverse instructional strategies—such as critical and creative thinking, metacognitive and cognitive learning, problem-solving, and research-based models—and the use of high-quality resources to effectively support differentiation (Johnsen et al., 2022).
Apart from the necessary appropriate legislative and institutional framework that favors the flexible curriculum and the differentiated instruction as well as the evaluation of the effectiveness of the programs implemented, the education of high ability children presupposes internationally, but also in the case of Greece, the valid identification and assessment procedures for children who need different instructional approaches, due to their inclinations and cognitive abilities. Although more research is needed to further validate and refine the Greek version of the GRS-P to replicate our current findings, the results of our studies, show that the GRS-P is a useful instrument for measuring cognitive and academic giftedness in the Greek cultural context. In turn, teachers, being aware of the cognitive and academic abilities of gifted children, could provide different learning experiences that go beyond those offered through the general education curriculum (Plucker & Callahan, 2014).
Footnotes
Acknowledgments
The authors confirm that no artificial intelligence (AI) tools were used in the preparation, writing, or editing of this manuscript.
Ethical Considerations
Ethical approval was not required. Principals, the teaching staff of all schools and the parents of the children gave their written informed consent.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
