Sage Journals: Discover world-class research

Abstract

General Cognitive Ability (GCA) is widely used in personnel selection, with meta-analyses from Europe and North America confirming its predictive validity for job performance. However, Swedish research remains limited, particularly in peer-reviewed literature. This study contributes to the field by summarizing mostly unpublished studies, broadening perspectives beyond academia. A psychometric meta-analysis of 25 studies (24 unpublished, one published) from 1949 to 2024 (N = 2875) applied two restriction of range correction strategies—“considered” and “conservative”—to examine their impact on results. Range restriction occurs when variability in predictor scores is artificially reduced due to pre-selection, leading to underestimated correlations. The observed mean correlation was .19, increasing to .32 after correcting for indirect range restriction and measurement error in job performance. Different correction strategies had minimal impact, reaffirming GCA as a strong predictor of job performance in Sweden. However, further validation studies using Swedish samples are needed to refine correction methods and improve generalizability.

Keywords

General Cognitive Ability meta-analysis personnel selection psychological test

Introduction

General Cognitive Ability (GCA) is one of the most extensively studied constructs in psychology (Spearman, 1904) and has been central to personnel selection since the early development of intelligence tests (Binet and Simon, 1916). While World War II is often cited as the starting point for large-scale GCA testing in selection (Yerkes, 1921), research suggests that several European countries implemented these assessments before North America (Salgado et al., 2010). The widespread adoption of GCA tests in military settings later extended into professional selection systems, making them a key component of modern personnel selection globally (Steiner, 2012).

The predictive validity of General Cognitive Ability for job performance has been extensively documented in North America (Hunter and Hunter, 1984) and Europe (Hülsheger et al., 2007; Salgado et al., 2004). However, Scandinavian and Nordic countries have contributed only sporadically to this research, and Sweden, in particular, remains largely absent from the international research landscape. While studies exist on intelligence and occupational choices in Swedish samples (Hemmingsson et al., 2006) and on individual differences and job performance published by Swedish researchers (Sjöberg et al., 2012), only one peer-reviewed study based on original Swedish data has been identified in major academic databases (Annell et al., 2015). The lack of published research raises concerns about whether findings from North America and other European countries can be generalized to the Swedish labor market, given potential differences in selection practices, labor market policies, and educational systems.

Another key motivation for this meta-analysis is the replication crisis in psychology (Schmidt, 2009). Given increasing concerns about research reproducibility and robustness, broadening the empirical foundation of GCA studies is essential. While most previous meta-analyses rely predominantly on North American and Western European data, Sweden has been underrepresented, despite the widespread use of GCA tests in academic, military, and professional selection. Expanding the knowledge base is crucial, even if cultural differences are assumed to have a limited impact on GCA’s predictive validity.

Finding published studies on GCA and job performance in Sweden proved nearly impossible, necessitating the use of grey literature, including unpublished studies, technical reports, and dissertations. Without incorporating unpublished data, the evidence base would be extremely limited or nonexistent, making this study impossible to conduct. Since GCA’s predictive validity has been firmly established in other regions, assessing its relevance in Sweden requires utilizing all available data sources. Excluding grey literature would not only leave a significant knowledge gap but also risk introducing publication bias, as published studies tend to favor statistically significant findings while disregarding null results (Hopewell et al., 2007). Incorporating grey literature helps mitigate publication bias, ensuring a more accurate and representative estimate of GCA’s predictive validity. However, accessing unpublished data presents an obvious challenge. Fortunately, there is a strong tradition among practitioners in personnel selection to both conduct and document studies on the relationship between GCA and job performance. The willingness of these researchers and consultants to share their documentation has made it possible to carry out this study.

Thus, this study makes an important contribution by leveraging previously unpublished primary studies, adding to the broader body of North American and European research on GCA and job performance. It conducts a meta-analysis of Swedish studies on GCA and job performance to provide the first comprehensive estimate of this relationship in Sweden.

General Cognitive Ability

The fact that individuals differ in their levels of job performance makes it essential for organizations, and applicants, to identify and hire the highest performers. Individual differences in GCA contribute significantly to explaining differences between individuals in many vital areas of life (Gottfredson, 1997a; Jensen, 1998; Neisser, 1996).

Based on the positive correlation between students’ rankings across different subjects in school grades, Spearman (1904) suggested that this shared variance represents a general factor, the g factor, of cognitive ability. In his two-factor theory, Spearman proposed that individual differences in the true score of any ability measurement are attributed to two factors: the g factor, which is common to all GCA assessments, and a second factor specific to each individual measure of mental ability.

In 1939, Holzinger and Swineford proposed the first hierarchical model of intelligence with a general factor at the top and several uncorrelated specific ability factors below, and although Spearman’s g factor and the hierarchical model have been criticized (e.g., Thurstone, 1947), accumulated research has provided solid evidence for the robustness and soundness of the hierarchical model and for the relevance of the g factor (Carroll, 1993). Competing theories (e.g., Guilford, 1988; Sternberg, 1985) are certainly not absent, but suffer, at least for the time being, from a lack of empirical support (Jensen, 1998). Since the publication of Spearman’s paper in 1904, more than a century of empirical research has demonstrated the pervasive influence of GCA in such various areas as academic achievement, occupational attainment, socioeconomic status, divorce, and even age of death (e.g., Cucina et al., 2024; Gottfredson, 1997b; Hemmingsson et al., 2006; Sackett et al., 2024).

Today, it is fair to say that there is broad consensus in the scientific community concerning the hierarchical structure of cognitive ability, the existence of the GCA factor, and the definition of the construct. GCA does not represent a narrow academic intelligence, but will manifest itself in almost any realm of activity that involves active information processing. A definition proven to be useful in applied psychology is the one presented by Gottfredson (1997b), which was first published in the Wall Street Journal in 1994 as part of an editorial written by Gottfredson and signed by a number of colleagues. In their words, GCA ‘is a very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly and learn from experience’ (Gottfredson, 1997b: 13).

Thus, the degree to which an individual is able to learn, adapt, deal with complexity, and process job-relevant information appears to determine his or her work behavior in general. In addition, the relationship between GCA and job performance has been found to be linear, which implies that higher levels of GCA are consistently related to higher levels of job performance, and that there is no point where a higher level of GCA is negatively related to job performance (Sackett et al., 2008). Demographics such as ethnicity and gender as well as organizational, national, and cultural settings have not been found to moderate the relationship between GCA and job performance (Hunter and Hunter, 1984).

It should be mentioned that the empirical support for using lower-order factors for personnel selection purposes (i.e., for predicting job performance) is not as convincing. Specific abilities are generally less important for explaining behavior than GCA, and research suggests that the incremental validity of specific abilities (defined as ability factors unrelated to the general factor) in the prediction of performance and training outcomes is minimal when more general factors are taken into account (Ree and Carreta, 2022; Ree et al., 1994). The reason for this might be that most tasks that involve active information processing rely on a range of abilities rather than one specific ability.

Job performance

The main objective of collecting and combining data about applicants in personnel selection is to predict job performance. Accurately rank-ordering applicants based on predictions of their future job performance, and making hiring decisions based on this rank-ordering, constitutes the very essence of personnel selection (Schmidt and Hunter, 1998).

The conceptualization of job performance as a hierarchical construct, where general job performance represents the highest-order, most generalizable factor at the top of the performance taxonomy and specific performance domains are situated at lower levels, has received strong empirical support (Viswesvaran et al., 2005).

In occupational settings, the general factor of job performance is defined as ‘scalable actions, behaviors, and outcomes that employees engage in or bring about, which are linked with and contribute to organizational goals’ (Viswesvaran and Ones, 2000: 216). The general factor of job performance is an aggregation of the primary performance domains, and there is strong support for a structure of three primary job performance domains—task performance, contextual performance, and avoidance of counterproductive work behaviors (Rotundo and Sackett, 2002). All three domains contribute to the general factor of job performance and represent three distinctly different aspects of human behavior and performance in the workplace. Although many meta-analyses have focused on primary dimensions of job performance (e.g., Gonzalez-Mulé et al., 2014), this study investigates the overall construct of job performance.

GCA and job performance

Recently, Sackett et al. (2022) conducted a partial re-analysis of data from 1980s General Aptitude Test Battery (GATB) testing, where the results were significantly re-evaluated. The authors suggested that the correlation reported in the 1980s was likely overestimated. Instead of .51 (Hunter, 1980), their new estimate was substantially lower at .31, primarily due to methodological issues related to range restriction corrections (e.g., Rydberg, 1968; Sackett et al., 2022), as described in more detail below. This study and a completely new meta-study (Sackett et al., 2024), which includes studies from 2000 to 2021, suggest a current effect of only .22, which is significantly lower compared to previous meta-analyses (Hunter and Hunter, 1984).

In the new meta-analysis by Sackett et al. (2024), it is stated that previous results (such as Schmidt and Hunter, 1998; Salgado et al., 2004) are based on “old” data, and that working life today looks different. It is postulated that we have transitioned from more traditional industrial jobs to performing more roles within the service sector, which may not require the same attributes. Sackett et al. (2024) also highlight that previous studies in North America primarily rely on a single GCA test, namely the General Aptitude Test Battery (GATB), making it difficult to generalize the results to GCA measured with other instruments.

Above all, sharp criticism is directed at the method used in earlier studies to calculate the so-called operational validity, specifically regarding whether and how to correct for what is known as range restriction. Restriction of range occurs when data for a predictor (in this case, GCA) are collected in a single validation study, where the same predictor has been directly or indirectly used for selection decisions on the same group that the study’s results are based on (Lang et al., 2010).

Direct range restriction occurs when a GCA test is used for selection decisions, such as when organizations set a threshold score on a GCA test for applicants to qualify. Indirect range restriction occurs when another predictor influences the distribution, such as when grades, highly correlated with GCA (Roth et al., 2015), are considered during the selection process. In both cases, when the relationship between GCA and job performance is examined, either in concurrent or predictive studies, the variation in GCA is reduced compared to what it would be if the study included all applicants, including those not selected.

Schmidt and Hunter (1998) corrected for restriction of range in both predictive and concurrent studies using available estimates. In contrast, Sackett et al. (2024) did not apply range restriction corrections in concurrent studies, despite having relevant data. Their differing assumptions reflect a broader stance by Sackett et al. (2022), who argue that range restriction is often negligible in typical selection settings and therefore should not be corrected in concurrent or validity generalization studies based on concurrent designs. Criticism of the “new” method, which has not fully corrected correlations in all studies, quickly emerged. Ones and Viswesvaran (2023: 363) state that “many researchers consider it good science to be conservative, but conservative estimates are by definition biased. We believe it is more appropriate to strive for unbiased estimates since the research goal is to maximize the accuracy of the final estimates.” And Oh et al. (2023) simply argue that Sackett’s new way of correcting (or rather the lack of correction) is unreasonable and that more work is needed to fully understand the representativeness of estimates of restriction of range and optimal correction procedures under typical conditions.

Further, Bobko et al. (2025) argue against Sackett et al.’s conservative estimation approach, claiming that the method they use is fundamentally flawed and misleading because some values are corrected for range restriction while others are not when validities are directly compared. Based on these and other concerns, Bobko et al. (2025) outline an alternative “considered estimation” strategy for comparing predictors of job performance.

In this study, we report the positions we take to calculate the so-called operational validity, i.e., the practical validity of using a GCA test to predict overall job performance. Operational validity reflects the correlation between GCA scores and job performance in real-world settings, where range restriction and measurement error are factors that can underestimate validity.

By focusing on operational validity, we measure the strength of this relationship within actual hiring practices, providing a more accurate estimate of the GCA test’s predictive power under practical conditions. According to established guidelines, both the observed correlation between GCA and job performance and the corrected correlation, adjusted for range restriction in GCA and measurement error in job performance (i.e., operational validity), are reported (Binning and Barrett, 1989; Schmidt and Hunter, 2015).

Challenges of GCA testing in Sweden

Although GCA testing began in Sweden as early as the early 1900s (Jäderholm, 1914), and the Swedish Scholastic Aptitude Test (SweSAT), developed in the 1960s, was based on GCA principles, such tests have also been widely used in military enlistment. Since 1944, nine versions of the Enlistment Battery have been developed (Husén, 1948). Despite these advancements, GCA testing in Sweden has faced political and ideological opposition. During the 1960s and 1970s, it was criticized as a capitalist tool, limiting its acceptance in occupational contexts. This debate shaped the training of industrial-organizational (I-O) psychologists, creating a generational divide in attitudes toward GCA testing. Psychologists trained in the 1960s typically had a strong foundation in psychometrics, whereas those trained in later decades received limited education in this area, fostering hostility toward GCA testing, particularly in personnel selection. This resistance has also reduced the academic community’s interest in publishing studies on the relationship between GCA and job performance, making it impossible to conduct a meta-analysis based solely on Swedish peer-reviewed publications.

Consequently, this study, based on 25 studies conducted between 1949 and 2024, provides a unique contribution for practical recruiters in Sweden. It also complements the recently published meta-analysis of both published and unpublished studies (Sackett et al., 2024), which primarily focuses on North American samples.

Method

Sample and inclusion criteria

We adopted the same criteria as Sackett et al. (2024) for measuring GCA, which included diverse problem-solving items (e.g., excluding tests solely based on matrices). We included studies measuring overall job or task performance, rated by supervisors or through objective job-related behavior. Excluded were studies using self, peer, or subordinate ratings, those focusing on narrow performance aspects like citizenship or counterproductive behavior, and those targeting non-performance outcomes like turnover, job satisfaction, or training performance. As far as we know, no compilation of studies conducted in Sweden exists, so we expanded our search beyond the last 20 years to include all available Swedish and English databases.

A literature search was conducted across various publicly available electronic databases to gather relevant studies. Google Scholar and ProQuest Dissertations and Theses were utilized, using search terms such as “cognitive ability,” “job performance,” and “Swedish sample.”

To ensure a diverse dataset, outreach was conducted via LinkedIn, Facebook, and personal contacts to solicit studies directly from Swedish consultants and test publishers. Additionally, all Swedish university databases were searched for master’s theses or dissertations unavailable in other databases. Unfortunately, only one published peer-reviewed study was found (Annell et al., 2015). However, these efforts resulted in 25 observed GCA–job performance correlations, which formed the input for our analysis (see Appendix 1). The results were derived from three different types of publications: technical psychometric manuals for published psychological tests (e.g., Sjöberg et al., 2006), published books (e.g., Hubendick, 2023), internal reports from Swedish authorities (e.g., Andersson et al., 1968), and internal documents from private companies. Since almost all studies were unpublished, it was impossible to fully assess the quality of the data. However, even though they had not undergone peer review, there was extensive documentation in manuals and internal reports, which was made available to the authors. In cases of uncertainty, direct contact was made with the researchers who conducted the studies to clarify any ambiguities.

Analysis

In some studies, multiple correlations were reported rather than a single correlation. In cases where it was not possible to directly reproduce a correlation between GCA and job performance, we employed a composite formula instead of simply averaging the correlations. This approach ensures a more accurate representation of the correlation levels for each individual study, thereby avoiding potential underestimation of operational validity (Le et al., 2007; Schmidt and Hunter, 2015). In one study, we could not find the correlations between the three GCA tests used as predictors. In this specific case (Study 25), only the average was used to ensure the study’s inclusion in the analysis.

A psychometric meta-analysis was conducted using the “psychmeta” package in R (Dahlke and Wiernik, 2019). The Schmidt and Hunter approach, utilized in “psychmeta” and combined with a random effects model, offers a comprehensive framework for handling the complexities often encountered in meta-analytic research. In this study, we will report the positions we take to calculate the operational validity, based on the “considered estimation” strategy (Bobko et al., 2025), but to understand how this might affect the results, we complement the analysis with a “conservative estimation” (Sackett et al., 2022).

As previously described, restriction of range refers to a reduction in the observed score variance of a sample (e.g., the selected applicants) compared to the variance of the entire population (e.g., all applicants to the position). In other words, restriction of range occurs when the variance of a predictor in a sample is reduced due to some form of pre-selection (Sackett and Yang, 2000). In practice, this involves comparing the standard deviation (SD) of the applicant sample with that of the incumbent sample, allowing for the computation of a u ratio (incumbent SD/applicant SD). This u ratio is then applied in correction formulas, such as Thorndike’s (1949) Case II formula, to adjust the observed correlation for range restriction effects. By accounting for the variance reduction caused by pre-selection, these corrections provide a more accurate estimate of the true relationship between predictors (e.g., GCA) and job performance. This ensures that selection tools are evaluated based on their actual predictive validity, rather than an artificially deflated correlation due to restricted variance in the incumbent sample.

Range restriction is a widespread issue in selection research. While recent years have seen intense debate on this topic without a clear consensus, it remains essential to clearly describe our approach to this issue. Of the 25 studies, 13 were predictive, meaning that GCA test data were collected before performance measures. In these 13 predictive studies, researchers and/or the study documentation provided individual estimates of restriction of range, which were therefore used in the analysis. Among the remaining 12 studies (concurrent validations), five involved researchers who had access to the same applicant population as the incumbents in the validation study. These five studies were therefore corrected for restriction of range. In the other six studies, where no information on restriction of range was available, the results were calculated without adjustment.

For the main analysis, the indirect restriction of range method was applied to the 13 predictive studies and the five concurrent validity studies (Hunter et al., 2006). Although the approach described by Hunter et al. may not be ideal under certain breached assumptions, it typically offers a more accurate assessment of operational validity than the alternatives (i.e., either omitting correction or presuming direct range restriction). Consequently, it is advisable to use this method when the information required for more sophisticated corrections is not available (Morris, 2023). For comparison, supplementary analyses will be presented in the results to examine the effects of our methodological decisions on the outcome (i.e., considered estimation) in contrast to more conservative estimation (i.e., Sackett et al., 2024).

Of the 25 studies, 20 utilized supervisory ratings of overall job performance, while the remaining five employed objective criteria (i.e., production results). As an example of objective measures, time studies of tractor drivers and their efficiency in loading can be mentioned, where the ratio between the number of working hours and the number of stoppages during loading served as a measure of work performance. No studies presented reliability data (i.e., intraclass coefficients) that were usable for correcting the observed correlation. Estimates from the latest update of reliability levels for supervisory ratings were obtained (Zhou et al., 2024): .61 for non-managerial jobs and .48 for managerial jobs using Schmidt and Hunter’s meta-analysis method. For studies that had objective criteria, the reliability was set to 1.00. We chose these estimates even though later analyses have shown slightly higher reliability levels for direct supervisory ratings (Speer et al., 2024). Except for two, all studies included in the analysis involved non-managerial jobs. The descriptive data for all included studies are presented in Appendix 1.

Results

Figure 1 presents a forest plot of the corrected correlation effect sizes and 95% confidence intervals for the 25 studies included in the meta-analysis. Each row represents a study, with the black square indicating the point estimate of the correlation between General Cognitive Ability (GCA) and job performance. Square size reflects the study’s weight, typically based on sample size or precision, and the horizontal line shows the confidence interval.

Figure 1.

Forest plot.

The vertical dashed line at zero indicates the null effect. Studies whose intervals do not cross this line are statistically significant at the .05 level. The diamond at the bottom represents the overall effect size from the random-effects model, with its center showing the mean estimate and its width the 95% confidence interval. Its position to the right of zero suggests a positive relationship between GCA and job performance.

Outliers may be influential if their inclusion or exclusion unduly affects the meta-analysis conclusions. To further investigate whether any studies deviated from the main result, we used Viechtbauer (2010) approach within the R metafor package to identify outliers and potentially influential studies. This approach defines outliers as studies with absolute studentized deleted residuals greater than 1.96, meaning they deviate significantly from the mean effect size. The result identified two studies (Study 16 and 20) as outliers. These outliers show a relatively low and high correlation compared to what might be expected, suggesting these studies deviate significantly from the overall trend in the meta-analysis.

Table 1 provides a detailed breakdown of the meta-analysis results. The column labeled k indicates the number of studies that contributed to the meta-analysis, while N represents the total sample size across those studies. The mean observed correlation is shown as $\bar{r}$ , and its variability is captured by the observed standard deviation ( $S D_{r}$ ). The $S D_{r e s}$ represents the residual standard deviation of the observed correlations, accounting for sample size adjustments made in the analysis. $Next, \bar{ρ}$ refers to the operational validity, the adjusted estimate of the corrected correlation. The variability of the corrected correlations is given by $S D_{r_{c}}$ while the $S D_{ρ}$ represents the residual standard deviation of $\bar{ρ}$ , showing the variability left after correcting for statistical artifacts.

Table 1.

Meta-analysis results.

GCA–Job performance	k	N	$\bar{r}$	$S D_{r}$	$S D_{r e s}$	$\bar{ρ}$	$S D_{r_{c}}$	$S D_{ρ}$	% Var	95% CI	80% CR
All data (considered estimation)	25	2875	.193	.101	.043	.325	.137	.000	100.000	[.268, .381]	[.325, .325]
Exclude influence studies	23	2082	.220	.078	.000	.348	.110	.000	100.000	[.301, .396]	[.348, .348]
All data (conservative estimation)	25	2875	.193	.101	.051	.300	.139	.028	95.819	[.243, .357]	[.263, .338]
Exclude influence studies	23	2082	.222	.078	.000	.325	.114	.000	100.000	[.276, .374]	[.325, .325]

Note. k = number of studies contributing to meta-analysis; N = total sample size; $\bar{r}$ = mean observed correlation; $S D_{r}$ = observed standard deviation of $r$ ; $S D_{r e s}$ = residual standard deviation of $r$ ; $\bar{ρ}$ = operational validity; $S D_{r_{c}}$ = observed standard deviation of corrected correlations ( $r_{c}$ ); = residual standard deviation of $ρ$ ; % Var = percentage of variance in $\bar{ρ}$ accounted for sampling error and measurement error; CI = confidence interval around $\bar{ρ}$ ; CR = credibility interval around $\bar{ρ}$ . Correlations corrected individually. Considered estimation = Estimation with correction for indirect restriction of range and measurement error in job performance; Conservative estimation = Estimation with correction for direct restriction of range and for measurement error in job performance; no correction for restriction of range was applied to concurrent validation studies.

The percentage of variance in $\bar{ρ}$ explained by sampling and measurement errors is labeled as % Var. The table also provides the CI (95% confidence interval) around $\bar{ρ}$ , which reflects the precision of the estimate, and the CR (80% credibility interval) around $\bar{ρ}$ , indicating the range within which individual true corrected correlations are likely to fall across different samples.

The meta-analysis results across 2875 individuals and 25 independent samples yielded an observed mean correlation ( $\bar{r})$ of .19 between GCA and job performance using considered estimation. Correcting for unreliability in the criterion and for indirect range restriction in the predictor produced a higher mean corrected correlation ( $\bar{ρ} =$ .32).

The results with and without the outliers (i.e., after excluding the influential studies) are also presented in Table 1. Excluding these studies resulted in a higher observed mean correlation ( $\bar{r} = .$ 22) with performance and an increased operational validity ( $\bar{ρ}$ =.35). Although the results differ in magnitude, the 95% confidence intervals overlap, indicating that the results do not differ significantly.

To test whether the results differ significantly when we follow Sackett’s recommendations not to correct for restriction of range in concurrent validation studies and, instead of correcting for indirect restriction of range, correct for direct restriction of range, we used a so-called conservative estimation (Bobko et al., 2014). Correcting for unreliability in the criterion and for direct restriction of range in the predictor produced, as expected, a slightly lower mean corrected correlation ( $\bar{ρ} =$ .30; 95% CI [.25, .36]), and when the influential studies were removed from the analysis, the effect increased ( $\bar{ρ} =$ .32; 95% CI [.28, .37]). In summary, the result can be interpreted as stable, especially because the SD_ρ is zero or near zero when the influential studies are excluded and when we apply a more conservative estimation.

As a supplementary analysis, earlier studies (1949–1958) were separated from later studies (2005–2024), referred to as old and new studies, respectively. We also categorized the studies as concurrent or predictive, and as using supervisory ratings or objective performance and low and high complexity job.¹ These factors were considered potential moderators of the overall results. A meta-regression using a mixed-effects model, entering all moderators in the analysis, was conducted to assess significant differences among the moderators, accounting for both within-study sampling error and between-study variance, using the metafor package in R (Viechtbauer, 2010).

The meta-regression explained 12.96% of the heterogeneity, suggesting that the moderators accounted for a modest portion of the between-study variance. However, the overall test of moderators was not statistically significant, indicating that the moderators included in the analysis did not explain much of the variance. This suggests that unexamined factors may be influencing the differences, or that the selected moderators are weak predictors. Further exploration of other moderators or adjustments to the model may be necessary to explain the remaining heterogeneity. For descriptive purposes, we present the results of the moderation analyses in Appendix 2.

Discussion

The purpose of this study was to investigate whether previous findings on the relationship between GCA and job performance can be generalized to Swedish conditions. The almost complete absence of published studies on Swedish samples necessitated a search for lesser-known studies conducted outside academia. Our review identified studies dating back to the 1940s, which is a strength, as it enables the documentation of previously unknown research over an extensive period.

Earlier research, such as Hunter (1980), suggested a correlation of .51, but more recent analyses indicate that this was likely overestimated. Sackett et al. (2022) re-evaluated historical data and found a significantly lower estimate ( $\bar{ρ}$ = .31), primarily due to methodological considerations regarding range restriction corrections. Our findings align with this trend, showing an operational validity considerably lower than studies from the late 1980s and early 2000s, yet substantially higher than the most recent major meta-analysis by Sackett et al. (2024), which estimated an operational validity of only .22.

These results highlight the importance of contextual factors and methodological choices when evaluating GCA’s predictive validity. Our study contributes by broadening the empirical base, particularly for Swedish conditions, and by emphasizing the need for further research on the methodological nuances that influence operational validity estimates. Based on this study’s data, a cautious conclusion is that GCA correlates positively with job performance, though perhaps not as strongly as previously thought (Richardson and Norgate, 2015).

The overall effect ( $\bar{ρ} =$ .32; 95% CI [.28, .37]) affirms GCA’s utility in personnel selection. However, a comprehensive utility analysis is needed to assess the practicality of methods or instruments. Such analysis allows for evaluating the benefits and costs of different selection methods, guiding decisions based on solid reasoning. Without a utility framework, selecting methods becomes arbitrary, risking financial losses for the organization (Cascio and Bodreau, 2008).

Interestingly, the largest study, based on police applicants, shows the lowest effect (an observed correlation of .08) and is also the only published study (Annell et al., 2015). It seems unlikely that police officers, in particular, would be unaffected by cognitive ability in their work. Instead, the unusually low effect may be due to the aggregation of all regional police groups in Sweden into a single sample. This aggregation could introduce average differences between groups, causing the correlation to approach zero. Other methods used in the same study, such as the structured interview, also showed implausibly low, near-zero results, which supports this interpretation.²

Another possible explanation is the criterion used in the study, which was based on supervisory ratings. Many supervisors evaluating their colleagues may not have had sufficient opportunities to closely observe their performance, potentially leading to less reliable assessments. This lack of direct observation could have weakened the observed relationship between GCA and job performance. The highest correlation was found in an older study (Study 20) of tree fellers, with an observed correlation of .44. This study, conducted over 50 years ago—long before the advent of meta-analyses—is included as an appendix in a recently published book about Lennart Bergström (Hubendick, 2023), one of Sweden’s first occupational psychologists. The results were calculated manually, and it is possible that some test results were excluded from the internal report simply due to the time required for manual calculations. By including only the highest correlations, this effect may be somewhat overestimated. However, it is also noted that the study’s design was superior to many later studies. All tests were administered under highly controlled conditions, and the test quality was high, likely covering the entire intended construct, with as many as nine subtests constituting the GCA measure. The criterion measure was also collected under controlled conditions and served as a good marker for all tasks performed by the tree fellers, which may have contributed to the relatively strong effect.

However, when the influence of these two studies was removed from the analysis, the strength of the effect increased slightly, from .32 to .35, which means we can be confident that these studies do not significantly alter our conclusions that Swedish samples show similar results compared to North American data. This holds true even though more recent studies from the 2000s show higher operational validity (corrected correlation .31, see Appendix 2) compared to the latest meta-analysis published primarily on North American data (Sackett et al., 2024).

The overall result of this Swedish meta-analysis can be compared with the latest meta-analyses from North America. In the military domain, using hands-on military job proficiency measures as criteria, Cucina et al. (2024) found an average effect of .44, significantly higher than what Sackett et al. found in civilian occupations (.22). This Swedish study falls between their effect sizes, raising the question of why there is such a difference. While range restriction correction may influence the results, it is unlikely to be the sole explanation. Future research should examine the criteria used, with more primary studies on highly complex jobs. Older and newer studies should also be analyzed more thoroughly using consistent methods for range restriction correction in GCA and measurement error in job performance.

Limitations

The fact that most of the included studies have not undergone peer review is, of course, a weakness. The absence of a thorough review of the reports that formed the basis for this meta-analysis is unfortunate. On the other hand, these studies represent solid work by occupational psychologists in organizations focusing on recruitment and selection, and several studies are well-documented in internal reports from consulting firms, public authorities, and test providers, ensuring a quality-assured result. Another limitation, compared to other meta-analyses, is that this Swedish study is relatively small in terms of sample size and number of studies, which may call the results into question. Therefore, this meta-analysis should be supplemented in the future with studies from Sweden and perhaps our other Nordic neighbors.

This study, along with others that have examined the relationship between cognitive ability and job performance, has focused on the general factor (e.g., Sackett et al., 2024), but Nye et al, (2022) showed that the narrow cognitive abilities least correlated with GCA added significant incremental validity for predicting task performance, training performance, and organizational citizenship behavior. These findings can guide future studies to explore subfactors of GCA as well as different dimensions of job performance.

Furthermore, as an anonymous reviewer pointed out, the lack of reliability studies on supervisory ratings in Swedish samples is a limitation. Therefore, it is important that future studies collect data to assess whether these estimates can be generalized to Swedish conditions.

Conclusion

This Swedish meta-analysis provides valuable insights into the relationship between General Cognitive Ability and job performance in a Swedish context. While the results align with international findings, they indicate a slightly lower operational validity than earlier studies, though higher than the most recent meta-analysis by Sackett et al. (2024). The study reinforces GCA’s positive correlation with job performance but underscores the importance of adopting a utility framework for assessing practical application in different contexts. Our hope is that this study will inspire further validation studies in Sweden and that research funding will also support this important area of work and organizational psychology.

Footnotes

Appendix

Appendix 2.

Moderation analysis.

Moderators	k	N	$\bar{r}$	$S D_{r}$	$S D_{r e s}$	$\bar{ρ}$	$S D_{r_{c}}$	$S D_{ρ}$	% Var	95% CI	80% CR
Design
Concurrent	12	1250	.243	.008	.000	.367	.111	.000	100.000	[.296, .437]	[.367, .367]
Predictive	13	1625	.155	.101	.048	.281	.042	.023	96.450	[.189, .373]	[.242, .320]
Criteria
Supervisory ratings	20	2668	.184	.093	.039	.312	.127	.000	100.000	[.252, .372]	[.312, .312]
Objective production	5	207	.308	.140	.000	.452	.185	.000	100.000	[.222, .681]	[.452, .452]
Complexity
Low (Job Zone 1–3)	21	2526	.190	.102	.051	.319	.139	.000	100.000	[.256, .382]	[.319, .319]
High (Job Zone 4–5)	4	349	.217	.098	.000	.370	.137	.000	100.000	[.153, .588]	[.370, .370]
Time
Year 1949–1958	6	357	.282	.11	.000	.422	.147	.000	100.000	[.268, .576]	[.422, .422]
Year 2004–2024	19	2518	.180	.094	.042	.308	.131	.000	100.000	[.245, .371]	[.308, .308]

Acknowledgements

In light of the complexities involved, the authors would like to express their sincere appreciation to the practicing work psychologists and organizations who enabled the publication of this article by granting access to previously unavailable data. Special thanks are due to Gudrun Hubendick, whose permission to examine long-forgotten archival materials from the early 1950s has contributed valuable historical depth to the present study.

Declaration of conflicting interests

Both authors are affiliated with commercial organizations that provide psychological testing services on the Swedish market. This potential conflict of interest has been transparently acknowledged, and every effort has been made to ensure that the research has been conducted and presented with scientific integrity and objectivity.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Anders Sjöberg

Notes

Author biographies

Anders Sjöberg holds the title of Docent in Psychology from Stockholm University and is a co-owner of Psychometrics Sweden AB, which provides psychological assessment tools to private and public organizations in Sweden. His research focuses on psychometric methods within Classical Test Theory and Item Response Theory, and on developing selection tools for specialized roles in the military and police. He also studies interventions to enhance Psychological Capital (PsyCap)—hope, efficacy, resilience, and optimism—in organizational settings.

Sofia Sjöberg holds a PhD in Psychology from Stockholm University and is the CEO of Psychometrics Sweden AB, which provides psychological assessment tools to private and public organizations in Sweden. She has extensive experience in test development within the clinical domain and currently focuses her research on personnel selection, utility analysis, and the development of occupational psychological assessments. Sofia has a particular interest in the construct validation of psychological tests, with an emphasis on ensuring that instruments accurately capture the traits and abilities they are designed to measure.

References

Andersson

Bergström

Krantz

, et al. (1968) Urval av skogstraktorförare med psykologiska test [Selection of forestry tractor operators using psychological tests]. Stockholm: Forskningstiftelsen skogsrabetaren.

Annell

Lindfors

Sverke

(2015) Police selection – implications during training and early career. Policing: An International Journal 38(2): 221–238.

Bergvall

(2016) Pioneering assessment centers within local government in Sweden: Gothenburg’s search for better leaders. In: Povah

Thornton

GC III

(eds) Assessment Centres and Global Talent Management. Routledge, pp. 129–138.

Binet

Simon

(1916) The Development of Intelligence in Children. Arno Press.

Binning

Barrett

(1989) Validity of personnel decisions: A conceptual analysis of the inferential and evidential bases. Journal of Applied Psychology 74(3): 478–494.

Bobko

Roth

Huy

, et al. (2025) The need for “considered estimation” versus “conservative estimation” when ranking or comparing predictors of job performance. International Journal of Selection and Assessment 33(1): e12489.

Carroll

(1993) Human Cognitive Abilities: A Survey of Factor-Analytic Studies.: Cambridge University Press.

Cascio

Bodreau

(2008) Investing in People: Financial Impact of Human Resource Initiatives. Pearson Education.

Cucina

Burtnick

De la Flor Musso

, et al. (2024) Meta-analytic validity of cognitive ability for hands-on military job proficiency. Intelligence 104: 101628.

10.

Dahlke

Wiernik

(2019) psychmeta: An R package for psychometric meta-analysis. Applied Psychological Measurement 43(5): 415–416.

11.

Gonzalez-Mulé

Mount

I-S

(2014) A meta-analysis of the relationship between general mental ability and non-task performance. Journal of Applied Psychology 99(6): 1222–1243.

12.

Gottfredson

(1997a) Why g matters: The complexity of everyday life. Intelligence 24(1): 79–132.

13.

Gottfredson

(1997b) Mainstream science on intelligence: An editorial with 52 signatories, history, and bibliography. Intelligence 24(1): 13–23.

14.

Guilford

(1988) Some changes in the structure of intellect model. Educational and Psychological Measurement 48(1): 1–4.

15.

Hemmingsson

Melin

Allebeck

Lundberg

(2006) The association between cognitive ability measured at ages 18–20 and mortality during 30 years of follow-up: A prospective observational study among Swedish males born 1949–51. International Journal of Epidemiology 35(3): 665–670.

16.

Holzinger

Swineford

(1939) A study in factor analysis: The stability of a bifactor solution. Supplementary Educational Monograph 48. University of Chicago Press.

17.

Hopewell

McDonald

Clarke

Egger

(2007) Grey literature in meta-analyses of randomized trials of health care interventions. Cochrane Database of Systematic Reviews 2007(2). Art. No.: MR000010.

18.

Hubendick

(2023) Lennart Bergström 1912–1984. Den tillämpade psykologins pionjär i Sverige [Lennart Bergström 1912–1984. The pioneer of applied psychology in Sweden]. Gudrun Hubendick.

19.

Hunter

(1980) Validity generalization for 12,000 jobs: An application of synthetic validity and validity generalization to the General Aptitude Test Battery (GATB). U.S. Department of Labor, Employment Service.

20.

Hunter

(1984) Validity and utility of alternative predictors of job performance. Psychological Bulletin 96(1): 72–98.

21.

Hunter

Schmidt

(2006) Implications of direct and indirect range restriction for meta-analysis methods and findings. Journal of Applied Psychology 91(3): 594–612.

22.

Husén

(1948) Konstruktion och standardisering av svenskakrigsmaktens inskrivningsprov. 1948 års version [Construction and standardization of the Swedish Armed Forces’ Enlistment Battery, 1948 version]. Lund: Håkan Ohlssons Boktryckeri.

23.

Hülsheger

Maier

Stumpp

(2007) Validity of general mental ability for the prediction of job performance and training success in Germany: A meta-analysis. International Journal of Selection and Assessment 15(1): 3–18.

24.

Jäderholm

(1914) Undersökningar över intelligensmätningarnas teori och praxis [Investigations concerning the theory and practice of measurement of intelligence]. Almqvist & Wiksell.

25.

Jensen

(1998) The g Factor: The Science of Mental Ability. Praeger Publishers.

26.

Lang

JWB

Kersting

Hülsheger

(2010) Range shrinkage of cognitive ability test scores in applicant pools for German governmental jobs: Implications for range restriction corrections. International Journal of Selection and Assessment 18(3): 321–328.

27.

I-S

Shaffer

Schmidt

(2007) Implications of methodological advances for the practice of personnel selection: How practitioners benefit from recent developments in meta-analysis. Academy of Management Perspectives 21(3): 6–15.

28.

Morris

(2023) Meta‑analysis in organizational research: A guide to methodological options. Annual Review of Organizational Psychology and Organizational Behavior 10(1): 225–259.

29.

Neisser

(1996) Intelligence: Knowns and unknowns. American Psychologist 51(2): 77–101.

30.

Nye

Wee

(2022) Cognitive ability and job performance: Meta-analytic evidence for the validity of narrow cognitive abilities. Journal of Business and Psychology 37(6): 1119–1139.

31.

I-S

Mendoza

(2023) To correct or not to correct for range restriction: That is the question. Industrial and Organizational Psychology 16(3): 322–327.

32.

Ones

Viswesvaran

(2023) A response to speculations about concurrent validities in selection: Implications for cognitive ability. Industrial and Organizational Psychology: Perspectives on Science and Practice 16(3): 358–365.

33.

Ree

Carretta

(2022) Thirty years of research on general and specific abilities: Still not much more than g. Intelligence 91: 1–4.

34.

Ree

Earles

Teachout

(1994) Predicting job performance: Not much more than g. Journal of Applied Psychology 79(4): 518–524.

35.

Richardson

Norgate

(2015) Does IQ really predict job performance? Applied Developmental Science 19(3): 153–169.

36.

Roth

Becker

Romeyke

, et al. (2015) Intelligence and school grades: A meta-analysis. Intelligence 53: 118–137.

37.

Rotundo

Sackett

(2002) The relative importance of task, citizenship, and counterproductive performance to global ratings of job performance: A policy-capturing approach. Journal of Applied Psychology 87(1): 66–80.

38.

Rydberg

(1968) Bias in Prediction: On Correction Methods. Almqvist & Wiksell.

39.

Sackett

Yang

(2000) Correction for range restriction: An expanded typology. Journal of Applied Psychology 85(1): 112–118.

40.

Sackett

Borneman

Connelly

(2008) High stakes testing in higher education and employment: Appraising the evidence for validity and fairness. American Psychologist 63(4): 215–227.

41.

Sackett

Zhang

Berry

Lievens

(2022) Revisiting meta-analytic estimates of validity in personnel selection: Addressing systematic overcorrection for restriction of range. Journal of Applied Psychology 107(11): 2040–2068.

42.

Sackett

Demeke

Bazian

, et al. (2024) A contemporary look at the relationship between general cognitive ability and job performance. Journal of Applied Psychology 109(5): 687–713.

43.

Salgado

Anderson

Hülsheger

(2010) Employee selection in Europe: Psychotechnics and the forgotten history of modern scientific employee selection. In: Farr

Tippins

(eds) Handbook of Employee Selection. Routledge, pp. 921–942.

44.

Salgado

Anderson

Moscoso

, et al. (2004) A meta-analytic study of general mental ability validity for different occupations in the European community. Journal of Applied Psychology 88(6): 1068–1081.

45.

Schmidt

(2009). Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology 13(2): 90–100.

46.

Schmidt

Hunter

(1998) The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin 124(2): 262–274.

47.

Schmidt

Hunter

(2015) Methods of Meta-Analysis: Correcting Error and Bias in Research Findings (3rd edn). Sage.

48.

Sjöberg

Forssén

(2006) Predicting job performance. Swedish version. Manual. Psykologiförlaget AB.

49.

Sjöberg

Näswall

Sverke

(2012) Using individual differences to predict job performance: Correcting for direct and indirect range restriction. Journal of Scandinavian Psychology 55(4): 368–373.

50.

Spearman

(1904) General intelligence: Objectively determined and measured. American Journal of Psychology 15(2): 201–292.

51.

Speer

Delacruz

Wegmeyer

Perrotta

(2024) Meta-analytical estimates of interrater reliability for direct supervisor performance ratings: Optimism under optimal measurement designs. Journal of Applied Psychology 109(3): 456–467.

52.

Steiner

(2012) Personnel selection across the globe. In: Schmitt

(ed.) The Oxford Handbook of Personnel Assessment and Selection. Oxford University Press, pp. 740–767.

53.

Sternberg

(1985) Beyond IQ: A Triarchic Theory of Human Intelligence. Cambridge University Press.

54.

Thorndike

(1949) Personnel Selection: Test and Measurement Techniques. Wiley.

55.

Thurstone

(1947) Multiple Factor Analysis. University of Chicago Press.

56.

U.S. Department of Labor, Employment and Training Administration (n.d.) O*NET Resource Center. Available at: https://www.onetcenter.org (accessed 14 December 2024).

57.

Viechtbauer

(2010) Conducting meta-analyses in R with the metafor package. Journal of Statistical Software 36(3): 1–48. https://doi.org/10.18637/jss.v036.i03.

58.

Viswesvaran

Ones

(2000) Perspectives on models of job performance. International Journal of Selection and Assessment 8(4): 216–226.

59.

Viswesvaran

Schmidt

Ones

(2005) Is there a general factor in ratings of job performance? A meta-analytic framework for disentangling substantive and error influences. Journal of Applied Psychology 90(1): 108–131.

60.

Yerkes

(1921) Psychological Examining in the United States Army. Memoirs of the National Academy of Science, Vol. XV. U.S. Government Printing Office.

61.

Zhou

Sackett

Shen

Beatty

(2024) An updated meta-analysis of the interrater reliability of supervisory performance ratings. Journal of Applied Psychology 109(6): 949–970.

General Cognitive Ability and job performance in personnel selection in Sweden: A meta-analysis

Abstract

Keywords

Introduction

General Cognitive Ability

Job performance

GCA and job performance

Challenges of GCA testing in Sweden

Method

Sample and inclusion criteria

Analysis

Results

Discussion

Limitations

Conclusion

Footnotes

Appendix

Acknowledgements

Declaration of conflicting interests

Funding

ORCID iD

Notes

Author biographies

References