Abstract
The Ravens Advanced Progressive Matrices (APM) is a widely used measure of general intelligence (
Keywords
The Ravens Progressive Matrices Test, developed by Raven (1941) as a measure of general intelligence (
Due to its nonverbal format, the APM is purported to be a culturally fair, unbiased measure of fluid intelligence (Cattell, 1963), educative ability (J. Raven et al., 1993), or, as we will refer to it, general intelligence (
In answer to these various limitations, Arthur and Day (1994) developed a 12-item short form of the APM (which we call APM-12), with an administration time of 15 min. Several studies have shown that this 12-item form shows acceptable psychometric properties (e.g., Cronbach’s alpha, test–retest reliability, convergent validity; see Arthur, Tubre, Paul, & Sanchez-Ku, 1999, for review). However, this short form shows relatively low and variable internal consistency (IC). For example, Cronbach’s alphas range from .58 to .66 for short form itself to and .72 to .73 for the 12 short-form items extracted from the full 36-item version (Arthur & Day, 1994).
More recently, Hamel and Schmittmann (2006) have argued that the complete 36-item APM can be administered as a 20-min speed test. Scores on this speeded form of the APM show strong correlations with scores on slower timed (40 min,
The purpose of the current study was to develop a medium-form version of the APM that resulted in higher IC than the 12-item version (APM-12), but shorter administration time than the full 36-item APM (APM-36)—a combination of features that might be useful for time-constrained and mass-testing situations. Here, we report the development and construct validity of this 18-item scale.
Study 1: Scale Construction and Construct Validity
Method
Participants
A total of 633 students (198 male, 435 female) from three southwestern universities participated in this study as a partial requirement for experimental course credit. The mean age for participants was 20.92,
Measures
The Ravens Advanced Progressive Matrices 18-Item Short Form (APM-18)
This 18-item short-form version of the APM is printed in a booklet format on 8½″ by 11″ white paper, with each test item printed on a separate page. The first four pages of the test booklet contain three example items (Practice Items 1, 5, and 9 from APM-36) to explain the task.
The 18 actual test items were derived by adding six items from the longer 36-item version (J. Raven et al., 1993) to Arthur and Day’s (1994) published 12-item version. Arthur and Day used Items 1, 4, 8, 11, 15, 18, 21, 23, 25, 30, 31, and 35 from the 36-item APM based on a set of three decision rules, which can be summed up as (a) dividing the APM into 12, three-item sections based on difficulty; (b) taking the item with the highest item-total correlation for each section; and (c) in the case of a tie, including the item that resulted in the largest drop in IC if it was excluded from the full test. Following these same rules, we added six more items of increasing difficulty—two that were easy (96% and 75% of examinees from the normative sample answered correctly), two that were moderate (50% and 48% of examinees from the normative sample answered correctly), and two that were difficult (37% and 32% of examinees from the normative sample answered correctly). These items (2, 20, 22, 24, 34, and 32) were integrated of difficulty to mimic their presentation order in the original APM.
Procedure
The new APM-18 test was given in classroom settings with several examinees at a time. This was done because this test was developed as a measure of
Analyses
All statistical analyses were conducted using SAS Version 8.2 (SAS Institute, 1999). Cronbach’s alphas and bivariate correlations were computed using the PROC CORR procedure. Tests for mean differences between sexes were calculated through
Results
IC estimates were computed by using Cronbach’s alpha. The IC of the APM-18 scale yielded moderate reliability (α = .79). This alpha is lower than normative IC reports for the APM-36 (α = .84; Forbes, 1964), but higher than those for the APM-12 (ranging from α = .58-.66; see Arthur et al., 1999). Furthermore, the alpha of the APM-18 was larger than that of the embedded APM-12 (α = .73). Table 1 shows the results for each of the APM-18 items, with respect to their item-total correlations, item difficulties, and scale α of the overall scale if the item is deleted. As seen, deleting any item reduces the overall reliability of the scale, suggesting that all items should be retained.
Item-Total Correlations and Item Difficulty for the APM-18.
The mean APM-18 score was 9.73,
Hierarchical GLMs were tested to explore whether the apparent differences in male and female APM-18 scores might have been indirectly attributable to the relationship between age and APM-18 scores. This model defined the APM-18 score as the criterion variable, with the ordered predictor variables being age and then sex. The hierarchical model was designed to allow age to absorb as much variance as possible, with sex entered into the model only afterward. Using this model, both GLMs indicated a significant effect for age (
Discussion
The results presented here suggest that the APM-18 may serve as a useful compromise between the lower reliability APM-12 and the much longer APM-36. The hierarchical GLMs identify both age and sex to be significant predictors of APM-18 scores, with younger individuals and males generally scoring higher. These results are consistent with many previous studies looking at general intelligence (e.g., Jackson & Rushton, 2006). Results of Study 1, however, do not test the convergent validity of this scale relative to other measures of intelligence. Study 2 was designed to do this.
Study 2: Convergent Validity
Study 2 was conducted to assess the convergent validity of the APM-18 with other measures of intelligence, academic achievement, and personality. To do so, we tested two separate subsamples (
Method
Participants
Sample 1 was comprised of 193 students (94 male, 99 female) from an introductory psychology course at the University of Arizona. Mean age of participants was 19.11,
Sample 2 was comprised of 229 students (65 male, 164 female) from various undergraduate courses at the University of New Mexico. Mean age of participants was 20.19,
Measures
APM-18
The APM-18 consisted of the same items identified in Study 1. In Sample 1, the form was presented first in a series of measures examining adult intelligence. In Sample 2, it was presented in the middle of a questionnaire packet concerning personality, creativity, sexual behavior, and intelligence.
The SILS
The SILS (Zachary, 1986) is a timed (10 min per subscale), 60-item self-report measure that examines both verbal intelligence (40 items) and abstract intelligence (20 items). The test is considered appropriate for average English-speaking individuals from 14 to adult ages, who are motivated test takers. Validities and norms published in the manual were taken from a sample of 322 army recruits. Split-half reliabilities for each subscale are reported as .87 for Vocabulary, .89 for Abstraction, and .92 for the total score.
The MHV-MC
The MHV-MC (J. Raven et al., 1997) is a 68-item self-administered multiple-choice vocabulary test designed to complement the APM-36. Whereas the APM aimed to measure an individual’s ability to solve novel problems and think in novel ways (i.e.,
Academic performance
Academic performance was measured by self-reported GPAs and SAT scores in Sample 1. Sample 2 participants were asked for SAT and ACT scores. A variety of studies have identified moderate to strong correlations between these academic achievement and aptitude measures, and a variety of other traits, including intelligence, personality, and psychopathology (Barton, Dielman, & Cattell, 1971; Brown, 1994; Dyer, 1987; Mouw & Khanna, 1993).
NEO-FFI
The NEO-FFI (Costa & McCrae, 1992) is the most widely used measure in research on the Five-Factor model of personality. It is a shortened version of the 240-item Revised NEO Personality Inventory (NEO-PI-R; Costa & McCrae, 1992), comprised of 60 items that measure five global personality factors (12 items per factor): Openness to Experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism. In our version, participants rated degree of agreement with statements about their personalities and behavioral propensities on a 5-point scale ranging from −2 (
Verbal and drawing creativity tasks
Participants completed six 2-min verbal creativity tasks and eight 1-min drawing creativity tasks (Miller & Tal, 2007). Because a mating-oriented mind-set promotes creativity (Griskevicius, Cialdini, & Kenrick, 2006), participants were asked to complete these tasks as creatively as possible with the intention of attracting a romantic partner. Examples of verbal tasks included writing answers to thought-provoking questions, such as “How would you keep a marriage exciting after the first couple of years?” “What do you hope the world will be like in a 100 years?” and “Imagine that all clouds had really long strings hanging from them—strings hundreds of feet long. What would be the implications of that fact for nature and society?” There were two types of drawing tasks, four abstract (e.g., “Please draw an abstract symbol, pattern, or composition that represents your happiness as a child doing a favorite activity”) and four representational (e.g., “In the space below, please draw an animal that you admire for its strength, grace, speed, or beauty”). Each participant’s responses to each of the 14 creativity tasks were scored independently by four raters on a 1- to 5-point creativity scale. The resulting composite verbal creativity and drawing creativity measures showed high interrater reliability and IC (Cronbach’s alphas = .91 and .90, respectively; Miller & Tal, 2007).
Results
Sample 1
IC estimates were computed using Cronbach’s alpha. The APM-18 showed moderate IC (α = .71), with the embedded APM-12 yielding a slightly lower value (α = .63). Although these internal consistencies are lower than those reported in Study 1, they are still moderate in strength.
The mean APM-18 score was 10.68,
Descriptive Statistics for Measures of Intelligence in Sample 1 (
Indicates mean differences as a function of sex.
As seen in Table 3, both the APM-18 and embedded APM-12 correlated significantly with most of the other measures of intelligence and academic achievement and aptitude used in this sample. Specifically, both the APM scales correlated positively and significantly most strongly with the Shipley Abstraction scale and self-report SAT scores. This is not surprising as the APM is designed to be a measure of
Correlations Among APM-18, APM-12, and Other Intelligence and Academic Achievement Measures in Sample 1.
Sample 2
IC estimates were again computed using Cronbach’s alpha. As in Study 1 and Sample 1 of this study, the APM-18 showed moderate reliability (α = .79), whereas the embedded APM-12 again shows slightly lower reliability (α = .74). The mean APM-18 score was 9.53,
Descriptive Statistics for Measures of Intelligence, Academic Achievement, and Personality in Sample 2 (
Indicates mean differences as a function of sex.
As seen in Table 5, both the APM-18 and embedded APM-12 scores were significantly positively related to verbal creativity (
Correlations Among APM-18, APM-12, and Other Intelligence, Academic Achievement, and Personality Measures in Sample 2.
Discussion
Each sample in Study 2 used different methods of assessing the convergent validity of the APM-18. Sample 1 focused on relationships between the APM-18 and other standard measures of intelligence and academic achievement (e.g., verbal intelligence tests, self-reported GPA, and SAT scores); whereas Sample 2 examined the relationship between the APM-18, creativity, self-reported ACT scores, and Big Five personality traits. Both studies confirmed that the APM-18 is related to these measures in a predictable manner. Generally speaking, both the APM-18 and the embedded APM-12 showed the same pattern of correlations with the other measures used in these studies. However, the higher IC of the APM-18 suggests that it may be better at detecting individual variation in
Conclusion
Each of the 18 items used in this new APM-18 test was chosen to maintain the progressive difficulty of both the long form (APM-36) and the short form (APM-12). Unsurprisingly, although the APM-18’s reliability was lower than that of the APM-36, it was higher than that of the APM-12 developed by Arthur and Day (1994). Furthermore, the patterns of correlation with other measures of intelligence are virtually identical to the APM-12, which has, in previous studies, been shown to mimic the APM-36 results (Arthur & Day, 1994; Arthur et al., 1999). Combined with an average administration time of 17.53 min (25 min maximum), these findings suggest that the APM-18 may work well as a compromise for researchers who want a quite accurate measure of general intelligence in a quite short amount of time. The cross-validation in the three samples reported here is an initial attempt to collect normative data for the APM-18. Our results may generalize only to other college students. However, the APM-18’s short administration time, high IC, reasonable validity, and ease of administration by paper and pencil in large college classroom settings make it ideal for behavioral science studies where researchers want a reasonably fast, accurate intelligence score as part of a larger questionnaire battery.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research and/or authorship of this article.
