Abstract
Even in multicultural nations interracial relationships and marriages are quite rare, one reflection of assortative mating. A relatively unexplored factor that could explain part of this effect is that people may find members of their own racial group more attractive than members of other groups. We tested whether there is an own-race preference in attractiveness judgments, and also examined the effect of familiarity by comparing the attractiveness ratings given by participants of different ancestral and geographic origins to faces of European, East Asian and African origin. We did not find a strong own-race bias in attractiveness judgments, but neither were the data consistent with familiarity, suggesting an important role for other factors determining the patterns of assortative mating observed.
Introduction
Despite increasing racial integration in many countries, interracial relationships remain quite rare. For example, even in the multicultural United States, in the 2010 US Census, only 12.5% of “Black” husbands, 2% of “White” husbands and 0.4% of “Asian” husbands were married to women of other racial backgrounds, although such pairings are on the increase. By “race” we simply mean broad ancestral region of origin (for example, European or African) rather than national or cross-national origins. This rarity is one manifestation of the fact that people have a strong tendency to form romantic relationships with people who are similar to them on a wide range of dimensions, a phenomenon typically referred to as assortative mating (Theissen and Gregg, 1980; Buss, 1985; Little, Burt and Perrett, 2006). This tendency to pair-bond with similar individuals means that reproductive partners are often similar in background, political and religious beliefs, social and economic status, intelligence, attractiveness and personality characteristics, among others. If any of these traits is influenced by genes (and many are known to be, e.g., Jang, Livesley, and Vernon, 1996; Alford, Funk, and Hibbing, 2005; Gray and Thompson, 2004), then assortative mating can have major evolutionary consequences, by producing new couplings between otherwise unrelated genes (Buss, 1985), by exaggerating existing gene-based differences in the population (Theissen and Gregg, 1980; Buss, 1985), and/or by increasing the probability that parents and offspring share genes (Theissen and Gregg, 1980). For this reason it is important to understand the mechanisms underpinning assortative mating.
While much of the tendency for same-race couples to form romantic relationships is likely explained by shared social groups, interests, beliefs and geography, one potentially important factor that has not yet been extensively considered, is that people may perceive members of their own-race to be more physically attractive than members of other-races. Research suggests that physical attractiveness plays a major role in partner choice, particularly for men, but also for women (for reviews see Rhodes, 2006; Gallup and Frederick, 2010; Little, Jones, and DeBruine, 2011), and so it is plausible that people tend to choose own-race partners partially because they perceive own-race partners to be more attractive than other-race partners.
Only two studies have directly investigated whether people perceive own-race faces to be more attractive than other-race faces using more than one race of participant (Rhodes et al., 2001; Rhodes et al., 2005), and these studies present conflicting findings. Rhodes et al. (2001) found that ethnically Chinese participants expressed an own-race face preference (compared to Caucasian faces), but this preference only emerged for mathematically averaged female faces (not averaged male faces). Rhodes et al. (2005) found that Caucasian participants expressed an own-race face preference, but this preference only emerged for averaged male faces (not averaged female faces) and individual female faces (not individual male faces). Mixed-race averages or unmanipulated faces of mixed-race individuals were rated highest of all.
A variety of studies have indirectly assessed for an own-race face preference. These are studies that were designed with other aims in mind, but nonetheless required participants of one or more racial group to rate the attractiveness of own-race and other-race faces. Each of these indirect studies found an own-race face preference for at least one participant group, but frequently only contained one participant group, rendering all preferences inconclusive. The extent of the preference varied from study to study, and the stimuli, and participant race, were generally not well-controlled, and so it is difficult to draw firm conclusions (Clark and Clark, 1947; Bernstein, Lin, and McLellan, 1982; Cross and Cross, 1971; Cunningham et al., 1995; Wade, Irvine, and Cooper, 2004; Zebrowitz, Montepare, and Lee, 1993; Lewis, 2010; Lewis, 2011).
Most studies, including the two direct studies, did not attempt to minimize potential participant response biases. Both race and attractiveness are sensitive issues (Jones, 1996), which may render participants more likely to overestimate the attractiveness of other-race faces, or underestimate the attractiveness of own-race faces, to appear egalitarian or non-superficial.
Most studies, including the two direct studies, required participants to rate only same-sex faces or to rate both same-sex and opposite-sex faces (Bernstein, Lin, and McClellan, 1983; Cross and Cross, 1971; Cunningham, 1971; Rhodes et al., 2001; Rhodes et al., 2005). It is possible that heterosexual participants rate same-sex faces according to an objective standard of attractiveness (what they believe others would regard attractive) rather than a subjective impression of attractiveness (what they personally regard attractive), and this tendency may generalize to opposite-sex faces presented alongside same-sex faces, potentially masking genuine differences. We attempted to minimize these potential problems in the current study by providing a plausible cover-story designed to encourage unbiased personal ratings of attractiveness, and by having heterosexual participants only rate opposite sex faces.
We also used a design that enabled us to examine the well-reported effect of familiarity with particular kinds of faces on attractiveness judgments (reviewed by Rhodes, 2006), a phenomenon that is thought to underpin the attractiveness of mathematical averages of multiple faces (Langois and Roggman, 1990; Langois, Roggman, and Musselman, 1994). If this familiarity effect incorporates all the faces one has seen, as is suggested by short-term shifts in attractiveness induced by aftereffects (Rhodes et al., 2003), then the most commonly seen race in a particular region should be perceived as most attractive. The influence of face familiarity can be measured at 3 months of age, with infants preferring to look at the race of face with which they are most familiar (Kelly et al., 2007), and there is some evidence that this early experience may have long-term consequences, with data showing that mixed-race people are more likely to marry a partner of the same race as their opposite-sex parent (Jedlicka, 1980), and correlations between partners' hair and eye color and opposite-sex parents' hair and eye color (Little, Penton-Voak, Burt, and Perrett, 2003).
The current study is designed to produce as fair a test as practicable of whether people perceive own-race faces to be more attractive (on average) than other-race faces, and to examine the possible role of experience in this. In order to achieve this, familiarity with different races was varied both between groups and within groups. Familiarity with different races was varied between groups by using two Australian born-and-raised participant groups (Australian East-Asians and Australian Europeans) and a Hong Kong born-and-raised participant group (Hong Kong East-Asians). In Australia, people of European descent are the racial majority and people of East-Asian descent are a racial minority, whereas in Hong Kong, people of East-Asian descent are the racial majority and people of European descent are the racial minority. Hence Australian East-Asians and Europeans should be more familiar with European than East-Asian faces, whereas Hong Kong East-Asians should be more familiar with East-Asian than European faces. Familiarity with different races was also examined within groups by presenting African faces alongside East-Asian and European faces. In both Australia and Hong Kong people of African descent are very rare (less than 1% of population). Hence both Australian and Hong Kong participant groups should be less familiar with African faces than with East-Asian or European faces.
For comparison with previous studies, we also included mixed-race averages in the current study. Four previous studies have assessed the attractiveness of mixed-race faces relative to single-race faces. In two separate studies, Lewis (2010, 2011) found that “Mixed” race faces were rated as more attractive than either “Black” or “White” faces (although this only held for female faces in the 2011 paper). These studies used a high number of judged faces, but they were taken from facebook pages, and so varied (possibly systematically) in lighting, camera angle, hair-style, jewelry, etc, and were defined as being representatives of particular races only by membership of British race-relevant facebook groups. They were also judged by a small number of participants of only one race (“White” British); 10 females and 8 males in one study (2011), and 20 participants of unspecified sex in the other (2010), and so the results may be influenced by idiosyncratic preferences of some of the participants. Of the other two studies, Rhodes et al. (2001) found no compelling evidence of a preference for mixed-race averaged/composite faces (there was no main effect of kind of averaged face; Caucasian, Chinese and mixed were rated equally high), but Rhodes et al. (2005) found that mixed-race faces (both averaged and individual) were perceived to be more attractive than single-race averaged and individual faces, and suggested that this effect may reflect an evolved preference for cues of health. Because this study only had two types of faces, it only used one type of mixed-race face: Caucasian-Asian. It would be interesting to know to what extent a preference for mixed-race faces extends to mixed-race faces outside the participants' experience. The current study uses three types of mixed-race faces (including mixes of races other than the participant's own-race), and so is well equipped to assess the generality of any mixed-race face effect.
In Experiment 1 we measured perceived attractiveness, and in Experiment 2 we measured perceived masculinity and femininity to attempt to shed light on the results obtained in Experiment 1.
Experiment 1 - Attractiveness
Materials and Methods
Participants
One hundred and twenty (58M, 62F) university students participated in the experiment. Participants fell into one of 3 pre-existing groups: (1) Australian European – born and raised in Australia with only European ancestry; (2) Australian East-Asian – born and raised in Australia with only Chinese, Japanese, or Korean ancestry; or (3) Hong Kong East-Asian – born and raised in Hong Kong with only Chinese, Japanese, or Korean ancestry. Participants were instructed to indicate their ancestry by considering their previous three generations, such that ancestry was defined by reference to genealogy rather than self-identity.
Forty-one (20M, 21F) Australian Europeans were recruited from an introductory psychology course at Macquarie University in Australia, by circulating an advertisement on the course website. Forty-two (21M, 21F) Australian East-Asians were recruited from introductory psychology, law, economics, and business courses and the general student population at Macquarie University. Thirty-seven (17M, 20F) Hong Kong East-Asians were recruited from an introductory psychology course at the Chinese University of Hong Kong. Data about number of people known of various backgrounds, countries visited, and amount of Western media exposure for Hong Kong participants is reported in Table 1.
Acquaintances with and exposure to people of differing backgrounds.
Note: n = 120; F = Female; M = Male; Mean time spent visiting (months) is provided in parentheses.
Australian Europeans and Hong Kong East-Asians participated in return for partial fulfillment of a course requirement. Australian East-Asians recruited from the introductory psychology course participated in return for partial fulfillment of a course requirement, whilst those recruited elsewhere participated in return for $10.
Stimuli
One hundred and forty-four individual digitized color photographs of faces were used in the experiment: 24 faces per single-race face category (East-Asian, European, African) per sex. The 24 East-Asian faces per sex were sourced from the Asian Face Image Database ‘Postech Faces 01’ (PF01). These faces were primarily Korean faces. The 24 European faces per sex were sourced from Colin Tredoux's (University of Capetown, South Africa) database. These faces were white South African faces (white South Africans are descendants of primarily Irish, English, and Dutch settlers, and the long history of apartheid in South Africa means that it is highly unlikely that white South Africans are mixed-race). The 24 African faces per sex were also sourced from Colin Tredoux's database. These faces were actual African faces, and the long history of apartheid in South Africa suggests that Africans are much less likely to be mixed-race than African-Americans.
Adobe® Photoshop® Creative Suite (CS) was used to remove any characteristic attributes (moles, etc.) and to rotate the faces such that pupils were horizontally aligned. All faces were resized so that the distance between the mid-point of the pupils and the top of each photo, between the outermost edges of each side of the face, and between the outermost edge of each side of the face and the closest vertical edge of the photo were all standardized. All faces were standardized on hairstyle by placing grey rectangular masks around each face, which were 10cm wide (the exact width of the face), 1cm above the hairline and 1cm below the chin. See Figure 1 for low resolution examples.

The compound stimuli used in the experiment.
Compound faces
Thirty-six compound faces (computer-generated averages of the individual faces) were created: Three single-race compound faces per single-race category (East-Asian, European, and African) per sex (see Figure 1), and three mixed-race compound faces per mixed-race category (East-Asian/European, East-Asian/African, African/European) per sex (see Figure 2). Each of the three single-race compound faces per single-race category per sex, was a computer-generated average of eight individual faces that were randomly selected (without replacement) from the 24 individual faces within each single-race category and sex. Each of the three mixed-race compound faces per mixed-race category per sex, was a computer-generated average of: (i) four individual faces that were randomly selected (without replacement) from the 24 individual faces within one single-race category and sex, and (ii) four individual faces that were randomly selected (without replacement) from the 24 individual faces within another single-race category but the same sex.

The mixed race compound faces used in the experiment.
All compound faces were created using the 'Sqirlz Morph', a freeware morphing application. Two hundred and thirty three control points were placed on key facial features on each of the eight individual faces which were to be averaged into each compound face, and then 'Sqirlz Morph' averaged all the control points across all eight individual faces to create each compound face.
Procedure
Participants were not told the true purpose of the experiment, but rather were told that it was designed to assess whether faces that are perceived as good looking are remembered better than faces that are not perceived as good looking. This was designed to reduce the impact of socially desirable responding, since we emphasized that the memory effect we were interested in depended on their personal ratings.
Participants were tested individually on 15” Apple G3 iMac computers. In order to encourage participants to use a subjective standard of attractiveness - that which they personally found attractive - they judged opposite-sex faces only. In total, participants were presented with 90 opposite-sex faces: 72 opposite-sex individual faces and 18 opposite-sex compound faces. The order of presentation of the faces was randomized, and each face was presented on screen for five seconds with a five second inter-stimulus interval (such that participants had 10 seconds in total to respond to each face). Participants were instructed to rate faces on a 9-point Likert scale according to how good looking they personally thought they were. The following markers for this scale were placed under each face: 1 = not at all good looking, 3 = slightly good looking, 5 = moderately good looking, 7 = quite good looking, 9 = extremely good looking. After rating all the faces, participants were administered a questionnaire that assessed demographic characteristics and familiarity with different races. Upon completion participants were debriefed and were administered either course credit or $10 in return for their participation.
Results
Since male and female participants were judging different faces (only opposite sex), their judgments were analyzed separately.
A 2 × 3 × 3 (compound vs. individual × participant group × face race) mixed factorial ANOVA was used to analyze the female ratings of single-race faces. The data are plotted in Figure 3. This analysis revealed a significant main effect of compound vs. individual (F1, 59 = 499.1, p < 0.001), reflecting the fact that averaged faces were judged as better looking than individual faces, a significant main effect of face race (F2, 118 = 105.69, p < 0.001), and a significant main effect of participant group (F2, 59 = 13.91, p < 0.001). There were also significant 2-way interactions between compound vs. individual and participant group (F2, 59 = 4.99, p = 0.010), face race and participant group (F4, 118 = 10.82, p < 0.001), and compound vs. individual and face race (F2, 118 = 499.1, p < 0.001). The 3-way interaction between compound vs. individual, face race and participant group approached significance (F4, 59 = 2.41, p = 0.053). Overall, these data do not show an obvious own-race preference, since the European faces were rated as most attractive by all participants.

Mean “good looks” ratings given by female participants judging male faces.
A 2 × 3 × 3 (compound vs. individual × face race × participant group) mixed factorial ANOVA was also used to analyze the male ratings of single-race faces. The data are plotted in Figure 4. This analysis revealed a significant main effect of compound vs. individual (F1, 55 = 459.33, p < 0.001), reflecting the fact that averaged faces were judged as better looking than individual faces, a significant main effect of face race (F2, 110 = 118.68, p < 0.001), and a significant main effect of participant group (F2, 55 = 15.57, p < 0.001). There were also significant 2-way interactions between face race and participant group (F4, 110 = 3.82, p = 0.006), and compound vs. individual and face race (F2, 110 = 22.72, p < 0.001). The 3-way interaction between compound vs. individual, face race and participant group was also significant (F4, 110 = 3.23, p = 0.015), but unlike the female data, the 2-way interaction between compound vs. individual and participant group was not significant (F2, 55 = 1.79, p = 0.177). The three-way interaction reflects the fact that for individual faces there is a slight own-race preference, but for averaged faces, there is an overall preference for European faces, as there was in the data from female participants.

Mean “good looks” ratings given by male participants judging female faces.
The data from the judgments of the mixed race compound stimuli were analyzed in two separate 3 × 3 (face race mix × participant group) mixed factorial ANOVAs, one for the male participants, and one for the female. The data are plotted in Figure 5. The analysis of the female participant data revealed a significant main effect of face race mix (F2, 118 = 10.79, p < 0.001), a significant main effect of participant group (F2, 59 = 19.74, p < 0.001), and a significant interaction between these factors (F4, 118 = 6.04, p < 0.001). The male data also revealed a significant main effect of face race mix (F2, 110 = 81.23, p < 0.001), and of participant group (F2, 55 = 16.01, p < 0.001), and a significant interaction between these factors (F4, 110 = 2.77, p = 0.031).

Mean “good looks” ratings given to the mixed race compound faces.
Across all analyses, the data show some relatively uninteresting group differences in the use of the rating scales – Hong Kong participants gave higher scores overall – and some unsurprising effects of facial averaging – compound faces are uniformly rated as more attractive than individual faces - but also some interesting differences in the patterns of ratings by different groups, as reflected in the significant interactions. These are discussed in more detail below.
Discussion
Despite our attempts to obtain genuine, subjective evaluations of “good looks”, by having heterosexual participants only rate opposite-sex faces, and using a cover story that emphasized memorizing the faces to minimize the effects of social desirability, we did not actually find a strong own-race preference. For female participants there was a universal European face preference, and the Australian East Asian participants did not even rate own-race faces as second most attractive, instead rating African faces as equally attractive (for individual faces) or slightly more attractive (for compound faces) than East Asian faces. For male participants European faces were also rated highly, and uniformly highest for compound faces, but there is some evidence of an own-race preference when rating individual faces, with both groups of East Asian participants showing a very slight own-race preference.
We did not find strong evidence for an own-race preference, since it emerged only for male participants judging unmanipulated faces, but the data overall are also not consistent with familiarity being the prime determinant of attractiveness, although the general preference for European faces (especially averages) may well reflect an effect of exposure. The mere fact that males and females differed in their ratings is very difficult to explain as a familiarity effect, since, in the samples we used, males and females of the various races are equally common (or uncommon). Hong Kong East Asians are exposed to far fewer European faces than East Asian faces, both in daily interactions, and in the media, but they (especially females) nevertheless usually rated European faces as most attractive. Equally problematic for a simple effect of familiarity, the only evidence for an own-race preference that we obtained, when males were judging unaveraged faces, showed no signs of being affected by experience. East Asian males, regardless of where they grew up, showed a slight preference for East Asian faces, despite dramatically different levels of exposure to European and East Asian faces, and the European Australians showed only a slight preference for European faces (over East Asian faces) despite much greater exposure to European faces. The ratings by female participants are more like those expected on the basis of familiarity, since the pattern of preferences shown by the Australian-raised participants is more similar than the pattern of preferences shown by the Hong Kong East Asian participants, but the mere fact of sex-differences is not easily accommodated by this factor, and the fact that the Australian-born participants rated African faces as second most attractive, despite their rarity, is also problematic. Not only did the female Australian-born participants rate the highly unfamiliar African faces as more attractive than the much more familiar East Asian faces, all of the male participant groups rated the African faces as least attractive, consistent with their low familiarity, but inconsistent with their own almost-experience-independent ratings of the European and East Asian faces. There is obviously a complex set of factors underpinning the attractiveness ratings, which may differ between males and females, and that is not completely explained by a straightforward own-race preference, or by an effect of familiarity.
Unlike Rhodes et al. (2005), but like Rhodes et al. (2001), we did not find that participants rated mixed-race faces as more attractive than single-race compound faces. In fact for most of the mixes, the mixed race face was rated at about the mean level of attractiveness for the two single race composites it was a mixture of. Whether this difference is a consequence of procedural differences or stimulus differences is difficult to tell, but having participants rate each face singly in random order, and including three different compound faces for each category, mixed in with unaveraged faces, as in our procedure, is perhaps a more ecologically valid way of assessing uncontaminated attractiveness impressions than that used by Rhodes et al. (2005). In their study participants were presented with a single array of morphs for a given sex along a continuum (from exaggerated Japanese faces, through averaged Japanese faces, mixed Japanese/Caucasian, etc to exaggerated Caucasian faces) printed on to cards which were shuffled for each participant, and asked to pick out the most attractive and give it a score out of 10, then to pick the next most attractive and score it, etc. This procedure seems more likely than ours to focus participants' attention on the blending between the races of the faces, and so perhaps more likely to elicit preferences for mixes via demand characteristics. The fact that Rhodes et al. (2001) also failed to find a preference for mixed-race composites, and followed a procedure more like ours, supports the possibility that the mixed-race preference found by Rhodes et al. (2005) might be a consequence of their procedure.
In common with both Rhodes et al. (2001) and Rhodes et al. (2005), we found that compound faces were rated as much more attractive than individual faces, independently of familiarity with the race of face being averaged. Although we found significant interactions between face race and averaging, this was clearly not a consequence of larger increases for more familiar faces. Indeed, the African faces, which were unfamiliar for all participants, showed some of the largest increases in attractiveness as a consequence of averaging. This suggests that compound faces are not rated as more attractive only because they are more like the mean of the faces with which we are familiar (Langois and Roggman, 1990; Langois et al., 1994), but also because averaging creates faces that intrinsically reflect preferred traits, like healthiness, as suggested by Rhodes et al. (2005). Our averaged faces were also more symmetrical than our unaveraged faces, which would have contributed to this effect.
A result that we did not predict was that the ratings for the African faces would differ so dramatically between males and females. Based on either familiarity or a possible own-race preference, our participants should have rated African faces as low in attractiveness. This is the pattern shown by male raters, but for female raters, the African faces (especially averages) were rated as more attractive than the much more familiar (and sometimes own-race) East Asian faces. One factor that we did not explicitly consider is that the different race faces may differ in perceived masculinity and/or femininity. Lewis (2011) asked his UK Caucasian participants to rate the “Masculinity” (among other attributes) of the “White” “Mixed” and “Black” faces he used, and both male and female “Black” faces were rated as more masculine than “White” or “Mixed” faces. It is well established that for both Caucasian and Asian faces feminizing female faces makes them more attractive (Perrett et al., 1998) and that more masculine (or at least less feminine) male faces are rated as more attractive as short term partners (e.g., Perrett et al., 1998), a rating similar to our “good looks”. So one possible explanation for the inconsistent ratings of the African faces, and the overall high ratings of European faces, in our study is that they show different levels of perceived dimorphism. We explored this possibility in experiment 2.
Experiment 2 – Sexual Dimorphism
One factor that may account for some of the variance in attractiveness ratings in Experiment 1 is the perceived masculinity and femininity of the faces being rated. In order to examine this possibility, we had a new set of raters, from Japan and Australia rate the masculinity and femininity of a subset of the unmanipulated faces used in experiment 1.
Materials and Methods
Ten of each sex of each race (60 faces in total) of the unmanipulated faces used in experiment 1 were judged for their perceived level of masculinity (for male faces) or femininity (for female faces) in a separate study, run online, using Inquisit v3.0.6 (for PC users) or Inquisit v4.0 (for Mac users). The experiment was made available on the website of one of the researchers, and participants completed it on any internet-connected computer to which they had access.
One hundred and forty-five (108 females and 37 males) participants took part in the study. They were divided into two groups based on whether the majority of their viewing experience was likely to have been of Asian or Caucasian faces. The Asian viewing experience group consisted mostly of Japanese nationals: 50 females (mean age 22.6) and 19 males (mean age 36.2). The Caucasian viewing experience group consisted mostly of Australian nationals: 58 females (mean age 30.6) and 18 males (mean age 31.8). In addition to their nationality (Australian, Japanese or other) participants were asked to write the name of the one or two countries where they had mostly lived during the past 10 years. Those who stated that they had lived in either Japan or China and two others (one of whom stated that she had lived in both China and Australia, and the other in Malaysia), were included in the Asian viewing experience group. Those who stated that they had lived in countries with a predominantly Caucasian population were included in the Caucasian viewing experience group. English and Japanese language versions of the experiments were available. The instructions for the latter were the same as the English version, but translated into Japanese using back translation to ensure equivalence.
All participants were asked to rate the masculinity of the 30 male faces and the femininity of the 30 female faces on a 10 point rating scale (1 = least, 10 = most). As a concrete anchor for these judgments, each face was presented in a pair with a randomly selected (unrated) opposite sex face of the same race to serve as a comparison. The order of presentation and side of screen on which the rated face appeared were randomized within blocks. Faces were presented in three blocks of face race, and the order of blocks was randomized between participants.
Results
Mean masculinity and femininity ratings given by each participant to the male and female faces of each race were calculated (see Figure 6). These means were analyzed using a mixed factorial analysis of variance (ANOVA) with sex of the rated face (2 levels: male and female) and race of the rated face (3 levels: African, Asian and Caucasian) as within-subjects factors and sex of the rater (2 levels: male and female) and nationality of the rater (2 levels: Japanese and Australian) as between-subjects factors. In order to facilitate comparisons across sex of the rated face, we have included the femininity ratings of the female faces and the masculinity ratings of the male faces as equivalent dependent variables for the sake of the ANOVA – a rating of sex-typicality.

Mean masculinity/femininity ratings for male/female African, Asian and Caucasian faces.
As predicted, there was a significant main effect for race of rated face (F2, 282 = 28.95, p < .001). Simple contrasts revealed significantly lower scores for the African compared to Caucasian faces (F1, 141 = 53.15, p < .001), and for the African compared to the Asian faces, (F1, 141 = 29.82, p < .001), but no difference between the Asian and Caucasian scores, (F1, 141 = 3.09, p = .081). The interaction between race of rated face and nationality of participant was not significant (F2, 282 = 1.00, p = .369).
The main effect of sex of rated face was also significant (F1, 141 = 13.72, p < .001), with the males being rated as more masculine than the females were feminine. The interaction between race and sex of rated face was also statistically significant, (F1.82, 256.46 = 64.96, p < .001 - degrees of freedom corrected for violation of sphericity using Greenhouse-Geisser). Simple contrasts revealed that the African female faces were rated significantly lower on femininity than the African males were on masculinity (F1, 141 = 67.36, p < .001), and that the Asian male faces were rated as significantly lower than the Caucasian males on masculinity (F1, 141 = 7.32, p = .008). There was also a significant three way interaction between race of rated face, sex of rated face and nationality of rater (F2, 282 = 3.8, p = .035).
General Discussion
The results from Experiment 2 support the possibility that the difference in the ratings of the male and female African faces in Experiment 1 was at least partially due to a difference in perceived femininity/masculinity, since the male African faces were judged to be about as masculine as the male Caucasian and Asian faces, but the female African faces were judged to be significantly less feminine than the female Caucasian and Asian faces. Whether this is a genuine difference in femininity or simply a perceived difference due to our participants' lack of familiarity with African faces (and with the features that distinguish male and female African faces) is impossible to tell from the current data, but it does provide a potential explanation for the lower attractiveness ratings given to female African faces, since lower perceived femininity is known to correlate with lower attractiveness ratings (e.g., Burke and Sulikowski, 2010). To properly tease out the factors involved would require running experiments like those reported here but using European African and Native African participants. It is also true that while the subset of female African faces we had rated in Experiment 2 were judged to be on lower in femininity than were the Caucasian and Asian faces, and African female faces were also judged to be lower in attractiveness in Experiment 1, these judgments came from different raters, and so it is impossible to be sure that the variation in ratings of perceived attractiveness in Experiment 1 was driven by variation in perceived femininity.
Another dimorphic feature that may underpin part of the lower ratings for the female African faces (but increase those for male African faces) is that adult males on average have darker skin than females, and that this affects attractiveness judgments (van den Berghe and Frost, 1986; Lewis, 2011). Obviously skin color dimorphism alone is not the only factor that matters, or all of our participants would have rated European female faces as most attractive (which they did not always do), and European male faces as least attractive (which they never did), but this factor is likely to have interacted with other factors to produce the patterns we found. The ordering of the male preferences for mixed-race faces tracks variation in skin color, such that the darker-skinned mixes are rated as less attractive than the lighter-skinned mixes, so perhaps this is a more important cue for males.
The current study was designed to test, in as unbiased a way as could be managed, for an own-race preference in attractiveness judgments, which may contribute to race-based assortative mating. In fact we found only weak support for the possibility of an own-race preference, but even less support for a most-familiar-race preference. Some of the results are consistent with an experience-independent own-race preference (a slight male own-race preference that tracked the rater's racial origin rather than their country of birth), but the effect is insufficiently robust to account for much of the variance in attractiveness ratings that we measured. It is not clear, for example, why the male raters' own-race preference, evidenced from their ratings of individual faces, did not translate into an own-race preference for the compound faces, reverting instead to a uniform European-race preference.
Clearly other factors are at work in determining the perceived attractiveness of the same- and other-race faces. We tried to reduce demand characteristics, and to elicit genuine, subjectively accurate ratings with our cover story, and the significant differences across face-race suggest that we succeeded, at least in part. We also measured perceived femininity and masculinity of the faces used in Experiment 1, and these ratings (along with sexually dimorphic skin color preferences) may help to explain the lower ratings given to female African faces than male African faces, but none of the factors we have taken into account offer a straightforward explanation for the fact that our female participants in particular rated the European faces as most attractive, irrespective of their own racial background or the faces with which they were most familiar. Future studies, examining other variables (including different cultural contexts), will be necessary to determine what might underpin such preferences, but given that the effect was stronger for female than for male participants, one possibility is that the European faces were perceived to be (on average) more affluent, and so have better prospects, which we know is important, especially for females making judgments of potential long-term partners (Buss, 1989). This factor is also likely to be augmented by social learning/mate copying effects, in which targets of opposite-sex interest become more attractive, an effect that generalizes to similar individuals (Little et al., 2011). It may be that even when asked to judge faces only for “good looks” it is difficult for participants to disengage mechanisms involved in estimating important non-physical components of “attractiveness.”
Footnotes
Acknowledgements
We sincerely thank Colin Tredoux for access to, and permission to use, his face database, and two anonymous reviewers, who provided advice that greatly improved the focus of the paper.
