Sage Journals: Discover world-class research

Abstract

Are associations between ratings of adolescents’ attractiveness and their adult health, cognitive functioning, and longevity plausibly causal, or are they confounded by factors correlated with judgements about attractiveness? How do these processes differ for women and men? Using data from the Wisconsin Longitudinal Study, the authors estimate the impact of judgements about adolescent facial attractiveness on 35 cognitive, health, and mortality outcomes through age 72. Ratings of adolescent facial attractiveness are predictive of later life outcomes among women, but mainly because ratings of young women’s attractiveness are closely connected with women’s socioeconomic standing and body mass in early life. The same is not true for men. People use different standards to evaluate the attractiveness of women and men; these differences induce largely noncausal associations between ratings of young women’s attractiveness and their cognition, morbidity, and mortality.

Keywords

facial attractiveness gender health mortality cognition

People routinely and critically evaluate the visible characteristics of other people’s bodies. In many cases—such as when consequential evaluations are based on skin tone, perceived ancestral origins, or perceived gender of the body—those judgements can intersect with cultural and political systems to support powerful structures of inequity and injustice (e.g., Eberhardt et al. 2006; King and Johnson 2016; Monk 2021a, 2021b). Other visible characteristics of bodies, such as height and weight, are also routinely and critically evaluated, often in ways that may affect socioeconomic and other life outcomes (e.g., Biener, Cawley, and Meyerhoefer 2018; Böckerman et al. 2019; Bossavie et al. 2017; Kim and Han 2017; Kim and von dem Knesebeck 2018).

An emerging strand of research has considered the impact of judgements about yet another visible characteristic of bodies—how beautiful or attractive their faces are—on social and economic outcomes like marriage prospects, wages, and occupational attainment (e.g., Deng, Li, and Zhou 2020; Gu and Ji 2019; Karraker, Sicinski, and Moynihan 2017; Monk, Esposito, and Lee 2021; Sala et al. 2013). A much smaller strand of research has estimated the impacts of how beautiful or attractive young people’s faces are judged to be on their subsequent morbidity and mortality outcomes (Henderson and Anglin 2003; Kalick et al. 1998; Kim 2014; Scholz and Sicinski 2015; Weeden and Sabini 2005).

Our goal is to expand and improve upon this body of scholarship on the effects of adolescent attractiveness on longevity and later life health and cognitive functioning. Although extant research is limited in crucial ways that we describe later, the conceptual premise that such effects may exist is sound. Isolating those effects, however, is complicated by the fact that in real life multiple characteristics of people’s bodies are viewed and judged simultaneously. Those judgements may be correlated with one another (e.g., such that in contemporary America heavier bodies may tend to be judged as less attractive) and may be biased by socioeconomic status, perceived intelligence, perceived healthiness, or other factors. As a result, identifying causal effects of attractiveness is challenging.

In this article we follow previous research by asking whether young people whose faces are judged to be more attractive tend to have better adult health, better adult cognitive functioning, and longer lives. We go beyond previous research by (1) focusing more carefully on whether those associations are plausibly causal or whether they may be confounded by factors correlated with judgements about facial attractiveness and (2) carefully considering whether and how these processes differ for women and men.

Ratings of attractiveness reflect cultural constructions of what counts as attractive and what does not; our ratings are discussed in detail in the data section of the article. Our goal in this study is to better understand whether people judged as attractive have better health and cognitive outcomes because of their perceived attractiveness alone or if attractiveness is merely acting as a proxy for other variables known to influence later life cognitive and physical health, namely, body weight and socioeconomic status.

Adolescent Attractiveness and Later Life Health and Mortality

There are two basic theoretical perspectives that support the premise that ratings of adolescents’ facial attractiveness are associated with their later life morbidity, cognition, and mortality. These theoretical perspectives each imply that we will see an empirical association between adolescent facial attractiveness and these outcomes. The perspectives differ, however, with respect to whether those associations are causal in nature.

First, from what we will call a beauty norm discrimination perspective, people with the power to allocate or withhold valued social and economic resources do so based in part on their judgements about other people’s attractiveness (e.g., Gu and Ji 2019; Mobius and Rosenblat 2006; Monk et al. 2021). People whose appearance fails to adhere to culturally defined beauty norms—either because their bodies simply differ from normative standards or because they do not or cannot afford to modify them in expected ways (e.g., through makeup, hair style, or fashionable clothing)—may be denied romantic affection, social network access, educational opportunities, job offers, salary increases, and other valued positions and resources. For example, Deng et al. (2020:1303) conducted a field experiment and found that “taste-based pure appearance discrimination exists at the pre-interview stage” in the Chinese labor market. Furthermore, those in power may inflict psychological distress upon those they judge to not meet cultural standards of beauty or attractiveness, thus exacerbating disparities between more and less attractive people (e.g., Gupta, Etcoff, and Jaeger, 2016).

Above and beyond discrimination based on skin color, sex, height, weight, and other bodily attributes, people whose appearance better adheres to dominant cultural beauty norms may enjoy social and economic advantages at key stages of the life course. From this perspective, attractiveness in adolescence has causal effects on subsequent social positions, economic rewards, and psychological well-being, which in turn shape health, cognition, and longevity. This perspective has dominated research on the impact of attractiveness on socioeconomic and psychological outcomes (e.g., Biddle and Hamermesh 1998; Deng et al. 2020; Gu and Ji 2019; Gupta et al. 2016; Jæger 2011; Liu and Sierminska 2014; Scholz and Sicinski 2015). Yet if facial attractiveness is mostly a reflection of one’s (or one’s parents’) ability to buy or otherwise obtain a culturally preferred look, then we may not be seeing the effects of facial attractiveness on later life outcomes so much as we are seeing ratings of attractiveness act as a proxy for socioeconomic status.

Second, and in sharp contrast, the mate selection perspective suggests that bodily (including facial) attractiveness is a signal of strength, healthiness, virility, and fertility and thus has a more biological basis (e.g., Gallup and Frederick 2010; Jæger 2011; Rhodes 2006; Singh and Singh 2011; Thornhill and Gangestad 1999). As Gallup and Frederick (2010) noted, “Features we find attractive in members of the opposite sex signal important underlying dimensions of health and reproductive viability” (p. 240). In support of this perspective, for example, Law Smith et al. (2006) found strong correlations between ratings of women’s facial attractiveness and their estrogen and progesterone levels, and Jokela (2009) found that adolescents who were viewed as more attractive went on to have more children. As reviewed by Gallup and Frederick (2010) and Thornhill and Gangestad (1999), there is also suggestive evidence of a positive relationship between men’s attractiveness and the quality of their sperm.

From this perspective, evaluations of the attractiveness of another person are really evaluations of the person’s desirability as a mate (in either the sexual sense or the long-term partnership sense). Furthermore, although norms about beauty differ across cultural contexts (Broer et al. 2014), those norms are to some unknown degree driven by mate selection processes and an evolutionary drive to reproduce (Grammer et al. 2003). From the mate selection perspective, it is not necessarily the degree to which a person’s appearance adheres to beauty norms that ultimately causally impacts (via discrimination) their later life morbidity, cognition, and mortality. Instead, factors such as early life healthiness, robustness, physical fitness, strength, and vitality—which are unconsciously the basis on which evaluations of attractiveness are founded—ultimately shape those outcomes. That is, the empirical association between ratings of attractiveness in adolescence and morbidity, cognition, and mortality later in life is mostly spurious, confounded by adolescent healthiness, robustness, and other proxies for being a good procreative partner.

Of course, both the beauty norm discrimination perspective and the mate selection perspective may have partial merit: attractiveness may be mostly a proxy for other factors such as wealth, healthiness, and/or virility and at the same time there may still be some discrimination against people whose appearance does not conform to prevailing beauty norms. If either perspective has merit, we expect to see that people who are rated as more attractive early in life enjoy better later life health and cognitive functioning and live longer.

As we describe later, estimating the degree to which those associations are causal in nature requires attention to the possibility that ratings of young people’s attractiveness are influenced by or are merely proxies for other factors that may themselves matter for later life well-being. The mate selection perspective would lead us to expect associations between facial attractiveness and later life outcomes to be partially confounded by indicators of healthiness and virility. However, it may also be that ratings of attractiveness are a proxy for other things—perhaps especially socioeconomic status and body mass—such that associations between attractiveness and subsequent outcomes are confounded by those attributes. If this is true, then neither the beauty norm discrimination nor the mate selection perspective may have merit; instead, it may be the case that ratings of attractiveness simply reflect other adolescent attributes that are known to have long-term health consequences.

Finally, associations between ratings of young people’s attractiveness and later life morbidity, cognition, and mortality likely work differently for women and men. From the beauty norm discrimination perspective, the societal penalties associated with having bodies that deviate from beauty norms may be more pervasive and more severe for women. From the mate selection perspective, the manner and degree to which attractiveness proxies for healthiness and virility and suitability as a mate may also differ for women and men. Likewise, the degree to which ratings of attractiveness serve as proxies for things, like socioeconomic status or body mass, may be different for women and men. For these reasons, we are attentive to gender differences in our analyses.

Current Evidence about the Association between Attractiveness and Health

Prior scholars have estimated the effects of adolescent attractiveness on later life health and mortality. Below we review the main research articles on this subject, describe their methodological limitations, and motivate the need for additional analyses.

In the earliest empirical work on this topic, Kalick et al. (1998) used longitudinal data on 164 male and 169 female (mostly working-class) residents of the Berkeley and Oakland, California, area to investigate the impact of young people’s facial attractiveness on medical professionals’ subsequent ratings of their overall healthiness. Using a model that adjusted for a measure of socioeconomic background, the authors found that baseline attractiveness did not predict doctors’ ratings of their healthiness in follow-up surveys for either women or men. Their longitudinal sample suffered from considerable attrition.

Henderson and Anglin (2003) had 20 undergraduate students view 50 yearbook photos from one high school in the 1920s; the students were asked to rate the healthiness and facial attractiveness of the person in each photo. The utility of those attractiveness ratings is potentially limited because the people doing the ratings came from a very different birth cohort than the people who they were asked to rate; the raters may thus have had different ideals of beauty and of healthiness. In any case, Henderson and Anglin then linked those attractiveness ratings to mortality records to ascertain longevity. Using a model that adjusted for no covariates, they found that ratings of facial attractiveness predicted longevity but that ratings of healthiness did not.

Three more recent studies have used data from the same cohort study that we use—the Wisconsin Longitudinal Study (WLS). The WLS, as we describe in more detail later, has followed a one-third random sample of the Wisconsin high school graduating class of 1957. In the mid-2000s, headshots from senior high school yearbooks were rated for facial attractiveness for 82% of the original cohort. Importantly, the individuals who did the rating were drawn from Wisconsin and from roughly the same birth cohort. First, Kim (2014) estimated the impact of ratings of facial attractiveness on mortality, self-assessed overall health, psychological distress, and a count of self-reported diseases decades later. Using models that controlled for family socioeconomic standing, IQ, religiosity, and personality, they found few effects on health outcomes; however, they did find that women who were rated more attractive lived longer. Unfortunately, their models inappropriately adjust for factors that are almost certainly mediators (e.g., midlife health behaviors), not confounders. It is thus difficult to discern the total estimated effects of facial attractiveness from their results. Second, Scholz and Sicinski (2015) likewise used WLS data to estimate the effects of adolescent facial attractiveness on longevity and self-assessed overall health from a model that adjusted for family socioeconomic background and IQ. They found no effects on either outcome. Third, Gupta et al. (2016) used WLS data to estimate the impact of facial attractiveness on psychological distress and well-being. Using a model that adjusted for midlife height and body mass index (BMI) and a measure of adolescent IQ, they found that people judged to be more attractive in adolescence had greater psychological well-being and less psychological distress in later life.

Although these articles have moved the field forward, they are severely limited in some important respects. First, only Kim (2014), Scholz and Sicinski (2015), and Gupta et al. (2016) used a large population-based sample. The external validity of the results of Kalick et al. (1998) and Henderson and Anglin (2003) is unclear. Like Kim et al., Scholz and Sicinski, and Gupta et al., we extend this field of study by using data from the WLS. The WLS cohort includes only high school graduates and almost all respondents are white: about two thirds of Americans in this cohort were white high school graduates (Herd, Carr, and Roan 2014).

Second, the ratings of facial attractiveness in the WLS were made by people in the same demographic group as the subjects of the ratings. If there are cohort differences in standards of attractiveness, then it is important to have raters be from similar birth cohorts as the people whose attractiveness is being rated. Although it is virtue of the WLS’s coding procedures that the evaluators came from the same state and birth cohort as the WLS graduates, it is a potential weakness that cultural norms of attractiveness may have changed in unknown ways between 1957 and the early 2000s. It is unclear how the coders’ ratings might have been different had they evaluated the photographs in 1957. Given that ratings were not done in the 1950s, we feel that this method represents a strong estimate of relative facial attractiveness in 1957. As discussed in the “Measures” section, the coders were trained and had rating scales developed from contemporary yearbook photos (Figure 1).

Figure 1.

Wisconsin Longitudinal Study measure of adolescent facial attractiveness.

Third, none of these studies adjusted for an adequate set of confounding variables. As might be expected from the mate selection perspective—and as we show later—ratings of adolescents’ facial attractiveness are empirically correlated with adolescents’ relative body mass (RBM; Hume and Montgomerie 2001; Jæger 2011; Weeden and Sabini 2005), perceived healthiness (Foo, Simmons, and Rhodes 2017; Kalick et al. 1998; Shackelford and Larsen 1999; Weeden and Sabini 2005), family socioeconomic circumstances (Kalick et al. 1998), and intelligence (Langlois et al. 2000; Zebrowitz et al. 2002). None of these studies adjusted for adolescent health or body mass; only Kim (2014), Scholz and Sicinski (2015), and Gupta et al. (2016) adjusted for intelligence; and only Gupta et al. adjusted for adolescent socioeconomic standing. Our estimates, which adjust for all four, thus improve on our ability make inferences about the consequences of adolescent attractiveness.

Fourth, although many of the aforementioned studies analyzed sample data that include women and men, none gives adequate attention to gender differences. Scholz and Sicinski (2015) analyzed only data for men. Henderson and Anglin (2003) and Gupta et al. (2016) reported results only from “full sample” models that pool women and men and, thus, implicitly constrain the effects of facial attractiveness to be the same for women and men; they report no tests of the validity of this constraint. Kalick et al. (1998) and Kim (2014) estimated gender-specific models—thereby allowing the effects of adolescent attractiveness on health and mortality to differ for women and men—but they report no evidence about the statistical significance of differences in results across those models. In other words, none of the four studies performs formal statistical tests to support claims about gender differences or equivalencies in the effects of early life attractiveness. In our analyses, we estimate gender-specific effects and perform statistical tests to assess whether there are gender differences in the effects of early life attractiveness on later life morbidity and mortality.

Summary

Are societal judgements about people’s facial attractiveness—such as judgements about people based on their skin pigmentation, their sex, their height, and their conformity to norms about body size and shape—associated with people’s long-term health and longevity? If so, are these associations causal in nature—perhaps because of life-course-long biases against people judged to be less attractive? How do these processes differ for women and men?

Research Design

The WLS is a long-term study of a random sample of 10,317 men and women who graduated from Wisconsin high schools in 1957. WLS graduates were interviewed by telephone, mail, and/or in person in 1957, 1975, 1993, 2004, and 2011. The WLS graduate sample is broadly representative of white, non-Hispanic Americans who have completed at least a high school education, a group that includes about two-thirds of all Americans of this generation (Herd et al. 2014). Response rates to WLS have been remarkably high. In 1993, when most of the surviving graduates were age 53 or 54, 87 percent responded to the telephone survey and 71 percent responded to the mail survey. The corresponding response rates were 81 percent and 76 percent in 2004 and 72 percent and 65 percent in 2011. The largest source of survey nonresponse is mortality.

In the early 2000s, WLS staff members obtained 1957 high school yearbooks for 8,434 sample members (4,018 men, 4,416 women). As described later, two of our key measures, facial attractiveness and RBM, are derived from photographs in those yearbooks. Thus, our analytic sample is initially restricted to those 8,434 sample members (representing 82 percent of the full cohort of 10,317). All our cognitive and health outcome measures, except mortality, were ascertained in 2011 when sample members were about 72 years old. Consequently, our analytic sample is further restricted to the 4,905 sample members (2,264 men, 2,641 women) who responded to the 2011 survey.

Measures

Facial Attractiveness

Each yearbook photograph was rated by six men and six women using a photo-labeled, 11-point, gender-specific rating scale like the one depicted in Figure 1, with end points labeled as “least attractive” (1) and “most attractive” (11). The example photographs in Figure 1 were selected from non-WLS sample members in Madison, Wisconsin, area 1957 high school yearbooks. The choice and ranking of the example photographs followed psychometric methods for paired comparisons (Bock and Jones 1968; Torgerson 1958). Trained coders rated each WLS graduate in relation to the gender-specific 11-point rating scale. Coders were recruited between 2004 and 2008 from Wisconsin and from roughly the same birth cohort as the WLS graduates; judges ranged in age from 63 to 91 years (with a mean of 78.5 years) and were (like the cohort) almost all white. See Meland (2002) for details about how the coding system was developed, tested, and implemented.

To account for average differences in ratings across coders (i.e., one coder may average 5.5 on the 11-point scale, whereas another may average 6.0) we rescaled each coders’ rankings by subtracting the coder-specific mean from each score (e.g., such that −2 represents 2 points below average regardless of each coders’ average score). We then averaged across those 12 coders to construct our final measure of attractiveness (called “meanrat” in the WLS codebook). As shown in Table 1, the mean attractiveness score was 0.01. Men and women in the top and bottom quartiles of gender-specific attractiveness distributions averaged about −1.5 and +1.6, respectively.

Table 1.

Descriptive Statistics for All Measures, by Gender and Quartile of Attractiveness.

	Full Sample					Men in the Lowest 25% of the Distribution of Facial Attractiveness			Men in the Top 25% of the Distribution of Facial Attractiveness			Women in the Lowest 25% of the Distribution of Facial Attractiveness			Women in the Top 25% of the Distribution of Facial Attractiveness
Variable	n	Avg/%	SD	Min	Max	n	Avg/%	SD	n	Avg/%	SD	n	Avg/%	SD	n	Avg/%	SD
Facial attractiveness rating
Mean normalized rating, age 18	8,434	.01	1.26	−4.09	4.00	1,007	−1.50	.55	1,003	1.63	.56***	1,104	−1.68	.57	1,103	1.61	.60***
Adolescent anthropomorphic measures
RBM, age 18	8,418	.00	.82	−3.31	3.62	1,002	−.02	1.07	1,000	.00	.63	1,104	.37	.99	1,103	−.34	.61***
Height (in)	6,621	67.33	3.86	55.00	78.00	748	70.46	2.65	766	70.48	2.41	867	64.44	2.64	909	64.75	2.41*
Childhood socioeconomic circumstances
Family income (logged), 1957 tax records	7,357	8.46	.73	.00	11.51	855	8.39	.73	863	8.50	.72**	965	8.35	.68	962	8.54	.73***
Father’s occupational SEI, 1957	8,249	34.31	23.02	2.00	96.00	987	32.28	22.37	976	36.22	23.11***	1,081	29.70	20.91	1,086	38.14	23.95***
Mother’s education (y)	8,308	10.49	2.80	.00	21.00	990	10.38	2.83	984	10.90	2.72***	1,090	9.81	2.73	1,091	10.82	2.77***
Father’s education (y)	8,164	9.80	3.34	.00	25.00	977	9.49	3.40	971	10.14	3.39***	1,068	9.13	3.19	1,072	10.31	3.26***
Adolescent IQ
Henmon-Nelson IQ (junior year)	7,873	100.66	14.88	61.00	145.00	928	99.96	15.32	934	101.02	15.16	1,045	98.80	14.89	1,039	102.63	14.00***
Adolescent health
Self-assessed overall health (% excellent/very good)	5,626	.83	.37	.00	1.00	617	.82	.38	628	.86	.34	762	.79	.41	787	.83	.37*
Count of childhood diseases	5,208	1.07	1.02	.00	8.00	570	1.01	.96	575	1.02	1.01	695	1.16	1.04	735	1.12	1.05
Childhood limitations from disease/injury	5,530	.23	.68	.00	3.00	609	.23	.64	618	.20	.61	744	.23	.69	776	.22	.69
Cognitive outcomes, age 72
Verbal fluency (letter F task)	4,447	11.20	4.16	.00	31.00	481	10.38	4.15	508	10.71	4.07	589	11.19	4.19	645	12.24	4.02***
Working memory (digit span task)	3,751	5.64	1.41	2.00	8.00	399	5.43	1.44	444	5.58	1.45	493	5.65	1.37	530	5.75	1.37
Short-term memory (immediate-recall task)	3,731	5.48	1.45	.00	10.00	401	5.09	1.37	436	5.22	1.37	493	5.62	1.47	522	5.92	1.40***
Short-term memory (delayed-recall task)	3,727	3.46	1.79	.00	10.00	401	2.90	1.57	435	3.10	1.58	492	3.82	1.89	521	4.01	1.80
Abstract reasoning (WAIS task)	4,780	6.32	2.33	.00	12.00	532	6.21	2.32	553	6.48	2.45	619	5.98	2.21	674	6.60	2.23***
Inductive reasoning (number-series task)	4,735	8.46	3.65	.00	15.00	527	8.78	3.82	546	8.97	3.63	605	7.85	3.63	672	8.34	3.47*
Psychological well-being outcomes, age 72
Psychological distress (CES-D)	4,205	15.48	14.49	.00	132.00	446	14.46	12.68	468	14.61	13.40	539	17.42	15.72	622	15.65	14.75*
Anxiety score	4,205	6.84	7.49	.00	47.00	447	6.09	7.06	475	6.71	7.76	539	7.57	7.55	616	7.37	7.92
Anger score	4,210	2.83	4.61	.00	49.00	447	2.77	4.62	475	3.15	5.34	538	3.08	4.74	620	2.69	4.81
Hostility score	4,195	1.05	2.11	.00	21.00	446	.96	1.88	466	1.32	2.59*	538	1.05	1.93	621	.94	1.99
Anthropomorphic outcomes, age 72
Lung function (peak flow measure)	4,640	361	132	60	850	523	434	131	539	453	127*	589	281	81	660	291	81*
Leg strength (chair rise task)	4,411	10.15	3.65	.72	129.58	493	10.21	3.16	514	9.80	3.23*	539	10.51	3.30	635	10.04	3.45*
Grip strength (dynamometer)	4,699	28.56	10.53	.00	84.00	524	35.95	8.97	546	36.85	9.15	599	21.35	6.08	671	21.89	5.83
Walking speed	4,643	2.93	4.77	1.18	318.00	518	2.75	.94	536	2.67	.74	586	3.11	1.29	669	2.96	1.54
Body mass index	4,666	28.70	5.29	16.01	54.81	528	28.87	4.61	548	28.84	4.27	586	30.00	6.48	657	27.82	5.59***
Other physical health outcomes, age 72
Self-assessed overall health	4,904	.58	.49	.00	1.00	544	.55	.50	566	.60	.49	629	.53	.50	691	.62	.49**
Interviewer-assessed overall health	4,799	.54	.50	.00	1.00	534	.51	.50	556	.53	.50	615	.49	.50	679	.59	.49***
Overall health (SF-12)	3,941	56.18	6.62	20.42	71.36	423	56.68	6.29	445	56.36	6.42	496	56.15	6.99	583	55.77	7.10
Days in bed because of illness/injury	4,296	2.34	16.85	.00	365.00	464	2.29	18.87	487	1.17	7.17	544	3.18	22.79	628	2.24	10.29
Hypertension (self-reported)	4,897	.61	.49	.00	1.00	542	.61	.49	564	.64	.48	629	.65	.48	690	.57	.50**
High blood sugar (self-reported)	4,886	.22	.41	.00	1.00	541	.25	.43	563	.26	.44	626	.23	.42	690	.14	.35***
Diabetes (self-reported)	4,894	.18	.38	.00	1.00	542	.21	.41	565	.19	.39	628	.19	.40	692	.11	.32***
Cancer (self-reported)	4,898	.18	.39	.00	1.00	543	.19	.40	563	.18	.39	629	.16	.37	692	.19	.39
Heart disease (self-reported)	4,894	.24	.43	.00	1.00	542	.32	.47	564	.32	.47	628	.19	.39	691	.16	.37
Stroke (self-reported)	4,895	.06	.23	.00	1.00	544	.05	.21	563	.07	.26	629	.05	.21	691	.05	.23
HUI summary score	4,520	.79	.22	.00	1.00	484	.80	.21	497	.81	.21	587	.76	.24	653	.78	.22
HUI vision score	4,968	.95	.06	.38	1.00	547	.95	.06	567	.96	.05	640	.95	.07	701	.95	.05
HUI hearing score	4,729	.98	.07	.00	1.00	510	.98	.09	512	.98	.07	614	.99	.07	681	.99	.06
HUI speech score	5,032	.99	.06	.00	1.00	553	.99	.07	575	.99	.07	649	.99	.07	715	1.00	.03**
HUI ambulation score	5,026	.94	.17	.00	1.00	554	.96	.15	575	.96	.15	648	.91	.22	713	.94	.17**
HUI dexterity score	5,055	.99	.08	.00	1.00	557	.99	.10	577	.99	.05	651	.99	.10	717	.99	.07
HUI emotion score	5,007	.97	.09	.00	1.00	552	.98	.07	575	.97	.08	643	.97	.10	713	.96	.10
HUI cognition score	5,050	.94	.13	.00	1.00	555	.93	.13	577	.93	.14	652	.93	.14	716	.95	.10*
HUI pain score	5,048	.83	.26	.00	1.00	557	.84	.27	577	.86	.24	652	.79	.29	716	.82	.26
Mortality
Deceased (yes = 1, 0 = no) by 2017	8,434	.35	.48	.00	1.00	1,007	.42	.49	1,003	.41	.49	1,104	.33	.47	1,103	.27	.44***

Note: The analytic sample is restricted to people who responded to the 2011 computer-assisted telephone interview survey and for whom attractiveness ratings of yearbook photographs are available. See text for more details. Avg = average; CES-D = Center for Epidemiologic Studies Depression Scale; HUI = Health Utilities Index; Max = maximum; Min = minimum; RBM = relative body mass; SF-12 = Short Form 12; WAIS = Wechsler Adult Intelligence Scale.

p < .05, **p < .01, and ***p < .001 for hypothesis tests about gender-specific differences in measures between those in the top 25 percent and those in the bottom 25 percent of the attractiveness scale.

Confounders

For reasons described earlier, to estimate the causal impact of early life facial attractiveness on later life morbidity and mortality, it is essential to statistically adjust for factors that may be correlated with ratings of attractiveness and may themselves impact later life health and longevity. In our analyses, these include adolescent RBM, height, childhood socioeconomic circumstances, adolescent IQ, and adolescent health. All are described in Table 1, separately for the full sample and for people in the top and bottom quartiles of the gender-specific distributions of facial attractiveness.

Adolescent RBM was measured in a manner much like facial attractiveness; see Figure 2 for the gender-specific rating scales. Between 2005 and 2008, the WLS team coded the senior yearbook photographs for RBM, which is a proxy for BMI. For every photograph, coders recorded an RBM score ranging from “not at all heavy” (1) to “extremely heavy” (11). To account for differences across coders in the mean and variance of ratings, we standardize ratings within coders before averaging across them (Reither 2004). As noted by Reither, Hauser, and Swallen (2009), “the RBM scale is reliable (α = .91) and meets several criteria of validity as a measure of body mass.” For example, it is correlated at r = .31 with BMI at ages 53 and 54 and at r = .48 with maximum BMI between ages 16 and 30.

Figure 2.

Wisconsin Longitudinal Study measure of adolescent body mass index.

Sample members’ height was self-reported in 1992 (at age 53). If it was unavailable from 1992, we use self-reported height in 2004 (at age 65) instead; if it was unavailable in both 1992 and 2004, we use the interviewer-measured height from 2011 (at age 72). Using heights from these later surveys is potentially problematic because people start to lose height after age 40 at an average rate of 1 cm per decade. This process is even more rapid, for most, after age 70.

Childhood socioeconomic circumstances were measured in 1957. We include indicators of sample members’ father’s occupation and family income, both from 1957 Wisconsin income tax records, and of their mother’s and father’s educational attainments. The latter were reported by sample members across various surveys and aggregated into a consensus measure.

Adolescent IQ, measured in high school using the Henmon-Nelson test of cognitive abilities, was ascertained from records of the Wisconsin State Testing Service. Using national norms by grade level for the Henmon-Nelson test as well as a renorming of raw scores for graduates for whom there were test scores in both the freshman and junior year, WLS staff members estimated junior-year raw scores on the Henmon-Nelson test for all of the graduates for whom any test score has been obtained. Finally, WLS staff members renormed the raw scores to a set of IQ equivalents, based on the percentile distribution of scores that were observed among all Wisconsin high school juniors in 1957. Thus, their norming of the Henmon-Nelson test scores does not depend on the obsolete concept of mental age used in the construction of Henmon-Nelson IQ scores.

Finally, adolescent health was measured retrospectively in the 2004 WLS using three sets of survey questions. First, self-assessed overall childhood health was measured using a survey question that asked, “How would you rate your health as a child?” Response options included “poor,” “fair,” “good,” “very good,” and “excellent.” Second, number of childhood illnesses was measured by summing the number of affirmative responses to 11 questions about whether graduates had asthma, frequent ear infections, a tonsillectomy or adenoidectomy, chronic bronchitis, whooping cough/pertussis, polio, diphtheria, hepatitis, pneumonia, meningitis, and infectious mononucleosis. Third, activity limitations in adolescence were measured by summing the number of affirmative responses to questions about whether graduates ever missed one month or more of school because of a health condition, were ever confined to bed for one month or more because of a health condition, or had their sports or physical activities restricted for three months or more because of a health condition. Prior evidence suggests that adults can retrospectively report childhood health conditions and related circumstances with a reasonable degree of reliability and validity (Haas 2007; Havari and Mazzonna 2015; Smith 2009).

Cognitive and Health Outcomes

We estimate separate models for a series of cognitive outcomes, measures of psychological well-being, anthropomorphic measures, self-reported physical health measures, and mortality. Except for mortality, which was assessed using administrative records from 1957 through 2017, all were measured at the time of the 2011 WLS.

Cognitive outcomes included a verbal fluency (letter F) measure (Lezak et al. 2004), a working memory (digit span) measure (Conway et al. 2005), immediate and delayed-recall tasks measuring memory and attention (Morris et al. 1989), abstract reasoning (Wechsler 1955), and a measure of inductive reasoning (using a number-series task; Lachman et al. 2014).

We included four measures of psychological well-being. First, a modified 20-item versions of the Center for Epidemiologic Studies Depression Scale expressed psychological distress and depression (Radloff 1977). Second, a seven-item anxiety index included questions such as “On how many days in the past week did you feel calm?” (Spielberger, Gorsuch, and Lushene 1970). Third, a seven-item anger index included questions such as “On how many days in the past week did you feel furious?” (Spielberger 1980). Finally, a three-item hostility index included questions such as “On how many days during the past week did you feel irritable, or likely to fight or argue?” (Spielberger 1988).

The 2011 in-person interview included a series of anthropomorphic measurements, including lung strength (using a peak flow meter), leg strength (using a chair rise task that timed how long it took graduates to rise from their chair into a standing position five times without using their arms), grip strength (using a dynamometer in their dominant hand), and walking speed (timed on a 2.5-m course while wearing shoes). Height and weight were also assessed; we include a measure of BMI as kilograms of weight per meters of height squared.

We estimated models predicting five sets of physical health outcomes. First, subjective overall health was measured twice: once by asking sample members, “In general, would you say your health is: excellent, very good, good, fair, or poor?” and once by asking a parallel question of interviewers about the sample member. Second, we use the Short Form 12 (Jenkinson et al. 1997), a well-validated assessment of general health and the impact of health on everyday life. Third, we use a self-reported measure of whether sample members had spent more than half a day in bed because of illness or injury in the preceding year. Fourth, we estimate separate models predicting whether sample members had ever been diagnosed with hypertension, high blood sugar, diabetes, cancer, heart disease, and a stroke. Finally, we model Health Utilities Index–Mark 3 scores that measure general health status and health-related quality of life (Horsman et al. 2003); we model the overall summary score and the vision, hearing, speech, ambulation, dexterity, emotion, cognition, and pain component scores.

Finally, we measure mortality and timing of death. WLS records are periodically linked to the National Death Index; the most recent data capture deaths between 1957 and 2017. We begin by modeling a dichotomous measure that expresses whether sample members were alive as of 2017. We then estimate Cox proportional hazard models to estimate impacts on timing of death (Cox 1972).

Analytic Approach

To assess whether young people who are judged to be less attractive tend to have worse adult health and shorter lives, we relied on careful consideration of the descriptive patterns in Table 1. To understand whether bivariate associations between attractiveness and our outcomes are causal in nature, we estimated a series of ordinary least squares regression models (for continuous outcomes), linear probability models (for binary outcomes), and event history models (for timing of death) that adjust for each of the confounders described earlier. We prefer linear probability models over logistic regression or probit models because they are easier to interpret; results from logistic regression models, which show substantively the same results, are available in the Appendix A.

All our models are estimated on pooled samples of women and men. We include interaction terms between all of our independent variables and gender; we then back out estimates of the effects of facial attractiveness for women and men separately and report the statistical significance of the interaction term between gender and facial attractiveness to inform our conclusion about whether the effect of attractiveness differs by gender.

As shown in Table 1, there is relatively little missing data on our confounding variables; we have no missing data on facial attractiveness (because of our sample selection criteria). To maximize sample size, we have imputed missing values on all confounding variables—but not on dependent variables—using chained equations as implemented in Stata’s ICE routine (Royston 2009; Royston and White 2011). We imputed 20 datasets. Also, to account for potentially selective patterns of nonresponse to the 2011 survey (from which we obtain most of our outcome measures), we construct and use in all multivariate analyses a poststratification weight; to construct the weight, we compute the inverse of the probability of responding in 2011 modeled as a function of gender and childhood socioeconomic background.

Results

All measures are described in Table 1, separately for the full sample and for women and men in the top 25 percent and bottom 25 percent of the gender-specific distributions of facial attractiveness; for ease of discussing Table 1, the top 25 percent will be referred to as the “attractive” and the bottom 25 percent will be referred to as the “unattractive.”

Among both women and men, the attractive and unattractive differ substantially and significantly with respect to all four measures of childhood socioeconomic circumstances. The attractive come from higher income families, better educated parents, and fathers with better jobs.

However, for the other potential confounders of the association between adolescent attractiveness and later life morbidity, cognition, and mortality—adolescent RBM, height, IQ, and adolescent health—Table 1 shows significant and sizable differences between attractive and unattractive women but not between attractive and unattractive men. For instance: Attractive women are almost a standard deviation lower in RBM than unattractive women; attractive and unattractive men have statistically equivalent RBMs. Attractive women’s IQ scores are about a quarter of a standard deviation higher than unattractive women’s IQ scores; there is no such difference among men. Attractive women were in better self-assessed overall health in adolescence than unattractive women; the same is not true among men.

In Figure 3 we explore this finding further. Separately for women and men, we report correlations (and 95 percent confidence intervals) between our measure of facial attractiveness and measures of RBM, selected measures of childhood socioeconomic circumstances, IQ, and selected measures of childhood health (specifically, the count of childhood illnesses). Facial attractiveness and childhood health are basically uncorrelated for both women and men. Facial attractiveness is positively and statistically significantly correlated with childhood socioeconomic circumstances and IQ; notably, those correlations are substantially larger—about twice as large—for women compared with men. Remarkably, whereas the correlation between facial attractiveness and RBM is essentially zero for men, it is significant and remarkably large (−0.34) among women. As we explore later, this basic empirical finding has important implications for our understanding of associations between adolescent facial attractiveness and later life health.

Figure 3.

Correlations between facial attractiveness and adolescent body mass index, childhood socioeconomic status, IQ, and childhood health, by sex.

The next sections of Table 1 report descriptive statistics for our measures of cognitive, psychological, physical health, and mortality outcomes. For men, only two of these many measures (hostility, lung capacity, and leg strength)—about as many as we would expect by chance—differ significantly between attractive and unattractive men; if we applied a Bonferroni correction, none would achieve statistical significance. That is, at the bivariate level, male adolescents who are judged to be attractive fare no better (or worse) decades later than male adolescents who are judged to be unattractive.

In contrast, among women we see statistically significant and substantively meaningful differences on about half of outcomes between those judged to be attractive and those judged to be unattractive; most achieve significance even when applying a Bonferroni correction. For example, women judged to be attractive in adolescence do much better decades later on four of the six cognitive measures; they have lower levels of psychological distress; they have lower adult BMIs; they have better self-assessed overall health; they are less likely to report having been diagnosed with hypertension, high blood sugar, or diabetes; and they are less likely to be dead. In other words, adolescent facial attractiveness is predictive of a wide range of cognitive, health, and mortality outcomes—but only among women.

Are these associations between adolescent attractiveness and later life morbidity and mortality, at least for women, confounded by adolescent RBM, socioeconomic circumstances, health, and/or IQ?

In Table 2, we report the results of multivariate models that estimate the independent effect of adolescent facial attractiveness on cognition, psychological well-being, and anthropomorphic outcome measures net of the set of confounders described earlier and in Table 1. In Tables 3 and 4, we estimate similar models for self-reported health outcomes and mortality. In all three tables, we report estimated effects for women, estimated effects for men, and the significance of gender differences in those estimated effects.

Table 2.

Regressions of Cognitive, Psychological Well-Being, and Anthropomorphic Outcomes at Approximately Age 72 on Ratings of Facial Attractiveness at Approximately Age 18.

		Women		Men
Outcome Measure	n	b	SE	b	SE
Cognitive outcomes, age 72
Verbal fluency (letter F task)	4,447	.136	.064*	.055	.076
Working memory (digit span task)	3,751	.005	.025	.027	.028
Short-term memory (immediate-recall task)	3,731	.060	.026*	.031	.028
Short-term memory (delayed-recall task)	3,727	.022	.033	.046	.032
Abstract reasoning (WAIS task)	4,780	.058	.034	.052	.039
Inductive reasoning (number-series task)	4,735	.040	.054	.017	.060
Psychological well-being outcomes, age 72
Psychological distress (CES-D)	4,064	.069	.268	.131	.247
Anxiety score	4,065	.160	.140	.110	.141
Anger score	4,069	−.042	.086	.118	.102
Hostility score	4,055	.005	.038	.083	.038*
Anthropomorphic outcomes, age 72
Lung function (peak flow measure)	4,640	1.529	1.353	6.463	2.289**
Leg strength (chair rise task)	4,411	−.094	.054	−.087	.064
Grip strength (dynamometer)	4,699	.196	.099*	.367	.153*
Walking speed	4,643	−.026	.024	−.020	.021
Body mass index	4,666	−.342	.096**^††	.015	.076

Note: The analytic sample is restricted to people who responded to the 2011 computer-assisted telephone interview survey and for whom attractiveness ratings of yearbook photographs are available. Coefficients represent the expected change in the outcome from a one-unit increase in the normalized mean attractiveness rating. All models include (1) controls for height, adolescent relative body mass, family socioeconomic background, IQ, and childhood health and (2) interactions between gender and all covariates. Analyses use a poststratification weight to account for selective patterns of nonresponse to the 2011 survey. See text for more details.

p < .05 and **p < .01 for hypothesis tests about coefficients for attractiveness rating. ^††p < .01 for hypothesis tests about gender differences in coefficients for attractiveness rating.

Table 3.

Regressions of Physical Health and Mortality Outcomes at Approximately Age 72 on Ratings of Facial Attractiveness at Approximately Age 18.

		Women		Men
Outcome Measure	n	b	SE	b	SE
Other physical health outcomes, age 72
Self-assessed overall health	4,904	.003	.008	.009	.008
Interviewer-assessed overall health	4,799	.012	.035	−.155	.131
Overall health (SF-12)	3,941	.058	.036	−.124	.122
Days in bed because of illness/injury	4,296	.094	.305	−.328	.331
Hypertension (self-reported)	4,897	−.020	.008*	.000	.009
High blood sugar (self-reported)	4,886	−.017	.006**	.005	.008
Diabetes (self-reported)	4,894	−.017	.006**	−.002	.007
Cancer (self-reported)	4,898	.006	.006	.000	.007
Heart disease (self-reported)	4,894	−.006	.006	−.004	.008
Stroke (self-reported)	4,895	.002	.004	.009	.004*
HUI summary score	4,405	.004	.004	−.001	.004
HUI vision score	4,824	.002	.001	.000	.001
HUI hearing score	4,584	.001	.001	.000	.002
HUI speech score	4,883	.003	.001**	.001	.001
HUI ambulation score	4,880	.007	.003*	.000	.003
HUI dexterity score	4,904	.002	.002	.002	.001
HUI emotion score	4,870	−.002	.002	−.001	.001
HUI cognition score	4,901	.002	.002	−.003	.002
HUI pain score	4,901	.006	.004	.004	.005
Mortality and longevity
Deceased (yes = 1, 0 = no) by 2017	8,434	−.012	.006*	−.007	.006
Age at death (hazard model)	8,434	−.050	.024*	−.023	.020

p < .05 and **p < .01 for hypothesis tests about coefficients for attractiveness rating.

Table 4.

Regressions of Selected Morbidity and Mortality Outcomes at Approximately Age 72 on Ratings of Facial Attractiveness at Approximately Age 18, Net of Mediating Variables.

	Women						Men
	Unmediated Model (from Tables 2 and 3)			Mediation Model			Unmediated Model (from Tables 2 and 3)		Mediation Model
Outcome Measure	n	b	SE	b	SE	Change	b	SE	b	SE	Change
Cognitive outcomes, age 72
Verbal fluency (letter F task)	4,447	.136	.064*	.101	.064	−.035	.055	.076	.060	.074	.005
Short-term memory (immediate recall)	3,731	.060	.026*	.056	.026*	−.005	.031	.028	.033	.028	.003
Anthropomorphic outcomes, age 72
Lung function (peak flow measure)	4,640	1.529	1.353	1.244	1.365	−.284	6.463	2.289**	6.744	2.190**	.281
Grip strength (dynamometer)	4,699	.196	.099*	.135	.101	−.061	.367	.153*	.378	.152*	.011
Body mass index	4,666	−.342	.096**	−.327	.096**	.015^††	.015	.076	.002	.075	−.013
Other physical health outcomes, age 72
Hypertension (self-reported)	4,897	−.020	.008*	−.019	.008*	.001	.000	.009	.000	.009	.000
High blood sugar (self-reported)	4,886	−.017	.006**	−.015	.006*	.002^†	.005	.008	.006	.008	.000
Diabetes (self-reported)	4,894	−.017	.006**	−.015	.006**	.002	−.002	.007	−.002	.007	.000
HUI speech score	4,883	.003	.001**	.003	.001**	.000	.001	.001	.001	.001	.000
HUI ambulation score	4,880	.007	.003*	.007	.003*	.000	.000	.003	.000	.003	.001
Mortality and longevity
Deceased (yes = 1, 0 = no) by 2017	8,434	−.012	.006*	−.012	.006*	.000	−.007	.006	−.008	.006	.000
Age at death (hazard model)	8,434	−.050	.024*	−.049	.024*	.001	−.023	.020	−.026	.021	−.003

Note: The analytic sample is restricted to people who responded to the 2011 computer-assisted telephone interview survey and for whom attractiveness ratings of yearbook photographs are available. Coefficients represent the expected change in the outcome from a one-unit increase in the normalized mean attractiveness rating. All models include (1) controls for height, adolescent relative body mass, family socioeconomic background, IQ, and childhood health; (2) interactions between gender and all covariates; and (3) mediators, including education and midlife socioeconomic status, family history, health behaviors, and psychological measures. Analyses use a poststratification weight to account for selective patterns of nonresponse to the 2011 survey. See text for more details. HUI = Health Utilities Index.

p < .05 and **p < .01 for hypothesis tests about coefficients for attractiveness rating. ^†p < .0.5 and ^††p < .01 for hypothesis tests about gender differences in coefficients for attractiveness rating (in unmediated model).

Among men, Tables 2 through 4 show that (net of confounders) adolescent facial attractiveness is significantly associated with just four of 35 outcomes. For two of the four—the hostility scale and self-reported stroke—more attractive men had worse outcomes. Thus, for only two outcomes—lung function and grip strength—did we observe that men rated as attractive in adolescence enjoyed better health outcomes (all else equal) in later life. Only the coefficient for grip strength is statistically significant if we apply a Bonferroni correction. However, for these two outcomes the implied effect sizes are very small (with a full standard deviation increase in attractiveness leading to increases in lung function and grip strength of about 1/16th and 1/19th of a standard deviation, respectively). In short, for men we found little descriptive evidence in Table 1 that adolescent attractiveness was associated at the bivariate level with our outcomes; net of confounders, we find virtually no evidence of meaningful effects of adolescent attractiveness on men’s morbidity or mortality.

Among women, Tables 2 through 4 tell a somewhat different story. For 11 of the 35 outcomes, women rated as more attractive in adolescence enjoyed better cognitive, health, and mortality outcomes (all else equal). Net of confounders, more attractive women had better short-term memory; better verbal fluency; lower BMIs; better grip strength; less hypertension, high blood sugar, and diabetes; better HUI speech and ambulation scores; and lower rates of mortality. However, in contrast to the descriptive results in Table 1, (1) only one coefficient is statistically significant if we apply a Bonferroni correction, and (2) many of the implied effects sizes for women are modest. The lone exception—and the lone statistically significant result if we apply a Bonferroni correction—is that a 1 standard deviation increase in adolescent attractiveness is associated with a reduction in women’s BMI at age 72 (all else equal, including adolescent RBM). For only two outcomes—BMI at age 72 and self-reported high blood sugar—do our models suggest that (all else equal) the effects of adolescent attractiveness on these outcomes are statistically significantly different for women and men.

In short, we find limited evidence of independent effects of adolescent attractiveness on cognitive, health, and mortality outcomes among women; those effects are generally small in magnitude, at least compared with the descriptive results in Table 1. This suggests that the associations we observe among women in Table 1 are mainly spurious, confounded by factors such as adolescent health, adolescent RBM, and IQ. We find almost no evidence that adolescent attractiveness affects men’s later life cognitive, health, or mortality outcomes.

Discussion

We investigated the long-term cognitive, health, and mortality consequences of people’s evaluations of how attractive other people’s faces are judged to be. These evaluations can have profound consequences for the individuals whose bodies are being evaluated, especially when those evaluations are connected to social, cultural, or political systems of racism, sexism, or ablism.

We went beyond previous research by (1) focusing more carefully on whether associations between ratings of facial attractiveness and life outcomes are plausibly causal or whether they may be confounded by factors correlated with judgements about attractiveness and (2) by carefully considering how each of these processes differ for women and men.

At the outset we discussed two theoretical perspectives that suggest we might expect to find a correlation between facial attractiveness in adolescence and later life cognitive and health outcomes. From the beauty norm discrimination perspective—the idea that associations between adolescent attractiveness and later life health and well-being are due to lifelong patterns of discrimination against people whose appearance does not meet conventional standards of beauty and attractiveness—we expect that associations between adolescent attractiveness and later life outcomes should be at least partly causal. Through various discriminatory processes, those judged to be less attractive experience negative social and economic consequences; those consequences have negative long-term implications for health and well-being. We found little empirical support for this perspective, especially among men.

We also have reason to be skeptical that the second theoretical perspective, the mate selection perspective, explains why more attractive (female) adolescents had better health and cognitive outcomes decades later. For one thing, the correlations between ratings of attractiveness and things like healthiness, intelligence, and socioeconomic resources in Figure 3 were small in magnitude, especially among men. If attractiveness is a proxy for attributes upon which people base mate selection decisions, it is (empirically speaking) not a very good proxy.

We suspect that what is going on has more to do with (1) gendered differences in cultural expectations of attractiveness and the cost of buying that attractiveness and (2) gendered differences in the nature and size of biases against overweight or obese people. First, we suspect that men are less able and perhaps less motivated to buy culturally defined beauty. Whereas men in this cohort wore similar clothes and had similar hairstyles (see Figures 1 and 2), women’s fashion and hairstyles were more variable and more expressive—and thus more amenable to improvement through spending money.

Second, RBM may mean different things for young men and women. Among men, it is perhaps difficult to distinguish obesity from muscularity in a facial photograph; few women are visibly muscular, and so nonthin women are more likely than nonthin men to be perceived to be overweight. We also suspect that people evaluating attractiveness may more harshly penalize heavier women in their ratings than they do heavier men. This would explain the very strong correlation between RBM and adolescent facial attractiveness among women and the nonexistence of that correlation among men. Because adolescent obesity is predictive of later life health problems, and because ratings of attractiveness are so closely tied to weight among women, we should neither be surprised at gender differences in the bivariate associations shown in Table 1 nor by the attenuation of those associations in Tables 2 and 3.

Our first main conclusion, then, as illustrated in Figure 3, is that when people evaluate women’s facial attractiveness, they also (at least implicitly) consider women’s socioeconomic status and their RBM. In contrast, when they evaluate men’s physical attractiveness, they consider men’s socioeconomic status—although not to the same degree as for women—but not their weight. Apparently, what raters evaluate when they look at women’s faces differs from what they evaluate when they look at men’s faces.

The standard that people use when evaluating young men’s attractiveness is modestly correlated with their socioeconomic circumstances: advantaged men tend to be judged more attractive, perhaps because they are able to purchase somewhat more fashionable clothes, afford better haircuts or dental care, or otherwise pay for style. However, ratings of adolescent men’s facial attractiveness do not depend on their intelligence, their healthiness, or their weight; the latter may have to do with the fact that it is difficult to discern a very muscular man from an overweight man in a facial photograph. A man can be heavier, less intelligent, or less healthy and ratings of his facial attractiveness will not suffer.

In contrast, the standard that people use when evaluating young women’s facial attractiveness is more highly correlated with those things, especially RBM. Women who are judged to be more attractive tend to come from socioeconomically advantaged families, but they also tend to have lower RBM. Heavier women are judged to be less attractive. Although we do not formally test this idea in our analyses, we speculate that socioeconomic status may ultimately drive many of these patterns. Adolescent intelligence, weight, and healthiness are all influenced by childhood family socioeconomic status; advantaged families can use their resources for improved nutrition, better educational opportunities and experiences, better pediatric medical and dental care, and health-inducing recreational activities. We also speculate that gender differences in the correlations between attractiveness and socioeconomic status may be related to differential investments in young boys and young girls in this era, especially in families with constrained resources.

These findings about how people differentially rate the attractiveness of men’s and women’s faces is fundamentally interesting, but it also has major implications for our main research objective: especially for women, bivariate associations between ratings of adolescent facial attractiveness and any health or mortality outcome later in life are confounded by factors such as early life health, IQ, family socioeconomic standing, and, especially, body weight. As we reviewed earlier, prior research on the impact of adolescent attractiveness on subsequent health and well-being has not adequately accounted or adjusted for these confounders; their results are thus of questionable validity, especially as they pertain to women.

Our second main conclusion is that associations between ratings of adolescent facial attractiveness and later life health and cognition are mostly nonexistent among men and mostly spurious among women. In our descriptive analyses, we see sizable and statistically significant associations between ratings of adolescent facial attractiveness and many cognitive, health, and mortality outcomes later in life, but only for women. When we adjust for childhood socioeconomic status, adolescent body mass, adolescent health, and intelligence, we observe that among women those associations are considerably attenuated and smaller in magnitude. Ratings of adolescent attractiveness have little or no causal impact on later life health and mortality, and then only among women. This basic conclusion stands in sharp contrast to earlier research, even using WLS data; those prior findings were driven by their failure to adjust adequately for the confounding roles of factors associated with ratings of women’s facial attractiveness.

Four limitations of our analyses are worth noting. First, our data include no high school dropouts; thus, the generalizability of our conclusions is somewhat limited. Second, because our data are from a predominantly racially white state—and because whites were overrepresented among high school graduates in this place and time—we can say nothing about how any of these processes work among people racialized as Black and Latine or other groups. Third, it is not ideal that ratings of attractiveness happened so many years after 1957. Although it is a virtue that we rely on ratings by people from the same state and birth cohort, it is a possible weakness that beauty norms may have changed in ways that altered our findings. In the end, the virtues of our data outweigh their limitations, but we encourage others to collect and analyze data that overcome these shortcomings. Fourth, there may have been some selective mortality prior to 2011 such that those who survived until that year exhibit different relationships between predictor and outcome variables.

We conclude that ratings of adolescent facial attractiveness have little causal impact on later life health, mortality, or cognition; any causal effects are small in magnitude and seen only among women. Our analyses point instead to the pervasiveness of gender differences in the nature and bases of beauty norms. The structure and functioning of these beauty norms induce largely spurious correlations between adolescent attractiveness and later life outcomes.

Footnotes

Appendix

Appendix A.

Logistic Regressions of Physical Health Outcomes at Approximately Age 72 on Ratings of Facial Attractiveness at Approximately Age 18.

		Women		Men
Outcome Measure	n	exp(b)	SE	exp(b)	SE
Other physical health outcomes, age 72
Self-assessed overall health	4,904	1.025	.033	1.055	.036
Interviewer-assessed overall health	4,799	1.050	.033	1.039	.036
Hypertension (self-reported)	4,897	.919	.033*	.998	.036
High blood sugar (self-reported)	4,886	.898	.042*^†	1.026	.040
Diabetes (self-reported)	4,894	.874	.046**	.989	.044
Cancer (self-reported)	4,898	1.049	.042	1.000	.044
Heart disease (self-reported)	4,894	.955	.042	.986	.038
Stroke (self-reported)	4,895	1.039	.074	1.166	.072*

p < .05 and **p < .01 for hypothesis tests about coefficients for attractiveness rating. ^†p < .05 for hypothesis tests about gender differences in coefficients for attractiveness rating.

Acknowledgements

This article was prepared for presentation at the 2021 meetings of the American Sociological Association. Generous support for this project has been provided by the Minnesota Population Center, which receives core funding (P2C HD041023) from the Eunice Kennedy Shriver National Institute for Child Health and Human Development. We sincerely thank Joe Savard, Kamil Sicinski, and Carol Roan at the WLS for help with technical support and Katie Berry, Eric Grodsky, Jonas Helgertz, and several anonymous reviewers for useful feedback. However, errors and omissions are our responsibility.

Data Availability Statement

The data underlying this article are freely available at .

ORCID iD

John Robert Warren

Author Biographies

John Robert Warren is director of the Institute for Social Research and Data Innovation and a professor of sociology at the University of Minnesota. He is also codirector of the Education Studies for Healthy Aging Research project, codirector of the Population Health Sciences training program at the Minnesota Population Center, and co–principal investigator of the IPUMS–Current Population Survey project. He has worked on the WLS since his first day of graduate school.

Gina Rumore is former director of the Minnesota Population Center’s Development Core. Her educational background is in biological research, writing, and science studies. She completed postdoctoral research in the ecology, evolution, and behavior program; has worked doing freelance science writing; consults on grant proposal writing; and developed a senior thesis course for the History of Medicine Program at the University of Minnesota.

References

Biddle

Jeff E.

Hamermesh

Daniel S.

1998. “Beauty, Productivity, and Discrimination: Lawyers’ Looks and Lucre.” Journal of Labor Economics 16(1):172–201.

Biener

Adam

Cawley

John

Meyerhoefer

Chad

. 2018. “The Impact of Obesity on Medical Care Costs and Labor Market Outcomes in the US.” Clinical Chemistry 64(1):108–17.

Bock

Richard Darrell

Jones

Lyle V.

1968. The Measurement and Prediction of Judgment and Choice. San Francisco, CA: Holden-Day.

Böckerman

Petri

Cawley

John

Viinikainen

Jutta

Lehtimäki

Terho

Rovio

Suvi

Seppälä

Ilkka

Pehkonen

Jaakko

, et al. 2019. “The Effect of Weight on Labor Market Outcomes: An Application of Genetic Instrumental Variables.” Health Economics 28(1):65–77.

Bossavie

Laurent

Alderman

Harold

Giles

John

Mete

Cem

. 2017.” The Effect of Height on Earnings: Is Stature Just a Proxy for Cognitive and Non-cognitive Skills?” Washington, DC: World Bank.

Broer

Peter Niclas

Juran

Sabrina

Liu

Yuen-Jong

Weichman

Katie

Tanna

Neil

Walker

Marc E.

Reuben

, et al. 2014. “The Impact of Geographic, Ethnic, and Demographic Dynamics on the Perception of Beauty.” Journal of Craniofacial Surgery 25(2):e157–61.

Conway

Andrew R. A.

Kane

Michael J.

Bunting

Michael F.

Zach

D. Hambrick

Wilhelm

Oliver

Engle

Randall W.

2005. “Working Memory Span Tasks: A Methodological Review and User’s Guide.” Psychonomic Bulletin & Review 12(5):769–86.

Cox

David R.

1972. “Regression Models and Life-Tables.” Journal of the Royal Statistical Society: Series B (Methodological) 34(2):187–202.

Deng

Weiguang

Dayang

Zhou

Dong

. 2020. “Beauty and Job Accessibility: New Evidence From a Field Experiment.” Journal of Population Economics 33(4):1303–41.

10.

Eberhardt

Jennifer L.

Davies

Paul G.

Purdie-Vaughns

Valerie J.

Johnson

Sheri Lynn

. 2006. “Looking Deathworthy: Perceived Stereotypicality of Black Defendants Predicts Capital-Sentencing Outcomes.” Psychological Science 17(5):383–86.

11.

Foo

Yong Zhi

Simmons

Leigh W.

Rhodes

Gillian

. 2017. “Predictors of Facial Attractiveness and Health in Humans.” Scientific Reports 7:39731.

12.

Gallup

Gordon G.

Jr. Frederick

David A.

2010. “The Science of Sex Appeal: An Evolutionary Perspective.” Review of General Psychology 14(3):240–50.

13.

Grammer

Karl

Fink

Bernhard

Møller

Anders P.

Thornhill

Randy

. 2003. “Darwinian Aesthetics: Sexual Selection and the Biology of Beauty.” Biological Reviews 78(3):385–407.

14.

Tianzhu

Yueqing

. 2019. “Beauty Premium in China’s Labor Market: Is Discrimination the Main Reason?” China Economic Review 57:101335.

15.

Gupta

Nabanita Datta

Etcoff

Nancy L.

Jaeger

Mads M.

2016. “Beauty in Mind: The Effects of Physical Attractiveness on Psychological Well-Being and Distress.” Journal of Happiness Studies 17(3):1313–25.

16.

Haas

Steven A.

2007. “The Long-Term Effects of Poor Childhood Health: An Assessment and Application of Retrospective Reports.” Demography 44(1):113–35.

17.

Havari

Enkelejda

Mazzonna

Fabrizio

. 2015. “Can We Trust Older People’s Statements on Their Childhood Circumstances? Evidence from SHARELIFE.” European Journal of Population 31(3):233–57.

18.

Henderson

Joshua J. A.

Anglin

Jeremy M.

2003. “Facial Attractiveness Predicts Longevity.” Evolution and Human Behavior 24(5):351–56.

19.

Herd

Carr

Roan

2014. “Cohort Profile: Wisconsin Longitudinal Study (WLS).” International Journal of Epidemiology 43(1):34–41.

20.

Horsman

John

Furlong

William

Feeny

David

Torrance

George

. 2003. “The Health Utilities Index (HUI^®): Concepts, Measurement Properties and Applications.” Health and Quality of Life Outcomes 1(1):1–13.

21.

Hume

Deborah K.

Montgomerie

Robert

. 2001. “Facial Attractiveness Signals Different Aspects of ‘Quality’ in Women and Men.” Evolution and Human Behavior 22(2):93–112.

22.

Jæger

Mads Meier

. 2011. “‘A Thing of Beauty Is a Joy Forever’? Returns to Physical Attractiveness over the Life Course.” Social Forces 89(3):983–1003.

23.

Jenkinson

Crispin

Layte

Richard

Jenkinson

Damian

Lawrence

Kate

Petersen

Sophie

Paice

Colin

Stradling

John

. 1997. “A Shorter Form Health Survey: Can the SF-12 Replicate Results From the SF-36 in Longitudinal Studies?” Journal of Public Health 19(2):179–86.

24.

Jokela

Markus

. 2009. “Physical Attractiveness and Reproductive Success in Humans: Evidence from the Late 20th Century United States.” Evolution and Human Behavior 30(5):342–50.

25.

Kalick

S. Michael

Zebrowitz

Leslie A.

Langlois

Judith H.

Johnson

Robert M.

1998. “Does Human Facial Attractiveness Honestly Advertise Health? Longitudinal Data on an Evolutionary Question.” Psychological Science 9(1):8–13.

26.

Karraker

Amelia

Sicinski

Kamil

Moynihan

Donald

. 2017. “Your Face Is Your Fortune: Does Adolescent Attractiveness Predict Intimate Relationships Later in Life?” Journals of Gerontology: Series B 72(1):187–99.

27.

Kim

Keuntae

. 2014. “Who Lives Longer and Healthier? The Role of Personality, Facial Attractiveness, and Intelligence.” Korean Journal of Sociology 48(6):1–30.

28.

Kim

Tae Hyun

Han

Euna

. 2017. “Height Premium for Job Performance.” Economics & Human Biology 26:13–20.

29.

Kim

Tae Jun

von dem Knesebeck

Olaf

. 2018. “Income and Obesity: What Is the Direction of the Relationship? A Systematic Review and Meta-analysis.” BMJ Open 8(1):e019862.

30.

King

Ryan D.

Johnson

Brian D.

2016. “A Punishing Look: Skin Tone and Afrocentric Features in the Halls of Justice.” American Journal of Sociology 122(1):90–124.

31.

Lachman

Margie E.

Agrigoroaei

Stefan

Tun

Patricia A.

Weaver

Suzanne L.

2014. “Monitoring Cognitive Functioning: Psychometric Properties of the Brief Test of Adult Cognition by Telephone.” Assessment 21(4):404–17.

32.

Langlois

Judith H.

Kalakanis

Lisa

Rubenstein

Adam J.

Larson

Andrea

Hallam

Monica

Smoot

Monica

. 2000. “Maxims or Myths of Beauty? A Meta-analytic and Theoretical Review.” Psychological Bulletin 126(3):390.

33.

Law Smith

Miriam J.

Perrett

David I.

Jones

Benedict C.

Cornwell

R. Elisabeth

Moore

Fhionna R.

Feinberg

David R.

Boothroyd

Lynda G.

, et al. 2006. “Facial Appearance Is a Cue to Oestrogen Levels in Women.” Proceedings of the Royal Society B: Biological Sciences 273(1583):135–40.

34.

Lezak

Muriel Deutsch

Howieson

Diane B.

Loring

David W.

Fischer

Jill S.

2004. Neuropsychological Assessment. New York: Oxford University Press.

35.

Liu

Xing

Sierminska

Eva

. 2014. “Evaluating the Effect of Beauty on Labor Market Outcomes: A Review of the Literature.” Working Paper Series 11. Luxembourg: Luxembourg Institute of Socio-economic Research.

36.

Meland

Sheri A.

2002. “Objectivity in Perceived Attractiveness: Development of a New Methodology for Rating Facial Physical Attractiveness” Master’s thesis, Department of Sociology, University of Wisconsin–Madison.

37.

Mobius

Markus M.

Rosenblat

Tanya S.

2006. “Why Beauty Matters.” American Economic Review 96(1):222–35.

38.

Monk

Ellis P.

Jr.

2021a. “Colorism and Physical Health: Evidence from a National Survey.” Journal of Health and Social Behavior 62(1):37–52.

39.

Monk

Ellis P.

Jr.

2021b. “The Unceasing Significance of Colorism: Skin Tone Stratification in the United States.” Daedalus 150(2):76–90.

40.

Monk

Ellis P.

Jr. Esposito

Michael H.

Lee

Hedwig

. 2021. “Beholding Inequality: Race, Gender, and Returns to Physical Attractiveness in the United States.” American Journal of Sociology 127(1):194–241.

41.

Morris

John C.

Heyman

Albert

Mohs

Richard C.

Hughes

J. P.

van Belle

Gerald

Fillenbaum

G.D.M.E.

Mellits

E. D.

, et al. 1989. “The Consortium to Establish a Registry for Alzheimer’s Disease (CERAD): I. Clinical and Neuropsychological Assessment of Alzheimer’s Disease.” Neurology 39(9):1159–65.

42.

Radloff

Lenore Sawyer

. 1977. “The CES-D Scale: A Self-Report Depression Scale for Research in the General Population.” Applied Psychological Measurement 1(3):385–401.

43.

Reither

Eric N.

2004. “Why Are Our Waistlines Expanding? Age-Period-Cohort Analyses of the Obesity Epidemic and a Critical Examination of Mass Preparation Theory.” Doctoral dissertation, Department of Sociology, University of Wisconsin–Madison.

44.

Reither

Eric N.

Hauser

Robert M.

Swallen

Karen C.

2009. “Predicting Adult Health and Mortality From Adolescent Facial Characteristics in Yearbook Photographs.” Demography 46(1):27–41.

45.

Rhodes

Gillian

. 2006. “The Evolutionary Psychology of Facial Beauty.” Annual Review of Psychology 57:199–226.

46.

Royston

2009. “Multiple Imputation of Missing Values: Further Update of ICE, with an Emphasis an Categorical Variables.” Stata Journal 9(3):466–77.

47.

Royston

White

I. R.

2011. “Multiple Imputation by Chained Equations (MICE): Implementation in Stata.” Journal of Statistical Software 45(4):1–20.

48.

Sala

Emanuela

Terraneo

Marco

Lucchini

Mario

Knies

Gundi

. 2013. “Exploring the Impact of Male and Female Facial Attractiveness on Occupational Prestige.” Research in Social Stratification and Mobility 31:69–81.

49.

Scholz

John Karl

Sicinski

Kamil

. 2015. “Facial Attractiveness and Lifetime Earnings: Evidence from a Cohort Study.” Review of Economics and Statistics 97(1):14–28.

50.

Shackelford

Todd K.

Larsen

Randy J.

1999. “Facial Attractiveness and Physical Health.” Evolution and Human Behavior 20(1):71–6.

51.

Singh

Devendra

Singh

Dorian

. 2011. “Shape and Significance of Feminine Beauty: An Evolutionary Perspective.” Sex Roles 64(9–10):723–31.

52.

Smith

James P.

2009. “Reconstructing Childhood Health Histories.” Demography 46(2):387–403.

53.

Spielberger

C. D.

1980. Preliminary Manual for the State-Trait Anger Scale (STAS). Tampa: University of South Florida Human Resources Institute.

54.

Spielberger

C. D.

1988. State-Trait Anger Expression Inventory: Professional Manual. Odessa, FL: Psychological Assessment Resources.

55.

Spielberger

C. D.

Gorsuch

R. L.

Lushene

R. E.

1970. STAI: Manual for the State Trait Anxiety Inventory. Palo Alto, CA: Consulting Psychologists Press.

56.

Thornhill

Randy

Gangestad

Steven W.

1999. “Facial Attractiveness.” Trends in Cognitive Sciences 3(12):452–60.

57.

Torgerson

Warren S.

1958. Theory and Methods of Scaling. New York: John Wiley.

58.

Wechsler

David

. 1955. Manual for the Wechsler Adult Intelligence Scale. New York: Psychological Corporation.

59.

Weeden

Jason

Sabini

John

. 2005. “Physical Attractiveness and Health in Western Societies: A Review.” Psychological Bulletin 131(5):635–53.

60.

Zebrowitz

Leslie A.

Hall

Judith A.

Murphy

Nora A.

Rhodes

Gillian

. 2002. “Looking Smart and Looking Good: Facial Cues to Intelligence and Their Origins.” Personality and Social Psychology Bulletin 28(2):238–49.

Adolescent Facial Attractiveness and Later Life Morbidity,Cognition,and Mortality

Abstract

Keywords

Adolescent Attractiveness and Later Life Health and Mortality

Current Evidence about the Association between Attractiveness and Health

Summary

Research Design

Measures

Facial Attractiveness

Confounders

Cognitive and Health Outcomes

Analytic Approach

Results

Discussion

Footnotes

Appendix

Acknowledgements

Data Availability Statement

ORCID iD

Author Biographies

References