Abstract
There is an emerging interest in psychology in examining personality in history, yet the implicit personality models of past historical periods have received little attention. Drawing upon the lexical hypothesis, this study examined the personality concepts and implicit model of personality in early 20th-century American literature, focusing on the novels by F. Scott Fitzgerald. A list of 1710 personality-descriptive adjectives and 100 marker terms of the Big Five model were used to code the descriptions of the novels’ characters (N = 169). Exploratory factor analysis identified a seven-factor structure that could be interpreted with reference to the cultural-historical context of the novels, encompassing themes of dynamism, agility, expressivity, self-regulation and social-relational functioning. Direct comparisons with structures obtained using the same list of words in contemporary self-ratings and in similar text-based analyses of 19th-century English novels from earlier research indicated overall low congruence. Still, some correspondence with the Big Five was observed at the level of marker scales of individual factors, and there was some qualitative similarity of some factors with those of the earlier authors’ models. The findings are discussed with reference to the potential of the presented approach to advance the study of personality structure in history.
Introduction
Personality has occupied people’s thoughts since the beginning of time. Personality judgments have been identified in ancient Chinese, Greek, Indian, and Middle Eastern texts from the Great Transformation (1000 – 200 BCE; Mayer et al., 2011). Contemporary personality models like the Big Five are often viewed as products of evolutionary processes rooted in our distant past (Buss, 1991). However, recent research and theoretical perspectives suggest that the Big Five may represent a modern personality framework shaped by cultural and environmental factors, which continue to influence the formation and evolution of personality constructs (Durkee et al., 2022; Fischer, 2017). Put differently, personality descriptions are known to evolve with time. Tracing such changes can shed light on the evolution of the underlying personality constructs (Du et al., 2024). The extent to which the salience of individual traits and the implicit personality models – the beliefs about how traits are interrelated – remain consistent or change across historical periods forms an intriguing question in personality research. Yet, the scientific study of personality through history has only recently started to emerge as part of a broader engagement with historical topics in psychology (Atari & Henrich, 2023). The current study introduces an approach to personality in historical context based on an examination of personality constructs and an implicit model of personality in early 20th-century American literature, focusing on the works of F. Scott Fitzgerald.
Personality Traits in the Past
Questions about personality characteristics in the past have received some attention in empirical research in recent years. Most research has employed modern models and measures to describe historical or mythological figures (e.g., McCrae et al., 2012; Passakos & De Raad, 2009; Ritzler & Singer, 1998) or the salience of various traits in specific historical periods (e.g., Johnson et al., 2011; Roivainen, 2013; Ye et al., 2018). For example, Passakos and De Raad (2009) examined Homer’s Iliad. The authors categorized the 1,057 trait-epithets they identified based on the Big Five factors, noting that most words aligned with dimensions of agreeableness and conscientiousness. Similarly, Johnson et al. (2011) obtained ratings by readers and experts on the characters of Victorian novels with reference to the Big Five model. They found agreeableness to be the most prominent Big Five dimension. While this previous work presents a first step into the exploration of personality characteristics in the past, it was built on the assumption that contemporary personality models (i.e., the Big Five) can be directly applied to earlier periods. This approach can be considered top-down as the researchers classify personality descriptions along pre-existing, theory-driven categories. However, the assumption that a modern model applies to earlier historical periods is not self-evident. Questions surrounding the implicit personality models of past eras have remained largely unexplored.
The Lexical Hypothesis: An Approach to Personality in Literature
A useful framework to guide the examination of personality concepts and models in written texts can be found in the psycho-lexical approach (Passakos & De Raad, 2009). The lexical hypothesis suggests that the important aspects of personality differences are likely to be expressed through language and to be codified in trait-descriptive words (Galton, 1884; Goldberg, 1982). People tend to articulate characteristics they deem important in their daily descriptions of both their own and others’ personalities. Furthermore, people form beliefs about which characteristics are likely to co-occur, leading to the formation of an implicit model of personality. Even though implicit personality models (also referred to as “internal structure”) do not correspond precisely to the actual trait co-occurrence patterns observed in a population (“external structure”), these structures have substantial similarity (Peabody & Goldberg, 1989). Examinations of implicit models can thus serve as a valuable, and probably among the few available routes to understanding personality of the past.
The relevance of personality traits in a cultural-linguistic context can be gauged by the frequency with which trait descriptors appear in texts and these traits’ patterns of co-occurrence. This suggests that textual sources can offer a window into what personality traits were salient in a given period and, to some extent, how these traits were seen as linked, forming an implicit personality model. Fiction literature provides a promising area to examine. Literary sources, such as novels, can offer rich descriptions of the attributes, thoughts, and actions of their characters. Novels can be assumed to illustrate the Zeitgeist of the time they were written, offering “distillations of folk psychology” (Oatley, 1999, p. 115). The way characters are depicted in fictional stories depends on a shared understanding of human behavior because knowledge about people in the real world is a key ingredient for character perception (Lugea & Walker, 2023). Thus, literary works, although authored by single individuals, often reflect implicit beliefs about how traits connect, as expressed through diverse and sometimes autobiographically inspired characters. The information about individual differences that is inherent in fictional characters can be extracted, for example, by having external raters evaluate the characters’ traits using a predefined model (e.g., Johnson et al., 2011; McCrae et al., 2012) or, staying closer to the raw data, by analyzing the words used by the author to describe these characters (e.g., Fischer et al., 2020).
Recent research has sought to examine personality based on the lexical hypothesis in historical texts from a bottom-up perspective. Fischer et al. (2020) analyzed the use of personality-descriptive adjectives in literary works of the 19th-century English authors Jane Austen and Charles Dickens. The findings indicated that Austen’s fictional characters were described with a set of adjectives predominantly focusing on social and emotional attributes. In turn, Dickens’ character descriptions had a focus on power differences and social dimensions. Both authors’ character depictions could be captured by models with seven factors. The factors in Austen’s novels were labeled Civility, Intelligence, Sternness, Approachability, Egocentrism, and Vigor, whereas those found in Dickens’ novels were Dominance, Approachability, Sociability, Civility, Dynamism, Integrity, and Activity. Although some labels were similar, the factors were not defined by the same adjectives and the overall congruence between the personality structures in the two authors’ novels was low. Furthermore, these models had limited overlap with the Big Five model and were seen as reflecting the different social contexts of the two writers.
Shifts in Personality Models Across Time
An interesting question is whether implicit personality structures of texts written closer to the present time would show greater correspondence to contemporary models of personality. There are indications that words describing different trait domains have emerged at different points in time, with those for openness being most recent (Piedmont & Aycock, 2007). In turn, research on three distinct historical versions of the Gilgamesh epic, one of the earliest surviving pieces of literature, suggested a tendency toward increasing occurrence of terms recognizable from present-day personality models (Du et al., 2024). Finally, Cutler and Condon (2023) found evidence for factors resembling the Big Five (with the least support for neuroticism and intellect/openness) in large language models, where contemporary texts have the largest weight. These findings suggest that an increasing resemblance of familiar contemporary models may be expected with time. Still, questions of cross-time comparisons of personality models have received little attention so far.
The early 20th century in the US offers a relevant context for examining personality models in history. The first comprehensive list of 17,953 person-descriptive adjectives originated in lexical work on American English from this period (Allport & Odbert, 1936). This list served as a foundational resource for later lexical research, and several contemporary lists in English trace their origins to it (e.g., Ashton et al., 2004; De Raad et al., 2010). It is thus informative to examine patterns of person descriptions in literature from this period. Furthermore, this period can be seen as an intermediate time between the 19th century, which has received some attention in historical-psychological research (Fischer et al., 2020; Johnson et al., 2011), and the second half of the 20th century, when the Big Five model was established in data from self- and other-ratings (Goldberg, 1992). Finally, examining American literature extends the scope of earlier research, which has often focused on British literature (e.g., Fischer et al., 2020; Johnson et al., 2011), while allowing comparisons with both contemporary observed structures and historical implicit personality structures obtained in English. Within-language comparisons are informative as connotations of individual words (e.g., aggressive) may vary between cultural contexts sharing the same language, such as the UK and the US (Passakos & De Raad, 2009). In sum, American literature of the early 20th century is interesting to examine both on its own, being written in the period that saw the beginnings of the empirical lexical approach to personality, and as a point of comparison with other data across cultural and temporal lines.
The Present Study
The present study aims to examine the salient personality concepts and to identify the implicit structure of personality in early 20th century America based on an analysis of the novels of F. Scott Fitzgerald (1896 – 1940). Rather than describing the observed personality structure in a historical population—which is hardly achievable retrospectively for any period preceding the large-scale data collection using psychometric instruments in the late 20th century— this study instead seeks to reconstruct the implicit personality structure as it was culturally represented in literature through lexical co-occurrence patterns. These cultural models of personality can help us understand how people conceptualized and communicated about character and traits, offering insight into the evolution of personality constructs over time. Fitzgerald was selected as he is widely seen as one of the leading American writers of this period. For example, his first novel, This Side of Paradise (Fitzgerald, 1920/1997), gained wide popularity, becoming a best-seller and undergoing eight reprints within a year of publication (Mizener, 1965). The novel was regarded as a representation of the emerging moral values among young people in post-World War I America (Mizener, 1946). In turn, The Great Gatsby (Fitzgerald, 1925/2021) has been described as the most profoundly American novel of its time (Mizener, 1946).
Our study aims to identify the implicit personality dimensions in Fitzgerald’s novels based on the patterns of co-occurrence of trait descriptors applied to the novels’ characters. Furthermore, we examine the correspondence of these dimensions to contemporary personality models. The main focus is on the Big Five model, using established marker items (Goldberg, 1992) as well as a direct comparison with the observed personality dimensions emerging in self-ratings (Ashton et al., 2004). It should be clear that our data source is different from contemporary self-ratings, drawing on implicit personality theory inherent in literary sources rather than on population data from self- or peer-reports. We advance this comparison as a novel approach to understanding historically embedded personality structures relative to widely used contemporary frameworks. In addition to the Big Five, we examine comparisons with six- and seven-factor solutions from self-ratings, where the sixth and seventh factors are interpreted as honesty-humility and religiosity, respectively (Ashton et al., 2004). Finally, we compare the personality structure in Fitzgerald’s novels to those found in 19th-century English authors, Austen and Dickens (Fischer et al., 2020).
This study addresses the following research questions. First, what are the salient personality concepts, and what is the implicit personality structure in the works of F. Scott Fitzgerald? Second, to what extent does this structure resemble the Big Five model? Third, how similar is the personality structure found in Fitzgerald’s novels to those identified in the works of Austen and Dickens?
Methods
Materials
All four novels published by Fitzgerald were analyzed using the digitized copies available on Project Gutenberg (gutenberg.org): This Side of Paradise (1920/1997), The Beautiful and Damned (1922/2006), The Great Gatsby (1925/2021), and Tender is the Night (1934/2004). We coded the texts using the list of 1,710 trait-descriptive adjectives compiled by Goldberg (1982), which is a descendant of Allport and Odbert’s (1936) list of 17,953 adjectives. These 1,710 adjectives have been used both in lexical research on self-ratings (Ashton et al., 2004; Saucier & Iurino, 2020) and in previous work on text analysis of fiction literature (Fischer et al., 2020). In addition, we used the list of 100 marker terms by Goldberg (1992). This list is a subset of the 1,710 list, with the addition of the word neat. These terms have been used as benchmark indicators of the Big Five in psycho-lexical research (e.g., Thalmayer et al., 2020).
Coding Procedure
We used manual coding to identify personality-descriptive adjectives in the novels. Each adjective of the list of 1,710 was looked up electronically in the text, separately per novel. The coder then, who was familiar with the novels, reviewed each instance in its context to confirm whether the adjective was used as a personality descriptor of a character. Similarly to previous research, fictional characters were treated as quasi-participants (e.g., Fischer et al., 2020). Each occurrence of an adjective from the list was uniquely matched to a character it referred to, based on the guidelines outlined below. Digital copies of the novels were scanned to identify the adjectives and tag them for the respective characters. The counts of each adjective per character were recorded in an Excel sheet and formed the input for factor analysis. While manual coding is labor-intensive, it was deliberately chosen in light of known shortcomings of existing algorithms for character matching in older fiction texts (e.g., Du et al., 2024; Fischer et al., 2020).
The following coding guidelines were applied: (1) If the narrator described a character or their actions, thoughts, cognitive or emotional states with an adjective from the list, the adjective was coded as referring to the respective character. (2) When an adjective was mentioned in dialog or a letter, it was assigned to the person being referred to. When the adjective referred to the speaker, it was assigned to the speaker. (3) When an adjective was used in an imperative sentence, such as “Don’t be morbid!” (Fitzgerald, 1925/2021, p. 116), it was considered descriptive for the person addressed and was assigned to the respective character. (4) Negated adjectives were recorded and combined with the positively stated adjectives in the analyses. (5) Adjectives referring to two or more identifiable characters were assigned to all respective characters. In contrast, adjectives referring to a group of people, where individual characters could not be identified, were not coded.
Inter-Coder Reliability
To assess the reliability of the coding, a section of text was coded by the first author and two additional coders. The coders coded the first two chapters of The Beautiful and Damned using a shortened list of 300 adjectives from the 1,710 list (leaving out most adjectives that did not occur in the sample text). The first reliability question was concerned with the extent to which the three coders included the same adjectives in their coding. To examine this, we looked at all adjectives from the list of 300 that could have been coded in the sample text (184 units), including instances that were consistently not coded by any of the three coders as they were not deemed to refer to characters. We omitted homographic non-adjectives (e.g., learned, which is also a verb form). Fleiss' kappa was .45, indicating moderate agreement (Landis & Koch, 1977). Sources of differences among the coders included instances of adjectives describing groups, physical spaces, and momentary states, as well as negations. The second reliability question was to what extent the commonly included adjectives were assigned to the same character. To examine this question, we calculated Krippendorff’s alpha on the commonly coded adjectives (75 units), reflecting their assignment to characters (Hayes & Krippendorff, 2007). Alpha was .93, indicating high agreement among the three coders in assigning adjectives to specific characters.
Data Analysis
The analyses were conducted in R (v. 4.2.2; R Core Team, 2022) with the packages tidyverse (v. 1.3.2; Wickham et al., 2019), psych (v. 2.3.6; Revelle, 2018), GPArotation (v. 2023.3.1; Bernaards & Jennrich, 2005), and ccpsyc (v. 0.2.7; Fischer & Karl, 2019). The main data and R-code are available on the OSF for replication: https://osf.io/hpcyd/files/osfstorage.
Results
Big Five Markers Identified in the Novels
Note. Negative marker adjectives appear in italics.
Figure 1 presents an overview of the 20 most frequent adjectives in Fitzgerald’s novels, with those in Austen’s and Dickens’ novels (from Fischer et al., 2020) added for comparison.
1
The adjectives included words such as nervous, deep, quiet, restless, curious, and confused, suggesting a preoccupation with the inner world and a sense of unease. For a broad description of the data in the Big Five framework, we assigned each adjective to a factor based on the primary loadings from Ashton et al.’s (2004) five-factor solution and examined the distribution of the adjectives across factors (both the unique 456 adjectives and the instances of their occurrence, adding to 1,713). We similarly examined the frequencies of the 52 marker terms. The results are presented in Figure 2. The 456 adjectives’ distribution was similar to that of the source list of 1,710 terms and the general pattern in psycho-lexical research in English, where Agreeableness, Extraversion, and Conscientiousness are the most prominent factors (Ashton et al., 2004; De Raad et al., 2014). The distribution of the 52 marker terms also closely followed their source list of 100 marker terms (Goldberg, 1992), with approximately equal representation of unique terms across factors. Against this background, Neuroticism and Intellect marker terms were heavily overrepresented in instances of occurrence (Figure 2(d)), in line with the salience of these concepts among the 20 most frequent terms (Figure 1). Percentage of the 20 Most Frequently Identified Adjectives of the 1,710 List in Fitzgerald, With Those of Austen and Dickens (From Fischer et al., 2020) for Comparison Unique and Instance Counts of the 456 Adjectives (From the 1,710 List) and the 52 Markers (From the List of 100 Markers)

Factor Structure
We performed an exploratory factor analysis on the 456 adjectives’ occurrences across the 169 characters. The input to this analysis was the counts of each term (in the columns) with reference to each character (in the rows), with an observed range from 0 to 10 (the term nervous occurred 10 times in reference to The Beautiful and Damned’s protagonist, Anthony Patch). Where a word was not used in reference to a character, this was reflected as a count of zero, in line with previous research extracting personality structure from written texts (Chung & Pennebaker, 2008; Fischer et al., 2020). The assumption in this approach is that, although the trait may not be absent in the respective character, it was not foregrounded as salient.
2
We used principal axis factoring and Varimax rotation, resulting in 85 factors with an eigenvalue of 1 or above.
3
The scree plot is presented in Figure 3. A parallel analysis using random simulated data suggested extracting 16 components; Velicer’s MAP test suggested seven. Based on these complementary indicators, we examined the content of solutions with 1 to 16 factors. The decision on the number of factors to retain within this range was based primarily on the interpretability and coherence of the solutions. After a thorough examination of the factor content of these solutions, we focused our interpretation on the seven-factor solution, which resulted in an optimal balance of interpretability, comprehensiveness, and parsimony. The higher-dimensional solutions included factors that were harder to distinguish in their content from already extracted factors. The seven-factor solution accounted for 50% of the variance. The highest loading terms per factor are presented in Table 2. Scree Plot Highest Loading Adjectives in the Seven-Factor Solution Note. N = 169 character entities. Adjective loadings above .40 are presented. The adjectives are listed in decreasing order of their loadings, and alphabetically across adjectives with equal loadings. The highest and lowest loadings in each factor are presented in parentheses.
The first factor was large and somewhat heterogeneous. It suggested a notion of courage and dynamism, with high loading adjectives such as daring, dynamic, independent, and industrious. Thus, the factor was labeled Dynamism. The second factor included terms related to intellectual flexibility, such as analytical, critical, and imaginative, and others related to persistence, such as tireless and persistent. The factor was labeled Intellectual Versatility. The third factor was defined by high-loading terms such as animated, flamboyant, and frivolous, suggesting expressive and somewhat histrionic personality characteristics. The factor was thus labeled Theatricality. The fourth factor was characterized by a social and emotional orientation, with adjectives such as complimentary, considerate, and forgiving. Hence, the factor was labeled Empathy. The fifth factor was described by the tendency to control oneself and hold back, including high-loading terms such as controlled, cautious, and deliberate, and was labeled Restraint. The sixth factor suggested a lack of various socially desirable qualities; high-loading terms included immature, illogical, and insincere. Therefore, this factor was labeled Immaturity. The seventh factor was characterized by elements of politeness and civic morality, including terms such as elegant, incorruptible, and civilized, and was labeled Civility. Considering the rich and varied content, especially of the larger factors, our labels cannot be entirely comprehensive and may be seen as provisional. The hierarchically nested factor solutions with 1 to 7 factors (cf. Goldberg, 2006) are displayed in Figure 4, and the solutions with up to 10 factors are presented in Figure S1. Factor Cascades Displaying Correlations of .40 and Higher
Correlations Between the Seven Factors and the Big Five Marker Scales (Based on Positive Markers Only)
Note. E = Extraversion, A = Agreeableness, C = Conscientiousness, N = Neuroticism, I = Intellect. Correlations significant at p < .001 are underlined and those at .01 are in italics. Correlations above .50 are presented in bold.
Tucker’s Phi Congruence Coefficients After Procrustes Rotation of Factors Extracted From the Fitzgerald Novels to Self-Ratings (From Ashton et al., 2004) and to Factors Extracted From Novels by J. Austen and C. Dickens (From Fischer et al., 2020)
Note. For the comparison with Austen, k = 149 overlapping terms. For the comparison with Dickens, k = 291 overlapping terms.
Comparison Across Authors
To compare the overall content of character descriptions in Fitzgerald’s novels to those of Austen and Dickens, we first examined the correspondence across the three authors in the relative frequency of terms (the count of instances of occurrence of each adjective divided by the overall count of coded adjectives per author, which was 1,713 for Fitzgerald, 1,157 for Austen, and 5,006 for Dickens). Fitzgerald’s term frequencies correlated at r = .29 with those of Austen (based on 149 terms common to both authors) and r = .47 with those of Dickens (291 common terms); the correlation between Austen’s and Dickens’ term frequencies was r = .65 (198 common terms; all correlations were significant at .001 or lower). 6 The pairwise correlation differences, examined using Fisher’s r-to-z transformation, were significant at .05 or lower, indicating that overall, authors closer in time had higher correspondence in term usage.
We subsequently performed target rotations of Fitzgerald’s factors (for completeness, including also the 5- and 6-factor solutions) to the factors obtained in the two earlier authors, using their solutions as targets. The results are displayed in the middle and bottom panels of Table 4. The congruence coefficients were in a similar range as those obtained in the target rotations to self-ratings, ranging from .32 to .41. 7 Interestingly, Fitzgerald’s factors had lower congruence with Dickens’ factors than with Austen’s factors, which was the opposite trend of the one observed for the correspondence of term frequencies.
It is informative to engage in a qualitative comparison of the factor structures of the three authors. The first factors in all three authors were broad and heterogeneous, and several factors across all authors had complex relations with Big-Five indicators. At the same time, the links with the Big-Five marker scales were more pronounced in Fitzgerald (Table 3) than in the two earlier authors, where most marker-scale correlations were below .40 and only the correlation of Austen’s Intelligence factor with the Intellect marker scale (there labelled as Openness), at .68, exceeded .60 (Fischer et al., 2020, pp. 933–934). As for the content of individual factors, Fitzgerald’s Intellectual Versatility and Austen’s Intelligence had some similarities (e.g., imaginative in Fitzgerald and curious in Austen; rational in both authors), even including elements of social orientation (e.g., charitable in both authors), but the elements of flexibility and persistence (e.g., adjustable, unchangeable) were more extensive in Fitzgerald’s factor. Characteristics related to approachability, which defined large corresponding factors in both earlier authors, were spread across various factors in Fitzgerald. Finally, although all three authors had a factor with a focus on civility, the respective factors were larger in the two earlier authors.
Discussion
This study explored the personality concepts and implicit personality structure in the novels of early 20th-century American author F. Scott Fitzgerald. Frequently occurring adjectives, such as nervous, deep, wild, restless, curious, and confused, suggested a prevailing sense of uneasiness in Fitzgerald’s characters, combining features of intellect and neuroticism. Factor analysis identified a seven-factor personality structure that presents an overall individualistic focus. The most prominent factors featured themes of dynamism, agility, and expressivity. Social-relational aspects and elements of self-control were also identified. Two factors correlated highly with respective Big Five marker scales of agreeableness and intellect. However, the factors extracted from Fitzgerald’s novels had low congruence with factors obtained in self-rating data, such as the Big Five or six- and seven-factor structures, including honesty-humility and religiosity (Ashton et al., 2004). Comparisons with Austen’s and Dickens’ novels revealed greater consistencies in the usage of individual terms in authors closer in time. The factor structures had low congruence across authors, although some qualitative correspondence of individual factors was observed.
Personality in Context
Although elements of the Big Five are represented in our results, the overall structure that emerged differs from the canonical Big Five model. This suggests that familiar personality content is reconfigured in ways that reflect the distinct individual, contextual, and historical features of the source material. This finding suggests that previous studies that have used the Big Five model to describe the personality concepts of past periods (e.g., Johnson et al., 2011; McCrae et al., 2012) may have unduly imposed a presumed universal structure, in a way similar to the “imposed etic” approach in cross-cultural research (Cheung et al., 2011). On the other hand, our findings are in line with recent research that identified idiosyncratic models of personality in writers of the past, with a limited correspondence to presumed universal models (Fischer et al., 2020). These emerging historical models of personality can be interpreted in the cultural-historical context in which they were situated. Individualism and the pursuit of happiness and material success were central features of American culture of the early 20th century (Mizener, 1946; Thomson, 1989; Triandis, 2001). Topics of self-expression, self-determination, managing self-control, and rebelling against society’s expectations, traditional values, and strict moral codes were salient themes in that period, reflected in the preoccupations of Fitzgerald’s contemporaries and his novels’ protagonists, e.g., Amory, Anthony, and Gatsby (Mizener, 1946; Thomson, 1989). These themes are well represented in the Dynamism, Theatricality, Restraint, and Immaturity factors. From this perspective, the obtained personality structure in Fitzgerald’s novels can be seen as reflecting the distinctive character or ethos (McCrae, 2009) of the period.
The finding of a distinct personality structure with only partial correspondence to modern models is in line with recent theorizing that suggests that different patterns of personality trait covariation can emerge due to variations in social, cultural, and ecological constraints and affordances (e.g., Durkee et al., 2022; Fischer, 2017). For example, despite not being prominent among the marker terms, individual words related to conscientiousness were fairly well represented in the data (Figure 2), and elements of conscientiousness, such as achievement orientation and self-control, were present in various factors. The fact that these elements did not coalesce into a single factor points to the sociocultural specificity of the emerging structure.
Another aspect of the present data that is relevant for the interpretation of the obtained structure is that it represents the covariation of terms in the language of a single individual, Fitzgerald, channeled through his description of many individual characters. Our results also likely reflect the emission, that is, the selective emphasis on salient characteristics by the author, which is inherent in literary descriptions and differs from the comprehensive elicitation of all potentially relevant traits, applied in self- and peer-reports. While contemporary personality models, such as the Big Five, are derived from between-individual analyses of elicited ratings of target individuals, our analysis bridges the between- and within-individual levels as it involves numerous fictional characters, all originating from a single author. In turn, our approach expands classic psychobiographic and assessment-at-a-distance research which focuses on the personality profile of single individuals, both living and historical figures, based on their language use (e.g., Hirsh & Peterson, 2009; Ritzler & Singer, 1998; Suedfeld et al., 2011). This research approach offers the prospect of bridging nomothetic and idiographic approaches to personality (McAdams & Pals, 2007), especially when data from more authors of the same period are examined.
Personality Across Contexts
Our study offers an initial example of the potential of text-based research to examine the evolution of personality concepts through time. The findings suggest complex patterns. On the one hand, authors closer in time (Fitzgerald and Dickens, and Dickens and Austen) had a greater similarity in their use of individual personality-descriptive adjectives. On the other hand, the congruence coefficients of Fitzgerald’s factor structure were somewhat larger with the temporally more distant author, Austen. Finally, the factors in Fitzgerald’s novels had similar congruence with self-ratings as those of the two earlier authors, yet Fitzgerald’s factors had some higher correlations with Big-Five marker scales compared with the correlations found in Austen and Dickens (Fischer et al., 2020). Our findings are partly in line with research on versions of the Gilgamesh epic, where versions closer in time tended to be more similar to each other both in the occurrence of individual terms and in overall structure, although the patterns of structure similarity were not entirely consistent (Du et al., 2024). Across the board, our results suggest that patterns of continuity and temporal shifts towards a greater resemblance of contemporary models, such as the Big Five, may be observed in a more straightforward manner at the level of individual words than of word co-occurrence. 8
Furthermore, our study suggests that it is informative to go beyond purely quantitative comparisons of factors obtained in individual authors’ works. Despite the overall low quantitative similarity of factor structures, qualitative comparisons of Fitzgerald’s model with those of the two earlier authors identified some common themes, such as intelligence, approachability, and civility, although with different emphases and factor compositions. Aggregating across more authors and time periods will allow future studies to develop a more comprehensive representation of personality concepts through time, in which some core elements may be shared while others vary locally or diachronically. Such an approach can be compared to the combined emic–etic approach to personality across cultures, which seeks to integrate universal with culturally specific elements of personality (Cheung et al., 2011).
Recent advances in large-language models have demonstrated the potential to recover meaningful personality structures from large text corpora (Cutler & Condon, 2023). Research in this field has so far mostly focused on contemporary models, with little attention to historical changes in personality structure. When algorithms are reliably trained on texts from different historical periods, this will enable larger-scale comparisons building on the works of individual authors and periods. We see this as a promising area for future development.
Limitations and Future Directions
A limitation of our study is its focus on a single author, which entails some level of idiosyncrasy and limits generalizability, also due to the reliance on emitted, rather than elicited, character descriptions. As discussed earlier, this approach offers the potential to bridge idiographic and nomothetic perspectives in text-based research. Yet, this potential will be more fully realized when data from a larger number of authors from the same cultural-historical context can be extracted. Moreover, it is important to recognize that literary authors may not merely reflect pre-existing cultural beliefs about personality but can also actively shape them, for example, by enhancing certain word connotations. We expect that developments in natural language processing and large language models will advance research in these directions.
An alternative method of extracting personality information from texts would involve using all trait-descriptive words that appear in reference to characters, rather than employing a predefined list of words. Such an approach would be more comprehensive and could be described as entirely bottom-up. Yet, this approach would have its own drawbacks, introducing a layer of subjectivity in determining which words should be included as trait-descriptive and potentially reducing comparability across data sets. Furthermore, previous research suggests that results obtained with word lists of different length and breadth, including Allport and Odbert’s (1936) list of 17,953 adjectives, are comparable, while results from the 1,710 list were most clearly interpretable (Fischer et al., 2020).
Finally, the exclusive use of adjectives as the unit of analysis is a limitation worth noting. Literature characters are only partially described by the use of adjectives, and their actions, dialogues, and internal monologues are at least as important. Future research could expand the selection of word classes analyzed, in line with a broader need for research on personality structure in word classes beyond adjectives (Barelds & Raad, 2015; Garrashi et al., 2024).
Conclusion
In the context of the emerging field of personality in history, our study addressed the implicit structure of personality in early 20th-century American literature based on an analysis of the novels of F. Scott Fitzgerald. Applying a psycho-lexical approach to the study of fictional characters, we identified a seven-factor model that resonates with the cultural-historical context of the novels. Our study highlights the potential of text-based research to enhance the understanding of personality in the past through the perspective of people of the past in a way similar to cross-cultural research that integrates local and universal perspectives. Future research on personality in history stands to gain from applying a lexical lens at the levels of characters, writers, and time periods to exploit the still largely untapped resource of person descriptions in the world’s literature. Finally, this work highlights that good writers remain relevant beyond their era while also reminding us that engaging with authors of the past requires adjusting our interpretive lens to their historical context.
Supplemental Material
Supplemental Material - Personality Concepts in Early 20th-Century American Literature: Examining the Novels of F. Scott Fitzgerald
Supplemental Material for Personality Concepts in Early 20th-Century American Literature: Examining the Novels of F. Scott Fitzgerald by Sophie C. Bauditz, Velichko Fetvadjiev, Ronald Fischer in Personality Science
Supplemental Material
Supplemental Material - Personality Concepts in Early 20th-Century American Literature: Examining the Novels of F. Scott Fitzgerald
Supplemental Material for Personality Concepts in Early 20th-Century American Literature: Examining the Novels of F. Scott Fitzgerald by Sophie C. Bauditz, Velichko Fetvadjiev, Ronald Fischer in Personality Science
Footnotes
Author Note
Not applicable.
Acknowledgement
The authors would like to thank Andreas Pingouras and Anjulie Grimm for coding part of the literature to assess inter-rater reliability.
Author Contributions
The authors confirm their contribution to the paper as follows:
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
Supplemental Material
Supplemental material for this article is available online. Depending on the article type, these usually include a Transparency Checklist, a Transparent Peer Review File, and optional materials from the authors.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
