Abstract
The structure of academic self-concept (ASC) is assumed to be multidimensional and hierarchical. This methodological review considers the most central models depicting the structure of ASC: a higher-order factor model, the Marsh/Shavelson model, the nested Marsh/Shavelson model, a bifactor representation based on exploratory structural equation modeling, and a first-order factor model. We elaborate on how these models represent the theoretical assumptions on the structure of ASC and outline their inherent psychometric properties. We analyzed these models using a data set of German 10th-grade students (
Keywords
Academic self-concept (ASC) is defined as the mental representation of one’s own academic abilities in general and in different academic domains (e.g., Brunner et al., 2010). Researchers have also used the terms self-concept of ability (Helmke & van Aken, 1995) and perceived cognitive competence (Harter, 1982) with similar definitions. ASC has been an important construct in education research for several decades, as it relates to desirable outcomes such as higher educational aspirations, better attainment, and more favorable learning behavior (Marsh, 2007; Marsh & Craven, 2006).
For the purpose of this article, it is useful to separate two lines of research (Brunner et al., 2010; Edwards & Bagozzi, 2000; Marsh & Craven, 2006; Shavelson et al., 1976): The first line of research—also known as “within-network analyses” (Byrne, 1984, 1996)—addresses the internal structure of ASC. The second line of research—also known as “between-network analyses” (Byrne, 1984, 1996)—addresses the nomological network of ASC by investigating relations between ASC and outcomes such as academic achievement. These two lines of research are intertwined, given that studying between-network relations requires an appropriate structural model of ASC. In other words, before examining the theoretical and practical significance of ASC by relating it to other variables, it is essential to clarify the underlying conceptual characteristics of ASC. Consequently, the question of how to best capture the structure of ASC has stimulated much theoretical deliberation and empirical research.
Prior reviews on the structure of ASC (Byrne, 1984; Marsh & Hattie, 1996; Shavelson et al., 1976) are outdated and limited in scope because they did not cover the many currently available and applied ASC models. In addition, the purpose of previous work on different models on the structure of ASC (Brunner et al., 2010; Marsh, 1990b; Morin et al., 2016) was to introduce new models of ASC that addressed the challenges of previous models, rather than providing a comprehensive review and systematic comparison of existing models. The goal of this methodological review is therefore to provide an in-depth discussion of five central ASC models that have been most often applied or have recently been developed in contemporary ASC research: the higher-order factor model, the Marsh/Shavelson model, the nested Marsh/Shavelson model, a bifactor representation based on exploratory structural equation modeling (ESEM), and the first-order factor model.
This article consists of two parts. In Part 1, we elaborate on the five models on the structure of ASC within the framework of within-network analyses to describe the internal structure and components of ASC as assumed by these models. We also elaborate on the five models on the ASC structure within the framework of between-network analyses to discuss how these models test relations to outcome variables (e.g., academic achievement). In doing so, we highlight important similarities and differences concerning theoretical, conceptual, methodological, and psychometric characteristics of the ASC models. In Part 2, we illustrate the application of the five ASC models for a large data set of secondary school students from Germany. We investigate the different ASC models and examine how the correlations between academic achievement and ASC vary depending on the structural model of ASC applied. The syntax codes for all statistical analyses are provided in the supplements (in the online version of the journal) to facilitate the application of these models in future research. We conclude by discussing the empirical findings from a theoretical and methodological perspective. In addition, we give recommendations concerning which research questions can be best addressed with the different ASC models and outline the major shortcomings of each model. Given that other vital constructs in education research (e.g., academic anxiety, academic interest) share theoretical and psychometric underpinnings with ASC (e.g., domain specificity; Gogol et al., 2017), the present article may also be relevant for researchers focusing on such constructs.
Part 1: Within-Network and Between-Network Analyses on the Structure of ASC
The seminal review by Shavelson et al. (1976) marked the beginning of empirical and psychometric ASC research as it made several empirically testable assumptions about the structure of self-concept (SC). Shavelson et al. (p. 411) defined SC as “a person’s perception of himself [sic]” and assumed SC to be both multidimensional and hierarchical in nature. Multidimensionality means that SC consists of different domain-specific facets tapping several domains of an individual’s life and experiences. Hierarchy means that the domain-specific SC facets are located on different generality levels. General SC is assumed to be located on the highest and most general level of the hierarchy and separated into ASC and non-ASC on a subordinate level.
According to Shavelson et al. (1976), ASC itself has a multidimensional and hierarchical structure. That is, students form separate SCs related to different school subjects or academic domains and these domain-specific ASCs can be combined into a general ASC (Marsh, 1990b). 1 Most of the ASC research has been conducted with students at school; thus, the academic connotation of ASC commonly refers to the school context (Marsh & Craven, 2006).
The model proposed by Shavelson et al. (1976) invoked a surge in research to empirically validate the assumptions of the multidimensional and hierarchical structure of ASC (i.e., within-network analyses; Byrne, 1996). The development and empirical evaluation of ASC models has benefitted from the evolution of structural equation modeling (SEM [e.g., Bollen, 1989; Kline, 2005]) that differentiates between manifest or observed variables (items) and latent, unobserved constructs (factors). The manifest variables operate as indicators for the latent constructs. Methodological advancements such as confirmatory factor analyses (CFAs) and their refinements have also contributed to the development of various ASC models (MacCallum & Austin, 2000). In the following, we describe the five central ASC models that are considered in detail in this article: the higher-order factor model, the Marsh/Shavelson model (Marsh, 1990b), the nested Marsh/Shavelson model (Brunner et al., 2010), a bifactor model implemented in ESEM (Asparouhov & Muthén, 2009), and the first-order factor model.
Within-Network Analyses: Models on the Structure of ASC
Higher-order Factor Model
The assumptions by Shavelson et al. (1976) implicated a higher-order factor model of ASC (Figure 1a). In this model, all domain-specific ASCs (e.g., verbal, math) formed first-order factors that load on a single higher-order factor representing general ASC (i.e., students’ ASC across all domains). It is important to differentiate between two conceptualizations of general ASC. First, general ASC can be directly measured by items that assess how students perceive their abilities related to school in general and that are not tied to a specific domain or school subject (e.g., “I have always been good at school”). These items are then used to directly form general ASC either by deriving a latent first-order factor or by building a manifest scale score. Alternatively, general ASC can be extracted and aggregated from subordinate (first-order) factors of domain-specific ASCs. In this case, general ASC constitutes a higher-order factor that depicts the apex of a hierarchy of domain-specific ASCs. Such a higher-order factor of general ASC can either include or exclude a first-order general ASC factor (Yeung et al., 2000). 2 In the following, we refer to “GASC” for a first-order factor of general ASC that is directly measured by items capturing students’ perceptions of their school-related abilities in general. We use the term “HGASC” to refer to a higher-order factor of general ASC composed of different first-order (domain-specific and GASC) factors.

Structural models of academic self-concept: (a) Higher-order factor model; (b) Marsh/Shavelson model; (c) Nested Marsh/Shavelson model; (d) Bifactor-ESEM representation; and (e) First-order factor model. Residual terms as well as residual correlations between items using the same wordings were omitted for reasons of clarity. The three items in the nested Marsh/Shavelson model that load only on the G-factor refer to GASC. In the bifactor-ESEM model, only cross-loadings for GSC and HSC are illustrated to reduce complexity of the figure. Dashed lines represent cross-loadings. ESEM = exploratory structural equation modeling; HGASC = higher-order factor of general academic self-concept; MATH = higher-order math self-concept; VERBAL = higher-order verbal self-concept; GSC = German self-concept; ESC = English self-concept; HSC = history self-concept; BSC = biology self-concept; CSC = chemistry self-concept; PSC = physics self-concept; MSC = math self-concept; GASC = general academic self-concept.
The higher-order factor model clearly depicts the theoretical assumptions of Shavelson et al. (1976), but it often shows an inferior fit to the data compared to the alternative ASC models described below (Brunner et al., 2010), or even a poor fit (Marsh, 1987, 1990b). This is because the first-order math and verbal ASCs are consistently found to be nearly uncorrelated (Marsh, 1990a; Möller et al., 2009, 2020). It is reasonable to assume a higher-order factor only when the subordinate domain-specific factors are substantially correlated. In this case, the higher-order factor can effectively explain the common variance of the domain-specific factors (Brunner et al., 2012).
Marsh/Shavelson Model
The Marsh/Shavelson model was developed in response to the observation of nearly uncorrelated math and verbal ASCs and thus to remedy the shortcomings of the higher-order factor model (Marsh, 1990b). The model (Figure 1b) replaces the HGASC with two higher-order factors that are assumed to be nearly uncorrelated—a higher-order math ASC and a higher-order verbal ASC. The domain-specific ASCs form first-order factors that are located on a continuum from a verbal endpoint to a math endpoint. The position of a certain ASC on this continuum is reflected by the loadings on the higher-order math ASC and the higher-order verbal ASC. Hence, domain-specific ASCs can be classified as being either “math-like” or “verbal-like.” The ASC related to students’ main language of instruction (e.g., German for students in the German educational system; English for students in the English educational system) is assumed to represent the verbal endpoint of the ASC continuum in its purest form, and it should therefore have the highest loadings on the higher-order verbal ASC. Other verbal-like ASCs, such as the ASCs related to students’ first foreign language or second foreign language, are located near the verbal endpoint of the ASC continuum. They are assumed to also load on the higher-order verbal ASC, but less strongly than the ASC related to students’ main language of instruction. At the other end of the continuum, math ASC is assumed to represent the math endpoint in its purest form, and therefore should have the highest loadings on the higher-order math ASC. Math-like ASCs such as science-related ASCs (e.g., ASCs in physics and chemistry) are assumed to be located close to the math endpoint of the ASC continuum, and therefore to load on the higher-order math ASC as well, but to a lesser degree than math ASC.
In some cases, domain-specific ASCs cannot be straightforwardly classified into either being math-like or being verbal-like. For example, history ASC and biology ASC are each assumed to have both math-like and verbal-like aspects and are thus located in the center of the continuum. Therefore, history and biology ASCs are assumed to load on both higher-order factors (Marsh et al., 2017). Finally, GASC is specified as a first-order factor that loads on both higher-order ASC factors, in line with its conceptualization as students’ ASC related to all school subjects.
Nested Marsh/Shavelson Model
The nested Marsh/Shavelson model (Brunner et al., 2010; Figure 1c) capitalizes on advanced CFA models (Eid et al., 2003, 2017) and was developed to simultaneously account for the hierarchy and multidimensionality of ASCs. It reverts to the assumption of a hierarchically superordinate ASC that was abandoned in the Marsh/Shavelson model. In the nested Marsh/Shavelson model, all indicators (i.e., items) of domain-specific ASCs form an HGASC represented by a general (G-) factor. In addition, the indicators of domain-specific ASCs form first-order factors representing domain-specific ASCs (S-factors) that are nested under the G-factor. Items measuring GASC do not build a separate first-order S-factor but load on the G-factor only. The G-factor is specified to be uncorrelated with all domain-specific ASC factors (i.e., the S-factors). The different S-factors for the domain-specific ASC factors are allowed to correlate.
The nested Marsh/Shavelson model was empirically validated with samples of German students from elementary school (Schmidt et al., 2017) and secondary school (Brunner et al., 2008), with secondary school students from Luxembourg (Brunner et al., 2010; Gogol et al., 2017), and with secondary school students from 26 different countries (Brunner et al., 2009).
Bifactor-ESEM Representation
All models presented so far rely on CFA, which typically builds on the independent cluster model (ICM; McDonald, 1985), according to which each manifest item loads only on one single target factor without allowing any cross-loadings on other factors. However, the ICM approach might be too restrictive for multidimensional constructs such as ASC. Multidimensional constructs consist of different facets with some conceptual overlap, making cross-loadings theoretically plausible (e.g., Howard et al., 2018; Litalien et al., 2017; Marsh et al., 2009, 2011).
ESEM (Asparouhov & Muthén, 2009; Morin et al., 2013) has been recently established as a methodological framework within SEM that allows cross-loadings. When applying target rotation in ESEM, it is possible to a priori specify whether an item should have a main loading or a cross-loading on a specific factor. ESEM also allows the implementation of bifactor models where each manifest variable loads on a G-factor as well as on one or more S-factors (Morin et al., 2016; Reise, 2012; Reise et al., 2011). Bifactor models are usually orthogonal models, meaning all S-factors are specified to be mutually uncorrelated and the correlations between the S-factors and the G-factor are also set to zero. Bifactor-ESEM representations have been successfully applied to model the joint structure of SC including academic and nonacademic facets (Arens & Morin, 2017; Morin et al., 2016). Still, bifactor-ESEM representations have not yet been applied to ASC only, although a bifactor-ESEM representation of ASC can well account for the hierarchy and multidimensionality of ASC as proposed by Shavelson et al. (1976). In a bifactor-ESEM representation of ASC, each ASC item has target loadings on a G-factor representing general ASC and on its corresponding S-factor. In addition, each ASC item has nontarget cross-loadings on the other S-factors (Figure 1d). The nested Marsh/Shavelson model can be conceptualized as an incomplete bifactor model because it does not include an S-factor for GASC items (Eid et al., 2017). The bifactor-ESEM representation, on the other hand, is a complete bifactor model because it comprises S-factors for all items including GASC items.
The First-Order Factor Model
A first-order factor model also builds on the ICM/CFA approach and assumes separate ASC factors for different domains or school subjects as well as for GASC (Figure 1e). The different ASC factors are allowed to correlate. However, the different ASC factors (including domain-specific ASCs as well as GASC) are not hierarchically related, instead all being located on the same level of hierarchy. Hence, the first-order factor model reflects the multidimensionality, but not the hierarchy, of ASC.
Between-Network Analyses: Outcome Relations of ASC
The importance of ASC in education research originates from its relations with outcome variables that are examined in the context of between-network analyses (Byrne, 1996). Academic achievement is probably the most frequently examined outcome variable of ASC (Hansford & Hattie, 1982; Marsh & Craven, 2006; Valentine et al., 2004). Academic achievement can be measured through standardized achievement test scores and school grades. Many empirical studies have found ASC to be more highly related to school grades than to standardized achievement test scores (Arens et al., 2017; Marsh et al., 2005; Möller et al., 2009, 2020). School grades are highly salient as they are directly and regularly communicated to the students (Marsh et al., 2014). Furthermore, students can easily compare their own school grades in one subject with other students’ school grades in the same subject (social comparisons), with their own school grades in other subjects (dimensional comparisons), and with their own school grades in the same subject at previous points in time (temporal comparisons). These comparison processes have been conceptualized as an important mechanism of ASC formation (Möller et al., 2009, 2020; Wolff et al., 2019). The findings from between-network analyses have supported the domain specificity of ASC. Stronger relations have been found between ASC facets and achievement in matching domains (e.g., math ASC and math achievement) than between ASC facets and achievement in nonmatching domains (Marsh & Craven, 2006; Marsh et al., 2017). Domain-specific ASCs are better able to explain and predict domain-specific achievement than indicators of general ASC such as GASC (Swann et al., 2007; Valentine et al., 2004).
Comparing Key Structural Characteristics Across ASC Models
In the following section, we discuss and compare the previously introduced models of ASC with respect to their assumptions about the nature and structure of ASC (within-network analyses) and with respect to the question of how achievement relations can be tested (between-network analyses). Table 1 summarizes the main aspects.
Theoretical and psychometric characteristics of five central ASC models
Note. ASC = academic self-concept; HO = higher-order factor model; M/S = Marsh/Shavelson model; NM/S = nested Marsh/Shavelson model; Bi-ESEM = bifactor representation using exploratory structural equation modeling; FO = first-order factor model; NA = not applicable; GASC = general academic self-concept.
One model parameter (e.g., factor correlation) needs to be fixed to a predetermined value (e.g., zero).
Multidimensionality
All five models of ASC reviewed here capture the assumption of multidimensionality as proposed by Shavelson et al. (1976)—that is, all models differentiate between separate ASC facets representing different domains.
Hierarchy
The assumption of hierarchy as proposed by Shavelson et al. (1976) is included in all of the reviewed models except for the first-order factor model—they all include some form of superordinate constructs. However, the models differ in how they construe the hierarchical nature of ASC. The higher-order factor model includes an HGASC that is located at the apex of the ASC hierarchy and combines the common variance of the first-order ASC factors. The nested Marsh/Shavelson model and the bifactor-ESEM representation include a G-factor that combines the common variance of all manifest ASC indicators (items). The Marsh/Shavelson model includes two higher-order factors (i.e., a higher-order math ASC and a higher-order verbal ASC). As the higher-order factors in the Marsh/Shavelson model are domain-specific, only the higher-order factor model, the nested Marsh/Shavelson model, and the bifactor-ESEM representation meet the original assumption regarding one hierarchically superordinate domain-unspecific, general ASC construct.
Interpretation of Domain-Specific ASCs
All the models considered here reflect a multidimensional nature of ASC and thus include domain-specific ASCs. However, the domain-specific ASCs bear different meanings in the ASC models. In the higher-order factor model, the domain-specific ASC factors capture the residual variance not explained by the HGASC. In the nested Marsh/Shavelson model, the domain-specific ASC factors capture the residual variance not explained by the G-factor. Hence, in these two models, the domain-specific ASCs are interpreted against the background of controlling for a hierarchically superordinate ASC construct (i.e., HGASC or the G-factor). In the Marsh/Shavelson model, the domain-specific ASCs capture the residual variance not explained by the higher-order math and/or the higher-order verbal ASCs. In the bifactor-ESEM representation, the domain-specific ASCs are controlled for when estimating a hierarchically superordinate construct of general ASC (by stating a G-factor) and the other domain-specific ASCs and GASC (by allowing item cross-loadings and stating an S-factor for GASC). Therefore, in all of these models, individual students’ ratings on domain-specific ASCs provide information about students’ profiles (strengths and weaknesses), while keeping constant the level of one (in the higher-order factor model, nested Marsh/Shavelson model, and bifactor-ESEM representation) or two (in the Marsh/Shavelson model) hierarchically superordinate ASC constructs. In the first-order factor model, no hierarchically superordinate construct of ASC is included and cross-loadings are not allowed. Hence, the interpretation of domain-specific ASCs is more complex: Here, the variance of domain-specific ASCs reflects a mixture of the variances of domain-specific ASCs as well as superordinate ASC constructs.
Inspired by the Marsh/Shavelson model (Marsh, 1990b) and the related idea of a math-verbal continuum of domain-specific ASCs (Marsh et al., 1988, 2015), researchers have examined the conceptual closeness of domain-specific ASCs. The models of ASC outlined here differ as to whether and how they depict conceptual closeness among domain-specific ASCs. The Marsh/Shavelson model reflects the conceptual closeness of domain-specific ASCs by the pattern of factor loadings of domain-specific ASCs on the higher-order math and higher-order verbal ASC factors. In the nested Marsh/Shavelson and the first-order factor model, the pattern of correlations among the domain-specific ASCs allows for examining the conceptual closeness of domain-specific ASCs. Importantly, in the nested Marsh/Shavelson model but not in the first-order factor model, these correlations are controlled for the variance attributable to the G-factor. In the nested Marsh/Shavelson model, the correlations among domain-specific ASCs can therefore be clearly attributed to the common variance among domain-specific ASCs. By contrast, in the first-order factor model, the correlations among domain-specific ASCs may partly reflect the common variance attributable to a superordinate ASC in addition to the common variance attributable to domain-specific ASCs. In the bifactor-ESEM representation, the pattern of cross-loadings of items on ASC factors other than the target ASC factors reflects the conceptual overlap between ASC domains. The higher-order factor model cannot provide information on the conceptual closeness among domain-specific ASCs, because all domain-specific ASCs load on the HGASC only.
Interpretation of GASC
The different ASC models conceptualize the directly assessed GASC in different ways. In the higher-order factor model, GASC is a first-order factor that captures the common variance of all items assessing GASC and that is assumed to load on the higher-order factor of HGASC. In the bifactor-ESEM representation, GASC items load on a separate S-factor and on the G-factor (and may additionally have cross-loadings on the other S-factors). In the Marsh/Shavelson model, GASC is a first-order factor that loads on both the higher-order math and verbal ASC factors. GASC is thus residualized for both higher-order factors. In other words, in the higher-order factor model, the bifactor-ESEM representation, and the Marsh/Shavelson model, GASC items explain residual variance of domain-unspecific ASC ratings that is not captured by hierarchically superior constructs (i.e., HGASC, the G-factor, or higher-order math and verbal ASCs) and other domain-specific ASCs (in the bifactor-ESEM representation). By contrast, the nested Marsh/Shavelson model does not include an S-factor for the GASC items, which are only used to define a superordinate general ASC (the G-factor). Likewise, in the first-order factor model, GASC is a separate first-order factor with correlations to the other domain-specific ASCs.
Invariance of the Meaning of Hierarchically Superordinate ASC Constructs
The different conceptualizations of GASC outlined above affect the interpretation of hierarchically superordinate constructs. In the higher-order factor model, the bifactor-ESEM representation, and the Marsh/Shavelson model, the meaning of the hierarchically superior constructs is linked to the ASC domains included in the model because the corresponding higher-order factors are defined by the common variance across domain-specific ASCs and GASC. In contrast, in the nested Marsh/Shavelson model, as there is no S-factor for GASC, the G-factor primarily depicts the common variance among GASC items and therefore retains its meaning irrespective of the other ASC domains included in the model (Eid et al., 2017).
Correlations Between ASC and Achievement
In the nested Marsh/Shavelson model, one can probe for the correlations between the G-factor and outcome variables and between domain-specific ASCs (S-factors) and outcome variables. The correlations obtained between domain-specific ASCs (S-factors) and outcome variables can be interpreted as semipartial correlations that are controlled for the G-factor. This model does not permit testing the correlations between GASC and outcome variables because there is no S-factor for GASC. The bifactor-ESEM representation enables testing the correlations between outcome variables and the G-factor, the S-factors for domain-specific ASCs, and the S-factor for GASC. The resulting correlations for the S-factors are controlled not only for the G-factor as a hierarchically superordinate construct but also for shared variance with other ASCs given possible item cross-loadings across the S-factors for domain-specific ASCs and GASC (Asphahourov & Muthen, 2009; Morin et al., 2016). The first-order factor model allows researchers to examine the correlations between all domain-specific ASCs as well as GASC and outcome variables. When interpreting the correlations, one should keep in mind that the variances of the ASC factors represent a combination of the variance that is specific to the ASC considered and shared with other ASCs included in the model. Consequently, estimates of correlations with outcome variables obtained from the first-order factor model reflect correlations that are specific to a certain ASC but also due to shared variance across ASCs.
In the higher-order factor model and the Marsh/Shavelson model, one typically considers only the relations of the higher-order factors to outcome variables. When studying correlations between first-order ASCs and outcome variables, the higher-order factor model and the Marsh/Shavelson model suffer from an inherent psychometric restriction, that is, the proportionality constraint (Brunner et al., 2012; Gignac, 2016; see also Chen et al., 2006; Schmiedek & Li, 2004). This constraint affects the proportion of variance in the item scores explained by higher-order and first-order ASC constructs. Specifically, the ratio of variance attributable to the first-order ASCs to variance attributable to the HGASC (in the higher-order factor model) or to the higher-order math and verbal ASCs (in the Marsh/Shavelson model) is constrained to be the same across a given set of ASC items. The estimated relations of first-order and higher-order ASCs (HGASC or higher-order math and verbal ASCs) and outcome variables are thus linearly dependent (Schmiedek & Li, 2004). In other words, the proportionality constraint limits the value of these higher-order factor models in providing insights into the relation between first-order factors (domain-specific ASCs or GASC) and outcome variables. If the relations between first-order factors and outcome variables are of interest beyond the relations between higher-order factors and outcome variables, one has to rely on highly restrictive model assumptions. Specifically, the size of one model parameter such as a correlation between a first-order factor and the outcome variable has to be fixed to a predetermined value (usually zero) in order to achieve model identification. This model constraint may distort the results when the predetermined value for the ASC-outcome relation does not reflect the true empirical relation (Brunner et al., 2012; Christensen et al., 2001).
Part 2: Empirical Illustration
In Part 1, we provided an overview of five central models of the structure of ASC. These models have been developed and tested using data sets that differ regarding the domain-specific ASCs included and the student samples used (e.g., age and origin of the samples). To our knowledge, no study has yet used the same data set to empirically test the models described above. Our overview of the different ASC models shows that an empirical evaluation of these models has to rely on a data set that includes a measure for GASC as well as measures for a broad variety of domain-specific ASCs. Therefore, we implemented the models using a large data set obtained from secondary school students in Germany with a measure for GASC as well as measures for seven domain-specific ASCs. We scrutinized and compared the properties of these models with within-network analyses, by considering the resulting structural characteristics, and then with between-network analyses, by examining how the obtained correlations between ASCs and academic achievement—operationalized by school grades—vary depending on the structural model of ASC applied.
Method
Sample
On behalf of the 16 federal states in Germany, the Institute for Educational Quality Improvement (IQB) conducts regular, sample-based, large-scale assessment studies to monitor the German educational system and compare student achievement across the federal states (e.g., Pant et al., 2013). 3 We used data from a large field trial to pretest achievement test items to be used in these assessment studies. The field trial was conducted in 2014 with a total sample of 3,258 10th-grade students. As is common in large-scale assessments, different test booklets and student questionnaires were randomly assigned to the students. The versions were randomized at the class level (i.e., all students from a class received the same questionnaire). Our analyses are based on the random subsample of N = 1,232 students (N = 587 [47.6%] boys and N = 644 [52.3%] girls, N = 1 unreported] who received a questionnaire version with measures of ASC. Students’ age ranged from 14 to 19 years with a mean age of 15.43 years (SD = 0.65). The students came from 63 classes distributed across 48 schools. All common school types of the German secondary school system were included with a (slight) oversampling (50.8%) of the academic track (Gymnasium). The majority of students (N = 1.228; 87.9%) had German as their native language, N = 145 (11.8%) students did not have German as their native language, and respective information was missing for N = 4 (0.3%) students. Participation was mandatory at the school level but not at the student level. The participation rate at the student level was 79.6%. Students were not rewarded or graded for participation. Parental consent was given for all students in the present sample.
Measures
Students’ ASCs in the domains of German (students’ main language of instruction), math, physics, English (a foreign language taught in all secondary schools in the German educational system, most commonly as students’ first foreign language), chemistry, biology, and history were measured with four items each. The items were worded identically across the seven domains: “I learn quickly in [domain]”; “I have always been good at [domain]”; “Things in [domain] are easy for me”; “It is easy for me to understand new things in [domain].” The GASC scale consisted of three items that asked the students for their self-perceptions of school competence unrelated to a specific domain: “I learn things quickly in most school subjects”; “I have always been good at school”; “Things in most school subjects are easy for me.” Students responded to all items on a 5-point Likert-type scale ranging from does not apply at all to completely applies. Hence, high ratings consistently represented high levels of ASCs in all domains. The reliability estimates in terms of Cronbach’s alpha were good for all scales (α = .83 to .95; Supplemental Table S1 in the online version of the journal).
Students’ school grades related to the same domains as assessed for the domain-specific ASCs—that is, school grades in German, math, physics, English, chemistry, biology, and history (for descriptive statistics, see Supplemental Table S1 in the online version of the journal). The school grades were reported by the school officials based on the latest school report (end of the 9th-grade school report obtained in summer 2014). In Germany, school grades range from 1 to 6, with 1 representing the best grade. For ease of interpretation, school grades were reverse coded for all analyses so that higher numbers represented better achievement.
Statistical Analyses
The analyses were all conducted within the SEM framework using Mplus 8.2 (Muthén & Muthén, 1998–2019). For all models, we used the robust maximum likelihood estimator that may account for nonnormally distributed manifest variables (Hox et al., 2010). As the data had a hierarchical structure (students nested in classes), all models were implemented by using the Mplus option “type=complex,” using students’ classes as cluster variables. This option corrects for possibly biased standard errors, which can result when not considering the hierarchical structure of the data (Stapleton, 2006). In all models, we further included correlated uniquenesses between parallel-worded ASC items across domains to account for common variance (Marsh et al., 2013). To obtain model identification, the unstandardized loading of the first item on a factor was set to 1 in the models relying on the CFA framework (i.e., the higher-order factor model, the Marsh/Shavelson model, the nested Marsh/Shavelson model, and the first-order factor model). In the bifactor-ESEM representation, we used target rotation. Here, each item was assumed to have target loadings on the G-factor as well as on the S-factor representing the matching domain-specific ASC (or GASC). All items also had nontarget loadings on all remaining S-factors for domain-specific ASCs or GASC.
In a first step, we analyzed the structural models of ASC (within-network analyses; Byrne, 1996); in a second step, we extended these models by including school grades to examine the correlations between ASC and achievement as an outcome variable (between-network analyses; Byrne, 1996). We used the school grades as single-item indicators to operationalize domain-specific achievement factors. The variance of the residual term of each achievement factor was fixed to zero. We inspected the zero-order correlations between the domain-specific school grades and ASC facets. The Mplus inputs for each ASC model are reported in the supplements in the online version of the journal.
Three students had missing values on all ASC variables and had thus to be excluded from the analyses that focused on the structure of ASC only; these students could, however, be included in the analyses that focused on the relations between ASCs and school grades. On the item level, missingness for the ASC items was very low (range from 0.4% to 1.2%). For the school grades, the amount of missing data was higher and ranged from 2.3% to 20.9%. The major reason for missing values on the school grades was that single subjects were not taught or graded in a specific school, school type, or federal state in Grade 10, which can be considered as a missing at random (MAR) process. Still, the considered school subjects are part of the standard curriculum for secondary schools and thus all students were taught in these school subjects during their school career at some time. Missing values on all variables were handled by the full information maximum likelihood (FIML) approach implemented in Mplus. The FIML approach is known to be reliable and to lead to unbiased parameter estimates when handling data that are MAR or missing completely at random (Enders, 2010; Graham, 2009). The assumption of MAR cannot be empirically tested (Schafer & Graham, 2002), but serious violations of the assumption of MAR are relatively rare (Graham et al., 1997; Schafer & Graham, 2002). School grades from other subjects and domain-specific ASCs were included in the analyses; school grades of different school subjects are highly correlated, and ASCs and school grades of matching domains are also highly correlated (Möller et al., 2009, 2020). Thus, even if the value of a certain school grade drove the missing data process (i.e., data are not missing at random), this process could be approximated by including school grades in other subjects or students’ ASCs in different subjects in the FIML estimation process. Simulation studies have shown that including such powerful covariates in the estimation process helps reduce bias in model parameters even when data are not missing at random (Collins et al., 2001). Taken together, missing data may not impose severe threats to the validity of our findings on the comparison of the structural models of ASC or on the pattern of relations between ASCs and achievement.
Two main criteria were applied for model fit evaluation. First, we referred to several commonly accepted descriptive goodness-of-fit indices (Marsh et al., 2004): the comparative fit index (CFI), the root mean square error of approximation (RMSEA), and the standardized root mean square residual (SRMR). For the CFI, values between .90 and .95 are commonly accepted as indicators of good model fit, although some authors suggest a stricter criterion of .95 (e.g., Hu & Bentler, 1998). Concerning the RMSEA, values below .05 are indicative of a close fit, values between .05 and .08 are indicative of a reasonable fit, and values greater than .10 are indicative of a poor fit (Browne & Cudeck, 1993). For the SRMR, values below .05 indicate good model fit (Diamantopoulos & Siguaw, 2000), while cutoff values of .08 (Hu & Bentler, 1998) and even of .10 (Kline, 2005) are still accepted as adequate. Second, we inspected the resulting model parameter estimates and how well they aligned with theoretical assumptions of ASC (Bollen, 1989; Kline, 2005; Marsh et al., 2004; West et al., 2012). Particularly, we evaluated the statistical significance and size of the standardized factor loadings (λ). Following Floyd and Widaman (1995), we considered standardized factor loadings of λ ≥ .30 to be substantial. For cross-loadings resulting from the bifactor-ESEM representation, we followed Mai et al. (2018), who proposed that cross-loadings of λ ≥ .10 are nonignorable. In addition, we evaluated factor correlations among the ASCs and the correlations between ASCs and academic achievement operationalized by school grades to judge the theoretical and empirical adequacy of the ASC models.
Results
For each ASC model, we first report the key results obtained for the structural model itself. Second, we report the findings regarding the correlations between ASCs and school grades.
Higher-Order Factor Model
Most goodness-of-fit indices obtained for the higher-order factor model with and without school grades were within an acceptable range, except for the SRMR that was greater than .10 (Table 2). All domain-specific first-order ASCs and GASC were well-defined as is evident from substantial and statistically significant positive factor loadings of the items on their matching first-order factors (Table 3). Among the first-order factors, GASC displayed the highest loading on the HGASC (λ = .79). The domain-specific first-order factors for English (λ = .19), German (λ = .21), and history ASCs (λ = .35) were relatively weakly related to the HGASC, whereas biology (λ = .54), math (λ = .64), chemistry (λ = .72), and physics (λ = .72) ASCs demonstrated substantial loadings on the HGASC.
Goodness-of-fit indices
Note. All χ² were statistically significant (p < .05). All models were conducted with the robust maximum likelihood estimator. ESEM = exploratory structural equation modeling; CFI = comparative fit index; RMSEA = root mean square error of approximation; SRMR = standardized root mean square residual.
Standardized factor loadings (p values in parentheses) of the ASC models
Note. The parameter estimates before the slash were obtained from the models without school grades; the parameter estimates after the slash were obtained from the models including school grades. ASC = academic self-concept.
The HGASC was positively related to academic achievement as measured by school grades (Table 4; see also Supplemental Table S2 in the online version of the journal). The correlations ranged from r = .20 for the English grade to r = .55 for the chemistry grade. The pattern of correlations showed that the HGASC was more strongly related to grades obtained in math/science subjects (math: r = .47; physics: r = .50; chemistry: r = .55) than to school grades in verbal subjects (German: r = .26; English: r = .20).
Correlations with school grades (p values in parentheses)
Note. For the full correlations including the correlations among school grades and academic self-concepts, see the supplements (Supplemental Tables S2 to S6 in the online version of the journal); MGR = math grade; GGR = German grade; PGR = physics grade; EGR = English grade; CGR = chemistry grade; HGR = history grade; BGR = biology grade; HGASC = higher-order factor for general academic self-concept; MATH = higher-order math self-concept; VERBAL = higher-order verbal self-concept; MSC = math self-concept; GSC = German self-concept; PSC = physics self-concept; ESC = English self-concept; CSC = chemistry self-concept; HSC = history self-concept; BSC = biology self-concept; GASC = general academic self-concept; ESEM = exploratory structural equation modeling.
Marsh/Shavelson Model
The Marsh/Shavelson model fitted the data well both with and without school grades (Table 2). The first-order, domain-specific ASC factors and GASC were well-defined given the statistically significant and substantial positive loadings of the items on the corresponding factors (Table 3). Both higher-order factors were also well-defined: The higher-order math ASC was defined by math-like ASCs such as math ASC (λ =.73), chemistry ASC (λ =.74), and physics ASC (λ =.76); the higher-order verbal ASC was defined by verbal-like ASCs such as German ASC (λ =.78) and English ASC (λ =.49). The first-order GASC factor showed substantial positive loadings of similar size on both higher-order factors (higher-order math ASC: λ =.68; higher-order verbal ASC: λ = .64). History ASC and biology ASC also displayed substantial positive loadings on both higher-order factors. History ASC showed a higher loading on the higher-order verbal ASC (λ = .44) than on the higher-order math ASC (λ = .23). Biology ASC showed a similarly high loading on the higher-order verbal ASC (λ = .34) and higher-order math ASC (λ = .45). The correlation between the higher-order math ASC and the higher-order verbal ASC factors was not statistically significant (r = −.05).
The higher-order math ASC was more strongly related to school grades in math/science subjects such as math (r = .50), physics (r = .49), and chemistry (r = .54) than to school grades in verbal subjects (German: r = .14; English: r = .08; Table 4; see also Supplemental Table S3 in the online version of the journal). The higher-order verbal ASC, in turn, was more strongly related to school grades in verbal subjects such as German (r = .47) and English (r = .47) than to grades obtained in math/science subjects (math: r = .05; physics: r = .12; chemistry: r = .05). Both higher-order ASC factors were positively related to the biology grade (r = .36 for the higher-order math ASC, and r = .21 for the higher-order verbal ASC) and the history grade (r = .24 for the higher-order math ASC, and r =.35 for the higher-order verbal ASC).
Nested Marsh/Shavelson Model
The nested Marsh/Shavelson model provided a good fit to the data both with and without school grades (Table 2). The loadings of the domain-specific ASC items on their corresponding S-factors for domain-specific ASCs were all positive and of substantial size (range from λ = .68 for math ASC to λ = .85 for English ASC; Table 3). The G-factor was well-defined as is evident from the substantial positive loadings of the GASC items on the G-factor (λs = .71 to λ = .85). The items measuring domain-specific ASCs also showed positive loadings on the G-factor of a similar size across the different ASC domains (λs = .31 to λ = .51). The factor correlations (Table 5) showed a clear separation between math-like and verbal-like ASCs. For instance, the correlation between math and German ASCs was r = −.47, the correlation between math and English ASCs was r = −.31, and the correlation between German and physics ASCs was r = −.32. High positive correlations were observed between math-like ASCs such as math and physics ASCs (r = .40), math and chemistry ASCs (r = .31), or physics and chemistry ASCs (r = .47). The two language ASCs (i.e., German and English) were also positively correlated (r = .20).
Factor correlations (with p values in parentheses) of the nested Marsh/Shavelson model and first-order factor model
Note. A G-factor is only modeled in the nested Marsh/Shavelson model; a first-order factor for general ASC is only included in the first-order factor model. NM/S = nested Marsh/Shavelson model; FO = first-order factor model; ASC = academic self-concept.
Each domain-specific ASC demonstrated the highest correlations with the school grade of the matching domain (e.g., math ASC and math grade: r = .33; Table 4; see also Supplemental Table S4 in the online version of the journal). Moreover, correlations between math-like ASCs and school grades in math/science subjects were positive and statistically significant (e.g., math ASC and physics grade: r = .12; math ASC and chemistry grade: r = .15). Correlations between verbal-like ASCs and school grades in verbal subjects were positive and statistically significant (e.g., German ASC and English grade: r = .10) or not statistically significant (English ASC and German grade: r = −.03). Negative correlations were observed between math-like ASCs and school grades in verbal subjects (e.g., math ASC and German grade: r = −.12; physics ASC and English grade: r = −.17) as well as between verbal-like ASCs and school grades in math/science subjects (e.g., German ASC and math grade: r = −.23; English ASC and physics grade: r = −.14). Finally, the G-factor demonstrated substantial positive relations to school grades in all subjects (rs = .40 to .48).
Bifactor-ESEM Representation
Despite the very good model fit obtained for the bifactor-ESEM representation (Table 2), an inspection of the factor loadings indicated several estimation problems. Out of 31 target loadings on the S-factors, 11 loadings did not reach statistical significance, although they were substantial in size (i.e., λ ≥ .30; Table 6). For the S-factor of physics ASC, there was one standardized factor loading above 1 that was not statistically significant. The target loadings of the domain-specific ASC items and GASC items on the G-factor were all nonsignificant. Nevertheless, the inclusion of item cross-loadings seemed to be warranted as 52 out of the 217 item cross-loadings were larger than λ > .10 and therefore nonignorable, although most of these cross-loadings were not statistically significant. Finally, the G-factor did not seem to be adequately defined as a domain-unspecific ASC construct: The loadings of GASC items on the G-factor ranged from λ = .48 to .70 in the model without school grades, but some domain-specific ASC items showed even higher loadings on the G-factor (e.g., items for physics ASC and chemistry ASC had loadings of λ =.73 and λ = .83, respectively).
Bifactor-ESEM representation: Standardized factor loadings (p values in parentheses)
Note. Target factor loadings are indicated in bold. The parameter estimates before the slash were obtained from the model without school grades; the parameter estimates after the slash were obtained from the model including school grades. ESEM = exploratory structural equation modeling; ASC = academic self-concept.
When adding the school grades as outcome variables to the bifactor-ESEM representation, the factor loadings of the ASC items on the target S-factors and on the G-factor did not change in sign and were of similar size; however, the standard errors of the factor loadings (indicating precision of estimation) became considerably smaller. Therefore, all target loadings of the ASC items on their matching S-factors and on the G-factor became statistically significant, along with many cross-loadings (Table 6). The problems regarding the definition of the G-factor also vanished, because none of the loadings of domain-specific ASC items on the G-factor exceeded the highest loading of a GASC item (λ = .74).
The S-factors for domain-specific ASCs demonstrated positive and statistically significant relations to the school grades of the matching domains (e.g., math ASC and math grade: r = .38; Table 4; see also Supplemental Table S5 in the online version of the journal). Across domains, most of the correlations between ASCs and school grades were not statistically significant; six out of 42 correlations were statistically significant but small in size (rs = −.10 to .16). The S-factor for GASC was positively and substantially correlated to school grades in all subjects (rs = .32 to .43). Similarly, the G-factor was found to share positive and substantial correlations with all school grades (rs = .14 to .41).
First-Order Factor Model
The first-order factor model provided a good fit to the data (Table 2). All ASC factors were well-defined as indicated by substantially positive factor loadings ranging from λ = .71 for GASC to λ = .92 for chemistry ASC (Table 3). GASC showed statistically significant positive correlations with all domain-specific ASCs; these correlations ranged from r = .38 (for history ASC) to r = .52 (for math ASC; Table 5). Among different domain-specific ASCs, the correlations ranged from a small, yet statistically significant negative correlation between math and German ASCs (r = −.12; i.e., between a math-like and a verbal-like ASC) to a moderate and statistically significant positive correlation between physics and chemistry ASCs (r = .60; i.e., between two math-like ASCs).
When considering the relations to achievement, the highest correlations resulted between ASCs and school grades related to the same domain (e.g., math ASC and math grade: r = .51; Table 4; see also Supplemental Table S6 in the online version of the journal). GASC displayed positive and statistically significant correlations with all school grades (rs = .39 to .47).
Discussion
Part 1 of this article provided an in-depth review of vital theoretical and methodological characteristics of five central structural models of ASC: the higher-order factor model, the Marsh/Shavelson model, the nested Marsh/Shavelson model, a bifactor representation based on ESEM, and the first-order factor model. In Part 2, we illustrated the application of these models using data from a large field trial with secondary school students from Germany. In the following sections, we discuss our empirical findings from a theoretical and methodological perspective. We then conclude with recommendations suggesting which research questions can be addressed by the different ASC models and by outlining advantages and shortcomings of each model.
Review of Empirical Findings
Multidimensionality
The assumption of multidimensionality was empirically supported in all models because all factors representing domain-specific ASCs were well-defined. In particular, the domain-specific ASC items had substantial positive loadings on their respective ASC factors in all models. The domain specificity of ASC was further corroborated by the correlations between ASCs and school grades used as achievement indicators. The nested Marsh/Shavelson model, the bifactor-ESEM representation, and the first-order factor model allow for the examination of domain-specific outcome relations. In all these models, the strongest relations between ASC and school grades were observed in the same domain (i.e., school subject), whereas the correlations between ASCs and school grades of nonmatching domains were relatively lower. This indication of domain specificity was found irrespective of whether these relations involved first-order ASCs (in the first-order factor model) or residualized ASCs controlled for the G-factor (in the nested Marsh/Shavelson model and bifactor-ESEM representation).
Hierarchy
All models assuming a hierarchical ASC structure (i.e., the higher-order factor model, the Marsh/Shavelson model, the nested Marsh/Shavelson model, and the bifactor-ESEM representation) showed a satisfactory fit to the data. In addition, the constructs representing hierarchically superordinate ASC constructs (i.e., the HGASC in the higher-order factor model, the higher-order math and verbal ASCs in the Marsh/Shavelson model, and the G- factors in the nested Marsh/Shavelson model and the bifactor-ESEM representation) were well-defined by the respective subordinate domain-specific ASC factors.
In the higher-order factor model and in the bifactor-ESEM representation, the meaning of the HGASC and the G-factor, respectively, varies with the ASC domains included. This issue was clearly illustrated in our study that was based on a data set containing two verbal-like ASC measures (German, English), three math-like ASC measures (math, physics, chemistry), and two presumably mixed ASC measures (history, biology). This overrepresentation of math-like ASCs may explain the relatively higher factor loadings of the three math-like ASCs on the HGASC and relatively lower factor loadings of the two verbal-like ASCs on the HGASC in the higher-order factor model. A similar pattern of findings emerged in the bifactor-ESEM representation, in that the items for German and English ASCs showed low loadings on the G-factor despite defining strong S-factors. The items for math, physics, and chemistry ASCs, on the other hand, had substantial loadings both on their matching S-factors and on the G-factor. In contrast, in the nested Marsh/Shavelson model, the GASC items displayed the highest factor loadings on G-factor relative to the loadings of all domain-specific ASC items. This is a vital characteristic of the nested Marsh/Shavelson model: The G-factor retains its meaning irrespective of which and how many ASC domains are integrated (Eid et al., 2017).
The Meaning of GASC
Measures of GASC have been increasingly disregarded by ASC research given the strong domain specificity of ASC. However, the consideration and inclusion of GASC measures can be advantageous. First, GASC items are a direct and economical measure of students’ ASC across a variety of domains or school subjects. The findings from this study corroborate this conclusion: In the higher-order factor model, the GASC factor displayed the highest loading on the HGASC of all the first-order ASC factors. In the Marsh/Shavelson model, the GASC factor had similarly sized loadings on both the higher-order math and higher-order verbal ASCs. In the nested Marsh/Shavelson model, the GASC items showed higher loadings on the G-factor compared to the loadings of the domain-specific ASC items. In the bifactor-ESEM representation, the GASC items had substantial positive loadings on the G-factor. The first-order factor model treats GASC as another first-order factor that is positively correlated with the other domain-specific ASCs, and these correlations were of similar size for the various domain-specific ASCs.
The conclusion that GASC displays a cross-sectional representation of domain-specific ASCs is further supported by the inspection of the correlations to achievement. In the first-order factor model, the relations between the first-order GASC factor and the different school grades were all statistically significant, positive, and similar in size. The same pattern of relations could be observed for the G-factor that is defined by the GASC items in the nested Marsh/Shavelson model. Finally, in the bifactor-ESEM representation, the S-factor for GASC showed similarly sized relations to the different school grades. In sum, our findings consistently supported the capacity of GASC measures to capture how students perceive their school-related abilities in general.
We also observed a substantial proportion of residual variance of GASC in the higher-order factor model, the Marsh/Shavelson model, and the bifactor-ESEM representation. This residual variance suggests that GASC items capture some specific variance that is not shared with the domain-specific ASC items. This variance might refer to ASCs in noncore school subjects (e.g., physical education or arts), or might be related to secondary or side competences such as self-regulated learning or social/cooperative skills. Hence, a second advantage of GASC measures is that they seem to contain more information than what is reflected by domain-specific ASCs alone. Still, the residual variance of GASC might not only represent these secondary academic competences but also reflect students’ differential weighting of school subjects or domains when responding to GASC items. In other words, students might have different school subjects or domains in mind when asked for their self-perceptions related to “all school subjects.” The potential subjectivity of GASC items is a common characteristic of measures assessing general constructs not tied to specific domains. For example, Pavot and Diener (1993, p. 164) concluded that items assessing general life satisfaction (e.g., “In most ways my life is close to my ideal”) have the advantage that individuals can “weight domains of their lives in terms of their own values, in arriving at a global judgment of life satisfaction.” In a similar vein, GASC items have the advantage of allowing students to weight different school subjects according to what is important to them and to apply their own standards for success when being asked how they perceive their abilities related to “all school subjects.” Studies explicitly asking students about the domain(s) they think of when responding to GASC items might help to further illuminate the meaning and subjectivity of GASC measures.
A third advantage of GASC measures is related to the finding that ASC measures and outcome variables (e.g., achievement, learning behavior) are the most strongly related when both are located on the same level of hierarchy and when they address the same content (Swann et al., 2007). When the task is to explain and predict domain-unspecific outcomes (e.g., general academic achievement, persistence at school), GASC measures might be more useful than domain-specific ASC measures.
Conceptual Closeness of Domain-Specific ASCs
Apart from the higher-order factor model, all models presented here offer the possibility to inspect the conceptual closeness of domain-specific ASCs. The relevant models consistently demonstrated the separation between math and verbal ASCs as well as closer associations between different math-like ASCs and between different verbal-like ASCs, providing empirical support for a math-verbal continuum of ASCs. In the Marsh/Shavelson model, the higher-order math and verbal ASCs were nearly uncorrelated. In the nested Marsh/Shavelson model and the first-order factor model, math and German ASCs even showed a negative correlation. In addition, the correlations between math-like ASCs and verbal-like ASCs were considerably lower than the correlations within math-like ASCs or within verbal-like ASCs. In the nested Marsh/Shavelson model and in the first-order factor model, we found positive correlations between math, physics, and chemistry ASCs as math-like ASCs. Similarly, we found positive correlations between the verbal-like German and English ASCs. In the first-order factor model, the correlations between and among math-like and verbal-like ASCs partially reflect the common variance among domain-specific ASCs that can be attributed to a hierarchically superordinate construct of general ASC. In the nested Marsh/Shavelson model, however, these correlations are controlled for the G-factor. This might also explain the stronger (negative) correlation between math and German ASCs in the nested Marsh/Shavelson model as compared to the first-order factor model. In the bifactor-ESEM representation, the items measuring math ASC revealed negative cross-loadings on the German ASC factor, and vice versa. Furthermore, we found nonignorable negative cross-loadings between physics and German ASCs, and between chemistry and German ASCs, further illustrating the separation between math-like and verbal-like ASCs. 4 Still, we found positive cross-loadings between math and physics ASCs and between German and English ASCs, illustrating the conceptual closeness among math-like and verbal-like ASCs, respectively.
History and biology ASCs were allocated in the center of the math-verbal continuum of ASC and thus were assumed to have both math-like and verbal-like characteristics. The findings of the present study lent support to the idea of considering biology ASC to be both math-like and verbal-like, as reflected by similar substantial loadings on the higher-order math and the higher-order verbal ASCs in the Marsh/Shavelson model. History ASC, however, seems to be more verbal-like than math-like, given its substantial loading on the higher-order verbal ASC but nonsubstantial loading on the higher-order math ASC. This conclusion was also supported by the findings from the bifactor-ESEM representation, which revealed positive cross-loadings between history and German ASCs. Moreover, in the nested Marsh/Shavelson model, we found a positive correlation between the S-factors for German ASC and history ASC but a negative correlation between the S-factors for math ASC and history ASC. Finally, in the first-order factor model, the correlations between German or English ASCs and history ASC were higher than the correlation between math ASC and history ASC. Hence, findings from different structural models of ASC support the conclusion that history ASC is more verbal-like than math-like. This should be considered in future models on the structure of domain-specific ASCs; previous conceptualizations of history ASC as being both math-like and verbal-like should be revised.
Recommendations
“All models are approximations. Essentially, all models are wrong, but some are useful” (Box & Draper, 1987, p. 424). This common aphorism in statistics applies well to models of the ASC structure. All ASC models are associated with specific limitations that should be balanced against their advantages and usefulness when selecting the most appropriate model for a specific research question. We therefore conclude by offering researchers guidance in how to select an appropriate model contingent upon the specific research questions and study aims and by pointing out the main limitations associated with each ASC model.
Higher-Order Factor Model
Research questions
From a theoretical standpoint, the higher-order factor model is well suited to empirically testing the assumptions of Shavelson et al. (1976). From a methodological standpoint, the higher-order factor model is adequate to assess the variance shared across all ASC items included in a study as represented in the HGASC. In terms of application, the higher-order factor model might be useful in studies that aim to aggregate ASC scores across several domains in order to test the relation of this aggregated ASC score to outcome variables that are also measured on a general level, such as grade point average or general school satisfaction.
Limitations
First, in data sets that include a broad variety of domain-specific ASC measures like the one used in the present study, the higher-order factor model fits the data worse than alternative structural models of ASCs. That is, the higher-order factor model does not reflect the empirical relations among ASC measures as well as other models do. For instance, it does not incorporate the consistently observed differentiation between math and verbal ASCs. Second, the higher-order factor model suffers from the proportionality constraint. Hence, it is not well suited to study the relations between first-order ASCs (i.e., domain-specific ASCs or GASC) and outcome variables, because the ratio of variance attributable to a first-order ASC and to the higher-order factor is constrained to be the same across the ASC items. Third, the meaning of HGASC depends on the ASC domains under investigation, limiting the comparability of results across studies using data sets with different domain-specific ASC measures. If the goal for HGASC is to represent general ASC as accurately as possible, a high number of domains as well as a balance of math-like and verbal-like domains would be desirable. Finally, given its foundation on the ICM/CFA approach, the higher-order factor model does not take item cross-loadings into account, although they are plausible in multidimensional constructs such as ASC.
Marsh/Shavelson Model
Research questions
From a theoretical standpoint, the Marsh/Shavelson model is useful for examining the location of domain-specific ASCs along a math-verbal continuum. From a methodological standpoint, the Marsh/Shavelson model is well suited to examining the common variance across math-like ASCs and across verbal-like ASCs, as reflected in the higher-order math and verbal ASCs. For instance, based on the Marsh/Shavelson model, researchers have examined which science domains are most math-like (Jansen et al., 2015). The model may also be useful for classifying students based on their preference for either the math or verbal domain, using their scores on the higher-order math versus higher-order verbal ASCs. When the need for parsimony only allows the inclusion of higher-order math and higher-order verbal ASCs instead of a variety of domain-specific ASCs, the higher-order ASCs can be useful for predicting students’ educational choices. For example, one might predict that students’ major choice at college or university in the science, technology, engineering, and mathematics domain is associated with higher scores on the higher-order math ASC than on the higher-order verbal ASC (Guo et al., 2015).
Limitations
First, as the Marsh/Shavelson model assumes two higher-order factors (i.e., a higher-order math ASC and a higher-order verbal ASC), it suffers from the proportionality constraint. This means that correlations among all first-order and higher-order ASCs and outcome variables cannot be examined with this model. Second, the meanings of the (math and verbal) higher-order constructs depend on the ASC domains under investigation, which limits the comparability of results across studies that include varying domain-specific ASC measures. A high number and balanced selection of math-like and verbal-like domains will result in a higher substantive validity of higher-order math and verbal ASCs. Third, the Marsh/Shavelson model cannot account for differences between ASCs within the math or verbal domains, as all math-like or verbal-like ASCs are assumed to commonly load on the higher-order math ASC or higher-order verbal ASC, respectively. Finally, the Marsh/Shavelson model relies on the ICM/CFA approach and therefore does not consider possible item cross-loadings across domain-specific ASCs.
Nested Marsh/Shavelson Model
Research questions
From a theoretical standpoint, the nested Marsh/Shavelson model is well suited to simultaneously testing the multidimensionality and hierarchy of ASC following the theoretical assumptions by Shavelson et al. (1976). From a methodological standpoint, as an incomplete bifactor model, the nested Marsh/Shavelson model is an interesting refinement and advancement of previous CFA models. The meaning of the G-factor is invariant even in cases of varying numbers and scope of the domain-specific ASC measures. Hence, this model may be particularly useful in integrative data analyses (e.g., across several data sets based on independent samples; Curran, 2009), in longitudinal research (e.g., when different domain-specific ASC measures are considered over time to reflect the changing school curricula in different grade levels), or in studies that include an unbalanced number of domain-specific math-like and verbal-like ASCs. The nested Marsh/Shavelson model can further be used to examine the relations between all domain-specific ASCs and the G-factor and outcome variables (e.g., achievement). Finally, the nested Marsh/Shavelson model is ideally suited to depicting students’ profiles of self-perceived strengths and weaknesses. In other words, students’ scores on the S-factors for domain-specific ASCs represent their individual profiles of self-perceived strengths and weaknesses related to different domains, which are controlled for the G-factor (Schmidt et al., 2017). Hence, the nested Marsh/Shavelson model can be used to better understand the importance of individual profiles for students’ developmental trajectories (e.g., course selection).
Limitations
First, the nested Marsh/Shavelson model does not include an S-factor for GASC, and thus cannot be used to study research questions that focus on GASC in terms of specific variance that is not shared across the domain-specific ASC measures included. Second, given its foundation on the ICM/CFA approach, the nested Marsh/Shavelson model does not take item cross-loadings into account, but these cross-loadings may explain at least some item variance over and above the target loadings on domain-specific ASCs and the G-factor.
Bifactor-ESEM Representation
Research questions
From a theoretical standpoint, the bifactor-ESEM representation simultaneously reflects the multidimensionality and hierarchy of ASC following the assumptions by Shavelson et al. (1976). With this model, researchers can test the relations between domain-specific ASCs, GASC, and the G-factor and outcome variables (e.g., achievement). From a methodological standpoint, it is a full bifactor model including an S-factor for GASC, which makes it possible to test whether GASC retains some specific variance over and above the G-factor and how this specific variance is related to outcome variables. Moreover, based on the ESEM approach, the bifactor-ESEM representation considers the inclusion of item cross-loadings, which are theoretically plausible in multidimensional constructs such as ASC. This model characteristic might be particularly useful for the purposes of scale development and evaluation. In fact, item cross-loadings reflect the extent to which ASC items measure conceptually close but distinct ASCs over and above the target ASCs. This empirical knowledge may help construct ASC scales that better measure a certain target construct.
Limitations
First, the bifactor-ESEM representation cannot probe for correlations among domain-specific ASCs (including GASC), as the corresponding factors are specified to be orthogonal. Second, the meaning of the G-factor reflecting general ASC may change contingent upon the number and selection of domain-specific ASCs considered. Hence, a high and balanced number of domain-specific math-like and verbal-like ASCs should be included in respective studies. Third, in our study, some estimation problems concerning the size and statistical significance of factor loadings occurred in models without school grades as outcome variables. Hence, considerably more knowledge (e.g., on sample size, number of factor loadings, ratio of number of indicators per factor) is needed to identify the conditions for obtaining reliable parameter estimates in the bifactor-ESEM representation.
First-Order Factor Model
Research questions
From a theoretical standpoint, the first-order factor model depicts the multidimensionality of ASC, and therefore reflects one core assumption of the ASC structure as proposed by Shavelson et al. (1976). Moreover, this model enables researchers to examine the relations between domain-specific ASCs and GASC and outcome variables (e.g., achievement). Finally, the first-order factor model can probe for the conceptual closeness of ASCs within and across domains. From a methodological standpoint, it is surely the simplest model of the ASC structure and can thus be used as a starting point for data evaluation. Practically, many studies are based on the first-order model, particularly when they include only domain-specific ASC measures (i.e., no measure of GASC) and when they are not interested in possible higher-order constructs of ASC or in the structure of ASC. The first-order factor model is frequently used to examine relations between ASC and outcome variables, such as studies on the cross-sectional and longitudinal relations between ASC and achievement within one domain (e.g., Arens et al., 2017; Marsh et al., 2015) or across several domains (e.g., Möller et al., 2009, 2020; Weidinger et al., 2019).
Limitations
First, the first-order factor model does not consider the hierarchy of ASC; thus, it does not fully represent the theoretical assumptions of the nature of ASC (Shavelson et al., 1976). Consequently, the resulting domain-specific ASCs represent a blend of variance attributable to a hierarchically superordinate ASC construct and to domain-specific ASC facets. Second, given its foundation on the ICM/CFA approach, the first-order factor model does not take into account the theoretically plausible and empirically validated item cross-loadings of the multidimensional ASC construct.
General Remarks
An ideal approach to each ASC study would surely be to comparatively test and evaluate the different ASC models using multiple criteria including those applied in the present study. Researchers could thus learn about the consistency of empirical findings across different ASC models and get insight into how the results vary contingent upon which ASC model is applied. Still, this might not be always possible because the different ASC models rely on different conditions of use. The higher-order factor model, the Marsh/Shavelson model, and the bifactor-ESEM model need a broad array of domain-specific ASC measures as well as a GASC measure. Here, a balanced number of domain-specific math-like and verbal-like ASCs is recommended to prevent verbally or mathematically biased hierarchically superordinate constructs. Given the invariance of the G-factor in the nested Marsh/Shavelson model, a minimal specification of the nested Marsh/Shavelson model is the assessment of one domain-specific ASC in addition to GASC items. On the other hand, the first-order factor model can be used even when only one domain-specific ASC is assessed. This might be the case in cross-sectional and longitudinal large-scale studies such as the Programme for International Student Assessment (PISA) or the Trends in International Mathematics and Science Study (TIMSS), which cover many student variables beyond ASC.
Concerning sample size requirements, the first-order factor model seems to have the least restrictive requirements. The simulation study by Wolf et al. (2013) implied that a sample of N = 150 students is sufficient to obtain reliable model parameters in a CFA model with three factors that are defined by substantial item indicators greater than λ = .80. Furthermore, the required sample size does not substantially increase when more than three well-defined factors are included. Hence, the required sample size depends on the measurement quality of the factor indicators (i.e., factor loadings of the item indicators), suggesting that the use of well-validated ASC instruments is important. Also, the required sample size varies contingent upon the number of indicators (e.g., items) per factor (Bollen, 1989), making it difficult to establish general rules on sample size requirements. With respect to more complex models involving at least one hierarchical superordinate ASC construct, it is even more challenging to postulate sample size requirements. Simulation studies would thus be helpful for the purpose of study planning (Muthén & Muthén, 2002) and can build on the empirical findings of the present study. For example, the simulation study by Morgan et al. (2015) suggested that a sample of N = 200 is required for analyzing higher-order factor models (with one higher-order and four first-order factors with two to three indicators per first-order factor) and bifactor-CFA models (with one G-factor and four S-factors with two to three indicators per S-factor).
Conclusion
Starting with the seminal theoretical model by Shavelson et al. (1976), researchers have been interested in investigating the relations among domain-specific ASCs, between domain-specific ASCs and GASC, and between ASC and outcome variables. In this context, researchers often face the question of which structural model of ASC should be used. We systematically compared and empirically illustrated those structural models that have been most often applied in past research or that have been recently established through methodological developments. To this end, we discussed in detail the inherent properties of the higher-order factor model, the Marsh/Shavelson model, the nested Marsh/Shavelson model, the bifactor-ESEM representation, and the first-order factor model. Moreover, we outlined how these models represent key theoretical assumptions and empirical findings concerning the structure of ASC, how the included factors can be interpreted, and how they can address substantial questions of ASC research and theory. Each ASC model has its advantages and limitations when it comes to answering different research questions. Hence, careful consideration is needed when selecting a specific ASC model. Other core constructs in education research (e.g., academic anxiety, academic interest) have theoretical underpinnings that are similar to ASC as they can also be conceptualized as multidimensional and hierarchical in nature (Gogol et al., 2017). We therefore hope that our review provides helpful guidance not only to researchers choosing between structural models representing ASC but also to researchers focusing on other core constructs of education research and psychology.
Supplemental Material
sj-docx-1-rer-10.3102_0034654320972186 – Supplemental material for The Structure of Academic Self-Concept: A Methodological Review and Empirical Illustration of Central Models
Supplemental material, sj-docx-1-rer-10.3102_0034654320972186 for The Structure of Academic Self-Concept: A Methodological Review and Empirical Illustration of Central Models by A. Katrin Arens, Malte Jansen, Franzis Preckel, Isabelle Schmidt and Martin Brunner in Review of Educational Research
Footnotes
Preparation of this article was supported by a Heisenberg fellowship grant from the German Research Foundation to A. Katrin Arens (AR 877/3-1).
Notes
Authors
A. KATRIN ARENS, PhD, holds a Heisenberg postdoctoral position funded by the German Research Foundation (DFG) at the DIPF/Leibniz Institute for Research and Information in Education, Department on Research on Education and Human Development and Centre for Research on Individual Development and Adaptive Education of Children at Risk (IDeA), Rostocker Str. 6, D-60323 Frankfurt am Main, Germany; email:
MALTE JANSEN, PhD, is head of the research data center at the Institute for Educational Quality Development at the Humboldt University of Berlin, Unter den Linden 6, 10099 Berlin, Germany; email: malte.jansen@iqb.hu-berlin.de. The research data center is also part of the center for international student assessment (ZIB). Malte Jansen’s main research interests are student motivation, social and dimensional comparisons and social networks.
FRANZIS PRECKEL, PhD, is a full professor and head of the Chair of Giftedness Research and Education at the Department of Psychology, University of Trier, Universitätsring 15, D-54296 Trier, Germany; email:
ISABELLE SCHMIDT, PhD, is a senior researcher at GESIS-Leibniz Institute for the Social Sciences, B2 1, 68159 Mannheim, Germany; email:
MARTIN BRUNNER, PhD, is a full professor and head of the Chair of Quantitative Methods in Educational Sciences at the Department of Educational Sciences, University of Potsdam, Karl-Liebknecht-Straße 24–25, Potsdam, Brandenburg 14476, Germany; email:
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
