Abstract
This article addresses aggregation as a fundamental practice in educational psychology and ties it into the idiographic/nomothetic distinction, that is, distinguishing between studying what once was and studying what always is. I address the underlying assumptions of seminal educational research (OECD’s large-scales assessment and Hattie’s synthesizing meta-analyses). I argue that educational psychologists assume a priori general educational principles akin to nomothetic laws without sufficiently scrutinizing the limitations of aggregation. I then contextualize this assumption within the history of psychology, and address how these assumptions shape how educational psychologists view, collect, and examine data. Furthermore, I contextualize this assumption with an example showing a peculiarity of educational research: the existence of multiple perspectives on constructs. Finally, I argue that investing time and resources in the debate on aggregation and the epistemic nature of the insights that educational psychologists generate will ultimately advance the field and help bridge the theory–practice gap.
This article deals with the practice of aggregation—“combining and summarizing a set of scores into a smaller set of scores that capture an aspect of the original set” (American Psychological Association [APA], 2022, para. 2)—in educational psychology. I argue in this article that educational psychologists assume a priori general educational principles akin to nomothetic laws. This assumption, in turn, shapes how we view, collect, and examine data. Accordingly, educational psychologists should scrutinize their own assumptions and aggregate data to higher levels based on well-founded rationale. This scrutiny should also be accompanied by heeding the limitations of what type of insights we can achieve by only studying aggregates (Lamiell, 1998).
To develop this line of reasoning, I first discuss aggregation in educational psychology. I then go into the practice of aggregation in relation to striving for nomothetic laws. In doing so, I revisit the idiographic/nomothetic distinction and relate it to today’s educational psychology. Building on this, I address the underlying assumptions of seminal educational research. In doing so, I argue that educational psychologists assume a priori general educational principles akin to nomothetic laws. I then go into the epistemic implications of this approach to research and discuss the limitations inherently imposed by studying aggregates in samples and how aggregates are limited in yielding nomothetic insights. Furthermore, I contextualize these lines of reasoning by an example showing a peculiarity of educational research: the existence of multiple perspectives on constructs. Therein, I address how our assumptions on general educational principles can shape how educational psychologists view, collect, and examine data.
Lastly, I discuss the theory–practice gap in educational psychology. I do so by putting forth the notion that the transfer of educational findings into practice may be hindered by quantitative educational psychologists aggregating to ever higher levels without considering the type of insights this process generates. Before this backdrop, I argue that investing time and resources in this debate may advance the field by helping to bridge the divide between theory and practice.
I conclude the article with two recommendations for educational psychologists. I invite all psychological researchers to examine if these can be applied to their respective area of research.
On aggregation in educational psychology
The practice of aggregation is inherent to today’s quantitative educational psychology. In theory, we can aggregate data as soon as the number of observations exceeds one. From a statistical perspective, we can already aggregate data of two participants, most commonly, by averaging. While intuitively a mean of n = 2 is not meaningful, in theory n = 2 already enables nonzero variances and nonzero degrees of freedom; consequently, commonly used statistical procedures relying on variance become available (e.g., t-tests to structural equation models). Going from such minimum requirements for aggregation to the other extreme, educational studies have reached high aggregation levels of up to approximately 236 million students (Hattie, 2009), or entire countries (e.g., The Programme for International Student Assessment, i.e., PISA; Organisation for Economic Co-operation and Development [OECD], 2000). However, we can even observe aggregation within educational case studies (e.g., single-case experimental designs; Plavnick & Ferreri, 2013) or even research of individual teachers (cf. action research; Altrichter, 1990). Essentially, aggregation is present as soon as educational psychologists can use means.
This omnipresence of the aggregate is not a new phenomenon. Danzinger (1990) describes the historical perspective of how psychology shifted from experimentally studying individuals to making knowledge claims based on aggregation of either societal or artificially created groups. Therein, Danzinger highlights how presenting means became suggestive of general laws describing the individual. This shift was supported by aggregated social statistics repeatedly demonstrating regularities (e.g., crime rates in certain districts). Eventually, “psychologists took to presenting their data as the attributes of collective rather than individual subjects” (Danzinger, 1990, p. 87).
Furthermore, educational psychology is signified by closeness to other social sciences and their statistical methods (Gräsel, 2011). This closeness to statistical methods is reflected in the fact that among the psychological journals it was the Journal of Educational Psychology that most rapidly adopted the shift to means across the 1910s, 1920s, and 1930s (Danzinger, 1990, p. 81). For some subdisciplines of educational psychology, this development with its closeness to other areas of psychological research is even made explicit (e.g., comparative education research; Jornitz & Wilmers, 2021).
This triumph of the aggregate has also been the subject of critique in educational psychology. One side of this critique hails from a theoretical level. For example, Smeyers (2015) calls for a critical appraisal of common quantitative educational practices and the promise of simplicity (p. 1383). Questioning this promise can already be understood as criticism of the aggregate: it questions if the implicit simplicity of aggregates is warranted. After all, relying on aggregates can tempt us to dismiss the heterogeneity and thereby the variance that may be underlying the aggregate.
To illustrate with a concrete example, scholars have recently criticized PISA’s comparisons of countries’ average performances (Miller & Fonseca, 2021). The criticism is based on the fact that “an average is just an average” (Miller & Fonseca, 2021, p. 195). Consider, for example, that selected schools from so-called low performing countries can outperform selected schools from so-called high performing countries despite the average differences on the country level. Accordingly, aggregated country level differences in performance do not govern the discrete empirical entities underlying this aggregate.
We can also view this above-mentioned critique of the aggregate in terms of criticizing reductionism in educational psychology. However, we need to clearly distinguish this critique from the overarching discussion on reductionism in educational research. Such a distinction is necessary, as this line of critique encompasses many other lines of reasoning (for an overview, see Allen, 1991; Wrigley, 2019). This specific critique on the aggregate better fits into the narrative of scholars criticizing meta-analyses (Pant, 2014; Wrigley, 2018) and how they tempt us to simplicity at the price of obscuring existing heterogeneity (see Feinstein, 1995; Glass & Robinson, 2004). This promise of simplicity may also be seen as a response to the complexity of educational psychology. After all, this complexity has led seminal scholars to designate educational research as the hardest science of all (Berliner, 2002).
Despite the criticism outlined above, currently, there is a strong tendency in educational psychology to aggregate data to ever higher levels. This tendency is illustrated by synthesizing meta-analyses (e.g., Hattie, 2009) or large-scale assessments such as PISA (see OECD, 2000). Scholars are repeatedly outlining the superiority of higher levels of aggregation over lower levels (as a prominent example, see Hattie, 2009). Moreover, this tendency is also reflected in calls for more data science methods in educational psychology (e.g., Singer, 2019) or even specific recommendations for statistical analyses geared towards handling more and more data (e.g., machine learning; Lindl et al., 2020).
Unfortunately, this tendency does not seem to be accompanied by quantitative educational psychologists debating saliently what the underlying aggregation implies and what are its inherent limitations to knowledge generation. I argue that, to advance the field and its best practice, it is necessary that educational psychology leads this debate on aggregation in close collaboration with quantitative educational psychologists.
To this end, I put forth the notion that educational psychologists’ tendency to aggregate data to ever higher levels is driven by the desire to find general educational principles, that is, educational psychologists strive for their generated knowledge to yield nomothetic laws in the educational context. In the quest for such general educational principles, researchers are trying to remove effects of the context, the situation, and the zeitgeist. At the same time, educational psychology seems to fall short of acknowledging the limitations inherent to analysing aggregates.
Aggregating to strive for nomothetic laws
On the nomothetic/idiographic distinction
Let us first revisit the distinction of studying what once was and studying what always is. Windelband (1921) termed this distinction nomothetic and idiographic, respectively. Nomothetic deals with distinct classes of phenomena, and idiographic describes principles bound by time, context, and culture (Grice et al., 2006). It is part of a debate on our understanding of science and on what divides and unites different disciplines. Scholars often date Windelband’s distinction to his work in 1921, but, in fact, it goes back to a speech in 1894 (Hurlburt & Knapp, 2006; Kemmis, 1978; Lamiell, 1998; von Wright, 1971) or even 1876 (Salvatore & Valsiner, 2010). Therein, Windelband differentiated between nomothetic and idiographic research in an attempt to offer a common ground for the different fields of science.
Tracing back the steps, the idiographic/nomothetic distinction of Windelband was brought via Münsterberg to Stern to Allport and, finally, to psychology’s attention (Hurlburt & Knapp, 2006). The simplified notion in psychology seems to be that nomothetic laws apply to groups while idiographic laws explain the individual (Beck, 1953; Falk, 1956); a prevailing view (see, e.g., Diemer & Gore, 2009; Lazarus et al., 2020; Renner et al., 2020). Consequently, nomothetic psychological laws are derived by reducing an individual to a selected subset of traits, which we then study in groups of individuals. The credo is to study interindividual differences. By contrast, idiographic psychological laws are derived by studying an individual and all their traits. To do so, we acknowledge that individuality is made up of an entire universe of traits, which can all interact with each other. The credo is to study intra-individual differences.
The idiographic/nomothetic distinction has led to the assumption that idiographic and nomothetic are opposites. On the one hand, this assumption led to research aiming to bridge this divide (e.g., Grice et al., 2006) and scholars calling to reconsider existing paradigms (Molenaar, 2004). This distinction also gave rise to different approaches to research, each with distinct research aims concerned with either variables or persons (Howard & Hoffman, 2018). In the more extreme case, this view has led to scholars stressing that idiographic research is an “antiscience” (Nunnally, 1978).
On the other hand, the above-mentioned distinction is an inherent controversy which scholars have tried to homogenize. For example, Meehl (1954) proposed that laws of (human) behaviour are nomothetic for a given group, idiographic when applied to the individual, and finally, strongly idiographic when considering response properties of organisms. Others stress that idiographic and nomothetic research need to go hand in hand (Beck, 1953). Another approach is to find nomothetic laws in idiographic approaches, that is, through collecting more data on intra-individual differences to derive nomothetic laws (e.g., Bakan, 1955; Renner et al., 2020).
Going back to Windelband, he would be very surprised how the debate in psychology around nomothetic and idiographic research deviates from what he intended. Lamiell (1998) gives an overview of how we can understand the original distinction of Windelband. In doing so, he draws a very different picture than contemporary psychology. Far beyond the simplified notion of “nomothetic = group & idiographic = individual,” he stresses that nomothetic was never limited to groups and interindividual differences, nor idiographic to the individual and intra-individual differences. Put more generally, nomothetic refers to the sciences of laws (i.e., what always is) and idiographic to the sciences of events (i.e., what once was). The former concerns itself with what always is, the latter concerns itself with what once was. The former referred to laws for distinct classes of phenomena, the latter to principles bound by time, context, and culture (Grice et al., 2006).
When we ourselves as humans become the object of inquiry, we should look for general laws that we can apply to the individual, even though this may never reach the goal of explaining the individual. Windelband’s underlying assumption was that even after applying general laws to the individual, something will be left unexplained (Lamiell, 1998). Nevertheless, this view implies that there are nomothetic psychological laws. This implication offered psychologists the possibility to position themselves in a domain where they could claim to study human behaviour and derive general laws. Thereby, psychology could see itself as a special case of the natural sciences (Mos, 1998; Salvatore & Valsiner, 2010).
The nomothetic/idiographic distinction in educational psychology
Within the educational debate, we can also find the distinction between nomothetic and idiographic research. However, the nomothetic/idiographic debate is much less dominant in educational psychology and focuses on the idiographic aspect. This focus is rooted in the context and situation specificity of educational processes. For example, scholars have advocated shifting to case studies due to the shortcomings of quantitative educational psychology (Codd, 1981). Other evaluation theorists have called for a focus on the individual to understand the educational milieu in which learning takes place (Kemmis, 1978). Others stress that educational psychology needs to heed individual ways of learning, as learning appears in specific and unique forms (Eisner, 1993). Even others point to researchers handling idiographic synonymously with case studies and action research while delimiting these lines of research from “traditional–quantitative–empirical” research (Altrichter, 1990, p. 136). In the broader scope, we can also find this focus on the individual in comparative education research in which the individual is a whole education system (e.g., application of the idiographic function of Hörner in school leadership research; Brauckmann-Sajkiewicz et al., 2021).
However, these educational examples only partially reflect how the distinction between nomothetic and idiographic was intended by Windelband. These examples adequately reflect that idiographic is bound by time, context, and culture. In doing so, testimony is given to the specificities of educational processes. On the other hand, all above-mentioned examples suffer from the same distortion of the nomothetic/idiographic distinction that can be found in psychological research. This distortion is that idiographic is viewed as inherently limited to studying the individual. Consequently, educational psychology draws the same line psychological research does: either study groups (nomothetic) or study individuals (idiographic).
I argue that this divide has done and will do a disservice to educational psychology. Reconsidering how the nomothetic/idiographic distinction was intended by Windelband (i.e., studying what always is vs. what once was) may be an asset for educational research. I put this argument forth as such reconsideration may prompt educational psychologists to scrutinize their own assumptions on the type of insights we aim to gain. Doing so may, in turn, keep educational psychologists from hastily aggregating to ever higher levels, while erroneously striving to thereby distil general educational principles akin to nomothetic laws. On the contrary, such reconsideration may let us take a step back from the highest level of aggregation, without having to abandon aggregation in favour of henceforth only studying individuals.
The nomothetic striving of educational psychology
I will argue in this section that educational psychology is striving to find general educational principles akin to nomothetic laws by analysing aggregates. Educational psychologists readily assume such general educational principles without scrutinizing this assumption nor how aggregates are supposed to unveil these principles, nor how aggregates cannot unveil these principles for the discrete empirical entities underlying aggregates. I want to first scrutinize this assumption by referring to the PISA studies of the OECD (2000) and the synthesis of meta-analyses by Hattie (2009).
Please bear in mind that even though the OECD’s (2000) PISA and Hattie’s (2009) synthesis of meta-analyses have been highly impactful in the educational field, they are of course examples. Accordingly, they are bound by their context, situation, and zeitgeist. Tapping into the idiographic/nomothetic distinction, these examples serve as studying what once was and are limited by not being able to express what always is. Therefore, conclusions derived from these examples should not be viewed as general educational principles akin to nomothetic laws. They should be viewed as examples of highly impactful studies in the field, which address two important stakeholders in education: students and teachers.
PISA has established itself as an important player for policy reforms and is widely used for benchmarking students’ performance worldwide (Breakspear, 2012). Thus, scholars view PISA as a story of success for the betterment of educational systems worldwide (e.g., Schleicher, 2018). Having said that, scholars also hotly debate and heavily criticize PISA in its basic assumptions (Zhao, 2020), its usefulness and implications (Sjøberg, 2015), and its methods (Fernandez-Cano, 2016). It seems the only thing that proponents and opponents of PISA can agree upon is that PISA yields great influence. At the same time, all stakeholders involved in debating the pros and cons of PISA have more differences than commonalities, resulting in a debate that is rarely eye to eye or beyond one’s own respective field and view (Gorur, 2017).
Setting the controversies surrounding PISA aside, taking a closer look at PISA can be informative for discerning some explicit and implicit views of the educational psychologists involved. Schleicher (2018) strived to “compare the achievements of . . . school systems with those of other countries” (p. 18). This is aimed very high and, once achieved, undoubtedly offers unprecedented possibilities for the betterment of educational systems. At the same time, this ambitious scheme rests on underlying assumptions strongly grounded in what type of knowledge researchers think can be generated. A basic assumption underlying PISA and the educational psychologists involved is that it is possible to measure the quality of educational systems by common and comparable indicators (Sjøberg, 2015). In turn, this assumption is rooted in an even more fundamental assumption on education: the existence of a set of skills and knowledge universally valuable in societies, regardless of that society’s past and future (Zhao, 2020). These are what PISA claims to assess: not what once was but what always is. Accordingly, the striving of PISA is to tease out general educational principles from data, such as distinguishing features of high-performing school systems (Schleicher, 2018). Without assuming such general educational principles akin to nomothetic laws, the striving of PISA is unattainable from the start. Furthermore, PISA evidently assumes that analysing aggregates can yield such general educational principles.
Next, let us consider the synthesis of meta-analyses by Hattie (2009). Hattie’s synthesis is undoubtedly a milestone in the effort to understand successful learning in schools (Terhart, 2011). As such, much has been spoken and written due to and about Hattie’s synthesis (for a recent overview, see Cramer, 2021). Going hand in hand with being well-received worldwide, Hattie’s synthesis was and is also the subject of heavy criticism on the methodological (Bergeron & Rivard, 2017) and theoretical (Rømer, 2019) level. Not unlike PISA, scholars have criticized Hattie’s synthesis as having created a cult (Eacott, 2017).
Without having to pick a side in this controversy, taking a closer look at Hattie (2009) can again be very informative for discerning some explicit and implicit views of the educational psychologists involved. Already at the beginning of his book, Hattie (2009) attests to differences in classrooms, pointing out that today’s classrooms are “busy, multifaceted, culturally invested, and changing” (p. 4) and that “every moment of learning is different” (p. 2). Despite appreciating the fact that that each classroom is variously different, Hattie (2009) ascribes to the notion that “effective teaching can occur similarly for all students, all ethnicities, and all subjects” (p. 239). On the one hand, this notion may testify to believing in educability. On the other hand, this notion implies that studying effective teaching can go beyond studying what once was into discerning what always is. This notion of an underlying general educational principle is a necessary precondition to take a step towards the goal of Hattie’s (2009) synthesis: “we need a barometer of what works best” (p. ix).
Following the train of thought outlined above, the underlying notion in Hattie (2009) is that all differences across teachers, learners, schools, and so forth can be dealt with by aggregating enough student data. Educational psychologists seem to agree with this notion, as criticism is predominantly directed towards Hattie’s flawed methodology and lacking statistical rigour (e.g., Bergeron & Rivard, 2017). On the other hand, theorists have opposed this underlying notion avidly (Rømer, 2019). But, as with PISA, assuming that we can distil such general educational principles given enough data is also a necessary precondition for Hattie’s synthesis. Otherwise, Hattie’s aspiration would have been unattainable from the start and aggregation would not have been his method of choice.
Taken together, both the PISA studies of the OECD (2000) and the synthesis of meta-analyses by Hattie (2009) need to assume the existence of general educational principles akin to nomothetic laws. This assumption is rarely scrutinized within the field of quantitative educational psychology. Yet, it profoundly shapes the way we view, collect, and examine data. After all, to educational psychologists holding this view, such nomothetic laws are hidden within the data. To attain these nomothetic laws, educational psychologists are ever refining their methods and measurements and are collecting more and more data. Better measurement and more aggregation seems to be the chosen path to find these nomothetic laws in education, that is, to go beyond studying what once was and to gain insights into what always is. Herein lies the nomothetic striving of educational psychology.
On nomothesis in today’s educational psychology
This nomothetic striving of educational psychology can be considered as the discipline’s efforts towards developing its self-understanding as a natural science. Such a development in educational psychology is encouraged by the influence of other psychological disciplines. For subdisciplines of educational psychology, such as comparative education research, scholars have made this development with its closeness to other psychological disciplines explicit (for an overview, see Jornitz & Wilmers, 2021). This shift in self-understanding entails two distinct characteristics: the population-centredness (Salvatore & Valsiner, 2010) and the triumph of the aggregate (Danzinger, 1990). Both are as common in educational psychology as they seem unquestioned in common publishing practice.
First, let us consider population-centredness (Salvatore & Valsiner, 2010). Along this line, scholars have suggested binding nomothetic laws by context, situation, and zeitgeist. Doing so, scholars can claim for any finding that it is nomothetic in nature. To give this appearance, researchers must simply specify the exact population where the derived law applies. Such a population may not only be limited by its characteristics, but also by its context, situation, and zeitgeist. This line of reasoning, in particular, renders the nomothetic striving absurd. Consider, for example, Brown et al. (2022) studying what characteristics and circumstances are beneficial for research-informed educational practice in educators. With population-centredness, we can easily lift this claim up to the status of a nomothetic law. The presumably nomothetic law would be that the finding always holds in exactly the studied sample of N = 147 educators at the point in time when the study was carried out in Northern England, Southern England, Midlands, and London (UK). To alleviate this limitation, researchers could go a step further and study whether this finding holds repeatedly across multiple populations. However, following this path would be obstructed by the need for the finding to hold empirically for every particular instance. Taken together, such a law would not depict a general educational principle of a phenomenon that always is, but would be a description of a phenomenon that once was—or at best a phenomenon that was several times in aggregates.
The scientific community would of course reject a nomothetic law derived in the above-described manner. However, educational psychologists would do so given the limited contribution stemming from the severely limited generalizability (for a similar line of reasoning, see Robinson et al., 2013). We would not explicitly reject that it is nomothetic in nature; which, however, would be the correct approach. This is akin to educational psychologists specifying in publications the limitations in generalizability of their own findings and samples, thereby seemingly alleviating the criticism that their findings may not be nomothetic. In common publishing practice, furthermore, this limitation of generalizability is often specified after the research was conducted. What, in fact, is being done here is that we study what once was and claim nomothetic nature by stating that what once was is a general law for what once was and thereby describes something that always is. However, such a line of reasoning does of course not produce general educational principles akin to nomothetic laws.
Bearing population-centredness in mind, let us move on to the triumph of the aggregate (Danzinger, 1990). After all, assuming populations that can be studied by collecting sufficiently sizeable subsets of their individuals is already grounded in the triumph of the aggregate. Danzinger (1990) describes the historical perspective of this triumph. This development was strong within educational psychology: the Journal of Educational Psychology rapidly adopted this practice across the 1910s, 1920s, and 1930s (Danzinger, 1990, p. 81). To reiterate, researchers “took to presenting their data as the attributes of collective rather than individual subjects” (Danzinger, 1990, p. 87), while not evaluating critically enough that regularities on the collective level need not represent regularities on the individual level (Lamiell, 2019). Consequently, researchers lost interest in applying insights to individuals. For educational psychology, this lack hinders applicability of findings to specific problems and individual students. This is unfortunate, as it is exactly this lack of applicability that teachers criticize about educational psychology (Hinzke et al., 2020; Joram et al., 2020), and that scholars have suggested should be increased to facilitate educators’ research-informed educational practice (Brown et al., 2022).
Besides problems and challenges of aggregation discussed by Danzinger (1990), aggregation has a fundamental epistemic implication for the debate on nomothesis. As scholars have pointed out, studying aggregates is severely limited in the striving for discerning nomothetic laws, as insights derived from aggregation may be true for the aggregate but need not apply to every single instance (Lamiell, 1998). Consequently, it would not only be necessary that this transfer of the aggregate to the individual is given, but research would need to investigate this empirically for every particular instance. To be nomothetic in nature, a law derived from the aggregate would need to apply to every single individual from which the aggregate was formed—and more. One way to try to circumvent this issue would be to specify that a law derived from the aggregate is only nomothetic for the aggregate in the situation, the context, and the zeitgeist when it was assessed. Doing so would be as fallacious as described above for population-centredness and would not yield nomothetic laws. Another way to try to circumvent this issue would be to show that a law derived from the aggregate holds repeatedly for each of many aggregates. However, doing so would also be insurmountable, as researchers would need to investigate this empirically for every particular aggregate.
While this inherent nonnomothetic nature yields a strong limitation to aggregation, it also offers a very alluring benefit: laws derived from the aggregate are easily defensible against individuals not adhering to the law. Thereby, these laws are not refutable by individual cases. The exception to the rule does not refute the rule (Lamiell, 1998). The school for which PISA recommendations do not change student achievement cannot refute Schleicher (2018). Akin to this, the student for whom feedback does not work cannot refute Hattie (2009).
Now what are the implications for educational studies where researchers analyse aggregates? On an epistemic level, the implications are that analysing aggregate statistics such as means, standard deviations, correlations, and so forth cannot yield general educational principles akin to nomothetic laws. Consequently, researchers need to acknowledge having gained an insight that was bound by context, situation, and zeitgeist, and is thus inherently idiographic in nature.
Furthermore, we cannot overcome this idiographic nature by aggregating to higher levels. No matter how much researchers strive to nullify the boundedness of an aggregate statistic of a sample by more aggregation, the results will still not be nomothetic in nature.
Moreover, applying meta-analyses can also not alleviate this limitation. Consider first that meta-analyses aggregate aggregates, for example, by estimating an underlying effect size based on multiple studies, each analysing standardized mean differences. What researchers are thereby doing is taking a study that studied what once was, and merging it with other studies that also studied what once was. This process cannot elevate studying what once was to having studied what always is, that is, the sum of multiple idiographic findings does not yield a nomothetic finding.
This idiographic nature of meta-analyses and their underlying primary studies becomes even more clear when considering that researchers per best practice analyse individual studies’ heterogeneity/homogeneity in meta-analyses (Feinstein, 1995). These analyses are geared towards discerning whether individual studies differ in their results and if, consequently, researchers should heed moderators in their meta-analyses (Song et al., 2001). Accordingly, any meta-analyses discerning moderators shows that an effect varied, for example, by context. Moreover, educational psychology is affected by powerful context effects and interactions even more than other fields, as is bound to be reflected by heterogeneity within meta-analyses (cf. Berliner, 2002; Pant, 2014). Consequently, we are not even tasked yet with contemplating if accumulating sufficient idiographic findings with sufficient homogeneity may discern something nomothetic, as the idiographic studies already differ heavily.
This challenge of results varying by context, situation, and zeitgeist also shines a light on another tradition in educational psychology: the lack of replications. This lack has been criticized in educational psychology (Makel & Plucker, 2014; Plucker & Makel, 2021; Rost & Bienefeld, 2019; Travers et al., 2016), and unfortunately ties into psychology’s tradition of preferring novel and statistically significant findings over thoroughly researched findings (Pigott et al., 2013; Sterling, 1959). This tradition goes hand in hand with an excessive number of positive findings (91.5% in psychiatry/psychology; Fanelli, 2010) and a lack of reproducibility (Open Science Collaboration, 2015). This tendency to not replicate findings and to generate findings that do not replicate shines a critical light on the idiographic/nomothetic distinction: how can we dare to make any nomothetic postulates before this background?
On a positive note, scholars have begun to advocate Open Science and to call for more robustness within psychological research (e.g., Wentzel, 2021). This call to Open Science is also echoed within educational psychology (Gehlbach & Robinson, 2021; Krammer & Svecnik, 2020; van der Zee & Reich, 2018). Heeding this call to Open Science and to therefore conduct research more transparently and openly while undertaking more replications may lead educational psychology to at least gauge the robustness of insights into educational issues.
At this point, I want to clarify that this article should not be misunderstood as a call to abandon or to intensify quantitative educational psychology. Neither is it a call to give up on striving to find universal insights into educational issues. However, educational psychology should scrutinize itself and its assumptions on which knowledge it strives to attain. After all, today’s educational psychology is studying what once was while making strong claims about what always is.
A concrete example: Students’ and teachers’ perspectives on teaching quality
After viewing the practice of aggregating in the context of striving for nomothetic laws in educational psychology, I want to place this debate into one specific field of educational psychology—students’ and teachers’ perspectives on teaching quality. Doing so highlights how assumptions on general principles in educational psychology can shape the research process, and thereby the type of light we shed onto educational issues.
A peculiarity of educational psychology is the existence of multiple perspectives on constructs. While there are many constructs in education and educational psychology with different perspectives, I want to limit this example to the perspectives of students and teachers. Scholars can and have assessed constructs from these two perspectives. To name a few: strictness, leadership, giving responsibility, differentiation, promoting students’ independence, uncertainty, rule clarity, task difficulty, and cohesiveness (see overview of den Brok et al., 2006), dimensions of mastery goal structures (Bardach et al., 2018), school climate (Konold & Shukla, 2017), teacher effectiveness (Hill et al., 2011), classroom management (Göllner et al., 2020), and aspects of quality of teaching (Fauth et al., 2020; Krammer et al., 2019, 2021).
The fact that we can view and assess constructs from different perspectives raises profound questions for educational psychology. These questions address the nature of the constructs themselves. Do different perspectives examine the same construct, but merely from different points of view? Or do different perspectives fabricate different constructs? Or are different perspectives a testimony to distinct and different constructs?
Answers to and assumptions on these above-posed questions have implications for what level educational psychologists aggregate their data, that is, it shapes how we handle and interpret data and thus shapes the insights we derive. Not being wary of these assumptions may implicitly dictate the level of aggregation chosen by researchers. In this example, the question is whether one phenomenon is assumed—and consequently researchers aggregate different perspectives; or if two phenomena are assumed—and consequently researchers do not aggregate different perspectives.
Regarding the questions raised above, we can placed answers on a continuum between (a) supposing one phenomenon and thus striving for perspective-free insights and (b) supposing multiple phenomena and letting perspectives stand in their own right. I argue that aiming for perspective-free insights is a hallmark of striving for general educational principles akin to nomothetic laws. On the other hand, letting perspectives stand in their own right respects that educational phenomena may be bound by the context.
Let us further contrast these two poles of one continuum. Supposing one phenomenon, one phenomenon is viewed from different perspectives. Supposing multiple phenomena puts forward that there is one phenomenon for each perspective. Supposing one phenomenon assumes that different perspectives bear a challenge for statistical modelling, and aggregating multiple perspectives can distil the shared basis of these perspectives. Supposing multiple phenomena assumes perspectives standing in their own right, and does not superimpose a perspective-free trait. Supposing one phenomenon can be found in the vein of educational psychology pointing to the benefit of multiple perspectives for better measurement. For example, researchers using mixed methods to assess teacher effectiveness (Hill et al., 2011), or multitrait-multimethod modelling to assess school climate (Konold & Shukla, 2017). Supposing multiple phenomena can be found in the vein of educational psychology that ascribes distinct value to each perspective. For example, researchers emphasizing perspective-specific validities (Fauth et al., 2014; Wettstein et al., 2017), or the contribution of different perspectives to professional development (Fraser, 1998).
Assuming that there is one phenomenon, we may be inclined to affirm this assumption no matter what the data show us. After all, a null correlation between perspectives does not have to challenge the view of one phenomenon; it may just point towards flaws in measurement. Consequently, differences in and lacking precisions in measurement may explain null correlations between perspectives (e.g., not heeding items’ reference frame: Fauth et al., 2020). Thus, the aim of research becomes to find the assumed correlation; thereby, the null correlation becomes merely the absence of this correlation due to measurement issues. This train of thought becomes even more salient in the framework of null hypothesis testing. Within this framework, a H0 can never be proven (p-values cannot lend support to the H0; Aczel et al., 2018; Passon & von der Twer, 2020). Consequently, a null correlation can merely imply that no correlation was found.
On the other hand, a similar risk is present when we assume two phenomena. Researchers working under this assumption may not be persuaded otherwise by a perfect correlation between perspectives. After all, a perfect correlation can exist even between unrelated phenomena (cf. spurious correlations; Calude & Longo, 2017).
Consequently, educational psychology should tread very carefully regarding assumptions on different perspectives. These assumptions need to be questioned in research and only perspectives aggregated to a higher level with full intent and purpose. Observing a classroom from different perspectives is not the same as watching an apple fall from a tree from different perspectives. Instead of aggregating to perspective-free traits, perspectives on educational phenomena can stand in their own right.
Aggregation’s impact on theory-to-practice
In the following, I want to outline an additional detrimental effect that aggregation can have for educational psychology. This specific detrimental effect feeds into the theory–practice gap of educational psychology and—to the best of my knowledge—has not been readily addressed in quantitative educational psychology: is the transfer of educational findings into practice hindered by quantitative educational psychologists aggregating to ever higher levels without considering the type of insights this process generates?
First, I argue that more aggregation may not always be of advantage. To this end, let us consider that aggregation in educational psychology inherently opens up the question of which observational unit should be chosen (Hayman et al., 1979). For example, effects may be different for students, classes, and schools. Consequently, educational psychologists need to address at what level an effect should be theorized and studied. Correspondingly, the most appropriate level of aggregation is determined by the research question. Therefore, more aggregation cannot per default be the appropriate option.
Second, I argue that striving for higher levels of aggregation may make findings less applicable to situations at hand. For instance, Hattie’s (2009) finding on the importance of teachers’ feedback is such a broad finding that it is readily applicable to today’s educational systems in general. At the same time, this broadness comes at the cost of generating a finding that may need to be applied very differently to different settings, that is, it neither does justice to the complexity of the subject researched nor to the power of the context (Berliner, 2002). In this example, educational psychologists are confronted with the trade-off of gaining insights that are widely generalizable, but less applicable to a classroom at hand, or are less generalizable but seem more readily applicable (for a similar line of reasoning in research on personnel selection, cf. the bandwidth-fidelity dilemma; Ones & Viswesvaran, 1996).
Consider in a next step two things: how do teachers embrace educational research and how can educational psychologists view this process? Addressing the first question, studies show that when teachers were confronted with the evidence, they very often refuted the evidence (cf. the theory–practice or research–practice gap in education; Brown et al., 2022; Cramer, 2014) or encountered it with open resistance (Terhart, 2013). One source of this reluctance is rooted in the difficulties that arise when applying the evidence to the classroom (for an overview of the difficulties, see Neuweg, 2002, 2015). Furthermore, teachers do not feel that the evidence is transferable to individual students (Hinzke et al., 2020; Joram et al., 2020), which in turn reduces educators’ willingness to implement research-informed educational practices (Brown et al., 2022).
At this point, please bear in mind that this lack of transferability should be considered a problem for educational research. After all, transferability is a fundamental characteristic of educational research (e.g., Elliott, 1986). In other words, educational research should aim at producing scientific insights that can help improve educational systems (Cain et al., 2019; Gräsel, 2011). Before this background, the lack of transferability has long been discussed and has not shone a favourable spotlight on quantitative educational psychology (for overviews, see Brown et al., 2022; Cain et al., 2019; Wrigley, 2018).
This problem is reinforced by educational psychologists attributing teachers’ refusal to transfer theory to practice as a testimony to practitioners’ lack of appreciation and understanding for generalizable educational findings and the underlying research. Yes, on the one hand it can be a problem that teachers do not readily revisit their beliefs when presented with contrary-to-belief evidence (cf. prevailing subjective theories in Neuweg, 2004; or devaluation of educational research in Thomm et al., 2021). However, viewing the theory–practice gap in this light inherently makes practitioners shoulder the burden of the theory–practice gap.
By contrast, it should not only be the practitioners’ responsibility to make sense of educational psychologists’ findings; it should also be educational psychologists’ responsibility to facilitate transferring educational insights into practice. Along this line, scholars have shown that carefully framing and preparing educational insights can indeed lead teachers to trust insights from educational research more than from other sources (Schmidt et al., 2022).
Having said that, educational psychologists seem to forget that educational psychology is studying what once was in aggregates to make strong claims about what always is even on the individual level. Before this backdrop, I urge educational psychology to be more aware of the inherent limitation this places on the derived insights and their transfer to practice (for a summary of a similar critique regarding practising clinicians and clinical psychological research, see Lamiell, 2019). Furthermore, I ask educational psychologists to acknowledge that studying ever higher levels of aggregation does not facilitate transferring theory into practice.
It also stands to reason that educational psychology could profit from listening more to teachers. Arguably, teachers have a deep understanding of educational processes (cf. deriving a workable theory of the case; Elliott, 1986). At the same time, tapping into this understanding may be hindered by not being able to explicitly communicate in the framework of educational research (for the problem of teachers not knowing what teachers can do, see Neuweg, 2002). However, this should not discourage educational psychologists to meet teachers eye to eye.
Taking all of this together, I urge educational psychologists to strive for the most appropriate level of aggregation. Investing time and resources into this debate may advance the field by helping to bridge the divide between theory and practice.
Final note
I want to conclude with two recommendations. I invite all educational psychologists to examine if these can be applied to their respective area of research.
Furthermore, I see no reason not to extend the derived recommendations to other areas of quantitative psychological research. Therefore, I also invite all noneducational psychologists to assess if heeding these recommendations and regarding their basis can contribute to advancing psychological research.
First, educational psychologists should question the nomothetic striving. Educational psychologists should clarify their own assumptions on the existence of general educational principles akin to nomothetic laws. These assumptions profoundly shape the way we view, collect, and examine data.
Second, educational psychologists should scrutinize aggregation. In doing so, we should continuously scrutinize which level of aggregation is the most appropriate. Defaulting to the highest levels of aggregation may advance the field less than respecting the context, the situation, and the zeitgeist.
Footnotes
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
