Abstract
The debate about false positives in psychological research has led to a demand for higher statistical power. To meet this demand, researchers need to collect data from larger samples—which is important to increase replicability, but can be costly in both time and money (i.e., remuneration of participants). Given that researchers might need to compensate for these higher costs, we hypothesized that larger sample sizes might have been accompanied by more frequent use of less costly research methods (i.e., online data collection and self-report measures). To test this idea, we analyzed social psychology studies published in 2009, 2011, 2016, and 2018. Indeed, research reported in 2016 and 2018 (vs. 2009 and 2011) had larger sample sizes and relied more on online data collection and self-report measures. Thus, over these years, research improved in its statistical power, but also changed with regard to the methods applied. Implications for social psychology as a discipline are discussed.
Nearly 10 years ago, Simmons, Nelson, and Simonsohn (2011) initiated a debate about false positives (i.e., results supporting an effect that does not exist) in psychology. As a result of this debate, many psychology journals now devote more attention to appropriate statistical power (e.g., Cumming, 2014; Funder et al., 2014; Giner-Sorolla, 2016; Vazire, 2016). This clearly is a step toward greater replicability, as high statistical power is crucial to avoiding false positives (e.g., Simmons et al., 2011). Statistical power is contingent on alpha, the effect size, and the number of observations (Cohen, 1992). Increasing the number of observations is, thus,
At the same time, individual researchers obviously face limits regarding available lab space, participants, and money for remunerating participants, as well as limits on the time they can spend on recruiting and testing. Given that larger sample sizes (and more measures within a study) require more of these resources, the demand for higher statistical power might motivate researchers to alter their research strategy. Indeed, our colleagues have frequently raised these ideas in discussions with us. Moreover, changes in research strategies would not be surprising given that policy changes in organizations are known to elicit strategy changes among their members as a side effect (Oliver, 1991).
What are researchers’ options for dealing with the demand for higher statistical power? Nelson, Simmons, and Simonsohn (2012) suggested that researchers should publish less—that is, focus on fewer, high-quality articles. This might be a good solution at the collective level (e.g., for a discipline). Yet individual researchers will likely not adopt this strategy, and if they do, they will risk jeopardizing their career success as long as a large number of (high-quality) publications is an important criterion on the job market. An alternative to publishing fewer articles would be to publish articles reporting fewer studies. Considering the high rejection rates of top journals in psychology, however, this also does not seem to be a viable option for individual researchers, as articles reporting fewer studies are more likely to be rejected.
To be able to publish a large number of articles with a large number of studies and larger sample sizes, researchers could apply two strategies: (a) using less resource-intensive means of data collection—such as online data collection—and (b) using less resource-intensive measures—such as self-reports. As is true for any method, online data collection and self-reports are good for some research questions and fields but problematic for others. Whereas online data collection makes it easier to recruit nonstudent samples, which has advantages for the generalizability of findings and for fields such as cross-cultural psychology, it has clear limitations for research in other fields, such as social interaction. Hence, assuming that journals continue to publish articles reporting studies using methods that were adequate for the target research question, changes in method choice due to demands for higher statistical power might change research content in the long run—and might even lead to the extinction of research fields of highest societal relevance.
Therefore, we investigated whether researchers have indeed conducted studies with higher statistical power, more online data collection, and more self-report measures in recent years. To this end, we compared articles published before and after social psychology journals implemented new requirements regarding statistical power as a consequence of the discussion about false positives between 2012 and 2015. We predicted that in articles published after 2015, compared with those published before 2012, (a) sample sizes were larger, (b) more data were collected online, and (c) more studies relied exclusively on self-reports. In addition, we explored whether the number of studies reported per article changed over time.
Disclosures
Data, materials, and online resources
The data and the scripts for the analyses reported in this article have been made available via PsychArchives (http://dx.doi.org/10.23668/psycharchives.2367).
Reporting
We report how we determined our sample size, all data exclusions, and all measures in the study.
Ethical approval
This study did not involve human participants. It relied on coding publicly available materials and was thus not subject to ethical review by an institutional review board.
Method
Sample
Our sample consisted of studies reported in research articles published in the four top empirical social psychology journals:
Initially, we planned to compare work published in 2011 and 2016. We aimed to test for effects with a small to medium effect size (
For a given journal and year, the coders recorded data on articles reporting empirical studies with human research participants up through the end of the issue in which the goal of about 100 studies was reached. For
The final sample consisted of 1,300 studies (
Measures
Sample size
The sample size for each study submitted to the main analyses served as the indicator of sample size. There were severe outliers, and this variable was skewed; 5% of the studies had more than 500 participants (maximum
Online data collection
If any data were collected off-line, a study was classified as off-line; otherwise, it was classified as an online study. We chose this criterion so that studies using online assessments only before or after (off-line) lab sessions would not be classified as online. Intercoder agreement for this variable was 93.8%.
Self-report measures
The coders recorded whether or not each study had employed one of the following types of non-self-report measures (intercoder agreement is reported in parentheses): behavioral measures (85.7%), response times (96.3%), memory measures (97.1%), performance measures (93.4%), coding of written materials (96.2%), and physiological measures (99.6%). In addition, the coders were instructed to note additional measures that did not fit any of the categories on the coding sheet, but no such cases were reported. Our self-report index indicated whether or not a study relied exclusively on self-report; it was set to 0 if one or more measures from these categories had been applied and to 1 if no measures from these categories had been applied.
Number of studies
The number of studies reported in each article was counted.
Data analysis
To test for differences between publication years, differences between journals, and differences between years contingent on the journal (i.e., the Year × Journal interaction), we computed multiple regressions in Mplus (Version 8; Muthén & Muthén, 2017) for the dependent variables sample size, online data collection, and self-report measures. To account for the interdependence of multiple studies within a given article, we followed McNeish, Stapleton, and Silverman’s (2017) recommendation for clustered data and employed the “complex” analysis type in Mplus. Year and journal were effect coded using 3 orthogonal contrasts each. Given that the incomplete 4 (year) × 4 (journal) design contained data in 12 cells, 5 degrees of freedom remained (12 – 3 – 3 – 1 = 5). Therefore, 5 additional orthogonal contrasts representing the Journal × Year interaction were entered into the analyses (see Tables 1 and 2 for lists of all contrasts, with their labels). The contrasts did not perfectly match the predictions (as one would ideally aim for), because the incomplete design put restrictions on the ways a set of orthogonal contrasts could be generated. Given that number of studies was not clustered, we computed a (standard) multiple regression using SPSS25 for this dependent variable; the same 11 contrasts were the predictors in this analysis.
Orthogonal Contrasts for the Main Effects
Note:
Orthogonal Contrasts for the Journal × Year Interaction
Note:
Results
Test of predictions
We predicted that the number of participants per study was higher in research published in 2016 and 2018 than in research published in 2009 and 2011. Indeed, sample sizes were larger in studies published in 2016 (
Mean Sample Size, Mean Percentages of Studies Using Online Data Collection and Only Self-Report Measures, and Mean Number of Studies per Article, by Journal and Publication Year
Note: Standard deviations are given in parentheses.
The next prediction was that in 2016 and 2018 (vs. 2011 and 2009), data were more frequently collected online. As predicted, the percentage of published studies that relied on online data collection was larger in 2016 (43.9%) and 2018 (49.8%) than in 2011 (11.4%) and 2009 (6.0%), C1(Y):
Finally, we expected that studies published in 2016 and 2018, compared with those published in 2009 and 2011, more often relied exclusively on self-report measures. As predicted, the percentage of published studies using only self-report measures was higher in 2016 (58.5%) and 2018 (68.0%) than in 2011 (38.8%) and 2009 (46.1%), C1(Y):
Exploratory analyses
To test whether larger sample sizes were accompanied by more online data collection and more use of self-reports, as we suggested in the introduction, we computed the correlation between sample size and the other two variables. Studies with larger sample sizes were indeed more often conducted online (
As we mentioned earlier, one way to help compensate for higher costs (due to higher statistical power) would be to reduce the number of studies per article. Therefore, we tested whether the number of studies per article differed between journals and years. We found that the number of studies per article was larger, not smaller, in 2016 (
Discussion
Our results suggest that the demand for higher statistical power has—like most policy changes (e.g., Oliver, 1991)—evoked strategic responses among researchers. In 2016 and 2018, when samples sizes larger than those in 2009 and 2011 were required, researchers used less costly means of data collection, namely, more online studies and less effortful measures. In addition, studies with larger samples were more likely to be conducted online and more likely to use self-report measures only. Even though the current study provides only correlational evidence at the behavioral rather than the psychological level, our results suggest that researchers behaved in line with an individual-level cost-benefit analysis (even though they might not have explicitly conducted such an analysis). Studies in social psychology changed in line with the call for higher statistical power, but research methods changed as well. This development could have been anticipated from psychological theorizing. Therefore, we suggest that when policy changes in science and especially in psychology are to be implemented, psychological theorizing on the potential consequences of these changes should be considered (see also Fetterman & Sassenberg, 2015).
However, implicit cost-benefit analyses might not be the (only) reasons for the patterns we observed. An alternative explanation of our findings that cannot be ruled out with our data is that researchers used behavioral measures less over time because failed replications reduced trust in such measures. In addition, the increased use of online data collection after 2011 might have been driven by the greater availability and acceptance of this method among researchers. These and other alternative explanations should be addressed in further research, for instance, by studying researchers’ decision making rather than by using archival data.
The archival data we used, however, allowed us to uncover an interesting pattern.
The differences between
What are the implications of the trends summarized here for social psychology as a field? The good news is that social psychology has learned its lessons from the debate about false positives. At least regarding sample sizes, social psychology is moving in the desired direction. The change in methods of data collection is, however, not an unequivocally positive development. Both the increased reliance on online data collection and the more frequent use of self-reports are adequate for addressing some, but not all, research questions. The 2000s were the Decade of Behavior in psychological science and beyond (Fowler, Seligman, & Koocher, 1999), as it was acknowledged (a) that behaviors such as choices or performance are important for many research questions, and have particular societal relevance, and (b) that relying exclusively on self-reports is problematic because of substantial differences between actual and self-reported behavior (Baumeister, Vohs, & Funder, 2007). The current results suggest that what was a mission in the past decade no longer guides researchers’ choice of measures in this decade.
Assuming that journals continue to publish only articles with a high match between the research question and research methods, the trend toward self-reports and online data collection should lead to a change in research questions (see also Vazire, 2018). Whereas some areas might benefit, those that require more labor-intensive research methods might die out. Just as small-group research disappeared almost completely over the years, (partly) because of its resource-intensive nature (Levine & Moreland, 1990), other fields of research may be eliminated if the demand for high statistical power is compensated for only by researchers’ choice of research methods and not by other means, such as resources provided by funding agencies, lower numbers of publications, or lower numbers of studies per publication.
Supplemental Material
SassenbergRevOpenPracticesDisclosure – Supplemental material for Research in Social Psychology Changed Between 2011 and 2016: Larger Sample Sizes, More Self-Report Measures, and More Online Studies
Supplemental material, SassenbergRevOpenPracticesDisclosure for Research in Social Psychology Changed Between 2011 and 2016: Larger Sample Sizes, More Self-Report Measures, and More Online Studies by Kai Sassenberg and Lara Ditrich in Advances in Methods and Practices in Psychological Science
Footnotes
Action Editor
Alexa Tullett served as action editor for this article.
Author Contributions
K. Sassenberg developed the idea for this article, was responsible for the data collection, and had the lead role in analyzing the data and writing the manuscript. L. Ditrich contributed to analyzing the data and writing the manuscript.
Declaration of Conflicting Interests
The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.
Open Practices
Open Materials: not applicable
Preregistration: no
All data have been made publicly available via PsychArchives and can be accessed at http://dx.doi.org/10.23668/psycharchives.2367. The complete Open Practices Disclosure for this article can be found at http://journals.sagepub.com/doi/suppl/10.1177/2515245919838781. This article has received the badge for Open Data. More information about the Open Practices badges can be found at
.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
