Sage Journals: Discover world-class research

Abstract

Journalists are often maligned for covering sensational or desirable research results at the expense of studies with stronger methods. In the present study, we aimed to test how journalists’ preferences shift when studies are selected based on their methods rather than results (results-blind selection). Practicing journalists and editors, journalism faculty, and journalism graduate students (N = 413) read summaries of real social-psychology studies and rated their interest in reporting on them. Participants were randomly assigned to read either “traditional” summaries that included the results or “results-blind” summaries that excluded the results. Summaries varied on three within-subjects dimensions: replication status, preregistration status, and belief consistency. Participants expressed more interest in replicable (vs. not replicable) and preregistered (vs. nonpreregistered) studies regardless of whether they learned the results, suggesting that these studies have features that are valued by journalists. Meanwhile, results-blind selection showed potential for reducing confirmation bias, suggesting it may be worth further exploration if feasibility challenges can be addressed.

Keywords

attitudes media motivated reasoning preregistration registered reports science communication open data open materials

In October 2024, the Pew Research Center asked Americans what they thought about scientists and their role in policymaking (Tyson & Kennedy, 2024). Respondents largely thought of scientists as “intelligent” (89%) and “focused on solving real world problems” (65%; Tyson & Kennedy, 2024). Fewer than half, however, said that scientists were “good communicators” (45%), suggesting that the public thinks scientific communication has room for improvement (Tyson & Kennedy, 2024). Typically, the public hears about scientific developments through the media. Scholars have documented an increasingly tighter relationship between science and the media such that the latter plays a major role in disseminating, popularizing, and legitimizing scientific research (Väliverronen, 2021; Weingart, 1998). Perhaps, then, changes to media’s coverage could provide one pathway to regaining public trust in science (Scheufele, 2014).

Some strategies for increasing public trust in science rely on rhetoric or persuasion (Dahlstrom, 2014; Goodwin & Dahlstrom, 2014). However, a more direct way to tackle the root of this problem is to address valid reasons for mistrust, such as bias (Scheufele & Krause, 2019). If, for example, journalists are more likely to report on sensationalistic findings over those supported by strong evidence, this could undermine the scientific record’s value as a source of public guidance. Furthermore, if journalists are more likely to report findings that reinforce their own views on politicized scientific issues, this could deepen public distrust in science.

In this project, we test a possible tool for reducing biased reporting of psychology research: Journalists¹ could select research based on methods rather than results (results-blind selection). The idea that results-blind selection could reduce bias and improve the quality of featured research is the foundation for the registered reports academic-article format, first introduced at Cortex in 2013 (Chambers, 2013). With registered reports, editors and reviewers evaluate scientific projects based on their methods before the results are known (Chambers, 2019). Results-blind selection is promising because it prompts one to consider not whether one likes a finding but whether one thinks the scientific process of arriving at that finding was sound.

A similar approach, then, might improve the trustworthiness of the psychological findings that are communicated to the public. If journalists make reporting decisions while blinded to studies’ results, they may place greater emphasis on methodological rigor and less emphasis on whether the results tell an appealing story. In this project, we examined this possibility by asking the following: Does journalists’ use of results-blind selection improve the trustworthiness of reported psychology research?

Results-Blind Selection and Research Quality

Journalists share a set of professional norms and routines that govern story selection, sourcing, information gathering, writing, and editing (Singer, 2007; Tandoc & Duffy, 2019). However, market pressures often push journalists to report on research that is novel and surprising (Fitzpatrick, 1999; Galtung & Ruge, 1965; Munger, 2020; Myers, 1996; Siravuri & Alhoori, 2017). Ruhrmann’s (1989, 1997) work noted the importance of unexpectedness, controversy, and novelty and expressed concern that a focus on spectacular discoveries could cause distortion of the public’s understanding of science.

These journalistic incentives might not be concerning if all the novel and unexpected findings were trustworthy. However, in psychology, efforts to replicate published findings have often yielded disappointing success rates (Camerer et al., 2018; Ebersole et al., 2016, 2020; Klein et al., 2018; Open Science Collaboration, 2012). Many psychological scientists express concern about the rate of false positives in the literature; 95% of one sample said that it is “somewhat” or “much” higher than it should be in psychology (Miranda et al., 2022).

It is possible that the research seen as most newsworthy also tends to be high quality. Journalists may be skeptical of findings that seem implausible, examining the methods with greater scrutiny rather than taking shocking findings at face value. If the most eye-catching reports have outsize appeal for journalists, however, perceived newsworthiness could be negatively related to quality. Consistent with this possibility, previous work has shown that risky or counterintuitive hypotheses tend to have lower statistical power (Fraley & Vazire, 2014) and be less likely to replicate (Camerer et al., 2016; Dreber et al., 2015). More broadly, it is possible that highly surprising findings tend to rest on shakier methodological foundations (Chambers, 2019).

Recent work examined how journalists evaluated fictitious scientific findings: When all else was equal, journalists favored larger samples and were not swayed by the prestige of authors’ institutions (Bottesini et al., 2023). However, no data that we know of have examined journalists’ interest in reporting on real psychological findings and whether this changes when they are blinded to the results.

Results-Blind Selection and Confirmation Bias

Although journalists are encouraged to report on unexpected discoveries, they also have incentives to affirm the political ideology of their newsrooms. Many popular media outlets are associated with a detectable political slant (Gentzkow & Shapiro, 2005; Stroud, 2011). For instance, laypeople, academics, and think tanks tend to agree that The New York Times leans left and that The Wall Street Journal leans right (AllSides, 2021; Groseclose & Milyo, 2005). A news organization’s partisan slant is reflected in both the stories it chooses to cover and the ways in which it covers them (Hassell et al., 2022). Although there are some strengths to this variability—the fact that diverse perspectives are represented, for instance, or that individual outlets are not held to a misleading standard of “balance” (Boykoff & Boykoff, 2004; Brüggemann & Engesser, 2017)—journalists may end up selectively covering the findings that align with their own beliefs and ignoring those that do not (Carvalho, 2007; Elsasser & Dunlap, 2013).

If journalists exhibit a confirmation bias—a tendency to preferentially report findings that support their preexisting beliefs—this would be a legitimate cause for distrust (Kunda, 1990; Nickerson, 1998). Indeed, conservatives tend to be less trusting of the media and scientists compared with liberals, in part because of perceptions of bias (Brenan, 2024; Hmielowski et al., 2014; Tyson & Kennedy, 2024; Weingart & Guenther, 2016). If media outlets ignore compelling data that tell an unpopular story, long-term exposure could cultivate false—but ostensibly “evidence-based”—beliefs among broad segments of the public (Potter, 2014; Scheufele & Krause, 2019). In the current project, we examined whether such a bias exists and whether it can be reduced by results-blind selection.

Overview

We asked journalists about their interest in reporting on real psychology studies that varied along three key dimensions: (a) whether the findings had been successfully replicated, (b) whether the studies were preregistered, and (c) whether the findings were consistent with journalists’ preexisting attitudes. Half of the journalists were blind to the results, and the other half were not. We chose to use real psychology studies, as opposed to fabricated studies, because we were interested in how journalists perceive the actual psychology literature. By taking this approach, our design gets closer to telling us what might happen if journalists were to adopt results-blind selection. For example, would we see a decrease in coverage of replicable studies or studies that have been preregistered? This would not be possible with an experiment that manipulates specific features via fabricated study descriptions—as valuable as such a contribution would be—because it would remain unknown how manipulated features are combined in real research.

Of course, natural experiments also have notable limitations. Using real studies introduces the possibility that our independent variables could be confounded with other study features. Our study is motivated by the suspicion that these confounds are naturally present in the literature and could result in different patterns of evaluation depending on whether the results are known. It seems useful to document these patterns—which have immediate real-world implications—even if the current project cannot yet elucidate the specific features driving journalists’ decisions.

To make the rationale for this investigation more concrete, consider the possibility that replicable studies tend to have more rigorous methods and less surprising findings than studies that fail to replicate. If journalists evaluate studies based on methods alone, they may be impressed by the rigor and show a preference for replicable over unreplicable studies. In contrast, if journalists are exposed to the results, the surprising results may sway their preference in favor of unreplicable studies. In sum, our hypotheses reflect the assumption that replicability and preregistration tend to be confounded with factors that make methods more impressive and results less impressive.

Based on this rationale, we aimed to answer three research questions:

Research Question 1: Do journalists express more interest in reporting on replicable research when they select studies in a results-blind (vs. traditional) fashion?

Research Question 2: Do journalists express more interest in reporting on preregistered studies when they select studies in a results-blind (vs. traditional) fashion?

Research Question 3: Do journalists exhibit less confirmation bias when they select studies in a results-blind (vs. traditional) fashion?

For Research Questions 1 and 2, we anticipated that the most “newsworthy” results would be associated with shakier methods. Thus, we hypothesized that unreplicable and nonpreregistered findings would have an advantage in the traditional condition compared with the results-blind condition. For Research Question 3, we expected that journalists would show a greater tendency to select belief-consistent studies in the traditional condition compared with the results-blind condition.

Disclosures

All aspects of this project are available on the main OSF page (https://doi.org/10.17605/OSF.IO/W9JG5), which links to our preregistration, ethical-approval forms, study materials, data, analysis scripts, and supplemental materials containing additional details about the power analysis, participant demographics, and alternative analyses.

Preregistration

The hypotheses, methods, and analysis plan were preregistered before data collection (https://osf.io/jurgv/overview).

Reporting

We report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study.

Ethical approval

This research received approval from a local ethics board (Protocol No. 23-01-6253) and meets the ethical guidelines set forth in the World Medical Association Declaration of Helsinki.

Method

Design

Participants read a series of social-psychology-study summaries and rated their interest in reporting on them. Summary type (results-blind vs. traditional) was manipulated as a between-subjects variable. Replication status (replicated vs. did not replicate), preregistration status (preregistered vs. not preregistered), and belief alignment (belief-consistent vs. belief-inconsistent) were manipulated as within-subjects predictor variables.

Materials

We selected three sets of psychology studies—one for each research question—and created a results-blind and traditional summary of each one (summaries: N = 142). These were designed to provide similar information to that included in a press release. Results-blind summaries included four sections: the primary research question, a summary of the methods, a brief list of open-science practices (if any), and a statement about the work’s implications. Traditional summaries were identical to the results-blind summaries except they included a fifth section describing the results (Fig. 1). Participants in the results-blind condition were asked to evaluate the study based on the way it was designed and conducted. This was done to reduce confusion about the lack of results, but this may also have increased the salience of the methods.

Fig. 1.

Example study summary from Set 2. Text in boxes explains how summaries were selected to address our research questions.

Summary Set 1: replication status

To address Research Question 1, we created results-blind and traditional summaries for each of the 21 findings that were targets of replication attempts in the Social Sciences Replication Project (SSRP; Camerer et al., 2018). Of the 21 findings, 13 replicated (e.g., people saw more positive emotion in the images of the bodies of winners vs. losers; Aviezer et al., 2012), and eight did not replicate (e.g., people rated job candidates as more suitable when they were holding a heavy clipboard compared with a lighter one; Ackerman et al., 2010). Studies in this set allowed us to examine how results-blind selection might influence journalists’ interest in reporting on replicable research (note that the summaries did not state the replication outcome). Because these studies vary on many dimensions aside from replication status, we can draw conclusions about journalistic preference for the kinds of psychology studies that do (vs. do not) replicate without being able to isolate the features that factor most heavily in participants’ decisions. One advantage of relying on SSRP is that all studies were published in Nature or Science, two primary sources relied on by science journalists (Franzen, 2012).

Summary Set 2: preregistration status

To address Research Question 2, we created results-blind and traditional summaries for 18 registered reports² (e.g., people in a community-service-learning program did not show boosts in well-being compared with people on a waitlist; Whillans et al., 2017) and 18 journal-based comparison articles (e.g., person-university fit was linked to a higher sense of belonging, which was, in turn, related to a more positive university experience; Suhlmann et al., 2018) in Soderberg et al.’s (2021) investigation. For registered reports, summaries mentioned preregistration in the “Open Science” section. Registered reports and comparison articles were matched on several factors by Soderberg et al.; they addressed the same topic and often had the same first author and were published in the same journal. Nevertheless, like the summaries in Set 1, they still vary on dimensions besides preregistration status. In fact, many of these dimensions were identified in Soderberg et al.’s work, which showed that registered reports were evaluated as having more rigorous methods, posing higher quality questions, and having more to teach than comparison studies. Studies in this set allowed us to examine how results-blind selection might influence journalists’ interest in reporting on preregistered research.

Summary Set 3: belief alignment

To address Research Question 3, we created results-blind and traditional summaries for 24 findings—eight for each of three topics. We selected topics that have a large body of both confirmatory and disconfirmatory results in the literature. The first eight summaries examined whether people are more likely to shoot unarmed Black versus White targets when acting as a police officer in a simulated environment (“shooter bias”; selected from meta-analyses by Cesario & Carrillo, 2024, and Mekawi & Bresin, 2015). Another eight examined whether teenage girls do worse on math, science, and spatial-skills tests when negative stereotypes about girls’ math performance are activated (“stereotype threat”; selected from the meta-analysis by Flore & Wicherts, 2015). The last eight examined whether women have stronger preferences for stereotypical masculinity in potential mates during the fertile versus nonfertile periods of their menstrual cycles (selected from the meta-analysis by Wood et al., 2014). For each set of eight, we selected four studies that provided evidence for the phenomenon and another four that provided evidence challenging it. The studies in the “evidence for” versus “evidence against” categories were closely matched in that they satisfied the same meta-analytic inclusion criteria. Studies in this set allowed us to examine how results-blind selection might reduce journalists’ confirmation bias when choosing what to report.

Participants

Practicing journalists and journalism graduate students were invited to complete our Qualtrics study via personal email and paid $20, commensurate with the average hourly wage for journalists (Tice, 2019). Email addresses of eligible participants were obtained from university journalism-program and media-outlet websites. A power analysis indicated that 400 participants would provide 92% power to detect an interaction of f = .06 for Research Question 1 (ANOVA_power, n.d.; for details, see the Supplemental Material available online). We anticipated that this would be the analysis requiring the most power, and thus, other analyses would be powered at 92% or higher.

Four hundred sixty-eight participants began our study. Of those, five were excluded for completing the study in fewer than 4 min, and 50 were excluded for not completing the dependent variables. This left 413 participants that we included in analyses (see Deviations from Preregistration below). Another six participants indicated neutral baseline attitudes for all three topics in Summary Set 3, preventing us from computing belief-consistent and belief-inconsistent scores. These participants were excluded from analyses for Research Question 3.

Procedure

Participants began by indicating their baseline belief in each of the three phenomena addressed in the Set 3 summaries (shooter bias, stereotype threat, and menstrual-cycle effects) using a 5-point scale (1 = almost certainly false, 3 = could go either way, 5 = almost certainly true). Half of participants were randomly assigned to read only results-blind summaries. The other half read only traditional summaries. All participants read two summaries randomly selected from Set 1 with the condition that one replicated and the other did not. Participants were not told the replication status of the studies. Participants also read two summaries randomly selected from Set 2 with the condition that one was preregistered and the other was not. Finally, participants read two summaries from Set 3 with the condition that one provided evidence for a phenomenon (e.g., demonstrating shooter bias) and the other provided evidence challenging the same phenomenon (e.g., failing to demonstrate shooter bias). The order of the six summaries was randomized.

To assess interest in reporting, participants used a 7-point scale (1 = strongly disagree, 7 = strongly agree) to respond to three questions (“I would be interested in reporting on this research,” “I think readers would be interested in hearing about this study,” and “I think this research conveys an important message”).

Exploratory measures

For each summary, participants used a 7-point scale (1 = strongly disagree, 7 = strongly agree) to indicate skepticism (“I am skeptical of this research”). At the end of the study, participants answered questions about the potential value and feasibility of results-blind reporting. They also answered questions about what they look for in research they want to report on and what they think their audience wants to read about. We included two attention checks, one at the beginning of the study that directed people to check specific boxes and one at the end that asked people explicitly whether they took the time they needed with their responses. Finally, participants indicated their professional role, media-outlet type (if applicable), frequency of reporting on scientific research, ethnicity, race, gender, education level, and political orientation (for a breakdown of participant demographics, see the Supplemental Material).

Deviations from preregistration

First, our preregistration said that we would exclude participants who indicated that they were not a practicing journalist, journalism graduate student, faculty member in a journalism department, or editor. Because response rates to our emails were low (8%) and our recruitment method was targeted to eligible participants, we dropped this exclusion criterion after data access but before results were known. We conducted all analyses using our preregistered exclusion criterion and found no change in the significance of the main effects or interactions for our three research questions (see the Supplemental Material). This deviation could have a small impact on readers’ interpretations because the alternative analyses yield slightly different effect sizes.

Second, we preregistered a stopping rule of 400 participants but ended up with 413 participants who had usable data. This discrepancy reflects the fact that we could not continuously monitor all exclusion criteria and thus did not stop data collection at the exact point we reached our goal N. This deviation occurred after data access but before results were known. This deviation could have a small impact on readers’ interpretations because changes in sample affect effect sizes. Because this is not a data-dependent deviation, it is unlikely to increase the risk of bias (Hardwicke & Wagenmakers, 2023; Willroth & Atherton, 2024).

Unregistered steps

Our preregistration did not say that we would exclude participants who failed to complete the dependent variables. Because these variables are necessary for our preregistered analyses, we added this exclusion criterion after data access but before results were known. This deviation has little impact on readers’ interpretations because there is no clear alternative analysis.

Results

Descriptive statistics and psychometric properties

To evaluate the reliability of the three items used to measure interest in reporting, we calculated Cronbach’s alphas for each of the six summaries viewed by participants. These ranged from α = .885 to α = .917, indicating high internal consistency. To test for acquiescence bias, we computed correlations between interest in reporting and the exploratory skepticism item for each of the six summaries viewed by participants. These ranged from r = −.40 to r = −.51, suggesting that skepticism was negatively related to interest in reporting and that it was unlikely that participants were agreeing indiscriminately to all items. For means, standard deviations, and ranges for the baseline belief and interest-in-reporting ratings, see Table S2 in the Supplemental Material.

Research Question 1

To determine whether journalists express more interest in reporting on replicable research when they select studies in a results-blind (vs. traditional) fashion, we used responses to Set 1 summaries to conduct a 2 (summary type: results-blind vs. traditional) × 2 (replication status: replicated vs. did not replicate) mixed analysis of variance (ANOVA) with interest in reporting as the dependent variable. This analysis revealed a significant main effect of replication status, F(1, 411) = 54.89, p < .001, η_p² = .12, such that participants expressed more interest in studies that replicated (M = 4.39, SD = 1.53) than those that did not (M = 3.77, SD = 1.66). The main effect of summary type was not significant, F(1, 411) = 3.86, p = .05, η_p² = .01; participants expressed similar levels of interest when they read traditional summaries (M = 4.21, SD = 1.49) and results-blind summaries (M = 3.95, SD = 1.68). The interaction between replication status and summary type was not significant, F(1, 411) = .004, p = .95, η_p² = .00 (Fig. 2a).

Fig. 2.

Interest in reporting as a function of summary type (results-blind vs. traditional) and (a) replication status (replicated vs. did not replicate), (b) preregistration status (preregistered vs. not preregistered), and (c) belief alignment (belief consistent vs. belief inconsistent). Error bars depict standard errors.

These findings do not provide support for the idea that results-blind (vs. traditional) selection leads to greater favoring of replicable research. Overall, we observed a medium-sized main effect (η_p² = .12; a 0.62 difference on a 7-point scale) showing that journalists were more interested in studies that replicated than those that did not despite not knowing the replication status of each study. This suggests that there may be features of replicable studies that appeal to journalists regardless of whether they learn the results.

Research Question 2

To determine whether journalists express more interest in reporting on preregistered research when they select studies in a results-blind (vs. traditional) fashion, we used responses to Set 2 summaries to conduct a 2 (summary type: results-blind vs. traditional) × 2 (preregistration status: preregistered vs. not preregistered) mixed ANOVA with interest in reporting as the dependent variable. This analysis revealed a significant main effect of preregistration status, F(1, 411) = 7.42, p = .007, η_p² = .02, such that participants expressed more interest in preregistered studies (M = 4.82, SD = 1.42) than those that were not preregistered (M = 4.57, SD = 1.50). The main effect of summary type was not significant, F(1, 411) = 0.05, p = .82, η_p² = .00; participants expressed similar levels of interest when they read traditional summaries (M = 4.68, SD = 1.39) and results-blind summaries (M = 4.71, SD = 1.53). The interaction between replication status and condition was not significant F(1, 411) = .01, p = .91, η_p² = .00 (Fig. 2b).

These findings do not provide support for the idea that results-blind (vs. traditional) selection leads to greater favoring of preregistered research. Overall, we observed a small main effect (η_p² = .02; a 0.25 difference on a 7-point scale) showing that journalists were more interested in preregistered versus nonpreregistered research. Journalists, then, were responsive to the use of this open-science practice regardless of whether they learned the results.

Research Question 3

To determine whether journalists exhibit less confirmation bias when they select studies in a results-blind (vs. traditional) fashion, we used responses to Set 3 summaries to conduct a 2 (summary type: results-blind vs. traditional) × 2 (belief alignment: belief-consistent vs. belief-inconsistent) mixed ANOVA with interest in reporting as the dependent variable. This analysis did not reveal a significant main effect of belief alignment, F(1, 405) = .22, p = .64, η_p² = .00; participants expressed similar interest in reporting on belief-consistent (M = 4.97, SD = 1.52) and belief-inconsistent (M = 4.94, SD = 1.47) studies. The main effect of summary type was not significant, F(1, 405) = 1.32, p = .25, η_p² = .00; participants expressed similar levels of interest when they read traditional summaries (M = 5.32, SD = 1.39) and results-blind summaries (M = 4.88, SD = 1.58). The interaction between belief alignment and condition was significant F(1, 405) = 4.66, p = .03, η_p² = .01, such that the preference for belief-consistent (vs. belief-inconsistent) findings was larger in the traditional condition (difference: M = .18, SE = .10, p = .06) compared with the results-blind condition (difference: M = −.11, SE = .09, p = .23; Fig. 2c).

These findings do provide support for the idea that results-blind (vs. traditional) selection leads to a reduction in confirmation bias. Put another way, journalists showed a reduced preference for studies confirming their preexisting beliefs when they evaluated studies without the results. Although the interaction was significant, the effect was small (η_p² = .01), and the simple effects were not significant; thus, it would be inaccurate to say that journalists were biased toward attitude-consistent studies in the traditional condition.

Discussion

Overall, the results tentatively supported one of our three hypotheses. We did not observe that results-blind (vs. traditional) selection led to greater favoring of replicable research or preregistered research. We did, however, find that participants’ preference for attitude-consistent findings was reduced in the results-blind condition compared with the traditional condition.

For Research Question 1, we saw that regardless of condition, journalists expressed more interest in reporting on studies that replicated compared with those that did not. This is particularly notable given that participants were not told the replication status of the studies and therefore were not simply favoring studies with the “replicated” stamp of approval. Presumably, then, journalists were picking up on features associated with the likelihood of successful replication, much the way that researchers have done in prediction-market studies (Gordon et al., 2021). Although we cannot isolate the features that journalists considered most heavily, one possibility is that they showed a preference for studies that had open data or materials because this was more common among studies that replicated (6/13) than those that did not replicate (2/8). Another possibility is that journalists were skeptical about topics that seemed implausible (e.g., social priming), contradicting the notion that the most counterintuitive ideas would be seen as most newsworthy. Consistent with this account, exploratory analyses revealed that journalists were more skeptical of studies that did not replicate compared with those that did replicate (see the Supplemental Material).

For Research Question 2, participants showed a preference for reporting on studies that were preregistered (vs. not) regardless of whether they knew the results. The use of this open-science practice, then, may give studies an advantage when it is explicitly communicated to journalists. On the other hand, it may be features linked with but separate from preregistration that are driving the main effect we observed. Past work (e.g., Soderberg et al., 2021) has demonstrated that preregistered studies tend to have characteristics—such as rigorous methods and strong research questions—that could make them more appealing than their nonpreregistered counterparts. More specifically, one feature that could have affected our results is sample size; across our summaries, preregistered studies had a median sample size of 280, compared with 205 for nonpreregistered studies. The idea that journalists notice and prefer larger samples has precedent in previous work (Bottesini et al., 2023). Another difference between our summary sets was that the frequency of significant results was lower among the preregistered (5/18) versus nonpreregistered (15/18) studies, reflecting a broader trend in the literature (Scheel et al., 2021). If we assume that significant results generally elicit more interest than null results, this difference would have worked against our findings, possibly reducing the preference journalists showed for preregistered studies.

Our results tentatively supported our hypothesis for Research Question 3: We observed a significant interaction such that journalists who selected studies using results-blind selection showed a reduced preference for belief-consistent findings compared with journalists who used traditional selection. This finding should be considered in light of two qualifications. First, this effect was small (η_p² = .01) and might be overwhelmed by competing considerations in a real science-journalism context. Second, we did not observe a significant preference for belief-consistent (vs. belief-inconsistent) findings in the traditional condition. An experiment that used fabricated studies specifically manipulating belief consistency would be better equipped to identify confirmation bias, if it is indeed operating, in traditional settings.

At first glance, it might seem inevitable that results-blind selection should reduce confirmation bias—journalists can be biased by results only when they are privy to them. Nevertheless, our results are inconsistent with two alternative outcomes. First, we could have found that journalists’ preferences were unaffected by the type of selection. This is what we should have observed if journalists’ decisions were determined by the quality of the methods and not influenced by the direction of the results. Second, we could have found that journalists in the results-blind condition were able to intuit which studies would confirm their beliefs and to favor those studies even without the results. This possibility appears unlikely given that in the results-blind condition, interest in belief-inconsistent findings is (nonsignificantly) higher than interest in belief-consistent findings. The support we observed for this research question is unlikely to be explained by confounds because the interaction cannot be accounted for by an underlying difference across conditions.

If results-blind selection could have beneficial effects, it is important to consider how this could work in practice. For studies covered in this fashion, researchers would first need to finalize the methods of their study before data collection and make these methods publicly available. This currently happens with registered reports, some big-team-science projects, grant press releases, and clinical trials. Journalists could then make a decision about whether to report on the study conditional on, say, the researchers following the methods faithfully and the study passing peer review. Then, once the researchers complete and publish the study, the journalists would report on the results. At this stage, the methods and results could be reported together in the usual fashion so long as there was a time-stamped record (similar to a preregistration) of the results-blind decision.

Despite the possible benefits of results-blind reporting, there are constraints in journalistic workflows and audience expectations that could limit its feasibility as a widespread practice. First, the fast pace of many newsrooms might leave little room for a two-step process like the one required by results-blind reporting. Moreover, audiences may lack the scientific literacy or level of interest required to engage with methodological minutiae. Although it seems unlikely (and undesirable) that results-blind reporting would ever replace traditional science reporting, this practice could bolster credibility in covering methodologically strong research that seeks to answer questions of high importance to the public.

Normative constraints on journalism, including the ideal of what makes for a good science story, are mutable strategies that largely respond to or take advantage of dynamic market conditions (Reese & Shoemaker, 2018; Schudson, 1998). False balance in climate-change reporting provides a good example. In the past, it was common practice to include quotes on “both sides” of the climate-change issue to balance views on what was once a controversial topic (Brüggemann & Engesser, 2017). But reporting norms changed between 2010 and 2015, and many prominent journalists, including CNN’s Christiane Amanpour, made a concerted effort to change these practices. The change quickly became common practice in the field. This normative shift illustrates that it is at least possible that journalists will adopt results-blind reporting practices if they perceive that such a shift will be well received by their audiences and if the idea receives a “signal boost” from influential figures. The questions of whether these conditions will materialize are beyond the scope of the current study, and future research could design studies to better understand how audiences would receive the idea of results-blind reporting or how to persuade prominent journalists to adopt results-blind reporting practices.

Limitations

One limitation of this work was that we conducted a “natural experiment” in the sense that we used real psychology studies. We made this decision for two reasons. First, it bolstered external validity across all three of our research questions. Second, drawing on the actual published record was necessary for testing Research Questions 1 and 2 because these are questions about how journalists respond to the existing literature. A limitation of this approach is that we cannot identify the specific features that account for journalists’ preferences. Fabricated summaries could be useful in taking this next step because they could be used to manipulate specific characteristics, such as sample size or statistical significance.

Although we made external validity a high priority, we made some methodological decisions that deviate from a real-world reporting context. First, to reduce participant burden, we did not provide participants with the full articles corresponding to each summary and therefore cannot be sure how a more comprehensive review process might have shifted their responses. Second, our summaries explicitly mentioned open-science practices (e.g., preregistration) and provided brief explanations to ensure that our participants knew what they were. Because open-science practices are not typically featured in press releases (American Psychological Association, 2025), they were likely more salient than usual. Finally, participants in the results-blind condition were told that because they would not see the results, they should evaluate the study based on the way it was designed and conducted. Again, because results-blind selection is not currently practiced in science-reporting settings, these instructions lack ecological validity.

Interpretations of our results should also bear in mind limitations of our sample. More than 70% of our participants were academics, and faculty members in journalism departments made up the largest subgroup (n = 206). Thus, our sample may reflect different perspectives on “newsworthiness” and different standards for methodological rigor than one comprising solely journalists and editors without an academic background. Encouragingly, many journalism faculty have professional experience that could provide insight into journalistic decision-making. A further limitation of our sample is that we recruited participants from media outlets and schools based in the United States and Canada, making it difficult to generalize our findings beyond these two countries.

Conclusion

Probing journalists’ reactions to real psychological studies yielded some encouraging—and unanticipated—findings. For example, journalists showed a preference for findings that replicated over those that did not and for studies that were preregistered over those that were not. These results challenge the idea that the most “newsworthy” studies—as determined by journalists themselves—are those resting on shaky methodological foundations. We also observed that when journalists used results-blind reporting practices, they showed less favoritism toward findings that confirmed their preexisting beliefs. Results-blind selection, then, may be worth further investigation as a tool for bolstering the trustworthiness of the psychological science that gets communicated to the public.

Supplemental Material

sj-docx-1-amp-10.1177_25152459261434559 – Supplemental material for Can Results-Blind Selection Improve Science Communication?

Supplemental material, sj-docx-1-amp-10.1177_25152459261434559 for Can Results-Blind Selection Improve Science Communication? by Alexa M. Tullett, Savannah C. Lewis, Nell Lambdin, Joshua Baker and Matthew Barnidge in Advances in Methods and Practices in Psychological Science

Footnotes

Transparency

Action Editor: David A. Sbarra

Editor: David A. Sbarra

Author Contributions

Alexa M. Tullett: Conceptualization; Formal analysis; Funding acquisition; Investigation; Writing – original draft; Writing – review & editing.

Savannah C. Lewis: Formal analysis; Investigation; Methodology; Writing – review & editing.

Nell Lambdin: Investigation; Writing – review & editing.

Joshua Baker: Investigation; Writing – review & editing.

Matthew Barnidge: Investigation; Writing – review & editing.

ORCID iD

Alexa M. Tullett

Supplemental Material

Additional supporting information can be found at

Notes

References

Ackerman

J. M.

Nocera

C. C.

Bargh

J. A.

(2010). Incidental haptic sensations influence social judgments and decisions. Science, 328(5986), 1712–1715. https://doi.org/10.1126/science.1189993

AllSides. (2021). AllSides media bias chart. https://www.allsides.com/media-bias/media-bias-chart

American Psychological Association. (2025, November 17). Press releases. https://www.apa.org/news/press/releases

ANOVA_power. (n.d.). ANOVA_power. https://shiny.ieis.tue.nl/anova_power/

Aviezer

Trope

Todorov

(2012). Body cues, not facial expressions, discriminate between intense positive and negative emotions. Science, 338(6111), 1225–1229. https://doi.org/10.1126/science.1224313

Bottesini

J. G.

Aschwanden

Rhemtulla

Vazire

(2023). How do science journalists evaluate psychology research? Advances in Methods and Practices in Psychological Science, 6(3). https://doi.org/10.1177/25152459231183912

Boykoff

M. T.

Boykoff

J. M.

(2004). Balance as bias: Global warming and the US prestige press. Global Environmental Change, 14(2), 125–136. https://doi.org/10.1016/j.gloenvcha.2003.10.001

Brenan

(2024). Americans’ trust in media remains at trend low: Trust in political and civic institutions highest for local and state governments, lowest for media and congress. Gallup. https://news.gallup.com/poll/651977/americans-trust-media-remains-trend-low.aspx

Brüggemann

Engesser

(2017). Beyond false balance: How interpretive journalism shapes media coverage of climate change. Global Environmental Change, 42, 58–67. https://doi.org/10.1016/j.gloenvcha.2016.11.004

10.

Camerer

C. F.

Dreber

Forsell

T.-H.

Huber

Johannesson

Kirchler

Almenberg

Altmejd

Chan

Heikensten

Holzmeister

Imai

Isaksson

Nave

Pfeiffer

Razen

(2016). Evaluating replicability of laboratory experiments in economics. Science, 351(6280), 1433–1436. https://doi.org/10.1126/science.aaf0918

11.

Camerer

C. F.

Dreber

Holzmeister

T.-H.

Huber

Johannesson

Kirchler

Nave

Nosek

B. A.

Pfeiffer

Altmejd

Buttrick

Chan

Chen

Forsell

Gampa

Heikensten

Hummer

Imai

. . . Wu

(2018). Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour, 2(9), 637–644. https://doi.org/10.1038/s41562-018-0399-z

12.

Carvalho

(2007). Ideological cultures and media discourses on scientific knowledge: Re-reading news on climate change. Public Understanding of Science, 16(2), 223–243. https://doi.org/10.1177/0963662506066775

13.

Cesario

Carrillo

(2024). Racial bias in police officer deadly force decisions: What has social cognition learned? In Carlston

D. E.

Hugenberg

Johnson

K. L.

(Eds.), The Oxford handbook of social cognition (2nd ed., pp 529–559). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780197763414.013.19

14.

Chambers

(2019). What’s next for registered reports? Nature, 573(7773), 187–189. https://doi.org/10.1038/d41586-019-02674-6

15.

Chambers

C. D.

(2013). Registered reports: A new publishing initiative at Cortex. Cortex, 49(3), 609–610. https://doi.org/10.1016/j.cortex.2012.12.016

16.

Dahlstrom

M. F.

(2014). Using narratives and storytelling to communicate science with nonexpert audiences. Proceedings of the National Academy of Sciences, 111(Suppl. 4), 13614–13620. https://doi.org/10.1073/pnas.1320645111

17.

Dreber

Pfeiffer

Almenberg

Isaksson

Wilson

Chen

Nosek

B. A.

Johannesson

(2015). Using prediction markets to estimate the reproducibility of scientific research. Proceedings of the National Academy of Sciences, 112(50), 15343–15347. https://doi.org/10.1073/pnas.1516179112

18.

Ebersole

C. R.

Atherton

O. E.

Belanger

A. L.

Skulborstad

H. M.

Allen

J. M.

Banks

J. B.

Baranski

Bernstein

M. J.

Bonfiglio

D. B. V.

Boucher

Brown

E. R.

Budiman

N. I.

Cairo

A. H.

Capaldi

C. A.

Chartier

C. R.

Chung

J. M.

Cicero

D. C.

Coleman

J. A.

Conway

J. G.

. . . Nosek

B. A.

(2016). Many Labs 3: Evaluating participant pool quality across the academic semester via replication. Journal of Experimental Social Psychology, 67, 68–82. https://doi.org/10.1016/j.jesp.2015.10.012

19.

Ebersole

C. R.

Mathur

M. B.

Baranski

Bart-Plange

D. J.

Buttrick

N. R.

Chartier

C. R.

Corker

K. S.

Corley

Hartshorne

J. K.

IJzerman

Lazarevicć

L. B.

Rabagliati

Ropovik

Aczel

Aeschbach

L. F.

Andrighetto

Arnal

J. D.

Arrow

Babincak

. . . Nosek

B. A.

(2020). Many Labs 5: Testing pre-data-collection peer review as an intervention to increase replicability. Advances in Methods and Practices in Psychological Science, 3(3), 309–331. https://doi.org/10.1177/2515245920958687

20.

Elsasser

S. W.

Dunlap

R. E.

(2013). Leading voices in the denier choir: Conservative columnists’ dismissal of global warming and denigration of climate science. American Behavioral Scientist, 57(6), 754–776. https://doi.org/10.1177/0002764212469800

21.

Fitzpatrick

S. M.

(1999). What makes science newsworthy? The Scientist, 13(23), 11–13.

22.

Flore

P. C.

Wicherts

J. M.

(2015). Does stereotype threat influence performance of girls in stereotyped domains? A meta-analysis. Journal of School Psychology, 53(1), 25–44. https://doi.org/10.1016/j.jsp.2014.10.002

23.

Fraley

R. C.

Vazire

(2014). The N-pact factor: Evaluating the quality of empirical journals with respect to sample size and statistical power. PLOS ONE, 9(10), Article e109019. https://doi.org/10.1371/journal.pone.0109019

24.

Franzen

(2012). Making science news: The press relations of scientific journals and implications for scholarly communication. In Rödder

Franzen

Weingart

(Eds.), The sciences’ media connection–Public communication and its repercussions (Vol. 28, pp. 333–352). Springer Netherlands. https://doi.org/10.1007/978-94-007-2085-5_17

25.

Galtung

Ruge

M. H.

(1965). The structure of foreign news: The presentation of the Congo, Cuba and Cyprus crises in four Norwegian newspapers. Journal of Peace Research, 2(1), 64–90. https://doi.org/10.1177/002234336500200104

26.

Gentzkow

Shapiro

J. M.

(2006). Media bias and reputation. Journal of Political Economy, 114(2), 280–316. https://doi.org/10.1086/499414

27.

Goodwin

Dahlstrom

M. F.

(2014). Communication strategies for earning trust in climate change debates. Wiley Interdisciplinary Reviews: Climate Change, 5(1), 151–160. https://doi.org/10.1002/wcc.262

28.

Gordon

Viganola

Dreber

Johannesson

Pfeiffer

(2021). Predicting replicability—Analysis of survey and prediction market data from large-scale forecasting projects. PLOS ONE, 16(4), Article e0248780. https://doi.org/10.1371/journal.pone.0248780

29.

Groseclose

Milyo

(2005). A measure of media bias. The Quarterly Journal of Economics, 120(4), 1191–1237. https://doi.org/10.1162/003355305775097542

30.

Hardwicke

T. E.

Wagenmakers

E.-J.

(2023). Reducing bias, increasing transparency and calibrating confidence with preregistration. Nature Human Behaviour, 7(1), 15–26. https://doi.org/10.1038/s41562-022-01497-2

31.

Hassell

H. J. G.

Miles

M. R.

Reuning

(2022). Does the ideology of the newsroom affect the provision of media slant? Political Communication, 39(2), 184–201. https://doi.org/10.1080/10584609.2021.1986613

32.

Hmielowski

J. D.

Feldman

Myers

T. A.

Leiserowitz

Maibach

(2014). An attack on science? Media use, trust in scientists, and perceptions of global warming. Public Understanding of Science, 23(7), 866–883. https://doi.org/10.1177/0963662513480091

33.

Klein

R. A.

Vianello

Hasselman

Adams

B. G.

Adams

R. B.

Alper

Aveyard

Axt

J. R.

Babalola

M. T.

Bahník

Š.

Batra

Berkics

Bernstein

M. J.

Berry

D. R.

Bialobrzeska

Binan

E. D.

Bocian

Brandt

M. J.

Busching

. . . Nosek

B. A.

(2018). Many Labs 2: Investigating variation in replicability across samples and settings. Advances in Methods and Practices in Psychological Science, 1(4), 443–490. https://doi.org/10.1177/2515245918810225

34.

Kunda

(1990). The case for motivated reasoning. Psychological Bulletin, 108(3), 480–498. https://doi.org/10.1037/0033-2909.108.3.480

35.

Mekawi

Bresin

(2015). Is the evidence from racial bias shooting task studies a smoking gun? Results from a meta-analysis. Journal of Experimental Social Psychology, 61, 120–130. https://doi.org/10.1016/j.jesp.2015.08.002

36.

Miranda

J. F.

Whitt

C. M.

McDiarmid

Stephens

J. E.

Purdue

Hall

Tullett

A. M.

(2022). How do researchers in psychology perceive the field? A qualitative exploration of critiques and defenses. Collabra: Psychology, 8(1), Article 35711. https://doi.org/10.1525/collabra.35711

37.

Munger

(2020). All the news that’s fit to click: The economics of clickbait media. Political Communication, 37(3), 376–397. https://doi.org/10.1080/10584609.2019.1687626

38.

Myers

M. G.

(1996). “Drug may have caused huge number of deaths”: Lessons learned during an encounter with The Fifth Estate. CMAJ: Canadian Medical Association Journal, 155(6), 772–775.

39.

Nickerson

R. S.

(1998). Confirmation bias: A ubiquitous phenomenon in many guises. Review of General Psychology, 2(2), 175–220. https://doi.org/10.1037/1089-2680.2.2.175

40.

Open Science Collaboration. (2012). An open, large-scale, collaborative effort to estimate the reproducibility of psychological science. Perspectives on Psychological Science, 7(6), 657–660. https://doi.org/10.1177/1745691612462588

41.

Potter

W. J.

(2014). A critical analysis of cultivation theory. Journal of Communication, 64(6), 1015–1036. https://doi.org/10.1111/jcom.12128

42.

Reese

S. D.

Shoemaker

P. J.

(2018). A media sociology for the networked public sphere: The hierarchy of influences model. In Wei

(Ed.), Advances in foundational mass communication theories (pp. 96–117). Routledge. https://doi.org/10.4324/9781315164441

43.

Ruhrmann

(1989). Recipient and message: Structure and process of message reconstruction. Opladen.

44.

Ruhrmann

(1997). Science, media, and public opinion. In Hoebrink

(Ed.), Perspectives for the university 2000: Reform efforts of universities toward greater efficiency (pp. 145–157) Luchterhand.

45.

Scheel

A. M.

Schijen

M. R.

Lakens

(2021). An excess of positive results: Comparing the standard psychology literature with registered reports. Advances in Methods and Practices in Psychological Science, 4(2). https://doi.org/10.1177/25152459211007467

46.

Scheufele

D. A.

(2014). Science communication as political communication. Proceedings of the National Academy of Sciences, 111(Suppl. 4), 13585–13592. https://doi.org/10.1073/pnas.1317516111

47.

Scheufele

D. A.

Krause

N. M.

(2019). Science audiences, misinformation, and fake news. Proceedings of the National Academy of Sciences, 116(16), 7662–7669. https://doi.org/10.1073/pnas.1805871115

48.

Schudson

(1998). Discovering the news: A social history of American newspapers. Basic Books.

49.

Singer

J. B.

(2007). Contested autonomy: Professional and popular claims on journalistic norms. Journalism Studies, 8(1), 79–95. https://doi.org/10.1080/14616700601056866

50.

Siravuri

H. V.

Alhoori

(2017). What makes a research article newsworthy? Proceedings of the Association for Information Science and Technology, 54(1), 802–803. https://doi.org/10.1002/pra2.2017.14505401163

51.

Soderberg

C. K.

Errington

T. M.

Schiavone

S. R.

Bottesini

Thorn

F. S.

Vazire

Esterling

K. M.

Nosek

B. A.

(2021). Initial evidence of research quality of registered reports compared with the standard publishing model. Nature Human Behaviour, 5(8), 990–997. https://doi.org/10.1038/s41562-021-01142-4

52.

Stroud

N. J.

(2011). Niche news: The politics of news choice. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199755509.001.0001

53.

Suhlmann

Sassenberg

Nagengast

Trautwein

(2018). Belonging mediates effects of student-university fit on well-being, motivation, and dropout intention. Social Psychology, 49(1), 16–28. https://doi.org/10.1027/1864-9335/a000325

54.

Tandoc

E. C.

Jr. Duffy

(2019). Routines in journalism. In Oxford Research Encyclopedia of Communication. Oxford University Press. https://doi.org/10.1093/acrefore/9780190228613.013.870

55.

Tice

(2019). Pay Survey 2019: What 1,400 freelancers get paid to write. Make a Living Writing. https://makealivingwriting.com/writer-pay-survey-2019-get-paid-to-write/

56.

Tyson

Kennedy

(2024). Public trust in scientists and views on their role in policymaking. Pew Research Center. https://www.pewresearch.org/science/2024/11/14/public-trust-in-scientists-and-views-on-their-role-in-policymaking/

57.

Väliverronen

(2021). Mediatisation of science and the rise of promotional culture. In Bucchi

Trench

(Eds.), Routledge handbook of public communication of science and technology (pp. 2–19). Routledge.

58.

Weingart

(1998). Science and the media. Research Policy, 27(8), 869–879. https://doi.org/10.22323/2.15050301

59.

Weingart

Guenther

(2016). Science communication and the issue of trust. Journal of Science Communication, 15(5), Article C01. https://doi.org/10.22323/2.15050301

60.

Whillans

A. V.

Dunn

E. W.

Smeets

Bekkers

Norton

M. I.

(2017). Buying time promotes happiness. Proceedings of the National Academy of Sciences, 114(32), 8523–8527. https://doi.org/10.1073/pnas.1706541114

61.

Willroth

E. C.

Atherton

O. E.

(2024). Best laid plans: A guide to reporting preregistration deviations. Advances in Methods and Practices in Psychological Science, 7(1). https://doi.org/10.1177/25152459231213802

62.

Wood

Kressel

Joshi

P. D.

Louie

(2014). Meta-analysis of menstrual cycle effects on women’s mate preferences. Emotion Review, 6(3), 229–249. https://doi.org/10.1177/1754073914523073

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.54 MB

0.00 MB