Abstract

The conditions under which laypeople attribute knowledge to protagonists have long been debated by experimental philosophers (Colaço et al., 2014; Nagel et al., 2013; Weinberg et al., 2001). Consider the case of “Darrel,” who accurately recognizes the species of an animal in the woods even though it was the only one of its kind among many animals of an identical-looking species. Responses to this so-called “Gettier-type” case have been studied to examine whether laypeople consider luckily true beliefs as constituting actual knowledge. In their Experiment 1, Turri et al. (2015) compared knowledge attributions in this version of the “Darrel” case to a version in which his belief is a clear case of knowledge; they found no difference between these conditions and concluded that “a salient but failed threat to the truth of a judgment does not significantly affect whether it is viewed as knowledge” (p. 381). Hall et al. (2024) replicated and extended Experiment 1 of Turri et al., testing condition differences by using the Darrel case alongside two other counterfeit-object Gettier-type cases in a large multinational study. Hall et al. found that participants were less likely to attribute knowledge to the protagonists when beliefs were only luckily true (i.e., the Gettier conditions) than when the truth of the beliefs was not under threat (i.e., the knowledge conditions). This significant condition difference reported by Hall et al. stands in contrast to the null result reported by Turri et al.
In their commentary on Hall et al., Buckwalter and Friedman (2024) claimed that the replication should have been interpreted as successful, argued that the researchers’ conclusions were incorrect, and implied that the replication effort was misguided. As a subset of contributors to the Hall et al. replication, we appreciate the opportunity to respond to their comment. Although we recognize the potential for disagreement in the interpretation of research results, Buckwalter and Friedman’s critique ignored several key features of the research, and many of their arguments and proposed interpretations were already addressed by Hall et al. (2024). In response to their comment, we (a) explain why Hall et al. did not replicate all of the original findings, (b) emphasize how Hall et al. were accurate and nuanced in the description and interpretation of their results, and (c) caution against focusing on the mechanisms underlying a psychological phenomenon before it is clearly established.
Different Statistical and Interpretive Conclusions
Although Hall et al. (2024) found evidence for several of the original claims of Turri et al. (2015), they did not replicate one notable null result: Unlike Turri et al., they observed a difference in knowledge attribution between the Gettier condition and the knowledge control condition. Although small in magnitude, this difference was identified robustly across three tested vignettes, different measures of knowledge attribution, and multiple analyses. Hall et al. further examined possible moderators, none of which eliminated the condition difference that Turri et al. failed to find. Given two studies that differ on the statistical significance of their results, we contend that the study with the larger, more diverse sample and higher analytic power to detect differences is more likely to yield valid, replicable, and generalizable results. Because the difference between the aforementioned conditions is absolutely central to the philosophical debate over Gettier cases, and the experimental philosophical literature has accordingly focused on it, we are confident that Hall et al.’s failure to replicate the original null result is an important contribution.
Nonetheless, Buckwalter and Friedman (2024) concluded that the two discrepant findings were “basically the same.” This assertion is based on their interpretation of descriptive differences in the frequencies of knowledge attributions between conditions and enables them to generate a narrative (or, reframing) that would align with that of Turri et al. (2015). However, Hall et al.’s (2024) preregistered analysis plan did not include this method of assessing the results. The purpose of preregistration is to reduce researcher degrees of freedom and to discourage exploratory analyses disguised as confirmatory, especially those that support preferred narratives (e.g., Nosek et al., 2018; Wicherts et al., 2016). In this case, Hall et al. anticipated a null result aligned with the original finding, conducted a prespecified analysis with any data-driven changes transparently noted, detected a nominally significant result, and reported those findings accordingly. Based on these prespecified conditions and consistent with the framework of null hypothesis significance testing (NHST), we must state that Hall et al.’s findings were different from those of the original study.
Although one can, and Hall et al. (2024) did, calculate an effect size from the null result reported in Experiment 1 of Turri et al. (2015; i.e., odds ratio [OR] = 2.00, 95% confidence interval [CI] = [0.77, 5.21]), interpreting the results of the statistical test beyond a failure to reject the null hypothesis would violate the logic of NHST (e.g., Frick, 1996). The null result merely indicates that Turri et al. had insufficient evidence to support a claim of a difference between conditions. Nonetheless, as Hall et al. acknowledged, the effect they observed (i.e., OR = 1.86, 95% CI = [1.78, 1.94]) was similar in size to the original. Accordingly, as both Hall et al. and Buckwalter and Friedman (2024) pointed out, the large sample in the replication study may be one explanation for the difference in statistical conclusions. Lacking this insight, Turri et al. concluded that knowledge attributions were insensitive to the differences in Gettier and knowledge conditions in their Experiment 1—a claim that was not supported by the results of Hall et al. Nonetheless, Buckwalter and Friedman suggested that the percentage differences in the original, likely underpowered, Turri et al. study should have been the sole criteria for replication rather than the statistical conclusions or interpretations therefrom. However, the CI of the original effect is quite wide and includes effect sizes that would imply a greater likelihood of knowledge attribution in the knowledge condition than the Gettier condition (OR > 1), a null result (OR = 1), and a greater likelihood of knowledge attribution in the Gettier condition than the knowledge condition (OR < 1); therefore, their results were inconclusive. In contrast, the results from Hall et al. represent a clear, albeit small, Gettier-intuition effect. The results of the original study simply cannot be classified as “basically the same” as those presented by Hall et al.
Accurate and Nuanced
Contrary to Buckwalter and Friedman’s (2024) reading, Hall et al. (2024) did not “claim to . . . show that people deny knowledge to lucky agents” or “infer from this finding that there is a common psychological tendency to deny that true-by-luck beliefs are knowledge.” Rather, Hall et al. accurately reported a nominally significant condition difference, which they then interpreted accordingly. For example, they stated the following: This result did not correspond to that found by Turri et al., who failed to detect a significant difference in knowledge attribution between these two conditions . . . we did find effect sizes in the same range as Turri et al. when directly comparing like conditions; however, the null result did not replicate.
Hall et al. (2024) further noted their sample’s baseline skepticism (i.e., that a “notable number of participants [43.41%; see Table 7] denied knowledge to protagonists even in clear cases of justified true belief”) and for this reason, were careful in their interpretation of the condition difference. Specifically, they acknowledged that “Gettier intuitions were by no means common,” “the small size of the effect suggests that Gettier intuitions were not prevalent in our research,” and that “given the small size of the observed effect, the theoretical significance of this result is debatable.” They also provided a nuanced discussion of how differences in methods, design, analytic approach, and geographical origin and composition of the samples could have affected results.
Constructs and Mechanisms
We agree with Buckwalter and Friedman (2024) that researchers who claim to answer vague questions about poorly defined constructs may very well lead readers to inappropriate conclusions. However, Hall et al.’s (2024) rhetorical use of “Gettier-type case” and “Gettier intuition” did not preclude the specificity of their claims or the recognition of their constraints on generality. Their limitations were readily acknowledged. Contrary to what Buckwalter and Friedman implied, Hall et al. did not make strong statements about the theoretical implications of their results.
Buckwalter and Friedman (2024) argued for the development of theories and identification of mechanisms in research on luckily true beliefs. We agree that the study of mechanisms is important; in fact, Hall et al. (2024) explored the moderating role of luck attributions directly in their analyses. 1 However, the examination of mechanisms should, in our view, follow the confirmation of an effect’s replicability. The approach of Hall et al. was to first establish the presence of the phenomenon before (and alongside) identifying the factors that influence it. Turri et al. (2015) employed a particular strategy and methodology for studying knowledge attributions in their Experiment 1 and made conclusions based on the results. The purpose of Hall et al.’s replication was to use similar strategies to determine whether Turri et al.’s findings were robust. The replication results demonstrate that prior to studying mechanisms, establishing agreement on what researchers are trying to understand is necessary. Examining “specific forms of luck” and theorizing about their importance, as Buckwalter and Friedman recommended, would have been premature. When results do not conform to hypotheses, researchers tend to avoid accepting the conclusion that their theory may not be correct (i.e., refuted; Popper, 1959). Instead, not infrequently, they embark on subtle searches for subgroups or moderators for which their theory may hold. Hall et al. found differences and highlighted methodological issues, which, had they not been identified, could have hindered progress on theoretical development.
Balancing Replication Goals
Buckwalter and Friedman (2024) suggested that Hall et al. (2024) did not appreciate the tension between two replication goals: that of assessing reliability and that of understanding phenomena. Hall et al. set out to do a simple, direct replication study with both pedagogical and scientific goals. Because of the partnership between the Psychological Science Accelerator (Moshontz et al., 2018) and the Collaborative Replications and Education Project (Grahe et al., 2024; Wagge et al., 2019), the research project increased in complexity and scope over the course of the Stage 1 Registered Report review process. As a consequence, additional piloted vignettes were incorporated, measures were added and changed, alternative response options were included, extension hypotheses were proposed by coauthors, and plans to assess moderating variables were formulated. The conceptual replication Hall et al. conducted emerged out of precisely those tensions noted by Buckwalter and Friedman and the additional consideration of pedagogical value. We hope that the findings of Hall et al. will contribute to the incremental understanding of epistemic intuitions and, in turn, inspire future inquiry along these lines, that is, examining subtypes, mechanisms, and moderators of so-called Gettier intuitions.
Footnotes
Transparency
Action Editor: David A. Sbarra
Editor: David A. Sbarra
Author Contributions
