Abstract
How do comprehenders interpret semantically implausible sentences? Previous studies proposed a noisy-channel framework of sentence comprehension, where communication between a speaker and a comprehender happens in a noisy channel. The comprehender rationally adopts an interpretation of a sentence based on how likely the interpretation is (the semantic prior) and how likely is the interpretation corrupted into the perceived sentence because of noise (the likelihood). The theory predicted that comprehenders would be more likely to adopt a literal interpretation of an implausible sentence if their prior of implausible sentences were higher. To test this hypothesis, Gibson et al. manipulated the proportion of implausible test sentences in two sets of experiments, where participants read a number of sentences and answer a comprehension question following each sentence. Although their results supported the hypothesis, the experiment could be confounded (a) by participants’ adaptation effect (due to different experiment lengths) and (b) by different participants having different strategies to do the task (due to the between-subject design). In our study, we manipulated the semantic prior and controlled for these potential confounds. We found participants exposed to more implausible sentences were indeed more likely to interpret implausible sentences literally. Our results hence offer additional support for the noisy-channel framework.
Introduction
Messages are constantly conveyed between a speaker – who encodes their intended meaning in the message – and a comprehender – who decodes the message to recover the speaker’s intended meaning. This might sound like a trivial process: after all, as speakers of a human language, we are used to constantly alternating between the role of speaker and comprehender and exchanging thoughts and ideas using language. However, it is not as easy as one might think, as there is often noise in the communication process (Brehm, 2023). For example, the speaker might utter disfluent speech, the environment of the conversation might be noisy, or the comprehender might not be paying close attention to the speaker, and as a result, the meaning that the comprehender decodes at their end is not always the intended meaning of the speaker. Remarkably, despite the presence of noise, comprehenders often manage to recover the speaker’s intended meaning.
How do we as language users achieve this? Past studies have offered models on how a comprehender extracts meaning given a signal (e.g. Ferreira, 2003; Gibson, 2000; Hale, 2001; Levy, 2008a; Lewis et al., 2006; MacWhinney & Bates, 1989; Tabor et al., 2004). However, most of the proposals treat the communication as taking place in a noise-free environment. In contrast, some recent proposals (e.g., Gibson et al., 2013; Levy, 2008b) integrate the presence of noise in their models. These proposals model the communication between a speaker and a comprehender as happening in a noisy channel (Shannon, 1948). In particular, they model that the speaker has an intended utterance
The first term
This noisy-channel framework has been experimentally tested in numerous studies. Gibson et al. (2013) tested the framework by investigating comprehenders’ interpretation of different syntactic alternations, such as the active-passive alternation or the double-object/prepositional phrase object alternation as in 2 below. These are syntactic constructions that are very close to one another in meaning and in form, varying only the word order and some morphology. All languages have such alternations: they allow us to say the same idea in different word orders, depending on which elements are already introduced and which are yet to be introduced. People like to start with information that is already part of the discourse and proceed to new information (Chafe, 1970; Givón, 1984; Givón 1987; Lambrecht, 1994; Birner and Ward, 1998; Clifton & Frazier, 2004).
Gibson et al. (2013) focused on syntactically well-formed but semantically implausible versions of these materials and examined how participants interpreted them. For example, when a comprehender encounters a sentence “the mother gave the candle to the daughter” (utterance
Gibson et al. (2013) tested the noisy-channel framework in several ways. First, they manipulated the plausibility of the test sentences: a plausible utterance has a higher prior probability than an implausible utterance. In the previous example, since the candle is inanimate and hence cannot receive anything or anyone, the utterance
(2) a.
b.
c.
d.
Participants read one version of the sentences in (2) and were asked the comprehension question. For each version, the proportion of responses where participants interpreted the sentence literally was calculated. Two predictions were made by Gibson et al. (2013): first,
Third, Gibson et al. (2013) also manipulated the presence of noise in the filler sentences. In one experiment, the filler sentences given to the participants were plausible and syntactically good English sentences, such as “The colonel was knighted by the queen because of his loyalty.” In contrast, in another experiment, half of the filler sentences were replaced with syntactically illicit sentences caused by various noise operations (e.g. “The colonel was knighted for by the queen because of his loyalty”). Gibson et al. (2013) predicted that these syntactically illicit filler sentences would raise the likelihood of noise operations
Fourth, and most relevant to the current study, Gibson et al. (2013) manipulated the semantic prior by changing the proportion of implausible sentences. A higher proportion of implausible sentences raises the prior of implausible utterances, and the framework predicts that participants will be more likely to interpret implausible sentences literally. In Experiments 1A-1E of Gibson et al. (2013), participants read 20 test sentences of one of five types of syntactic alternations, together with 60 filler sentences. For example, one of these sets of materials investigated the double-object/prepositional phrase object alternation as in 2. In Experiment 3 of Gibson et al. (2013), participants read 100 test sentences, consisting of materials from all five alternations in one experiment, together with the same 60 filler sentences. As a result, the proportion of implausible sentences was higher in Experiment 3 than in Experiment 1 (50/160, or 31.25% vs. 10/80, or 12.5%, since half of the test sentences were implausible). They found that, as predicted, participants were more likely to interpret implausible sentences literally.
However, the manipulation of the semantic prior in Gibson et al. (2013) was potentially confounded. Their approach was to pool plausible and implausible sentences across different alternations together in one single experiment (Experiment 3), and in this way, the proportion of implausible sentences increased, and they showed that this manipulation led to an increase in participants’ literal interpretation of implausible sentences. Their approach introduces two potential confounds as to why the literal interpretation rate increased: a confound of experiment length and a confound of participant idiosyncrasy. The first potentially confounding factor is a result of an adaptation effect: as participants see more and more implausible test sentences throughout the experiments, they could adjust their semantic prior to have more expectation of implausible sentences and hence are more likely to interpret them literally. Indeed, this effect was found in Delaney-Busch et al. (2019): the change in participants’ N400 amplitude (a measure of a participant’s semantic processing) over a prime-target matching task can be largely predicted by a model of their by-trial target word probability estimate, suggesting that participants’ semantic prediction shifted throughout an experiment to adapt to the statistical structure of the task. It is possible that in Experiment 3 of Gibson et al. (2013), participants were exposed to more implausible sentences and had more adaptation to the semantic prior, compared to participants in Experiment 1, and the results reported in the original study, which was an average over the course of the experiment, could not show whether the reported effect was actually due to the difference in semantic prior, or due to the difference in experimental length.
The second potentially confounding factor was what we would label as idiosyncrasy of strategy across participants: each participant had a different strategy to complete the task. Some participants may choose to consistently interpret sentences literally, even though they are implausible (“consistently literal” henceforth); some participants may switch back and forth, interpreting some implausible sentences literally and others non-literally (“switching” henceforth); and others may choose to consistently interpret sentences non-literally (“consistently non-literal” henceforth). Since the experiment to measure the effect of semantic prior is between-participant in nature, it was unclear to what extent the effect observed in Gibson et al. (2013) was actually due to the effect of semantic prior manipulation or due to differences in participant strategies. To measure the effect of the semantic prior manipulation in a between-participant design, one should minimize the influence of individual differences by recruiting a large number of participants.
To see if the effect of semantic prior found in Gibson et al. (2013) was in fact confounded, we plotted the original results in Figure 1a and the average by-trial literal interpretation rate in Figure 1b. If semantic prior has an effect on sentence interpretation, we expect no noticeable difference between participants’ literal interpretation rates in the initial trials of Experiment 3 (blue line in Figure 1b) and Experiment 1 (red line in Figure 1b). Then, as trials progress, the effect of the semantic prior becomes more pronounced, and we should see a consistent difference in literal interpretation rate between the two conditions. Results in Figure 1b were not in line with this prediction: the difference in literal interpretation rate between the two conditions was already present in the beginning trials of the experiment. This implies that the results found in Gibson et al. (2013) could still be because of the idiosyncrasy of participant strategies in Experiment 3. This also implies that the difference in the literal interpretation rate in Gibson et al. (2013) could be due to an adaptation effect: as shown in Figure 1b, participants were more likely to interpret implausible sentences literally later in the experiment, as they encountered more implausible sentences. In addition, the adaptation effect was stronger in Experiment 3 than in Experiment 1 due to differences in experiment length.

Experiment length and participant idiosyncracy could be confounds in (
To further examine the effect of participant idiosyncrasy, we compared four groups of participants who were presented with the same sentences and were asked to do the same task. The first two groups were taken from Experiment 1 in Gibson et al. (2013), where participants were presented with active/passive sentences and DO-goal/PO-goal sentences, respectively. The remaining two groups were taken from Experiment 3 in Chen et al. (2023), where participants were also presented with active/passive sentences and DO-goal/PO-goal sentences, respectively 2 . For each type of sentence material, we calculated the portions of different types of participants (i.e., consistently non-literal, switching, and consistently literal). The results are shown in Figure 2: despite being presented with the same sentences and asked to do the same task, different groups of participants have different distribution of behaviors. A higher proportion of participants in Gibson et al. (2013) consistently interpreted implausible active/passive sentences literally, and a higher proportion of participants in Chen et al. (2023) switched between literal and non-literal interpretation in active/passive, compared to in Gibson et al. (2013). A lower proportion of participants consistently interpreted implausible DO-goal sentences non-literally in Chen et al. (2023) compared with in Gibson et al. (2013), whereas a higher proportion of participants switched back and forth when interpreting implausible DO-goal sentences in Chen et al. (2023). Therefore, in order to ensure that the effect observed in Gibson et al. (2013) is actually due to the effect of interest (semantic prior), instead of due to participant idiosyncrasy, one needs to recruit a large enough number of participants to ensure each participant’s idiosyncrasy is mostly smoothed out.

Different groups of participants had different distribution of responses when given the same materials. This plot shows how participants responded to implausible sentences across two identical experiments (Chen et al., 2023; Gibson et al., 2013). Red: those who always interpreted implausible sentences non-literally (i.e. 0% literal interpretation rate); blue: those who switched back and forth between literal and non-literal interpretation when presented with implausible sentences (i.e., between 0% and 100% literal interpretation rate); green: those who consistently interpreted implausible sentences literally (i.e. 100% literal interpretation rate). The numbers in each panel indicate the number of participants.
The current study serves to address these confounds, and hence test the effect of semantic prior on noisy-channel sentence interpretation. As an overview, one group of participants is given the filler sentences in Gibson et al. (2013), whereas another group of participants is given filler sentences that are syntactically licit but semantically implausible. We address the potential confounds in Gibson et al. (2013) in two ways: first, participants in different experimental conditions read the same number of sentences. We also track the literal interpretation rate across participants throughout the course of the experiment. In addition, each experimental condition has the same number of participants, and in the second experiment, we recruited a large number of participants (200 per condition), hoping to smooth out the idiosyncrasy in their semantic prior. If participants are indeed more likely to interpret implausible sentences literally given a more implausible semantic prior, we should expect such a difference to be present once participants were adapted to the new semantic prior. Instead, if the results in Gibson et al. (2013) were indeed caused by a difference in experiment length, we should expect no difference in literal interpretation rate in the two groups of participants. In addition, if the results in Gibson et al. (2013) were due to participant idiosyncrasy, we should expect the literal interpretation rate in two groups of participants to be consistently different throughout the experiment.
Experiment 1
Methods
We followed the methods from previous studies (e.g., Gibson et al., 2013): participants read sentences varying in constructions and plausibility and were asked a comprehension question. One group of participants was given plausible filler items, and another group was given implausible filler items. In each condition, we calculated the proportion of the trials where participants interpreted the sentences literally. Below is a detailed account of the methods.
Participants in both groups were asked to read 80 sentences: 20 test sentences and 60 filler sentences. The 20 test sentences were taken from the DO-goal/PO-goal materials in Gibson et al. (2013), systematically varying in syntactic construction (DO-goal or PO-goal; for simplicity, we will refer to them simply by DO and PO) and plausibility (plausible or implausible), with 5 sentences in each combination [See (2)]. Critically,
Example Sentence Stimuli Used in This Study.
120 participants were recruited from Prolific 3 who were native English speakers located in the United States, with an approval rate higher than 95%. The study was hosted on Qualtrics 4 . 60 participants were in the plausible filler condition, and 60 participants were in the implausible filler condition. Before the experiments, participants were asked to complete 5 English sentences in a grammatical way as a proficiency check. After the proficiency check, all 80 trials were presented on the same webpage. Each trial contained a sentence, followed by a comprehension question and two buttons, one for “Yes” and another for “No”. Participants were free to edit their responses. There was no time limit for the experiment. The expected completion time for the experiment was 15 min, and participants were paid $3.00 for their submission, regardless of how long it took for them to complete. Only participants with a higher than 75% filler accuracy rate were included in our analysis.
This study was not pre-registered. The data and the analysis scripts are available at https://osf.io/k5vqj.
Results
One hundred twenty-two participants in total were originally recruited, and two participants were excluded from the data analysis due to low filler accuracy. The median completion time of the experiment was 13 min. In all conditions, plausible sentences were interpreted literally in more than 90% of the trials and were hence not analyzed further.
We ran a Bayesian mixed-effects logistic regression using the MCMCglmm package (Hadfield, 2010) in R (R Core Team, 2013). We coded the sentence construction (DO vs. PO) and the filler condition (plausible vs. implausible fillers) as fixed effects. Following the maximum random effect structure under our experimental design (Barr et al., 2013), we included random intercepts for participants and items, random by-participant and by-item slope for construction, and random by-item slope for filler condition. In lme4 (Bates et al., 2014) syntax, the formula would be written as in (3):
In the analysis, the priors were set to be uninformative (Baayen et al., 2008), and the number of iterations was set to be 10000, with a thinning interval of 10, and a warm-up period of 3000 iterations. For each main effect parameter and the interaction, we report the 2.5% percentile, the mean, and the 97.5% percentile of the posterior distribution. We also report the value
The results are presented in the upper facet of Figure 3, and the results of the statistical analyses are shown in Table 2. First, implausible PO sentences were more likely to be interpreted literally than implausible DO sentences (

Participants with higher implausible semantic prior are more likely to interpret implausible sentences literally. Percentage of literal interpretation of implausible double-object (DO, red) and implausible prepositional object (PO, blue) sentences, faceted by filler conditions (implausible vs. plausible fillers) and experiments. The numbers in each panel indicate the number of participants.
Results of the Mixed-Effect Logistics Regression in Experiment 1 and Experiment 2, Including the 2.5% Percentile, the Mean, the 97.5% Percentile, and the
Experiment 2
Methods
Experiment 2 is an exact replication of Experiment 1, except that the number of participants was 400. Those who participated in Experiment 1 were ineligible for this experiment. The data and the analysis scripts are available at https://osf.io/k5vqj.
Results
Four hundred twenty-nine participants in total were originally recruited, and 25 participants were excluded from the data analysis due to low filler accuracy. Two hundred three participants from the implausible filler condition and 201 participants from the plausible filler condition were included in the analysis. The median completion time for the study is 12.5 min. In all conditions, plausible sentences were interpreted literally in more than 90% of the trials and were hence not analyzed further. We adopted the same statistical analysis procedures as in Experiment 1.
The results are presented in the lower facet of Figure 3, and the statistical analysis results are shown in Table 2. First, just as in Experiment 1, implausible PO sentences were more likely to be interpreted literally than implausible DO sentences (
By-Trial Analysis
In Experiments 1 and 2, a group of participants was presented with plausible filler sentences, whereas another group was presented with implausible filler sentences. We found that, as predicted, participants given implausible filler sentences were overall more likely to interpret implausible sentences literally than those given plausible filler sentences. We also replicated results reported in Gibson et al. (2013): PO sentences were more likely to be interpreted literally than DO sentences. However, it remains unclear whether such a difference is consistent throughout the experiment. In this section, we analyzed participants’ responses by trial, under different conditions and sentence constructions.
Figure 4 shows the mean proportion of literal interpretation across trial numbers, grouped by sentence constructions and experiments. Similar to Experiments 1 and 2, we ran a Bayesian generalized linear mixed-effects regression in each sentence construction and experiment. Conditions (implausible vs. plausible fillers) and trial numbers are coded as fixed effects. We also included random intercepts of participants and items and random by-participant and by-item slopes for trial number and random by-item slope for conditions, as shown in Equation 4.

The difference in literal interpretation rate is consistent under different filler conditions. The proportion of literal interpretation (y–axis) under different filler conditions (green for implausible fillers, orange for plausible fillers) is plotted against the trial number (x—axis).
The statistical analysis results are shown in Table 3. First, similar to Figure 1, we found that participants were adapting their semantic prior as the experiment progressed, as implausible sentences presented later in the experiment were more likely to be interpreted literally compared to those presented earlier in the experiment. This was also shown as a significant, positive effect of trial number in the statistical analysis (
Results of the Mixed-Effect Logistics Regression in the By-Trial Analysis, Including the 2.5% Percentile, the Mean, the 97.5% Percentile, and the
Second, apart from the adaptation effect, it could be observed from the graph that in both experiments, participants given implausible fillers were indeed consistently more likely to interpret implausible sentences literally, compared with those given plausible fillers, as the trend was consistent in both types of sentences and across different trials. This observation was partially supported by statistics, as there were no significant main effects of filler condition in Experiment 1 (
Distribution of Responses
Following the procedures in Figure 2, we plotted the distribution of participant responses from Experiment 1 (120 participants in total) and Experiment 2 (404 participants in total) in this study. Since Experiment 2 is a replication of Experiment 1 with more participants, this gives us a direct comparison to investigate participant idiosyncrasy. The results are presented in Figure 5, showing the distribution of different types of participants (i.e., consistently non-literal, switching, and consistently literal) in each filler condition and sentence construction.

Across two experiments in this study, the distribution of participant responses is relatively stable. This plot shows how participants responded to implausible sentences in Experiment 1 (120 participants in total) and Experiment 2 (404 participants in total) in this study. The data is organized by constructions (DO vs. PO, columns) and filler conditions (implausible vs. plausible fillers, rows). Red: those who always interpreted implausible sentences non-literally (i.e. 0% literal interpretation rate); blue: those who switched back and forth between literal and non-literal interpretation when presented with implausible sentences (i.e. between 0% and 100% literal interpretation rate); green: those who consistently interpret implausible sentences literally (i.e. 100% literal interpretation rate). The numbers in each panel indicate the number of participants.
The results suggest that across the two experiments, the distribution of participant responses is relatively stable within each combination of filler condition and sentence construction. In both experiments, most participants switched between interpreting implausible DO sentences literally and making inferences on them, regardless of filler conditions. A higher proportion of participants in the implausible filler condition consistently interpreted implausible PO sentences literally than those who switched between literal and non-literal interpretation, while the opposite was true for those in the plausible filler condition.
Discussion
Numerous studies in the past have tested various aspects of the noisy-channel framework (e.g. Bader and Meng, 2018; Buxó-Lugo and Slevc, 2024; Cai et al., 2022; Chen et al., 2023; Gibson et al., 2013; Gibson et al., 2017; Liu et al., 2020; Poliak et al., 2024; Poppels & Levy, 2016; Ryskin et al., 2018; Zhan et al., 2023; Paape, 2024, also see Traxler, 2014 for a review), but few studies have tested the prior component by manipulating a comprehender’s semantic prior. A test was conducted in Gibson et al. (2013), but their experiments could be confounded: the experiment that was intended to elicit a higher prior for implausible interpretation was also longer, with the consequence that the results in Gibson et al. (2013) could be potentially due to experiment length, instead of a difference in semantic prior. In addition, it was unclear whether the difference was actually due to the difference in the proportion of implausible sentences, as the noisy-channel framework would predict, or it could be just due to participants in Experiment 3 being more likely to interpret implausible sentences literally, since Experiment 3 had much fewer subjects.
Our study addressed these two issues by controlling for experiment length. In the experiment, two groups of participants read the same test sentences under double-object (DO) and prepositional object (PO) constructions, but one group was presented with implausible filler sentences, while another was presented with plausible filler sentences. This experimental design manipulated the semantic prior without also varying the experiment length. We also recruited a larger number of participants than Gibson et al. (2013), while keeping the same number of participants for each condition, in order to mitigate the effects of participant idiosyncrasy. We predicted that by exposing a comprehender with more implausible sentences, the comprehender would have a higher prior of implausible utterances, and therefore, when they encounter an implausible sentence, they would be more likely to interpret it literally. We also predicted that such an effect should be continuously present after participants are adapted to the new semantic prior. Our findings were consistent with the predictions: in both experiments, participants who were exposed to implausible filler sentences were more likely to interpret implausible test sentences literally, compared with those who were exposed to plausible filler sentences. In addition, such a difference was consistent at the trial level once participants adapted to the semantic prior in their respective experimental condition. We also replicated previous results in Gibson et al. (2013) that PO sentences were interpreted literally more often than DO sentences, plausibly because deletions are less likely to take place than insertions. Our results indicate that if a comprehender repeatedly receives messages that sound implausible, it might be a rational strategy for the comprehender to assume that the sender just tends to send implausible messages, rather than continuing to assume that the sender is saying something plausible.
Our study shows the dynamicity of human semantic prior and the rationality of comprehenders in sentence interpretation. This finding is also broadly in line with previous studies such as Ryskin et al. (2018), which showed the dynamicity of noise likelihood – comprehenders adapted to the noise likelihood according to the noisy sentences they were exposed to. For example, comprehenders exposed to more sentences with deletion errors are more likely to infer deletion as the noise operation. Both this study and Ryskin et al. (2018) showed that comprehenders can quickly adapt to the semantic prior and the noise model of listeners and rationally interpret sentences they perceive. This ability is critical in that communication is always changing - different speakers have different semantic priors and noise likelihood, depending on their native language and even the modality where communication takes place, and different environments have different levels of noise. Therefore, comprehenders need to quickly adapt to the parameters specific to the setting where the conversation takes place, in order to maximally recover the speaker’s intended message.
Our study is the first to raise the issue of individual idiosyncrasy in noisy-channel comprehension: different participants may have different strategies in completing the task. In many previous noisy-channel studies (e.g. Gibson et al., 2013; Poliak et al., 2024; Poppels & Levy, 2016; Zhan et al., 2023), the effects of interest were mostly studied by within-participant experiments, where the same group of participants was presented with sentences manipulated in various ways, and they interpreted sentences accordingly. In these experiments, since it was the same group of participants that read different sentences, the effect of individual idiosyncrasy was small. However, in between-subject experiments (e.g. Chen et al., 2023; Gibson et al., 2017), where different groups of participants were presented with different sets of sentences, the effect of participant idiosyncrasy cannot be ignored, especially when the effect size of interest is small. There are two takeaways from this. First, future studies should ensure that the effect of participant idiosyncrasy is mitigated in between-subject experiments. Results in Figure 5 give us some initial insights into how many subjects are considered enough. Since the distribution of participant responses does not seem to change much when the number of participants varies from 60 (Experiment 1) to 200 (Experiment 2), we speculate that 200 participants per between-participant condition could be an upper bound in order to mitigate the idiosyncrasy effects. Second, future studies could potentially look into the causes of individual variations in noisy-channel comprehension. We speculate that there are at least two factors: one is that different participants may rely on different ways to complete the experiment efficiently, and another factor is that given the same sentence, different participants may assign different plausibility.
A limitation of the study is that so far our predictions on the effects of semantic prior are inexact: participants exposed to a higher proportion of semantically implausible sentences are more likely to interpret implausible sentences literally. However, the extent to which plausibility in the experiment may influence the comprehender’s semantic prior is still unclear (Ryskin & Fang, 2021). Future work should develop a more detailed account of how comprehenders update their semantic prior by integrating information at different timescales.
Footnotes
Acknowledgements
We would like to thank Rachel Ryskin for her feedback on the draft. We would also like to thank the audience of the 2022 MIT MSRP-bio poster session and the 36th Annual Conference on Human Sentence Processing.
Ethical Considerations
Experiments in this work have been approved by MIT’s Committee on the Use of Humans as Experimental Subjects, Protocol 403000040 (Title: Principles of Language Processing).
Consent to Participate
We obtained written informed consent from each participant in the beginning of the experiment.
Consent for Publication
We have removed identifying information from participants.
Author Contributions
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work as supported by NSF Award 2121074 “CompCog: Noisy-channel processing in human language understanding” to Gibson.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
