Abstract
Different studies suggest that collaborative argumentation among peers promotes school learning, especially the comprehension of concepts. However, the available evidence shows that the relationship between argumentation and learning is not direct but instead mediated by development processes that, in turn, promote learning. The goal of this study is to understand the mediating role that the development of argumentative inner speech may play in the process of constructing knowledge through collaborative argumentation. A case study was conducted in which one child (fourth grade) was tracked throughout an entire unit in which he and his peers argued collaboratively class after class. We assessed the students individually before and after in their learning (oral and written) and written argumentation skills. The collaborative work from all the classes was videoed and analysed through discourse analysis. The student showed significant progress in both delayed learning and written argumentation compared to the group. Furthermore, the analysis of oral tests shows that the argumentative interactions that initially appeared in the discussions among peers were internalized so he could understand the concepts involved on an individual level. The article discusses the implications of these results in understanding the role of discursive interactions in school learning processes.
Experimental research consistently shows that dialogue among peers within the context of collaborative work, especially discussions from different perspectives, benefits their understanding of concepts (Asterhan & Schwarz, 2007; Howe, 2009; Mercer & Littleton, 2007). Different studies show that the effects of peer discussion on conceptual gains are delayed; that is, they are detected up to several weeks after the collaboration (Asterhan & Schwarz, 2007; Howe et al., 2005; Rivard & Straw, 2000; Roy & Howe, 1990; Tolmie et al., 1993). While some studies show that the effects of peer discussion last over time while the effects of other forms of peer interaction tend to decline more quickly (Asterhan & Schwarz, 2007; Howe et al., 1992; Rivard & Straw, 2000; Roy & Howe, 1990), others show that there may be gains even in tests taken just after the collaboration ends (Howe, 2009; Howe et al., 2005; Tolmie et al., 1993). What is more, in primary school the gains do not seem to be related to either resolving the differences within each group, the quality of the group results or the achievement of group constructions (Howe, 2009). Even students who discussed ‘wrong’ ideas without reaching a group understanding may learn, as long as they debate different perspectives.
It is fairly unlikely that this effect is due to the mere presence of contradiction, given that exposure to disagreement does not necessarily result in conceptual development (Asterhan & Schwarz, 2007). Nor is it likely that the effect is the outcome of the group activity of negotiating meanings which are later internalized, because better ideas within group dialogues are not related to better delayed performance. Therefore, the evidence suggests two things. The first is that unlike many instructional activities, peer discussion somehow promotes more lasting (and probably deeper) gains. The second is that between the group discussion and the conceptual effect, individual post-collaboration cognition processes occur which may explain the conceptual gains (see Howe, 2009). The problem is that the type of individual processes involved are not yet known, which makes it difficult to understand the potential of peer interaction for knowledge construction, limiting its pedagogical use.
The role of collaborative argumentation
While in the studies cited above the authors call the type of language that has a delayed impact by different names, they all concur in describing this type of group dialogue as one in which students formulate their opinions (and/or explanations) and support them, challenge others’ ideas and share different perspectives. Following Asterhan and Schwarz (2007), we view this type of discourse as argumentative and consider argumentation a type of discourse that emerges when one (Greco, 2016) or more speakers use additional pieces of discourse to uphold a position in a rhetorical context of controversy (see Billig, 1987). If there are no alternatives, there is nothing to argue about; argumentation emerges when there is an implicit or explicit controversy — or a difference in perspectives — and the speakers are willing to participate in discursive interactions to reach an understanding (see Eemeren & Grootendorst, 1992).
In this regard, it is worth highlighting that argumentation has traditionally been viewed as having the potential to construct knowledge. Leitão (2000) posits that argumentation is a type of language that involves three semiotic mechanisms (justification, counter-argumentation and response) that as a whole promote negotiation and a revision of perspectives, thus promoting reflection on one’s own knowledge. There is also evidence showing that when students argue with each other they develop argumentation skills (Anderson et al., 2001; Chinn et al., 2000; Kuhn & Crowell, 2011; Kuhn et al., 1997; Reznitskaya et al., 2009) and reasoning skills (Wegerif et al., 1999), inasmuch as these ways of talking are transferred from a social to an individual level, and they are then able to use that language to reason (Kuhn et al., 1997; Mercer, 2013a; Reznitskaya & Gregory, 2013).
If this is taken seriously, it is possible to believe that this internalization process may be involved in the delayed effects of collaborative argumentation or argumentation among peers.
The role of inner speech: a possible hypothesis
Vygotsky (1934/2001) believed that the use of language with others transforms the way people think because people not only communicate but also psychologically collaborate with each other through language. What transforms psychological processes is the internalization of the social uses of language, which from then on occurs via inner speech. Thus, interior or the internal is not more psychological and less social; it simply means that the psychological activity is targeted at regulating one’s own psychological activity.
Therefore, Vygotsky (1934/2001) believed that inner speech is not silent language but a special way to use language that is targeted not at another person but at oneself. The collaborative activity that occurs through the functional use of interpersonal language (inter-psychological activity) is internalized, that is, transformed and made available for use in the absence of those with whom it was initially used and geared at new purposes.
Even though Rogoff (1995) defines appropriation as a personal process of transformation through participation in the social activities in which they become part of the individual, internalizing language entails its appropriation, but it also implies that an individual not only personally adopts a way of speaking that begins as foreign to them but then also uses it to regulate their own activity.
If argumentation among peers is believed to have benefits for both disciplinary learning and the development of argumentation and reasoning skills (Kuhn et al., 1997), then it is plausible to posit that these two benefits are related. This relationship would enable us to explain the delayed nature of disciplinary learning: through participation in specific thematic argumentative group discussions with other peers at different language skill levels, students may appropriate and later internalize a collaborative dynamic of asking for and giving additional support for a point of view, in addition to formulating alternative and challenging visions. This way of speaking with oneself regarding the topics discussed (school contents) may enable them to reason about these contents individually after that collaboration. The very structure of argumentation (argument, counter-argument and response) and the epistemic standards it involves (Reznitskaya & Gregory, 2013) would make it possible to think about school concepts and assess comprehension hypotheses which would lead the individual to better understand them.
To assess this hypothesis, we conducted a quasi-experimental case study with two classes in two educational establishments in the Metropolitan Region of Chile in the field of the natural sciences. Within this study, we conducted an analysis of a single case with the goal of further exploring our understanding of this hypothesis.
Method
Design
The case study below is framed within an exploratory study with a quasi-experimental design conducted with the goals of: (1) ascertaining the impact that the use of argumentation in learning has on scientific concepts; and (2) assessing the role of the development of individual argumentation skills in this impact. As part of this study, we tracked one particular case for two reasons: because the hypothesis we are seeking to understand is exploratory in nature, and because we needed to observe a longitudinal process that is as complex as possible. According to Stake (2000), this justifies the use of a case study, which enables us to understand change processes using different types of data and sources.
Participants
Of the 61 students in their fourth year of general primary education in two schools funded by the Metropolitan Region (Chile) who participated in the quasi-experimental study outlined above (31 in the intervention group), we chose one student from the intervention group for the case study. The selection criteria were that it be a student who shows: (a) higher delayed learning than the group average; and (b) gains in individual argumentation. The idea was to explore whether there were any signs that these gains in individual argumentation were used to understand the concepts being studied, that is, whether they had an internal use. One student (male) who fulfilled these criteria was chosen. At the time of assessment, the student was 108 months (nine years) old.
Procedure
The quasi-experimental study in which this study is framed entailed the participation of two teachers with their classes, who were invited to participate after securing the authorization of the school directors. The schools were government-funded and served a high percentage of highly vulnerable students. After the teachers were invited to participate, they agreed by signing an informed consent form; all the students in each class were also asked to participate, and the consent of their parents/guardians and the written assent of the students was obtained. The design entailed participation in one class with an intervention, while the other class served as the control group. In both cases, the students were assessed once before and twice after (once immediately after and another time four weeks after) working on the Force and Motion unit in the Understanding the Natural Sciences discipline. Written assessments of what students had learned in the unit and their general written argumentation skills were given, along with an individual oral interview to assess their understanding of the relevant concepts from the unit. Regarding the class that participated with an intervention (which the student in the case study belonged to), the teacher was asked to work on the unit with certain class plans that were specially designed to promote argumentation in the classroom (both the whole group and cooperatively). The unit lasted a total of six 90-minute classes which were filmed to record both the whole-class interactions (with a camera and microphone) and the interactions among peer groups (with one camera per group).
Class plans for the intervention
The class plans were adapted from the curricular material of the Forces unit developed by the epiSTEMe® project (see Howe et al., 2015) in the Faculty of Education at Cambridge University. This project sought to design and assess modules to promote improvements in the sciences and maths based on pedagogical principles through dialogic teaching and the use of exploratory speech in secondary school (ages 12–13) in both whole-class and groupwork interactions (see Howe et al., 2015). The adaptation consisted of matching the classes to the level at which Forces was being taught in Chile (fourth year of primary school) and to the national curriculum. All told, five classes were redesigned, yielding a unit with nine 45-minute classes that covered the topics of motion, forces in everyday life, forces in balance and types of forces. Each class was divided into three or four parts so it could better fit the class periods planned by the teacher. The planning of the unit was outlined in a document for the teacher which had 48 printed, ring-bound pages complemented with PowerPoint images, and with student workbooks.
Measures
Written tests
Unit learning test
We used the learning test adapted (Larrain et al., 2014a) from the knowledge tests developed by epiSTEMe (Howe et al., 2015) in a prior study. Three versions were obtained which were equivalent in terms of difficulty and item reliability. The Cronbach alpha coefficients for each version were: pre-test, α = .89; immediate post-test, α = .92; and delayed post-test, α = .89. Each version contained 24 items on the topics covered by the module: nine multiple-choice items that sought to assess students’ mastery of key concepts (force, motion and balanced force, among others), maximum score for the section (MS) = 9; 12 items in which students were asked to apply the notions from the unit to everyday situations (MS = 17 points); and three questions to assess their understanding of the phases and aspects of an experiment (MS = 6 points). The total maximum score for each test was 32.
Written argumentation test
The written argumentation test developed by the research team in a prior study was used (see Larrain et al., 2014b). The test was validated and piloted in the aforementioned study in three versions (pre-test, immediate post-test, delayed post-test) which were equivalent in terms of difficulty and item reliability, including a total of 13 items. The overall reliability of each version was: pre-test, α = .732; immediate post-test, α = .651; and delayed post-test, α = .760. Three items assessed the students’ ability to state a point of view (MS = 5); five assessed their ability to justify that point of view (MS = 14 points); and five assessed their ability to critically assess their own and others’ arguments (MS = 13 points). Two trained judges independently coded 30% of each of the versions of the tests in three consecutive rounds. The Cohen’s kappa scores were acceptable for all the questions (K > .64) except one (K > .4). The total maximum score for each test was 32.
Oral interview
An individual interview was conducted to inquire into the students’ level of understanding of the relevant concepts from the unit from a qualitative standpoint. To do so, a standardized script was developed. The interviews had pre, immediate post and delayed post versions. All the versions were designed by the research team based on the individual interviews reported by Howe et al. (1992) and were piloted with children at the same educational level before creating the definitive version. They deal with specific situations in which the students were asked to predict and then explain what happened. In the pre-interview, the situation was supposed to let two sheets of paper of the same size fall (one folded and the other full) to see which one reached the ground first; the immediate post-interview sought to see whether two paper balls which had objects of different weights inside them (a paperclip and a stone) would reach the floor at the same time; and the delayed post-interview sought to keep a helium balloon from moving using paper clips. The interviews were conducted by educational psychologists who had been specifically trained by the research team and were held during class time at school, usually the library. They lasted approximately 15–20 minutes per student and were filmed and later analysed. The responses were qualitatively analysed via content analysis to trace the appropriation of the elements from the group discussions in them.
Collaborative work dialogue
Of all the speech during the collaborative work, we only analysed the speech associated with the task that the students had to do. We obtained a total of 2,484 utterances. We developed a codification guide for argumentation in collaborative work developed by Larrain et al. (2020), identifying spoken utterances and the type of argumentative utterance. The guide’s codes included arguments, counter-arguments, justifying questions and other argumentative questions (defined in Table 1). Two trained analysts analysed 30% of the videos in seven rounds with the software The Observer XT (Noldus ©). In the last round, the percentage of agreement in all the codes was over 96%, which was considered excellent, with the exception of counter-arguments (65%), which was only acceptable. The differences were discussed and resolved. After reaching agreement, an analyst codified the other videos. The total frequency of argumentative utterances per student, group and class was calculated.
Definition of argumentative utterances for the group work.
Analysis
To choose the case to be analysed in depth, we performed descriptive analyses in order to locate the students compared to the group average on each written test. After that, we reviewed all the transcriptions of the group in which the chosen student (henceforth Ismael) participated, and, beyond the argumentative utterances, we analysed the position the student tended to take regarding the utterances and the presence of argumentative schemas or idea flows in the group dialogues which could be traced in the individual interviews. The interviews were analysed to assess the presence of argumentative dynamics.
Results
Results of written tests
As shown in Figures 1 and 2, Ismael gained 4 points between the pre-test and the delayed post-test on learning (the class gained an average of 2.5 points) and was one of the five of the 31 students who showed the most gains in the delayed post-test on argumentation (3 points vs. 0 points on average). Regarding argumentation, Ismael improved in the number of reasons he was capable of formulating and in his ability to anticipate his interlocutor’s objections.

Score on argumentation test.

Score on learning test.
Ismael’s contribution to group argumentation
Ismael worked with five classmates during the six classes that the Force and Motion unit lasted. As shown in Figure 3, which shows the number of argumentative utterances formulated by each student per session, Ismael did not significantly contribute to the discussions in the first few sessions. However, his contributions rose significantly in the second half of the unit.

Number of argumentative utterances formulated by student by class.
When analysing the type of contributions that Ismael made, we found that not only did the number of argumentative utterances he contributed increase, but so did their relevance. In fact, in the last class he was one of the students who formulated the most reasons to support his point of view. However, generally speaking he is a student who formulated virtually no counter-arguments directly. Only in class 5 did he formulate a counter-argument directly, and in class 6 he asked two questions which could be considered counter-argumentative, inasmuch as they sought to weaken the position upheld. The excerpt from the group dialogue presented in Table 2 corresponds to this latter class (see Figure 3 for a summary of his participation in the group argumentation), which was chosen because in it aspects in the group appear that later emerge individually in the delayed post-interview:
Excerpt from group work.
At first glance, the excerpt above does not seem to involve constructive argumentation. The students throw out many ideas, most of which are added rather than being explored and discarded. However, if you look carefully, you can notice different moves involving justification and questioning that are more complex than they first appear. Thus, despite the fact that the argumentation in this excerpt is basic, it is the type of interaction that Anderson et al. (1997) assert may correspond to the type of discussion that students have in fourth grade.
In the excerpt, we see that Ismael makes the first argumentative move in turn 11 in which Student 4 (henceforth S4) asks him to justify the opinion he put forth in turn 9 (force of friction). In turn 11, when asked for justification, Ismael revises his initial opinion and implicitly proposes a new opinion (human force) by formulating a justification (It’s human force, when we throw with our hands). This move involves an interesting reflection and revision process which occurs through the use of argumentative language. S1 and S2 challenge his stance by questioning it in turns 15 and 23 respectively, and Ismael responds by anticipating their objection: There are many forces. But S3 insists and asks for additional support for Ismael’s position, implicitly opposing his justification (There are many forces?). To that, Ismael responds without further elaboration, discarding his previous justification (There aren’t many forces) and proposing a new opinion (or perhaps reformulating the notion of human force in engine power). Whereupon S2 once again casts doubt on Ismael’s opinion in turn 27 (Engine?), and in turn 28 S4 shows that the group has not yet agreed to include Ismael’s suggestions. In turn 31, S1 once again suggests one of Ismael’s suggestions with some hesitation (Human force?), which Ismael supports in turn 35. However, in turn 36 S3 asks for explicit justification and openly shows that he is not yet convinced and may never be. Ismael responds by insisting on his argument in turn 11, to which the student from turn 2 contributes more information and concludes that Ismael may be right (in fact, turn 43 shows that the justification seemed sufficient for the idea to be accepted). Just when the interaction on the forces involved in the car’s motion seems to be ending, S2 suggests a hypothetical scenario (turns 49 and 51): he mentions the situation of what would happen if after moving it backwards to activate the friction mechanism, the car would fly. The situation is crucial because he is suggesting that there may be two forces of friction involved: the one that gradually slows the car as the wheels rub on the table, and the friction on the table that impels the car and is necessary for it to move. This latter would be absent in the flying car, even when the first kind of friction (the car’s friction with the air) would remain present. S1 suggests that the car would fly (turn 52), but S2 disagrees and suggests that nothing could move it forward. The discussion ends without group agreement. It is worth highlighting the fact that S2, who is the one who makes the most utterances challenging Ismael’s stance, is the student that contributed the most to the group argumentation in the majority of classes.
Another noteworthy aspect worth highlighting is the flow of ideas that are accepted by the group: the group starts with accepting the role of the forces of gravity and friction and then accepts human force, and the participation of the force of air remains open at the end (Force of gravity → Force of friction → Human force → Force of air). We shall revisit this in the next section.
Individual interviews
Regarding the individual interviews, in the first two interviews Ismael is only partially capable of formulating a coherent explanation of why a folded piece of paper falls more quickly than one that is open, or why two paper balls with the same shape but different weights take the same amount of time to fall. In both cases, he clearly attributes these effects to the weight of the paper, even though the sheets of paper have the same mass and the paper balls fall at the same time despite their different weight. However, in the delayed post-interview, an excerpt of which can be seen in Table 3, when asked why the paper clips kept a helium balloon still, he managed to identify relevant factors. This does not mean that he formulated a response that was totally correct (it contained errors), but he did identify a key factor in understanding the activity: the force of the push of air. Given space limitations, below we present an excerpt from the last part of the interview in which the force of air is identified:
Excerpt from the delayed post-interview.
In response to the question What other forces may act on the balloon to make it rise sometimes, fall sometimes and stay suspended in the air sometimes?, Ismael responds with brief reasoning aloud which leads him to conclude that it may be force of air. In this response, it is possible to see how Ismael uses an argumentative way of speaking to organize his thinking. This response resembles a critical or dialectical argumentation in which Ismael implicitly debates and anticipates postures dialectically. He proposes: it is gravity; he opposes: it is not gravity because; he proposes: so it’s magnetic force; he opposes: it is not magnetic force because; he proposes: so it’s human force; he opposes: it isn’t human force because; he concludes: it could be the force of air. Ismael’s thinking in this response clearly anticipates possible responses and debates them. If we compare this response with the type of dialogue to which Ismael contributed in the group discussions (see excerpt), we realize that although in the group he was the one who proposed alternatives and other students were the ones who opposed them (primarily S2 and S3), now he is the one who proposes them and opposes his own discourse, taking positions that the students opposed to him took, and this time thinking about the forces that are acting on the balloon.
It is interesting to note that the flow of ideas in his response is reminiscent of the flow of ideas in the group discussion outlined above. While in the group discussion the flow was Force of gravity → Force of friction → Human force → Force of air, in his response the flow of ideas was Force of gravity → Magnetic force → Human force → Force of air. This shows that more than simply activating related ideas, what we can see in Ismael’s response is the ability to reproduce an interpersonal dialogue, taking both the speaker’s position and the opposition, to think about the situation in the interview.
Discussion
The analysis of the case of Ismael enables us to understand what authors like Reznitskaya and Gregory (2013), Kuhn et al. (1997) and Mercer (2013a) posit: there may be a transfer of the argumentative forms of speech with peers internally, to direct one’s own thinking. In the case of Ismael, we can see how based on the collaborative discussion, something that was not initially there (in the previous interviews) gradually emerges: a collaborative dynamic of proposing and opposing ideas, which helps him to think. Here the goal is not merely to internalize group agreements, as the agreements reached in collaborative work may not necessarily help him to think about the new situation (the helium balloon). The goal instead is to adopt a structure that enables thinking to move in a productive way to conceive this new situation. Inasmuch as a structure is what directs thinking, even when it occurs in an interpersonal context, it could be a form of inner speech: inner argumentative speech (Greco, 2016). The fact that this dynamic came from the collaborative activity among peers cannot be assured, but this way of speaking appear did not appear in any of the previous measures (written and oral). What is more, the similarity of the flow of ideas between the peer discussion analysed and Ismael’s response in the interview leads us to believe that they may be genetically related, suggesting that more than general argumentative speech, we may see proof that he developed argumentative speech specific to the topic of forces. What is interesting and novel is not only noting his appropriation and internalization of argumentation based on peer interactions, which has already been documented (Anderson et al., 2001; Kuhn & Crowell, 2011; Reznitskaya et al., 2009), but understanding the role that this internalization process may play in the effect of argumentation among peers and the construction of knowledge. This entails joining two lines of research on development processes associated with argumentation among peers which until now were separate (see Mercer, 2013b): the development of argumentative skills and the construction of disciplinary knowledge.
What is more, the study suggests that more than the mere presence of contradiction, what is important in constructing knowledge is the use of argumentative language among peers because, in line with Vygotsky’s ideas (Vygotsky, 1934/2001), the internalization of this language is what allows individuals to continue the construction of knowledge and post-collaboration conceptual elaboration. Therefore, the goal is not simply to internalize a dialogic structure of assessing perspectives but also to internalize the standards of reasoning inherent to argumentative discourse (see Reznitskaya & Gregory, 2013). It is possible to hypothesize that the genetic process of development involved here is as follows: Collaborative argumentation among peers → Appropriation of argumentative forms of speech on the topic of discussion + elaboration of meanings → Development of general argumentative skills → Development of argumentative inner speech → Further elaboration of meanings.
Even though Howe (Howe, 2009; Howe & Zachariou, 2019) also stresses the importance of appropriating the peer dialogue for individual cognitive development that leads to delayed learning, she does not claim that these cognitive processes are linguistic per se, which is what we are claiming in this article. What is more, even though Reznitskaya and Gregory (2013) and Mercer (2013a) also posit the relationship between the internalization of interpersonal dialogue and the construction of knowledge, they do not specify the strictly discursive nature of individual knowledge-construction processes. In this sense, this study is putting forth a complex, controversial proposal. It is complex because it requires additional evidence which entails major methodological complexities, particularly associated with measuring inner argumentative speech and its mediating relationship between peer collaboration and disciplinary learning. And it is controversial because even though the sociocultural nature of learning is broadly accepted, there is little consensus regarding the sociocultural and discursive nature of the psychological processes involved. In consequence, it is essential to further explore this hypothesis with empirical studies in order to assess the sociocultural nature of school learning as a psychological process.
