Abstract
Background
The benefit of collaborative testing to learning has been examined via two-stage exams (individual then group) for high-stakes tests. However, group testing might be particularly beneficial to students when implemented during the earlier stages of learning (i.e.,
Objective
In a large Introductory Psychology course, we investigated whether low-stakes collaborative practice testing enhanced learning compared to individual practice testing.
Methods
Students completed collaborative and individual practice tests followed by two delayed retention tests. Across sections of the course, some students also engaged in group-building exercises prior to practice testing.
Results
Collaborative practice testing improved performance on surprise individual retention tests administered approximately one and two weeks later (
Conclusion
The present research suggests that collaborative practice testing can enhance long-term retention of course material.
Teaching Implications
This work provides a potential model for implementing collaborative practice testing in large undergraduate psychology classes and suggests that group-building exercises may not be necessary to produce durable learning from collaborative practice testing.
Retrieving information from memory—as students do when taking tests—has been shown to enhance learning relative to more passive study strategies like highlighting or rereading (i.e., the testing effect; Bjork, 1975; Carpenter et al., 2022; Roediger & Karpicke, 2006). Unsurprisingly, students tend to score higher on tests they take in groups (Garaschuk, 2022; Lusk & Conklin, 2003; Woody et al., 2008). More noteworthy, however, is that collaborative testing can bolster individual group members’ recall on a future assessment (Cortright et al., 2003; Gilley & Clarkston, 2014; Vázquez-García, 2018).
Much of the classroom research on collaborative testing has focused on collaboration during high-stakes or “two-stage exams,” in which students take an exam individually and then retake all or part of the exam in small groups. Students also report engaging in collaborative testing before exams during study sessions with their peers (Wissman & Rawson, 2016). However, it is unclear whether this type of low-stakes collaborative practice testing is more beneficial than individual practice testing when it is used in preparation for a later criterion exam. The present study addresses this issue in the context of a large Introductory Psychology course. Our primary aim was to assess whether collaborative practice testing could lead to better retention of course content compared to individual testing. Additionally, we examined whether facilitating constructive interactions among group members would enhance these potential benefits.
Evidence From Two-Stage Exams
One of the most common ways collaborative testing has been implemented in classrooms is through two-stage exams (e.g., Gilley & Clarkston, 2014). Some studies have found evidence that these two-stage collaborative exams can enhance individual performance on a delayed retention test (Cortright et al., 2003; Gilley & Clarkston, 2014), whereas other studies have not shown any long-term benefits (Cooke et al., 2019; Woody et al., 2008). Although the retention period in these studies varied from two days to four weeks, there was no clear association between the length of the delay and the benefit of the collaborative exam. However, even studies that do not demonstrate a positive effect of collaborative testing have found that it is as good as an individual re-test (i.e., not detrimental to learning).
Collaboration Benefits and Barriers
Testing collaboratively, whether on a two-stage exam or a practice test, could enhance learning for various reasons. For instance, there are opportunities for re-exposure when one group member retrieves a piece of information that no other group member would have retrieved on their own, and cross-cuing when information retrieved by one group member prompts another group member to retrieve something they otherwise would not have retrieved (Blumen & Rajaram, 2008; Nokes-Malach et al., 2019). Under ideal circumstances, these additional encoding opportunities can act as a form of restudy or scaffolded retrieval practice for the group, thereby fostering better long-term learning.
Working in groups can also facilitate error correction. When, for example, one group member produces an incorrect answer, others in the group—if they hold accurate prior knowledge—can correct that group member's mistake (Barber et al., 2010). Through this process, individuals can benefit from identifying and refuting their errors, and their fellow group members can reinforce their understanding of the material.
Working with others might foster additional unique processes that enrich the learning experience compared to working alone, such as the opportunity to explain answers to the group during constructive disagreement (see also the benefits of self-explanation; e.g., Chi et al., 1989). Such discussions can encourage students to reevaluate their knowledge, seek new information to reach a consensus, and refine their conclusions (Johnson & Johnson, 2009; Johnson et al., 1998).
If, however, students are not motivated to work together, or if they dislike group work, they might not engage in effective collaboration. Students, for example, often remark that group work can be plagued by dysfunctional group dynamics, such as ineffective communication, “free-riding,” or dominating group members (Gillespie et al., 2006; Hillyard et al., 2010; Woody et al., 2008). These experiences may lead students to develop negative views toward collaborative activities (Gillespie et al., 2006). Promisingly, however, positive group work experiences can improve attitudes towards group work among students who previously held negative attitudes towards in-class collaboration (Wosnitza & Volet, 2014) and increase excitement for future group work (Linnenbrink-Garcia et al., 2011; Reinig et al., 2011).
Facilitating Effective Collaboration
Johnson and colleagues (1998) proposed five key features of effective collaborative learning—features that, when promoted in students, could plausibly make collaborative testing more effective. According to their framework, effective collaboration occurs when group members rely on each other to produce the desired outcome (
In sum, previous work has demonstrated that collaborative testing—mainly in the context of two-stage exams—can facilitate retrieval during the collaborative stage of the exam. Evidence for how such benefits extend to longer-term individual learning is mixed and could depend on how well groups work together.
The Present Study
The present study examined the benefits of collaborative practice testing for retention of Introductory Psychology content. Due to the COVID-19 pandemic, students in two sections of the course completed a 1-hour practice testing activity during a synchronous online Zoom session. Whether practice tests were conducted in groups or alone was manipulated within-subjects such that all students completed two practice tests collaboratively and then two practice tests individually. The efficacy of collaborative versus individual practice testing on learning was then assessed via performance on individual retention tests that occurred approximately one week (Test 1) and two weeks (Test 2) after the initial activity. We hypothesized that students would retain information practiced collaboratively better than information practiced individually.
We also assigned one course section to complete group-building activities and another to complete neutral activities before the practice testing activity. Group-building activities might bolster cohesion between group members, in turn increasing the quality of group discussion. Thus, we explored whether students who completed group-building activities showed greater benefits of collaborative practice testing and improved attitudes towards group work.
Method
Participants
The participants were students enrolled in two large sections of an online Introductory Psychology course in the Fall of 2020. The final sample consisted of 569 students who completed the testing activity on Zoom (
The sample size in the current study was determined by course enrollment. Still, we conducted a power analysis using G*Power 3.1.9.6 (Faul et al., 2007) to evaluate the minimum sample size required to achieve 80% power for a two-tailed test at
Design
Course instruction consisted of asynchronous instructional modules and synchronous online laboratory sessions. During the lab sessions, students typically worked in small groups of four or five people to complete an activity on Zoom (https://zoom.us/). As shown in Figure 1, one section was assigned to complete the group-building activities, and the other was assigned to complete the neutral activities before the practice testing activity.

Experimental procedure.
During the practice testing activity, each small group of students completed one of six versions of the activity to counterbalance which topics were practice-tested individually and which topics were practice-tested collaboratively. 1 Each version included two collaborative practice tests and two individual practice tests. Later, learning was assessed with two surprise individual retention tests.
Materials
Study materials were developed to align with course content in Introductory Psychology. Samples of study materials are available online (Imundo et al., 2024).
Practice Tests
Four 10-item multiple-choice practice tests on core topics in Introductory Psychology (i.e., Research Methods, Biological Psychology, Sensation and Perception, and Learning; see APA, 2014) were used in the study. Each question contained the correct answer and four competitive lures.
Retention Tests
There were two retention tests (Tests 1 and 2). Each test contained a unique set of eight questions (two from each topic) taken directly from the practice testing activity.
Precourse and Postcourse Survey Items
The 17-item
Procedure
At the beginning of the course, students completed a precourse survey that included the
The practice testing activity occurred during the fifth lab of the course online via Zoom. To further foster group cohesion, students in the group-building activities section were instructed to have one group member share their screen and type for the group during the collaborative practice tests. In contrast, students in the section that completed neutral activities were instructed to discuss the questions together but have each member in their small group submit their own responses. 2
During the practice testing activity, the first two practice tests were taken as a group, and the second two practice tests were taken alone. Students were asked to take these tests without using external aids. After each practice test, students received corrective feedback in the form of a summary page presenting the items, the selected response, and the correct response. When practice testing was done collaboratively, students reviewed the feedback as a group, and when practice testing was done individually, students reviewed the feedback on their own.
About one week after the practice testing activity (
Results
Practice Test Performance
Figure 2 illustrates performance on the practice and retention tests. To assess whether performance on the practice tests was impacted by the practice test format (individual or collaborative) or section activities (neutral or group-building) we conducted a 2 × 2 mixed ANOVA with practice test format as the within-subjects variable, group activity as the between-subjects variable, and practice test score as the dependent variable. 4

The proportion of test items answered correctly on the practice tests, Retention Test 1 and Retention Test 2. Panels a, b, and c illustrate the proportions of questions answered correctly on the practice tests, Retention Test 1 and Retention Test 2, respectively, as a function of section activity (neutral vs. group-building) and collaborative versus individual practice test format. The significant main effect of collaborative versus individual practice testing in each panel demonstrates that practice testing in groups leads to greater practice performance (Panel a) and better individual retention of that content at a 1-week and 2-week delay (Panels b and c) than practice testing alone (
As expected, and shown in Panel a of Figure 2, scores were significantly higher on practice tests completed collaboratively (
Retention Test Performance
Out of the 569 students who participated in the practice testing activity, 498 completed the first retention test (Test 1). Of those students who completed Test 1, 462 completed the second retention test (Test 2). 5
A 2 × 2 mixed ANOVA with practice test format and section activity as variables revealed that students performed better on the ∼1-week retention test for topics that they had previously practice tested collaboratively (
For the ∼2-week retention test, students again scored better on items previously practiced collaboratively (
Attitudes Towards Group Work
We hypothesized that participating in group-building activities might be associated with improved attitudes toward group work. To test this possibility, we conducted a multivariate analysis of variance (MANOVA) with section activity (group-building versus neutral) as the between-subjects variable and change scores (post–pre) in the subscales for positive attitudes towards group work, discomfort with group work, and preference in group work subscales from the
Notably, the effect of section (group-building vs. neutral) was nonsignificant for all subscales (all
Discussion
In the present classroom-based study, collaborative practice testing during a synchronous online learning session yielded better performance than individual practice testing. Even though corrective feedback was provided after each practice test, this benefit of collaborative practice testing extended to students’ performance on surprise individual retention tests administered one and two weeks later. That is, students performed better on items appearing in these delayed tests if they had previously practiced them collaboratively than if they had previously practiced them individually. Taken together, these findings suggest that a well-structured collaborative testing activity can enhance long-term retention of material.
Reconciling With Prior Work
We found that collaborative practice testing enhanced recall on delayed retention tests, whereas prior studies on collaborative testing have produced mixed results (Cooke et al., 2019; Cortright et al., 2003; Woody et al., 2008). There are multiple possible reasons why the practice testing activity used in the present study allowed us to observe the long-term effects of collaborative testing. For example, students completed the practice testing activity with their already-established lab groups; thus, familiarity with each other may have enhanced the quality of information exchange and collaborative processing.
Additionally, our collaborative testing activity occurred relatively early in the learning process as part of a low-stakes formative assessment. In contrast, much of the classroom-based literature on collaborative testing involves two-stage and typically high-stakes exams (e.g., LoGuidice et al., 2015). Experiencing collaborative testing long before a formal assessment may have created an environment where students feel more comfortable exchanging ideas. Furthermore, given the early occurrence of the practice testing activity, students were likely to have gaps in their knowledge or misconceptions that were resolved by exchanging ideas with peers during collaborative practice testing. For that reason, students might also have been motivated to spend more time reflecting on the corrective feedback provided after the practice tests when they tested collaboratively than when they tested individually, which may have further enhanced the effect of collaborative practice testing. Ultimately, given the design of the present study, it is not possible to know the role that learning from feedback played in the observed effects, but it is educationally appropriate to provide feedback following formative assessments (Pan & Rickard, 2018).
In interpreting the present pattern of results, it is important to consider that the study occurred during emergency online instruction due to the COVID-19 pandemic. In contrast to prior work on collaborative testing implemented in person, the students in this study engaged in the practice testing activity during a synchronous online class session. Although this difference adds to the novelty of the present study, future work should explore if the results obtained here generalize to classes designed to be held online or to in-person courses. Another limitation to note is that, to ensure that some group members were not sitting on a Zoom call waiting for other group members to finish their individual practice tests, the collaborative practice tests were always taken prior to the individual practice tests. This procedure, however, aligns with the order in which group and individual activities are often presented to students due to classroom logistics.
The Role of Group-Building Activities
Although we hypothesized that including group-building activities might enhance the potential benefits of collaborative practice testing, we did not find any effect of group-building (versus neutral) activities in the present study. It is possible that the general structure of multiple-choice questions (i.e., five answer options with the mandate to select just one) may have encouraged effective on-task discussion for students in both sections, or perhaps the impact of the group-building activities would have been stronger if all the group-building exercises had occurred immediately prior to the practice testing activity. We explored this latter possibility in a follow-up study but still did not observe any association between completing group-building activities and collaborative practice testing efficacy (available on OSF at Imundo et al., 2024).
In terms of attitudes towards group work, the pattern of findings suggests that students may view group work more favorably and feel less discomfort working in groups after participating in a course where the group work is structured, as was the case in both the group-building and neutral sections. Preference in how to engage in group work did not change, but this finding may have occurred because the structured group work in the course did not facilitate the exploration of different group work styles.
Educational Implications
Taken together, the present pattern of results suggests that collaborative practice testing can be an effective strategy for enhancing students’ learning and long-term retention of course content. Although the present research does not allow us to offer fine-grained disambiguation of what elements of collaborative learning (e.g., cross-cuing) are driving student performance, our work suggests that implementing a relatively short (∼45-min) in-class collaborative practice testing activity can produce sustained memory gains. Given our findings, group practice testing might be a valuable strategy for instructors to implement in their own courses. There are existing resources that offer high-quality practice test questions that could be used during collaborative practice testing in Introductory Psychology (e.g., www.testyourself.psych.ucla.edu; Paquette-Smith et al., 2023). Instructors could also explore having students self-generate practice test questions (e.g., Berry & Chew, 2008). Overall, the present work suggests that collaborative practice testing can be an effective method for increasing student learning, and thus we encourage instructors to consider how they might incorporate such collaborative activities into their classroom instruction.
Footnotes
Acknowledgments
An earlier version of this work is included in MNI's doctoral dissertation.
Author Contributions
MNI, CMC, and MP designed the study, developed the materials, and collected the study data. MNI analyzed and interpreted the data with input from CMC and MP. MNI drafted the manuscript and CMC, MP, and ELB provided substantial manuscript edits. All authors approved the manuscript for submission.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Instructional Improvement Grants (IIP #19-02; IIP #20-09) from the UCLA Center for the Advancement of Teaching (CAT), as well as grants from the Society for the Teaching of Psychology Scholarship of Teaching and Learning Research Grant program and the Association for Psychological Science Teaching Fund Microgrants Program.
