Abstract
Instructional coaching is increasingly regarded as an essential feature of professional development, but no research exists on content-specific instructional coaching for history teachers. The study examines data from a coaching program in which history teacher leaders served as novice coaches for their colleagues. We found that coached teachers, as a whole, improved in discussion facilitation. Case analysis of two successful and two less successful coach–teacher pairings revealed that successful coaches were more likely to focus on eliciting students’ argumentative discourse whereas less successful coach planning sessions focused on discrete historical content knowledge and the disciplinary heuristics of historical thinking. It appears that coaching that emphasized a conceptual redirection toward inquiry and a pedagogical toolkit for fostering student discourse was more closely tied to growth in discussion facilitation and opportunities for student historical reasoning than a coaching approach more narrowly focused on historical concepts.
Student-centered discourse remains rare in social studies classrooms despite a powerful body of evidence demonstrating its contribution to a variety of essential learning outcomes across subject areas (e.g., Reisman, 2015; Hess & McAvoy, 2014; Kohlmeier & Saye, 2019; Parker, 2010; Saye & Social Studies Inquiry Research Collaborative [SSRIC], 2013). The persistent struggle to foster student discourse is a core problem for social studies teacher education in both preservice (e.g., Reisman et al., 2019) and in-service education (e.g., Saye et al., 2009) as teachers default to controlled discourse patterns that constrain opportunities for student interaction and argumentation. This study explores the relationship between in-service instructional coaching, which has received little attention in the social studies (Crocco & Livingston, 2017), and teacher growth in discussion facilitation in history classrooms. We share results from a year-long coaching program in which teachers were coached in the practices of document-based history, a form of inquiry instruction. The purpose of this study is to explore whether certain coaching moves are associated with an increase in teacher facilitation of student-centered discourse.
Coaching is increasingly viewed as a critical feature of effective professional development and has been tied to changes in teacher practice and gains in student learning (e.g., Kraft et al., 2018). Logistical and financial barriers, however, discourage widespread, sustainable in-person coaching because such models are resource- and time-intensive (Carter et al., 2017; Morgan & Bates, 2018). School-embedded coaches are often tasked with multiple roles and have limited time for focused, instructional feedback (Bean et al., 2010). These hurdles are especially pronounced for history teachers, whose subject matter is often deprioritized in favor of tested subjects such as language arts and mathematics (Fitchett et al., 2014). Rarely, if ever, are history teachers the recipients of subject-specific instructional coaching, despite near consensus that such support is essential for instructional growth.
The study examines data from the second year of a longer design-based study in which history teacher leaders served as novice coaches for their colleagues. Following the tenets of design-based implementation research, the broader project emphasized collaborative, iterative design geared toward developing theory on teaching and learning, building district capacity, and seeding continued collaboration (e.g., Penuel et al., 2011). The following questions framed our inquiry in this study: (a) Do teachers who receive coaching in the practices of document-based history instruction improve in their capacity to facilitate argumentative classroom discourse? (b) What differences exist between the coaching received by teachers who grew the most in discussion facilitation and the coaching received by teachers who grew the least?
Discourse and Inquiry Instruction in History Classrooms
We view the role and value of student-centered discourse in classrooms through the lens of social constructivism, a theory that posits individual learning to be socially and culturally situated and mediated (Cobb & Yackel, 1996; Palincsar, 1998; Vygotsky, 1987). The turn toward social constructivism, informed by cognitive and sociocultural perspectives on teaching and learning, inspired a rich body of classroom-based research across subject areas (see Palincsar, 1998). Synthesizing the literature on social constructivism in social studies, van Hover and Hicks (2017) identified six core principles that foster such learning, including (a) authentic learning tasks; (b) opportunities to deepen conceptual understandings; (c) opportunities to extend prior knowledge; (d) scaffolds to support complex learning; (e) social mediation (e.g., conversation among learners) to articulate ideas; and (f) metacognition to develop self-regulated learning. Both theoretically and empirically, classroom discourse emerges as the salient mediational tool to support cognitive development, and more specifically, knowledge construction, in studies that assume a social constructivist perspective (Palincsar, 1998).
Instructional designs that support such student-centered learning and knowledge construction are often referred to as “inquiry,” highlighting the critical role of questions and the indeterminate nature of knowledge in the learning process. The College, Career, & Civic Life (C3) Framework for Social Studies Standards (National Council for the Social Studies, 2013) features an inquiry arc comprising four dimensions: (a) developing questions and planning inquiries; (b) applying disciplinary concepts and tools; (c) evaluating sources and using evidence; and (d) communicating conclusions and taking informed action (p. 17). Whether designed as extended units of student-driven inquiry, or stand-alone document-based lesson plans with predetermined central questions (see Reading Like a Historian curriculum in Method), inquiry-based history instruction invites students to simulate the knowledge construction practices of historians as they strive to make sense of the past, grappling with an evidentiary record that, by necessity, remains partial and incomplete (Lévesque & Clark, 2018; Wineburg, 2001).
To navigate, interrogate, and interpret the evidentiary record, historians deploy disciplinary heuristics: they source or evaluate the reliability of historical evidence by identifying the author’s perspective, motive, and biases; they corroborate, or compare perspectives, arguments, and evidence across multiple sources; and they contextualize, or locate texts and artifacts within a particular time and place (see Reisman, 2012; Wineburg, 1991, 2001). In short, disciplinary historical inquiry centers students as the primary adjudicators of historical questions and inverts typical classroom instruction, where historical knowledge is usually presented in a fixed, authoritative narrative, and where textbooks rarely reveal the evidentiary basis for their claims (Cuban, 2016). Here again, classroom discourse operates as the key mediational tool, as students offer tentative interpretations of the available evidence, surface contradictions, and collaborate discursively to formulate plausible historical arguments that account for the available evidence. Although other national traditions, such as those in the Canada or Germany, have constructed broader definitions of historical thinking that include a greater connection with the present, engagement with epistemic uncertainty is a characteristic of nearly all conceptions of disciplinary inquiry for history (Körber, 2011; Lévesque & Clark, 2018).
Prior research on historical thinking in the United States has found that students rarely engage in historical thinking spontaneously (e.g., Monte-Sano, 2010), though pedagogical supports can bolster their capacity to do so (e.g., Jay, 2021; Nokes et al., 2007). Still, existing research underscores the challenge of enacting inquiry instruction in history (Jay, 2021; Reisman, 2015; Reisman et al., 2019; Reisman & Jay, 2022). Teachers must have deep familiarity with the subject matter, facility with ways of organizing classroom discourse, and they must be responsive to the knowledge students bring to the endeavor (Cohen, 2011; Shulman, 1987). Well-designed inquiry lessons that frame meaningful problems and questions, provide accessible resources, and anticipate student answers can offload some, but not all, of the demands of inquiry instruction (Charalambous et al., 2012; Reisman & Fogo, 2016). It is equally apparent that such curricular materials can only go so far in supporting teacher enactment (Reisman, 2015; Reisman et al., 2020; Reisman & Fogo, 2016; Reisman & Jay, 2022). Successful implementation of disciplinary inquiry lessons in history requires that teachers hold two key conceptual understandings: first, that the fundamental purpose of the lesson is to engage students in collaborative knowledge construction around the focal question; and second, that student discourse mediates and drives learning throughout the lesson.
Research on Coaching Feedback
Teachers interested in transforming their classrooms to support historical inquiry often need coaching, given the conceptual and pedagogical demands of this instruction. The extant literature on instructional coaching typically refers to two approaches—responsive and directive—that align with different visions for the purpose of coaching. Most content-based models of instructional coaching embrace a directive approach; coaches are considered instructional experts and the goal of coaching is to reform teacher practice to align with a particular instructional approach (e.g., Gibbons & Cobb, 2016; Matsumura et al., 2010). Directive models are guided by the assumption that teachers will change their practice if they observe changes in student outcomes or achievement (Neufeld & Roper, 2003). Directive approaches stand in contrast with responsive approaches that focus on developing teachers’ self-efficacy and capacity for reflection (see Costa & Garmston, 1994).
Recent research indicates that effective coaches toggle between the two approaches (Borman & Feger, 2006; Gibbons & Cobb, 2017), prompting teachers to reflect on their assumptions or interpretive frames and, in doing so, supporting them in considering alternative instructional solutions to recurring problems. Across responsive and directive coaching approaches, a general portrait has emerged of effective feedback, namely that it be “timely, sufficient, concrete, specific and limited to a small number of performance problems” (Aikens & Akers, 2011; Veenman & Denessen, 2001, p. 410). Furthermore, the literature on content-based coaching suggests that feedback be deeply rooted in content and in relevant curricular structures (Gibbons & Cobb, 2016). Ultimately, for coaching feedback to be effective, it must be actionable to the extent that teachers perceive next steps they can take to improve instruction. In certain interventions, the focus on “next steps” is structured as part of a routine (e.g., Kraft & Hill, 2020). But the notion that feedback must be actionable appears widely across the broader span of literature (e.g., Hattie & Timperley, 2007).
To date, there exists no literature on content-specific instructional coaching for history teachers. In the absence of existing models, the current program sought to incorporate the tools and logic of practice-based teacher education (PBTE) into history teacher coaching, a move that has strong theoretical grounding but has not yet generated an extensive research literature (exceptions include Cohen et al., 2020; Reisman & Beckwith, 2023). As an approach to teacher education and professional development, PBTE argues that learning experiences be designed and organized around bounded, discrete instructional practices that can be analyzed, decomposed, and rehearsed (Ball & Forzani, 2009; Grossman et al., 2009). A coaching program designed around the principles of PBTE would focus teachers’ attention and coaches’ feedback on the enactment of specific instructional practices. Our hypothesis was that by decomposing and rehearsing the core instructional practices that comprise a document-based history lesson, coaches could help teachers see how the lesson’s inquiry should be sustained from the initial launch of the lesson through its culminating discussion.
Method
Project Context
The current project grew out of an effort to design a sustainable model of coaching that leverages the instructional expertise of teacher leaders and overcomes the logistical hurdles that limit the scalability of face-to-face coaching models. Data from the project come from the second year of a design-based implementation study in a well-resourced district in the mid-Atlantic region of the United States that had a long-standing commitment to document-based history instruction. The eight novice coaches were full-time classroom teachers selected for their proficiency in document-based history instruction, and their experience and familiarity with the Reading Like a Historian curriculum. We did not collect data on coaches’ incoming beliefs about inquiry instruction or student-centered discourse.
The Reading Like a Historian curriculum features document-based lessons organized around a central historical question (CHQ) that invite students into the processes of historical knowledge construction. The curriculum regularly appears on lists of best curricular materials for secondary students (e.g., https://www.commonsense.org/education/lists/best-us-history-websites-for-students). Each Reading Like a Historian lesson is designed as a sequence of core instructional activities: (a) establish background knowledge, (b) engage students in disciplinary reading of multiple historical texts, and (c) facilitate whole-class discussion around a CHQ. By establishing background knowledge, the teacher assists students in constructing a general schema about a particular historical event. Done effectively, the teacher elicits, attends to, and builds upon students’ incoming knowledge about the historical event or period. The purpose of engaging students with historical documents is to then offer conflicting interpretations that are sequenced to prompt students to change their minds and revise their interpretations. The third and final activity, the whole-class discussion, allows students to reconcile their newly acquired information about the past with their initial schemas. As such, the curriculum served as a fitting context to explore the relationship between instructional coaching and teacher facilitation of student discourse.
Influenced by the literature on PBTE, the first author invited the teacher leaders (heretofore called coaches) to design the existing coaching model around the practices that comprise the document-based lesson. Together, the coaches and first author worked to specify and design instructional tools around three core practices: (a) establishing background knowledge, (b) supporting historical reading, and (c) facilitating historical discussion. Each coach was paired with one to two teachers to pilot a coaching model in which teachers co-planned with coaches and then videotaped themselves enacting each practice in their classroom. Coaches gave teachers asynchronous feedback via an online video analysis platform (Reisman & Beckwith, 2023).
Coaches participated in over 30 hours of coach development sessions with the first author over the first 2 years of the larger study, but sessions were primarily dedicated to building out the coaching model, specifying the three instructional practices, and reflecting upon and revising the program. The final session in Year 1 and several sessions in Year 2 were dedicated to coaches sharing and debriefing practice using video representations (e.g., what would you have said to this teacher?). Coaches were encouraged to elicit teachers’ interpretive frames (e.g., Rudolph et al., 2006) and then offer concrete suggestions related to supporting student inquiry, but no single aspect of inquiry (e.g., questioning) was emphasized (see Reisman & Beckwith, 2023 for a discussion for the online feedback tools coaches developed).
Data Collection
Prior to the start of the coaching cycle, each teacher filmed themselves facilitating a historical discussion in their classroom. Teachers had access to the Reading like a Historian curriculum and were encouraged, but not obligated to use it in their baseline videos. After those videos were collected, the coaching cycles began. Coaches and teachers had three planning meetings over the course of the school year, dedicating one meeting to each of the three core practices: Establishing Background Knowledge (EBK), Supporting Historical Reading (SHR), and Facilitating Historical Discussion (FHD). Each session featured coaches and teachers collaboratively talking through lessons planned by the teachers, prior to their implementation. Each meeting was videotaped and each teacher also filmed themselves twice, once implementing the co-planned lessons with their students, and once implementing the focal practice in a different lesson. In total, 13 middle and high school teachers submitted 92 videos 1 of their instruction, accompanied by 37 videos of planning sessions with coaches. 2 We interviewed coaches after their planning meetings, but we did not interview teachers.
Case Selection
We selected four teacher-coach pairs for closer analysis based on the teachers’ growth in discussion facilitation. The teachers coached by coaches Sarah and Steven grew the most; the teachers coached by Liz and Logan grew the least. Table 1 presents coaches’ and teachers’ relevant demographic information, including the demographics of the schools in which teachers taught. The classes that all four teachers taught were general education classes, meaning that they were not designated for serving a specific population such as honors students, student with specialized education plans, or students who were learning English, and can be understood to reflect, approximately, the demographic makeup of the school. All four coaches could be seen as roughly mid-career, except for Sarah who was a bit earlier in her career, and all four coaches were White. Still, we see several differences between the two pairs worth noting. Sarah and Steve coached teachers who were considerably less experienced than the teachers coached by Liz and Logan. Sarah and Steve’s teachers also taught high school and held graduate degrees in education and undergraduate history majors. The teachers coached by Liz and Logan taught middle school and were both experienced teachers; indeed, both had more teaching experience than their respective coaches. Although our analysis focuses on differences in the substance of the teacher-coach planning sessions, these additional factors likely informed the experiences of the focal teachers. We address these potential limitations in our discussion.
Coach, Teacher, and School Demographics for Focal Cases.
Data Analysis
The dual research questions of this study ask how teachers’ facilitation of classroom discussion changed over their year of coaching and how those changes related to the coaching they received.
Video Coding
Our initial step was to code pre/postvideos of classroom discussions. We coded all the baseline videos submitted prior to the first coaching cycle and all the final videos conducted after the facilitating historical discussion coaching session using the Science Discourse Instrument (SDI) (Fishman et al., 2017; Osbourne et al., 2019). The SDI is a validated observation tool for evaluating the argumentative structure of disciplinary classroom discussions in science. To prepare for scoring with SDI, raters identify a 15-min segment of a videotaped lesson that contains at least 5 minutes of whole-class discourse. If more than one segment of the lesson meets this criterion, the scorer identifies the segment with the most student talk about the content. SDI holistically scores six discursive moves on a 4-point scale, with 3 and 4 indicating proficiency, or “moves that demonstrate mastery of the target practices,” and 1 and 2 indicating emergent practice, or “moves that exhibit some characteristics of the target practices” described in the instrument (SDI, Stanford University, v2.0.4). The six discursive moves are evenly divided between teacher and student contributions. The dimensions ask, press, and link, evaluate the teachers’ proficiency in initiating open-ended discourse, prompting students to substantiate claims with evidence, and connecting students’ ideas, respectively. The dimensions explain, co-construct, and critique, score students’ proficiency articulating claims with evidence, building on one another’s ideas, and productively disagreeing with their peers, respectively.
Although SDI was developed to score discussions in science classes, two considerations prompted us to use it for historical discussions. First, there is no comparable tool for assessing disciplinary discussion in history/social studies classrooms [existing tools like Huijgen et al.’s (2017) observation protocol limit their scope to specific historical heuristics rather than the entire practice of discussion]. Second, SDI is functionally content neutral as it only rates the forms of discourse, not their content. This design choice, a concession to the variety in scientific discourse, enables SDI to be transferred to history because historical argumentation also relies on explicit claim-evidence relationships. After scoring each of the videos with SDI, we conducted paired T-tests to assess teacher growth.
To account for the transfer of the SDI to a history/social studies context, as well as to reflect the emphasis placed on disciplinary history instruction, the videos of classroom discussions were also coded with a deductive codebook for historical thinking. Because Reading Like a Historian lessons are designed to elicit epistemological uncertainty about the past as students interrogate and corroborate documentary evidence, we sought to capture the extent to which such reasoning was present in classroom discourse. This codebook was applied to the same 15-minute segments that were used to code SDI. Videos were coded at the level of the utterance with categories drawn from Wineburg’s (1991) descriptions of the heuristics used by historians when reading texts: sourcing, corroboration, and contextualization (see Table 2). Although Wineburg initially defined these heuristics based on their absence in high school students’ habits of reading, these concepts have increasingly been utilized as explicit objects of teaching and students have been shown to utilize these concepts in contexts where they have received instruction on historical thinking (Jay, 2021). Videos were coded as demonstrating sourcing whenever teachers or students asked a question or made a comment questioning the reliability of a document or scrutinizing the author, date, or genre of a text. They were coded as corroborating whenever teachers or students asked a question or made a comment comparing two texts. Finally, videos were coded as contextualizing whenever teachers or students asked a question or made a comment that situated the text within a particular historical situation or zeitgeist. We hypothesized that instructional coaching would result not only in a higher frequency of historical thinking in classroom discourse, but also in cognitive responsibility for this thinking shifting from teachers to students over the course of the year.
Historical Thinking Codes for Videos of Discussion.
Coach-Teacher Planning Transcripts Coding
To explore the relationship between coaching and instruction, we created a 2 × 2 comparative case study by identifying the two most productive and two least productive coach-teacher pairings as determined by the growth on SDI scores (Yin, 2017). Although this selection process resulted in our comparing different grade levels (e.g., teachers of highest scoring coaches worked in high schools; teachers of lowest scoring worked in middle schools), our interest in the interaction within the coaching dyads overrode concerns about the divergent contexts. We coded the transcripts of all three planning sessions conducted between these four coach-teacher pairings with an iterative inductive codebook informed by SDI (Miles et al., 2014). In this codebook, we were interested in the ways coaches initiated and sustained consideration of teacher and student contributions to discussions, as represented by the SDI’s domains. Coding proceeded at the level of the utterance and noted whenever coaches either discussed a teacher or student move within the SDI framework or modeled a move within the SDI framework by roleplaying as a teacher or student. Accordingly, utterances were coded as ask when teachers or coaches either discussed or modeled open-ended questions intended to prompt student discourse, as press when they discussed or modeled prompting students for elaborated responses connecting claim and evidence, and link when they discussed or modeled teacher moves that would orient students to one another’s ideas. Explain, co-construct, and critique codes were used to note coach and teacher utterances describing students’ elaborated responses, connections to one another’s ideas, or productive disagreements, respectively. Although the codes are closely related, ask, press, and link codes were characterized by a focus on teachers’ actions while explain, co-construct, and critique were distinguished by a focus on students’ thinking and talking (see Table 3). Each video was coded by the second author and a research assistant, and interrater reliability was calculated to be a Cohen’s Kappa of 0.76, indicating “good agreement.”
Discourse Codes for Transcripts of Coaching Meetings.
Findings
We found that coached teachers, as a whole, improved in discussion facilitation, but some coach-teacher pairings were more successful than others. Analyzing the comparative cases of two successful and two less successful coach-teacher pairings revealed that successful coach pairing were more likely to focus on eliciting students’ argumentative discourse whereas less successful coach planning sessions focused on lesson planning, discrete historical content knowledge, and the disciplinary heuristics of historical thinking. Yet, the growth of teachers’ facilitation capacity in the successful coach-teacher pairings did not appear to come at the expense of their students’ opportunities to learn historical content or historical thinking. To the contrary, disciplinary thinking occurred with relatively similar frequency in the final videos from all four of the case study classrooms but was more likely to be enacted by students in the classrooms of successful coach-teacher pairings. It appears that coaching that emphasized a conceptual redirection toward inquiry and a pedagogical toolkit for fostering student discourse was more closely tied to growth in discussion facilitation and opportunities for student historical reasoning than a coaching approach more narrowly focused on historical concepts.
Coaching Supported Teachers’ Capacity as Discussion Facilitators
At the cohort level (N = 13), teachers improved in the overall discursive quality of classroom discussions, though growth in students’ historical thinking was more mixed. Comparing the videos of teachers’ fall baseline discussions and their final spring discussions, the average teacher increased from an 11.8 to a 14.5 out of the possible 24 points on the SDI. T-tests determined that the cohort’s growth was significant in four of the six domains: ask, t(12) = −.2.3, p = .04, press, t(12) = −2.5, p = .03, link, t(12) = −5.1, p < .001, and explain, t(12) = −2.9, p = .01. This is to say that over the course of the coaching cycles, teachers became significantly more likely to ask open questions, prompt students for extended elaborations, and connect multiple student ideas. At the same time, students became more likely to offer elaborated claims, although they did not make significant progress in co-constructing or critiquing one another’s ideas. The SDI scoring of teachers’ videos presents evidence suggesting that the coaching cycles may have been a successful intervention. The teachers’ practices shifted significantly in favor of becoming more dialogic and discursive and the changes in their teaching were correlated with a marked increase in students’ elaborated contributions within the lesson.
At the cohort level, there was no clear growth pattern in historical thinking moves. For instance, the amount of sourcing students performed increased in five of the classes, decreased in five classes, and stayed the same in three. Analysis of teachers’ historical thinking moves produced similar ambiguity, with 8 of the 13 teachers making fewer historical thinking moves in their final discussion than in their baseline. In short, although we observed an increase in the likelihood of teachers engaging in dialogic teaching, we did not observe concomitant growth in the presence of historical thinking.
Successful Coaching Emphasized Initiating and Sustaining Student-Centered Inquiry
Cohort-level analysis, however, necessarily homogenizes the individual interactions between coaches and teachers. To better understand how different coaching pairs engaged in this work, we created a 2 × 2 comparative case study by selecting the two most successful coaching dyads (with coaches Steve and Sarah) and the two least successful dyads (with coaches Liz and Logan) as measured by teachers’ growth in SDI scores from the baseline to the final video (see Table 4). Because SDI draws a line between emergent practice (scores 1–2) and proficient practice (scores 3–4), we see from this table that the most pronounced difference between the successful and less successful coaching pairs is the teachers’ growth in press and link.
Teachers’ SDI Scores Displayed by Coach.
Note. SDI = science discourse instrument.
It is important to note that all four coaches engaged with teachers in a similar manner. They went through the same three cycles of coaching, drew on many of the same coaching prompts, and shared a tendency to alternate between questioning teachers about their plans and offering suggestions. In most cases, recommendations for practice were delivered monologically by coaches. What differentiated the successful coaches from the less successful ones was therefore not the form of their coaching, but the content. We found three trends in the substance of what successful coaches said to teachers that differentiated their coaching from that of the less successful coaches. Successful coaches were more likely (a) to emphasize the importance of open questioning, (b) to consistently prioritize student sensemaking, and (c) to do so across the arc of the lesson. By contrast, the coaching sessions of the less successful coaches were characterized by a lack of integration: the instructional practices and embedded content were consistently decoupled from the larger purpose and structure of the inquiry lesson.
Emphasizing Teacher Questioning
The biggest difference between the most and least successful coaching was the extent to which coaches emphasized the importance of asking open, rigorous, and text-based questions. In their five recorded coaching sessions, Steve and Sarah made 71 ask comments. Liz and Logan, however, made only 10 ask comments in their six sessions (see Figure 1). Across these dozens of comments, Steve and Sarah were able to develop a taxonomy of questioning (e.g., Steve: “You want to be able to ask two types of questions. One, to connect them to prior knowledge like, ‘Someone tell me why the United States has a problem with socialism.’ Then the other question is to access some new knowledge that might be applied to the documents. You want to think of this in terms of a good mix of the higher level questions and the lower level questions.”), provide advice about how to structure student discourse around a central question (e.g., Sarah: “So your prediction, frame it in terms of your CHQ [Central Historical Question]”), and give teachers granular advice about phrasing, timing, and enacting powerful questioning (e.g., Sarah: “I wouldn’t give them both documents at the same time if I were you. I think I would give the John F. Kennedy one first and then ask, ‘So how do you think most African Americans feel about this legislation? What do you expect?’”).

SDI Coding of Coaching Sessions by Successful (S) and Less Successful (L) Coach-Teacher Pairs
Liz and Logan, however, infrequently mentioned teachers’ questioning. On the rare occasions they did, their discussions tended to focus on the procedural relationship between the three coaching cycles (e.g., Liz: “The SHR [supporting historical reading] really can be anything from when you start modeling, through going over those guiding questions or comprehension questions . . . Technically the last part of these lessons are . . . whole-class discussion about the CHQ.” or Logan: “Okay. When do we get to the CHQ?”). Teachers coached by Steve and Sarah had many more prompts to think about their questioning and a much greater variety of examples, advice, and feedback about questioning to draw from than did teachers coached by Liz and Logan.
Emphasizing Student Sensemaking
As a complement to their focus on the teacher move ask, the successful coaches were also disproportionately attuned to students’ explain moves. Steve and Sarah discussed explain seven times more frequently (22 mentions) than Liz and Logan (3 mentions). Liz demonstrated a sound understanding of the importance of explain with her advice that, for classroom discussions, “the ideal is it is as student-driven as can be, it is full of evidence, multiple interpretations, both sides are being addressed,” but she simply did not develop this idea across multiple coaching cycles. Logan was never coded as discussing students’ explanations, demonstrating a lack of attunement to a critical element of classroom discussion. Steve and Sarah, however, helped teachers predict students’ arguments (e.g., Steve: “Do you think they'd try to answer that question just based on the stuff from the reading?”), plan specific moves to promote, ground, and extend students’ argumentation (e.g., Sarah: “Even one minute, whatever you can, but just trying to get them to use that evidence and be like, ‘Okay, well, don’t just say I trust Frick.’ Like, ‘Why, show me that you liked him.’ Just hold them to the documents, make them use their evidence”), and develop frameworks for evaluating students’ explanations (e.g., Sarah: “I think sometimes you’re making this be too nitpicky and too difficult. So it can just be something as simple as like how is Document B different than Document A? . . . That’s going to be the type of skill you want them to grow and develop as a student. You want them to look at and compare across documents.”). Steve and Sarah’s teachers were frequently prompted to think about the relationship between teacher questions and student answers.
Steve and Sarah’s teachers improved in the quality of their ask, but they also improved in domains that the coaches did not emphasize. Steve’s teacher improved in every domain except students’ critique and Sarah improved in every domain except co-construct and critique (see Table 4). This growth occurred even though Steve was the only coach to never mention press and only referred to link moves twice. Liz mentioned press as much as Sarah (three times each), and yet Sarah’s teacher moved from a two to a four in that domain and Liz’s teacher did not improve his press score at all. These findings suggest changes in teacher behavior cannot be entirely attributed to coaches drilling particular moves.
Orienting Toward Discussion Across the Arc of the Lesson
Not only did Steve and Sarah place greater emphasis on ask and explain, but they also structured their coaching conversations differently to unify the disparate practices around a continuous connection to discussion. All the coaches shared the same progression of three history instruction practices, moving from establishing background knowledge and supporting historical reading to facilitating historical discourse. The successful coaches made it clear that these practices built on each other such that establishing background knowledge and supporting historical reading were scaffolds in service of the whole class discussion. This formulation explicitly and repeatedly oriented teachers toward understanding that discourse was the true goal of the coaching cycle. In their establishing background knowledge and supporting historical reading coaching sessions, Steve (36 SDI comments in two planning conversations) and Sarah (45 SDI comments in one planning conversation) made frequent reference to the SDI domains. Liz (six SDI comments in two planning conversations) and Logan (three SDI comments in two planning conversations), however, rarely incorporated discussion facilitation into their initial coaching conversations.
Steve and Sarah both introduced the concept of student argumentation in their sessions on historical reading. For example, Sarah said, I think you have good plans, you’re going to do that modeling with the first [document], then give them your graphic organizer . . . Kind of do the same thing for document B. The last thing is I know that this session isn’t focused on discussion, but I would really encourage you to try and teach this with discussion in mind and like what big questions are the kids going to be grappling with and how am I going to engage them in that? . . . Some of the things that have been most useful for me is when I had forced them to take a side, like, okay, so who is to blame? . . . Make them stand up for why they trust that particular person.
Although this coaching conversation was focused on supporting students’ historical reading, Sarah self-consciously broke that frame to connect the reading scaffold (a Venn diagram) to student argumentation that would occur later in the lesson. Reading is not a task to support content knowledge as an end unto itself, but rather a way to prepare for the more important goal of formulating and critiquing peers’ arguments.
Liz and Logan, by way of contrast, did not make any co-construct or critique comments prior to their final facilitating historical discussion planning session. Not only did Steve and Sarah’s consistent emphasis on discussion provide more opportunity to develop teachers’ facilitation skills by virtue of not confining facilitation to a single session at the end of the cycle, but it also provided a schema that positioned establishing background knowledge and supporting historical reading as parts of a whole rather than ends unto themselves, allowing teachers to understand the motivation behind the embedded practices.
Avoiding the Decoupling, the Practices, and the Content
In lieu of extended coaching on discursive moves, Liz and Logan devoted more time during planning sessions to content matter and historical thinking. This included longer didactic explanations of the historical thinking skills (Liz: “If you’re talking about the change over time and the powers of the military . . . to me that’s almost more like close reading or even maybe contextualization”), anecdotes of personal experiences teaching lessons on similar content even when they did not utilize the same lesson materials and structures (Logan: “I love battle paintings . . . We do the Battle of Little Bighorn and we do talk about a couple of different paintings that actually used to be in their old textbooks and they were great”), or narrations of key content points (Liz: “We know that the colonists win the Revolutionary War, that the patriots are successful. Why did some people support the Loyalists? . . . I always try to point out the fact that Great Britain was the one who already had not only a great military but a big chunk of that military was already in the colonies”). These coaching conversations prepared teachers to think about history and potentially to lecture about history but did not situate the students’ historical learning as necessarily occurring within discourse.
One result of Liz and Logan’s focus on content was the decontextualization of the practices of establishing background knowledge and supporting historical reading. Whereas Steve began his coaching session by telling the teacher “The EBK [establishing background knowledge] is supposed to motivate the CHQ,” Logan asked, “For the EBK itself, what do you think the essential knowledge is that the kids need to know?” Liz entirely set aside the language of EBK and instead asked her teacher about his “hook” and suggested, Something that I like to do when I run into that issue of not understanding why people would come over to the Americas is I like to show a map of Europe . . . and ask the students, if I was a country and I was running out of resources or I wanted to be more powerful, what would I do? I feel like that could be your hook for the lesson because if you start it off with this idea of, hey, military strength, power, we’re running out of land in Europe. Come on over to the Americas.
Liz and Logan’s focus on establishing knowledge without making an explicit connection to the underlying inquiry of the lesson decontextualized the EBK and may have deepened teachers’ existing schemas that placed information transference at the core of social studies learning.
Despite Liz and Logan’s emphasis on historical content and historical thinking, their teachers became less likely to encourage students’ historical thinking over the course of the year. Steve and Sarah’s teachers, however, improved both the frequency and quality of students’ historical thinking in their classes. The teacher Liz coached had less historical thinking in his classes than any other teacher. In his baseline video, students made three comments questioning the reliability of the Disney film Pocahontas, which were coded as sourcing. In the final video, a document-based Reading Like a Historian lesson about abolitionist John Brown, neither the teacher nor his students made any historical thinking comments. Logan’s teacher began the year with a sourcing activity that asked students to identify potential reliability issues with a variety of hypothetical sources. In this class, 79% (11/14) of the students’ utterances featured historical thinking moves but did not involve an actual historical inquiry as students never actually read the sources they discussed and made no attempt to answer the historical questions presented. The teacher’s final video did feature a full inquiry lesson, the Reading Like a Historian lesson about the attack on Fort Sumpter, but showed a decrease in the frequency of historical thinking as 14% (4/29) of the students’ utterances contained historical thinking. In contrast, both Steve and Sarah’s teachers increased the amount of historical thinking done by students (from 4% of comments to 13% in Steve’s case and from 10% to 27% in Sarah’s).
In addition to the change in frequency, the quality of historical thinking in Steve and Sarah’s teachers’ classes also increased, in part because of the shift to a more student-centered form of discourse. In the class taught by the teacher Logan coached, historical thinking was dominated by the teacher who tended to explicitly prompt brief student answers and then complete the thinking herself. In her final lesson on Fort Sumter, the teacher facilitated the following exchange about corroboration:
How many of you would agree, give me a thumbs up, that these documents kind of say the same thing about who fired first?
(Ss raise thumbs)
Yeah, there’s no indication in the documents that the North fired first and, Gina, what was the phrase that you used?
It said, “They returned fire.”
It says “they returned fire”—and which document was that from, A or B?
A
So document A indicates that the South fired first and the North returned fire. In document B, that would be where we would look to see if there’s any accusation that the North fired first, it says they fired in self-defense because they felt threatened by the strong military force, but nowhere does it say that they had been fired upon. So just with these two documents, we could probably corroborate that the South fired first on Fort Sumpter.
The teacher does the bulk of the thinking and the questions are presented as closed. She understood how to employ the strategy of corroboration to verify historical claims, but not how to invite students into this intellectual work.
By way of contrast, the teacher that Steve coached opened a much more interpretive space. In a Reading Like a Historian lesson on the United States’ refusal to ratify the Kyoto Protocols and join the international effort to combat the climate catastrophe, one student offered this summary of a trend she noticed across multiple texts: We read these documents and figure out fallacies or things we might distrust in it. And the big outstanding thing was the lobbyists and how big business are very much involved in this conversation . . . If we know that . . . we can figure out who has been striving to work with us.
This observation prompted the teacher to ask, “How did Aubrey figure out that big business has an interest that is unwavering, using her skills as a historian?” Students’ answers included “sourcing” and “look[ing] at trends,” but the teacher pressed further until a student explained Aubrey had been considering “their bias.” This pivot prompted an extended conversation about bias that led to a student pointing out that “We get these documents, but we don’t look [for them] for ourselves sometimes . . . So we only get a limited perspective,” introducing a meta-level understanding of sourcing by pointing out that the teacher curated their entire classroom experience. In this classroom, the historical thinking was being done by students, even to the point of questioning their teacher. The teacher did not enact the historical thinking for the students; she prompted students to name and explain the historical thinking enacted by their peers. The dialogic and student-centered nature of the classroom discourse mutually reinforced students’ historical thinking and strengthened the rigor of the discussion. While the cohort-level results did not show a consistent relationship between the increase in dialogic discourse and students’ historical thinking, the results of the comparative case study indicate that the most successful coaches were able to simultaneously increase both the centrality of student voice and the rigor of historical thinking.
Discussion
This study sheds light on the underexamined field of history teacher professional development and coaching. For over a century, reformers have called on history teachers to enliven classroom instruction so that students might see the relevance and interpretive nature of the discipline (Cuban, 2016), yet the field has produced scant empirical evidence of professional development interventions that support instructional change (Crocco & Livingston, 2017). In this study, however, we observed a significant shift toward dialogic instruction across the cohort of coached teachers. Our analysis identified specific coaching practices that lay at the juncture of conceptual change and practical application and that may have supported teachers’ capacity to facilitate dialogic and student-centered inquiry. When coaches focused on questioning and student explanation, teachers facilitated discussions that promoted greater student-centered discourse and historical analysis. Although these findings beg further exploration and verification, they contribute to theory-building around the elements of coaching that might promote meaningful instructional change.
Successful coaching for historical inquiry could engender conceptual change about the nature of learning history. History teachers often approach their subject monologically, relying on lectures and scaffolded textbook readings to transfer content information to students (Saye & SSRIC, 2013). Transforming that instruction requires more than a new curriculum, as teachers routinely modify curriculum to fit their existing instructional schemas (Fogo et al., 2019; Reisman & Jay, 2022). Three of the four focal teachers taught Reading Like a Historian lessons in their baseline and final videos, with Steven’s teacher as the lone exception whose baseline was a lesson on analyzing sources using a framework from the IB course (OVPL: Origins, Values, Purpose, Limitations). Their selection of these lessons—and, indeed, their voluntary participation in a coaching program focused on teaching document-based historical inquiry—indicated a shared interest in history instruction that promotes student analysis and interpretation rather than memorization and regurgitation. And yet, as the baseline videos made clear, teachers’ embrace of these inquiry lessons was not, in and of itself, sufficient. To enact instruction where students engage in genuine knowledge construction, teachers must begin to see dialogic student discourse as the key mechanism for learning.
This study took up the question of how to support teachers to make this conceptual shift. Building on prior work (Reisman et al., 2019; Reisman & Enumah, 2020) that highlighted the promise and limitations of practice-based methods focused solely on facilitating discussion, the coaching program developed for this study sought to decompose the full document-based lesson, so that teachers might see how a successful historical discussion represents the culmination of students’ learning and thinking over the course of the full lesson. The first author worked with all the coaches in the first year of the design-based study to specify and identify video exemplars of each of the three focal practices. For the coaches, the individual practices—especially establishing background knowledge and supporting historical reading—should have held no meaning outside their integration in the document-based lesson and as scaffolds for the final discussion. Yet, when Liz and Logan planned with their teachers, they upheld the practices as meaningful stand-alone activities, divorced from the broader arc of the lesson. Why? We can think of three potential reasons, none of which are entirely satisfying, and all of which underscore the need for further research.
Perhaps Liz and Logan held foggy “instructional visions” for historical inquiry (Gibbons & Cobb, 2016). Although Liz and Logan presented their teachers a vision of historical content whereby the subject was open to interpretation (via their emphasis on historical thinking skills), they decoupled it from the pedagogical work of inviting students to construct knowledge (e.g., through an emphasis on ask that is sustained over the arc of the lesson). This point is striking: by all counts, Liz and Logan were excellent history teachers, well-regarded in their schools and in the district writ large. They were tapped to serve as coaches for precisely these reasons. It is entirely conceivable that they simply were not conscious of all the ways they elicit and sustain student discourse and were therefore unable to adequately relay this vision to the teachers they coached. Because we relied in part on district administrators to tap teacher leaders for this project, we do not know. But this first interpretation would underscore not only the importance of an articulated instructional vision as the basis of successful coaching for inquiry, but also the importance of providing opportunities for coaches to reflect on their practice so that their visions might be made explicit to novices. Especially in our current political climate, where educators are increasingly wary of broaching controversial historical topics, it is essential to understand the root of coaches’ hesitance to encourage student discourse (Woo et al., 2023).
Alternatively, perhaps Liz and Logan did a perfect job helping their teachers understand and enact the focal practices, and the differences we observed between the two sets of teachers were not related to the coaching they received. Compared to Liz and Logan’s teachers, Steve and Sarah’s teachers taught high school, had more coursework in history, completed graduate work in education, were closer in age and experience to their coaches, and had just begun their career. Any of these factors, or others of which we remain unaware, may have contributed to teachers’ relative growth in discussion facilitation or lack thereof. We should further acknowledge that because we selected our case teachers at the extremes, it is not clear if the differences in coaching exist on a continuum and if the association between coaching and practice would hold accordingly. In short, given our selection mechanism, we cannot make claims as to whether the changes in teacher practice can be attributed to the coaching they received. What we can say, empirically, is that the substance of Steve and Sarah’s coaching sessions differed from Liz and Logan’s in the ways we describe. But we cannot say with any certainty that these differences are meaningful.
A third possibility is that the substantive differences between coaching sessions were meaningful, and Liz and Logan were aware that the EBK and SHR should scaffold the final discussion and propel the lesson’s inquiry, but the structure of the three coaching cycles distorted or constrained the way they communicated their vision to teachers. Indeed, after the first year of the larger study, coaches had expressed concern that the structure and sequence of the cycles prevented teachers from seeing the full arc of the lesson. Our remedy in Year 2 was to offer—before launching the first coaching cycle—a full-day professional development led by the first author that engaged teachers in a full document-based lesson, highlighting how it comprised the three focal practices. It is certainly conceivable that this session was insufficient for helping teachers connect the focal practices to the broader lesson structure. If so, the structure and sequence of the three cycles militated against reintegration of the practices (as evidenced in Sarah’s awkward caveat during her SHR session, “I know that this session isn’t focused on discussion, but . . .”). An alternative coaching program, where teachers cycled through the three focal practices multiple times, might have more effectively revealed the arc of the lesson and the central role of discussion.
Beyond the structure of the program, it is also worth noting that none of the coaches showed evidence of developed “coaching visions” (Gibbons & Cobb, 2016). As novice coaches, they did not engage teachers in pedagogies that align with constructivist and sociocultural theories of teacher learning, or that helped teachers to reason through instructional dilemmas (Kavanagh et al., 2020). Across all four cases, coaches primarily told teachers what to do. It is plausible that with a more robust pedagogical toolkit for coaching, the structure of the program would be inconsequential because the coaches would be able to employ a range of techniques to help teachers arrive at insights about their instruction. Further research is needed to explore each of these possibilities.
What we do observe in these data is a shift among the teachers coached by Steve and Sarah. Over the course of the year, their practice in press and link (and co-construct for Steven’s teacher) grew from emergent to proficient. These moves, perhaps more than the others, suggest instructional attention to students’ thinking and argumentation. Interestingly, however, these practices were not emphasized in their coaching sessions. Rather, Steve and Sarah emphasized the practice of ask, while attending to students’ capacity to explain. We remain curious about exactly why a coaching focus on ask might have manifested in growth in teachers’ press and link and see this potential relationship as grounds for further study.
Practice-Based Teacher Coaching
From the perspective of PBTE, however, we might draw two inferences from Steve and Sarah’s emphasis on ask and explain and their teachers’ concomitant growth in press and link. We might, for example, consider ask and explain (or, more explicitly: ask open questions and elicit student explanations) to be alternative high-leverage practices that Steve and Sarah elevated above the other practices embedded in the coaching model. Operating at a small grain size, these practices could be integrated into any instructional activity and remain linked to the CHQ and to the final discussion. Once enacted, these high leverage practices would presumably shift the discourse in the teachers’ classrooms, increasing student voice, and putting teachers in a position where they could focus on sustaining (e.g., press, link) student voice, rather than merely eliciting. Perhaps, then, Steve and Sarah unwittingly landed upon two higher-leverage practices than the ones specified by the program that they were able to revisit over the course of the three coaching cycles, and these practices were effective in producing the desired conceptual change in their teachers. For those of us who have been working to infuse the secondary history classroom with student discourse, inquiry, and critical thinking, such a finding represents a beacon of hope. Perhaps instructional coaching on asking open questions and eliciting student explanations can do the work of shifting the history classroom from one of passive reception to active learning.
The second inference we could draw from the findings concerns the way Steve and Sarah worked on these practices with their teachers: the practices were never treated abstractly. Contrary to fears that practice-based approaches risk mechanizing and decontextualizing practice (Kennedy, 2016), Steve and Sarah conducted their coaching sessions around concrete curricular materials with which they were deeply familiar, and their suggestions never strayed from these materials and the specific historical content. This finding raises an interesting question about transfer: if the coach offers instructional suggestions about how to enact a lesson about a particular historical topic, why might we expect the teacher to carry those suggestions to a different lesson? In this bounded instructional space, where coaching was grounded in shared lesson plans and specified practices, it is possible that Steve and Sarah’s success may have lain in their ability to focus teachers’ attention even more narrowly on the conceptual core of inquiry: teachers’ questions and students’ claims.
Implications
These findings illustrate that the work of coaches is complex and raises important questions for districts and schools committed to developing and supporting ambitious instruction. As instructional coaching is increasingly recognized as an effective and essential component of professional development, we caution districts from simply adopting domain-general or content-agnostic models. We do not believe the demands of high-quality social studies instruction can be met by coaches unfamiliar with the subject matter. Recognizing that most districts cannot afford content-specific coaching for social studies teachers, we encourage schools and districts to leverage social studies teacher leaders as instructional coaches. Such an arrangement would require schools to be creative in offering reduced teaching time or other additional compensation, but we suspect these investments would pay dividends. At a time when social studies teachers are, once again, in the national spotlight, it is unconscionable that they remain last in line for subject-specific professional development in their schools and districts.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
