Abstract
The challenges inherent in mastering academic content in a new language are many. When it comes to learning science in U.S. high schools, English learners (ELs) confront these on a daily basis. In an effort to document expert language/content instructional strategies, we analyze Mrs. B’s sheltered high school biology class, made up of ELs from around the world and representing varying stages of emerging bilingualism. The aim of this 2-year case study was to detail effective teaching patterns in a high-functioning multicultural science class—a class where the myriad linguistic, cultural, and affective needs of students are expertly met—and to subsequently suggest a model for understanding and undertaking powerful language and content learning supported by multimodal referents. From a rich data set comprising class recordings, interviews, reflections from Mrs. B, course documents, student work, and survey responses emerged a model of the language/content multimodal interface for teaching ELs.
How teachers, of both science and English as a second language (ESOL), attend to these challenges has been investigated in terms of inquiry learning (Stoddart, Pinal, Latzke, & Canaday, 2002), language register (Mohan & Slater, 2006), and language development supports (Llosa et al., 2016; Rosebery, Warren, & Conant, 1992). How educators make use of classroom technologies to support comprehension and content mastery has, as well (Ajayi, 2009; Kim, Hannafin, & Bryan, 2007; Meskill, Mossop, & Bates, 1999a; Oliveira, de Oliveira, & Meskill, 2019; Oliveira & Weinburgh, 2019; Yang & Walker, 2015). In an effort to document expert instructional strategies, we analyze the multimodal interactions in Mrs. B’s high school biology class, a sheltered classroom made up of ELs from around the world and representing varying stages of emerging bilingualism. We selected a high-functioning, multicultural biology class, a class where the myriad linguistic, cultural, and affective needs of students are expertly met. On the basis of our analysis of the instructional conversations in Mrs. B’s class, we suggest a model of language and content learning supported by multimodal mediation.
Theoretical Perspective
Our inquiry is guided by a social, interactionist view of language learning that sees productive use of the target language as central to its appropriation and mastery. Rooted in sociocultural perspectives on learning generally (Vygotsky, 1972) and decades of empirical studies on second-language acquisition (Anton, 1999; Gibbons, 2003; Lightblown, Spada, Ranta, & Rand, 1999; Poehner & Lantolf, 2014; Swain, 2000), a social, interactionist view of learning with instructional conversations serving practice (Elhassan & Adam, 2017; Meskill, 2013; Tharp & Gallimore, 1988) has emerged. With the goal of language education being the development of communicative competence—the ability to say or write the right thing, in the right way, with the desired result in a given context (Hymes, 1972)—development of communicatively viable language and literacy through social interaction is clearly indicated. Likewise, current theoretical developments underpinning content through language and language through content instruction have trended away from the monologic (teacher centered) and toward the dialogic or interactionist (Nystrand, Gamoran, Kachur, & Prendergast, 1997). Learning, as Vygotsky (1972) argues, is in part establishing links between what is known and unknown and requires that their internalization be played out on the social plane—something that language educators, and now increasingly, content educators—are coming to accept as foundational to successful learning (Lantolf, 2000; Vygotsky, 1972). Consequently, rather than monologic delivery of information, instruction is dialogic, with instructional conversations serving as the central mode of teaching and learning (Saunders & Goldenberg, 1999). A growing body of research supports this movement away from the monologic and lecture driven to the interactional. We know, for example, from recent, large-scale classroom research that active, authentic use of language and content is the critical component for ELs’ success (Portes, González Canché, Boada, & Whatley, 2018). Our inquiry is guided by social, interactionist views of learning with special focus on the multimodally supported instructional conversations that pervade Mrs. B’s sheltered biology class for ELs.
Review of Literature
Multimodalities
Representations of ideas and events are critical to both language and science education (Meskill et al, 2014; Oliveira et al., 2013; Oliveira & Weinburgh, 2016). Multimodality understands communication and representation as including a variety of semiotic modes (speech, writing, image, gesture, and three-dimensional models) that are socially and culturally shaped for making meaning (Norton & Kress, 2000). Multimodal learning—learning with, through, and around content in multiple forms—has, in a digital age, become the seamless norm in most contemporary classrooms. Students are accustomed to encountering curricular content through images (still and moving) aurally, kinesthetically, and of course, textually. In their study investigating the use of interactive whiteboards (IWBs) as pedagogical tools, Mercer, Warwick, Kershner, and Staarman (2010) found that teachers can mediate digital material for their students as a means of augmenting comprehension and stimulating oral and written production of content. When comparing the use of digital materials versus traditional whiteboards, Fernández-Cárdenas and Silveyra-De La Garza (2010) conclude that digital materials tend to stimulate the use of gesturing and pointing by teachers as they mediate meaning, and Hennessey’s (2011) case studies of classroom practice illustrate how teachers and students exploit “multiple modes of representation enabled by the IWB” to create a space for multimodally supported instructional conversations (p. 468). In short, multimodal referents can serve as common visual references, what Meskill, Mossop, and Bates (1999b) term public “anchored referents,” to facilitate comprehension and communication. In the context of this study, the term referent is used to describe the text, image, and/or gesture used to assist comprehension and production of new language.
Multimodal Science
Verbal language is only one of many modes of representation used by teachers and students to communicate scientific ideas and is often not the predominant one (Jewitt, Kress, Ogborn, & Tsatsarelis, 2001). This multimodal perspective on meaning making in science has been informed by the social semiotic theory of communication (Halliday, 1978) and the further development of that theory to include nonlinguistic forms of communication (Hodge & Kress, 1988). Each mode of communication (text, speech, facial expression, pantomime, image, video, graph, and gesture) constitutes an organized set of semiotic resources available to foster student conceptual understanding (Jewitt, 2009). Indeed, a growing number of science educators have shifted away from a monomodal view of classroom discourse, in which verbal language is considered the sole and central communicative component, to a multimodal perspective, wherein various modes are perceived as “semiotic hybrids”—concepts that are simultaneously verbal, visual, mathematical, and/or interactional (Lemke, 1998). In short, scientific discourse incorporates the use of simultaneous modes of communication to convey ideas and is, thereby, multimodal in nature (Gee, 2015; Gillies & Baffour, 2017).
Multimodal Mediation and ELs
For the growing number of ELs in U.S. schools, teacher mediation of content through multiple perceptual modalities has been well established as supportive to the development of language/content (August, Artzi, & Mazrum, 2010; Calderón et al., 2005; Carels, 1981; Case, 2002; Church, Ayman-Nolley, & Mahootian, 2010; Cummins, 2014; Meskill, 2005; Meskill et al., 1999a; Waring, Creider, & Box, 2013). Not only does integrating multimodal resources into science classrooms enable teachers to employ representations, but there is some evidence that this can also assist in the development of academic literacy for ELs (Early & Marshall, 2008; Meskill et al., 1999b; Zhang, 2016). Ajayi (2009), for example, suggests that visual presentations require students to interpret meaning and make connections with their identities and life experiences, thus employing and extending schema. Further, Choi and Yi (2016) suggest that multimodality can linguistically reinforce, scaffold, and connect subject-matter content to the lives of ELs in addition to serving as tools for culminating student projects. They report that visual representations accompanied with text facilitated ELs’ acquisition of content knowledge. Skilled use of multimodal representations of content allowed ELs to “revisit and practice content and linguistic knowledge repeatedly with more ease” (Choi & Yi, 2016, p. 320).
In a rare study of computer-screen influences on instructional conversations for language learning, more-competent peers were observed mediating what appeared on the screen when a peer was in need of scaffolding (Hsieh, 2017). In another examination of ELs and digital media, teachers used “point talk” to capitalize on specific digital learning features, such as its publicness, anarchy, instability, and malleability (Meskill et al., 1999b). Urmeneta and Evnitskaya’s (2014) case study of a Spanish/science classroom further illustrates how teacher-led discussions that employ multimodal sources lead students to co-construct meaning as part of their mastering target content/language. The authors contrast these multimodally supported discussions with a failed activity whereby students lacked multimodal resources to help them formulate extended content utterances, findings echoed by Robinson (2005). As a result of examining multimodal discourse in EL science classes, both Zhang (2016) and Urmeneta and Evnitskaya (2014) found that language teachers’ systematic use of multimodal resources led to improved comprehension of science vocabulary. Finally, Mortensen’s (2011) close analysis of conversationally integrated lexical items in a language/content learning context underscores the critical supportive role played by multimodal resources in comprehension and, ultimately, linguistic/conceptual mastery and illustrates in detail the ways “lexical items emerge from the ongoing interaction” (p. 137).
Multimodal mediation of science content with ELs is clearly a fruitful area of inquiry often generating practical strategies for language and content teachers alike. However, a sophisticated and multidimensional model of the integration of new and traditional multimodal classroom elements to support learning is needed (Jenkins, 2006; Zhang, 2016). Indeed, in a Delphi study on priorities for educational technology, models and strategies for effective integration and use by practitioners were at the top of the list (Pollard & Pollard, 2005). It is in this context that we undertook intensive examination and analysis of ELs and multimodal referring in Mrs. B’s sheltered biology class.
The Study
Grounded in a social interactionist view of learning new language and content with multimodal supports, and given the priorities and outcomes of the extant literature, the overarching question driving our inquiry became the following: What multimodal-supported teaching patterns lead to language/content acquisition opportunities in a sheltered high school biology class for ELs?
A parallel research focus originally developed as part of a larger, 5-year, federally funded initiative that examined the language/content teaching strategies devised by 40 paired ESOL and science and math educators. This portion of the study is a 2-year, detailed case study of a midsized, postindustrial Upstate New York high school biology class. Its selection was based on constant comparison with like and unlike classrooms using a system of multimodal amplification coding to determine patterns in the quality and effectiveness of teaching math and science content to ELs (Kolb, 2012). Data are composed of nine video-recorded classes, teacher-written reflections on these recorded classes, and recorded planning and debrief sessions with professional development staff. All recordings were transcribed and stored as text documents, the content of which was initially grouped by emerging themes and patterns using simple concordancing. In addition, our focal teacher completed two lengthy questionnaires. The first one pertained to her background, teaching philosophies, and the recorded classes, and the second contained in-depth follow-up questions regarding her multimodal practices (Appendices A and B). Mrs. B’s lesson plans, her written reflections about recorded lessons, two presentations on her work at two statewide professional development institutes, a multimodality questionnaire completed by Mrs. B’s students, and class artifacts make up the remainder of the case data set.
Using simple concordancing software, transcriptions were first analyzed to determine the contexts in which target science vocabulary co-occurred with multimodal referents. These contextualized instances were compiled, compared, and used to (a) illustrate the predominant pattern represented in our model and (b) construct. These were continually discussed with Mrs. B as part of these processes (Appendices A and B). The language—both verbal and gestural—used to describe her multimodal mediation strategies comprised in vivo coding that later led to specific, detailed patterns of the recorded instructional conversations (Yin, 2009). The breadth of our recorded data allows for a sense of the pacing, frequency, and the pervasiveness of the distinct conversational patterns of classroom interaction—the teaching patterns—that emerged. Iterative analysis of these contextualized patterns developed into our emerging model of what constitutes expertly taught language/content for ELs that capitalizes on carefully integrated multimodal referents. A detailed portrayal of Mrs. B’s instructional strategies, as well as the development of a fine-grained model for multimodal language and science learning for ELs, follows.
Context and Participants
Our focal high school class is in an Upstate New York district where 16% of the district’s students are classified as ELs. One of the district’s many strategies to support immigrant and refugee families is to provide “sheltered” instruction. A sheltered content class here refers to EL-only classes where language and content are explicitly taught intensively and at the same time. It is a temporary, transitional learning space designed to make mainstream curricula accessible and comprehensible to ELs by offering them a safe, productive, and low-anxiety environment with many language supports (Fritzen, 2011). Mrs. B’s biology class is composed of 13 students from Yemen, Sudan, Libya, Thailand, Burma, Malaysia, Bhutan, the Dominican Republic, Puerto Rico, and the Ukraine. Their English proficiency varies from three students not having literacy in their home language to the “emerging” and “entering” levels as determined by state assessments. These are the lowest two levels of English proficiency on the New Language Arts Progressions recently adopted by New York State. This policy conceives of student acquisition of a new language (not spoken at home) as a gradual progression along a sequence of five distinct developmental stages, namely, entering, emerging, transitioning, expanding, and commanding (New York State Education Department, 2012a, 2012b).
Mrs. B
Mrs. B’s path to becoming an ESOL professional and chair of her department is a multicultural one. She graduated from a Russian university with a degree in Germanic philology and began learning English as a foreign language at age 10. She went on to major in English at the university level. The short version of her language learning philosophy is an environmental one, with “instruction, plentiful opportunities to practice, accessible input, and an authentic purpose to produce output. So a teacher needs to create an environment for all those conditions to occur” (Questionnaire 1, Mrs. B). Such a philosophy suggests an ecological perspective on content language integration wherein learning is conceived as being environmentally mediated (Van Lier, 2004).
Classroom technologies contribute to a strong language/content learning environment, and Mrs. B uses a range: Quizlet for vocabulary practice, Kahoot for multiple-choice questions,
1
NoRedInk for grammar exercises, NewsEla for level-appropriate readings, instructional videos with captions, Google Classroom for organizing materials and resources, and the like. “The variety is like a menu I can chose from when planning my instruction: It keeps students engaged, it can provide immediate feedback, assists in repetitive but necessary skills practice” (Questionnaire 1, Mrs. B). Mrs. B’s teaching illustrates the centrality of multimodal referents. She explains, I am lucky to be able to teach in a classroom that has an interactive TV, document camera, and a cart of Chromebooks. I use all that “hardware” every day: I project material using interactive TV, write notes on whiteboard, use the document camera to model annotating text; students access Chrome books when we play Kahoot, practice vocabulary with Quizlet, or create presentations using Google Slides. All these tools allow students to access the material and minimize chances of being lost. For example, written notes are accompanied by verbal explanations, if I refer to or read a passage, then I project it on the interactive TV, if students are working on a presentation, they have a sample of it and directions on Google classroom that is accessible at home as well as in school. I would have to re-invent and re-imagine my teaching if I lose access to any of the technologies I currently use. (Questionnaire 1, Mrs. B)
Her overall aim and focus is to teach the language of science, specifically the new lexical items and syntactic forms that students need to productively understand and use concepts and ideas. She does so conversationally, integrating students’ interests and experiences along with new information in tantalizing and curiosity-provoking ways. As a skilled conversationalist, she establishes mutualities while referring to immediate visual and auditory supports available on walls, screens, and boards and via her physical body.
Mrs. B’s Classroom
Mrs. B’s classroom has two large screens; one is an older smartboard on which she projects from her laptop. The other is a newer mobile version with a touch screen.
I found it helpful to use both screens during class: for example, one screen is used to project text or video and to annotate that text, while the other screen is used for writing notes based on the text. . . . When I start a new unit, I place instructional materials around the room—magazines, books, posters from my previous year classes, printouts of the articles we will later use etc.—I want to use any opportunity to connect what students say to what we will study. For example, “You said that your grandfather lived longer than your grandmother. This chart shows life expectancies in different countries around the world for men and women. We can see who lives longer on average and try to find out the reasons.” . . . “You said that it is difficult for older people to move. This magazine shows pictures of 90 year old women doing yoga. I wonder if regular exercise helps one to stay active?” (Questionnaire 2, Mrs. B)
The Lessons
In our three focal lessons, each extending over three class periods, the topics were human audition, human biological systems, and human longevity. Mrs. B worked every unit vocabulary item throughout her lessons, which comprised eight language/content routines (Table 1).
The Anatomy of Mrs. B’s Eight Instructional Stages
Manipulating here means dragging and dropping text and visuals in and out of the foreground and resizing and relocating them on the screen. In Stage 1, student manipulation comprises dragging and dropping lexical items to visually align with their correct definitions and/or images.
Throughout, Mrs. B and her students converse about the focal topic, students’ questions and thoughts about that topic, and how aspects of the focal topic relate to their lives and their understanding of human biology. Integral and essential to the comprehensibility and, ultimately, to the success of these interactions are the multimodal elements physically at hand to which all participants continuously and conversationally refer, Mrs. B in particular. She employs a number of multimodal elements, usually on the smartboard, to generate interest and enthusiasm. If the referent is not readily available there, she will cross the room and point to visuals that depict what she is speaking about. In rare moments where a referent is not immediately present, Mrs. B will act out the word using gestures and facial expressions. For example, when talking about the fight-or-flight response, she mimed increased adrenaline by shaking her body energetically and feigned fighting and fleeing. When the word offer was puzzled over, she picked up a student’s handbag and offered it to another. These improvisations became permanent emblems for the remainder of the unit and beyond as students were witnessed using them weeks later in humorous conversations with one another.
Visuals to anchor the topic are on the smartboard, and Mrs. B continually references these, pulls them in and out of the foreground as they are addressed, resizes and repositions them according to prominence in the conversation, and calls on students to think and speak in depth. Students actively confer about the images relating what is familiar to their lives and questioning what is unfamiliar. Photographs of elderly people prompted comparisons with friends and relatives, their lifestyles, their probable life spans, and the like. As regards her use of video, Mrs. B reported short videos that have models and explanations on the human ear parts and functions was the best way of teaching the material, as it combined visuals, models, explanations, closed captions, and I was able to pause it and explain parts students had trouble with. (Questionnaire 1, Mrs. B)
Each activity (1–8) is densely interactional and provides Mrs. B with ongoing measures of student progress: I choose a discussion style, questions, calling on students who seemed confused, asking volunteers to help out, asking for reasons behind their answers. Sometimes I ask for students to answer in writing, but it requires a much longer time for students at beginning level of proficiency. Making lists, web diagrams, fill in the blanks, Kahoot assessments are effective and require less time investment. (Questionnaire 1, Mrs. B)
Rather than simply following a fixed script, Mrs. B employs what can be called a “choreography of teaching”—defined by Oser and Baeriswyl (2001) as “a [type of] choreography that binds, on the one side, freedom of method, choice of social form and situated improvisation, on the other, with the relative rigor of the steps that are absolutely necessary in inner learning activity” (p. 1043). Her approach is flexible at the surface level, thus allowing for variation and adaptation while retaining its deeper structure (her stable theoretical core). These choreographic aspects have been shown to be characteristic of expert teaching (Oser, Patry, Elsasser, Sarasin, & Wagner, 1997).
Multimodal Referents in Instructional Conversations
In this section we increase the granularity of our examination to identify specific multimodally supported conversational routines characteristic of Mrs. B’s classroom. We are specifically concerned with the roles that multimodal information is playing in student comprehension and production of the language of science. Such elements are continually and conversationally referenced in this class, and our purpose in examining this productive referring is to understand how, why, and with what instructional impact.
Unlike the written word on which so much instruction depends, speech is evanescent. The listener has to attend to, hear, and try to understand an utterance at the moment it is spoken. Doing so in a new, developing language is challenging to say the least, yet the aural mode is most often primary in language education, and experienced language educators integrate a number of supports in their classroom conversations to anchor, amplify, and elaborate meaning as it is being conversationally negotiated. For Mrs. B, learning depends on socially situated interaction whereby language is not restricted to being “in the head.” She consequently utilizes the environment for joint meaning making between interlocutors (Goodwin, 2000). In such conversations, referring is a collaborative process (Clark & Wilkes-Gibbs, 1986). Speakers bring their interlocutors into the referential process by the design of their utterance. Further, the act of referring—in our case, teachers and students making continual, seamless reference to multimodal elements in the classroom—plays a central role in this joint meaning making. In Mrs. B’s classroom, we see language/content integrated teaching operationalized as conversationally communicative multimodal referring and will focus on this aspect of her classroom throughout subsequent analysis and discussion.
Mrs. B engages her students conversationally throughout the class time period. Even though she mentions employing “direct teaching” (an instructional method defined as the direct telling of information), the tone, tenor, and manner of her speech is consistently conversational, and, like in noninstructional settings, conversations depend on common ground and mutual understandings (mutualities) of what is being talked about. What marks Mrs. B’s discourse is that, rather than halting the conversation to launch into direct explanations and explications, she expertly and seamlessly weaves in multimodal referents to assure comprehension. Further, as in conversations generally, she seamlessly assesses comprehension and does not move forward with the conversation until mutual comprehension is achieved (Figure 1).

Two-step instructional conversational strategy.
I favor a conversational style of formative assessment. In the department, we joke a lot that “we can see it in students’ eyes.” There is some truth to that though, as you get to know your students, you know if they stay with you, if they are confused, bored etc based on how they sit, how they track you with their eyes, how they smile and laugh at your jokes, how they repeat the words quietly after you, how excited they are to turn to their friend and comment to explain something. As a teacher, you learn to feel your audience. (Questionnaire 2, Mrs. B)
In the following sequence, during the longevity unit, comprehension and productive use of the word centenarian is Mrs. B’s goal within the larger goal of pushing her students to think about longevity, connect it to their own lives, and develop curiosities that will evolve into research hypotheses. 2
Mrs. B: What did you say, Sammy? Ladies
S: Yeah. Ladies
Mrs. B: What do you all think of this? Women live longer than men?
3
[points to
S: Asians
Mrs. B: Ha! You think? You will need to find information to back up your, your [motions with
S:
Mrs. B: Right! [points to
SS:
Mrs. B is quick to pick up on any gaps in comprehension and calls on her extensive repertoire of descriptions, gestures, analogies, and so on to fill these. For her, when conversations progress, this is a signal that shared referents have been established and successfully comprehended; in short, new lexical/conceptual items have been learned (Figure 1). Thus, rather than halting the conversation in the interest of assuring comprehension of new language, she integrates additional ways of knowing and understanding new items often by utilizing students’ prior academic and home-culture knowledge in conversationally fluid ways. The class sequence on longevity, where the issue of women living longer than men conversationally emerged, exemplifies this pattern. The class continues its discussion of whether it is a good thing to have a long life or not. All students have written down the new words in their notebooks, some using their phone translators and/or paper dictionaries. They will hear these words frequently in the next week and use them in their class activities, readings, writing, and assessments.
Subsequent activities consist of discussing why some people live longer than others, and students are quick to generate their lists of reasons. The two Muslim students, for example, emphasize fate and being in God’s hands as the main influences on longevity. The three young Hispanic women emphasize quality of life, including friends, family, good food, and dancing, and the Asian students underscore hard work and family care and dedication as essential in living a good, long life. All perspectives and ideas are respectfully and enthusiastically embraced as part of the conversation. Key words are repeated, looked up in dictionaries and on devices, translated by classmates, used actively in speaking and writing, and of course, multimodally referred to throughout.
There are two basic steps in the conversational referring process: presentation and acceptance (Clark & Wilkes-Gibbs, 1986). Through variously lively and affective means of referring and maintaining student attention, Mrs. B achieves the first step, presentation, by, for example, pointing to a photo and verbally generating target language that that photo illustrates. Like in noninstructional environments, Mrs. B’s students are required to indicate mutual comprehension before the conversation continues. They use facial expressions, thumbs-up or thumbs-down gestures, shrugging shoulders, nodding, or smiling along with saying the word or words all as means to indicate mutual comprehension. Indeed, Mrs. B often “sees in the students’ eyes” whether or not they understand (reflection on class recording, Mrs. B).
This two-step process and its requirements pervade classroom discourse. In the following sequence, the vocabulary item centenarian, in large digital form on the smartboard, is not only referred to as a whole; Mrs. B visually (chops with her hands) and verbally (exaggerated enunciation) divides the word into four phonetic chunks. This is accomplished with alternating taps and chopping motions to indicate segmenting. Mrs. B exaggerates this chunking by vocally lengthening each. She orchestrates the class (indicates all should say the word with a sweeping motion around the class that concludes with pointing to the word), repeating the word three times. The students do so in concert with her pointing and chopping motions at and around the text. Surrounding the word centenarian on the smartboard is a collection of elegant photographs depicting elders in various activities. One photo represents a woman celebrating her 100th birthday. 3
Mrs. B: How old is
SS: One hundred!
Mrs. B: How do you know this?
SS:
Mrs. B: How many
SS: [mix of “one hundred” and “one”]
Mrs. B: Ha! [
SS:
Mrs. B: Right. She is . . .
SS:
Mrs. B: Right. She is
SS: [most attempt]
Mrs. B: You are awesome. Say
Mrs. B: [
SS:
Mrs. B establishes common ground (Figure 2) by continually pointing to the photograph on the screen, thus achieving the first in the two-step referring process. Students chorally indicate mutual comprehension (Figures 1 and 2), thus satisfying the requirements of the second step and signaling that the conversation can now continue. Mrs. B points to each image on the screen in turn and the text of the ideas the class had earlier generated: good health, good food, exercise, being with family, being outside in nature, and having a youthful spirit. “Some of you put a hundred, right [pointing to whiteboard timeline],” Mrs. B says. “So we’ll be talking about people who live to that stage, we’ll be talking about how you still have bad habits and still have a long life [pointing to elderly smoker], she is still smoking and celebrating.” Students chime in, “One hundred!”—a clear signal for the conversation to continue.

A model of multimodal language/content instructional conversations.
In Mrs. B’s classroom, the instructional/conversational goal of mutual comprehension is readily achieved due to centrality of the item, its referents, and its role in the conversation. Signaling comprehension is part of the conversational contract and the established routines that guide interaction. Other integral components of the multimodal referring process that lead to the instructional success of these conversations are
the immediacy, salience, and attractiveness of publisher and teacher-generated images, and
mutual investment in successful conversation-shared goals of science and language learning.
The predominant pattern is a two-step referring process represented in Figure 1.
Mrs. B and her students collaboratively construct shared meanings through gradual refinement of ambiguous, partial meanings while mapping the target language on to the natural world. Multimodal referents serve as anchors and sources of meaning making throughout. This two-step pattern of her instructional choreography leads to language/meaning convergence.
Reaching mutual agreement regarding what one is conversing about is a conversational requirement and is inherently collaborative, a key feature of Mrs. B’s classroom and one she nurtures for the learning outcomes and adolescent development it affords. Additionally, these interactional sequences
sustain common ground initially established when Mrs. B activates her students’ prior knowledge and sparks their interest,
adhere to the principle of least collaborative effort (Clark & Wilkes-Gibbs, 1986),
conversationally invite students to indicate that they are successfully co-referencing and thus participating,
indicate that uncertainty is tolerated,
promote mutual acceptance,
communicate that all instructional activity requires collaborative effort, and
signal that new conversational content (the newly learned language of science) will continue to be productively used in speech and writing and encountered in unit readings.
Mrs. B’s classroom conversations are primarily felicitous; that is, they adhere to unspoken contracts between interlocutors that ensure all are heard, attended to, and respected and that they enjoy themselves. Indeed, the atmosphere in Mrs. B’s classroom can best be described as joyful. There is ample gaiety around turn taking, transitions between activities, and the instructional conversations in which they enthusiastically engage. They know the class routines well and enjoy the socializing aspect. The affective groundwork is thereby established for productive, authentic language/content mastery. The participation frameworks she orchestrates integrate a “mix of semiotic fields” (Goodwin, 2000, p. 1517) to structure and support the communicative instructional conversations in which her students enthusiastically engage. Word meanings get interactionally co-constructed with an eye on students’ current level, potential background hooks, and immediate contextual multimodal supports.
As a class activity, 24 of Mrs. B’s students were asked to rate the importance of her multimodal teaching strategies along with their preferred ways to learn new language (Appendix C). Figure 3 shows students prefer their teacher saying words and their reading and repeating the target vocabulary, preferences that underscore the centrality of both the aural/textual/oral and social dimensions of language acquisition. This was echoed in students’ responses to the open-ended questions, where activity that involved interaction with others (their teacher, partners, family members) was reported as their favored way to study English. Not surprisingly, when rating the importance of their individual learning strategies, the importance of images were comparably high (Figure 4). In their open-ended responses, one third mentioned computer apps and videos as important while another third wrote that they preferred reading and saying new words.

Student ratings of teaching strategies.

Student ratings of their learning strategies.
A Model of Multimodal Language/Content Instructional Conversations
From our extensive and intensive investigation of Mrs. B’s choreography of instructional conversations with multimodal referents, we see an emerging model of multimodally supported instructional conversations, a model for which the aim is to capture this teaching practice with all of its nuanced, interdependent components. In Mrs. B’s own words, The vocabulary dictates the approach for teaching it. For some words, it is enough for students to see an image to be able to understand the meaning (ex. mobile home). Other words require image and explanation that sometimes means simplification or expansion (I notice that I use gestures and dramatic movements or facial expressions frequently when I explain vocabulary). Some words call for pointing out morphemes, so that students can figure out the meaning based on morphological analysis or etymology (eg. defenseless, indivisible). Other words are best understood through semantic mapping or semantic feature analysis. (Questionnaire 2, Mrs. B)
The calculus she employs to determine optimal approaches for individual lexical items drives her orchestration of the multimodal instructional conversations that make the new item accessible to ELs and, per her students’ survey responses (Figures 3 and 4), represent optimal language/content teaching and learning strategies. The overall recurring pattern of these conversations is represented in Figure 2 and responds directly to our overarching research question: What multimodal-supported teaching patterns lead to language/content acquisition opportunities in a sheltered high school biology class for ELs?
Figure 2 represents the conceptual integration of elements that constitute the multimodally supported instructional conversations that are the heart of Mrs. B’s practice. Reading left to right, forms of digital and analog resources that are referred to throughout these conversations are designated. One or more of these play a central role in establishing mutualities as Mrs. B makes use of them to track topics, focus, attract/steer attention, anchor, illustrate, and elicit student responses. Students’ indications of comprehension and uptake push the conversation forward (Figure 1), where opportunities to further comprehend and utilize the new language/concept is orchestrated by Mrs. B by repeating the multimodal instructional conversation routine. Clearly, the central and most important element in our emerging model is Mrs. B’s mediation (Figure 2). Indeed, her students rated strategies that involve communication with others as most important, underscoring students’ responsiveness to human mediation (Figures 3 and 4).
Digital Versus Analog Modalities
In Mrs. B’s biology class, multimedia materials are consistently employed to illustrate and conversationally anchor focal content. Digital materials, typically projected to the whole class, and also accessed on individual laptops and phones, are in many respects “supervisuals” in that temporal change can be represented along with much more than the naked eye can see (Alac, 2011), and Mrs. B is quick to exploit this dynamism. It is these and other objects in the classroom that she refers to and speaks about when engaging her students in instructional conversations about biology. Advantages of the digital in terms of size/publicness and malleability are significant and make referring much more seamless than when restricted to what appears on paper. Widely considered a “best practice,” teaching with interactive digital projection has been shown to be effective not only in making classroom discourse more dialogic (Kennewell & Beauchamp, 2007; Kennewell, Tanner, Jones, & Beauchamp, 2008) but also in improving ELs’ performance on standardized tests in content areas like mathematics (López, 2010). However, the constancy of wall charts, posters, and other analog information cannot be overlooked in the context of referring. The instability of digital resources, considered in other contexts as a positive feature as it provoked active, authentic student involvement (Meskill et al., 1999b), was bemoaned by Mrs. B, who expressed frustration when the digital resources do not go her way. She nonetheless praised digital technology for its attractiveness to students, the breadth and richness of resources she can have at her fingertips, and its malleability, which she exploits to great effect, for example, swiping what the class is referring to into the foreground, grouping, coloring, marking, enlarging, minimizing, and the like. Indeed, when asked what aspects of technology she most prized in her teaching, Mrs. B provided the following list:
Combining text, sound (audio of that text), and images that support the text.
Having a glossary (even better with translations in various languages)
Having models of a person working through a task (ex. annotating)
Providing immediate feedback (students can see the tasks they completed correctly and the mistakes they made; mistake are linked to relevant rules or guidance)
Including different levels of complexity or difficulty of material or task (students need to attain certain mastery before moving to the next level)
Varying methods of assessments and tasks (eg. multiple choice, fill in the blanks, matching, short response, annotations etc). (Questionnaire 2, Mrs. B)
With classroom processes grounded in mutuality, Mrs. B employs any number of mediational moves in conjunction with multimodal referents to achieve the goals of student comprehension and competence in using the new word/concept in both spoken and written form in contextually appropriate ways.
We’ve had many conversations and laughs about this [teacher gestures] in our department—the habit of making the language accessible becomes part of us. The goal is to provide students with many entry points to be able to understand and remember the content: hear it, say it, see an image, use gestures etc. (Questionnaire 2, Mrs. B)
She uses digital elements as multimodal referents to illustrate, anchor, and focus talk in interaction as well as attract and steer attention, elicit reactions, and/or track and reopen abandoned topics. Joint attention, moreover, makes available a great deal of information about objects by establishing reference and intention. Indeed, Yang and Walker (2015) make a strong case for multimodal referents as these arouse student interest, allow for freely switching between languages as needed, and facilitating adaptive remedial instruction. They argue that the greatest promise for classroom technology is in providing new ways for teachers to interact with their students. The case of Mrs. B’s sheltered biology class for ELs is an exemplary response to this promise.
Conclusion
At the secondary level, the complexity of academic content increases, as do the demands for the language and literacy skills required for success with that content (Carrasquillo, Kucer, & Abrams, 2004). Around the United States, newcomers attend schools in large numbers and, like their U.S.-born counterparts, attend age-appropriate classes sometimes with the support of extra ESOL classes, sometimes with tutors and/or translators, and sometimes with nothing but their own will to master school content through sheer tenaciousness. Although the challenges for high school ELs are many, thoughtful and well-trained educators make a difference. As we illustrate through the case of Mrs. B’s biology class, instructional supports for language and content learning, especially those supported by multimodal instructional conversations, are viable, are productive, and render meeting such challenges opportunities. Instructional conversations that render new, complex science content accessible and comprehensible for diverse learners, and that employ multimodal referents in the process, constitute a teaching model worth further exploration. They also are an important, heretofore absent feature in considering roles for classroom technologies.
The young people in Mrs. B’s biology class are particularly fortunate to have a teacher exquisitely talented at teaching language through content and content through language via solidly choreographed, multimodal-supported instructional conversations. Strong cases have been made that teaching and learning can truly be understood only via analysis of classroom interaction (Cazden, 2001; Seedhouse & Walsh, 2010). As regards teacher professional development generally, and what is specific to supporting ELs, models are critical tools and much needed. Mrs. B’s sheltered biology class is well poised to serve in this regard.
Footnotes
Appendix A
Appendix B
Appendix C
Acknowledgements
This study was supported in part by the U.S. Department of Education Office of English Language Acquisition (OELA) Program Grant No. T365Z120266.
Notes
Authors
CARLA MESKILL is a professor of educational theory and practice at the State University of New York, Albany. Her research explores language, technology, and new forms of teaching and learning at their intersections.
JENNIFER NILSEN is a doctoral student in the Department of Educational Theory and Practice at the State University of New York, Albany. Her research interests are in instructional improvements through the study of classroom discourse.
ALAN OLIVEIRA is an associate professor in the Department of Educational Theory and Practice at the State University of New York, Albany. His research interests include cooperative science learning, inquiry-based teaching, and classroom discourse and language use.
