Abstract
Mid-adolescence is a period of considerable potential growth in the language for academic writing. Yet, to date, few writing studies explore language development during this period and even fewer focus on longitudinal or diverse samples. In this study, we examined the development of language skills for academic writing in a socio-economically diverse sample followed from sixth to seventh grade (n = 124). In each grade, participants wrote summaries of a science text. Subsequently, summaries were scored for writing quality (WQ) and analyzed for productive language skills (lexico-syntactic and discourse features). Participants completed a receptive academic language assessment and a test that measured reading comprehension of the source text. First, we examined if WQ or productive language skills changed over time. Next, we tested if Grade 6 productive and receptive language skills predicted Grade 7 WQ. Results revealed syntactic growth over time. Grade 6 use of connectives and receptive language skills emerged as predictors of Grade 7 WQ.
Becoming a skilled writer is increasingly important inside and outside of school because it enables participation in today’s information-based society, where comprehending and communicating information in writing is commonplace (National Assessment of Educational Progress [NAEP], 2011; Schleicher, 2010]. Writing to explain, that is, writing to communicate information that supports others’ understanding of a topic, represents the most frequently assigned communicative purpose for writing at school (Graham & Perin, 2007; NAEP, 2011). Throughout middle school and beyond, writing-to-explain tasks include, among others, summaries of explanatory texts, which usually function as study aids, as assessment tasks, or as components of larger reports or essays that require recounting information gleaned from other texts (Gelati, Galvan, & Boscolo, 2014). Particularly in science classes, students are often asked to summarize texts that explain conceptually complex scientific ideas or phenomena relevant to a lesson’s learning goal. Typically, however, summary tasks are used for students to learn or demonstrate content understanding, but without considering that, students’ language skills may also play a role in their comprehension of the source text and their production of a written summary. This consideration is especially critical in the transition from the upper elementary years into the early middle school grades. The shift in curricular content and standards at the beginning of middle school suddenly requires students to engage in learning tasks that demand higher mastery of language, reading, and writing practices, a challenge for which not all students may be prepared (Christie, 2012). Whereas writing researchers are likely to be keenly aware of the linguistic challenges posed by advanced literacy tasks, U.S. educational practitioners and researchers are rarely presented with illustrative evidence to understand and act upon the diverse language needs crucial to support all public-school students’ content-area learning in this transition and beyond.
Reflective of today’s societal needs, recent educational standards aim to prepare students to be skillful writers of explanatory texts at earlier grades than before (e.g., Common Core State Standards Initiative [CCSSI], 2010; Next Generation Science Standards Lead States, 2013). Recent standards, as well as the “language facility” skills in the NAEP (2011) Writing Framework, emphasize the mastery of precise word choice, sentence structure, logical connectors, and linguistic expression of writers’ voice starting in the upper elementary grades. We contend that achieving this aim calls not only for innovative pedagogies but also for research that can inform evidence-based instruction. The need for research is urgent given that the challenges ahead are not minor. By the end of middle school, U.S. students’ performances are far behind the expected levels of writing proficiency (which, in the NAEP framework, includes proficiency in writing to explain). NAEP (2011) documented that only 27% of eighth graders performed at or above the expected level of writing proficiency, with students from low socioeconomic backgrounds performing significantly lower than their more privileged peers. The juxtaposition of the high expectations at earlier grades with the reality of low writing achievement and persistent socioeconomic disparities highlights the urgency of better understanding U.S. students’ writing development.
Whereas available studies show that mid-adolescence (i.e., middle school years) is a period of growth in academic writing, most research has focused either on the early years of learning to write or on composition processes during the college years. Comparatively, minimal research has examined writing development during mid-adolescence (Berman, 2009). The few available studies document developmental trends, but most are based on cross-sectional data or exemplary longitudinal case studies, and mostly on homogeneous middle-class samples (Christie & Derewianka, 2008; Nippold & Sun, 2010).
Motivated by the insights and gaps in prior research, in this study, we examine mid-adolescents’ language for academic writing by analyzing summaries of explanatory science texts (henceforth referred to as science summaries) produced by a socioeconomically diverse sample followed longitudinally from sixth to seventh grade. This study was driven by two goals. The first goal was to examine if participants’ science summaries changed over time, either in writing quality (WQ) or in the productive language skills displayed. The second goal was to test if participants’ receptive and productive language skills in Grade 6 (G6) predicted participants’ summary WQ in Grade 7 (G7). Our research aims to advance the understanding of developmental patterns and individual differences in samples that resemble the diversity of U.S. schools, with the ultimate goal of informing research-based pedagogies attuned to the language needs of today’s students.
Background
Language Skills in Psychological Models of Writing
The language skills that support skilled text generation have been consistently present in psychological models of writing development (Alamargot & Fayol, 2009; Berninger, Abbott, Abbott, Graham, & Richards, 2002; Hayes, 2006). Parallel to the theoretical model of reading comprehension known as the “Simple View” (Hoover & Gough, 1990), conceptualizations of writing development distinguish transcription (spelling, letter writing) from ideation (generation and organization of ideas) (Juel, Griffith, & Gough, 1986). Transcription skills are understood as lower level cognitive skills and linguistic knowledge as a higher level cognitive skill, acknowledging the role that word-, sentence-, and discourse-level skills play in text generation (Berninger et al., 2002). Linguistic knowledge is considered crucial in allowing the writer to encode meaning, or in Alamargot and Fayol’s (2009) terms, to encode “what must be said” (content) into “what can be said” (linguistic knowledge).
Paradoxically, mid-adolescents’ language skills have been understudied in writing research. As stated by Alamargot and Fayol (2009), “writing models remain silent on the questions related to the acquisition of linguistic, syntactic, lexical and textual representations” (p. 29). The prevalent characterization of text generation that continues to dominate the writing development literature is the shift from “knowledge-telling” to “knowledge-transforming” (Bereiter & Scardamalia, 1987). While this research captures an important shift from a focus on content with minimal planning to a focus on conceptual planning that considers audience and text coherence, the language skills that support this progress have not yet been systematically investigated.
Furthermore, while informative, the “simple view of writing” models do not incorporate insights from extensive sociocultural and functional linguistics research. Assuming an “additive-cumulative” perspective, these psychological models understand writing development as a sequential process in which transcription skills take precedence, and language, understood as discrete skills, is considered as playing a significant role only after transcription is mastered. In this view, writing is conceptualized just as spoken language transcribed (Tolchinsky, 2016). This perspective has studied interactions across language modalities (listening, speaking, reading, writing); yet, within modalities, language is understood as a general, one-dimensional proficiency (Abbott, Berninger, & Fayol, 2010; Berninger & Abbott, 2010). In contrast, findings from pragmatics, sociocultural studies, and functional linguistics reveal the multidimensional and context-dependent nature of language learning; illustrate how specific purposes for writing—or speaking—draw on particular constellations of language resources as pragmatic solutions to particular social contexts; and demonstrate how different ways of using oral and written language influence each other throughout development (Halliday, 2004; Heath, 2012). Thus, while psychological models offer a basic conceptualization of the cognitive architecture of writing development, we depart from them and align, instead, with “mutually enhancing-interactive views” informed by qualitative and functional linguistics analyses that understand oral and written language as reciprocal developmental processes (Tolchinsky, 2016). In our work, we understand oral and written language learning as learning particular ways of using language influenced by learners’ prior opportunities to participate in specific oral and written language practices throughout their life (Tolchinsky, 2016; Uccelli, Phillips Galloway, Barr, Meneses, & Dobbs, 2015).
In this study, we use quantitative methodologies and focus on selected language skills, yet we do not align with models in which context is dismissed and learning is understood as an accumulation of discrete skills. Instead, our work is grounded in a pragmatics-based sociocultural view of language learning as context driven and aligned with “mutually enhancing-interactive views” of writing development (Tolchinsky, 2016) and with a usage-based view in which “discourse drives grammar” (Nir & Berman, 2010). We focus on selected skills to test our hypothesis that mid-adolescents, even in the same grade and classroom, exhibit extensive individual differences in their mastery of the particular ways of using language for academic writing. As our examples at the end of this article illustrate, we do not conceptualize skills as discrete or learned in isolation. Instead, we focus on selected school-relevant skills as potential evidence to test our hypothesis, but we view them as proxies for a wider constellation of interrelated skills that developing writers learn concurrently as they participate actively in the oral and written discourses of school.
Linguistic Challenges of Science Summary Writing
Among the multiple factors that influence overall writing proficiency (e.g., working memory, background knowledge, reasoning) (Hayes, 2006; Kellogg, 2008), academic writing requires mastering specific language skills that overlap but at the same time differ predictably from more colloquial uses of language. To succeed at academic writing throughout middle school, developing writers need to progressively master new language forms and functions that enable precise, concise, cohesive, and assertive, yet reflective, communication about abstract and complex ideas (Nagy & Townsend, 2012; Schleppegrell, 2001). Consequently, mid-adolescents need to continue to learn the language resources that support academic reading and writing, but this linguistic learning—and students’ individual differences in language skills—often go unacknowledged in content-area curricula and assessments both for monolingual and bilingual students (Lee, 2018; Schleppegrell, 2004).
Science summaries are prevalent school writing tasks that pose double linguistic challenges. Writers’ language skills play a role, first, receptively in the comprehension of the source text and, then, productively in the composition of the summary. We define a science summary in this study as a one-paragraph-length written task in which the writer paraphrases and sums up the science content of an explanatory source text (Durst & Newell, 1989). Writing a summary entails selecting the most important ideas from the source text and communicating them in a shortened rendition that has a coherent organization of logically linked and precise propositions (Gelati et al., 2014). Different from other types of text responses, in a summary, writers are not expected to go beyond the information of the source text. Instead, writers need to communicate only the information contained in the source text as they repackage and condense its content using their own language (Durst & Newell, 1989; Hidi & Anderson, 1986). Thus, both receptive academic language skills (the language skills that support students in comprehending academic texts) and productive language skills (the language skills that support the writing of academic texts) are closely implicated in the production of a science summary (Phillips Galloway & Uccelli, 2019).
Academic language—or the language of science (Halliday, 2004) or the language of schooling (Schleppegrell, 2001)—refers to a repertoire of language resources used in educational and scientific writing and learning contexts. Even though there is no clear boundary between colloquial and academic language, the predictable prevalence of certain linguistic features typically distinguishes academic writing from more colloquial uses of oral or written language. When put to “good use” (Fairclough, 2008), academic language features in texts constitute pragmatic solutions to the need to convey complex information to distant audiences, without relying on the physical context of the interaction (gestures, pointing, intonation, etc.) to support communication (Halliday, 2004; Snow & Uccelli, 2009). Mastering academic language is relevant also for life outside of school as it underpins access to the public discourses of politics, health, and news media required for effective civic participation (LeVine, LeVine, Schnell-Anzola, Rowe, & Dexter, 2012).
Cummins’s (2008) work raised awareness of the educational consequences associated with bilingual students’ differential mastery of colloquial versus the more challenging academic language skills decades ago. Only recently, however, the field has begun to understand the central role academic language learning plays in the academic achievement of all students, including monolingual students (Uccelli, Phillips Galloway, et al., 2015).
Receptive Language Skills for Academic Reading
In spite of the widespread awareness of the relevance of academic language proficiency at school and beyond, this construct remained for a long time narrowly operationalized as academic vocabulary knowledge, with potentially additional sentence- or discourse-level skills only imprecisely delineated (Nagy & Townsend, 2012). Responding to the need for a more comprehensive and precise construct, we proposed the Core Academic Language Skills (CALS) construct and designed a research-based, theoretically grounded and psychometrically robust assessment to measure students’ CALS in Grades 4 through 8, the CALS-Instrument (CALS-I).
CALS refer to a constellation of high-utility language skills that correspond to linguistic features prevalent in academic texts across school content areas and infrequent in colloquial conversations. On the basis of extensive research syntheses that integrated insights from textual linguistics, functional developmental linguistics, and studies of the language demands of U.S. standards and curricula, we generated an initial construct that we tested iteratively through extensive qualitative and quantitative studies over multiple years (detailing the development of the construct and instrument goes beyond the scope of this article; for more information, see Uccelli, Barr, et al., 2015). CALS include the following skill sets:
Unpacking dense information: understanding morpho-syntactically complex words and sentences.
Connecting ideas logically: understanding logical connectives prevalent in academic texts (e.g., consequently).
Tracking participants: understanding expressions that refer to prior participants or themes in academic texts (e.g., Crop rotation means changing the kind of crop planted in a plot of land after one or two harvests. This technique. . .).
Interpreting writers’ viewpoints: understanding markers of a writer’s viewpoint (e.g., certainly. . .).
Understanding metalinguistic vocabulary: understanding terms that refer to reasoning and discussion processes (e.g., hypothesize, paraphrase).
Understanding text organization: understanding non-narrative academic texts’ organization (e.g., argumentative texts).
Recognizing academic register: identifying more academic versus more colloquial language.
Research shows that these skill sets are not learned as discrete skills but concurrently, presumably as learners engage with academic discourse in authentic practices, in which these lexico-syntactic and discourse skills co-occur (Uccelli & Phillips Galloway, 2017; Uccelli, Barr, et al., 2015). CALS are conceptualized as complementary to discipline-specific language skills (Fang, Schleppegrell, & Cox, 2006).
To date, considerable individual variability in mid-adolescents’ CALS has been documented and shown to significantly contribute to reading comprehension (in Grades 4–8) (see Uccelli, Phillips Galloway, et al., 2015). Extending this prior research to writing, we hypothesized that differences in mid-adolescents’ CALS would contribute significantly to the quality of science summaries, through their contribution to both source-text comprehension and their potential direct influence on text generation.
Productive Language Skills for Academic Writing
In contrast to the extensive literature on academic language and reading comprehension, research on mid-adolescents’ development of the productive language skills that support academic writing is still in its infancy (Alamargot & Fayol, 2009; Berman, 2009). Despite this scarcity, the emerging functional textual and developmental linguistics studies already offer relevant findings and promising measures to capture growth in the productive language skills that support writing throughout middle school (Berman, 2004, 2009; Christie & Derewianka, 2008; Schleppegrell, 2004). Different lines of textual linguistics (e.g., systemic functional linguistics, corpus analysis, metadiscourse analysis) reveal an inventory of linguistic features prevalent in experts’ academic writing across disciplines (Biber, 1988; Halliday, 2004; Hyland, 2005; Schleppegrell, 2004). Additionally, the comprehensive cross-sectional and cross-linguistic research program conducted by Berman and Nir-Sagiv (2007) provides quantifiable measures of lexico-syntactic and discourse skills, shown to capture upward developmental trends in expository writing throughout the pre-adolescent and adolescent years. Overall, cross-sectional research reveals that adolescents’ expository writing gradually becomes more lexically precise, syntactically concise, cohesively connected, and detached in stance (as opposed to involved). These studies, however, focus predominantly on middle-class samples and on average developmental trends, mostly inferred from cross-sectional research or exemplary case studies.
Building upon insights from functional linguistics research, in this study, we examine if we can detect growth in the language for academic writing. We focus on a specific task, science summary writing, and on the narrow but important developmental window from sixth to seventh grade. A second and central aim is to investigate the potential contribution of individual differences in the language for academic writing in G6 on the WQ of summaries in G7. Below we briefly review prior research on the productive lexico-syntactic and discourse skills of focus in this study. The selected language skills are not meant as an exhaustive list but are representative of skills shown by prior research to support adolescents’ academic writing.
Lexico-syntactic skills
Lexico-syntactic skills that support precision and conciseness will be analyzed in this study. These refer to specific vocabulary and syntactic skills deployed to serve the unambiguous and succinct communication valued in academic writing. Specifically, academic writing development entails using a broader range of academic vocabulary and packing more information in fewer words through the use of more content words per total words (lexical density) or through increasingly complex syntax. Illustrative of this phenomenon, Berman and colleagues across a series of studies found that in the writings produced by English-speaking participants in four age groups (10, 13, 17, and adults), the expository texts produced by the older students contained, on average, a higher number of abstract and Latinate words (typical characteristics of English academic words), as well as higher lexical density (Bar-Ilan & Berman, 2007; Berman & Nir-Sagiv, 2007). To capture the degree of information packing through syntactic skills, a widely used measure of syntactic complexity is words per clause (Beers & Nagy, 2011; Scott, 2004). A clause is defined as “a unified predicate describing a single situation” (Berman & Slobin, 1994, p. 660). Student writers tend to employ more words per clause with age, particularly in expository texts (Berman & Nir-Sagiv, 2007; Schleppegrell, 2004). More skilled writers also tend to use a higher number of relative and adverbial clauses (Scott, 2004) and more frequently use the passive voice (Reilly, Zamora, & McGivern, 2005). Cross-sectional research shows developmental trends; yet, minimal research follows students over time to test if development of these language skills indeed occurs during mid-adolescence within the same learners (Berman & Nir-Sagiv, 2007; Berman, 2009; Christie & Derewianka, 2008).
Research has also examined if lexico-syntactic skills support WQ. The use of content words has been found to be associated with higher quality in informational texts produced by upper elementary students (Olinghouse & Leaird, 2009) and mean length of clause as contributing to the persuasive WQ of middle schoolers (Beers & Nagy, 2009) and high schoolers (Uccelli, Dobbs, & Scott, 2013). For the most part, research has focused on within-text relations (i.e., features of a text predicting quality of the same text). Instead, in this study, we investigate if summary-based lexico-syntactic skills in G6 predict summary WQ in G7.
Discourse skills
In this study, we analyze two discourse domains: text connectivity and writer’s stance.
Text connectivity
Text connectivity refers to how relations between ideas are explicitly marked in a text. Some writers begin to use linguistic markers to signal relations between ideas as early as second grade (King & Rentel, 1979), but this development continues well into late adolescence (McCutchen & Perfetti, 1982). Research on connectives (in contrast, therefore) shows, for instance, that sixth graders’ essays contain fewer connectives than eighth graders’ essays (McCutchen, 2000). The types of connectives employed by young writers also appear to change as a function of grade, with sixth graders employing more basic forms (and, so, then) than 12th graders, who made use of more academic connectives that provided more precise signposting for readers (for this reason, consequently) (Crowhurst, 1987).
Recent research found that middle school students’ use of adversative connectives was associated with more complex argumentative essays (Taylor, Lawrence, Connor, & Snow, 2018), and others found that connectives, functioning as both local interclausal links and as discourse markers, are associated with the quality of high schoolers’ persuasive essays (Uccelli et al., 2013). Some researchers have found that throughout high school and college, though, the use and impact of connectives on WQ decreases over time typically as syntactic complexity and lexical intricacy continue to be important predictors (Crossley, Weston, Sullivan, & McNamara, 2011; Crowhurst, 1980; Halliday, 2004). There is still little research on middle school writers’ growth in connectives and their contribution to WQ in academic writing.
Writer’s stance
In contrast to the typical subjective and involved viewpoints of colloquial conversation (e.g., Let me tell you! I think. . .), academic discourse stance is encoded linguistically through a variety of later-acquired forms that express a distanced or detached attitude. Extensive qualitative analyses show that writers move from adopting a more colloquial involved stance to gradually internalizing a more detached perspective in academic writing (Berman, 2004; Christie & Derewianka, 2008; Hyland, 2005; Qin & Uccelli, 2019; Schleppegrell, 2004). This research, combined with findings from high school students (Uccelli et al., 2013), suggests that the type of stance marked linguistically in academic texts is also likely to relate to texts’ quality.
The Present Study
In this study, we examined change over time and individual differences in the language skills that support science summary writing. Two sets of questions guided this study (RQ1) Do science summaries change (a) over time from sixth to seventh grade either in WQ or (b) in productive language skills (i.e., lexico-syntactic or discourse features)? and (RQ2) Are G6 productive language skills or G6 receptive academic language skills, as captured by the CALS-I, associated with G7 science summary WQ?
Method
Participants
Data for this study come from participants who were recruited to participate in the control group of a large project on reading comprehension, Catalyzing Comprehension Through Discussion and Debate, Institute of Education Sciences, grant R305F100026 (Jones et al., in press), that involved collecting writing data from multiple cohorts of students followed longitudinally (Uccelli & Phillips Galloway, 2017). Using a stratified random sampling procedure, we selected a longitudinal sample approximately balanced by gender and socioeconomic status (SES) from the larger reading comprehension project. For the present study, we focused on a smaller analytic sample that includes only participants from a single cohort: those who were in sixth grade in Study Year 1 and in seventh grade in Study Year 2 and had complete data on the summary writing task and the source text reading comprehension test in both years. All participants read and summarized the same source text in G6 and G7. The sample for this study includes a total of 124 participants who attended public schools in two urban districts in the Northeastern United States. The sample was approximately balanced by gender (53% female) and SES (58% eligible for free or reduced-price lunch). The majority of students in this sample were White (56%), followed by Latinx (27%), and smaller proportions of Black/African American (10%), Asian (3%), Native American/Alaskan Islander students (1%), and students of mixed races/ethnicities (2%). English learners (an official U.S. designation for bilingual students identified as needing English services) comprised 15% of the sample.
Measures
Participants were administered the instruments described below as part of their regular school day. Two waves of data were collected using the same set of instruments: the first wave in sixth grade and the second in seventh grade.
Summary writing task: Explanatory science summary
As part of the Global Integrated Scenario-Based Assessment (GISA) computer-based test described below, participants were asked to produce written summaries of a 448-word science source text titled “Organic Farming.” The same task was administered in G6 and G7.
Source text reading comprehension test: The GISA
Developed by Educational Testing Services (ETS), the GISA is a computer-administered assessment designed to capture students’ skills in complex literacy tasks (Sabatini, O’Reilly, Halderman, & Bruce, 2014). Students are prompted to use reading comprehension and source-based writing skills in the service of completing an authentic task (e.g., creating a website). After reading the GISA source text “Organic Farming,” participants answered multiple-choice comprehension questions (including literal, inferential, and information-integration questions). Validation research on the GISA reading comprehension test has yielded adequate psychometric properties, including internal consistency (alpha) reliability (.89) and split half reliability (.76) (Sabatini et al., 2014). Scaled scores were used in this analysis.
Receptive language skills: CALS-I
The CALS-I is a paper-and-pencil group-administered test for Grades 4 through 8 (Uccelli, Barr, et al., 2015). It measures the CALS construct described above. Two vertically equated CALS-I forms were administered: CALS-I-Form 1 to Grade 6 (α = .90, total items = 49) and CALS-I-Form 2 to Grade 7 (α = .86, total items = 46). Using Rasch item response theory analysis, factor scores were generated for analysis.
Science Summary: Scoring and Automated Language Analysis
Science summaries were scored for WQ (using human raters) and analyzed for productive language features (using automated analysis).
Data preparation
Prior to analysis, all spelling errors were corrected in the summary data files to assure that human scorers of WQ were not negatively biased by misspellings and to conduct accurate automated linguistic analyses so that these were not affected by non-relevant orthographic features. Original files with misspellings were also preserved.
Summary WQ
Informed by prior research and by the NAEP (2011) Writing Framework, a researcher-developed holistic Science Summary Writing Quality Rubric was used to score summaries along four dimensions (see the appendix in the supplementary materials) and to subsequently generate a final holistic WQ rating. The following dimensions were assessed:
Organization: the extent to which the summary was coherently organized at the text and paragraph levels.
Accuracy: the extent to which the summary information was accurate in relation to the source text.
Coverage: the extent to which the summary covered the most important information from the source text.
Clarity: the extent to which the summary conveyed information in a precise and unambiguous manner.
To ensure reliable and valid scoring, two tools were generated: (a) a content idea unit map, which presented the source text color coded to indicate the main idea units in the passage, and (b) the minimal summary scheme, which described the main organizational structure of the source text. These tools were generated on the basis of summaries of the source text produced by eight skilled adults. These adults were all graduate students specializing in education-related areas with prior experience as classroom teachers. Human raters were trained to score the science summaries during group sessions in which each summary was scored by two raters and guided by the holistic writing rubric, which included anchor summaries to illustrate each level. Human raters were also graduate students in education, with prior experience as teachers, and were blind to the study questions. On the basis of 20% of scored summaries, a high percent agreement was achieved (96.1%) with a weighted Kappa of .72. A principal components analysis was conducted to examine the structural validity of the rubric’s dimensions. The first principal component weighted all dimensions positively and equally and accounted for 77% of the variance.
Copied text ratio
Examination of the written science summaries revealed that estimating the amount of textual borrowing or copying from the source text was necessary. We developed an automatic system—using Natural Language Processing tool kits embedded in Python (complemented with manual checking)—to quantify the proportion of textual borrowing in the summaries. Specifically, we divided the source text into consecutive 5-grams (five-word bundles). Then, the frequency of these exact 5-grams and a ratio of original to copied text was generated per summary (Phillips Galloway & Uccelli, 2019). Science summaries with more than 50% of copied text were excluded from analysis. Copied text ratio was used as a covariate in the regression analysis.
Summary-Based Productive Language Skills
Using the automated tools TAALES (Kyle & Crossley, 2015), TAASSC (Kyle, 2016), CLA (Kyle, Crossley, & Kim, 2015), and CLAN (MacWhinney, 2011), the following measures were generated:
Lexico-syntactic measures
Academic words: a normed count of words from the Academic Word List (AWL) corpus (Coxhead, 2000). This count displays the use of words from the 570 most common word families found in written academic corpora, but that are infrequent in non-academic corpora (e.g., analyze, technique).
Lexical density: the number of content words divided by the number of total words (Berman & Nir-Sagiv, 2007).
Mean length of clause: the number of words divided by the number of clauses.
Passive nominal subjects: the use of passive nominal subjects per clause (e.g., these nutrients are taken with the plant).
Discourse measures
For text connectivity and use of connectives, first automated total frequencies of connectives per summary were generated. Subsequently, human verification of connectives function was facilitated by the AntConc tool kit (Version 3.5.6; Anthony, 2018). After human verification, the diversity of connectives—that is, the number of distinct connectives over total words per summary—was generated.
Writer’s stance and self-mention, as an index of writers’ involved stance, was calculated as a ratio of self-mentioning pronouns (i.e., first-person references: I, me, we) per 100 words.
Given that science summaries are expected to have a detached stance, we expected the self-mention ratio to be negatively related to WQ; all other measures were anticipated to be positively related to WQ.
Results
RQ1a: Does Science Summary WQ Change Over Time?
Table 1 reports the descriptive statistics for science summary overall WQ and dimension-specific WQ scores, by grade. Pairwise t tests were conducted to compare WQ mean scores over time. Results revealed that WQ of participants’ summaries did not change from G6 to G7. Even though means in Table 1 show an upward trend in overall WQ, as well as higher means for text organization, coverage, accuracy, and clarity from G6 to G7, mean differences by grade were not large enough to be statistically significant.
Descriptive Statistics for Science Summaries’ Writing Quality and Productive Linguistic Skills by Grade (n = 124).
Note. Summaries with more than 50% copied text were not included in the estimation of means and SD, and were excluded from later analyses.
RQ1b: Do Summary-Based Productive Language Skills Change Over Time?
Similar to the results for WQ, pairwise t tests showed that overall lexico-syntactic and discourse skills did not change significantly from G6 to G7. Only one syntax measure showed significant change. Mean length of clause grew significantly, from a mean of 8.17 words per clause in G6 to a mean of 8.78 words per clause in G7 (t = −2.3359, p < .05). No statistically significant differences between G6 and G7 were found for any of the other lexico-syntactic (academic words, lexical density, passive nominal subjects) or discourse measures (diversity of connectives, self-mention ratio).
RQ2: Do G6 Productive or Receptive Language Skills Predict G7 WQ?
To answer RQ2, we first explored WQ correlations with productive and receptive language skills within and across both grades. Table 2 reveals that whereas all productive and receptive language skills were significantly associated with WQ in G6 (this was consistent in G7, even though these estimates are not reported in Table 2), only G6 diversity of connectives and receptive language skills (CALS) were positively associated with WQ in G7. In other words, within-grade analyses showed that summaries evaluated to be of higher quality in G6 (and in G7) were, on average, lexically and syntactically more sophisticated, as evidenced by a higher percentage of academic words, higher lexical density, longer clauses, and more use of passive subjects; they were also more aligned with the discourse expectations of academic writing, as evidenced by a higher diversity of connectives to logically connect ideas and fewer self-mentions. However, only two G6 measures emerged as significant predictors of summary WQ 1 year later.
Zero-Order Correlations Between Grade 6 Productive Language Skills and Receptive Academic Language Skills (CALS) and Grade 6 and Grade 7 Science Summary Writing Quality.
Note. CALS = Core Academic Language Skills.
*p < .05. **p < .01. ***p < .001.
Guided by correlation results, we examined G6 diversity of connectives and CALS as predictors of summary WQ in G7 (Table 3). Guided by theory-based and empirically grounded predictions, our baseline control model included participants’ school district and sociodemographic characteristics (SES and English-learner designation). Although none of these variables were significant, we kept them as controls in subsequent models to account for the variances in participants’ sociodemographic characteristics. In Model 2, we added two text-based covariates: source text reading comprehension and copy-ratio. Source text reading comprehension significantly predicted G7 WQ, after controlling for sociodemographic characteristics and copy ratio. Even though copy-ratio was not a significant predictor in Model 2, we kept it in subsequent models to account for any minor fragments of copied text in participants’ summaries. In Model 3, we added our first question predictor: diversity of connectives. Confirming our hypothesis of the contribution of productive language skills to WQ over time, G6 diversity of connectives significantly contributed to G7 WQ, even after controlling for sociodemographic background and text-based covariates. Our final model, Model 4, included both question predictors and showed that both G6 diversity of connectives use in summaries and G6 CALS were predictive of G7 WQ, controlling for sociodemographic characteristics and text-based covariates. The model suggests that, as anticipated, participants’ earlier receptive academic language skills contributed to their later written production. This final model accounted for about 31% of the variances in G7 WQ.
A Series of Multiple Regression Models Predicting Grade 7 Writing Quality From Grade 6 Productive Language Skills (Diversity of Connectives) and Grade 6 Receptive Language Skills (CALS), Controlling for Sociodemographic Characteristics, Source Text Comprehension, and Ratio of Copied Text (n = 77).
Note. CALS = Core Academic Language Skills.
p < .05. **p < .01. ***p < .001.
Illustrative Examples
In this section, we illustrate our results by briefly analyzing two science summaries (see Figure 1) scored at both ends of the WQ continuum. Both summaries were produced in G6 by students with similar sociodemographic backgrounds: both girls, White, not designated as English learners, and not eligible for free or reduced-price lunch. In G7, each received the same WQ score they had received in G6. This reflects this sample’s overall trend: Participants maintained their relative ranking over time, scoring at generally the same level in both years. In the examples presented below, Summary A was evaluated as high quality by raters (WQ = 5), while Summary B received a low quality score (WQ = 2).

Science summaries scored at both ends of the WQ continuum.
Summary A displays a coherent organization and contains the most important information from the source text paraphrased accurately using the writer’s own language, rather than copied language. Student A also includes definitions of key terms (It is a process. . . Crop rotation is a technique) presented in the source text, an important component in fulfilling the communicative purpose of this writing-to-explain summarization task. Illustrative of our quantitative results, one salient feature of Summary A is the diversity of connectives used to explicitly signal how ideas are connected across and within sentences (such as, while, so that, in addition). Interestingly, this skilled use of connectives co-occurs in this summary with additional productive language skills that support text connectivity and cohesion, such as the use of conceptual anaphora (i.e., this technique). At the lexico-syntactic level, we observe the use of academic words (technique, conventional) and complex syntactic skills, as illustrated by embedded relative clauses (called organic farmers) and passive nominals (when the plants are harvested, these nutrients are taken with the plant). Finally, according to the expectations of this academic writing task, the writer’s stance is detached, with no self-mentions. It is worth highlighting that these productive language skills correspond to the receptive skill sets tested by the CALS-I (e.g., connecting ideas, tracking participants, unpacking complex sentences), which may suggest the concurrent development of the receptive and productive language skills that support academic writing. Also, as a clear illustration of our findings, Student A scored at the 99th percentile in the CALS-I, whereas Student B scored at the 48th percentile of the CALS-I norming sample.
Summary B lacks a clear organization, misses important information expected to be covered in a summary, and reveals some gaps in understanding the source text (e.g., the student misunderstood healthy as related to human health, not to soil health), which is at least partially explained by her low CALS-I performance. This led to the inclusion of some non-relevant details from outside the source text (chemicals. . .aren’t good for our body). Even though the summary demonstrates partial understanding of the source text, definitions of key terms are missing. In comparison to Summary A, the relations between ideas in Summary B are not always clearly marked by precise connectives; this writer uses the connective and five items in her short text. This summary also makes use of more colloquial expressions not conventionally expected in academic writing. Whereas more colloquial expressions certainly could enhance skillful academic writing, in this summary, they do not contribute to increase precision in communication (it’s kind of like a pattern).
Finally, uncharacteristic of academic summary conventions, summaries rated as lower quality tended to contain a more involved writer’s stance, marked by self-mentions. Self-mentions were typically combined with non-relevant, often narrative-like, information from outside the source text. As an additional example, Student C ended her summary in the following way: In September I see lots of apples grown over and over again. My parents always grow watermelon but this year it did not grow as much as it did last year. Across the corpus, self-mentions appeared in 13% of the summaries in G6 and in 9% in G7.
In sum, skilled writers displayed the skills to succeed in this task, but they also seemed aware of the expectations of academic writing-to-explain summary tasks. In the same school district, often in the same school and classroom, less skilled summary writers seem to have learned fewer productive language resources to support academic writing and tend to also have low receptive language skills that compromise text understanding and presumably contribute to their lack of awareness of academic writing expectations. For educators, this analysis only offers an illustration of patterns and perhaps suggests that diversity of connectives is a proxy for a wider range of productive skills and awareness of the academic science summary expectations.
Discussion
In this study, we analyzed 248 science summaries produced in sixth grade and then, a year later, in seventh grade, by 124 students. Participants were also administered a reading comprehension test and a receptive academic language assessment. Our goals were to test if science summaries changed in quality or linguistic features from G6 to G7 and to examine whether G6 productive or receptive language skills predicted G7 science summary quality. We found that although science summary quality did not improve significantly after 1 year, one syntactic measure, mean length of clause, showed significant growth. We also found that the G6 connective diversity and G6 CALS significantly predicted G7 WQ. Findings highlight that mid-adolescence is a period of continued language development and, especially, of considerable individual differences. Confirming cross-sectional research (McCutchen, 2000; Taylor et al., 2018) and textual analysis of exemplary longitudinal cases (Christie, 2012), our findings highlight that the ability to use connectives to create cohesive ties between ideas is particularly relevant to support academic writing in the middle school years. Furthermore, our findings suggest that this ability to use connectives is mastered at considerably different levels among sixth graders who might even be peers in the same class. These findings echo results from reading comprehension that document the contribution of individuals’ knowledge of connectives to their levels of text understanding (Crosson & Lesaux, 2013). Even though small in scope, this study contributes to advance prior research in a number of ways.
First, whereas developmental research has been conducted mostly with cross-sectional samples from homogeneous middle-class backgrounds, our study focuses on a socioeconomically diverse sample drawn from metropolitan areas in the United States. Our results partially corroborate prior findings with cross-sectional samples in documenting that productive syntactic skills, specifically mean length of clause, continue to develop during mid-adolescence (Berman, 2009; Berman & Nir-Sagiv, 2007). In this short summary, however, we did not find, as others have found for persuasive essays, mean length of clause to be associated with WQ (Beers & Nagy, 2009). Yet, similar to prior studies on persuasive writing, our findings reveal the contribution of discourse skills to the quality of middle graders’ academic texts (Taylor et al., 2018; Uccelli et al., 2013). Future research needs to investigate which linguistics skills may be of particular relevance across a variety of writing tasks.
Second, findings reveal significant individual differences in mid-adolescents’ language skills for academic reading and writing. Minimal growth was detected through the analysis of mean performances. However, regression analyses revealed considerable individual differences in receptive and productive language skills in G6, which in turn predicted WQ one year later. These findings suggest that the analysis of mean performances—prevalent in the study of later writing development—may obscure individual differences, which are, in fact, of high relevance for advancing developmental theory and pedagogical practice.
Third, while the linguistic skills examined in this study have been the focus of prior research (Berman, 2009), our findings expand prior investigations by simultaneously examining students’ receptive language skills as captured through external assessment. Studies of productive writing are certainly insightful, yet in any given writing task, a writer’s full repertoire of language skills may not be on display. In this study, we managed this limitation by using a comprehensive test of academic language skills to predict summary WQ. Findings invite further research on the concurrent development of productive and receptive language skills, which ultimately may shed light on how best to scaffold school-relevant language skills through combined reading and writing pedagogical approaches (Barr, Uccelli, & Phillips Galloway, in press).
Even if not surprising, the present findings are highly relevant to begin to delineate for U.S. educators which language skills require pedagogical attention. Today, educational practitioners and researchers may still be misled by the widespread erroneous assumption that language development is mostly complete by mid-adolescence. If middle school students struggle with writing, general school-relevant language skills are not often considered to design instruction, especially in content-area teaching. This, we argue, is in part due to the lack of research evidence on the language needs of samples of students that resemble U.S. educators’ classrooms. One of the main motivations of our research is in fact to make visible to educators and researchers the ubiquitous, yet often invisible, linguistic demands of school reading and learning. Confirming prior findings (e.g., Schleppegrell, 2004), but with the relatively new contribution of conducting quantitative analysis on a diverse U.S. urban school sample, our study foregrounds that advancing students’ conceptual understanding needs to be complemented by pedagogical attention to the linguistic resources called upon in the expression, comprehension, and active processing of new learning. Given that the comprehension and expression of more abstract information and ideas requires context-specific language not previously learned by all students, elementary school practices are typically far from sufficient to equip all learners with the linguistic resources that support the critical transition to middle school writing, reading, and learning.
Whereas considerable research focuses on discipline-specific or genre-specific language demands (Fang et al., 2006; Qin & Uccelli, 2016), our research focuses on high-utility language skills relevant across content areas. In light of this study’s findings as well as prior results, we propose that the scaffolding of connectives’ use in authentic discipline-specific tasks be complemented with a common metalanguage and similar scaffolding moves across content areas and grades.
Far from advocating a focus on discrete skills, our research and that of others (Christie & Derewianka, 2008; Schleppegrell, 2004) foreground that the skills that support academic reading and writing are learned as constellations of resources to solve pragmatic demands. Learners tend to internalize these sets of resources concurrently as they participate actively in authentic academic reading and writing tasks, with the appropriate scaffolding. As our examples of students’ summaries illustrate, the mastery of the language for academic writing also develops with the implicit or explicit internalization of the expectations of academic discourse, more broadly, and of specific tasks, in particular. Within this understanding, we call for adding and testing the potential high-leverage pedagogical practice of emphasizing certain high-utility language skills, such as connectives use, across reading and writing tasks, content areas, and grades.
Limitations and Future Research
This study has several limitations. First, it focuses on a relatively small sample and only two waves of data. This is far from an optimal longitudinal design. Not only would a larger sample be desirable for generalizability purposes, but multiple waves of writing data are needed to examine variability in developmental trajectories over extended periods of time. We focused on the transition from G6 to G7 because of the sudden increase in language demands in transitioning to middle school, but it is possible that this was too short a developmental window to detect relevant changes; consequently, our findings cannot be generalized beyond the two grades studied.
The examination of a single writing product per time point, combined with the on-demand nature of the task, also limits our inferences. We cannot assume that findings from this study reflect participants’ writing profiles, but just an individual performance in an on-demand writing task. Furthermore, because we examined a single type of writing task, the science summary, our results cannot be extrapolated to other writing tasks. Research that examines writers’ performances through a wider range of linguistic measures and across a range of writing tasks is needed to advance the understanding of relations between language skills and writing development.
The use of connectives in these science summaries could also benefit from a more in-depth exploration. Further insights into individual differences can be gained through analysis of the conceptual relations expressed by the connectives deployed (e.g., addition, cause and effect), the functions served (i.e., inter-clausal connectors or discourse markers), and the productive skills that co-occurred with differences in connectives’ frequency or diversity.
In conclusion, our study documents syntactic growth in the transition from G6 to G7 and use of connectives and receptive academic language skills as predictors of later WQ. The results highlight the importance of attending to individual differences and of integratively investigating receptive and productive language skills that support academic writing development. This is, however, only a modest contribution in a field in need of longitudinal research on diverse samples to answer many pending questions about the relations of specific linguistic skills and writing development across a range of academic writing tasks throughout adolescence.
Supplemental Material
OL_SUP_APP_1_Uccelli_(1) – Supplemental material for The Role of Language Skills in Mid-Adolescents’ Science Summaries
Supplemental material, OL_SUP_APP_1_Uccelli_(1) for The Role of Language Skills in Mid-Adolescents’ Science Summaries by Paola Uccelli, Ziyun Deng, Emily Phillips Galloway and Wenjuan Qin in Journal of Literacy Research
Footnotes
Acknowledgements
The authors would like to express their gratitude to the research assistants of the Language for Learning team at Harvard Graduate School of Education for their assistance with data cleaning, coding, and scoring. Special thanks to research assistants Young Eun Chan and Erqian Xu, and also to Kaity Kao and Brianne McGee.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research was supported by the Institute of Education Sciences (Grant No. R305A170185; Grant No. R305F100026), U.S. Department of Education. The opinions expressed are those of the authors and do not represent the views of the Institute or the U.S. Department of Education.
Supplemental material
The appendix referenced in this article and abstracts in languages other than English are available at online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
