Abstract
Writing discussion sections of research articles (RAs) is difficult for novice scientists. The study investigates patterns of linguistic characterizations in discussion sections of RAs in chemical engineering. Around 240,000-word corpus was compiled using 213 discussion sections extracted from 20 disciplinary journals. Multi-dimensional (MD) analysis proposed by Biber was used to capture linguistic co-occurrence patterns based on a constellation of features across collected texts. The MD results show six salient linguistic patterns: (1) Involvement and interactivity; (2) Description versus Narration; (3) Expression of attitude; (4) Informational production; (5) Framing scientific claims; and (6) Expression of denial. Discourse-based interviews were then conducted with eight professional scientists to elicit their perception of MD findings concerning their reading experience and understanding of established writing conventions. The implications for EAP professionals are proposed as to the explicit instruction on teaching novice writers how to employ stance expressions strategically in academic writing.
Keywords
Introduction
English has become the dominant language in international scientific communication. Recent statistics show that over 95% of high-ranking peer-reviewed international journals in science and engineering are published in English (Lillis & Curry, 2010). The decisive requirement for scientists to survive in modern academia is to achieve publication of research articles in well-established international journals (Hyland, 2016a). However, the dominance of English in academic communication has posed challenges for novice scientists who are unfamiliar with the writing conventions shared by disciplinary communities.
Among various forms of academic written outputs such as technical reports, reviews, and letters, published research articles (RA) are “a codification of disciplinary knowledge” (Hyland, 2000, p. 64) legitimized by the community gatekeepers as the demonstration of disciplinary research competence. They are the primary means through which disciplinary knowledge is disseminated to an international readership. Research has found that academic writers report the discussion section as being challenging to write in both theses and RAs (Bitchener & Basturkmen, 2006; Flowerdew, 1999; Geng & Wharton, 2016; Hopkins & Dudley-Evans, 1988; Shen et al., 2019). Novice writers may not clearly understand the form and function of this section. They are less aware of how to present their stance on the findings through utilizing metadiscoursal resources. As a critical component in RA writing, a range of topics in this area have been addressed, but they focus primarily on the rhetorical structure following Swales’s (1990) tradition (Basturkmen, 2012; Peacock, 2002; Ruiying & Allison, 2003). Comprehensive linguistic descriptions of the discussion section have received less attention. However, a solid understanding of comprehensive accounts of language use would have important implications for novice writers, as suggested by Hyland (2000), to understand “why particular features seem to be so useful to writers that they become regular practices” (p. 2). For writing pedagogy, awareness of such linguistic choices can help writing instructors develop appropriate teaching materials for learners who are unfamiliar with linguistic conventions in academic writing.
Statement of the Problem
A number of studies on RA discussion sections have emerged focusing either on analysis of rhetorical functions following Swales’s move-based analysis (e.g., Basturkmen, 2009, 2012; Hopkins & Dudley-Evans, 1988; Peng, 1987; Posteguillo, 1999) or description of the lexico-grammatical resources used in discussion sections (e.g., Geng & Wharton, 2016; Lee & Casal, 2014; Parkinson, 2011). These studies are meaningful in unpacking the linguistic characterization typical of discussion sections. However, several gaps can be identified in this area of inquiry. First, studies directly addressing the genre-specific language used in discussion sections remain scarce. Second, existing studies focus on the examination of a single lexico-grammatical feature in discussion sections. Although these linguistic elements are important in constructing discussion sections, reliance on a single feature cannot reveal discipline-specific language patterns. A comprehensive linguistic description must be validated by examining the relationships between a number of linguistic features within a large number of texts (Conrad, 2001; Csomay, 2015). Third, a growing number of studies have focused on combining corpus-based linguistic descriptions with semi-structured interviews (e.g., Hyland, 2000, 2002; McGrath & Kuteeva, 2012) asking disciplinary informants why they make the linguistic choices they do (Tardy, 2011, as cited in Basturkmen, 2012, p. 143). This inclusion of interviews can enable researchers to “glimpse the social situation and the context in which academic writing occurs” (Harwood, 2006, p. 429). However, relatively few studies have explored specialist informants’ interpretations of the corpus results and their discoursal practices in discussing research findings.
Research Questions
In response to the problems identified above, the present study investigates the rhetorical organization and genre-specific language patterns in discussion sections of RAs in an engineering discipline (specifically chemical engineering in this study). The chemical engineering discipline was selected for two reasons. First, this discipline is of substantial interest to scientists and engineering practitioners working in related engineering fields. Description of linguistic characterizations can be beneficial to engineering scientists and practitioners working in multiple related engineering disciplines. These include the hard-science disciplines (e.g., chemistry), applied sciences (e.g., energy studies and separation technology), and engineering sciences (e.g., environmental engineering and biochemical engineering). Second, it has been suggested that the selection of a single discipline may avoid the possibility of different modes of writing conventions arising from various engineering disciplines (Cominos, 2011).
This study is part of a larger doctoral research project (Jin, 2018b). In this article, the focus is on the multi-dimensional analysis of discussion sections in chemical engineering, following by interviews with eight chemical engineering specialists to elicit their understanding on the corpus findings in relation to their reading and writing experience. Two research questions have been proposed:
What linguistic dimensions characterize RA discussion sections in chemical engineering?
How do professional engineering scientists respond to the linguistic dimensions given their disciplinary writing practices?
Literature Review
Linguistic Analysis of Research Articles
The past several decades have seen a notable reliance on corpora and corpus-based linguistic analysis approach to uncover the distribution, forms, and functions of a single lexico-grammatical linguistic feature in discipline-specific published RAs. Examples of such features include adjective-controlled to- and that-clauses (Groom, 2005), verb-controlled that (Parkinson, 2013), multi-word lexical bundles (Gilmore & Millar, 2018), and this/these (Gray, 2010).
Corpus-based studies that address the lexico-grammatical features in discussion sections of engineering RAs have been surprisingly scarce. Parkinson (2011) explored lexico-grammatical features related to knowledge claims through student lab reports and physics RAs. This investigation employed both a clause-by-clause analysis and corpus analysis to identify ways of realizing these meanings. The results reveal the claims demonstrated in the student writing to be more congruent, more emphatic, and less closely argued than in the RA corpus.
Williams (2012) investigated the first-person reference in discussions of biochemical RAs in a bilingual English-Spanish corpus. Quantitative analysis demonstrated that Spanish writers choose both exclusive and mixed inclusive-exclusive perspectives equally, whereas the exclusive perspective is prevalent in the English-language articles. Major linguistic differences between the languages were identified for overall use and for statements of results, comparison of results with prior research findings, and metatext.
Millar et al. (2019) provide quantitative and qualitative descriptions of hyperbolic language in randomized controlled bias articles regarding medicine. From a small corpus of 24 RAs of research articles of Random Control Trials (RCTs) in orthopedic medicine, 161 uses of hype language were identified and categorized for functional and linguistic realization. It was found that hypes are most frequent in discussion sections and most frequently serve to glamorize the methodology and promote the results.
These corpus-based studies generally start from the construction of the corpus and then employ corpus analyses to study the salient patterns of lexical-grammatical features, usually supplemented by manual analysis for detailed interpretation. However, the entire linguistic profile is difficult to be identified by merely relying on one single linguistic feature (Csomay, 2015). Comprehensive linguistic descriptions of this particular section can be more advantageously described by “documenting the relationships across many linguistic features and texts” (Csomay, 2015, p. 3).
Among the features under investigation, metadiscoursal repertoires have been extensively explored. Given the importance of these features to interpret linguistic dimensions informed by multidimensional (MD) analysis, a number of corpus-based studies that capture stance features in full-length RAs are reviewed in somewhat greater detail.
According to Hyland (2000), metadiscourse is defined as “the interpersonal resources used to organize a discourse or the writer’s stance toward either its content or the reader” (p. 109). The definition is based on the assumption that effective academic writing is built not only upon the solid scientific reality arising from experimental results, but also on the projection of a position toward the work under investigation and the awareness to appeal to readers (Abdollahzadeh, 2011; Hyland, 1999; Lee & Deakin, 2016). In line with Hyland (2005b), stance expressions are achieved through hedges, boosters, attitude markers, and self-reference. Strategic mastery of these features for stance presentation is essential in the development of advanced academic literacy (Hyland, 2004, 2005a; Lancaster, 2016; Lee & Deakin, 2016). Metadiscoursal features tend to appear frequently in humanity and social science disciplines than in science and engineering, as evidenced by Hyland (2000, 2005b), and McGrath and Kuteeva (2012). However, a diachronic study conducted by Hyland and Jiang (2016) suggests that stance-taking expressions in science disciplines experienced a gradual increase in the past 50 years.
The linguistic features typical to the academic written discourse introduced are investigated in isolation. These studies start with the examination of a single linguistic feature or functional linguistic construct (e.g., stance features) that may not provide a complete picture of the academic genre under investigation. In the next section, how a range of features interact with each other and form particular linguistic dimensions in academic texts is reviewed.
Multidimensional Analysis
Multidimensional (MD) analysis developed by Biber (1988) is a multivariate approach that considers linguistic characteristics of texts in a comprehensive way based on a range of linguistic features across texts. In Biber’s (1988) pioneering study, 67 lexico-grammatical features were automatically extracted within the spoken and written corpora. They were then reduced to several interpretable linguistic dimensions in the form of co-occurring linguistic features by factor analysis: (1) involved versus informational production; (2) narrative versus non-narrative concern; (3) explicit versus situation-dependent references; (4) overt expression of persuasion; (5) abstract versus non-abstract information; and (6) online informational elaboration. The theoretical assumption of this approach is that a set of linguistic features in a factor reflects shared communicative functions. These co-occurring linguistic features are then interpreted as dimensions according to “situational, social, and cognitive functions most widely shared by the linguistic feature” (Biber & Conrad, 2001, p. 6). For instance, in spoken discourse, interactivity demonstrates extensive use of first- and second-person pronouns, directives, and imperatives. The MD analytical approach provides a comprehensive identification of the “core structural and functional characteristics of a given genre of discourse” (Friginal et al., 2013, p. 286).
Extensive MD studies following Biber’s (1988) dimensions have been conducted to investigate how academic written or spoken texts fall along on the identified linguistic dimensions (e.g., Conrad, 2001; Crosthwaite, 2016). Not restricted to Biber’s (1988) dimensions, some studies have sought to produce new dimensions by conducting new factor analyses replicating or expanding on the linguistic features in Biber’s (1988) study, enabling the specialized discourse domains to be represented in linguistic co-occurrence forms (Biber, 2006; Friginal & Weigle, 2014; Gray, 2011) and thus providing insights into linguistic choices made by writers. Hardy and Römer (2013) proposed four linguistic dimensions in MICUSP (Michigan Corpus of Upper-level Student Papers) corpus before further exploring the discipline-specific linguistic variations on the identified dimensions. Friginal and Mustafa (2017) explored the linguistic differences between U.S.-based and Iraqi RA abstracts based on Hardy and Römer’s (2013) MDA-informed dimensions, and revealed the potential variation in information packaging, description of procedural discourse and argumentation. Cao and Xiao (2013) found seven linguistic dimensions underlying RA abstracts in a range of science-related disciplines, and further compared the NS and NNS writers’ linguistic characterizations along each dimension. Gardner et al. (2019) provided comprehensive linguistic descriptions of a wide range of university written assignments along the mapping of disciplines, level of study, and sub-genres. They interpreted the dimensions as (i) compressed procedural information versus stance toward the work of others, (ii) personal stance, (iii) possible events versus completed events, and (iv) informational density. Motivated by these studies, the present study utilized the MD analytical approach to uncover the linguistic patterns in a corpus of texts; however, it does not follow the established six dimensions proposed by Biber (1988). It utilizes the MD approach to identify new dimensions that account for linguistic characterizations in chemical engineering RAs. Beyond the statistical analysis, this study invited professional engineering practitioners to comment on the identified linguistic patterns. According to Hyland (2000), discourse-based interviews help to “make explicit the tacit knowledge or strategies that writers or readers bring to the acts of composing or assessing writing, allowing them to interpret meanings, reconstructing writer motivations, and evaluate rhetorical effectiveness” (p. 143), which sheds light on issues of why published writers make such language choices in disciplinary writing. Such interviews have been employed by writing researchers to probe into writers’ rationale for making specific linguistic or rhetorical choices. For instance, Harwood (2005, 2006, 2009) employed discourse-based interviews to examine the use of personal pronouns/citations/stance features in disciplinary writing. He first conducted corpus analysis first to identify language patterns and then interviewed specialists to comment on target language features. In the current study, discourse-based interviews with disciplinary informants were adopted to elicit their perceptions or reactions toward MD findings, revealing linguistic choices made by published writers in discussion sections.
Methods
The Corpus
A corpus of 213 discussion sections from published chemical engineering RAs was composed for this investigation. Each RA has more than two authors, in which less than 50% of the articles were written by Chinese authors as the first author, 50% of them were produced by Anglo-American authors, European countries and countries such as India, Korea, and Japan. However, the comparison between Chinese and “native” speaking English authors is not the aim of the study.
Small-scale specialized corpora have been built widely to enable analyses of lexico-grammatical features in academic settings (e.g., Friginal, 2013; Lee & Swales, 2006). As Connor and Upton (2004a, 2004b) identify, “specialized corpora can be used to explore specific types of genres within specific contexts” (pp. 7–8). The investigation of language specificity might not be achieved with corpora built for general purposes.
The articles were selected from 20 SCI-index disciplinary journals. The journals ranged from the Q1 to Q4 rankings based upon the journal impact factor (IF) reported by Thomas Reuters Journal Citation Reports (2015) Review articles and pure theoretical RAs were excluded, as they may have different rhetorical organizations in comparison to empirical RAs (Ozturk, 2007; Ruiying & Allison, 2003). One disciplinary expert in chemical engineering was consulted for journal selections. His recommendations were equally important when deciding which journals to include in the analysis. The structure of the corpus was introduced in Table 1.
SCI Journals Included in the Corpus of Discussion Sections.
The total number of words in the corpus was 246,146. Around two-thirds of the RAs did not follow the default IMRD structural patterns. Therefore, RAs were selected if they had a separate discussion section. Integrated Results and Discussion (R&D) sections were not included, as there was no clear boundary between presenting results and discussing them. The rhetorical organization of discussions in integrated R&D is different from those in stand-alone discussion sections (Ruiying & Allison, 2003). From a pedagogical point of view, it is also necessary to select RAs with a single discussion section, as novice scientists who are unfamiliar with the genre of RAs need to clearly understand the typical rhetorical pattern of a stand-alone discussion section first, before choosing to combine this section with the results section or not in actual writing practices (Stoller & Robinson, 2013).
Analytical Procedures
The step-by-step analytical procedures of conducting MD analysis can be found in Friginal and Hardy’s (2014) study. The general steps of conducting MD analysis are introduced briefly here. First, Nini’s (2015) publicly available Multidimensional Analysis Tagger (MAT) software was used to extract and tag a range of linguistic features (see Appendix A) in the corpus. This process replicates the features used in Biber’s (1988) MD analysis. This study follows a strict approach to factor analysis, in which the ratio between the total number of texts and features is around 5:1 (Gorsuch, 1983, as cited in Kanoksilapatham, 2003, p. 128). Therefore, around 40 features (42 in total) were selected corresponding to 213 discussion sections under investigation. Linguistic features sharing similar communicative functions were grouped into one superordinate category (also see Kanoksilapatham, 2003); for instance, wh- and that-controlled relative clauses function as the elaboration on an antecedent and thus were aggregated into relative clause as one single linguistic feature.
The MAT software can automatically identify groups of features at various levels, including semantic level (e.g., public/private verbs), lexical-grammatical level (e.g., that-complement clauses controlled by verbs), word level (e.g., nouns, adjectives), syntactic level (e.g., adverbial clauses). The rate of accuracy is well beyond 90%.
Second, after tagging, Patcount software (Liang & Xiong, 2008) was used to count frequencies of the tagged features automatically. The software is freely downloadable (http://www.bfsu-corpus.org/channels/tools). The tool is designed to count the frequency of a particular linguistic feature.
Third, factor analysis was run on SPSS 22.0 to reduce a range of linguistic variables to several identifiable factors (i.e., dimensions). Before statistical processing, tagged linguistic features were normalized to per 100 words, as two thirds of the discussion sections fell below 1,000 words. When determining the exact number for factor extraction, an optimal number in this study was determined based on Eigenvalue, which is a common way to identify the number of factors (Biber, 1988; Friginal & Hardy, 2014; Gorsuch, 1983). On a consistently decreasing scree plot (The Eigenvalue is on the y-axis, and the number of factors are on the x-axis), the points corresponding to the x-axis where the plot demonstrates apparent changes are the number of factors extracted for the study. In this study, six factors were considered the optimal number.
These six factors account for 39% of the variations in the corpus. A rotated factor analysis was subsequently conducted with Promax rotation, allowing for some correlations among factors. The largest amount of correlation is between factors 1 and 4, which is understandable given the fact that both are versions of the same chief distinction between orality and literacy proposed by Biber (1988).
With a rotation, “each linguistic feature tends to load on only one factor, and each factor is characterized by those relatively few features that are most representative of the underlying construct” (Biber, 1988, p. 102). Each linguistic feature is a component of a factor with corresponding factor loadings. Factor loadings represent correlations between features and factors, indicating percentages of variances of features explained by a single factor. A cut-off factor loading should be determined to eliminate those features with a weak correlation to other features. In the majority of MD research, ±0.3 is often used as a cut-off value for inclusion in MD analysis (Biber, 1988; Friginal & Weigle, 2014; Gray, 2011). Finally, the interpretation of the co-occurring linguistic patterns in each dimension is based on the functional meaning of each individual linguistic features (see Biber, 1995, pp. 136–138), which can capture the “substantive nature of the factor” (Friginal & Hardy, 2014, p. 311) and reference to the previously identified MD research results identify if there are any similar clustering of the factors that can help with interpretation.
Semi-Structured Interviews
Interviews have the advantage of eliciting participants’ opinions and feelings about specific events (Denscombe, 2010). In second-language writing research, the qualitative interview is a useful data collection tool to obtain insider accounts from the disciplinary informants’ perspectives, “emic perspective on academic discourse [. . .], informed by the writer’s accounts instead of the corpus analyst’s account, [. . .] thus enabling the researcher to glimpse the social situation and the context in which academic writing occurs” (Harwood, 2006, pp. 426–429). Eight specialist informants (eight associate professors) based in China from the School of Chemical Engineering at JT University (JTU) were interviewed individually and face-to-face in their offices during April to May 2017 (Table 2).
A Profile of Specialist Informants in Chemical Engineering for Interviews.
JT University (a pseudonym for privacy) is a renowned university in China which was established more than a century ago. In 2017, JTU was selected by the Ministry of Education in China as one of the prestigious institutions in response to the Double First-Class University Plan. The school has been among the top 10 best chemical engineering schools from 2000 to 2017, according to national annual research assessments. The school pushes academic staff to publish in recognized international journals either for professional advancement or for promotion of the school ranking in the international academic community and research assessments. From 2012 to 2017, 2,070 papers were published in SCI-index journals. Because of this, participants with rich international publication were recruited in the follow-up interview study.
The participants are all middle-aged associate professors in chemical engineering with strong publication records in international journals (see Table 2). The majority of the researchers obtained their PhD degree in mainland China, except P5, who pursued his doctoral study in Hong Kong. P1 and P3 both completed 1-year academic visits to Hong Kong, and P2 fulfilled a 1-year academic visit in the U.S. They received training during their doctoral and post-doctoral studies on how to write RAs in English, and they help their students write RAs in English during one-to-one supervision or group meetings.
Interviews were conducted with informants in Chinese to facilitate communication. Each informant was interviewed once only, in consideration of their schedules. Audio-recordings were made throughout the interviews with permission. The individual interviews ranged between 30 and 50 minutes, totaling 380 minutes. Each interview opened with some general warm-up questions, such as “What is the role of the RA discussion section in research articles?” and “What do you usually include when writing a discussion section in your own writing experience?”
Following this, discourse-based interviews (Odell et al., 1983) were conducted with the informants prompted by the text extracts selected from MD findings (see Appendix B). I explained the typical linguistic choices (i.e., stance features) in these extracts in an intelligible way to the engineering scientists and asked for their comments. The intended aim was to investigate participants’ responses, interpretations, and evaluations of the results of MD analysis by asking specific questions, such as “Would you like to comment on the use of these sentences segments in discussion sections in your discipline?” All the interviews were fully transcribed and checked with the recordings to ensure accuracy.
Eight interviews were transcribed in Chinese and translated into English. The interview transcripts were stored as Word files and analyzed manually. The interview transcripts were read repeatedly and closely to identify the segments of transcripts that reflected participants’ understanding of the writing process. In the first part of the interviews, where the specialist informants’ general understanding of the discussion sections in chemical engineering was discussed, two themes were identified: (1) the overall significance of discussion sections in chemical engineering RAs and (2) how the informants organize a discussion section. In the second part of the interviews, informants’ interpretations of stance expressions (hedges, boosters, adjective attitude markers, and first-person pronouns) were identified as follows: (1) whether a particular stance was obligatory or not in writing and (2) the rhetorical effects generated by the presentation of one’s stance in a discussion section. Due to space constraints, I report on the Part 2 interview results related to MD analysis.
Results
Descriptions of Linguistic Dimensions in Chemical Engineering Discussion Sections
Table 2 summarizes six factors in the form of six sets of co-occurring linguistic features with corresponding factor loadings (see Jin, 2018a). Each linguistic feature represents a component of a factor reflected by its factor loading. Factor loadings generally range from 0 to 1 and examine the degree of co-occurrence between an individual feature and the set of features loaded on a factor (Gray, 2011). The interpretation of each factor was determined by understanding the communicative functions of lexico-grammatical features and reading relevant results from previous research. The six dimensions with proposed descriptive labels are: (1) Commitment to scientific prepositions; (2) Generalization versus narration; (3) Expression of attitude; (4) Informational production; (5) Framing scientific claims; and (6) The absence of supporting literature or negation of scientific prepositions. Among them, Dimensions 1, 3, and 5 are related to the stance expressions through hedges, boosters, attitude markers (i.e., adjectives), and author presence (i.e., first-person pronouns). Dimension 2 is associated with the narration and description of the discussions, Dimension 4 concerns informational production, and Dimension 6 involves the negation of findings and arguments.
Dimension 1: Commitment to scientific prepositions
Dimension 1 consists of six features loaded at the positive continuum of the dimension. Split auxiliaries, adverbs, amplifiers, possibility modals, conjuncts, and the perfect aspect as tagged by MAT software are included. Adverbs are often used to highlight the delivery of information. Hedges extracted by the MAT software are signaled by lexico-grammatical items such as something like, maybe, sort of, kind of, and more or less. Possibility modals, which represent a kind of hedging device, include can, may, might, could, and would. The hedges in the form of possibility modals extracted by MAT are used to show a researcher’s tentativeness and uncertainty about a proposition or assertion, mitigating the force of claims and also demonstrating humility toward a research community. The use of hedging and boosting devices. According to Biber (2006), these linguistic features with positive loadings are also stance markers demonstrating writer’s attitudes and engagement on a proposition, serving as the core rhetorical strategies to manage interactions between readers and writers (see Hu & Cao, 2011). Therefore, the label of this dimension is interpreted as the extent of commitment toward scientific prepositions.
The following lengthy excerpt is representative of Dimension 1, where frequent occurrences of possibility modals, and adverbs are found to either moderate or strengthen their commitment to propositions. It should be pointed out that there is overlap between amplifiers and adverbs. In the MAT tagging software, amplifiers refer to a particular group of items, and adverbs refer to items tagged by the Stanford Tagger in order to have a final count of total adverbs (see the MAT manual developed by Nini, 2015).
(1) Doped alkali metal oxide with stronger alkalescence
Dimension 2: Generalization versus narration
Dimension 2 consists of positive and negative groupings of linguistic features. The positive and negative continuums represent two ends of the dimension identified by factor analysis, and they do not represent evaluative meanings (Friginal & Weigle, 2014). Features with positive loadings mean that texts relying on positive features may rely less on negative features and vice versa. The present tense, which is the most salient loading in the positive continuum, is used to account for the “natural course of events or actions, and then build interpretations upon those observations” (Gray, 2011, p. 148). The negative continuum of this factor is featured by past tense verbs to describe the completed research activities or experimental procedures. The complementary distribution of these two features makes the communicative function of this dimension transparent. Therefore, the interpretative label non-narration versus narration is proposed here (Biber, 1988), as shown in the examples in Table 3, suggesting that this linguistic dimension of RA discussions captures linguistic descriptions of natural course of actions or presentations of results by means of present tense, and also describes past events signaling by past tense verbs.
Factor Loadings of Dimension 2.
The following three text extracts capture past and present tenses in discussion sections. The positive side of this dimension uses present tense to describe scientific phenomenon discovered from the results, while the negative incorporates explicit use of past tense verbs to report what was conducted in research.
(2) . . .
(3) The growth of amorphous oxide during stage I oxidation (4) Tang et al. [55]
Dimension 3: Expression of attitude
Dimension 3 includes predictive adjectives, be as the main verb, subordinators, and demonstrative pronouns (this, that, these, those). As shown in Table 4, the feature with the highest factor loading is predicative adjectives, followed by be as a copula verb but not as an intransitive main verb (e.g., I was in the basement). As Biber et al. (1999) argue, be+ predicative adjective is a structure used to denote an epistemic or an evaluative stance through the choice of words such as difficult, important, likely, and possible. Subordinators (e.g., if, because, though, whereas, since) indicate “the meaning relationship between the dependent clause and the superordinate structure: time, reason, condition, comparison, etc.” (Biber et al., 1999, p. 85), and are marked by “greater elaboration and thus should be characteristic of informational discourse” (Biber, 1988, p. 107). Demonstrative pronouns are used to maintain cohesion by referring to preceding texts. Considering these features, it appears that this dimension first captures researchers’ expression of attitudes toward findings and claims by means of predicative adjectives and then provides further elaboration and explanation on evaluative statements in discussion sections. The emergence of this dimension corroborates the finding that adjective attitudinal markers are frequent in discussion sections (Crosthwaite et al., 2017).
Factor Loadings of Dimension 3.
Highlighted in the following extracts are predicative adjectives, subordinators, and demonstrative pronouns—in particular, this. Writers make use of adjectives to indicate their attitudes toward findings, and employ the subordinators such as since, because, whereas, if to elaborate their viewpoints.
(5) A deposit of KCl (6) Examination of the overall reforming process as well as the surface chemistry of carbohydrates shows that the routes for synthesis gas (7) This means that a concerted process by which two O atoms in the oxide will come together to form an O2 molecule and two vacancies will require a large activation energy.
Dimension 4: Informational production
Dimension 4 (Table 5) includes features of word length, attributive adjectives, do, and nouns. A high rate of word length “marks a high density of information” and thus results in “an exact presentation of informational content” (Biber, 1988, p. 104). Nouns characterize the density of information. Attributive adjectives package information in a more condensed manner (Biber, 1988). These main features are associated with informational production, allowing for a great deal of information to be expressed in dense noun phrases, supporting Biber and Gray’s (2013) findings that nominal style is pervasive in science writing. This dimension is in stark contrast to Dimension 1, which conveys a more involved and interactive language pattern.
Factor Loadings of Dimension 4.
Extracts below highlight how they maintain a discourse characteristic of being informational, characterized by a heavy reliance on noun sequences (e.g., gene expression of bone sialoprotein attachment protein), attributive adjectives (e.g., specific activity) and relatively long average word length (e.g., phosphate composites).
(8) Then, it is interesting to check whether the present EOS model can predict such (9) This effect is caused by the (10) . . .and up-regulated the gene expression of bone sialoprotein attachment protein. Another study tried to improve bone tissue reconstitution with CH+calcium phosphate composites on the rabbit and sheep models [36]. (KCE14-CLA) (11) The
Dimension 5: Framing scientific claims
Dimension 5 (Table 6) comprises four positive features: that-complement clauses, private verbs, public verbs, and first-person pronouns. The that-complement structure is salient in academic writing and has been studied extensively (e.g., Hyland & Tse, 2005; Parkinson, 2013). Biber (1988) argues that that-complement clauses function as an “elaboration of information relative to the personal stance of the speaker, introducing an affective component into this dimension” (p.114). The private (e.g., recognize, suppose, realize, indicate, and conclude) and public verbs (e.g., claim, maintain, and report) are used to express one’s ideas and claims. First-person pronouns, mainly in the form we, our, and us, indicate the collaborative nature of science, thus bring out the presence of authors in scientific research. The co-occurring features in Dimension 5 suggest how scientists make explicit reference to themselves in presentation of findings.
Factor Loadings of Dimension 5.
The sample extracts 12 to 15 feature a combination of first-person pronouns and verbs with a reporting function. Scientists tend to incorporate self-mention markers, emphasizing their commitments to an assertion. The announcement of authorial presence accords with Hyland’s (2002) finding arguing that first-person pronouns highlight scientists’ unique role in “constructing a plausible interpretation for a phenomenon, thereby establishing a personal authority based on confidence and command of their arguments” (p. 1104).
(12) atoms located near the dopant and this will make the dopant less positive. [. . .] (14) Similarly, (15) The analysis described above enables
Factor Loadings of Dimension 1.
Dimension 6: The absence of supporting literature or negation of scientific prepositions
This dimension is characterized by negation, existential there, and to-infinitives (Table 8). Negation, including analytic negation (not) and synthetic negation (no, neither, and nor), exhibit the highest factor loadings, followed by the existential there structure that can “predict the occurrence of something” (Biber et al., 1999, p. 943).
Factor Loadings of Dimension 6.
It is surprising that prepositional phrases are not salient enough to constitute a dimension, thus the prepositional phrases as a single linguistic feature loaded on negative dimension may not be powerful enough to be assigned as a “dimension.” Therefore, no functional interpretation is given. However, as in previous MD studies (e.g., Biber, 1988, 2006), high frequency of prepositional phrases contributes to densely packed information in written academic texts.
The grouping of these features in this dimension, on the one hand, indicates the “denials and rejections in the reported reasoning processes” (Biber, 1988, p. 109), in particular, the negation of propositions and research findings. On the other hand, they highlight the insufficient numbers of relevant research in a particular area. The extracts below highlight the use of negation in the form of no and not, combined with existential “there” and to-infinitives, suggesting that not very much previous research has been conducted in relation to current research and no expected research outcomes have been identified.
(16) The rate constant for the RLS, kCOH, also can be split up in a similar manner as the equilibrium constant, <> <here h is Planck’s constant, <> <s the energy difference between the COH in the transition state COHTS, and the adsorbed COH species on the edge, and <>,<because (17) Results summarized in Table 1 also demonstrate that (18) In the literature,
Overall, the six linguistic dimensions of discussion sections are: (1) Commitment to scientific prepositions; (2) Generalization versus narration; (3) Expression of attitude; (4) Informational production; (5) Framing scientific claims; and (6) The absence of supporting literature or negation of scientific prepositions. Among them, only Dimension 2 has the positive and negative continuum. This might be explained by the limited samples and linguistic features extracted that may inadequately account for the dimensions (also see the Limitations in the Conclusion section).
Findings of Interviews With Disciplinary Informants: Perceptions of MD Results in Light of the Research Practices of Chemical Engineering
Drawing on interview data collected from eight experienced chemical engineering scientists, I aimed to go beyond the straightforward interpretation of results of MD analysis (Conrad, 2014) by uncovering specialist perspectives of disciplinary research practices. However, due to the constraint of space, I am unable to provide detailed comments on the six dimensions. Therefore, I focus on reporting about Dimension 1, 3, and 5 associated with stance-taking expressions, which were salient in MD results.
When referring to Dimension 1 and 3 of hedges, boosters or adjective attitude markers, P2 and P3 supported the use of these devices in discussion sections. They identified that these markers can leave room for disciplinary readers to interpret “whether the findings are convincing or not, and soothe their claims for broader acceptance.” P2 commented that hedging devices are a strategic choice that makes scientific claims less assertive in case there are other potential explanations of scientific phenomena, allowing readers to carefully reflect on the likely impact of their results. Referring to the example “this result would also be an important proof of the decrease in the reducibility of surface vanadium species,” P2 appeared cautious to explain the reasons regarding the formation of N2O, which led to the reducibility of surface vanadium species. As a result, room is left for readers to consider the alternative reasons for formation (the selected extract as prompts). P3 noted the words efficiently and fully in the third extract and remarked that they can offer strong conviction regarding results to readers.
P6 compared the employment of adjective attitudinal markers as “ornaments,” suggesting that the incorporation of attitude markers may leave a positive impression on readers, making results more reliable. P7 indicated that they should be used where appropriate; otherwise, they are likely to make scientific paper subjective.
Other participants were not sure about the usage of these stance features when discussing findings, as they held the view that science writing is objective and neutral. P4 described an interesting supervision scenario with a graduate student to demonstrate how he asked the student to remove the boosters in order to maintain the objectivity of scientific findings:
I can tell you a scenario of this with my student W. She said she had obtained some ‘advanced’ findings, but I did not see the value and innovation from my perspective. However, she described her results as ‘remarkably contribute to. . .’. I deleted it. W was angry and asked, “Why are we so timid in the use of promotional language?” I said to her: “If your study is indeed a high-quality and cutting-edge research, even if you do not use remarkably, readers can sense your work is remarkable. There is no need to emphasize this. It will make the reviewers feel uncomfortable.”
P4’s scenario suggests that the stance devices may have a counter-effect on readers. To be specific, P5 suggested that if lexical expressions such as could inhibit, may be viewed, or demonstrate/strongly are used to discuss findings, this leaves readers with an impression that the researchers themselves are not confident or over-confident: “The reviewers tend to think that the researchers may devalue or hype their results, so how can we trust your study?”
When referring to first-person pronouns for stance presentation, most participants reported their reservations about the authorial presence in discussion sections. The majority of participants agreed with the author-evacuation style. P1 said that he inclined to avoid emphasizing who did the work or who made the conclusion (e.g., we conclude), as “there is no need to emphasize who proposed or discovered it, but discussion should focus on the results themselves.” P6 said that he was kept being reminded by his supervisor to “remove yourself from the scientific reality. It is the results that can tell everything.”
P8 offered a different interpretation, identifying the possibility that top researchers might be in a better position to utilize first-person pronouns based on their command of expertise and authority. However, he felt uncomfortable in using them as he thought he was not a top researcher and thus lacked confidence to claim this, despite his shining publication record. As P8 stated:
It seems that I am leading something, too confident, we speculate that, but I am not qualified to do that. I don’t want ourselves to be highlighted in the research, in case we are wrong. But expert scientists are confident enough to state their findings and thus assert their prominence.
This interpretation conveys the message that academic writers resort to “protect themselves against falsification by distancing themselves from their findings” (Harwood, 2005, p. 1209), and even experienced writers incline to avoid explicit self-promotion.
However, contrastive views were presented, highlighting the use of first-person pronouns. P4 was not surprised by the exemplars containing instances of first-person pronouns (we speculate that. . .). According to P4’s observations, the use of first-person pronouns might be accepted by some top journals (e.g., Energy and Environmental Science), and be often used by well-established researchers. P4 aimed to publish in high-ranking journals, and he is very likely to imitate the stylistic choices to increase the possibility of publication:
I am influenced by the trend of top journals and top researchers. If it is popular among them, I will follow it because I want to publish in those journals and follow the style of writing by those ‘big name’ researchers. [. . .] I have tried to adapt myself to that way because I am working on a paper which I want to submit to that top journal.
However, different from P4 who tended to employ first-person pronouns following the stylistic trend led by top journals and researchers, P5 pointed out that the use of first-person pronouns is a rhetorical strategy to achieve stylistic variation, as he believes it is too rigid to use only passives or sentences beginning with it or the results. This use of first-person pronouns supports Harwood’s (2006) finding that pronouns help writers to “vary their style and make their text more interesting to their audience” (p. 444). As stated by P4:
Don’t you think a discussion section is full of expressions beginning with the subjects like these? The results. . ., it leads to. . ., A caused by B. Sometimes I deliberately choose to shift to first person, it can make your language expressions flexible and leave an impression that your style of writing is not rigid.
Discussion: Interpretation of Scientists’ Linguistic Choices Informed by MD Analysis
The linguistic dimensions obtained from the current study may not all be directly comparable to previous studies that have used an MD analytical framework to explore linguistic dimensions in academic written texts (Biber, 1988; Friginal & Mustafa, 2017; Friginal & Weigle, 2014; Gray, 2011). This is because the corpora in prior research under investigation varied in size, which made direct comparisons with the current study difficult. Additionally, the specialized corpora representing the particular genre (i.e., discussion sections in this study) are different from those of the studies observed. However, some obtained dimensions in this study can still partially allude to previous results.
The linguistic features held by Dimensions 1 and 4 are featured by the possibility modals, hedges, adverbs, and the nouns and attributive adjectives, suggesting the writer’s explicit involvement of themselves in relation to research findings in contrast to a strong focus on information delivery. The emergence of these two dimensions aligns with Dimension 1 and Dimension 4, which describes involved versus informational production and overt persuasions of texts identified in Biber’s (1988), Gray’s (2011), and Friginal and Weigle’s (2014) studies. The seemingly antithetical representations corroborated in discussion sections suggest that scientific discourse is not simply the construction of solid factual results derived from experiments. It also constitutes the use of language to demonstrate an authorial stance toward the issues discussed to persuade readers of the relevance and value of the research (Crosthwaite et al., 2017; Hyland, 2004). Dimension 2 features the present and past tenses to describe the “natural course of events or actions, and then build interpretations upon those observations” (Gray, 2011, p. 148) and the completed research activities or experimental procedure. The linguistic characterizations align with Dimension 2 of Biber’s (1988) and Gray’s (2011) studies, where the present tense and past tense are used to describe the current activities or completed research established in RAs. Dimension 5 captures the co-occurring patterns of first-person pronouns/verbs with a reporting function to refer explicitly to the role of researchers, partially corresponding to Friginal and Weigle’s (2014) interpretation of personal opinion characterized by the first-person plus mental verbs (e.g., guess, feel). The only difference is that the pronoun I is prominent in their studies, as their focus was on academic essays rather than experimental RAs where group work is common.
Academic writing is often portrayed as a faceless and impersonal style in writing manual books and style guides. Informed by MD analysis, the presence of stance expressions reveals that chemical engineering discourse is not purely objective. The results echo Hyland’s (2000) finding of the presence of these stance-related expressions in science and engineering fields. Hedges and boosters explicitly convey the researcher’s epistemic attitude toward an argument. Researchers employ hedging and boosting devices to either strengthen or moderate the degree of commitment to the claims (Hyland, 2005a). The employment of attitude markers exemplified by evaluative adjectives also supports McGrath and Kuteeva (2012) findings, suggesting that scientists rely on them to indicate positions and engage with the reader. Moreover, self-mention achieved by first-person pronouns plus verbs with reporting function frames a scientific claim by giving explicit prominence to the role of researchers who conduct research and discover findings (Hyland, 2002) to demonstrate a promotional tenor (Harwood, 2005).
However, stance expressions do not fully align with the perceptions shared by the majority of the informants who held mixed views to these highlighted features. Informants (P2 and P3) identified the strategic use of hedging devices to soften strong research claims and their awareness to consider the consequences of how the message is received by readers. In particular, when referring to first-person pronouns, it is interesting to note that some informants (P4 and P5) appeared to demonstrate an awareness of using these features. However, their awareness seemed driven not by their recognition that stance expressions can highlight their unique role in making judgments and thus disciplinary competence (Hyland, 2002) but rather by the practical need to publish in high-ranking journals where first-person pronouns are welcomed (P4) or the purpose of using first-person pronouns to replace passive voices or sentences beginning with it or the results as a way to seek stylistic variation (P5).
Multiple reasons could explain the results. First, chemical engineering research is a fact- and truth-based discipline derived from experimental activities. It is understandable that the majority of informants (in P4, P6, P7’s accounts) may develop a rooted mindset that the engineering science discipline is free from personal involvement and authorial presence. Second, according to Hyland (2002), they may be influenced by writing manuals and teacher instruction in which objectivity of scientific results is advocated for and therefore naturally form a rigid understanding of stance-taking forms. The avoidance of stance may be due to the lack of confidence to stake out claims explicitly, in particular, when using first-person pronouns. P8’s response to first-person pronouns indicates that we and I were for senior researchers’ exclusive use. The notion aligns with Chang and Swales’s (1999) finding that first-person pronouns are “only usable for senior scholars” (p.164). Furthermore, participants in this study are all Chinese academic professionals. It seemed that disciplinary informants did not highly appreciate the use of first-person pronouns. Their comments reflect the cultural-specific hidden perception of collectivism that might contribute to their direct self-reference, which is in accordance with Hyland’s (2002) and Scollon’s (1994) findings, suggesting that the traditional Asian culture downplays expression of an authoritative persona in writing.
Conclusion
Summary of Findings
The present study sought to uncover the linguistic characterizations of discussion sections. It was determined that three linguistic dimensions are related to stance expressions and the other three dimensions are associated with the generalization/narration of research, informational production, and expressions of negation. The presence of Dimensions 1, 3, and 5 suggests that discussion of findings in chemical engineering is not purely objective but relies on these features to convey varying degrees of commitment to the findings or to engage readers. The follow-up semi-structured interviews inquired about expert informants’ understanding of stance expressions in the context of disciplinary writing practices. The results have revealed that use of these highlighted linguistic choices is possibly associated with the influence of teacher instruction and writing manuals, preference of author-evacuated style, and cultural-specific orientations of authority.
Pedagogical Implications
Some pedagogical implications for engineering writing instruction can be drawn from the results of MD and interview analysis. The interview-based findings suggest that engineering researchers may not be fully aware of the strategic employment of stance expressions that help to “organise the writer’s stance toward either its content or the reader” (Hyland, 2000, p. 109). These accounts are all notable omissions in prior academic writing instruction. However, the ability to project appropriate stance about the findings under discussion is still essential, as advanced writers can resort to this to craft “more academic reader-friendly prose and make more concerted attempts to engage with readers” (Hyland, 2004, p. 141). Emphasis can be placed on implementing instruction regarding how to use stance expressions strategically so as to clearly position the writer toward the findings under discussion (Crosthwaite et al., 2017; Hyland, 2004a, 2005b; Lee & Deakin, 2016). Writing instructors can devise a series of consciousness-raising activities with an explicit focus on the hedging and boosting devices, attitude markers, and first-person pronouns to increase novice writers’ sensitivity of these features, encouraging them to inquire about and discuss their thoughts on rhetorical questions about the presentation of stance in discussion sections: “How certain do I want to be about this? What is my attitude toward it? Do I want to make myself prominent here?” (Hyland, 2016b, p. 248). Instruction on these features makes these interpersonal dimensions explicit in the writing process (Lee & Casal, 2014).
Limitations
Several limitations should be acknowledged in this research. First, chemical engineering is an established broad discipline, including various areas such as energy, catalysis, and chemical reaction. These research areas are likely to have their own specific writing conventions. A close examination of the linguistic choices in different areas of research within chemical engineering may reveal how reliable the textual results are, ensuring that the textual features are not specific to one area of the disciplinary community.
Second, the linguistic features in the current study were extracted from the Nini’s (2015) MAT tagging software, which replicates Biber’s (1988) study, while further grouping was made by combining features with similar communicative functions. Therefore, the features under investigation are significantly fewer than Biber’s (1988), Gardner et al.’s (2019), and Cao and Xiao’s (2013) studies. The limited set of features may help to explain the relatively weak presentation of the dimensions (i.e., only one seems to have a positive and negative pole). Also, the feature extraction basically follows Biber’s (1988) early approach, but it failed to recognize the recent progress made to include additional features, which specifically addressed what he saw as an absence of stance feature in the original set.
Additionally, the semi-structured interviews were all conducted with Chinese scientists with good publication records in chemical engineering, which limits the claims to the Chinese population and context. While the role of cultural backgrounds in writing is beyond the discussion of this current study, it would be interesting to investigate the understanding of American or British-based expert scientists in overseas universities who have published in top journals. Future research could focus on other salient linguistic features in discussion sections, in addition to the stance expressions; so as to generate pedagogical insights into EAP teaching practice.
Footnotes
Appendix
Forty-Two Linguistic Features in MAT for Multi-Dimensional Analysis.
| Linguistic feature | Description/example |
|---|---|
| 1. Word count | overall number of words per text |
| 2. Word length | average number of letter in a word |
| 3. Type-token ratio | the ratio between number of different words and total words |
| 4. Coordination | e.g., and, or, but, so |
| 5. Subordinators | e.g. because, if, though, unless |
| 6. Amplifiers/emphatics | e.g., completely, fully, totally |
| 7. Be as main verb | all forms of be verb |
| 8. Passives | Passive constructions with agent controlled by by/
passive constructions with no agent identified |
| 9. Conjuncts | e.g., furthermore, consequently |
| 10. Demonstrative pronouns | e.g., this, that, these, those |
| 11. Downtoners/hedges | e.g., almost, merely, only, partly |
| 12. Gerunds | nominal form ends in –ing, ings |
| 13. First-person pronoun | e.g., I, we |
| 14. Attributive adjectives | adjectives occurring as a noun pre-modifier |
| 15. Possibility modals | e.g., can, may, might, would |
| 16. Common nouns | words identified as nouns |
| 17. Nominalizations | e.g., nouns end with –tion, -ment, -ness, -ity |
| 18. Participle clauses | past participle clauses, e.g., Built in a single week, the house can stand for years. Present participle clauses, e.g., Stuffing his mouth with cookies, he left the room. |
| 19. Perfect aspect | e.g., has claimed, have shown |
| 20. Total prepositional phrase | any instances of prepositions |
| 21. Relative clauses | e.g., wh- and that- controlled relative clauses |
| 22. pronoun it | all occurrences of it |
| 23. Place adverbials | e.g., above, around, behind |
| 24. Predicative adjectives | instances such as . . .is beautiful |
| 25. Private verbs | e.g., demonstrate, indicate, anticipate |
| 26. public verbs | e.g., acknowledge, declare, claim |
| 27.Pro-verb ‘do’ | any instances of do that is used as a main verb |
| 28. Stranded preposition | e.g., instances like the problem that I was thinking of. . . |
| 29. Negation | all occurrences of no,not, n’t |
| 30. Complement clauses | that-complement clauses controlled by verbs/adjectives |
| 31. Time adverbials | e.g., immediately, previously, recently, once |
| 32. Infinitives | to- as an infinitive marker |
| 33. Subordinator that-deletion | instances that that- is omitted |
| 34. Third person pronouns | e.g., he, she, they, her, him |
| 35. Past tense | e.g., reported, found |
| 36. Present tense | e.g., report, find, see |
| 37. wh-clause | wh-controlled clauses followed by public/private verbs, e.g. I believe what he told |
| 38. Post-nominal modifiers | e.g., the approach |
| 39. Existential ‘there’ | instance of there |
| 40. Total adverbs | all instances of adverbs |
| 41. Seem/appear | Any instances of seem, and appear |
| 42. Split auxiliaries | e.g., they are objectively shown. . .. |
Appendix B
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
