Sage Journals: Discover world-class research

Abstract

Current willingness to communicate (WTC) scales center on WTC in general second language (L2) learning, while L2 writing WTC is underrepresented. This study intended to close this gap by developing and validating an L2 writing WTC scale. A three-phase sequential embedded mixed-methods design was adopted to overcome the over-reliance on quantitative data and provide adequate evidence of validity. Nineteen items were generated based on our literature search and thematic analysis of the interview data (n = 10). With quantitative data collected from 288 learners of English as a foreign language (EFL), the psychometric properties of the initial scale were examined by exploratory factor analysis. After that, the revised 17-item questionnaire was validated by confirmatory factor analysis and other validation methods with data from 224 EFL learners. The results indicated that the underlying structure involved writing task traits, English language ideology, writing teacher support, interest in English language, and self-perception of English language proficiency. The scale was further validated through factor analysis of the quantitative data (n = 173) and thematic analysis of the immediate retrospective interview data (n = 12) from EFL learners to test its generalizability in other L2 learning contexts and for face validity evidence. The findings showcased a promising mixed-methods design for scale development and clarified the underlying factors of L2 writing WTC. Implications for scale development and the teaching and learning of L2 writing were discussed.

Keywords

scale development second language writing sequential embedded mixed-methods design willingness to communicate

I Introduction

Willingness to communicate (WTC) in a second language (L2) indicates learners’ intention to actively engage in L2 communicative activities (Dörnyei, 2005; D.M. Kang, 2014; MacIntyre et al., 1998). Research has revealed that WTC positively contributes to learners’ L2 oral performance and output (Dörnyei & Kormos, 2000), willingness to engage in complex speaking and writing tasks (MacIntyre et al., 1999), and foreign language enjoyment (Botes et al., 2022). Given the benefits of WTC to boost L2 learning, especially against the communicative language learning and teaching background, exploring how WTC can be fostered and accurately measured is necessary. Our scrutiny of the literature shows that the illuminated individual constructs as contributors to L2 WTC include motivation (MacIntyre & Charos, 1996; Teimouri, 2017), anxiety (Baker & MacIntyre, 2000), self-perceived L2 communicative competence (Joe et al., 2017; Baker & MacIntyre, 2000; MacIntyre & Charos, 1996), attitudes towards L2 (MacIntyre & Charos, 1996; MacIntyre et al., 2001; Subtirelu, 2014; Yashima et al., 2004), enjoyment (Dewaele & Dewaele, 2018), self-perceived language proficiency (Sato, 2023), and L2 interest (Eddy-U, 2015). Contextual variables have also been found to modulate WTC significantly. The identified factors, at least, consist of teacher and peer support (Cao, 2011; Eddy-U, 2015; S.J. Kang, 2005; MacIntyre et al., 2001; Yashima et al., 2018; Zarrinabadi, 2014; Zhong, 2013), exposure to an L2 (D.M. Kang, 2014; MacIntyre & Charos, 1996), and communicative task features (Cao & Philp, 2006; J. Zhang et al., 2018).

Contributors to WTC in general language learning are well documented, but factors promoting L2 writing WTC are underexplored. Writing is a complex activity indicating learners’ general language proficiency and performance. Understanding and capturing L2 learners’ writing WTC will help them engage better in writing and enhance their language learning progress. Consequently, the investigation into L2 writing WTC’s underlying forces and measurement is warranted. Several scaling instruments have been developed to measure WTC (MacIntyre et al., 2001; McCroskey & Baer, 1985; Ryan, 2009; Weaver, 2005). However, few measure WTC in the L2 writing learning context or provide compelling validation. Therefore, we sought to develop a well-validated questionnaire of L2 writing WTC.

The previous scale development emphasized factor analysis, neglecting the role of qualitative data and follow-up validation (Sudina, 2023). To pursue methodological rigor and robustness, we initiated developing and validating an L2 writing willingness to communicate scale (L2WWTCS) with a three-phase sequential embedded mixed-methods research design. We began generating scale items based on our literature review and the interview data. The newly developed scale was then evaluated through statistical analyses. Finally, the revised scale was examined with both quantitative and qualitative data. With the completion of this research, we can discuss potential factors nurturing L2 learners’ writing WTC and measure their variations.

1 Potential contributors to L2 writing WTC

Many researchers have reported the common phenomenon that some language learners with high-level communicative competence tend to avoid communication, while others with minimal linguistic competence seek chances to communicate (Dörnyei, 2005; MacIntyre et al., 1998). This issue has received much scholarly attention in an era emphasizing engagement in meaningful interaction to boost language learning (Swain, 1995). Originating from communication research (McCroskey, 1992; McCroskey & Richmond, 1991), the WTC construct was defined as the willingness to engage in communication when free to do so (McCroskey & Baer, 1985). It was conceptualized as static and subjected to personalities in the first language (L1) environment (McCroskey, 1992; McCroskey & Richmond, 1991). In L2 studies, WTC was considered as the final psychological step before actual communication (MacIntyre et al., 1998). MacIntyre et al. (1998, p. 547) defined WTC in L2 learning as ‘a readiness to enter into discourse at a particular time with a specific person or persons, using an L2’. It combined psychological, educational, linguistic, and communicative perspectives into L2 communication research (Clément et al., 2003). Given the vital role of WTC in learners’ engagement in L2 communication, WTC has been recognized as a critical individual difference factor in second language acquisition (Shirvan et al., 2019; Wang et al., 2021), and its multi-dimensional contributors have been profoundly investigated.

Early research followed McCroskey and Baer (1985), mainly examining WTC in oral contexts. MacIntyre et al. (1998) voiced the necessity of extending the WTC construct to other language skills. Similar to L2 speaking, L2 writing is a significant medium of communication. L2 learners write to inform, explain, or persuade, conveying or exchanging information to the audience. On the other hand, L2 writing bears fundamental differences from L2 speaking. For example, L2 writing, in most cases, is planned and non-interactive. Thus, it called for the exploration of L2 writing WTC. Different from the extended definition of WTC (i.e. engage in communication when free to do so), the L2 writing WTC targeted in this article refers specifically to L2 learners’ willingness to engage in L2 writing in instructional contexts. Overall, although there is little research on L2 writing WTC so far, the contributors to L2 WTC may help us understand the construct of L2 writing WTC.

The contributing factors of L2 WTC have been extensively researched at the individual level. For example, MacIntyre and Charos (1996) indicated that global personality features and affective factors associated with language could significantly influence L2 WTC. Baker and MacIntyre (2000) found that the two strongest indicators of WTC were communication anxiety for immersion students and perceived communication competence for non-immersion students. Subtirelu (2014) found that the deficit language ideology adversely affected L2 users’ WTC and the lingua franca ideology positively. Teimouri (2017) linked motivation studies with L2 WTC and suggested ideal L2 self as a predictor of WTC. Joe et al.’s (2017) research revealed that L2 WTC was strongly predicted by the satisfaction of basic psychological needs (i.e. autonomy, relatedness, and competence). Sato (2023) revealed that differences existed within the fluctuations of WTC between low-intermediate and advanced English proficiency groups, although they were both influenced by self-perception of English proficiency. Other individual variables also include L2 enjoyment (Dewaele & Dewaele, 2018) and international posture (Yashima, 2002; Yashima et al., 2004).

As L2 WTC was increasingly understood as a situated concept, researchers began to scrutinize the potential contextual factors contributing to its fluctuation. The effect of teachers in instructional settings on learners’ WTC has been extensively researched (Derakhshan et al., 2023). S.J. Kang’s (2005) qualitative study suggested that teachers’ engagement and positive feedback in conversations could increase learners’ security and situational WTC. Cao’s (2011) research found that a favorable teacher–student relationship would facilitate classroom engagement. Zarrinabadi (2014) identified four factors concerning teachers that might influence learners’ WTC: teachers’ wait time for receiving responses, topic selection, error correction, and support. The social context was also believed to influence L2 learners’ WTC. In their early research, MacIntyre and Charos (1996) discovered that exposure to L2 in social settings would influence WTC. Meanwhile, study-abroad (SA) experiences were reported to affect language learners’ WTC (D.M. Kang, 2014). Cao and Philp’s (2006) research lent evidence to the dynamic nature of WTC by delving into situational WTC. Several elements, including group size, interlocutor conditions, and topic familiarity, were detected to impact actual WTC. Eddy-U (2015) analysed possible factors (de)motivating task-situated WTC from a dynamic systems model and found that good group partners and marks had potential influences. Zhang et al.’s (2018) systematic review showed that situation cues (i.e. interlocutors, classroom atmosphere, and tasks) were overt features that influenced WTC. However, more latent factors (i.e. task-interest, task-usefulness, and task-confidence) were underlying elements that also influenced WTC.

2 Current scales to measure willingness to communicate

WTC cannot be detected or observed directly through physiological manifestations or physical data as with other psychological or cognitive processes. The previous research has used questionnaires, classroom observations, participant interviews, self-reporting, teachers’ ratings, or idiodynamic methods to measure WTC (e.g. Cao & Philp, 2006; de Saint Léger & Storch, 2009; MacIntyre & Legatto, 2011). The scaling measurement is the most common method used to assess WTC. McCroskey and Baer’s (1985) scale is the first psychometric measurement of WTC, developed initially for L1 communication. The items examine participants’ WTC with three types of receivers: stranger, acquaintance, and friend, and in four communication contexts: public speaking, meetings, group discussion, and dyad. All of the items are constructed in ordinary life circumstances. Data suggested satisfying reliability and validity of this scale (McCroskey, 1992). However, serious scrutiny revealed that this scale and most of its validation were conducted in bilingual contexts, making its face validity of application in L2 language learning settings weak (J. Peng, 2013). Cao and Philp (2006) also questioned its application in educational settings because of its wording in everyday situations.

MacIntyre et al. (2001) designed the first two questionnaires to measure WTC in and out of classrooms. Items in each questionnaire can be classified into four language skills: speaking, reading, writing, and comprehension. However, this scale lacks validity data, and its connection with the theoretical underpinning of WTC is obscure. Weaver’s (2005) L2 WTC scale devised on the Rasch model is the first endeavor to examine L2 speaking and writing WTC. The speaking part of this scale has been modified and validated by J. Peng and Woodrow (2010). Nevertheless, since the wording of its writing part is outdated and the validation data are absent, further modification and validation are needed. Ryan (2009) developed eight items to measure WTC inside and outside classroom settings on a 6-point Likert scale. Other questionnaires are almost modeled after the four aforementioned scales (J. Peng, 2013).

The review indicates the necessity to develop and validate a new scale assessing L2 learners’ writing WTC based on solid theoretical and empirical groundwork. By doing so, learners’ differences in performing them can be well discriminated against and, in turn, the theoretical underpinnings of L2 WTC as a whole can be strengthened. Adopting a sequential embedded mixed-methods design, we attempted to develop and validate a scale for measuring L2 writing WTC. In addition to examining its validity through inferential statistics, we value the role of qualitative data in strengthening its theoretical framework and adding content and face validity. The research questions were formulated as follows:

• Research question 1: How reliable and valid is the L2WWTCS?

• Research question 2: What is the confirmed underlying factor structure of L2 learners’ writing WTC?

II Method

1 Research design

Given the research questions, a three-phase sequential embedded mixed-methods design was adapted from Creswell et al. (2008), as shown in Figure 1, where qualitative and quantitative data were triangulated to ensure a well-validated L2 writing WTC scale. Qualitative data were collected before quantitative data to help the development of a new scale and afterward to verify the validity of the new scale together with quantitative data.

Figure 1.

Sequential embedded mixed-methods design.

2 Participants

The participants included 10 interviewees involved in the preparation of L2WWTCS, 685 EFL learners in the quantitative examination of L2WWTCS, and 12 EFL learners in the follow-up interview recruited through convenience sampling on a voluntary basis. All the participants signed the consent letters after being informed of the research purposes and the anonymity and confidentiality of their personal information. The 10 interviewees were recruited after we posted the participant recruitment information on the notice board in a local tertiary educational institution. In this selection, we strived for the representativeness of different stakeholders. Among the 10 interviewees, two were university instructors responsible for teaching English and English writing, respectively. Their experiences and feelings regarding EFL learners’ writing WTC based on large-scale and long-term teaching contributed to the scale compilation. The other eight interviewees are undergraduates learning English writing in their relevant compulsory courses. Their majors represented various university academic disciplinary groupings, including Engineering (electrical engineering and computer science), Arts (Chinese language and literature), Law, and Media.

Given the three rounds of quantitative validation of L2WWTCS, the participants were divided into three groups. Their demographic information is shown in Table 1. In the first round, 288 undergraduates were recruited from a national university in southeast China. In the initial screening, three questionnaires were detected as mischief answers (i.e. same answers throughout the questionnaire), and one participant reported a different first language. Consequently, the four questionnaires were excluded from the statistical analyses. None of the respondents reported long-term living or studying experiences (longer than one month) in English-speaking countries. In the second round, 224 undergraduates from the same university participated in the research. The initial screening identified seven mischief answers, and one participant reported a one-year learning experience in an English-speaking country. They thus were excluded from the statistical analysis. In the third round, 173 English-language-major undergraduates in another large project completed the new scale. All the data were suitable for further quantitative analysis.

Table 1.

Demographic information of participants in quantitative data collection.

Participant type	Number			Age		EFL learning years
Participant type	n (total)	n (female)	n (male)	M	SD	M	SD
First-round survey	288	219	69	18.95	.85	10.39	1.72
Second-round survey	224	134	90	18.94	.83	10.60	1.83
Third-round survey	173	123	50	19.39	.73	10.71	1.40

Note. EFL = English as a foreign language.

The 12 participants (4 males and 8 females) in the follow-up interview were from the third-round survey. They agreed to attend the follow-up interview voluntarily to discuss their relevant experiences.

3 Research procedure

The research procedure can be divided into three steps, during which principles laid out in Dörnyei and Taguchi (2009) were followed in our modification, administration, and analysis of questionnaire items. In Phase One, the scale items were generated based on our literature review and thematic analysis of the interview data. The content validation in item generation is the key to ensuring psychometric soundness (Cortina et al., 2020). However, previous research has not emphasized improving content validity or reporting the details of their endeavors to improve content validity. In this research, the literature review and the thematic analysis of the interview data could improve the content validity of the proposed questionnaire by caring for both the conceptual and operational definitions of L2 writing WTC (Sudina, 2023). In Phase Two, the initial scale was examined in two rounds through vigorous statistical analysis, including exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). In Phase Three, further quantitative and qualitative data yielded more evidence of the validity of the new questionnaire. Drawing on Ivankova et al.’s (2006, p. 15) graphic presentation of the mixed-methods design, we illustrate the research procedure in Table 2. We also provide more details on each phase in the following sections.

Table 2.

Visual model of the research procedure.

Phase	Procedure	Product
Phase 1: Qualitative data collection	Literature review	Review report
	Developing interview questions	Interview protocol
	Semi-structured interview (n = 10)	Audio data
Qualitative data analysis	Transcription	Text data
	Thematic analysis with NVivo	Codes, themes, thematic matrix
Connecting qualitative and quantitative phases	Item generation	Item pool
	Item selection	Initial questionnaire
Phase 2: Quantitative data collection	First-round survey (n = 288)	Numeric data
Quantitative data analysis	Data screening (univariate, multivariate)	Descriptive data, missing data, outliers, assumptions of inferential statistical analysis
	EFA	Factor loadings
	Questionnaire modification	Final questionnaire
Quantitative data collection	Second-round survey (n = 224)	Numeric data
	Data screening (univariate, multivariate)	Descriptive data, missing data, outliers, assumptions of inferential statistical analysis
Quantitative data analysis	EFA	Factor loadings
	CFA	Factor structure, model fit
	Other validation methods	Confirmed questionnaire
Phase 3: Quantitative data collection	Third-round survey (n = 173)	Numeric data
	Data screening (univariate, multivariate)	Descriptive data, missing data, outliers, assumptions of inferential statistical analysis
Quantitative data analysis	CFA	Factor structure
	Multigroup CFA	Model fit
Connecting quantitative and qualitative phases	Developing interview questions	Interview protocol
Qualitative data collection	Semi-structured interview (n = 12).	Audio data
Qualitative data analysis	Transcription	Text data
	Analysis	Qualitative results
Integration of quantitative and qualitative results	Interpretation of quantitative and qualitative results	Discussion
	Conclusion
	Implication

a Qualitative data collection and analysis

In the beginning, we thoroughly reviewed the existing literature exploring contributing factors to L2 learners’ WTC. Accordingly, we attempted to construct a multi-faceted model of L2 WTC to increase the content validity of the new scale. However, the existing literature primarily focused on speaking situations or WTC in general terms. No research paid exclusive attention to L2 writing WTC. As a result, qualitative data collected through semi-structured interviews were examined to tease out features related to the L2 writing WTC so that more questionnaire items could be generated to flesh out the key constructs of the new scale.

An interview protocol (see Appendix A) was pre-developed to ensure the interviews proceeded effectively and efficiently. We framed the initial questions based on the proposed research questions, and an experienced TESOL teacher helped revise them. Four sections with 10 questions were determined in the final protocol and believed to elicit enough responses. The 10 questions were general questions for the interviewer to follow. In the interview, the interviewer asked the questions with more details. If the interviewees were not clear about the questions and did not know how to respond, the interviewer would help them navigate through the questions with prompts. Moreover, the interviewer also asked additional questions based on the interviewees’ answers. Every interview session lasted for no more than 30 minutes in case of fatigue. Interviews were communicated in Chinese since Chinese was the interviewees’ mother tongue, which helped them express their experiences and ideas comfortably and thoroughly. Interviews were audio-recorded for future transcribing, reviewing, and thematic analysis.

The first author was responsible for analysing the interview data. Braun and Clarke’s (2006) proposal of six phases of thematic analysis was adopted. The interview data were first transcribed from the verbal form into the written form. A transcribing machine was employed to finish the initial transcription, after which the researchers manually checked the transcripts against the audio recordings to improve their accuracy. In this process, the researcher became familiar with the data. The initial coding criteria were established through (re)reading the data using both theoretically driven (deductive) and data-driven (inductive) methods. The coding process was conducted in NVivo 12. The researcher coded not only segments that were correlated with ideas having been documented by existent literature but also segments emerging in data. After two rounds of consistent and systematic (re-)coding, 21 codes were established. Intra-coder reliability calculated by intraclass correlation coefficient was 88.9%. After that, emerging codes were classified into potential overarching themes, during which the qualitative data were re-examined to justify codes if problems arose. An expert in the relevant field was consulted on the classification and definition of themes. Some of Nowell et al.’s (2017) suggestions on establishing trustworthiness for each phase of thematic analysis were adopted to ensure the credibility of thematic analysis in this study, including member checking, peer debriefing, and researcher triangulation.

An item pool with more than 30 items was generated in this process. At last, 19 items were adopted to prepare the initial 7-point Likert scale, which was confirmed through consultations with an expert in second language education. Two English language experts were consulted for the wording. They both have had studying experiences in English-speaking countries (i.e. England and Canada) and have conducted survey research. Their suggestions made the questionnaire items more accurate and easier to understand. For example, Item 15 read initially I have a good sense of logic and was then changed into I can write logically.

b Quantitative data collection and analysis

The three rounds of quantitative data were collected in the same way. We approached the potential participants after their classes with the lecturers’ permission. In total, 15 minutes were required to successfully conduct the data collection process. The questionnaires were distributed to the participants in person and paper-and-pen format. Since participants possessed different levels of English proficiency, the Chinese version of the L2WWTCS was provided to eliminate the potential unfairness raised by language proficiency. One of the researchers translated the questionnaire into Chinese. A university instructor with an accredited translation certificate then revised the draft translation with two principles: accuracy and readability. The Chinese version was also back-translated by a bilingual to ensure the equivalence of meaning.

Based on the results of EFA on the first-round data, the initial scale was modified (see Appendix B). In the second round, we conducted EFA, CFA, and other validation methods, which yielded the factor structure of L2 writing WTC. The final scale was examined its measurement invariance by the third-round data through CFA and multigroup CFA.

c Follow-up qualitative data collection and analysis

Follow-up interview data were collected as part of a large research project involving the implementation of the newly developed scale. We interviewed the participants to gather the face validity of the scale and investigated the sources of Chinese EFL learners’ writing WTC. The data were handled using the same practice as the previous qualitative data analysis. All the names used in the corresponding results part were pseudonyms.

III Results

1 Thematic analysis of precedent interview data

In total, 21 codes related to L2 writing WTC were identified. Their names, examples, and frequency numbers are shown in Table 3. After the negotiation between authors, five themes, defined as Writing Task Traits, Individual Differences, Teachers and Peers, Self-Perception of English Language Proficiency, and Miscellaneous, were generated to incorporate the 21 codes (see Figure 2). More than 30 items were then drafted to incorporate the 21 codes and constitute the item pool. After consultations with an expert in second language education, items related to prompts, interactive modes, physical and mental states, abroad experiences, and scores were believed to be less relative to the construct of L2 writing WTC and thus deleted under the expert’s suggestions. Items that overlapped were also deleted. In total, 19 items were selected from the item pool to compile the initial scale.

Table 3.

Emerging codes through thematic analysis.

Code	Quote	References
Teachers’ scaffolding	My teacher’s guidance makes me think more.	21
Discussions with peers	Discussions with classmates give me new ideas.	10
Vocabulary	If I don’t know the word, I feel stuck.	12
Physical and mental states	If I feel tired, I don’t want to write.	2
Familiarity	I practice writing a lot, so I don’t feel it’s difficult.	4
Topics	I am more willing to write if the topic is related to my personal experience.	18
Language ideology	English is the lingua franca around the world.	19
Prompts	If the prompt gives more background information or requirements, it’s easy to write.	2
Interactive modes	I can do the writing on my own. I don’t have to interact with others.	2
Teachers’ feedback	I will consult my teacher for feedback on my grammar and logic.	11
Difficulty	It should be accurate in spelling in all dimensions.	7
Affectivity	They are positive, even if they have low English proficiency, and are not afraid of expressing ideas.	11
Time	If the teacher gives me enough time, I will be very patient and write all I want to write.	3
Experiences abroad	Students who have living or traveling experiences abroad seem to be more willing to write.	1
Writing purposes	I want to get a high score (in IELTS), so I am willing to learn writing.	21
English proficiency	She makes so many grammatical errors, so she dislikes writing.	18
Interest	They are not interested in learning English.	7
Background knowledge	If they have no background knowledge, they don’t know what to write.	9
Scores	If this writing features high in their final score, they will be more willing to do it.	3
Atmosphere	If everyone in the class treats it seriously, (I will too).	6
Sense of logic	She is incapable of sorting out the logic.	1

Figure 2.

Five themes identified to cluster emerging codes.

2 Descriptive statistics and distribution normality check of quantitative data

All the questionnaire data, including the demographic information and scaling data, were first imported into Excel. Descriptive data of each item, including mean score, standard deviation, skewness, and kurtosis, were presented in Appendices C, D, and E. Box-and-whisker plot showed no outlier in the collected data. Distribution normality data are shown in Table 4.

Table 4.

Descriptive and distribution normality statistics of quantitative data.

Data type	Shapiro–Wilk test	Mardia’s coefficient of skewness	Mardia’s coefficient of kurtosis
First-round survey	.99 (p = .17)	.82 (p = .37)	−.68 (p = .49)
Second-round survey	.99 (p = .09)	.0047 (p = .95)	−1.69 (p = .09)
Third-round survey	.99 (p = .62)	0.06 (p = .80)	−0.05 (p = .96)

3 Exploratory factor analysis of the first- and second-round data

The Kaiser–Meyer–Olkin (KMO) measure of the first-round data was .80, larger than the cut-off value of .60. Thus, the total number of samples was adequate for further statistical analysis. The result of the Bartlett’s test of sphericity was χ² = 1662.01 (p < .001, df = 105), supporting the appropriateness of EFA. Before the EFA, parallel analysis was conducted to determine the factors to be kept in an EFA. The yielded scree plot suggested that five factors existed in the model (see Figure 3).

Figure 3.

Scree plot of eigenvalues of principal factors in the first round.

This research adopted the principal axis factoring and the promax rotation. A Kaiser normalization was conducted before the rotation. Considering the sample size in this round of data, the threshold of factor loading cut-off should be at least .35 (Hair et al., 1998). Items 9, 14, 15, and 19 were excluded since their factor loadings were lower than .35. The factor analysis results and the factor loadings of all the items are presented in Figure 4 and Table 5. This model accounted for 46.60% of the total variance.

Figure 4.

Graphical presentation of factor analysis in the first round.

Table 5.

Factor loadings of the first-round quantitative data (n = 284).

Factor	Item	Factor loading
Factor	Item	1	2	3	4	5
Factor 1	Item 4	.571
	Item 7	.589
	Item 11	.838
	Item 13	.512
	Item 16	.598
Factor 2	Item 3		.984
	Item 10		.616
Factor 3	Item 2			.510
	Item 8			.650
	Item 17			.578
Factor 4	Item 12				.770
	Item 18				.793
Factor 5	Item 1					.581
	Item 5					.495
	Item 6					.678

The possible underlying factor structures identified by EFA were then labeled thematically by analysing the contents of items grouped under each factor: Factor 1, Writing Task Traits (WTT); Factor 2, English Language Ideology (ELI); Factor 3, Writing Teacher Support (WTS); Factor 4, Interest in English Language (IEL); Factor 5, Self-Perception of English Language Proficiency (SPELP). Item 5 was deleted since it was theoretically uncorrelated with the other two items clustered around Factor 4. After the EFA and the deletion of items, this questionnaire was inappropriate for CFA since the number of items clustered under Factors 2, 4, and 5 was less than 3, a threshold number of latent variables. As a result, another three items, believed to correlate with the corresponding factors, were extracted from the item pool. An updated questionnaire with 17 items was prepared for further examination.

For the second-round data, the KMO measure was .87, larger than the cut-off value of .60, indicating the sampling adequacy. The result of the Bartlett’s test of sphericity was χ² = 1887.87 (p < .001, df = 136), indicating the second-round data were suitable for EFA. As for the number of factors, parallel analysis and the yielded scree plot suggested four factors in the model (see Figure 5). Since the parallel analysis was just one of the various statistical methods available for deciding the number of factors to be included in EFA, the most appropriate number can differ from its result. Consequently, this research tried both four and five factors to see which one accounted for more variance. The 4-factor model, which combined IEL and SPELP into one factor (i.e. IP), could explain 58.20% of the total variance, and the 5-factor model accounted for 61.20%. Besides, scrutiny of the two models revealed that the 5-factor model was theoretically more acceptable.

Figure 5.

Scree plot of eigenvalues of principal factors in the second round.

Considering the sample size in this round of data, the threshold of factor loading cut-off should be at least .40 (Hair et al., 1998). The factor loadings of all items exceeded the threshold number. The results of factor analysis and factor loadings of all the items are shown in Figure 6 and Table 6.

Figure 6.

Graphical presentation of factor analysis in the second round.

Table 6.

Factor loadings of the second-round quantitative data (n = 216).

Factor	Item	Factor loading
Factor	Item	1	2	3	4	5
Factor 1	Item 4	.727
	Item 7	.820
	Item 11	.763
	Item 13	.424
	Item 16	.554
Factor 2	Item 3		.848
	Item 10		.751
	Item 21		.763
Factor 3	Item 2			.699
	Item 8			.779
	Item 17			.492
Factor 4	Item 12				.849
	Item 18				.987
	Item 22				.745
Factor 5	Item 1					.826
	Item 6					.766
	Item 20					.704

4 Validity and reliability tests

a Construct validity

CFA was adopted to verify the factor structure described by EFA. Since the parallel analysis suggested a 4-factor model, we tested both the 4-factor model (Model 1) and the 5-factor model (Model 2) with CFA, as shown in Figures 7 and 8. The model fit indices of both Model 1 (χ² = 265.694; df = 113; χ²/df = 2.351; TLI = .899; CFI = .916; RMSEA = .079 [.067, .092]; SRMR = .068) and Model 2 (χ² = 199.052; df = 109; χ²/df = 1.826; TLI = .938; CFI = .950; RMSEA = .062 [.048, .076]; SRMR = .062) met the threshold values. The further comparison between the two models indicated Model 2 was better (χ ²_M1 − χ ²_M2 = 66.642; df_M1 − df_M2 = 4, p < .001). Considering the strong correlation between WTT and SPELP, hierarchical CFA was conducted to test whether including a second order (Model 3, see Figure 9) improved model fit. The results indicated that Model 2 was better than Model 3 (χ ²_M3 − χ ²_M2 = 6.036; df_M3 − df_M2 = 2, p = .049). The third-round data were also subjected to CFA to generate Model 4 (see Figure 10). The model fit indices of Model 4 (χ² = 226.530; df = 109; χ²/df = 2.078; TLI = .900; CFI = .898; RMSEA = .079 [.065, .094]; SRMR = .074) basically met the threshold values. The results of multigroup CFA between Model 2 and Model 4 (see Table 7) showed the existence of acceptable measurement invariance.

Figure 7.

Four-factor model of the second language writing willingness to communicate scale (L2WWTCS).

Figure 8.

Five-factor model of the second language writing willingness to communicate scale (L2WWTCS).

Figure 9.

The model yielded by hierarchical confirmatory factor analysis (CFA).

Figure 10.

The model yielded in the third-round quantitative data.

Table 7.

Model fit results of measurement invariance models.

Model	χ ²	df	RMSEA	CFI	SRMR
Configural invariance	440.590	235	0.067	0.931	0.086
Metric invariance	433.643	230	0.067	0.932	0.069
Scalar invariance	444.579	242	0.066	0.932	0.066

b Internal consistency reliability

The Cronbach’s Alpha for the five subscales were .83, .82, .70, .89, and .86, indicating this questionnaire had high internal consistency reliability. Compared with Cronbach’s Alpha, McDonald’s Omega took account of the strength of association between items (McDonald, 1999), which, as a result, was believed to be a better substitute for Cronbach’s Alpha. In our data, the McDonald’s Omega for the five subscales were .83, .82, .71, .90, and .85, all higher than the cut-off value .70.

c Split-half reliability

The questionnaire was split into two halves. One half comprised odd-numbered items, and the other consisted of even-numbered items. The Pearson Correlation Coefficient between the scores for the two halves was .80 (p < .001), which showed that all parts of the questionnaire contributed equally to measuring participants’ writing WTC, indicating high split-half reliability of this scale.

d Inter-rater reliability

To examine inter-rater reliability, the average score of each item collected in this study was compared with the data collected in a following project. The Pearson Correlation Coefficient between the pairs was .96 (p < .001), indicating very high inter-rater reliability.

5 Qualitative results of participants’ L2 writing WTC

The follow-up interview conducted immediately after the participants finished the questionnaire was intended to delve deeper into their answers, allowing the researcher to enrich the quantitative interpretation and collect face validity evidence. With their answers, we explored the sources of L2 learners’ writing WTC in the Chinese context.

In the five sub-scales of L2WWTCS, Writing Task Traits is the most writing task-specific dimension. The interviewees expressed various opinions on the traits of writing tasks that influenced their WTC. Unlike other parts of language abilities, writing tasks usually require task-takers to establish a position, present reasons, and evaluate evidence logically, increasing learners’ cognitive demands. Before they write, learners spend much more time planning, which could impair their WTC. Once you have an idea, you need to enumerate your evidence, which could not be fabricated impromptu but based on prior knowledge accumulated in your everyday life. As a result, learners need to employ prior knowledge and think critically about popular topics. Meanwhile, connecting opinions and evidence coherently and cohesively is challenging, requiring learners to practice their reasoning and argument ability validly. Other factors related to task features referred to by respondents also included the structure of essays and how to organize the structure with diversified syntactic structures and accurate vocabulary. Cao expressed her ideas on this issue.

Cao (female, 19, learning English for 10 years): For writing, the principal thing is to have a clear viewpoint. Once you have one, you can continue writing by listing your reasons. You need to persuade others, just like debating. If I want to complete this task well, I really need to think a lot. I cannot start writing once I see the task.

The data revealed that interviewees’ interest in learning English could be categorized into two types. One type of interest originated from their natural or nurtured interest in learning languages regardless of English being their major. They favored acquiring English in natural settings, taking the initiative in approaching English materials, such as movies, books, and TV series. They had been keen on learning foreign languages and chose their major out of personal propensity. The other type of interest was closely correlated with their recognition of the importance of English. Majoring in English, some interviewees confessed that, although they were initially apathetic to English, they gradually developed an interest in it. The reasons could be attributed to their immersion in an English-speaking environment constructed courtesy of their department and their decision to work in English-related fields. Chen Z.’s and Chen Y.’s opinions represented these two types of interest.

Chen Z. (male, 20, learning English for 11 years): I have been interested in learning English since I was a kid. When I was an elementary school student, I watched English TV series on electronic devices at night without my parents’ notice. This has influenced me a lot. Now, I feel that classroom knowledge is not enough for me. Basically, I try to learn more by surfing the Internet.

Chen Y. (male, 20, learning English for 11 years): I am interested in learning English because this is my major. I will work in this field in the future. So, I certainly need to learn more stuff of English. And most of my classes are related to English.

Since participants scored lowest in their self-perceived English proficiency, their sources of such perception were carefully investigated. The results revealed that the test-oriented educational system was prominent in constructing their low confidence. Referring to the reasons why they felt unconfident in English proficiency, interviewees frequently mentioned their English subject grades. Seldom can test takers consistently achieve the highest in tests. These situations shifted their attention to the comparison with better performers while overlooking their personal growth. Cheng and Pan’s descriptions echoed this phenomenon.

Cheng (female, 19, learning English for 14 years): Before I went to university, my English grades were not good, so I had no confidence in my English ability. In the university, teachers’ evaluations of me, my poor spoken English, and my roommates’ good grades all make me feel disappointed in myself.

Pan (male, 20, learning English for 11 years): I think my proficiency is maybe worse than pupils in the U.S. Oh, not maybe. It should be a sure thing. My proficiency is worse than native speakers. Because they have the environment. The environment to speak, to use. We don’t have the environment. In other words, our English knowledge is totally not from life but from the classroom or our self-study.

Interviewees’ English Language Ideology seems to be shaped by their recognition of English as a future survival skill. Many of them wanted to find English-related jobs, such as EFL teachers or job positions in multinational corporations. Writing as a fundamental language skill was deemed significant for them to be competitive in the job market. With this cognition, they were more determined to practice writing and more willing to communicate in writing. One typical illustration of this fact was found in Yang’s narrative.

Yang (female, 19, learning English for 12 years): English is my major and the most important thing I want to learn in the four-year study. Another reason is I want to choose a job related to English. So, it’s an essential skill for me, very important to me.

Several characteristics of writing teachers were important in improving EFL learners’ writing WTC. Teachers capable of arranging classroom activities properly and with distinctive charm seemed to be more favored by EFL learners. He’s idea was a generalization of these results.

He (female, 20, learning English for 11 years): Teachers’ teaching styles and their enthusiasm for English or their levels of professionalism influence my recognition of their teaching content. They also determine how willing I am to learn English.

IV Discussion

1 Validation of L2WWTCS

This study examined various aspects of validity and reliability in scale development (see Figure 11). In this section, we discussed how we achieved content, construct, and face validity, three fundamental types of validity evidence (DeVellis, 2017), in our scale development.

Figure 11.

Validity and reliability examined in the scale development.

Content validity refers to the extent to which the scale items adequately cover the content domain of the investigated construct (DeVellis, 2017), which was informed by the literature review and qualitative data in this study. Our literature review indicated that individual and contextual attributes should influence L2 writing WTC simultaneously. This finding was corroborated in the thematic analysis of interview data. The five themes that emerged in the first-round thematic analysis were Writing Task Traits, Individual Differences, Teachers and Peers, Self-Perception of English Language Proficiency, and Miscellaneous. The first, third, and fifth sub-constructs tap into contextual dimensions. In their systematic review, J. Zhang et al. (2018) argued that task features (e.g. topic, type of activity, preparation time, and assessment) were critical situational antecedents of L2 WTC. Meanwhile, teacher and peer factors are also significant cues for L2 WTC. Research has revealed that teacher support, teacher engagement, teacher feedback, teacher–student relationship, and peer support can affect L2 WTC (e.g. Cao, 2011; S.J. Kang, 2005; Zarrinabadi, 2014; J. Zhang et al., 2018). The second and fourth sub-constructs tap into individual characteristics. Language ideology (Subtirelu, 2014), interest (Eddy-U, 2015), and language proficiency (Cao & Philp, 2006; Yashima et al., 2018) are well-documented individual traits related to L2 WTC. An expert in L2 education also confirmed these themes.

Construct validity relates to how well a scale measures the underlying structure, which can be divided into discriminant and convergent validity (DeVellis, 2017). EFA and CFA, including hierarchical and multigroup CFA, were used to examine construct validity. In this study, factor loadings over .40, absence of cross-loading, and acceptable latent factor correlation proved its discriminant validity; factor loading over .40 and items in their theoretically posited latent variables confirmed its convergent validity. The CFA confirmed a 5-factor model: WTT, ELI, WTS, IEL, and SPELP, which are the aforementioned critical cues to L2 WTC. At last, the model was examined its measurement invariance by the third-round quantitative data from a different language learning context, indicating a satisfying fitness of the model across second language learning contexts.

Face validity indicates the degree to which the measure appears to be related to the focal construct in the subjective judgment of non-experts (DeVellis, 2017). In this article, we confirmed the scale’s face validity by collecting follow-up qualitative data of EFL learners’ writing WTC sources. The interviewees’ narratives corresponded to the sub-structures confirmed in factor analysis and provided vivid and detailed explanations of each variable, which gave strength to the statistical validation we did previously.

2 Underlying factor structure of L2 writing WTC

The first factor is related to writing features. In J. Zhang et al.’s (2018) review of the situational antecedents of L2 WTC, task was recognized as a critical overarching category of situational cues. According to their proposed model, task features, including time (e.g. Zarrinabadi, 2014; Zhong, 2013), types of activity (e.g. Cao, 2011; de Saint Léger & Storch, 2009; Eddy-U, 2015; J.-E. Peng, 2012) and topic (e.g. Cao, 2011; S.J. Kang, 2005), and thematic categories of topic (e.g. Cao, 2011), including content knowledge and relevant vocabulary, will affect L2 WTC by regulating L2 learners’ confidence and motivation. In the final scale, time and topic were explicitly stated, and activity was implicitly implied.

The second factor, English Language Ideology, refers to L2 learners’ attitudes and beliefs about the roles of the English language in their social worlds. Language ideology has evolved as an essential concept in linguistic anthropology, in which its conceptualization is still under debate. In general, language ideology in linguistic anthropology could be roughly defined as people’s concepts concerning the roles of language in social experiences within a cultural group (see Kroskrity, 2004). Subtirelu (2014) imported this concept into the research on WTC, arguing that it would contribute to the theoretical underpinning of WTC. His research has indicated that positive language ideologies could promote WTC significantly. In this scale, items clustered under this factor presented statements on the importance of the English language in academic achievements and future careers.

The third factor, classified as Writing Teacher Support, deals with the role of teachers’ support in promoting writing WTC. Writing teachers’ support stated in this scale includes teachers’ scaffolding, feedback, and other behaviors that may stimulate a sense of appreciation. Scaffolding has been strongly recommended to be added to the instructional repertoire as it can help learners achieve learning targets with motivation (Cotterall & Cohen, 2003; Hammond, 2002). Research has confirmed that certain types of feedback could enhance the language learning or writing process (Hyland & Hyland, 2006; L.J. Zhang & Cheng, 2021) and promote writing motivation and engagement (Yu et al., 2020). Teacher support has been extensively recognized as a situational cue to promote WTC (J. Zhang et al., 2018). Fallah (2014) maintained that teacher immediacy, defined as teachers’ ‘nonverbal and verbal behaviors, which reduce psychological and/or physical distance between teachers and students’ (Christophel & Gorham, 1995, p. 292), influenced WTC by regulating motivation and confidence. Wen and Clément (2003) also pointed out that teachers’ immediacy, attitudes, and styles affected WTC from a sociocultural perspective.

The fourth factor tapped into L2 learners’ Interest in English Language. The research conducted by Amiryousefi (2018) confirmed that interest contributed significantly to L2 learners’ WTC. Interest is a considerable motivator for initiating WTC (Eddy-U, 2015) and can directly influence learners’ behaviors and involvement in learning (Amiryousefi, 2018). L2 learners with higher interest are thus more willing to engage in their L2 writing.

The fifth factor, Self-Perception of English Language Proficiency, has been documented as a highly correlated antecedent of L2 WTC. Some research acknowledged self-perceived communicative competence as the most crucial predictor of L2 WTC (MacIntyre & Charos, 1996; MacIntyre & Gardner, 1994). In MacIntyre et al. (2002), self-perceived communicative competence was the only variable significantly correlated with WTC. Shirvan et al.’s (2019) meta-analysis revealed that self-perceived communicative competence moderately correlated with L2 WTC.

In the two L2 learning contexts we investigated (i.e. L2 English undergraduate learners across majors and English major undergraduates), there were variations in the overall levels of L2 writing WTC and the five underlying constructs. English major undergraduates seemed to have higher L2 writing WTC. This result was mainly attributed to their higher scores in Writing Task Traits, Writing Teacher Support, and Interest in English Language. Surprisingly, although the English major undergraduates had higher English proficiency (i.e. upper-intermediate), they had similar low levels of self-perception of English proficiency as the L2 English undergraduate learners.

V Conclusions

This article contributes to developing and validating a scale for measuring L2 learners’ writing WTC with a sequential embedded mixed-methods design. To ensure theoretical soundness, literature was scrutinized to extract potential contributors to L2 writing WTC. These contributors were further empirically examined by interviewing the target population to generate the item pool. With large-scale data, carefully selected items were evaluated by EFA, CFA, and other scale validation methods. The results confirmed a 5-factor model of L2 writing WTC: Writing Task Traits, English Language Ideology, Writing Teacher Support, Interest in English Language, and Self-Perception of English Language Proficiency. After completing the newly developed scale, this underlying model was finally examined through another round of quantitative tests and the immediate retrospective interview with EFL learners in a different L2 learning context. The results indicated that the scale could accurately reflect their L2 writing WTC.

Validity is the key to the scale development in this article. A sequential embedded mixed-methods design can improve the chance of high validity uttermost. First, building literature reviews and qualitative data into item generation ensures the content validity of the new scale. Second, vigorous statistical examination further illuminates the proposed theoretical structure to increase its construct validity. Then, examining the model in a different context to test its generalizability across contexts strengthens its construct validity. Finally, a follow-up qualitative analysis gives more details of the confirmed model and further provides face validity.

This research has evident implications for scale development in L2 education and L2 writing learning and teaching. The triangulation of qualitative and quantitative data in scale development should be present and also iterative. Considering the diversity and complexity of L2 learning contexts, a well-validated scale should be examined several rounds by multi-source data. Moreover, to increase L2 learners’ writing WTC, educators and practitioners may focus on motivational predispositions, affective and cognitive factors, and writing-specific features. For motivational predispositions, L2 writing WTC could be enhanced by developing L2 learners’ interest in learning English. Furthermore, affective and cognitive features, including L2 learners’ attitudes toward English and their self-perceived English proficiency, also play indispensable roles. Finally, sufficient knowledge of writing tasks and teacher support also improve L2 learners’ writing WTC.

Footnotes

Appendix A

Appendix B

Appendix C

Descriptive statistics of the first-round quantitative data of the second language writing willingness to communicate scale (L2WWTCS).

Item	M	SD	Skewness	Kurtosis
1	3.37	1.28	0.07	−1.15
2	5.61	1.01	−1.33	2.27
3	5.84	1.04	−1.24	2.18
4	3.94	1.28	−0.20	−0.53
5	3.56	1.42	0.24	−0.83
6	3.10	1.31	0.24	−0.64
7	3.45	1.30	0.16	−0.79
8	4.65	1.23	−0.54	−0.34
9	4.98	1.32	−0.73	0.19
10	5.27	1.27	−0.66	0.06
11	3.29	1.30	0.15	−0.75
12	4.79	1.39	−0.67	0.35
13	3.66	1.26	−0.04	−0.59
14	5.11	1.24	−0.88	0.32
15	4.53	1.47	−0.36	−0.81
16	4.15	1.19	−0.20	−0.32
17	5.51	1.11	−1.04	1.65
18	5.21	1.29	−1.00	0.96
19	4.95	1.20	−0.66	0.17

Appendix D

Descriptive statistics of the second-round quantitative data of the second language writing willingness to communicate scale (L2WWTCS).

Item	M	SD	Skewness	Kurtosis
1	3.38	1.37	0.10	−1.00
2	5.14	1.30	−1.01	0.69
3	5.49	1.25	−0.91	0.87
4	3.75	1.37	0.06	−0.70
6	3.17	1.36	0.25	−0.96
7	3.64	1.43	0.03	−0.86
8	4.75	1.31	−0.49	0
10	4.85	1.47	−0.45	−0.31
11	3.25	1.35	0.29	−0.64
12	4.12	1.52	−0.26	−0.66
13	3.36	1.26	0.30	−0.08
16	3.87	1.34	−0.08	−0.37
17	5.36	1.33	−0.93	0.75
18	4.61	1.53	−0.54	−0.47
20	3.01	1.29	0.33	−0.74
21	5.78	1.19	−1.39	2.69
22	4.06	1.62	−0.08	−0.91

Note. 0 is a rounded number.

Appendix E

Descriptive statistics of the third-round quantitative data of the second language writing willingness to communicate scale (L2WWTCS).

Item	M	SD	Skewness	Kurtosis
1	3.21	1.34	0.29	−0.80
2	5.17	1.35	−1.25	1.20
3	5.58	1.24	−0.93	0.97
4	3.74	1.30	0	−0.58
6	3.20	1.28	0.25	−0.90
7	3.78	1.41	−0.08	−1.02
8	4.88	1.34	−0.64	0.06
10	4.90	1.41	−0.41	−0.34
11	3.34	1.33	0.22	−0.73
12	4.39	1.36	−0.35	−0.30
13	3.37	1.15	0.14	−0.69
16	3.92	1.20	−0.25	0.25
17	5.62	1.09	−0.91	1.10
18	4.93	1.40	−0.86	0.28
20	3.05	1.21	0.22	−0.59
21	5.94	1.09	−1.56	4.07
22	4.24	1.53	−0.35	−0.67

Note. 0 is a rounded number.

Author contribution statement

Y. Zhang conceived and designed the study. Y. Zhang collected and analysed the data, drafted the manuscript, and all the authors revised and approved the manuscript. L.J. Zhang finalized it for submission as the corresponding author.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The project was supported by a joint doctoral scholarship awarded to the first author by The University of Auckland, New Zealand and the China Scholarshp Council (CSC NO. 202108250009).

Ethical approval

This research is approved by the Human Participants Ethics Committee of the University of Auckland, New Zealand (reference number UAHPEC22974).

ORCID iDs

Yujie Zhang

Lawrence Jun Zhang

References

Amiryousefi

(2018). Willingness to communicate, interest, motives to communicate with the instructor, and L2 speaking: A focus on the role of age and gender. Innovation in Language Learning and Teaching, 12, 221–234.

Baker

S.C.

MacIntyre

P.D.

(2000). The role of gender and immersion in communication and second language orientations. Language Learning, 50, 311–341.

Botes

Dewaele

J.M.

Greiff

(2022). Taking stock: A meta-analysis of the effects of Foreign Language Enjoyment. Studies in Second Language Learning and Teaching, 12, 205–232.

Braun

Clarke

(2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3, 77–101.

Cao

(2011). Investigating situational willingness to communicate within second language classrooms from an ecological perspective. System, 39, 468–479.

Cao

Philp

(2006). Interactional context and willingness to communicate: A comparison of behaviour in whole class, group and dyadic interaction. System, 34, 480–493.

Christophel

D.M.

Gorham

(1995). A test–retest analysis of student motivation, teacher immediacy, and perceived sources of motivation and demotivation in college classes. Communication Education, 44, 292–306.

Clément

Baker

S.C.

MacIntyre

P.D.

(2003). Willingness to communicate in a second language: The effects of context, norms, and vitality. Journal of Language and Social Psychology, 22, 190–209.

Cortina

J.M.

Sheng

Keener

S.K.

, et al. (2020). From alpha to omega and beyond! A look at the past, present, and (possible) future of psychometric soundness in the Journal of Applied Psychology. Journal of Applied Psychology, 105, 1351–1381.

10.

Cotterall

Cohen

(2003). Scaffolding for second language writers: Producing an academic essay. ELT Journal, 57, 158–166.

11.

Creswell

J.W.

Plano Clark

V.L.

Garrett

A.L.

(2008). Methodological issues in conducting mixed methods research designs. In Bergman

(Ed.), Advances in mixed methods research (pp. 66–84). Sage.

12.

de Saint Léger

Storch

. (2009). Learners’ perceptions and attitudes: Implications for willingness to communicate in an L2 classroom. System, 37, 269–285.

13.

Derakhshan

Zhang

L.J.

Zhaleh

(2023). The effects of instructor clarity and non-verbal immediacy on Chinese and Iranian EFL students’ affective learning: The mediating role of instructor understanding. Studies in Second Language Learning and Teaching, 13, 71–100.

14.

DeVellis

R.F.

(2017). Scale development: Theory and applications (4th ed.). Sage.

15.

Dewaele

J.M.

Dewaele

(2018). Learner-internal and learner-external predictors of willingness to communicate in the FL classroom. Journal of the European Second Language Association, 2, 24–37.

16.

Dörnyei

(2005). The psychology of the language learner: Individual differences in second language acquisition. Lawrence Erlbaum.

17.

Dörnyei

Kormos

(2000). The role of individual and social variables in oral task performance. Language Teaching Research, 4, 275–300.

18.

Dörnyei

Taguchi

(2009). Questionnaires in second language research: Construction, administration, and processing. Routledge.

19.

Eddy-U

(2015). Motivation for participation or non-participation in group tasks: A dynamic systems model of task-situated willingness to communicate. System, 50, 43–55.

20.

Fallah

(2014). Willingness to communicate in English, communication self-confidence, motivation, shyness and teacher immediacy among Iranian English-major undergraduates: A structural equation modeling approach. Learning and Individual Differences, 30, 140–147.

21.

Hair

J.F.

Tatham

R.L.

Anderson

R.E.

Black

(1998). Multivariate data analysis. Prentice Hall.

22.

Hammond

(2002). Scaffolding teaching and learning in language and literacy education. Primary English Teaching Association, Australia.

23.

Hyland

(2006). Feedback on second language students’ writing. Language Teaching, 39, 83–101.

24.

Ivankova

N.V.

Creswell

J.W.

Stick

(2006). Using mixed methods sequential explanatory design: From theory to practice. Field Methods, 18, 3–20.

25.

Joe

H.K.

Hiver

Al-Hoorie

A.H.

(2017). Classroom social climate, self-determined motivation, willingness to communicate, and achievement: A study of structural relationships in instructed second language settings. Learning and Individual Differences, 53, 133–144.

26.

Kang

D.M.

(2014). The effects of study-abroad experiences on EFL learners’ willingness to communicate, speaking abilities, and participation in classroom interaction. System, 42, 319–332.

27.

Kang

S.J.

(2005). Dynamic emergence of situational willingness to communicate in a second language. System, 33, 277–292.

28.

Kroskrity

P.V.

(2004). Language ideologies. In Duranti

(Ed.), A companion to linguistic anthropology (pp. 496–517). Blackwell.

29.

MacIntyre

P.D.

Babin

P.A.

Clément

(1999). Willingness to communicate: Antecedents and consequences. Communication Quarterly, 47, 215–229.

30.

MacIntyre

P.D.

Baker

S.C.

Clément

Donovan

L.A.

(2002). Sex and age effects on willingness to communicate, anxiety, perceived competence, and L2 motivation among junior high school French immersion students. Language Learning, 52, 537–564.

31.

MacIntyre

P.D.

Charos

(1996). Personality, attitudes, and affect as predictors of second language communication. Journal of Language and Social Psychology, 15, 3–26.

32.

MacIntyre

P.D.

Clément

Conrod

(2001). Willingness to communicate, social support, and language-learning orientations of immersion students. Studies in Second Language Acquisition, 23, 369–388.

33.

MacIntyre

P.D.

Clément

Dörnyei

Noels

K.A.

(1998). Conceptualizing willingness to communicate in a L2: A situational model of L2 confidence and affiliation. Modern Language Journal, 82, 545–562.

34.

MacIntyre

P.D.

Gardner

R.C.

(1994). The subtle effects of language anxiety on cognitive processing in the second language. Language Learning, 44, 283–305.

35.

MacIntyre

P.D.

Legatto

J.J.

(2011). A dynamic system approach to willingness to communicate: Developing an idiodynamic method to capture rapidly changing affect. Applied Linguistics, 32, 149–171.

36.

McCroskey

J.C.

(1992). Reliability and validity of the willingness to communicate scale. Communication Quarterly, 40, 16–25.

37.

McCroskey

J.C.

Baer

J.E.

(1985). Willingness to communicate: The construct and its measurement. Unpublished article presented at the Speech Communication Association convention, Denver, CO, USA.

38.

McCroskey

J.C.

Richmond

V.P.

(1991). Willingness to communicate: A cognitive view. In Booth-Butterfield

(Ed.), Communication, cognition, and anxiety (pp. 19–37). Sage.

39.

McDonald

R.P.

(1999). Test theory: A unified treatment. Lawrence Erlbaum.

40.

Nowell

L.S.

Norris

J.M.

White

D.E.

Moules

N.J.

(2017). Thematic analysis: Striving to meet the trustworthiness criteria. International Journal of Qualitative Methods, 16, 1–13.

41.

Peng

(2013). The challenge of measuring willingness to communicate in EFL contexts. The Asia-Pacific Education Researcher, 22, 281–290.

42.

Peng

J.-E.

(2012). Towards an ecological understanding of willingness to communicate in EFL classrooms in China. System, 40, 203–213.

43.

Peng

J.-E.

Woodrow

(2010). Willingness to communicate in English: A model in the Chinese EFL classroom context. Language Learning, 60, 834–876.

44.

Ryan

(2009). Self and identity in L2 motivation in Japan: The ideal L2 self and Japanese learners of English. In Dörnyei

Ushioda

(Eds.), Motivation, language identity and the L2 self (pp. 120–143). Multilingual Matters.

45.

Sato

(2023). Examining fluctuations in the WTC of Japanese EFL speakers: Language proficiency, affective and conditional factors. Language Teaching Research, 27, 974–994.

46.

Shirvan

M.E.

Khajavy

G.H.

MacIntyre

P.D.

Taherian

(2019). A meta-analysis of L2 willingness to communicate and its three high-evidence correlates. Journal of Psycholinguistic Research, 48, 1241–1267.

47.

Sudina

(2023). Scale quality in second-language anxiety and WTC: A methodological synthesis. Studies in Second Language Acquisition, 45, 1427–1455.

48.

Subtirelu

(2014). A language ideological perspective on willingness to communicate. System, 42, 120–132.

49.

Swain

(1995). Three functions of output in second language learning. In Cook

Seidlhofer

(Eds.), Principles and practice in applied linguistics: Studies in honor of H.G. Widdowson (pp. 125–144). Oxford University Press.

50.

Teimouri

(2017). L2 selves, emotions, and motivated behaviors. Studies in Second Language Acquisition, 39, 681–709.

51.

Wang

Derakhshan

Zhang

L.J.

(2021). Researching and practicing positive psychology in second/foreign language learning and teaching: The past, current status and future directions. Frontiers in Psychology, 12, 731721.

52.

Weaver

(2005). Using the Rasch model to develop a measure of second language learners’ willingness to communicate within a language classroom. Journal of Applied Measurement, 6, 396–415.

53.

Wen

W.P.

Clément

(2003). A Chinese conceptualisation of willingness to communicate in ESL. Language, Culture and Curriculum, 16, 18–38.

54.

Yashima

(2002). Willingness to communicate in a second language: The Japanese EFL context. Modern Language Journal, 86, 54–66.

55.

Yashima

MacIntyre

P.D.

Ikeda

(2018). Situated willingness to communicate in an L2: Interplay of individual characteristics and context. Language Teaching Research, 22, 115–137.

56.

Yashima

Zenuk-Nishide

Shimizu

(2004). The influence of attitudes and affect on willingness to communicate and second language communication. Language Learning, 54, 119–152.

57.

Jiang

Zhou

(2020). Investigating what feedback practices contribute to students’ writing motivation and engagement in Chinese EFL context: A large scale study. Assessing Writing, 44, 100451.

58.

Zarrinabadi

(2014). Communicating in a second language: Investigating the effect of teacher on learners’ willingness to communicate. System, 42, 288–295.

59.

Zhang

Beckmann

J.F.

(2018). To talk or not to talk: A review of situational antecedents of willingness to communicate in the second language classroom. System, 72, 226–239.

60.

Zhang

L.J.

Cheng

(2021). Examining the effects of comprehensive written corrective feedback on L2 EAP students’ linguistic performance: A mixed-methods study. Journal of English for Academic Purposes, 54, 101043.

61.

Zhong

Q.M.

(2013). Understanding Chinese learners’ willingness to communicate in a New Zealand ESL classroom: A multiple case study drawing on the theory of planned behavior. System, 41, 740–751.

Developing and validating an L2 writing willingness to communicate scale: A sequential embedded mixed-methods approach

Abstract

Keywords

I Introduction

1 Potential contributors to L2 writing WTC

2 Current scales to measure willingness to communicate

II Method

1 Research design

2 Participants

3 Research procedure

a Qualitative data collection and analysis

b Quantitative data collection and analysis

c Follow-up qualitative data collection and analysis

III Results

1 Thematic analysis of precedent interview data

2 Descriptive statistics and distribution normality check of quantitative data

3 Exploratory factor analysis of the first- and second-round data

4 Validity and reliability tests

a Construct validity

b Internal consistency reliability

c Split-half reliability

d Inter-rater reliability

5 Qualitative results of participants’ L2 writing WTC

IV Discussion

1 Validation of L2WWTCS

2 Underlying factor structure of L2 writing WTC

V Conclusions

Footnotes

Appendix A

Appendix B

Appendix C

Appendix D

Appendix E

Author contribution statement

Funding

Ethical approval

ORCID iDs

References