Abstract
This study examined how Chinese native speakers (NSs) and second language (L2) learners process compound words. The findings showed that they used the hybrid model of coexistence for whole word and morphemes; and were influenced by word frequency, semantic transparency, and word structure. The results revealed that two groups of participants used hybrid representation when identifying high-frequency words and whole-word representation when identifying low-frequency words. Besides, semantic transparency might impact word structure awareness, and subject-predicate words were the most difficult to process. The research also showed that L2 learners’ word frequency effect was more robust than NSs’; morpheme location information might affect NSs, but L2 learners could not process it. There was variation in NSs’ speed in recognizing transparent and obscure words, but there was no difference among L2 learners. Besides, L2 learners’ word recognition speed could not reach the levels of NSs.
Plain Language Summary
Since the mental lexicon was put forward, researchers have begun to study the processing and representation of words. As an important part of vocabulary, the processing mechanism of compound words has also received much attention. There have been many studies on how native speakers process compound words so far, but researches on L2 learners are still in its infancy, and most of these studies have been done on English L2 learners, with little research on other languages, for example, Chinese as a second language. Compound words account for about 65% of the vocabulary in Mandarin Chinese. Therefore, research on the processing of Chinese compound words is of great significance. Two lexical judgment experiments were designed for the present study respectively with word frequency, semantic transparency, and lexical structure as independent variables to solve the disputes in the processing of compound words among native Chinese speakers and L2 learners adopting the repetitive priming paradigm based on the reaction time task. This work is the first to explore the interaction effects of semantic transparency and lexical structure on compound word processing. This study may be of particular interest to the general readers of your journal as it gives insight into the compound word processing mechanism of non-English second languages and enriches the theoretical knowledge of L2 word processing from a cross-language perspective. The findings show that they employ the hybrid model of coexistence for whole words and morphemes, and are affected by word frequency, semantic transparency, and word structure. We found that both Chinese native speakers (NS) and L2 learners use hybrid representation when identifying high-frequency words and whole-word representation when identifying low-frequency words; word structure awareness is affected by semantic transparency, and subject-predicate words are the most difficult to process. However, we also found that L2 learners’ word frequency effect is stronger than NSs’; morpheme location information may have an effect on NSs, but L2 learners cannot process it; There is variation in NSs’ speed in recognizing transparent and obscure words, but there is no difference among L2 learners. In addition, L2 learners’ word recognition speed and aptitude cannot reach the levels of NSs. Thus, Chinese L2 learners differ from English L2 learners in their lexical representations, as well as in the effects of semantic transparency, which further enriches the theory of L2 lexical processing.
Introduction
Compound words are ubiquitous in human language and are important ways of lexical construction (Libben et al., 2020). The study of compound word processing can help explicate word representation in the mental lexicon. Therefore, psycholinguistic researchers have conducted a series of studies (Libben, 1998; Pollatsek et al., 2000; Sandra, 1990; Smolka & Libben, 2017). Compound words account for about 65% of the vocabulary in Mandarin Chinese (Chen & Duanmu, 2016). Therefore, research on Chinese compound words processing is of great significance. Previous studies have explored compound processing in NSs, and their conclusions can be divided into two categories: decomposition representation (B. Zhang & Peng, 1992) and hybrid representation (Peng, Liu, & Wang 1999; Tian, Liu, & Wang, 2009). In addition, researchers have also studied the influence of word frequency (Yan et al., 2006), semantic transparency (C. Wang & Peng, 1999), and word structure (B. Zhang & Peng, 1992) on compound word processing.
The study of Chinese as a second language (CSL) learners can extend the results of compound word processing, but the current research results are few, the dimensions of attention are limited, and the conclusions are controversial (Gao et al., 2022; Jiang et al., 2020). In contrast, there has been initial progress in the processing of L2 compound words in alphabetic languages. For example, most scholars support the morphological decomposition hypothesis (Cheng et al., 2011; Li et al., 2017; M. Wang et al., 2010). However, as is typical of isolated languages, Chinese is very different from alphabetic languages in terms of word-formation rules. There is no word segmentation in the Chinese written language, which uses characters as the carrier. Therefore, there may be differences in the compound processing of Chinese and alphabetic languages between L2 learners. Thus, based on the investigation of NSs, the current study has conducted a study on how L2 learners process compound words. It focuses on the representation of Chinese L2 compound words and the effects of word frequency, semantic transparency, and word structure on the processing. Besides, the present study compared L2 learners and NSs to explore the accessibility of L2 learners’ compound word processing to develop a theory on L2 lexical processing.
Literature Review
How Native Chinese Speakers Process Compound Words
Controversy has surrounded compound word representation. The morphological decomposition hypothesis holds that a compound word is automatically decomposed into its morphemic constituents. B. Zhang and Peng (1992) investigated modifying and coordinative words processing. They found a significant frequency effect among the first and last morphemes of the coordinative words, while the final morpheme of modifying words had a significant error rate. The initial and final morphemes’ significance frequency effect indicates that words are accessed via decomposed morphemes, which supports the decomposition hypothesis. Peng et al. (1994) studied disyllabic compound words and binding words and have argued that the facilitation effects for the constituent-primed and the whole-word primed were equally robust, that is, a full priming effect. They also found that the participants recognized words in which the first constituent was a prime faster than in words in which its second constituent was a prime. This result supports the morphological decomposition hypothesis. W. Wang et al. (2017) suggested that morpheme frequency affects compound word processing and have provided evidence of the sub-lexical representations in Chinese compound words.
However, the hybrid model holds that in addition to morphemic access units, there also is a whole-word form in lexical representation. Zhou and Marslen-Wilson (1994, 1995) provided evidence to support a two-layer, whole-word and morphemic model (the Multi-Level Cluster Representation Model). Peng, Liu, & Wang (1999) proposed the Inter/Intra Connection model (IIC), which points out that morphemes and whole word co-exist in the same layer and that word properties affect the connections between the two units. Many experiments supported the IIC model (Ding & Peng, 2006; Zhang, 2009).
The two hypotheses confirm the morphemic representation, but some scholars (Peng et al., 1994; W. Wang et al., 2017; B. Zhang & Peng, 1992) do not recognize the representation of whole word. Therefore, evidence is still needed to determine whether there is a whole-word unit in the representation of Chinese NS compound words and whether the hybrid hypothesis is valid.
Besides, lexical properties such as word frequency, semantic transparency, and word structure are variables that affect compound word recognition. Among them, word frequency affects compound representation. Researchers have found that high-frequency words are represented as a whole, while low-frequency words are represented by decomposition (Yan et al., 2006). However, there are different conclusions. C. Wang and Peng (1999) found that high-frequency words are processed by decomposition, while low-frequency words are processed via the whole-word route. Therefore, word frequency’s influence on the processing of compound words is still worth exploring. Semantic transparency affects compound word representation, and transparent words are faster than opaque words. It also affects the relationship between morphemes and whole word. In transparent words, there is a positive connection between them, and in opaque words, there is a negative connection (C. Wang & Peng, 1999). Word structure is also one of the factors affecting compound word recognition. B. Zhang and Peng (1992) found that for coordinative words, the initial and the final morphemes play a role, and for modifying words, only the final morphemes play a role. Moreover, the representations of other structures are rarely involved in previous studies.
How Chinese L2 Learners Process Compound Words
How are morphemes and whole word stored in the representation of L2 compounds? In a study of L2 compound words in English, Cheng et al. (2011) and M. Wang et al. (2010) studied Chinese–English bilingual children and adults, respectively, and found that English L2 compound words are processed using morpheme decomposition models. Li et al. (2017) found that they produced robust and statistically equivalent priming effects with transparent and opaque compound primes. In addition, they observed a robust orthographic priming effect in the WORD-INITIAL overlap position but no such effect in the WORD-FINAL position. These results indicated that English L2 learners are not affected by semantics in the early processing stage. Then, L2 learners decomposed the words into morphemes, using the sublexical morpho-orthographic decomposition mechanism as the model. It can be seen that L2 processing is similar to that in L1, and they have adopted the decomposition model as a model for processing compound words.
However, CSL compound word processing remains controversial. Jiang et al. (2020) examined the stroke-number effect of Chinese characters and found that only CSL speakers, and not NSs, showed a significant stroke-number effect. This result provided empirical support for L2 learners’ analytical processing strategies. However, Gao et al. (2022) have provided evidence to the contrary. They found that advanced CSL speakers and NSs show that the whole-word priming and the initial morpheme priming are robust and that the whole-word priming effect was more significant than the initial morpheme priming. They suggested that advanced L2 learners and NSs might employ a whole-word pathway.
In addition, lexical properties such as word frequency and semantic transparency also affect compound word processing in L2. Many studies have confirmed the word frequency effect in L2 learners. M. Wang and Koda (2005) tested two groups of English L2 learners (Chinese and Korean) and found that L2 learners’ naming performance for high-frequency words was faster and more accurate than that of low-frequency words. The same conclusion has been reached in the study of Chinese L2 learners’ compound word processing (Jiang et al., 2020). There is also an interesting phenomenon in the word frequency effect when using the same set of stimuli: L2 learners tended to produce a more robust frequency effect than NSs (de Groot et al., 2002; Van Wijnendaele & Brysbaert, 2002). Does this phenomenon occur in Chinese as a second language? No similar conclusions have been reached thus far.
Does semantic transparency play a role for L2 learners? Cheng et al. (2011) and M. Wang et al. (2010) found that Chinese-English bilingual children and adults use decomposition strategies when processing transparent and opaque words. Therefore, semantic transparency does not affect English bilinguals’ processing of compound words. Uygun and Gürel (2017) and Li et al. (2017) have also found that semantic transparency does not affect the early processing of compound words in English L2 learners with different mother tongues. However, some scholars have held the opposite view. Gan and Zhang (2013) studied English L2 learners with Chinese mother tongues and pointed out that both morpheme and whole word promote transparent word access. Thus, semantic transparency is still controversial in research on L2 learners’ compound word processing; this controversy has focused on how transparent words are processed. Most studies on compound word processing by Chinese native speakers have concluded that semantic transparency affects compound words’ recognition (Tse & Yap, 2018; C. Wang & Peng, 1999). Meanwhile, most studies on the semantic transparency of Chinese L2 compound words have been conducted with paper-based tests; to our knowledge, there have been no online studies. Therefore, semantic transparency’s role in recognizing Chinese L2 compound words is worth exploring.
L2 learners’ awareness of word structure is essential in predicting their reading ability. Feng (2003) pointed out that Chinese L2 learners’ compound word recognition is influenced by word structure and that there are differences in recognition patterns between modifying words and coordinative words. The initial and final morphemes of high-frequency coordinative words have the same effect on word recognition. However, the function of the final morpheme is more significant than that of the initial morpheme in the modifying words. Hao and Li (2015) only found the function of the final morpheme of the high-frequency modifying word. Up to the present, scholarship has focused on coordinative words and modifying words but not on other structures.
In summary, Chinese L2 compound word processing is a complex cognitive process influenced by many factors. However, few studies have involved narrow areas and drawn different conclusions. In terms of research methods, most studies are paper-based tests of offline tasks, with little online research. Moreover, from a research standpoint, they focus on single factors in compound word processing, with little inquiry into the factors or the interactions between them. Furthermore, existing studies have reached disparate conclusions, such as the debate over morpheme decomposition versus whole-word representation, the influence of word frequency, and the influence of semantic transparency.
Method
Research Question
The primary purpose of this study was to explore compound word processing in native Chinese speakers and L2 learners and to observe the effects of word frequency, semantic transparency, and lexical structure on compound word recognition in both speakers. In addition, this study compared the experimental results of Chinese NSs and L2 learners to explore the accessibility of compound word processing in Chinese L2 learners. By adopting the repetition priming paradigm, this study designed two lexical decision experiments (Jiang, 2013) to explore the following questions:
What units are used by native speakers and L2 learners of Chinese for representation during compound word recognition?
How do word frequency, semantic transparency, and word structure affect compound word processing in native speakers and L2 learners of Chinese?
Two experiments were conducted: Experiment 1 examined compound word processing and the effects of word frequency on word recognition in native speakers and L2 learners of Chinese. Experiment 2 examined the effects of semantic transparency and Word structure on the recognition of compound words by native and L2 speakers of Chinese.
Participants
Thirty-two native Chinese speakers and 32 high-level Chinese L2 learners participated in Experiment 1. The Chinese native speaker group consisted of non-linguistics undergraduate students. The participants in the L2 group were all Indonesian students, all of whom were of non-Chinese origin, formally learned Chinese after the age of 12, and had been studying Chinese in China for 3 to 4 years. They had passed the HSK Level V (Hanyu Shuiping Kaoshi, an international standardized test of Chinese language proficiency). All of the participants were between the ages of 18 and 25. Later, three L2 participants were added as a supplement because the data error rate of the three L2 participants exceeded 20%, which brought the total number of participants in the experiment to 67.
Experiment 2 chose the participants using the same criteria as in Experiment 1, and the gap between the two experiments was longer than 2 weeks, thus preventing Experiment 1 from interfering. However, because the data error rate for the four L2 participants was greater than 20%, this study included four additional L2 participants, resulting in 68 participants in Experiment 2.
We obtained informed written consent from all participants prior to the experiment. This research was approved by the ethics committee of Huaqiao University (the ethic code is M2021014).
Design and Materials
Experiment 1 used a lexical decision task based on repetition priming paradigm, with a 2 (Participants Type: L1 vs. L2) × 2 (Frequency: high vs. low) × 4 (Prime Type: initial morpheme, final morpheme, whole-word, and symbol) design. Taking the target word管理 (manage) as an example, 管(pipe), 理 (reason), and 管理 are the prime of the target word respectively, and “####” as the symbol prime which is the baseline for measuring other prime type conditions (Peng et al., 1994).
The materials were selected from the primary and intermediate words in
Information on the Materials in Experiment 1.
Forty non-words were constructed by combining characters. Their frequency was similar to those of morphemes in the key material. These non-words did not form words or make sense and were matched with keywords in stroke number and morpheme frequency (
The Latin square was used to balance the inertia of the correct materials and the non-words. In order to match the key material conditions, four lists were created. Then, the two types of participants were divided into four groups of eight people each. Each participant only read one list, and there were 40 target words. In addition, the priming type of the same target word was to be different in all four lists. Each participant completed a single list.
Experiment 2 used a 2 (Participants Type: L1 vs. L2) × 2 (Semantic Transparency: transparent vs. opaque) × 4 (Prime Type: initial morpheme, final morpheme, whole-word, and symbol) × 5 (Word Structure: modifying word, coordinative word, verb-object word, verb-complement word, subject-predicate word) design.
The source of the material was the same as that of Experiment 1. First, 200 high-frequency words were randomly selected, and they were evaluated with the same semantic transparency scale to select transparent words and opaque words. Then, by determining the word structure of 200 high-frequency words, we found that verb-complement words were the least common, with only eight high-frequency opaque words. Therefore, we chose qualified words from the other four compound word categories based on semantic transparency and determined that the key material consisted of 80 words. These key materials were matched in the number of strokes, word frequency, initial and final morpheme frequency, the number of families of morphemes, vagueness, and concreteness (
Information on the Materials in Experiment 2.
Note. Word frequency: 1/1.31 million. Examples meanings: 森林, forest; 范围, scope.
Procedure
The experiment was conducted in a professional laboratory equipped with a Dell laptop and programmed with E-Prime 3.0. The stimulus was presented in the center of the screen, with black characters on a white background.
The experiment sequence was as follows: a fixation “+” was first presented for 500 ms, which indicated the beginning of the experiment, followed by a prime word in Song typeface (管) for 500 ms. The prime was then followed by the mask (****) for 200 ms. The mask was then immediately substituted with the target words in regular script (管理) and presented in the same size as the prime. The target word remained on the screen until the participant responded by pressing a button or until there was a 3,000 ms timeout. The participants would judge the target word by pressing the “J” key with the right hand for real words or the “D” key with the left hand for non-words. The blank screen was presented for 1,000 ms after the end of the press, followed by the next trial. If no judgment was made within 3,000 ms, the response to the target word was wrong. The participants were asked to react quickly and accurately when presented with the target words. Participants needed to take eight trials of exercises before the experiment, and the participants did not enter the formal part of the experiment until they were familiar with the procedure. A typical trial looked like this: + - 管 - ****- 管理 (Figure 1).

The procedure of lexical decision task based on repetition priming paradigm.
Data Analysis
Correct responses to keywords were included in the RT (Reaction Time) analysis. Responses that were 2.5 standard deviations from the mean of RT were also excluded, and any RT data that was less than 150 ms was eliminated.
This study used mixed-effects modeling to estimate the effects of experimental factors and the lme4 package (Bates et al., 2015) in R to analyze participants’ RT data. The mixed-effects model was as follows: Mixed model = M.01<−lmer (RT~Category × Condition + (1|participants) + (1|items), data = Gjc). In this model, RT was the participants’ reaction time which is the dependent variable of the experiment. The Category was participants’ type, divided into NSs and L2 learners. The Condition represented experimental conditions. All factors were of interest, and the ANOVA () function was used to analyze their main effects. Finally, the
Results
Experiment 1: Word Frequency’s Influence on Transparent Compound Word Processing
Descriptive statistics for participants’ RT under different conditions are shown in Table 3 and Figure 2.
Mean Reaction Times Per Condition in Experiment 1 (ms).

Mean reaction times per condition in Experiment 1 (ms).
We used mixed-effects modeling to fit the participants’ RT data, then reported the β values as the statistical results’ effect quality. The mixed-effects model after fitting was as follows: Mixed model = M.01<−lmer (RT~Category × Condition + (1|participants) + (1|items), data = Gjc). In this model, the dependent variable was RT. The category and condition were independent variables, and they also were the fixed-effects of the mixed-effects modeling. Participants and items were the random-effects of the mixed-effects modeling. The conditions included prime type and word frequency.
The main effect of the participant was significant. The RT in the NS group was shorter than that in the L2 group (χ2(1) = 777.89,
The main effect of the frequency of the NS group was not significant (χ2(1) = .68,
The main effect of the frequency of the L2 group reached significance (χ2(1) = 83.21,
Experiment 2: Semantic Transparency and Lexical Structure’s Influence on High-frequency Compound Word Processing
The descriptive statistics of the participants’ RT under different conditions are shown in Table 4.
Mean Reaction Times Per Condition in Experiment 2 (ms).
The data analysis was similar to Experiment 1. The mixed-effects model after fitting was as follows: Mixed model = M.01<−lmer (RT~Category × Condition + (1|participants) + (1|items), data = Gjc). In this model, the conditions included prime type, semantic transparency, and word structure.
The main effect of the participants was significant, and the NSs’ RT was shorter than that of the L2 learners (χ2(1) = 1,911.23,
In the NS group, the main effect of semantic transparency was significant (χ2(1) = 5.85,

Mean reaction times per condition in Experiment 2 (ms).
The interaction between semantic transparency and word structure was significant (χ2(4) = 19.16,
In the L2 group, the main effect of semantic transparency was not significant (χ2(1) = .12,
The interaction between priming type and word structure was significant (χ2(12) = 38.4,
The interaction between priming type, semantic transparency, and word structure was significant (χ2(12) = 32.24,
Discussion
The Representation of Chinese NS and L2 Learners’ Compound Words
The experimental logic of the repetition priming paradigm is: if the morpheme and whole word have the same priming effect as in symbol priming, then both morpheme and whole word may become independent units; if the whole word has a priming effect but the morpheme does not, then whole-word becomes an independent representation, not a morpheme. These results showed that when NSs process high-frequency compound words, the effects of initial morpheme priming and whole-word priming were significant, and whole-word priming was faster than the initial morpheme priming and final morpheme priming. It demonstrated that both morphemes and whole-word exist in the recognition of high-frequency words and that the processing of high-frequency compound words is a hybrid model. This model has been supported by many scholars (Ding & Peng, 2006; Zhou & Marslen-Wilson, 1995). Zhou and Marslen-Wilson (1995) adopted the auditory priming task, while Ding and Peng (2006) used the semantic priming paradigm to investigate the reversible words as experimental material. We provided new evidence for the hybrid model using the repetition priming paradigm.
However, this conclusion is inconsistent with the repetition priming experiment conducted by Peng et al. (1994). They discovered that morpheme priming and whole-word priming had equally significant priming effects, that is, a full priming pattern. At the same time, they observed a decomposition phenomenon from left to right in word processing, which suggested that Chinese compound word processing was a morpheme decomposition representation. The difference may be that SOA (stimulus onset asynchrony) is inconsistent. Peng et al. (1994) set SOA at 240 ms. Thus, the recognition effect of the priming stimulus on the target word may have been in the early stages of processing, and there may have been no significant difference between the morpheme and whole-word priming effect. However, in our experiment, the SOA was 700 ms, the recognition time may have been sufficient, and the NSs had reached the late recognition stage. Gao et al. (2022) also adopted the repetition priming paradigm and found no difference between morpheme and symbol priming in NSs. They believed this phenomenon was caused by the use of whole-word representation in high-frequency and high-familiarity materials. However, our work found a significant morpheme priming effect in high-frequency word analysis. The reason for this difference may be that they do not design the final morpheme as a priming stimulus, leading to sequential effects among the participants to counteract the priming effect of the initial morpheme. In addition, if there is no designed final morpheme priming, it is impossible to observe whether there is a morpheme position effect.
The RT data on high-frequency words in Chinese L2 learners showed a significant morpheme and whole-word priming effect, and the whole-word priming effect is greater than the final morpheme priming effect. This result indicated the co-existence of morphemes and whole-word units, supporting the hybrid model, which was different from previous studies on compound word processing in English L2 (Cheng et al., 2011; Li et al., 2017; Uygun & Gürel, 2017; M. Wang et al., 2010). They suggested that English L2 compound words are presented in a decomposed manner. The reasons for the research controversy are different between Chinese and English. In English, even if words contain multiple morphemes, the shape of the compound words is still a whole visually, so the compound words must be decomposed into several morphemes. However, Chinese compound words are two individual characters in font shape, and word formation is more complex than in English.
However, this study also differed from the conclusions of Gao et al. (2022). They found significant initial morpheme and whole-word priming effects in high-frequency compound words, and whole-word priming was faster than the initial morpheme priming, which suggested that advanced Chinese L2 learners may rely on the whole-word representation. Nevertheless, this statement is open to question. Our experiments found significant initial morpheme and whole-word priming but no difference between them, suggesting the coexistence of whole-word and morpheme units. Furthermore, there was no final morpheme priming in the Gao et al. (2022) experiment, so it was impossible to observe whether there was a morpheme position effect. Finally, in teaching Chinese L2 vocabulary, the morpheme teaching method can consolidate the learned vocabulary and expand the range of new words. This method is often used at intermediate and advanced levels to help students improve their learning efficiency. Some studies have also proved the teaching effect of morpheme teaching methods (Y. Zhang & Zhang, 2021). Therefore, when advanced Chinese L2 learners recognize compound words, the morpheme still plays a role, and its representation is still the hybrid model.
In summary, the hybrid model is supported for compound word processing by both Chinese NSs and L2 learners, but there are slight differences between them. Chinese NSs rely more on whole-word units, while the decomposed representation of compound words is present in L2 learners. The differences between Chinese and Indonesian may cause this. Although compound formation is also used in Indonesian, its orthographic system (Muljani et al., 1998) and internal structure (Ji, 2019) differs from those of Chinese. The orthographic system of Indonesian is more similar to English, and its compound words in three forms (hutan rimba, jungle; jual-beli, transaction; siapsiaga, ready) differ more from Chinese. Through a lexical decision task, Muljani et al. (1998) found that Indonesian English learners were more accessible than Chinese learners of English to recognize words. It demonstrated that the similarity between L1 and L2 could facilitate word processing in L2. Whereas Li et al. (2017) demonstrated that native English speakers used the sublexical morpho-orthographic decomposition model for compound word processing, there may also be morphological decomposition in Indonesian. Therefore, the morphological decomposed units present in Indonesian Chinese learners’ compound word processing may come from the influence of their native language and further demonstrates that native language processing strategies can be transferred to a second language.
Word Frequency’s Influence on NSs’ and L2 Learners’ Compound Words Processing
Our work observed no word frequency effect in the NS group, which is different from previous studies (C. Wang & Peng, 1999; Yan et al., 2006). The reason for this could be the material from the elementary and intermediate vocabulary. These words are familiar to NSs, so there is no difference between high-frequency and low-frequency words in recognition speed. However, RT data showed a significant interaction between word frequency and priming type, indicating that word frequency may affect priming type. It meant that the role of representational units was different when native Chinese speakers recognized high and low-frequency words. When participants identified high-frequency words, the whole-word unit played the most considerable role, the initial morpheme also played a role, and its role was smaller than that of the whole-word units. However, the final morpheme did not play a role. When participants identified low-frequency words, only the whole word played a role. This result is inconsistent with Yan et al. (2006), who found that high-frequency words tend to be processed as whole-word units and low-frequency words tend to be decomposed. We can use the connectionism theory to explain this. Connectionism theory holds that the more a word appears, the stronger the connection between it and its morphemes (C. Wang & Peng, 1999). The connection between morphemes and words is stronger among high-frequency words and weaker among low-frequency words. Thus, the morpheme priming effect is present in high-frequency words but not low-frequency words. So why is there only an initial morpheme effect in high-frequency words? It is related to the stimulus duration, and previous studies have found that processing time affects compound word recognition (Peng, Ding, et al., 1999; J. Zhang, 2011). Considering the proficiency of the L2, we set the stimulus duration to 500 ms, at which point the NSs begin elaborate processing to confirm target words. During this stage, the representation matched with the stimulus input is further activated, and the mismatched representation is suppressed. The whole word and the initial morpheme priming match the representation of the target word, and the whole-word priming effect and the initial morpheme priming effect appear. In contrast, the final morpheme priming does not match the morpheme position information, so the representation unit’s activation is suppressed. Therefore, morpheme position information is vital in NSs’ compound word identification.
In the L2 group, the RT of high-frequency words was significantly shorter than that of low-frequency words, which further supported the existence of the word frequency effect (Lemhöfer et al., 2008). Our work also discovered that the representation of Chinese L2 learners differs due to differences in word frequency by analyzing the interaction between word frequency and priming type. In high-frequency word recognition, the morpheme and the whole word play an important role, while the whole word plays a significant role in recognizing low-frequency words. Because this is similar to NSs’ experimental results, it is feasible to apply the same theory to explain the observed results. In contrast to NSs, the L2 learners have robust initial and final morpheme priming effects in high-frequency words, with no significant difference. However, the NSs’ group only has the initial morpheme priming effects. These results showed that at 500 ms, the NSs had completed word recognition, and any representation that did not match the target word would be suppressed. While the L2 learners have not reached later stages of processing, the morpheme position information has not been fully processed, and the mismatched information has not affected the compound word recognition. Therefore, the L2 learners have not entered later stages of compound word recognition, their word processing has not reached the NSs’ level, and the morpheme position information cannot affect their compound word processing.
Comparing the data from the NSs and L2 learners, we found that when there was a long extension of the stimulus duration, there was no difference in processing speed for either high-frequency or low-frequency words among the NSs. However, there is a significant difference in the processing speed of different frequency compound words among the L2 learners. This phenomenon shows that in the face of the same experimental tasks and materials, Chinese L2 learners may have a more significant word frequency effect than NSs. A similar phenomenon has been found in English L2 learners (de Groot et al., 2002; Lemhöfer et al., 2008; Van Wijnendaele & Brysbaert, 2002). Furthermore, word frequency affects the NSs and L2 learners’ compound word recognition differently. When native speakers recognize high-frequency words, it is primarily the whole word and initial morphemes that play a role, while the whole word, initial and final morphemes play a role for L2 learners. It is possible to relate to whether or not word recognition reaches later stages of processing. The L2 learners cannot process the morpheme position information nor reach the NSs’ level. Only the whole word is involved in both participant groups when recognizing low-frequency words, which differs from the previous conclusion but can be better explained by connectionism theory.
Semantic Transparency’s Influence on NSs’ and L2 Learners’ Compound Words Processing
The NSs group’s semantic transparency effect was significant; and they recognized transparent words faster than opaque ones. It confirmed the existence of the semantic transparency effect in compound words processing (C. Wang & Peng, 1999). In addition, the interaction between semantic transparency and word structure suggested that semantic transparency may also play a role in word structure awareness of the NSs’ group, affecting the processing time of different structural types. During the initial morpheme priming, there was variation in the time processing of different types of transparent words, but there was no significant difference in speed for opaque words. This is the first study to find an interaction between word structure and semantic transparency.
The data from L2 learners showed that there was no significant main effect of semantic transparency. Semantic transparency does not affect L2s’ recognition speed, which seems similar to the conclusions of Cheng et al. (2011), but this is not the case. There was a significant interaction between semantic transparency, priming type, and word structure, which suggested that semantic transparency may have influenced Chinese L2s’ compound word recognition. In transparent words, there were three different priming effects in modifying and verb-object words. However, in opaque words, there was no initial morpheme priming effect in modifying words and no final morpheme priming effect in verb-object words. This result differs from the conclusions of Cheng et al. (2011) and Li et al. (2017). They found that English L2 learners use decomposition strategies when recognizing transparent and opaque words. The reason may be related to the differential nature of the two languages. Morphemes in Chinese compound words have more robust ideographic functions than those in English, and semantic information could be involved in word processing earlier. Moreover, semantic transparency also affects word structure awareness in compound word processing in L2. Under the same priming type, the RT of different structure compound words in transparent and opaque words was different.
There are similarities and differences in the effects of semantic transparency on compound word processing mechanisms in NSs and L2 learners. Firstly, semantic transparency affects word structure awareness in both groups, and the speed order of recognizing compound words with different structures may be affected by semantic transparency. Upon initial morpheme priming, the NSs recognized modifying, coordinative, and verb-object words faster than verb-complement and subject-predicate words in transparent words. Moreover, there was no significant difference between compound word types in opaque words. In transparent words, the L2 learners recognized the modifying, coordinative, and verb-object type faster than the verb-complement type. The coordinative type in opaque words was faster than the subject-predicate. In both NSs and L2 learners, the verb-complement word recognition was slow.
Furthermore, semantic transparency affects the representation of compound words in L2 learners. When L2 learners recognize modifying words, if the word is transparent, its representation is a hybrid, and if the word is opaque, its representation is the whole word. However, NSs’ data did not show a significant interaction between semantic transparency and priming paradigms.
Word Structure’s Influence on Compound Words Processing for NSs and L2 Learners
These results suggested that word structure affects NSs’ processing. It provides new evidence for the view that word-formation awareness is activated in word recognition (J. Zhang, 2011). Word structure could also affect the recognition speed of compound words; modifying and coordinative words were faster than the verb-complement type, and there was no significant difference between the other structures. Therefore, for NSs, modifying and coordinative words can be activated quicker and easier. According to findings on corpus research, modifying words accounted for the largest proportion of all compound words, and the proportions of coordinative, verb-object, subject-predicate, and verb-complement words were in descending order (Shen, 1998). Thus, the differences in recognition speed between different structures may be related to word frequency. The recognition speed of modifying and coordinative words confirms this conclusion, and the remaining structural compound words may have no difference in recognition speed due to limitations in the number of materials. It needs further research.
Word structure also influences L2 compound word processing. Like the NSs, the recognition time is influenced by word formation, with modifying, coordinative, and verb-object words faster than verb-complement words. Additionally, the modifying type was faster than the subject-predicate. Compared with NSs, the results of L2 learners are closer to the statistics of different types of compound words in the Shen (1998) corpus study. Moreover, the awareness of word formation also affects the representation of L2 compound words. These results showed that initial and final morpheme activations were the same in opaque coordinative words by analyzing the interaction of priming type, semantic transparency, and word structure. Feng (2003) used the masked priming experiment to study intermediate Chinese L2 learners and found that initial and final morphemes were equally activated. Our work has studied advanced Chinese L2 learners via the repetition priming paradigm, which provides new evidence for this conclusion. However, Hao and Li (2015) conducted an LDT study of elementary L2 learners, which did not find the same activation effects in the initial and final morpheme in coordinative words. We believe that the discrepancy stems from the differences in the L2 learners’ Chinese proficiency, that the primary L2 learners’ word structure awareness may still be in its infancy, and that the recognition of coordinative words is affected by sequential processing (Hyönä et al., 2004). However, with improved Chinese proficiency, their word structure awareness will continue to increase and change toward the NSs’ processing mode. In addition, regardless of whether the semantic transparency is high or low, there are two representations of morpheme and whole word in coordinative and subject-predicate words. However, due to differing semantic transparency, modifying, verb-object, and verb-complement words show different priming effects.
In summary, word structure influences compound word processing in both NSs and L2 learners, but the specific manifestations differ between the two groups. This experiment confirmed word structure awareness’s psychological reality in NSs and L2 learners. Combined with the study of primary L2 learners (Hao & Li, 2015) and intermediate L2 learners (Feng, 2003), we found that Chinese L2 word structure awareness is a dynamic process of gradual development from weak to strong. Comparing NSs’ and L2 learners’ data also indicated that the whole word and morpheme co-exist when they process coordinative and subject-predicate words. In contrast, the NSs mostly rely on the whole word. It also shows that advanced L2 learners’ word structure awareness has not reached the level of NSs’. Furthermore, the two participant groups had similar processing time trends for the five different structures. There was no significant difference between the modifying, coordinative, and verb-object types; the verb-complement type was the most difficult to process. Nevertheless, the NSs and L2 learners may adopt different strategies when processing words of the same structure. For example, L2 learners recognized subject-predicate words as hybrid representations, whereas NSs used the whole-word processing strategy. It may have been influenced by the Indonesian participants’ native language, as Ji (2019) pointed out the absence of subject-predicate and descriptive-complement forms in Indonesian compound words. Their absence in the native language led to difficulties in L2 processing, while the similarity in native and second language constructions facilitated the recognition of L2 compound words.
This study is not free from limitations. First, the participants of the L2 group are Indonesian learners who belong to the Non-Chinese Culture Circle. These findings may only be generalized to alphabetic languages with similar orthography and morphology, while learners belonging to the Chinese Culture Circle (e.g., Japanese and Korean) may differ in their recognition of compound words due to the influence of their native language. Therefore, in future studies, learners from the Chinese Culture Circle should be included as participants. The experimental results of Chinese native speakers, learners from the Chinese Culture Circle, and those from the Non-Chinese Culture Circle can be compared to obtain more comprehensive conclusions.
Second, there may be a Hawthorne effect because the participants already know they will participate in the experiment. Here the study used a lexical decision task under the priming paradigm, where participants were not aware of the meaning of the priming paradigm during the experiment, to ensure that all participants remained balanced before the experiment. In further studies, the presentation time of the priming and masking items can be reduced to decrease the visual effect of the priming stimuli, thus being participants in an unknown state for the experimental task.
Conclusion and Pedagogical Implication
The two experiments above provide answers to the research questions.
The morpheme and whole word co-exist in compound words representation, and the NSs and L2 learners show a hybrid model. Under various conditions, L2 learners’ RT was longer than that of NSs, indicating that the advanced L2 learners still could not reach the NSs’ processing level.
Word frequency affects compound words processing in both NSs and L2 learners. When native speakers recognize high-frequency words, the whole-word unit is crucial due to the mismatch of morpheme position information, that is, the final morpheme effect is suppressed, and the initial morpheme is activated. Nevertheless, in L2 learners, both initial and final morphemes are activated, and the whole word is predominant. This result also shows that morpheme location information influences NSs, but advanced L2 learners cannot process it.
Semantic transparency affects compound word recognition in both groups. However, when NSs recognize transparent and opaque words, there is a significant difference in recognition speed, but there is no such difference in L2 learners.
Word structure also influences the compound processing of both NSs and L2 learners. The two groups of participants had similar processing time trends for the five different structures, and the verb-complement type was the most difficult to process. There are differences in how different structural words are represented as well. In addition, this is the first study to find an interaction between word structure and semantic transparency and conclude that semantic transparency affects lexical structure awareness.
The first language can impact word recognition when processing the second language, with orthographic and morphological similarities facilitating recognition of second language words and differences creating recognition difficulties.
This study found that word frequency was a vital factor influencing Chinese L2 learners, implying that the more frequently words appear, the faster learners process them. It requires that textbooks deal with words regularly and scientifically and that vocabulary classroom teaching should guide students to carry out vocabulary practice appropriately and systematically. Most textbooks today take text as the main line and do not yet pay attention to the recurrence of words. Our experiments have shown that sublexical level factors, such as initial and final morphemes, word structure, and semantic transparency, impact the processing of compound words by high-level Non-Chinese Culture Circle learners. It suggests that learners have accumulated knowledge of the patterns of structural and semantic relationships among morphemes of compound words, providing experimental support for the morpheme teaching method. The development of textbooks and classroom teaching on vocabulary can be done by consciously using morphemes with strong word formation ability, guiding students to build awareness of morpheme formation, constructing awareness of guessing the meaning of new words during input learning, and then expanding vocabulary through extensive reading. In addition, our work found that compound words processing of the first language impacted second language recognition, suggesting that teachers can use the similarities between the first and second languages to help students understand and master the second language. However, second language teaching should not rely exclusively on the first language transfer. It should pay attention to linguistic differences and consider the socio-cultural environment of the target language (Kolodkina & Tan, 2008) to avoid the influence of negative first language transfer.
Footnotes
Appendix 1
Author Notes
This research was conducted while [Chenxi Wu] was at [College of Chinese Language and Culture, Huaqiao University]. He is now at [University International College, Macau University of Science and Technology] and may be contacted at [
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by 2022 International Chinese Education Research Topic Key Project (22YH37B).
Data Availability Statement
The data supporting the findings of this study are available upon request from the corresponding author.
