Abstract
Aims and objectives:
The study provides new insights into how bilingual speakers process semantically complex novel meanings in their native (L1) and non-native language (L2).
Methodology:
The study employs an EEG method with a semantic decision task to novel nominal metaphors, novel similes, as well as literal and anomalous sentences presented in participants’ L1 and L2.
Data and analysis:
In total, 29 native speakers of Polish (L1) who were highly proficient in English (L2) took part in the study. The collected EEG signal was analyzed in terms of an event-related potential analysis. The statistical analyses were based on behavioral data (reaction times and accuracy rates) as well as mean amplitudes for the four conditions in the two languages within the N400 and LPC time windows.
Findings:
The results revealed the N400 effect of utterance type modulated by language nativeness, where the brainwaves for anomalous sentences, novel nominal metaphors, and novel similes converged in L2, while in L1 a graded effect was observed from anomalous sentences to novel nominal metaphors, novel similes and literal sentences. In contrast, within the late time window, a more pronounced sustained negativity to novel nominal metaphors than novel similes was observed in both languages, thus indicating that meaning integration mechanisms might be of similar automaticity in L1 and L2 when bilingual speakers are highly proficient in their L2. Altogether, the present results point to a more taxing mechanisms involved in lexico-semantic access in L2 than L1, yet such an increased effort seems to be resolved within the meaning integration phase.
Originality:
The findings present novel insights into how bilinguals construct new unfamiliar meanings and show how and when cognitive mechanisms engaged in this process are modulated by language nativeness.
Significance:
The study might provide crucial implications for further research on bilingual semantic processing as well as human creativity.
Introduction
The role of semantic complexity in bilingual language processing can be well tapped into by studying novel metaphors, which are highly creative and rarely used in everyday language, and which can therefore be clearly contrasted with semantically simple meanings (i.e. literal utterances). Metaphors are defined as utterances whose meaning can be accessed as a result of cross-domain mappings, which involve recognizing common features of two presumably distinct concepts (De Grauwe et al., 2010; Gibbs & Colston, 2012). Importantly, the strength of the links that are formed as a result of such mappings is modulated by their frequency (Bowdle & Gentner, 2005). Consequently, metaphors with highly frequent connections between the two domains become conventionalized and lexicalized, resulting in a high familiarity of such utterances (i.e. conventional metaphors). In contrast, less familiar, more creative and poetic meanings (i.e. novel metaphors) require additional referential processes, as a result of which novel metaphor comprehension is more resource-intensive as well as more time-consuming, as it involves the process of meaning creation (Gibbs & Colston, 2006). It has been, however, postulated that, at least in the native language, novel metaphor processing involves the process of comparisons and, therefore, novel metaphors are easier and faster to comprehend when presented as novel similes (A is like B) compared with novel nominal metaphors (A is B; Bowdle & Gentner, 2005). The present study is aimed to test this assumption in the context of both native (L1) and the non-native language (L2), so as to show whether the processing of semantically complex meanings (novel similes and novel nominal metaphors) is dependent on language nativeness.
Specific mechanisms engaged in forming the links between the two metaphor domains when processing novel metaphors in L1 were thoroughly discussed in the Career of Metaphor Model (Bowdle & Gentner, 2005). The authors postulate that the vehicles of novel metaphors (e.g. Amnesia is a
Previous event-related potential (ERP) studies on metaphor processing in the monolingual context have repeatedly reported two language-related ERP components: the N400 and the late positive complex (LPC). The N400 is a negative-going brainwave that peaks in amplitude at around 400 ms after stimulus onset, is usually observed over centro-parietal electrode sites, and reflects the amount of information that needs to be retrieved from the long-term memory (Kotz et al., 2012; Kutas & Federmeier, 2000, 2011). In metaphor research, a graded N400 effect is frequently observed, with the most pronounced N400 amplitudes for novel metaphors, followed by conventional metaphors, and finally literal utterances (e.g. Arzouan et al., 2007; Coulson & Van Petten, 2002; Lai et al., 2009). Such an effect is indicative of the more extended mechanisms involved in metaphoric mapping constructions, defined as the process of forming relational correspondences between different concepts (Arzouan et al., 2007; Coulson & Van Petten, 2002). The LPC, on the other hand, is a positive-going wave peaking in amplitude at around 500–900 ms post stimulus onset (Friedman & Johnson, 2000) and indexing processes engaged in meaning re-analyses or additional working memory load (e.g. Brouwer et al., 2012; Regel et al., 2011; Spotorno et al., 2013). Interestingly, within the LPC time frame, previous research on metaphor comprehension has also revealed sustained negativity in response to novel metaphoric meanings (Goldstein et al., 2012; Rataj et al., 2018b; Tang et al., 2017a). Such a sustained negativity for novel metaphors relative to literal utterances might reflect the continuing difficulty of novel metaphoric meaning integration and/or access to the non-literal route when comprehending metaphoric meanings (Jankowiak, 2019). Notwithstanding the aforementioned studies into metaphor comprehension, previous research into novel simile and novel nominal metaphor processing has been limited to the monolingual context, and it thus remains under-investigated whether comparison mechanisms facilitate novel metaphor comprehension also in the non-native tongue.
It, however, seems crucial to note that in addition to providing insights into specific mechanisms engaged in novel metaphor processing in bilingualism, studies on metaphor comprehension offer a new and valuable perspective on the interaction between semantics and bilingualism. To date, bilingualism research has pointed to potentially more developed executive control functions in bilingual relative to monolingual populations (e.g. Adesope et al., 2010; Bialystok et al., 2009; López Zunini et al., 2019; Timmer et al., 2017; but see Kousaie et al., 2014; van den Noort et al., 2019), which indicates their better ability to direct and organize problem-solving mechanisms (Lezak et al., 2004). Possibly in line with and resulting from this bilingual advantage, neuroimaging studies have shown distinct patterns of brain activation when processing L1 compared with L2, with the non-native language recruiting brain regions that exceed the L1 semantic network and are related to executive control mechanisms (see Sulpizio et al., 2020 for a meta-analysis of functional neuroimaging studies on bilingual language processing). At the same time, however, Sulpizio et al. (2020) showed that at the level of lexico-semantic processing, it is the native language that recruits larger cortical and subcortical regions, which might thus indicate that L1 has an access to a richer lexico-semantic system.
So far, a great majority of psycholinguistic studies into bilingual lexico-semantic processing have been devoted to investigating the effects of semantic priming or the comprehension of semantically congruous compared with incongruous stimuli. Previous studies have, however, rarely been devoted to investigating bilingual language processing at different levels of semantic complexity, which can be examined by means of studying novel metaphoric language comprehension. Behavioral research on other types of metaphoric expressions has suggested that metaphor comprehension lags behind in L2 compared with L1 (Heredia & Cieślicka, 2016; Littlemore et al., 2011; Vaid et al., 2015). However, experiments conducted thus far have rarely focused on the comprehension of semantically complex meanings (i.e. novel metaphors), have hardly ever employed electrophysiological methods, and have not been devoted to investigating the role of comparison mechanisms when processing novel metaphoric meanings in bilingualism (Rataj, 2020).
While some studies have tested how bilingual speakers comprehend conventional metaphoric meanings (e.g. Citron et al., 2020; Mashal et al., 2015; Su et al., 2019), to the best of our knowledge, thus far only one ERP study has examined novel and conventional metaphor comprehension in the context of bilingualism. Jankowiak et al. (2017) tested novel and conventional metaphoric word dyads, and observed attenuated LPC amplitudes for conventional metaphors in L2 but not in L1, thus suggesting a decreased sensitivity to the levels of metaphor conventionality in the non-native tongue, as L2 conventional metaphors and novel metaphors might be similarly cognitively taxing. Importantly, the authors used only word pairs with minimal contextual cues (e.g. to smell excuses), and did not examine whether a linguistic form in which metaphors are presented (i.e. a nominal metaphor vs. a simile) could modulate bilingual metaphor comprehension.
The present study aims to provide new insights into the comprehension of semantically complex meanings in bilingualism by means of examining the processing of novel metaphors presented as nominal metaphors and as novel similes, whose comprehension is hypothesized to require a high degree of cognitive effort engaged in meaning creation processes even in the native tongue. Furthermore, the study is aimed to show whether comparison mechanisms could facilitate novel metaphor comprehension in both languages, and thus to provide an in-depth examination of how semantic information is accessed from the semantic memory network when processing the native and non-native language.
Method
Participants
The original sample included 31 participants, but 2 were excluded from the analyses due to their accuracy rates on literal or/and anomalous trials being below 50%. This resulted in a final sample of 29 participants (18 women, Mage = 27.18, SD = 2.97), who were MA students or graduates of the Faculty of English (Adam Mickiewicz University, Poznań). Participants received a gift card of PLN 200 for taking part in the experiment. An online Handedness Questionnaire (Cohen, 2008) based on the Edinburgh Inventory (Oldfield, 1971) indicated the right-hand preference of participants (Mright hand preference = 77.86%, SD = 48.1). Participants were all native speakers of Polish (L1), and were late proficient unbalanced learners of English as their second language (L2), as they had acquired their L2 after their L1 (Mage of L2 acquisition = 8.82, SD = 2.59). They were highly proficient in English, as confirmed by the LexTale (Lemhöfer & Broersma, 2012) results (MLexTale score = 89.55%, SD = 6.4). All participants had normal or corrected-to-normal vision, and none had any language or neurological disorder.
Materials
Materials used in the study were adopted from a database by Jankowiak (2020) that provides a set of 120 novel nominal metaphors (e.g. Love is a monastery), 120 novel similes (e.g. Love is like a monastery), 120 literal sentences (e.g. This monument is a monastery), and 120 anomalous utterances (e.g. A carpet is a monastery) in both Polish and English (see Table 1). All of the stimuli were highly controlled for and matched on their respective level of meaningfulness, familiarity, metaphoricity, and cloze probability by means of conducting a series of normative studies on native populations of the respective language. Additionally, critical (sentence-final) words were controlled for in terms of frequency (SUBTLEX-PL; Mandera et al., 2015; and SUBTLEX-UK; van Heuven et al., 2014), concreteness, number of letters (M = 6.57, SD = 1.45) and syllables (M = 2.34, SD = .48), as well as cognate status.
Examples of the experimental stimuli (after Jankowiak, 2020).
In the experiment proper, the stimuli were divided into 3 blocks in each language, each consisting of 17 novel nominal metaphors, 17 novel similes, 17 literal sentences, and 17 anomalous utterances. Additionally, 17 meaningful and 51 meaningless filler sentences were added to each block, as a result of which participants were presented with 136 sentences in each block, and 816 sentences (408 experimental and 408 filler trials) in the whole experiment, 50% of which were in Polish (L1), and 50% in English (L2). The presentation of each block was randomized and counterbalanced. Participants were not presented with nominal metaphors and similes sharing the same metaphor source and target domain in order to avoid a potential priming effect.
Procedure
The procedures applied in the experiment were in accordance with the ethical guidelines for research with human participants, and were approved by the Adam Mickiewicz University Human Research Ethics Committee. Participants were informed about the procedures of the experiment and were asked to sign the informed consent form before the experiment began.
The experiment was conducted at the Neuroscience of Language Laboratory (Faculty of English, Adam Mickiewicz University, Poznań). Participants were seated in a dim and quiet testing cabin, 70 cm from the computer screen. The experiment was programmed and run in the Presentation software (Neurobehavioral Systems, Inc., Berkeley, CA). During the experiment, the sentences were randomly presented on a computer screen using black letters, and were centered on a gray background. The time sequence of stimuli presentation is provided in Figure 1.

Time sequence of stimuli presentation.
Participants decided whether the sentence presented on a computer screen was meaningful or meaningless by pressing a corresponding key, whose designation was counterbalanced. Prior to the experimental blocks, participants completed a practice block with 20 sentences not included in the experimental trials, in order to practice the task.
EEG signal acquisition and data preprocessing
EEG signals were recorded from 64 active actiCAP slim electrodes (Brain Products GmbH, Gilching, Germany): FP1, FP2, F7, F3, Fz, F4, F8, FC5, FC1, FC2, FC6, T7, C3, C2, C4, T8, TP9, CP5, CP1, CP2, CP6, TP10, P7, P3, Pz, P4, P8, PO9, O1, Oz, O2, PO10, AF7, AF3, AF4, AF8, F5, F1, F2, F6, FPz, FT7, FC3, FC4, FT8, FCz, C5, C1, Cz, C6, TP7, CP3, CPz, CP4, TP8, P5, P1, P2, P6, PO7, PO3, POz, PO4, PO8, placed at the standard extended 10/20 positions with the ground electrode placed at FPz. To monitor vertical eye movements, bipolar electrodes were placed above and below the left eye (vEOG). For horizontal eye movements, bipolar electrodes were placed horizontally from positions next to the outer rims of the eyes (hEOG). EEG signal was amplified by the BrainVision actiCHamp amplifier (Brain Products), referenced to the TP10 channel, and stored at 500 Hz per channel. Data were stored on a computer for offline analyses. Impedances were kept below 10 kΩ for each electrode. ERPs were time-locked to the onset of the final word of a sentence.
The electrodes selected for the statistical analyses included: FC1, FC3, FCz, FC2, FC4, C1, C3, Cz, C2, C4, CP1, CP3, CPz, CP2, CP4, P1, P3, Pz, P2, P4. Offline data analyses were conducted using BrainVision Analyzer 2.1 software (Brain Products). The continuous EEG data were re-referenced to common average reference (Luck, 2014; Nunez & Srinivasan, 2006), segmented from 200 ms before stimulus onset to 2000 ms afterward, and filtered offline (Butterworth zero phase filters) with a high-pass filter set at 0.1 Hz (slope 24 dB/octave) and a low-pass filter set at 30 Hz (slope 24 dB/octave). Next, data were referred to baseline −100 to 0 ms before stimulus onset, and edited for artifacts (rejecting trials with zero lines, rejecting trials with voltage differences higher than 150 µV or voltage steps higher than 50 µV). Ocular artifacts were corrected by the Gratton & Coles method.
Overview of statistical analyses
Both accuracy ratings and reaction times were analyzed using repeated measures ANOVAs, with 4 sentence type (novel nominal metaphor vs. novel simile vs. literal sentence vs. anomalous sentence) × 2 language (Polish–L1 vs. English–L2) as within-subject factors. For ERP analyses, mean amplitudes from 20 electrodes for each condition were selected. Along the anterior-posterior axis, the following electrodes were chosen: FC3, FC1, FCz, FC2, FC4 (fronto-central), C3, C1, Cz, C2, C4 (central), CP3, CP1, CPz, CP2, CP4 (centro-parietal), P3, P1, Pz, P2, P4 (parietal). Along the left-right axis, the following electrodes were selected: FC3, C3, CP3, P3 (left), FC1, C1, CP1, P1 (left medial), FCz, Cz, CPz, Pz (midline), FC4, C4, CP4, P4 (right), FC2, C2, CP2, P2 (right medial).
Mean amplitudes were analyzed using repeated measures ANOVAs with 4 sentence type (novel nominal metaphor vs. novel simile vs. literal sentence vs. anomalous sentence) × 2 language (Polish–L1 vs. English–L2) × 4 anterior-posterior (fronto-central vs. central vs. centro-parietal vs. parietal) × 5 left-midline-right (left vs. left medial vs. midline vs. right medial vs. right) as within-subject factors. The ERP analyses were conducted on all responses. In all analyses, significance values for pairwise comparisons were corrected for multiple comparisons using the Bonferroni correction. When Mauchly’s tests showed that the assumption of sphericity was violated, the Greenhouse-Geisser correction was applied, and the original degrees of freedom were reported with the corrected p value.
Results
Behavioral results
Accuracy rates
Accuracy ratings are reported as percentage of correct responses in the semantic decision task. The results showed an interaction between language and sentence type, F(3, 84) = 6.97, p < .001, ηp2 = .199. Follow up analyses were carried out for each language separately. A repeated measures ANOVA with sentence type as factor performed on accuracy rates for Polish utterances showed a main effect of sentence type, F(3, 84) = 69.16, p < .001, ε = .511, ηp2 = .712. Pairwise comparisons confirmed that novel nominal metaphors (M = 53.48, SD = 18.01) were rated less accurately than novel similes (M = 69.17, SD = 16.66), as well as than literal (M = 89.18, SD = 8.08), and anomalous sentences (M = 92.1, SD = 6.47), ps < .001. Similarly, novel similes were rated less accurately than both literal and anomalous sentences ps < .001. There was no statistically significant difference between literal and anomalous sentences, p > .05.
A repeated measures ANOVA with sentence type as factor performed on accuracy rates for English sentences showed a main effect of sentence type, F(3, 84) = 32.24, p < .001, ε = .546, ηp2 = .535. Pairwise comparisons revealed that novel nominal metaphors (M = 61.7, SD = 14.1) differed from novel similes (M = 70.52, SD = 16.12), p = .002, from literal sentences (M = 86.61, SD = 6.65), p < .001, as well as from anomalous sentences (M = 88.61, SD = 11.41), p < .001. Additionally, novel similes differed from literal, p < .001, and from anomalous sentences, p = .002. There was no statistically significant difference between accuracy ratings for literal and anomalous sentences, p > .05.
With regard to between-language differences, post-hoc tests revealed that novel nominal metaphors were more frequently categorized as meaningful in English than in Polish, p = .003. There was no statistically significant difference between accuracy rates to Polish and English novel similes, literal, and anomalous sentences, ps > .05.
In addition to the interaction, a main effect of sentence type was found, F(3, 84) = 60.83, p < .001, ε = .453, ηp2 = .685. Pairwise comparisons confirmed that novel nominal metaphors (M = 57.6, SE = 2.8) differed from novel similes (M = 69.8, SE = 2.8), from literal sentences (M = 87.9, SE = 1.2), as well as from anomalous sentences (M = 90.3, SE = 1.5), ps < .001. Furthermore, novel similes differed from literal and anomalous sentences, ps < .001. There was no statistically significant difference between accuracy rates for literal and anomalous sentences, p > .05. Also, there was no main effect of language, p > .05. Mean accuracy rates per each sentence type in each language are provided in Figure 2.

Accuracy rates (%) for Polish (dark gray) and English (light gray) novel nominal metaphors, novel similes, literal, and anomalous sentences.
Reaction times
Reaction times (RTs) to correct responses were measured from the onset of the last word of a sentence. For statistical analyses, reaction times were log-transformed. The results showed a main effect of language, F(1, 27) = 14.65, p = .001, ηp2 = .352, with longer RTs elicited by English (M = 1024.46 ms, SE = 28.11) than Polish sentences (M = 962.47 ms, SE = 29.43). Moreover, a main effect of sentence type was found, F(3, 81) = 33.04, p < .001, ε = .608, ηp2 = .550. Pairwise comparisons confirmed that novel nominal metaphors (M = 1069.98 ms, SE = 30.1) elicited longer RTs compared with novel similes (M = 1021.11 ms, SE = 29.52), p = .003, literal sentences (M = 899.84 ms, SE = 26.02), p < .001, and anomalous sentences (M = 982.94 ms, SE = 32.6), p = .002. Additionally, novel similes evoked longer response times than literal sentences (p < .001), and anomalous sentences elicited longer RTs compared with literal sentences (p = .001). There was no statistically significant difference between anomalous sentences and novel similes, p > .05. Mean reaction times per each utterance type in each language are provided in Figure 3.

Reaction times (ms) for Polish (dark gray) and English (light gray) novel nominal metaphors, novel similes, literal, and anomalous sentences.
Event-related potentials
The N400 (350–450 ms)
Within the N400 time window (350–450 ms), an interaction between laterality and utterance type was found, F(12, 336) = 2.41, p = .038, ε = .442, ηp2 = .079, as well as between language and utterance type, F(3, 84) = 3.16, p = .029, ηp2 = .10, and between laterality, anterior-posterior electrode position, and language, F(12, 336) = 2.65, p = .017, ε = .514, ηp2 = .096. To deconstruct the interactions, follow up analyses were carried out for each language separately over individual electrodes. The results showed that in Polish (L1), the effect of utterance type was most pronounced over midline, right medial, and right fronto-central and central electrodes: FCz [F(3, 84) = 3.88, p = .012, ηp2 = .122], FC2 [F(3, 84) = 3.02, p = .034, ηp2 = .097], and Cz electrode [F(3, 84) = 2.72, p = .050, ηp2 = .088]. Mean amplitudes of these electrodes were then averaged and subject to an analysis by utterance type, whose results showed a statistically significant difference in the N400 amplitudes between literal and anomalous sentences, and a marginally significant difference between novel similes and anomalous sentences (see Table 2, Figure 4, and Figure 5). Additionally, the analysis revealed a graded effect across the utterance type, F(1, 28) = 10.11, p = .004, ηp2 = .265.
Mean amplitudes (amps) for literal (LIT), novel similes (SIM), novel nominal metaphors (NM), and anomalous (ANO) utterances in Polish (L1) and English (L2) within the 350–450 ms time window.

Topographic distribution of anomalous sentences, novel nominal metaphors, novel similes, and literal sentences in Polish (L1) in the 350–450 ms time window.

Grand averages for anomalous sentences (black solid line), novel nominal metaphors (gray solid line), novel similes (black dashed line), and literal sentences (black dotted line) over midline, right medial, and right fronto-central and central electrode positions in Polish (L1).
In English (L2), on the other hand, the effect of utterance type was more widely distributed, and was most robust over centro-parietal and parietal electrodes: CP3 (F( 3, 84) = 2.80, p = .045, ηp2 = .091), CP1 (F(3, 84) = 3.64, p = .016, ηp2 = .115), CPz, (F(3, 84) = 3.82, p = .013, ηp2 = .120), CP2 (F(3, 84) = 3.0, p = .035, ηp2 = .097), CP4 (F(3, 84) = 4.10, p = .009, ηp2 = .128), P3 (F(3, 84) = 6.66, p < .001, ηp2 = .192), Pz (F(3, 84) = 5.15, p = .003, ηp2 = .155), P2 (F(3, 84) = 4.73, p = .004, ηp2 = .145), and P4 electrode (F(3, 84) = 3.26, p = .025, ηp2 = .104). Mean amplitudes of these electrodes were then averaged and subject to an analysis by utterance type, whose results showed statistically significant differences in the N400 amplitudes between literal sentences and novel nominal metaphors, literal and anomalous sentences, and a marginally significant difference between novel similes and literal sentences (see Table 2, Figure 6, and Figure 7).

Topographic distribution of anomalous sentences, novel nominal metaphors, novel similes, and literal sentences in English (L2) in the 350–450 ms time window.

Grand averages for anomalous sentences (black solid line), novel nominal metaphors (gray solid line), novel similes (black dashed line), and literal sentences (black dotted line) over centro-parietal and parietal electrode positions in English (L2).
Sustained negativity (600–800 ms)
Within the 600 to 800 ms time window, an interaction was found between laterality, anterior-posterior electrode position, and utterance type, F(36, 1008) = 2.18, p < .001, ηp2 = .072. To deconstruct the interaction, we performed post-hoc 4 sentence type (novel nominal metaphor vs. novel simile vs. literal sentence vs. anomalous sentence) × 2 language (Polish–L1 vs. English–L2) repeated measures ANOVAs over individual electrodes. The results showed that the main effect of utterance type was strongest over the fronto-central and central midline and right medial electrodes: FCz (F(3, 84) = 19.08, p < .001, ηp2 = .405), FC2 (F(3, 84) = 15.70, p < .001, ηp2 = .359), Cz (F(3, 84) = 11.48, p < .001, ηp2 = .291), and C2 electrode (F(3, 84) = 7.73, p < .001, ηp2 = .216). Mean amplitudes of these electrodes were then averaged and subject to an analysis by utterance type, whose results showed statistically significant differences between anomalous sentences and novel nominal metaphors, novel similes, and literal sentences, as well as between novel nominal metaphors and novel smiles (see Table 3, Figure 8, and Figure 9). The analysis also yielded a marginally significant interaction between language and anterior-posterior electrode position, F(3, 84) = 3.12, p = .070, ε = .478, ηp2 = .100, yet there was no interaction between language and utterance type (p > .05).
Mean amplitudes (amps) for literal (LIT), novel similes (SIM), novel nominal metaphors (NM), and anomalous (ANO) utterances over fronto-central and central midline and right medial electrode positions within the 600 to 800 ms time window.

Topographic distribution of novel nominal metaphors (NM), novel similes (SIM), literal (LIT), and anomalous (ANO) utterances in the 600–800 ms time window. Voltage maps were obtained for the averaged value of difference waves (anomalous sentences minus novel nominal metaphors, novel nominal metaphors minus novel similes, and novel similes minus literal sentences).

Grand averages for anomalous sentences (black solid line), novel nominal metaphors (gray solid line), novel similes (black dashed line), and literal sentences (black dotted line) over fronto-central and central midline and right medial electrode positions.
Discussion
The present study examined electrophysiological correlates of novel metaphoric language comprehension in the native (Polish) and non-native language (English). The experiment addressed the question of whether novel meaning processing is facilitated by the comparison structure, and whether such a facilitatory effect is reflected in both L1 and L2.
Behavioral results
Reaction time results revealed a graded effect across the four utterance types, as novel nominal metaphors elicited the longest RTs, followed by novel similes, literal, and anomalous utterances. Importantly, such results were observed in both L1 and L2, thus suggesting that novel metaphor comprehension, as evidenced in response times patterns, is not modulated by language nativeness when bilingual speakers are highly proficient in their L2. The accuracy rate patterns further confirmed that in both languages, the lowest accuracy rates were elicited for novel nominal metaphors, followed by novel similes, literal, and finally anomalous sentences.
Such results are in line with previous monolingual studies on novel metaphor comprehension, showing that novel similes, whose linguistic form initiates comparison mechanisms, are faster and easier to comprehend (Bowdle & Gentner, 2005). The present results therefore suggest that comparison mechanisms initiated when processing similes might ease novel meaning comprehension. This in turn provides support for the career of metaphor model (Bowdle & Gentner, 2005), according to which novel metaphors are comprehended by means of comparisons. Furthermore, both novel nominal metaphors and novel similes elicited significantly longer RTs and lower accuracy rates compared with literal sentences, which indicates that cognitive mechanisms engaged in novel meaning construction are more difficult and more time consuming relative to retrieval mechanisms involved in literal meaning comprehension (e.g. Arzouan et al., 2007; Lai et al., 2009; Obert et al., 2018).
Interestingly, the same pattern of results observed in both L1 and L2 is in line with previous research on novel metaphor comprehension by bilingual speakers. Jankowiak et al. (2017) tested highly proficient Polish–English bilinguals who performed a semantic decision task to novel and conventional metaphoric word pairs, and found differences between L1 and L2, yet only for conventional metaphors, thus suggesting that the comprehension of novel metaphoric meanings might be independent of language nativeness. Therefore, the results observed in the present study confirm that novel metaphor comprehension is similarly challenging in both languages, possibly due to the low frequency of such meanings in both languages.
Finally, in line with previous behavioral studies (e.g. de Groot et al., 2002; Dijkstra et al., 1998, 1999), we found a general difference in RT patterns for L1 and L2, with significantly longer RTs for the non-native relative to the native tongue. This indicates that although participants were highly proficient in L2, as confirmed by the LexTale (Lemhöfer & Broersma, 2012) results, language processing mechanisms were still more automatic in their native language, possibly due to the fact that they were all L1-dominant and used their native language more frequently than L2 (Dijkstra & van Heuven, 2002).
Electrophysiological results
Within the N400 time window (350–450 ms), modulations by utterance type were observed, which showed a different pattern of L1 and L2 novel nominal metaphor and novel simile processing.
Namely, in Polish (L1), a graded effect across the utterance type was found, with the most pronounced N400 amplitudes for anomalous utterances, followed by novel nominal metaphors and novel similes, and finally literal sentences. Such results seem to be in line with the career of metaphor model (Bowdle & Gentner, 2005) and indicate that novel metaphor processing is facilitated by a comparison structure. Therefore, lexico-semantic mechanisms involved in the processing of novel metaphoric utterances, even when these express the same source and target domain, might be modulated by the linguistic structure, as an explicit comparison form provided in similes may lead to a decreased retrieval of information from long-term memory needed to assist meaning comprehension, which is reflected in attenuated N400 amplitudes for novel similes relative to novel nominal metaphors.
In English (L2), in contrast, both novel nominal metaphors and novel similes converged with anomalous sentences, and evoked more robust N400 amplitudes compared with literal sentences. Such results suggest that in L2, both types of novel metaphoric meanings are initially processed similarly to anomalous utterances. Consequently, this indicates that at the stage of lexico-semantic access, comparison mechanisms do not facilitate novel metaphor processing in L2. The results therefore suggest that though novel similes were faster and easier to respond to than novel nominal metaphors, as shown in the behavioral data, both utterance types required a considerable activation of long-term memory before arriving at their final meanings. Furthermore, a more pronounced N400 response to novel nominal metaphors and novel similes relative to literal sentences might reflect an increased difficulty of cross-domain mappings in novel metaphoric compared with literal meaning processing, along with a more resource intensive lexico-semantic processes engaged in novel meaning construction (Arzouan et al., 2007; Lai et al., 2009; Tang et al., 2017a).
Nevertheless, the between-language differences observed within the N400 time frame disappeared within the time window of 600–800 ms, where sustained negativity was observed in both languages. Namely, within the time window of the late positive complex (LPC)—a component frequently observed in studies on figurative language comprehension (e.g. Arzouan et al., 2007; Regel et al., 2011; Rutter et al., 2012; Spotorno et al., 2013)—prolonged negativity was observed in response to novel nominal metaphors and anomalous utterances. Novel similes, in contrast, converged with literal sentences, both of which elicited more positive amplitudes. Such findings are in line with previous studies on monolingual novel metaphoric meaning comprehension, in which a prolonged negativity was observed in response to novel metaphoric relative to literal meanings, and was interpreted as continuation of the N400 response, reflecting the ongoing difficulty of meaning integration (Tang et al., 2017a, 2017b) or overlapping the LPC amplitudes and indicating a prolonged activation of semantic information in creative language processing (Rataj et al., 2018a, 2018b). Alternatively, such sustained negativity might be indicative of an activation of the non-literal route during the comprehension of semantically complex, novel metaphoric meanings (Jankowiak, 2019).
Importantly, sustained negativity in response to novel nominal metaphors relative to novel similes was observed in both L1 and L2, thus indicating that the comprehension of novel nominal metaphors involved the continuous difficulty in meaning integration and/or access to the nonliteral route in both languages. Previously, sustained negativity for novel metaphors in the bilingual context was found in the study by Jankowiak et al. (2017), who also tested highly proficient late unbalanced Polish-English bilinguals, and who observed prolonged negativity in response to novel metaphors in L1, and to both novel and conventional metaphors in L2. In line with the study conducted by Jankowiak et al. (2017), the present study shows that novel metaphor meaning integration is similarly cognitively taxing in both L1 and L2. Yet, this continued effort seems to be modulated by the comparison mechanisms, since the comparison form present in similes facilitated meaning integration mechanisms, as indexed by attenuated sustained negativity amplitudes for novel similes compared with novel nominal metaphors. Consequently, the observed findings might be interpreted in favor of the Career of Metaphor Model (Bowdle & Gentner, 2005), showing that novel, highly creative and unfamiliar metaphor construction might rely on comparison mechanisms between the source and target domains, and such an effect might be independent of language nativeness.
Interestingly, the ERP results seem to indicate that novel nominal metaphors reflect an intermediate point between anomalous utterances and literal meanings, which was also demonstrated by longer reaction times for novel nominal metaphors relative to anomalous conditions, thus pointing to an increased difficulty of categorizing them as meaningful or meaningless, and rather low accuracy rates for novel nominal metaphors, therefore further indicating that they were often classified as anomalous. Given that the present investigation cannot provide a direct answer to the question of what makes a given novel metaphor perceived as more meaningful or meaningless, we believe that future studies should try to elucidate this issue by, for instance, employing a multiple-choice semantic decision task, whereby participants judge how much sense a particular utterance makes on a scale (e.g. a four-point scale reflecting perfect sense, some sense, little sense, and no sense; Lai & Curran, 2013; Lai et al., 2009). Unlike a binary semantic decision task, a multiple-choice semantic decision task allows for investigating the level of meaningfulness of a stimulus, and it could thus show how the degree of sensicality correlates with brain responses to novel metaphoric compared with anomalous utterances.
Finally, we also believe that research on cognitive mechanisms engaged in L1 and L2 metaphor processing could benefit to a great extent from future investigations that would include the comparisons of languages that are more distant, both in terms of their typology and culture-related phraseology. Such studies could consequently provide valuable insights into lexico-semantic processes that are more independent of a potential conceptual transfer between the two languages.
Conclusion
Altogether, the present results show that semantically complex meaning construction requires a more extended activation of long-term memory during the stage of lexico-semantic access (i.e. the N400 response) in L2 than L1. In contrast, meaning integration mechanisms, as indexed by late components, might engage a similar degree of cognitive effort in L1 and L2 when bilingual speakers are highly proficient in their L2. Consequently, though comparison mechanisms facilitate novel metaphor processing in both languages, in the case of L2, their role is more profound at the stage of meaning integration.
Footnotes
Acknowledgements
We thank Weronika Młodzikowska (Faculty of Psychology and Cognitive Science, Adam Mickiewicz University, Poznan, Poland) for her help in EEG data collection.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the National Science Centre, Poland (Grant Number 2017/25/N/HS2/00615).
Data availability
The corresponding author can share the data collected and analyzed for the purpose of the present study upon request.
