Is Motor Simulation Involved During Foreign Language Learning? A Virtual Reality Experiment

Abstract

This article presents a study performed to investigate the role of simulation in second language learning while using a virtual environment. Participants were asked to explore a virtual park while learning 15 new Czech verbs (action verbs that describe movements performed with either the hand or the foot, and abstract verbs). This learning condition was compared with a baseline condition, where movements (either virtual or real) were not allowed. The goal was to investigate whether the virtual action (performed with the feet) would promote or interfere with the learning of verbs describing actions that were performed with the same or a different effector. The number of verbs correctly remembered in a free recall task was computed, along with reaction times and number of errors during a recognition task. Results show that the simulation per se has no effect in verbal learning, but the features of the virtual experience mediate it.

Keywords

second language learning embodiment simulation verbal memory enactment

Introduction

The link between action and language has been widely acknowledged by the scientific community. During the last decade, several articles have been published reporting the involvement of the motor (Buccino et al., 2005; Gerfo et al., 2008; Repetto, Colombo, Cipresso, & Riva, 2013) and premotor cortices (Boulenger, Hauk, & Pulvermuller, 2009; Hauk, Johnsrude, & Pulvermuller, 2004; Kemmerer, Castillo, Talavage, Patterson, & Wiley, 2008; Tettamanti et al., 2005; Willems, Labruna, D’Esposito, Ivry, & Casasanto, 2011) during language processing. Language learning is a cognitive ability that entangles language and memory, and is suitable for investigation within this theoretical frame. In fact, it has been highlighted that foreign language learning is enhanced by linking verbal and action information (see Macedonia & von Kriegstein, 2012, for an extensive review). How to investigate this link poses an interesting methodological question. We know that verbal information is usually communicated by words or sentences, whereas action information is derived from gestures. After a review of the key points of traditional studies on action information and language learning, the present study will focus on the feasibility of using simulation as an alternative, innovative method to explore these same processes.

The impact of gestures on verbal memory has been studied for decades as gestures are fundamental in second language acquisition. Engelkamp and Zimmer (1985), for example, reported that the recall of action words or sentences is improved if, during the learning phase, the subjects pantomime the corresponding action. These results emerged by comparing the language acquisition rate to a control condition in which subjects could only hear/read the action items. This “enactment effect” not only increased the number of items correctly remembered but also improved the accessibility of the memorized items, as proven using recognition tasks (Kronke, Mueller, Friederici, & Obrig, 2013; Masumoto et al., 2006; Noice & Noice, 2001). Recent neuropsychological studies provide further evidence. A recent functional magnetic resonance imaging (fMRI) study (Miyahara, Kitada, Sasaki, Okamoto, Tanabe, & Sadato, 2013) investigated the neural substrates involved in the effect of spontaneous verbal labeling when memorizing increasingly complex sequences of hand movements. Results showed that the use of verbal labels would reduce neuronal activity in imitation-related regions, such as the left inferior frontal gyrus. As further confirmation of these findings, Straube and colleagues (2014), relying on behavioral and fMRI data, reported the activation of different neural substrates (left temporal pole and middle cingulate cortex vs. posterior thalamic structures and anterior and posterior cingulate cortices) for processing related and unrelated coverbal gestures, leading to enhanced memory performance.

Enriching the study phase with action information, to promote verbal learning, has been used effectively in foreign language learning, a field where verbal memory has a crucial role (Taleghani-Nikazm, 2008). Several researchers have pointed out how gestures increase recall and prevent decay when used to accompany foreign language words (Kelly, McDevitt, & Esch, 2009; Macedonia, 2003; Tellier, 2008). Interestingly, abstract words also profit from the use of enactment, as demonstrated by Macedonia and Knösche (2011).

The reason why enacted items are better remembered and retained is still under debate. Different explanations have been proposed for the enactment effect. As they are not mutually exclusive, we can say they mirror different perspectives. Some authors (see Allen, 1995) refer to classical cognitive theories, such as the depth of processing principle (Craik & Tulving, 1975). According to this principle, the deeper the item is processed (i.e., in terms of semantic features), the more likely it will be recalled in the future; moreover, the memory trace will also last longer. Hence, item recall should benefit from enactment in the encoding phase as it deepens the level of processing. The dual code theory by Paivio (Paivio, 1971) is also referred to as a mechanism underlining this effect (Tellier, 2008): Those items that comprise not only verbal but also visual information are more efficiently remembered. Gestures, in this case, provide the second “code,” the motor trace that plays the same role as the visual code.

The hypothesis of the motor trace is also taken into account by Macedonia et al. (Macedonia, Muller, & Friederici, 2011) to explain learning of both concrete and abstract words during foreign language acquisition. According to the authors,

performing a gesture when learning a word . . . strengthens the connections to embodied features of the word that are contained in its semantic core representation. . . . in the case of abstract words such as adverbs, gesture constructs an arbitrary motor image from scratch that grounds abstract meaning in the learner’s body. (Macedonia & von Kriegstein, 2012, p. 398)

All of these positions are based on the idea that an enrichment of the semantic representation facilitates the advantage of enactment.

However, other researchers explored the effect of action on language processing, while assuming the concept of simulation as a theoretical background. According to Barsalou (Barsalou, 2008), “simulation is the re-enactment of perceptual, motor, and introspective states acquired during experience with the world, body, and mind” (p. 618). The effects of motor simulation have been widely investigated in several behavioral experiments, addressing different issues about the interplay between language and action in different linguistic processes (Bergen & Wheeler, 2010; Ditman, Brunye, Mahoney, & Taylor, 2010; Frak, Nazir, Goyette, Cohen, & Jeannerod, 2010; Papeo, Corradi-Dell’Acqua, & Rumiati, 2011; Rueschemeyer, Lindemann, van Rooij, van Dam, & Bekkering, 2010; Springer & Prinz, 2010; Taylor & Zwaan, 2008; Tseng & Bergen, 2005; Zwaan & Taylor, 2006). Yet, the direction of the effect of simulation is still unclear: Does it help or interfere with the linguistic process? The literature reports opposing results related to comprehension and lexical decision tasks. Some authors (Myung, Blumstein, & Sedivy, 2006; Rueschemeyer et al., 2010) found a facilitation effect (faster reaction times [RTs]) due to a match between the action performed and the one described by the verb. Other authors (Buccino et al., 2005) observed an interference when the effector used to provide the answer and the one involved in the action word were the same.

It is quite interesting that, to our knowledge, few articles aimed at applying the concept of simulation to the learning processes, investigating the role of overt action execution during learning. Paulus and collaborators (Paulus, Lindemann, & Bekkering, 2009), for example, predicted that if the acquisition of functional information about an object requires a mental simulation of its use, then an overt motor interference during the encoding phase should affect the acquisition of the functional object knowledge by blocking motor simulation. To test this hypothesis, Paulus and collaborators (Paulus et al., 2009) constructed two sets of novel objects: Half of these objects related to the action of hearing, and the other half related to the action of smelling (both actions being performed by manipulating the object with one hand). Participants were shown pictures of the objects and instructed to learn verbally their functional properties by repeating them aloud. The learning settings were systematically varied according to four different interference conditions during the encoding phase: no interference, hand interference (participants had to squeeze a soft ball while performing the verbal learning task), foot interference (participants had to press a soft ball with their feet while performing the verbal learning task), and attention interference (the task concomitant to the learning one was an auditory oddball target detection task). As predicted, the performance—assessed in a subsequent test phase—decreased significantly in the hand interference condition. In this condition, the actual movement performed during the learning phase interfered with spontaneous and covert motor simulation of the functional object knowledge. The fact that action information contributes to conceptual processing was already identified by Kiefer et al. (Kiefer, Sim, Liebich, Hauk, & Tanaka, 2007), who found that, during a categorization task of previously learned novel objects, an early activation of the frontal motor regions and later activation of occipitoparietal visuomotor regions occurred only when a pantomime of the features of the objects was performed during learning phase. However, these experiments deal with the learning processes linked to conceptual (i.e., functional) information and not with language learning per se, as addressed in the research previously described, which used gestures enrichment.

Starting from this theoretical background, the aim of the present work is to investigate the role of motor simulation (related to the match-mismatch between the effector that executes the action and that described by the action verb) during the acquisition of a foreign language. To reach this goal, an experimental setting was implemented in which participants had to learn foreign verbs—action (hand or foot actions) and abstract—with or without concomitant real and virtual motor tasks. The tasks had to be performed using virtual reality technology.

Virtual reality is a combination of technological devices that allows users to create, explore, and interact with 3-D environments. Typically, individuals entering a virtual environment feel part of this world, and they have the opportunity to interact with it almost as they would in the real world. The similarity of the virtual experience to the real world relies mostly on three features: sight, hearing, and interaction. In most cases, the visual input is provided by means of a computer monitor or a head-mounted display (HMD). The HMD is a visualization helmet that conveys computer-generated images to both eyes, giving the illusion of the third dimension in the surrounding space. Aural devices may be head-based, such as headphones, or stand-alone, such as speakers. The degree of interaction relies on multiple factors. Probably the most influential of these is the software that manages this virtual interaction: The more users see their actions affecting the virtual world, the more they will feel immersed and engaged. These features of the virtual experience have an impact on the sense of presence perceived by the user. Presence is usually defined as the “sense of being there” in a scene depicted by a medium (Barfield, Zelter, Sheridan, & Slater, 1995). In other words, the more a virtual environment is able to elicit a “perceptual illusion of non-mediation” in the user (Lombard & Ditton, 1997), the more the user will feel present in the environment. The determinants of presence are multiple, and refer, on one side, to the features of the environment and to the level of interaction it allows (IJsselsteijn, de Ridder, Freeman, Avons, & Bouwhuis, 2001; IJsselsteijn & Riva, 2003) and, on the other side, to the user’s characteristics (i.e., the tendency to use visual representations; Slater & Usoh, 1994).

In our experiment, the evaluation of the sense of presence was a critical issue: It has been proven that some key aspects linked to presence, such as the sense of spatial presence, correlate with the activation of different cortical regions, including motor and premotor areas (Baumgartner, Valko, Esslen, & Jancke, 2006). Hence, understanding the role of these areas during semantic comprehension of action words (Repetto et al., 2013) is also a fundamental prerequisite to understanding how presence modulates linguistic processes.

Our main hypothesis is that if the simulation of the action described by the verb is important for learning a verb’s meaning, then a concomitant action that involves the same effector of the verb should modulate its recall.

More specifically, we hypothesized three scenarios:

Hypothesis 1: Motor simulation is not involved: If so, the acquisition of action verbs will be equal, regardless of the effector described.

Hypothesis 2: Motor simulation is involved and triggered by actual motion: If so, as participants use their hands to explore the virtual environment, the memorization of hand-action verbs should be modulated.

Hypothesis 3: Motor simulation is involved and triggered by virtual motion: If so, as participants virtually walk/run in the environment, the memorization of the foot-action verbs should be modulated.

Method

Participants

Forty-two volunteers (16 males and 26 females, age: range = 19-49 years, M = 33.17, SD = 7.23; years of education: range = 13-21, M = 16.27, SD = 2.33), were recruited for the experiment by public advertisement. Participants were all native Italian speakers and had normal or corrected-to-normal vision. Exclusion criteria included history of traumatic brain injury or neurological diseases. No participant was aware of the specific purpose of the study. They were informed that one of the participants, chosen based on the best performance, would receive a coupon worth 50 euros. All participants signed an informed consent (previously approved by the ethics committee of the university) to join the experiment.

Stimuli

Fifteen verbs in the Czech language were selected: Five of them described actions performed with the hand (e.g., to draw), five verbs described actions performed with the foot/leg (e.g., to jump), and five of the verbs were intellectual or symbolic activities (e.g., to forget). The complete set of items is listed in Table 1.

Table 1.

The Complete Set of Items Included in the Experiment.

Items	Czech verb	Italian verb (English)	Verb type
1	Kopat	Calciare (to kick)	Foot-action verb
2	Skok	Saltare (to jump)	Foot-action verb
3	Bruslit	Pattinare (to skate)	Foot-action verb
4	Pochod	Marciare (to march)	Foot-action verb
5	Bezet za	Rincorrere (to run after)	Foot-action verb
6	Kura	Sbucciare (to peel)	Hand-action verb
7	Prohlizet	Sfogliare (to leaf through)	Hand-action verb
8	Odzanotkovack	Stappare (to uncork)	Hand-action verb
9	Kreslit	Disegnare (to draw)	Hand-action verb
10	Hreben	Pettinare (to brush)	Hand-action verb
11	Provést	Intraprendere (to undertake)	Abstract
12	Zapomenout	Scordare (to forget)	Abstract
13	Usadìt	Dirimere (to settle)	Abstract
14	Ocenovat	Apprezzare (to appreciate)	Abstract
15	Oprit	Propendere (to have a propensity for)	Abstract

We choose the Czech language because, on one hand, it is almost unknown in Italy (thus optimum to avoid familiarity effects), and, on the other hand, its phonology is quite comprehensible for Italian speakers. The three categories of verbs included items matched for length and frequency, according to the available database for spoken Italian (De Mauro, Mancini, Vedovelli, & Voghera, 1993). All the Czech verbs were audiotaped with an online voice synthesizer, and the correspondent Italian translations were recorded by a female human voice.

Each trial was composed of a Czech verb, followed by its Italian translation, and by the repetition of the same Czech verb, with 1 s of delay in between. The intertrial delay was set up at 3 s. Figure 1 summarizes the trial composition and timing.

Figure 1.

Trials composition and timing.

Five blocks were constructed and randomly presented. A particular trial was presented only once in each block, and the order of presentation of the trials was randomized. Thus, in total, the task included 75 trials.

Virtual Environment

The virtual environment employed the freeware software NeuroVr2 (www.neurovr2.org). It was designed to represent a park on a sunny day. When entering the virtual park, the participant started his or her exploration from a paved track, and the “first-person point of view” was set up as for an adult who was standing and ready to explore the park. On the sides of the track, green grass completely covered the ground, and trees and shrubs enriched the area. In addition to natural features, many artifacts were shown that typically would be seen in a park: for example, benches, streetlamps, and bins. A picnic area and a playground were displayed. No human beings were present in the scene. The interaction with the environment (when required, depending on the experimental condition) was regulated by manipulating the left knob of a joypad (Xbox 360) with the left thumb. The virtual environment was projected through an HMD, shaped as sunglasses (covering the eyes and resting on the ears).

Questionnaires

To measure the sense of presence, the ITC-Sense of Presence Inventory (ITC-SOPI) was employed (Lessiter, Freeman, Keogh, & Davidoff, 2001). This questionnaire was developed taking into account the key factors that predict the sense of presence; it focuses on the user’s experience of the media, both during and after the experience. It is based on four factors: Sense of Physical Space (Cronbach’s alpha = .94), Engagement (Cronbach’s alpha = .89), Ecological Validity (Cronbach’s alpha = .76), and Negative Effects (Cronbach’s alpha = .77).

Participants also completed the UsoImm77 questionnaire (Antonietti & Colombo, 1997). This questionnaire aims at investigating the spontaneous occurrence of visualization and mental images in everyday life activities. The questionnaire comprises 77 items: Each item typifies a situation in which people may experience mental images. Subjects are requested to rate, on a 5-point scale, how frequently the visualization process described in the item occurs for them. The items concern different mental functions (memorizing, recalling, problem solving, daydreaming), involve different kinds of mental images (static and dynamic, single and interactive, personal and impersonal, spontaneously elicited by external stimuli and intentionally constructed and processed by the subjects), have different content (objects, persons, places), and concern different situations (e.g., study activities, leisure time, and so on).

Procedure

Before attending the experimental session, volunteers were contacted by email and requested to complete (using an online form) the Usoimm77 questionnaire. This had to be finished at least 1 day before the experimental session to prevent a priming effect on spontaneous imagery.

On the day of the experimental session, participants were welcomed into a quiet room by an experienced researcher. The materials used in the lab included a personal computer and the tools used to experience virtual reality (VR; joypad and HMD). These materials were arranged in front of the participant at a distance of approximately 50 cm.

As a first step, the participants wore the HMD and held the joypad, while the researcher launched the practice session. This first phase aimed at familiarizing participants with the environment and the commands needed to interact with it. Afterwards, the experimental session started. The main task was the verbal learning of the verbs, which were presented in an auditory manner. Participants were instructed to listen to the Czech verbs, trying to remember as many items as possible. In addition, participants had to follow different instructions according to the experimental condition to which they belonged (as explained below).

Participants were randomly assigned to one of two experimental conditions: the Run condition or the Baseline condition. In the Run condition, participants performed the main task while exploring the park as if they were walking or running through it. The instructions stressed that they had to keep walking, in whatever direction they were going, without stopping until the verbs presentation stopped. The walk-like action inside the park was achieved by moving the joypad knob on the left with the left hand. This experimental condition required people to stand in front of the computer to assume a body position coherent with the virtual walk. No real walking movements were allowed during the session.

In the Baseline condition, the participants sat in front of the computer and started the virtual experience as if they were seated on a bench. In front of them, the playground of the park was displayed. Participants were instructed to pay attention to the Czech verbs: No action within the environment was allowed, with the only exception being the visual exploration of the scene (by turning the head around). This condition served as a baseline measure of the verbal learning.

After completing the study phase (which lasted about 12 min), participants were asked to perform a cued recall task: The experimenter presented the Czech verbs, in an auditory manner and one at a time, and the participants had to provide orally the corresponding Italian translation. The number of verbs correctly remembered was recorded. Immediately after the cued recall task, a recognition task was performed. Participants were instructed to listen to the Czech verbs and to select, as quickly as possible, one of the two possible translations written on the left and right side of the screen by pressing the corresponding left or right button on a button box. The correct responses were presented equally on the left or on the right side of the screen. The correct translation was always coupled with an incorrect, but plausible, translation (i.e., the translation of another presented verb). The RTs were recorded. At the end of the memory tasks, the participants completed the ITC-SOPI questionnaire (Lessiter et al., 2001).

Results

Statistical analyses were conducted on 40 participants. Two of them presented either ceiling or floor effects in the cued recall tasks and, hence, were excluded as outliers.

First of all, we were interested in testing the impact of the different virtual experiences on the dependent variables (number of verbs correctly remembered in the cued recall task, RTs of verbs correctly recognized, and number of errors in the recognition task—see descriptive statistics in Table 2).

Table 2.

Descriptives Statistics for All the Considered Variables.

Variable	Experimental condition	M	SD	SE	N
Free_hand	Baseline	2.5000	1.57280	0.352	20
	Run	2.3810	1.20317	0.269	20
	Total	2.4390	1.37929	0.218	40
Free_foot	Baseline	2.9500	1.70062	0.380	20
	Run	2.3810	1.35927	0.304	20
	Total	2.6585	1.54288	0.244	40
Free_abstract	Baseline	1.3000	1.45458	0.325	20
	Run	1.1429	0.91026	0.204	20
	Total	1.2195	1.19399	0.189	40
Rt_hand	Baseline	1708.2670	931.34194	208.254	20
	Run	1406.7129	390.89834	87.408	20
	Total	1553.8124	715.33413	113.104	40
Rt_foot	Baseline	1683.2105	816.27008	182.524	20
	Run	1334.7014	496.74457	111.075	20
	Total	1504.7059	686.27533	108.510	40
Rt_abstract	Baseline	1894.8105	839.65249	187.752	20
	Run	1527.0495	462.62855	103.447	20
	Total	1706.4451	690.31301	109.148	40
Err_hand	Baseline	2.1000	1.58612	0.355	20
	Run	3.2857	2.41128	0.539	20
	Total	2.7073	2.11239	0.334	40
Err_foot	Baseline	1.8000	1.96281	0.439	20
	Run	2.1429	1.87845	0.420	20
	Total	1.9756	1.90378	0.301	40
Err_abstract	Baseline	2.3000	1.89459	0.424	20
	Run	2.7143	2.26148	0.506	20
	Total	2.5122	2.07511	0.328	40

Note. rt_hand/foot/abstract = reaction times during recognition task for each type of verb; err_hand/foot/abstract = number of errors during recognition task for each type of verb; free_hand/foot/abstract = number of items correctly remembered during free recall task for each type of verb; Rt = right; Err = error.

We performed Repeated Measures ANOVAS. We used the variable “Verb” (three levels: hand–foot–abstract), as a within subjects variable and the “Condition” as a between subjects variable with two levels (baseline–run). Results highlighted that for the number of items recalled, there was an effect of the type of Verb, F(2, 78) = 27.261, mean square error (MSE) = 0.91, p < .001, η² = 0.41, but not of the Condition, F(1, 39) = 0.618, MSE = 3.94, p = .436. Contrasts computed on the variable Verb demonstrated that there were fewer abstract verbs correctly remembered than hand- or foot-action verbs, F(1, 39) = 66.751, MSE = 1.09, p < .001, η² = 0.631, but hand-action verbs and foot-action verbs did not differ, F(1, 39) = 0.952, MSE = 2.18, p = .335. Furthermore, the effect of the type of Verb did not change depending on the Condition, Verb × Condition: F(2, 78) = 0.703, MSE = .91, p = .498.

With respect to the recognition task, the number of errors was not influenced by the type of Verb, F(2, 78) = 2.035, MSE = 2.86; p = .14, nor by the Condition, F(1, 39) = 1.95, MSE = 6.61, p = .17, nor by the interaction between the two, F(2, 78) = .79, MSE = 2.86, p = .46. Consistent with the pattern found for the free recall measure, RTs were influenced by the Verb, F(2, 78) = 6.52, MSE = 69914.48, p < .05, η² = 0.14, but not by the Condition, F(1, 39) = 2.79, MSE = 421827.48, p = .1, nor by the interaction Verb × Condition, F(2, 78) = 0.17, MSE = 69914.48, p = .71. Contrasts showed that RTs for abstract verbs were lower than for the other type of verbs, F(1, 39) = 15.061, p < .001, MSE = 85918.97, η² = 0.28, that were similar to each other, F(1, 39) = 0.59, MSE = 165099.31, p = .449.

Afterward, the scores of the questionnaires were taken into account. First, we computed a MANCOVA using the responses (numbers of errors, response time, and free recall performance) as dependent variables, the Condition (baseline vs. run) as a fixed factor, and the subscales of ITC-SOPI questionnaire as covariates. We used the subscales as covariates as we were interested in examining if and how the levels of perceived presence could mediate participants’ overall responses.

The general model was significant for number of errors in recognizing the correct translation of hand-related verbs, F(5, 35) = 6.72, p < .001, MSE = 2.60, η² = 0.49, R² = .49: Participants in the Baseline condition committed fewer errors (M = 2.10, SD = 1.59) than those in the Run condition (M = 3.29, SD = 2.41). As covariates, the subscale Engagement, F(5, 35) = 6.37, MSE = 2.60, p < .05, η² = 0.15, and Negative Effects, F(5, 35) = 17.15, MSE = 2.60, p < .05, η² = 0.33, appear to have contributed to this difference. Data highlighted how the Engagement subscale had a negative relationship with this dependent variable (B = −1.68, t = −2.52, p < .05), and hence, the higher the level of Engagement, the lower the mistake rate. The opposite was true for the Negative Effects subscale: Higher scores in this subscale were positively related to higher number of errors (B = 1.46, t = 4.14, p < .001).

When examining the influence of specific covariates on our independent variable, it was possible to highlight another interesting effect. The subscale Eco-Valid had a significant influence, F(5, 35) = 5.17, MSE = 3.42, p < .05, η² = 0.13, on the number of errors in translating foot-related verbs. This subscale had a negative relationship with the dependent variable (B = −1.10, t = −2.27, p < .05). This means that lower error rates in recognizing the correct translation of foot-related verbs were associated to higher scores in the Ecological Validity scale. Interestingly, mean estimates predicted by the effect of this subscale (Baseline = 2.091, Run = 1.866) are in the opposite direction with respect to the observed sample means (Baseline = 1.8, Run = 2.14; see Figure 2).

Figure 2.

The effect of the scale Ecological Validity.

As a second step, the same analysis was applied to the UsoImm 77 questionnaire, but results revealed no significant effects for any variable, thus indicating that the individual tendency to use imagery did not influence the task.

Discussion

The present study aimed at investigating the role of motor simulation during foreign language learning. To achieve this goal, we used a virtual environment where participants, while learning Czech verbs, had to move as if they were running. They achieved this virtual run experience by manipulating a knob with their left hand. This procedure allowed participants to obtain two kinds of action: one real (the movement of the hand on the knob) and one virtual (the virtual movement of the feet, which is necessary to run). When comparing the linguistic performance (in terms of learning) in this condition with that in the baseline condition (i.e., without any real or virtual action), we were able to understand if simulation is involved in this process and which movement triggers it.

When looking at the experiment’s results, it appears that, overall, the simulation of an action performed with the same/different effector does not play a role during second language learning. In fact, the number of items correctly recalled did not vary across conditions but depended only upon the type of verb: Abstract verbs were more difficult to remember than concrete ones. This finding is not surprising as the cognitive advantage of concrete words over abstract words has been recognized in several memory and language tasks (Nelson & Schreiber, 1992; Paivio, Walsh, & Bons, 1994).

Yet the fact that hand-action verbs and foot-action verbs did not differ from each other and, moreover, that the effect of the verb type was not different (depending on the conditions) seem to indicate that neither of the actions (real or virtual) affected the learning of verbs that described actions performed with the same or different effector. Coherently, during the recognition task, the same pattern of effects was evident: The words that were previously better retained (hand- and foot-action verbs) were more quickly recognized, and the opposite was true for the words that were less remembered (abstract verbs). On the contrary, the number of errors in the recognition task did not appear to be influenced by any considered variable: One possible explanation is that the error rate did not rely on the learning process but on different variables, possibly linked to the specific setting or environment.

The fact that simulation is apparently not involved in verbal learning is a new and relevant finding. Data in the literature report an advantage in terms of language learning due to linking words or sentences with gestures (Kelly et al., 2009; Macedonia & von Kriegstein, 2012; Tellier, 2008). However, the enrichment of the action achieved by using gestures and by using the typical paradigm employed to test simulation differ substantially. In one case, the learner pairs a lexical item with a univocal pattern of movements, and the couple action + word is repeated over and over during the study phase. In the second case, a specific movement (virtual or real—in this study, respectively, the run and the manipulation of the knob) is performed for the duration of the study phase with one specific effector that either matches or does not match the one corresponding to the verb. Thus, there is not a specific combination between motion and semantics, but only a generic sharing versus not sharing of the effector (Buccino et al., 2005; Repetto, Cipresso, & Riva, 2015). Moreover, in the present study, volunteers underwent a single session of learning with a relatively small number of verb repetitions: This could account for the lack of differential recall performances regardless of different learning conditions and types of verbs (notice that Macedonia [Macedonia et al., 2011] found that training does not always have an impact on retention). However, it is easy to see that while the gesture paradigm promotes the grounding of the meaning in the learner’s body experience, the use of the same versus different effector is not enough to establish a link between the lexical item and the action.

Yet, when learning focuses on conceptual knowledge, the involvement of simulation is reported in verbal learning tasks (Paulus et al., 2009). In this study, the learners were told explicitly to pay attention to the functional use of the objects, and thus, it is possible that these specific instructions allowed them to imagine the possible use of that object. In this case, the imagery is likely activated rather than simulation, and this process relies on different cerebral networks (Willems, Toni, Hagoort, & Casasanto, 2010).

The fact that no effect of simulation in the foreign-language-learning paradigm emerged from our data, compared with its well-established involvement in other linguistic tasks such as comprehension (Ditman et al., 2010; Frak et al., 2010; Tseng & Bergen, 2005; Zwaan & Taylor, 2006), seems to posit that simulation is a relatively “automatic” mechanism, activated during online processes, sometimes guided by the context or by attentional focus (Bergen & Wheeler, 2010; Taylor & Zwaan, 2008), but never pervaded by an awareness of the usage of strategies. Relying on this perspective, foreign language learning can be seen as a typical process in which individual strategies have a strong impact. To support this interpretation, we can mention the fact that, after the experimental session, participants spontaneously told the experimenter the tricks they used to recall as many verbs as possible in the final test (Lawson & Hogben, 1996). For this reason, the absence of a simulation effect in a language-learning task does not rule out the involvement of the motor system in this linguistic process.

The second (and somehow surprising) result is the effect of simulation in the recognition task. As discussed previously, the recognition measures do not seem to be influenced by the condition of learning (with or without virtual/real movement), but, more interestingly, some effect arises from the contribution of the presence components, as assessed by the ITC-Sopi questionnaire. Specifically, the number of hand-action verb errors in the recognition task seems to be predicted globally by the set of subscales of the questionnaire, with Engagement and Negative effects being the most important predictors. The hand-action verbs are more easily recognized if acquired without interference movement (Baseline condition), when the learner experienced a high level of Engagement and a low level of Negative Effects.

Even more interesting is the effect on the number of errors for foot-action verbs: This measure appeared to be influenced specifically by Ecological Validity, that is, the tendency to recognize the environment as real. When this index was higher, the errors decreased; moreover, the impact of Ecological Validity, when controlling for the other subscales, was predicted to foster fewer errors in the Run condition than in the Baseline condition. This effect is compatible with the following simulation: In the Run condition, learners performed a virtual motion with their feet, and this action was simulated exactly at the same time as the lexical access. Thus, the more the learner perceived the environment as real, the more the virtual action was effective on the cognitive representation of the verb, and the more he or she simulated the action during the recognition. The foot-action simulation, in turn, facilitated the lexical access to the verbs that shared the same effector.

Conclusion

The aim of the experiment was to extend the knowledge about the mechanism of simulation and its cognitive effects. In particular, we were interested in testing the occurrence of this process during linguistic tasks in which, to our knowledge, simulation has never been applied to second language learning. The first important finding is that the simulation per se is not sufficient to establish a relationship between words and action during learning, resulting in null effect with respect to the number of items recalled (at least with low number of items and repetitions).

Nevertheless, and maybe more interestingly, our paradigm allowed us to identify a new and relevant finding: The simulation can be mediated by other perceptual and cognitive processes induced by the context, especially the sense of presence. In this perspective, the use of virtual reality gave us the opportunity to point out the role of some factors linked to the specific experience. These factors appear to be related to the concept of presence (which is linked to the specific characteristic of a VR experience) and can promote or interfere with the simulation process that occurs, even after the virtual experience, as evidenced during the recognition task. In fact, it is known that the sense of presence is mediated by the egocentric perspective in the environment (Bae et al., 2012), typical of the virtual experience.

This result can suggest two reflections: On one hand, it makes clear that simulation can happen when the lexical item must be accessed after being learned. On the other hand, it pinpoints that the occurrence of simulation during this process is mediated in different ways by the different components of presence involved in the simulation. The latter observation raises interesting questions to be addressed by future research: Is it possible to modify the virtual environment to fit the parameters that promote simulation (according to the present findings: the reduction of the Negative Effects, the enhancement of Ecological Validity and Engagement)? What happens when the environment is “optimized” in terms of presence? Could the simulation speed up the time to access the word as well (in the present study, RTs do not appear to be influenced)?

In the present study, the virtual environment had a very basic structure, and the virtual experience allowed a low level of interaction. We can hypothesize that implementing a virtual world that induces higher levels of presence could be a first step toward this goal.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research and/or authorship of this article.

Author Biographies

Claudia Repetto, PhD, is a post-doc researcher in General Psychology and Neuroscience at the Faculty of Psychology of the Catholic University of the Sacred Hearth in Milan. Her research interests include memory, language and new technologies for the study and the rehabilitation of cognitive processes.

Barbara Colombo, PhD, is assistant professor of General Psychology at the Faculty of Psychology of the Catholic University of the Sacred Hearth in Milan. She also serves as an associate professor in General Psychology and cognitive Neuroscience at Champlain College (USA). Her theoretical and research background is in cognitive (neuro)psychology. Her research focuses on cognitive processes like creativity, mental imagery, decision making and problem solving. She also has been exploring the role of individual differences (such as cognitive styles and personality traits) and emotions in the elaboration of multimedia stimuli.

Giuseppe Riva, PhD, is full professor (tenure position) of General Psychology and Communication Psychology at the Catholic University of Milan, Italy and head researcher of the the Applied Technology for Neuro-Psychology Laboratory - ATN-P Lab., Istituto Auxologico Italiano, Verbania Italy. He conducted several researches and published many papers about methods and assessment tools in psychology and about the use of Virtual Reality and Internet in medicine and in training.

References

Allen

L. Q.

(1995). The effects of emblematic gestures on the development and access of mental representations of French expressions. The Modern Language Journal, 79, 521-529.

Antonietti

Colombo

(1997). The spontaneous occurrence of mental visualization in thinking. Imagination, Cognition and Personality, 16, 415-428.

Bae

Lee

Park

Cho

Park

Kim

(2012). The effects of egocentric and allocentric representations on presence and perceived realism: Tested in stereoscopic 3D games. Interacting With Computers, 24, 251-264. doi:10.1016/j.intcom.2012.04.009

Barfield

Zelter

Sheridan

T. B.

Slater

(1995). Presence and performance within virtual environments. In Barfield

Furness

T. A.

(Eds.), Virtual environments and advanced interface design (pp. 473-513). Oxford: Oxford University Press.

Barsalou

L. W.

(2008). Grounded cognition. Annual Review of Psychology, 59, 617-645. doi:10.1146/annurev.psych.59.103006.093639

Baumgartner

Valko

Esslen

Jancke

(2006). Neural correlate of spatial presence in an arousing and noninteractive virtual reality: An EEG and psychophysiology study. CyberPsychology & Behavior, 9, 30-45. doi:10.1089/cpb.2006.9.30

Bergen

Wheeler

(2010). Grammatical aspect and mental simulation. Brain & Language, 112, 150-158. doi:10.1016/j.bandl.2009.07.002

Boulenger

Hauk

Pulvermuller

(2009). Grasping ideas with the motor system: Semantic somatotopy in idiom comprehension. Cerebral Cortex, 19, 1905-1914. doi:10.1093/cercor/bhn217

Buccino

Riggio

Melli

Binkofski

Gallese

Rizzolatti

(2005). Listening to action-related sentences modulates the activity of the motor system: A combined TMS and behavioral study. Cognitive Brain Research, 24, 355-363. doi:10.1016/j.cogbrainres.2005.02.020

10.

Craik

F. I. M.

Tulving

(1975). Depth of processing and the retention of words in episodic memory. Journal of Experimental Psychology: General, 104, 268-294.

11.

De Mauro

Mancini

Vedovelli

Voghera

(1993). Lessico di frequenza dell’ italiano parlato (LIP) [Frequency lexicon of the spoken Italian]. Milano, Italy: Etaslibri.

12.

Ditman

Brunye

T. T.

Mahoney

C. R.

Taylor

H. A.

(2010). Simulating an enactment effect: Pronouns guide action simulation during narrative comprehension. Cognition, 115, 172-178. doi:10.1016/j.cognition.2009.10.014

13.

Engelkamp

Zimmer

H. D.

(1985). Motor programs and their relation to semantic memory. German Journal of Psychology, 9, 239-254.

14.

Frak

Nazir

Goyette

Cohen

Jeannerod

(2010). Grip force is part of the semantic representation of manual action verbs. PLoS ONE, 5, e9728. doi:10.1371/journal.pone.0009728

15.

Gerfo

E. L.

Oliveri

Torriero

Salerno

Koch

Caltagirone

(2008). The influence of rTMS over prefrontal and motor areas in a morphological task: Grammatical vs. semantic effects. Neuropsychologia, 46, 764-770. doi:10.1016/j.neuropsychologia.2007.10.012

16.

Hauk

Johnsrude

Pulvermuller

(2004). Somatotopic representation of action words in human motor and premotor cortex. Neuron, 41, 301-307.

17.

IJsselsteijn

W. A.

de Ridder

Freeman

Avons

S. E.

Bouwhuis

(2001). Effects of stereoscopic presentation, image motion, and screen size on subjective and objective corroborative measures of presence. Presence: Teleoperators and Virtual Environments, 10, 298-311.

18.

IJsselsteijn

W. A.

Riva

(2003). Being there: The experience of presence in mediated environments. In Davide

Riva

IJsselsteijn

W. A.

(Eds.), Being there: Concepts, effects and measurements of user presence in synthetic environments (pp. 3-16). Amsterdam, The Netherlands: IOS Press.

19.

Kelly

S. D.

McDevitt

Esch

(2009). Brief training with co-speech gesture lends a hand to word learning in a foreign language. Language and Cognitive Processes, 24, 313-334.

20.

Kemmerer

Castillo

J. G.

Talavage

Patterson

Wiley

(2008). Neuroanatomical distribution of five semantic components of verbs: Evidence from fMRI. Brain & Language, 107, 16-43. doi:10.1016/j.bandl.2007.09.003

21.

Kiefer

Sim

E. J.

Liebich

Hauk

Tanaka

(2007). Experience-dependent plasticity of conceptual representations in human sensory-motor areas. Journal of Cognitive Neuroscience, 19, 525-542. doi:10.1162/jocn.2007.19.3.525

22.

Kronke

K. M.

Mueller

Friederici

A. D.

Obrig

(2013). Learning by doing? The effect of gestures on implicit retrieval of newly acquired words. Cortex: A Journal Devoted To the Study Of the Nervous System and Behavior, 49, 2553-2568. doi:10.1016/j.cortex.2012.11.016

23.

Lawson

M. J.

Hogben

(1996). The vocabulary-learning strategies of foreign-language students. Language Learning, 46, 101-135.

24.

Lessiter

Freeman

Keogh

Davidoff

(2001). A cross-media presence questionnaire: The ITC-Sense of Presence Inventory. Presence: Teleoperators and Virtual Environments, 10, 282-297.

25.

Lombard

Ditton

(1997). At the heart of it all: The concept of presence. Journal of Computer-Mediated Communication, 3(2). doi:10.1111/j.1083-6101.1997.tb00072.x

26.

Macedonia

(2003). Sensorimotor enhancing of verbal memory through “voice movement icons” during encoding of foreign language (Doctoral thesis). University of Salzburg, Germany.

27.

Macedonia

Knösche

T. R.

(2011). Body in mind: How gestures empower foreign language learning. Mind, Brain, & Education, 5, 196-211.

28.

Macedonia

Muller

Friederici

A. D.

(2011). The impact of iconic gestures on foreign language word learning and its neural substrate. Human Brain Mapping, 32, 982-998. doi:10.1002/hbm.21084

29.

Macedonia

von Kriegstein

(2012). Gestures enhance foreign language learning. Biolinguistics, 6, 393-416.

30.

Masumoto

Yamaguchi

Sutani

Tsuneto

Fujita

Tonoike

(2006). Reactivation of physical motor information in the memory of action events. Brain Research, 1101, 102-109. doi:10.1016/j.brainres.2006.05.033

31.

Miyahara

Kitada

Sasaki

A. T.

Okamoto

Tanabe

H. C.

Sadato

(2013). From gestures to words: Spontaneous verbal labeling of complex sequential hand movements reduces fMRI activation of the imitation-related regions. Neuroscience Research, 75(3), 228-38. doi:10.1016/j.neures.2012.12.007

32.

Myung

J. Y.

Blumstein

S. E.

Sedivy

J. C.

(2006). Playing on the typewriter, typing on the piano: Manipulation knowledge of objects. Cognition, 98, 223-243. doi:10.1016/j.cognition.2004.11.010

33.

Nelson

D. L.

Schreiber

T. A.

(1992). Word concreteness and word structure as independent determinants of recall. Journal of Memory and Language, 31, 237-260.

34.

Noice

(2001). Learning dialogue with and without movement. Memory & Cognition, 29, 820-827.

35.

Paivio

(1971). Imagery and verbal processes. New York, NY: Holt.

36.

Paivio

Walsh

Bons

(1994). Concreteness effects on memory: When and why? Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 1196-1204.

37.

Papeo

Corradi-Dell’Acqua

Rumiati

R. I.

(2011). “She” is not like “I”: The tie between language and action is in our imagination. Journal of Cognitive Neuroscience, 23, 3939-3948. doi:10.1162/jocn_a_00075

38.

Paulus

Lindemann

Bekkering

(2009). Motor simulation in verbal knowledge acquisition. The Quarterly Journal of Experimental Psychology, 62, 2298-2305. doi:10.1080/17470210903108405

39.

Repetto

Cipresso

Riva

(2015). Virtual action and real action have different impacts on comprehension of concrete verbs. Frontiers in Psychology, 6, 176. doi:10.3389/fpsyg.2015.00176

40.

Repetto

Colombo

Cipresso

Riva

(2013). The effects of rTMS over the primary motor cortex: The link between action and language. Neuropsychologia, 51, 8-13. doi:10.1016/j.neuropsychologia.2012.11.001

41.

Rueschemeyer

S. A.

Lindemann

van Rooij

van Dam

Bekkering

(2010). Effects of intentional motor actions on embodied language processing. Experimental Psychology, 57, 260-266. doi:10.1027/1618-3169/a000031

42.

Slater

Usoh

(1994). Representations systems, perceptual position, and presence in immersive virtual environments. Presence: Teleoperators and Virtual Environments, 2, 221-233.

43.

Springer

Prinz

(2010). Action semantics modulate action prediction. The Quarterly Journal of Experimental Psychology, 63, 2141-2158. doi:10.1080/17470211003721659

44.

Straube

Meyer

Green

Kircher

(2014). Semantic relation vs. surprise: The differential effects of related and unrelated co-verbal gestures on neural encoding and subsequent recognition. Brain Research, 1567, 42-56. doi:10.1016/j.brainres.2014.04.012

45.

Taleghani-Nikazm

(2008, October). Gestures in foreign language classrooms: An empirical analysis of their organization and function. Paper presented at the Second Language Research Forum, University of Illinois, IL.

46.

Taylor

L. J.

Zwaan

R. A.

(2008). Motor resonance and linguistic focus. The Quarterly Journal of Experimental Psychology, 61, 896-904. doi:10.1080/17470210701625519

47.

Tellier

(2008). The effect of gestures on second language memorisation by young children. Gesture, 8, 219-235.

48.

Tettamanti

Buccino

Saccuman

M. C.

Gallese

Danna

Scifo

. . . Perani

(2005). Listening to action-related sentences activates fronto-parietal motor circuits. Journal of Cognitive Neuroscience, 17, 273-281. doi:10.1162/0898929053124965

49.

Tseng

Bergen

(2005). Lexical processing drives motor simulation. Paper presented at the Proceedings of the Twenty-Seventh Annual Conference of the Cognitive Science Society. July, Stresa (Italy).

50.

Willems

R. M.

Labruna

D’Esposito

Ivry

Casasanto

(2011). A functional role for the motor system in language understanding: Evidence from theta-burst transcranial magnetic stimulation. Psychological Science, 22, 849-854. doi:10.1177/0956797611412387

51.

Willems

R. M.

Toni

Hagoort

Casasanto

(2010). Neural dissociations between action verb understanding and motor imagery. Journal of Cognitive Neuroscience, 22, 2387-2400. doi:10.1162/jocn.2009.21386

52.

Zwaan

R. A.

Taylor

L. J.

(2006). Seeing, acting, understanding: Motor resonance in language comprehension. Journal of Experimental Psychology: General, 135, 1-11. doi:10.1037/0096-3445.135.1.1