Abstract
Abstract
Objective:
A feasibility study was run on an embodied exergame designed to teach 4th–12th grade students about nutrition and several U.S. Department of Agriculture (USDA) MyPlate guidelines. The study assessed efficacy on a new version of a game that was first implemented on an immersive platform and published in this journal in 2013. The earlier “Alien Health” game was redesigned for use with the Microsoft® (Redmond, WA) Kinect® sensor. Players learned about the amount of nutrients and optimizers in common food items and practiced making food choices while engaging in short cardio exercises.
Subjects and Methods:
Twenty 6th and 7th graders were randomly assigned to either the “Alien Health” game or a treated control condition. All engaged in “front of the classroom” performative activities. The “Alien Health” experimental group experienced the full game narrative of feeding the Alien and automated feedback on the quality of performed exercises. The control group experienced the same performative food choices at the interactive whiteboard but did no exercises. Two-week follow-up data were collected.
Results:
Both groups displayed statistically significant learning gains on the immediate nutrition knowledge posttest. The effect sizes from pretest to 2-week follow-up were 0.83 for the control group and 1.14 for the experimental group. Of interest is the crossover interaction from posttest to follow-up that approached significance (F19=3.96, P<0.058). Here, the experimental group outperformed the control group for knowledge retention.
Conclusions:
Results suggest acceptability, feasibility, and limited efficacy in a Kinect-based game to instruct in nutrition and the USDA MyPlate icon. The follow-up test revealed that nutrition knowledge continued to increase for the experimental group that performed short cardio exercises, suggesting that short exercises and perhaps a game narrative may have helped to consolidate content memory.
Introduction
E
Exergames could be extremely useful in the fight against childhood obesity. A recent meta-analysis on exergaming found that such games “… increase physical activity levels, energy expenditure, maximal oxygen uptake, heart rate, and…reduce waist circumference and sedentary screen time.” 2 Meshing an effective exergame with the instruction of nutritional knowledge has not been done for middle schoolers to our knowledge, and so the “Alien Health” game is filling a new niche. In addition, adding a skeletal tracking system like the Microsoft Kinect to give players immediate performance feedback and reinforce learning is an innovative add-on, and two goals were to test the prototype and usability with middle-school children.
Previously this journal published an accessibility study on the alpha version of the “Alien Health” game, demonstrating it was engaging for 4th graders and held the attention of the entire class for over an hour. Statistically significant learning gains on a nutrition test were also reported postintervention. 3 The previous version was played in SMALLab (Situated Multimedia Arts Learning Lab), a motion-capture educational platform that used 12 ceiling-mounted infrared cameras to track player movements. The experience was very immersive because the projected floor space was 15×15 feet and players could run short distances. The original game was also played in dyads in front of an entire class as peers sat around the periphery of the floor projection. That small, within-subjects study resulted in statistically significant learning gains on the same nutrition test used in this study 3 (see the Appendix for a version of the test or visit the Web site http://egl.lsi.asu.edu). With the advent of the Microsoft Kinect, the game no longer needed the expensive rigid-body camera-based platform of SMALLab to capture movement. It became obvious that a complete redesign was needed to accommodate the affordances of the new technology. The new game adhered to several design constraints that are followed in the EGL lab (e.g., make the game embodied with gross motor movements necessary for task completion, make it engaging for the entire class, make food items based on real world/authentic choices for teens, and design it to facilitate peer discourse).
Learning theories
The authors have created several educational games that are based on embodiment.4–6 Embodied design is premised on the hypothesis that cognition is tightly linked to the body and its physical affordances. 7 Several learning theories drive the lab's creation of games, including a new taxonomy on embodiment in the Journal of Educational Psychology. 6 The movements associated with the games' interactivity are all gesturally congruent, 8 that is, the gestures have structural or metaphoric overlap with content to be learned. In addition, there is solid gesture research over the past few decades supporting that when students learn with gesture (as opposed to having gestures constrained), they retain the new knowledge better. 9 Others have stated that gestures can serve as cognitive primes for learning.10,11 Goldin-Meadow 12 posited that additional motor traces in memory may be created during gesture-based instruction. Perhaps it is these multimodal traces that enhance the retention of new information? Our hypothesis is that practicing better food choices with active choosing gestures in a fun, game-like environment will lead to increases in learning and retention. Well-designed games can be engaging and intrinsically motivating. 13
Serious games also create space for students to reflect and to experience becoming experts. 14 Care was taken to insert opportunities for discourse and reflection in the “Alien Health” game. Discourse is embedded in the collaboration of the playing dyads as they must agree on a food item, but opportunities are also created for the observing students. The observers are encouraged to volunteer hypotheses and ideas during play. Collaborative learning in the classroom generates significantly higher achievement outcomes, higher-level reasoning, better retention, improved motivation, and better social skills15–17 than traditional didactics. One reason discourse may be a potent causal factor in learning is because students must be explicit—that is, they must explain why they are making certain choices. They must defend their current mental model. The act of explaining forces students to be evaluative, and in this manner students might uncover “unifying patterns or regularities and these may prompt discovery of broad generalizations.”18, p.800
Research questions
The Kinect for Xbox 360 sensor was used as the input device. With full-body exercise it is best to not have more than two players in the active space at one time. A large vertical display (an interactive whiteboard) was used so it was no longer possible to have players run down the long floor diagonal as in the previous SMALLab study. By consensus, six stationary exercises were prompted by the system. The teacher (experimenter in this study) told students that the short sets of exercises helped the Alien metabolize foods. A final goal of this study was to replicate the gains from the mixed reality platform using the new Kinect platform.
The study addressed two specific research questions:
1. Will middle-school children learn more nutrition content when the learning is accompanied with embodied exercise and an engaging game narrative? The hypothesis was that the experimental “Alien Health” game group would show greater learning gains on the immediate posttest because the game narrative would be more engaging and motivating.
13
Engagement is promoted for learners when games are “well-designed and appropriate.”
19
2. Will the learning gains be maintained after 2 weeks? The memory consolidation literature would suggest that a learning difference might be seen in follow-up studies.
20
(Note that follow-up tests were not given in the original SMALLab study.)
Subjects and Methods
Participants
Thirty-two 6th and 7th graders came to the university campus on a science field trip. The children attended a Title 1 middle school and received 100 percent free breakfast and lunch. Fifty-one percent of the participants were female; they were 80 percent Hispanic, 15 percent African American, and 5 percent other. The children at the school were 35 percent English Language Learners, primarily native Spanish speakers. Because of pairwise pretest matching, results were run on only 20 participants.
Experiment design
Students were randomly assigned to condition in two waves. In both the first and second waves, there were 6 control and 10 experimental students. The two groups went to two separate rooms (the control room was smaller). They took the pretests, went through the intervention, and then took the immediate posttest. When not active at the whiteboard, the peer students sat in a large semicircle around the whiteboard and watched the active dyad play the game. The same protocol occurred for the second wave. There were two experimenters (M.C.J.-G. and C.S.-R.). They randomly chose mixed-gender pairs to come to the front of the room to play on the whiteboard. Play lasted approximately 50 minutes for each wave, similar to a class period. The experimental dyads each physically played the “Alien Health” game for a shorter amount of time (approximately 9 minutes per dyad versus 15 minutes per dyad in the control condition). Overall, all participants were exposed to the same content the same amount of time.
Gameplay
The game was designed to teach about the five nutrients/optimizers and also to encourage discourse. The backstory was, “You have awoken to find an Alien under your bed. He is hungry, but you cannot communicate. You must figure out which foods make him feel better. He is in charge of stopping an asteroid from hitting Earth.” In the opening shot the Alien is hanging his head and looking tired.
The five “nutrients”
Figure 1 shows the two food choices at the top with the constellation/profile of nutrients and optimizers below. A force choice paradigm was used so that students had to choose between one of two similar or matched food items—in this example, a bran muffin versus a cupcake. We wanted children to have practice quickly choosing food they might be exposed to during a typical day at home, in stores, or in a school lunch line. Two subject matter experts (a registered dietitian and a nutrition expert) were consulted extensively during the design of the game. We did not want to focus on simple calorie counts because children do not seem to make rapid decisions this way and because, overall, we were more interested in emphasizing food quality over caloric quantity per the USDA Guidelines. 21 Our goal was to get children to think about foods that satiate and were “more” nutritious in a comparative manner.

Screenshot of force choice task for real foods.
The five nutrients and optimizers are protein, fats, carbohydrates, fiber, and vitamins/minerals. They represent broad nutritional domains important for understanding food quality. For each factor, four levels of magnitude were included to represent varying concentrations of each within each food choice; these show up as vibrantly colored bars at the bottom of the screen. Note in Figure 1 there is more fiber and less fat in the bran muffin. The game was designed to encourage students to discourse about the somewhat similar food items and to make an optimal choice together. Thus, amount of display time for the paired items was under control of the players; decisions were made between 3 and 10 seconds depending on amount of discourse and how closely matched the food pairs were. When consensus was reached (e.g., for the bran muffin), then both players needed to place both their hand icons hands over the item. When the Kinect sensed both players' hand icons hovering over a single food item, then that food item would be released, and the player on the same side would be able to move it into the Alien's mouth. Once the Alien ate the food, the system prompted the players for a series of short exercises to help the Alien “metabolize.” Shimmery sparkles coursed over the Alien as the players exercised following prompts from iconic stick figures. Many of the game mechanics are similar to the previous game in SMALLab. 22 A 3-minute video on the game is available at www.youtube.com/watch?v=V81VaKD79rY.
Figure 2 shows the exercising stick figures that appear where the food items had previously been. The figures animate a set of six randomly displayed exercises: Arm circles, squats, dance, jumping jacks, jog in place, and simple jump up and down. The teacher can select ahead of time the level of difficulty: easy (2 reps), medium (5 reps), or hard 9 (reps). For this experiment, reps were set at medium. If one player did not do any exercises, the other player in the dyad had to do twice as many, so both participants were motivated to perform.

Screenshot of exercising figures. Kinect assesses human match in real time and gives feedback.
The Kinect also allowed for immediate feedback on quality of the exercise. When an exercise was well performed, as assessed by our algorithm, the words “Good Job” flashed across the appropriate stick figure. As an example of scoring, the Kinect gathered data on the position of 20 joints on an interpolated skeleton at 30 frames/second. For arm circles, to calculate a “hit”, each arm must first be straight. The system must check the angle between the upper arm and shoulder blades and the angle between the lower arm and shoulder blades, and if both angles are close, then the arm is straight. It also assesses whether each hand has moved a minimum distance per second; at the end of 1 second the distance traveled against a minimum required distance is checked. If both arms are straight and both hands have moved more than the minimum criterion, then the player has performed an arm circle.
As players exercised, bright shimmery lights traversed up and down the Alien's body to simulate him metabolizing the food. If the correct (healthier choice [i.e., the bran muffin]) had been chosen, the Alien would next stand upright and become a more vibrant green shade after metabolizing. Conversely, if the poorer choice had been given to him (the cupcake), he would fall into a lower state of fatigue and turn a shade of lighter yellow, and his antennae and body would droop further. There were five stages of fatigue feedback until the Alien reached full sleep (bright red and snoring). This was considered a fail state, and the game started over.
Description of game levels
Level 1
Level 1 (described above) featured forced-choice play using real foods. Level 1 contained 25 matched pairs that displayed randomly without replacement on random sides of the screen (i.e., the correct choice was not always on one side of the screen).
Level 2: Nonfoods
In the 4th grade study, students were extremely savvy about what they “should” choose in a test situation. Given the choice of ice cream versus low fat frozen yogurt, they know the yogurt is better for them, but they often do not know why. Level 2 was included to force players to attend to the nutrient and optimizer profile to drive decision-making. We did not want them to merely rely on simple visual cues of known foods. This level contained multiple pairs of “nonfoods.” The following backstory text appeared before the level: “These were found in the Alien's backpack and you need to figure out which ones make him feel better.” Figure 3 shows what the nonfood molecules looked like. Similar to the real food level, these also have four levels of magnitude in the profile. An example of names might be “Ogg-Ogg” versus “Blumpf.” This nonfood concept was modeled from a construct used in the classic nonword reading task, 23 the gold standard for assessing dyslexia. In that task participants are asked to read nonwords so they cannot rely on sight reading or memory for the words, but must use their phonological awareness skills. We believe this level is crucial to force students to practice their nutrition comprehension skills.

Screenshot of nonfood choices.
Level 3: MyPlate and Build a Lunch
In the final level, Level 3, players were now able to individually drag items onto a cafeteria tray to create a balanced lunch for the Alien. Figure 4 shows a screen shot. The maximum number of items in the lunch was six. Dyads still needed to discourse to agree if the meal appeared balanced and when they should select the “Eat” button at the top of the screen. When “Eat” was selected, the Alien began to metabolize the lunch, and the players needed to exercise to metabolize all the items in a row.

Screenshot with MyPlate icon over a cafeteria tray.
At this level, the USDA MyPlate template was introduced. In our studies in Arizona we have queried our participants if they had ever been introduced to MyPlate. By the fall of 2013 no student has ever raised his or her hand. Teachers have confirmed they have not covered this topic. After a player placed an item on the tray, the proportion of a daily serving for a 12-year-old boy filled the MyPlate icon. This proportion was chosen to match the preteen figure of the Alien body. Thus, for the three items on the tray in Figure 4 the quadrant for grains is one-third filled because of the small bowl of rice and the cherry Danish pastry, and the veggies and protein quadrants have been activated by the avocado. Note there is no dairy circle in this version of MyPlate because we wished to focus on solid food choices only. At the end of the final exercise sequence, fireworks are displayed to signal the end.
Pilot study control condition
Learning gains were assessed between the experimental group (“Alien Health”) and a control group that did not perform the short cardio exercises. The learning tasks needed to be somewhat similar in style, and the performative “front of the classroom” aspect needed to be maintained. Students take more ownership of learning when they know they must perform in front of their peers and make public choices. 24 Because of interface design and programming time issues, the Alien backstory was also removed from this condition; thus, there were two alterations in the control condition.
Control Level 1: Real foods
Students were simply asked, “Which do you think would be better to eat?” When the better choice of the two had been made by both students, the same fun, technobeat music from the experimental condition played, except now the task was to use the interactive whiteboard pen and tap a yellow ball as it bounced along a blue signal (Fig. 5). A stick figure danced in the middle of the screen, but players did not have to mimic the movements as in the experimental condition. All students smiled during this task and reported when asked that it “was fun”; thus we assume it was somewhat engaging. Students were standing and interacting with the whiteboard with arm movements, but they were not performing the same sort of gross body cardio exercise that the experimental group performed. Player dyads still needed to discourse to agree on the food choice.

Screenshot of the control condition main interface.
Control Level 2: Nonfoods
The same food labels and large molecule icons showed up for nonfoods but with no Alien in the middle. Players were told to “pick the healthiest item” based on the nutrient profile.
Control Level 3: Build a Lunch
Players were instructed to pick six of the healthiest foods for a balanced lunch, and they received feedback on their choices (Fig. 6).

Screenshot of the control condition MyPlate version.
Measure: The nutrition and food choice test
This study used the same experimenter-designed measure from Johnson-Glenberg and Hekler 3 (see the Appendix). We wanted to gather data on an older group of children and see if gains would replicate in a new platform. The measure contained a mixture of 31 forced-choice and open-ended items along with a blank template of the MyPlate icon for students to fill in. The maximum score on the test was 100. (In the previous study with 4th graders no participant scored over 95.) The maximum that could be scored on the MyPlate subsection was 16. “Protein” needed to be written in one of the smaller quadrants for 4 full points to be awarded. A common error would be when students wrote “meat.” This was awarded 3 points because it is a type of protein, but it is certainly worth more than “candy”—which showed up on the occasional pretest for 0 points.
Results
Because of room size differences, the study contained more experimental than control participants; in addition, several of the players were recent immigrants and had such limited English language proficiency that the teachers assured us these students did not understand the written English pretest. These students were omitted from the analyses. Given the sample size differences, a pairwise match was performed to create equitable final groups. There were 10 English-proficient controls; these were matched with 10 experimental English-proficient students on the pretest. Thus, the study ended up with 20 participants equally divided across the visiting classrooms and two genders.
Statistics
Statistics were run using SPSS version 19 software (SPSS, Inc., Chicago, IL). Both groups made significant gains in learning by posttest (paired t19=4.29, P<0.001). The comparative analysis at posttest was somewhat unexpected. The control group displayed a larger gain than the experimental group (Fig. 7), but the difference was not statistically significant (between-subjects t19<1.5, difference not significant). This may have occurred because the control dyads each had more hands-on time with their lesson, and this led to larger gains at immediate posttest. Table 1 lists the descriptives and effect sizes.

Graph of content knowledge at three test points.
Data are mean (standard deviation) values.
ES, effect size (or Cohen's d).
Follow-up test
However, when the experimenters went to the school 2 weeks later to re-administer the invariant follow-up test, a different story emerged. Now the experimental group (mean=77.40 [standard deviation 7.01]) did better than the control group (mean=75.40 [standard deviation 8.70]). The crossover effect approached statistical significance. An analysis of variance on the gains scores revealed that the effect between group and knowledge change (from test point time 2 to 3) was a trend that approached significance (F19=3.96, P<0.058). Given the small sample size and the fact that the gain score differences trended toward significance, this is worth reporting. The effect sizes, which ranged from moderate to large, are impressive in the cognitive change literature.
Discussion
This study was designed to answer two research questions regarding games and nutrition instruction. The first question focused on immediate gains: Will players learn more nutrition content when the learning is folded in with exercise and a game narrative? The answer was unexpected. Both groups made impressive and statistically significant gains on the content test immediately after their respective interventions, demonstrating that dyads of students coming to the front of the class and performing forced-choice decisions on food items is an acceptable paradigm for teaching about nutrition and MyPlate. For the experimental condition, it did not appear that adding the game and exercise components differentially affected immediate recall of content. Indeed, the gain was somewhat less for the experimental “Alien Health” group immediately postintervention, although not significantly so. This may be because more experimental students needed to cycle through the active, performative part of the game in a shorter amount of time, and so they experienced less hands-on time with the tasks, although total observational (or time-in-room) was the same for both conditions. However, we have learned over the years of creating embodied educational content that important learning effects can sometimes emerge on delayed tests 25 ; thus a 2-week delayed test was administered at the school.
For the second question regarding retention effects, these data suggest that adding a narrative backstory and short cardiovascular exercises efficaciously affected retention. We have seen delayed retention effects using other embodied learning content (see Johnson-Glenberg et al. 26 ). This finding on delayed effects for embodiment is in line with the memory consolidation literature, 20 which stipulates that sleep strengthens new, labile memories. In the vernacular of production rules, people solve problems in new domains by applying weak problem-solving procedures to declarative knowledge, and new production rules need time to be compiled and to facilitate transfer among skills. 27 Our finding would also be supported by the more theoretical cognitive and discourse models, like construction integration, 28 that require time for activation to spread. In addition, neural network modeling researchers have posited that new knowledge must be “interleaved” with old knowledge, and this takes time; specifically there, is “… gradual incorporation of new knowledge into representational systems located primarily in the neocortical regions.…”29, p.453
This study held constant the content and much of the performative nature of the task. However, we note again that two components were altered in the control condition. First, the alien backstory was removed; this narrative may have been motivating for this age group, although it may also have added cognitive load to the learning, which might explain the stronger immediate gains seen in the control group. Second, the cardio exercises were not performed in the control condition. In some ways, it is embarrassing to perform exercises in front of your peers as they sit and watch. This may have affected immediate gains in learning as well. In future iterations we will ask the entire class to continue standing and exercising with the active dyad.
Limitations
This study suggests acceptability, feasibility, usability, and some efficacy for the game format. It is not clear if it is the narrative or the exercise that drives the crossover retention trend (i.e., the delayed gains interaction). This is a common design problem in educational research because games strive for authenticity, and it can be unnatural and nonexpeditious to vary one degree at a time when a multimedia product is being designed for real-world classrooms. The overall pretest to follow-up test effect size for the “Alien Health” game is an impressive 1.14, which is higher than the 0.83 achieved by the control group.
The relationship between exercise and cognition is undoubtedly complex; minimally, it appears to be dependent on type of exercise, type of test, and time of test. 30 Our pilot teachers refer to the “Alien Health” game as a great example of a break-time activity, something to just “get kids out of their seats during long lessons.” American children's schooldays have become increasingly sedentary; only 29 percent of students regularly attend physical education classes, down from 42 percent in 1991. 31 These results lend further support to the hypothesis that adding embodied exercises may aid long-term memory traces, and the next study will vary only the amount of exercise in the game. To that end we will also include biometrics and sensors that gather individual physical performance data.
This study's Institutional Review Board Human Subjects constraints did not allow us to gather video- or audiotape recordings of the students. We are interested in analyzing discourse and how the students reached food-choice decisions and plan to do so in future studies.
Future studies
Affordable motion capture technologies are opening a new field of research and require new methods of assessment. The lab plans to research the role of congruent gestures on learning. How does a movement's congruency, or overlap with the content to be learned, affect retention (e.g., placing the food in the Alien's mouth [congruent] compared with merely clicking on an item [not congruent to eating])? Several researchers have hypothesized that gesture can be a prime driver in learning because more physical movement activates complex motor neuron patterns, and these will be associated with the learning signal. Cook and Goldin-Meadow 10 and Goldin-Meadow 12 hypothesized that their significant delay test results may be due to gestures producing “stronger and more robust memory traces.” The latter study compared a group that expressed knowledge via speech only with one that used speech and gesture.
For future studies we would use an untreated control group to analyze test effects. In addition, students could be followed up for longer periods to see how much gameplay (dosage) would be most effective for how long. Other metrics like “Plate Waste” could also be used to assess authentic behavioral changes. Finally, M.C.J.-G. and C.S.-R. concur that this version with the interactive whiteboard display was not as continuously engaging as the more immersive, original SMALLab version on the larger mixed-reality platform. This suggests that either all students on the periphery should also exercise to stay engaged, or that teachers should use the game as one of several stations in a classroom with rotating small groups.
Conclusions
Games that merge exercise with learning of nutritional content can be powerful vehicles for instruction and may hold promise for combatting America's ongoing obesity epidemic. The “Alien Health” game is a Kinect-based game that meshes gesture with nutritional instruction and cardiovascular exercise. We are attempting to discover which gameplay components are most efficacious and believe such games should be designed for multiple environments (i.e., for use at home [facilitating intergenerational play]), in informal learning settings (science museums, physicians' waiting rooms, etc.), and in more formal classroom and physical education settings.
Footnotes
Acknowledgments
Some of this work was made possible by an Obesity Solutions Grant from Arizona State University. Special thanks to Maureen Zimmerman, Tatyana Koziupa, Chris Dean, Sean Griffin, Elizabeth Phillips, Melanie Mosiman, Eric Clark, James Comstock, Trapper McFerron, and James Levine.
Author Disclosure Statement
M.C.J.-G. is also founder of a company called Embodied Games for Learning, LLC. C.S.-R. and H.H. declare no competing financial interests.
