Abstract
This article examines the multimodal native cultural content in two sets of English-language textbooks widely used in public junior high schools in China and Mongolia. A pre-existing analytical framework was adapted for this analysis. Through this adapted framework, this paper aims to analyze and compare the distribution of multimodal native cultural content in the chosen textbooks. The results reveal that the two sets of books contain diverse multimodal native cultural content, despite the significant imbalance between Big “C” and Small “c” categories in each set. The two sets of textbooks also have different emphases on representing these elements. We argue that previous studies overgeneralized language textbooks’ native cultural content. Learners’ intercultural communicative competence development may be impacted by the overemphasis and shortage of some cultural categories in the chosen books. This article concludes with implications for textbook design and classroom teaching and calls for further studies of teachers’ and learners’ perceptions of multimodal native cultural content.
Introduction
Textbooks play a pivotal role in education and language instruction. Thus, the representation of culture in textbooks is of great interest for critical scholars and educators (Canale, 2016). Textbooks for language learning may offer a diverse range of representations of the world and assist learners in developing their communication skills and language awareness (Risager, 2021). The English Curriculum Standards for Compulsory Education, Ministry of Education of PRC (2012), issued by the Ministry of Education of China, highlighted students’ intercultural communication competence (ICC) and awareness. Regulations on Secondary and Primary School Textbooks (2019) also contains numerous keywords related to native culture. Accordingly, interest has increased in culture and cultural awareness in China’s officially published EFL textbooks, which may have the largest number of users in the world (Lee & Li, 2020; Shen, 2019; Xiang & Yenika-Agbaw, 2021; Xiong, 2012). However, the representations of native culture have not received a detailed analysis. The native cultural content’s composition and inner structure of these textbooks reflect the native culture, which the previous studies have treated as monolithic, may be much more complicated.
This line of inquiry has also suggested some issues that should be examined further. In recent years, scholars both within the English as a Foreign Language (EFL) field and educational policy studies have contributed to establishing a substantial body of literature regarding multimodal cultural content in textbooks (Cremona & Arnaouti, 2019; Joo et al., 2020; Setyono & Widodo, 2019). In general, multimodality integrates different modes of communication (i.e., written, oral, or visual) to better illustrate what is trying to be communicated. Multimodality is used among different social and cultural representations, whose in-depth understanding of a specific context is enhanced through writing, images, and verbal languages. The multimodal cultural content in the English-language textbooks represents how specific ideas and reflexive turns are communicated through various signs within social environments (Kress, 2003; Kress & Van Leeuwen, 2001). However, studies on multimodal native cultural content in English textbooks are surprisingly scarce even when current educational policies in China have explicitly emphasized it. Therefore, it is necessary to go deep into its composition in this representative country’s typical ESL textbooks to reveal more details, analyze its pedagogical implications and complement the critical analyses focusing on the inclusion and exclusion of certain cultural elements. Meanwhile, the existing frameworks only inefficiently analyze the cultural content of textbooks (e.g., Lee, 2009) must be analyzed if they are to be applied to multimodal cultural content analysis within a complicated cultural context.
Additionally, comparing textbooks from more than one country is valuable for textbook research. Comparative education research can help people learn more about other cultures and societies, identify problems in their education systems, analyze the causes of problems, and find solutions (Bray, 2014). Instructive analysis can be carried out if the objects for comparison have sufficient features in common to analyze their differences meaningfully (Phillips & Schweisfurth, 2014). Therefore, comparing a country that has similarities with China in its modern educational history and system though it maintains its own traditional and contemporary culture will reinforce the analysis of the native cultural content of the textbooks and improve the comparative results. Mongolia has been selected because it is a developing country in the East Asian cultural sphere that was once profoundly influenced by the Soviet Union in education and language teaching and has built social linkages with China through collaboration in education development (Reeves, 2018). Nevertheless, Mongolia possesses cultural traits different from China and other East Asian countries (Dillon, 2019; Li, 2018), making it an ideal counterpart to China.
This study will analyze and compare the multimodal native cultural content in the foreign language textbooks most widely used in China and Mongolia. Both sets are for sixth to ninth graders, and they consist of seven and four books, respectively. Considering the incompatibility of the multimodal data with the existing analytical framework, we will adapt it to overcome its weakness, quantify the multimodal native cultural content in the two sets of books and reveal its composition. Comparison based on the statistical results will show the books’ differences and similarities, the adequacy, or inadequacy of their cultural categories, which will indicate the hidden reasons for their present arrangement, and provide important suggestions for textbook design and publication and classroom teaching and learning.
Review of the Literature and Theoretical Framework
Native Culture in Foreign Language Textbooks
One approach to language textbook research is to consider it as a carrier of culture, where the content of native and foreign culture is closely connected with intercultural awareness. Byram (1997) proposes that intercultural communicative competence involves five factors that are connected to the concrete cultural content in language textbooks. As one purpose of a textbook is to communicate cultural values and ideologies, existing studies have examined diverse cultural elements in textbooks, like gender equality, language attitude, and the identity of a society (Canagarajah, 1993; Curdt-Christiansen & Weninger, 2015; Lee, 2018; Ruiz-Cecilia et al., 2020; Thompson, 2013). If mishandled, any of these can cause learners to feel bewildered in understanding the social roles (Bandura, 2003; Foroutan, 2012) or lose the opportunity to be exposed to multilingualism (Thompson, 2013).
Recent studies have been concerned about both the native and foreign cultural content in language textbooks. Researchers from Iran, Saudi Arabia, and Thailand have juxtaposed and compared the textual or discursive content of target culture and native culture (Rashidi & Meihami, 2016; Sultan, 2018; Thumvichit, 2018), while researchers from other countries focused on one aspect of local culture (Canh, 2018; Feng, 2019). In this context, the target culture is the culture that the language being learned comes from, and the native culture is the language of the learners. According to Xu (2013), Chinese culture and world cultures are in ideal coexistence in the senior secondary ELT textbooks used nationwide. But Xiang and Yenika-Agbaw (2021) have found in a recent edition published by the same press that world cultures and multicultural variables are represented in an “unbalanced, stereotypical way and lack diversity.” Other studies on the textbooks of China (e.g., Lee & Li, 2020; Liu & Fang, 2022) also focused on similar contradictions. However, few discussions have examined the native cultural content in other modalities or analyzed the representation and distribution of various native cultural elements in textbooks to test these arguments or find explanations.
Additionally, observations of the imbalance mentioned above through content and discourse analysis are mostly based on critical theory, and take native culture as monolithic. Admiration of the West, native-speakerism, and a static perspective on culture in language textbooks have been discussed as a major concern (Joo et al., 2020; Liu et al., 2022; Sun & Kwon, 2020). Even critical discourse analysis of the “official knowledge” in textbooks from different publishers (Wu, 2021) emphasizes only political and economic topics. This research trend may have been created by conceptualizing a country’s culture as highly homogeneous (Canale, 2016; Herman, 2007). Such a perspective is compatible with multimodal research methods (e.g., Joo et al., 2020) but incompatible with further investigations into the structure of native cultural content.
Textbooks are among the core artifacts in language classrooms. Most of the previous investigations of the interaction between English textbooks and their users, mainly teachers and students, have been carried out as book evaluations (Mutiah & Albiansyah, 2021; Muzakky & Albiansyah, 2021; Orfan et al., 2021), stressing content related to linguistic knowledge and competence. Users’ perception of cultural content has yet to be examined. Since the focus of this study is multimodal native cultural content per se, the following sections will not discuss teachers’ and students’ perceptions.
A Perspective of Multimodality
Foreign language textbooks do not communicate only in plain text. Where visual communication is involved, researchers may pay attention to the available semiotic resources, their use, and ways to use them (Machin & Mayr, 2012). In paper-based textbooks, these resources or signs are text and images. Chapelle (2016) created several terms to describe the five essential roles of pictures in foreign language textbooks: task essential, text enhancing, generally orienting, theme building, and independent. Still, images alone can also create spatial and political dimensions, and the use of visual elements equates to the communication of meaning (Stocchetti, 2011). Therefore, textbook studies should also stress multimodal content about culture and notice the influence and complexity of visual elements.
Analysis of native and target cultural content through the multimodal lens has produced interpretations from a critical perspective. Multimodal content analysis has been applied to research on textbooks for the study of English adopted in countries like Germany, Indonesia, and South Korea (Joo et al., 2020; Motschenbacher, 2019; Setyono & Widodo, 2019). These studies began the process of paying attention to the visual representation of characters who shared the same cultural background with the learners as well as the political implications of the images. The previous researchers also maintained that the indigenous cultural characters should play a critical role in language textbooks to provide cognitive and affective support. Studies on Poland’s English textbooks discussed local and world cultural content and analyzed the positive effect of multimodal elements in different contexts (Stec, 2017, 2019). These explorations were centered on the form, influence, and position of multimodal content that represented learners’ native culture. However, native cultural content left multimodal complexity undiscussed.
Multimodal native cultural content has been used in a variety of ways in different studies. Weninger’s (2021) method of analyzing foreign-language textbooks was to study how multimodal content “encodes and communicates ideas about the world.” Thus, some critical studies sought to illuminate the ideological meaning of the multimodal native and target culture’s elements or their roles in the context of a specific social culture (Joo et al., 2020; Setyono & Widodo, 2019; Stec, 2019). Although native cultural content is in the foreground of oversimplifying the composition in the cultural content, which does not offer practical suggestions for selecting and systematically arranging multimodal content in textbooks. Cremona and Arnaouti (2019) analyzed selected foreign language textbooks used in Malta and Greece with an original framework based on Byram’s (1993) theory. Despite the framework, they incorporate native culture into the background when looking into the structure of multimodal content, suggested compensating the inadequate representation of target culture with video materials. Therefore, the native cultural content in EFL textbooks must be examined through the multimodal perspective using a systematized analytical framework.
Rationale for the Chosen Conceptual Maps Within the Context of Comparative Analysis
Some recent and prominent studies have addressed the cultural content in China’s EFL textbooks. Soon, the influence of the country’s current policies on native culture will be reflected in language textbooks. As institutionally sanctioned artifacts for formal education, language textbooks are carriers of “truth” (Weninger & Kiss, 2015). They facilitate the learners’ development of intercultural awareness by exploring the diversity and complexity of culture (Baker, 2012). Thus, existing cultural knowledge should be enhanced, especially the multimodal native cultural content ingrained in language textbooks. Then, native cultural content can be examined in an inclusive analytical framework. Meanwhile, performing a comparative study between the textbooks of China and a counterpart (Mongolia, in this case) will help the researcher find more details and avoid bias. Both the findings and the tool can contribute to the stakeholders’ improvement in educational policies and practices.
The conceptual maps in this study were chosen to conceptualize “culture” and operationalize the multimodal analysis of language textbooks. Halliday’s (1976) theory is used frequently in studies focused on the functions of multimodal elements or the interaction between visual and verbal content in EFL textbooks (e.g., Olatunji & Onipede, 2020). Other researchers developed analytical frameworks based on Byram’s (1993) theory to examine verbal or multimodal cultural representations in language textbooks using a set of criteria (Cremona, 2017; Wu, 2010). However, language learners need to acquire culture-specific and general knowledge, skills, and attitudes about culture (Paige et al., 1999), and multimodal elements should be culture-specific. The culture-specific aspect consists of “the products of civilization” (Big “C”) and “a particular group of people’s way of life” (Small “c”) (Brody, 2003; Lange & Paige, 2003; Paige et al., 1999), which can be embodied into specific categories for analysis. This cannot be achieved through the frameworks based on Halliday’s and Byram’s theories.
Some researchers have drawn on the “general-specific” model and expanded it for language textbooks’ cultural content analysis. Lee (2009) designed detailed categories and performed content analysis to evaluate cultural content in specific high-school EFL conversation textbooks used in South Korea. This evaluation demonstrated the textbooks’ lack of culture-general or Small “c” content, which was further confirmed when the expanded framework was applied to books from other countries (e.g., Raigón-Rodríguez, 2018). The present study aims to examine the complexity and composition of EFL textbooks’ multimodal native cultural content, which will be an investigation of culture-specific signs, both Big “C” and Small “c” included. For this study, Lee’s (2009) framework, which contains the culture-specific aspect, will be adopted after some categories are adapted or eliminated according to its flaws and the features of the multimodal elements.
A comparative analysis can help improve our understanding of the forces that shape education systems and their roles in social and economic development. Most extant investigations that use a multimodal perspective have focused on the textbooks used in only one country or region (e.g., Setyono & Widodo, 2019). By comparing the local case with a similar but unfamiliar one, researchers can find small but critical differences that might be neglected in studies of single cases (Phillips & Schweisfurth, 2014) and can help reduce bias. Although Mongolia has been struggling with the impact of Western culture and suffering economic hardship in the past decades, its educational history and 9-year basic education system (elementary and junior high school) are similar to China’s (Dillon, 2019; Li, 2018). This makes Mongolia an ideal candidate to compare with China. In these two countries, officially published EFL textbooks are the most widely used, and students first start formal foreign language learning in junior high school (Li, 2018). Textbooks are more inclusive in social and cultural information than elementary school textbooks, which are often dominated by visual content. So, comparing China and Mongolia’s junior high school EFL textbooks through content analysis will help us understand the two more comprehensively and reflect on language teaching at different levels of schooling. Therefore, based upon the chosen conceptual maps, the primary research questions guiding this study are as follows:
RQ1: How is multimodal native cultural content distributed in China and Mongolia’s junior high school EFL textbooks?
RQ2: What are the differences between the multimodal native cultural content of the two sets of EFL textbooks?
Methods and Data
We performed a content analysis to quantitatively examine multimodal native cultural content in China and Mongolia’s officially published EFL textbooks and thus answer our research questions. The main semiotic modes in textbooks are written text and printed visual images, which are most common in educational communication (Kress, 2003, 2010; Serafini & Reid, 2019). As written text is the dominant mode in printed textbooks, the multimodal exploration in this study paid more attention to visual images. To avoid potential biases, researchers with different cultural backgrounds collected and processed the data independently before further comparison and analysis.
One set of junior EFL textbooks from each country was selected. The set for Chinese students was approved by the Ministry of Education of China (in 2012–2013) and published by Shandong Education Press (“SEP textbooks”). It is a 4-year version of the most widely used 3-year textbook set published by the People’s Education Press. The set for Mongolian students was authorized for publication by the Ministry of Education, Culture, Science and Sport of Mongolia in 2015 (“MME textbooks”). Both sets, chosen due to being the most standard and widely used EFL textbooks in each country, are designed for use by 4-year public junior high school students (Grade 6–9) who are of the same age group and proficiency level (see Table 1).
Textbooks Chosen for Analysis.
This study mainly investigated the multiple representations of the two countries’ native cultures through an adapted framework. Culture-specific Big “C” and Small “c” should be the focus of statistical and analytical practice. Lee (2009) included the culture-general aspect, as well as the Big “C” and Small “c” of the culture-specific context, in a framework for analyzing the conversation textbooks used in Korea. However, this framework should be simplified by keeping the categories that may be related to visual elements, because the present study focuses on the static visual elements related to native culture, and these cannot express many abstract and procedural concepts that video and audio materials are rich in. In the original framework, the categories bearing too broad or too narrow connotations, or those unsuitable for analyzing a country of higher cultural diversity, or those in an inappropriate hierarchy, were adjusted in advance. For instance, informality is deleted because it can only be embodied in plain text, agriculture is replaced by traditional production mode to fit the analysis of cultures featuring husbandry and fishery, literature and arts are combined into one item, and food is transferred from dress/style/housing to (social) customs. The adapted framework for analyzing and comparing the two sets of English language textbooks is shown in Table 2.
Categories of Cultural Elements in Foreign Language Textbooks.
The researchers marked and collected visual images related to native culture in the body part of the two sets of textbooks. The two sets of books include cartoon images and photos, which should be both included in the statistical process, according to Machin and Mayr (2012). All the pictures related to native culture were identified, with their position, content, and relevant text recorded. Then all these items were assigned to different categories in the adapted framework. After that, the assignment results were compared, with each inconsistent item discussed and assigned to a consensual category. Table 3 presents an excerpt from the item assignment, a part of the Grade 6 volume of the MME textbooks. The final assignments for each book were counted and further analyzed (see Tables 4 and 5). No outstanding ethical concerns exist as all data were collected from paper-based, officially published textbooks.
An Excerpt of Item Assignment (Grade 6 Book of MME Textbooks).
Categories of Pictures Related to Native Culture in SEP Textbooks.
Categories of Pictures Related to Native Culture in MME Textbooks.
Findings
The statistics and analysis of the two sets of books reveal the features in their multimodal representation of native culture and show significant similarities and differences. Both sets of books are multimodally richer in Big “C” than Small “c” content. They also have different drawbacks, as elaborated on in the discussion section.
Pervasive Imbalance Between Big “C” and Small “c”
The SEP textbooks cover most of the categories of the pictures related to native culture (see Table 4). The most frequent category in Big “C” is customs, which appears 13 times in the 7 books. Customs pictures include festival activities like the Dragon Boat Race, Spring Festival, Lantern Festival, and Kite Festival in Weifang, the use of chopsticks, traditional food and drink such as zongzi, rice noodles, mooncakes, and tea, as well as the cartoon image of a Chinese restaurant. Additionally, arts/literature appears 10 times, mainly depicted in cartoons about Chinese historical stories and myths in Book 7. Moreover, metropolitan/transportation appears six times, with the urban landscapes of Beijing, Shanghai, Hong Kong, and Taipei, including the Forbidden City, a typical traditional building, and the Oriental Pearl Tower, a typical modern building. The dress/style images are primarily photos of students, while a group of photos featuring modern Chinese females (see Figure 1a) shows Chinese people’s dress and spiritual temperament. These diverse visual elements represent Chinese culture, overcoming the mindset of exclusively focusing on the target culture.

(a) Chinese characters in SEP textbook 6A and (b) Chinese and international students in one photo in SEP textbook 9.
The Environment category has the highest frequency (n = 4) among the Small “c” visual elements. Three of these are, respectively, photos of giant pandas, the symbol of China, and a photo of Chinese students going on a ropeway to cross a river to school. The single photo in the fairness/equality (see Figure 1b) category shows a group of students talking about cooking. There are both Chinese and international students and female and male students in it, thereby conveying rich semiotic information. The structure of this photo is markedly different from those of the pictures mentioned previously. From the perspective of visual elements, this is the opposite of the trend in South Korean textbooks criticized by Joo et al. (2020). Indeed, the richness of the native Big “C” and Small “c” categories, as embodied in visual images in the SEP textbooks, partly counterbalances existing studies (e.g., Xiang & Yenika-Agbaw, 2021).
Table 5 illustrates the picture categories representing native cultural content in the MME textbooks. The category with the highest frequency in Big “C” is education. Most of the photos are not directly connected to the text near them, but rather show scenes of speaking practice and other activities in the classroom. These pictures play one of the five roles of pictures discussed in Chapelle (2016), namely general orientation. However, as Stocchetti (2011) points out, these visual elements also construct meaning within a particular context. The category ranking second in Big “C” is dress/style, including traditional Mongolian costumes (see the traditional hat in Figure 2a), though there are fewer such pictures than there are images depicting modern dress worn by students (n = 3 vs. n = 19). Customs pictures (n = 5) include photos of traditional Mongolian food. However, the category of “agriculture” in Lee’s (2009) framework has been replaced by traditional production mode, which contains pictures of “sheep herding” and the like. Ethnic groups, science, and family are scarce, while business/market and metropolitan/transportation are totally absent. The ethnic groups picture is a cartoon of a Kazakh boy (from Mongolia) in Unit 5, the Grade 6 book (Figure 2b). Images like this have always been highlighted in multimodal critical analyses (e.g., Joo et al., 2020; Machin & Mayr, 2012).

(a) Traditional Mongolian costume in a photo in MME textbooks and (b) A Kazakh boy in MME textbooks.
Small “c” categories found in the MME textbooks include environment, time, competition, and rule-oriented (see Table 5). The photo of the National Park, the name of which is introduced in the text, embodies people’s attitude toward the environment, echoing with photos of horseback riding and sheep herding. The time picture is a cartoon depicting the daily life of a Mongolian child, while in the rule-oriented cartoon, a child is showing respect to the elderly, an expression similar to that which Canh (2018) explores in the study of moral education as a part of native culture. Some Small “c” categories overlap with the Big “C” ones. For example, in the traditional production mode and customs pictures, horseback riding and sheep herding also reflect the relationship between humans and nature.
In interpreting the first research question, we found that the two sets of textbooks cover a considerable number of categories of multimodal native cultural content. The imbalance between Big “C” and Small “c” is obvious in each set of books, whereas the distribution of the categories shows the designers’ different emphases. Such imbalance and distribution are also unsatisfactory from the outlook of cultural education. Comparing the data about the SEP and MME books will reveal more details.
Absence of “Ethnic Groups” in SEP and Modern Elements in MME
The two sets of textbooks share some features in terms of Big C, but these are outnumbered by the pervasive differences. Both sets of books include the customs, dress/style, arts/literature, education, and sports/leisure categories. Neither covers the Big “C” category business/market. This could be a coincidence, as no relative discussion has been found in previous literature. The SEP books have neither ethnic groups’ images, which Herman (2007) describes as homogenized representations of a culture, nor politics-related native cultural elements. In contrast, these two categories are embodied in the MME textbooks through the cartoon images of the Kazakh child and the national flag of Mongolia. However, the MME books, lacking the standards of being “the carriers of truth” (Weninger & Kiss, 2015), do not include the metropolitan/transportation category, which is represented in the SEP textbooks by pictures of traditional and modern buildings in China.
The composition of concrete elements of the same Big “C” category varies significantly across the two sets. For instance, all the dress/style pictures in the SEP books present modern clothing, while traditional dress and modern clothing both appear in the MME books. Also, the sports/leisure activities in the SEP books are primarily popular modern pastimes (e.g., listening to music or playing basketball), but Mongolian chess, a typical traditional activity, embodies the same concept in the MME pictures. The “traditional production mode” pictures appear much more frequently in the MME textbooks (n = 4) than in their Chinese counterparts (n = 1). This provides the Mongolian book designers with more opportunities to display various representations of their traditions.
The two sets of textbooks contain a limited number of Small “c” elements but with different emphases. While the SEP books have pictures representing the fairness/equality, environment, competition, nurture, and self-improvement native cultural elements, the MME books have only time, environment, and competition. Science has only one native-culture-related picture in each set of textbooks, whereas “novelty-oriented” appears only once, in one of the SEP textbooks. In all seven MME textbooks, no pictures depict local and foreign people simultaneously. Neither of the two sets features pictures representing confrontation, which also reflects these countries’ ideologies decisions (Curdt-Christiansen & Weninger, 2015).
In interpreting the second question, we have noticed that the two sets of textbooks have different inclinations to display distinct native culture when using Big “C” and Small “c” multimodal elements. The SEP multimodal elements stress customs and arts/literature more in Big “C” and environment more in Small “c,” but the MME textbooks seem to pay more attention to dress/style and education. Even in the same categories, like dress/style, the two sets of books represent images with totally different preferences and distributions. Despite the scarcity of Small “c” categories in both SEP and MME books, the listed categories exhibit different priorities, and as Weninger and Kiss (2015) argued, bear and communicate distinct information.
Discussion
To explore the multimodal native cultural content in the chosen textbooks, we adapted the existing framework Lee (2009) applied to the analysis of conversation coursebooks. We structurally analyzed the multimodal native cultural elements using this new framework, revealing the hidden facts that the previous “generalized” studies disagreed on. Comparing the two sets of textbooks has provided more details about what was highlighted or missed in the arrangement of the cultural content in these books.
The present study finds that the chosen Chinese textbooks covered diverse multimodal cultural-specific concepts. This indicates two general facts. First, the criticism of the dominance of the target culture in the language textbooks of Expanding Circle countries (e.g., Xiang & Yenika-Agbaw, 2021) has not sufficiently covered all the content or variables related to ICC development. Ethnicity, sexuality, and social class were repeatedly mentioned as fundamental concepts of critical theory. However, our analysis shows that the textbooks might have included diverse native cultural elements, even in a multimodal form, which signifies an improvement in breaking free of the stereotype of the previous studies (e.g., Y. Liu et al., 2022). Second, the studies that found an ideal distribution of multicultural and multimodal content in Chinese ELT materials (e.g., Xu, 2013) failed to examine the deficiency in native cultural information in terms of both Big “C” and Small “c” in the chosen textbooks. It is open to question whether a textbook contains appropriate multicultural and multimodal content when the specific native and nonnative culture categories have not yet been analyzed. After comparing the two sets of textbooks, their merits and demerits have been foregrounded and become explainable.
Comparing the Big “C” categories of multimodal native cultural content in the chosen books has highlighted what Chinese compilers should aim for in designing EFL textbooks. Big “C” is a direct representation of the concept of culture (Brody, 2003). In the SEP books, pictures with traditional and modern elements embody the metropolitan/transportation and customs categories, which reflect China’s economic and social development and measures taken to maintain its traditional culture. The past decades have witnessed rapid economic growth in Mongolia, although the scale of the cities and population of this country has not changed significantly (Dillon, 2019). The lack of metropolitan/transportation pictures in MME books partly reflects the current reality of this landlocked country, something also reflected in the textbooks’ shortage of business/market images. Arrangements like this have made learners in Mongolia risk having incomplete native cultural beliefs and values (Curdt-Christiansen & Weninger, 2015). On the other hand, the SEP books have shown what cities across the country look like by deploying multimodal content to present the genuine Big “C” of China. As the number of elements embodying this category is still tiny, the compilers can provide richer information for fostering learners’ ICC.
Comparing the Small “c” categories of multimodal native cultural content in SEP and MME, we have discovered the challenging tasks that all EFL compilers face. The number of Small “c” categories that appear in the two sets of books is modest, a finding that echoes existing studies (Lee, 2009; Raigón-Rodríguez, 2018). Comparison in the present study indicates that plausible causes of this may be that the Small “c” categories are difficult to convey graphically, or that the designers of curricula and textbooks might regard a country’s culture as highly homogeneous (Canale, 2016; Herman, 2007). As a result, even if the designers have realized the complexity and diversity of their native culture, their design still lacks the systematicity of multimodal cultural content. This could naturally lead to an incomplete coverage of culture-specific aspect categories, especially the possibly inadvertent neglect of Small “c” constructs in the two sets of chosen books.
Emphasizing some specific multimodal categories causes imbalance, which may pose a risk to language education and learners’ ICC development in the two countries. The SEP textbooks have integrated images of traditional customs and modern cities into the multimodal representation of authentic native culture—the diverse and complex culture described by Baker (2012). The SEP compilers also seem to have started considering education about (inter)cultural practices from the perspective of multimodality, as more Small “c” categories have become represented. However, the scarcity of pictures in some categories, such as the absence of traditional production modes and ethnic groups in SEP books, does not align with social reality or the policies’ emphasis on cultural inheritance. Mainland China has enacted several new policies covering textbooks in three subjects, aiming to uphold students’ cultural identity, and EFL textbooks have not yet been covered. Even considering the visual elements alone, independent of the text (Chapelle, 2016), the learners can still make meaning to create a complete native cultural space with comprehensive cognitive dimensions (Stocchetti, 2011). The SEP compilers can refer to the MME textbooks, which present learners with a diverse native culture and a multicultural world to develop their cultural awareness. In comparison, the dominance of traditional elements has made the MME books face the risk of being trapped in a native culture stereotype that may force the learners in both real and virtual environments of multimodality to struggle with ideological confusion (Canagarajah, 1993).
The analysis and comparisons in this study have also revealed the efforts the Expanding Circle countries have made in designing EFL textbooks. Language textbooks are a highly wrought cultural artifact with a preference for cultural values and ideologies (Curdt-Christiansen & Weninger, 2015), which may motivate compilers to stress certain visual elements in textbooks. Mongolia has been culturally influenced by the Western countries (Dillon, 2019), but it is deliberately trying to shake off the stereotypes and worship of the West that Joo et al. (2020) criticized in South Korean coursebooks. The MME textbooks paid great attention to the inheritance and communication of their traditional culture, which is distinctive in the modern world. The multimodal content about traditional production modes, customs, and traditional dress/style demonstrates the textbook compilers’ intentions. The SEP also admitted the importance of customs as well as traditional arts/literature, displaying a similar and unambiguous motivation, although content about target cultures might hold a quantitative lead.
Concluding Remarks
This study has analyzed and compared multimodal native cultural content in two sets of typical junior high school EFL textbooks used respectively in China and Mongolia. It has combined multimodality and content analysis and adapted a framework for analyzing the cultural ingredients in foreign language textbooks. The results show significant differences between the two sets of books, despite some similarities. As the comparison shows, possible causes of these differences include the status of socioeconomic development, cultural visibility, the designers’ intentional or unintentional positioning of the textbooks, and the varying tacit understanding of the diversity and complexity of culture.
As long as little or no attention has been paid to the multimodal representation of culture, language textbooks do not provide a rich inventory in which the learners interact with native and target culture. An imbalanced representation, furthermore, may lead the learners to a stereotypical or overgeneralized understanding of their native culture (Baker, 2012) or even cause the false cognition that Bandura (2003) and Foroutan (2012) have described. Inner ideological conflicts would make it impossible to achieve the “intercultural awareness development” mentioned in the policies, even though diverse target cultural content in various modes is involved in the textbooks. Therefore, in the phase of multimodal content design, for the sake of users’ ICC development, native and world culture distribution, native cultural Big “C” and Small “c” category coverage, and their relative proportions should be considered. Commercial publishers can also refer to these principles and the findings of this study when designing language textbooks, especially those customized for different countries or regions.
Multimodal native cultural content should also be stressed in classroom teaching. When preparing for a course, teachers may notice general native values or the relationship between native culture and the targeted world culture (Sultan, 2018; Thumvichit, 2018). However, it is challenging for teachers to analyze and calculate all the native cultural elements. Teachers in China and Mongolia using the two sets of books may prepare visual and verbal materials for their instruction according to the findings of this study. Most of the existing studies have also found sufficient target culture content in language textbooks. Therefore, teachers in other countries or regions should be encouraged, referring to this study’s analytical framework, to check their multimodal materials and identify what multimodal content of native culture is missing. The Supplemental Materials can be introduced into a class in the form of text, images, or even video clips (Cremona & Arnaouti, 2019). An emphasis on the Small “c” categories is also necessary, as the selected textbooks have shown a tendency to neglect multimodal elements.
Because the focus of this study is on the culture-specific native cultural content, it does not fully consider the target or regional cultural content, which represents a more expansive social space. Future studies could analyze “world cultural content” within an integrated framework involving both culture-general and culture-specific categories. This may contribute to a comprehensive understanding of multimodal cultural content distribution in EFL books and a complete evaluation tool. Textbook compilers, publishers, and teachers would also be able to adjust their practice accordingly to develop learners’ intercultural awareness and communication competence to make up for existing shortfalls.
Multimodal content plays a significant role in today’s EFL classrooms, creating spatial and political dimensions and communicating rich meanings (Chapelle, 2016; Stocchetti, 2011). As the most critical stakeholders who use language textbooks nowadays, teachers and learners also use all kinds of available semiotic resources whose roles, usage, and effects on users need to be further understood. However, existing studies of teachers’ and students’ perceptions of textbooks mainly focus on linguistic competence and knowledge by referring to book evaluation theories (e.g., Mutiah & Albiansyah, 2021). Therefore, future studies should investigate diverse stakeholders’ perceptions of and reactions to language textbooks’ systematized multimodal native, target, or world cultural content.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This paper is supported by Innovative Research Team of Shanghai International Studies University (Project NO: 2020114052).
