Abstract
What do the languages spoken by migrants reveal about the social forces that shape international migration flows? This visualization of languages spoken by unaccompanied lien children (UACs) detained in the United States between 2014 and 2016 demonstrates how linguistic diversity among UACs varies across language families, countries of origin, and world regions. The relationship between the languages represented in this dataset and UACs’ countries of origin reveals patterns that are invisible when focusing solely on country of origin, including migration patterns of ethnic and racial minorities, colonial legacies evident in the languages spoken, and prior international migration flows of groups before they reached the United States. Thus, this visualization warrants greater attention to migrant language as a key demographic indicator for understanding international migration dynamics.
When examining the demographics of international migration flows, sociologists have traditionally focused on migrants’ countries of origin, legal status, gender, and age to observe the social forces shaping migration patterns. Language has been a less frequently explored factor for explaining who migrates from one country to another, with the limited studies primarily examining the significance of language after migration. However, scholarship shows that language can reveal critical information for understanding the migration process. Scholars have shown, for example, that the presence and distribution of languages in a country often reflect that country’s political and economic development, such as colonialism, industrialization, and globalization (Massey et al. 1993), as well as ethnic and racial identity (Flores, Loría, and Casas 2023).
The analysis represented in the visualization (Figure 1) links these bodies of research and initiates the exploration of language to further our understanding of international migration flows through the languages they spoke. Specifically, this work examines how the demographic on languages spoken by unaccompanied alien children (UACs) 1 can reveal key patterns relevant to the broader processes of international migration—namely, how language intersects with the political, social, and cultural forces that drive current international migration flows—and it focuses specifically on children as a group whose vulnerabilities and migration experiences are often at the center of attention to explain their migration to the United States. The goal of this analysis is to demonstrate that language should be analyzed independently from and in addition to the conventional factors commonly believed to influence international migration because language can reveal social forces at play that are not captured by those other factors.

This figure is a Sankey chart displaying the languages spoken by unaccompanied alien children (UACs), also known as unaccompanied minor migrants, organized by national origin, world region, and language family code (International Organization for Standardization 639-5) from October 2014 to December 2016. The data were restricted to show only languages that represent more than 0.1 percent of the total number for each country. See Appendix Table A.1 for complete data, including the total number of minors per country by language spoken. Although the sample includes 114,858 minors, the total count of languages spoken is 128,249, reflecting that many minors are multilingual. The thickness of links between the nodes represents the number of minor migrants who speak each language. The figure displays 58 unique languages across all continents except Oceania (86 languages appear in the full unfiltered dataset). The most common spoken language is Spanish, which comes mainly from three Central American countries: Guatemala, El Salvador, and Honduras.
The basis for this visualization is a unique dataset obtained from the U.S. Citizenship and Immigration Services that captures the languages spoken by UACs apprehended in the United States between October 2014 and December 2016. The selected time period captures the time in which the United States experienced its first two major surges in the number of UACs crossing the southwestern border with Mexico. The first surge occurred in mid-2014. In response, the United States and Mexico increased their detention efforts, and arrivals decreased. Arrivals then rebounded, and a second, even larger surge of UAC arrivals occurred in mid-2016 (Escamilla García 2022). The U.S. government estimates that during this period, it apprehended more than 168,000 UACs, more than 82 percent of whom hailed from four countries: Guatemala, Honduras, El Salvador, and Mexico (Kandel 2024). Given these surges, scholars and policymakers began to focus their research on UACs. Their work has largely concentrated on the conditions driving the departure of UACs from their countries of origin and their incorporation in the United States (Castañeda and Jenks 2024 Escamilla García 2020; Galli 2023; Heidbrink 2014). However, scholarship on the intersection of UAC migration and language is scarce and has focused on the relationship between language and integration into U.S. institutions, especially schools, and has concentrated largely on Latin American populations (Canizales 2025; Heidbrink 2020). The broader demographics of the UAC population, particularly the different languages spoken in these countries, also known as their linguistic diversity of that population (Nichols 1992), remain unexamined.
Turning to the analysis and the visualization, Figure 1 shows the languages spoken by these UACs in relation to their language family, regions and countries of origin. For purposes of clarity, the visualization shows only languages that correspond to more than 0.1 percent of the total number of speakers per country (see Appendix Table A.1 for complete data). The visualization reveals patterns that are often invisible when looking at other demographic factors. At its most descriptive level, Figure 1 shows that UACs’ linguistic diversity surpasses national diversity—63 countries of origin correspond to 58 different languages and 19 language families, distributed across all the continents of the world except Oceania. The country with the most linguistic diversity in the sample is Guatemala, with Guatemalan UACs speaking 30 distinct languages, including Spanish; all 22 officially recognized Mayan languages in Guatemala; and Garifuna, an Afro-Caribbean language spoken primarily in Honduras (see Appendix Table A.1). This linguistic diversity underscores the geographic breadth of migration flows from Guatemala, where approximately 40 percent of the country’s population speaks a Mayan language family (Instituto Nacional de Estadística–Guatemala 2023). Scholars have documented the migration of Mayan language speakers to the United States through individual cases (Hagan 1994) but never on this demographic level. Moreover, the presence of Kohatanek in the data—a language that is not recognized by Guatemala as its own language, whose speakers have advocated for its recognition—highlights how linguistic self-identification carries social and cultural meaning, shaping senses of belonging for migrant communities that are missed when considering only national origin (HCR-Guatemala 2003, 2019). Furthermore, UACs from Mexico, Honduras, and El Salvador also reported speaking Mayan languages typically spoken in Guatemala, such as Q’anjob’al and K’iche, potentially reflecting the regional mobility of indigenous populations within Central America prior to their onward migration to the United States. This reveals the depth of information available about international migrants by focusing on language spoken in addition to other typical demographic factors.
Beyond the descriptive level, the visualization also illuminates the global imprint of colonialism and its enduring effect on international migration flows. The visualization makes clear, for example, that in absolute numbers, Spanish, English, and French dominate the dataset across regions of origin, with 80 percent of all UACs speaking one of these three languages. The distribution of these three languages across countries of origin further reflects the colonial legacies of each region, with Spanish predominating in Latin America and French and English in Africa, Asia, and the Caribbean.
Finally, the visualization reveals contemporary patterns of internal and regional migration before UACs ever reached the United States. For example, among the relatively small population of UACs originating from European countries, Romanian and Romani were the most common languages regardless of country, reflecting both migration from Romania and the presence of Roma minorities across other European countries such as Poland, France, Italy, and the United Kingdom (Meseşan-Schmitz et al. 2025). Other regions of the world reflected similar patterns. Some Angolan and Guinean UACs reported speaking Spanish, which may reflect the increasing presence of African diasporas in Latin America or Spain before onward migration to the United States. And Haitian UACs also reported speaking Portuguese, which corresponds with the recent Haitian diaspora to Brazil over the past decade (París Pombo 2018; Yates and Bolter 2021). These linguistic patterns illustrate long-standing regional migration flows that remain invisible when focusing exclusively on country of origin.
Taken together, the spectrum of languages spoken by UACs from 2014 to 2016 and their distribution by country of origin reveal migration dynamics that would be overlooked by relying only on traditional demographic indicators. Although we know that poverty, violence, and family reunification are the main factors that drive UACs to the United States, examining the linguistic diversity of this group through this visualization shows how UAC migration is also embedded in regional forces, such as colonial legacies and interregional migration flows. At the same time, the visualization reveals racial and ethnic minorities moving to the United States who are otherwise unseen. This finding expands our understanding of the long processes that drive the international migration of UACs and offers a glimpse into future processes of UAC integration in the United States, such as the interaction of language and the education system. Future scholarship should continue to engage with multiple dimensions of human mobility, including languages spoken, to fully capture these multifaceted dynamics.
Supplemental Material
sj-pdf-1-srd-10.1177_23780231251409235 – Supplemental material for Language and International Migration: Visualizing the Linguistic Patterns among Unaccompanied Alien Children Detained in the United States (2014–2016)
Supplemental material, sj-pdf-1-srd-10.1177_23780231251409235 for Language and International Migration: Visualizing the Linguistic Patterns among Unaccompanied Alien Children Detained in the United States (2014–2016) by Ángel A. Escamilla García in Socius
Footnotes
Supplemental Material
Supplemental material for this article is available online.
Notes
Author Biography
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
