Abstract
Cognates are orthographically, semantically, and syntactically identical (or similar) words in two languages. The English and Spanish languages share more than 20,000 cognates, and many are essential academic vocabulary. Research has shown that cognates facilitate vocabulary acquisition and reading comprehension for language learners (when compared to non-cognate words). In Experiment 1, orthographic transparency ratings for 440 English-Spanish cognate nouns drawn from the Paivio et al. imagery norms were collected from 41 college students. Our participants were presented with lists of English-Spanish cognate word pairs presented side-by-side and were asked to rate the orthographic similarity of the pairs on a Likert scale of 1 to 7. The analysis of the ratings suggests that the earlier an English word deviates from its Spanish equivalent (its “point of differentiation”), the lower the cognate transparency rating it is assigned (extending the generalizability of the “initial letter effect” previously reported). In Experiment 2, we validated these ratings by having 43 new participants quickly judge whether English-Spanish word pairs were or were not cognates. We found that reaction times were strongly correlated with transparency ratings and the points of differentiation, supporting the usefulness of the transparency ratings obtained in Experiment 1. A limitation of Experiment 1 was that the cognate pairs varied only by page number, and the individual cognate pairs were not ordered differently. Additionally, we recommend a larger participant sample to include persons other than college students.
Plain language summary
Why is this Topic Important? Cognates are words that are spelled the same or similarly, mean the same or similarly, and have the same grammatical part of speech in two languages. The English and Spanish languages share more than 20,000 cognates, and many are essential academic vocabulary. Research has shown that cognate words help second language learners acquire vocabulary and comprehend what they read (when compared to non-cognate words). Experiment 1. What did the researchers do? 440 English-Spanish cognate nouns from the Paivio et al. imagery word types were rated for how similarly the cognate word pairs are spelled. There were 41 college students who rated the cognate pairs. The college students were presented with lists of English-Spanish cognate word pairs presented side-by-side and were asked to rate from 1 to 7 their spelling similarity. What did the Researchers find? The results suggest that the earlier an English word differs from its Spanish word pair (its “point of differentiation”), the lower the rating the participants assigned it (this extends it generalizability of the “initial letter effect” reported previously). Experiment 2. What did the researchers do? We had 43 new participants quickly judge whether word pairs were or were not cognates. Our goal was to compare the results of ratings in Experiment 1 with how quickly these college students noted their spelling similarities. What did the Researchers find? We found that reaction times were strongly connected to the spelling similarity ratings. The position of the letters at which they differed, supported the usefulness of the spelling ratings found in Experiment 1. What were the limitations? In Experiment 1 the cognate pairs varied only by page number, and the individual cognate pairs were not in different orders. Additionally, we recommend a larger participant sample to include persons other than college students.
Introduction
English-Spanish cognates are words that are orthographically, semantically, and syntactically similar in both of these languages as a result of their common etymology (Baker & Wright, 2021). They are interlingual words with ancestral roots in appearance and meaning (García & Godina, 2017). For example, the English word, “product,” and its Spanish equivalent, “producto,” are cognates. There are more than 20,000 English-Spanish cognates (Nash, 1997). In addition to their large numbers, cognates may be found at all levels of English word frequency (e.g., Davies, 2008) and Spanish word frequency (e.g., Davies, 2016).
Cognates are an important language population in the field of education. For decades, linguists (e.g., Corson, 1997; Lado, 1957) have suggested that cognates be taught to English learners due to their orthographic transparency (similarity) and because of their ubiquity. Their resemblance to words in the second language makes cognates among the easiest vocabulary items for language learners to acquire (A. M. B. De Groot & Keijzer, 2000; Mallikarjun et al., 2017) and they have been shown to improve bilingual English-Spanish students’ abilities to read, spell, and write (García et al., 2020).
The majority of English-Spanish cognates are academic vocabulary words (Hiebert & Lubliner, 2008). Roughly one-third to one-half of an educated person’s vocabulary is comprised of cognates (Holmes & Guerra Ramos, 1995). They constitute most of the subject headings in the Dewey Decimal System used in the classification of books in public and school libraries (J. A. Montelongo, 2012). Moreover, they are highly represented in picture books for children (J. A. Montelongo et al., 2018, 2023). Cognates are the words most often encountered in content area textbooks such as those used in science classes (Bravo et al., 2007) and are the highlighted words defined in their glossaries. Equally important, cognates are a conduit to the acquisition of other classes of vocabulary words and phrases, such as phrasal verbs (Martínez, 2018). Of the 570 words on the Coxhead (2000) Academic Word List, 465 (82%) are English-Spanish cognates (Hout et al., 2023), the majority of which are the vocabulary words that require explicit teacher instruction (Beck et al., 2013; J. A. Montelongo et al., 2015).
Cognate recognition refers to the ability of a language learner to recognize that a particular word in the learner’s second language (L2) is a cognate of a known word in the learner’s native or home language (L1). Cognate recognition, then, is a helpful skill for making meaning of text, especially in the case of unfamiliar words. When confronted with an unknown L2 word, language learners can use their background knowledge to make an educated guess about the meaning of the unfamiliar word (Jiménez, 1997). In empirical studies of cognate recognition (i.e., Kelley & Kohnert, 2012), a particularly robust finding is that English-Spanish cognates are recognized at a higher level than non-cognates.
English-Spanish cognates possess different levels of orthographic transparency, as measured by subjective ratings (J. A. Montelongo et al., 2010, in press). Some cognate equivalents share identical spellings, as is the case of the English word, “natural,” and the Spanish word, “natural.” Language learners are more readily able to recognize such identical pairs as equivalent given their physical sameness. Other English-Spanish cognates are not as transparent. The word, “sacred,” is spelled somewhat differently than its Spanish equivalent, “sagrado,” and the spellings of “loyal” and “leal” are even more disparate, despite their common etymology and semantic meaning. Such pairs of cognates do not facilitate the learning of English vocabulary as cognates that are the same or nearly the same (J. Montelongo, 2002).
Calibrated Materials for Language Research
Language researchers require calibrated materials to advance the study the effects of the various word characteristics on learning and memory. In the past, measures such as those of English word frequency (Thorndike & Lorge, 1944), word association (Postman & Keppel, 1970), and imagery (Paivio et al., 1968), stimulated much of the research that laid the foundation for today’s cognitive psychology.
In studies that have investigated the role of cognate transparency in languages that share many cognates, researchers have collected ratings of transparency to create their stimulus materials. In the research on the effects of Dutch-English cognates on learning and memory, for example, A. M. B. de Groot and Nas (1991) collected cognate transparency ratings for 280-word pairs prior to beginning their experiment. In order to study the correlation between transparency and translation, Friel and Kennison (2001) collected ratings for 563 English-German cognates using the A. M. B. de Groot and Nas (1991) ratings methodology. J. A. Montelongo et al. (2009, 2010) collected orthographic transparency ratings for more than 2,400 English-Spanish cognate nouns and adjectives drawn from the Julliand and Chang-Rodríguez’s (1964) Spanish frequency word count. These ratings provide a measure of cognate orthographic transparency together with a built-in measure of frequency in Spanish. More recently, J. A. Montelongo et al. (in press) have collected orthographic transparency ratings for over 4,500 English–Spanish cognates culled from over 3,000 award-winning picture books for use by cognate researchers and curriculum writers.
Currently, researchers are opting to use measures that don’t involve the time-consuming task of having to collect and tabulate cognate ratings of orthographic similarity. In a study of the effects of cognates on the processing of anaphors, for example, Lauro and Schwartz (2019) used the van Orden (1987) formula for calculating the orthographic overlap between every noun and its translation. The overlap represents the ratio between the word’s graphemic similarity with another word and its graphemic similarity with itself. According to the authors, the van Orden formula accounts for the number of letters in both words, whether first or last letters are shared, the number of adjacent letter pairs shared in both forward and backward directions, as well as the relative length of both words. An example of a van Orden rating for the English cognate word, “computer,” and its Spanish equivalent, “computadora,” is presented in Figure 1.

Example of the van Orden formula for the cognate pair “computer-computadora.”
As may be seen in Figure 1, the van Orden formula for the cognate pairs: computer-computadora yields middling orthographic transparency rating of 0.56. Since, the van Orden ratings range from “0 to 1,” the cognate pair, computer-computadora, belongs to the cognates in the mid-range of orthographic transparency.
Researchers have employed other objective rating methodologies. In a study of pedagogical priming for developing cognate awareness, Cenoz et al. (2022) used the normalized Levenshtein distance (NLD) which objectively measures the differences between two words or strings of letters. It is the “edit distance” measured by the minimum number of edits whether it be an inserted letter, a deleted letter, or substituted letter needed to change one word into the other. For example, the orthographic distance in the pairs “natural” in English and “natural” in Spanish is a “0” because no letters were inserted, deleted, or substituted; the distance between the pairs “problem” and “problema” is a “1” because one letter was inserted to change one word to the other.
Costa et al. (2023) have more recently proposed PHOR-in-One, a multilingual lexical database containing phonological and orthographic normalized Levenshtein distance (NLD) estimates for over 6,000 translation equivalents in English, Portuguese, German, and Spanish, for approximately 31,000 words. The authors further proposed a “phonographic NLD” which is a pooled index of orthographic and phonological similarity. Phor-in-One also includes “a comprehensive of its lexical entries, namely Part-of-Speech-dependent and independent frequency counts, number of letters and phonemes, and phonetic transcription” (p. 3699).
Recently, Hout et al. (2023) conceded that while these computational ratings techniques seem promising for the study of cognates, further studies comparing these formulaic ratings with phenomenological ratings need to be conducted. In their investigation, Hout et al. found that orthographic similarity ratings strongly correlated with reaction times, lending support to the usefulness of such ratings. The authors further suggested that computational approaches do not necessarily align with the subjective impressions of human raters, and thus a more person-centric approach to quantifying orthographic similarity should be viewed as a useful complement to more objective ratings.
Imagery Ratings
Translation studies have demonstrated that cognate orthographic transparency alone does not facilitate performance (A. M. B. De Groot & Keijzer, 2000). The studies by de Groot and her associates (i.e., A. M. de Groot, 1992; A. M. B. de Groot & Poot, 1997) suggested that imageability (imagery) and word frequency also exerts a powerful influence on translation performance.
For measuring the effects of imagery and orthographic transparency on learning and memory, researchers thus require ratings of both of these variables in addition to word frequency measures to conduct their studies. While word frequency counts in English (Davies, 2008) and Spanish (Davies, 2016) are available, collecting both imagery ratings and orthographic transparency ratings is an essential but arduous time-consuming process. To simplify the situation for a researcher desiring to conduct such a research project, the present authors resolved to collect orthographic transparency ratings for the nouns in the Paivio et al. (1968) study which have already been rated for imagery and whose frequency rating in English and Spanish can be readily ascertained.
A secondary purpose of this study was to test the generalizability of the initial letters effect reported in previous investigations (Hout et al., 2023; J. A. Montelongo et al., 2009, 2010). In those studies, it was found that the initial letters of the English- Spanish cognate pairs were important determinants of the orthographic transparency ratings for each pair. That is, the earlier an English cognate differed orthographically from its Spanish equivalent, the lower the transparency rating for that cognate pair. For instance, the ratings of noun cognate equivalents which differed in their first letters, such as “eagle” and “aguila,” averaged 2.7 rating (on a scale of 1–7). Contradistinctively, cognate equivalents having first letter deviations occurring later in the words, like “context” and “contexto,” averaged 5.8 rating. A Pearson correlation of .68 was obtained between the point at which the English–Spanish nouns began to differ orthographically and the transparency rating for the pair in that study.
Experiment 1 Method
Participants
All participants (n = 41) were recruited from undergraduate psychology classes at a California State University and were compensated with partial course credit awarded toward a course requirement. Those who consented to participate in the study were asked demographic backgrounds questions: gender, age, and first and second language or foreign language information. The undergraduates were adults ranging in ages 19 to 52, with the majority between the ages of 19 to 23. Of the 41 participants, nine spoke English as a second language, with Spanish as their native language. The remaining 29 participants had taken minimally one Spanish as a foreign language classes in either high school or in college. While all students were familiar with Spanish, this was not a necessary requirement since participants were rating the word pairs only for the orthographic similarity between the two words. They were not rating the word pairs for meaning. The data were aggregated and to ensure confidentiality individuals were not named.
Materials
The ratings instrument was developed by first identifying the English-Spanish cognate nouns in the Paivio et al. (1968) imagery norms. To ensure accuracy, all cognate words were cross-checked for semantic meaning and etymology (Coromines, 2012; Gómez de Siva, 1998; Real Academia Española, 2001a, 2001b; Roberts, 2014a, 2014b; Thomas et al., 2006). The ratings instrument consisted of a nine-page booklet made up of all of the identified 440 English-Spanish cognate pairs and 118 noncognate pairs. The noncognate pairs consisted of an English word and a synonymous Spanish word that was semantically equivalent, but orthographically different from the English word. The non-cognate pairs were randomly intermixed among the cognate pairs. Each of the nine pages was divided into two columns and there were 62 items on each page. Each item contained an English word and its Spanish equivalent. The English word was printed on the left side and its Spanish cognates occupied the space to the right side. The cognate pairs were separated by a black underlined space on which the participant could rate the orthographic transparency of the pair. The printed pages were shuffled and every participant received a different page order. This was done to reduce order effects.
To assess the degree of orthographic transparency, the procedures used the 7-point scale by A. M. B. de Groot and Nas (1991). Participants were asked to assign a score ranging from “1” (for those word pairs possessing low similarity between the English and Spanish words) to “7” for word pairs having identical orthographic similarity. The participants were asked to rate the English-Spanish cognates solely on the basis of orthographic similarity. That is, the ratings should reflect only an assessment of the word’s spelling.
Procedures
The participants who consented to be part of the study, were asked to meet for 1 hr in one of the larger classrooms in the library on a designated day and time. The participants were told that the purpose of the study was to rate the orthographic similarity of word pairs. Specifically, they were told that the study was about the orthographic similarity of English-Spanish cognates. The Experimenter provided the participants with the following definition of English-Spanish cognates:
English-Spanish cognates are words in English and Spanish that are spelled the same or nearly same, possess the same meaning, and come from the same language root.
The participants were further advised that the results of the study would be used by researchers conducting language experiments and by curriculum writers wishing to create a vocabulary curriculum.
Prior to the start of the ratings, the Experimenter provided a practice for the group as a whole. Five pairs of cognates and two pairs of noncognates were printed on a whiteboard. The pairs of words included a set of cognates ranging from “1” to “7” in orthographic transparency (from low similarity to identical). The participants were asked to rate the example words as a group one word at a time. None of the word pairs in the practice were drawn from the Paivio et. al norms. The participants were informed that there were no “right” or “wrong” ratings and that they were to complete the ratings individually without asking the experimenter or the other participants for help. After the practice items had been rated as a whole group, the experimenter passed out the booklets to all 41 participants containing the set of cognate pairs to be rated in one 50-min session. At the end of the session, all participants were thanked, debriefed, and given credit for participation.
Scoring
Each of the participant’s ratings was entered into a Microsoft Excel spreadsheet. Before the mean and standard deviations were calculated, the three highest and the three lowest scores for each cognate pair were eliminated in order to reduce the effect of outliers.
Results
The transparency ratings for each of the English-Spanish cognate pairs are available in the data file provided on our OSF site (https://osf.io/g9yaq/?view_only=c0ffc0110a764667bfcbb0d59f61867a). The mean orthographic transparency rating for the cognates was 4.8 (SD = 1.2). The mean orthographic transparency rating for the noncognates was 1.2 (SD = 0.4).
An analysis of the distribution of the cognate nouns revealed that 29 of the 440 (7%) English-Spanish cognate pairs were rated as identical or nearly identical. Typically, these were cognate pairs that were exactly identical or those for which the Spanish equivalent possessed an accent mark as in the pair, such as piston-pistón. The largest percentage (47%) of cognate pairs were those in the range between 5.01 and 6.00. These tended to be cognate pairs that were orthographically similar except for the final letters. Such pairs as “infant-infante” exemplify these cognate pairs which include an additional “a,”“e,” or “o” at the end of the Spanish equivalent. The cognate pairs in the 4.01 to 5.00 range were those that showed slightly more prominent orthographic dissimilarities at the ends of words and comprised the second highest percentage (23%) of instances. Examples of pairs in this category were “salute-saludo.” Cognate pairs in the 3.01 to 4.00 range possessed differences as early as in their second letters, as in the cognates “market-mercado.” These accounted for 15% of the cases. Cognate pairs in the 2.01 to 3.00 range tended to orthographically diverge early as in the pair, “beverage-brebaje,” and these accounted for 6% of the cognate pairs. Finally, the lowest ratings, 1.0 to 2.0, were those that were orthographically dissimilar from the initial letter as in “sugar-azúcar.” These accounted for 2% of the 440 pairs. An analysis of the distribution is presented in Table 1.
Distribution of Ratings for the English-Spanish Cognate Nouns.
Initial-Letter Effect
The distribution of the transparency ratings indicates that the earlier the cognate equivalents diverged orthographically from each other, the lower the rating tended to be. Raters attached more significance to differences occurring at the beginning of words than those occurring later. Many of the cognate pairs which contained letter changes in the initial positions of the words tended to have lower ratings than those Spanish cognates which had letter changes at the ends of words. For instance, the word “photograph” and its cognate equivalent, “fotografía” differ in their first letters. The cognate pair “painter” and “pintor” differ in their second letters, “yacht” and “yate” in the third letter, and so on. Correlations were calculated to examine the possibility that the letter position at which the first difference between an English word and its Spanish equivalent occurs makes a difference. The Pearson product moment correlation between the position at which the first difference occurs and the mean rating of the cognate equivalents was r (458) = +.7, (p < .0001).
A summary of these means and standard deviations is presented in Table 2. The table clearly shows that the ratings of those cognates which diverged from their cognate equivalents in the first three letters received the lowest ratings. Those with changes occurring at the fourth and fifth letter were at the mid-range (4.55 and 5.02), while those with later-occurring changes had the highest ratings. See Table 2.
The Letter Position at which English-Spanish Cognate Pairs Differ.
Discussion and Implications
The purpose of Experiment 1 was to provide a set of orthographic transparency ratings for English-Spanish cognate pairs taken from the Paivio et al. (1968) imagery norms. English-Spanish cognates with the initial fourth, fifth, sixth, and seventh letters that are identical received a moderate to higher rating—making them easier to recognize. This initial-letter effect was also observed in earlier studies of cognates drawn from the Julliand and Chang-Rodríguez (1964) norms (J. A. Montelongo et al., 2009, 2010).
The implications for language learners reading in their second language is that cognates that have identical letters at the beginning and in the middle of a word will more likely be easier to recognize. However, recognizing cognates is not an automatic reading strategy, but one that can be taught to help language learners improve their reading of words in their second language (Jiménez & Gámez, 1996; Mallikarjun et al., 2017; Martínez, 2018). Moreover, language researchers can thus use the resultant ratings reported here to develop experimental stimulus materials or use them to create educational materials without having to spend valuable time collecting their own phenomenological imagery and orthographic transparency ratings.
To provide further evidence that these ratings are valid, we performed a second experiment in which orthographic transparency was assessed differently. Specifically, we used the ratings obtained in Experiment 1 to predict the speed at which a pair of words are recognized to be cognates. If the ratings obtained in Experiment 1 are valid, then word pairs given higher transparency ratings should tend to be recognized as cognates more quickly (and vice versa).
Experiment 2 Method
Participants
All participants (n = 43) were recruited via the New Mexico State University’s research information and scheduling tool, SONA, and were compensated with partial course credit awarded toward a course requirement.
Materials
Apparatus
Data collection was performed using E-Prime vs3 software (Psychology Software Tools, Pittsburgh, PA), run on identical desktop PCs with Intel Core i5 processors. Stimuli were displayed on 24″ monitors, and responses were made using standard keyboards.
Stimuli
Cognate word pairs were identical to those of Experiment 1. Additionally, there were many non-cognate word pairs. All stimuli can be found in the Excel spreadsheet on our OSF page (see Experiment 1 Results for the link); there are two tabs for the cognate stimuli and non-cognate word pairs.
Demographic Questionnaire
Prior to the start of the experiment (but following informed consent), participants were asked several questions regarding background demographics. Each question allowed the participant to indicate “no response” if they felt uncomfortable answering the prompt. Most questions allowed the participant to enter their response by keyboard (and thus, participants could simply press the ENTER key to indicate a preference not to respond). For those questions with multiple-choice answer selections, we have indicated the options in parentheses.
The demographic questions were as follows:
What is your country of birth?
How many years have you lived in an English-speaking country?
How many years have you lived in a Spanish-speaking country?
How many years of schooling did you have in an English-speaking country?
How many years of schooling did you have in a Spanish-speaking country?
What word best describes your ability to write English? (1 = fluent; 2 = above average; 3 = average; 4 = below average; 5 = not fluent)
What answer best describes your ability to read English text? (1 = fluent; 2 = above average; 3 = average; 4 = below average; 5 = not fluent)
What word best describes your ability to write Spanish? (1 = fluent; 2 = above average; 3 = average; 4 = below average; 5 = not fluent)
What answer best describes your ability to read Spanish text? (1 = fluent; 2 = above average; 3 = average; 4 = below average; 5 = not fluent)
Indicate the phrase that describes your ability to read in both English and Spanish. (1 = English dominant; 2 = equally fluent in both English and Spanish; 3 = Spanish dominant)
Indicate the phrase the describes your ability to write in both English and Spanish. (1 = English dominant; 2 = equally fluent in both English and Spanish; 3 = Spanish dominant). See Table 3.
Demographic Questionnaire Date from Experiment 2.
Procedures
Participants viewed one pair of words at a time, and conducted as many trials as they could in two blocks of 10-min duration (separated by a short break). The computers collected millisecond accurate response times (RTs) and all responses were coded for accuracy. The procedure began by asking participants to fill out informed consent, after which the experiment began. The experiment started with the collection of demographic data. Then, participants were given the following instructions for the task.
The general instructions given to participants were as follows:
“You are being tested for how similar words are in English and Spanish with regard to the way they are spelled. During the experiment, you may be shown words that possess the same meaning but are not spelled similarly. For instance, perro/dog, sol/sun, and libro/book.
For the purposes of this study, a cognate can be defined as words in English and Spanish that are the same/similar orthographically (i.e., cognates normally share a common etymology too, usually Latin, Greek, and Arabic). Etymology is the origin of a word and the historical development of its meaning.
For example, the English word “experiment” and its Spanish cognate “experimento” are cognates because they are spelled similarly AND they mean the same thing. However, the English word “sugar” and its Spanish equivalent “azúcar” are cognates even though they are spelled very differently.
Cognate transparency refers to this relationship; specifically, how orthographically similar the pair of words is.
Specific instructions were then provided on how to complete the task, as follows: “In the following, you will be shown a pair of words, one in English and one in Spanish. Sometimes the words will be cognates and sometimes they will not. Your task is simple. If you think the word pair are cognates, press the “F” key to indicate that they are. If you think they are NOT cognates, press the “J” key.
It is important that you respond as quickly and as accurately as possible. Please keep your left and right index fingers on the “F” and “J” keys, respectively, so that you can respond easily.”
Each trial began with a prompt asking the participant to press any key when ready to begin the trial. This allowed them to take a break in between trials if they needed one. Upon initiation, a fixation cross—an enlarged “+” symbol centered in the computer screen—would appear to center the participant’s gaze, with two randomly selected word stimuli appearing directly above and below the fixation cross. In the bottom corners of the display were the response choices: “Cognate: Press F” in green text, and “Not a Cognate: Press J” in red text. The instructions text was provided as a reminder to participants of how to respond, but participants were asked to place their index fingers on the keys so that they could quickly and easily respond to the task once the response pairings were learned.
All trial pairs consisted of one Spanish and one English word; stimuli were selected at random (without replacement) in equal proportions from the cognate and non-cognate word lists (in order to avoid any response biases that might result in more swift selection of cognate/non). The pair of stimuli persisted on screen until a response was made. After responses were made, feedback was provided. For incorrect responses, participants were shown “Incorrect” in red text and given a reminder of what cognates are, as follows:
“Remember… cognates are words in English and Spanish that are the same/similar orthographically (i.e. how they are spelled) AND semantically (i.e., what they mean). For example: experiment/experimento and sugar/azúcar.”
For correct responses, participants were given the display “Correct!” in green text. Both types of feedback were displayed until the participant pressed any key, thus terminating the trial. Participants continued in this manner, working at their own pace, for a total of 50 min. Upon termination of the experiment, all participants were thanked, debriefed, given credit for participation in the SONA system, and released.
Results
We examined the time it took for participants to render their responses as a function of the word pair’s mean transparency rating (from Experiment 1), and the pair’s point of differentiation. We predicted that higher transparency ratings between the word pairs would result in shorter RTs, and that later points of differentiation would also result in shorter RTs. We examined RTs only for accurate responses; overall accuracy on this task was high: participants responded accurately to 87% of cognate pairs, and 88% of non-cognate pairs. The accuracy of responses to each pair of cognates is reported in our full data sheets, available at our OSF page.
Demographic Information
Table 3 presents the results of our demographics questionnaire. The survey results suggest that our sample is broadly representative of the multi-cultural (and largely bilingual) population in the Las Cruces, New Mexico and El Paso, Texas region. Furthermore, the demographic data suggest that we have a range of linguistic and cultural experiences represented in our sample participants. Participants ranged in age from 18 to 25, and gender identities included 13 “male,” 26 “female,” and 1 “other” response.
Data Cleaning
Prior to data analysis, the data were cleaned to remove participants whose performance indicated that they were likely not completing the task as instructed (or who were otherwise inattentive), and to remove outlier RTs on trials in which the participant may have had an attentional lapse or erroneously responded too quickly. One participant was removed for having response times that were more than 2.5 standard deviations longer than the group mean, and two participants were removed for having accuracy more than 2.5 standard deviations below the group mean. These participants were clearly not attending properly to the task or following instructions. This procedure is standard “data cleaning” in cognitive psychology literatures and resulted in removal of only 7% of participants, leaving a total sample of 40 participants.
Individual trial RTs (from the remaining set of participants, on trials with correct responses) were then cleaned to remove RTs less than 200 ms (which would indicate an erroneous early button-press) or more than 2.5 standard deviations beyond the group mean (i.e., those longer than 3,285 ms, which would indicate a clear attentional lapse). This resulted in the loss of only 7.77% of remaining RTs.
Because word pairs were randomly selected across trials, there were an unequal number of observations across pairs. Figure 2 presents a histogram of the number of observations (after data cleaning) obtained. As can be seen, each pair received as many as 37 (correct) observations. The mean number of observations was 22.20, indicating that most pairs received a large amount of data from which average RTs were computed; the majority of pairs received more than 10 observations each. Our full data sheet reports the number of correct responses to each cognate word pair.

Histogram of the number of observations received by word pairs following data cleaning.
Inferential Statistics
We next performed a pair of correlational analyses (on cleaned data) examining the relationship between mean transparency ratings (obtained in Experiment 1) and RT, and between point of differentiation and RT. We found a strong negative correlation between mean transparency rating and RT—r(440) = −.703, p < .001—indicating that word pairs with high transparency ratings tended to be responded to more quickly. We also found a strong negative correlation between point of differentiation and RT—r(440) = −.559, p < .001—indicating that words with later points of differentiation were responded to more quickly. Both of these analyses confirmed our a priori hypotheses. Figure 3 presents these correlations.

The correlations between RT and mean transparency rating (left plot) and between RT and point of differentiation (right plot).
General Discussion
The purpose of the present study was to produce a set of orthographic transparency ratings for English-Spanish cognate words previously rated for imagery. Transparency ratings for 440 pairs of English-Spanish cognates were collected. The result of this study is a new list of 440 English-Spanish cognates rated for levels of orthographic transparency. This new list can be accessed at https://osf.io/g9yaq/?view_only=c0ffc0110a764667bfcbb0d59f61867a. These ratings of English-Spanish cognate nouns can be used by language researchers wanting to conduct English-Spanish cognate studies without having to collect their own imagery and orthographic transparency ratings. Educators, too, can avail themselves of these ratings to create curriculum materials (Kelley & Kohnert, 2012; J. A. Montelongo et al., 2013, 2023).
In Experiment 1, an initial-letter effect was found. Simply put, the earlier an English word orthographically deviated from its Spanish equivalent, the lower the transparency rating for the pair. This suggests that the participant raters weighted the similarity of the initial letters of the word more heavily than later deviations. These results serve to generalize the findings of initial letter-effect in the nouns and adjectives reported by J. A. Montelongo et al. (2009, 2010).
In Experiment 2, we cross-validated transparency ratings by asking a new group of participants to provide speeded judgments regarding whether or not a pair of words were cognates. We found that reaction times were strongly correlated with transparency ratings and with the point of differentiation for the pairs. These findings validate the transparency ratings obtained in Experiment 1 by showing that they can be used to predict behavior in another domain. The rating speed correlated with the presentation of identical or different letters. Thus, the more identical initial letters a word has, the quicker and easier for participants to rate the word pairs with higher scores. While human ratings are more time consuming than objective methods, they likely better reflect what people actually think.
Implications
The findings have implications for second-language learners reading in either language (Mallikarjun et al., 2017; Martínez, 2018). Language learners can improve their fluency speed when reading passages with unfamiliar cognate words; however, their speed and ease depend on students and their teachers knowing the concept of cognates and using cognate recognition strategies while they read texts in the second language (Martínez, 2018; J. A. Montelongo et al., 2023; Nagy et al., 1993).
Curriculum designers play a pivotal role in the effective acquisition of cognates. They can strategically scope and sequence the cognates based on their orthographic transparency (J. A. Montelongo et al., 2013). The sequence should commence with the introduction of cognates that are either identical or highly transparent, as these are easier for language learners to recognize and acquire. Subsequent grades should see the introduction of cognates with mid-range transparency, such as those rated between 4.0 and 5.0. For advanced grades, curriculum designers can introduce cognates with lower transparency, such as those in the 3.0 to 2.0 range.
It is crucial for language educators to be cognizant of the fact that cognate orthographic transparency can vary significantly, ranging from identical to very dissimilar. Equipped with this knowledge, educators can be more effective in incorporating cognate vocabulary into their lessons (Hiebert & Lubliner, 2008; J. A. Montelongo et al., 2023). They should start by designing lessons with cognates that are identical or have high transparency ratings. As students grasp the concept of cognates, educators can gradually introduce those with mid-range orthographic transparency.
Limitations
A limitation in Experiment 1 of this study was that the cognate word pairs varied only by page number. In other words, while the pages were shuffled to obtain different page orders, the individual cognate pairs were not ordered differently. In future studies, the goal would be to have a random word order for every participant.
A second limitation was the size of the participant pool in both experiments. The study should be replicated with a larger sample of participants to check on stability across samples, including persons other than college students. The participant sample should be large enough to examine data for differences in Spanish-speaking abilities and gender differences. Another limitation was the focus on only orthographic ratings. A future study would include the confidence level the participants had when rating the orthographic similarity of the word pairs and examine the meaning of cognates, whether identical or similar. Future studies might also measure phonological similarities between English and Spanish.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethics Statement
This research has been conducted under approval from a university institutional review board (IRB) and meets all ethical requirements.
