UniPseudo: A universal pseudoword generator

Abstract

Pseudowords are letter strings that look like words but are not words. They are used in psycholinguistic research, particularly in tasks such as lexical decision. In this context, it is essential that the pseudowords respect the orthographic statistics of the target language. Pseudowords that violate them would be too easy to reject in a lexical decision and would not enforce word recognition on real words. We propose a new pseudoword generator, UniPseudo, using an algorithm based on Markov chains of orthographic n-grams. It generates pseudowords from a customizable database, which allows one to control the characteristics of the items. It can produce pseudowords in any language, in orthographic or phonological form. It is possible to generate pseudowords with specific characteristics, such as frequency of letters, bigrams, trigrams, or quadrigrams, number of syllables, frequency of biphones, and number of morphemes. Thus, from a list of words composed of verbs, nouns, adjectives, or adverbs, UniPseudo can create pseudowords resembling verbs, nouns, adjectives, or adverbs in any language using an alphabetic or syllabic system.

Keywords

Pseudoword generator lexical decision

Get full access to this article

View all access options for this article.

References

Baayen

R. H.

Piepenbrock

Gulikers

. (1996). The CELEX lexical database (cd-rom). https://pure.mpg.de/pubman/faces/ViewItemOverviewPage.jsp?itemId=item_2339741

Balota

D. A.

Yap

M. J.

Hutchison

K. A.

Cortese

M. J.

Kessler

Loftis

Neely

J. H.

Nelson

D. L.

Simpson

G. B.

Treiman

. (2007). The English lexicon project. Behavior Research Methods, 39(3), 445–459.

Besner

Davelaar

. (1983). Suedohomofoan effects in visual word recognition: Evidence for phonological processing. Canadian Journal of Psychology/Revue Canadienne De Psychologie, 37(2), 300.

Coltheart

Rastle

Perry

Langdon

Ziegler

. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108(1), 204.

Davis

C. J

. (2005). N-Watch: A program for deriving neighborhood size and other psycholinguistic statistics. Behavior Research Methods, 37(1), 65–70.

Davis

C. J.

Lupker

S. J

. (2006). Masked inhibitory priming in English: Evidence for lexical inhibition. Journal of Experimental Psychology: Human Perception and Performance, 32(3), 668.

Duyck

Desmet

Verbeke

L. P.

Brysbaert

. (2004). WordGen: A tool for word selection and nonword generation in Dutch, English, German, and French. Behavior Research Methods, Instruments, & Computers, 36(3), 488–499.

Ferrand

Méot

Spinelli

New

Pallier

Bonin

Dufau

Mathôt

Grainger

. (2018). MEGALEX: A megastudy of visual and auditory word recognition. Behavior Research Methods, 50(3), 1285–1307.

Gimenes

New

. (2016). Worldlex: Twitter and blog word frequencies for 66 languages. Behavior Research Methods, 48(3), 963–972.

10.

Gimenes

Perret

New

. (2020). Lexique-Infra: Grapheme-phoneme, phoneme-grapheme regularity, consistency, and other sublexical statistics for 137,717 polysyllabic French words. Behavior Research Methods, 52(6), 2480–2488.

11.

Keuleers

Brysbaert

. (2010). Wuggy: A multilingual pseudoword generator. Behavior Research Methods, 42(3), 627–633.

12.

König

Calude

A. S.

Coxhead

. (2020). Using character-grams to automatically generate pseudowords and how to evaluate them. Applied Linguistics, 41(6), 878–900.

13.

New

Pallier

Brysbaert

Ferrand

. (2004). Lexique 2: A new French lexical database. Behavior Research Methods, Instruments, & Computers, 36(3), 516–524.

14.

Perea

Rosa

Gómez

. (2005). The frequency effect for pseudowords in the lexical decision task. Perception & Psychophysics, 67(2), 301–314.

15.

Proverbio

A. M.

Vecchi

Zani

. (2004). From orthography to phonetics: ERP measures of grapheme-to-phoneme conversion mechanisms in reading. Journal of Cognitive Neuroscience, 16(2), 301–317.

16.

Rastle

Harrington

Coltheart

(2002). 358,534 nonwords: The ARC Nonword Database. Quarterly Journal of Experimental Psychology, 55A, 1339–1362.

17.

Taft

. (1982). An alternative to grapheme-phoneme conversion rules? Memory & Cognition, 10(5), 465–474.

18.

Taft

. (2004). Morphological decomposition and the reverse base frequency effect. The Quarterly Journal of Experimental Psychology Section A, 57(4), 745–765.

19.

Yap

M. J.

Sibley

D. E.

Balota

D. A.

Ratcliff

Rueckl

. (2015). Responding to nonwords in the lexical decision task: Insights from the English Lexicon Project. Journal of Experimental Psychology: Learning, Memory, and Cognition, 41(3), 597.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

1.44 MB