Abstract
This study aims to establish what lexical factors make it more likely for dictionary users to consult specific articles in a dictionary using the English Wiktionary log files, which include records of user visits over the course of 6 years. Recent findings suggest that lexical frequency is a significant factor predicting look-up behavior, with the more frequent words being more likely to be consulted. Three further lexical factors are brought into focus: (1) age of acquisition; (2) lexical prevalence; and (3) degree of polysemy operationalized as the number of dictionary senses. Age of acquisition and lexical prevalence data were obtained from recent published studies and linked to the list of visited Wiktionary lemmas, whereas polysemy status was derived from Wiktionary entries themselves. Regression modeling confirms the significance of corpus frequency in explaining user interest in looking up words in the dictionary. However, the remaining three factors also make a contribution whose nature is discussed and interpreted. Knowing what makes dictionary users look up words is both theoretically interesting and practically useful to lexicographers, telling them which lexical items should be prioritized in lexicographic work.
Plain Language Summary
This study aims to establish what factors make it more likely for dictionary users to consult specific articles in a dictionary using the English Wiktionary log files, which include records of user visits over the course of six years. Recent findings suggest that word frequency is a significant factor predicting look-up behaviour, with the more frequent words being more likely to be consulted. Three further factors are brought into focus: (1) age of acquisition, which is the age at which a word is learned; (2) lexical prevalence, which is how many people know the word; and (3) degree of polysemy calculated as the number of dictionary senses. Age of acquisition and lexical prevalence data were obtained from recent published studies and linked to the list of visited Wiktionary lemmas, whereas polysemy status was derived from Wiktionary entries themselves. Our study confirms the significance of word frequency in explaining user interest in looking up words in the dictionary. However, the remaining three factors also make a contribution whose nature is discussed and interpreted. Knowing what makes dictionary users look up words is both theoretically interesting and practically useful to lexicographers, telling them which words should be prioritized in lexicographic work.
Keywords
Introduction
The Role of Dictionaries in Today’s World
Dictionaries have been part and parcel of literate societies for many centuries. The most prominent role of dictionaries in society has been to assist in communication—be it in one language or across different languages—to aid in understanding, creating, and translating texts. Thousands of languages are spoken around the world, and communication problems arise whenever a native speaker of one language comes into contact with a speaker of another language. In today’s global village—marked by the ubiquity of long-distance travel, increased human mobility, and modern communication technology (the Internet and mobile telephony)—frequent contacts between speakers of different languages have become a major part of our daily experience. At the same time, English has established itself as a lingua franca of international communication: the language of choice in communication between people speaking different languages natively. This marked tendency gives lexicography of English a particular significance, as dictionaries with English are used intensively and extensively by huge numbers of people worldwide. For the English Wiktionary—the dictionary we are using as our primary data source here—the relevance of English language resources around the world is reflected in the page impression statistics for different countries 1 : the majority of page impressions (52.7%) originate from countries other than the five countries with the most native speakers of English 2 (USA, Great Britain, Canada, Nigeria, and Australia).
The Role of Corpora in Lexicography
In the not-so-distant past, lexicographers conceived and compiled dictionaries by relying on primarily two sources of data: their own introspection and past practice. Because of its reliance on introspection and largely uncritical copying from earlier dictionaries, pre-modern lexicography was anti-empirical as well as strongly conservative. Empirical evidence first came into lexicography in the 19th century with citation slips and reading programs (Atkins & Rundell, 2008, p. 50). This method was prone to human bias, as humans tend to notice the unusual, and ignore the habitual. More objectivity was only brought in with the corpus revolution pioneered by the COBUILD project (Sinclair, 1987).
Introduction of systematic corpus data was important in that it offered information, among other things, on which words are most frequent in the language, to the extent that the corpora employed were representative of the language. This allowed lexicographers—always contending with limited resources and time strictures—to focus their efforts on words that corpus analysis found to be most frequently used. In this fashion, corpora offered relatively objective data on language use, but had nothing to say about how people used dictionaries, or to what extent their interest in words aligned with corpus data. In particular, there was no telling whether words found to be frequent text-wise were also the ones that people wanted to consult most often in the dictionary. For that kind of insight, lexicographers needed real data on dictionary use.
Dictionary Log Files
The study of how people use dictionaries began in the late 20th century, and mostly consisted in recording a limited group of study subjects in the process of dictionary use, or in looking at the product of such dictionary use, such as lexical choices, sentences, or texts produced with the help of dictionaries. While this approach offered details relevant to the design of entry structure and organization, the low volume of the data as well as the frequent artificiality of the context and tasks meant that there was little information there on which words dictionary users would normally wish to look up in dictionaries under naturalistic circumstances.
New opportunities to collect such information on a larger scale came with the transition of lexicography to the digital medium (Lew & de Schryver, 2014), as some digital dictionaries, particularly those accessible online, log details of user visits. Such logs may be subsequently mined for information on which parts of the dictionary were accessed and with what frequency. Such data will sometimes be used by the publisher, though not usually shared outside due to their commercial value. However, for English there exists the popular and substantial English Wiktionary, which is a non-commercial crowd-sourced resource. For this dictionary, extensive log files are available for download, thus providing an excellent opportunity for study.
Potentially Relevant Lexical Factors
Corpus Frequency
Corpus-based lexical frequency has been an important consideration in determining the coverage of a dictionary, beginning with the pioneering work completed as part of the COBUILD project (Sinclair, 1987). Still, quite surprisingly, the positive relationship between dictionary look-up and corpus frequency did not turn out to be apparent at all in early studies looking into this issue (De Schryver & Joffe, 2004; De Schryver et al., 2006; Verlinde & Binon, 2010), and has only been established empirically with some confidence fairly recently (De Schryver et al., 2019; Koplenig et al., 2014; Müller-Spitzer et al., 2015), although frequency-based heuristics had been tested much earlier in a contrastive setting with two dictionaries (Verlinde & Selva, 2001). Armed with more sophisticated data analysis tools unavailable to early studies, De Schryver et al. (2019) found a clear positive relationship between corpus frequency and user interest, indicating that words with higher corpus frequency tend to be more frequently looked up by users. This effect was found for both English and Swahili in an online English-Swahili dictionary.
Word frequency also turns out to be an important factor in predicting behavior in reading text (Kliegl et al., 2006), lexical decision tasks (Morrison & Ellis, 1995), and a wide range of other tasks during language processing (N. C. Ellis, 2002). Another concept that is very tightly linked to corpus frequency is orthographic familiarity (White, 2008) which is an operationalization of how often a word with a similar “shape” (same initial letters and same length) as the target word can be observed in every-day language.
However, researchers are increasingly looking—especially in psycholinguistics if not so much (yet) in lexicography (Brysbaert et al., 2018, 2019; Mandera et al., 2017, 2020)—at the possibility that there are other non-trivial aspects of word knowledge, beyond mere frequency of occurrence, possibly playing a role in how “interesting” a word is found by speakers (Bialystok et al., 2009; Gerhand & Barry, 1998; Goodman et al., 2008; Mandera et al., 2015). We would like to explore this avenue insofar as it is evidenced in dictionary look-up behavior. Whilst the relationship between corpus frequency and look-up behavior has received some attention, we see a clear advantage in including further variables. Additional metrics describing other properties of words (some of them closely related—but not identical—to corpus frequency) can also help us understand better the effect of corpus frequency and the relationships between predictor variables. Three of those candidate predictors that appear most promising and will be examined in this contribution are briefly introduced below.
Word Prevalence
The prevalence of a word is the extent to which it is known amongst the native-speaking population. Words which occur with relatively higher frequency in texts and discourse should be more likely to be known by a large proportion of speakers (Longobardi et al., 2015; Weizman & Snow, 2001). Conversely, it would not be reasonable to expect quite rare words to be known to a broad majority of the speakers of a language. All this does not, however, preclude words of moderate frequency being more or less widely known, perhaps due to the relative ubiquity of the concepts that some of them might convey. Another complication is distribution across texts of different type, modality, or genre: words can be frequent, but with most tokens concentrated in a limited range of texts, which would not be conducive to universal prevalence. The state-of-the-art approach to collecting word prevalence information is through crowdsourcing, employing large-scale online surveys (Brysbaert et al., 2019), asking speakers if they are familiar with specific words.
Age of Acquisition
Age of acquisition is the age at which a word is, on average, acquired by native speakers in the process of (naturalistic) L1 acquisition. One might expect that this could play a role in how words acquired earlier, possibly being more deeply entrenched in the mental lexicon, get to be looked up. Age of acquisition has been found to have important and long-lasting effects on language behavior (A. W. Ellis & Lambon Ralph, 2000; Garlock et al., 2001; Juhasz, 2005; Kuperman et al., 2012; Morrison et al., 1992; Weizman & Snow, 2001). Of special interest is a study by Navarrete et al. (2015), which suggests that words that are acquired later in life are more likely to elicit “tip-of-the-tongue” phenomena (a sensation in which a known word is momentarily inaccessible). It is not unlikely that such a state could trigger dictionary consultation. Another study by Picard et al. (2010) suggests that the “core” of the dictionary, that is, words that are strongly interconnected and are key when it comes to learning a new language, are the ones that native speakers tend to acquire at a significantly younger age.
Number of Senses (Degree of Polysemy)
Words can (and often do) have more than one meaning, or sense. The concept of word sense is not without problems (Hanks, 2000; Kilgarriff, 1997), and there has been a long-drawn-out debate about the boundaries between polysemy and homonymy. Some lexicographers even explicitly identify as either lumpers or splitters (Van der Meer, 2004). To steer clear of the essentialist debate of whether words “have” senses, we will adopt a pragmatic approach of considering lexicographic senses, that is, the separate blocks of meaning description as given in a dictionary, in our case operationalized as the number of dictionary senses in the English Wiktionary. We have known for more than 70 years (Zipf, 1949) that the more frequent words tend to have more senses. However, the degree of polysemy may hold predictive potential above and beyond that of mere word frequency.
To wrap up this overview section, Table 1 summarizes the essential parameters of previously published log-file-based studies examining the role of lexical factors in dictionary consultation, along with analogous data for the present study.
Parameters of Previous Log-File Studies of the Role of Lexical Factors in Dictionary Acquisition Compared to the Present Study.
Aim
As argued above, when people elect to use dictionaries, they make choices about which words to look up, and our present aim is to try to identify the lexical variables that affect the likelihood of those choices by using the log files of a popular crowd-sourced dictionary: the English Wiktionary. At the same time, there is some controversy as to the relationship between lexical frequency and dictionary user look-up frequency, with some studies finding no such relationship, and others reporting a positive relationship. While at this time this difference in findings appears to be in the use of more sophisticated methods of exploring this relationship, we still see the need to confirm the findings that the more frequent words are indeed looked up more often. However, even if lexical frequency is a useful predictor, it seems clear that other factors are involved. While not seeking a single lexical processing or representation model, we are interested in what drives people’s decisions to look up a specific word in terms of language experience. This leads us to the following research question with four sub-questions:
How do the following lexical factors affect dictionary users’ decisions to look up specific words:
corpus frequency (verify the positive relationship) word prevalence age of acquisition degree of polysemy
Methodology
Data Sources and Data Integration
Age-of-acquisition (AoA) ratings and corpus frequencies were extracted from the supplementary material made available by Kuperman et al. (2012). AoA ratings are represented as the average rating of 1,960 responders on Amazon Mechanical Turk. Kuperman et al. (2012) show that their data of 842,438 ratings “are as valid and reliable as those collected in laboratory conditions” (p. 978).
Corpus frequency is given as standardized frequency values expressed as hits per 1 million tokens, and are computed from raw frequency figures given in the SUBTLEX-US corpus (as described in Brysbaert & New, 2009).
Prevalence values were extracted from the supplementary material published as part of Brysbaert et al. (2019). The prevalence data are based on the authors’ original web-based survey and covered responses from a total of 221,268 English-speaking participants living in the US and UK; the complete dataset of prevalence values comprises 61,855 data points.
For information on polysemy, we extracted the number of senses for each word directly from its dictionary entry in the English Wiktionary itself. For this, we used a custom R (R Core Team, 2022) function which accesses the edit page of each article. For example, for the entry “dictionary,” the URL https://en.wiktionary.org/w/index.php?title=dictionary&action=edit is being scraped by the function. 3 The extraction script is available upon request.
We extracted part-of-speech (POS) information directly for each word from its English Wiktionary entry, just as we did for the number of senses. Here too we used a custom R function (also available upon request), but this time we used the entry page itself for extraction. Our final dataset includes two types of POS information: (1) a list of all parts-of-speech given in the entry; and (2) the first part-of-speech listed. There were a total of four entries in our dataset (disrobement, iceskate, liquescence, and polloi) for which we had to assign POS information manually due to the different structure of these entries. For three entries, we had to extract POS information from their spelling variants (Capricorn, gokart, and plutonian).
The criterion we attempt to predict using the above variables is the number of look-ups for each of the online entries. To collect this information, we used the R package pageviews (Keyes & Lewis, 2020). We restricted page view information to non-automatic look-ups. Separate look-up counts are available for desktop access, mobile access via a web browser, and mobile access via the Wiktionary app. The time span that we collected daily look-up data for is 01-01-2016 to 31-10-2021. Aggregated figures are available for monthly, yearly, and overall look-ups for each dictionary entry.
To integrate all data described above in one single dataset, we first identified all intersecting lexical items from the AoA, frequency, and prevalence lists. We then checked which of these items had a corresponding entry in the English Wiktionary. For each of these entries, we extracted sense, POS, and look-up data. All in all, our final dataset contains approximately 780 million look-ups distributed over 30,750 entries. Table 2 gives an overview of all the variables in our dataset relevant for the present study.
Information on the Variables Used in the Present Study.
Data Analysis
Data Transformation
The distribution of look-up data (as measured by the number of views for each article) is heavily skewed toward low values. After log-transforming, the variable approaches normal distribution, and so we adopted the transformed variable (log views) as the criterion variable in our statistical model. Likewise, we log-transformed the standardized corpus frequency predictor variable to normalize the distribution of the residuals of the linear model, as per a general assumption of linear regression models.
We then standardized all continuous predictors (AoA, log frequency, and prevalence) to z-scores by subtracting the respective mean from each value and then dividing by the respective standard deviation. This maps all the continuous predictors on the same scale with a mean value of 0 and a standard deviation of 1, making linear regression model estimates comparable. 4
Model Specification
We predicted log views by standardized AoA, log frequency, and prevalence. Polysemy (true/false) entered the model as a categorial predictor. The corresponding R formula is:
The coefficient of determination for the full model is R2 = .5228. Further details on this model are given in Table 3.
Results for the Linear Regression Model.
Note. Predictor: name of the predictor; Estimate: beta estimate from the linear regression model; t-value: associated t value (Estimate divided by associated standard error); p-value: indicator of statistical significance; VIF: variance inflation factor for the predictor; ΔR2: difference between full model R2 and R2 of a model without the predictor (see further explanation in the text).
To spot potential problems with collinearity in the model (for example, log frequency and prevalence are correlated at rPearson = .62 and rSpearman = .72), we checked the variance inflation factors (VIF) of the predictors. As can be seen in Table 3, none of the VIFs approaches or exceeds a value (5 or 10) that could indicate “a problematic amount of collinearity” (James et al., 2013, p. 101f).
We also tested whether model results are crucially influenced by outliers in the log views, our criterion variable. After excluding 24 data points (i.e., less than 0.1% of all data points) lying outside the hinges of a boxplot with r = 1.5 (which is a rather strict criterion), none of the effects reported above changed in a meaningful manner, that is, the overall effect pattern stayed the same.
We do not present models with any interactions here. However, we also computed an alternative model which included all possible two-way interactions. Several of these interaction terms did not reach statistical significance and were excluded. Another interaction effect was excluded because it led to inflated variance in the model (highest VIF: 9.22). After this, two two-way interaction terms remained in the alternative model (AoA:Prevalence and AoA:Polysemy). Even so, as the gain in explained variance from including these two interaction terms was close to none (R2int = .5247, compared to R2 = .5228 for the original no-interaction model), there was little justification for retaining the two additional terms in the model. For reference, we include this alternative model in the Supplemental Material.
A Preliminary Look at Part of Speech Labels
In a separate preliminary analysis in response to a suggestion by an anonymous reviewer, we investigated whether look-ups vary by part of speech. We did not include POS as a predictor in the regression model, because the POS information is not as reliable as the other predictors: here, we only used the first part-of-speech label on the entry page in the English Wiktionary. In some cases, this seems rather arbitrary. For example, the first POS given at the entry “a” is “Letter,” whereas “Article” would have been a more appropriate category to represent the most salient use of the word “a.” We grouped these POS labels into the following four categories (starting with the most numerous group): (1) Nouns and proper nouns (n = 19,258); (2) Adjectives and adverbs (n = 7,689); (3) Verbs (n = 3,577); and Others 5 (n = 226). In this preliminary analysis, we compared the (log-transformed) views with pairwise t-tests between all groups.
Results
All predictors are highly significant in their contribution toward predicting article views in the English Wiktionary. Frequency and AoA show a positive relationship with article views. This means that words that are found more often in a corpus tend to be looked up more often, and words that are acquired later in life are also more likely to receive more views.
Words that are more prevalent in the population, that is, are known to more people, are looked up less often. This is indicated by the negative estimate for the standardized prevalence variable in Table 3.
To assess the relative importance of the predictors, we can refer to absolute values of the continuous predictors thanks to their prior standardization. This shows a clear hierarchy of importance: corpus frequency is by far the most important predictor in the model, followed by AoA and prevalence (recall that polysemy is not a continuous predictor). As a second measure of relative importance, we refer to the proportion of variance in the criterion that is explained by the predictors in the model (R2). Here, we drop each variable from the model and calculate the difference between the full model (as indicated above: R2 = .5228) and the model without the respective variable and call this measure ΔR2. Conversely, ΔR2 can also be interpreted as a measure of how much more variance in the number of views is explained by the model if the respective predictor is included. As expected, this shows the same hierarchy as the comparison of estimates. In addition, we can include polysemy in the hierarchy because ΔR2 does not rely on standardized predictors.
Figure 1 visualizes all effects from the linear regression model. The importance ranking of the variables is here also apparent in the range spanned by the values of predicted views. For example, the very strong effect of frequency leads to an increase from near-zero views to nearly 2 million predicted views for entries for very frequent words. In contrast, the effect of prevalence is visually apparent, but from one end of the standardized prevalence scale to the other, predicted views only change by a factor of 2.

Visualization of the estimated effects of the linear regression model.
Figure 2 shows the distribution of the (log-transformed) look-ups of entries grouped in four POS categories. Pairwise t-tests indicate highly significant differences (all Holm-adjusted p-values <.0001) between all groups except between (proper) nouns and verbs (p = .059). It is quite obvious that especially the “Others” category stands out from the rest. We propose two alternative reasons for this. The “Others” group contains high-frequency function words that are also looked up very often (e.g., “a,”“I,”“what,”“the,”“for,”“in,”“more”). Alternatively, the size of the group could lie at the heart of the difference: while the other groups contain several thousand entries each, only 220 words fall into the “Others” category. That is because this category captures non-productive, closed syntactic classes of the vocabulary (unlike nouns, verbs, or adjectives). But this also means fewer entries that could drag the distribution of look-ups down. To illustrate this point: the lower end of the rightmost violin (“Others”) in Figure 2 is at 2,469 look-ups for the entry “huzza” (POS tag “Interjection”). This is considerably higher than the entry with the fewest look-ups in the “Verbs” violin (“prewashed,” 156 look-ups).

Violin plots for the (log-transformed) look-ups of four part-of-speech categories.
Discussion
The present study confirms the crucial role that lexical frequency plays in driving interest in words for the purpose of dictionary consultation. This finding tallies well with previous studies (De Schryver et al., 2019; Müller-Spitzer et al., 2015), restating that the more frequent words tend to be looked up more often than the less frequent words. We believe a large part of this effect is rather mechanical: corpus frequency represents textual frequency (reflecting the type and proportion of texts represented in the particular corpus), which is also the probability of encountering a given lemma in running text. Now, some dictionary look-ups must be shots in the dark, without specific semantic motivation or well-formed assumptions. Dictionary look-ups that would be so classified would then be expected to reflect the textual frequency of word forms. Thus, the mere higher frequency of occurrence would drive consultation behavior, making it more likely, in a fairly superficial way, for frequent forms to be looked up more often. Simply put, we just cannot help to look up some words just because they are so frequent in the linguistic material that we encounter.
Our regression model also indicates that the other three factors considered also play a role, as suggested by the significance level of these effects and reduction in explained variance (columns 4 and 6 in Table 3). Judging by the latter parameter (ΔR2), second in importance would be polysemy.
Polysemous words tend to attract more views. One possible explanation for this effect might be straightforward: encountering words in text, language users pay attention, not just to form, but also meaning. One might even say that meaning is usually the prime goal of communication, with form being a mere vehicle to get at the meaning. This is where the effect of polysemy comes in. A polysemous word could (and, given our results, does) present a greater obstacle when it comes to figuring out the correct meaning of a given word in its context. This might simply be the case because there are several potential “meaning candidates” (some of them might be quite rarely used) from which the correct one must be selected and integrated into the overall meaning representation of the text. This selection process is exactly where dictionaries can help. Compared to that, monosemous words present less of a challenge to the reader/listener. This contrast is reflected in the effect of polysemy in our data. For example, take a word like school, whose most common meaning is quite generally known. However, the sense “group of fish” is not so well known, and when this use is encountered in the context of marine life form, users may be puzzled by the idea of fish attending an institution of learning. This type of experience of semantic difficulty may drive dictionary consultation for the less transparent meaning extensions marked as separate senses in the dictionary. This finding again confirms results by Müller-Spitzer et al. (2015) for the German Wiktionary: polysemous words are looked up more often than words with a single meaning.
Next in line in terms of ΔR2 is age of acquisition (AoA). Our best model indicates a positive effect, which means that words acquired later in life are in general more likely to be looked up. Given that this is an effect adjusted for frequency, we might offer an interpretation of this effect in terms of typical progression of lexical acquisition as well as consultation behavior. A significant part of the core vocabulary of the language is acquired in the early years of life (e.g., Anglin et al., 1993, p. 62, estimate the mean number of main entries in a dictionary known by fifth-graders at around 40,000). Under a typical language acquisition scenario, children would get a good grasp of these words before they start school. On the other hand, pre-school children are not yet literate and would not be expected to use dictionaries such as the English Wiktionary. Conversely, the typical user of Wiktionary already knows (most of) the early-AoA words and does not have to look them up as often as words that they do not know at their current point in life. This might explain why such relatively early-acquisition words are relatively underrepresented in the Wiktionary logs (after correcting for frequency). Note the point about polysemy above, though: rarer senses of early-acquisition words might still drive dictionary consultation.
The final predictor in our model—one with the least impact on the number of views of all—is lexical prevalence: an indicator of how widespread in the population the knowledge of a given word is. One might expect that if many people know word X, few people would want to look it up. Conversely, if few people know word Y, there would be many who might seek lexical help for it. The direction of the effect in our model agrees with this rationale: words that are known to more people are looked up less often (as indicated by the negative estimate).
The final hierarchy of importance that emerges is as follows: frequency > polysemy > AoA > prevalence, and may be rendered graphically as in Figure 3.

Relative hierarchy of importance of the four predictors of the frequency of dictionary consultation.
Implications for Lexicography
The main implication of our findings for lexicography is one of reassurance: the modern approach to the determination of a lemma list which underlies state-of-the-art corpus-based methodology is essentially correct: attention and effort should be primarily graduated in relation to corpus frequency. Our results tally with those of previous research looking at different types of dictionaries for other languages (De Schryver et al., 2019; Koplenig et al., 2014). However, we have shown that while frequency is a very important predictor of consultation behavior, it is not the only one. While Müller-Spitzer et al. (2015) already showed that short-term effects of social relevance can have impact on look-up behavior, we showed here that there are other long-term variables beside frequency, namely polysemy, age of acquisition, and word prevalence exerting their influence on dictionary users. These factors can (and maybe should) now also be taken into consideration when devising lemma lists or, of course, when expanding on research into dictionary use.
By its very nature, an analysis of web server records cannot shed light on either the context of an instance of dictionary use, or on the personal characteristics of the dictionary user. However, considering the practical application of dictionary-making, specific look-up context as well as idiosyncratic user characteristics cannot be determined at the stage of lexicographic design either, so it seems appropriate to ignore these factors, as in our approach.
Limitations and Future Work
Inevitably in a web-based log-file study with anonymous users, it was not possible to consider the language background and proficiency of the Wiktionary visitors, nor would any other personal characteristics of our dictionary users be known. Likewise, we could not obtain information on such details of the look-up context as the main activity (e.g., was it reading for comprehension, writing, translation work, or perhaps recreational dictionary browsing), or what sort of problem prompted the dictionary look-up. While “know your user” remains a valid principle in lexicography, it is also true that a general-purpose dictionary such as the English Wiktionary attracts a very broad variety of visitors trying to use it for all sorts of purposes. In view of that, the varied log-file data may actually not be such a bad source of information, especially if we consider that the dictionary users whose look-ups we are using are people who came to use the English Wiktionary out of choice, rather than being a captive audience in a controlled experiment.
In the present study, we used a multiple regression model with no interaction terms (though see the Supplemental Material for an interaction model), but other analysis protocols might be employed to yield corroborative or more nuanced results, such as bootstrapped models (or other analyses based on repeated sub-sampling of the dataset).
Another avenue for future research could also include part-of-speech information in the analyses: it might well be that, for example, nouns attract more views than adjectives. At the suggestion of one anonymous reviewer, we tried looking at POS. However, it is still unclear exactly which part-of-speech information should be used. Many entries have more than one part of speech listed; for example, the entry for angle includes both noun and verb uses, and it is not clear that the ordering of the POS sections is systematically motivated, nor do we know whether a nominal or verbal use was being looked up. Also, corpus frequency (or other predictors) might affect different parts of speech in different ways.
Furthermore, it might be worth exploring methods from the field of artificial intelligence (AI) or machine learning (ML) to corroborate or extend the present findings, which are based on regression modeling. 6
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded in part by the National Science Centre, Poland, award 2020/39/B/HS2/00923. For the purpose of Open Access, the authors have applied a CC-BY public copyright license to any Author Accepted Manuscript (AAM) version arising from this submission.
