Abstract
High frequency words, which are key to a text, must be mastered to achieve minimum levels of reading proficiency. However, knowledge about the frequency of items in a language is very limited. Given this consideration, WordSift (www.wordsift.org), a word cloud tool based on high frequency and key words can assist English as a Foreign Language learners of intermediate level who have difficulty in reading the text due to limited vocabulary knowledge and all of its distinctive features can be applied in varied stages of reading instruction.
Introduction
If an item naturally occurs frequently in the language, it is likely to be important (Leech, 2001). Therefore, high frequency words must be mastered to achieve minimum levels of reading proficiency (Gardner, 2004). However, knowledge about the frequency of items in a language is very limited (Leech, 2001). Given this consideration, WordSift (www.wordsift.org), a word cloud tool based on high frequency and key words can assist English as a Foreign Language learners of intermediate-level who have difficulty in reading the text due to limited vocabulary knowledge.
WordSift was created by Kenji Hakuta with the help of Diego Roman and Karen Thompson, two doctoral students at Stanford University, California, USA, and former teachers. The website homepage provides a big box for pasting text and several green tabs on the top linking to background information about this website (Figure 1). Right below these tabs, there are two important windows, “Cloud View” and “Text View”, linking to the main features of this website. Cloud View can generate a word cloud whose styles and words can be edited by using Cloud Styles, Sort Words and Mark Words. It also provides another three useful tools, namely WordNet Visualization, Images, and Key Word in Context. Additionally, Text View presents a readability measurement of the target text. All these distinctive features can be applied in varied stages of reading instruction.

The tabs on the top and a sample word cloud in Tag Cloud.
Pre-Reading: Tag Cloud
Words occurring frequently will be key to a text (Scott, 1997). Knowing key words helps students understand the text they read (Filatova, 2016). According to Filatova (2016), a word cloud presenting key words can be used for pre-reading discussion on what the article might be about. WordSift has a generator of word clouds based on word frequency statistics, called Tag Cloud, with which learners can easily identify word frequency and unknown key words. When users paste a text into the box, WordSift will sift it quickly and generate a word cloud which visualizes the 50 most frequent words from the text. As Figure 1 shows, the more frequently a word occurs, the larger it becomes.
During-Reading: Key Word in Context
The production of concordances with key-words-in-context lists is an important method shared by many text analysts (Bernard and Ryan, 1998). With “Key word in context”, which provides the same functionality as Concordance in AntConc, a classic tool in corpus linguistics, teachers can guide students to focus on the location of key words for text analysis during reading. For example, as Figures 2 and 3 show, “river” appears frequently in the first part of the text, suggesting that the river played an important role in Mark Twain's childhood. While “writing” mainly appears in the latter half, reflecting some genre characteristics of a writer's biography. Moreover, students can also understand the usage of key words in the target text by using this tool, which promotes language learning by allowing learners to discover patterns and adjust their misconceptions by observing extensive naturally occurring examples in real texts (Hill, 2000).

The results of “river” in context.

The results of “writing” in context.
Post-Reading: WordNet Visualization and Images
WordNet Visualization and Images help to teach key words during post-reading.
By displaying a word web of each selected key word, WordNet Visualization allows students to learn the depth of vocabulary knowledge, which includes synonymy, polysemy and collocations, that plays an important role in reading comprehension (Zhang and Anual, 2008). For example, clicking “writing” generates the word web where its synonyms and common collocations are shown (Figure 4). This has been demonstrated to be a good strategy for learners to develop the depth and dimension of word knowledge in reading (Johnson and Rasmussen, 1998).

“Writing” in WordNet Visualization.
To facilitate the acquisition of key words, Images provides multimodal resources from Google Search about the word chosen as the search term. In Figure 5, the search results of Google Images for “twain + river” are displayed. In a reading class, students can better understand the meaning of key words and comprehend the text with these materials. Such dual glossing modes combining definitions of target words with associated verbal and visual representations are effective for vocabulary learning (Ramezanali and Faez, 2019).

Google Images of “Twain River”.
It is also worth mentioning that WordSift provides data for readability measurement. When choosing reading materials online before class, teachers can refer to the indicators of readability in Text View and judge if the target text is suitable (Figure 6).

The indicators of readability in Text View.
However, despite these unique features mentioned above, WordSift also has some shortcomings. First, long text analysis cannot be achieved because only texts with fewer than 10,000 words can be sifted by the text box. Second, restricted by Google's Image Search policy, the images and videos cannot be shown directly but presented in pop-up windows. Finally, unlike some other word cloud tools, WordSift does not allow users to customize the word cloud into more attractive and vivid shapes and patterns (Figure 7), which can better support students to make preliminary predictions about text topics while reading.

A vivid word cloud made by another tool.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Center for Language Cognition and Assessment, Guangdong, People’s Republic of China. It was also the result of Guangdong “13th Five-Year” Plan Project of Philosophy & Social Science (Grant Number: GD20WZX01-02).
