Abstract
Wikipedia is one of the most widely accessed sources of information worldwide, playing a crucial role in global knowledge dissemination across languages and cultures. Understanding how information complexity varies among different language editions is essential to identify gaps and promote equitable access to knowledge. This study investigates differences in the informational complexity of Wikipedia articles across three language editions: English, Spanish and Portuguese. By analysing key quantitative features such as word count, number of topics, figures and references in articles from five thematic categories, we provide empirical evidence of significant variation in content depth. Our results reveal that both language and thematic category significantly affect Wikipedia article complexity across variables. In addition, a significant interaction between language and category was observed for number of topics. These findings highlight the linguistic disparities in knowledge representation on Wikipedia and underscore the importance of promoting linguistic diversity and inclusivity in global knowledge dissemination.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
