Abstract
This paper embarks on a methodical exploration of TVTropes.org, a comprehensive crowd-sourced wiki that documents storytelling conventions across various forms of media, with a special focus on representations of artificial intelligence and future scenarios. The study delineates the multifaceted nature of the platform, asserting that its offerings extend beyond mere television tropes, and explores its potent applicability for futurists and scholars. Through the strategic application of web scraping and data analysis, this research quantifies diverse aspects of science fiction media and the embedded tropes therein. The paper underscores the pivotal role of tropes in shaping societal perceptions and influencing discourse, particularly in the realm of futuristic thinking, using instances such as the impact of “The Terminator” franchise on military strategic thinking as a case study. Furthermore, the research engages in a quantitative analysis of trope density and frequency within science fiction works, offering a nuanced understanding and uncovering patterns in storytelling conventions. This study thus offers a foundational framework for understanding, categorizing, and analyzing the influential images of the future that permeate today’s media environment, thereby providing a valuable resource for futurists, researchers, and media scholars in their respective endeavors.
Introduction
Within the field of futures studies, the intersection of futures methods and science fiction is a frequent topic of discussion. For many futurists and foresight practitioners, science fiction serves as an inspiration, a tool for stretching peoples’ imaginations, or a literary genre worthy of analyzing for insights into how futures work can be more influential or engaging. The following hopes to build on the long tradition in the futures field of digging into science fiction for new ideas, but by paying special and focused attention to tropes.
While Raven and Elahi covered the benefits of applying narratology to the shaping of futures outputs (Raven and Elahi 2015) and this research deals with tropes, a topic typically reserved for narratology, the paper does not focus on the tropes typically explored by the literary tradition, such as metaphor, irony, synecdoche, or how to deploy them in crafting more effective narratives in futures outputs. Instead, it focuses on the database of tropes generated by the community at TVTropes.org. While not focused on how TVTropes.org fits within narratology, the paper does hope to give the futures community more concrete exposure to the depth and breadth of futures-oriented tropes and media available on TVTropes.org.
While the idea of science fiction “tropes” has come up in the futures literature in the past, a comprehensive engagement with science fiction and speculative fiction tropes is currently lacking. The following research hopes to provide the following: • An introduction to TVTropes.org and its applicability to futures studies • An overview of using web scraping to pull information from TVTropes.org for analysis • An initial mapping of the science fiction tropesphere on TVTropes.org by looking at science fiction media on the site and speculative fiction and technology tropes. • An example of how to use TVTropes.org data to compare two works of science fiction
Why TVTropes.org
TVTropes.org is a community wiki created in 2004 to document storytelling conventions people had been discussing and debating online before the site’s creation. While the site’s name may lead people to believe it focuses on storytelling conventions and cliches strictly on television, it covers a wide range of media types and pages that document instances of these tropes bleeding over into the “real world.”
Given this paper’s attempts to situate TVTropes.org in conversations about how narratology can be beneficial to futures studies, it is worth pointing out how TVTropes.org defines a trope as “a conceptual figure of speech, a storytelling shorthand for a concept that the audience will recognize and understand instantly” (Trope n.d.). TVTropes.org also provides a great resource, the “Playing with a Trope” page, 1 for those curious about different ways to use tropes in storytelling.
While TVTropes.org is a community wiki that may deviate from strict definitions of a trope, Rughini provides an overview of where TVTropes.org, in particular, fits within the citizen science space and demonstrates that the community of practice on TVTropes.org conducts a variety of qualitative work that other academic fields can build upon (Rughiniş 2016).
Studies of TVTropes.org have been taken up by different disciplines, namely computer science, for use in machine learning to automate the generation of stories. While this paper is not interested in using TVTropes.org to automate the writing of futures narratives, the paper does seek to do some of the mappings of the science fiction tropesphere, a general overview of the tropes and media within a network (García-Sánchez et al. 2021), but using a more conservative methodology.
TVTropes has also been acknowledged within the futures community as a potential means of understanding the myriad reading protocols of science fiction (Raven and Elahi 2015), and this paper attempts to map the footprint of science fiction within TVTropes.org to help non-science fiction readers begin to understand the nuances that can exist within some of the more familiar tropes associated with the genre.
Most pages on TVTropes.org contain information about either a trope or a creative work. Readers are introduced to the trope and its definition on trope pages, with some examples littered throughout. The remainder of the page allows readers to browse a list of works in which that trope occurs. The pages of creative works are structured similarly: a brief introduction of the work but instead followed by a list of tropes that occur within the work.
By counting the number of works listed on an individual trope’s page, we can get a raw number of how frequently that trope has occurred within media according to TVTropes.org’s community of contributors. Tropes with higher frequency can be assumed to be more widespread and, therefore, have a higher probability of being encountered by people as part of their regular media consumption habits. Alternately, by counting the tropes listed on an individual work’s page, we can get a raw number for how densely packed that work is with storytelling conventions and cliches. Works with a high density of tropes can be rich avenues for seeing how multiple storytelling conventions can come together in a story or image of the future.
Counting these attributes on a trope-by-trope or work-by-work basis can be enjoyable when wanting to dive deep into a specific topic or work, which we will explore later. However, we can also aggregate the individual page counts to get an expanded view of trope frequency or density across the science fiction corpus documented by the community on TVTropes.org.
Methodology
The study begins by looking at the structure of webpages hosted on TVTropes.org. Due to using wiki software to manage content on the site, pages tend to follow a very similar structure. The structure of the site also provides consistency in how URLs are structured; for example, pages that focused on tropes are always housed on URLs that follow the structure - https://tvtropes.org/pmwiki/pmwiki.php/Main/[TropeName] - while works’ URLs are based on the medium, such as https://tvtropes.org/pmwiki/pmwiki.php/Film/[FilmName] for films and https://tvtropes.org/pmwiki/pmwiki.php/Literature/[Title] for literature. Due to the tendency for sci-fi media to be adapted across mediums (book to film, video game to film, video game to book, etc), this URL structure allows for the differentiation of the presence or absence of tropes within a story based on medium. For example, there are pages for the novel Dune, each of the film adaptations of Dune, as well as the 1992 video game adaptation of Dune.
For the purposes of this research, one wants to pull three types of information from a page: a list of tropes associated with a work, a list of works associated with a trope, or a list of works associated with a medium. Each trope, works, and index page uses unordered lists to aggregate a main list of links to examples of the trope in different works or examples of tropes contained within a specific work. If we view the source for a trope’s page, we could take down a list of all URLs on that page that follow the URL convention for works that we noted above (https://tvtropes.org/pmwiki/pmwiki.php/Literature/, https://tvtropes.org/pmwiki/pmwiki.php/Film/, etc). By taking the sum of all links that follow that structure, we get the number of example works associated with the trope. We could also take the sum of only a specific subset of URLs, like only URLs linking to films, and take that as the number of examples of films associated with the trope. By specifically targeting URLs to works of media, and the anchor tags within the html in particular, this method reduces the risk of errors that could be introduced by using regular expressions that try to match text.
We can conduct a similar exercise when counting the number of tropes associated with a work. By looking at the source of a work’s list of examples of tropes, we can count the number of URLs that follow the structure – https://tvtropes.org/pmwiki/pmwiki.php/Main/[TropeName] – to get the number of tropes associated with the work. Unfortunately, trope pages are not the only pages that follow that URL structure, and other issues arise due to the presence of index links at the bottom of each trope and works page that introduce URLs that have a similar structure but are actually unassociated with the work. Due to the common use of this URL structure and the presence of multiple indices in the index section of each page, there is also a higher risk of duplicate links to trope pages being present, requiring an additional cleanup phase to remove duplicates so that the final counts only contain the number of unique tropes listed on the page.
The following set of pages formed the science fiction tropesphere for this study: • Two lists contained on the “Science Fiction” main page: a. Fifteen Science Fiction “medium” pages: Science Fiction Animated Films, Science Fiction Anime & Manga, Science Fiction Comic Books, Science Fiction Fanfic, Science Fiction Films, Science Fiction Literature, Science Fiction Podcasts, Science Fiction Radio, Science Fiction Series, Science Fiction Tabletop Games, Science Fiction Video Games, Science Fiction Visual Novels, Science Fiction Web Originals, Science Fiction Webcomics, and Science Fiction Western Animation. b. Twenty-five Science Fiction “subgenre” pages: Alien Works, Alien Invasion, Apocalyptic and Post-Apocalyptic Fiction, Cyberpunk, Cyberpunk for Flavor, Kaiju, Mecha Show, Military Science Fiction, Mutant Media, New Wave Science Fiction, New Weird, Pastoral Science Fiction, Planetary Romance, Post-Cyberpunk, Punk Punk, Real Robot Genre, Robot and AI Works, Science Fantasy, Sci-Fi Horror, Space Opera, Space Western, Steampunk, Super Robot Genre, Transhumans in Space, and Video Game Stories. • Speculative fiction tropes index: a. This page is inclusive of not only science fiction tropes but also other speculative fiction genres such as horror and fantasy. The reason for using this page is that it is the most exhaustive list of tropes consistent within science fiction while also providing space for capturing tropes that, while associated with other genres, may have consistent themes as they relate to the speculative arts. • Futuristic tech index b. The Futuristic Tech Index page contains a list of futuristic tech tropes as well as a list of subcategories. For the sake of the study, we pulled the list of tropes from both the primary index page and the subcategory pages. b. The 12 Futuristic Tech Index subcategories are Autonomous and Artificial Appendage Index, Faster than Light Index, Magical Computer, Mecha Tropes, Our Clones Are Identical, Radioactive Tropes, Ranged Energy Attack Tropes, Robot Roll Call, Spacecraft, Teleportation Tropes, Time Travel Tropes, and Transhuman.
Due to the consistency in the site URLs and the HTML structure of each page, web scraping tools can pull the aggregated information across all of the works and tropes within these pages focused on science fiction and speculative fiction.
Selecting a Web Scraping Tool
All attempts at this research used Python as the primary programming language due to its proven efficacy in extracting web content and data analysis and exporting. Numerical operations performed on the data relied on Python's native math functions and the numpy and pandas libraries. ChatGPT’s GPT-4 model was used to recommend and troubleshoot code throughout the process to make up for any limitations in the author’s coding skills.
For web scraping, the Scrapy library was used to build web spiders to extract the relevant information described above. The extracted data was added to a CSV file containing the name of the trope or work in one column and the URL for that item in the other column. We can then use a spider to visit each URL stored in the spreadsheet to pull the examples of tropes or works listed on that page and the URLs to each. Hence, we have them as a reference and then have the spider add the total number of examples provided and add those to a “count” column in the CSV. By pulling the list of works or tropes from each page, we can double-check the counts later by manual counting, or we can quickly double-check that a work was pulled from the page by searching within the contents of that cell.
The process is repeated for each of the medium pages to get a comprehensive list of tropes associated with each work by medium. The same process is then repeated for the speculative fiction tropes index and the futuristic tech index and its subcategories to get a list of works associated with each trope contained in those indices.
As mentioned, some data cleanup is required after extraction, such as pages that split examples across multiple subpages and pages that mention works or tropes multiple times, leading to overcounting, but one the cleanup steps are finished, the CSVs are ready for analysis. 2
Ethical Considerations of Web Scraping
When employing web scraping techniques, as exemplified by the use of the Scrapy framework in our study, it is crucial to address the ethical considerations and compliance aspects. Web scraping, while a powerful tool for data collection, intersects with several ethical and legal issues, primarily related to data privacy, website integrity, and intellectual property rights. • Adherence to robots.txt Files: A fundamental aspect of ethical web scraping practices involves respecting the directives specified in the robots.txt file of a website. In our study, Scrapy was configured to comply with these directives (‘ROBOTSTXT_OBEY’: True), ensuring that our data collection process did not infringe upon the website's designated scraping policies. • Managing Server Load: To mitigate any potential negative impact on the target website's server, measures were taken to prevent overloading. This was achieved through the implementation of a download delay (‘DOWNLOAD_DELAY’: 2). • Data Privacy and Use: In the context of data privacy, especially when dealing with personally identifiable information (PII), ethical considerations are paramount. Our study, however, focuses on publicly available data that does not include PII, thus mitigating privacy concerns.
A Quantitative Exploration of Science Fiction Media and Tropes on TVTropes.org
TVTropes.org also has pages that are neither tropes nor works of media; another type of page is an “Index” page that lists all of the works or tropes that fit specific criteria. For example, the “Speculative Fiction Tropes” page indexes all the tropes users have categorized as speculative fiction on the site. Another page, “Futuristic Tech Index,” provides an index of tropes related to futuristic technology.
Using the “Futuristic Tech Index” page, we can count the number of tropes related to futuristic tech on the site. The Futuristic Tech Index page itself already has a list of tropes, but the page also contains a list of futuristic tech subcategories.
Science Fiction Subgenres on TVTropes.org.
On the “Science Fiction” page of TVTropes.org, there are two lists: a list of science fiction subgenres and a list of science fiction mediums. If we list each of the subgenres and then pull the list of works listed on each of these pages, we see the number of works linked to each subgenre on TVtropes.org:
It’s worth noting here that Punk Punk contains Cyberpunk and Steampunk, but also other “punks” subgenres of speculative fiction like solarpunk and dieselpunk. Mecha and Kaiju are primarily focused on the robot and monster genres, respectively, as they show up in Asian media. Other interesting nuances between subgenres involve Super Robot versus Real Robot, which are primarily subgenres differentiated by the origin of the mecha featured in each work.
The intersection of science fiction and horror is also worth noting with Sci-Fi Horror specifically having the fifth highest number of example works among science fiction subgenres. As demonstrated later, the concerns about abuse and misuse of technologies that are often depicted in sci-fi horror stories can become the primary public concerns or debates surrounding a technology or the primary trope upon which specific sectors of society inform their worldview. Polak pointed out that science fiction in particular can foster skepticism toward new technology by making “crystal clear what the fatal consequences of the continued development of science and technology might be that it revives the old idea of a moratorium on further scientific research” (Polak and Boulding 1973). Science fiction horror often confronts the viewer with the “fatal consequences” Polak mentions, and later on the paper will explore how this is already manifesting in current public discourse about artificial intelligence.
By counting the list of works on each science fiction medium page, we get the following assessment of how many science fiction works are documented on the site for each medium:
Science Fiction Media on TVTropes.org.
Exploring Artificial Intelligence & Robotics Tropes
AI & Robot Tropes by Frequency.
The Robot Roll Call page has roughly 200 AI and robotics tropes listed. The top 25 tropes by number of works featuring those tropes come out as follows:
As mentioned earlier, Sci-Fi Horror as a trope gives way to sub-tropes that focus on fears or skepticism towards specific technologies. AI Is A Crapshoot is one such subtrope of Sci-Fi Horror. AI Is A Crapshoot, in particular, is worth calling out as artificial intelligence gains special prominence in media and popular culture following the release of publicly available large language models. The trope concerns itself with the inherent unpredictability of artificial intelligence and the somewhat predictable storytelling outcome of the artificial intelligence misbehaving in some way. This trope lies at the center of the debate around existential risks and the ethics of pursuing artificial general intelligence. The high frequency of this trope in popular media, as evidenced by how frequently it appears on TVTropes.org, demonstrates that the ubiquity of concerns about existential risk should not be surprising.
However, evidence exists to demonstrate that tools currently falling under the “artificial intelligence” umbrella, like large language and diffusion models, are already causing actual harm to individuals and society (Bender and Hanna 2023). In particular, mass disinformation through deepfake technologies that digitally replicate the likeness of real people are already being used to spread false information. The robot and AI tropes index on TVTropes.org does not contain any tropes related to misinformation. The closest we get to tropes of AI-based deception is “The Computer Is a Cheating Bastard” or “My Rules Are Not Your Rules.” However, these do not speak to ideas of artificial intelligence spreading or being used to spread mass information.
AI & Robots Works by Trope Count & Medium.
While this could be evidence of a blindspot in the TVTropes.org’s community’s knowledge of an AI work that does address this trope, it also seems to demonstrate the always looming disconnect between the real world and science fiction, and the limitations this disconnect places on science fiction in particular as a predictive or imaginative tool for futurists. This speaks to Blagovesta’s point that futures studies can contribute back to science fiction through futures’ more rigorous focus on plausibility as a critical component of futures works produced for clients (Nikolova 2021).
Alternatively, by exploring the robot and AI works page, we can get a sense of media properties that have a higher density of tropes and may provide a rich jumping off point for exploring how different AI and robotics tropes play together in weaving a story. However, when we look at the 25 works with the highest trope densities from the Robot and AI works page, we see the following:
When accounting for the animated media that are TV series, we see that TV series tend to have the highest density of tropes associated with them. This higher density of tropes makes intuitive sense; while a film may only have a 70-minute to 3-hour runtime, one season of a TV series has longer runtimes than movies and naturally lends itself to telling more stories and needing to use more tropes to sustain those longer runtimes as well. Video games having high trope densities also make sense since the length of a single playthrough of a game can often extend beyond the typical runtime of a film. However, this points out an essential limitation of the total count of tropes as a metric for use in assessing science fiction works or tropes for applicability in futures work: the amount of non-science fiction tropes on TVTropes.org is so high that a high total trope count for an individual work does nothing to indicate its usefulness as a narrative of futurity or the concentration of tropes related to the topic of study.
Instead, filtering the total tropes in a work down to a subset of tropes relevant to the study one is conducting is often a more relevant approach. By using a filtered merge between different datasets collected from the site, a work can be analyzed by how many tropes it contains within a specific subset.
Filtering Tropes for Greater Clarity in the Terminator Films
TVTropes can also be used, and are often used, to dive deeper into a specific trope or work. Following the theme of AI/Robot-based works and tropes, we can reuse our script to pull content from a single page to gather a list of all tropes associated with the film The Terminator as a foundation for analysis.
AI and Robot Tropes Present in the Film The Terminator.
AI and Robot Tropes Present in the Film Terminator 2.
However, how useful is The Terminator for exploring tropes about robotics and artificial intelligence? In the previous section, limitations of the total trope count demonstrated the need for filtering a works’ tropes list to dive deeper into its relevance to a topic. If we conduct a filtered merge between the list of tropes in The Terminator and the list of tropes contained in the “Robot Roll Call” trope index of 200 AI and robotics tropes, we see that The Terminator only contains seven robot/AI tropes:
The Terminator really made the most of the few robot and AI tropes it actually contained, but the list above demonstrates that there isn’t much nuance in how robots and artificial intelligence are depicted in the original film. However, doing the same analysis of Terminator 2, we find that the second film has much more to offer when exploring storytelling around robots and AI. Here are the 26 Robot and AI tropes that occur in Terminator 2:
One can see that by filtering works down on specific subsets of tropes, it becomes easier to see whether a work of science fiction may communicate or expose the reader/viewer to a greater number of relevant tropes for whatever analysis is being conducted. In the example demonstrated with The Terminator and Terminator 2, someone seeking a deep dive into artificial intelligence and robotics tropes in science fiction films may get more value out of a viewing of Terminator 2 than The Terminator, for instance.
Conclusion
The above research demonstrates how TVTropes.org information can be aggregated and explored for insights that can be aligned with futures work. By exploring TVTropes.org and using web spiders to target science fiction content on the site specifically, a science fiction tropesphere is developed for exploring works and tropes in the genre.
An introduction to TVTropes.org was provided as well as how it is an example of citizen science that has created a database of storytelling tropes that can be useful to researchers outside of the TVTropes community. Web scraping using Python’s Scrapy library helped extact data from the TVTropes website in an ethical manner, and a simple methodology for extracting the data was described.
After gathering the science fiction data for analysis, the paper explored the outputs of the different trope and media counts for science fiction works and tropes throughout the TVTropes site. From here, the paper went from discussing science fiction subgenres like Sci-Fi Horror, to the specific Sci-Fi Horror trope of “AI Is a Crapshoot” within AI and robotics tropes, and finally provided a comparative look at the prevalence of AI and robot tropes in the first films of the Terminator franchise.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
