Universality and diversity in world fictions

What is the origin of the massive cultural success of imaginary worlds in modern societies, from The Lord of the Rings to Harry Potter and Star Wars? When and where did fictions about the struggle between good and evil begin to become so popular? Does the evolution of literary fictions, video games, and films follow similar patterns in terms of contents and devices? These questions are difficult to answer, largely because we lack a way to reliably compare thousands of fictions across media, time, space, and languages. Despite important advances in Artificial Intelligence, fictions are difficult to understand (Min & Park, 2016), and the large-scale study of plots, narrative arcs and characters is still in its infancy (Jänicke et al., 2015).


This project aims at bringing together the expertise of fan communities to create an extensive and collaborative inventory of fictional works. To do so, we will go through two steps of densification through crowdsourcing: 1) a web-scraping step and 2) a community-based step (see Figure 1). First, we will extract thousands of fictional works distributed across all countries and time periods, from late antiquity to modernity, thanks to online databases (e.g., Wikidata, Bibliothèque Nationale de France, Library of Congress). We will then densify this database and its basic metadata (e.g., genres, plot keywords, word lengths) through the web-scraping of user-generated websites (e.g., IMDb, Goodreads). We will involve relevant communities, mostly literary scholars and fan communities, and ask them to fill this public inventory, presented as an online platform, with fine-grained tags only they can provide, about formal characteristics and content features of fictional works (e.g., ‘first-person narration’ ‘imaginary world’, ‘flashback’, ‘gothic horror’). Finally, we will develop specific algorithms using supervised and unsupervised methods to study the impact of a range of factors (cultural inheritance, cultural diffusion, structural constraint, etc.) the evolution of the fictions.

