BUNKA, a data mining engine using collective and artificial intelligence.
Charles de Dampierre, Nicolas Baumard, Andrei Mogoutov
The objective of Bunka is to map and quantify the distribution of cultural representations (beliefs, opinions, preferences, etc.), what Dan Sperber has called the epidemiology of representations, using automatic language processing and machine learning tools.
These mappings, first produced for social science research, can in fact be of great help to everyone. What others think about an event or a cultural content, the collective intelligence, is crucial in information retrieval.
The advent of the participatory Web or Web 2.0 (Twitter, Reddit, Wikipedia, etc.) has greatly increased the amount of collective intelligence through the reviews, opinions, and critiques posted online. However, this collective intelligence is still under-exploited, notably for technical reasons, because the size of the data is massive, and because this data is extremely heterogeneous.
BUNKA is developing a new way of organizing collective intelligence using recent advances in computational sciences (Automatic Language Processing and Machine Learning). More precisely, we use "embedding" algorithms to project data into an abstract space according to rules (semantic similarity, topological similarity, common dimension, etc.).
The second key idea of BUNKA is that visualization is essential for discoverability. To improve autonomy, diversity, transparency, and serendipity, we need to have all the options available, and their relationships to each other. In other words, for each query, each content, we need a map of the web. We aggregate the reviews, opinions, critics, and opinions posted on the Web by users, and we create maps and contextual representations that, for each query about a work or a theme, show all the associated cultural contents, as well as the opinions, popularity, and relationships between these contents. With all the options visible on the map, and in particular the minority and less visible options, the user can thus free himself from the search algorithms, and explore by himself the themes and dimensions that interest him.
Ultimately, BUNKA is an "exploration engine": a tool that does not give a single result, like a search engine, but a set of information that allows the user to explore by himself, autonomously and transparently