, , ,

Today the word “crowdsourcing” is synonymous, in the digital humanities world, of getting work from a crowd of people. Using one of the best resources the world has, humans, we can achieve results far better than what computers allow us for certain tasks. Yet, there still exists tasks that require a great amount of knowledge, that cannot be spread among a huge crowd nor executed by a computer, and thus there is a need for very specific kind of crowdsourcing, using “knowledge communities”.

In the first abstract, “What Do You Do With A Million Readers?” [1], the crowdsourcing was done in a quite unusual way. Instead of getting a crowd to do a work, the author used already available data, online reviews of books, to learn how people read. Isolating the most frequently reviewed books (over half a million ratings) and selecting sixteen of those based on the broad disparity in the structures, characters, and relationships, the study aimed to summarize each book based only on those reviews in the form of a character graph as the one below.


The study aimed to visualize how the readers interact with the book, and using metadata available from the reviews to differentiate classes of readers: e.g., female vs male, old vs young, and analyse which aspects of a story different types of reviewers tend to comment.

The second abstract, “From Crowdsourcing to Knowledge Communities: Creating Meaningful Scholarship Through Digital Collaboration” [2] aims to distinguish between Crowdsourcing and Knowledge communities, basically between large and small scale crowdsourcing. The authors describe and analyse three projects they were involved in, and the approaches and communities they were working with. Those projects engaged different crowds: the paid community of Amazon Mechanical Turkers, the expert community of train enthusiasts, and the heterogeneous public of persons interested in the Bay Area. Interestingly each one was only partially successful, but the study suggests that if researchers have flexibility in their projects, collaboration with those knowledge communities might result in original results, whereas without this flexibility, crowdsourcing becomes much more limited, though possible, but with a crowd unrelated to the project (e.g., Amazon Turkers).

The third abstract, “Nichesourcing The Uralic Languages For The Benefit Of Linguistic Research And Lingual Societies” [3] focuses on those knowledge communities and how they were established during a project aiming to support both linguistic research and lingual diversity. When dealing with digitization of documents in various endangered languages, it seems obvious that we cannot use a “random crowd”, hence the need to appeal to the niche communities that can help the research work. Although those communities provides smaller pools to draw resources from, the very specific skills and high quality output expectations yield qualitative results. Indeed those communities benefit from the results and thus they identify to the task, and have a purpose in it, as opposed to Amazon Turkers for example. The paper’s goal is to explain why the work can’t be done by other means and how they established those crowdsourcing methods involving the niche communities.

Crowdsourcing is growing stronger and stronger allowing to tackle vast tasks in an efficient way. But there exists very different approaches to crowdsourcing, from huge random crowds, without flexibility nor interaction with the researchers, to niche communities, benefiting from the research and able to execute diverse high skilled tasks. In the first abstract one advantage was that the crowd work was already done, but the task did not require any particular knowledge thus a large random crowd was the best choice, whereas in the third abstract the task is highly knowledge dependent, and can only be executed by very specific small crowds. The second abstract joins the two observations and emphasises on this quantity (crowd) vs. quality (niche) trade-off and the fact that no solution outputs perfect answers for the given problems.

To conclude, crowdsourcing is a very powerful tool and can be applied to various digital humanities problems, but when considering this approach it is crucial that one needs to identify and establish from the problem tackled what kind of community and what kind of tasks to ask them.



[1] R. Bandari, T. Tangherlini, and V. Roychowdhury : What Do You Do With A Million Readers?

[2] J. Voss, G. Wolfenstein, F. Zephyr, R. Heuser, K. Young, and N. Stanhope : From Crowdsourcing to Knowledge Communities: Creating Meaningful Scholarship Through Digital Collaboration

[3] J-P. Hakkarainen : Nichesourcing The Uralic Languages For The Benefit Of Linguistic Research And Lingual Societies