, , , , , , ,

One of the current trends (and possibly the ultimate goals) of the science of “Digital Humanities” is identifying the “story” from readily available sources and reconstructing them in an organized manner for availability in the global digital platform. Main components of a story, the events, would be formed by the characters, their interactions and the settings of these said interactions. After having the temporal ordering of these events what we have is a story.

Trend Breakdown and Major Problems

For the extraction and graphical construction of stories, the humanists employ graph theory and machine learning strategies to derive from the texts the story characters (graph vertices) and the interactions (graph edges). They temporally index them in groups into events (sub-graphs) and the story (graph) is completed.

Three of the major challenges that may be encountered when doing such research and attempting an application towards the aforementioned goal of story construction are as follows:

  1. Attributing respective degrees of impact to the events relative the story
  2. Abundance of comprehensive source candidates
  3. Difficulty in morphological analysis for name dictionary construction

To address these challenges, we will discuss three articles (one for each challenge) presented in the Digital Humanities conference DH2015. Afterwards, we will discuss how tackling these challenges simultaneously in a given task could support each other for improved comprehensiveness.


  • Attributing respective degrees of impact to the events relative to the story

An article named “What Do You Do With A Million Readers?” addresses this issue. [1]

They are benefiting from the limitless bulk of information available on crowd-sourcing sites such as blogs, forums and social networks. In their research,  they first decided on 16 most popular books of fiction according to Goodreads site and for each, they downloaded 3000 reviews from that website. From the reviews, they created story graphs which act as summaries of their respective books.

After the construction, the graphs’ performance was measured by its precision (completeness) and accuracy (false-detection rate). As an example, the Hobbit scored 0.6 in precision and 0.13 in false-detection rate.

We can see that, the precision parameter is sufficiently high considering the fact that the service the gathered texts promised were not summarizing books but were reviewing them. It is reasonable for them to not be comprehensive. For the false-detection rate, after comparison of the graph with the actual storytelling, we are able to catch a pattern. Some acts (interactions from events) were attributed not to the actual active subject but to a subject the passive object in question had the most interactions with over the whole story. The primary example is the mistakenly attribution of Bilbo as the slayer of Smaug in the results of the article.

The distinguishing aspect of this research’s results was that the widths of the edges of the graph was scaled in proportion to the confidence on the occurrence of that exact action derived from the overall frequency it was mentioned in the reviews. Thus, the application can place emphasize on certain events more than the others. Furthermore, after accessing information on the reviewers, graphs specialized to certain demographic segments (age, gender…) was also possible to generate. It is even possible to weight the review content by the seniority of the reviewers. These are all potential future improvements on the current results seen in the article.

  • Abundance of comprehensive source candidates

For this issue, we will consult the following article: “Automated Comparison of Narrative and Character Function Similarity Using Graph Theory”. [2]

In this article, the researcher collected a multitude of variation of a story and after analyzing them, they attempted to deduce whether the main characters in each variation can be matched correctly. They used natural language processing tools to extract characters and events information. After that, they separated the events temporally into sub-graphs to generate the adjacency matrix of each sub-graphs (set of events). The adjacency matrix is the indicator 1-0 matrix if a pair of nodes are connected with an edge. Afterwards, they collected these 2-D matrices into 3-D matrices with the third dimension being the temporal order. They cross-referenced these matrices obtained from each narrative via a set of matrix calculations to generate a correlation matrix with similarity scores for each possible match-up of characters.

We can analyze their results as follows. From a naive Bayesian approach, the derived correlation parameters between many characters did not immediately collapse into a complete one-to-one correspondence between each of the main characters from the separate variations. However, the results can potentially be fully utilized by deriving an overall similarity parameter to decide upon whether the stories are overlapping. After deciding as such, we can move on to matching the characters according to maximum likelihood with the imposed condition of at most one matching per character in pair-wise matching settings.

Before switching to the next challenge, let us note the following. In the article, it was claimed, within reason, that the method can compare unrelated texts and display poor match. The reason they chose to compare variations of the same story was to show the performance of their system under a correct match.

  • Difficulty in morphological analysis for name dictionary construction

The article titled “Detection of People Relationship Using Topic Model from Diaries in Medieval Period of Japan” will be used to discuss this challenge in the remainder of this section. [3]

Personal names, the representation of the story characters in textual form, can be very arbitrary from the analyzer’s perspective if a morphological analysis is impractical (such was the case in medieval Japanese). In such a case, an exhaustive search on a dictionary could work; however, they did not have that either. What they did do was to employ a method of extracting by string sequence pattern matching.

It seems, for this purpose, they created a supervised learning system with training data set which included patterns of personal and non-personal names in context. After that, they fed randomly chosen test data from a collection of text, and when they were successful in founding a match with patterns available in the test data set, they predicted personal names from the texts. In the opposite situation, when they failed to find a match, meaning they came across a potential event but were unable to extract the personal name due to the lack of knowledge of that particular event’s textual representation style, they manually decomposed and added it to the training data set. After a few such feedback iterations, they were able to extract 95% of the personal names.

This research has significant importance for the cases when the language in question is unconventional and not many generalizing rules can be derived for the computerized systems.


Even though, ultimately, all of the articles have the same goal of constructing the story and the tools they employ are not very divergent, they tackle different problems.

With the applications from the first article [1] we are able to create a summary based on the reviews of a large set of readers. These reviews are no more than comments, and the overall frequency of the mentioning of events show how proportionally memorable the said events were from the readers’ views. This can be an indication to the events’ importance in the overall story, their density of sensation and so on. In essence, the application here partially creates a story from a plentiful quantity of secondary sources greatly lacking in completeness and additionally gives insight into the impact on the readers (consumers of the literature) which was not found in the original and is not attempted to be found by the other methods.

In the second article [2], the problem defined differs greatly. We are trying to see if different literary sources are talking about the same story without manually going through them. The available data set are more appropriate sources for story construction comparing with the user reviews in [1], and the comparison between variations could be used to identify missed out events between different variations when a match is found.

With the method from the third article [3], we can include literature from languages more challenging in morphological terms like medieval Japanese as exemplified. This can be utilized to expand the subject universes of the articles [1] and [2].


We can combine the applications from all three for new heights. We can surely have a story almost (approximately more than 95%) fully divisible into events, characters and interactions via [3] independent of the language present. Additionally, we can make a preliminary construction of the story graph including the unique contribution of the readers’ emphasize with [1], to afterwards, compile such constructions for comparison and correspondance with [2]. Consequently, we can possibly create larger, more detailed stories.


  1. “What Do You Do With A Million Readers?” by Roja Bandari, Timothy Roland Tangherlini, Vwani Roychowdhury.
  2. “Automated Comparison of Narrative and Character Function Similarity Using Graph Theory” by Ben Miller, Ayush Shrestha, Jennifer Olive, Shakthidhar Gopavaram.
  3. “Detection of People Relationship Using Topic Model from Diaries in Medieval Period of Japan” by Taizo Yamada, Satoshi Inoue.