Return to search

Automated analysis of narrative text using network analysis in large corpora

In recent years there has been an increased interest in computational social sciences, digital humanities and political sciences to perform automated quantitative narrative analysis (QNA) of text in large scale, by studying actors, actions and relations in a given narration. Social scientists have always relied on news media content to study opinion biases and extraction of socio-historical relations and events. Yet in order to perform analysis they had to face labour-intensive coding where basic narrative information was manually extracted from text and annotated by hand. This PhD thesis addresses this problem using a big-data approach based on automated information extraction using state of the art Natural Language Processing, Text mining and Artificial Intelligence tools. A text corpus is transformed into a semantic network formed of subject-verb-object (SVO) triplets, and the resulting network is analysed drawing from various theories and techniques such as graph partitioning, network centrality, assortativity, hierarchy and structural balance. Furthermore we study the position of actors in the network of actors and actions; generate scatter plots describing the subject/object bias, positive/ negative bias of each actor; and investigate the types of actions each actor is most associated with. Apart from QNA, SVO triplets extracted from text can also be used to summarize documents. Our findings are demonstrated on two different corpora containing English news articles about US elections and Crime and a third corpus containing ancieilt folklore stories from the Gutenberg Project. Amongst potentially interesting findings we found the 2012 US elections campaign was very much focused on 'Economy' and 'Rights'; and overall, the media reported more frequently positive statements for the Democrats than the Republicans. In the Crime study we found that the network identified men as frequent perpetrators, and women and children as victims, of violent crime. A network approach to text based on semantic graphs is a promising approach to analyse large corpora of texts and, by retaining relational information pertaining to actors and objects, this approach can reveal latent and hidden patterns, and therefore has relevance in the social sciences and humanities.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:685924
Date January 2015
CreatorsSudhahar, Saatviga
PublisherUniversity of Bristol
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation

Page generated in 0.0015 seconds