Global ETD Search

Return to search

Statistical Text Analysis for Social Science

What can text corpora tell us about society? How can automatic text analysis algorithms efficiently and reliably analyze the social processes revealed in language production? This work develops statistical text analyses of dynamic social and news media datasets to extract indicators of underlying social phenomena, and to reveal how social factors guide linguistic production. This is illustrated through three case studies: first, examining whether sentiment expressed in social media can track opinion polls on economic and political topics (Chapter 3); second, analyzing how novel online slang terms can be very specific to geographic and demographic communities, and how these social factors affect their transmission over time (Chapters 4 and 5); and third, automatically extracting political events from news articles, to assist analyses of the interactions of international actors over time (Chapter 6). We demonstrate a variety of computational, linguistic, and statistical tools that are employed for these analyses, and also contribute MiTextExplorer, an interactive system for exploratory analysis of text data against document covariates, whose design was informed by the experience of researching these and other similar works (Chapter 2). These case studies illustrate recurring themes toward developing text analysis as a social science methodology: computational and statistical complexity, and domain knowledge and linguistic assumptions.

computational social science

natural language processing

text mining

quantitative text analysis

machine learning

probabilistic graphical models

Identifer	oai:union.ndltd.org:cmu.edu/oai:repository.cmu.edu:dissertations-1574
Date	01 August 2014
Creators	O'Connor, Brendan T.
Publisher	Research Showcase @ CMU
Source Sets	Carnegie Mellon University
Detected Language	English
Type	text
Format	application/pdf
Source	Dissertations

Page generated in 0.0019 seconds

Statistical Text Analysis for Social Science

Description

Links & Downloads

Tags

Additional Fields