The rise of big data, especially social media data (e.g., Twitter, Facebook, Youtube), gives new opportunities to the understanding of human behavior. Consequently, novel computing methods for mining patterns in social media data are therefore desired. Through applying these approaches, it has become possible to aggregate public available data to capture triggers underlying events, detect on-going trends, and forecast future happenings. This thesis focuses on developing methods for social media analysis. Specifically, five directions are proposed here: 1) semi-supervised detection for targeted-domain events, 2) topical interaction study among multiple datasets, 3) discriminative learning about the identifications for common and distinctive topics, 4) epidemics modeling for flu forecasting with simulation via signals from social media data, 5) storyline generation for massive unorganized documents. / Ph. D. / The rise of “big data”, especially social media data (e.g., Twitter, Facebook, Youtube), gives new opportunities to the understanding of human behavior. Consequently, novel computing methods for mining patterns in social media data are therefore desired. Through applying these approaches, it has become possible to aggregate public available data to capture triggers underlying events, detect on-going trends, and forecast future happenings.
This dissertation provides comprehensive studies for social media data analysis. The goals of the dissertation include: event early detection, future event prediction, and event chain organization. Specifically, these goals are achieved through efforts in the following aspects: (1) semi-supervised and unsupervised methods are developed to collect early signals from social media data and detect on-going events; (2) graphical models are proposed to model the interaction and comparison among multiple datasets; (3) traditional computational methods are combined with new emerge social media data analysis for the purpose of fast epidemic prediction; (4) events in different time stamps are organized into event chains via novel probabilistic models. The effectiveness of our approaches is evaluated using various datasets, such as Twitter posts and news articles. Also, interesting case studies are provided to show models’ abilities in the real world exploration.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/82029 |
Date | 05 February 2018 |
Creators | Hua, Ting |
Contributors | Computer Science, Lu, Chang-Tien, Reddy, Chandan K., Li, Zhenhui, Chen, Ing-Ray, Ramakrishnan, Naren |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Detected Language | English |
Type | Dissertation |
Format | ETD, application/pdf, application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.0025 seconds