Return to search

Semisupervised sentiment analysis of tweets based on noisy emoticon labels

There is high demand for computational tools that can automatically label tweets (Twitter messages) as having positive or negative sentiment, but great effort and expense would be required to build a large enough hand-labeled training corpus on which to apply standard machine learning techniques. Going beyond current keyword-based heuristic techniques, this paper uses emoticons (e.g. ':)' and ':(') to collect a large training set with noisy labels using little human intervention and trains a Maximum Entropy classifier on that training set. Results on two hand-labeled test corpora are compared to various baselines and a keyword-based heuristic approach, with the machine learned classifier significantly outperforming both. / text

Identiferoai:union.ndltd.org:UTEXAS/oai:repositories.lib.utexas.edu:2152/ETD-UT-2011-08-3823
Date02 February 2012
CreatorsSperiosu, Michael Adrian
Source SetsUniversity of Texas
LanguageEnglish
Detected LanguageEnglish
Typethesis
Formatapplication/pdf

Page generated in 0.0025 seconds