Global ETD Search

Return to search

Catweetegories : machine learning to organize your Twitter stream

We want to create a web service that will help users better organize the flood of tweets they receive every day by using machine learning. This was done by experimenting with ways to manually classify training sets of tweets such as using Amazon’s Mechanical Turk and crawling the Internet for large quantities of tweets. Once we acquired good training data, we began building a classifier. We tried NLTK and Stanford NLP as libraries for creating a classifier, and we ultimately created a classifier that is 87.5% accurate. We then built a web service to expose this classifier and to allow any user on the Internet to organize their tweets. We built our web service by using many open source tools, and we discuss how we integrated these tools to create a production quality web service. We run our web service in the Amazon cloud, and we review the costs associated with running in Amazon. Finally we review the lessons we learned and share our thoughts on further work we would like to do in the future. / text

http://hdl.handle.net/2152/23987

Identifer	oai:union.ndltd.org:UTEXAS/oai:repositories.lib.utexas.edu:2152/23987
Date	14 April 2014
Creators	Simoes, Christopher Francis
Source Sets	University of Texas
Detected Language	English
Type	Thesis
Format	application/pdf

Page generated in 0.0016 seconds

Catweetegories : machine learning to organize your Twitter stream

Description

Links & Downloads

Tags

Additional Fields