Return to search

Catweetegories : machine learning to organize your Twitter stream

We want to create a web service that will help users better organize the flood of tweets they receive every day by using machine learning. This was done by experimenting with ways to manually classify training sets of tweets such as using Amazon’s Mechanical Turk and crawling the Internet for large quantities of tweets. Once we acquired good training data, we began building a classifier. We tried NLTK and Stanford NLP as libraries for creating a classifier, and we ultimately created a classifier that is 87.5% accurate. We then built a web service to expose this classifier and to allow any user on the Internet to organize their tweets. We built our web service by using many open source tools, and we discuss how we integrated these tools to create a production quality web service. We run our web service in the Amazon cloud, and we review the costs associated with running in Amazon. Finally we review the lessons we learned and share our thoughts on further work we would like to do in the future. / text

Identiferoai:union.ndltd.org:UTEXAS/oai:repositories.lib.utexas.edu:2152/23987
Date14 April 2014
CreatorsSimoes, Christopher Francis
Source SetsUniversity of Texas
Detected LanguageEnglish
TypeThesis
Formatapplication/pdf

Page generated in 0.0017 seconds