Return to search

Rich Linguistic Structure from Large-Scale Web Data

The past two decades have shown an unexpected effectiveness of Web-scale data in natural language processing. Even the simplest models, when paired with unprecedented amounts of unstructured and unlabeled Web data, have been shown to outperform sophisticated ones. It has been argued that the effectiveness of Web-scale data has undermined the necessity of sophisticated modeling or laborious data set curation. In this thesis, we argue for and illustrate an alternative view, that Web-scale data not only serves to improve the performance of simple models, but also can allow the use of qualitatively more sophisticated models that would not be deployable otherwise, leading to even further performance gains. / Engineering and Applied Sciences

Identiferoai:union.ndltd.org:harvard.edu/oai:dash.harvard.edu:1/11181110
Date18 October 2013
CreatorsYamangil, Elif
ContributorsShieber, Stuart M.
PublisherHarvard University
Source SetsHarvard University
Languageen_US
Detected LanguageEnglish
TypeThesis or Dissertation
Rightsopen

Page generated in 0.0022 seconds