Global ETD Search

Return to search

Rich Linguistic Structure from Large-Scale Web Data

The past two decades have shown an unexpected effectiveness of Web-scale data in natural language processing. Even the simplest models, when paired with unprecedented amounts of unstructured and unlabeled Web data, have been shown to outperform sophisticated ones. It has been argued that the effectiveness of Web-scale data has undermined the necessity of sophisticated modeling or laborious data set curation. In this thesis, we argue for and illustrate an alternative view, that Web-scale data not only serves to improve the performance of simple models, but also can allow the use of qualitatively more sophisticated models that would not be deployable otherwise, leading to even further performance gains. / Engineering and Applied Sciences

statistical inference

syntactic parsing

wikipedia

Identifer	oai:union.ndltd.org:harvard.edu/oai:dash.harvard.edu:1/11181110
Date	18 October 2013
Creators	Yamangil, Elif
Contributors	Shieber, Stuart M.
Publisher	Harvard University
Source Sets	Harvard University
Language	en_US
Detected Language	English
Type	Thesis or Dissertation
Rights	open

Page generated in 0.0022 seconds

Rich Linguistic Structure from Large-Scale Web Data

Description

Links & Downloads

Tags

Additional Fields