• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 2
  • Tagged with
  • 5
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Web-assisted anaphora resolution

Li, Yifan 06 1900 (has links)
This dissertation investigates the utility of the web for anaphora resolution. Aside from offering a highly accurate, web-based method for pleonastic it detection, which eliminates up to 4% of errors in pronominal anaphora resolution, it also introduces a web-assisted model for definite description anaphoricity determination and a prototype system of anaphora resolution that uses the web for virtually all subtasks. The thesis starts with a thorough analysis of the relationship between anaphora and definiteness, a study that bridges the gap between previously reported empirical studies of definite description anaphora and the linguistic theories developed around the concept of definiteness. Various naturally-occurring definite descriptions found in the WSJ corpus are analyzed from both perspectives of familiarity and uniqueness, and a new classification scheme for definite descriptions is developed. With the fundamental issues solved, the rest of the thesis focuses on the various ways the web can be exploited for the purpose of anaphora resolution. This thesis presents methods of high-precision, high-recall anaphoricity determination for both pronouns and definite descriptions. Evaluation results suggest that the performance of the pleonastic it identification module is on par with casually-trained human annotators. When used together with a pronominal anaphora resolution system, the module offers a statistically significant performance gain of 4%. The performance of the anaphoricity determination module for definite descriptions, which benefits from both the insight gained from the study on anaphora and definiteness and the significantly expanded coverage offered by the web, is also one of the highest among existing studies. The thesis also introduces a web-centric anaphora resolution system. Aside from serving as the information source for implementing selectional restrictions and discovering hyponym/synonym relationships, the web is additionally used for gender/number determination and many other auxiliary tasks, such as determining the semantic subjects of as-prepositions, identifying antecedents for certain empty categories, and assigning appropriate labels for proper names using information available from the text itself. With a design that specifically leaves room for the application of verb-argument and genitive co-occurrence statistics, the web-based features provide statistically significant gains to the system's performance. / Software Engineering and Intelligent Systems
2

Web-assisted anaphora resolution

Li, Yifan Unknown Date
No description available.
3

Large-scale semi-supervised learning for natural language processing

Bergsma, Shane A 11 1900 (has links)
Natural Language Processing (NLP) develops computational approaches to processing language data. Supervised machine learning has become the dominant methodology of modern NLP. The performance of a supervised NLP system crucially depends on the amount of data available for training. In the standard supervised framework, if a sequence of words was not encountered in the training set, the system can only guess at its label at test time. The cost of producing labeled training examples is a bottleneck for current NLP technology. On the other hand, a vast quantity of unlabeled data is freely available. This dissertation proposes effective, efficient, versatile methodologies for 1) extracting useful information from very large (potentially web-scale) volumes of unlabeled data and 2) combining such information with standard supervised machine learning for NLP. We demonstrate novel ways to exploit unlabeled data, we scale these approaches to make use of all the text on the web, and we show improvements on a variety of challenging NLP tasks. This combination of learning from both labeled and unlabeled data is often referred to as semi-supervised learning. Although lacking manually-provided labels, the statistics of unlabeled patterns can often distinguish the correct label for an ambiguous test instance. In the first part of this dissertation, we propose to use the counts of unlabeled patterns as features in supervised classifiers, with these classifiers trained on varying amounts of labeled data. We propose a general approach for integrating information from multiple, overlapping sequences of context for lexical disambiguation problems. We also show how standard machine learning algorithms can be modified to incorporate a particular kind of prior knowledge: knowledge of effective weightings for count-based features. We also evaluate performance within and across domains for two generation and two analysis tasks, assessing the impact of combining web-scale counts with conventional features. In the second part of this dissertation, rather than using the aggregate statistics as features, we propose to use them to generate labeled training examples. By automatically labeling a large number of examples, we can train powerful discriminative models, leveraging fine-grained features of input words.
4

Prospects for a Deflationary Account of the Ontology of Propositions

McCracken, Michael 09 March 2010 (has links)
A proposition ontology occupies a potentially rich and foundational place in a good deal of contemporary philosophical theorizing. Some of the biggest roadblocks to a wider acceptance and employment of propositions have been legitimate worries about their nature, or ontological "explanatory" power of theories that employ them. This dissertation attempts to understand and construct a deflationary or minimalist understanding of the notion of a proposition and its theoretical roles. On the basis of this understanding, following Stephen Schiffer (2003), I attempt to construct an ontology of propositions -focusing on general propositions- which avoids or dissolves the most pressing worries about their ontological nature, and the epistemological and explanatory statuses of propositions. In chapter one, I discuss the primary theoretical motivations for positing propositions, and argue for a general set of ontological constraints that fall out of a consideration of entities posited according to these motivations. In chapter two, after arguing that propositions are substantially ontologically independent of mind and language, I argue that propositions are conceptually mind- dependent, but that conceptual dependence of this kind does not amount to any sort of ontological dependence. In chapter three, drawing heavily on the work of Stephen Schiffer (2003), I substantially address the epistemological worries about propositions, arguing that propositions are pleonastic entities whose natures and existence we can know simply by reflecting on our proposition- introducing linguistic practices. In chapter four, I argue that propositions may or may not, in virtue of their status as pleonastic entities, play any substantial explanatory role, but that by utilizing the notion of a proposition, which, according to the pleonastic conception of them, guarantees their existence independent of our practices, is useful and perhaps indispensable to certain of our communicative and epistemic practices. Our propositional linguistic practices, involving essentially our reference to propositions, are thus pragmatically justified.
5

Large-scale semi-supervised learning for natural language processing

Bergsma, Shane A Unknown Date
No description available.

Page generated in 0.0633 seconds