Global ETD Search

61	Mapping geospatial events based on extracted spatial information from web documents Rock, Nathaniel Robert 01 May 2011 (has links) Web documents such as news articles, social feeds, and blogs provide an abundant and readily available data source of spatial information relating to dynamic events such as wildfires, storms, and chemical spills. Research in the fields of geographic information retrieval and natural language processing use methods to extract place-names from web documents that can be used to geocode these events. However much of the spatial information in these articles are difficult to use because of the inherent vagueness of natural language. This thesis aims to develop methods to handle the vaguness of representing natural language descriptions of events by integrating precise spatial information (landmarks and geographic coordinates) with imprecise spatial information to provide a map-based visualization of the likely spatial extent and location of web document events. events GIR GIS natural language vague Geography
62	A framework and evaluation of conversation agents os.goh@murdoch.edu.au, Ong Sing Goh January 2008 (has links) This project details the development of a novel and practical framework for the development of conversation agents (CAs), or conversation robots. CAs, are software programs which can be used to provide a natural interface between human and computers. In this study, conversation refers to real-time dialogue exchange between human and machine which may range from web chatting to on-the-go conversation through mobile devices. In essence, the project proposes a smart and effective communication technology where an autonomous agent is able to carry out simulated human conversation via multiple channels. The CA developed in this project is termed Artificial Intelligence Natural-language Identity (AINI) and AINI is used to illustrate the implementation and testing carried out in this project. Up to now, most CAs have been developed with a short term objective to serve as tools to convince users that they are talking with real humans as in the case of the Turing Test. The traditional designs have mainly relied on ad-hoc approach and hand-crafted domain knowledge. Such approaches make it difficult for a fully integrated system to be developed and modified for other domain applications and tasks. The proposed framework in this thesis addresses such limitations. Overcoming the weaknesses of previous systems have been the key challenges in this study. The research in this study has provided a better understanding of the system requirements and the development of a systematic approach for the construction of intelligent CAs based on agent architecture using a modular N-tiered approach. This study demonstrates an effective implementation and exploration of the new paradigm of Computer Mediated Conversation (CMC) through CAs. The most significant aspect of the proposed framework is its ability to re-use and encapsulate expertise such as domain knowledge, natural language query and human-computer interface through plug-in components. As a result, the developer does not need to change the framework implementation for different applications. This proposed system provides interoperability among heterogeneous systems and it has the flexibility to be adapted for other languages, interface designs and domain applications. A modular design of knowledge representation facilitates the creation of the CA knowledge bases. This enables easier integration of open-domain and domain-specific knowledge with the ability to provide answers for broader queries. In order to build the knowledge base for the CAs, this study has also proposed a mechanism to gather information from commonsense collaborative knowledge and online web documents. The proposed Automated Knowledge Extraction Agent (AKEA) has been used for the extraction of unstructured knowledge from the Web. On the other hand, it is also realised that it is important to establish the trustworthiness of the sources of information. This thesis introduces a Web Knowledge Trust Model (WKTM) to establish the trustworthiness of the sources. In order to assess the proposed framework, relevant tools and application modules have been developed and an evaluation of their effectiveness has been carried out to validate the performance and accuracy of the system. Both laboratory and public experiments with online users in real-time have been carried out. The results have shown that the proposed system is effective. In addition, it has been demonstrated that the CA could be implemented on the Web, mobile services and Instant Messaging (IM). In the real-time human-machine conversation experiment, it was shown that AINI is able to carry out conversations with human users by providing spontaneous interaction in an unconstrained setting. The study observed that AINI and humans share common properties in linguistic features and paralinguistic cues. These human-computer interactions have been analysed and contributed to the understanding of how the users interact with CAs. Such knowledge is also useful for the development of conversation systems utilising the commonalities found in these interactions. While AINI is found having difficulties in responding to some forms of paralinguistic cues, this could lead to research directions for further work to improve the CA performance in the future. Conversation Agent Artificial Intelligence Natural Language Processing
63	Natural language interaction with robots Walker, Alden. January 2007 (has links) Thesis (B.A.)--Haverford College, Dept. of Computer Science, Swarthmore College. Dept. of Linguistics, 2007. / Includes bibliographical references.
64	Advanced Intranet Search Engine Narayan, Nitesh January 2009 (has links) <p>Information retrieval has been a prevasive part of human society since its existence.With the advent of internet and World wide Web it became an extensive area of researchand major foucs, which lead to development of various search engines to locate the de-sired information, mostly for globally connected computer networks viz. internet.Butthere is another major part of computer network viz. intranet, which has not seen muchof advancement in information retrieval approaches, in spite of being a major source ofinformation within a large number of organizations.Most common technique for intranet based search engines is still mere database-centric. Thus practically intranets are unable to avail the beneﬁts of sophisticated tech-niques that have been developed for internet based search engines without exposing thedata to commercial search engines.In this Master level thesis we propose a ”state of the art architecture” for an advancedsearch engine for intranet which is capable of dealing with continuously growing sizeof intranets knowledge base. This search engine employs lexical processing of doc-umetns,where documents are indexed and searched based on standalone terms or key-words, along with the semantic processing of the documents where the context of thewords and the relationship among them is given more importance.Combining lexical and semantic processing of the documents give an effective ap-proach to handle navigational queries along with research queries, opposite to the modernsearch engines which either uses lexical processing or semantic processing (or one as themajor) of the documents. We give equal importance to both the approaches in our design,considering best of the both world.This work also takes into account various widely acclaimed concepts like inferencerules, ontologies and active feedback from the user community to continuously enhanceand improve the quality of search results along with the possibility to infer and deducenew knowledge from the existing one, while preparing for the advent of semantic web.</p> semantic lexical search engine natural language processing
65	A Three-Step Procedure for Language Generation Katz, Boris 01 December 1980 (has links) This paper outlines a three-step plan for generating English text from any semantic representation by applying a set of syntactic transformations to a collection of kernel sentences. The paper focuses on describing a program which realizes the third step of this plan. Step One separates the given representation into groups and generates from each group a set of kernel sentences. Step Two must decide based upon both syntactic and thematic considerations, the set of transformations that should be performed upon each set of kernels. The output of the first two steps provides the "TASK" for Step Three. Each element of the TASK corresponds to the generation of one English sentence, and in turn may be defined as a triple consisting of: (a) a list of kernel phrase markers; (b) a list of transformations to be performed upon the list of kernels; (c) a "syntactic separator" to separate or connect generated sentences. Step Three takes as input the results of Step One and Step Two. The program which implements Step three "reads" the TASK, executes the transformations indicated there, combines the altered kernels of each set into a sentence, performs a pronomialization process, and finally produces the appropriate English word string. This approach subdivides a hard problem into three more manageable and relatively independent pieces. It uses linguistically motivated theories at Step Two and Step Three. As implemented so far, Step Three is small and highly efficient. The system is flexible; all the transformations can be applied in any order. The system is general; it can be adapted easily to many domains. language generation parsing transformations natural language
66	Prepositional Phrase Attachment Disambiguation Using WordNet Spitzer, Claus January 2006 (has links) In this thesis we use a knowledge-based approach to disambiguating prepositional phrase attachments in English sentences. This method was first introduced by S. M. Harabagiu. The Penn Treebank corpus is used as the training text. We extract 4-tuples of the form <em>VP</em>, <em>NP</em><sub>1</sub>, Prep, <em>NP</em><sub>2</sub> and sort them into classes according to the semantic relationships between parts of each tuple. These relationships are extracted from WordNet. Classes are sorted into different tiers based on the strictness of their semantic relationship. Disambiguation of prepositional phrase attachments can be cast as a constraint satisfaction problem, where the tiers of extracted classes act as the constraints. Satisfaction is achieved when the strictest possible tier unanimously indicates one kind of attachment. The most challenging kind of problems for disambiguation of prepositional phrases are ones where the prepositional phrase may attach to either the closest verb or noun. <br /><br /> We first demonstrate that the best approach to extracting tuples from parsed texts is a top-down postorder traversal algorithm. Following that, the various challenges in forming the prepositional classes utilizing WordNet semantic relations are described. We then discuss the actions that need to be taken towards applying the prepositional classes to the disambiguation task. A novel application of this method is also discussed, by which the tuples to be disambiguated are also expanded via WordNet, thus introducing a client-side application of the algorithms utilized to build prepositional classes. Finally, we present results of different variants of our disambiguating algorithm, contrasting the precision and recall of various combinations of constraints, and comparing our algorithm to a baseline method that falls back to attaching a prepositional phrase to the closest left phrase. Our conclusion is that our algorithm provides improved performance compared to the baseline and is therefore a useful new method of performing knowledge-based disambiguation of prepositional phrase attachments. Computer Science Natural language processing disambiguation semantics
67	Advanced Intranet Search Engine Narayan, Nitesh January 2009 (has links) Information retrieval has been a prevasive part of human society since its existence.With the advent of internet and World wide Web it became an extensive area of researchand major foucs, which lead to development of various search engines to locate the de-sired information, mostly for globally connected computer networks viz. internet.Butthere is another major part of computer network viz. intranet, which has not seen muchof advancement in information retrieval approaches, in spite of being a major source ofinformation within a large number of organizations.Most common technique for intranet based search engines is still mere database-centric. Thus practically intranets are unable to avail the beneﬁts of sophisticated tech-niques that have been developed for internet based search engines without exposing thedata to commercial search engines.In this Master level thesis we propose a ”state of the art architecture” for an advancedsearch engine for intranet which is capable of dealing with continuously growing sizeof intranets knowledge base. This search engine employs lexical processing of doc-umetns,where documents are indexed and searched based on standalone terms or key-words, along with the semantic processing of the documents where the context of thewords and the relationship among them is given more importance.Combining lexical and semantic processing of the documents give an effective ap-proach to handle navigational queries along with research queries, opposite to the modernsearch engines which either uses lexical processing or semantic processing (or one as themajor) of the documents. We give equal importance to both the approaches in our design,considering best of the both world.This work also takes into account various widely acclaimed concepts like inferencerules, ontologies and active feedback from the user community to continuously enhanceand improve the quality of search results along with the possibility to infer and deducenew knowledge from the existing one, while preparing for the advent of semantic web. semantic lexical search engine natural language processing
68	Prepositional Phrase Attachment Disambiguation Using WordNet Spitzer, Claus January 2006 (has links) In this thesis we use a knowledge-based approach to disambiguating prepositional phrase attachments in English sentences. This method was first introduced by S. M. Harabagiu. The Penn Treebank corpus is used as the training text. We extract 4-tuples of the form <em>VP</em>, <em>NP</em><sub>1</sub>, Prep, <em>NP</em><sub>2</sub> and sort them into classes according to the semantic relationships between parts of each tuple. These relationships are extracted from WordNet. Classes are sorted into different tiers based on the strictness of their semantic relationship. Disambiguation of prepositional phrase attachments can be cast as a constraint satisfaction problem, where the tiers of extracted classes act as the constraints. Satisfaction is achieved when the strictest possible tier unanimously indicates one kind of attachment. The most challenging kind of problems for disambiguation of prepositional phrases are ones where the prepositional phrase may attach to either the closest verb or noun. <br /><br /> We first demonstrate that the best approach to extracting tuples from parsed texts is a top-down postorder traversal algorithm. Following that, the various challenges in forming the prepositional classes utilizing WordNet semantic relations are described. We then discuss the actions that need to be taken towards applying the prepositional classes to the disambiguation task. A novel application of this method is also discussed, by which the tuples to be disambiguated are also expanded via WordNet, thus introducing a client-side application of the algorithms utilized to build prepositional classes. Finally, we present results of different variants of our disambiguating algorithm, contrasting the precision and recall of various combinations of constraints, and comparing our algorithm to a baseline method that falls back to attaching a prepositional phrase to the closest left phrase. Our conclusion is that our algorithm provides improved performance compared to the baseline and is therefore a useful new method of performing knowledge-based disambiguation of prepositional phrase attachments. Computer Science Natural language processing disambiguation semantics
69	Flexible speech synthesis using weighted finite-state transducers / Bulyko, Ivan. January 2002 (has links) Thesis (Ph. D.)--University of Washington, 2002. / Vita. Includes bibliographical references (p. 110-123).
70	Minimally supervised induction of morphology through bitexts Moon, Taesun, Ph. D. 17 January 2013 (has links) A knowledge of morphology can be useful for many natural language processing systems. Thus, much effort has been expended in developing accurate computational tools for morphology that lemmatize, segment and generate new forms. The most powerful and accurate of these have been manually encoded, such endeavors being without exception expensive and time-consuming. There have been consequently many attempts to reduce this cost in the development of morphological systems through the development of unsupervised or minimally supervised algorithms and learning methods for acquisition of morphology. These efforts have yet to produce a tool that approaches the performance of manually encoded systems. Here, I present a strategy for dealing with morphological clustering and segmentation in a minimally supervised manner but one that will be more linguistically informed than previous unsupervised approaches. That is, this study will attempt to induce clusters of words from an unannotated text that are inflectional variants of each other. Then a set of inflectional suffixes by part-of-speech will be induced from these clusters. This level of detail is made possible by a method known as alignment and transfer (AT), among other names, an approach that uses aligned bitexts to transfer linguistic resources developed for one language–the source language–to another language–the target. This approach has a further advantage in that it allows a reduction in the amount of training data without a significant degradation in performance making it useful in applications targeted at data collected from endangered languages. In the current study, however, I use English as the source and German as the target for ease of evaluation and for certain typlogical properties of German. The two main tasks, that of clustering and segmentation, are approached as sequential tasks with the clustering informing the segmentation to allow for greater accuracy in morphological analysis. While the performance of these methods does not exceed the current roster of unsupervised or minimally supervised approaches to morphology acquisition, it attempts to integrate more learning methods than previous studies. Furthermore, it attempts to learn inflectional morphology as opposed to derivational morphology, which is a crucial distinction in linguistics. / text Morphological clustering Inflectional morphology Natural language processing

Search results