Global ETD Search

111	Primary semantic type labeling in monologue discourse using a hierarchical classification approach Larson, Erik John 20 August 2010 (has links) The question of whether a machine can reproduce human intelligence is older than modern computation, but has received a great deal of attention since the first digital computers emerged decades ago. Language understanding, a hallmark of human intelligence, has been the focus of a great deal of work in Artificial Intelligence (AI). In 1950, mathematician Alan Turing proposed a kind of game, or test, to evaluate the intelligence of a machine by assessing its ability to understand written natural language. But nearly sixty years after Turing proposed his test of machine intelligence—pose questions to a machine and a person without seeing either, and try to determine which is the machine—no system has passed the Turing Test, and the question of whether a machine can understand natural language cannot yet be answered. The present investigation is, firstly, an attempt to advance the state of the art in natural language understanding by building a machine whose input is English natural language and whose output is a set of assertions that represent answers to certain questions posed about the content of the input. The machine we explore here, in other words, should pass a simplified version of the Turing Test and by doing so help clarify and expand on our understanding of the machine intelligence. Toward this goal, we explore a constraint framework for partial solutions to the Turing Test, propose a problem whose solution would constitute a significant advance in natural language processing, and design and implement a system adequate for addressing the problem proposed. The fully implemented system finds primary specific events and their locations in monologue discourse using a hierarchical classification approach, and as such provides answers to questions of central importance in the interpretation of discourse. / text Machine learning Hierarchical classification Natural language processing Discourse interpretation
112	Personality and alignment processes in dialogue : towards a lexically-based unified model Brockmann, Carsten January 2009 (has links) This thesis explores approaches to modelling individual differences in language use. The differences under consideration fall into two broad categories: Variation of the personality projected through language, and modelling of language alignment behaviour between dialogue partners. In a way, these two aspects oppose each other – language related to varying personalities should be recognisably different, while aligning speakers agree on common language during a dialogue. The central hypothesis is that such variation can be captured and produced with restricted computational means. Results from research on personality psychology and psycholinguistics are transformed into a series of lexically-based Affective Language Production Models (ALPMs) which are parameterisable for personality and alignment. The models are then explored by varying the parameters and observing the language they generate. ALPM-1 and ALPM-2 re-generate dialogues from existing utterances which are ranked and filtered according to manually selected linguistic and psycholinguistic features that were found to be related to personality. ALPM-3 is based on true overgeneration of paraphrases from semantic representations using the OPENCCG framework for Combinatory Categorial Grammar (CCG), in combination with corpus-based ranking and filtering by way of n-gram language models. Personality effects are achieved through language models built from the language of speakers of known personality. In ALPM-4, alignment is captured via a cache language model that remembers the previous utterance and thus influences the choice of the next. This model provides a unified treatment of personality and alignment processes in dialogue. In order to evaluate the ALPMs, dialogues between computer characters were generated and presented to human judges who were asked to assess the characters’ personality. In further internal simulations, cache language models were used to reproduce results of psycholinguistic priming studies. The experiments showed that the models are capable of producing natural language dialogue which exhibits human-like personality and alignment effects. 410
113	Automated question answering for clinical comparison questions Leonhard, Annette Christa January 2012 (has links) This thesis describes the development and evaluation of new automated Question Answering (QA) methods tailored to clinical comparison questions that give clinicians a rank-ordered list of MEDLINE® abstracts targeted to natural language clinical drug comparison questions (e.g. ”Have any studies directly compared the effects of Pioglitazone and Rosiglitazone on the liver?”). Three corpora were created to develop and evaluate a new QA system for clinical comparison questions called RetroRank. RetroRank takes the clinician’s plain text question as input, processes it and outputs a rank-ordered list of potential answer candidates, i.e. MEDLINE® abstracts, that is reordered using new post-retrieval ranking strategies to ensure the most topically-relevant abstracts are displayed as high in the result set as possible. RetroRank achieves a significant improvement over the PubMed recency baseline and performs equal to or better than previous approaches to post-retrieval ranking relying on query frames and annotated data such as the approach by Demner-Fushman and Lin (2007). The performance of RetroRank shows that it is possible to successfully use natural language input and a fully automated approach to obtain answers to clinical drug comparison questions. This thesis also introduces two new evaluation corpora of clinical comparison questions with “gold standard” references that are freely available and are a valuable resource for future research in medical QA. 020
114	Using a rewriting system to model individual writing styles Lin, Jing January 2012 (has links) Each individual has a distinguished writing style. But natural language generation systems pro- duce text with much less variety. Is it possible to produce more human-like text from natural language generation systems by mimicking the style of particular authors? We start by analysing the text of real authors. We collect a corpus of texts from a single genre (food recipes) with each text identified with its author, and summarise a variety of writing features in these texts. Each author's writing style is the combination of a set of features. Analysis of the writing features shows that not only does each individual author write differently but the differences are consistent over the whole of their corpus. Hence we conclude that authors do keep consistent style consisting of a variety of different features. When we discuss notions such as the style and meaning of texts, we are referring to the reac- tion that readers have to them. It is important, therefore, in the field of computational linguistics to experiment by showing texts to people and assessing their interpretation of the texts. In our research we move the thesis from simple discussion and statistical analysis of the properties of text and NLG systems, to perform experiments to verify the actual impact that lexical preference has on real readers. Through experiments that require participants to follow a recipe and prepare food, we conclude that it is possible to alter the lexicon of a recipe without altering the actions performed by the cook, hence that word choice is an aspect of style rather than semantics; and also that word choice is one of the writing features employed by readers in identifying the author of a text. Among all writing features, individual lexical preference is very important both for analysing and generating texts. So we choose individual lexical choice as our principal topic of research. Using a modified version of distributional similarity CDS) helps us to choose words used by in- dividual authors without the limitation of many other solutions such as a pre-built thesauri. We present an algorithm for analysis and rewriting, and assess the results. Based on the results we propose some further improvements. 025.410285635
115	Domain independent generation from RDF instance date Sun, Xiantang January 2008 (has links) The next generation of the web, the Semantic Web, integrates distributed web resources from various domains by allowing data (instantial and ontological data) to be shared and reused across applications, enterprise and community boundaries based on the Resource Description Framework (RDF). Nevertheless, the RDF was not developed for casual users who are unfamiliar with the RDF but interested in data represented using RDF. NLG may be a possible solution to bridging the gap between the casual users and RDF data, but the cost of separately applying fine grained NLG techniques for every domain in the Semantic Web would be extremely high, and hence not realistic. 006.35
116	Automatic Tagging of Communication Data Hoyt, Matthew Ray 08 1900 (has links) Globally distributed software teams are widespread throughout industry. But finding reliable methods that can properly assess a team's activities is a real challenge. Methods such as surveys and manual coding of activities are too time consuming and are often unreliable. Recent advances in information retrieval and linguistics, however, suggest that automated and/or semi-automated text classification algorithms could be an effective way of finding differences in the communication patterns among individuals and groups. Communication among group members is frequent and generates a significant amount of data. Thus having a web-based tool that can automatically analyze the communication patterns among global software teams could lead to a better understanding of group performance. The goal of this thesis, therefore, is to compare automatic and semi-automatic measures of communication and evaluate their effectiveness in classifying different types of group activities that occur within a global software development project. In order to achieve this goal, we developed a web-based component that can be used to help clean and classify communication activities. The component was then used to compare different automated text classification techniques on various group activities to determine their effectiveness in correctly classifying data from a global software development team project. Tagging machine learning global software development natural language processing
117	Content-Based Geolocation Prediction of Canadian Twitter Users and Their Tweets Metin, Ali Mert 13 August 2019 (has links) Last decade witnessed the rise of online social networks, especially Twitter. Today, Twitteris a giant social platform with over 250 million users \|who produce massive amounts of data everyday. This creates many research opportunities, speci cally for Natural Language Processing (NLP) in which text is utilized to extract information that could be used in many applications. One problem NLP might help solving is geolocation inference or geolocation detection from online social networks. Detecting the location of Twitter users based on the text of their tweets is useful since not many users publicly declare their locations or geotag their tweets. Location information is crucial for a variety of applications such as event detection, disease and illness tracking and user pro ling. These tasks are not trivial, because online content is often noisy; it includes misspellings, incomplete words or phrases, idiomatic expressions, abbreviations, acronyms, and Twitter-speci c literature. In this work, we attempted to detect the location of Canadian users \|and tweets sent from Canada \|at metropolitan areas and province level; this was not done before, to the best of our knowledge. In order to do this, we collected two di erent datasets, and applied a variety of machine learning, including deep learning methods. Besides, we also attempted to geolocate users based on their social graph (i.e., user's friends and followers) as a novel approach. Geolocation Canadian Twitter users Twitter Natural language processing
118	Geographic referring expressions : doing geometry with words Gomes de Oliveira, Rodrigo January 2017 (has links) No description available. 004
119	GLR parsing with multiple grammars for natural language queries. January 2000 (has links) Luk Po Chui. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 97-100). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Efficiency and Memory --- p.2 / Chapter 1.2 --- Ambiguity --- p.3 / Chapter 1.3 --- Robustness --- p.4 / Chapter 1.4 --- Thesis Organization --- p.5 / Chapter 2 --- Background --- p.7 / Chapter 2.1 --- Introduction --- p.7 / Chapter 2.2 --- Context-Free Grammars --- p.8 / Chapter 2.3 --- The LR Parsing Algorithm --- p.9 / Chapter 2.4 --- The Generalized LR Parsing Algorithm --- p.12 / Chapter 2.4.1 --- Graph-Structured Stack --- p.12 / Chapter 2.4.2 --- Packed Shared Parse Forest --- p.14 / Chapter 2.5 --- Time and Space Complexity --- p.16 / Chapter 2.6 --- Related Work on Parsing --- p.17 / Chapter 2.6.1 --- GLR* --- p.17 / Chapter 2.6.2 --- TINA --- p.18 / Chapter 2.6.3 --- PHOENIX --- p.19 / Chapter 2.7 --- Chapter Summary --- p.21 / Chapter 3 --- Grammar Partitioning --- p.22 / Chapter 3.1 --- Introduction --- p.22 / Chapter 3.2 --- Motivation --- p.22 / Chapter 3.3 --- Previous Work on Grammar Partitioning --- p.24 / Chapter 3.4 --- Our Grammar Partitioning Approach --- p.26 / Chapter 3.4.1 --- Definitions and Concepts --- p.26 / Chapter 3.4.2 --- Guidelines for Grammar Partitioning --- p.29 / Chapter 3.5 --- An Example --- p.30 / Chapter 3.6 --- Chapter Summary --- p.34 / Chapter 4 --- Parser Composition --- p.35 / Chapter 4.1 --- Introduction --- p.35 / Chapter 4.2 --- GLR Lattice Parsing --- p.36 / Chapter 4.2.1 --- Lattice with Multiple Granularity --- p.36 / Chapter 4.2.2 --- Modifications to the GLR Parsing Algorithm --- p.37 / Chapter 4.3 --- Parser Composition Algorithms --- p.45 / Chapter 4.3.1 --- Parser Composition by Cascading --- p.46 / Chapter 4 3.2 --- Parser Composition with Predictive Pruning --- p.48 / Chapter 4.3.3 --- Comparison of Parser Composition by Cascading and Parser Composition with Predictive Pruning --- p.54 / Chapter 4.4 --- Chapter Summary --- p.54 / Chapter 5 --- Experimental Results and Analysis --- p.56 / Chapter 5.1 --- Introduction --- p.56 / Chapter 5.2 --- Experimental Corpus --- p.57 / Chapter 5.3 --- ATIS Grammar Development --- p.60 / Chapter 5.4 --- Grammar Partitioning and Parser Composition on ATIS Domain --- p.62 / Chapter 5.4.1 --- ATIS Grammar Partitioning --- p.62 / Chapter 5.4.2 --- Parser Composition on ATIS --- p.63 / Chapter 5.5 --- Ambiguity Handling --- p.66 / Chapter 5.6 --- Semantic Interpretation --- p.69 / Chapter 5.6.1 --- Best Path Selection --- p.69 / Chapter 5.6.2 --- Semantic Frame Generation --- p.71 / Chapter 5.6.3 --- Post-Processing --- p.72 / Chapter 5.7 --- Experiments --- p.73 / Chapter 5.7.1 --- Grammar Coverage --- p.73 / Chapter 5.7.2 --- Size of Parsing Table --- p.74 / Chapter 5.7.3 --- Computational Costs --- p.76 / Chapter 5.7.4 --- Accuracy Measures in Natural Language Understanding --- p.81 / Chapter 5.7.5 --- Summary of Results --- p.90 / Chapter 5.8 --- Chapter Summary --- p.91 / Chapter 6 --- Conclusions --- p.92 / Chapter 6.1 --- Thesis Summary --- p.92 / Chapter 6.2 --- Thesis Contributions --- p.93 / Chapter 6.3 --- Future Work --- p.94 / Chapter 6.3.1 --- Statistical Approach on Grammar Partitioning --- p.94 / Chapter 6.3.2 --- Probabilistic modeling for Best Parse Selection --- p.95 / Chapter 6.3.3 --- Robust Parsing Strategies --- p.96 / Bibliography --- p.97 / Chapter A --- ATIS-3 Grammar --- p.101 / Chapter A.l --- English ATIS-3 Grammar Rules --- p.101 / Chapter A.2 --- Chinese ATIS-3 Grammar Rules --- p.104 Parsing (Computer grammar) Computational linguistics
120	Natural language understanding across application domains and languages. January 2002 (has links) Tsui Wai-Ching. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 115-122). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview --- p.1 / Chapter 1.2 --- Natural Language Understanding Using Belief Networks --- p.5 / Chapter 1.3 --- Integrating Speech Recognition with Natural Language Un- derstanding --- p.7 / Chapter 1.4 --- Thesis Goals --- p.9 / Chapter 1.5 --- Thesis Organization --- p.10 / Chapter 2 --- Background --- p.12 / Chapter 2.1 --- Natural Language Understanding Approaches --- p.13 / Chapter 2.1.1 --- Rule-based Approaches --- p.15 / Chapter 2.1.2 --- Stochastic Approaches --- p.16 / Chapter 2.1.3 --- Mixed Approaches --- p.18 / Chapter 2.2 --- Portability of Natural Language Understanding Frameworks --- p.19 / Chapter 2.2.1 --- Portability across Domains --- p.19 / Chapter 2.2.2 --- Portability across Languages --- p.20 / Chapter 2.2.3 --- Portability across both Domains and Languages --- p.21 / Chapter 2.3 --- Spoken Language Understanding --- p.21 / Chapter 2.3.1 --- Integration of Speech Recognition Confidence into Nat- ural Language Understanding --- p.22 / Chapter 2.3.2 --- Integration of Other Potential Confidence Features into Natural Language Understanding --- p.24 / Chapter 2.4 --- Belief Networks --- p.24 / Chapter 2.4.1 --- Overview --- p.24 / Chapter 2.4.2 --- Bayesian Inference --- p.26 / Chapter 2.5 --- Transformation-based Parsing Technique --- p.27 / Chapter 2.6 --- Chapter Summary --- p.28 / Chapter 3 --- Portability of the Natural Language Understanding Frame- work across Application Domains and Languages --- p.31 / Chapter 3.1 --- Natural Language Understanding Framework --- p.32 / Chapter 3.1.1 --- Semantic Tagging --- p.33 / Chapter 3.1.2 --- Informational Goal Inference with Belief Networks --- p.34 / Chapter 3.2 --- The ISIS Stocks Domain --- p.36 / Chapter 3.3 --- A Unified Framework for English and Chinese --- p.38 / Chapter 3.3.1 --- Semantic Tagging for the ISIS domain --- p.39 / Chapter 3.3.2 --- Transformation-based Parsing --- p.40 / Chapter 3.3.3 --- Informational Goal Inference with Belief Networks for the ISIS domain --- p.43 / Chapter 3.4 --- Experiments --- p.45 / Chapter 3.4.1 --- Goal Identification Experiments --- p.45 / Chapter 3.4.2 --- A Cross-language Experiment --- p.49 / Chapter 3.5 --- Chapter Summary --- p.55 / Chapter 4 --- Enhancement in the Belief Networks for Informational Goal Inference --- p.57 / Chapter 4.1 --- Semantic Concept Selection in Belief Networks --- p.58 / Chapter 4.1.1 --- Selection of Positive Evidence --- p.58 / Chapter 4.1.2 --- Selection of Negative Evidence --- p.62 / Chapter 4.2 --- Estimation of Statistical Probabilities in the Enhanced Belief Networks --- p.64 / Chapter 4.2.1 --- Estimation of Prior Probabilities --- p.65 / Chapter 4.2.2 --- Estimation of Posterior Probabilities --- p.66 / Chapter 4.3 --- Experiments --- p.73 / Chapter 4.3.1 --- Belief Networks Developed with Positive Evidence --- p.74 / Chapter 4.3.2 --- Belief Networks with the Injection of Negative Evidence --- p.76 / Chapter 4.4 --- Chapter Summary --- p.82 / Chapter 5 --- Integration between Speech Recognition and Natural Lan- guage Understanding --- p.84 / Chapter 5.1 --- The Speech Corpus for the Chinese ISIS Stocks Domain --- p.86 / Chapter 5.2 --- Our Extended Natural Language Understanding Framework for Spoken Language Understanding --- p.90 / Chapter 5.2.1 --- Integrated Scoring for Chinese Speech Recognition and Natural Language Understanding --- p.92 / Chapter 5.3 --- Experiments --- p.92 / Chapter 5.3.1 --- Training and Testing on the Perfect Reference Data Sets --- p.93 / Chapter 5.3.2 --- Mismatched Training and Testing Conditions ´ؤ Perfect Reference versus Imperfect Hypotheses --- p.93 / Chapter 5.3.3 --- Comparing Goal Identification between the Use of Single- best versus N-best Recognition Hypotheses --- p.95 / Chapter 5.3.4 --- Integration of Speech Recognition Confidence Scores into Natural Language Understanding --- p.97 / Chapter 5.3.5 --- Feasibility of Our Approach for Spoken Language Un- derstanding --- p.99 / Chapter 5.3.6 --- Justification of Using Max-of-max Classifier in Our Single Goal Identification Scheme --- p.107 / Chapter 5.4 --- Chapter Summary --- p.109 / Chapter 6 --- Conclusions and Future Work --- p.110 / Chapter 6.1 --- Conclusions --- p.110 / Chapter 6.2 --- Contributions --- p.112 / Chapter 6.3 --- Future Work --- p.113 / Bibliography --- p.115 / Chapter A --- Semantic Frames for Chinese --- p.123 / Chapter B --- Semantic Frames for English --- p.127 / Chapter C --- The Concept Set of Positive Evidence for the Nine Goalsin English --- p.131 / Chapter D --- The Concept Set of Positive Evidence for the Ten Goalsin Chinese --- p.133 / Chapter E --- The Complete Concept Set including Both the Positive and Negative Evidence for the Ten Goals in English --- p.135 / Chapter F --- The Complete Concept Set including Both the Positive and Negative Evidence for the Ten Goals in Chinese --- p.138 / Chapter G --- The Assignment of Statistical Probabilities for Each Selected Concept under the Corresponding Goals in Chinese --- p.141 / Chapter H --- The Assignment of Statistical Probabilities for Each Selected Concept under the Corresponding Goals in English --- p.146 Machine learning Automatic speech recognition

Search results