Spelling suggestions: "subject:"batural anguage anderstanding"" "subject:"batural anguage bunderstanding""
1 |
A logical approach to schema-based inferenceWobcke, W. R. January 1988 (has links)
No description available.
|
2 |
Probabilistic grammar induction from sentences and structured meaningsKwiatkowski, Thomas Mieczyslaw January 2012 (has links)
The meanings of natural language sentences may be represented as compositional logical-forms. Each word or lexicalised multiword-element has an associated logicalform representing its meaning. Full sentential logical-forms are then composed from these word logical-forms via a syntactic parse of the sentence. This thesis develops two computational systems that learn both the word-meanings and parsing model required to map sentences onto logical-forms from an example corpus of (sentence, logical-form) pairs. One of these systems is designed to provide a general purpose method of inducing semantic parsers for multiple languages and logical meaning representations. Semantic parsers map sentences onto logical representations of their meanings and may form an important part of any computational task that needs to interpret the meanings of sentences. The other system is designed to model the way in which a child learns the semantics and syntax of their first language. Here, logical-forms are used to represent the potentially ambiguous context in which childdirected utterances are spoken and a psycholinguistically plausible training algorithm learns a probabilistic grammar that describes the target language. This computational modelling task is important as it can provide evidence for or against competing theories of how children learn their first language. Both of the systems presented here are based upon two working hypotheses. First, that the correct parse of any sentence in any language is contained in a set of possible parses defined in terms of the sentence itself, the sentence’s logical-form and a small set of combinatory rule schemata. The second working hypothesis is that, given a corpus of (sentence, logical-form) pairs that each support a large number of possible parses according to the schemata mentioned above, it is possible to learn a probabilistic parsing model that accurately describes the target language. The algorithm for semantic parser induction learns Combinatory Categorial Grammar (CCG) lexicons and discriminative probabilistic parsing models from corpora of (sentence, logical-form) pairs. This system is shown to achieve at or near state of the art performance across multiple languages, logical meaning representations and domains. As the approach is not tied to any single natural or logical language, this system represents an important step towards widely applicable black-box methods for semantic parser induction. This thesis also develops an efficient representation of the CCG lexicon that separately stores language specific syntactic regularities and domain specific semantic knowledge. This factorised lexical representation improves the performance of CCG based semantic parsers in sparse domains and also provides a potential basis for lexical expansion and domain adaptation for semantic parsers. The algorithm for modelling child language acquisition learns a generative probabilistic model of CCG parses from sentences paired with a context set of potential logical-forms containing one correct entry and a number of distractors. The online learning algorithm used is intended to be psycholinguistically plausible and to assume as little information specific to the task of language learning as is possible. It is shown that this algorithm learns an accurate parsing model despite making very few initial assumptions. It is also shown that the manner in which both word-meanings and syntactic rules are learnt is in accordance with observations of both of these learning tasks in children, supporting a theory of language acquisition that builds upon the two working hypotheses stated above.
|
3 |
An approach to Natural Language understandingMarlen, Michael Scott January 1900 (has links)
Doctor of Philosophy / Department of Computing and Information Sciences / David A. Gustafson / Natural Language understanding over a set of sentences or a document is a challenging problem. We approach this problem using semantic extraction and an ontology for answering questions based on the data. There is more information in a sentence than that found by extracting out the visible terms and their obvious relations between one another. It is the hidden information that is not seen that gives this solution the advantage over alternatives. This methodology was tested against the FraCas Test Suite with near perfect results (correct answers) for the sections that are the focus of this paper (Generalized Quantifiers, Plurals, Adjectives, Comparatives, Verbs, and Attitudes). The results indicate that extracting the visible semantics as well as the unseen semantics and their interrelations using an ontology to reason over it provides reliable and provable answers to questions validating this technology.
|
4 |
All Purpose Textual Data Information Extraction, Visualization and QueryingJanuary 2018 (has links)
abstract: Since the advent of the internet and even more after social media platforms, the explosive growth of textual data and its availability has made analysis a tedious task. Information extraction systems are available but are generally too specific and often only extract certain kinds of information they deem necessary and extraction worthy. Using data visualization theory and fast, interactive querying methods, leaving out information might not really be necessary. This thesis explores textual data visualization techniques, intuitive querying, and a novel approach to all-purpose textual information extraction to encode large text corpus to improve human understanding of the information present in textual data.
This thesis presents a modified traversal algorithm on dependency parse output of text to extract all subject predicate object pairs from text while ensuring that no information is missed out. To support full scale, all-purpose information extraction from large text corpuses, a data preprocessing pipeline is recommended to be used before the extraction is run. The output format is designed specifically to fit on a node-edge-node model and form the building blocks of a network which makes understanding of the text and querying of information from corpus quick and intuitive. It attempts to reduce reading time and enhancing understanding of the text using interactive graph and timeline. / Dissertation/Thesis / Masters Thesis Software Engineering 2018
|
5 |
Natural Language Understanding for Multi-Level Distributed Intelligent Virtual SensorsPapangelis, Angelos, Kyriakou, Georgios January 2021 (has links)
In our thesis we explore the Automatic Question/Answer Generation (AQAG) and the application of Machine Learning (ML) in natural language queries. Initially we create a collection of question/answer tuples conceptually based on processing received data from (virtual) sensors placed in a smart city. Subsequently we train a Gated Recurrent Unit(GRU) model on the generated dataset and evaluate the accuracy we can achieve in answering those questions. This will help in turn to address the problem of automatic sensor composition based on natural language queries. To this end, the contribution of this thesis is two-fold: on one hand we are providing anautomatic procedure for dataset construction, based on natural language question templates, and on the other hand we apply a ML approach that establishes the correlation between the natural language queries and their virtual sensor representation, via their functional representation. We consider virtual sensors to be entities as described by Mihailescu et al, where they provide an interface constructed with certain properties in mind. We use those sensors for our application domain of a smart city environment, thus constructing our dataset around questions relevant to it.
|
6 |
Deriving A Better Metric To Assess theQuality of Word Embeddings Trained OnLimited Specialized CorporaMunbodh, Mrinal January 2020 (has links)
No description available.
|
7 |
Quality Assessment of Conversational Agents : Assessing the Robustness of Conversational Agents to Errors and Lexical Variability / Kvalitetsutvärdering av konversationsagenter : Att bedöma robustheten hos konversationsagenter mot fel och lexikal variabilitetGuichard, Jonathan January 2018 (has links)
Assessing a conversational agent’s understanding capabilities is critical, as poor user interactions could seal the agent’s fate at the very beginning of its lifecycle with users abandoning the system. In this thesis we explore the use of paraphrases as a testing tool for conversational agents. Paraphrases, which are different ways of expressing the same intent, are generated based on known working input by performing lexical substitutions and by introducing multiple spelling divergences. As the expected outcome for this newly generated data is known, we can use it to assess the agent’s robustness to language variation and detect potential understanding weaknesses. As demonstrated by a case study, we obtain encouraging results as it appears that this approach can help anticipate potential understanding shortcomings, and that these shortcomings can be addressed by the generated paraphrases. / Att bedöma en konversationsagents språkförståelse är kritiskt, eftersom dåliga användarinteraktioner kan avgöra om agenten blir en framgång eller ett misslyckande redan i början av livscykeln. I denna rapport undersöker vi användningen av parafraser som ett testverktyg för dessa konversationsagenter. Parafraser, vilka är olika sätt att uttrycka samma avsikt, skapas baserat på känd indata genom att utföra lexiska substitutioner och genom att introducera flera stavningsavvikelser. Eftersom det förväntade resultatet för denna indata är känd kan vi använda resultaten för att bedöma agentens robusthet mot språkvariation och upptäcka potentiella förståelssvagheter. Som framgår av en fallstudie får vi uppmuntrande resultat, eftersom detta tillvägagångssätt verkar kunna bidra till att förutse eventuella brister i förståelsen, och dessa brister kan hanteras av de genererade parafraserna.
|
8 |
Low-Resource Natural Language Understanding in Task-Oriented DialogueLouvan, Samuel 11 March 2022 (has links)
Task-oriented dialogue (ToD) systems need to interpret the user's input to understand the user's needs (intent) and corresponding relevant information (slots). This process is performed by a Natural Language Understanding (NLU) component, which maps the text utterance into a semantic frame representation, involving two subtasks: intent classification (text classification) and slot filling (sequence tagging). Typically, new domains and languages are regularly added to the system to support more functionalities. Collecting domain-specific data and performing fine-grained annotation of large amounts of data every time a new domain and language is introduced can be expensive. Thus, developing an NLU model that generalizes well across domains and languages with less labeled data (low-resource) is crucial and remains challenging.
This thesis focuses on investigating transfer learning and data augmentation methods for low-resource NLU in ToD. Our first contribution is a study of the potential of non-conversational text as a source for transfer. Most transfer learning approaches assume labeled conversational data as the source task and adapt the NLU model to the target task. We show that leveraging similar tasks from non-conversational text improves performance on target slot filling tasks through multi-task learning in low-resource settings. Second, we propose a set of lightweight augmentation methods that apply data transformation on token and sentence levels through slot value substitution and syntactic manipulation. Despite its simplicity, the performance is comparable to deep learning-based augmentation models, and it is effective on six languages on NLU tasks. Third, we investigate the effectiveness of domain adaptive pre-training for zero-shot cross-lingual NLU. In terms of overall performance, continued pre-training in English is effective across languages. This result indicates that the domain knowledge learned in English is transferable to other languages. In addition to that, domain similarity is essential. We show that intermediate pre-training data that is more similar – in terms of data distribution – to the target dataset yields better performance.
|
9 |
Numerical Reasoning in NLP: Challenges, Innovations, and Strategies for Handling Mathematical Equivalency / 自然言語処理における数値推論:数学的同等性の課題、革新、および対処戦略Liu, Qianying 25 September 2023 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第24929号 / 情博第840号 / 新制||情||140(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)特定教授 黒橋 禎夫, 教授 河原 達也, 教授 西野 恒 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
10 |
Towards Informal Computer Human Communication: Detecting Humor in a Restricted DomainTaylor, Julia Michelle January 2008 (has links)
No description available.
|
Page generated in 0.1242 seconds