• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 89
  • 5
  • 5
  • 4
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 134
  • 134
  • 66
  • 55
  • 47
  • 45
  • 36
  • 31
  • 28
  • 27
  • 25
  • 19
  • 19
  • 18
  • 17
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Implementation of Constraint Propagation Tree for Question Answering Systems

Palavalasa, Swetha Rao 01 January 2009 (has links)
Computing with Words based Question Answering (CWQA) system provides a foundation to develop futuristic search engines where more of reasoning and less of pattern matching and statistical methods are used for information retrieval. In order to perform successful reasoning, these systems should analyze the semantic of the query and the related information in the Knowledge Base. The concept of Computing with Words (CW) which is a kind of perception based reasoning where manipulation of perceptions using fuzzy set theory and fuzzy logic play a key role in recognition, decision and execution processes can be utilized for this purpose. Two concepts that were introduced by Computing with Words are the Generalized Constraint Language (GCL) and the Generalized Theory of Uncertainty (GTU) . In GCL propositions, i.e. perceptions in natural language, are denoted using generalized constraints. The Generalized Theory of Uncertainty (GTU) uses GCL to express proposition drawn from natural language as a generalized constraint. The GCL plays a fundamental role in GTU by serving as a precisiation language for propositions, commands and questions in natural language. In GTU, deduction rules are used to propagate generalized constraints to accomplish reasoning under uncertainty. In the previous work a CW-based QA-system methodology was introduced which uses a knowledge tree data structure, called as a Constraint Propagation Tree (CPT) that utilizes the concepts briefed above. The realization of Constraint Propagation Tree, the first phase, and partial implementation of constraint propagation and node combination, the second phase, is the main goal of this work.
92

Answering Deep Queries Specified in Natural Language with Respect to a Frame Based Knowledge Base and Developing Related Natural Language Understanding Components

January 2015 (has links)
abstract: Question Answering has been under active research for decades, but it has recently taken the spotlight following IBM Watson's success in Jeopardy! and digital assistants such as Apple's Siri, Google Now, and Microsoft Cortana through every smart-phone and browser. However, most of the research in Question Answering aims at factual questions rather than deep ones such as ``How'' and ``Why'' questions. In this dissertation, I suggest a different approach in tackling this problem. We believe that the answers of deep questions need to be formally defined before found. Because these answers must be defined based on something, it is better to be more structural in natural language text; I define Knowledge Description Graphs (KDGs), a graphical structure containing information about events, entities, and classes. We then propose formulations and algorithms to construct KDGs from a frame-based knowledge base, define the answers of various ``How'' and ``Why'' questions with respect to KDGs, and suggest how to obtain the answers from KDGs using Answer Set Programming. Moreover, I discuss how to derive missing information in constructing KDGs when the knowledge base is under-specified and how to answer many factual question types with respect to the knowledge base. After having the answers of various questions with respect to a knowledge base, I extend our research to use natural language text in specifying deep questions and knowledge base, generate natural language text from those specification. Toward these goals, I developed NL2KR, a system which helps in translating natural language to formal language. I show NL2KR's use in translating ``How'' and ``Why'' questions, and generating simple natural language sentences from natural language KDG specification. Finally, I discuss applications of the components I developed in Natural Language Understanding. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2015
93

Um sistema inteligente baseado em ontologia para apoio ao esclarecimento de dúvida

Amorim, Marta Talitha Carvalho Freire de 31 August 2012 (has links)
Made available in DSpace on 2016-12-23T14:33:48Z (GMT). No. of bitstreams: 1 Marta Talitha Carvalho Freire De Amorim.pdf: 1718108 bytes, checksum: 60eb34219545d0ffacecb5e5e80f2ea7 (MD5) Previous issue date: 2012-08-31 / When people want to learn a concept, the most common way is to use a search engine like: Google, Yahoo, Bing, among others. A natural language query is submitted to a search tool and which returns a lot of pages related to the concept studied. Usually the returned pages are listed and organized mainly based on the combination of keywords instead of using the interpretation and relevance of the terms found. The user must have read a lot of pages and selects the most appropriate to his needs. This kind of behavior takes time and focus on user-learner is dispersed to his goal. The use of intelligent systems that support the clarification of doubt has intent to solve this problem, presenting the most accurate answers to questions or sentences in natural language. Examples clarification of doubt systems are: question-answer system, help-desk intelligent among others. This work uses an architectural approach to a question answering system based on three steps: question analysis, selection and extraction of the answer and answer generation. One of the merits of this architecture is to use techniques that complement each other, such as ontologies, information retrieval techniques and a knowledge base written in AIML language to extract the answer quickly. The focus of this work is to answer questions WH-question (What, Who, When, Where, What, Who) of the English language / Quando as pessoas querem aprender algum conceito, a forma mais comum é usar uma ferramenta de pesquisa, como: Google, Yahoo, Bing, dentre outros. Uma consulta em linguagem natural é submetida para uma ferramenta e a pesquisa retorna uma grande quantidade de páginas relacionadas ao conceito pesquisado. Geralmente as páginas retornadas são listadas e organizadas principalmente baseando-se na combinação de palavras chaves ao invés de utilizar a interpretação e a relevância dos termos consultados. O usuário terá que ler uma grande quantidade de páginas e selecionar a mais apropriada a sua necessidade. Esse tipo de comportamento consome tempo e o foco do usuário-aprendiz é disperso do seu objetivo. A utilização de um sistema inteligente que apoie o esclarecimento de dúvidas pretende resolver esse problema, apresentando as respostas mais precisas ou frases para as perguntas em linguagem natural. Exemplos de sistemas de esclarecimento de dúvidas são: sistema de pergunta-resposta, help-desk inteligentes, entre outros. Este trabalho utiliza uma abordagem arquitetônica para um sistema de pergunta-resposta baseado em três passos: análise da pergunta, seleção e extração da resposta e geração da resposta. Um dos méritos dessa arquitetura é utilizar técnicas que se complementam, tais como: ontologias, técnicas de recuperação de informação e uma base de conhecimento escrita em linguagem AIML para extrair a resposta de forma rápida. O foco deste trabalho é responder perguntas WH-question (O que, Quem, Quando, Onde, Quais, Quem) da língua inglesa
94

Uma arquitetura de question-answering instanciada no domínio de doenças crônicas / A question-answering architecture instantiated on the domains of chronic disease

Luciana Farina Almansa 08 August 2016 (has links)
Nos ambientes médico e de saúde, especificamente no tratamento clínico do paciente, o papel da informação descrita nos prontuários médicos é registrar o estado de saúde do paciente e auxiliar os profissionais diretamente ligados ao tratamento. A investigação dessas informações de estado clínico em pesquisas científicas na área de biomedicina podem suportar o desenvolvimento de padrões de prevenção e tratamento de enfermidades. Porém, ler artigos científicos é uma tarefa que exige tempo e disposição, uma vez que realizar buscas por informações específicas não é uma tarefa simples e a área médica e de saúde está em constante atualização. Além disso, os profissionais desta área, em sua grande maioria, possuem uma rotina estressante, trabalhando em diversos empregos e atendendo muitos pacientes em um único dia. O objetivo deste projeto é o desenvolvimento de um Framework de Question Answering (QA) para suportar o desenvolvimento de sistemas de QA, que auxiliem profissionais da área da saúde na busca rápida por informações, especificamente, em epigenética e doenças crônicas. Durante o processo de construção do framework, estão sendo utilizados dois frameworks desenvolvidos anteriormente pelo grupo de pesquisa da mestranda: o SisViDAS e o FREDS, além de desenvolver os demais módulos de processamento de pergunta e de respostas. O QASF foi avaliado por meio de uma coleção de referências e medidas estatísticas de desempenho e os resultados apontam valores de precisão em torno de 0.7 quando a revocação era 0.3, para ambos o número de artigos recuperados e analisados eram 200. Levando em consideração que as perguntas inseridas no QASF são longas, com 70 termos por pergunta em média, e complexas, o QASF apresentou resultados satisfatórios. Este projeto pretende contribuir na diminuição do tempo gasto por profissionais da saúde na busca por informações de interesse, uma vez que sistemas de QA fornecem respostas diretas e precisas sobre uma pergunta feita pelo usuário / The medical record describes health conditions of patients helping experts to make decisions about the treatment. The biomedical scientific knowledge can improve the prevention and the treatment of diseases. However, the search for relevant knowledge may be a hard task because it is necessary time and the healthcare research is constantly updating. Many healthcare professionals have a stressful routine, because they work in different hospitals or medical offices, taking care many patients per day. The goal of this project is to design a Question Answering Framework to support faster and more precise searches for information in epigenetic, chronic disease and thyroid images. To develop the proposal, we are reusing two frameworks that have already developed: SisViDAS and FREDS. These two frameworks are being exploited to compose a document processing module. The other modules (question and answer processing) are being completely developed. The QASF was evaluated by a reference collection and performance measures. The results show 0.7 of precision and 0.3 of recall for two hundred articles retrieved. Considering that the questions inserted on the framework have an average of seventy terms, the QASF shows good results. This project intends to decrease search time once QA systems provide straight and precise answers in a process started by a user question in natural language
95

Unsupervised Morphological Segmentation and Part-of-Speech Tagging for Low-Resource Scenarios

Eskander, Ramy January 2021 (has links)
With the high cost of manually labeling data and the increasing interest in low-resource languages, for which human annotators might not be even available, unsupervised approaches have become essential for processing a typologically diverse set of languages, whether high-resource or low-resource. In this work, we propose new fully unsupervised approaches for two tasks in morphology: unsupervised morphological segmentation and unsupervised cross-lingual part-of-speech (POS) tagging, which have been two essential subtasks for several downstream NLP applications, such as machine translation, speech recognition, information extraction and question answering. We propose a new unsupervised morphological-segmentation approach that utilizes Adaptor Grammars (AGs), nonparametric Bayesian models that generalize probabilistic context-free grammars (PCFGs), where a PCFG models word structure in the task of morphological segmentation. We implement the approach as a publicly available morphological-segmentation framework, MorphAGram, that enables unsupervised morphological segmentation through the use of several proposed language-independent grammars. In addition, the framework allows for the use of scholar knowledge, when available, in the form of affixes that can be seeded into the grammars. The framework handles the cases when the scholar-seeded knowledge is either generated from language resources, possibly by someone who does not know the language, as weak linguistic priors, or generated by an expert in the underlying language as strong linguistic priors. Another form of linguistic priors is the design of a grammar that models language-dependent specifications. We also propose a fully unsupervised learning setting that approximates the effect of scholar-seeded knowledge through self-training. Moreover, since there is no single grammar that works best across all languages, we propose an approach that picks a nearly optimal configuration (a learning setting and a grammar) for an unseen language, a language that is not part of the development. Finally, we examine multilingual learning for unsupervised morphological segmentation in low-resource setups. For unsupervised POS tagging, two cross-lingual approaches have been widely adapted: 1) annotation projection, where POS annotations are projected across an aligned parallel text from a source language for which a POS tagger is accessible to the target one prior to training a POS model; and 2) zero-shot model transfer, where a model of a source language is directly applied on texts in the target language. We propose an end-to-end architecture for unsupervised cross-lingual POS tagging via annotation projection in truly low-resource scenarios that do not assume access to parallel corpora that are large in size or represent a specific domain. We integrate and expand the best practices in alignment and projection and design a rich neural architecture that exploits non-contextualized and transformer-based contextualized word embeddings, affix embeddings and word-cluster embeddings. Additionally, since parallel data might be available between the target language and multiple source ones, as in the case of the Bible, we propose different approaches for learning from multiple sources. Finally, we combine our work on unsupervised morphological segmentation and unsupervised cross-lingual POS tagging by conducting unsupervised stem-based cross-lingual POS tagging via annotation projection, which relies on the stem as the core unit of abstraction for alignment and projection, which is beneficial to low-resource morphologically complex languages. We also examine morpheme-based alignment and projection, the use of linguistic priors towards better POS models and the use of segmentation information as learning features in the neural architecture. We conduct comprehensive evaluation and analysis to assess the performance of our approaches of unsupervised morphological segmentation and unsupervised POS tagging and show that they achieve the state-of-the-art performance for the two morphology tasks when evaluated on a large set of languages of different typologies: analytic, fusional, agglutinative and synthetic/polysynthetic.
96

Textual Inference for Machine Comprehension / Inférence textuelle pour la compréhension automatique

Gleize, Martin 07 January 2016 (has links)
Étant donnée la masse toujours croissante de texte publié, la compréhension automatique des langues naturelles est à présent l'un des principaux enjeux de l'intelligence artificielle. En langue naturelle, les faits exprimés dans le texte ne sont pas nécessairement tous explicites : le lecteur humain infère les éléments manquants grâce à ses compétences linguistiques, ses connaissances de sens commun ou sur un domaine spécifique, et son expérience. Les systèmes de Traitement Automatique des Langues (TAL) ne possèdent naturellement pas ces capacités. Incapables de combler les défauts d'information du texte, ils ne peuvent donc pas le comprendre vraiment. Cette thèse porte sur ce problème et présente notre travail sur la résolution d'inférences pour la compréhension automatique de texte. Une inférence textuelle est définie comme une relation entre deux fragments de texte : un humain lisant le premier peut raisonnablement inférer que le second est vrai. Beaucoup de tâches de TAL évaluent plus ou moins directement la capacité des systèmes à reconnaître l'inférence textuelle. Au sein de cette multiplicité de l'évaluation, les inférences elles-mêmes présentent une grande variété de types. Nous nous interrogeons sur les inférences en TAL d'un point de vue théorique et présentons deux contributions répondant à ces niveaux de diversité : une tâche abstraite contextualisée qui englobe les tâches d'inférence du TAL, et une taxonomie hiérarchique des inférences textuelles en fonction de leur difficulté. La reconnaissance automatique d'inférence textuelle repose aujourd'hui presque toujours sur un modèle d'apprentissage, entraîné à l'usage de traits linguistiques variés sur un jeu d'inférences textuelles étiquetées. Cependant, les données spécifiques aux phénomènes d'inférence complexes ne sont pour le moment pas assez abondantes pour espérer apprendre automatiquement la connaissance du monde et le raisonnement de sens commun nécessaires. Les systèmes actuels se concentrent plutôt sur l'apprentissage d'alignements entre les mots de phrases reliées sémantiquement, souvent en utilisant leur structure syntaxique. Pour étendre leur connaissance du monde, ils incluent des connaissances tirées de ressources externes, ce qui améliore souvent les performances. Mais cette connaissance est souvent ajoutée par dessus les fonctionnalités existantes, et rarement bien intégrée à la structure de la phrase.Nos principales contributions dans cette thèse répondent au problème précédent. En partant de l'hypothèse qu'un lexique plus simple devrait rendre plus facile la comparaison du sens de deux phrases, nous décrivons une méthode de récupération de passage fondée sur une expansion lexicale structurée et un dictionnaire de simplifications. Cette hypothèse est testée à nouveau dans une de nos contributions sur la reconnaissance d'implication textuelle : des paraphrases syntaxiques sont extraites du dictionnaire et appliquées récursivement sur la première phrase pour la transformer en la seconde. Nous présentons ensuite une méthode d'apprentissage par noyaux de réécriture de phrases, avec une notion de types permettant d'encoder des connaissances lexico-sémantiques. Cette approche est efficace sur trois tâches : la reconnaissance de paraphrases, d'implication textuelle, et le question-réponses. Nous résolvons son problème de passage à l'échelle dans une dernière contribution. Des tests de compréhension sont utilisés pour son évaluation, sous la forme de questions à choix multiples sur des textes courts, qui permettent de tester la résolution d'inférences en contexte. Notre système est fondé sur un algorithme efficace d'édition d'arbres, et les traits extraits des séquences d'édition sont utilisés pour construire deux classifieurs pour la validation et l'invalidation des choix de réponses. Cette approche a obtenu la deuxième place du challenge "Entrance Exams" à CLEF 2015. / With the ever-growing mass of published text, natural language understanding stands as one of the most sought-after goal of artificial intelligence. In natural language, not every fact expressed in the text is necessarily explicit: human readers naturally infer what is missing through various intuitive linguistic skills, common sense or domain-specific knowledge, and life experiences. Natural Language Processing (NLP) systems do not have these initial capabilities. Unable to draw inferences to fill the gaps in the text, they cannot truly understand it. This dissertation focuses on this problem and presents our work on the automatic resolution of textual inferences in the context of machine reading. A textual inference is simply defined as a relation between two fragments of text: a human reading the first can reasonably infer that the second is true. A lot of different NLP tasks more or less directly evaluate systems on their ability to recognize textual inference. Among this multiplicity of evaluation frameworks, inferences themselves are not one and the same and also present a wide variety of different types. We reflect on inferences for NLP from a theoretical standpoint and present two contributions addressing these levels of diversity: an abstract contextualized inference task encompassing most NLP inference-related tasks, and a novel hierchical taxonomy of textual inferences based on their difficulty.Automatically recognizing textual inference currently almost always involves a machine learning model, trained to use various linguistic features on a labeled dataset of samples of textual inference. However, specific data on complex inference phenomena is not currently abundant enough that systems can directly learn world knowledge and commonsense reasoning. Instead, systems focus on learning how to use the syntactic structure of sentences to align the words of two semantically related sentences. To extend what systems know of the world, they include external background knowledge, often improving their results. But this addition is often made on top of other features, and rarely well integrated to sentence structure. The main contributions of our thesis address the previous concern, with the aim of solving complex natural language understanding tasks. With the hypothesis that a simpler lexicon should make easier to compare the sense of two sentences, we present a passage retrieval method using structured lexical expansion backed up by a simplifying dictionary. This simplification hypothesis is tested again in a contribution on textual entailment: syntactical paraphrases are extracted from the same dictionary and repeatedly applied on the first sentence to turn it into the second. We then present a machine learning kernel-based method recognizing sentence rewritings, with a notion of types able to encode lexical-semantic knowledge. This approach is effective on three tasks: paraphrase identification, textual entailment and question answering. We address its lack of scalability while keeping most of its strengths in our last contribution. Reading comprehension tests are used for evaluation: these multiple-choice questions on short text constitute the most practical way to assess textual inference within a complete context. Our system is founded on a efficient tree edit algorithm, and the features extracted from edit sequences are used to build two classifiers for the validation and invalidation of answer candidates. This approach reaches second place at the "Entrance Exams" CLEF 2015 challenge.
97

Question Answering on RDF Data Cubes

Höffner, Konrad 26 March 2021 (has links)
The Semantic Web, a Web of Data, is an extension of the World Wide Web (WWW), a Web of Documents. A large amount of such data is freely available as Linked Open Data (LOD) for many areas of knowledge, forming the LOD Cloud. While this data conforms to the Resource Description Framework (RDF) and can thus be processed by machines, users need to master a formal query language and learn a specific vocabulary. Semantic Question Answering (SQA) systems remove those access barriers by letting the user ask natural language questions that the systems translate into formal queries. Thus, the research area of SQA plays an important role for the acceptance and benefit of the Semantic Web. The original contributions of this thesis to SQA are: First, we survey the current state of the art of SQA. We complement existing surveys by systematically identifying SQA publications in the chosen timeframe. 72 publications describing 62 different systems are systematically and manually selected using predefined inclusion and exclusion criteria out of 1960 candidates from the end of 2010 to July 2015. The survey identifies common challenges, structured solutions, and recommendations on research opportunities for future systems. From that point on, we focus on multidimensional numerical data, which is immensely valuable as it influences decisions in health care, policy and finance, among others. With the growth of the open data movement, more and more of it is becoming freely available. A large amount of such data is included in the LOD cloud using the RDF Data Cube (RDC) vocabulary. However, consuming multidimensional numerical data requires experts and specialized tools. Traditional SQA systems cannot process RDCs because their meta-structure is opaque to applications that expect facts to be encoded in single triples, This motivates our second contribution, the design and implementation of the first SQA algorithm on RDF Data Cubes. We kick-start this new research subfield by creating a user question corpus and a benchmark over multiple data sets. The evaluation of our system on the benchmark, which is included in the public Question Answering over Linked Data (QALD) challenge of 2016, shows the feasibility of the approach, but also highlights challenges, which we discuss in detail as a starting point for future work in the field. The benchmark is based on our final contribution, the addition of 955 financial government spending data sets to the LOD cloud by transforming data sets of the OpenSpending project to RDF Data Cubes. Open spending data has the power to reduce corruption by increasing accountability and strengthens democracy because voters can make better informed decisions. An informed and trusting public also strengthens the government itself because it is more likely to commit to large projects. OpenSpending.org is an open platform that provides public finance data from governments around the world. The transformation result, called LinkedSpending, consists of more than five million planned and carried out financial transactions in 955 data sets from all over the world as Linked Open Data and is freely available and openly licensed.:1 Introduction 1.1 Motivation 1.2 Research Questions and Contributions 1.3 Thesis Structure 2 Preliminaries 2.1 Semantic Web 2.1.1 URIs and URLs 2.1.2 Linked Data 2.1.3 Resource Description Framework 2.1.4 Ontologies 2.2 Question Answering 2.2.1 History 2.2.2 Definitions 2.2.3 Evaluation 2.2.4 SPARQL 2.2.5 Controlled Vocabulary 2.2.6 Faceted Search 2.2.7 Keyword Search 2.3 Data Cubes 3 Related Work 3.1 Semantic Question Answering 3.1.1 Surveys 3.1.2 Evaluation Campaigns 3.1.3 System Frameworks 3.2 Question Answering on RDF Data Cubes 3.3 RDF Data Cube Data Sets 4 Systematic Survey of Semantic Question Answering 4.1 Methodology 4.1.1 Inclusion Criteria 4.1.2 Exclusion Criteria 4.1.3 Result 4.2 Systems 4.2.1 Implementation 4.2.2 Examples 4.2.3 Answer Presentation 4.3 Challenges 4.3.1 Lexical Gap 4.3.2 Ambiguity 4.3.3 Multilingualism 4.3.4 Complex Queries 4.3.5 Distributed Knowledge 4.3.6 Procedural, Temporal and Spatial Questions 4.3.7 Templates 5 Question Answering on RDF Data Cubes 5.1 Question Corpus 5.2 Corpus Analysis 5.3 Data Cube Operations 5.4 Algorithm 5.4.1 Preprocessing 5.4.2 Matching 5.4.3 Combining Matches to Constraints 5.4.4 Execution 6 LinkedSpending 6.1 Choice of Source Data 6.1.1 Government Spending 6.1.2 OpenSpending 6.2 OpenSpending Source Data 6.3 Conversion of OpenSpending to RDF 6.4 Publishing 6.5 Overview over the Data Sets 6.6 Data Set Quality Analysis 6.6.1 Intrinsic Dimensions 6.6.2 Representational Dimensions 6.7 Evaluation 6.7.1 Experimental Setup and Benchmark 6.7.2 Discussion 7 Conclusion 7.1 Research Question Summary 7.2 SQA Survey 7.2.1 Lexical Gap 7.2.2 Ambiguity 7.2.3 Multilingualism 7.2.4 Complex Operators 7.2.5 Distributed Knowledge 7.2.6 Procedural, Temporal and Spatial Data 7.2.7 Templates 7.2.8 Future Research 7.3 CubeQA 7.4 LinkedSpending 7.4.1 Shortcomings 7.4.2 Future Work Bibliography Appendix A The CubeQA Question Corpus Appendix B The QALD-6 Task 3 Benchmark Questions B.1 Training Data B.2 Testing Data
98

Znalec encyklopedie / Encyclopedia Expert

Krč, Martin January 2009 (has links)
This project focuses on a system that answers questions formulated in natural language. Firstly, the report discusses problems associated with question answering systems and some commonly employed approaches. Emphasis is laid on shallow methods, which do not require many linguistic resources. The second part describes our work on a system that answers factoid questions, utilizing Czech Wikipedia as a source of information. Answer extraction is partly based on specific features of Wikipedia and partly on pre-defined patterns. Results show that for answering simple questions, the system provides significant improvements in comparison with a standard search engine.
99

Neural Network Models for Tasks in Open-Domain and Closed-Domain Question Answering

Chen, Charles L. 01 June 2020 (has links)
No description available.
100

On Advancing Natural Language Interfaces: Data Collection, Model Development, and User Interaction

Yao, Ziyu January 2021 (has links)
No description available.

Page generated in 0.0737 seconds