Global ETD Search

1	Validation de réponses dans un système de questions réponses / Answer validation in question answering system Grappy, Arnaud 08 November 2011 (has links) Avec l'augmentation des connaissances disponibles sur Internet est apparue la difficulté d'obtenir une information. Les moteurs de recherche permettent de retourner des pages Web censés contenir l'information désirée à partir de mots clés. Toutefois il est encore nécessaire de trouver la bonne requête et d'examiner les documents retournés. Les systèmes de questions réponses ont pour but de renvoyer directement une réponse concise à partir d'une question posée en langue naturelle. La réponse est généralement accompagnée d'un passage de texte censé la justifier. Par exemple, pour la question « Quel est le réalisateur d'Avatar ? » la réponse « James Cameron » peut être renvoyée accompagnée de « James Cameron a réalisé Avatar. ». Cette thèse se focalise sur la validation de réponses qui permet de déterminer automatiquement si la réponse est valide. Une réponse est valide si elle est correcte (répond bien à la question) et justifiée par le passage textuel. Cette validation permet d'améliorer les systèmes de questions réponses en ne renvoyant à l'utilisateur que les réponses valides. Les approches permettant de reconnaître les réponses valides peuvent se décomposer en deux grandes catégories : -les approches utilisant un formalisme de représentation particulier de la question et du passage dans lequel les structures sont comparées ;-les approches suivant une approche par apprentissage qui combinent différents critères d'ordres lexicaux ou syntaxiques. Dans le but d'identifier les différents phénomènes sous tendant la validation de réponses, nous avons participé à la création d'un corpus annoté manuellement. Ces phénomènes sont de différentes natures telle que la paraphrase ou la coréférence. On peut aussi remarquer que les différentes informations sont réparties sur plusieurs phrases, voire sont manquantes dans les passages contenant la réponse. Une deuxième étude de corpus de questions a porté sur les différentes informations à vérifier afin de détecter qu'une réponse est valide. Cette étude a montré que les trois phénomènes les plus fréquents sont la vérification du type de la réponse, la date et le lieu contenus dans la question. Ces différentes études ont permis de mettre au point notre système de validation de réponses qui s'appuie sur une combinaison de critères. Certains critères traitent de la présence dans le passage des mots de la question ce qui permet de pointer la présence des informations de la question. Un traitement particulier a été effectué pour les informations de date en détectant une réponse comme n'étant pas valide si le passage ne contient pas la date contenue dans la question. D'autres critères, dont la proximité dans le passage des mots de la question et de la réponse, portent sur le lien entre les différents mots de la question dans le passage. Le second grand type de vérification permet de mesurer la compatibilité entre la réponse et la question. Un certain nombre de questions attendent une réponse étant d'un type particulier. La question de l'exemple précédent attend ainsi un réalisateur en réponse. Si la réponse n'est pas de ce type alors elle est incorrecte. Comme cette information peut ne pas se trouver dans le passage justificatif, elle est recherchée dans des documents autres à l'aide de la structure des pages Wikipédia, en utilisant des patrons syntaxiques ou grâce à des fréquences d'apparitions du type et de la réponse dans des documents. La vérification du type est particulièrement efficace puisqu'elle effectue 80 % de bonnes détections. La vérification de la validité des réponses est également pertinente puisque lors de la participation à une campagne d'évaluation, AVE 2008, le système s'est placé parmi les meilleurs toutes langues confondues. La dernière contribution a consisté à intégrer le module de validation dans un système de questions réponses, QAVAL. Dans ce cadre de nombreuses réponses sont extraites par QAVAL et ordonnées grâce au module de validation de réponses. Le système n'est plus utilisé afin de détecter les réponses valides mais pour fournir un score de confiance à chaque réponse. Le système QAVAL peut ainsi aussi bien être utilisé en effectuant des recherches dans des articles de journaux que dans des articles issus du Web. Les résultats sont assez bons puisqu'ils dépassent ceux obtenus par un simple ordonnancement des réponses de près de 50 %. / Question answering systems extract precise answers from a set of documents, and return the answers along with text snippets which justify them. For example, to the question "Who is the director of Avatar?" The answer "James Cameron" may be returned with "Avatar by James Cameron.".The answer validation detect automatically if the answer is valid ie. if it is correct (responds to the question) and justified by the text passage. This validation allows to improve the question answering systems by producing only valid answers.Two kind of methods can be used to detect right answers : -approaches using specific representation formalism of the question and the passage in which the structures are compared;-learning approaches that combines lexical and syntactic features.To identify the phenomena that characterize the answer validation, we built a manually annotated corpus. Differents phenomena can be seen like paraphrasing, coreference or that the information is spread in different sentences or documents. A second corpus aims to identify the different informations to be checked to valid an answer. This study showed that the three mains phenomena are the answer type, the date and place of the question.These studies have helped to develop our answer validation system which is based on a combination of features. The first one estimates the proportion of common terms in the snippet and the question, the second one measures the proximity of these terms and the answer. The second kind of features measure the compatibility between the answer and the question. Numerous questions wait for answers of an explicit type. For example, the question “Which president succeeded to Jacques Chirac?” requires an instance of president as answer.If the answer is not of this type then it is incorrect. The method aims at verifying that an answer given by a system corresponds to the given type. This verification is done by combining features provided by different methods. The first types of feature are statistical and compute the presence rate of both the answer and the type in documents, other features rely on named entity recognizers and the last criteria are based on the use of Wikipedia. Type checking is particularly effective because it makes 80 % correct detections. The final contribution was to integrate the validation module in a question answering system, QAVAL. Many answers are retrieved by QAVAL and ordered through the answers validation module. The module provide a confidence score to each response. QAVAL can be used both by researching the information in newspaper articles and in articles from the Web. The results are good, exceeding those obtained by a simple answer ranking from nearly 50%. Systèmes de questions réponses Validation de réponses Implication textuelle Question answering system Answer validation Textual entailment
2	Low-resource Language Question Answering Systemwith BERT Jansson, Herman January 2021 (has links) The complexity for being at the forefront regarding information retrieval systems are constantly increasing. Recent technology of natural language processing called BERT has reached superhuman performance in high resource languages for reading comprehension tasks. However, several researchers has stated that multilingual model’s are not enough for low-resource languages, since they are lacking a thorough understanding of those languages. Recently, a Swedish pre-trained BERT model has been introduced which is trained on significantly more Swedish data than the multilingual models currently available. This study compares both multilingual and Swedish monolingual inherited BERT model’s for question answering utilizing both a English and a Swedish machine translated SQuADv2 data set during its fine-tuning process. The models are evaluated with SQuADv2 benchmark and within a implemented question answering system built upon the classical retriever-reader methodology. This study introduces a naive and more robust prediction method for the proposed question answering system as well finding a sweet spot for each individual model approach integrated into the system. The question answering system is evaluated and compared against another question answering library at the leading edge within the area, applying a custom crafted Swedish evaluation data set. The results show that the fine-tuned model based on the Swedish pre-trained model and the Swedish SQuADv2 data set were superior in all evaluation metrics except speed. The comparison between the different systems resulted in a higher evaluation score but a slower prediction time for this study’s system. BERT Question Answering system Reading Comprehension Low resource language SQuADv2 Computer Systems Datorsystem
3	Automated question answering : template-based approach Sneiders, Eriks January 2002 (has links) <p>The rapid growth in the development of Internet-basedinformation systems increases the demand for natural langu-ageinterfaces that are easy to set up and maintain. Unfortunately,the problem of understanding natural language queries is farfrom being solved. Therefore this research proposes a simplertask of matching a one-sentence-long user question to a numberof question templates, which cover the knowledge domain of theinformation system, without in-depth understanding of the userquestion itself.The research started with development of an FAQ(Frequently Asked Question) answering system that providespre-stored answers to user questions asked in ordinary English.The language processing technique developed for FAQ retrievaldoes not analyze user questions. Instead, analysis is appliedto FAQs in the database long before any user questions aresubmitted. Thus, the work of FAQ retrieval is reduced tokeyword matching without understanding the questions, and thesystem still creates an illusion of intelligence.Further, the research adapted the FAQ answering techniqueto a question-answering interface for a structured database,e.g., relational database. The entity-relationship model of thedatabase is covered with an exhaustive collection of questiontemplates - dynamic, parameterized "frequently asked questions"- that describe the entities, their attributes, and therelationships in form of natural language questions. Unlike astatic FAQ, a question template contains entity slots - freespace for data instances that represent the main concepts inthe question. In order to answer a user question, the systemfinds matching question templates and data instances that fillthe entity slots. The associated answer templates create theanswer.Finally, the thesis introduces a generic model oftemplate-based question answering which is a summary andgene-ralization of the features common for the above systems:they (i) split the application-specific knowledge domain into anumber of question-specific knowledge domains, (ii) attach aquestion template, whose answer is known in advance, to eachknowledge domain, and (iii) match the submitted user questionto each question template within the context of its ownknowledge domain.</p><p><b>Keywords:</b>automated question answering, FAQ answering,question-answering system, template-based question answering,question template, natural language based interface</p> automated question answering FAQ answering question-answering system template-based question answering question template natural language based interface
4	A data mining approach to ontology learning for automatic content-related question-answering in MOOCs Shatnawi, Safwan January 2016 (has links) The advent of Massive Open Online Courses (MOOCs) allows massive volume of registrants to enrol in these MOOCs. This research aims to offer MOOCs registrants with automatic content related feedback to fulfil their cognitive needs. A framework is proposed which consists of three modules which are the subject ontology learning module, the short text classification module, and the question answering module. Unlike previous research, to identify relevant concepts for ontology learning a regular expression parser approach is used. Also, the relevant concepts are extracted from unstructured documents. To build the concept hierarchy, a frequent pattern mining approach is used which is guided by a heuristic function to ensure that sibling concepts are at the same level in the hierarchy. As this process does not require specific lexical or syntactic information, it can be applied to any subject. To validate the approach, the resulting ontology is used in a question-answering system which analyses students' content-related questions and generates answers for them. Textbook end of chapter questions/answers are used to validate the question-answering system. The resulting ontology is compared vs. the use of Text2Onto for the question-answering system, and it achieved favourable results. Finally, different indexing approaches based on a subject's ontology are investigated when classifying short text in MOOCs forum discussion data; the investigated indexing approaches are: unigram-based, concept-based and hierarchical concept indexing. The experimental results show that the ontology-based feature indexing approaches outperform the unigram-based indexing approach. Experiments are done in binary classification and multiple labels classification settings . The results are consistent and show that hierarchical concept indexing outperforms both concept-based and unigram-based indexing. The BAGGING and random forests classifiers achieved the best result among the tested classifiers. 371.33
5	Um sistema inteligente baseado em ontologia para apoio ao esclarecimento de dúvida Amorim, Marta Talitha Carvalho Freire de 31 August 2012 (has links) Made available in DSpace on 2016-12-23T14:33:48Z (GMT). No. of bitstreams: 1 Marta Talitha Carvalho Freire De Amorim.pdf: 1718108 bytes, checksum: 60eb34219545d0ffacecb5e5e80f2ea7 (MD5) Previous issue date: 2012-08-31 / When people want to learn a concept, the most common way is to use a search engine like: Google, Yahoo, Bing, among others. A natural language query is submitted to a search tool and which returns a lot of pages related to the concept studied. Usually the returned pages are listed and organized mainly based on the combination of keywords instead of using the interpretation and relevance of the terms found. The user must have read a lot of pages and selects the most appropriate to his needs. This kind of behavior takes time and focus on user-learner is dispersed to his goal. The use of intelligent systems that support the clarification of doubt has intent to solve this problem, presenting the most accurate answers to questions or sentences in natural language. Examples clarification of doubt systems are: question-answer system, help-desk intelligent among others. This work uses an architectural approach to a question answering system based on three steps: question analysis, selection and extraction of the answer and answer generation. One of the merits of this architecture is to use techniques that complement each other, such as ontologies, information retrieval techniques and a knowledge base written in AIML language to extract the answer quickly. The focus of this work is to answer questions WH-question (What, Who, When, Where, What, Who) of the English language / Quando as pessoas querem aprender algum conceito, a forma mais comum é usar uma ferramenta de pesquisa, como: Google, Yahoo, Bing, dentre outros. Uma consulta em linguagem natural é submetida para uma ferramenta e a pesquisa retorna uma grande quantidade de páginas relacionadas ao conceito pesquisado. Geralmente as páginas retornadas são listadas e organizadas principalmente baseando-se na combinação de palavras chaves ao invés de utilizar a interpretação e a relevância dos termos consultados. O usuário terá que ler uma grande quantidade de páginas e selecionar a mais apropriada a sua necessidade. Esse tipo de comportamento consome tempo e o foco do usuário-aprendiz é disperso do seu objetivo. A utilização de um sistema inteligente que apoie o esclarecimento de dúvidas pretende resolver esse problema, apresentando as respostas mais precisas ou frases para as perguntas em linguagem natural. Exemplos de sistemas de esclarecimento de dúvidas são: sistema de pergunta-resposta, help-desk inteligentes, entre outros. Este trabalho utiliza uma abordagem arquitetônica para um sistema de pergunta-resposta baseado em três passos: análise da pergunta, seleção e extração da resposta e geração da resposta. Um dos méritos dessa arquitetura é utilizar técnicas que se complementam, tais como: ontologias, técnicas de recuperação de informação e uma base de conhecimento escrita em linguagem AIML para extrair a resposta de forma rápida. O foco deste trabalho é responder perguntas WH-question (O que, Quem, Quando, Onde, Quais, Quem) da língua inglesa Sistema de pergunta-resposta Ontologia e recuperação da informação Question answering system Ontology and information retrieval
6	Automated question answering : template-based approach Sneiders, Eriks January 2002 (has links) The rapid growth in the development of Internet-basedinformation systems increases the demand for natural langu-ageinterfaces that are easy to set up and maintain. Unfortunately,the problem of understanding natural language queries is farfrom being solved. Therefore this research proposes a simplertask of matching a one-sentence-long user question to a numberof question templates, which cover the knowledge domain of theinformation system, without in-depth understanding of the userquestion itself.The research started with development of an FAQ(Frequently Asked Question) answering system that providespre-stored answers to user questions asked in ordinary English.The language processing technique developed for FAQ retrievaldoes not analyze user questions. Instead, analysis is appliedto FAQs in the database long before any user questions aresubmitted. Thus, the work of FAQ retrieval is reduced tokeyword matching without understanding the questions, and thesystem still creates an illusion of intelligence.Further, the research adapted the FAQ answering techniqueto a question-answering interface for a structured database,e.g., relational database. The entity-relationship model of thedatabase is covered with an exhaustive collection of questiontemplates - dynamic, parameterized "frequently asked questions"- that describe the entities, their attributes, and therelationships in form of natural language questions. Unlike astatic FAQ, a question template contains entity slots - freespace for data instances that represent the main concepts inthe question. In order to answer a user question, the systemfinds matching question templates and data instances that fillthe entity slots. The associated answer templates create theanswer.Finally, the thesis introduces a generic model oftemplate-based question answering which is a summary andgene-ralization of the features common for the above systems:they (i) split the application-specific knowledge domain into anumber of question-specific knowledge domains, (ii) attach aquestion template, whose answer is known in advance, to eachknowledge domain, and (iii) match the submitted user questionto each question template within the context of its ownknowledge domain. Keywords:automated question answering, FAQ answering,question-answering system, template-based question answering,question template, natural language based interface / <p>NR 20140805</p> automated question answering FAQ answering question-answering system template-based question answering question template natural language based interface
7	Recommending best answer in a collaborative question answering system Chen, Lin January 2009 (has links) The World Wide Web has become a medium for people to share information. People use Web-based collaborative tools such as question answering (QA) portals, blogs/forums, email and instant messaging to acquire information and to form online-based communities. In an online QA portal, a user asks a question and other users can provide answers based on their knowledge, with the question usually being answered by many users. It can become overwhelming and/or time/resource consuming for a user to read all of the answers provided for a given question. Thus, there exists a need for a mechanism to rank the provided answers so users can focus on only reading good quality answers. The majority of online QA systems use user feedback to rank users’ answers and the user who asked the question can decide on the best answer. Other users who didn’t participate in answering the question can also vote to determine the best answer. However, ranking the best answer via this collaborative method is time consuming and requires an ongoing continuous involvement of users to provide the needed feedback. The objective of this research is to discover a way to recommend the best answer as part of a ranked list of answers for a posted question automatically, without the need for user feedback. The proposed approach combines both a non-content-based reputation method and a content-based method to solve the problem of recommending the best answer to the user who posted the question. The non-content method assigns a score to each user which reflects the users’ reputation level in using the QA portal system. Each user is assigned two types of non-content-based reputations cores: a local reputation score and a global reputation score. The local reputation score plays an important role in deciding the reputation level of a user for the category in which the question is asked. The global reputation score indicates the prestige of a user across all of the categories in the QA system. Due to the possibility of user cheating, such as awarding the best answer to a friend regardless of the answer quality, a content-based method for determining the quality of a given answer is proposed, alongside the non-content-based reputation method. Answers for a question from different users are compared with an ideal (or expert) answer using traditional Information Retrieval and Natural Language Processing techniques. Each answer provided for a question is assigned a content score according to how well it matched the ideal answer. To evaluate the performance of the proposed methods, each recommended best answer is compared with the best answer determined by one of the most popular link analysis methods, Hyperlink-Induced Topic Search (HITS). The proposed methods are able to yield high accuracy, as shown by correlation scores: Kendall correlation and Spearman correlation. The reputation method outperforms the HITS method in terms of recommending the best answer. The inclusion of the reputation score with the content score improves the overall performance, which is measured through the use of Top-n match scores. Yahoo! Answers
8	Odpovídání na otázky nad strukturovanými daty / Question Answering over Structured Data Birger, Mark January 2017 (has links) Tato práce se zabývá problematikou odpovídání na otázky nad strukturovanými daty. Ve většině případů jsou strukturovaná data reprezentována pomocí propojených grafů, avšak ukrytí koncové struktury dát je podstatné pro využití podobných systémů jako součástí rozhraní s přirozeným jazykem. Odpovídající systém byl navržen a vyvíjen v rámci této práce. V porovnání s tradičními odpovídajícími systémy, které jsou založené na lingvistické analýze nebo statistických metodách, náš systém zkoumá poskytnutý graf a ve výsledků generuje sémantické vazby na základě vstupních párů otázka-odpověd'. Vyvíjený systém je nezávislý na struktuře dát, ale pro účely vyhodnocení jsme využili soubor dát z Wikidata a DBpedia. Kvalita výsledného systému a zkoumaného přístupu byla vyhodnocena s využitím připraveného datasetu a standartních metrik.

Search results