• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 103
  • 8
  • 5
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 153
  • 153
  • 73
  • 61
  • 53
  • 52
  • 44
  • 39
  • 36
  • 29
  • 26
  • 26
  • 20
  • 17
  • 17
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Vícejazyčný systém pro odpovídání na otázky nad otevřenou doménou / Multilingual Open-Domain Question Answering

Slávka, Michal January 2021 (has links)
Táto práca sa zaoberá automatickým viacjazyčným zodpovedaním na otázky v otvorenej doméne. V tejto práci sú navrhnuté prístupy k tejto málo prebádanej doméne. Konkrétne skúma, či: (i) použitie prekladu z angličtiny je dostačujúce, (ii) multilinguálne systémy vedia využiť preklad otázky do iných jazykov (iii) alebo je výhodnejšie nepoužívať žiaden preklad. Porovnávam použitie anglického systému založeného na modeli T5, ktorý využíva strojový preklad s natívne viacjazyčnými systémami založenými na viacjazyčnom modeli MT5. Anglický systém so strojovým prekladom mierne prekonáva svoje jednojazyčné náprotivky vo viacerých úlohách. Napriek tomu, že tento model bol natrénovaný na väčšom množstve dát zlepšenie nie je dostatočne signifikantné. To ukazuje, že použitie natívne viacjazyčných systémov je sľubným prístupom pre budúci výskum. Tiež prezentujem metódu získavania dokumentov v rôznych jazykoch pomocou algoritmu BM25 a porovnávam ju s anglickým retrievalom. Používanie viacjazyčných dôkazov sa javí ako prospešné a zlepšuje výkonnosť systému systémov.
132

Mitigation of Data Scarcity Issues for Semantic Classification in a Virtual Patient Dialogue Agent

Stiff, Adam January 2020 (has links)
No description available.
133

Reducing Training Time in Text Visual Question Answering

Behboud, Ghazale 15 July 2022 (has links)
Artificial Intelligence (AI) and Computer Vision (CV) have brought the promise of many applications along with many challenges to solve. The majority of current AI research has been dedicated to single-modal data processing meaning they use only one modality such as visual recognition or text recognition. However, real-world challenges are often a combination of different modalities of data such as text, audio and images. This thesis focuses on solving the Visual Question Answering (VQA) problem which is a significant multi-modal challenge. VQA is defined as a computer vision system that when given a question about an image will answer based on an understanding of both the question and image. The goal is improving the training time of VQA models. In this thesis, Look, Read, Reason and Answer (LoRRA), which is a state-of-the-art architecture, is used as the base model. Then, Reduce Uni-modal Biases (RUBi) is applied to this model to reduce the importance of uni- modal biases in training. Finally, an early stopping strategy is employed to stop the training process once the model accuracy has converged to prevent the model from overfitting. Numerical results are presented which show that training LoRRA with RUBi and early stopping can converge in less than 5 hours. The impact of batch size, learning rate and warm up hyper parameters is also investigated and experimental results are presented. / Graduate
134

[en] IMPROVING THE QUALITY OF THE USER EXPERIENCE BY QUERY ANSWER MODIFICATION / [pt] MELHORANDO A QUALIDADE DA EXPERIÊNCIA DO USUÁRIO ATRAVÉS DA MODIFICAÇÃO DA RESPOSTA DA CONSULTA

JOAO PEDRO VALLADAO PINHEIRO 30 June 2021 (has links)
[pt] A resposta de uma consulta, submetida a um banco de dados ou base de conhecimento, geralmente é longa e pode conter dados redundantes. O usuário é frequentemente forçado a navegar por uma longa resposta, ou refinar e repetir a consulta até que a resposta atinja um tamanho gerenciável. Sem o tratamento adequado, consumir a resposta da consulta pode se tornar uma tarefa tediosa. Este estudo, então, propõe um processo que modifica a apresentação da resposta da consulta para melhorar a qualidade de experiência do usuário, no contexto de uma base de conhecimento RDF. O processo reorganiza a resposta da consulta original aplicando heurísticas para comprimir os resultados. A consulta SPARQL original é modificada e uma exploração sobre o conjunto de resultados começa através de uma navegação guiada sobre predicados e suas facetas. O artigo também inclui experimentos baseados em versões RDF do MusicBrainz, enriquecido com dados do DBpedia, e IMDb, cada um com mais de 200 milhões de triplas RDF. Os experimentos utilizam exemplos de consultas de benchmarks conhecidos. / [en] The answer of a query, submitted to a database or a knowledge base, is often long and may contain redundant data. The user is frequently forced to browse thru a long answer, or to refine and repeat the query until the answer reaches a manageable size. Without proper treatment, consuming the query answer may indeed become a tedious task. This study then proposes a process that modifies the presentation of a query answer to improve the quality of the user s experience, in the context of an RDF knowledge base. The process reorganizes the original query answer by applying heuristics to summarize the results. The original SPARQL query is modified and an exploration over the result set starts thru a guided navigation over predicates and its facets. The article also includes experiments based on RDF versions of MusicBrainz, enriched with DBpedia data, and IMDb, each with over 200 million RDF triples. The experiments use sample queries from well-known benchmarks.
135

Designing a Question Answering System in the Domain of Swedish Technical Consulting Using Deep Learning / Design av ett frågebesvarande system inom svensk konsultverksamhet med användning av djupinlärning

Abrahamsson, Felix January 2018 (has links)
Question Answering systems are greatly sought after in many areas of industry. Unfortunately, as most research in Natural Language Processing is conducted in English, the applicability of such systems to other languages is limited. Moreover, these systems often struggle in dealing with long text sequences. This thesis explores the possibility of applying existing models to the Swedish language, in a domain where the syntax and semantics differ greatly from typical Swedish texts. Additionally, the text length may vary arbitrarily. To solve these problems, transfer learning techniques and state-of-the-art Question Answering models are investigated. Furthermore, a novel, divide-and-conquer based technique for processing long texts is developed. Results show that the transfer learning is partly unsuccessful, but the system is capable of perform reasonably well in the new domain regardless. Furthermore, the system shows great performance improvement on longer text sequences with the use of the new technique. / System som givet en text besvarar frågor är högt eftertraktade inom många arbetsområden. Eftersom majoriteten av all forskning inom naturligtspråkbehandling behandlar engelsk text är de flesta system inte direkt applicerbara på andra språk. Utöver detta har systemen ofta svårt att hantera långa textsekvenser. Denna rapport utforskar möjligheten att applicera existerande modeller på det svenska språket, i en domän där syntaxen och semantiken i språket skiljer sig starkt från typiska svenska texter. Dessutom kan längden på texterna variera godtyckligt. För att lösa dessa problem undersöks flera tekniker inom transferinlärning och frågebesvarande modeller i forskningsfronten. En ny metod för att behandla långa texter utvecklas, baserad på en dekompositionsalgoritm. Resultaten visar på att transfer learning delvis misslyckas givet domänen och modellerna, men att systemet ändå presterar relativt väl i den nya domänen. Utöver detta visas att systemet presterar väl på långa texter med hjälp av den nya metoden.
136

Self-Reflection on Chain-of-Thought Reasoning in Large Language Models / Självreflektion över Chain-of-Thought-resonerande i stora språkmodeller

Praas, Robert January 2023 (has links)
A strong capability of large language models is Chain-of-Thought reasoning. Prompting a model to ‘think step-by-step’ has led to great performance improvements in solving problems such as planning and question answering, and with the extended output it provides some evidence about the rationale behind an answer or decision. In search of better, more robust, and interpretable language model behavior, this work investigates self-reflection in large language models. Here, self-reflection consists of feedback from large language models to medical question-answering and whether the feedback can be used to accurately distinguish between correct and incorrect answers. GPT-3.5-Turbo and GPT-4 provide zero-shot feedback scores to Chain-of-Thought reasoning on the MedQA (medical questionanswering) dataset. The question-answering is evaluated on traits such as being structured, relevant and consistent. We test whether the feedback scores are different for questions that were either correctly or incorrectly answered by Chain-of-Thought reasoning. The potential differences in feedback scores are statistically tested with the Mann-Whitney U test. Graphical visualization and logistic regressions are performed to preliminarily determine whether the feedback scores are indicative to whether the Chain-of-Thought reasoning leads to the right answer. The results indicate that among the reasoning objectives, the feedback models assign higher feedback scores to questions that were answered correctly than those that were answered incorrectly. Graphical visualization shows potential for reviewing questions with low feedback scores, although logistic regressions that aimed to predict whether or not questions were answered correctly mostly defaulted to the majority class. Nonetheless, there seems to be a possibility for more robust output from self-reflecting language systems. / En stark förmåga hos stora språkmodeller är Chain-of-Thought-resonerande. Att prompta en modell att tänka stegvis har lett till stora prestandaförbättringar vid lösandet av problem som planering och frågebesvarande, och med den utökade outputen ger det en del bevis rörande logiken bakom ett svar eller beslut. I sökandet efter bättre, mer robust och tolk bart beteende hos språkmodeller undersöker detta arbete självreflektion i stora språkmodeller. Forskningsfrågan är: I vilken utsträckning kan feedback från stora språkmodeller, såsom GPT-3.5-Turbo och GPT-4, på ett korrekt sätt skilja mellan korrekta och inkorrekta svar i medicinska frågebesvarande uppgifter genom användningen av Chainof-Thought-resonerande? Här ger GPT-3.5-Turbo och GPT-4 zero-shot feedback-poäng till Chain-ofThought-resonerande på datasetet för MedQA (medicinskt frågebesvarande). Frågebesvarandet bör vara strukturerat, relevant och konsekvent. Feedbackpoängen jämförs mellan två grupper av frågor, baserat på om dessa besvarades korrekt eller felaktigt i första hand. Statistisk testning genomförs på skillnaden i feedback-poäng med Mann-Whitney U-testet. Grafisk visualisering och logistiska regressioner utförs för att preliminärt avgöra om feedbackpoängen är indikativa för huruvida Chainof-Thought-resonerande leder till rätt svar. Resultaten indikerar att bland resonemangsmålen tilldelar feedbackmodellerna fler positiva feedbackpoäng till frågor som besvarats korrekt än de som besvarats felaktigt. Grafisk visualisering visar potential för granskandet av frågor med låga feedbackpoäng, även om logistiska regressioner som syftade till att förutsäga om frågorna besvarades korrekt eller inte för det mesta majoritetsklassen. Icke desto mindre verkar det finnas potential för robustare från självreflekterande språksystem.
137

Distilling Multilingual Transformer Models for Efficient Document Retrieval : Distilling multi-Transformer models with distillation losses involving multi-Transformer interactions / Destillering av flerspråkiga transformatormodeller för effektiv dokumentsökning : Destillering av modeller med flera transformatorer med destilleringsförluster som involverar interaktioner mellan flera transformatorer

Liu, Xuecong January 2022 (has links)
Open Domain Question Answering (OpenQA) is a task concerning automatically finding answers to a query from a given set of documents. Language-agnostic OpenQA is an increasingly important research area in the globalised world, where the answers can be in a different language from the question. An OpenQA system generally consists of a document retriever to retrieve relevant passages and a reader to extract answers from the passages. Large Transformers, such as Dense Passage Retrieval (DPR) models, have achieved state-of-the-art performances in document retrievals, but they are computationally expensive in production. Knowledge Distillation (KD) is an effective way to reduce the size and increase the speed of Transformers while retaining their performances. However, most existing research focuses on distilling single Transformer models, instead of multi-Transformer models, as in the case of DPR. This thesis project uses MiniLM and DistilBERT distillation methods, two of the most successful methods to distil the BERT model, to individually distil the passage and query model of a fined-tuned DPR model comprised of two pretrained MPNet models. In addition, the project proposes and tests Embedding Similarity Loss (ESL), a distillation loss designed for the interaction between the passage and query models in DPR architecture. The results show that using ESL results in better students than using MiniLM or DistilBERT loss alone and that combining ESL with any of the other two losses increases their student models’ performances in most cases, especially when training on Information-Seeking Question Answering in Typologically Diverse Languages (TyDi QA) instead of The Stanford Question Answering Dataset 1.1 (SQuAD 1.1). The best resulting 6-layer student DPR model retained more than 90% of the recall and Mean Average Precision (MAP) in Cross-Lingual Transfer (XLT) tasks while reducing the inference time to 63.2%. In Generalised Cross-Lingual Transfer (G-XLT) tasks, it retained only around 42% of the recall and MAP using 53.8% of the inference time. / Domänlöst frågebesvarande är en uppgift som handlar om att automatiskt hitta svar på en fråga från en given uppsättning av dokument. Språkagnostiska domänlöst frågebesvarande är ett allt viktigare forskningsområde i den globaliserade världen, där svaren kan vara på ett annat språk än själva frågan. Ett domänlöst frågebesvarande-system består i allmänhet av en dokumenthämtare som plockar relevanta textavsnitt och en läsare som extraherar svaren från dessa textavsnitt. Stora transformatorer, såsom DPR-modeller (Dense Passage Retrieval), har uppnått toppresultat i dokumenthämtning, men de är beräkningsmässigt dyra i produktion. KD (Knowledge Distillation) är ett effektivt sätt att minska storleken och öka hastigheten hos transformatorer samtidigt som deras prestanda bibehålls. För det mesta är den existerande forskningen dock inriktad på att destillera enstaka transformatormodeller i stället för modeller med flera transformatorer, som i fallet med DPR. I det här examensarbetet används MiniLM- och DistilBERT-destilleringsmetoderna, två av de mest framgångsrika metoderna för att destillera BERT-modellen, för att individuellt destillera text- och frågemodellen i en finjusterad DPRmodell som består av två förinlärda MPNet-modeller. Dessutom föreslås och testas ESL (Embedding Similarity Loss), en destilleringsförlust som är utformad för interaktionen mellan text- och frågemodellerna i DPRarkitekturen. Resultaten visar att användning av ESL resulterar i bättre studenter än om man enbart använder MiniLM eller DistilBERT-förlusten och att kombinationen ESL med någon av de andra två förlusterna ökar deras studentmodellers prestanda i de flesta fall, särskilt när man tränar på TyDi QA (Typologically Diverse Languages) istället för SQuAD 1.1 (The Stanford Question Answering Dataset). Den bästa resulterande 6-lagriga student DPRmodellen behöll mer än 90% av återkallandet och MAP (Mean Average Precision) för XLT-uppgifterna (Cross-Lingual Transfer) samtidigt som tiden för inferens minskades till 63.2%. För G-XLT-uppgifterna (Generalised CrossLingual Transfer) bibehölls endast cirka 42% av återkallelsen och MAP med 53.8% av inferenstiden.
138

Developing a Framework for Geographic Question Answering Systems Using GIS, Natural Language Processing, Machine Learning, and Ontologies

Chen, Wei 02 June 2014 (has links)
No description available.
139

Active Learning for Extractive Question Answering

Marti Roman, Salvador January 2022 (has links)
Data labelling for question answering tasks (QA) is a costly procedure that requires oracles to read lengthy excerpts of texts and reason to extract an answer for a given question from within the text. QA is a task in natural language processing (NLP), where a majority of recent advancements have come from leveraging the vast corpora of unlabelled and unstructured text available online. This work aims to extend this trend in the efficient use of unlabelled text data to the problem of selecting which subset of samples to label in order to maximize performance. This practice of selective labelling is called active learning (AL).  Recent developments in AL for NLP have introduced the use of self-supervised learning on large corpora of text in the labelling process of samples for classification problems. This work adapts this research to the task of question answering and performs an initial exploration of expected performance.  The methods covered in this work use uncertainty estimates obtained from neural networks to guide an incremental labelling process. These estimates are obtained from transformer-based models, previously trained in a self-supervised manner, by calculating the entropy of the confidence scores or with an approximation of Bayesian uncertainty obtained through Monte Carlo dropout. These methods are evaluated on two different benchmarking QA datasets: SQuAD v1 and TriviaQA.  Several factors are observed to influence the behaviour of these uncertainty-based acquisition functions, including the choice of language model used, the presence of unanswered questions and the acquisition size used in the incremental process. The study produces no evidence to support that averaging or selecting maximal uncertainty values between the classification of an answer’s starting and ending positions affects sample acquisition quality. However, language model choice, the presence of unanswerable questions and acquisition size are all identified as key factors affecting consistency between runs and degree of success.
140

Neural Sequence Modeling for Domain-Specific Language Processing: A Systematic Approach

Zhu, Ming 14 August 2023 (has links)
In recent years, deep learning based sequence modeling (neural sequence modeling) techniques have made substantial progress in many tasks, including information retrieval, question answering, information extraction, machine translation, etc. Benefiting from the highly scalable attention-based Transformer architecture and enormous open access online data, large-scale pre-trained language models have shown great modeling and generalization capacity for sequential data. However, not all domains benefit equally from the rapid development of neural sequence modeling. Domains like healthcare and software engineering have vast amounts of sequential data containing rich knowledge, yet remain under-explored due to a number of challenges: 1) the distribution of the sequences in specific domains is different from the general domain; 2) the effective comprehension of domain-specific data usually relies on domain knowledge; and 3) the labelled data is usually scarce and expensive to get in domain-specific settings. In this thesis, we focus on the research problem of applying neural sequence modeling methods to address both common and domain-specific challenges from the healthcare and software engineering domains. We systematically investigate neural-based machine learning approaches to address the above challenges in three research directions: 1) learning with long sequences, 2) learning from domain knowledge and 3) learning under limited supervision. Our work can also potentially benefit more domains with large amounts of sequential data. / Doctor of Philosophy / In the last few years, computer programs that learn and understand human languages (an area called machine learning for natural language processing) have significantly improved. These advances are visible in various areas such as retrieving information, answering questions, extracting key details from texts, and translating between languages. A key to these successes has been the use of a type of neural network structure known as a "Transformer", which can process and learn from lots of information found online. However, these successes are not uniform across all areas. Two fields, healthcare and software engineering, still present unique challenges despite having a wealth of information. Some of these challenges include the different types of information in these fields, the need for specific expertise to understand this information, and the shortage of labeled data, which is crucial for training machine learning models. In this thesis, we focus on the use of machine learning for natural language processing methods to solve these challenges in the healthcare and software engineering fields. Our research investigates learning with long documents, learning from domain-specific expertise, and learning when there's a shortage of labeled data. The insights and techniques from our work could potentially be applied to other fields that also have a lot of sequential data.

Page generated in 0.1641 seconds