Global ETD Search

21	Methods for data and user efficient annotation for multi-label topic classification / Effektiva annoteringsmetoder för klassificering med multipla klasser Miszkurka, Agnieszka January 2022 (has links) Machine Learning models trained using supervised learning can achieve great results when a sufficient amount of labeled data is used. However, the annotation process is a costly and time-consuming task. There are many methods devised to make the annotation pipeline more user and data efficient. This thesis explores techniques from Active Learning, Zero-shot Learning, Data Augmentation domains as well as pre-annotation with revision in the context of multi-label classification. Active ’Learnings goal is to choose the most informative samples for labeling. As an Active Learning state-of-the-art technique Contrastive Active Learning was adapted to a multi-label case. Once there is some labeled data, we can augment samples to make the dataset more diverse. English-German-English Backtranslation was used to perform Data Augmentation. Zero-shot learning is a setup in which a Machine Learning model can make predictions for classes it was not trained to predict. Zero-shot via Textual Entailment was leveraged in this study and its usefulness for pre-annotation with revision was reported. The results on the Reviews of Electric Vehicle Charging Stations dataset show that it may be beneficial to use Active Learning and Data Augmentation in the annotation pipeline. Active Learning methods such as Contrastive Active Learning can identify samples belonging to the rarest classes while Data Augmentation via Backtranslation can improve performance especially when little training data is available. The results for Zero-shot Learning via Textual Entailment experiments show that this technique is not suitable for the production environment. / Klassificeringsmodeller som tränas med övervakad inlärning kan uppnå goda resultat när en tillräcklig mängd annoterad data används. Annoteringsprocessen är dock en kostsam och tidskrävande uppgift. Det finns många metoder utarbetade för att göra annoteringspipelinen mer användar- och dataeffektiv. Detta examensarbete utforskar tekniker från områdena Active Learning, Zero-shot Learning, Data Augmentation, samt pre-annotering, där annoterarens roll är att verifiera eller revidera en klass föreslagen av systemet. Målet med Active Learning är att välja de mest informativa datapunkterna för annotering. Contrastive Active Learning utökades till fallet där en datapunkt kan tillhöra flera klasser. Om det redan finns några annoterade data kan vi utöka datamängden med artificiella datapunkter, med syfte att göra datasetet mer mångsidigt. Engelsk-Tysk-Engelsk översättning användes för att konstruera sådana artificiella datapunkter. Zero-shot-inlärning är en teknik i vilken en maskininlärningsmodell kan göra förutsägelser för klasser som den inte var tränad att förutsäga. Zero-shot via Textual Entailment utnyttjades i denna studie för att utöka datamängden med artificiella datapunkter. Resultat från datamängden “Reviews of Electric Vehicle Charging ”Stations visar att det kan vara fördelaktigt att använda Active Learning och Data Augmentation i annoteringspipelinen. Active Learning-metoder som Contrastive Active Learning kan identifiera datapunkter som tillhör de mest sällsynta klasserna, medan Data Augmentation via Backtranslation kan förbättra klassificerarens prestanda, särskilt när få träningsdata finns tillgänglig. Resultaten för Zero-shot Learning visar att denna teknik inte är lämplig för en produktionsmiljö. Natural Language Processing Multi-label text classification Active Learning Zero-shot learning Data Augmentation Data-centric AI Naturlig språkbehandling Textklassificering med multipla klasser Active Learning Zero-shot learning Data Augmentation Datacentrerad AI Computer and Information Sciences Data- och informationsvetenskap
22	Deep Neural Networks for Multi-Label Text Classification: Application to Coding Electronic Medical Records Rios, Anthony 01 January 2018 (has links) Coding Electronic Medical Records (EMRs) with diagnosis and procedure codes is an essential task for billing, secondary data analyses, and monitoring health trends. Both speed and accuracy of coding are critical. While coding errors could lead to more patient-side financial burden and misinterpretation of a patient’s well-being, timely coding is also needed to avoid backlogs and additional costs for the healthcare facility. Therefore, it is necessary to develop automated diagnosis and procedure code recommendation methods that can be used by professional medical coders. The main difficulty with developing automated EMR coding methods is the nature of the label space. The standardized vocabularies used for medical coding contain over 10 thousand codes. The label space is large, and the label distribution is extremely unbalanced - most codes occur very infrequently, with a few codes occurring several orders of magnitude more than others. A few codes never occur in training dataset at all. In this work, we present three methods to handle the large unbalanced label space. First, we study how to augment EMR training data with biomedical data (research articles indexed on PubMed) to improve the performance of standard neural networks for text classification. PubMed indexes more than 23 million citations. Many of the indexed articles contain relevant information about diagnosis and procedure codes. Therefore, we present a novel method of incorporating this unstructured data in PubMed using transfer learning. Second, we combine ideas from metric learning with recent advances in neural networks to form a novel neural architecture that better handles infrequent codes. And third, we present new methods to predict codes that have never appeared in the training dataset. Overall, our contributions constitute advances in neural multi-label text classification with potential consequences for improving EMR coding. Natural Language Processing Machine Learning Neural Networks Multi-label Classification Biomedical Informatics Zero-shot Learning Artificial Intelligence and Robotics Computer Sciences
23	On Transfer Learning Techniques for Machine Learning Debasmit Das (8314707) 30 April 2020 (has links) <pre><pre><p> </p><p>Recent progress in machine learning has been mainly due to the availability of large amounts of annotated data used for training complex models with deep architectures. Annotating this training data becomes burdensome and creates a major bottleneck in maintaining machine-learning databases. Moreover, these trained models fail to generalize to new categories or new varieties of the same categories. This is because new categories or new varieties have data distribution different from the training data distribution. To tackle these problems, this thesis proposes to develop a family of transfer-learning techniques that can deal with different training (source) and testing (target) distributions with the assumption that the availability of annotated data is limited in the testing domain. This is done by using the auxiliary data-abundant source domain from which useful knowledge is transferred that can be applied to data-scarce target domain. This transferable knowledge serves as a prior that biases target-domain predictions and prevents the target-domain model from overfitting. Specifically, we explore structural priors that encode relational knowledge between different data entities, which provides more informative bias than traditional priors. The choice of the structural prior depends on the information availability and the similarity between the two domains. Depending on the domain similarity and the information availability, we divide the transfer learning problem into four major categories and propose different structural priors to solve each of these sub-problems.</p><p> </p><p>This thesis first focuses on the unsupervised-domain-adaptation problem, where we propose to minimize domain discrepancy by transforming labeled source-domain data to be close to unlabeled target-domain data. For this problem, the categories remain the same across the two domains and hence we assume that the structural relationship between the source-domain samples is carried over to the target domain. Thus, graph or hyper-graph is constructed as the structural prior from both domains and a graph/hyper-graph matching formulation is used to transform samples in the source domain to be closer to samples in the target domain. An efficient optimization scheme is then proposed to tackle the time and memory inefficiencies associated with the matching problem. The few-shot learning problem is studied next, where we propose to transfer knowledge from source-domain categories containing abundantly labeled data to novel categories in the target domain that contains only few labeled data. The knowledge transfer biases the novel category predictions and prevents the model from overfitting. The knowledge is encoded using a neural-network-based prior that transforms a data sample to its corresponding class prototype. This neural network is trained from the source-domain data and applied to the target-domain data, where it transforms the few-shot samples to the novel-class prototypes for better recognition performance. The few-shot learning problem is then extended to the situation, where we do not have access to the source-domain data but only have access to the source-domain class prototypes. In this limited information setting, parametric neural-network-based priors would overfit to the source-class prototypes and hence we seek a non-parametric-based prior using manifolds. A piecewise linear manifold is used as a structural prior to fit the source-domain-class prototypes. This structure is extended to the target domain, where the novel-class prototypes are found by projecting the few-shot samples onto the manifold. Finally, the zero-shot learning problem is addressed, which is an extreme case of the few-shot learning problem where we do not have any labeled data in the target domain. However, we have high-level information for both the source and target domain categories in the form of semantic descriptors. We learn the relation between the sample space and the semantic space, using a regularized neural network so that classification of the novel categories can be carried out in a common representation space. This same neural network is then used in the target domain to relate the two spaces. In case we want to generate data for the novel categories in the target domain, we can use a constrained generative adversarial network instead of a traditional neural network. Thus, we use structural priors like graphs, neural networks and manifolds to relate various data entities like samples, prototypes and semantics for these different transfer learning sub-problems. We explore additional post-processing steps like pseudo-labeling, domain adaptation and calibration and enforce algorithmic and architectural constraints to further improve recognition performance. Experimental results on standard transfer learning image recognition datasets produced competitive results with respect to previous work. Further experimentation and analyses of these methods provided better understanding of machine learning as well.</p><p> </p></pre></pre> Transfer learning Computer Vision Machine Learning Domain Adaptation Few-shot Learning Zero-shot Learning
24	Automatisk Summering av Cybersäkerhetsdiskussioner på Onlineforum : En prototyp för abstraktiv textsummering med en Zero-shot modell Ununger, Andreas January 2022 (has links) Antalet cyberattacker ökar ständigt och därav också antalet angreppssätt och försvarstekniker. Detta innebär att personer verksamma inom cybersäkerhet behöver spendera mer och mer tid på att hålla sig uppdaterade om de senaste utvecklingarna i branschen. Det är därför av intresse att hitta sätt som kan påskynda denna inhämtning av information. I denna studie utvecklas en prototyp med målet att på ett nytt sätt automatiskt summera en av de många sorters nyhetskällor som finns inom cybersäkerhetsdiskussioner på onlineforum. Prototypen använder sig av abstraktiv textsummering med zero-shot modellen GPT-3. Prototypen som utvecklades utvärderades genom att mäta de summeringar som skapades med SUPERT. Resultatet från mätningen gav ett värde av 0,269 vid mätning mot de originella texterna och 0,358 vid mätning mot ett dataset som städats från text som inte rör cybersäkerhet. Från dessa värdet dras slutsatsen att utvecklingen av prototypen lyckades. Maskininlärning pre-trained GPT-3 automatisk summering abstraktiv summering Zero-shot SUPERT Natural language processing Information Systems
25	Image-Guided Zero-Shot Object Detection in Video Games : Using Images as Prompts for Detection of Unseen 2D Icons / Bildstyrd Zero-Shot Objektdetektering i Datorspel : Användning av Bilder för att Diktera Detektion av Osedda 2D-ikoner Larsson, Axel January 2023 (has links) Object detection deals with localization and classification of objects in images, where the task is to propose bounding boxes and predict their respective classes. Challenges in object detection include large-scale annotated datasets and re-training of models for specific tasks. Motivated by these problems, we propose a zero-shot object detection (ZSD) model in the setting of user interface icons in video games. Allowing to quickly and accurately analyze the state of a game, with potentially millions of people watching, would greatly benefit the large and fast-growing video game sector. Our resulting model is a modification of YOLOv8, which, at inference time, is prompted with the specific object to detect in an image. Many existing ZSD models exploit semantic embeddings and high-dimensional word vectors to generalize to novel classes. We hypothesize that using only visual representations is sufficient for the detection of unseen classes. To train and evaluate our model, we create synthetic data to reflect the nature of video game icons and in-game frames. Our method achieves similar performance as YOLOv8 on bounding box prediction and detection of seen classes while retaining the same average precision and recall for unseen classes, where the number of unseen classes is in the order of thousands. / Objektdetektering handlar om lokalisering och klassificering av objekt i bilder, där uppgiften är att föreslå omskrivande rektanglar och prediktera de respektive klasserna. Utmaningar i objektdetektering inkluderar storskaliga annoterade datamängder och omträning av modeller för specifika uppgifter. Motiverade av dessa problem föreslår vi en zero-shot-modell för objektdetektering riktat mot användargränssnittsikoner i datorspel. Att snabbt och precist kunna analysera tillståndet i ett spel, med potentiellt miljontals människor som tittar, skulle vara till stor nytta för den snabbväxande datorspelssektorn. Vår slutliga modell är en modifiering av YOLOv8, som vid inferens förses med det specifika objektet som ska upptäckas i en given bild. Många befintliga zero-shot-modeller inom objektdetektering utnyttjar semantiska inbäddningar och högdimensionella ordvektorer för att generalisera till nya klasser. Vi hypotiserar att det är tillräckligt att använda visuella representationer för att upptäcka osedda klasser. För att träna och utvärdera vår modell skapar vi syntetisk data för att återspegla spelbilder och ikoner från datorspel. Vår metod uppnår liknande prestanda som YOLOv8 på prediktion av omskrivande rektanglar och på sedda klasser där antalet klasser är lågt. Samtidigt upprätthåller vi samma positiva prediktionsvärde och sensitivitet för osedda klasser där antalet klasser uppgår till tusentals. Computer Vision Deep learning Machine learning Object detection Zeroshot Datorseende Djupinlärning Maskininlärning Objektdetektering Zero-shot Computer and Information Sciences Data- och informationsvetenskap
26	GENERATING SQL FROM NATURAL LANGUAGE IN FEW-SHOT AND ZERO-SHOT SCENARIOS Asplund, Liam January 2024 (has links) Making information stored in databases more accessible to users inexperienced in structured query language (SQL) by converting natural language to SQL queries has long been a prominent research area in both the database and natural language processing (NLP) communities. There have been numerous approaches proposed for this task, such as encoder-decoder frameworks, semantic grammars, and more recently with the use of large language models (LLMs). When training LLMs to successfully generate SQL queries from natural language questions there are three notable methods used, pretraining, transfer learning and in-context learning (ICL). ICL is particularly advantageous in scenarios where the hardware at hand is limited, time is of concern and large amounts of task specific labled data is nonexistent. This study seeks to evaluate two strategies in ICL, namely zero-shot and few-shot scenarios using the Mistral-7B-Instruct LLM. Evaluation of the few-shot scenarios was conducted using two techniques, random selection and Jaccard Similarity. The zero-shot scenarios served as a baseline for the few-shot scenarios to overcome, which ended as anticipated, with the few-shot scenarios using Jaccard similarity outperforming the other two methods, followed by few-shot scenarios using random selection coming in at second best, and the zero-shot scenarios performing the worst. Evaluation results acquired based on execution accuracy and exact matching accuracy confirm that leveraging similarity in demonstrating examples when prompting the LLM will enhance the models knowledge about the database schema and table names which is used during the inference phase leadning to more accurately generated SQL queries than leveraging diversity in demonstrating examples. In-context learning Few-shot scenarios Zero-shot scenarios Large language model Prompt engineering Jaccard Similarity Computer Sciences Datavetenskap (datalogi)
27	„Auch heute war die Stimmung im Allgemeinen fest.“ Zero-Shot Klassifikation zur Bestimmung des Media Sentiment an der Berliner Börse zwischen 1872 und 1930 Burghardt, Manuel, Niekler, Andreas, Borst, Janos, Wehrheim, Lino 04 July 2024 (has links) No description available. info:eu-repo/classification/ddc/006 ddc:006
28	Breaking Language Barriers: Enhancing Multilingual Representation for Sentence Alignment and Translation / 言語の壁を超える：文のアラインメントと翻訳のための多言語表現の改善 Mao, Zhuoyuan 25 March 2024 (has links) 京都大学 / 新制・課程博士 / 博士(情報学) / 甲第25420号 / 情博第858号 / 新制\|\|情\|\|144(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)特定教授黒橋禎夫, 教授河原達也, 教授鹿島久嗣 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Multilingual Representation Multilingual Sentence Embedding Multilingual Neural Machine Translation Low-resource Languages Zero-shot Translation Training and Inference Efficiency 7
29	Apprentissage automatique en ligne pour un dialogue homme-machine situé / Online learning for situated human-machine dialogue Ferreira, Emmanuel 14 December 2015 (has links) Un système de dialogue permet de doter la Machine de la capacité d'interagir de façon naturelle et efficace avec l'Homme. Dans cette thèse nous nous intéressons au développement d'un système de dialogue reposant sur des approches statistiques, et en particulier du cadre formel des Processus Décisionnel de Markov Partiellement Observable, en anglais Partially Observable Markov Decision Process (POMDP), qui à ce jour fait office de référence dans la littérature en ce qui concerne la gestion statistique du dialogue. Ce modèle permet à la fois une prise en compte améliorée de l'incertitude inhérente au traitement des données en provenance de l'utilisateur (notamment la parole) et aussi l'optimisation automatique de la politique d'interaction à partir de données grâce à l'apprentissage par renforcement, en anglais Reinforcement Learning (RL). Cependant, une des problématiques liées aux approches statistiques est qu'elles nécessitent le recours à une grande quantité de données d'apprentissage pour atteindre des niveaux de performances acceptables. Or, la collecte de telles données est un processus long et coûteux qui nécessite généralement, pour le cas du dialogue, la réalisation de prototypes fonctionnels avec l'intervention d'experts et/ou le développement de solution alternative comme le recours à la simulation d'utilisateurs. En effet, très peu de travaux considèrent à ce jour la possibilité d'un apprentissage de la stratégie de la Machine de part sa mise en situation de zéro (sans apprentissage préalable) face à de vrais utilisateurs. Pourtant cette solution présente un grand intérêt, elle permet par exemple d'inscrire le processus d'apprentissage comme une partie intégrante du cycle de vie d'un système lui offrant la capacité de s'adapter à de nouvelles conditions de façon dynamique et continue. Dans cette thèse, nous nous attacherons donc à apporter des solutions visant à rendre possible ce démarrage à froid du système mais aussi, à améliorer sa capacité à s'adapter à de nouvelles conditions (extension de domaine, changement d'utilisateur,...). Pour ce faire, nous envisagerons dans un premier temps l'utilisation de l'expertise du domaine (règles expertes) pour guider l'apprentissage initial de la politique d'interaction du système. De même, nous étudierons l'impact de la prise en compte de jugements subjectifs émis par l'utilisateur au fil de l'interaction dans l'apprentissage, notamment dans un contexte de changement de profil d'utilisateur où la politique préalablement apprise doit alors pouvoir s'adapter à de nouvelles conditions. Les résultats obtenus sur une tâche de référence montrent la possibilité d'apprendre une politique (quasi-)optimale en quelques centaines d'interactions, mais aussi que les informations supplémentaires considérées dans nos propositions sont à même d'accélérer significativement l'apprentissage et d'améliorer la tolérance aux bruits dans la chaîne de traitement. Dans un second temps nous nous intéresserons à réduire les coûts de développement d'un module de compréhension de la parole utilisé dans l'étiquetage sémantique d'un tour de dialogue. Pour cela, nous exploiterons les récentes avancées dans les techniques de projection des mots dans des espaces vectoriels continus conservant les propriétés syntactiques et sémantiques, pour généraliser à partir des connaissances initiales limitées de la tâche pour comprendre l'utilisateur. Nous nous attacherons aussi à proposer des solutions afin d'enrichir dynamiquement cette connaissance et étudier le rapport de cette technique avec les méthodes statistiques état de l'art. Là encore nos résultats expérimentaux montrent qu'il est possible d'atteindre des performances état de l'art avec très peu de données et de raffiner ces modèles ensuite avec des retours utilisateurs dont le coût peut lui-même être optimisé. / A dialogue system should give the machine the ability to interactnaturally and efficiently with humans. In this thesis, we focus on theissue of the development of stochastic dialogue systems. Thus, we especiallyconsider the Partially Observable Markov Decision Process (POMDP)framework which yields state-of-the-art performance on goal-oriented dialoguemanagement tasks. This model enables the system to cope with thecommunication ambiguities due to noisy channel and also to optimize itsdialogue management strategy directly from data with Reinforcement Learning (RL)methods.Considering statistical approaches often requires the availability of alarge amount of training data to reach good performance. However, corpora of interest are seldom readily available and collectingsuch data is both time consuming and expensive. For instance, it mayrequire a working prototype to initiate preliminary experiments with thesupport of expert users or to consider other alternatives such as usersimulation techniques.Very few studies to date have considered learning a dialogue strategyfrom scratch by interacting with real users, yet this solution is ofgreat interest. Indeed, considering the learning process as part of thelife cycle of a system offers a principle framework to dynamically adaptthe system to new conditions in an online and seamless fashion.In this thesis, we endeavour to provide solutions to make possible thisdialogue system cold start (nearly from scratch) but also to improve its ability to adapt to new conditions in operation (domain extension, new user profile, etc.).First, we investigate the conditions under which initial expertknowledge (such as expert rules) can be used to accelerate the policyoptimization of a learning agent. Similarly, we study how polarized userappraisals gathered throughout the course of the interaction can beintegrated into a reinforcement learning-based dialogue manager. Morespecifically, we discuss how this information can be cast intosocially-inspired rewards to speed up the policy optimisation for bothefficient task completion and user adaptation in an online learning setting.The results obtained on a reference task demonstrate that a(quasi-)optimal policy can be learnt in just a few hundred dialogues,but also that the considered additional information is able tosignificantly accelerate the learning as well as improving the noise tolerance.Second, we focus on reducing the development cost of the spoken language understanding module. For this, we exploit recent word embedding models(projection of words in a continuous vector space representing syntacticand semantic properties) to generalize from a limited initial knowledgeabout the dialogue task to enable the machine to instantly understandthe user utterances. We also propose to dynamically enrich thisknowledge with both active learning techniques and state-of-the-artstatistical methods. Our experimental results show that state-of-the-artperformance can be obtained with a very limited amount of in-domain andin-context data. We also show that we are able to refine the proposedmodel by exploiting user returns about the system outputs as well as tooptimize our adaptive learning with an adversarial bandit algorithm tosuccessfully balance the trade-off between user effort and moduleperformance.Finally, we study how the physical embodiment of a dialogue system in a humanoid robot can help the interaction in a dedicated Human-Robotapplication where dialogue system learning and testing are carried outwith real users. Indeed, in this thesis we propose an extension of thepreviously considered decision-making techniques to be able to take intoaccount the robot's awareness of the users' belief (perspective taking)in a RL-based situated dialogue management optimisation procedure. Système de dialogue situé Apprentissage par renforcement en ligne Prise de perspective Situated dialogue system Online reinforcement learning Zero-Shot learning Perspective-taking
30	FAZT: FEW AND ZERO-SHOT FRAMEWORK TO LEARN TEMPO-VISUAL EVENTS FROM LITTLE OR NO DATA Naveen Madapana (11613925) 20 December 2021 (has links) <div>Supervised classification methods based on deep learning have achieved great success in many domains and tasks that are previously unimaginable. Such approaches build on learning paradigms that require hundreds of examples in order to learn to classify objects or events. Thus, their immediate application to the domains with few or no observations is limited. This is because of the lack of ability to rapidly generalize to new categories from a few examples or from high-level descriptions of categories. This can be attributed to the significant gap between the way machines represent knowledge and the way humans represent categories in their minds and learn to recognize them. In this context, this research represents categories as semantic trees in a high-level attribute space and proposes an approach to utilize these representations to conduct N-Shot, Few-Shot, One-Shot, and Zero-Shot Learning (ZSL). This work refers to this paradigm as the problem of general classification (GCP) and proposes a unified framework for GCP referred to as the Few and Zero-Shot Technique (FAZT). FAZT framework is an end-to-end approach that uses trainable 3D convolutional neural networks and recurrent neural networks to simultaneously optimize for both the semantic and the classification tasks. Lastly, the problem of systematically obtaining semantic attributes by utilizing domain-specific ontologies is presented. The proposed framework is validated in the domains of hand gesture and action/activity recognition, however, this research can be applied to other domains such as video understanding, the study of human behavior, emotion recognition, etc. First, an attribute-based dataset for gestures is developed in a systematic manner by relying on literature in gestures and semantics, and crowdsourced platforms such as Amazon Mechanical Turk. To the best of our knowledge, this is the first ZSL dataset for hand gestures (ZSGL dataset). Next, our framework is evaluated in two experimental conditions: 1. Within-category (to test the attribute recognition power) and 2. Across-category (to test the ability to recognize an unknown category). In addition, we conducted experiments in zero-shot, one-shot, few-shot and continuous learning conditions in both open-set and closed-set scenarios. Results showed that our framework performs favorably on the ZSGL, Kinetics, UIUC Action, UCF101 and HMDB51 action datasets in all the experimental conditions.<br></div><div><br></div> Computer Engineering transfer learning machine learning deep learning zero-shot learning few-shot learning lifelong learning gesture recognition activity recognition semantic description agreement analysis

Search results