Spelling suggestions: "subject:"generative models"" "subject:"agenerative models""
41 |
Composable, Distributed-state Models for High-dimensional Time SeriesTaylor, Graham William 03 March 2010 (has links)
In this thesis we develop a class of nonlinear generative models for high-dimensional time series. The first key property of these models is their distributed, or "componential" latent state, which is characterized by binary stochastic variables which interact to explain the data. The second key property is the use of an undirected graphical model to represent the relationship between latent state (features) and observations. The final key property is composability: the proposed class of models can form the building blocks of deep networks by successively training each model on the features extracted by the previous one.
We first propose a model based on the Restricted Boltzmann Machine (RBM) that uses an undirected model with binary latent variables and real-valued "visible" variables. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. This "conditional" RBM (CRBM) makes on-line inference efficient and allows us to use a simple approximate learning procedure. We demonstrate the power of our approach by synthesizing various motion sequences and by performing on-line filling in of data lost during motion capture. We also explore CRBMs as priors in the context of Bayesian filtering applied to multi-view and monocular 3D person tracking.
We extend the CRBM in a way that preserves its most important computational properties and introduces multiplicative three-way interactions that allow the effective interaction weight between two variables to be modulated by the dynamic state of a third variable. We introduce a factoring of the implied three-way weight tensor to permit a more compact parameterization. The resulting model can capture diverse styles of motion with a single set of parameters, and the three-way interactions greatly improve its ability to blend motion styles or to transition smoothly among them.
In separate but related work, we revisit Products of Hidden Markov Models (PoHMMs). We show how the partition function can be estimated reliably via Annealed Importance Sampling. This enables us to demonstrate that PoHMMs outperform various flavours of HMMs on a variety of tasks and metrics, including log likelihood.
|
42 |
Incremental generative models for syntactic and semantic natural language processingBuys, Jan Moolman January 2017 (has links)
This thesis investigates the role of linguistically-motivated generative models of syntax and semantic structure in natural language processing (NLP). Syntactic well-formedness is crucial in language generation, but most statistical models do not account for the hierarchical structure of sentences. Many applications exhibiting natural language understanding rely on structured semantic representations to enable querying, inference and reasoning. Yet most semantic parsers produce domain-specific or inadequately expressive representations. We propose a series of generative transition-based models for dependency syntax which can be applied as both parsers and language models while being amenable to supervised or unsupervised learning. Two models are based on Markov assumptions commonly made in NLP: The first is a Bayesian model with hierarchical smoothing, the second is parameterised by feed-forward neural networks. The Bayesian model enables careful analysis of the structure of the conditioning contexts required for generative parsers, but the neural network is more accurate. As a language model the syntactic neural model outperforms both the Bayesian model and n-gram neural networks, pointing to the complementary nature of distributed and structured representations for syntactic prediction. We propose approximate inference methods based on particle filtering. The third model is parameterised by recurrent neural networks (RNNs), dropping the Markov assumptions. Exact inference with dynamic programming is made tractable here by simplifying the structure of the conditioning contexts. We then shift the focus to semantics and propose models for parsing sentences to labelled semantic graphs. We introduce a transition-based parser which incrementally predicts graph nodes (predicates) and edges (arguments). This approach is contrasted against predicting top-down graph traversals. RNNs and pointer networks are key components in approaching graph parsing as an incremental prediction problem. The RNN architecture is augmented to condition the model explicitly on the transition system configuration. We develop a robust parser for Minimal Recursion Semantics, a linguistically-expressive framework for compositional semantics which has previously been parsed only with grammar-based approaches. Our parser is much faster than the grammar-based model, while the same approach improves the accuracy of neural Abstract Meaning Representation parsing.
|
43 |
Méthodes pour l'inférence en grande dimension avec des données corrélées : application à des données génomiques / Methods for staistical inference on correlated data : application to genomic dataLeonardis, Eleonora De 26 October 2015 (has links)
La disponibilité de quantités énormes de données a changé le rôle de la physique par rapport aux autres disciplines. Dans cette thèse, je vais explorer les innovations introduites dans la biologie moléculaire grâce à des approches de physique statistique. Au cours des 20 dernières années, la taille des bases de données sur le génome a augmenté de façon exponentielle : l'exploitation des données brutes, dans le champ d'application de l'extraction d'informations, est donc devenu un sujet majeur dans la physique statistique. Après le succès dans la prédiction de la structure des protéines, des résultats étonnamment bons ont été finalement obtenus aussi pour l'ARN. Cependant, des études récentes ont révélé que, même si les bases de données sont de plus en plus grandes, l'inférence est souvent effectuée dans le régime de sous-échantillonnage et de nouveaux systèmes informatiques sont nécessaires afin de surmonter cette limitation intrinsèque des données réelles. Cette thèse va discuter des méthodes d'inférence et leur application à des prédictions de la structure de l'ARN. Nous allons comprendre certaines approches heuristiques qui ont été appliquées avec succès dans les dernières années, même si théoriquement mal comprises. La dernière partie du travail se concentrera sur le développement d'un outil pour l'inférence de modèles génératifs, en espérant qu'il ouvrira la voie à de nouvelles applications. / The availability of huge amounts of data has changed the role of physics with respect to other disciplines. Within this dissertation I will explore the innovations introduced in molecular biology thanks to statistical physics approaches. In the last 20 years the size of genome databases has exponentially increased, therefore the exploitation of raw data, in the scope of extracting information, has become a major topic in statistical physics. After the success in protein structure prediction, surprising results have been finally achieved also in the related field of RNA structure characterisation. However, recent studies have revealed that, even if databases are growing, inference is often performed in the under sampling regime and new computational schemes are needed in order to overcome this intrinsic limitation of real data. This dissertation will discuss inference methods and their application to RNA structure prediction. We will discuss some heuristic approaches that have been successfully applied in the past years, even if poorly theoretically understood. The last part of the work will focus on the development of a tool for the inference of generative models, hoping it will pave the way towards novel applications.
|
44 |
Deep generative neural networks for novelty generation : a foundational framework, metrics and experiments / Réseaux profonds génératifs pour la génération de nouveauté : fondations, métriques et expériencesCherti, Mehdi 26 January 2018 (has links)
Des avancées significatives sur les réseaux de neurones profonds ont récemment permis le développement de technologies importantes comme les voitures autonomes et les assistants personnels intelligents basés sur la commande vocale. La plupart des succès en apprentissage profond concernent la prédiction, alors que les percées initiales viennent des modèles génératifs. Actuellement, même s'il existe des outils puissants dans la littérature des modèles génératifs basés sur les réseaux profonds, ces techniques sont essentiellement utilisées pour la prédiction ou pour générer des objets connus (i.e., des images de haute qualité qui appartiennent à des classes connues) : un objet généré qui est à priori inconnu est considéré comme une erreur (Salimans et al., 2016) ou comme un objet fallacieux (Bengio et al., 2013b). En d'autres termes, quand la prédiction est considérée comme le seul objectif possible, la nouveauté est vue comme une erreur - que les chercheurs ont essayé d'éliminer au maximum. Cette thèse défends le point de vue que, plutôt que d'éliminer ces nouveautés, on devrait les étudier et étudier le potentiel génératif des réseaux neuronaux pour créer de la nouveauté utile - particulièrement sachant l'importance économique et sociétale de la création d'objets nouveaux dans les sociétés contemporaines. Cette thèse a pour objectif d'étudier la génération de la nouveauté et sa relation avec les modèles de connaissance produits par les réseaux neurones profonds génératifs. Notre première contribution est la démonstration de l'importance des représentations et leur impact sur le type de nouveautés qui peuvent être générées : une conséquence clé est qu'un agent créatif a besoin de re-représenter les objets connus et utiliser cette représentation pour générer des objets nouveaux. Ensuite, on démontre que les fonctions objectives traditionnelles utilisées dans la théorie de l'apprentissage statistique, comme le maximum de vraisemblance, ne sont pas nécessairement les plus adaptées pour étudier la génération de nouveauté. On propose plusieurs alternatives à un niveau conceptuel. Un deuxième résultat clé est la confirmation que les modèles actuels - qui utilisent les fonctions objectives traditionnelles - peuvent en effet générer des objets inconnus. Cela montre que même si les fonctions objectives comme le maximum de vraisemblance s'efforcent à éliminer la nouveauté, les implémentations en pratique échouent à le faire. A travers une série d'expérimentations, on étudie le comportement de ces modèles ainsi que les objets qu'ils génèrent. En particulier, on propose une nouvelle tâche et des métriques pour la sélection de bons modèles génératifs pour la génération de la nouveauté. Finalement, la thèse conclue avec une série d'expérimentations qui clarifie les caractéristiques des modèles qui génèrent de la nouveauté. Les expériences montrent que la sparsité, le niveaux du niveau de corruption et la restriction de la capacité des modèles tuent la nouveauté et que les modèles qui arrivent à reconnaître des objets nouveaux arrivent généralement aussi à générer de la nouveauté. / In recent years, significant advances made in deep neural networks enabled the creation of groundbreaking technologies such as self-driving cars and voice-enabled personal assistants. Almost all successes of deep neural networks are about prediction, whereas the initial breakthroughs came from generative models. Today, although we have very powerful deep generative modeling techniques, these techniques are essentially being used for prediction or for generating known objects (i.e., good quality images of known classes): any generated object that is a priori unknown is considered as a failure mode (Salimans et al., 2016) or as spurious (Bengio et al., 2013b). In other words, when prediction seems to be the only possible objective, novelty is seen as an error that researchers have been trying hard to eliminate. This thesis defends the point of view that, instead of trying to eliminate these novelties, we should study them and the generative potential of deep nets to create useful novelty, especially given the economic and societal importance of creating new objects in contemporary societies. The thesis sets out to study novelty generation in relationship with data-driven knowledge models produced by deep generative neural networks. Our first key contribution is the clarification of the importance of representations and their impact on the kind of novelties that can be generated: a key consequence is that a creative agent might need to rerepresent known objects to access various kinds of novelty. We then demonstrate that traditional objective functions of statistical learning theory, such as maximum likelihood, are not necessarily the best theoretical framework for studying novelty generation. We propose several other alternatives at the conceptual level. A second key result is the confirmation that current models, with traditional objective functions, can indeed generate unknown objects. This also shows that even though objectives like maximum likelihood are designed to eliminate novelty, practical implementations do generate novelty. Through a series of experiments, we study the behavior of these models and the novelty they generate. In particular, we propose a new task setup and metrics for selecting good generative models. Finally, the thesis concludes with a series of experiments clarifying the characteristics of models that can exhibit novelty. Experiments show that sparsity, noise level, and restricting the capacity of the net eliminates novelty and that models that are better at recognizing novelty are also good at generating novelty.
|
45 |
Fully Unsupervised Image Denoising, Diversity Denoising and Image Segmentation with Limited AnnotationsPrakash, Mangal 06 April 2022 (has links)
Understanding the processes of cellular development and the interplay of cell shape changes, division and migration requires investigation of developmental processes at the spatial resolution of single cell. Biomedical imaging experiments enable the study of dynamic processes as they occur in living organisms. While biomedical imaging is essential, a key component of exposing unknown biological phenomena is quantitative image analysis. Biomedical images, especially microscopy images, are usually noisy owing to practical limitations such as available photon budget, sample sensitivity, etc. Additionally, microscopy images often contain artefacts due to the optical aberrations in microscopes or due to imperfections in camera sensor and internal electronics. The noisy nature of images as well as the artefacts prohibit accurate downstream analysis such as cell segmentation. Although countless approaches have been proposed for image denoising, artefact removal and segmentation, supervised Deep Learning (DL) based content-aware algorithms are currently the best performing for all these tasks.
Supervised DL based methods are plagued by many practical limitations. Supervised denoising and artefact removal algorithms require paired corrupted and high quality images for training. Obtaining such image pairs can be very hard and virtually impossible in most biomedical imaging applications owing to photosensitivity and the dynamic nature of the samples being imaged. Similarly, supervised DL based segmentation methods need copious amounts of annotated data for training, which is often very expensive to obtain. Owing to these restrictions, it is imperative to look beyond supervised methods. The objective of this thesis is to develop novel unsupervised alternatives for image denoising, and artefact removal as well as semisupervised approaches for image segmentation.
The first part of this thesis deals with unsupervised image denoising and artefact removal. For unsupervised image denoising task, this thesis first introduces a probabilistic approach for training DL based methods using parametric models of imaging noise. Next, a novel unsupervised diversity denoising framework is presented which addresses the fundamentally non-unique inverse nature of image denoising by generating multiple plausible denoised solutions for any given noisy image. Finally, interesting properties of the diversity denoising methods are presented which make them suitable for unsupervised spatial artefact removal in microscopy and medical imaging applications.
In the second part of this thesis, the problem of cell/nucleus segmentation is addressed. The focus is especially on practical scenarios where ground truth annotations for training DL based segmentation methods are scarcely available. Unsupervised denoising is used as an aid to improve segmentation performance in the presence of limited annotations. Several training strategies are presented in this work to leverage the representations learned by unsupervised denoising networks to enable better cell/nucleus segmentation in microscopy data. Apart from DL based segmentation methods, a proof-of-concept is introduced which views cell/nucleus segmentation from the perspective of solving a label fusion problem. This method, through limited human interaction, learns to choose the best possible segmentation for each cell/nucleus using only a pool of diverse (and possibly faulty) segmentation hypotheses as input.
In summary, this thesis seeks to introduce new unsupervised denoising and artefact removal methods as well as semi-supervised segmentation methods which can be easily deployed to directly and immediately benefit biomedical practitioners with their research.
|
46 |
Generating synthetic golf courses with deep learning : Investigation into the uses and limitations of generative deep learning / Generera syntetiska golfbanor med djupinlärning : Undersökning av användningsområden och begränsningar för generativ djupinlärningLundqvist, Carl January 2022 (has links)
The power of generative deep learning has increased very quickly in the past ten years and modern models are now able to generate human faces that are indistinguishable from real ones. This thesis project will investigate the uses and limitations of this technology by attempting to generate very specific data, images of golf holes. Generative adverserial networks, GANs, were used to solve this problem. Two different GAN models were chosen as candidates and these were trained on some different datasets that were extracted from the project provider Topgolf Sweden AB’s virtual golf game. This golf game contained data of many different types of golf holes from all over the world. The best performing model was Progressive Growing GAN, ProGAN, which works by iteratively increasing the size of the images until the desired size is reached. This model was able to produce results of very high quality and with large variety. To further investigate the quality of the results a survey was sent out to the employees of Topgolf Sweden AB. A survey that showed that it was difficult for the participants to correctly determine if a given image was real or had been generated by the model. These results further showed that the generated samples had a high quality. This thesis project also investigated how height data could be incorporated in the process. The results showed that the ProGAN model was able to generate height maps that capture the most important aspects of a golf hole. Furthermore, the overall results showed that the generative model had learned a good representation of the data’s underlying probability distribution. More work needs to be done before a model like the one presented here can be used to generate complete golf holes that can be used in a virtual golf game, but this project clearly shows that GANs are a worthwhile investment for this purpose. / Kraften i generativ djupinlärning har ökat snabbt under de senaste tio åren och moderna modeller kan generera bilder på människoansikten som är omöjliga att urskilja från riktiga ansikten. Detta examensarbete undersöker hur denna teknologi kan användas och vad det finns för begränsningar genom att försöka generera väldigt specifik data, bilder på golfhål. Generativa adversiella nätverk, GANs, användas för att lösa detta problem. Två modeller valdes som kandidater och dessa tränades på olika datasets som hade extraherats från projektleverantören Topgolf Sweden ABs virtuella golfspel. Detta golfspel innehöll data från en mängd olika typer av golfhål från hela världen. Modellen som presterade bäst var Progressive Growing GAN, ProGAN, som iterativt ökar storleken på bilderna tills den önskade storleken har nåtts. Denna modell lyckades skapa bilder av väldigt hög kvalitet och med stor variation. För att ytterligare undersöka kvaliten på resultaten så genomfördes en enkät. Enkäten skickades till anställda hos Topgolf Sweden AB. Svaren visade att det var svårt för deltagarna att urskilja äkta bilder från genererade bilder vilket ytterligare visade att de genererade bilderna hade hög kvalitet. Detta examensarbete undersökte också hur höjddata kunde integreras i processen. Resultaten av detta visade att ProGAN modellen kunde generera höjddata som innehöll de viktigaste delarna av ett golfhål. Dessutom så visade resultaten i helhet att den generativa modellen hade lärt sig en bra representation av träningsdatans underliggande sannolikhetsfördelning. Mer arbete krävs för att en liknande modell ska kunna generera kompletta golfhål som kan användas i ett virtuellt golfspel, men projektet visar att GANs är ett väldigt bra alternativ för att lyckas med det.
|
47 |
Understanding people movement and detecting anomalies using probabilistic generative models / Att förstå personförflyttningar och upptäcka anomalier genom att använda probabilistiska generativa modellerHansson, Agnes January 2020 (has links)
As intelligent access solutions begin to dominate the world, the statistical learning methods to answer for the behavior of these needs attention, as there is no clear answer to how an algorithm could learn and predict exactly how people move. This project aims at investigating if, with the help of unsupervised learning methods, it is possible to distinguish anomalies from normal events in an access system, and if the most probable choice of cylinder to be unlocked by a user can be calculated.Given to do this is a data set of the previous events in an access system, together with the access configurations - and the algorithms that were used consisted of an auto-encoder and a probabilistic generative model.The auto-encoder managed to, with success, encode the high-dimensional data set into one of significantly lower dimension, and the probabilistic generative model, which was chosen to be a Gaussian mixture model, identified clusters in the data and assigned a measure of unexpectedness to the events.Lastly, the probabilistic generative model was used to compute the conditional probability of which the user, given all the details except which cylinder that was chosen during an event, would choose a certain cylinder. The result of this was a correct guess in 65.7 % of the cases, which can be seen as a satisfactory number for something originating from an unsupervised problem. / Allt eftersom att intelligenta åtkomstlösningar tar över i samhället, så är det nödvändigt att ägna de statistiska inlärnings-metoderna bakom dessa tillräckligt med uppmärksamhet, eftersom det inte finns något självklart svar på hur en algoritm ska kunna lära sig och förutspå människors exakta rörelsemönster.Det här projektet har som mål att, med hjälp av oövervakad inlärning, undersöka huruvida det är möjligt att urskilja anomalier från normala iakttagelser, och om den låscylinder med högst sannolikhet att en användare väljer att försöka låsa upp går att beräknda.Givet för att genomföra detta projekt är en datamängd där händelser från ett åtkomstsystem finns, tillsammans med tillhörande åtkomstkonfig-urationer. Algoritmerna som användes i projektet har bestått av en auto-encoder och en probabilistisk generativ modell.Auto-encodern lyckades, med tillfredsställande resultat, att koda det hög-dimensionella datat till ett annat med betydligt lägre dimension, och den probabilistiska generativa modellen, som valdes till en Gaussisk mixtur-modell, lyckades identifiera kluster i datat och med att tilldela varje observation ett mått på dess otrolighet.Till slut så användes den probabilistiska generativa modellen för att beräkna en villkorad sannolikhet, för vilken användaren, given alla attribut för en händelse utom just vilken låscylinder som denna försökte öppna, skulle välja.Resultatet av dessa var en korrekt gissning i 65,7 % av fallen, vilket kan ses som en tillfredställande siffra för något som härrör från ett oövervakat problem.
|
48 |
AI-assisted Image Manipulation with Eye Tracking / Bildbehandling med Eye Tracking och AIKarlander, Rej, Wang, Julia January 2023 (has links)
Image editing tools can pose a challenge for motor impaired individuals who wish to perform image manipulation. The process includes many steps and can be difficult given a lack of tactile input such as mouse and keyboard. To increase the availability of image editing for motor impaired individuals, the potential for new tools and modalities have to be explored. In this project, a prototype was developed, which allows the user to edit images using eye tracking and deep learning models, specifically the DALL-E 2 model. This prototype was then tested on users who rated its functionality based on a set of human-computer interaction principles. The quality of the results varied a lot depending on the eye movements of the user, and the provided prompts. The results of the user testing found that there was potential for an editing tool implementing eye tracking and AI assistance, but that it requires further iteration and time to learn how to use. Most users enjoyed the experience of using the prototype and felt that continued experimentation would lead to improved results. / Användandet av bildbehandlingsverktyg kan för någon med motoriska svårigheter, specifikt de utan möjlighet att använda sina händer, innebära flera svårigheter. Processen omfattas av många steg som kan vara särskilt besvärliga utan användningen av mus och tangentbord. För att öka tillgängligheten av dessa verktyg behöver nya system utforskas, till exempel sådana som använder AI system. I denna studie utvärderas ett sådant system, för vilken en prototyp utvecklades. Prototypen låter användaren redigera bilder med hjälp av eye tracking och maskininlärningsmodellen DALL-E 2. Deltagarna i studien utvärderade funktionaliteten baserat på utvalda människa-datorinteraktionsprinciper. Resultaten av utvärderingen skiljde sig en del, till stor del grundat i ögonrörelserna av användaren och den givna ändringsbeskrivningen. Resultaten visade på att det fanns potential för ett bildbehandlingsverktyg som implementerar både AI och eye tracking men att det krävs mer tid och iterering för användaren att lära sig modellen. Användare fann överlag ett nöje i att använda programmet och upplevde att de skulle kunna presterat bättre resultat om de fick mer tid att experimentera.
|
49 |
Technology Acceptance for AI implementations : A case study in the Defense Industry about 3D Generative Models / Teknologisk Acceptans för AI implementationer : En fallstudie i försvarsindustrin om 3D Generativa ModellerArenander, Michael January 2023 (has links)
Advancements in Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) has emerged into 3D object creation processes through the rise of 3D Generative Adversarial Networks (3D GAN). These networks contain 3D generative models capable of analyzing and constructing 3D objects. 3D generative models have therefore become an increasingly important area to consider for the automation of design processes in the manufacturing and defense industry. This case study explores areas of automation enabled by 3D generative models for an incumbent in the Swedish defense industry. This study additionally evaluates discovered types of implementations of 3D generative models from a sociotechnical perspective by conducting qualitative interviews with employees. This study applies the Unified Theory of Acceptance and Use of Technology (UTAUT) for understanding the adoption and intention to use 3D generative models. A description of 3D objects, CAD, 3D generative models, and point cloud data is given in this study. A literature review is additionally given in the three fields of AI, technology acceptance, and the defense industry to funnel the literature to the context of this study. 21 types of implementations are discovered and categorized into four distinct groups. In conclusion a lot of potential is found for the adoption of 3D generative models for especially AI simulation processes, but challenges with data collection and security are discovered as the most significant obstacle to overcome. / Framsteg inom artificiell intelligens (AI), maskininlärning (ML) och djupinlärning (DL) har resulterat i att 3D-objektskapandeprocesser har utvecklats genom framväxten av 3D Generative Adversarial Networks (3D GAN). Dessa nätverk innehåller 3D-generativa modeller som är kapabla till att analysera och konstruera 3D-objekt. 3D-generativa modeller har därmed blivit ett allt viktigare område att beakta för automatisering av designprocesser inom tillverknings- och försvarsindustrin. Denna fallstudie undersöker automatiseringsområden som möjliggörs av 3D-generativamodeller för en etablerad aktör inom den svenska försvarsindustrin. Studien utvärderar dessutom identifierade typer av implementeringar av 3D-generativa modeller ur ett socio-tekniskt perspektiv genom att genomföra kvalitativa intervjuer med anställda. Denna studie tillämpar Unified Theory of Acceptance and Use of Technology (UTAUT) för att förstå acceptans och avsikt att använda 3D-generativa modeller. En beskrivning av 3D-objekt, CAD, 3D-generativa modeller och punktmolnsdata ges i denna studie. Dessutom ges en litteraturöversikt inom tre områden: AI, teknologianvändning och försvarsindustrin för att rikta in litteraturen mot denna studiens sammanhang. 21 typer av tillämpningar identifieras och kategoriseras i fyra distinkta grupper. Som slutsats finns det stor potential för antagande av 3D-generativamodeller, särskilt inom AI-simuleringsprocesser, men utmaningar med datainsamling och säkerhet identifieras som de mest betydande hindren att överkomma.
|
50 |
From specialists to generalists : inductive biases of deep learning for higher level cognitionGoyal, Anirudh 10 1900 (has links)
Les réseaux de neurones actuels obtiennent des résultats de pointe dans une gamme de domaines problématiques difficiles.
Avec suffisamment de données et de calculs, les réseaux de neurones actuels peuvent obtenir des résultats de niveau humain sur presque toutes les tâches. En ce sens, nous avons pu former des spécialistes capables d'effectuer très bien une tâche particulière, que ce soit le jeu de Go, jouer à des jeux Atari, manipuler le cube Rubik, mettre des légendes sur des images ou dessiner des images avec des légendes. Le prochain défi pour l'IA est de concevoir des méthodes pour former des généralistes qui, lorsqu'ils sont exposés à plusieurs tâches pendant l'entraînement, peuvent s'adapter rapidement à de nouvelles tâches inconnues. Sans aucune hypothèse sur la distribution génératrice de données, il peut ne pas être possible d'obtenir une meilleure généralisation et une meilleure adaptation à de nouvelles tâches (inconnues).
Les réseaux de neurones actuels obtiennent des résultats de pointe dans une gamme de domaines problématiques difficiles.
Une possibilité fascinante est que l'intelligence humaine et animale puisse être expliquée par quelques principes, plutôt qu'une encyclopédie de faits. Si tel était le cas, nous pourrions plus facilement à la fois comprendre notre propre intelligence et construire des machines intelligentes. Tout comme en physique, les principes eux-mêmes ne suffiraient pas à prédire le comportement de systèmes complexes comme le cerveau, et des calculs importants pourraient être nécessaires pour simuler l'intelligence humaine. De plus, nous savons que les vrais cerveaux intègrent des connaissances a priori détaillées spécifiques à une tâche qui ne pourraient pas tenir dans une courte liste de principes simples. Nous pensons donc que cette courte liste explique plutôt la capacité des cerveaux à apprendre et à s'adapter efficacement à de nouveaux environnements, ce qui est une grande partie de ce dont nous avons besoin pour l'IA. Si cette hypothèse de simplicité des principes était correcte, cela suggérerait que l'étude du type de biais inductifs (une autre façon de penser aux principes de conception et aux a priori, dans le cas des systèmes d'apprentissage) que les humains et les animaux exploitent pourrait aider à la fois à clarifier ces principes et à fournir source d'inspiration pour la recherche en IA.
L'apprentissage en profondeur exploite déjà plusieurs biais inductifs clés, et mon travail envisage une liste plus large, en se concentrant sur ceux qui concernent principalement le traitement cognitif de niveau supérieur. Mon travail se concentre sur la conception de tels modèles en y incorporant des hypothèses fortes mais générales (biais inductifs) qui permettent un raisonnement de haut niveau sur la structure du monde. Ce programme de recherche est à la fois ambitieux et pratique, produisant des algorithmes concrets ainsi qu'une vision cohérente pour une recherche à long terme vers la généralisation dans un monde complexe et changeant. / Current neural networks achieve state-of-the-art results across a range of challenging problem domains.
Given enough data, and computation, current neural networks can achieve human-level results on mostly any task. In the sense, that we have been able to train \textit{specialists} that can perform a particular task really well whether it's the game of GO, playing Atari games, Rubik's cube manipulation, image caption or drawing images given captions. The next challenge for AI is to devise methods to train \textit{generalists} that when exposed to multiple tasks during training can quickly adapt to new unknown tasks. Without any assumptions about the data generating distribution it may not be possible to achieve better generalization and adaption to new (unknown) tasks.
A fascinating possibility is that human and animal intelligence could be explained by a few principles (rather than an encyclopedia). If that was the case, we could more easily both understand our own intelligence and build intelligent machines. Just like in physics, the principles themselves would not be sufficient to predict the behavior of complex systems like brains, and substantial computation might be needed to simulate human intelligence. In addition, we know that real brains incorporate some detailed task-specific a priori knowledge which could not fit in a short list of simple principles. So we think of that short list rather as explaining the ability of brains to learn and adapt efficiently to new environments, which is a great part of what we need for AI. If that simplicity of principles hypothesis was correct it would suggest that studying the kind of inductive biases (another way to think about principles of design and priors, in the case of learning systems) that humans and animals exploit could help both clarify these principles and provide inspiration for AI research.
Deep learning already exploits several key inductive biases, and my work considers a larger list, focusing on those which concern mostly higher-level cognitive processing. My work focuses on designing such models by incorporating in them strong but general assumptions (inductive biases) that enable high-level reasoning about the structure of the world. This research program is both ambitious and practical, yielding concrete algorithms as well as a cohesive vision for long-term research towards generalization in a complex and changing world.
|
Page generated in 0.2697 seconds