Global ETD Search

261	TEMPORAL DIET AND PHYSICAL ACTIVITY PATTERN ANALYSIS, UNSUPERVISED PERSON RE-IDENTIFICATION, AND PLANT PHENOTYPING Jiaqi Guo (18108289) 06 March 2024 (has links) <p dir="ltr">Both diet and physical activity are known to be risk factors for obesity and chronic diseases such as diabetes and metabolic syndrome. We explore a distance-based approach for clustering daily physical activity time series to find temporal physical activity patterns among U.S. adults (ages 20-65). We further extend this approach to integrate both diet and physical activity, and find joint temporal diet and physical activity patterns. Our experiments indicate that the integration of diet, physical activity, and time has the potential to discover joint patterns with association to health. </p><p dir="ltr">Unsupervised domain adaptive (UDA) person re-identification (re-ID) aims to learn identity information from labeled images in source domains and apply it to unlabeled images in a target domain. We propose a deep learning architecture called Synthesis Model Bank (SMB) to deal with illumination variation in unsupervised person re-ID. From our experiments, the proposed SMB outperforms other synthesis methods on several re-ID benchmarks. </p><p dir="ltr">Recent technology advancement introduced modern high-throughput methodologies such as Unmanned Aerial Vehicles (UAVs) to replace the traditional, labor-intensive phenotyping. For many UAV phenotyping analysis, the first step is to extract the smallest groups of plants called “plots” that have the same genotype. We propose an optimization-based, rotation-adaptive approach for extracting plots in a UAV RGB orthomosaic image. From our experiments, the proposed method achieves better plot extraction accuracy compared to existing approaches, and does not require training data.</p> Computer vision Image processing Multimodal analysis and synthesis Deep learning Neural networks Semi- and unsupervised learning computer vision deep learning physical activity diet time series analysis time series clustering generative model image synthesis diffusion model GAN CUDA segmentation
262	Quantifying Gait Characteristics and Neurological Effects in people with Spinal Cord Injury using Data-Driven Techniques / Kvantifiering av gångens egenskaper och neurologisk funktionens effekt hos personer med ryggmärgsskada med hjälp av datadrivna metoder Truong, Minh January 2024 (has links) Spinal cord injury, whether traumatic or nontraumatic, can partially or completely damage sensorimotor pathways between the brain and the body, leading to heterogeneous gait abnormalities. Mobility impairments also depend on other factors such as age, weight, time since injury, pain, and walking aids used. The ASIA Impairment Scale is recommended to classify injury severity, but is not designed to characterize individual ambulatory capacity. Other standardized tests based on subjective or timing/distance assessments also have only limited ability to determine an individual's capacity. Data-driven techniques have demonstrated effectiveness in analysing complexity in many domains and may provide additional perspectives on the complexity of gait performance in persons with spinal cord injury. The studies in this thesis aimed to address the complexity of gait and functional abilities after spinal cord injury using data-driven approaches. The aim of the first manuscript was to characterize the heterogeneous gait patterns in persons with incomplete spinal cord injury. Dissimilarities among gait patterns in the study population were quantified with multivariate dynamic time warping. Gait patterns were classified into six distinct clusters using hierarchical agglomerative clustering. Through random forest classifiers with explainable AI, peak ankle plantarflexion during swing was identified as the feature that most often distinguished most clusters from the controls. By combining clinical evaluation with the proposed methods, it was possible to provide comprehensive analyses of the six gait clusters. The aim of the second manuscript was to quantify sensorimotor effects on walking performance in persons with spinal cord injury. The relationships between 11 input features and 2 walking outcome measures - distance walked in 6 minutes and net energy cost of transport - were captured using 2 Gaussian process regression models. Explainable AI revealed the importance of muscle strength on both outcome measures. Use of walking aids also influenced distance walked, and cardiovascular capacity influenced energy cost. Analyses for each person also gave useful insights into individual performance. The findings from these studies demonstrate the large potential of advanced machine learning and explainable AI to address the complexity of gait function in persons with spinal cord injury. / Skador på ryggmärgen, oavsett om de är traumatiska eller icke-traumatiska, kan helt eller delvis skada sensoriska och motoriska banor mellan hjärnan och kroppen, vilket påverkar gången i varierande grad. Rörelsenedsättningen beror också på andra faktorer såsom ålder, vikt, tid sedan skadan uppstod, smärta och gånghjälpmedel. ASIA-skalan används för att klassificera ryggmärgsskadans svårighetsgrad, men är inte utformad för att karaktärisera individens gångförmåga. Andra standardiserade tester baserade på subjektiva eller tids och avståndsbedömningar har också begränsad möjlighet att beskriva individuell kapacitet. Datadrivna metoder är kraftfulla och kan ge ytterligare perspektiv på gångens komplexitet och prestation. Studierna i denna avhandling syftar till att analysera komplexa relationer mellan gång, motoriska samt sensoriska funktion efter ryggmärgsskada med hjälp av datadrivna metoder. Syftet med den första studien är att karaktärisera de heterogena gångmönster hos personer med inkomplett ryggmärgsskada. Multivariat dynamisk tidsförvrägning (eng: Multivariate dynamic time warping) användes för att kvantifiera gångskillnader i studiepopulationen. Hierarkisk agglomerativ klusteranalys (eng: hierarchical agglomerative clustering) delade upp gång i sex distinkta kluster, varav fyra hade lägre hastighet än kontroller. Med hjälp av förklarbara AI (eng: explainable AI) identifierades det att fotledsvinkeln i svingfasen hade störst påverkan om vilken kluster som gångmönstret hamnat i. Genom att kombinera klinisk undersökning med datadrivna metoder kunde vi beskriva en omfattande bild av de sex gångklustren. Syftet med den andra manuskriptet är att kvantifiera sensoriska och motoriska faktorerans påverkan på gångförmåga efter ryggmärgsskada. Med hjälp av två Gaussian process-regressionsmodeller identiferades sambanden mellan 11 beskrivande faktorer och 2 gång prestationsmått, nämligen gångavstånd på 6 minuter samt metabola energiåtgång. Med hjälp av förklarbar AI påvisades det stora påverkan av muskelstyrka på både gångsträckan och energiåtgång. Gånghjälpmedlet samt kardiovaskulär kapaciteten hade också betydande påverkan på gångprestation. Enskilda analyser gav insiktsfull information om varje individ. Resultaten från dessa studier visar på potentiella tillämpningar av avancerad maskininlärning och AI metoder för att analysera komplexa relationer mellan funktion och motorisk prestation efter ryggmärgsskada. / <p>QC 20240221</p> gait analysis pathological gait biomechanics health informatics metabolic cost unsupervised learning nonparametric regression shapley addictive explanations gånganalys funktionsnedsättning gångpatologi energiförbrukning maskininlärning hälsoinformatik biomekanik AI Health Sciences Hälsovetenskaper Clinical Medicine Klinisk medicin
263	DETERMINING STRUCTURE AND GROWTH CHARACTERISTICS OF OXIDEHETEROSTRUCTURES THROUGH DEPOSITION AND DATA SCIENCE: TOWARDS SINGLE CRYSTAL BATTERIES Fraser, Kimberly 27 January 2023 (has links) No description available. Materials Science Statistics Chemistry Chemical Engineering Computer Science Engineering Pulsed laser deposition thin film data science battery lithium ion transmission electron microscopy RHEED supervised learning unsupervised learning
264	Intelligence Extraction Using Machine Learning for Threat Identification Purposes : An Overview / Inhämtande av underrättelseinformation genom maskininlärning för identifikation av hot Lindgren, Jonatan January 2022 (has links) Radar is an invaluable tool for detecting and assessing threats on land, on the seas and in the air. To properly evaluate threats, radar operators construct threat libraries where the signal characteristics of emitters are stored and mapped to specific types of platforms. In this project, methods for constructing these threat detection libraries from data obtained during real-life scenarios are investigated. A number of machine learning approaches are investigated and validated using general and method specific scoring methods. Using density based clustering methods and non-linear data transformation it is shown that Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) and spatial consistency metrics can be used to deinterleave and group signals to radar trace emitting platforms, from which suitable library parameters can be extracted. The results show that traditional metrics for evaluating cluster methods are not suited for evaluating data containing spatial information. / Radar är ett ovärderligt verktyg för att upptäcka och identifiera hot på land, till havs och i luften. För att kunna utvärdera olika former av hot använder sig radaroperatörer av hotbibliotek, vilka består av olika radarplattformers signalparametrar. I det här projektet undersöks olika metoder för att bygga hotbibliotek med hjälp av verkliga data insamlat under flygningar i Sverige. Olika maskininlärningsmetoder undersöks och utvärderas med hjälp av både generella och specifika utvärderingsmetoder. Genom att använda sig av densitets- baserade klustringsmetoder och olinjära metoder för att transformera data så visas att hierarkisk densitetsbaserad spatial klustring för tillämningar med störningar (HDBSCAN) och utvärderingsmetoder som baseras på spatial karaktäristik kan användas för att separera och gruppera radarkällor, vilka kan användas för att finna parametrar för att bygga hotbibliotek. Det visas även att traditionella metoder för att utvärdera klustringsresultat inte lämpar sig för att utvärdera spatiala data. Machine learning Radar threat identification Clustering Feature scaling Electronic warfare Maskininlärning Identifikation av radarhot Klustring Skalning av dataparametrar Elektronisk krigsföring Engineering and Technology Teknik och teknologier
265	Matching Sticky Notes Using Latent Representations / Matchning av klisterlappar med hjälp av latent representation García San Vicent, Javier January 2022 (has links) his project addresses the issue of accurately identifying repeated images of sticky notes. Due to environmental conditions and the 3D location of the camera, different pictures taken of sticky notes may look distinct enough to be hard to determine if they belong to the same note. More specifically, this thesis aims to create latent representations of these pictures of sticky notes to encode their content so that all the pictures of the same note have a similar representation that allows to identify them. Thus, those representations must be invariant to light conditions, blur and camera position. To that end, a Siamese neural architecture will be trained based on data augmentation methods. The method consists of learning to embed two augmented versions of the same image into similar representations. This architecture has been trained with unsupervised learning and fine-tuned with supervised learning to detect if two representations belong or not to the same note. The performance of ResNet, EfficientNet and Vision Transformers in encoding the images into their representations has been compared with different configurations. The results show that, while the most complex models overfit small amounts of data, the simplest encoders are capable of properly identifying more than 95% of the sticky notes in grey scale. Those models can create invariant representations that are close to each other in the latent space for pictures of the same sticky note. Gathering more data could result in an improvement of the performance of the model and the possibility of applying it to other fields such as handwritten documents. / Detta projekt tar upp frågan om att identifiera upprepade bilder av klisterlappar. På grund av miljöförhållanden och kamerans 3D-placering kan olika bilder som tagits till klisterlappar se tillräckligt distinkta ut för att det ska vara svårt att avgöra om de faktiskt tillhör samma klisterlappar. Mer specifikt är syftet med denna avhandling att skapa latenta representationer av bilder av klisterlappar som kodar deras innehåll, så att alla bilder av en klisterlapp har en liknande representation som gör det möjligt att identifiera dem. Sålunda måste representationerna vara oföränderliga för ljusförhållanden, oskärpa och kameraposition. För det ändamålet kommer en enkel siamesisk neural arkitektur att tränas baserad på dataförstärkningsmetoder. Metoden går ut på att lära sig att göra representationerna av två förstärkta versioner av en bild så lika som möjligt. Genomatt tillämpa vissa förbättringar av arkitekturen kan oövervakat lärande användas för att träna nätverket. Prestandan hos ResNet, EfficientNet och Vision Transformers när det gäller att koda bilderna till deras representationer har jämförts med olika konfigurationer. Resultaten visar att även om de mest komplexa modellerna överpassar små mängder data, kan de enklaste kodarna korrekt identifiera mer än 95% av klisterlapparna. Dessa modeller kan skapa oföränderliga representationer som är nära i det latenta utrymmet för bilder av samma klisterlapp. Att samla in mer data kan resultera i en förbättring av modellens prestanda och möjligheten att tillämpa den på andra områden som till exempel handskrivna dokument. Pattern matching Image matching Image recognition Representation learning Unsupervised learning Semisupervised learning Siamese architecture Deep learning Transfer learning Mönstermatchning Bildmatchning Bildigenkänning Representationsinlärning Oövervakat lärande Halvövervakat lärande Siamesisk arkitektur Djup lärning Överfört lärande Computer and Information Sciences Data- och informationsvetenskap
266	<b>DEVELOPING A RESPONSIBLE AI INSTRUCTIONAL FRAMEWORK FOR ENHANCING AI LEGISLATIVE EFFICACY IN THE UNITED STATES</b> Kylie Ann Kristine Leonard (17583945) 09 December 2023 (has links) <p dir="ltr">Artificial Intelligence (AI) is anticipated to exert a considerable impact on the global Gross Domestic Product (GDP), with projections estimating a contribution of 13 trillion dollars by the year 2030 (IEEE Board of Directors, 2019). In light of this influence on economic, societal, and intellectual realms, it is imperative for Policy Makers to acquaint themselves with the ongoing developments and consequential impacts of AI. The exigency of their preparedness lies in the potential for AI to evolve in unpredicted directions should proactive measures not be promptly instituted.</p><p dir="ltr">This paper endeavors to address a pivotal research question: " Do United States Policy Makers have a sufficient knowledgebase to understand Responsible AI in relation to Machine Learning to pass Artificial Intelligence legislation; and if they do not, how should a pedological instructional framework be created to give them the necessary knowledge?" The pursuit of answers to this question unfolded through the systematic review, gap analysis, and formulation of an instructional framework specifically tailored to elucidate the intricacies of Machine Learning. The findings of this study underscore the imperative for policymakers to undergo educational initiatives in the realm of artificial intelligence. Such educational interventions are deemed essential to empower policymakers with the requisite understanding for formulating effective regulatory frameworks that ensure the development of Responsible AI. The ethical dimensions inherent in this technological landscape warrant consideration, and policymakers must be equipped with the necessary cognitive tools to navigate these ethical quandaries adeptly.</p><p dir="ltr">In response to this exigency, the present study has undertaken the design and development of an instructional framework. This framework is conceived as a strategic intervention to address the evident cognitive gap existing among policymakers concerning the nuances of AI. By imparting an understanding of AI-related concepts, the framework aspires to cultivate a more informed and discerning governance ethos among policymakers, thus contributing to the responsible and ethical deployment of AI technologies.</p> Educational technology and computing Research, science and technology policy Semi- and unsupervised learning Artificial intelligence ethics Responsible AI governance semi-supervised methods. instructional curriculum development Artificial Intelligence Legistalation policymaker concern
267	Advancing Keyword Clustering Techniques: A Comparative Exploration of Supervised and Unsupervised Methods : Investigating the Effectiveness and Performance of Supervised and Unsupervised Methods with Sentence Embeddings / Jämförande analys av klustringstekniker för klustring av nyckelord : Undersökning av effektiviteten och prestandan hos övervakade och oövervakade metoder med inbäddade ord Caliò, Filippo January 2023 (has links) Clustering keywords is an important Natural Language Processing task that can be adopted by several businesses since it helps to organize and group related keywords together. By clustering keywords, businesses can better understand the topics their customers are interested in. This thesis project provides a detailed comparison of two different approaches that might be used for performing this task and aims to investigate whether having the labels associated with the keywords improves the clusters obtained. The keywords are clustered using both supervised learning, training a neural network and applying community detection algorithms such as Louvain, and unsupervised learning algorithms, such as HDBSCAN and K-Means. The evaluation is mainly based on metrics like NMI and ARI. The results show that supervised learning can produce better clusters than unsupervised learning. By looking at the NMI score, the supervised learning approach composed by training a neural network with Margin Ranking Loss and applying Kruskal achieves a slightly better score of 0.771 against the 0.693 of the unsupervised learning approach proposed, but by looking at the ARI score, the difference is more relevant. HDBSCAN achieves a lower score of 0.112 compared to the supervised learning approach with the Margin Ranking Loss (0.296), meaning that the clusters formed by HDBSCAN may lack meaningful structure or exhibit randomness. Based on the evaluation metrics, the study demonstrates that supervised learning utilizing the Margin Ranking Loss outperforms unsupervised learning techniques in terms of cluster accuracy. However, when trained with a BCE loss function, it yields less accurate clusters (NMI: 0.473, ARI: 0.108), highlighting that the unsupervised algorithms surpass this particular supervised learning approach. / Klustring av nyckelord är en viktig uppgift inom Natural Language Processing som kan användas av flera företag eftersom den hjälper till att organisera och gruppera relaterade nyckelord tillsammans. Genom att klustra nyckelord kan företag bättre förstå vilka ämnen deras kunder är intresserade av. Detta examensarbete ger en detaljerad jämförelse av två olika metoder som kan användas för att utföra denna uppgift och syftar till att undersöka om de etiketter som är associerade med nyckelorden förbättrar de kluster som erhålls. Nyckelorden klustras med hjälp av både övervakad inlärning, träning av ett neuralt nätverk och tillämpning av algoritmer för community-detektering, t.ex. Louvain, och algoritmer för oövervakad inlärning, t.ex. HDBSCAN och KMeans. Utvärderingen baseras huvudsakligen på mått som NMI och ARI. Resultaten visar att övervakad inlärning kan ge bättre kluster än oövervakad inlärning. Om man tittar på NMI-poängen uppnår den övervakade inlärningsmetoden som består av att träna ett neuralt nätverk med Margin Ranking Loss och tillämpa Kruskal en något bättre poäng på 0,771 jämfört med 0,693 för den föreslagna oövervakade inlärningsmetoden, men om man tittar på ARI-poängen är skillnaden mer relevant. HDBSCAN uppnår en lägre poäng på 0,112 jämfört med den övervakade inlärningsmetoden med Margin Ranking Loss (0,296), vilket innebär att de kluster som bildas av HDBSCAN kan sakna meningsfull struktur eller uppvisa slumpmässighet. Baserat på utvärderingsmetrikerna visar studien att övervakad inlärning som använder Margin Ranking Loss överträffar tekniker för oövervakad inlärning när det gäller klusternoggrannhet. När den tränas med en BCEförlustfunktion ger den dock mindre exakta kluster (NMI: 0,473, ARI: 0,108), vilket belyser att de oövervakade algoritmerna överträffar denna speciella övervakade inlärningsmetod. Keyword Clustering Supervised Learning Unsupervised Learning Cluster Labels Natural Language Processing Sentence Embeddings Nyckelord Klustring övervakad inlärning oövervakad inlärning klustermärkning naturlig språkbehandling Inbäddning av meningar Computer and Information Sciences Data- och informationsvetenskap
268	Generative Image-to-Image Translation with Applications in Computational Pathology Fangda Li (17272816) 24 October 2023 (has links) <p dir="ltr">Generative Image-to-Image Translation (I2IT) involves transforming an input image from one domain to another. Typically, this transformation retains the content in the input image while adjusting the domain-dependent style elements. Generative I2IT finds utility in a wide range of applications, yet its effectiveness hinges on adaptations to the unique characteristics of the data at hand. This dissertation pushes the boundaries of I2IT by applying it to stain-related problems in computational pathology. Particularly, the main contributions span two major applications of stain translation: H&E-to-H&E and H&E-to-IHC, each with its unique requirements and challenges. More specifically, the first contribution addresses the generalization challenge posed by the high variability in H&E stain appearances to any task-specific machine learning models. To this end, the Generative Stain Augmentation Network (G-SAN) is introduced to augment the training images in any downstream task with random and diverse H&E stain appearances. Experimental results demonstrate G-SAN’s ability to enhance model generalization across stain variations in downstream tasks. The second key contribution in this dissertation focuses on H&E-to-IHC stain translation. The major challenge in learning accurate H&E-to-IHC stain translation is the frequent and sometimes severe inconsistencies in the groundtruth H&E-IHC image pairs. To make training more robust to these inconsistencies, a novel contrastive learning based loss, named the Adaptive Supervised PatchNCE (ASP) loss is presented. Experimental results suggest that the proposed ASP-based framework outperforms the state-of-the-art in H&E-to-IHC stain translation by significant margins. Additionally, a new dataset for H&E-to-IHC translation – the Multi-IHC Stain Translation (MIST) dataset, is released to the public, featuring paired images from H&E to four different IHC stains. For future directions of generative I2IT in stain translation problems, a proof-of-concept study of applying the latest diffusion model based I2IT methods to the problem of virtual H&E staining is presented.</p> Adversarial machine learning Deep learning Neural networks Semi- and unsupervised learning GAN Histopathology Digital Pathology H&E Staining IHC Staining Generative Models Stain Translation Image-to-Image Translation Computational Pathology
269	On discovering and learning structure under limited supervision Mudumba, Sai Rajeswar 08 1900 (has links) Les formes, les surfaces, les événements et les objets (vivants et non vivants) constituent le monde. L'intelligence des agents naturels, tels que les humains, va au-delà de la simple reconnaissance de formes. Nous excellons à construire des représentations et à distiller des connaissances pour comprendre et déduire la structure du monde. Spécifiquement, le développement de telles capacités de raisonnement peut se produire même avec une supervision limitée. D'autre part, malgré son développement phénoménal, les succès majeurs de l'apprentissage automatique, en particulier des modèles d'apprentissage profond, se situent principalement dans les tâches qui ont accès à de grands ensembles de données annotées. Dans cette thèse, nous proposons de nouvelles solutions pour aider à combler cette lacune en permettant aux modèles d'apprentissage automatique d'apprendre la structure et de permettre un raisonnement efficace en présence de tâches faiblement supervisés. Le thème récurrent de la thèse tente de s'articuler autour de la question « Comment un système perceptif peut-il apprendre à organiser des informations sensorielles en connaissances utiles sous une supervision limitée ? » Et il aborde les thèmes de la géométrie, de la composition et des associations dans quatre articles distincts avec des applications à la vision par ordinateur (CV) et à l'apprentissage par renforcement (RL). Notre première contribution ---Pix2Shape---présente une approche basée sur l'analyse par synthèse pour la perception. Pix2Shape exploite des modèles génératifs probabilistes pour apprendre des représentations 3D à partir d'images 2D uniques. Le formalisme qui en résulte nous offre une nouvelle façon de distiller l'information d'une scène ainsi qu'une représentation puissantes des images. Nous y parvenons en augmentant l'apprentissage profond non supervisé avec des biais inductifs basés sur la physique pour décomposer la structure causale des images en géométrie, orientation, pose, réflectance et éclairage. Notre deuxième contribution ---MILe--- aborde les problèmes d'ambiguïté dans les ensembles de données à label unique tels que ImageNet. Il est souvent inapproprié de décrire une image avec un seul label lorsqu'il est composé de plus d'un objet proéminent. Nous montrons que l'intégration d'idées issues de la littérature linguistique cognitive et l'imposition de biais inductifs appropriés aident à distiller de multiples descriptions possibles à l'aide d'ensembles de données aussi faiblement étiquetés. Ensuite, nous passons au paradigme d'apprentissage par renforcement, et considérons un agent interagissant avec son environnement sans signal de récompense. Notre troisième contribution ---HaC--- est une approche non supervisée basée sur la curiosité pour apprendre les associations entre les modalités visuelles et tactiles. Cela aide l'agent à explorer l'environnement de manière autonome et à utiliser davantage ses connaissances pour s'adapter aux tâches en aval. La supervision dense des récompenses n'est pas toujours disponible (ou n'est pas facile à concevoir), dans de tels cas, une exploration efficace est utile pour générer un comportement significatif de manière auto-supervisée. Pour notre contribution finale, nous abordons l'information limitée contenue dans les représentations obtenues par des agents RL non supervisés. Ceci peut avoir un effet néfaste sur la performance des agents lorsque leur perception est basée sur des images de haute dimension. Notre approche a base de modèles combine l'exploration et la planification sans récompense pour affiner efficacement les modèles pré-formés non supervisés, obtenant des résultats comparables à un agent entraîné spécifiquement sur ces tâches. Il s'agit d'une étape vers la création d'agents capables de généraliser rapidement à plusieurs tâches en utilisant uniquement des images comme perception. / Shapes, surfaces, events, and objects (living and non-living) constitute the world. The intelligence of natural agents, such as humans is beyond pattern recognition. We excel at building representations and distilling knowledge to understand and infer the structure of the world. Critically, the development of such reasoning capabilities can occur even with limited supervision. On the other hand, despite its phenomenal development, the major successes of machine learning, in particular, deep learning models are primarily in tasks that have access to large annotated datasets. In this dissertation, we propose novel solutions to help address this gap by enabling machine learning models to learn the structure and enable effective reasoning in the presence of weakly supervised settings. The recurring theme of the thesis tries to revolve around the question of "How can a perceptual system learn to organize sensory information into useful knowledge under limited supervision?" And it discusses the themes of geometry, compositions, and associations in four separate articles with applications to computer vision (CV) and reinforcement learning (RL). Our first contribution ---Pix2Shape---presents an analysis-by-synthesis based approach(also referred to as inverse graphics) for perception. Pix2Shape leverages probabilistic generative models to learn 3D-aware representations from single 2D images. The resulting formalism allows us to perform a novel view synthesis of a scene and produce powerful representations of images. We achieve this by augmenting unsupervised learning with physically based inductive biases to decompose a scene structure into geometry, pose, reflectance and lighting. Our Second contribution ---MILe--- addresses the ambiguity issues in single-labeled datasets such as ImageNet. It is often inappropriate to describe an image with a single label when it is composed of more than one prominent object. We show that integrating ideas from Cognitive linguistic literature and imposing appropriate inductive biases helps in distilling multiple possible descriptions using such weakly labeled datasets. Next, moving into the RL setting, we consider an agent interacting with its environment without a reward signal. Our third Contribution ---HaC--- is a curiosity based unsupervised approach to learning associations between visual and tactile modalities. This aids the agent to explore the environment in an analogous self-guided fashion and further use this knowledge to adapt to downstream tasks. In the absence of reward supervision, intrinsic movitivation is useful to generate meaningful behavior in a self-supervised manner. In our final contribution, we address the representation learning bottleneck in unsupervised RL agents that has detrimental effect on the performance on high-dimensional pixel based inputs. Our model-based approach combines reward-free exploration and planning to efficiently fine-tune unsupervised pre-trained models, achieving comparable results to task-specific baselines. This is a step towards building agents that can generalize quickly on more than a single task using image inputs alone. Representation learning Weakly labeled data Intrinsic control Unsupervised Learning Generative modeling 3D scene understanding Apprentissage des représentations Apprentissage non supervisé Modélisation générative Perception de scènes 3D Contrôle intrinsèque Modèles du monde Données faiblement supervisées
270	Identification of Fundamental Driving Scenarios Using Unsupervised Machine Learning / Identifiering av grundläggande körscenarier med icke-guidad maskininlärning Anantha Padmanaban, Deepika January 2020 (has links) A challenge to release autonomous vehicles to public roads is safety verification of the developed features. Safety test driving of vehicles is not practically feasible as the acceptance criterion is driving at least 2.1 billion kilometers [1]. An alternative to this distance-based testing is the scenario-based approach, where the intelligent vehicles are exposed to known scenarios. Identification of such scenarios from the driving data is crucial for this validation. The aim of this thesis is to investigate the possibility of unsupervised identification of driving scenarios from the driving data. The task is performed in two major parts. The first is the segmentation of the time series driving data by detecting changepoints, followed by the clustering of the previously obtained segments. Time-series segmentation is approached using a Deep Learning method, while the second task is performed using time series clustering. The work also includes a visual approach for validating the time-series segmentation, followed by a quantitative measure of the performance. The approach is also qualitatively compared against a Bayesian Nonparametric approach to identify the usefulness of the proposed method. Based on the analysis of results, there is a discussion about the usefulness and drawbacks of the method, followed by the scope for future research. / En utmaning att släppa autonoma fordon på allmänna vägar är säkerhetsverifiering av de utvecklade funktionerna. Säkerhetstestning av fordon är inte praktiskt genomförbart eftersom acceptanskriteriet kör minst 2,1 miljarder kilometer [1]. Ett alternativ till denna distansbaserade testning är det scenaribaserade tillväga-gångssättet, där intelligenta fordon utsätts för kända scenarier. Identifiering av sådana scenarier från kördata är avgörande för denna validering. Syftet med denna avhandling är att undersöka möjligheten till oövervakad identifiering av körscenarier från kördata. Uppgiften utförs i två huvuddelar. Den första är segmenteringen av tidsseriedrivdata genom att detektera ändringspunkter, följt av klustring av de tidigare erhållna segmenten. Tidsseriesegmentering närmar sig med en Deep Learningmetod, medan den andra uppgiften utförs med hjälp av tidsseriekluster. Arbetet innehåller också ett visuellt tillvägagångssätt för att validera tidsserierna, följt av ett kvantitativt mått på prestanda. Tillvägagångssättet jämförs också med en Bayesian icke-parametrisk metod för att identifiera användbarheten av den föreslagna metoden. Baserat på analysen av resultaten diskuteras metodens användbarhet och nackdelar, följt av möjligheten för framtida forskning. Time-series Segmentation Time-series Clustering Stacked Sparse Autoencoders Unsupervised Learning Autonomous Driving Feature Extraction Segment av tidsserier Tidsserie-kluster Staplade autokodare Oövervakat lärande Autonom körning Särdragsextraktion Computer and Information Sciences Data- och informationsvetenskap

Search results