Spelling suggestions: "subject:"comain shift"" "subject:"comain hift""
1 |
Learning to Adapt Neural Networks Across Visual DomainsRoy, Subhankar 29 September 2022 (has links)
In the field of machine learning (ML) a very commonly encountered problem is the lack of generalizability of learnt classification functions when subjected to new samples that are not representative of the training distribution. The discrepancy between the training (a.k.a. source) and test (a.k.a.target) distributions are caused by several latent factors such as change in appearance, illumination, viewpoints and so on, which is also popularly known as domain-shift. In order to make a classifier cope with such domain-shifts, a sub-field in machine learning called domain adaptation (DA) has emerged that jointly uses the annotated data from the source domain together with the unlabelled data from the target domain of interest. For a classifier to be adapted to an unlabelled target data set is of tremendous practical significance because it has no associated labelling cost and allows for more accurate predictions in the environment of interest. A majority of the DA methods which address the single source and single target domain scenario are not easily extendable to many practical DA scenarios. As there has been as increasing focus to make ML models deployable, it calls for devising improved methods that can handle inherently complex practical DA scenarios in the real world.
In this work we build towards this goal of addressing more practical DA settings and help realize novel methods for more real world applications: (i) We begin our work with analyzing and addressing the single source and single target setting by proposing whitening-based embedded normalization layers to align the marginal feature distributions between two domains. To better utilize the unlabelled target data we propose an unsupervised regularization loss that encourages both confident and consistent predictions. (ii) Next, we build on top of the proposed normalization layers and use them in a generative framework to address multi-source DA by posing it as an image translation problem. This proposed framework TriGAN allows a single generator to be learned by using all the source domain data into a single network, leading to better generation of target-like source data. (iii) We address multi-target DA by learning a single classifier for all of the target domains. Our proposed framework exploits feature aggregation with a graph convolutional network to align feature representations of similar samples across domains. Moreover, to counteract the noisy pseudo-labels we propose to use a co-teaching strategy with a dual classifier head. To enable smoother adaptation, we propose a domain curriculum learning ,when the domain labels are available, that adapts to one target domain at a time, with increasing domain gap. (iv) Finally, we address the challenging source-free DA where the only source of supervision is a source-trained model. We propose to use Laplace Approximation to build a probabilistic source model that can quantify the uncertainty in the source model predictions on the target data. The uncertainty is then used as importance weights during the target adaptation process, down-weighting target data that do not lie in the source manifold.
|
2 |
Learning from Synthetic Data : Towards Effective Domain Adaptation Techniques for Semantic Segmentation of Urban Scenes / Lärande från Syntetiska Data : Mot Effektiva Domänanpassningstekniker för Semantisk Segmentering av Urbana ScenerValls I Ferrer, Gerard January 2021 (has links)
Semantic segmentation is the task of predicting predefined class labels for each pixel in a given image. It is essential in autonomous driving, but also challenging because training accurate models requires large and diverse datasets, which are difficult to collect due to the high cost of annotating images at pixel-level. This raises interest in using synthetic images from simulators, which can be labelled automatically. However, models trained directly on synthetic data perform poorly in real-world scenarios due to the distributional misalignment between synthetic and real images (domain shift). This thesis explores the effectiveness of several techniques for alleviating this issue, employing Synscapes and Cityscapes as the synthetic and real datasets, respectively. Some of the tested methods exploit a few additional labelled real images (few-shot supervised domain adaptation), some have access to plentiful real images but not their associated labels (unsupervised domain adaptation), and others do not take advantage of any image or annotation from the real domain (domain generalisation). After extensive experiments and a thorough comparative study, this work shows the severity of the domain shift problem by revealing that a semantic segmentation model trained directly on the synthetic dataset scores a poor mean Intersection over Union (mIoU) of 33:5% when tested on the real dataset. This thesis also demonstrates that such performance can be boosted by 25:7% without accessing any annotations from the real domain and 17:3% without leveraging any information from the real domain. Nevertheless, these gains are still inferior to the 31:0% relative improvement achieved with as little as 25 supplementary labelled real images, which suggests that there is still room for improvement in the fields of unsupervised domain adaptation and domain generalisation. Future work efforts should focus on developing better algorithms and creating synthetic datasets with a greater diversity of shapes and textures in order to reduce the domain shift. / Semantisk segmentering är uppgiften att förutsäga fördefinierade klassetiketter för varje pixel i en given bild. Det är viktigt för autonom körning, men också utmanande eftersom utveckling av noggranna modeller kräver stora och varierade datamängder, som är svåra att samla in på grund av de höga kostnaderna för att märka bilder på pixelnivå. Detta väcker intresset att använda syntetiska bilder från simulatorer, som kan märkas automatiskt. Problemet är emellertid att modeller som tränats direkt på syntetiska data presterar dåligt i verkliga scenarier på grund av fördelningsfel mellan syntetiska och verkliga bilder (domänskift). Denna avhandling undersöker effektiviteten hos flera tekniker för att lindra detta problem, med Synscapes och Cityscapes som syntetiska respektive verkliga datamängder. Några av de testade metoderna utnyttjar några ytterligare märkta riktiga bilder (few-shot övervakad domänanpassning), vissa har tillgång till många riktiga bilder men inte deras associerade etiketter (oövervakad domänanpassning), och andra drar inte nytta av någon bild eller annotering från den verkliga domänen (domängeneralisering). Efter omfattande experiment och en grundlig jämförande studie visar detta arbete svårighetsgraden av domänskiftproblemet genom att avslöja att en semantisk segmenteringsmodell som upplärts direkt på den syntetiska datauppsättningen ger en dålig mean Intersection over Union (mIoU) på 33; 5% när den testas på den verkliga datamängden. Denna avhandling visar också att sådan prestanda kan ökas med 25; 7% utan att komma åt några annoteringar från den verkliga domänen och 17; 3% utan att utnyttja någon information från den verkliga domänen. Ändå är dessa vinster fortfarande sämre än den 31; 0% relativa förbättringen som uppnåtts med så lite som 25 kompletterande annoterade riktiga bilder, vilket tyder på att det fortfarande finns utrymme för förbättringar inom områdena oövervakad domänanpassning och domängeneralisering. Framtida arbetsinsatser bör fokusera på att utveckla bättre algoritmer och på att skapa syntetiska datamängder med en större mångfald av former och texturer för att minska domänskiftet.
|
Page generated in 0.0495 seconds