• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 51
  • 7
  • 7
  • 2
  • Tagged with
  • 85
  • 85
  • 66
  • 36
  • 36
  • 34
  • 26
  • 24
  • 20
  • 19
  • 18
  • 17
  • 16
  • 16
  • 14
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Domain adaptation for classifying disaster-related Twitter data

Sopova, Oleksandra January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / Doina Caragea / Machine learning is the subfield of Artificial intelligence that gives computers the ability to learn without being explicitly programmed, as it was defined by Arthur Samuel - the American pioneer in the field of computer gaming and artificial intelligence who was born in Emporia, Kansas. Supervised Machine Learning is focused on building predictive models given labeled training data. Data may come from a variety of sources, for instance, social media networks. In our research, we use Twitter data, specifically, user-generated tweets about disasters such as floods, hurricanes, terrorist attacks, etc., to build classifiers that could help disaster management teams identify useful information. A supervised classifier trained on data (training data) from a particular domain (i.e. disaster) is expected to give accurate predictions on unseen data (testing data) from the same domain, assuming that the training and test data have similar characteristics. Labeled data is not easily available for a current target disaster. However, labeled data from a prior source disaster is presumably available, and can be used to learn a supervised classifier for the target disaster. Unfortunately, the source disaster data and the target disaster data may not share the same characteristics, and the classifier learned from the source may not perform well on the target. Domain adaptation techniques, which use unlabeled target data in addition to labeled source data, can be used to address this problem. We study single-source and multi-source domain adaptation techniques, using Nave Bayes classifier. Experimental results on Twitter datasets corresponding to six disasters show that domain adaptation techniques improve the overall performance as compared to basic supervised learning classifiers. Domain adaptation is crucial for many machine learning applications, as it enables the use of unlabeled data in domains where labeled data is not available.
2

Domain adaptive learning with disentangled features

Peng, Xingchao 18 February 2021 (has links)
Recognizing visual information is crucial for many real artificial-intelligence-based applications, ranging from domestic robots to autonomous vehicles. However, the success of deep learning methods on visual recognition tasks is highly dependent on access to large-scale labeled datasets, which are expensive and cumbersome to collect. Transfer learning provides a way to alleviate the burden of annotating data, which transfers the knowledge learned from a rich-labeled source domain to a scarce-labeled target domain. However, the performance of deep learning models degrades significantly when testing on novel domains due to the presence of domain shift. To tackle the domain shift, conventional domain adaptation methods diminish the domain shift between two domains with a distribution matching loss or adversarial loss. These models align the domain-specific feature distribution and the domain-invariant feature distribution simultaneously, which is sub-optimal towards solving deep domain adaptation tasks, given that deep neural networks are known to extract features in which multiple hidden factors are highly entangled. This thesis explores how to learn effective transferable features by disentangling the deep features. The following questions are studied: (1) how to disentangle the deep features into domain-invariant and domain-specific features? (2) how would feature disentanglement help to learn transferable features under a synthetic-to-real domain adaptation scenario? (3) how would feature disentanglement facilitate transfer learning with multiple source or target domains? (4) how to leverage feature disentanglement to boost the performance in a federated system? To address these needs, this thesis proposes deep adversarial feature disentanglement: a class/domain identifier is trained on the labeled source domain and the disentangler generates features to fool the class/domain identifier. Extensive experiments and empirical analysis demonstrate the effectiveness of the feature disentanglement method on many real-world domain adaptation tasks. Specifically, the following three unsupervised domain adaptation scenarios are explored: (1) domain agnostic learning with disentangled representations, (2) unsupervised federated domain adaptation, (3) multi-source domain adaptation.
3

Multi-Source and Source-Private Cross-Domain Learning For Visual Recognition

Peng, Qucheng 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Domain adaptation is one of the hottest directions in solving annotation insufficiency problem of deep learning. General domain adaptation is not consistent with the practical scenarios in the industry. In this thesis, we focus on two concerns as below. First is that labeled data are generally collected from multiple domains. In other words, multi-source adaptation is a more common situation. Simply extending these single-source approaches to the multi-source cases could cause sub-optimal inference, so specialized multi-source adaptation methods are essential. The main challenge in the multi-source scenario is a more complex divergence situation. Not only the divergence between target and each source plays a role, but the divergences among distinct sources matter as well. However, the significance of maintaining consistency among multiple sources didn't gain enough attention in previous work. In this thesis, we propose an Enhanced Consistency Multi-Source Adaptation (EC-MSA) framework to address it from three perspectives. First, we mitigate feature-level discrepancy by cross-domain conditional alignment, narrowing the divergence between each source and target domain class-wisely. Second, we enhance multi-source consistency via dual mix-up, diminishing the disagreements among different sources. Third, we deploy a target distilling mechanism to handle the uncertainty of target prediction, aiming to provide high-quality pseudo-labeled target samples to benefit the previous two aspects. Extensive experiments are conducted on several common benchmark datasets and demonstrate that our model outperforms the state-of-the-art methods. Second is that data privacy and security is necessary in practice. That is, we hope to keep the raw data stored locally while can still obtain a satisfied model. In such a case, the risk of data leakage greatly decreases. Therefore, it is natural for us to combine the federated learning paradigm with domain adaptation. Under the source-private setting, the main challenge for us is to expose information from the source domain to the target domain while make sure that the communication process is safe enough. In this thesis, we propose a method named Fourier Transform-Assisted Federated Domain Adaptation (FTA-FDA) to alleviate the difficulties in two ways. We apply Fast Fourier Transform to the raw data and transfer only the amplitude spectra during the communication. Then frequency space interpolations between these two domains are conducted, minimizing the discrepancies while ensuring the contact of them and keeping raw data safe. What's more, we make prototype alignments by using the model weights together with target features, trying to reduce the discrepancy in the class level. Experiments on Office-31 demonstrate the effectiveness and competitiveness of our approach, and further analyses prove that our algorithm can help protect privacy and security.
4

Deep Domain Fusion for Adaptive Image Classification

January 2019 (has links)
abstract: Endowing machines with the ability to understand digital images is a critical task for a host of high-impact applications, including pathology detection in radiographic imaging, autonomous vehicles, and assistive technology for the visually impaired. Computer vision systems rely on large corpora of annotated data in order to train task-specific visual recognition models. Despite significant advances made over the past decade, the fact remains collecting and annotating the data needed to successfully train a model is a prohibitively expensive endeavor. Moreover, these models are prone to rapid performance degradation when applied to data sampled from a different domain. Recent works in the development of deep adaptation networks seek to overcome these challenges by facilitating transfer learning between source and target domains. In parallel, the unification of dominant semi-supervised learning techniques has illustrated unprecedented potential for utilizing unlabeled data to train classification models in defiance of discouragingly meager sets of annotated data. In this thesis, a novel domain adaptation algorithm -- Domain Adaptive Fusion (DAF) -- is proposed, which encourages a domain-invariant linear relationship between the pixel-space of different domains and the prediction-space while being trained under a domain adversarial signal. The thoughtful combination of key components in unsupervised domain adaptation and semi-supervised learning enable DAF to effectively bridge the gap between source and target domains. Experiments performed on computer vision benchmark datasets for domain adaptation endorse the efficacy of this hybrid approach, outperforming all of the baseline architectures on most of the transfer tasks. / Dissertation/Thesis / Masters Thesis Computer Science 2019
5

Low-Resource Automatic Speech Recognition Domain Adaptation: A Case-Study in Aviation Maintenance

Nadine Amr Mahmoud Amin (16648563) 02 August 2023 (has links)
<p>With timeliness and efficiency being critical in the aviation maintenance industry, the need has been growing for smart technological solutions that help in optimizing and streamlining the different underlying tasks. One such task is the technical documentation of the performed maintenance operations. Instead of paper-based documentation, voice tools that transcribe spoken logbook entries allow technicians to document their work right away in a hands-free and time efficient manner. However, an accurate automatic speech recognition (ASR) model requires large training corpora, which are lacking in the domain of aviation maintenance. In addition, ASR models which are trained on huge corpora in standard English perform poorly in such a technical domain with non-standard terminology. Hence, this thesis investigates the extent to which fine-tuning an ASR model, pre-trained on standard English corpora, on limited in-domain data improves its recognition performance in the technical domain of aviation maintenance. The thesis presents a case study on one such pre-trained ASR model, wav2vec 2.0. Results show that fine-tuning the model on a limited anonymized dataset of maintenance logbook entries brings about a significant reduction in its error rates when tested on not only an anonymized in-domain dataset, but also a non-anonymized one. This suggests that any available aviation maintenance logbooks, even if anonymized for privacy, can be used to fine-tune general-purpose ASR models and enhance their in-domain performance. Lastly, an analysis on the influence of voice characteristics on model performance stresses the need for balanced datasets representative of the population of aviation maintenance technicians.</p>
6

Contributions to Document Image Analysis: Application to Music Score Images

Castellanos, Francisco J. 25 November 2022 (has links)
Esta tesis contribuye en el límite del conocimiento en algunos procesos relevantes dentro del flujo de trabajo típico asociado a los sistemas de reconocimiento óptico de música (OMR). El análisis de los documentos es una etapa clave y temprana dentro de dicho flujo, cuyo objetivo es proporcionar una versión simplificada de la información entrante; es decir, de las imágenes de documentos musicales. El resto de procesos involucrados en OMR pueden aprovechar esta simplificación para resolver sus correspondientes tareas de forma más sencilla y centrándose únicamente en la información que necesitan. Un ejemplo claro es el proceso dedicado a reconocer las áreas donde se sitúan los diferentes pentagramas. Tras obtener las coordenadas de los mismos, los pentagramas individuales pueden ser procesados para recuperar la secuencia simbólica musical que contienen y así construir una versión digital de su contenido. El trabajo de investigación que se ha realizado para completar la presente tesis se encuentra avalada por una serie de contribuciones publicadas en revistas de alto impacto y congresos internacionales. Concretamente, esta tesis contiene un conjunto de 4 artículos que se han publicado en revistas indexadas en el Journal Citation Reports y situadas en los primeros cuartiles en cuanto al factor de impacto, teniendo un total de 58 citas según Google Scholar. También se han incluido 3 comunicaciones realizadas en diferentes ediciones de un congreso internacional de Clase A según la clasificación proporcionada por GII-GRIN-SCIE. Se puede observar que las publicaciones tratan temas muy relacionados entre sí, enfocándose principalmente en el análisis de documentos orientado a OMR pero con pinceladas de transcripción de la secuencia musical y técnicas de adaptación al dominio. También hay publicaciones que demuestran que algunas de estas técnicas pueden ser aplicadas a otros tipos de imágenes de documentos, haciendo que las soluciones propuestas sean más interesantes por su capacidad de generalización y adaptación a otros contextos. Además del análisis de documentos, también se estudia cómo afectan estos procesos a la transcripción final de la notación musical, que a fin de cuentas, es el objetivo final de los sistemas OMR, pero que hasta el momento no se había investigado. Por último, debido a la incontable cantidad de información que requieren las redes neuronales para construir un modelo suficientemente robusto, también se estudia el uso de técnicas de adaptación al dominio, con la esperanza de que su éxito abra las puertas a la futura aplicabilidad de los sistemas OMR en entornos reales. Esto es especialmente interesante en el contexto de OMR debido a la gran cantidad de documentos sin datos de referencia que son necesarios para entrenar modelos de redes neuronales, por lo que una solución que aproveche las limitadas colecciones etiquetadas para procesar documentos de otra índole nos permitiría un uso más práctico de estas herramientas de transcripción automáticas. Tras la realización de esta tesis, se observa que la investigación en OMR no ha llegado al límite que la tecnología puede alcanzar y todavía hay varias vías por las que continuar explorando. De hecho, gracias al trabajo realizado, se han abierto incluso nuevos horizontes que se podrían estudiar para que algún día estos sistemas puedan ser utilizados para digitalizar y transcribir de forma automática la herencia musical escrita o impresa a gran escala y en un tiempo razonable. Entre estas nuevas líneas de investigación, podemos destacar las siguientes: · En esta tesis se han publicado contribuciones que utilizan una técnica de adaptación al dominio para realizar análisis de documentos con buenos resultados. La exploración de nuevas técnicas de adaptación al dominio podría ser clave para construir modelos de redes neuronales robustos y sin la necesidad de etiquetar manualmente una parte de todas las obras musicales que se pretenden digitalizar. · La aplicación de las técnicas de adaptación al dominio en otros procesos como en la transcripción de la secuencia musical podría facilitar el entrenamiento de modelos capaces de realizar esta tarea. Los algoritmos de aprendizaje supervisado requieren que personal cualificado se encargue de transcribir manualmente una parte de las colecciones, pero los costes temporal y económico asociados a este proceso suponen un amplio esfuerzo si el objetivo final es transcribir todo este patrimonio cultural. Por ello, sería interesante estudiar la aplicabilidad de estas técnicas con el fin de reducir drásticamente esta necesidad. · Durante la tesis, se ha estudiado cómo afecta el factor de escala de los documentos en el rendimiento de varios procesos de OMR. Además de la escala, otro factor importante que se debe tratar es la orientación, ya que las imágenes de los documentos no siempre estarán perfectamente alineadas y pueden sufrir algún tipo de rotación o deformación que provoque errores en la detección de la información. Por lo tanto, sería interesante estudiar cómo afectan estas deformaciones a la transcripción y encontrar soluciones viables para el contexto que aplica. · Como caso general y más básico, se ha estudiado cómo, con diferentes modelos de propósito general de detección de objetos, se podrían extraer los pentagramas para su posterior procesamiento. Estos elementos se han considerado rectangulares y sin rotación, pero hay que tener en cuenta que no siempre nos encontraremos con esta situación. Por lo tanto, otra posible vía de investigación sería estudiar otros tipos de modelos que permitan detectar elementos poligonales y no solo rectangulares, así como la posibilidad de detectar objetos con cierta inclinación sin introducir solapamiento entre elementos consecutivos como ocurre en algunas herramientas de etiquetado manual como la utilizada en esta tesis para la obtención de datos etiquetados para experimentación: MuRET. Estas líneas de investigación son, a priori, factibles pero es necesario realizar un proceso de exploración con el fin de detectar aquellas técnicas útiles para ser adaptadas al ámbito de OMR. Los resultados obtenidos durante la tesis señalan que es posible que estas líneas puedan aportar nuevas contribuciones en este campo, y por ende, avanzar un paso más a la aplicación práctica y real de estos sistemas a gran escala.
7

Improving NLP Systems Using Unconventional, Freely-Available Data

Huang, Fei January 2013 (has links)
Sentence labeling is a type of pattern recognition task that involves the assignment of a categorical label to each member of a sentence of observed words. Standard supervised sentence-labeling systems often have poor generalization: it is difficult to estimate parameters for words which appear in the test set, but seldom (or never) appear in the training set, because they only use words as features in their prediction tasks. Representation learning is a promising technique for discovering features that allow a supervised classifier to generalize from a source domain dataset to arbitrary new domains. We demonstrate that features which are learned from distributional representations of unlabeled data can be used to improve performance on out-of-vocabulary words and help the model to generalize. We also argue that it is important for a representation learner to be able to incorporate expert knowledge during its search for helpful features. We investigate techniques for building open-domain sentence labeling systems that approach the ideal of a system whose accuracy is high and consistent across domains. In particular, we investigate unsupervised techniques for language model representation learning that provide new features which are stable across domains, in that they are predictive in both the training and out-of-domain test data. In experiments, our best system with the proposed techniques reduce error by as much as 11.4% relative to the previous system using traditional representations on the Part-of-Speech tagging task. Moreover, we leverage the Posterior Regularization framework, and develop an architecture for incorporating biases from prior knowledge into representation learning. We investigate three types of biases: entropy bias, distance bias and predictive bias. Experiments on two domain adaptation tasks show that our biased learners identify significantly better sets of features than unbiased learners. This results in a relative reduction in error of more than 16% for both tasks with respect to existing state-of-the-art representation learning techniques. We also extend the idea of using additional unlabeled data to improve the system's performance on a different NLP task, word alignment. Traditional word alignment only takes a sentence-level aligned parallel corpus as input and generates the word-level alignments. However, as the integration of different cultures, more and more people are competent in multiple languages, and they often use elements of multiple languages in conversations. Linguist Code Switching (LCS) is such a situation where two or more languages show up in the context of a single conversation. Traditional machine translation (MT) systems treat LCS data as noise, or just as regular sentences. However, if LCS data is processed intelligently, it can provide a useful signal for training word alignment and MT models. In this work, we first extract constraints from this code switching data and then incorporate them into a word alignment model training procedure. We also show that by using the code switching data, we can jointly train a word alignment model and a language model using co-training. Our techniques for incorporating LCS data improve by 2.64 in BLEU score over a baseline MT system trained using only standard sentence-aligned corpora. / Computer and Information Science
8

Active Learning Under Limited Interaction with Data Labeler

Chen, Si January 2021 (has links)
Active learning (AL) aims at reducing labeling effort by identifying the most valuable unlabeled data points from a large pool. Traditional AL frameworks have two limitations: First, they perform data selection in a multi-round manner, which is time-consuming and impractical. Second, they usually assume that there are a small amount of labeled data points available in the same domain as the data in the unlabeled pool. In this thesis, we initiate the study of one-round active learning to solve the first issue. We propose DULO, a general framework for one-round setting based on the notion of data utility functions, which map a set of data points to some performance measure of the model trained on the set. We formulate the one-round active learning problem as data utility function maximization. We then propose D²ULO on the basis of DULO as a solution that solves both issues. Specifically, D²ULO leverages the idea of domain adaptation (DA) to train a data utility model on source labeled data. The trained utility model can then be used to select high-utility data in the target domain and at the same time, provide an estimate for the utility of the selected data. Our experiments show that the proposed frameworks achieves better performance compared with state-of-the-art baselines in the same setting. Particularly, D²ULO is applicable to the scenario where the source and target labels have mismatches, which is not supported by the existing works. / M.S. / Machine Learning (ML) has achieved huge success in recent years. Machine Learning technologies such as recommendation system, speech recognition and image recognition play an important role on human daily life. This success mainly build upon the use of large amount of labeled data: Compared with traditional programming, a ML algorithm does not rely on explicit instructions from human; instead, it takes the data along with the label as input, and aims to learn a function that can correctly map data to the label space by itself. However, data labeling requires human effort and could be time-consuming and expensive especially for datasets that contain domain-specific knowledge (e.g., disease prediction etc.) Active Learning (AL) is one of the solution to reduce data labeling effort. Specifically, the learning algorithm actively selects data points that provide more information for the model, hence a better model can be achieved with less labeled data. While traditional AL strategies do achieve good performance, it requires a small amount of labeled data as initialization and performs data selection in multi-round, which pose great challenge to its application, as there is no platform provide timely online interaction with data labeler and the interaction is often time inefficient. To deal with the limitations, we first propose DULO which a new setting of AL is studied: data selection is only allowed to be performed once. To further broaden the application of our method, we propose D²ULO which is built upon DULO and Domain Adaptation techniques to avoid the use of initial labeled data. Our experiments show that both of the proposed two frameworks achieve better performance compared with state-of-the-art baselines.
9

Evaluating use of Domain Adaptation for Data Augmentation Applications : Implementing a state-of-the-art Domain Adaptation module and testing it on object detection in the landscape domain. / Utvärdering av användningen av domänanpassning för en djupinlärningstillämpning : Implementering av en toppmodern domänanpassningsmodul och testning av den på objektdetektion i en landskapsdomän.

Jamal, Majd January 2022 (has links)
Machine learning models are becoming popular in the industry since the technology has developed to solve numerous problems, such as classification [1], detection [2], and segmentation [3]. These algorithms require training with a large dataset which includes correct class labels to perform well on unseen data. One way to get access to large sets of annotated data is to use data from simulation engines. However this data is often not as complex and rich as real data, and for images, for examples, there can be a need to make these look more photorealistic. One approach to do this is denoted Domain adaptation. In collaboration with SAAB Aeronautics, which funds this research, this study aims to explore available domain adaptation frameworks, implement a framework and use it to make a transformation from simulation to real- life. A state-of-the-art framework CyCADA was re-implemented from scratch using Python and TensorFlow as a Deep Learning package. The CyCADA implementation was successfully verified by reproducing the digit adaptation result demonstrated in the original paper, making domain adaptations between MNIST, USPS, and SVHN. CyCADA was used to domain adapt landscape images from simulation to real-life. Domain-adapted images were used to train an object detector to evaluate whether CyCADA allows a detector to perform more accurately in real-life data. Statistical measurements, unfortunately, showed that domain-adapted images became less photorealistic with CyCADA, 88.68 FID on domain-adapted images compared to 80.43 FID on simulations, and object detection performed better on real-life data without CyCADA, 0.131 mAP with a detector trained on domain-adapted images compared to 0.681 mAP with simulations. Since CyCADA produced effective domain adaptation results between digits, there remains a possibility to try multiple hyperparameter settings and neural network architecture to produce effective results with landscape images. / Denna studie genomfördes i ett samarbete med SAAB Aeronautics och handlar om att utveckla en Domain Adaptation-modul som förbättrar prestandan av ett nätverk för objektdetektering. När ett objektdetekteringsnätverk är tränat med data från en domän så är det inte givet att samma nätverk presterar bra på en annan domän. Till exempel, ritningar och fotografier av frukter. Forskare löser problemet genom att samla data från varje domän och träna flera maskininlärningsalgoritmer, vilket är en lösning som kräver tid och energi. Detta problem kallas för domänskiftesproblem. Ett hett ämne inom djupinlärning handlar om att lösa just detta problem med domänskift och det finns en rad algoritmer som faller i kategorin Domain Adaptation. Denna studie utvecklar CyCADA som metod att evaluera en toppmodern Domain Adaptation-algoritm. Återimplementering av CyCADA blev lyckad, eftersom flera resultat var återskapade från den originala artikeln. CyCADA producerade effektiva domänskiften på bilder av siffror. CyCADA användes med landskapsbilder från en simulator för att öka verklighetsförankringen på bilderna. Domänskiftade landskapsbilder blev suddiga med CyCADA. FID värdet av domänskiftade bilder, ett utvärderingsmått som evaluerar fotorealism av bilder, blev lägre i jämförelse med endast simulerade bilder. Objektdetekteringsnätverket presterade bättre utan användning av CyCADA. Givet att CyCADA presterade bra i att transformera bilder av siffror från en domän till en annan finns det hopp om att ramverket kan prestera bra med landskapsbilder med fler försök i att ställa in hyperparameterar.
10

Semi-Supervised Domain Adaptation for Semantic Segmentation with Consistency Regularization : A learning framework under scarce dense labels / Semi-Superviced Domain Adaption för semantisk segmentering med konsistensregularisering : Ett nytt tillvägagångsätt för lärande under brist på täta etiketter

Morales Brotons, Daniel January 2023 (has links)
Learning from unlabeled data is a topic of critical significance in machine learning, as the large datasets required to train ever-growing models are costly and impractical to annotate. Semi-Supervised Learning (SSL) methods aim to learn from a few labels and a large unlabeled dataset. In another approach, Domain Adaptation (DA) leverages data from a similar source domain to train a model for a target domain. This thesis focuses on Semi-Supervised Domain Adaptation (SSDA) for the dense task of semantic segmentation, where labels are particularly costly to obtain. SSDA has not received much attention yet, even though it has a great potential and represents a realistic scenario. The few existing SSDA methods for semantic segmentation reuse ideas from Unsupervised DA, despite the di↵erences between the two settings. This thesis proposes a new semantic segmentation framework designed particularly for the SSDA setting. The approach followed was to forego domain alignment and focus instead on enhancing clusterability of target domain features, an idea from SSL. The method is based on consistency regularization, combined with pixel contrastive learning and self-training. The proposed framework is found to be e↵ective not only in SSDA, but also in SSL. Ultimately, a unified solution for SSL and SSDA semantic segmentation is presented. Experiments were conducted on the target dataset of Cityscapes and source dataset of GTA5. The method proposed is competitive in both SSL and SSDA, and sets a new state-of-the-art for SSDA achieving a 65.6% mIoU (+4.4) on Cityscapes with 100 labeled samples. This thesis has an immediate impact on practical applications by proposing a new best-performing framework for the under-explored setting of SSDA. Furthermore, it also contributes towards the more ambitious goal of designing a unified solution for learning from unlabeled data. / Inlärning med hjälp av omärkt data är ett område av stor vikt inom maskininlärning. Detta på grund av att de stora datamängder som blivit nödvändiga för att träna konstant växande modeller både är kostsamma och opraktiska att implementera. Målet med Semi-Supervised Learning (SSL) är att kombinera ett fåtal etiketter med en stor mängd omärkt data för inlärning. Som ett annat tillvägagångssätt använder Domain Adaptation (DA) data från en liknande domän för att träna en annan måldomän. I Denna avhandling används Semi-Supervised Domain Adaptation (SSDA) för att utföra sådan semantisk segmentering, i vilken etiketter är särskilt kostsamma att erhålla. SSDA är ännu inte genererat mycket uppmärksamhet, även om det har en stor potential och representerar ett realistiskt scenario. De få metoder av SSDA som existerar för semantisk segmentering återanvänder idéer från Unsupervised DA, trots de olikheter som finns mellan de två modellerna. Denna avhandling föreslår ett nytt ramverk för semantisk segmentering, designat speciellt för SSDA modellen. Detta genom att försaka domänanpassning och i stället fokusera på att förbättra klusterbarheten av måldomänens egenskaper, en idé tagen från SSL. Metoden är baserad på konsistensregularisering, i kombination med pixelkontrastinlärning och självinlärning. Det föreslagna ramverket visar sig vara effektivt, inte bara för SSDA, men även för SSL. Till slut presenteras en enad lösning för semantisk segmentering med SLL och SSDA. Experiment utfördes på måldata från Cityscapes samt källdata från GTA5. Den föreslagna metoden är konkurrenskraftig både för SSL och SSDA, och blir världsledande för SSDA genom att uppnå 65,6% mIoU (+4,4) för Cityscapes med 100 märkta testdata. Denna avhandling har en omedelbar effekt gällande praktiska applikationer genom att föreslå ett nytt ”bäst resulterande” ramverk för dåligt utforskade inställningar av SSDA. Till yttermera visso bidrar avhandlingen även till det mer ambitiösa målet att designa en enad lösning för maskininlärning från omärkta data.

Page generated in 0.0495 seconds