• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 45
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 57
  • 57
  • 53
  • 29
  • 22
  • 21
  • 17
  • 16
  • 14
  • 12
  • 11
  • 11
  • 11
  • 11
  • 11
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Time-domain Deep Neural Networks for Speech Separation

Sun, Tao 24 May 2022 (has links)
No description available.
12

Exploration of Semi-supervised Learning for Convolutional Neural Networks

Sheffler, Nicholas 01 March 2023 (has links) (PDF)
Training a neural network requires a large amount of labeled data that has to be created by either human annotation or by very specifically created methods. Currently, there is a vast abundance of unlabeled data that is neglected sitting on servers, hard drives, websites, etc. These untapped data sources serve as the inspiration for this paper. The goal of this thesis is to explore and test various methods of semi-supervised learning (SSL) for convolutional neural networks (CNN). These methods will be analyzed and evaluated based on their accuracy on a test set of data. Since this particular neural network will be used to offer paths for an autonomous robot, it is important for the networks to be lightweight in scale. This paper will then take this assortment of smaller neural networks and run them through a variety of semi-supervised training methods. The first method is to have a teacher model that is trained on properly labeled data create labels for unlabeled data and add this to the training set for the next student model. From this base method, a few variations were tried in the hopes of getting a significant improvement. The first variation tested by this thesis is the effects of having this teacher and student cycle run more than one iteration. After that, the effects of using the confidence values that the models produced were explored by both including only data with confidence above a certain value and in a different test, relabeling data below a confidence threshold. The last variation this thesis explored was to have two teacher models concurrently and have the combination of those two models decide on the proper label for the unlabeled data. Through exploration and testing, these methods are evaluated in the results section as to which one produces the best results for SSL.
13

Self-supervised Representation Learning in Computer Vision and Reinforcement Learning

Ermolov, Aleksandr 06 December 2022 (has links)
This work is devoted to self-supervised representation learning (SSL). We consider both contrastive and non-contrastive methods and present a new loss function for SSL based on feature whitening. Our solution is conceptually simple and competitive with other methods. Self-supervised representations are beneficial for most areas of deep learning, and reinforcement learning is of particular interest because SSL can compensate for the sparsity of the training signal. We present two methods from this area. The first tackles the partial observability providing the agent with a history, represented with temporal alignment, and improves performance in most Atari environments. The second addresses the exploration problem. The method employs a world model of the SSL latent space, and the prediction error of this model indicates novel states required to explore. It shows strong performance on exploration-hard benchmarks, especially on the notorious Montezuma's Revenge. Finally, we consider the metric learning problem, which has much in common with SSL approaches. We present a new method based on hyperbolic embeddings, vision transformers and contrastive loss. We demonstrate the advantage of hyperbolic space over the widely used Euclidean space for metric learning. The method outperforms the current state-of-the-art by a significant margin.
14

Self-supervised Representation Learning for Visual Domains Beyond Natural Scenes

Chhipa, Prakash Chandra January 2023 (has links)
This thesis investigates the possibility of efficiently adapting self-supervised representation learning on visual domains beyond natural scenes, e.g., medical imagining and non-RGB sensory images. The thesis contributes to i) formalizing the self-supervised representation learning paradigm in a unified conceptual framework and ii) proposing the hypothesis based on supervision signal from data, called data-prior. Method adaptations following the hypothesis demonstrate significant progress in downstream tasks performance on microscopic histopathology and 3-dimensional particle management (3DPM) mining material non-RGB image domains. Supervised learning has proven to be obtaining higher performance than unsupervised learning on computer vision downstream tasks, e.g., image classification, object detection, etc. However, it imposes limitations due to human supervision. To reduce human supervision, end-to-end learning, i.e., transfer learning, remains proven for fine-tuning tasks but does not leverage unlabeled data. Representation learning in a self-supervised manner has successfully reduced the need for labelled data in the natural language processing and vision domain. Advances in learning effective visual representations without human supervision through a self-supervised learning approach are thought-provoking. This thesis performs a detailed conceptual analysis, method formalization, and literature study on the recent paradigm of self-supervised representation learning. The study’s primary goal is to identify the common methodological limitations across the various approaches for adaptation to the visual domain beyond natural scenes. The study finds a common component in transformations that generate distorted views for invariant representation learning. A significant outcome of the study suggests this component is closely dependent on human knowledge of the real world around the natural scene, which fits well the visual domain of the natural scenes but remains sub-optimal for other visual domains that are conceptually different. A hypothesis is proposed to use the supervision signal from data (data-prior) to replace the human-knowledge-driven transformations in self-supervised pretraining to overcome the stated challenge. Two separate visual domains beyond the natural scene are considered to explore the mentioned hypothesis, which is breast cancer microscopic histopathology and 3-dimensional particle management (3DPM) mining material non-RGB image. The first research paper explores the breast cancer microscopic histopathology images by actualizing the data-prior hypothesis in terms of multiple magnification factors as supervision signal from data, which is available in the microscopic histopathology images public dataset BreakHis. It proposes a self-supervised representation learning method, Magnification Prior Contrastive Similarity, which adapts the contrastive learning approach by replacing the standard image view transformations (augmentations) by utilizing magnification factors. The contributions to the work are multi-folded. It achieves significant performance improvement in the downstream task of malignancy classification during label efficiency and fully supervised settings. Pretrained models show efficient knowledge transfer on two additional public datasets supported by qualitative analysis on representation learning. The second research paper investigates the 3DPM mining material non-RGB image domain where the material’s pixel-mapped reflectance image and height (depth map) are captured. It actualizes the data-prior hypothesis by using depth maps of mining material on the conveyor belt. The proposed method, Depth Contrast, also adapts the contrastive learning method while replacing standard augmentations with depth maps for mining materials. It outperforms material classification over ImageNet transfer learning performance in fully supervised learning settings in fine-tuning and linear evaluation. It also shows consistent improvement in performance during label efficiency. In summary, the data-prior hypothesis shows one promising direction for optimal adaptations of contrastive learning methods in self-supervision for the visual domain beyond the natural scene. Although, a detailed study on the data-prior hypothesis is required to explore other non-contrastive approaches of recent self-supervised representation learning, including knowledge distillation and information maximization.
15

Knowledge transfer and retention in deep neural networks

Fini, Enrico 17 April 2023 (has links)
This thesis addresses the crucial problem of knowledge transfer and retention in deep neural networks. The ability to transfer knowledge from previously learned tasks and retain it for future use is essential for machine learning models to continually adapt to new tasks and improve their overall performance. In principle, knowledge can be transferred between any type of task, but we believe it to be particularly challenging in the field of computer vision, where the size and diversity of visual data often result in high compute requirements and the need for large, complex models. Hence, we analyze transfer and retention learning between unsupervised and supervised visual tasks, which form the main focus of this thesis. We categorize our efforts into several knowledge transfer and retention paradigms, and we tackle them with several contributions for the scientific community. The thesis proposes settings and methods based on knowledge distillation and self-supervised learning techniques. In particular, we devise two novel continual learning settings and seven new methods for knowledge transfer and retention, setting new state-of-the-art in a wide range of tasks. In conclusion, this thesis provides a valuable contribution to the field of computer vision and machine learning and sets a foundation for future work in this area.
16

Self-learning for 3D segmentation of medical images from single and few-slice annotation

Lassarat, Côme January 2023 (has links)
Training deep-learning networks to segment a particular region of interest (ROI) in 3D medical acquisitions (also called volumes) usually requires annotating a lot of data upstream because of the predominant fully supervised nature of the existing stateof-the-art models. To alleviate this annotation burden for medical experts and the associated cost, leveraging self-learning models, whose strength lies in their ability to be trained with unlabeled data, is a natural and straightforward approach. This work thus investigates a self-supervised model (called “self-learning” in this study) to segment the liver as a whole in medical acquisitions, which is very valuable for doctors as it provides insights for improved patient care. The self-learning pipeline utilizes only a single-slice (or a few-slice) groundtruth annotation to propagate the annotation iteratively in 3D and predict the complete segmentation mask for the entire volume. The segmentation accuracy of the tested models is evaluated using the Dice score, a metric commonly employed for this task. Conducting this study on Computed Tomography (CT) acquisitions to annotate the liver, the initial implementation of the self-learning framework achieved a segmentation accuracy of 0.86 Dice score. Improvements were explored to address the drifting of the mask propagation, which eventually proved to be of limited benefits. The proposed method was then compared to the fully supervised nnU-Net baseline, the state-of-the-art deep-learning model for medical image segmentation, using fully 3D ground-truth (Dice score ∼ 0.96). The final framework was assessed as an annotation tool. This was done by evaluating the segmentation accuracy of the state-of-the-art nnU-Net trained with annotation predicted by the self-learning pipeline for a given expert annotation budget. While the self-learning framework did not generate accurate enough annotation from a single slice annotation yielding an average Dice score of ∼ 0.85, it demonstrated encouraging results when two ground-truth slice annotations per volume were provided for the same annotation budget (Dice score of ∼ 0.90). / Att träna djupinlärningsnätverk för att segmentera en viss region av intresse (ROI) i medicinska 3D-bilder (även kallade volymer) kräver vanligtvis att en stor mängd data kommenteras uppströms på grund av den dominerande helt övervakade karaktären hos de befintliga toppmoderna modellerna. För att minska annoteringsbördan för medicinska experter samt den associerade kostnaden är det naturligt och enkelt att utnyttja självlärande modeller, vars styrka ligger i förmågan att tränas med omärkta data. Detta arbete undersöker således en självövervakad modell (“kallas ”självlärande” i denna studie) för att segmentera levern som helhet i medicinska skanningar, vilket är mycket värdefullt för läkare eftersom det ger insikter för förbättrad patientvård. Den självlärande pipelinen använder endast en enda skiva (eller några få skivor) för att sprida annotationen iterativt i 3D och förutsäga den fullständiga segmenteringsmasken för hela volymen. Segmenteringsnoggrannheten hos de testade modellerna utvärderas med hjälp av Dice-poängen, ett mått som vanligtvis används för denna uppgift. Vid genomförandet av denna studie på CT-förvärv för att annotera levern uppnådde den initiala implementeringen av det självlärande ramverket en segmenteringsnoggrannhet på 0,86 Dice-poäng. Förbättringar undersöktes för att hantera driften av maskutbredningen, vilket så småningom visade sig ha begränsade fördelar. Den föreslagna metoden jämfördes sedan med den helt övervakade nnU-Net-baslinjen, den toppmoderna djupinlärningsmodellen för medicinsk bildsegmentering, med hjälp av helt 3D-baserad sanning (Dice-poäng ∼ 0, 96). Det slutliga ramverket bedömdes som ett annoteringsverktyg. Detta gjordes genom att utvärdera segmenteringsnoggrannheten hos det toppmoderna nnU-Net som tränats med annotering som förutspåtts av den självlärande pipelinen för en given budget för expertannotering. Det självlärande ramverket genererade inte tillräckligt noggranna annoteringar från baserat på endast en snittannotering och resulterade i en genomsnittlig Dice-poäng på ∼ 0, 85, men uppvisade uppmuntrande resultat när två verkliga snittannoteringar per volym tillhandahölls för samma anteckningsbudget (Dice-poäng på ∼ 0, 90).
17

Transformer-based Model for Molecular Property Prediction with Self-Supervised Transfer Learning

Lin, Lyu January 2020 (has links)
Molecular property prediction has a vast range of applications in the chemical industry. A powerful molecular property prediction model can promote experiments and production processes. The idea behind this degree program lies in the use of transfer learning to predict molecular properties. The project is divided into two parts. The first part is to build and pre-train the model. The model, which is constructed with pure attention-based Transformer Layer, is pre-trained through a Masked Edge Recovery task with large-scale unlabeled data. Then, the performance of this pre- trained model is tested with different molecular property prediction tasks and finally verifies the effectiveness of transfer learning.The results show that after self-supervised pre-training, this model shows its excellent generalization capability. It is possible to be fine-tuned with a short period and performs well in downstream tasks. And the effectiveness of transfer learning is reflected in the experiment as well. The pre-trained model not only shortens the task- specific training time but also obtains better performance and avoids overfitting due to too little training data for molecular property prediction. / Prediktion av molekylers egenskaper har en stor mängd tillämpningar inom kemiindustrin. Kraftfulla metoder för att predicera molekylära egenskaper kan främja vetenskapliga experiment och produktionsprocesser. Ansatsen i detta arbete är att använda överförd inlärning (eng. transfer learning) för att predicera egenskaper hos molekyler. Projektet är indelat i två delar. Den första delen fokuserar på att utveckla och förträna en modell. Modellen består av Transformer-lager med attention- mekanismer och förtränas genom att återställa maskerade kanter i molekylgrafer från storskaliga mängder icke-annoterad data. Efteråt utvärderas prestandan hos den förtränade modellen i en mängd olika uppgifter baserade på prediktion av molekylegenskaper vilket bekräftar fördelen med överförd inlärning.Resultaten visar att modellen efter självövervakad förträning besitter utmärkt förmåga till att generalisera. Den kan finjusteras med liten tidskostnad och presterar väl i specialiserade uppgifter. Effektiviteten hos överförd inlärning visas också i experimenten. Den förtränade modellen förkortar inte bara tiden för uppgifts-specifik inlärning utan uppnår även bättre prestanda och undviker att övertränas på grund otillräckliga mängder data i uppgifter för prediktion av molekylegenskaper.
18

Exploring adaptation of self-supervised representation learning to histopathology images for liver cancer detection

Jonsson, Markus January 2024 (has links)
This thesis explores adapting self-supervised representation learning to visual domains beyond natural scenes, focusing on medical imaging. The research addresses the central question: “How can self-supervised representation learning be specifically adapted for detecting liver cancer in histopathology images?” The study utilizes the PAIP 2019 dataset for liver cancer segmentation and employs a self-supervised approach based on the VICReg method. The evaluation results demonstrated that the ImageNet-pretrained model achieved superior performance on the test set, with a clipped Jaccard index of 0.7747 at a threshold of 0.65. The VICReg-pretrained model followed closely with a score of 0.7461, while the model initialized with random weights trailed behind at 0.5420. These findings indicate that while ImageNet-pretrained models outperformed VICReg-pretrained models, the latter still captured essential data characteristics, suggesting the potential of self-supervised learning in diverse visual domains. The research attempts to contribute to advancing self-supervised learning in non-natural scenes and provides insights into model pretraining strategies.
19

Online Unsupervised Domain Adaptation / Online-övervakad domänanpassning

Panagiotakopoulos, Theodoros January 2022 (has links)
Deep Learning models have seen great application in demanding tasks such as machine translation and autonomous driving. However, building such models has proved challenging, both from a computational perspective and due to the requirement of a plethora of annotated data. Moreover, when challenged on new situations or data distributions (target domain), those models may perform inadequately. Such examples are transitioning from one city to another, different weather situations, or changes in sunlight. Unsupervised Domain adaptation (UDA) exploits unlabelled data (easy access) to adapt models to new conditions or data distributions. Inspired by the fact that environmental changes happen gradually, we focus on Online UDA. Instead of directly adjusting a model to a demanding condition, we constantly perform minor adaptions to every slight change in the data, creating a soft transition from the current domain to the target one. To perform gradual adaptation, we utilized state-of-the-art semantic segmentation approaches on increasing rain intensities (25, 50, 75, 100, and 200mm of rain). We demonstrate that deep learning models can adapt substantially better to hard domains when exploiting intermediate ones. Moreover, we introduce a model switching mechanism that allows adjusting back to the source domain, after adaptation, without dropping performance. / Deep Learning-modeller har sett stor tillämpning i krävande uppgifter som maskinöversättning och autonom körning. Att bygga sådana modeller har dock visat sig vara utmanande, både ur ett beräkningsperspektiv och på grund av kravet på en uppsjö av kommenterade data. Dessutom, när de utmanas i nya situationer eller datadistributioner (måldomän), kan dessa modeller prestera otillräckligt. Sådana exempel är övergång från en stad till en annan, olika vädersituationer eller förändringar i solljus. Unsupervised Domain adaptation (UDA) utnyttjar omärkt data (enkel åtkomst) för att anpassa modeller till nya förhållanden eller datadistributioner. Inspirerade av att miljöförändringar sker gradvis, fokuserar vi på Online UDA. Istället för att direkt anpassa en modell till ett krävande tillstånd, gör vi ständigt mindre anpassningar till varje liten förändring i data, vilket skapar en mjuk övergång från den aktuella domänen till måldomänen. För att utföra gradvis anpassning använde vi toppmoderna semantiska segmenteringsmetoder för att öka regnintensiteten (25, 50, 75, 100 och 200 mm regn). Vi visar att modeller för djupinlärning kan anpassa sig betydligt bättre till hårda domäner när man utnyttjar mellanliggande. Dessutom introducerar vi en modellväxlingsmekanism som tillåter justering tillbaka till källdomänen, efter anpassning, utan att tappa prestanda.
20

Self-supervised Representation Learning via Image Out-painting for Medical Image Analysis

January 2020 (has links)
abstract: In recent years, Convolutional Neural Networks (CNNs) have been widely used in not only the computer vision community but also within the medical imaging community. Specifically, the use of pre-trained CNNs on large-scale datasets (e.g., ImageNet) via transfer learning for a variety of medical imaging applications, has become the de facto standard within both communities. However, to fit the current paradigm, 3D imaging tasks have to be reformulated and solved in 2D, losing rich 3D contextual information. Moreover, pre-trained models on natural images never see any biomedical images and do not have knowledge about anatomical structures present in medical images. To overcome the above limitations, this thesis proposes an image out-painting self-supervised proxy task to develop pre-trained models directly from medical images without utilizing systematic annotations. The idea is to randomly mask an image and train the model to predict the missing region. It is demonstrated that by predicting missing anatomical structures when seeing only parts of the image, the model will learn generic representation yielding better performance on various medical imaging applications via transfer learning. The extensive experiments demonstrate that the proposed proxy task outperforms training from scratch in six out of seven medical imaging applications covering 2D and 3D classification and segmentation. Moreover, image out-painting proxy task offers competitive performance to state-of-the-art models pre-trained on ImageNet and other self-supervised baselines such as in-painting. Owing to its outstanding performance, out-painting is utilized as one of the self-supervised proxy tasks to provide generic 3D pre-trained models for medical image analysis. / Dissertation/Thesis / Masters Thesis Computer Science 2020

Page generated in 0.0963 seconds