• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 47
  • 2
  • 1
  • Tagged with
  • 55
  • 55
  • 55
  • 30
  • 26
  • 22
  • 18
  • 17
  • 13
  • 12
  • 11
  • 11
  • 11
  • 10
  • 10
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Self-supervised Representation Learning in Computer Vision and Reinforcement Learning

Ermolov, Aleksandr 06 December 2022 (has links)
This work is devoted to self-supervised representation learning (SSL). We consider both contrastive and non-contrastive methods and present a new loss function for SSL based on feature whitening. Our solution is conceptually simple and competitive with other methods. Self-supervised representations are beneficial for most areas of deep learning, and reinforcement learning is of particular interest because SSL can compensate for the sparsity of the training signal. We present two methods from this area. The first tackles the partial observability providing the agent with a history, represented with temporal alignment, and improves performance in most Atari environments. The second addresses the exploration problem. The method employs a world model of the SSL latent space, and the prediction error of this model indicates novel states required to explore. It shows strong performance on exploration-hard benchmarks, especially on the notorious Montezuma's Revenge. Finally, we consider the metric learning problem, which has much in common with SSL approaches. We present a new method based on hyperbolic embeddings, vision transformers and contrastive loss. We demonstrate the advantage of hyperbolic space over the widely used Euclidean space for metric learning. The method outperforms the current state-of-the-art by a significant margin.
12

Self-supervised Representation Learning for Visual Domains Beyond Natural Scenes

Chhipa, Prakash Chandra January 2023 (has links)
This thesis investigates the possibility of efficiently adapting self-supervised representation learning on visual domains beyond natural scenes, e.g., medical imagining and non-RGB sensory images. The thesis contributes to i) formalizing the self-supervised representation learning paradigm in a unified conceptual framework and ii) proposing the hypothesis based on supervision signal from data, called data-prior. Method adaptations following the hypothesis demonstrate significant progress in downstream tasks performance on microscopic histopathology and 3-dimensional particle management (3DPM) mining material non-RGB image domains. Supervised learning has proven to be obtaining higher performance than unsupervised learning on computer vision downstream tasks, e.g., image classification, object detection, etc. However, it imposes limitations due to human supervision. To reduce human supervision, end-to-end learning, i.e., transfer learning, remains proven for fine-tuning tasks but does not leverage unlabeled data. Representation learning in a self-supervised manner has successfully reduced the need for labelled data in the natural language processing and vision domain. Advances in learning effective visual representations without human supervision through a self-supervised learning approach are thought-provoking. This thesis performs a detailed conceptual analysis, method formalization, and literature study on the recent paradigm of self-supervised representation learning. The study’s primary goal is to identify the common methodological limitations across the various approaches for adaptation to the visual domain beyond natural scenes. The study finds a common component in transformations that generate distorted views for invariant representation learning. A significant outcome of the study suggests this component is closely dependent on human knowledge of the real world around the natural scene, which fits well the visual domain of the natural scenes but remains sub-optimal for other visual domains that are conceptually different. A hypothesis is proposed to use the supervision signal from data (data-prior) to replace the human-knowledge-driven transformations in self-supervised pretraining to overcome the stated challenge. Two separate visual domains beyond the natural scene are considered to explore the mentioned hypothesis, which is breast cancer microscopic histopathology and 3-dimensional particle management (3DPM) mining material non-RGB image. The first research paper explores the breast cancer microscopic histopathology images by actualizing the data-prior hypothesis in terms of multiple magnification factors as supervision signal from data, which is available in the microscopic histopathology images public dataset BreakHis. It proposes a self-supervised representation learning method, Magnification Prior Contrastive Similarity, which adapts the contrastive learning approach by replacing the standard image view transformations (augmentations) by utilizing magnification factors. The contributions to the work are multi-folded. It achieves significant performance improvement in the downstream task of malignancy classification during label efficiency and fully supervised settings. Pretrained models show efficient knowledge transfer on two additional public datasets supported by qualitative analysis on representation learning. The second research paper investigates the 3DPM mining material non-RGB image domain where the material’s pixel-mapped reflectance image and height (depth map) are captured. It actualizes the data-prior hypothesis by using depth maps of mining material on the conveyor belt. The proposed method, Depth Contrast, also adapts the contrastive learning method while replacing standard augmentations with depth maps for mining materials. It outperforms material classification over ImageNet transfer learning performance in fully supervised learning settings in fine-tuning and linear evaluation. It also shows consistent improvement in performance during label efficiency. In summary, the data-prior hypothesis shows one promising direction for optimal adaptations of contrastive learning methods in self-supervision for the visual domain beyond the natural scene. Although, a detailed study on the data-prior hypothesis is required to explore other non-contrastive approaches of recent self-supervised representation learning, including knowledge distillation and information maximization.
13

Knowledge transfer and retention in deep neural networks

Fini, Enrico 17 April 2023 (has links)
This thesis addresses the crucial problem of knowledge transfer and retention in deep neural networks. The ability to transfer knowledge from previously learned tasks and retain it for future use is essential for machine learning models to continually adapt to new tasks and improve their overall performance. In principle, knowledge can be transferred between any type of task, but we believe it to be particularly challenging in the field of computer vision, where the size and diversity of visual data often result in high compute requirements and the need for large, complex models. Hence, we analyze transfer and retention learning between unsupervised and supervised visual tasks, which form the main focus of this thesis. We categorize our efforts into several knowledge transfer and retention paradigms, and we tackle them with several contributions for the scientific community. The thesis proposes settings and methods based on knowledge distillation and self-supervised learning techniques. In particular, we devise two novel continual learning settings and seven new methods for knowledge transfer and retention, setting new state-of-the-art in a wide range of tasks. In conclusion, this thesis provides a valuable contribution to the field of computer vision and machine learning and sets a foundation for future work in this area.
14

Self-learning for 3D segmentation of medical images from single and few-slice annotation

Lassarat, Côme January 2023 (has links)
Training deep-learning networks to segment a particular region of interest (ROI) in 3D medical acquisitions (also called volumes) usually requires annotating a lot of data upstream because of the predominant fully supervised nature of the existing stateof-the-art models. To alleviate this annotation burden for medical experts and the associated cost, leveraging self-learning models, whose strength lies in their ability to be trained with unlabeled data, is a natural and straightforward approach. This work thus investigates a self-supervised model (called “self-learning” in this study) to segment the liver as a whole in medical acquisitions, which is very valuable for doctors as it provides insights for improved patient care. The self-learning pipeline utilizes only a single-slice (or a few-slice) groundtruth annotation to propagate the annotation iteratively in 3D and predict the complete segmentation mask for the entire volume. The segmentation accuracy of the tested models is evaluated using the Dice score, a metric commonly employed for this task. Conducting this study on Computed Tomography (CT) acquisitions to annotate the liver, the initial implementation of the self-learning framework achieved a segmentation accuracy of 0.86 Dice score. Improvements were explored to address the drifting of the mask propagation, which eventually proved to be of limited benefits. The proposed method was then compared to the fully supervised nnU-Net baseline, the state-of-the-art deep-learning model for medical image segmentation, using fully 3D ground-truth (Dice score ∼ 0.96). The final framework was assessed as an annotation tool. This was done by evaluating the segmentation accuracy of the state-of-the-art nnU-Net trained with annotation predicted by the self-learning pipeline for a given expert annotation budget. While the self-learning framework did not generate accurate enough annotation from a single slice annotation yielding an average Dice score of ∼ 0.85, it demonstrated encouraging results when two ground-truth slice annotations per volume were provided for the same annotation budget (Dice score of ∼ 0.90). / Att träna djupinlärningsnätverk för att segmentera en viss region av intresse (ROI) i medicinska 3D-bilder (även kallade volymer) kräver vanligtvis att en stor mängd data kommenteras uppströms på grund av den dominerande helt övervakade karaktären hos de befintliga toppmoderna modellerna. För att minska annoteringsbördan för medicinska experter samt den associerade kostnaden är det naturligt och enkelt att utnyttja självlärande modeller, vars styrka ligger i förmågan att tränas med omärkta data. Detta arbete undersöker således en självövervakad modell (“kallas ”självlärande” i denna studie) för att segmentera levern som helhet i medicinska skanningar, vilket är mycket värdefullt för läkare eftersom det ger insikter för förbättrad patientvård. Den självlärande pipelinen använder endast en enda skiva (eller några få skivor) för att sprida annotationen iterativt i 3D och förutsäga den fullständiga segmenteringsmasken för hela volymen. Segmenteringsnoggrannheten hos de testade modellerna utvärderas med hjälp av Dice-poängen, ett mått som vanligtvis används för denna uppgift. Vid genomförandet av denna studie på CT-förvärv för att annotera levern uppnådde den initiala implementeringen av det självlärande ramverket en segmenteringsnoggrannhet på 0,86 Dice-poäng. Förbättringar undersöktes för att hantera driften av maskutbredningen, vilket så småningom visade sig ha begränsade fördelar. Den föreslagna metoden jämfördes sedan med den helt övervakade nnU-Net-baslinjen, den toppmoderna djupinlärningsmodellen för medicinsk bildsegmentering, med hjälp av helt 3D-baserad sanning (Dice-poäng ∼ 0, 96). Det slutliga ramverket bedömdes som ett annoteringsverktyg. Detta gjordes genom att utvärdera segmenteringsnoggrannheten hos det toppmoderna nnU-Net som tränats med annotering som förutspåtts av den självlärande pipelinen för en given budget för expertannotering. Det självlärande ramverket genererade inte tillräckligt noggranna annoteringar från baserat på endast en snittannotering och resulterade i en genomsnittlig Dice-poäng på ∼ 0, 85, men uppvisade uppmuntrande resultat när två verkliga snittannoteringar per volym tillhandahölls för samma anteckningsbudget (Dice-poäng på ∼ 0, 90).
15

Exploring adaptation of self-supervised representation learning to histopathology images for liver cancer detection

Jonsson, Markus January 2024 (has links)
This thesis explores adapting self-supervised representation learning to visual domains beyond natural scenes, focusing on medical imaging. The research addresses the central question: “How can self-supervised representation learning be specifically adapted for detecting liver cancer in histopathology images?” The study utilizes the PAIP 2019 dataset for liver cancer segmentation and employs a self-supervised approach based on the VICReg method. The evaluation results demonstrated that the ImageNet-pretrained model achieved superior performance on the test set, with a clipped Jaccard index of 0.7747 at a threshold of 0.65. The VICReg-pretrained model followed closely with a score of 0.7461, while the model initialized with random weights trailed behind at 0.5420. These findings indicate that while ImageNet-pretrained models outperformed VICReg-pretrained models, the latter still captured essential data characteristics, suggesting the potential of self-supervised learning in diverse visual domains. The research attempts to contribute to advancing self-supervised learning in non-natural scenes and provides insights into model pretraining strategies.
16

Self-Supervised Representation Learning for Early Breast Cancer Detection in Mammographic Imaging

Kristofer, Ågren January 2024 (has links)
The proposed master's thesis investigates the adaptability and efficacy of self-supervised representation learning (SSL) in medical image analysis, focusing on Mammographic Imaging to develop robust representation learning models. This research will build upon existing studies in Mammographic Imaging that have utilized contrastive learning and knowledge distillation-based self-supervised methods, focusing on SimCLR (Chen et al 2020) and SimSiam (Chen et al 2020) and evaluate approaches to increase the classification performance in a transfer learning setting. The thesis will critically evaluate and integrate recent advancements in these SSL paradigms (Chhipa 2023, chapter 2), and incorporating additional SSL approaches. The core objective is to enhance robust generalization and label efficiency in medical imaging analysis, contributing to the broader field of AI-driven diagnostic methodologies. The proposed master's thesis will not only extend the current understanding of SSL in medical imaging but also aims to provide actionable insights that could be instrumental in enhancing breast cancer detection methodologies, thereby contributing significantly to the field of medical imaging and cancer research.
17

Online Unsupervised Domain Adaptation / Online-övervakad domänanpassning

Panagiotakopoulos, Theodoros January 2022 (has links)
Deep Learning models have seen great application in demanding tasks such as machine translation and autonomous driving. However, building such models has proved challenging, both from a computational perspective and due to the requirement of a plethora of annotated data. Moreover, when challenged on new situations or data distributions (target domain), those models may perform inadequately. Such examples are transitioning from one city to another, different weather situations, or changes in sunlight. Unsupervised Domain adaptation (UDA) exploits unlabelled data (easy access) to adapt models to new conditions or data distributions. Inspired by the fact that environmental changes happen gradually, we focus on Online UDA. Instead of directly adjusting a model to a demanding condition, we constantly perform minor adaptions to every slight change in the data, creating a soft transition from the current domain to the target one. To perform gradual adaptation, we utilized state-of-the-art semantic segmentation approaches on increasing rain intensities (25, 50, 75, 100, and 200mm of rain). We demonstrate that deep learning models can adapt substantially better to hard domains when exploiting intermediate ones. Moreover, we introduce a model switching mechanism that allows adjusting back to the source domain, after adaptation, without dropping performance. / Deep Learning-modeller har sett stor tillämpning i krävande uppgifter som maskinöversättning och autonom körning. Att bygga sådana modeller har dock visat sig vara utmanande, både ur ett beräkningsperspektiv och på grund av kravet på en uppsjö av kommenterade data. Dessutom, när de utmanas i nya situationer eller datadistributioner (måldomän), kan dessa modeller prestera otillräckligt. Sådana exempel är övergång från en stad till en annan, olika vädersituationer eller förändringar i solljus. Unsupervised Domain adaptation (UDA) utnyttjar omärkt data (enkel åtkomst) för att anpassa modeller till nya förhållanden eller datadistributioner. Inspirerade av att miljöförändringar sker gradvis, fokuserar vi på Online UDA. Istället för att direkt anpassa en modell till ett krävande tillstånd, gör vi ständigt mindre anpassningar till varje liten förändring i data, vilket skapar en mjuk övergång från den aktuella domänen till måldomänen. För att utföra gradvis anpassning använde vi toppmoderna semantiska segmenteringsmetoder för att öka regnintensiteten (25, 50, 75, 100 och 200 mm regn). Vi visar att modeller för djupinlärning kan anpassa sig betydligt bättre till hårda domäner när man utnyttjar mellanliggande. Dessutom introducerar vi en modellväxlingsmekanism som tillåter justering tillbaka till källdomänen, efter anpassning, utan att tappa prestanda.
18

Self-supervised Representation Learning via Image Out-painting for Medical Image Analysis

January 2020 (has links)
abstract: In recent years, Convolutional Neural Networks (CNNs) have been widely used in not only the computer vision community but also within the medical imaging community. Specifically, the use of pre-trained CNNs on large-scale datasets (e.g., ImageNet) via transfer learning for a variety of medical imaging applications, has become the de facto standard within both communities. However, to fit the current paradigm, 3D imaging tasks have to be reformulated and solved in 2D, losing rich 3D contextual information. Moreover, pre-trained models on natural images never see any biomedical images and do not have knowledge about anatomical structures present in medical images. To overcome the above limitations, this thesis proposes an image out-painting self-supervised proxy task to develop pre-trained models directly from medical images without utilizing systematic annotations. The idea is to randomly mask an image and train the model to predict the missing region. It is demonstrated that by predicting missing anatomical structures when seeing only parts of the image, the model will learn generic representation yielding better performance on various medical imaging applications via transfer learning. The extensive experiments demonstrate that the proposed proxy task outperforms training from scratch in six out of seven medical imaging applications covering 2D and 3D classification and segmentation. Moreover, image out-painting proxy task offers competitive performance to state-of-the-art models pre-trained on ImageNet and other self-supervised baselines such as in-painting. Owing to its outstanding performance, out-painting is utilized as one of the self-supervised proxy tasks to provide generic 3D pre-trained models for medical image analysis. / Dissertation/Thesis / Masters Thesis Computer Science 2020
19

New Computational Methods for Literature-Based Discovery

Ding, Juncheng 05 1900 (has links)
In this work, we leverage the recent developments in computer science to address several of the challenges in current literature-based discovery (LBD) solutions. First, LBD solutions cannot use semantics or are too computational complex. To solve the problems we propose a generative model OverlapLDA based on topic modeling, which has been shown both effective and efficient in extracting semantics from a corpus. We also introduce an inference method of OverlapLDA. We conduct extensive experiments to show the effectiveness and efficiency of OverlapLDA in LBD. Second, we expand LBD to a more complex and realistic setting. The settings are that there can be more than one concept connecting the input concepts, and the connectivity pattern between concepts can also be more complex than a chain. Current LBD solutions can hardly complete the LBD task in the new setting. We simplify the hypotheses as concept sets and propose LBDSetNet based on graph neural networks to solve this problem. We also introduce different training schemes based on self-supervised learning to train LBDSetNet without relying on comprehensive labeled hypotheses that are extremely costly to get. Our comprehensive experiments show that LBDSetNet outperforms strong baselines on simple hypotheses and addresses complex hypotheses.
20

Applicability of Detection Transformers in Resource-Constrained Environments : Investigating Detection Transformer Performance Under Computational Limitations and Scarcity of Annotated Data

Senel, Altan January 2023 (has links)
Object detection is a fundamental task in computer vision, with significant applications in various domains. However, the reliance on large-scale annotated data and computational resource demands poses challenges in practical implementation. This thesis aims to address these complexities by exploring self-supervised training approaches for the detection transformer(DETR) family of object detectors. The project investigates the necessity of training the backbone under a semi-supervised setting and explores the benefits of initializing scene graph generation architectures with pretrained DETReg and DETR models for faster training convergence and reduced computational resource requirements. The significance of this research lies in the potential to mitigate the dependence on annotated data and make deep learning techniques more accessible to researchers and practitioners. By overcoming the limitations of data and computational resources, this thesis contributes to the accessibility of DETR and encourages a more sustainable and inclusive approach to deep learning research. / Objektigenkänning är en grundläggande uppgift inom datorseende, med betydande tillämpningar inom olika domäner. Dock skapar beroendet av storskaliga annoterade data och krav på datorkraft utmaningar i praktisk implementering. Denna avhandling syftar till att ta itu med dessa komplexiteter genom att utforska självövervakade utbildningsmetoder för detektions transformer (DETR) familjen av objektdetektorer. Projektet undersöker nödvändigheten av att träna ryggraden under en semi-övervakad inställning och utforskar fördelarna med att initiera scenegrafgenereringsarkitekturer med förtränade DETReg-modeller för snabbare konvergens av träning och minskade krav på datorkraft. Betydelsen av denna forskning ligger i potentialen att mildra beroendet av annoterade data och göra djupinlärningstekniker mer tillgängliga för forskare och utövare. Genom att övervinna begränsningarna av data och datorkraft, bidrar denna avhandling till tillgängligheten av DETR och uppmuntrar till en mer hållbar och inkluderande inställning till djupinlärning forskning.

Page generated in 0.1692 seconds