Global ETD Search

111	Indoor scene verification : Evaluation of indoor scene representations for the purpose of location verification / Verifiering av inomhusbilder : Bedömning av en inomhusbilder framställda i syfte att genomföra platsverifiering Finfando, Filip January 2020 (has links) When human’s visual system is looking at two pictures taken in some indoor location, it is fairly easy to tell whether they were taken in exactly the same place, even when the location has never been visited in reality. It is possible due to being able to pay attention to the multiple factors such as spatial properties (windows shape, room shape), common patterns (floor, walls) or presence of specific objects (furniture, lighting). Changes in camera pose, illumination, furniture location or digital alteration of the image (e.g. watermarks) has little influence on this ability. Traditional approaches to measuring the perceptual similarity of images struggled to reproduce this skill. This thesis defines the Indoor scene verification (ISV) problem as distinguishing whether two indoor scene images were taken in the same indoor space or not. It explores the capabilities of state-of-the-art perceptual similarity metrics by introducing two new datasets designed specifically for this problem. Perceptual hashing, ORB, FaceNet and NetVLAD are evaluated as the baseline candidates. The results show that NetVLAD provides the best results on both datasets and therefore is chosen as the baseline for the experiments aiming to improve it. Three of them are carried out testing the impact of using the different training dataset, changing deep neural network architecture and introducing new loss function. Quantitative analysis of AUC score shows that switching from VGG16 to MobileNetV2 allows for improvement over the baseline. / Med mänskliga synförmågan är det ganska lätt att bedöma om två bilder som tas i samma inomhusutrymme verkligen har tagits i exakt samma plats även om man aldrig har varit där. Det är möjligt tack vare många faktorer, sådana som rumsliga egenskaper (fönsterformer, rumsformer), gemensamma mönster (golv, väggar) eller närvaro av särskilda föremål (möbler, ljus). Ändring av kamerans placering, belysning, möblernas placering eller digitalbildens förändring (t. ex. vattenstämpel) påverkar denna förmåga minimalt. Traditionella metoder att mäta bildernas perceptuella likheter hade svårigheter att reproducera denna färdighet . Denna uppsats definierar verifiering av inomhusbilder, Indoor SceneVerification (ISV), som en ansats att ta reda på om två inomhusbilder har tagits i samma utrymme eller inte. Studien undersöker de främsta perceptuella identitetsfunktionerna genom att introducera två nya datauppsättningar designade särskilt för detta. Perceptual hash, ORB, FaceNet och NetVLAD identifierades som potentiella referenspunkter. Resultaten visar att NetVLAD levererar de bästa resultaten i båda datauppsättningarna, varpå de valdes som referenspunkter till undersökningen i syfte att förbättra det. Tre experiment undersöker påverkan av användning av olika datauppsättningar, ändring av struktur i neuronnätet och införande av en ny minskande funktion. Kvantitativ AUC-värdet analys visar att ett byte frånVGG16 till MobileNetV2 tillåter förbättringar i jämförelse med de primära lösningarna. computer vision perceptual similarity visual place recognition indoor scene localization deep neural networks datorseende perceptuella likheter visuell platsigenkänning inomhusbild lokalisering djupa neuronnät Computer and Information Sciences Data- och informationsvetenskap
112	Supervised Speech Separation And Processing Han, Kun January 2014 (has links) No description available. Computer Science Supervised learning Speech separation Speech processing Machine learning Deep Learning Pitch estimation Speech Dereverberation Deep neural networks Support vector machines
113	Supervised Speech Separation Using Deep Neural Networks Wang, Yuxuan 21 May 2015 (has links) No description available. Computer Science Engineering Speech separation time-frequency masking computational auditory scene analysis acoustic features deep neural networks training targets generalization speech intelligibility speech quality
114	On Generalization of Supervised Speech Separation Chen, Jitong 30 August 2017 (has links) No description available. Computer Science Engineering Speech separation speech intelligibility computational auditory scene analysis mask estimation supervised learning deep neural networks acoustic features noise generalization SNR generalization speaker generalization
115	Efficient Continual Learning in Deep Neural Networks Gobinda Saha (18512919) 07 May 2024 (has links) <p dir="ltr">Humans exhibit remarkable ability in continual adaptation and learning new tasks throughout their lifetime while maintaining the knowledge gained from past experiences. In stark contrast, artificial neural networks (ANNs) under such continual learning (CL) paradigm forget the information learned in the past tasks upon learning new ones. This phenomenon is known as ‘Catastrophic Forgetting’ or ‘Catastrophic Interference’. The objective of this thesis is to enable efficient continual learning in deep neural networks while mitigating this forgetting phenomenon. Towards this, first, a continual learning algorithm (SPACE) is proposed where a subset of network filters or neurons is allocated for each task using Principal Component Analysis (PCA). Such task-specific network isolation not only ensures zero forgetting but also creates structured sparsity in the network which enables energy-efficient inference. Second, a fast and more efficient training algorithm for CL is proposed by introducing Gradient Projection Memory (GPM). Here, the most important gradient spaces (GPM) for each task are computed using Singular Value Decomposition (SVD) and the new tasks are learned in the orthogonal direction to GPM to minimize forgetting. Third, to improve new learning while minimizing forgetting, a Scaled Gradient Projection (SGP) method is proposed that, in addition to orthogonal gradient updates, allows scaled updates along the important gradient spaces of the past task. Next, for continual learning on an online stream of tasks a memory efficient experience replay method is proposed. This method utilizes saliency maps explaining network’s decision for selecting memories that are replayed during new tasks for preventing forgetting. Finally, a meta-learning based continual learner - Amphibian - is proposed that achieves fast online continual learning without any experience replay. All the algorithms are evaluated on short and long sequences of tasks from standard image-classification datasets. Overall, the methods proposed in this thesis address critical limitations of DNNs for continual learning and advance the state-of-the-art in this domain.</p> Computer vision Continual learning deep neural networks (DNNs) Machine Learning Optimization
116	Multimodal Deep Learning for Multi-Label Classification and Ranking Problems Dubey, Abhishek January 2015 (has links) (PDF) In recent years, deep neural network models have shown to outperform many state of the art algorithms. The reason for this is, unsupervised pretraining with multi-layered deep neural networks have shown to learn better features, which further improves many supervised tasks. These models not only automate the feature extraction process but also provide with robust features for various machine learning tasks. But the unsupervised pretraining and feature extraction using multi-layered networks are restricted only to the input features and not to the output. The performance of many supervised learning algorithms (or models) depends on how well the output dependencies are handled by these algorithms [Dembczy´nski et al., 2012]. Adapting the standard neural networks to handle these output dependencies for any speciﬁc type of problem has been an active area of research [Zhang and Zhou, 2006, Ribeiro et al., 2012]. On the other hand, inference into multimodal data is considered as a difﬁcult problem in machine learning and recently ‘deep multimodal neural networks’ have shown signiﬁcant results [Ngiam et al., 2011, Srivastava and Salakhutdinov, 2012]. Several problems like classiﬁcation with complete or missing modality data, generating the missing modality etc., are shown to perform very well with these models. In this work, we consider three nontrivial supervised learning tasks (i) multi-class classiﬁcation (MCC), (ii) multi-label classiﬁcation (MLC) and (iii) label ranking (LR), mentioned in the order of increasing complexity of the output. While multi-class classiﬁcation deals with predicting one class for every instance, multi-label classiﬁcation deals with predicting more than one classes for every instance and label ranking deals with assigning a rank to each label for every instance. All the work in this ﬁeld is associated around formulating new error functions that can force network to identify the output dependencies. Aim of our work is to adapt neural network to implicitly handle the feature extraction (dependencies) for output in the network structure, removing the need of hand crafted error functions. We show that the multimodal deep architectures can be adapted for these type of problems (or data) by considering labels as one of the modalities. This also brings unsupervised pretraining to the output along with the input. We show that these models can not only outperform standard deep neural networks, but also outperform standard adaptations of neural networks for individual domains under various metrics over several data sets considered by us. We can observe that the performance of our models over other models improves even more as the complexity of the output/ problem increases. Neural Networks Deep Neural Network Models Neural Network Architecture Multimodal Deep Neural Networks Multimodal Deep Learning Multi-Label Classification (MLC) Multi-class Classification (MCC) Label Ranking Multimodal Neural Networks Supervised Learning Multilayer Neural Network Perceptron Model Computer Science
117	Sequential modeling, generative recurrent neural networks, and their applications to audio Mehri, Soroush 12 1900 (has links) No description available. Artificial intelligence Machine learning Deep neural networks Representation learning Sequential modeling Generative models Audio generation Intelligence artificielle Apprentissage automatique Réseaux de neurones profonds Apprentissage de représentations Modélisation séquentielle Modèles génératifs Génération audio
118	Deep Learning for Whole Slide Image Cytology : A Human-in-the-Loop Approach Rydell, Christopher January 2021 (has links) With cancer being one of the leading causes of death globally, and with oral cancers being among the most common types of cancer, it is of interest to conduct large-scale oral cancer screening among the general population. Deep Learning can be used to make this possible despite the medical expertise required for early detection of oral cancers. A bottleneck of Deep Learning is the large amount of data required to train a good model. This project investigates two topics: certainty calibration, which aims to make a machine learning model produce more reliable predictions, and Active Learning, which aims to reduce the amount of data that needs to be labeled for Deep Learning to be effective. In the investigation of certainty calibration, five different methods are compared, and the best method is found to be Dirichlet calibration. The Active Learning investigation studies a single method, Cost-Effective Active Learning, but it is found to produce poor results with the given experiment setting. These two topics inspire the further development of the cytological annotation tool CytoBrowser, which is designed with oral cancer data labeling in mind. The proposedevolution integrates into the existing tool a Deep Learning-assisted annotation workflow that supports multiple users. machine learning deep learning cytology whole slide imaging deep neural networks convolutional neural networks certainty calibration calibration metrics active learning deep active learning human-in-the-loop Other Computer and Information Science Annan data- och informationsvetenskap
119	Vizuální systém pro detekci obsazenosti parkoviště pomocí hlubokých neuronových sítí / Visual Car-Detection on the Parking Lots Using Deep Neural Networks Stránský, Václav January 2017 (has links) The concept of smart cities is inherently connected with efficient parking solutions based on the knowledge of individual parking space occupancy. The subject of this paper is the design and implementation of a robust system for analyzing parking space occupancy from a multi-camera system with the possibility of visual overlap between cameras. The system is designed and implemented in Robot Operating System (ROS) and its core consists of two separate classifiers. The more successful, however, a slower option is detection by a deep neural network. A quick interaction is provided by a less accurate classifier of movement with a background model. The system is capable of working in real time on a graphic card as well as on a processor. The success rate of the system on a testing data set from real operation exceeds 95 %.
120	Zlepšování systému pro automatické hraní hry Starcraft II v prostředí PySC2 / Improving Bots Playing Starcraft II Game in PySC2 Environment Krušina, Jan January 2018 (has links) The aim of this thesis is to create an automated system for playing a real-time strategy game Starcraft II. Learning from replays via supervised learning and reinforcement learning techniques are used for improving bot's behavior. The proposed system should be capable of playing the whole game utilizing PySC2 framework for machine learning. Performance of the bot is evaluated against the built-in scripted AI in the game.

Search results