• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 90
  • 12
  • 6
  • 4
  • 3
  • 2
  • 1
  • 1
  • Tagged with
  • 145
  • 145
  • 145
  • 76
  • 52
  • 51
  • 24
  • 23
  • 22
  • 21
  • 20
  • 19
  • 19
  • 19
  • 18
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Adapting deep neural networks as models of human visual perception

McClure, Patrick January 2018 (has links)
Deep neural networks (DNNs) have recently been used to solve complex perceptual and decision tasks. In particular, convolutional neural networks (CNN) have been extremely successful for visual perception. In addition to performing well on the trained object recognition task, these CNNs also model brain data throughout the visual hierarchy better than previous models. However, these DNNs are still far from completely explaining visual perception in the human brain. In this thesis, we investigated two methods with the goal of improving DNNs’ capabilities to model human visual perception: (1) deep representational distance learning (RDL), a method for driving representational spaces in deep nets into alignment with other (e.g. brain) representational spaces and (2) variational DNNs that use sampling to perform approximate Bayesian inference. In the first investigation, RDL successfully transferred information from a teacher model to a student DNN. This was achieved by driving the student DNN’s representational distance matrix (RDM), which characterises the representational geometry, into alignment with that of the teacher. This led to a significant increase in test accuracy on machine learning benchmarks. In the future, we plan to use this method to simultaneously train DNNs to perform complex tasks and to predict neural data. In the second investigation, we showed that sampling during learning and inference using simple Bernoulli- and Gaussian-based noise improved a CNN’s representation of its own uncertainty for object recognition. We also found that sampling during learning and inference with Gaussian noise improved how well CNNs predict human behavioural data for image classification. While these methods alone do not fully explain human vision, they allow for training CNNs that better model several features of human visual perception.
22

Training Data Generation Framework For Machine-Learning Based Classifiers

McClintick, Kyle W 14 December 2018 (has links)
In this thesis, we propose a new framework for the generation of training data for machine learning techniques used for classification in communications applications. Machine learning-based signal classifiers do not generalize well when training data does not describe the underlying probability distribution of real signals. The simplest way to accomplish statistical similarity between training and testing data is to synthesize training data passed through a permutation of plausible forms of noise. To accomplish this, a framework is proposed that implements arbitrary channel conditions and baseband signals. A dataset generated using the framework is considered, and is shown to be appropriately sized by having $11\%$ lower entropy than state-of-the-art datasets. Furthermore, unsupervised domain adaptation can allow for powerful generalized training via deep feature transforms on unlabeled evaluation-time signals. A novel Deep Reconstruction-Classification Network (DRCN) application is introduced, which attempts to maintain near-peak signal classification accuracy despite dataset bias, or perturbations on testing data unforeseen in training. Together, feature transforms and diverse training data generated from the proposed framework, teaching a range of plausible noise, can train a deep neural net to classify signals well in many real-world scenarios despite unforeseen perturbations.
23

Deep neural networks for music tagging

Choi, Keunwoo January 2018 (has links)
In this thesis, I present my hypothesis, experiment results, and discussion that are related to various aspects of deep neural networks for music tagging. Music tagging is a task to automatically predict the suitable semantic label when music is provided. Generally speaking, the input of music tagging systems can be any entity that constitutes music, e.g., audio content, lyrics, or metadata, but only the audio content is considered in this thesis. My hypothesis is that we can fi nd effective deep learning practices for the task of music tagging task that improves the classi fication performance. As a computational model to realise a music tagging system, I use deep neural networks. Combined with the research problem, the scope of this thesis is the understanding, interpretation, optimisation, and application of deep neural networks in the context of music tagging systems. The ultimate goal of this thesis is to provide insight that can help to improve deep learning-based music tagging systems. There are many smaller goals in this regard. Since using deep neural networks is a data-driven approach, it is crucial to understand the dataset. Selecting and designing a better architecture is the next topic to discuss. Since the tagging is done with audio input, preprocessing the audio signal becomes one of the important research topics. After building (or training) a music tagging system, fi nding a suitable way to re-use it for other music information retrieval tasks is a compelling topic, in addition to interpreting the trained system. The evidence presented in the thesis supports that deep neural networks are powerful and credible methods for building a music tagging system.
24

Chromosome 3D Structure Modeling and New Approaches For General Statistical Inference

Rongrong Zhang (5930474) 03 January 2019 (has links)
<div>This thesis consists of two separate topics, which include the use of piecewise helical models for the inference of 3D spatial organizations of chromosomes and new approaches for general statistical inference. The recently developed Hi-C technology enables a genome-wide view of chromosome</div><div>spatial organizations, and has shed deep insights into genome structure and genome function. However, multiple sources of uncertainties make downstream data analysis and interpretation challenging. Specically, statistical models for inferring three-dimensional (3D) chromosomal structure from Hi-C data are far from their maturity. Most existing methods are highly over-parameterized, lacking clear interpretations, and sensitive to outliers. We propose a parsimonious, easy to interpret, and robust piecewise helical curve model for the inference of 3D chromosomal structures</div><div>from Hi-C data, for both individual topologically associated domains and whole chromosomes. When applied to a real Hi-C dataset, the piecewise helical model not only achieves much better model tting than existing models, but also reveals that geometric properties of chromatin spatial organization are closely related to genome function.</div><div><br></div><div><div>For potential applications in big data analytics and machine learning, we propose to use deep neural networks to automate the Bayesian model selection and parameter estimation procedures. Two such frameworks are developed under different scenarios. First, we construct a deep neural network-based Bayes estimator for the parameters of a given model. The neural Bayes estimator mitigates the computational challenges faced by traditional approaches for computing Bayes estimators. When applied to the generalized linear mixed models, the neural Bayes estimator</div><div>outperforms existing methods implemented in R packages and SAS procedures. Second, we construct a deep convolutional neural networks-based framework to perform</div><div>simultaneous Bayesian model selection and parameter estimation. We refer to the neural networks for model selection and parameter estimation in the framework as the</div><div>neural model selector and parameter estimator, respectively, which can be properly trained using labeled data systematically generated from candidate models. Simulation</div><div>study shows that both the neural selector and estimator demonstrate excellent performances.</div></div><div><br></div><div><div>The theory of Conditional Inferential Models (CIMs) has been introduced to combine information for efficient inference in the Inferential Models framework for priorfree</div><div>and yet valid probabilistic inference. While the general theory is subject to further development, the so-called regular CIMs are simple. We establish and prove a</div><div>necessary and sucient condition for the existence and identication of regular CIMs. More specically, it is shown that for inference based on a sample from continuous</div><div>distributions with unknown parameters, the corresponding CIM is regular if and only if the unknown parameters are generalized location and scale parameters, indexing</div><div>the transformations of an affine group.</div></div>
25

Indonésko-anglický neuronový strojový překlad / Indonesian-English Neural Machine Translation

Dwiastuti, Meisyarah January 2019 (has links)
Title: Indonesian-English Neural Machine Translation Author: Meisyarah Dwiastuti Department: Institute of Formal and Applied Linguistics Supervisor: Mgr. Martin Popel, Ph.D., Institute of Formal and Applied Linguis- tics Abstract: In this thesis, we conduct a study on neural machine translation (NMT) for an under-studied language, Indonesian, specifically for English-Indonesian (EN-ID) and Indonesian-English (ID-EN) in a low-resource domain, TED talks. Our goal is to implement domain adaptation methods to improve the low-resource EN-ID and ID-EN NMT systems. First, we implement model fine-tuning method for EN-ID and ID-EN NMT systems by leveraging a large parallel corpus contain- ing movie subtitles. Our analysis shows the benefit of this method for the improve- ment of both systems. Second, we improve our ID-EN NMT system by leveraging English monolingual corpora through back-translation. Our back-translation ex- periments focus on how to incorporate the back-translated monolingual corpora to the training set, in which we investigate various existing training regimes and introduce a novel 4-way-concat training regime. We also analyze the effect of fine- tuning our back-translation models with different scenarios. Experimental results show that our method of implementing back-translation followed by model...
26

Object Recognition in Videos Utilizing Hierarchical and Temporal Objectness with Deep Neural Networks

Peng, Liang 01 May 2017 (has links)
This dissertation develops a novel system for object recognition in videos. The input of the system is a set of unconstrained videos containing a known set of objects. The output is the locations and categories for each object in each frame across all videos. Initially, a shot boundary detection algorithm is applied to the videos to divide them into multiple sequences separated by the identified shot boundaries. Since each of these sequences still contains moderate content variations, we further use a cost optimization-based key frame extraction method to select key frames in each sequence and use these key frames to divide the videos into shorter sub-sequences with little content variations. Next, we learn object proposals on the first frame of each sub-sequence. Building upon the state-of-the-art object detection algorithms, we develop a tree-based hierarchical model to improve the object detection. Using the learned object proposals as the initial object positions in the first frame of each sub-sequence, we apply the SPOT tracker to track the object proposals and re-rank them using the proposed temporal objectness to obtain object proposals tubes by removing unlikely objects. Finally, we employ the deep Convolution Neural Network (CNN) to perform classification on these tubes. Experiments show that the proposed system significantly improves the object detection rate of the learned proposals when comparing with some state-of-the-art object detectors. Due to the improvement in object detection, the proposed system also achieves higher mean average precision at the stage of proposal classification than the state-of-the-art methods.
27

Contribution au développement de l’apprentissage profond dans les systèmes distribués / Contribution to the development of deep learning in distributed systems

Hardy, Corentin 08 April 2019 (has links)
L'apprentissage profond permet de développer un nombre de services de plus en plus important. Il nécessite cependant de grandes bases de données d'apprentissage et beaucoup de puissance de calcul. Afin de réduire les coûts de cet apprentissage profond, nous proposons la mise en œuvre d'un apprentissage collaboratif. Les futures utilisateurs des services permis par l'apprentissage profond peuvent ainsi participer à celui-ci en mettant à disposition leurs machines ainsi que leurs données sans déplacer ces dernières sur le cloud. Nous proposons différentes méthodes afin d'apprendre des réseaux de neurones profonds dans ce contexte de système distribué. / Deep learning enables the development of a growing number of services. However, it requires large training databases and a lot of computing power. In order to reduce the costs of this deep learning, we propose a distributed computing setup to enable collaborative learning. Future users can participate with their devices and their data without moving private data in datacenters. We propose methods to train deep neural network in this distibuted system context.
28

Deep Neural Network Acoustic Models for ASR

Mohamed, Abdel-rahman 01 April 2014 (has links)
Automatic speech recognition (ASR) is a key core technology for the information age. ASR systems have evolved from discriminating among isolated digits to recognizing telephone-quality, spontaneous speech, allowing for a growing number of practical applications in various sectors. Nevertheless, there are still serious challenges facing ASR which require major improvement in almost every stage of the speech recognition process. Until very recently, the standard approach to ASR had remained largely unchanged for many years. It used Hidden Markov Models (HMMs) to model the sequential structure of speech signals, with each HMM state using a mixture of diagonal covariance Gaussians (GMM) to model a spectral representation of the sound wave. This thesis describes new acoustic models based on Deep Neural Networks (DNN) that have begun to replace GMMs. For ASR, the deep structure of a DNN as well as its distributed representations allow for better generalization of learned features to new situations, even when only small amounts of training data are available. In addition, DNN acoustic models scale well to large vocabulary tasks significantly improving upon the best previous systems. Different input feature representations are analyzed to determine which one is more suitable for DNN acoustic models. Mel-frequency cepstral coefficients (MFCC) are inferior to log Mel-frequency spectral coefficients (MFSC) which help DNN models marginalize out speaker-specific information while focusing on discriminant phonetic features. Various speaker adaptation techniques are also introduced to further improve DNN performance. Another deep acoustic model based on Convolutional Neural Networks (CNN) is also proposed. Rather than using fully connected hidden layers as in a DNN, a CNN uses a pair of convolutional and pooling layers as building blocks. The convolution operation scans the frequency axis using a learned local spectro-temporal filter while in the pooling layer a maximum operation is applied to the learned features utilizing the smoothness of the input MFSC features to eliminate speaker variations expressed as shifts along the frequency axis in a way similar to vocal tract length normalization (VTLN) techniques. We show that the proposed DNN and CNN acoustic models achieve significant improvements over GMMs on various small and large vocabulary tasks.
29

Deep Neural Network Acoustic Models for ASR

Mohamed, Abdel-rahman 01 April 2014 (has links)
Automatic speech recognition (ASR) is a key core technology for the information age. ASR systems have evolved from discriminating among isolated digits to recognizing telephone-quality, spontaneous speech, allowing for a growing number of practical applications in various sectors. Nevertheless, there are still serious challenges facing ASR which require major improvement in almost every stage of the speech recognition process. Until very recently, the standard approach to ASR had remained largely unchanged for many years. It used Hidden Markov Models (HMMs) to model the sequential structure of speech signals, with each HMM state using a mixture of diagonal covariance Gaussians (GMM) to model a spectral representation of the sound wave. This thesis describes new acoustic models based on Deep Neural Networks (DNN) that have begun to replace GMMs. For ASR, the deep structure of a DNN as well as its distributed representations allow for better generalization of learned features to new situations, even when only small amounts of training data are available. In addition, DNN acoustic models scale well to large vocabulary tasks significantly improving upon the best previous systems. Different input feature representations are analyzed to determine which one is more suitable for DNN acoustic models. Mel-frequency cepstral coefficients (MFCC) are inferior to log Mel-frequency spectral coefficients (MFSC) which help DNN models marginalize out speaker-specific information while focusing on discriminant phonetic features. Various speaker adaptation techniques are also introduced to further improve DNN performance. Another deep acoustic model based on Convolutional Neural Networks (CNN) is also proposed. Rather than using fully connected hidden layers as in a DNN, a CNN uses a pair of convolutional and pooling layers as building blocks. The convolution operation scans the frequency axis using a learned local spectro-temporal filter while in the pooling layer a maximum operation is applied to the learned features utilizing the smoothness of the input MFSC features to eliminate speaker variations expressed as shifts along the frequency axis in a way similar to vocal tract length normalization (VTLN) techniques. We show that the proposed DNN and CNN acoustic models achieve significant improvements over GMMs on various small and large vocabulary tasks.
30

Towards robust conversational speech recognition and understanding

Weng, Chao 12 January 2015 (has links)
While significant progress has been made in automatic speech recognition (ASR) during the last few decades, recognizing and understanding unconstrained conversational speech remains a challenging problem. In this dissertation, five methods/systems are proposed towards a robust conversational speech recognition and understanding system. I. A non-uniform minimum classification error (MCE) approach is proposed which can achieve consistent and significant keyword spotting performance gains on both English and Mandarin large-scale spontaneous conversational speech tasks (Switchboard and HKUST Mandarin CTS). II. A hybrid recurrent DNN-HMM system is proposed for robust acoustic modeling and a new way of backpropagation through time (BPTT) is introduced. The proposed system achieves state-of-the-art performances on two benchmark datasets, the 2nd CHiME challenge (track 2) and Aurora-4, without front-end preprocessing, speaker adaptive training or multiple decoding passes. III. To study the specific case of conversational speech recognition in the presence of competing talkers, several multi-style training setups of DNNs are investigated and a joint decoder operating on multi-talker speech is introduced. The proposed combined system improves upon the previous state-of-the-art IBM superhuman system by 2.8% absolute on the 2006 speech separation challenge dataset. IV. Latent semantic rational kernels (LSRKs) are proposed for spotting the semantic notions on conversational speech. The proposed framework is generalized using tf-idf weighting, latent semantic analysis, WordNet, probabilistic topic models and neural network learned representations and is shown to achieve substantial topic spotting performance gains on two conversational speech tasks, Switchboard and AT&T HMIHY initial collection. V. Non-uniform sequential discriminative training (DT) of DNNs with LSRKs is proposed which directly links the information of the proposed LSRK framework to the objective function of the DT. The experimental results on the subset of Switchboard show the proposed method can lead the acoustic modeling to a more robust system with respect to the semantic decoder.

Page generated in 0.0625 seconds