• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1494
  • 55
  • 44
  • 36
  • 36
  • 27
  • 15
  • 11
  • 9
  • 6
  • 4
  • 4
  • 2
  • 2
  • 1
  • Tagged with
  • 2186
  • 2186
  • 911
  • 818
  • 683
  • 502
  • 453
  • 410
  • 404
  • 366
  • 353
  • 350
  • 335
  • 327
  • 308
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

FPGA acceleration of CNN training

Samal, Kruttidipta 07 January 2016 (has links)
This thesis presents the results of an architectural study on the design of FPGA- based architectures for convolutional neural networks (CNNs). We have analyzed the memory access patterns of a Convolutional Neural Network (one of the biggest networks in the family of deep learning algorithms) by creating a trace of a well-known CNN architecture and by developing a trace-driven DRAM simulator. The simulator uses the traces to analyze the effect that different storage patterns and dissonance in speed between memory and processing element, can have on the CNN system. This insight is then used create an initial design for a layer architecture for the CNN using an FPGA platform. The FPGA is designed to have multiple parallel-executing units. We design a data layout for the on-chip memory of an FPGA such that we can increase parallelism in the design. As the number of these parallel units (and hence parallelism) depends on the memory layout of input and output, particularly if parallel read and write accesses can be scheduled or not. The on-chip memory layout minimizes access contention during the operation of parallel units. The result is an SoC (System on Chip) that acts as an accelerator and can have more number of parallel units than previous work. The improvement in design was also observed by comparing post synthesis loop latency tables between our design and one with a single unit design. This initial design can help in designing FPGAs targeted for deep learning algorithms that can compete with GPUs in terms of performance.
2

Understanding Perceived Sense of Movement in Static Visuals Using Deep Learning

Kale, Shravan 11 January 2019 (has links)
This thesis introduces the problem of learning the representation and the classification of the perceived sense of movement, defined as dynamism in static visuals. To solve the said problem, we study the definition, degree, and real-world implications of dynamism within the field of consumer psychology. We employ Deep Convolutional Neural Networks (DCNN) as a method to learn and predict dynamism in images. The novelty of the task, lead us to collect a dataset which we synthetically augmented for spatial invariance, using image processing techniques. We study the methods of transfer learning to transfer knowledge from another domain, as the size of our dataset was deemed to be inadequate. Our dataset is trained across different network architectures, and transfer learning techniques to find an optimal method for the task at hand. To show a real-world application of our work, we observe the correlation between the two visual stimuli, dynamism and emotions.
3

Domain Adaptation on Semantic Segmentation with Separate Affine Transformation in Batch Normalization

Yan, Junhao 06 June 2022 (has links)
Domain adaptation on semantic segmentation generally refers to the procedures for narrowing the distribution gap between source and target data, which is vital for developing the automatic vehicle system. It requires a large amount of data with well-labelled ground truth at the pixel level. Labelling this scale of data is extremely costly due to the lot of human effort required. Also, manually labelling often comes with label noises that are harmful to automatic vehicle system development. In this case, solving the above problem utilizes computer-generated data and ground truth for development. However, a notorious problem exists when a system is trained with synthetic data but deployed in a real-world environment, which results from the distribution (domain) difference between these two kinds of data, and domain adaptation helps solve this issue. In the thesis, the limitation of conventional batch normalization layer on adversarial learning based domain adaptation methods is mentioned and discussed. From the view of the limitation, we propose replacing the Sharing Affine Transformation with our proposed Separate Affine Transformation (SEAT) to improve the domain adapting performance. The proposed SEAT is simple, easily implemented, and integrated into existing adversarial learning-based unsupervised domain adaptation methods. Also, to further improve the adaptation quality on lower-level features, we introduce multi-level adaptation by adding the lower-level features to the higher-level ones before feeding them to the discriminator, which is different from others by adding extra discriminators. Finally, a simple training strategy, self-training, is adopted to improve the model performance further. Extensive experiments show that our proposed method is able to get comparable results with other domain adaptation methods with simpler design.
4

Novel computational methods for promoter identification and analysis

Umarov, Ramzan 02 March 2020 (has links)
Promoters are key regions that are involved in differential transcription regulation of protein-coding and RNA genes. The gene-specific architecture of promoter sequences makes it extremely difficult to devise a general strategy for their computational identification. Accurate prediction of promoters is fundamental for interpreting gene expression patterns, and for constructing and understanding genetic regulatory networks. In the last decade, genomes of many organisms have been sequenced and their gene content was mostly identified. Promoters and transcriptional start sites (TSS), however, are still left largely undetermined and efficient software able to accurately predict promoters in newly sequenced genomes is not yet available in the public domain. While there are many attempts to develop computational promoter identification methods, reliable tools to analyze long genomic sequences are still lacking. In this dissertation, I present the methods I have developed for prediction of promoters for different organisms. The first two methods, TSSPlant and PromCNN, achieved state-of-the-art performance for discriminating promoter and non-promoter sequences for plant and eukaryotic promoters respectively. For TSSPlant, a large number of features were crafted and evaluated to train an optimal classifier. Prom- CNN was built using a deep learning approach that extracts features from the data automatically. The trained model demonstrated the ability of a deep learning approach to grasp complex promoter sequence characteristics. For the latest method, DeeReCT-PromID, I focus on prediction of the exact positions of the TSSs inside the eukaryotic genomic sequences, testing every possible location. This is a more difficult task, requiring not only an accurate classifier, but also appropriate selection of unique predictions among multiple overlapping high scoring genomic segments. The new method significantly outperform the previous promoter prediction programs by considerably reducing the number of false positive predictions. Specifically, to reduce the false positive rate, the models are adaptively and iteratively trained by changing the distribution of samples in the training set based on the false positive errors made in the previous iteration. The new methods are used to gain insights into the design principles of the core promoters. Using model analysis, I have identified the most important core promoter elements and their effect on the promoter activity. Furthermore, the importance of each position inside the core promoter was analyzed and validated using a large single nucleotide polymorphisms data set. I have developed a novel general approach to detect long range interactions in the input of a deep learning model, which was used to find related positions inside the promoter region. The final model was applied to the genomes of different species without a significant drop in the performance, demonstrating a high generality of the developed method.
5

Privacy-Preserving Facial Recognition Using Biometric-Capsules

Phillips, Tyler S. 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / In recent years, developers have used the proliferation of biometric sensors in smart devices, along with recent advances in deep learning, to implement an array of biometrics-based recognition systems. Though these systems demonstrate remarkable performance and have seen wide acceptance, they present unique and pressing security and privacy concerns. One proposed method which addresses these concerns is the elegant, fusion-based Biometric-Capsule (BC) scheme. The BC scheme is provably secure, privacy-preserving, cancellable and interoperable in its secure feature fusion design. In this work, we demonstrate that the BC scheme is uniquely fit to secure state-of-the-art facial verification, authentication and identification systems. We compare the performance of unsecured, underlying biometrics systems to the performance of the BC-embedded systems in order to directly demonstrate the minimal effects of the privacy-preserving BC scheme on underlying system performance. Notably, we demonstrate that, when seamlessly embedded into a state-of-the-art FaceNet and ArcFace verification systems which achieve accuracies of 97.18% and 99.75% on the benchmark LFW dataset, the BC-embedded systems are able to achieve accuracies of 95.13% and 99.13% respectively. Furthermore, we also demonstrate that the BC scheme outperforms or performs as well as several other proposed secure biometric methods.
6

Applications of Deep Learning to Video Enhancement

Shi, Zhihao January 2022 (has links)
Deep learning, usually built upon artificial neural networks, was proposed in 1943, but poor computational capability restricted its development at that time. With the advancement of computer architecture and chip design, deep learning gains sufficient computational power and has revolutionized many areas in computer vision. As a fundamental research area of computer vision, video enhancement often serves as the first step of many modern vision systems and facilitates numerous downstream vision tasks. This thesis provides a comprehensive study of video enhancement, especially in the sense of video frame interpolation and space-time video super-resolution. For video frame interpolation, two novel methods, named GDConvNet and VFIT, are proposed. In GDConvNet, a novel mechanism named generalized deformable convolution is introduced in order to overcome the inaccuracy flow estimation issue in the flow-based methods and the rigidity issue of kernel shape in the kernel-based methods. This mechanism can effectively learn motion information in a data-driven manner and freely select sampling points in space-time. Our GDConvNet, built upon this mechanism, is shown to achieve state-of-the-art performance. As for VFIT, the concept of local attention is firstly introduced to video interpolation, and a novel space-time separation window-based self-attention scheme is further devised, which not only saves costs but acts as a regularization term to improve the performance. Based on the new scheme, VFIT is presented as the first Transformer-based video frame interpolation framework. In addition, a multi-scale frame synthesis scheme is developed to fully realize the potential of Transformers. Extensive experiments on a variety of benchmark datasets demonstrate the superiority and liability of VFIT. For space-time video super-resolution, a novel unconstrained space-time video super-resolution network is proposed to solve the common issues of the existing methods that either fail to explore the intrinsic relationship between temporal and spatial information or lack flexibility in the choice of final temporal/spatial resolution. To this end, several new ideas are introduced, such as integration of multi-level representations and generalized pixshuffle. Various experiments validate the proposed method in terms of its complete freedom in choosing output resolution, as well as superior performance over the state-of-the-art methods. / Thesis / Doctor of Philosophy (PhD)
7

IMAGE RESTORATIONS USING DEEP LEARNING TECHNIQUES

Chi, Zhixiang January 2018 (has links)
Conventional methods for solving image restoration problems are typically built on an image degradation model and on some priors of the latent image. The model of the degraded image and the prior knowledge of the latent image are necessary because the restoration is an ill posted inverse problem. However, for some applications, such as those addressed in this thesis, the image degradation process is too complex to model precisely; in addition, mathematical priors, such as low rank and sparsity of the image signal, are often too idealistic for real world images. These difficulties limit the performance of existing image restoration algorithms, but they can be, to certain extent, overcome by the techniques of machine learning, particularly deep convolutional neural networks. Machine learning allows large sample statistics far beyond what is available in a single input image to be exploited. More importantly, the big data can be used to train deep neural networks to learn the complex non-linear mapping between the degraded and original images. This circumvents the difficulty of building an explicit realistic mathematical model when the degradation causes are complex and compounded. In this thesis, we design and implement deep convolutional neural networks (DCNN) for two challenging image restoration problems: reflection removal and joint demosaicking-deblurring. The first problem is one of blind source separation; its DCNN solution requires a large set of paired clean and mixed images for training. As these paired training images are very difficult, if not impossible, to acquire in the real world, we develop a novel technique to synthesize the required training images that satisfactorily approximate the real ones. For the joint demosaicking-deblurring problem, we propose a new multiscale DCNN architecture consisting of a cascade of subnetworks so that the underlying blind deconvolution task can be broken into smaller subproblems and solved more effectively and robustly. In both cases extensive experiments are carried out. Experimental results demonstrate clear advantages of the proposed DCNN methods over existing ones. / Thesis / Master of Applied Science (MASc)
8

Generic Model-Agnostic Convolutional Neural Networks for Single Image Dehazing

Liu, Zheng January 2018 (has links)
Haze and smog are among the most common environmental factors impacting image quality and, therefore, image analysis. In this paper, I propose an end-to-end generative method for single image dehazing problem. It is based on fully convolutional network and effective network structures to recognize haze structure in input images and restore clear, haze-free ones. The proposed method is agnostic in the sense that it does not explore the atmosphere scattering model, it makes use of convolutional networks advantage in feature extraction and transfer instead. Somewhat surprisingly, it achieves superior performance relative to all existing state-of-the-art methods for image dehazing even on SOTS outdoor images, which are synthesized using the atmosphere scattering model. In order to improve its weakness in indoor hazy images and enhance the dehazed image's visual quality, a lightweight parallel network is put forward. It employs a different convolution strategy that extracts features with larger reception field to generate a complementary image. With the help of a parallel stream, the fusion of the two outputs performs better in PSNR and SSIM than other methods. / Thesis / Master of Applied Science (MASc)
9

GridDehazeNet: Attention-Based Multi-Scale Network for Image Dehazing

Ma, Yongrui January 2019 (has links)
We propose an end-to-end trainable Convolutional Neural Network (CNN), named GridDehazeNet, for single image dehazing. The GridDehazeNet consists of three modules: pre-processing, backbone, and post-processing. The trainable pre-processing module can generate learned inputs with better diversity and more pertinent features as compared to those derived inputs produced by hand-selected pre-processing methods. The backbone module implements a novel attention-based multi-scale estimation on a grid network, which can effectively alleviate the bottleneck issue often encountered in the conventional multi-scale approach. The post-processing module helps to reduce the artifacts in the final output. Experimental results indicate that the GridDehazeNet outperforms the state-of-the-art on both synthetic and real-world images. The proposed hazing method does not rely on the atmosphere scattering model and we provide an explanation as to why it is not necessarily beneficial to take advantage of the dimension reduction offered by the atmosphere scattering model for image dehazing, even if only the dehazing results on synthetic images are concerned. / Thesis / Master of Applied Science (MASc)
10

Constellation Design for Multi-user Communications with Deep Learning

Sun, Yi-Lin January 2019 (has links)
In the simple form, a communication system includes a transmitter and a receiver. In the transmitter, it transforms the one-hot vector message to produce a transmitted signal. In general, the transmitter demands restrictions on the transmitted signal. The channel is defined by the conditional probability distribution function. On receiving of the transmitted signal with noise, the receiver appears to apply the transformation to generate the estimate of one hot vector message. We can regard this simplest communication system as a specific case of autoencoder from a deep learning perspective. In our case, autoencoder used to learn the representations of the one-hot vector which are robust to the noise channel and can be recovered at the receiver with the smallest probability of error. Our task is to make some improvements on the autoencoder systems. We propose different schemes depending on the different cases. We propose a method based on optimization of softmax and introduce the L1/2 regularization in MSE loss function for SISO case and MIMO case, separately. The simulation shows that both our optimized softmax function method and L1/2 regularization loss function have a better performance than the original neural network framework. / Thesis / Master of Applied Science (MASc)

Page generated in 0.0559 seconds