Spelling suggestions: "subject:"[een] DEEP LEARNING"" "subject:"[enn] DEEP LEARNING""
1 |
FPGA acceleration of CNN trainingSamal, Kruttidipta 07 January 2016 (has links)
This thesis presents the results of an architectural study on the design of FPGA- based architectures for convolutional neural networks (CNNs).
We have analyzed the memory access patterns of a Convolutional Neural Network (one of the biggest networks in the family of deep learning algorithms) by creating a trace of a well-known CNN architecture and by developing a trace-driven DRAM simulator. The simulator uses the traces to analyze the effect that different storage patterns and dissonance in speed between memory and processing element, can have on the CNN system. This insight is then used create an initial design for a layer architecture for the CNN using an FPGA platform. The FPGA is designed to have multiple parallel-executing units. We design a data layout for the on-chip memory of an FPGA such that we can increase parallelism in the design. As the number of these parallel units (and hence parallelism) depends on the memory layout of input and output, particularly if parallel read and write accesses can be scheduled or not. The on-chip memory layout minimizes access contention during the operation of parallel units. The result is an SoC (System on Chip) that acts as an accelerator and can have more number of parallel units than previous work. The improvement in design was also observed by comparing post synthesis loop latency tables between our design and one with a single unit design. This initial design can help in designing FPGAs targeted for deep learning algorithms that can compete with GPUs in terms of performance.
|
2 |
Understanding Perceived Sense of Movement in Static Visuals Using Deep LearningKale, Shravan 11 January 2019 (has links)
This thesis introduces the problem of learning the representation and the classification of the perceived sense of movement, defined as dynamism in static visuals. To solve the said problem, we study the definition, degree, and real-world implications of dynamism within the field of consumer psychology. We employ Deep Convolutional Neural Networks (DCNN) as a method to learn and predict dynamism in images. The novelty of the task, lead us to collect a dataset which we synthetically augmented for spatial invariance, using image processing techniques. We study the methods of transfer learning to transfer knowledge from another domain, as the size of our dataset was deemed to be inadequate. Our dataset is trained across different network architectures, and transfer learning techniques to find an optimal method for the task at hand. To show a real-world application of our work, we observe the correlation between the two visual stimuli, dynamism and emotions.
|
3 |
Domain Adaptation on Semantic Segmentation with Separate Affine Transformation in Batch NormalizationYan, Junhao 06 June 2022 (has links)
Domain adaptation on semantic segmentation generally refers to the procedures for narrowing the distribution gap between source and target data, which is vital for developing the automatic vehicle system. It requires a large amount of data with well-labelled ground truth at the pixel level. Labelling this scale of data is extremely costly due to the lot of human effort required. Also, manually labelling often comes with label noises that are harmful to automatic vehicle system development. In this case, solving the above problem utilizes computer-generated data and ground truth for development. However, a notorious problem exists when a system is trained with synthetic data but deployed in a real-world environment, which results from the distribution (domain) difference between these two kinds of data, and domain adaptation helps solve this issue.
In the thesis, the limitation of conventional batch normalization layer on adversarial learning based domain adaptation methods is mentioned and discussed. From the view of the limitation, we propose replacing the Sharing Affine Transformation with our proposed Separate Affine Transformation (SEAT) to improve the domain adapting performance. The proposed SEAT is simple, easily implemented, and integrated into existing adversarial learning-based unsupervised domain adaptation methods. Also, to further improve the adaptation quality on lower-level features, we introduce multi-level adaptation by adding the lower-level features to the higher-level ones before feeding them to the discriminator, which is different from others by adding extra discriminators. Finally, a simple training strategy, self-training, is adopted to improve the model performance further.
Extensive experiments show that our proposed method is able to get comparable results with other domain adaptation methods with simpler design.
|
4 |
Novel computational methods for promoter identification and analysisUmarov, Ramzan 02 March 2020 (has links)
Promoters are key regions that are involved in differential transcription regulation
of protein-coding and RNA genes. The gene-specific architecture of promoter
sequences makes it extremely difficult to devise a general strategy for their computational
identification. Accurate prediction of promoters is fundamental for interpreting
gene expression patterns, and for constructing and understanding genetic regulatory
networks. In the last decade, genomes of many organisms have been sequenced and
their gene content was mostly identified. Promoters and transcriptional start sites
(TSS), however, are still left largely undetermined and efficient software able to accurately
predict promoters in newly sequenced genomes is not yet available in the
public domain. While there are many attempts to develop computational promoter
identification methods, reliable tools to analyze long genomic sequences are still lacking.
In this dissertation, I present the methods I have developed for prediction of promoters
for different organisms. The first two methods, TSSPlant and PromCNN,
achieved state-of-the-art performance for discriminating promoter and non-promoter
sequences for plant and eukaryotic promoters respectively. For TSSPlant, a large
number of features were crafted and evaluated to train an optimal classifier. Prom-
CNN was built using a deep learning approach that extracts features from the data
automatically. The trained model demonstrated the ability of a deep learning approach
to grasp complex promoter sequence characteristics.
For the latest method, DeeReCT-PromID, I focus on prediction of the exact positions
of the TSSs inside the eukaryotic genomic sequences, testing every possible location. This is a more difficult task, requiring not only an accurate classifier, but also
appropriate selection of unique predictions among multiple overlapping high scoring
genomic segments. The new method significantly outperform the previous promoter
prediction programs by considerably reducing the number of false positive predictions.
Specifically, to reduce the false positive rate, the models are adaptively and
iteratively trained by changing the distribution of samples in the training set based
on the false positive errors made in the previous iteration.
The new methods are used to gain insights into the design principles of the core
promoters. Using model analysis, I have identified the most important core promoter
elements and their effect on the promoter activity. Furthermore, the importance of
each position inside the core promoter was analyzed and validated using a large single
nucleotide polymorphisms data set. I have developed a novel general approach to
detect long range interactions in the input of a deep learning model, which was used
to find related positions inside the promoter region. The final model was applied
to the genomes of different species without a significant drop in the performance,
demonstrating a high generality of the developed method.
|
5 |
Privacy-Preserving Facial Recognition Using Biometric-CapsulesPhillips, Tyler S. 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / In recent years, developers have used the proliferation of biometric sensors in smart devices, along with recent advances in deep learning, to implement an array of biometrics-based recognition systems. Though these systems demonstrate remarkable performance and have seen wide acceptance, they present unique and pressing security and privacy concerns. One proposed method which addresses these concerns is the elegant, fusion-based Biometric-Capsule (BC) scheme. The BC scheme is provably secure, privacy-preserving, cancellable and interoperable in its secure feature fusion design.
In this work, we demonstrate that the BC scheme is uniquely fit to secure state-of-the-art facial verification, authentication and identification systems. We compare the performance of unsecured, underlying biometrics systems to the performance of the BC-embedded systems in order to directly demonstrate the minimal effects of the privacy-preserving BC scheme on underlying system performance. Notably, we demonstrate that, when seamlessly embedded into a state-of-the-art FaceNet and ArcFace verification systems which achieve accuracies of 97.18% and 99.75% on the benchmark LFW dataset, the BC-embedded systems are able to achieve accuracies of 95.13% and 99.13% respectively. Furthermore, we also demonstrate that the BC scheme outperforms or performs as well as several other proposed secure biometric methods.
|
6 |
Applications of Deep Learning to Video EnhancementShi, Zhihao January 2022 (has links)
Deep learning, usually built upon artificial neural networks, was proposed in 1943, but poor computational capability restricted its development at that time. With the advancement of computer architecture and chip design, deep learning gains sufficient computational power and has revolutionized many areas in computer vision. As a fundamental research area of computer vision, video enhancement often serves as the first step of many modern vision systems and facilitates numerous downstream vision tasks. This thesis provides a comprehensive study of video enhancement, especially in the sense of video frame interpolation and space-time video super-resolution.
For video frame interpolation, two novel methods, named GDConvNet and VFIT, are proposed. In GDConvNet, a novel mechanism named generalized deformable convolution is introduced in order to overcome the inaccuracy flow estimation issue in the flow-based methods and the rigidity issue of kernel shape in the kernel-based methods. This mechanism can effectively learn motion information in a data-driven manner and freely select sampling points in space-time. Our GDConvNet, built upon this mechanism, is shown to achieve state-of-the-art performance. As for VFIT, the concept of local attention is firstly introduced to video interpolation, and a novel space-time separation window-based self-attention scheme is further devised, which not only saves costs but acts as a regularization term to improve the performance.
Based on the new scheme, VFIT is presented as the first Transformer-based video frame interpolation framework. In addition, a multi-scale frame synthesis scheme is developed to fully realize the potential of Transformers. Extensive experiments on a variety of benchmark datasets demonstrate the superiority and liability of VFIT.
For space-time video super-resolution, a novel unconstrained space-time video super-resolution network is proposed to solve the common issues of the existing methods that either fail to explore the intrinsic relationship between temporal and spatial information or lack flexibility in the choice of final temporal/spatial resolution. To this end, several new ideas are introduced, such as integration of multi-level representations and generalized pixshuffle. Various experiments validate the proposed method in terms of its complete freedom in choosing output resolution, as well as superior performance over the state-of-the-art methods. / Thesis / Doctor of Philosophy (PhD)
|
7 |
INTELLIGENT RESOURCE PROVISIONING FOR NEXT-GENERATION CELLULAR NETWORKSYu, Lixing 07 September 2020 (has links)
No description available.
|
8 |
Non-competitive and competitive deep learning for imaging applicationsZhou, Xiao 05 July 2022 (has links)
While generative adversarial networks (GAN) have been widely applied in various settings, the competitive deep learning frameworks such as GANs were not as popular in medical image processing and even less widely applied on high resolution data due to the issues related to their stability. In this dissertation, we examined optimal ways of modeling a generalizable competitive framework that can alleviate the inherent stability issues while still meeting additional objectives such as to achieve prediction accuracy of a classification task or to satisfy other performance metrics on high dimensional data sets.
The first part of the thesis is focused on exploring better network performance in a non-competitive setting with a closed-form solution. (1) We introduced Pyramid Encoder in seq2seq models and observed a significant increase in computational and memory efficiency while achieving a similar repair rate to their non-pyramid counterparts. (2) We proposed a mixed spatio-temporal neural network for real-time prediction of crimes, establishing the feasibility of a convolutional neural network (CNN) in the spatio-temporal domain. (3) We developed and validated an interpretable deep learning framework for Alzheimer’s disease (AD) classification as a clinically adaptable strategy to generate neuroimaging signatures for AD diagnosis and as a generalizable approach for linking deep learning to pathophysiological processes in human disease. (4) We designed and validated an end-to-end survival model for prediction of progression from mild cognitive impairment (MCI) to AD, and identified regions salient to predicting progression from MCI to AD. (5) Additionally, we applied a supervised learning framework in Parrondo's Paradox that maps playing history directly to the decision space, and learned to combine two individually-losing games to have a positive expectation.
The second part is focused on the design and analysis of neural models in a competitive setting without a closed-form solution. We extended the models from tackling a single objective to multiple tasks, while also moving from two-dimensional images to three-dimensional magnetic resonance imaging scans of the human brain. (1) We experimented with domain-specific inpainting with a concurrently pre-trained GAN to recover noised or cropped images. (2) We developed a GAN model to enhance MRI-driven AD classification performance using generative adversarial learning. (3) Finally, we proposed a competitive framework that could recover 3D medical data from 2D slices, while retaining disease-related information. / 2023-07-04T00:00:00Z
|
9 |
Proposing a Three-Stage Model to Quantify Bradykinesia on a Symptom Severity Level Using Deep LearningJaber, R., Qahwaji, Rami S.R., Buckley, John, Abd-Alhameed, Raed 23 March 2022 (has links)
No / Typically characterised as a movement disorder, bradykinesia can be represented according to the degree of motor impairment. The assessment criteria for Parkinson’s disease (PD) is therefore well defined due to its symptomatic nature. Diagnosing and monitoring the progression of bradykinesia is currently heavily reliant on clinician’s visual judgment. One of the most common forms of examining bradykinesia involves rapid finger tapping and is aimed to determine the patient’s ability to initiate and sustain movement effectively. This consists of the patient repeatedly tapping their index finger and thumb together. Object detection algorithm, YOLO, was trained to track the separation between the index finger and thumb. Bounding boxes (BB) were used to determine their relative position on a frame-to-frame basis to produce a time series signal. Key movement characteristics were extracted to determine regularity of movement in finger tapping amongst Parkinson’s patients and controls.
|
10 |
IMAGE RESTORATIONS USING DEEP LEARNING TECHNIQUESChi, Zhixiang January 2018 (has links)
Conventional methods for solving image restoration problems are typically built on an image degradation model and on some priors of the latent image. The model of the degraded image and the prior knowledge of the latent image are necessary because the restoration is an ill posted inverse problem. However, for some applications, such as those addressed in this thesis, the image degradation process is too complex to model precisely; in addition, mathematical priors, such as low rank and sparsity of the image signal, are often too idealistic for real world images. These difficulties limit the performance of existing image restoration algorithms, but they can be, to certain extent, overcome by the techniques of machine learning, particularly deep convolutional neural networks. Machine learning allows large sample statistics far beyond what is available in a single input image to be exploited. More importantly, the big data can be used to train deep neural networks to learn the complex non-linear mapping between the degraded and original images. This circumvents the difficulty of building an explicit realistic mathematical model when the degradation causes are complex and compounded.
In this thesis, we design and implement deep convolutional neural networks (DCNN) for two challenging image restoration problems: reflection removal and joint demosaicking-deblurring. The first problem is one of blind source separation; its DCNN solution requires a large set of paired clean and mixed images for training. As these paired training images are very difficult, if not impossible, to acquire in the real world, we develop a novel technique to synthesize the required training images that satisfactorily approximate the real ones. For the joint demosaicking-deblurring problem, we propose a new multiscale DCNN architecture consisting of a cascade of subnetworks so that the underlying blind deconvolution task can be broken into smaller subproblems and solved more effectively and robustly. In both cases extensive experiments are carried out. Experimental results demonstrate clear advantages of the proposed DCNN methods over existing ones. / Thesis / Master of Applied Science (MASc)
|
Page generated in 0.0552 seconds