Spelling suggestions: "subject:"convolutional neural network""
1 |
Wide Activated Separate 3D Convolution for Video Super-ResolutionYu, Xiafei 18 December 2019 (has links)
Video super-resolution (VSR) aims to recover a realistic high-resolution (HR) frame
from its corresponding center low-resolution (LR) frame and several neighbouring supporting frames. The neighbouring supporting LR frames can provide extra information to help recover the HR frame. However, these frames are not aligned with the center frame due to the motion of objects. Recently, many video super-resolution methods based on deep learning have been proposed with the rapid development of neural networks. Most of these methods utilize motion estimation and compensation models as preprocessing to handle spatio-temporal alignment problem. Therefore, the accuracy of these motion estimation models are critical for predicting the high-resolution frames. Inaccurate results of motion compensation models will lead to artifacts and blurs, which also will damage the recovery of high-resolution frames. We propose an effective wide activated separate 3 dimensional (3D) Convolution Neural Network (CNN) for video super-resolution to overcome the drawback of utilizing motion compensation models. Separate 3D convolution factorizes the 3D convolution into convolutions in the spatial and temporal domain, which have benefit for the optimization of spatial and temporal convolution components. Therefore, our method can capture temporal and spatial information of input frames simultaneously without additional motion evaluation and compensation model. Moreover, the experimental results demonstrated the effectiveness of the proposed wide activated separate 3D CNN.
|
2 |
OBJECT DETECTION IN DEEP LEARNINGHaoyu Shi (8100614) 10 December 2019 (has links)
<p>Through the computing advance and GPU (Graphics Processing
Unit) availability for math calculation, the deep learning field becomes more
popular and prevalent. Object detection with deep learning, which is the part
of image processing, plays an important role in automatic vehicle drive and
computer vision. Object detection includes object localization and object
classification. Object localization involves that the computer looks through
the image and gives the correct coordinates to localize the object. Object
classification is that the computer classification targets into different
categories. The traditional image object detection pipeline idea is from
Fast/Faster R-CNN [32] [58]. The region proposal network
generates the contained objects areas and put them into classifier. The first
step is the object localization while the second step is the object
classification. The time cost for this pipeline function is not efficient.
Aiming to address this problem, You Only Look Once (YOLO) [4] network is born. YOLO is the
single neural network end-to-end pipeline with the image processing speed being
45 frames per second in real time for network prediction. In this thesis, the
convolution neural networks are introduced, including the state of art
convolutional neural networks in recently years. YOLO implementation details
are illustrated step by step. We adopt the YOLO network for our applications
since the YOLO network has the faster convergence rate in training and provides
high accuracy and it is the end to end architecture, which makes networks easy
to optimize and train. </p>
|
3 |
DeepCNPP: Deep Learning Architecture to Distinguish the Promoter of Human Long Non-Coding RNA Genes and Protein-Coding GenesAlam, Tanvir, Islam, Mohammad Tariqul, Househ, Mowafa, Belhaouari, Samir Brahim, Kawsar, Ferdaus Ahmed 01 January 2019 (has links)
Promoter region of protein-coding genes are gradually being well understood, yet no comparable studies exist for the promoter of long non-coding RNA (lncRNA) genes which has emerged as a global potential regulator in multiple cellular process and different diseases for human. To understand the difference in the transcriptional regulation pattern of these genes, previously, we proposed a machine learning based model to classify the promoter of protein-coding genes and lncRNA genes. In this study, we are presenting DeepCNPP (deep coding non-coding promoter predictor), an improved model based on deep learning (DL) framework to classify the promoter of lncRNA genes and protein-coding genes. We used convolution neural network (CNN) based deep network to classify the promoter of these two broad categories of human genes. Our computational model, built upon the sequence information only, was able to classify these two groups of promoters from human at a rate of 83.34% accuracy and outperformed the existing model. Further analysis and interpretation of the output from DeepCNPP architecture will enable us to understand the difference in transcription regulatory pattern for these two groups of genes.
|
4 |
Forecasting retweet count during elections using graph convolution neural networksVijayan, Raghavendran 31 May 2018 (has links)
Indiana University-Purdue University Indianapolis (IUPUI)
|
5 |
The Automated Prediction of Solar Flares from SDO Images Using Deep LearningAbed, Ali K., Qahwaji, Rami S.R., Abed, A. 21 March 2021 (has links)
Yes / In the last few years, there has been growing interest in near-real-time solar data processing, especially for space weather applications. This is due to space weather impacts on both space-borne and ground-based systems, and industries, which subsequently impacts our lives. In the current study, the deep learning approach is used to establish an automated hybrid computer system for a short-term forecast; it is achieved by using the complexity level of the sunspot group on SDO/HMI Intensitygram images. Furthermore, this suggested system can generate the forecast for solar flare occurrences within the following 24 h. The input data for the proposed system are SDO/HMI full-disk Intensitygram images and SDO/HMI full-disk magnetogram images. System outputs are the “Flare or Non-Flare” of daily flare occurrences (C, M, and X classes). This system integrates an image processing system to automatically detect sunspot groups on SDO/HMI Intensitygram images using active-region data extracted from SDO/HMI magnetogram images (presented by Colak and Qahwaji, 2008) and deep learning to generate these forecasts. Our deep learning-based system is designed to analyze sunspot groups on the solar disk to predict whether this sunspot group is capable of releasing a significant flare or not. Our system introduced in this work is called ASAP_Deep. The deep learning model used in our system is based on the integration of the Convolutional Neural Network (CNN) and Softmax classifier to extract special features from the sunspot group images detected from SDO/HMI (Intensitygram and magnetogram) images. Furthermore, a CNN training scheme based on the integration of a back-propagation algorithm and a mini-batch AdaGrad optimization method is suggested for weight updates and to modify learning rates, respectively. The images of the sunspot regions are cropped automatically by the imaging system and processed using deep learning rules to provide near real-time predictions. The major results of this study are as follows. Firstly, the ASAP_Deep system builds on the ASAP system introduced in Colak and Qahwaji (2009) but improves the system with an updated deep learning-based prediction capability. Secondly, we successfully apply CNN to the sunspot group image without any pre-processing or feature extraction. Thirdly, our system results are considerably better, especially for the false alarm ratio (FAR); this reduces the losses resulting from the protection measures applied by companies. Also, the proposed system achieves a relatively high scores for True Skill Statistics (TSS) and Heidke Skill Score (HSS).
|
6 |
Big Data Analysis of Bacterial Inhibitors in Parallelized Cellomics - A Machine Learning ApproachJanuary 2016 (has links)
abstract: Identifying chemical compounds that inhibit bacterial infection has recently gained a considerable amount of attention given the increased number of highly resistant bacteria and the serious health threat it poses around the world. With the development of automated microscopy and image analysis systems, the process of identifying novel therapeutic drugs can generate an immense amount of data - easily reaching terabytes worth of information. Despite increasing the vast amount of data that is currently generated, traditional analytical methods have not increased the overall success rate of identifying active chemical compounds that eventually become novel therapeutic drugs. Moreover, multispectral imaging has become ubiquitous in drug discovery due to its ability to provide valuable information on cellular and sub-cellular processes using florescent reagents. These reagents are often costly and toxic to cells over an extended period of time causing limitations in experimental design. Thus, there is a significant need to develop a more efficient process of identifying active chemical compounds.
This dissertation introduces novel machine learning methods based on parallelized cellomics to analyze interactions between cells, bacteria, and chemical compounds while reducing the use of fluorescent reagents. Machine learning analysis using image-based high-content screening (HCS) data is compartmentalized into three primary components: (1) \textit{Image Analytics}, (2) \textit{Phenotypic Analytics}, and (3) \textit{Compound Analytics}. A novel software analytics tool called the Insights project is also introduced. The Insights project fully incorporates distributed processing, high performance computing, and database management that can rapidly and effectively utilize and store massive amounts of data generated using HCS biological assessments (bioassays). It is ideally suited for parallelized cellomics in high dimensional space.
Results demonstrate that a parallelized cellomics approach increases the quality of a bioassay while vastly decreasing the need for control data. The reduction in control data leads to less fluorescent reagent consumption. Furthermore, a novel proposed method that uses single-cell data points is proven to identify known active chemical compounds with a high degree of accuracy, despite traditional quality control measurements indicating the bioassay to be of poor quality. This, ultimately, decreases the time and resources needed in optimizing bioassays while still accurately identifying active compounds. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2016
|
7 |
BRAIN-INSPIRED MACHINE LEARNING CLASSIFICATION MODELSAmerineni, Rajesh 01 May 2020 (has links)
This dissertation focuses on the development of three classes of brain-inspired machine learning classification models. The models attempt to emulate (a) multi-sensory integration, (b) context-integration, and (c) visual information processing in the brain.The multi-sensory integration models are aimed at enhancing object classification through the integration of semantically congruent unimodal stimuli. Two multimodal classification models are introduced: the feature integrating (FI) model and the decision integrating (DI) model. The FI model, inspired by multisensory integration in the subcortical superior colliculus, combines unimodal features which are subsequently classified by a multimodal classifier. The DI model, inspired by integration in primary cortical areas, classifies unimodal stimuli independently using unimodal classifiers and classifies the combined decisions using a multimodal classifier. The multimodal classifier models are be implemented using multilayer perceptrons and multivariate statistical classifiers. Experiments involving the classification of noisy and attenuated auditory and visual representations of ten digits are designed to demonstrate the properties of the multimodal classifiers and to compare the performances of multimodal and unimodal classifiers. The experimental results show that the multimodal classification systems exhibit an important aspect of the “inverse effectiveness principle” by yielding significantly higher classification accuracies when compared with those of the unimodal classifiers. Furthermore, the flexibility offered by the generalized models enables the simulations and evaluations of various combinations of multimodal stimuli and classifiers under varying uncertainty conditions. The context-integrating model emulates the brain’s ability to use contextual information to uniquely resolve the interpretation of ambiguous stimuli. A deep learning neural network classification model that emulates this ability by integrating weighted bidirectional context into the classification process is introduced. The model, referred to as the CINET, is implemented using a convolution neural network (CNN), which is shown to be ideal for combining target and context stimuli and for extracting coupled target-context features. The CINET parameters can be manipulated to simulate congruent and incongruent context environments and to manipulate target-context stimuli relationships. The formulation of the CINET is quite general; consequently, it is not restricted to stimuli in any particular sensory modality nor to the dimensionality of the stimuli. A broad range of experiments are designed to demonstrate the effectiveness of the CINET in resolving ambiguous visual stimuli and in improving the classification of non-ambiguous visual stimuli in various contextual environments. The fact that the performance improves through the inclusion of context can be exploited to design robust brain-inspired machine learning algorithms. It is interesting to note that the CINET is a classification model that is inspired by a combination of brain’s ability to integrate contextual information and the CNN, which is inspired by the hierarchical processing of visual information in the visual cortex. A convolution neural network (CNN) model, inspired by the hierarchical processing of visual information in the brain, is introduced to fuse information from an ensemble of multi-axial sensors in order to classify strikes such as boxing punches and taekwondo kicks in combat sports. Although CNNs are not an obvious choice for non-array data nor for signals with non-linear variations, it will be shown that CNN models can effectively classify multi-axial multi-sensor signals. Experiments involving the classification of three-axis accelerometer and three-axes gyroscope signals measuring boxing punches and taekwondo kicks showed that the performance of the fusion classifiers were significantly superior to the uni-axial classifiers. Interestingly, the classification accuracies of the CNN fusion classifiers were significantly higher than those of the DTW fusion classifiers. Through training with representative signals and the local feature extraction property, the CNNs tend to be invariant to the latency shifts and non-linear variations. Moreover, by increasing the number of network layers and the training set, the CNN classifiers offer the potential for even better performance as well as the ability to handle a larger number of classes. Finally, due to the generalized formulations, the classifier models can be easily adapted to classify multi-dimensional signals of multiple sensors in various other applications.
|
8 |
Využitie pokročilých segmentačných metód pre obrazy z TEM mikroskopov / Using advanced segmentation methods for images from TEM microscopesMocko, Štefan January 2018 (has links)
Tato magisterská práce se zabývá využitím konvolučních neuronových sítí pro segmentační účely v oblasti transmisní elektronové mikroskopie. Také popisuje zvolenou topologii neuronové sítě - U-NET, použíté augmentační techniky a programové prostředí. Firma Thermo Fisher Scientific (dříve FEI Czech Republic s.r.o) poskytla obrazová data pro účely této práce. Získané segmentační výsledky jsou prezentovány ve formě křivek (ROC, PRC) a ve formě numerických hodnot (ARI, DSC, Chybová matice). Zvolená UNET topologie dosáhla excelentních výsledků v oblasti pixelové segmentace. S největší pravděpodobností, budou tyto výsledky sloužit jako odrazový můstek pro interní firemní výzkum.
|
9 |
Anticurtaining - obrazov filtr pro elektronovou mikroskopii / Anticurtaining - Image Filter for Electron MicroscopyDvok, Martin January 2021 (has links)
Tomographic analysis produces 3D images of examined material in nanoscale by focus ion beam (FIB). This thesis presents new approach to elimination of the curtain effect by machine learning method. Convolution neuron network is proposed for elimination of damaged imagine by the supervised learning technique. Designed network deals with features of damaged image, which are caused by wavelet transformation. The outcome is visually clear image. This thesis also designs creation of synthetic data set for training the neuron network which are created by simulating physical process of the creation of the real image. The simulation is made of creation of examined material by milling which is done by FIB and by process displaying of the surface by electron microscope (SEM). This newly created approach works precisely with real images. The qualitative evaluation of results is done by amateurs and experts of this problematic. It is done by anonymously comparing this solution to another method of eliminating curtaining effect. Solution presents new and promising approach to elimination of curtaining effect and contributes to a better procedure of dealing with images which are created during material analysis.
|
10 |
Hardware Acceleration of Video analytics on FPGA using OpenCLJanuary 2019 (has links)
abstract: With the exponential growth in video content over the period of the last few years, analysis of videos is becoming more crucial for many applications such as self-driving cars, healthcare, and traffic management. Most of these video analysis application uses deep learning algorithms such as convolution neural networks (CNN) because of their high accuracy in object detection. Thus enhancing the performance of CNN models become crucial for video analysis. CNN models are computationally-expensive operations and often require high-end graphics processing units (GPUs) for acceleration. However, for real-time applications in an energy-thermal constrained environment such as traffic management, GPUs are less preferred because of their high power consumption, limited energy efficiency. They are challenging to fit in a small place.
To enable real-time video analytics in emerging large scale Internet of things (IoT) applications, the computation must happen at the network edge (near the cameras) in a distributed fashion. Thus, edge computing must be adopted. Recent studies have shown that field-programmable gate arrays (FPGAs) are highly suitable for edge computing due to their architecture adaptiveness, high computational throughput for streaming processing, and high energy efficiency.
This thesis presents a generic OpenCL-defined CNN accelerator architecture optimized for FPGA-based real-time video analytics on edge. The proposed CNN OpenCL kernel adopts a highly pipelined and parallelized 1-D systolic array architecture, which explores both spatial and temporal parallelism for energy efficiency CNN acceleration on FPGAs. The large fan-in and fan-out of computational units to the memory interface are identified as the limiting factor in existing designs that causes scalability issues, and solutions are proposed to resolve the issue with compiler automation. The proposed CNN kernel is highly scalable and parameterized by three architecture parameters, namely pe_num, reuse_fac, and vec_fac, which can be adapted to achieve 100% utilization of the coarse-grained computation resources (e.g., DSP blocks) for a given FPGA. The proposed CNN kernel is generic and can be used to accelerate a wide range of CNN models without recompiling the FPGA kernel hardware. The performance of Alexnet, Resnet-50, Retinanet, and Light-weight Retinanet has been measured by the proposed CNN kernel on Intel Arria 10 GX1150 FPGA. The measurement result shows that the proposed CNN kernel, when mapped with 100% utilization of computation resources, can achieve a latency of 11ms, 84ms, 1614.9ms, and 990.34ms for Alexnet, Resnet-50, Retinanet, and Light-weight Retinanet respectively when the input feature maps and weights are represented using 32-bit floating-point data type. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2019
|
Page generated in 0.1221 seconds