Spelling suggestions: "subject:"convolutional"" "subject:"onvolutional""
121 |
ESTIMATION OF DEPTH FROM DEFOCUS BLUR IN VIRTUAL ENVIRONMENTS COMPARING GRAPH CUTS AND CONVOLUTIONAL NEURAL NETWORKProdipto Chowdhury (5931032) 17 January 2019 (has links)
Depth estimation is one of the most important problems in computer vision. It has
attracted a lot of attention because it has applications in many areas, such as robotics,
VR and AR, self-driving cars etc. Using the defocus blur of a camera lens is one of
the methods of depth estimation. In this thesis, we have researched this technique in
virtual environments. Virtual datasets have been created for this purpose.
In this research, we have applied graph cuts and convolutional neural network
(DfD-net) to estimate depth from defocus blur using a natural (Middlebury) and a
virtual (Maya) dataset. Graph Cuts showed similar performance for both natural and
virtual datasets in terms of NMAE and NRMSE. However, with regard to SSIM, the
performance of graph cuts is 4% better for Middlebury compared to Maya.
We have trained the DfD-net using the natural and the virtual dataset and then
combining both datasets. The network trained by the virtual dataset performed best
for both datasets.
The performance of graph-cuts and DfD-net have been compared. Graph-Cuts
performance is 7% better than DfD-Net in terms of SSIM for Middlebury images. For
Maya images, DfD-Net outperforms Graph-Cuts by 2%. With regard to NRMSE,
Graph-Cuts and DfD-net shows similar performance for Maya images. For Middlebury
images, Graph-cuts is 1.8% better. The algorithms show no difference in performance
in terms of NMAE. The time DfD-net takes to generate depth maps compared
to graph cuts is 500 times less for Maya images and 200 times less for Middlebury
images.
|
122 |
Low-Cost and Scalable Visual Drone Detection System Based on Distributed Convolutional Neural NetworkHyun Hwang (5930672) 20 December 2018 (has links)
<div>Recently, with the advancement in drone technology, more and more hobby drones are being manufactured and sold across the world. However, these drones can be repurposed</div><div>for the use in illicit activities such as hostile-load delivery. At the moment there are not many systems readily available for detecting and intercepting those hostile drones. Although there is a prototype of a working drone interceptor system built by the researchers of Purdue University, the system was not ready for the general public due to its nature of proof-of-concept and the high price range of the military-grade RADAR used in the prototype. It is essential to substitute such high-cost elements with low-cost ones, to make such drone interception system affordable enough for large-scale deployment.</div><div><br></div><div><div>This study aims to provide an alternative, affordable way to substitute an expensive, high-precision RADAR system with Convolutional Neural Network based drone detection system, which can be built using multiple low-cost single board computers. The experiment will try to find the feasibility of the proposed system and will evaluate the accuracy of the drone detection in a controlled environment.</div></div>
|
123 |
Deep learning based approaches for imitation learningHussein, Ahmed January 2018 (has links)
Imitation learning refers to an agent's ability to mimic a desired behaviour by learning from observations. The field is rapidly gaining attention due to recent advances in computational and communication capabilities as well as rising demand for intelligent applications. The goal of imitation learning is to describe the desired behaviour by providing demonstrations rather than instructions. This enables agents to learn complex behaviours with general learning methods that require minimal task specific information. However, imitation learning faces many challenges. The objective of this thesis is to advance the state of the art in imitation learning by adopting deep learning methods to address two major challenges of learning from demonstrations. Firstly, representing the demonstrations in a manner that is adequate for learning. We propose novel Convolutional Neural Networks (CNN) based methods to automatically extract feature representations from raw visual demonstrations and learn to replicate the demonstrated behaviour. This alleviates the need for task specific feature extraction and provides a general learning process that is adequate for multiple problems. The second challenge is generalizing a policy over unseen situations in the training demonstrations. This is a common problem because demonstrations typically show the best way to perform a task and don't offer any information about recovering from suboptimal actions. Several methods are investigated to improve the agent's generalization ability based on its initial performance. Our contributions in this area are three fold. Firstly, we propose an active data aggregation method that queries the demonstrator in situations of low confidence. Secondly, we investigate combining learning from demonstrations and reinforcement learning. A deep reward shaping method is proposed that learns a potential reward function from demonstrations. Finally, memory architectures in deep neural networks are investigated to provide context to the agent when taking actions. Using recurrent neural networks addresses the dependency between the state-action sequences taken by the agent. The experiments are conducted in simulated environments on 2D and 3D navigation tasks that are learned from raw visual data, as well as a 2D soccer simulator. The proposed methods are compared to state of the art deep reinforcement learning methods. The results show that deep learning architectures can learn suitable representations from raw visual data and effectively map them to atomic actions. The proposed methods for addressing generalization show improvements over using supervised learning and reinforcement learning alone. The results are thoroughly analysed to identify the benefits of each approach and situations in which it is most suitable.
|
124 |
Hardware Acceleration of Deep Convolutional Neural Networks on FPGAJanuary 2018 (has links)
abstract: The rapid improvement in computation capability has made deep convolutional neural networks (CNNs) a great success in recent years on many computer vision tasks with significantly improved accuracy. During the inference phase, many applications demand low latency processing of one image with strict power consumption requirement, which reduces the efficiency of GPU and other general-purpose platform, bringing opportunities for specific acceleration hardware, e.g. FPGA, by customizing the digital circuit specific for the deep learning algorithm inference. However, deploying CNNs on portable and embedded systems is still challenging due to large data volume, intensive computation, varying algorithm structures, and frequent memory accesses. This dissertation proposes a complete design methodology and framework to accelerate the inference process of various CNN algorithms on FPGA hardware with high performance, efficiency and flexibility.
As convolution contributes most operations in CNNs, the convolution acceleration scheme significantly affects the efficiency and performance of a hardware CNN accelerator. Convolution involves multiply and accumulate (MAC) operations with four levels of loops. Without fully studying the convolution loop optimization before the hardware design phase, the resulting accelerator can hardly exploit the data reuse and manage data movement efficiently. This work overcomes these barriers by quantitatively analyzing and optimizing the design objectives (e.g. memory access) of the CNN accelerator based on multiple design variables. An efficient dataflow and hardware architecture of CNN acceleration are proposed to minimize the data communication while maximizing the resource utilization to achieve high performance.
Although great performance and efficiency can be achieved by customizing the FPGA hardware for each CNN model, significant efforts and expertise are required leading to long development time, which makes it difficult to catch up with the rapid development of CNN algorithms. In this work, we present an RTL-level CNN compiler that automatically generates customized FPGA hardware for the inference tasks of various CNNs, in order to enable high-level fast prototyping of CNNs from software to FPGA and still keep the benefits of low-level hardware optimization. First, a general-purpose library of RTL modules is developed to model different operations at each layer. The integration and dataflow of physical modules are predefined in the top-level system template and reconfigured during compilation for a given CNN algorithm. The runtime control of layer-by-layer sequential computation is managed by the proposed execution schedule so that even highly irregular and complex network topology, e.g. GoogLeNet and ResNet, can be compiled. The proposed methodology is demonstrated with various CNN algorithms, e.g. NiN, VGG, GoogLeNet and ResNet, on two different standalone FPGAs achieving state-of-the art performance.
Based on the optimized acceleration strategy, there are still a lot of design options, e.g. the degree and dimension of computation parallelism, the size of on-chip buffers, and the external memory bandwidth, which impact the utilization of computation resources and data communication efficiency, and finally affect the performance and energy consumption of the accelerator. The large design space of the accelerator makes it impractical to explore the optimal design choice during the real implementation phase. Therefore, a performance model is proposed in this work to quantitatively estimate the accelerator performance and resource utilization. By this means, the performance bottleneck and design bound can be identified and the optimal design option can be explored early in the design phase. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2018
|
125 |
Non-Contact Evaluation Methods for Infrastructure Condition AssessmentDorafshan, Sattar 01 December 2018 (has links)
The United States infrastructure, e.g. roads and bridges, are in a critical condition. Inspection, monitoring, and maintenance of these infrastructure in the traditional manner can be expensive, dangerous, time-consuming, and tied to human judgment (the inspector). Non-contact methods can help overcoming these challenges. In this dissertation two aspects of non-contact methods are explored: inspections using unmanned aerial systems (UASs), and conditions assessment using image processing and machine learning techniques. This presents a set of investigations to determine a guideline for remote autonomous bridge inspections.
|
126 |
Human Activity Recognition and Prediction using RGBD DataCoen, Paul Dixon 01 August 2019 (has links)
Being able to predict and recognize human activities is an essential element for us to effectively communicate with other humans during our day to day activities. A system that is able to do this has a number of appealing applications, from assistive robotics to health care and preventative medicine. Previous work in supervised video-based human activity prediction and detection fails to capture the richness of spatiotemporal data that these activities generate. Convolutional Long short-term memory (Convolutional LSTM) networks are a useful tool in analyzing this type of data, showing good results in many other areas. This thesis’ focus is on utilizing RGB-D Data to improve human activity prediction and recognition. A modified Convolutional LSTM network is introduced to do so. Experiments are performed on the network and are compared to other models in-use as well as the current state-of-the-art system. We show that our proposed model for human activity prediction and recognition outperforms the current state-of-the-art models in the CAD-120 dataset without giving bounding frames or ground-truths about objects.
|
127 |
Advanced Imaging Analysis for Predicting Tumor Response and Improving Contour Delineation UncertaintyMahon, Rebecca N 01 January 2018 (has links)
ADVANCED IMAGING ANALYSIS FOR PREDICTING TUMOR RESPONSE AND IMPROVING CONTOUR DELINEATION UNCERTAINTY
By Rebecca Nichole Mahon, MS
A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at Virginia Commonwealth University.
Virginia Commonwealth University, 2018
Major Director: Dr. Elisabeth Weiss,
Professor,
Department of Radiation Oncology
Radiomics, an advanced form of imaging analysis, is a growing field of interest in medicine. Radiomics seeks to extract quantitative information from images through use of computer vision techniques to assist in improving treatment. Early prediction of treatment response is one way of improving overall patient care. This work seeks to explore the feasibility of building predictive models from radiomic texture features extracted from magnetic resonance (MR) and computed tomography (CT) images of lung cancer patients. First, repeatable primary tumor texture features from each imaging modality were identified to ensure a sufficient number of repeatable features existed for model development. Then a workflow was developed to build models to predict overall survival and local control using single modality and multi-modality radiomics features. The workflow was also applied to normal tissue contours as a control study. Multiple significant models were identified for the single modality MR- and CT-based models, while the multi-modality models were promising indicating exploration with a larger cohort is warranted.
Another way advances in imaging analysis can be leveraged is in improving accuracy of contours. Unfortunately, the tumor can be close in appearance to normal tissue on medical images creating high uncertainty in the tumor boundary. As the entire defined target is treated, providing physicians with additional information when delineating the target volume can improve the accuracy of the contour and potentially reduce the amount of normal tissue incorporated into the contour. Convolution neural networks were developed and trained to identify the tumor interface with normal tissue and for one network to identify the tumor location. A mock tool was presented using the output of the network to provide the physician with the uncertainty in prediction of the interface type and the probability of the contour delineation uncertainty exceeding 5mm for the top three predictions.
|
128 |
Metadata Validation Using a Convolutional Neural Network : Detection and Prediction of Fashion ProductsNilsson Harnert, Henrik January 2019 (has links)
In the e-commerce industry, importing data from third party clothing brands require validation of this data. If the validation step of this data is done manually, it is a tedious and time-consuming task. Part of this task can be replaced or assisted by using computer vision to automatically find clothing types, such as T-shirts and pants, within imported images. After a detection of clothing type is computed, it is possible to recommend the likelihood of clothing products correlating to data imported with a certain accuracy. This was done alongside a prototype interface that can be used to start training, finding clothing types in an image and to mask annotations of products. Annotations are areas describing different clothing types and are used to train an object detector model. A model for finding clothing types is trained on Mask R-CNN object detector and achieves 0.49 mAP accuracy. A detection take just above one second on an Nvidia GTX 1070 8 GB graphics card. Recommending one or several products based on a detection take 0.5 seconds and the algorithm used is k-nearest neighbors. If prediction is done on products of which is used to build the model of the prediction algorithm almost perfect accuracy is achieved while products in images for another products does not achieve nearly as good results.
|
129 |
Iterative cerebellar segmentation using convolutional neural networksGerard, Alex Michael 01 December 2018 (has links)
Convolutional neural networks (ConvNets) have quickly become the most widely used tool for image perception and interpretation tasks over the past several years. The single most important resource needed for training a ConvNet that will successfully generalize to unseen examples is an adequately sized labeled dataset. In many interesting medical imaging cases, the necessary size or quality of training data is not suitable for directly training a ConvNet. Furthermore, access to the expertise to manually label such datasets is often infeasible. To address these barriers, we investigate a method for iterative refinement of the ConvNet training. Initially, unlabeled images are attained, minimal labeling is performed, and a model is trained on the sparse manual labels. At the end of each training iteration, full images are predicted, and additional manual labels are identified to improve the training dataset.
In this work, we show how to utilize patch-based ConvNets to iteratively build a training dataset for automatically segmenting MRI images of the human cerebellum. We construct this training dataset using a small collection of high-resolution 3D images and transfer the resulting model to a much larger, much lower resolution, collection of images. Both T1-weighted and T2-weighted MRI modalities are utilized to capture the additional features that arise from the differences in contrast between modalities. The objective is to perform tissue-level segmentation, classifying each volumetric pixel (voxel) in an image as white matter, gray matter, or cerebrospinal fluid (CSF). We will present performance results on the lower resolution dataset, and report achieving a 12.7% improvement in accuracy over the existing segmentation method, expectation maximization. Further, we will present example segmentations from our iterative approach that demonstrate it’s ability to detect white matter branching near the outer regions of the anatomy, which agrees with the known biological structure of the cerebellum and has typically eluded traditional segmentation algorithms.
|
130 |
Human Activity Recognition Based on Transfer LearningPang, Jinyong 06 July 2018 (has links)
Human activity recognition (HAR) based on time series data is the problem of classifying various patterns. Its widely applications in health care owns huge commercial benefit. With the increasing spread of smart devices, people have strong desires of customizing services or product adaptive to their features. Deep learning models could handle HAR tasks with a satisfied result. However, training a deep learning model has to consume lots of time and computation resource. Consequently, developing a HAR system effectively becomes a challenging task. In this study, we develop a solid HAR system using Convolutional Neural Network based on transfer learning, which can eliminate those barriers.
|
Page generated in 0.0896 seconds