131 |
3D Visualization of MPC-based Algorithms for Autonomous VehiclesSörliden, Pär January 2019 (has links)
The area of autonomous vehicles is an interesting research topic, which is popular in both research and industry worldwide. Linköping university is no exception and some of their research is based on using Model Predictive Control (MPC) for autonomous vehicles. They are using MPC to plan a path and control the autonomous vehicles. Additionally, they are using different methods (for example deep learning or likelihood) to calculate collision probabilities for the obstacles. These are very complex algorithms, and it is not always easy to see how they work. Therefore, it is interesting to study if a visualization tool, where the algorithms are presented in a three-dimensional way, can be useful in understanding them, and if it can be useful in the development of the algorithms. This project has consisted of implementing such a visualization tool, and evaluating it. This has been done by implementing a visualization using a 3D library, and then evaluating it both analytically and empirically. The evaluation showed positive results, where the proposed tool is shown to be helpful when developing algorithms for autonomous vehicles, but also showing that some aspects of the algorithm still would need more research on how they could be implemented. This concerns the neural networks, which was shown to be difficult to visualize, especially given the available data. It was found that more information about the internal variables in the network would be needed to make a better visualization of them.
|
132 |
Skin lesion segmentation and classification using deep learningUnknown Date (has links)
Melanoma, a severe and life-threatening skin cancer, is commonly misdiagnosed
or left undiagnosed. Advances in artificial intelligence, particularly deep learning,
have enabled the design and implementation of intelligent solutions to skin lesion
detection and classification from visible light images, which are capable of performing
early and accurate diagnosis of melanoma and other types of skin diseases. This work
presents solutions to the problems of skin lesion segmentation and classification. The
proposed classification approach leverages convolutional neural networks and transfer
learning. Additionally, the impact of segmentation (i.e., isolating the lesion from the
rest of the image) on the performance of the classifier is investigated, leading to the
conclusion that there is an optimal region between “dermatologist segmented” and
“not segmented” that produces best results, suggesting that the context around a
lesion is helpful as the model is trained and built. Generative adversarial networks,
in the context of extending limited datasets by creating synthetic samples of skin
lesions, are also explored. The robustness and security of skin lesion classifiers using
convolutional neural networks are examined and stress-tested by implementing
adversarial examples. / Includes bibliography. / Thesis (M.S.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection
|
133 |
Parallel Distributed Deep Learning on Cluster ComputersUnknown Date (has links)
Deep Learning is an increasingly important subdomain of arti cial intelligence.
Deep Learning architectures, arti cial neural networks characterized by having both
a large breadth of neurons and a large depth of layers, bene ts from training on Big
Data. The size and complexity of the model combined with the size of the training
data makes the training procedure very computationally and temporally expensive.
Accelerating the training procedure of Deep Learning using cluster computers faces
many challenges ranging from distributed optimizers to the large communication overhead
speci c to a system with o the shelf networking components. In this thesis, we
present a novel synchronous data parallel distributed Deep Learning implementation
on HPCC Systems, a cluster computer system. We discuss research that has been
conducted on the distribution and parallelization of Deep Learning, as well as the
concerns relating to cluster environments. Additionally, we provide case studies that
evaluate and validate our implementation. / Includes bibliography. / Thesis (M.S.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection
|
134 |
Using Deep Learning Semantic Segmentation to Estimate Visual OdometryUnknown Date (has links)
In this research, image segmentation and visual odometry estimations in real time
are addressed, and two main contributions were made to this field. First, a new image
segmentation and classification algorithm named DilatedU-NET is introduced. This deep
learning based algorithm is able to process seven frames per-second and achieves over
84% accuracy using the Cityscapes dataset. Secondly, a new method to estimate visual
odometry is introduced. Using the KITTI benchmark dataset as a baseline, the visual
odometry error was more significant than could be accurately measured. However, the
robust framerate speed made up for this, able to process 15 frames per second. / Includes bibliography. / Thesis (M.S.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection
|
135 |
Encoder-decoder neural networksKalchbrenner, Nal January 2017 (has links)
This thesis introduces the concept of an encoder-decoder neural network and develops architectures for the construction of such networks. Encoder-decoder neural networks are probabilistic conditional generative models of high-dimensional structured items such as natural language utterances and natural images. Encoder-decoder neural networks estimate a probability distribution over structured items belonging to a target set conditioned on structured items belonging to a source set. The distribution over structured items is factorized into a product of tractable conditional distributions over individual elements that compose the items. The networks estimate these conditional factors explicitly. We develop encoder-decoder neural networks for core tasks in natural language processing and natural image and video modelling. In Part I, we tackle the problem of sentence modelling and develop deep convolutional encoders to classify sentences; we extend these encoders to models of discourse. In Part II, we go beyond encoders to study the longstanding problem of translating from one human language to another. We lay the foundations of neural machine translation, a novel approach that views the entire translation process as a single encoder-decoder neural network. We propose a beam search procedure to search over the outputs of the decoder to produce a likely translation in the target language. Besides known recurrent decoders, we also propose a decoder architecture based solely on convolutional layers. Since the publication of these new foundations for machine translation in 2013, encoder-decoder translation models have been richly developed and have displaced traditional translation systems both in academic research and in large-scale industrial deployment. In services such as Google Translate these models process in the order of a billion translation queries a day. In Part III, we shift from the linguistic domain to the visual one to study distributions over natural images and videos. We describe two- and three- dimensional recurrent and convolutional decoder architectures and address the longstanding problem of learning a tractable distribution over high-dimensional natural images and videos, where the likely samples from the distribution are visually coherent. The empirical validation of encoder-decoder neural networks as state-of- the-art models of tasks ranging from machine translation to video prediction has a two-fold significance. On the one hand, it validates the notions of assigning probabilities to sentences or images and of learning a distribution over a natural language or a domain of natural images; it shows that a probabilistic principle of compositionality, whereby a high- dimensional item is composed from individual elements at the encoder side and whereby a corresponding item is decomposed into conditional factors over individual elements at the decoder side, is a general method for modelling cognition involving high-dimensional items; and it suggests that the relations between the elements are best learnt in an end-to-end fashion as non-linear functions in distributed space. On the other hand, the empirical success of the networks on the tasks characterizes the underlying cognitive processes themselves: a cognitive process as complex as translating from one language to another that takes a human a few seconds to perform correctly can be accurately modelled via a learnt non-linear deterministic function of distributed vectors in high-dimensional space.
|
136 |
Towards Personalized Learning using Counterfactual Inference for Randomized Controlled TrialsZhao, Siyuan 26 April 2018 (has links)
Personalized learning considers that the causal effects of a studied learning intervention may differ for the individual student (e.g., maybe girls do better with video hints while boys do better with text hints). To evaluate a learning intervention inside ASSISTments, we run a randomized control trial (RCT) by randomly assigning students into either a control condition or a treatment condition. Making the inference about causal effects of studies interventions is a central problem. Counterfactual inference answers “What if� questions, such as "Would this particular student benefit more if the student were given the video hint instead of the text hint when the student cannot solve a problem?". Counterfactual prediction provides a way to estimate the individual treatment effects and helps us to assign the students to a learning intervention which leads to a better learning. A variant of Michael Jordan's "Residual Transfer Networks" was proposed for the counterfactual inference. The model first uses feed-forward neural networks to learn a balancing representation of students by minimizing the distance between the distributions of the control and the treated populations, and then adopts a residual block to estimate the individual treatment effect. Students in the RCT usually have done a number of problems prior to participating it. Each student has a sequence of actions (performance sequence). We proposed a pipeline to use the performance sequence to improve the performance of counterfactual inference. Since deep learning has achieved a huge amount of success in learning representations from raw logged data, student representations were learned by applying the sequence autoencoder to performance sequences. Then, incorporate these representations into the model for counterfactual inference. Empirical results showed that the representations learned from the sequence autoencoder improved the performance of counterfactual inference.
|
137 |
A Machine Learning approach to Febrile ClassificationKostopouls, Theodore P 25 April 2018 (has links)
General health screening is needed to decrease the risk of pandemic in high volume areas. Thermal characterization, via infrared imaging, is an effective technique for fever detection, however, strict use requirements in combination with highly controlled environmental conditions compromise the practicality of such a system. Combining advanced processing techniques to thermograms of individuals can remove some of these requirements allowing for more flexible classification algorithms. The purpose of this research was to identify individuals who had febrile status utilizing modern thermal imaging and machine learning techniques in a minimally controlled setting. Two methods were evaluated with data that contained environmental, and acclimation noise due to data gathering technique. The first was a pretrained VGG16 Convolutional Neural Network found to have F1 score of 0.77 (accuracy of 76%) on a balanced dataset. The second was a VGG16 Feature Extractor that gives inputs to a principle components analysis and utilizes a support vector machine for classification. This technique obtained a F1 score of 0.84 (accuracy of 85%) on balanced data sets. These results demonstrate that machine learning is an extremely viable technique to classify febrile status independent of noise affiliated.
|
138 |
Why did they cite that?Lovering, Charles 26 April 2018 (has links)
We explore a machine learning task, evidence recommendation (ER), the extraction of evidence from a source document to support an external claim. This task is an instance of the question answering machine learning task. We apply ER to academic publications because they cite other papers for the claims they make. Reading cited papers to corroborate claims is time-consuming and an automated ER tool could expedite it. Thus, we propose a methodology for collecting a dataset of academic papers and their references. We explore deep learning models for ER and achieve 77% accuracy with pairwise models and 75% pairwise accuracy with document-wise models.
|
139 |
Parameter Continuation with Secant Approximation for Deep Neural NetworksPathak, Harsh Nilesh 03 December 2018 (has links)
Non-convex optimization of deep neural networks is a well-researched problem. We present a novel application of continuation methods for deep learning optimization that can potentially arrive at a better solution. In our method, we first decompose the original optimization problem into a sequence of problems using a homotopy method. To achieve this in neural networks, we derive the Continuation(C)- Activation function. First, C-Activation is a homotopic formulation of existing activation functions such as Sigmoid, ReLU or Tanh. Second, we apply a method which is standard in the parameter continuation domain, but to the best of our knowledge, novel to the deep learning domain. In particular, we use Natural Parameter Continuation with Secant approximation(NPCS), an effective training strategy that may find a superior local minimum for a non-convex optimization problem. Additionally, we extend our work on Step-up GANs, a data continuation approach, by deriving a method called Continuous(C)-SMOTE which is an extension of standard oversampling algorithms. We demonstrate the improvements made by our methods and establish a categorization of recent work done on continuation methods in the context of deep learning.
|
140 |
Image processing and forward propagation using binary representations, and robust audio analysis using deep learningPedersoli, Fabrizio 15 March 2019 (has links)
The work presented in this thesis consists of three main topics:
document segmentation and classification into text and score,
efficient computation with binary representations, and deep learning
architectures for polyphonic music transcription and classification.
In the case of musical documents, an important
problem is separating text from musical score by detecting the
corresponding boundary boxes. A new algorithm is
proposed for pixel-wise classification of digital documents in musical
score and text. It is based on a bag-of-visual-words approach and
random forest classification. A robust technique for identifying
bounding boxes of text and music score from the pixel-wise
classification is also proposed.
For efficient processing of learned models, we turn our attention to
binary representations. When dealing with binary data, the use of
bit-packing and bit-wise computation can reduce computational time and
memory requirements considerably. Efficiency is a key factor when
processing large scale datasets and in industrial applications.
SPmat is an optimized framework for binary image processing.
We propose a bit-packed representation for binary images that encodes
both pixels and square neighborhoods, and design SPmat, an optimized
framework for binary image processing, around it.
Bit-packing and bit-wise computation can also be used for efficient
forward propagation in deep neural networks. Quantified deep neural
networks have recently been proposed with the goal of improving
computational time performance and memory requirements while
maintaining as much as possible classification performance. A particular
type of quantized neural networks are binary neural networks in which
the weights and activations are constrained to $-1$ and $+1$. In this
thesis, we describe and evaluate Espresso, a novel optimized framework
for fast inference of binary neural networks that takes advantage of
bit-packing and bit-wise computations. Espresso is self contained,
written in C/CUDA and provides optimized implementations of all the
building blocks needed to perform forward propagation.
Following the recent success, we further investigate Deep neural
networks. They have achieved state-of-the-art results and
outperformed traditional machine learning methods in many applications
such as: computer vision, speech recognition, and machine translation.
However, in the case of music information retrieval (MIR) and audio
analysis, shallow neural networks are commonly used. The
effectiveness of deep and very deep architectures for MIR and audio
tasks has not been explored in detail. It is also not clear what is
the best input representation for a particular task. We therefore
investigate deep neural networks for the following audio analysis
tasks: polyphonic music transcription, musical genre classification,
and urban sound classification. We analyze the performance of common
classification network architectures using different input
representations, paying specific attention to residual networks. We
also evaluate the robustness of these models in case of degraded audio
using different combinations of training/testing data. Through
experimental evaluation we show that residual networks provide
consistent performance improvements when analyzing degraded audio
across different representations and tasks. Finally, we present a
convolutional architecture based on U-Net that can improve polyphonic
music transcription performance of different baseline transcription
networks. / Graduate
|
Page generated in 0.1168 seconds