• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 90
  • 14
  • 13
  • 9
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 161
  • 161
  • 161
  • 88
  • 55
  • 54
  • 52
  • 46
  • 33
  • 31
  • 30
  • 27
  • 27
  • 27
  • 26
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Efficient and Robust Deep Learning through Approximate Computing

Sanchari Sen (9178400) 28 July 2020 (has links)
<p>Deep Neural Networks (DNNs) have greatly advanced the state-of-the-art in a wide range of machine learning tasks involving image, video, speech and text analytics, and are deployed in numerous widely-used products and services. Improvements in the capabilities of hardware platforms such as Graphics Processing Units (GPUs) and specialized accelerators have been instrumental in enabling these advances as they have allowed more complex and accurate networks to be trained and deployed. However, the enormous computational and memory demands of DNNs continue to increase with growing data size and network complexity, posing a continuing challenge to computing system designers. For instance, state-of-the-art image recognition DNNs require hundreds of millions of parameters and hundreds of billions of multiply-accumulate operations while state-of-the-art language models require hundreds of billions of parameters and several trillion operations to process a single input instance. Another major obstacle in the adoption of DNNs, despite their impressive accuracies on a range of datasets, has been their lack of robustness. Specifically, recent efforts have demonstrated that small, carefully-introduced input perturbations can force a DNN to behave in unexpected and erroneous ways, which can have to severe consequences in several safety-critical DNN applications like healthcare and autonomous vehicles. In this dissertation, we explore approximate computing as an avenue to improve the speed and energy efficiency of DNNs, as well as their robustness to input perturbations.</p> <p> </p> <p>Approximate computing involves executing selected computations of an application in an approximate manner, while generating favorable trade-offs between computational efficiency and output quality. The intrinsic error resilience of machine learning applications makes them excellent candidates for approximate computing, allowing us to achieve execution time and energy reductions with minimal effect on the quality of outputs. This dissertation performs a comprehensive analysis of different approximate computing techniques for improving the execution efficiency of DNNs. Complementary to generic approximation techniques like quantization, it identifies approximation opportunities based on the specific characteristics of three popular classes of networks - Feed-forward Neural Networks (FFNNs), Recurrent Neural Networks (RNNs) and Spiking Neural Networks (SNNs), which vary considerably in their network structure and computational patterns.</p> <p> </p> <p>First, in the context of feed-forward neural networks, we identify sparsity, or the presence of zero values in the data structures (activations, weights, gradients and errors), to be a major source of redundancy and therefore, an easy target for approximations. We develop lightweight micro-architectural and instruction set extensions to a general-purpose processor core that enable it to dynamically detect zero values when they are loaded and skip future instructions that are rendered redundant by them. Next, we explore LSTMs (the most widely used class of RNNs), which map sequences from an input space to an output space. We propose hardware-agnostic approximations that dynamically skip redundant symbols in the input sequence and discard redundant elements in the state vector to achieve execution time benefits. Following that, we consider SNNs, which are an emerging class of neural networks that represent and process information in the form of sequences of binary spikes. Observing that spike-triggered updates along synaptic connections are the dominant operation in SNNs, we propose hardware and software techniques to identify connections that can be minimally impact the output quality and deactivate them dynamically, skipping any associated updates.</p> <p> </p> <p>The dissertation also delves into the efficacy of combining multiple approximate computing techniques to improve the execution efficiency of DNNs. In particular, we focus on the combination of quantization, which reduces the precision of DNN data-structures, and pruning, which introduces sparsity in them. We observe that the ability of pruning to reduce the memory demands of quantized DNNs decreases with precision as the overhead of storing non-zero locations alongside the values starts to dominate in different sparse encoding schemes. We analyze this overhead and the overall compression of three different sparse formats across a range of sparsity and precision values and propose a hybrid compression scheme that identifies that optimal sparse format for a pruned low-precision DNN.</p> <p> </p> <p>Along with improved execution efficiency of DNNs, the dissertation explores an additional advantage of approximate computing in the form of improved robustness. We propose ensembles of quantized DNN models with different numerical precisions as a new approach to increase robustness against adversarial attacks. It is based on the observation that quantized neural networks often demonstrate much higher robustness to adversarial attacks than full precision networks, but at the cost of a substantial loss in accuracy on the original (unperturbed) inputs. We overcome this limitation to achieve the best of both worlds, i.e., the higher unperturbed accuracies of the full precision models combined with the higher robustness of the low precision models, by composing them in an ensemble.</p> <p> </p> <p><br></p><p>In summary, this dissertation establishes approximate computing as a promising direction to improve the performance, energy efficiency and robustness of neural networks.</p>
72

Rekurentní neuronové sítě pro klasifikaci textů / Recurrent Neural Network for Text Classification

Myška, Vojtěch January 2018 (has links)
Thesis deals with the proposal of the neural networks for classification of positive and negative texts. Development took place in the Python programming language. Design of deep neural network models was performed using the Keras high-level API and the TensorFlow numerical computation library. The computations were performed using GPU with support of the CUDA architecture. The final outcome of the thesis is linguistically independent neural network model for classifying texts at character level reaching up to 93,64% accuracy. Training and testing data were provided by multilingual and Yelp databases. The simulations were performed on 1200000 English, 12000 Czech, German and Spanish texts.
73

Exploring Contextual Information in Neural Machine Translation / Exploring Contextual Information in Neural Machine Translation

Jon, Josef January 2019 (has links)
Tato práce se zabývá zapojením mezivětného kontextu v neuronovém strojovém překladu (NMT). Dnešní běžné NMT systémy překládají jednu zdrojovou větu na jednu cílovou větu, bez jakéhokoliv ohledu na okolní text. Tento přístup je nedostačující a neodpovídá způsobu práce lidských překladatelů. Pro mnoho jazykových párů je dnes za splnění určitých (přísných) podmínek výstup NMT nerozeznatelný od lidského překladu. Jedna z těchto podmínek je, že hodnotitelé skórují přeložené věty nezávisle, bez znalosti kontextu. Při hodnocení celých dokumentů je výstup NMT stále hodnocen hůře, než lidský překlad, i v případech, kdy byl na úrovni jednotlivých vět preferován. Tato zjištění jsou motivací pro výzkum zapojení kontextu na úrovni dokumentu v NMT, je totiž možné, že na úrovni vět již není mnoho prostoru ke zlepšení, alespoň pro jazykové páry a domény bohaté na trénovací data. Tato práce shrnuje současné přístupy zapojení kontextu do překladu, několik z nich je implementováno a vyhodnoceno v rámci obecné překladové kvality i na překladu specifických fenoménů souvisejících s kontextem. Pro zhodnocení kvality jednotlivých systému byla ručně vytvořena testovací sada pro překlad z anglického do českého jazyka.
74

Aktivní učení pro rozpoznávání textu / Active Learning for OCR

Kohút, Jan January 2019 (has links)
The aim of this Master's thesis is to design methods of active learning and to experiment with datasets of historical documents. A large and diverse dataset IMPACT of more than one million lines is used for experiments. I am using neural networks to check the readability of lines and correctness of their annotations. Firstly, I compare architectures of convolutional and recurrent neural networks with bidirectional LSTM layer. Next, I study different ways of learning neural networks using methods of active learning. Mainly I use active learning to adapt neural networks to documents that the neural networks do not have in the original training dataset. Active learning is thus used for picking appropriate adaptation data. Convolutional neural networks achieve 98.6\% accuracy, recurrent neural networks achieve 99.5\% accuracy. Active learning decreases error by 26\% compared to random pick of adaptations data.
75

Popis fotografií pomocí rekurentních neuronových sítí / Image Captioning with Recurrent Neural Networks

Kvita, Jakub January 2016 (has links)
Tato práce se zabývá automatickým generovaním popisů obrázků s využitím několika druhů neuronových sítí. Práce je založena na článcích z MS COCO Captioning Challenge 2015 a znakových jazykových modelech, popularizovaných A. Karpathym. Navržený model je kombinací konvoluční a rekurentní neuronové sítě s architekturou kodér--dekodér. Vektor reprezentující zakódovaný obrázek je předáván jazykovému modelu jako hodnoty paměti LSTM vrstev v síti. Práce zkoumá, na jaké úrovni je model s takto jednoduchou architekturou schopen popisovat obrázky a jak si stojí v porovnání s ostatními současnými modely. Jedním ze závěrů práce je, že navržená architektura není dostatečná pro jakýkoli popis obrázků.
76

Fundamentální analýza numerických dat pro automatický trading / Fundamental Analysis of Numerical Data for Automatic Trading

Huf, Petr January 2016 (has links)
This thesis is aimed to exploitation of fundamental analysis in automatic trading. Technical analysis uses historical prices and indicators derived from price for price prediction. On the opposite, fundamental analysis uses various information resources for price prediction. In this thesis, only quantitative data are used. These data sources are namely weather, Forex, Google Trends, WikiTrends, historical prices of futures and some fundamental data (birth rate, migration, \dots). These data are processed with LSTM neural network, which predicts stocks prices of selected companies. This prediction is basis for created trading system. Experiments show major improvement in results of the trading system; 8\% increase in success prediction accuracy thanks to involvement of fundamental analysis.
77

Essential Reservoir Computing

Griffith, Aaron January 2021 (has links)
No description available.
78

Aplicación web para la detección de mentiras utilizando redes neuronales recurrentes y micro-expresiones / Web application for lie detection using recurrent neural networks and micro-expressions

Rodriguez Meza, Bryan Alberto, Vargas Lopez-Lavalle, Renzo Nicolas 21 January 2021 (has links)
En la vida cotidiana, detectar una falacia puede tener importantes implicaciones en distintas situaciones sociales. Descifrar mentiras, puede ser determinante en situaciones que impliquen consecuencias graves o moderadas; como el caso de investigaciones policiales. El trabajo expuesto en las siguientes paginas tiene como fin la realización de un sistema de detección de mentiras que utilice una cámara web como medio único para la detección. Además de esto, se busca realizar la investigación correspondiente a las subáreas relacionadas al problema. Estas son la de detección de mentiras, Deep learning y visión computacional. En este trabajo expuesto, se asumirá al acto de mentir como cualquier acto que busque comunicar información falsa o trastornada, de forma deliberada con la finalidad de engañar a otros. La investigación realizada, se hará presente en el desarrollo de un proyecto cuyo alcance consiste en la creación de una aplicación capaz de detectar si una persona dice la verdad a partir de su reconocimiento facial. Para ello, se utilizarán técnicas de visión computacional y machine learning con el fin de dar otra opción más económica y accesible ante las otras metodologías (polígrafo, ERPs, fMRI) que se basan en analizar el estado cerebral requieren de maquinaria extremadamente costosa y tienden a tener la misma precisión que el uso de polígrafos. / In everyday life, detecting a fallacy can have important implications in different social situations. Deciphering lies can be decisive in situations that involve serious or moderate consequences, as in the case of police investigations. The work presented in the following pages is aimed at the realization of a lie detection system that uses a web camera as the only means for detection. In addition to this, it seeks to carry out the investigation corresponding to the subareas related to the problem. These subareas are lie detection, deep learning, and computer vision. In this exposed work, the act of lying will be assumed as any act that seeks to communicate false or disturbed information, deliberately with the purpose of deceiving others. The research carried out will be present in the development of a project whose scope consists of the creation of an application capable of detecting if a person is telling the truth from their facial recognition. To do this, computer vision and machine learning techniques will be used in order to provide another cheaper and more accessible option compared to other methodologies (polygraph, ERPs, fMRI) that are based on analyzing the brain state, require extremely expensive machinery and tend to have the same precision as the use of polygraphs. / Trabajo de investigación
79

Predicting Road Rut with a Multi-time-series LSTM Model

Backer-Meurke, Henrik, Polland, Marcus January 2021 (has links)
Road ruts are depressions or grooves worn into a road. Increases in rut depth are highly undesirable due to the heightened risk of hydroplaning. Accurately predicting increases in road rut depth is important for maintenance planning within the Swedish Transport Administration. At the time of writing this paper, the agency utilizes a linear regression model and is developing a feed-forward neural network for road rut predictions. The aim of the study was to evaluate the possibility of using a Recurrent Neural Network to predict road rut. Through design science research, an artefact in the form of a LSTM model was designed, developed, and evaluated.The dataset consisted of multiple-multivariate short time series where research was limited. Case studies were conducted which inspired the conceptual design of the model. The baseline LSTM model proposed in this paper utilizes the full dataset in combination with time-series individualization through an added index feature. Additional features thought to correlate with rut depth was also studied through multiple training set variations. The model was evaluated by calculating the Root Mean Squared Error (RMSE) and the Mean Absolute Error (MAE) for each training set variation. The baseline model predicted rut depth with a MAE of 0.8110 (mm) and a RMSE of 1.124 (mm) outperforming a control set without the added index. The feature with the highest correlation to rut depth was curvature with a MAEof 0.8031 and a RMSE of 1.1093. Initial finding shows that there is a possibility of utilizing an LSTM model trained on multiple-multivariate time series to predict rut depth. Time series individualization through an added index feature yielded better results than control, indicating that it had the desired effect on model performance.
80

Reinforcement Learning with Recurrent Neural Networks

Schäfer, Anton Maximilian 20 November 2008 (has links)
Controlling a high-dimensional dynamical system with continuous state and action spaces in a partially unknown environment like a gas turbine is a challenging problem. So far often hard coded rules based on experts´ knowledge and experience are used. Machine learning techniques, which comprise the field of reinforcement learning, are generally only applied to sub-problems. A reason for this is that most standard RL approaches still fail to produce satisfactory results in those complex environments. Besides, they are rarely data-efficient, a fact which is crucial for most real-world applications, where the available amount of data is limited. In this thesis recurrent neural reinforcement learning approaches to identify and control dynamical systems in discrete time are presented. They form a novel connection between recurrent neural networks (RNN) and reinforcement learning (RL) techniques. RNN are used as they allow for the identification of dynamical systems in form of high-dimensional, non-linear state space models. Also, they have shown to be very data-efficient. In addition, a proof is given for their universal approximation capability of open dynamical systems. Moreover, it is pointed out that they are, in contrast to an often cited statement, well able to capture long-term dependencies. As a first step towards reinforcement learning, it is shown that RNN can well map and reconstruct (partially observable) MDP. In the so-called hybrid RNN approach, the resulting inner state of the network is then used as a basis for standard RL algorithms. The further developed recurrent control neural network combines system identification and determination of an optimal policy in one network. In contrast to most RL methods, it determines the optimal policy directly without making use of a value function. The methods are tested on several standard benchmark problems. In addition, they are applied to different kinds of gas turbine simulations of industrial scale.

Page generated in 0.0587 seconds