Spelling suggestions: "subject:"recurrent neural networks"" "subject:"decurrent neural networks""
81 |
Efficient Document Image Binarization using Heterogeneous Computing and Interactive Machine LearningWestphal, Florian January 2018 (has links)
Large collections of historical document images have been collected by companies and government institutions for decades. More recently, these collections have been made available to a larger public via the Internet. However, to make accessing them truly useful, the contained images need to be made readable and searchable. One step in that direction is document image binarization, the separation of text foreground from page background. This separation makes the text shown in the document images easier to process by humans and other image processing algorithms alike. While reasonably well working binarization algorithms exist, it is not sufficient to just being able to perform the separation of foreground and background well. This separation also has to be achieved in an efficient manner, in terms of execution time, but also in terms of training data used by machine learning based methods. This is necessary to make binarization not only theoretically possible, but also practically viable. In this thesis, we explore different ways to achieve efficient binarization in terms of execution time by improving the implementation and the algorithm of a state-of-the-art binarization method. We find that parameter prediction, as well as mapping the algorithm onto the graphics processing unit (GPU) help to improve its execution performance. Furthermore, we propose a binarization algorithm based on recurrent neural networks and evaluate the choice of its design parameters with respect to their impact on execution time and binarization quality. Here, we identify a trade-off between binarization quality and execution performance based on the algorithm’s footprint size and show that dynamically weighted training loss tends to improve the binarization quality. Lastly, we address the problem of training data efficiency by evaluating the use of interactive machine learning for reducing the required amount of training data for our recurrent neural network based method. We show that user feedback can help to achieve better binarization quality with less training data and that visualized uncertainty helps to guide users to give more relevant feedback. / Scalable resource-efficient systems for big data analytics
|
82 |
Swedish Natural Language Processing with Long Short-term Memory Neural Networks : A Machine Learning-powered Grammar and Spell-checker for the Swedish LanguageGudmundsson, Johan, Menkes, Francis January 2018 (has links)
Natural Language Processing (NLP) is a field studying computer processing of human language. Recently, neural network language models, a subset of machine learning, have been used to great effect in this field. However, research remains focused on the English language, with few implementations in other languages of the world. This work focuses on how NLP techniques can be used for the task of grammar and spelling correction in the Swedish language, in order to investigate how language models can be applied to non-English languages. We use a controlled experiment to find the hyperparameters most suitable for grammar and spelling correction on the Göteborgs-Posten corpus, using a Long Short-term Memory Recurrent Neural Network. We present promising results for Swedish-specific grammar correction tasks using this kind of neural network; specifically, our network has a high accuracy in completing these tasks, though the accuracy achieved for language-independent typos remains low.
|
83 |
Inteligência computacional aplicada à modelagem de cargas não-lineares e estimação de contribuição harmônicaSilva, Leandro Rodrigues Manso 29 February 2012 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-04-24T17:21:05Z
No. of bitstreams: 1
leandrorodriguesmansosilva.pdf: 691785 bytes, checksum: 4024e0e319f1469cc354c2c346a90dbe (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-04-24T17:59:43Z (GMT) No. of bitstreams: 1
leandrorodriguesmansosilva.pdf: 691785 bytes, checksum: 4024e0e319f1469cc354c2c346a90dbe (MD5) / Made available in DSpace on 2017-04-24T17:59:43Z (GMT). No. of bitstreams: 1
leandrorodriguesmansosilva.pdf: 691785 bytes, checksum: 4024e0e319f1469cc354c2c346a90dbe (MD5)
Previous issue date: 2012-02-29 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / A distorção harmônica, dentre outras formas de poluição na rede de sistemas de energia, é um importante problema para as concessionárias. De fato, o aumento do uso de dispositivos não-lineares na indústria resultou em um aumento direto da distorção harmônica nos sistemas elétricos de potência nos últimos anos. Com isso, a modelagem destas cargas e suas interações se tornaram de grande importância, e portanto, o uso de novas técnicas computacionais passou a ser de grande interesse para este fim. Neste contexto, este trabalho descreve uma metodologia baseada em técnicas de Inteligência Computacional (Redes Neurais Artificiais (RNA)s e Lógica Fuzzy (LF)), proposta para modelagem de cargas não-lineares presentes em sistemas elétricos de potência, bem como a estimação de sua parcela na distorção harmônica do sistema. A principal vantagem deste método é que apenas as formas de onda de tensão e corrente no ponto de acoplamento comum precisam ser medidas, além disso esta técnica pode ser aplicada na modelagem de cargas monofásicas bem como cargas trifásicas. / The harmonic distortin, among other forms of pollution to the electric power systems is an important issue for electric utilities. In fact, the increased use of nonlinear devices in industry has resulted in direct increase of harmonic distortion in industrial power grids in recent years. Thus, the modeling of these loads and the understanding of their interactions with the system have became of great importance, then the use of computational-based techniques has emerged as a suitable tool to deal with these requirements. In this context, this work describes a methodology based on Computational Intelligence (Artificial Neural Networks (ANN)s and Fuzzy Logic (FL)) for modeling nonlinear loads present in electric power systems, as well as the estimation of their contribution in the harmonic distortion. The main advantage of this technique is that only the waveforms of voltages and currents at the point of common coupling must be measured and it can be applied to model single and three phase loads.
|
84 |
Large deviations for the dynamics of heterogeneous neural networks / Grandes déviations pour la dynamique de réseaux de neurones hétérogènesCabana, Tanguy 14 December 2016 (has links)
Cette thèse porte sur l'obtention rigoureuse de limites de champ moyen pour la dynamique continue de grands réseaux de neurones hétérogènes. Nous considérons des neurones à taux de décharge, et sujets à un bruit Brownien additif. Le réseau est entièrement connecté, avec des poids de connections dont la variance décroît comme l'inverse du nombre de neurones conservant un effet non trivial dans la limite thermodynamique. Un second type d'hétérogénéité, interprété comme une position spatiale, est considéré au niveau de chaque cellule. Pour la pertinence biologique, nos modèles incluent ou bien des délais, ainsi que des moyennes et variances de connections, dépendants de la distance entre les cellules, ou bien des synapses dépendantes de l'état des deux neurones post- et présynaptique. Ce dernier cas s'applique au modèle de Kuramoto pour les oscillateurs couplés. Quand les poids synaptiques sont Gaussiens et indépendants, nous prouvons un principe de grandes déviations pour la mesure empirique de l'état des neurones. La bonne fonction de taux associée atteint son minimum en une unique mesure de probabilité, impliquant convergence et propagation du chaos sous la loi "averaged". Dans certains cas, des résultats "quenched" sont obtenus. La limite est solution d'une équation implicite, non Markovienne, dans laquelle le terme d'interactions est remplacé par un processus Gaussien qui dépend de la loi de la solution du réseau entier. Une universalité de cette limite est prouvée, dans le cas de poids synaptiques non-Gaussiens avec queues sous-Gaussiennes. Enfin, quelques résultats numérique sur les réseau aléatoires sont présentés, et des perspectives discutées. / This thesis addresses the rigorous derivation of mean-field results for the continuous time dynamics of heterogeneous large neural networks. In our models, we consider firing-rate neurons subject to additive noise. The network is fully connected, with highly random connectivity weights. Their variance scales as the inverse of the network size, and thus conserves a non-trivial role in the thermodynamic limit. Moreover, another heterogeneity is considered at the level of each neuron. It is interpreted as a spatial location. For biological relevance, a model considered includes delays, mean and variance of connections depending on the distance between cells. A second model considers interactions depending on the states of both neurons at play. This last case notably applies to Kuramoto's model of coupled oscillators. When the weights are independent Gaussian random variables, we show that the empirical measure of the neurons' states satisfies a large deviations principle, with a good rate function achieving its minimum at a unique probability measure, implying averaged convergence of the empirical measure and propagation of chaos. In certain cases, we also obtained quenched results. The limit is characterized through a complex non Markovian implicit equation in which the network interaction term is replaced by a non-local Gaussian process whose statistics depend on the solution over the whole neural field. We further demonstrate the universality of this limit, in the sense that neuronal networks with non-Gaussian interconnections but sub-Gaussian tails converge towards it. Moreover, we present a few numerical applications, and discuss possible perspectives.
|
85 |
Efficient and Robust Deep Learning through Approximate ComputingSanchari Sen (9178400) 28 July 2020 (has links)
<p>Deep
Neural Networks (DNNs) have greatly advanced the state-of-the-art in a wide range
of machine learning tasks involving image, video, speech and text analytics,
and are deployed in numerous widely-used products and services. Improvements in
the capabilities of hardware platforms such as Graphics Processing Units (GPUs)
and specialized accelerators have been instrumental in enabling these advances
as they have allowed more complex and accurate networks to be trained and
deployed. However, the enormous computational and memory demands of DNNs
continue to increase with growing data size and network complexity, posing a
continuing challenge to computing system designers. For instance,
state-of-the-art image recognition DNNs require hundreds of millions of
parameters and hundreds of billions of multiply-accumulate operations while
state-of-the-art language models require hundreds of billions of parameters and
several trillion operations to process a single input instance. Another major
obstacle in the adoption of DNNs, despite their impressive accuracies on a range
of datasets, has been their lack of robustness. Specifically, recent efforts
have demonstrated that small, carefully-introduced input perturbations can
force a DNN to behave in unexpected and erroneous ways, which can have to
severe consequences in several safety-critical DNN applications like healthcare
and autonomous vehicles. In this dissertation, we explore approximate computing
as an avenue to improve the speed and energy efficiency of DNNs, as well as
their robustness to input perturbations.</p>
<p> </p>
<p>Approximate
computing involves executing selected computations of an application in an
approximate manner, while generating favorable trade-offs between computational
efficiency and output quality. The intrinsic error resilience of machine learning
applications makes them excellent candidates for approximate computing, allowing
us to achieve execution time and energy reductions with minimal effect on the
quality of outputs. This dissertation performs a comprehensive analysis of
different approximate computing techniques for improving the execution efficiency
of DNNs. Complementary to generic approximation techniques like quantization,
it identifies approximation opportunities based on the specific characteristics
of three popular classes of networks - Feed-forward Neural Networks (FFNNs),
Recurrent Neural Networks (RNNs) and Spiking Neural Networks (SNNs), which vary
considerably in their network structure and computational patterns.</p>
<p> </p>
<p>First, in
the context of feed-forward neural networks, we identify sparsity, or the presence
of zero values in the data structures (activations, weights, gradients and errors),
to be a major source of redundancy and therefore, an easy target for
approximations. We develop lightweight micro-architectural and instruction set
extensions to a general-purpose processor core that enable it to dynamically
detect zero values when they are loaded and skip future instructions that are
rendered redundant by them. Next, we explore LSTMs (the most widely used class
of RNNs), which map sequences from an input space to an output space. We
propose hardware-agnostic approximations that dynamically skip redundant
symbols in the input sequence and discard redundant elements in the state
vector to achieve execution time benefits. Following that, we consider SNNs,
which are an emerging class of neural networks that represent and process
information in the form of sequences of binary spikes. Observing that spike-triggered
updates along synaptic connections are the dominant operation in SNNs, we
propose hardware and software techniques to identify connections that can be
minimally impact the output quality and deactivate them dynamically, skipping any
associated updates.</p>
<p> </p>
<p>The
dissertation also delves into the efficacy of combining multiple approximate computing
techniques to improve the execution efficiency of DNNs. In particular, we focus
on the combination of quantization, which reduces the precision of DNN data-structures,
and pruning, which introduces sparsity in them. We observe that the ability of
pruning to reduce the memory demands of quantized DNNs decreases with precision
as the overhead of storing non-zero locations alongside the values starts to
dominate in different sparse encoding schemes. We analyze this overhead and the
overall compression of three different sparse formats across a range of
sparsity and precision values and propose a hybrid compression scheme that
identifies that optimal sparse format for a pruned low-precision DNN.</p>
<p> </p>
<p>Along with
improved execution efficiency of DNNs, the dissertation explores an additional
advantage of approximate computing in the form of improved robustness. We
propose ensembles of quantized DNN models with different numerical precisions as
a new approach to increase robustness against adversarial attacks. It is based on
the observation that quantized neural networks often demonstrate much higher robustness
to adversarial attacks than full precision networks, but at the cost of a substantial
loss in accuracy on the original (unperturbed) inputs. We overcome this limitation
to achieve the best of both worlds, i.e., the higher unperturbed accuracies of
the full precision models combined with the higher robustness of the low
precision models, by composing them in an ensemble.</p>
<p> </p>
<p><br></p><p>In
summary, this dissertation establishes approximate computing as a promising direction
to improve the performance, energy efficiency and robustness of neural networks.</p>
|
86 |
Rekurentní neuronové sítě pro klasifikaci textů / Recurrent Neural Network for Text ClassificationMyška, Vojtěch January 2018 (has links)
Thesis deals with the proposal of the neural networks for classification of positive and negative texts. Development took place in the Python programming language. Design of deep neural network models was performed using the Keras high-level API and the TensorFlow numerical computation library. The computations were performed using GPU with support of the CUDA architecture. The final outcome of the thesis is linguistically independent neural network model for classifying texts at character level reaching up to 93,64% accuracy. Training and testing data were provided by multilingual and Yelp databases. The simulations were performed on 1200000 English, 12000 Czech, German and Spanish texts.
|
87 |
Exploring Contextual Information in Neural Machine Translation / Exploring Contextual Information in Neural Machine TranslationJon, Josef January 2019 (has links)
Tato práce se zabývá zapojením mezivětného kontextu v neuronovém strojovém překladu (NMT). Dnešní běžné NMT systémy překládají jednu zdrojovou větu na jednu cílovou větu, bez jakéhokoliv ohledu na okolní text. Tento přístup je nedostačující a neodpovídá způsobu práce lidských překladatelů. Pro mnoho jazykových párů je dnes za splnění určitých (přísných) podmínek výstup NMT nerozeznatelný od lidského překladu. Jedna z těchto podmínek je, že hodnotitelé skórují přeložené věty nezávisle, bez znalosti kontextu. Při hodnocení celých dokumentů je výstup NMT stále hodnocen hůře, než lidský překlad, i v případech, kdy byl na úrovni jednotlivých vět preferován. Tato zjištění jsou motivací pro výzkum zapojení kontextu na úrovni dokumentu v NMT, je totiž možné, že na úrovni vět již není mnoho prostoru ke zlepšení, alespoň pro jazykové páry a domény bohaté na trénovací data. Tato práce shrnuje současné přístupy zapojení kontextu do překladu, několik z nich je implementováno a vyhodnoceno v rámci obecné překladové kvality i na překladu specifických fenoménů souvisejících s kontextem. Pro zhodnocení kvality jednotlivých systému byla ručně vytvořena testovací sada pro překlad z anglického do českého jazyka.
|
88 |
Aktivní učení pro rozpoznávání textu / Active Learning for OCRKohút, Jan January 2019 (has links)
The aim of this Master's thesis is to design methods of active learning and to experiment with datasets of historical documents. A large and diverse dataset IMPACT of more than one million lines is used for experiments. I am using neural networks to check the readability of lines and correctness of their annotations. Firstly, I compare architectures of convolutional and recurrent neural networks with bidirectional LSTM layer. Next, I study different ways of learning neural networks using methods of active learning. Mainly I use active learning to adapt neural networks to documents that the neural networks do not have in the original training dataset. Active learning is thus used for picking appropriate adaptation data. Convolutional neural networks achieve 98.6\% accuracy, recurrent neural networks achieve 99.5\% accuracy. Active learning decreases error by 26\% compared to random pick of adaptations data.
|
89 |
Popis fotografií pomocí rekurentních neuronových sítí / Image Captioning with Recurrent Neural NetworksKvita, Jakub January 2016 (has links)
Tato práce se zabývá automatickým generovaním popisů obrázků s využitím několika druhů neuronových sítí. Práce je založena na článcích z MS COCO Captioning Challenge 2015 a znakových jazykových modelech, popularizovaných A. Karpathym. Navržený model je kombinací konvoluční a rekurentní neuronové sítě s architekturou kodér--dekodér. Vektor reprezentující zakódovaný obrázek je předáván jazykovému modelu jako hodnoty paměti LSTM vrstev v síti. Práce zkoumá, na jaké úrovni je model s takto jednoduchou architekturou schopen popisovat obrázky a jak si stojí v porovnání s ostatními současnými modely. Jedním ze závěrů práce je, že navržená architektura není dostatečná pro jakýkoli popis obrázků.
|
90 |
Fundamentální analýza numerických dat pro automatický trading / Fundamental Analysis of Numerical Data for Automatic TradingHuf, Petr January 2016 (has links)
This thesis is aimed to exploitation of fundamental analysis in automatic trading. Technical analysis uses historical prices and indicators derived from price for price prediction. On the opposite, fundamental analysis uses various information resources for price prediction. In this thesis, only quantitative data are used. These data sources are namely weather, Forex, Google Trends, WikiTrends, historical prices of futures and some fundamental data (birth rate, migration, \dots). These data are processed with LSTM neural network, which predicts stocks prices of selected companies. This prediction is basis for created trading system. Experiments show major improvement in results of the trading system; 8\% increase in success prediction accuracy thanks to involvement of fundamental analysis.
|
Page generated in 0.1067 seconds