• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 1
  • 1
  • Tagged with
  • 6
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Data Augmentation with Seq2Seq Models

Granstedt, Jason Louis 06 July 2017 (has links)
Paraphrase sparsity is an issue that complicates the training process of question answering systems: syntactically diverse but semantically equivalent sentences can have significant disparities in predicted output probabilities. We propose a method for generating an augmented paraphrase corpus for the visual question answering system to make it more robust to paraphrases. This corpus is generated by concatenating two sequence to sequence models. In order to generate diverse paraphrases, we sample the neural network using diverse beam search. We evaluate the results on the standard VQA validation set. Our approach results in a significantly expanded training dataset and vocabulary size, but has slightly worse performance when tested on the validation split. Although not as fruitful as we had hoped, our work highlights additional avenues for investigation into selecting more optimal model parameters and the development of a more sophisticated paraphrase filtering algorithm. The primary contribution of this work is the demonstration that decent paraphrases can be generated from sequence to sequence models and the development of a pipeline for developing an augmented dataset. / Master of Science / For a machine, processing language is hard. All possible combinations of words in a language far exceed a computer’s ability to directly memorize them. Thus, generalizing language into a form that a computer can reason with is necessary for a machine to understand raw human input. Various advancements in machine learning have been particularly impressive in this regard. However, they require a corpus, or a body of information, in order to learn. Collecting this corpus is typically expensive and time consuming, and does not necessarily contain all of the information that a system would need to know - the machine would not know how to handle a word that it has never seen before, for example. This thesis examines the possibility of using a large, general corpus to expand the vocabulary size of a specialized corpus in order to improve performance on a specific task. We use Seq2Seq models, a recent development in neural networks that has seen great success in translation tasks to do so. The Seq2Seq model is trained on the general corpus to learn the language and then applied to the specialized corpus to generate paraphrases similar to the format in the specialized corpus. We were able to significantly expand the volume and vocabulary size of the specialized corpus via this approach, we have demonstrated that decent paraphrases can be generated from Seq2Seq models, and we developed a pipeline for augmenting other specialized datasets.
2

Molecular Optimization Using Graph-to-Graph Translation

Sandström, Emil January 2020 (has links)
Drug development is a protracted and expensive process. One of the main challenges indrug discovery is to find molecules with desirable properties. Molecular optimization is thetask of optimizing precursor molecules by affording them with desirable properties. Recentadvancement in Artificial Intelligence, has led to deep learning models designed for molecularoptimization. These models, that generates new molecules with desirable properties, have thepotential to accelerate the drug discovery. In this thesis, I evaluate the current state-of-the-art graph-to-graph translation model formolecular optimization, the HierG2G. I examine the HierG2G’s performance using three testcases, where the second test is designed, with the help of chemical experts, to represent a commonmolecular optimization task. The third test case, tests the HierG2G’s performance on,for the model, previously unseen molecules. I conclude that, in each of the test cases, the HierG2Gcan successfully generate structurally similar molecules with desirable properties givena source molecule and an user-specified desired property change. Further, I benchmark the HierG2Gagainst two famous string-based models, the seq2seq and the Transformer. My resultsuggests that the seq2seq is the overall best model for molecular optimization, but due to thevarying performance among the models, I encourage a potential user to simultaneously use allthree models for molecular optimization.
3

Study of Semi-supervised Deep Learning Methods on Human Activity Recognition Tasks

Song, Shiping January 2019 (has links)
This project focuses on semi-supervised human activity recognition (HAR) tasks, in which the inputs are partly labeled time series data acquired from sensors such as accelerometer data, and the outputs are predefined human activities. Most state-of-the-art existing work in HAR area is supervised now, which relies on fully labeled datasets. Since the cost to label the collective instances increases fast with the increasing scale of data, semi-supervised methods are now widely required. This report proposed two semi-supervised methods and then investigated how well they perform on a partly labeled dataset, comparing to the state-of-the-art supervised method. One of these methods is designed based on the state-of-the-art supervised method, Deep-ConvLSTM, together with the semi-supervised learning concepts, self-training. Another one is modified based on a semi-supervised deep learning method, LSTM initialized by seq2seq autoencoder, which is firstly introduced for natural language processing. According to the experiments on a published dataset (Opportunity Activity Recognition dataset), both of these semi-supervised methods have better performance than the state-of-the-art supervised methods. / Detta projekt fokuserar på halvövervakad Human Activity Recognition (HAR), där indata delvis är märkta tidsseriedata från sensorer som t.ex. accelerometrar, och utdata är fördefinierade mänskliga aktiviteter. De främsta arbetena inom HAR-området använder numera övervakade metoder, vilka bygger på fullt märkta dataset. Eftersom kostnaden för att märka de samlade instanserna ökar snabbt med den ökade omfattningen av data, föredras numera ofta halvövervakade metoder. I denna rapport föreslås två halvövervakade metoder och det undersöks hur bra de presterar på ett delvis märkt dataset jämfört med den moderna övervakade metoden. En av dessa metoder utformas baserat på en högkvalitativ övervakad metod, DeepConvLSTM, kombinerad med självutbildning. En annan metod baseras på en halvövervakad djupinlärningsmetod, LSTM, initierad av seq2seq autoencoder, som först införs för behandling av naturligt språk. Enligt experimenten på ett publicerat dataset (Opportunity Activity Recognition dataset) har båda dessa metoder bättre prestanda än de toppmoderna övervakade metoderna.
4

Dynamik och tillförlighet i finansiell prognostisering : En analys av djupinlärningsmodeller och deras reaktion på marknadsmanipulation / Dynamics and Reliability in Financial Forecasting : An Analysis of Deep Learning Models’ Response to Market Manipulation

Zawahri, Aya, Ibrahim, Nanci January 2024 (has links)
Under åren har intensiv forskning pågått för att förbättra maskininlärningsmodellers förmåga att förutse marknadsrörelser. Trots detta har det, under finanshistorien, inträffat flera händelser, såsom "Flash-crash", som har påverkat marknaden och haft dramatiska konsekvenser för prisrörelserna. Därför är det viktigt att undersöka hur modellerna påverkas av manipulativa handlingar på finansmarknaden för att säkerställa deras robusthet och tillförlitlighet i sådana situationer.  För att genomföra detta arbete har processen delats upp i tre steg. Först har en undersökning av tidigare arbeten gjorts för att identifiera de mest robusta modellerna inom området. Detta gjordes genom att träna modellerna på FI-2010 datasetet, som är ett offentligt tillgängligt dataset för högfrekvent handel med aktier på NASDAQ Nordic-börsen. De modeller som undersöktes inkluderade DeepLOB, DeepLOB-Attention, DeepLOB-seq2seq, DTNN och TCN. Det andra steget innefattade att köpa det svenska datasetet från Nasdaq Nordic, vilket tillhandahåller data om svenska aktier Limit Order Book (LOB). De två modellerna som visade bäst resultat i det första steget tränades sedan med detta dataset. Slutligen genomfördes en manipulation på de svenska orderböckerna för att undersöka hur dessa modeller påverkas. Resultatet utgjorde en tydlig bedömning av modellernas robusthet och pålitlighet när det gäller att förutse marknadsrörelser genom en omfattande jämförelse och analys av samtliga tester och deras resultat. Arbetet belyser även hur modellernas resultat påverkas av manipulativa handlingar. Dessutom framgår det hur valet av normaliseringsmetod påverkar modellernas resultat. / Over the years, intensive research has been conducted to enhance the capability of machine learning models to predict market movements. Despite this, during financial history, several events, such as the "Flash-crash," have impacted the market and had dramatic consequences for price movements. Therefore, it is crucial to examine how the models are affected by manipulative actions in the financial market to ensure their robustness and reliability in such situations. To carry out this work, the process has been divided into three steps. Firstly, a review of previous studies was conducted to identify the most robust models in the field. This was achieved by training the models on the FI-2010 dataset, which is a publicly available dataset for high-frequency trading of stocks on the NASDAQ Nordic stock exchange. The examined models included DeepLOB, DeepLOB-Attention, DeepLOB-seq2seq, DTNN, and TCN. The second step involved acquiring the Swedish dataset from Nasdaq Nordic, providing data on Swedish stock Limit Order Books (LOB). The two models that demonstrated the best results in the first step were then trained with this dataset. Finally, a manipulation was performed on the Swedish order books to investigate how these models would be affected. The result constituted a clear assessment of the models' robustness and reliability in predicting market movements through a comprehensive comparison and analysis of all tests and their results. The work also highlights how the models' outcomes are affected by manipulative actions. Furthermore, it becomes evident how the choice of normalization method affects the models' results.
5

Huvudtitel: Understand and Utilise Unformatted Text Documents by Natural Language Processing algorithms

Lindén, Johannes January 2017 (has links)
News companies have a need to automate and make the editors process of writing about hot and new events more effective. Current technologies involve robotic programs that fills in values in templates and website listeners that notifies the editors when changes are made so that the editor can read up on the source change at the actual website. Editors can provide news faster and better if directly provided with abstracts of the external sources. This study applies deep learning algorithms to automatically formulate abstracts and tag sources with appropriate tags based on the context. The study is a full stack solution, which manages both the editors need for speed and the training, testing and validation of the algorithms. Decision Tree, Random Forest, Multi Layer Perceptron and phrase document vectors are used to evaluate the categorisation and Recurrent Neural Networks is used to paraphrase unformatted texts. In the evaluation a comparison between different models trained by the algorithms with a variation of parameters are done based on the F-score. The results shows that the F-scores are increasing the more document the training has and decreasing the more categories the algorithm needs to consider. The Multi-Layer Perceptron perform best followed by Random Forest and finally Decision Tree. The document length matters, when larger documents are considered during training the score is increasing considerably. A user survey about the paraphrase algorithms shows the paraphrase result is insufficient to satisfy editors need. It confirms a need for more memory to conduct longer experiments.
6

Strojový překlad pomocí umělých neuronových sítí / Machine Translation Using Artificial Neural Networks

Holcner, Jonáš January 2018 (has links)
The goal of this thesis is to describe and build a system for neural machine translation. System is built with recurrent neural networks - encoder-decoder architecture in particular. The result is a nmt library used to conduct experiments with different model parameters. Results of the experiments are compared with system built with the statistical tool Moses.

Page generated in 0.0284 seconds