Global ETD Search

1	Rozpoznávání ručně psaného textu pomocí hlubokých neuronových sítí / Deep Networks for Handwriting Recognition Richtarik, Lukáš January 2020 (has links) The work deals with the issue of handrwritten text recognition problem with deep neural networks. It focuses on the use of sequence to sequence method using encoder-decoder model. It also includes design of encoder-decoder model for handwritten text recognition using a transformer instead of recurrent neurons and a set of experiments that were performed on it.
2	Revealing the Positive Meaning of a Negation Sarabi, Zahra 05 1900 (has links) Negation is a complex phenomenon present in all human languages, allowing for the uniquely human capacities of denial, contradiction, misrepresentation, lying, and irony. It is in the first place a phenomenon of semantical opposition. Sentences containing negation are generally (a) less informative than affirmative ones, (b) morphosyntactically more marked—all languages have negative markers while only a few have affirmative markers, and (c) psychologically more complex and harder to process. Negation often conveys positive meaning. This meaning ranges from implicatures to entailments. In this dissertation, I develop a system to reveal the underlying positive interpretation of negation. I first identify which words are intended to be negated (i.e, the focus of negation) and second, I rewrite those tokens to generate an actual positive interpretation. I identify the focus of negation by scoring probable foci along a continuous scale. One of the obstacles to exploring foci scoring is that no public datasets exist for this task. Thus, to study this problem I create new corpora. The corpora contain verbal, nominal and adjectival negations and their potential positive interpretations along with their scores ranging from 1 to 5. Then, I use supervised learning models for scoring the focus of negation. In order to rewrite the focus of negation with its positive interpretation, I work with negations from Simple Wikipedia, automatically generate potential positive interpretations, and then collect manual annotations that effectively rewrite the negation in positive terms. This procedure yields positive interpretations for approximately 77% of negations, and the final corpus includes over 5,700 negations and over 5,900 positive interpretations. I then use sequence-to-sequence neural models and provide baseline results. Negation Positive Interpretation Supervised Learning Neural Networks Sequence-to-sequence models Annotations Computer Science
3	Automatic morphological analysis of L-verbs in Palula / Automatisk morfologisk analys av L-verb i Palula Wallerö, Emma January 2020 (has links) This study is exploring the possibilities of automatic morphological analysis of L-verbs in the Palula language by the help from Finite-state technology and two-level morphology along with supervised machine learning. The type of machine learning used are neural Sequence to Sequence models. A morphological transducer is made with the Helsinki Finite-State Transducer Technology, HFST, toolkit covering the L-verbs of the Palula Language. Several Sequence to Sequence models are trained on sets of L-verbs along with morphological tagging annotation. One model is trained with a small amount of manually annotated data and four models are trained with different amounts of training examples generated by the Finite-State Transducer. The efficiency and accuracy of these methods are investigated. The Sequence to Sequence model trained on solely manually annotated data did not perform as well as the other models. A Sequence to Sequence model trained with training examples generated by the transducer performed the best recall, accuracy and F1-score, while the Finite-State Transducer performed the best precision score. / Denna studie undersöker möjligheterna för en automatisk morfologisk analys av L-verb i språket Palula med hjälp av finit tillståndsteknik och två-nivå-morfologi samt övervakad maskininlärning. Den typ av maskininlärning som används i studien är neurala Sekvens till Sekvens-modeller. En morfologisk transduktor är skapad med verktyget Helsinki Finite-State Transducer Technology, HFST, som täcker L-verben i Palula. Flera Sekvens till Sekvens-modeller tränas på set av L-verb med morfologisk taggningsannotation. En modell tränas på ett litet set av manuellt annoterade data och fyra modeller tränas på olika mängder träningsdata som genererats av den finita tillstånds-transduktorn. Effektiviteten och noggrannheten för dessa modeller undersöks. Sekvens till Sekvens-modellen som tränats med bara manuellt annoterade data presterade inte lika bra som de andra modellerna i studien. En Sekvens till Sekvens-modell tränad med träningsdata bestående av genereringar producerade av transduktorn gav bästa svarsfrekvens, noggrannhet och F1-poäng, medan den finita tillstånds-transduktorn gav bästa precision. Finite-State Transducer FST morphology two-level morphology machine learning Sequence to Sequence RNN Finite-State Transducer FST morfologi tvånivåmorfologi maskininlärning Sequence to Sequence RNN General Language Studies and Linguistics
4	Sequence to Sequence Machine Learning for Automatic Program Repair Svensson, Niclas, Vrabac, Damir January 2019 (has links) Most of previous program repair approaches are only able to generate fixes for one-line bugs, including machine learning based approaches. This work aims to reveal whether such a system with the state of the art technique is able to make useful predictions while being fed by whole source files. To verify whether multi-line bugs can be fixed using a state of the art solution a system has been created, using already existing Neural Machine Translation tools and data gathered from GitHub. The result of the finished system shows however, that the method used in this thesis is not sufficient to get satisfying results. No bug has successfully been corrected by the system. Although the results are poor there are still unexplored approaches to the project that possibly could improve the performance of the system. One way being narrowing down the input data to method level of source code instead of file level. Automatic program repair neural machine translation sequence to sequence bug fix Elektroteknik och elektronik
5	Novel Deep Learning Models for Spatiotemporal Predictive Tasks Le, Quang 23 November 2022 (has links) Spatiotemporal Predictive Learning (SPL) is an essential research topic involving many practical and real-world applications, e.g., motion detection, video generation, precipitation forecasting, and traffic flow prediction. The problems and challenges of this field come from numerous data characteristics in both time and space domains, and they vary depending on the specific task. For instance, spatial analysis refers to the study of spatial features, such as spatial location, latitude, elevation, longitude, the shape of objects, and other patterns. From the time domain perspective, the temporal analysis generally illustrates the time steps and time intervals of data points in the sequence, also known as interval recording or time sampling. Typically, there are two types of time sampling in temporal analysis: regular time sampling (i.e., the time interval is assumed to be fixed) and the irregular time sampling (i.e., the time interval is considered arbitrary) related closely to the continuous-time prediction task when data are in continuous space. Therefore, an efficient spatiotemporal predictive method has to model spatial features properly at the given time sampling types. In this thesis, by taking advantage of Machine Learning (ML) and Deep Learning (DL) methods, which have achieved promising performance in many complicated computational tasks, we propose three DL-based models used for Spatiotemporal Sequence Prediction (SSP) with several types of time sampling. First, we design the Trajectory Gated Recurrent Unit Attention (TrajGRU-Attention) with novel attention mechanisms, namely Motion-based Attention (MA), to improve the performance of the standard Convolutional Recurrent Neural Networks (ConvRNNs) in the SSP tasks. In particular, the TrajGRU-Attention model can alleviate the impact of the vanishing gradient, which leads to the blurry effect in the long-term predictions and handle both regularly sampled and irregularly sampled time series. Consequently, this model can work effectively with different scenarios of spatiotemporal sequential data, especially in the case of time series with missing time steps. Second, by taking the idea of Neural Ordinary Differential Equations (NODEs), we propose Trajectory Gated Recurrent Unit integrating Ordinary Differential Equation techniques (TrajGRU-ODE) as a continuous time-series model. With Ordinary Differential Equation (ODE) techniques and the TrajGRU neural network, this model can perform continuous-time spatiotemporal prediction tasks and generate resulting output with high accuracy. Compared to TrajGRU-Attention, TrajGRU-ODE benefits from the development of efficient and accurate ODE solvers. Ultimately, we attempt to combine those two models to create TrajGRU-Attention-ODE. NODEs are still in their early stage of research, and recent ODE-based models were designed for many relatively simple tasks. In this thesis, we will train the models with several video datasets to verify the ability of the proposed models in practical applications. To evaluate the performance of the proposed models, we select four available spatiotemporal datasets based on the complexity level, including the MovingMNIST, MovingMNIST++, and two real-life datasets: the weather radar HKO-7 and KTH Action. With each dataset, we train, validate, and test with distinct types of time sampling to justify the prediction ability of our models. In summary, the experimental results on the four datasets indicate the proposed models can generate predictions properly with high accuracy and sharpness. Significantly, the proposed models outperform state-of-the-art ODE-based approaches under SSP tasks with different circumstances of interval recording. spatiotemporal sequence prediction convolutional recurrent networks attention mechanisms neural ordinary differential equations
6	Addressing the Rare Word Problem in Source Code Modelling Ivstam, Linn January 2020 (has links) The field of automatic program repair has adapteddeep learning techniques. Sequence to sequence neural networkshave successfully been applied in neural machine translation(NMT). This can also be applied to automatic program repair,attempting to translate buggy source code into fixed sourcecode. However, the frequent occurrence of user-defined variablesmakes the rare word problem a significant issue. Techniquesused in NMT to handle the rare word problem specifically bytepairing encoding (BPE) and the copy mechanism were appliedand evaluated on source code. The results showed that whenobserving the exact sequence match of the predicted output andtarget output, techniques were not an improvement. However,when observing correct syntax techniques outperformed theoriginal model without any techniques applied. To be able tosee an improvement in exact sequence match there should be agreater variety of sequence length and vocabulary size also, moreextensive hyperparameter tuning should be performed. / Inom området för automatisk mjukvarureparation har djupinlärningstekniker implementerats. Neurala nätverk av typen sekvens till sekvens har blivit framgångsrikt applicerade inom neural maskinöversättning av mänskliga språk. Dessa neurala nätverk kan också appliceras inom automatisk mjukvarureparation genom att översätta källkod innehållande buggar till en lagad kod utan buggar. Den frekventa användningen av användardefinierade variabler gör att ”the rare word problem” är en signifikant svaghet. Tekniker som används i neural maskinöversättning, ”byte pairing encoding” (BPE) och ”the copy mechanism” har applicerats och utvärderats på källkod. Resultaten visar att då modellens förutsagda utdata jämförs med det förväntat utdata visar teknikerna ingen förbättring. Dock hanterar nätverk med tekniker applicerade syntax för programmeringsspråket c avsevärt bättre. / Kandidatexjobb i elektroteknik 2020, KTH, Stockholm utomatic program repair neural machinetranslation sequence to sequence byte pairing encoding copymechanism Elektroteknik och elektronik
7	Streamline searches in a database / Effektivisera sökningar in en databas Ellerblad Valtonen, David, Franzén, André January 2023 (has links) The objective of this thesis is to explore technologies and solutions and see if it is possible to make a logistical flow more efficient. The logistical flow consists of a database containing materiel for purchase or reparation. As of now, searches may either result in too many results, of which several are irrelevant, or no results at all. The search needs to be very specific to retrieve the exact item, which requires extensive knowledge about the database and its contents. Areas that will be explored include Natural Language Processing and Machine Learning techniques. To solve this, a literature study will be conducted to gain insights into existing work and possible solutions. Exploratory Data Analysis will be used to understand the patterns and limitations of the data. AI NLP ML Artificial Intelligence Natural Language Processing Machine Learning text-to-sql seq-to-seq sequence-to-sequence Interaction Technologies Interaktionsteknik Media Engineering Mediateknik
8	End-to-End Trainable Chatbot for Restaurant Recommendations / Neuronnätsbaserad chatbot för restaurangrekommendationer Strigér, Amanda January 2017 (has links) Task-oriented chatbots can be used to automate a specific task, such as finding a restaurant and making a reservation. Implementing such a conversational system can be difficult, requiring domain knowledge and handcrafted rules. The focus of this thesis was to evaluate the possibility of using a neural network-based model to create an end-to-end trainable chatbot that can automate a restaurant reservation service. For this purpose, a sequence-to-sequence model was implemented and trained on dialog data. The strengths and limitations of the system were evaluated and the prediction accuracy of the system was compared against several baselines. With our relatively simple model, we were able to achieve results comparable to the most advanced baseline model. The evaluation has shown some promising strengths of the system but also significant flaws that cannot be overlooked. The current model cannot be used as a standalone system to successfully conduct full conversations with the goal of making a restaurant reservation. The review has, however, contributed with a thorough examination of the current system, and shown where future work ought to be focused. / Chatbotar kan användas för att automatisera enkla uppgifter, som att hitta en restaurang och boka ett bord. Att skapa ett sådant konversationssystem kan dock vara svårt, tidskrävande, och kräva mycket domänkunskap. I denna uppsats undersöks om det är möjligt att använda ett neuralt nätverk för att skapa en chatbot som kan lära sig att automatisera en tjänst som hjälper användaren hitta en restaurang och boka ett bord. För att undersöka detta implementerades en så kallad ``sequence-to-sequence''-modell som sedan tränades på domänspecifik dialogdata. Systemets styrkor och svagheter utvärderades och dess förmåga att generera korrekta svar jämfördes med flera andra modeller. Vår relativt enkla modell uppnådde liknande resultat som den mest avancerade av de andra modellerna. Resultaten visar modellens styrkor, men påvisar även signifikanta brister. Dessa brister gör att systemet, i sig självt, inte kan användas för att skapa en chatbot som kan hjälpa en användare att hitta en passande restaurang. Utvärderingen har dock bidragit med en grundlig undersökning av vilka fel som görs, vilket kan underlätta framtida arbete inom området. chatbots machine learning deep learning sequence to sequence learning conversation systems end-to-end trainable task-oriented chatbot Computer Sciences Datavetenskap (datalogi)
9	Neural Sequence Modeling for Domain-Specific Language Processing: A Systematic Approach Zhu, Ming 14 August 2023 (has links) In recent years, deep learning based sequence modeling (neural sequence modeling) techniques have made substantial progress in many tasks, including information retrieval, question answering, information extraction, machine translation, etc. Benefiting from the highly scalable attention-based Transformer architecture and enormous open access online data, large-scale pre-trained language models have shown great modeling and generalization capacity for sequential data. However, not all domains benefit equally from the rapid development of neural sequence modeling. Domains like healthcare and software engineering have vast amounts of sequential data containing rich knowledge, yet remain under-explored due to a number of challenges: 1) the distribution of the sequences in specific domains is different from the general domain; 2) the effective comprehension of domain-specific data usually relies on domain knowledge; and 3) the labelled data is usually scarce and expensive to get in domain-specific settings. In this thesis, we focus on the research problem of applying neural sequence modeling methods to address both common and domain-specific challenges from the healthcare and software engineering domains. We systematically investigate neural-based machine learning approaches to address the above challenges in three research directions: 1) learning with long sequences, 2) learning from domain knowledge and 3) learning under limited supervision. Our work can also potentially benefit more domains with large amounts of sequential data. / Doctor of Philosophy / In the last few years, computer programs that learn and understand human languages (an area called machine learning for natural language processing) have significantly improved. These advances are visible in various areas such as retrieving information, answering questions, extracting key details from texts, and translating between languages. A key to these successes has been the use of a type of neural network structure known as a "Transformer", which can process and learn from lots of information found online. However, these successes are not uniform across all areas. Two fields, healthcare and software engineering, still present unique challenges despite having a wealth of information. Some of these challenges include the different types of information in these fields, the need for specific expertise to understand this information, and the shortage of labeled data, which is crucial for training machine learning models. In this thesis, we focus on the use of machine learning for natural language processing methods to solve these challenges in the healthcare and software engineering fields. Our research investigates learning with long documents, learning from domain-specific expertise, and learning when there's a shortage of labeled data. The insights and techniques from our work could potentially be applied to other fields that also have a lot of sequential data. Machine Learning for Code Machine Learning for Healthcare Information Retrieval Question Answering Entity Linking Program Translation Code Refinement Sequence-to-Sequence Models
10	Strojový překlad pomocí umělých neuronových sítí / Machine Translation Using Artificial Neural Networks Holcner, Jonáš January 2018 (has links) The goal of this thesis is to describe and build a system for neural machine translation. System is built with recurrent neural networks - encoder-decoder architecture in particular. The result is a nmt library used to conduct experiments with different model parameters. Results of the experiments are compared with system built with the statistical tool Moses.

Search results