Global ETD Search

101	Vývoj trénovatelných strategií řízení pro dialogové systémy / Development of trainable policies for spoken dialogue systems Le, Thanh Cong January 2016 (has links) Abstract Development of trainable policies for spoken dialogue systems Thanh Le In humanhuman interaction, speech is the most natural and effective manner of communication. Spoken Dialogue Systems (SDS) have been trying to bring that high level interaction to computer systems, so with SDS, you could talk to machines rather than learn to use mouse and keyboard for performing a task. However, as inaccuracy in speech recognition and inherent ambiguity in spoken language, the dialogue state (user's desire) can never be known with certainty, and therefore, building such a SDS is not trivial. Statistical approaches have been proposed to deal with these uncertainties by maintaining a probability distribution over every possible dialogue state. Based on these distributions, the system learns how to interact with users, somehow to achieve the final goal in the most effective manner. In Reinforcement Learning (RL), the learning process is understood as optimizing a policy of choosing action conditioned on the current belief state. Since the space of dialogue...
102	Implementace aproximativních Bayesovských metod pro odhad stavu v dialogových systémech / Approximative Bayes methods for belief monitoring in spoken dialogue systems Marek, David January 2013 (has links) The most important component of virtually any dialog system is a dialogue manager. The aim of the dialog manager is to propose an action (a continuation of the dialogue) given the last dialog state. The dialog state summarises all the past user input and the system input and ideally it includes all information necessary for natural progress in the dialog. For the dialog manager to work efficiently, it is important to model the probability distribution over all dialog states as precisely as possible. It is possible that the set of dialog states will be very large, so approximative methods usually must be used. In this thesis we will discuss an implementation of approximate Bayes methods for belief state monitoring. The result is a library for dialog state monitoring in real dialog systems. 1
103	Assessing the impact of manual corrections in the Groningen Meaning Bank / Assessing the impact of manual corrections in the Groningen Meaning Bank Weck, Benno January 2016 (has links) The Groningen Meaning Bank (GMB) project develops a corpus with rich syntactic and semantic annotations. Annotations in GMB are generated semi-automatically and stem from two sources: (i) Initial annotations from a set of standard NLP tools, (ii) Corrections/refinements by human annotators. For example, on the part-of-speech level of annotation there are currently 18,000 of those corrections, so called Bits of Wisdom (BOWs). For applying this information to boost the NLP processing we experiment how to use the BOWs in retraining the part-of-speech tagger and found that it can be improved to correct up to 70% of identified errors within held-out data. Moreover an improved tagger helps to raise the performance of the parser. Preferring sentences with a high rate of verified tags in retraining has proven to be the most reliable way. With a simulated active learning experiment using Query-by-Uncertainty (QBU) and Query-by- Committee (QBC) we proved that selectively sampling sentences for retraining yields better results with less data needed than random selection. In an additional pilot study we found that a standard maximum-entropy part-of-speech tagger can be augmented so that it uses already known tags to enhance its tagging decisions on an entire sequence without retraining a new model first. Powered by...
104	A Study on Text Classification Methods and Text Features Danielsson, Benjamin January 2019 (has links) When it comes to the task of classification the data used for training is the most crucial part. It follows that how this data is processed and presented for the classifier plays an equally important role. This thesis attempts to investigate the performance of multiple classifiers depending on the features that are used, the type of classes to classify and the optimization of said classifiers. The classifiers of interest are support-vector machines (SMO) and multilayer perceptron (MLP), the features tested are word vector spaces and text complexity measures, along with principal component analysis on the complexity measures. The features are created based on the Stockholm-Umeå-Corpus (SUC) and DigInclude, a dataset containing standard and easy-to-read sentences. For the SUC dataset the classifiers attempted to classify texts into nine different text categories, while for the DigInclude dataset the sentences were classified into either standard or simplified classes. The classification tasks on the DigInclude dataset showed poor performance in all trials. The SUC dataset showed best performance when using SMO in combination with word vector spaces. Comparing the SMO classifier on the text complexity measures when using or not using PCA showed that the performance was largely unchanged between the two, although not using PCA had slightly better performance NLP Text Classification SVM MLP PCA SUC DigInclude
105	Predicting the area of industry : Using machine learning to classify SNI codes based on business descriptions, a degree project at SCB / Att prediktera näringsgrensindelning : Ett examensarbete om tillämpningavmaskininlärning för att klassificeraSNI-koder utifrån företagsbeskrivningarhos SCB Dahlqvist-Sjöberg, Philip, Strandlund, Robin January 2019 (has links) This study is a part of an experimental project at Statistics Sweden,which aims to, with the use of natural language processing and machine learning, predict Swedish businesses’ area of industry codes, based on their business descriptions. The response to predict consists of the most frequent 30 out of 88 main groups of Swedish standard industrial classification (SNI) codes that each represent a unique area of industry. The transformation from business description text to numerical features was done through the bag-of-words model. SNI codes are set when companies are founded, and due to the human factor, errors can occur. Using data from the Swedish Companies Registration Office, the purpose is to determine if the method of gradient boosting can provide high enough classification accuracy to automatically set the correct SNI codes that differ from the actual response. Today these corrections are made manually. The best gradient boosting model was able to correctly classify 52 percent of the observations, which is not considered high enough to implement automatic code correction into a production environment. machine learning classification gradient boosting data analysis NLP SNI SCB Probability Theory and Statistics Sannolikhetsteori och statistik
106	APPLICATIONS OF DEEP LEARNING IN TEXT CLASSIFICATION FOR HIGHLY MULTICLASS DATA Grünwald, Adam January 2019 (has links) Text classification using deep learning is rarely applied to tasks with more than ten target classes. This thesis investigates if deep learning can be successfully applied to a task with over 1000 target classes. A pretrained Long Short-Term Memory language model is fine-tuned and used as a base for the classifier. After five days of training, the deep learning model achieves 80.5% accuracy on a publicly available dataset, 9.3% higher than Naive Bayes. With five guesses, the model predicts the correct class 92.2% of the time. ULMFiT Neural Networks NLP LSTM Transfer Learning Probability Theory and Statistics Sannolikhetsteori och statistik
107	Domain Adaptation for Hypernym Discovery via Automatic Collection of Domain-Speciﬁc Training Data / Domänanpassning för identiﬁering av hypernymer via automatisk insamling av domänspeciﬁkt träningsdata Palm Myllylä, Johannes January 2019 (has links) Identifying semantic relations in natural language text is an important component of many knowledge extraction systems. This thesis studies the task of hypernym discovery, i.e discovering terms that are related by the hypernymy (is-a) relation. Speciﬁcally, this thesis explores how state-of-the-art methods for hypernym discovery perform when applied in speciﬁc language domains. In recent times, state-of-the-art methods for hypernym discovery are mostly made up by supervised machine learning models that leverage distributional word representations such as word embeddings. These models require labeled training data in the form of term pairs that are known to be related by hypernymy. Such labeled training data is often not available when working with a speciﬁc language domain. This thesis presents experiments with an automatic training data collection algorithm. The algorithm leverages a pre-deﬁned domain-speciﬁc vocabulary, and the lexical resource WordNet, to extract training pairs automatically. This thesis contributes by presenting experimental results when attempting to leverage such automatically collected domain-speciﬁc training data for the purpose of domain adaptation. Experiments are conducted in two different domains: One domain where there is a large amount of text data, and another domain where there is a much smaller amount of text data. Results show that the automatically collected training data has a positive impact on performance in both domains. The performance boost is most signiﬁcant in the domain with a large amount of text data, with mean average precision increasing by up to 8 points. NLP natural language processing domain adaptation hypernym hyponym Computer Engineering Datorteknik
108	A study of a small-scale classroom intervention that uses an adapted neuro-linguistic programming (NLP) modelling approach Day, Trevor Rodney January 2008 (has links) This is a largely qualitative, part quantitative, inquiry into the effectiveness of classroom modelling in helping tertiary students prepare for their AS-level examinations. Classroom modelling, a form of peer modelling developed by the author, draws substantially upon neuro-linguistic programming (NLP), a discipline regarded as controversial in education. Classroom modelling involves students investigating each other's more successful practices and drawing out elements that might be woven into their own practice. 373.18
109	A Study on the Efficacy of Sentiment Analysis in Author Attribution Schneider, Michael J 01 August 2015 (has links) The field of authorship attribution seeks to characterize an author’s writing style well enough to determine whether he or she has written a text of interest. One subfield of authorship attribution, stylometry, seeks to find the necessary literary attributes to quantify an author’s writing style. The research presented here sought to determine the efficacy of sentiment analysis as a new stylometric feature, by comparing its performance in attributing authorship against the performance of traditional stylometric features. Experimentation, with a corpus of sci-fi texts, found sentiment analysis to have a much lower performance in assigning authorship than the traditional stylometric features. NLP Sentiment Analysis Authorship Attribution Data Mining Stylometry Computational Linguistics
110	Implementation of an acoustic echo canceller using MATLAB [electronic resource] / by Srinivasaprasath Raghavendran. Raghavendran, Srinivasaprasath. January 2003 (has links) Title from PDF of title page. / Document formatted into pages; contains 66 pages. / Thesis (M.S.E.E.)--University of South Florida, 2003. / Includes bibliographical references. / Text (Electronic thesis) in PDF format. / ABSTRACT: The rapid growth of technology in recent decades has changed the whole dimension of communications. Today people are more interested in hands-free communication. In such a situation, the use a regular loudspeaker and a high-gain microphone, in place of a telephone receiver, might seem more appropriate. This would allow more than one person to participate in a conversation at the same time such as a teleconference environment. Another advantage is that it would allow the person to have both hands free and to move freely in the room. However, the presence of a large acoustic coupling between the loudspeaker and microphone would produce a loud echo that would make conversation difficult. Furthermore, the acoustic system could become instable, which would produce a loud howling noise to occur. The solution to these problems is the elimination of the echo with an echo suppression or echo cancellation algorithm. / ABSTRACT: The echo suppressor offers a simple but effective method to counter the echo problem. However, the echo suppressor possesses a main disadvantage since it supports only half-duplex communication. Half-duplex communication permits only one speaker to talk at a time. This drawback led to the invention of echo cancellers. An important aspect of echo cancellers is that full-duplex communication can be maintained, which allows both speakers to talk at the same time. This objective of this research was to produce an improved echo cancellation algorithm, which is capable of providing convincing results. The three basic components of an echo canceller are an adaptive filter, a doubletalk detector and a nonlinear processor. The adaptive filter creates a replica of the echo and subtracts it from the combination of the actual echo and the near-end signal. The doubletalk detector senses the doubletalk. / ABSTRACT: Doubletalk occurs when both ends are talking, which stops the adaptive filter in order to avoid divergence. Finally, the nonlinear processor removes the residual echo from the error signal. Usually, a certain amount of speech is clipped in the final stage of nonlinear processing. In order to avoid clipping, a noise gate was used as a nonlinear processor in this research. The noise gate allowed a threshold value to be set and all signals below the threshold were removed. This action ensured that only residual echoes were removed in the final stage. To date, the real time implementation of echo an cancellation algorithm was performed by utilizing both a VLSI processor and a DSP processor. Since there has been a revolution in the field of personal computers, in recent years, this research attempted to implement the acoustic echo canceller algorithm on a natively running PC with the help of the MATLAB software. / System requirements: World Wide Web browser and PDF reader. / Mode of access: World Wide Web. aec. nlms. dtd. nlp. matlab.

Search results