Global ETD Search

1	Med doktorsexamen i bagaget : en enkätundersökning riktad till disputerade vid Uppsala universitet 1997-2001. Språkvetenskapliga fakulteten. Björnermark, Maria January 2006 (has links) No description available. Språkvetenskaplig fakultet utvärdering doktorsexamen alumn
2	Fördomsfulla associationer i en svenskvektorbaserad semantisk modell / Bias in a Swedish Word Embedding Jonasson, Michael January 2019 (has links) Semantiska vektormodeller är en kraftfull teknik där ords mening kan representeras av vektorervilka består av siffror. Vektorerna tillåter geometriska operationer vilka fångar semantiskt viktigaförhållanden mellan orden de representerar. I denna studie implementeras och appliceras WEAT-metoden för att undersöka om statistiska förhållanden mellan ord som kan uppfattas somfördomsfulla existerar i en svensk semantisk vektormodell av en svensk nyhetstidning. Resultatetpekar på att ordförhållanden i vektormodellen har förmågan att återspegla flera av de sedantidigare IAT-dokumenterade fördomar som undersöktes. I studien implementeras och applicerasockså WEFAT-metoden för att undersöka vektormodellens förmåga att representera två faktiskastatistiska samband i verkligheten, vilket görs framgångsrikt i båda undersökningarna. Resultatenav studien som helhet ger stöd till metoderna som används och belyser samtidigt problematik medatt använda semantiska vektormodeller i språkteknologiska applikationer. / Word embeddings are a powerful technique where word meaning can be represented by vectors containing actual numbers. The vectors allow geometric operations that capture semantically important relationships between the words. In this study WEAT is applied in order to examine whether statistical properties of words pertaining to bias can be found in a swedish word embedding trained on a corpus from a swedish newspaper. The results shows that the word embedding can represent several of the IAT documented biases that where tested. A second method, WEFAT, is applied to the word embedding in order to explore the embeddings ability to represent actual statistical properties, which is also done successfully. The results from this study lends support to the validity of both methods aswell as illuminating the issue of problematic relationships between words in word embeddings.
3	Sentiment Analysis of Equity Analyst Research Reports using Convolutional Neural Networks Olof, Löfving January 2019 (has links) Natural language processing, a subfield of artificial intelligence and computer science, has recently been of great research interest due to the vast amount of information created on the internet in the modern era. One of the main natural language processing areas concerns sentiment analysis. This is a field that studies the polarity of human natural language and generally tries to categorize it as either positive, negative or neutral. In this thesis, sentiment analysis has been applied to research reports written by equity analysts. The objective has been to investigate if there exist a distinct distribution of the reports and if one is able to classify sentiment in these reports. The thesis consist of two parts; firstly investigating possibilities on how to divide the reports into different sentiment labelling regimes and secondly categorizing the sentiment using machine learning techniques. Logistic regression as well as several convolutional neural network structures has been used to classify the sentiment. Working with textual data requires the mapping of text to real valued values called features. Several feature extraction methods has been investigated including Bag of Words, term frequency-inverse document frequency and Word2vec. Out of the tested labelling regimes, classifying the documents using upgrades and downgrades of report recommendation shows the most promising potential. For this regime, the convolutional neural network architectures outperform logistic regression by a significant margin. Out of the networks tested, a double input channel utilizing two different Word2vec representations performs the best. The two different representations originate from different sources; one from the set of equity research reports and the other trained by the Google Brain team on an extensive Google news data set. This suggests that using one representation that represent topic specific words and one that is better at representing more common words enhances classification performance.
4	Automatic Error Detection and Correction in Neural Machine Translation : A comparative study of Swedish to English and Greek to English Papadopoulou, Anthi January 2019 (has links) Automatic detection and automatic correction of machine translation output are important steps to ensure an optimal quality of the final output. In this work, we compared the output of neural machine translation of two different language pairs, Swedish to English and Greek to English. This comparison was made using common machine translation metrics (BLEU, METEOR, TER) and syntax-related ones (POSBLEU, WPF, WER on POS classes). It was found that neither common metrics nor purely syntax-related ones were able to capture the quality of the machine translation output accurately, but the decomposition of WER over POS classes was the most informative one. A sample of each language was taken, so as to aid in the comparison between manual and automatic error categorization of five error categories, namely reordering errors, inflectional errors, missing and extra words, and incorrect lexical choices. Both Spearman’s ρ and Pearson’s r showed that there is a good correlation with human judgment with values above 0.9. Finally, based on the results of this error categorization, automatic post editing rules were implemented and applied, and their performance was checked against the sample, and the rest of the data set, showing varying results. The impact on the sample was greater, showing improvement in all metrics, while the impact on the rest of the data set was negative. An investigation of that, alongside the fact that correction was not possible for Greek due to extremely free reference translations and lack of error patterns in spoken speech, reinforced the belief that automatic post-editing is tightly connected to consistency in the reference translation, while also proving that in machine translation output handling, potentially more than one reference translations would be needed to ensure better results.
5	Transcription of Historical Encrypted Manuscripts : Evaluation of an automatic interactive transcription tool. Johansson, Kajsa January 2019 (has links) Countless of historical sources are saved in national libraries and archives all over the world and contain important information about our history. Some of these sources are encrypted to prevent people from reading it. This thesis examines a semi-automated Interactive transcription Tool based on unsupervised learning without any labelled training data that has been developed for transcription of encrypted sources and compares it to manual transcription. The interactive transcription tool is based on handwritten text recognition techniques and the system identifies cluster of symbols based on similarity measures. The tool is evaluated on ciphers with number sequences that have previously been transcribed manually to compare how well the transcription tool performs. The weaknesses of the tool are described and suggestions on how the tool can be improved are proposed. Transcription based on HTR techniques and clustering shows promising results and the unsupervised method based on clustering should be further investigated on ciphers with various symbol sets.
6	Using Unsupervised Morphological Segmentation to Improve Dependency Parsing for Morphologically Rich Languages Yusupujiang, Zulipiye January 2018 (has links) In this thesis, we mainly investigate the influence of using unsupervised morphological segmentation as features on the dependency parsing of morphologically rich languages such as Finnish, Estonian, Hungarian, Turkish, Uyghur, and Kazakh. Studying the morphology of these languages is of great importance for the dependency parsing of morphologically rich languages since dependency relations in a sentence of these languages mostly rely on morphemes rather than word order. In order to investigate our research questions, we have conducted a large number of parsing experiments both on MaltParser and UDPipe. We have generated the supervised morphology and the predicted POS tags from UDPipe, and obtained the unsupervised morphological segmentation from Morfessor, and have converted the unsupervised morphological segmentation into features and added them to the UD treebanks of each language. We have also investigated the different ways of converting the unsupervised segmentation into features and studied the result of each method. We have reported the Labeled Attachment Score (LAS) for all of our experimental results. The main finding of this study is that dependency parsing of some languages can be improved simply by providing unsupervised morphology during parsing if there is no manually annotated or supervised morphology available for such languages. After adding unsupervised morphological information with predicted POS tags, we get improvement of 4.9%, 6.0%, 8.7%, 3.3%, 3.7%, and 12.0% on the test set of Turkish, Uyghur, Kazakh, Finnish, Estonian, and Hungarian respectively on MaltParser, and the parsing accuracies have been improved by 2.7%, 4.1%, 8.2%, 2.4%, 1.6%, and 2.6% on the test set of Turkish, Uyghur, Kazakh, Finnish, Estonian, and Hungarian respectively on UDPipe when comparing the results from the models which do not use any morphological information during parsing.
7	The effect of speaking style on the performance of a forensic voice comparison system Koschwitz, Joana January 2018 (has links) No description available.
8	Natural language interfaces over spatial data : investigations in scalability, extensibility and reliability / Naturliga-språkgränssnitt över rumsliga data : undersökningar i skalbarhet, utbyggbarhet och tillförlitlighet Mollevik, Johan January 2013 (has links) No description available.
9	Intonation and sentence type interpretation in Greek : A production and perception approach Kotsifas, Dimitrios January 2009 (has links) This thesis examines the intonation patterns of Modern Greek with regard to different interpretations of the sentence types (declarative, interrogative, imperative). 14 utterances are produced by Greek native speakers (2 men and 2 women) so as to express various speech acts: STATEMENT, QUESTION, COMMAND and REQUEST. The acquisition of the F0 curve for each utterance by means of the Wavesurfer tool leads to an analysis of the pitch movements and their alignments. After the F0 curves are analyzed and illustrated using the Excel program we are able to compare and group them. Thus, we come up with 5 different intonation patterns. After a second-level comparison based on the fact that some of the F0 curves were similar but they differed only as far as the final pitch movement is concerned, we ended up with 3 fundamental categories of intonation patterns: Category I whose main feature is the rising pitch movement aligned to the onset of the stressed syllables. This category includes only sentences that denote Statement so we can call it the STATEMENT category. Category II’s main characteristic is a dipping pitch movement aligned to the head of the utterance that is the stress of the verb or a particle that signifies negation (/min/, /den/). Sentences meaning Command or Request belong to this category. Lastly, Category III’s intonation pattern consists of peaking pitch movements aligned to the initial and final stressed syllables. Interrogative sentences belong to this category no matter their interpretation. A secondary goal of the thesis is to examine to which extent intonation can be a safe criterion for the “correct” interpretation of a sentence. A de facto presumption that since the ratio between the number of utterances (14) and the different intonation patterns (5) is not 1:1 there can always be misunderstandings among speakers, is basically verified by the results of our perception test conducted to Greek native speakers: Greek native speakers were able to identify most of the speech acts that were expressed by the most common (default) sentence type (i.e. imperative sentence for COMMAND and interrogative for QUESTION) however there were combinations that they had difficulties to identify, such as interrogative sentences that were denoting other than QUESTION, e.g. REQUEST or STATEMENT.Ending, a perception test conducted to Flemish speakers (subjects that were native speakers of another language than Greek) showed that they were more successful in sentences that meant STATEMENT and QUESTION but they could hardly identify an interrogative sentence that meant other than QUESTION and they also confused between COMMAND and REQUEST. This implies that the intonation used to convey different interpretations is basically language-dependent. Concluding, this study offers a description of the intonation patterns (based on pitch movements) regarding the 3 sentence types with 4 different interpretations. Our findings prove that the intonation for some cases (i.e. for sentences that express COMMAND or STATEMENT) seems to be structure-independent and for others structure-dependent (cf. the interrogative sentences). Additionally, the fact that the negation can play an important role for the choice of intonation pattern (as shown for the case of COMMAND and STATEMENT) could be considered as a structure-dependent feature of intonation. This approach contrasts the approach used for many years in the traditional Grammar according to which the structure alone (sentence type) defines the meaning that is to be conveyed.
10	Digitala verktyg och läsmotivation : Hur digitala verktyg används i högstadiet för att stimulera elevernas vilja att läsa Nilsson, Beatrice January 2017 (has links) No description available.

Search results