Global ETD Search

81	Thoughts don't have Colour, do they? : Finding Semantic Categories of Nouns and Adjectives in Text Through Automatic Language Processing / Generering av semantiska kategorier av substantiv och adjektiv genom automatisk textbearbetning Fallgren, Per January 2017 (has links) Not all combinations of nouns and adjectives are possible and some are clearly more fre- quent than other. With this in mind this study aims to construct semantic representations of the two types of parts-of-speech, based on how they occur with each other. By inves- tigating these ideas via automatic natural language processing paradigms the study aims to find evidence for a semantic mutuality between nouns and adjectives, this notion sug- gests that the semantics of a noun can be captured by its corresponding adjectives, and vice versa. Furthermore, a set of proposed categories of adjectives and nouns, based on the ideas of Gärdenfors (2014), is presented that hypothetically are to fall in line with the produced representations. Four evaluation methods were used to analyze the result rang- ing from subjective discussion of nearest neighbours in vector space to accuracy generated from manual annotation. The result provided some evidence for the hypothesis which suggests that further research is of value. semantic representations semantic categories word vectors adjective noun pair
82	Dependency Parsing and Dialogue Systems : an investigation of dependency parsing for commercial application Adams, Allison January 2017 (has links) In this thesis, we investigate dependency parsing for commercial application, namely for future integration in a dialogue system. To do this, we conduct several experiments on dialogue data to assess parser performance on this domain, and to improve this performance over a baseline. This work makes the following contributions: first, the creation and manual annotation of a gold-standard data set for dialogue data; second, a thorough error analysis of the data set, comparing neural network parsing to traditional parsing methods on this domain; and finally, various domain adaptation experiments show how parsing on this data set can be improved over a baseline. We further show that dialogue data is characterized by questions in particular, and suggest a method for improving overall parsing on these constructions. dependency parsing dialogue systems error analysis
83	Större chans att klara det? : En specialpedagogisk studie av 10 ungdomars syn på hur datorstöd har påverkat deras språk, lärande och skolsituation. Hansson, Britt January 2008 (has links) I studien intervjuades 10 ungdomar om sina erfarenheter av att använda dator med talsyntes och inspelade böcker. De tillfrågades om i vilka situationer verktygen har kommit till nytta eller upplevts hämmande i deras lärande och skolsituation. På grund av stora skolsvårigheter har ungdomarna fått låna en bärbar dator av skolan. Den har de använt både hemma och i skolan. Tillsammans med föräldrar och lärare har de fått handledning vid kommunens Skoldatatek. Att språket utvecklas när det används har varit utgångspunkt i studien, ur ett sociokulturellt perspektiv. Skolan ska erbjuda en tidsenlig utbildning och elever i skolsvårigheter har rätt att få stöd. Hur detta stöd ska utformas kan skapa ett dilemma på den enskilda skolan. Ett stöd riktat direkt till den enskilde kan nämligen uppfattas som att skolsvårigheter ses som en elevburen problematik, vilket inte får förekomma i ”en skola för alla”. Med tanke på detta dilemma var det viktigt att efterforska ungdomarnas upplevelser av stöd, utveckling och hinder, för att förstå om de orsakar utpekande och exkludering. Resultatet visade att ungdomarna upplevde att de kände sig mer motiverade med sina datorverktyg, som har kompenserat deras svårigheter och tilltalat deras olika lärstilar. Ungdomarna sade sig ha blivit säkrare skribenter och läsare tack vare ökat språkbruk. I deras berättelse framgår även nödvändigheten av stöd från lärare och föräldrar. Resultatet pekar på att alternativa verktyg i lärandet skulle kunna medverka till större måluppfyllelse i en skola för alla, med pedagogisk mångfald. datorstöd specialpedagogik skoldatatek alternativa verktyg datoranvändning kompensation
84	Semantisk spegling : En implementation för att synliggöra semantiska relationer i tvåspråkiga data Andersson, Sebastian January 2004 (has links) Semantiska teorier inom traditionell lingvistik har i huvudsak fokuserat på relationen mellan ord och de egenskaper eller objekt som ordet står för. Dessa teorier har sällan varit empiriskt grundade utan resultatet av enskilda teoretikers tankemödor som exemplifierats med ett fåtal ord. För användning inom översättning eller maskinöversättning kan ett ords betydelse istället definieras utifrån dess relation till andra språk. Översättning av text lämnar dessutom analyserbart material efter sig i form av originaltext och översättning som öppnar möjlighet för empiriskt grundade semantiska relationer. En metod för att försöka hitta enspråkiga semantiska relationer utifrån tvåspråkiga översättningsdata är semantisk spegling. Genom att utnyttja att ord är tvetydiga på olika sätt i källspråk och målspråk kan semantiska relationer mellan ord i källspråket hittas utifrån relationen till målspråket. I denna uppsats har semantisk spegling implementerats och applicerats på tvåspråkiga (svenska ochengelska) ordboksdata. Eftersom de enspråkiga relationerna i semantisk spegling tas fram utifrån ett annat språk har detta utnyttjats i arbetet för att även ta fram tvåspråkiga semantiska relationer. Resultatet har jämförts med befintliga synonymlexikon, utvärderats kvalitativt samt jämförts med ursprungsdata. Resultaten är av varierande kvalitet men visar ändå på potential hos metoden och möjlighet att använda resultatet som lexikal resurs inom till exempel lexikografi Interdisciplinary studies språkteknologi lexikografi semantik ordbok synonym TVÄRVETENSKAP Social Sciences Interdisciplinary
85	Extracting social networks from fiction : Imaginary and invisible friends: Investigating the social world of imaginary friends. Ek, Adam January 2017 (has links) This thesis develops an approach to extract the social relation between characters in literary text to create a social network. The approach uses co-occurrences of named entities, keywords associated with the named entities, and the dependency relations that exist between the named entities to construct the network. Literary texts contain a large amount of pronouns to represent the named entities, to resolve the antecedents of pronouns, a pronoun resolution system is implemented based on a standard pronoun resolution algorithm. The results indicate that the pronoun resolution system finds the correct named entity in 60,4\% of all cases. The social network is evaluated by comparing character importance rankings based on graph properties with an independently human generated importance rankings. The generated social networks correlate moderately to strongly with the independent character ranking. Centrality Graphs Named entities Pronoun Pronoun resolution Social network
86	Automatisk utvinning av felaktigt särskrivna sammansättningar Hedén, Sofia January 2017 (has links) Denna uppsats beskriver en automatisk utvinning av särskrivningar som läggs i ett lexikon och implementeras i en redan existerande stavningskon- troll. Arbetet har utförts i samarbete med Svensk TalTeknologi. Många skribenter har svårt att förstå vilka fraser som ska skrivas samman och vilka fraser som kan stå isär. De datorstödda språkgranskningsprogram som finns för svenska idag har svårt att hantera både särskrivningar och sammansättningar vilket kan ge missvisande rekommendationer. Metoden som har utvecklats i detta arbete extraherar sammanslagna bigram från en icke normativ korpus som är 84,6 MB stor för att jäm- föra mot unigram från en normativ korpus som är 99,2 MB stor. Med begränsningar utvinns 2492 möjliga särskrivningar som påträffas i båda korpusarna och som läggs i ett lexikon. Lexikonets precision uppgår till 92 %. Stavningskontrollens täckning för felaktiga särskrivningar samt ord som det går bra att skriva både ihop och isär uppgår till 60,8 % medan täckningen för felaktiga särskrivningar uppgår till 41,6 %. Lexikonet visar hög noggrannhet och med enkla medel kan precisionen höjas ytterligare. Programmet presterar inte lika bra men med ett mer omfattande lexikon höjs även programmets prestation. / This thesis describes an automatic extraction of split compounds that are added in a lexicon and implemented in an already existing spell checker. The work has been performed in cooperation with Svensk TalTeknologi. Many writers have difficulties understanding what phrases should be writ- ten jointly and what phrases should be written separately. The computer assisted language editors that exist for Swedish today have difficulties dealing with erroneously split and joint compounds, which can result in misleading recommendations. The method that has been developed in this work extracts joint bigrams from a non-normative corpus that is 84,6 MB big to compare with unigrams from a normative corpus that is 99,2 MB big. With some limitations 2492 possible compounds that are found in both the corpora are extracted and put in a lexicon. The lexicon’s precision amounts to 92 %. The recall of the spell checker amounts to 60,8 % for both erroneously compounds and compounds that can be written jointly or separately, and to 41,6 % for erroneously split compounds. The lexicon presents high accuracy and with simple means the precision can be further increased. The spell checker’s achievement is not as good but with a more extensive lexicon the achievement of the program will increase as well. särskrivning särskrivningar sammansättning sammansättningar automatisk utvinning språkkontroll språkgranskning språkgranskningasprogram
87	Putting a spin on SPINN : Representations of syntactic structure in neural network sentence encoders for natural language inference Jesper, Segeblad January 2017 (has links) This thesis presents and investigates a dependency-based recursive neural network model applied to the task of natural language inference. The dependency-based model is a direct extension of a previous constituency-based model used for natural language inference. The dependency-based model is tested on the Stanford Natural Language Inference corpus and is compared to the previously proposed constituency-based model as well as a recurrent Long-Short Term Memory network. The experiments show that the Long-Short Term Memory outperform both the dependency-based models as well as the constituency-based model. It is also shown that what is to be explicitly represented depends on the model dimensionality that one use. With 50-dimensional models, more explicit representations of the dependency structure provides higher accuracies, and the best dependency-based model performs on par with the LSTM. Higher model dimensionalities seem to favor less explicit representations of the dependency structure. We hypothesize that a smaller dimensionality requires a more explicit representation of the relevant linguistic features of the input, while the explicit representation becomes limiting when a higher model dimensionality is used. Human Computer Interaction
88	Using cloud services and machine learning to improve customer support : Study the applicability of the method on voice data Spens, Henrik, Lindgren, Johan January 2018 (has links) This project investigated how machine learning could be used to classify voice calls in a customer support setting. A set of a few hundred labeled voice calls were recorded and used as data. The calls were transcribed to text using a speech-to-text cloud service. This text was then normalized and used to train models able to predict new voice calls. Different algorithms were used to build the models, including support vector machines and neural networks. The optimal model, found by extensive parameter search, was found to be a support vector machine. Using this optimal model a program that can classify live voice calls was made. speech-to-text machine learning natural language processing
89	Neural Networks for Part-of-Speech Tagging Strandqvist, Wiktor January 2016 (has links) The aim of this thesis is to explore the viability of artificial neural networks using a purely contextual word representation as a solution for part-of-speech tagging. Furthermore, the effects of deep learning and increased contextual information of the network are explored. This was achieved by creating an artificial neural network written in Python. The input vectors employed were created by Word2Vec. This system was compared to a baseline using a tagger with handcrafted features in respect to accuracy and precision. The results show that the use of artificial neural networks using a purely contextual word representation shows promise, but ultimately falls roughly two percent short of the baseline. The suspected reason for this is the suboptimal representation for rare words. The use of deeper network architectures shows an insignificant improvement, indicating that the data sets used might be too small. The use of additional context information provided a higher accuracy, but started to decline after a context size of one. artificial neural network part-of-speech tagging language technology
90	Translationese and Swedish-English Statistical Machine Translation Joelsson, Jakob January 2016 (has links) This thesis investigates how well machine learned classifiers can identify translated text, and the effect translationese may have in Statistical Machine Translation -- all in a Swedish-to-English, and reverse, context. Translationese is a term used to describe the dialect of a target language that is produced when a source text is translated. The systems trained for this thesis are SVM-based classifiers for identifying translationese, as well as translation and language models for Statistical Machine Translation. The classifiers successfully identified translationese in relation to non-translated text, and to some extent, also what source language the texts were translated from. In the SMT experiments, variation of the translation model was whataffected the results the most in the BLEU evaluation. Systems configured with non-translated source text and translationese target text performed better than their reversed counter parts. The language model experiments showed that those trained on known translationese and classified translationese performed better than known non-translated text, though classified translationese did not perform as well as the known translationese. Ultimately, the thesis shows that translationese can be identified by machine learned classifiers and may affect the results of SMT systems. Translationese Statistical Machine Translation Text Classification Classification of Translationese

Search results