• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 313
  • 47
  • Tagged with
  • 360
  • 351
  • 321
  • 306
  • 303
  • 296
  • 296
  • 98
  • 87
  • 81
  • 78
  • 76
  • 73
  • 65
  • 58
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Extracting social networks from fiction : Imaginary and invisible friends: Investigating the social world of imaginary friends.

Ek, Adam January 2017 (has links)
This thesis develops an approach to extract the social relation between characters in literary text to create a social network. The approach uses co-occurrences of named entities, keywords associated with the named entities, and the dependency relations that exist between the named entities to construct the network. Literary texts contain a large amount of pronouns to represent the named entities, to resolve the antecedents of pronouns, a pronoun resolution system is implemented based on a standard pronoun resolution algorithm. The results indicate that the pronoun resolution system finds the correct named entity in 60,4\% of all cases. The social network is evaluated by comparing character importance rankings based on graph properties with an independently human generated importance rankings. The generated social networks correlate moderately to strongly with the independent character ranking.
82

Automatisk utvinning av felaktigt särskrivna sammansättningar

Hedén, Sofia January 2017 (has links)
Denna uppsats beskriver en automatisk utvinning av särskrivningar som läggs i ett lexikon och implementeras i en redan existerande stavningskon- troll. Arbetet har utförts i samarbete med Svensk TalTeknologi. Många skribenter har svårt att förstå vilka fraser som ska skrivas samman och vilka fraser som kan stå isär. De datorstödda språkgranskningsprogram som finns för svenska idag har svårt att hantera både särskrivningar och sammansättningar vilket kan ge missvisande rekommendationer. Metoden som har utvecklats i detta arbete extraherar sammanslagna bigram från en icke normativ korpus som är 84,6 MB stor för att jäm- föra mot unigram från en normativ korpus som är 99,2 MB stor. Med begränsningar utvinns 2492 möjliga särskrivningar som påträffas i båda korpusarna och som läggs i ett lexikon. Lexikonets precision uppgår till 92 %. Stavningskontrollens täckning för felaktiga särskrivningar samt ord som det går bra att skriva både ihop och isär uppgår till 60,8 % medan täckningen för felaktiga särskrivningar uppgår till 41,6 %. Lexikonet visar hög noggrannhet och med enkla medel kan precisionen höjas ytterligare. Programmet presterar inte lika bra men med ett mer omfattande lexikon höjs även programmets prestation. / This thesis describes an automatic extraction of split compounds that are added in a lexicon and implemented in an already existing spell checker. The work has been performed in cooperation with Svensk TalTeknologi. Many writers have difficulties understanding what phrases should be writ- ten jointly and what phrases should be written separately. The computer assisted language editors that exist for Swedish today have difficulties dealing with erroneously split and joint compounds, which can result in misleading recommendations. The method that has been developed in this work extracts joint bigrams from a non-normative corpus that is 84,6 MB big to compare with unigrams from a normative corpus that is 99,2 MB big. With some limitations 2492 possible compounds that are found in both the corpora are extracted and put in a lexicon. The lexicon’s precision amounts to 92 %. The recall of the spell checker amounts to 60,8 % for both erroneously compounds and compounds that can be written jointly or separately, and to 41,6 % for erroneously split compounds. The lexicon presents high accuracy and with simple means the precision can be further increased. The spell checker’s achievement is not as good but with a more extensive lexicon the achievement of the program will increase as well.
83

Putting a spin on SPINN : Representations of syntactic structure in neural network sentence encoders for natural language inference

Jesper, Segeblad January 2017 (has links)
This thesis presents and investigates a dependency-based recursive neural network model applied to the task of natural language inference. The dependency-based model is a direct extension of a previous constituency-based model used for natural language inference. The dependency-based model is tested on the Stanford Natural Language Inference corpus and is compared to the previously proposed constituency-based model as well as a recurrent Long-Short Term Memory network. The experiments show that the Long-Short Term Memory outperform both the dependency-based models as well as the constituency-based model. It is also shown that what is to be explicitly represented depends on the model dimensionality that one use. With 50-dimensional models, more explicit representations of the dependency structure provides higher accuracies, and the best dependency-based model performs on par with the LSTM. Higher model dimensionalities seem to favor less explicit representations of the dependency structure. We hypothesize that a smaller dimensionality requires a more explicit representation of the relevant linguistic features of the input, while the explicit representation becomes limiting when a higher model dimensionality is used.
84

Using cloud services and machine learning to improve customer support : Study the applicability of the method on voice data

Spens, Henrik, Lindgren, Johan January 2018 (has links)
This project investigated how machine learning could be used to classify voice calls in a customer support setting. A set of a few hundred labeled voice calls were recorded and used as data. The calls were transcribed to text using a speech-to-text cloud service. This text was then normalized and used to train models able to predict new voice calls. Different algorithms were used to build the models, including support vector machines and neural networks. The optimal model, found by extensive parameter search, was found to be a support vector machine. Using this optimal model a program that can classify live voice calls was made.
85

Neural Networks for Part-of-Speech Tagging

Strandqvist, Wiktor January 2016 (has links)
The aim of this thesis is to explore the viability of artificial neural networks using a purely contextual word representation as a solution for part-of-speech tagging. Furthermore, the effects of deep learning and increased contextual information of the network are explored. This was achieved by creating an artificial neural network written in Python. The input vectors employed were created by Word2Vec. This system was compared to a baseline using a tagger with handcrafted features in respect to accuracy and precision. The results show that the use of artificial neural networks using a purely contextual word representation shows promise, but ultimately falls roughly two percent short of the baseline. The suspected reason for this is the suboptimal representation for rare words. The use of deeper network architectures shows an insignificant improvement, indicating that the data sets used might be too small. The use of additional context information provided a higher accuracy, but started to decline after a context size of one.
86

Translationese and Swedish-English Statistical Machine Translation

Joelsson, Jakob January 2016 (has links)
This thesis investigates how well machine learned classifiers can identify translated text, and the effect translationese may have in Statistical Machine Translation -- all in a Swedish-to-English, and reverse, context. Translationese is a term used to describe the dialect of a target language that is produced when a source text is translated. The systems trained for this thesis are SVM-based classifiers for identifying translationese, as well as translation and language models for Statistical Machine Translation. The classifiers successfully identified translationese in relation to non-translated text, and to some extent, also what source language the texts were translated from. In the SMT experiments, variation of the translation model was whataffected the results the most in the BLEU evaluation. Systems configured with non-translated source text and translationese target text performed better than their reversed counter parts. The language model experiments showed that those trained on known translationese and classified translationese performed better than known non-translated text, though classified translationese did not perform as well as the known translationese. Ultimately, the thesis shows that translationese can be identified by machine learned classifiers and may affect the results of SMT systems.
87

Pronoun translation between English and Icelandic

Odd, Jakobsson January 2018 (has links)
A problem in machine translation is how to handle pronouns since languages use these differently, for example, in anaphoric reference. This essay examines what happens to the English third person pronouns he, she, and it when translated into Icelandic. Parallel corpora were prepared by tokenisation and subsequently the machine translation method word alignment was applied on the corpus. The results show that when a pronoun is used to refer to something outside the sentence (extra-sentential), this gives rise to major problems. Another problem encountered was the differences in the deictic strength between pronouns in English and Icelandic. One conclusion that can be drawn is that more research is needed as more reliable ways of handling pronouns are needed in translations. / Ett problem inom maskinöversättning är hur man ska hantera pronomen då språk använder dessa olika, exempelvis vid anaforisk referens. I den här uppsatsen undersöks vad som händer med engelska tredje persons pronomen he, she, och it när de har översatts till isländska. Parallella korpusar gjordes iordning genom tokenisering och därefter användes maskinöversättningsmetoden ordlänkning på korpusen. Resultaten visar att när pronomen används för att referera till något utanför satsen (extrasententiell) är det ett stort problem. Ett annat problem som påträffades gällde skillnader i deiktisk styrka mellan pronomen i engelska och isländska. En slutsats som kan dras är att mer forskning behövs då det behövs mer tillförlitliga sätt att hantera pronomen i översättningar.
88

Facebook i skolan? : -Ur ett elevperspektiv

Pettersson, Erica January 2011 (has links)
Detta examensarbete bygger på en undersökning gjord i en år sju samt  en år nio i grundskolans senare del. Syftet för arbetet har varit att utifrån ett elevperspektiv undersöka vilka sociala medier ungdomar använder idag. Främst Facebook som ligger i tiden just nu med över 4 miljoner medlemmar bara i Sverige. Om lärare i sin undervisning kan utnyttja och använda  Facebook som ett pedagogiskt redskap eftersom det är något som eleverna naturligt ändå använder. Detta är en intressant fråga eftersom sociala medier tycks få en allt större plats i vårt allt mer digitaliserade samhälle där mycket information finns att tillgå på internet.
89

Tolkning av spansk känsloprosodi

Olavison, Jari January 2003 (has links)
Text-till-talsystem blir allt vanligare i vardagen, och det forskas även en hel del på utvecklingen av tal-till-talöversättningssystem. Många företag använder sig i allt större utsträckning av telefontjänster där automatiska system med syntetiskt tal och taligenkänning ersätter människor. För att vi som konsumenter ska känna att det är bekvämt att nyttja dessa tjänster och förstå budskapen är det viktigt att dessa syntetiska röster låter så naturliga som möjligt. Det som gör en röst naturlig är dess prosodi, dvs. dess ickesegmentella aspekter såsom röstens intonation, intensitet och tempo, för att nämna några. Prosodin har inte endast lingvistiska funktioner utan den signalerar även känslor och attityder hos talaren. Vem vill lyssna på en syntetisk röst som låter väldigt ledsen eller arg t.ex. när bilens GPS-navigator sorgset talar om att vi ska ta nästa avfart åt höger. Känslosignalering sker normalt både auditivt och visuellt, en glad person har ofta ett leende på läpparna och talar på ett sätt att vi som lyssnare får intryck av att personen är glad. Denna studie handlar just om den auditiva signaleringen av känslor som jag kallar känsloprosodi. Det är inte självklart att talare av olika språk signalerar känslor på samma sätt trots att många lingvister, liksom jag, är övertygade om att det finns en viss universalitet, vilket man bör beakta vit tal-till-talöversättningssystem. Av denna anledning har jag i min studie valt att jämföra svenska auditiva känsloyttranden med spanska känsloyttranden. Detta har jag gjort genom att göra perceptionstester av spanska röster och jämfört resultaten med en tidigare studie av Åsa Abelin och Jens Allwood på Göteborgs universitet (1999) som gjort en liknande studie mha. svenska röster. Jämförelser av misstolkningar av avsedda känslor indikerar bl.a. att vissa känslor verkar uttryckas på olika sätt för spanska och svenska. Tydligast är detta för ”förvåning” som i båda studier i stor utsträckning misstolkats av informanter med annat modersmål än talaren, även ”avsky” verkar uttryckas något annorlunda. Andra resultat som framkom är att svensktalande ofta misstolkar ”ilska” (spansk) som ”glädje” vilket kan jämföras med att spansktalande misstolkade ”glädje” (svensk) som ”sorg”. Studien visar också att känslor som förväxlas ofta är akustiskt lika till uttrycket och även har en del semantiska likheter.
90

Using Alignment Methods to Reduce Translation of Changes in Structured Information

Resman, Daniel January 2012 (has links)
In this thesis I present an unsupervised approach that can be made supervised in order to reducetranslation of changes in structured information, stored in XML-documents. By combining a sentenceboundary detection algorithm and a sentence alignment algorithm, a translation memory is createdfrom the old version of the information in different languages. This translation memory can then beused to translate sentences that are not changed. The structure of the XML is used to improve theperformance. Two implementations were made and evaluated in three steps: sentence boundary detection,sentence alignment and correspondence. The last step evaluates the using of the translation memoryon a new version in the source language. The second implementation was an improvement, using theresults of the evaluation of the first implementation. The evaluation was done using 100 XML-documents in English, German and Swedish. There was a significant difference between the results ofthe implementations in the first two steps. The errors were reduced by each step and in the last stepthere were only three errors by first implementation and no errors by the second implementation. The evaluation of the implementations showed that it was possible to reduce text that requires re-translation by about 80%. Similar information can and is used by the translators to achieve higherproductivity, but this thesis shows that it is possible to reduce translation even before the textsreaches the translators.

Page generated in 0.0358 seconds