Global ETD Search

91	Tolkning av spansk känsloprosodi Olavison, Jari January 2003 (has links) Text-till-talsystem blir allt vanligare i vardagen, och det forskas även en hel del på utvecklingen av tal-till-talöversättningssystem. Många företag använder sig i allt större utsträckning av telefontjänster där automatiska system med syntetiskt tal och taligenkänning ersätter människor. För att vi som konsumenter ska känna att det är bekvämt att nyttja dessa tjänster och förstå budskapen är det viktigt att dessa syntetiska röster låter så naturliga som möjligt. Det som gör en röst naturlig är dess prosodi, dvs. dess ickesegmentella aspekter såsom röstens intonation, intensitet och tempo, för att nämna några. Prosodin har inte endast lingvistiska funktioner utan den signalerar även känslor och attityder hos talaren. Vem vill lyssna på en syntetisk röst som låter väldigt ledsen eller arg t.ex. när bilens GPS-navigator sorgset talar om att vi ska ta nästa avfart åt höger. Känslosignalering sker normalt både auditivt och visuellt, en glad person har ofta ett leende på läpparna och talar på ett sätt att vi som lyssnare får intryck av att personen är glad. Denna studie handlar just om den auditiva signaleringen av känslor som jag kallar känsloprosodi. Det är inte självklart att talare av olika språk signalerar känslor på samma sätt trots att många lingvister, liksom jag, är övertygade om att det finns en viss universalitet, vilket man bör beakta vit tal-till-talöversättningssystem. Av denna anledning har jag i min studie valt att jämföra svenska auditiva känsloyttranden med spanska känsloyttranden. Detta har jag gjort genom att göra perceptionstester av spanska röster och jämfört resultaten med en tidigare studie av Åsa Abelin och Jens Allwood på Göteborgs universitet (1999) som gjort en liknande studie mha. svenska röster. Jämförelser av misstolkningar av avsedda känslor indikerar bl.a. att vissa känslor verkar uttryckas på olika sätt för spanska och svenska. Tydligast är detta för ”förvåning” som i båda studier i stor utsträckning misstolkats av informanter med annat modersmål än talaren, även ”avsky” verkar uttryckas något annorlunda. Andra resultat som framkom är att svensktalande ofta misstolkar ”ilska” (spansk) som ”glädje” vilket kan jämföras med att spansktalande misstolkade ”glädje” (svensk) som ”sorg”. Studien visar också att känslor som förväxlas ofta är akustiskt lika till uttrycket och även har en del semantiska likheter. känslor misstolkningar akustiska parametrar semantiska särdrag spansktalande svensktalande
92	Using Alignment Methods to Reduce Translation of Changes in Structured Information Resman, Daniel January 2012 (has links) In this thesis I present an unsupervised approach that can be made supervised in order to reducetranslation of changes in structured information, stored in XML-documents. By combining a sentenceboundary detection algorithm and a sentence alignment algorithm, a translation memory is createdfrom the old version of the information in different languages. This translation memory can then beused to translate sentences that are not changed. The structure of the XML is used to improve theperformance. Two implementations were made and evaluated in three steps: sentence boundary detection,sentence alignment and correspondence. The last step evaluates the using of the translation memoryon a new version in the source language. The second implementation was an improvement, using theresults of the evaluation of the first implementation. The evaluation was done using 100 XML-documents in English, German and Swedish. There was a significant difference between the results ofthe implementations in the first two steps. The errors were reduced by each step and in the last stepthere were only three errors by first implementation and no errors by the second implementation. The evaluation of the implementations showed that it was possible to reduce text that requires re-translation by about 80%. Similar information can and is used by the translators to achieve higherproductivity, but this thesis shows that it is possible to reduce translation even before the textsreaches the translators. translation sentence alignment translation memory sentence boundary detection
93	Rhetorical Structures in Medication Information for Patients and Physicians : A comparative study in preparation for text generation Krevers, Robert January 2011 (has links) The healthcare domain contains a lot of information that could help patients understand and handle their situation, if it is presented in an understandable way. One way to assist healthcare professionals in this endeavour could be a text generation system that can handle a large amount of information and produce a text adapted to fit the knowledge and needs of the recipient. In order to construct such a system, the current methods for presenting and adapting texts in the healthcare domain need to be analysed and understood. In this study, Rhetorical Structure Theory is used, which is a framework that has often been applied within text generation to map out how texts are structured. The objective is to discern how texts containing medication information directed toward laymen are structured in comparison to similar texts directed toward healthcare professionals. It turns out that the texts directed toward laymen prompt and motivate the reader directly, while texts directed toward healthcare professionals at the utmost offer advice and generally provides more neutral, comprehensive information. The results indicate that Rhetorical Structure Theory can be used to find different intentions with texts directed toward different recipients, as well as how these intentions are mediated in the texts, in a structured way that appears to be useful for the text generation process. / Hälso- och sjukvårdsfältet innehåller mycket information som skulle kunna hjälpa patienter att förstå och hantera sin situation, under förutsättning att den formuleras på ett begripligt sätt. Ett sätt att underlätta denna uppgift för vårdpersonal skulle kunna vara ett textgenereringssystem som kan hantera den stora mängden information och producera en text som är anpassad till mottagarens behov och förkunskaper. För att kunna konstruera ett sådant system måste emellertid hälso- och sjukvårdens nuvarande praxis för att formulera och anpassa texter analyseras och förstås. I den här studien används Rhetorical Structure Theory, som är ett struktureringssystem som ofta tillämpats inom textgenerering för att kartlägga hur texter hänger samman. Målet är att avgöra hur texter med medicinsk information avsedda för privatpersoner är strukturerade i förhållande till liknande texter avsedda för vårdpersonal. Det visar sig att texter riktade till privatpersoner ger direkta uppmaningar och motiveringar medan texter riktade till vårdpersonal på sin höjd erbjuder råd och överlag ger mer neutral, mångsidig information. Resultatet indikerar att Rhetorical Structure Theory kan användas för att identifiera skillnader i intention med texter riktade till olika mottagare, samt hur dessa intentioner förmedlas i text, på ett strukturerat sätt som verkar vara användbart för textgenereringsprocessen. Human Computer Interaction
94	MaltParser -- An Architecture for Inductive Labeled Dependency Parsing Hall, Johan January 2006 (has links) This licentiate thesis presents a software architecture for inductive labeled dependency parsing of unrestricted natural language text, which achieves a strict modularization of parsing algorithm, feature model and learning method such that these parameters can be varied independently. The architecture is based on the theoretical framework of inductive dependency parsing by Nivre \citeyear{nivre06c} and has been realized in MaltParser, a system that supports several parsing algorithms and learning methods, for which complex feature models can be defined in a special description language. Special attention is given in this thesis to learning methods based on support vector machines (SVM). The implementation is validated in three sets of experiments using data from three languages (Chinese, English and Swedish). First, we check if the implementation realizes the underlying architecture. The experiments show that the MaltParser system outperforms the baseline and satisfies the basic constraints of well-formedness. Furthermore, the experiments show that it is possible to vary parsing algorithm, feature model and learning method independently. Secondly, we focus on the special properties of the SVM interface. It is possible to reduce the learning and parsing time without sacrificing accuracy by dividing the training data into smaller sets, according to the part-of-speech of the next token in the current parser configuration. Thirdly, the last set of experiments present a broad empirical study that compares SVM to memory-based learning (MBL) with five different feature models, where all combinations have gone through parameter optimization for both learning methods. The study shows that SVM outperforms MBL for more complex and lexicalized feature models with respect to parsing accuracy. There are also indications that SVM, with a splitting strategy, can achieve faster parsing than MBL. The parsing accuracy achieved is the highest reported for the Swedish data set and very close to the state of the art for Chinese and English. / Denna licentiatavhandling presenterar en mjukvaruarkitektur för datadriven dependensparsning, dvs. för att automatiskt skapa en syntaktisk analys i form av dependensgrafer för meningar i texter på naturligt språk. Arkitekturen bygger på idén att man ska kunna variera parsningsalgoritm, särdragsmodell och inlärningsmetod oberoende av varandra. Till grund för denna arkitektur har vi använt det teoretiska ramverket för induktiv dependensparsning presenterat av Nivre \citeyear{nivre06c}. Arkitekturen har realiserats i programvaran MaltParser, där det är möjligt att definiera komplexa särdragsmodeller i ett speciellt beskrivningsspråk. I denna avhandling kommer vi att lägga extra tyngd vid att beskriva hur vi har integrerat inlärningsmetoden supportvektor-maskiner (SVM). MaltParser valideras med tre experimentserier, där data från tre språk används (kinesiska, engelska och svenska). I den första experimentserien kontrolleras om implementationen realiserar den underliggande arkitekturen. Experimenten visar att MaltParser utklassar en trivial metod för dependensparsning (\emph{eng}. baseline) och de grundläggande kraven på välformade dependensgrafer uppfylls. Dessutom visar experimenten att det är möjligt att variera parsningsalgoritm, särdragsmodell och inlärningsmetod oberoende av varandra. Den andra experimentserien fokuserar på de speciella egenskaperna för SVM-gränssnittet. Experimenten visar att det är möjligt att reducera inlärnings- och parsningstiden utan att förlora i parsningskorrekthet genom att dela upp träningsdata enligt ordklasstaggen för nästa ord i nuvarande parsningskonfiguration. Den tredje och sista experimentserien presenterar en empirisk undersökning som jämför SVM med minnesbaserad inlärning (MBL). Studien använder sig av fem särdragsmodeller, där alla kombinationer av språk, inlärningsmetod och särdragsmodell har genomgått omfattande parameteroptimering. Experimenten visar att SVM överträffar MBL för mer komplexa och lexikaliserade särdragsmodeller med avseende på parsningskorrekthet. Det finns även vissa indikationer på att SVM, med en uppdelningsstrategi, kan parsa en text snabbare än MBL. För svenska kan vi rapportera den högsta parsningskorrektheten hittills och för kinesiska och engelska är resultaten nära de bästa som har rapporterats. Dependency Parsing Support Vector Machines Machine Learning
95	Controlled Languages in Software User Documentation Steensland, Henrik, Dervisevic, Dina January 2005 (has links) In order to facilitate comprehensibility and translation, the language used in software user documentation must be standardized. If the terminology and language rules are standardized and consistent, the time and cost of translation will be reduced. For this reason, controlled languages have been developed. Controlled languages are subsets of other languages, purposely limited by restricting the terminology and grammar that is allowed. The purpose and goal of this thesis is to investigate how using a controlled language can improve comprehensibility and translatability of software user documentation written in English. In order to reach our goal, we have performed a case study at IFS AB. We specify a number of research questions that help satisfy some of the goals of IFS and, when generalized, fulfill the goal of this thesis. A major result of our case study is a list of sixteen controlled language rules. Some examples of these rules are control of the maximum allowed number of words in a sentence, and control of when the author is allowed to use past participles. We have based our controlled language rules on existing controlled languages, style guides, research reports, and the opinions of technical writers at IFS. When we applied these rules to different user documentation texts at IFS, we managed to increase the readability score for each of the texts. Also, during an assessment test of readability and translatability, the rewritten versions were chosen in 85 % of the cases by experienced technical writers at IFS. Another result of our case study is a prototype application that shows that it is possible to develop and use a software checker for helping the authors when writing documentation according to our suggested controlled language rules. Controlled Language Readability Translatability Style Guides IFS
96	Exponerade hatkommentarer : En studie av svensk hatkommentarsklassificering Johansson, Kim January 2016 (has links) I detta arbete presenteras hatfulla kommentarer på internet som ett sam- hällsproblem som vi bör göra något åt. Webbplatsen Exponerat.net presenteras som en källa till hatfulla kommentarer. Med hjälp av ett förenklande antagande om att de kommentarer som finns på Exponerat kan utgöra en god representation för hatfulla kommentarer på internet konstruerar vi en klassificerare. Klassificeraren utvärderas i två steg; det ena med hjälp av tiofaldig korsvalidering och det andra manuellt. Klassificeraren uppvisar acceptabla precision/recall-värden i det första utvärderingssteget men faller kort i det manuella. Arbetet avslutas med en diskussion om rimligheten i det förenklande antagandet att använda en enda källa. / Hate speech on the internet is a serious issue. This study asks the question: "Is it possible to use machine learning to do something about it?". By using crawled comments from the blog Exponerat.net as a representation of “hate” and comments from the blog Feber.se as “not-hate” we try to construct a classifier. Evaluation in done in two steps; one using 10-fold cross validation and one using manual evaluation methods. The classifier produces an acceptable result in the first step but falls short in the second. The study ends with discussions about if it is even possible to train a classifier using only one source of data. exponerat.net hatkommentarer SVM textklassificering blogg näthat
97	Användning av Self Organizing Maps som en metod att skapa semantiska representationer ur text Fallgren, Per January 2015 (has links) Denna studie är ett kognitionsvetenskapligt examensarbete som syftar på att skapa en modell som skapar semantiska representationer utifrån ett mer biologiskt plausibelt tillvägagångssätt jämfört med traditionella metoder. Denna modell kan ses som ett första steg i utredningen av ansatsen som följer. Studien utreder antagandet om Self Organizing Maps kan användas för att skapa semantiska representationer ur stora mängder text utifrån ett distribuerat inspirerat tillvägagångssätt. Resultatet visar på ett potentiellt fungerande system, men som behöver utredas vidare i framtida studier för verifiering av högre grad. Self Organizing Maps semantic neural network
98	An eye-tracking study on synonym replacement / En ögonrörelsestudie på synonymutbyte Svensson, Cassandra January 2015 (has links) As the amount of information increase, the need for automatic textsimplication also increase. There are some strategies for doing thatand this thesis has studied two basic synonym replacement strategies.The rst one is called word length and is about always choosinga shorter synonym if it is possible. The second one is called wordfrequency and is about always choosing a more frequent synonym if itis possible. Three dierent versions of them were tried. The rst onewas about just choosing the shortest or most frequent synonym. Thesecond was about only choosing a synonym if it was extremely shorteror more frequent. The last was about only choosing a synonym if itmet the requirements for being replaced and was on synonym level 5.Statistical analysis of the data revealed no signicant dierence. Butsmall trends showed that always choosing a more frequent synonymthat is of level 5 seemed to make the text a bit easier. Synonym replacement word lenght word frequency eye tracking
99	Cohesion and Comprehensibility in Swedish-English Machine Translated Texts Askarieh, Sona January 2014 (has links) Access to various texts in different languages causes an increasing demand for fast, multi-purpose, and cheap translators. Pervasive internet use intensifies the necessity for intelligent and cheap translators, since traditional translation methods are excessively slow to translate different texts. During the past years, scientists carried out much research in order to add human and artificial intelligence into the old machine translation systems and the idea of developing a machine translation system came into existence during the days of World War (Kohenn, 2010). The new invention was useful in order to help the human translators and many other people who need to translate different types of texts according to their needs. The new translation systems are useful in meeting people’s needs. Since the machine translation systems vary according to the quality of the systems outputs, their performance should be evaluated from the linguistic point of view in order to reach a fair judgment about the quality of the systems outputs. To achieve this goal, two various Swedish texts were translated by two different machine translation systems in the thesis. The translated texts were evaluated to examine the extent to which errors affect the comprehensibility of the translations. The performances of the systems were evaluated using three approaches. Firstly, most common linguistically errors, which appear in the machine translation systems outputs, were analyzed (e.g. word alignment of the translated texts). Secondly, the influence of different types of errors on the cohesion chains were evaluated. Finally, the effect of the errors on the comprehensibility of the translations were investigated. Numerical results showed that some types of errors have more effects on the comprehensibility of the systems’ outputs. The obtained data illustrated that the subjects’ comprehension of the translated texts depend on the type of error, but not frequency. The analyzing depicted which translation system had best performance. Languages and Literature Språk och litteratur
100	Keeping an Eye on the Context : An Eye Tracking Study of Cohesion Errors in Automatic Text Summarization / Med ett öga på sammanhanget : En ögonrörelsestudie av kohesionsfel i automatiska textsammanfattningar Rennes, Evelina January 2013 (has links) Automatic text summarization is a growing field due to the modern world’s Internet based society, but to automatically create perfect summaries is not easy, and cohesion errors are common. By the usage of an eye tracking camera, this thesis studies the nature of four different types of cohesion errors occurring in summaries. A total of 23 participants read and rated four different texts and marked the most difficult areas of each text. Statistical analysis of the data revealed that absent cohesion or context and broken anaphoric reference (pronouns) caused some disturbance in reading, but that the impact is restricted to the effort to read rather than the comprehension of the text. Erroneous anaphoric reference (pronouns) was not detected by the participants which poses a problem for automatic text summarizers, and other potential disturbing factors were detected. Finally, the question of the meaningfulness of keeping absent cohesion or context as a separate error type was raised. Automatic text summarization cohesion errors eye tracking CogSum

Search results