Spelling suggestions: "subject:"sprickkontroll"" "subject:"språkkontakt""
1 |
Automatisk utvinning av felaktigt särskrivna sammansättningarHedén, Sofia January 2017 (has links)
Denna uppsats beskriver en automatisk utvinning av särskrivningar som läggs i ett lexikon och implementeras i en redan existerande stavningskon- troll. Arbetet har utförts i samarbete med Svensk TalTeknologi. Många skribenter har svårt att förstå vilka fraser som ska skrivas samman och vilka fraser som kan stå isär. De datorstödda språkgranskningsprogram som finns för svenska idag har svårt att hantera både särskrivningar och sammansättningar vilket kan ge missvisande rekommendationer. Metoden som har utvecklats i detta arbete extraherar sammanslagna bigram från en icke normativ korpus som är 84,6 MB stor för att jäm- föra mot unigram från en normativ korpus som är 99,2 MB stor. Med begränsningar utvinns 2492 möjliga särskrivningar som påträffas i båda korpusarna och som läggs i ett lexikon. Lexikonets precision uppgår till 92 %. Stavningskontrollens täckning för felaktiga särskrivningar samt ord som det går bra att skriva både ihop och isär uppgår till 60,8 % medan täckningen för felaktiga särskrivningar uppgår till 41,6 %. Lexikonet visar hög noggrannhet och med enkla medel kan precisionen höjas ytterligare. Programmet presterar inte lika bra men med ett mer omfattande lexikon höjs även programmets prestation. / This thesis describes an automatic extraction of split compounds that are added in a lexicon and implemented in an already existing spell checker. The work has been performed in cooperation with Svensk TalTeknologi. Many writers have difficulties understanding what phrases should be writ- ten jointly and what phrases should be written separately. The computer assisted language editors that exist for Swedish today have difficulties dealing with erroneously split and joint compounds, which can result in misleading recommendations. The method that has been developed in this work extracts joint bigrams from a non-normative corpus that is 84,6 MB big to compare with unigrams from a normative corpus that is 99,2 MB big. With some limitations 2492 possible compounds that are found in both the corpora are extracted and put in a lexicon. The lexicon’s precision amounts to 92 %. The recall of the spell checker amounts to 60,8 % for both erroneously compounds and compounds that can be written jointly or separately, and to 41,6 % for erroneously split compounds. The lexicon presents high accuracy and with simple means the precision can be further increased. The spell checker’s achievement is not as good but with a more extensive lexicon the achievement of the program will increase as well.
|
2 |
Språklig förlust i främmande framtid : Nyspråk och språkkontroll i svenska dystopier 1958–1979 / Estranged Futures and Language Lost : Newspeak and Language Control in Swedish Dystopian Fiction 1958–1979Järpedal, Ebba January 2020 (has links)
Dystopian fiction seeks to make conscious the faults of contemporary society through estrangement. Newspeak plays an important role in this estrangement, being a euphemistic and propagandistic language meant to distort the characters' perception of the fictional world. This type of language, however, has two different functions: one fictional and one didactical, where the latter seeks to emphasize the negative aspects of the fictional world to the reader. In this thesis I analyze the use of newspeak and language control as a means for social criticism in five Swedish dystopian novels published from the late 1950s through to the late 1970s. The novels analyzed are: Strålen (1958) by Ann Margret Dahlquist-Ljungberg, De sista (1962) by Arvid Rundberg, Elektra. Kvinna år 2070 (1967) by Ivar Lo-Johansson, Klotjorden (1970) by Kerstin Strandberg, and Järnblommorna (1979) by Jenny Berthelius. Apart from newspeak and language control I also examine the use of obsolete language and literary onomastics. Additionally, the thesis contains a smaller bibliography of Swedish utopian and dystopian novels published from 1950 to 1979. Language plays a central role in the novels analyzed: they contain different forms of newspeak and whilst these languages only figurate sporadically, their function is clearly didactic and meant for social criticism. Language control on the other hand, is a common theme that is often used to accentuate a totalitarian threat towards society. Most of the novels, however, primarily deal with obsolete language. It is the lost and forgotten that produces anxiety. This type of language emphasizes a loss of normative values that makes the reader question the fictional society as well as their own.
|
Page generated in 0.0332 seconds