• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • 2
  • Tagged with
  • 6
  • 6
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Towards the development and application of representative lexicographic corpora for the Gabonese languages

Soami, Leandre Serge 03 1900 (has links)
Thesis (DLitt (Afrikaans and Dutch))--University of Stellenbosch, 2010. / ENGLISH ABSTRACT: The compilation of dictionaries is a laborious activity and it takes time, money and staff to achieve the objectives of any dictionary project. Many dictionaries have been compiled using the lexicographers’ personal intuition and guessing rather than being corpus based. That resulted in some dictionaries often being criticised by users because of the lack of representation of some important lexical items. This can probably be explained by the fact that most of these dictionaries were compiled in an era when theoretical lexicography was lacking or not well established. The last decades have witnessed the emergence of metalexicography as a theory directed also at dictionary planning in order to enhance the quality of lexicographic practice and the way in which the management and the compilation of dictionaries are dealt with. The planning of dictionaries takes into account not only the gathering of language material to be used but also the way in which this material will be treated and presented on both the macrostructural and the microstructural level as well as in the front matter texts and the back matter texts. In order to enhance the quality of the presentation in dictionaries, this dissertation pleads in favour of the formulation of a data collection policy that takes into consideration all the different sources of material, written and spoken, used in the different phases of the compilation of a dictionary. The three phases that form the main focus of this study are the material acquisition phase, the material preparation phase and the material processing phase. The involvement of the speech community in the compilation of a lexicographic corpus ensures the collection of representative and balanced data, and the different needs of that community are central to the dictionary project. The different language materials can be organised into different corpus types. The efficiency of a corpus resides in its capacity to provide different data types that can be included in the comment on semantics and the comment on form of each article in the central list of each dictionary. Some dictionaries lack a good representation of data in both these comments in the different articles. However, languages such as the Gabonese languages are in a privileged situation because they can still avoid the mistakes of other dictionary compilers by investing in corpus-based dictionaries at this early stage. Therefore, the establishment of lexicographic units with multifunctional tasks can play an important role. In a multilingual environment such as Gabon the issue of language status needs to be dealt with carefully because it is realistic to choose a certain number of languages to function as official languages. Different alphabets are presented in this study and realistic choices are made. The way in which the language material is organised will impact on the quality of the macrostructure and microstructure; this is essential because dictionaries are consulted most of the time for the spelling of a given lexical item, for a translation equivalent or for the explanation of the meaning of a lemma sign. The computerisation of a corpus is a focal point and needs to be done in a satisfactory manner that presents a clean and helpful corpus in order to provide the lexicographer with useful statistics, frequency word lists and the different concordance lines that are very important for the wording of definitions and the extraction of example sentences. This is why a corpus is seen as an indispensable tool in the improvement of the macro- and the microstructure of any type of dictionary. / AFRIKAANSE OPSOMMING: Die saamstel van woordeboeke is ’n moeisame aktiwiteit, en dit verg tyd, geld en personeel om die doelstellings van ’n woordeboekprojek te bereik. Talle woordeboeke is op grond van die navorsers se persoonlike intuïsie en raaiwerk saamgestel, in stede daarvan dat dit korpusgebaseerd is. Die gevolg is dat baie woordeboeke dikwels deur gebruikers gekritiseer word weens die gebrek aan verteenwoordiging van enkele belangrike leksikale items. Dít kan moontlik verklaar word deur die feit dat die meeste van hierdie woordeboeke saamgestel is in ’n era waartydens teoretiese leksikografie gebrekkig en nie goed gevestig was nie. In die afgelope dekades het metaleksikografie na vore getree as a teorie wat op woordeboekbeplanning gerig is ten einde die gehalte van die leksikografie-praktyk en die manier waarop die bestuur en samestelling van woordeboeke hanteer word, te verbeter. By die beplanning van woordeboeke word nie net die versameling taalmateriaal wat gebruik kan word in berekening gebring nie, maar ook die manier waarop hierdie materiaal op sowel makro- as mikrostrukturele vlakke, asook in die voorwerk en die agterwerk, hanteer en aangebied gaan word. Ten einde die gehalte van die aanbieding in woordeboeke te verbeter, lewer hierdie proefskrif ’n pleidooi vir die formulering van ’n dataversamelingsbeleid wat al die verskillende materiaalbronne, hetsy skriftelik of mondelings, wat in die verskillende stadia van die samestelling van ’n woordeboek gebruik word, in ag neem. Die drie stadia wat die hooffokus van hierdie studie is, is die stadia waarin die materiaal aangeskaf, voorberei en verwerk word. Die spraakgemeenskap se betrokkenheid by die saamstel van ’n leksikografiese korpus verseker die versameling van verteenwoordigende en gebalanseerde data, en die verskillende behoeftes van sodanige gemeenskap is die kern van die woordeboekprojek. Die verskillende taalmateriale kan in verskillende korpussoorte georden word. Die doeltreffendheid van ’n korpus berus op die vermoë daarvan om verskillende datasoorte te verskaf wat in die kommentaar op semantiek en die kommentaar op vorm van elke item in die sentrale lys van elke woordeboek ingesluit kan word. Sommige woordeboeke toon ’n gebrek aan goeie verteenwoordiging van data in albei hierdie soorte kommentaar in die verskillende items. Tale soos die Gaboenese tale is egter in ’n bevoorregte posisie, aangesien hulle nog die foute van ander woordeboeksamestellers kan vermy deur op hierdie vroeë stadium in korpusgebaseerde woordeboeke te belê. Die stigting van leksikografiese eenhede met multifunksionele take kan dus ’n belangrike rol speel. In ’n veeltalige omgewing soos Gaboen moet die kwessie van taalstatus versigtig hanteer word, aangesien dit realisties is om ’n sekere hoeveelheid tale as amptelike tale te kies. Verskillende alfabette word in hierdie studie aangebied en realistiese keuses word gemaak. Die manier waarop die taalmateriaal georden is, sal ’n uitwerking op die makro- en mikrostruktuur hê; dit is van belang omdat woordeboeke meestal vir die spelling van ’n gegewe leksikale item, vir ’n vertaalekwivalent of vir die verklaring van die betekenis van ’n lemmateken geraadpleeg word. Die rekenarisering van ’n korpus is ’n belangrike aspek en moet op ’n bevredigende wyse uitgevoer word wat ’n skoon en nuttige korpus lewer ten einde die leksikograaf van goeie statistieke, frekwensiewoordlyste en die verskillende konkordansielyne te voorsien, wat baie belangrik is vir die skryf van definisies en die onttrekking van voorbeeldsinne. Om hierdie rede word ’n korpus as ’n onmisbare instrument in die verbetering van die makro- en mikrostruktuur van enige soort woordeboek beskou.
2

Engelskan i skolan : en undersökning av vokabulär i gymnasieskolans textböcker i engelska

Borking, Ulrika January 2008 (has links)
<p>This essay reviews vocabulary samples from three different textbooks, which are readers for the basic course in English at an upper secondary school in Sweden. The aim of the study is to determine whether the word samples from the readers’ word lists consist mostly of high- or low frequency words and if the words denote any particular semantic fields. Moreover, the possible use of word frequencies in second language acquisition is also examined. The method used in ascertaining the quality of the words is comparing the word samples to the BNC (the British National Corpus) and analysing how frequently they occur in written and spoken modern English. The results are based on the findings from the analysis made in this study and also compared to current research in the fields of linguistics and language acquisition. The results exhibit both overrepresentation- and absence of words in particular semantic fields. For instance, words from the semantic field concerning ‘food and cooking’ were found to be somewhat predominant. The findings also include support for the use of word frequencies in language acquisition, especially in terms of how words are translated from English into Swedish in the textbooks’ wordlists. The only Swedish synonym given was in some cases item of the least frequent usage in modern English, according to the BNC. <strong> </strong></p><h1> </h1><h1> </h1><p> </p>
3

Structures in complex systems : Playing dice with networks and books

Bernhardsson, Sebastian January 2009 (has links)
Complex systems are neither perfectly regular nor completely random. They consist of a multitude of players who, in many cases, playtogether in a way that makes their combined strength greater than the sum of their individual achievements. It is often very effective to represent these systems as networks where the actual connections between the players take on a crucial role.Networks exist all around us and are an important part of our world, from the protein machinery inside our cells to social interactions and man-madecommunication systems. Many of these systems have developed over a long period of time and are constantly undergoing changes driven by complicated microscopic events. These events are often too complicated for us to accurately resolve, making the world seem random and unpredictable. There are however ways of using this unpredictability in our favor by replacing the true events by much simpler stochastic rules giving effectively the same outcome. This allows us to capture the macroscopic behavior of the system, to extract important information about the dynamics of the system and learn about the reason for what we observe. Statistical mechanics gives the tools to deal with such large systems driven by underlying random processes under various external constraints, much like how intracellular networks are driven by random mutations under the constraint of natural selection.This similarity makes it interesting to combine the two and to apply some of the tools provided by statistical mechanics on biological systems.In this thesis, several null models are presented, with this view point in mind, to capture and explain different types of structural properties of real biological networks. The most recent major transition in evolution is the development of language, both spoken and written. This thesis also brings up the subject of quantitative linguistics from the eyes of a physicist, here called linguaphysics. Also in this case the data is analyzed with an assumption of an underlying randomness. It is shown that some statistical properties of books, previously thought to be universal, turn out to exhibit author specific size dependencies. A meta book theory is put forward which explains this dependency by describing the writing of a text as pulling a section out of a huge, individual, abstract mother book. / Komplexa system är varken perfekt ordnade eller helt slumpmässiga. De består av en mängd aktörer, som i många fall agerar tillsammans på ett sådant sätt att deras kombinerade styrka är större än deras individuella prestationer. Det är ofta effektivt att representera dessa system som nätverk där de faktiska kopplingarna mellan aktörerna spelar en avgörande roll. Nätverk finns överallt omkring oss och är en viktig del av vår värld , från proteinmaskineriet inne i våra celler till sociala samspel och människotillverkade kommunikationssystem.Många av dessa system har utvecklats under lång tid och genomgår hela tiden förändringar som drivs på av komplicerade småskaliga händelser.Dessa händelser är ofta för komplicerade för oss att noggrant kunna analysera, vilket får vår värld att verka slumpmässig och oförutsägbar. Det finns dock sätt att använda denna oförutsägbarhet till vår fördel genom att byta ut de verkliga händelserna mot mycket enklare regler baserade på sannolikheter, som ger effektivt sett samma utfall. Detta tillåter oss att fånga systemets övergripande uppförande, att utvinna viktig information om systemets dynamik och att få kunskap om anledningen till vad vi observerar. Statistisk mekanik hanterar stora system pådrivna av sådana underliggande slumpmässiga processer under olika restriktioner, på liknande sätt som nätverk inne i celler drivs av slumpmässiga mutationer under restriktionerna från naturligt urval. Denna likhet gör det intressant att kombinera de två och att applicera de verktyg som ges av statistisk mekanik på biologiska system. I denna avhandling presenteras flera nollmodeller som, baserat på detta synsätt, fångar och förklarar olika typer av strukturella egenskaper hos verkliga biologiska nätverk. Den senaste stora evolutionära övergången är utvecklandet av språk, både talat och skrivet. Denna avhandling tar också upp ämnet om kvantitativ linguistik genom en fysikers ögon, här kallat linguafysik. även i detta fall så analyseras data med ett antagande om en underliggande slumpmässighet. Det demonstreras att vissa statistiska egenskaper av böcker, som man tidigare trott vara universella, egentligen beror på bokens längd och på författaren. En metaboksteori ställs fram vilken förklarar detta beroende genom att beskriva författandet av en text som att rycka ut en sektion ur en stor, individuell, abstrakt moderbok.
4

Engelskan i skolan : en undersökning av vokabulär i gymnasieskolans textböcker i engelska

Borking, Ulrika January 2008 (has links)
This essay reviews vocabulary samples from three different textbooks, which are readers for the basic course in English at an upper secondary school in Sweden. The aim of the study is to determine whether the word samples from the readers’ word lists consist mostly of high- or low frequency words and if the words denote any particular semantic fields. Moreover, the possible use of word frequencies in second language acquisition is also examined. The method used in ascertaining the quality of the words is comparing the word samples to the BNC (the British National Corpus) and analysing how frequently they occur in written and spoken modern English. The results are based on the findings from the analysis made in this study and also compared to current research in the fields of linguistics and language acquisition. The results exhibit both overrepresentation- and absence of words in particular semantic fields. For instance, words from the semantic field concerning ‘food and cooking’ were found to be somewhat predominant. The findings also include support for the use of word frequencies in language acquisition, especially in terms of how words are translated from English into Swedish in the textbooks’ wordlists. The only Swedish synonym given was in some cases item of the least frequent usage in modern English, according to the BNC.
5

Ord för samkönade relationer : En korpusundersökning baserad på tidningsartiklar från åren 1965-2004 / Word for same-sex relationships : A corpus based study of articles from journals published 1965 to 2004

Pettersson Storsberg, Linda January 2011 (has links)
No description available.
6

Ordförrådet och textboken : En analys av läromedlet Good Stuff för skolår 4-6 / Vocabulary and the textbook : An analysis of the teaching material Good Stuff aimed at school years 4-6

Nordlund, Marie January 2013 (has links)
Syftet med detta arbete är att åskådliggöra variationen i det ordförråd som presenteras i läromedelsserien Good Stuff för skolår 4–6. Mer precist har följande forskningsfrågor behandlats: (i) Vilka ord är mest frekventa i böckernas texter? (ii) I vilken utsträckning förekommer och på vilket sätter sker återanvändning av orden? (iii) Vilka seman­tiska domäner finns representerade i det ordförråd som presenteras i de tre böckerna? För att besvara dessa frågor har en korpus av alla texterna i de tre böckerna skapats. Analysen  visar att läromedlet till viss del stöttar ordinlärning (t.ex. genom övningar som kräver djupare mental bearbetning), men att det också skulle kunna förbättras i detta hänseende, i synnerhet gäller detta ordfrekvens och återanvändning av orden. / The aim of this study is to shed light on variation in the vocabulary presented in Good Stuff, a teaching material aimed at school years 4–6 (age group 10–12 years) in Sweden. More precisely, the following questions have been considered: (i) What words are most frequent in the books? (ii) To what extent and in what way are the words recycled? (iii) What semantic domains are represented in the vocabulary presented in the three books? To answer these questions, a corpus of all the texts in the three books has been compiled. The analyses show that this teaching material supports vocabulary learning to some extent (e.g., with exercises demanding mental processing), but also that it could be improved in that respect, in particular as regards word frequency and recycling of words.

Page generated in 0.3917 seconds