• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1616
  • 442
  • 328
  • 185
  • 139
  • 87
  • 86
  • 61
  • 42
  • 31
  • 28
  • 28
  • 28
  • 28
  • 28
  • Tagged with
  • 3610
  • 812
  • 634
  • 432
  • 278
  • 278
  • 259
  • 253
  • 248
  • 238
  • 233
  • 232
  • 229
  • 213
  • 189
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1421

VISUALISING DATA FRAME FORMATS CONTAINING SUPER COMMUTATION AND VARIABLE WORD LENGTHS

Kitchen, Frank 10 1900 (has links)
International Telemetering Conference Proceedings / October 25-28, 1999 / Riviera Hotel and Convention Center, Las Vegas, Nevada / Compiling a PCM data frame with super commutation poses problems of maintaining constant sample intervals for the parameters whilst keeping within channel bandwidth limitations. Add an extra requirement of using variable word lengths to optimise the use of available bit rate and the problem becomes more challenging. The available telemetry or tape recorder channel bandwidth rather than the capabilities of the data acquisition system normally govern the amount of data that can be acquired by the aircraft instrumentation system. The amount of data demanded usually expands to fill all available bandwidth and the bit rates are operated at the maximum for the particular channel. The use of variable word lengths can, in some circumstances, increase the utilisation of a channel bandwidth. In order to visualise if a particular requirement can be accommodated within a given data structure a method of sketching PCM data frames containing a wide mixture of sample rates using an intermediate matrix has been devised. The method is described in three stages. 1. Compiling a simple PCM frame. 2. Sketching the intermediate matrix to assist in visualising super commutation limits. 3. Mixing variable word lengths and super commutation in the same PCM format. The method is not guaranteed to be the most efficient but does give a relatively simple, non mathematical, way to visualise if the required sample rates can be accommodated in a given data structure. If the requirement will not fit into the data structure then the method allows the impact of the necessary changes to the structure to be rapidly assessed. The paper includes comments on the relevant characteristics needed in the aircraft data acquisition system. These include variable word lengths, frame lengths, incremental bit rates and coherency of multiple data bus word parameters
1422

The Development of Second Language Reading and Morphological Processing Skills

Kraut, Rachel Elizabeth January 2016 (has links)
Decades of research have shed light on the nature of reading in our first language. There is substantial research about how we recognize words, the ways in which we process sentences, and the linguistic and non-linguistic factors which may affect those processes (e.g. Besner & Humphreys, 2009). This has led to more effective pedagogical techniques and methodologies in the teaching of L1 reading (Kamil et al., 2011). With the ever-increasing number of L2 English speakers in U.S. schools and universities, research in more recent has begun to investigate reading in L2. However, this field of inquiry is not nearly as robust as that of L1 reading. Much remains to be explored in terms of how L2 readers process words, sentences, and comprehend what they read (Grabe, 2012). The studies in this dissertation add to the growing body of literature detailing the processes of L2 reading and improvement in L2 reading skills. The first two studies will focus on a topic that has sparked lively discussion in the field over the last 10 years or so: the online processing of L2 morphologically complex words in visual word recognition. Article 3 discusses the effects of a pedagogical intervention and the ways in which it may influence the development of second language reading. Broadly, the studies in this dissertation will address the following research questions: (1) how do L2 readers process morphologically complex words? (2) Is there a connection between their knowledge of written morphology and their ability to use it during word recognition? (3) What is the role of L2 proficiency in these processes? (4) How does extensive reading influence the development of L2 reading skills? Many studies of L2 word processing have been conducted using offline methods. Accordingly, the studies in this dissertation seek to supplement what we know about L2 morphological processing and reading skills with the use of psycholinguistic tasks, namely, traditional masked priming, masked intervenor priming, and timed reading. Secondly, this collection of studies is among the few to explore the relationship between online processing and offline morphological awareness, thereby bridging the two fields of study. Thirdly, unlike most studies of online processing, the data from this dissertation will be discussed in terms of its implications for the teaching of L2 morphologically complex words and L2 reading skills. Thus, this dissertation may be of interest to those working in L2 psycholinguistics of word recognition and sentence processing as well as ESL practitioners.
1423

Hybris in Greek tragedy

Jooste, Christoffel Murray January 1900 (has links)
Thesis (MA) -- Stellenbosch University, 1977.
1424

The perceived credibility of electronic word-of-mouth communicaton on e-commerce platforms

Bosman, Dirk Johannes 12 1900 (has links)
Thesis (MComm)--Stellenbosch University, 2012. / ENGLISH ABSTRACT: Enterprises and more specifically, marketing departments, function in a complex global market, while trying to deliver products and services to satisfy the needs of consumers. It is estimated that by 2013, enterprises will be spending $4.75 trillion and consumers $330 billion by means of commercial transactions over the Internet, and that by 2050 most transactions – if not all transactions – will be e-commerce based (Laudon and Traver, 2010:1-7). The 24-hour access to a global network of markets has brought about two major challenges for most enterprises. Firstly, the Internet as a publishing platform has exponentially increased the creation and sharing of information, which has significantly increased consumers’ search cost; and secondly, as more electronic word-of-mouth (EWOM) is being generated online, a significant amount of power and influence over enterprises has shifted to consumers (Chen, Wu and Yoon, 2004:716-722; Tapscott and Williams, 2008:52-53). Ultimately, enterprises are challenged to harness the power of EWOM for more successful e-commerce strategies and increased market share. Given previous studies, it was possible to extend the theoretical framework of EWOM communication in the fields of Internet marketing and online consumer behaviour. The purpose of this study was to create two models that could measure, over time, the impact of EWOM review communication on an e-commerce platform, specifically with regard to review credibility and sales levels. In using a non-probability judgement sampling procedure, it emerged that EWOM reviews do indeed influence the sales levels of e-commerce platform Amazon.com, and that certain review factors (platform, text length, time and star ratings) significantly influenced the credibility of Amazon.com and Barnesandnoble.com reviews. Furthermore, it was concluded that the overall credibility of reviews increases over time as more and more online users have the ability to scrutinise it. When Amazon.com and Barnesandnoble.com’s reviews were compared to each other, the results indicated that Amazon.com had more reviews than Barnesandnoble.com, and that the reviews posted at Amazon.com had on average longer text lengths and were found to be more helpful than the reviews at Barnesandnoble.com. The study's major contribution is that it provides wide-ranging guidelines for usability and user experience design, sales and inventory forecasting, as well as benchmark statistics for marketing campaigns. / AFRIKAANSE OPSOMMING: Ondernemings, en in die besonder bemarkingsafdelings, funksioneer in ‘n komplekse globale mark, in hulle strewe om voortdurend produkte en dienste te lewer wat voldoen aan verbruikersbehoeftes. Na raming sal ondernemings teen die jaar 2013 $4.75 triljoen en verbruikers $330 biljoen spandeer aan kommersiële transaksies oor die Internet, terwyl die meeste, indien nie alle transaksies, teen die jaar 2050 gebaseer gaan wees op e-handel (Laudon en Traver, 2010:1-7). Die 24-uur toegang tot ‘n globale netwerk van markte het twee hoofuitdagings vir die meeste ondernemings tot gevolg. In die eerste plek het die Internet as uitgewersplatform die skep en verspreiding van inligting eksponensieel laat toeneem, wat verbruikers se soekkoste noemenswaardig verhoog het; en tweedens, namate elektroniese hoorsê aanlyn gegenereer word, het ‘n beduidende hoeveelheid mag en invloed van ondernemings na die verbruiker verskuif (Chen, Wu en Yoon, 2004:716-722; Tapscott en Williams, 2008:52-53). Ondernemings word dus uitgedaag om die impak van elektroniese hoorsê (electronic word-of-mouth) te ontgin om meer suksesvolle e-handelstrategieë en verhoogde winste te verkry. In die lig van vorige studies was dit moontlik om die tradisionele teoretiese raamwerk van hoorsêkommunikasie uit te brei na die veld van Internet-bemarking en verbruikersgedrag. Die doel van hierdie studie was om twee modelle te ontwikkel wat, met verloop van tyd, die impak van elektroniese hoorsêkommunikasie op ‘n e-handelplatform kon meet, met spesifieke verwysing na resensiegeloofwaardigheid en verkoopsvlakke van boeke. Deur die gebruik van ‘n nie-waarskynlikheidsoordeel steekproefprosedure, het die studie bevind dat elektroniese hoorsêresensies inderdaad ‘n invloed het op die verkoopsvlakke van elektroniese e-handelplatforms en dat sekere resensiefaktore (platform, teks, lengte, tyd en stergradering) die geloofwaardigheid van elektroniese e-handelplatforms se resensies beduidend beïnvloed het. Verder is die gevolgtrekking gemaak dat die oorhoofse geloofwaardigheid van resensies toeneem namate al hoe meer aanlynverbruikers die vermoë het om dit onder oë te kry. In ‘n vergelyking tussen die resensies van twee e-handel platforms Amazon.com en Barnesandnoble.com, is bevind dat Amazon.com meer resensies as Barnesandnoble.com het, dat die resensies op Amazon.com gemiddeld langer tekste het, en ook meer behulpsaam uit ‘n verbruikersoogpunt is as die resensies op Barnesandnoble.com. Die hoofbydrae van hierdie studie is dat dit riglyne bied aan e-handelplatforms om hul kliënte se verbruikerservaring beter te verstaan, en om sodoende beter verkope- en voorraadvooruitskattings te kan maak. Verder bied die studie ook riglyne t.o.v. doeltreffende bemarkingsveldtogte vir e-handelplatforms.
1425

On the automated compilation of UML notation to a VLIW chip multiprocessor

Stevens, David January 2013 (has links)
With the availability of more and more cores within architectures the process of extracting implicit and explicit parallelism in applications to fully utilise these cores is becoming complex. Implicit parallelism extraction is performed through the inclusion of intelligent software and hardware sections of tool chains although these reach their theoretical limit rather quickly. Due to this the concept of a method of allowing explicit parallelism to be performed as fast a possible has been investigated. This method enables application developers to perform creation and synchronisation of parallel sections of an application at a finer-grained level than previously possible, resulting in smaller sections of code being executed in parallel while still reducing overall execution time. Alongside explicit parallelism, a concept of high level design of applications destined for multicore systems was also investigated. As systems are getting larger it is becoming more difficult to design and track the full life-cycle of development. One method used to ease this process is to use a graphical design process to visualise the high level designs of such systems. One drawback in graphical design is the explicit nature in which systems are required to be generated, this was investigated, and using concepts already in use in text based programming languages, the generation of platform-independent models which are able to be specialised to multiple hardware architectures was developed. The explicit parallelism was performed using hardware elements to perform thread management, this resulted in speed ups of over 13 times when compared to threading libraries executed in software on commercially available processors. This allowed applications with large data dependent sections to be parallelised in small sections within the code resulting in a decrease of overall execution time. The modelling concepts resulted in the saving of between 40-50% of the time and effort required to generate platform-specific models while only incurring an overhead of up to 15% the execution cycles of these models designed for specific architectures.
1426

AXEL : a framework to deal with ambiguity in three-noun compounds

Martinez, Jorge Matadamas January 2010 (has links)
Cognitive Linguistics has been widely used to deal with the ambiguity generated by words in combination. Although this domain offers many solutions to address this challenge, not all of them can be implemented in a computational environment. The Dynamic Construal of Meaning framework is argued to have this ability because it describes an intrinsic degree of association of meanings, which in turn, can be translated into computational programs. A limitation towards a computational approach, however, has been the lack of syntactic parameters. This research argues that this limitation could be overcome with the aid of the Generative Lexicon Theory (GLT). Specifically, this dissertation formulated possible means to marry the GLT and Cognitive Linguistics in a novel rapprochement between the two. This bond between opposing theories provided the means to design a computational template (the AXEL System) by realising syntax and semantics at software levels. An instance of the AXEL system was created using a Design Research approach. Planned iterations were involved in the development to improve artefact performance. Such iterations boosted performance-improving, which accounted for the degree of association of meanings in three-noun compounds. This dissertation delivered three major contributions on the brink of a so-called turning point in Computational Linguistics (CL). First, the AXEL system was used to disclose hidden lexical patterns on ambiguity. These patterns are difficult, if not impossible, to be identified without automatic techniques. This research claimed that these patterns can assist audiences of linguists to review lexical knowledge on a software-based viewpoint. Following linguistic awareness, the second result advocated for the adoption of improved resources by decreasing electronic space of Sense Enumerative Lexicons (SELs). The AXEL system deployed the generation of “at the moment of use” interpretations, optimising the way the space is needed for lexical storage. Finally, this research introduced a subsystem of metrics to characterise an ambiguous degree of association of three-noun compounds enabling ranking methods. Weighing methods delivered mechanisms of classification of meanings towards Word Sense Disambiguation (WSD). Overall these results attempted to tackle difficulties in understanding studies of Lexical Semantics via software tools.
1427

חֶסֶד and Ikharari : the book of Ruth from a Lomwe perspective

Alfredo, Justino Manuel 12 1900 (has links)
Thesis (DTh (Old and New Testament))--University of Stellenbosch, 2010. / ENGLISH ABSTRACT: It has been acknowledged in two recent studies that the translation of key biblical terms is an area that needs urgent attention. Many lexicons provide the meaning of a word by describing its etymology, hardly paying any attention to the socio-cultural contexts within which it is used. Thus, lexicons are often of limited value for Bible interpretation and translation. This dissertation argues that the meaning of a word can only be fully determined by taking into consideration the linguistic and socio-cultural contexts within which it functions. A basic assumption is that the biblical source text serves as a frame of reference for the semantic analysis of a particular word. The text provides an integrative semantic and pragmatic framework within which a biblical term must be investigated with reference to its wider socio-cultural setting. In the light of this framework, this study investigates the meaning of dsj in the book of Ruth from a Lomwe perspective. Although the word occurs only three times (Ruth 1:8, 2:20 and 3:10) with reference to Ruth, Boaz and Yahweh as subjects, respectively, the book is a “dsj story”, which represents the essence of the covenant between Yahweh and His people. The essence of this covenant is demonstrated by the main characters of the story, which unveil the theological depth that dsj brings to the understanding of this narrative. Since the aim of the study is to evaluate the suitability of the terms osivela, osivela combined with woororomeleya and ikharari in relation to others that are potentially available in Lomwe to convey the conceptual complexity denoted by dsj, a Cognitive Frames of Reference (CFR) approach was introduced for the translation. To bridge the cognitive gap between the socio-cultural worlds of the biblical audience and the target audience, the study used different dimensions of CFR namely the textual, socio-cultural, communicational and the organizational frames of reference. Using the book of Ruth as a starting point for the translation of the word dsj into Lomwe, it is argued that this approach offers a better understanding of the meaning of dsj in Ruth 1:8, 2:20 and 3:10. Since osivela waya woororomeleya does not do justice to the meaning of dsj in the three passages, the words ikharari (1:8 and 2:20) and oreera murima (3:10) have been proposed as exegetically and socio-culturally more appropriate alternatives. / AFRIKAANSE OPSOMMING: In twee onlangse studies is aangedui dat daar dringend aandag geskenk behoort te word aan die vertaling van sleutel bybelse terme. Baie woordeboeke verskaf die betekenis van woord deur die etimologie daarvan te beskryf, met beperkte fokus op die sosio-kulturele kontekse waarin dit gebruik word. Gevoglik is die waarde van woordeboeke beperk met betrekking tot bybelinterpretasie en -vertaling. Hierdie proefskrif argumenteer dat die betekenis van woord slegs volkome bepaal kan word deur die inagneming van die literêre en sosio-kulturele kontekste waarin dit funksioneer. Basiese aanname is dat die bybelse bronteks as verwysingsraamwerk dien vir die semantiese analise van bepaalde woord. Die teks verskaf geïntegreerde semantiese en pragmatiese raamwerk waarin bybelse term ondersoek moet word met verwysing na sy breër sosio-kulturele milieu. In die lig van hierdie raamwerk ondersoek hierdie studie dus die betekenis van dsj in die Boek van Rut vanuit Lomwe perspektief. Alhoewel die woord slegs driekeer voorkom (Rut 1:8, 2:20 en 3:10) met betrekking tot onderskeidelik Rut, Boaz en Jahwe as onderwerpe, is die boek “dsj storie” wat die essensie van die verbond tussen Jahwe en sy volk verbeeld. Die wese van dié verbond word gedemonstreer deur die storie se hoofkarakters wat die teologiese diepte van dsj tot beter verstaan van die narratief blootlê. Aangesien die studie evaluering van toepaslike terme osivela, osivela, gekombineer met woororomeleya, en ikharari, in verhouding tot andere wat moontlik in Lomwe beskikbaar is, om die konseptuele kompleksiteit weer te gee, ten doel het, is Kognitiewe Verwysingsraamwerk (KWR) benadering vir vertaling voorgestel. Ten einde die kognitiewe gaping tussen die sosiokulturele wêrelde van die bybelse gehoor en die teikengehoor te oorbrug, het hierdie studie verskillende dimensies van KWR, te wete die tekstuele, sosio-kulturele, kommunikatiewe en organisatoriese verwysingsraamwerke aangewend. Deur die Boek Rut as vertrekpunt te neem vir die vertaling van dsj in Lomwe, word geargumeenteer dat dié benadering beter verstaan van dsj se betekenis in Rut 1:8, 2:20, 3:10 tot gevolg het. Aangesien osivela waya woororomeleya nie reg laat geskied aan die betekenis van dsj in hierdie drie perikope nie, is die woorde ikharari (1:8 en 2:20) en oreera murima (3:10) as eksegeties en sosio-kultureel meer toepaslike alternatiewe voorgestel.
1428

Πολιτισμικοί αλγόριθμοι : Εφαρμογή στην ανάλυση της ελληνικότητας του παγκόσμιου ιστού

Κατσικούλη, Παναγιώτα 12 October 2013 (has links)
Οι πολιτισμικοί αλγόριθμοι είναι εξελικτικοί αλγόριθμοι εμπνευσμένοι από την κοινωνική εξέλιξη. Περιλαμβάνουν ένα χώρο πεποιθήσεων, ένα πληθυσμό και ένα πρωτόκολλο επικοινωνίας που περιέχει συναρτήσεις που επιτρέπουν την ανταλλαγή γνώσης μεταξύ του πληθυσμού και του χώρου πεποιθήσεων. Στην παρούσα εργασία οι πολιτισμικοί αλγόριθμοι χρησιμοποιούνται για την ανάλυση της ελληνικότητας του παγκόσμιου ιστού. Είναι γνωστό πως η ελληνική γλώσσα αποτελεί πηγή άντλησης πληθώρας λέξεων για τα λεξιλόγια πολλών γλωσσών. Ο παγκόσμιος ιστός αποτελεί πλέον κλαθολικό μέσο επικοινωνίας, χώρο διακίνησης τεράστιου όγκου πληροφορίας και δεδομένων και σύγχρονο μέσο οικονομικής, πολιτικής και κοινωνικής δραστηριοποίησης. Με άλλα λόγια, ο παγκόσμιος ιστός αποτελεί σήμερα το χώρο εκείνο όπου η επίδραση του πολιτισμού, μέσω της γλώσσας, είναι εμφανής στα διάφορα κείμενα που φιλοξενούνται σε αυτόν. Η παρούσα διπλωματικής επιχειρεί να "μετρήσει" το ποσοστό των λέξεων με ελληνική προέλευση που χρησιμοποιούνται στα κάθε είδους κείμενα που εμφανίζονται στις ιστοσελίδες του παγκόσμιου ιστου. Στόχος της εργασίας είναι η διερεύνηση του κατά πόσον είναι εφικτός ο σχεδιασμός κατάλληλου μοντέλου και αντίστοιχων αλγορίθμων που θα επιτρέψουν να εκτιμηθεί η "ελληνικότητα" του παγκόσμιου ιστού. Η μεθοδολογία προσέγγισης του θέματος περιλαμβάνει το σχεδιασμό και την υλοποίηση ενός πολιτισμικού αλγορίθμου και χρήση του περιβάλλοντος προγραμματισμού Python για σχεδιασμό και υλοποίηση κατάλληλης εφαρμογής και για πειραματικό έλεγχο. / Cultural Algorithms are Evolutionary Αlgorithms inspired from societal evolution. They involve a belief space, a population space and a communication protocol which provides functions that enable exchange of knowledge between population and belief space. In this thesis cultural algorithms are used in order to analyze how greek the web is. It is commonly known that the greek language is the source of a plethora of words for other languages' dictionaries. The World Wide Web is, nowadays, a universal means of communication, a place where huge amounts of information and data are transmitted and a modern means of economical, political and social activity. In other words, the world wide web has emerged as a new kind of society. As such, it has become the place where any culture's in uence, throuh their language, is obvious in hosted texts. This thesis attempts to "count" the percentage of words with greek origin used in web hosted texts of any kind. The main objective is to investigate whether it is possible to design a proper model and corresponding algorithms that allow to evaluate how greek the web is. The methodology followed in this approach consists of the design and implementation of a Cultural Algorithm and of the use of the programming language Python for designing and implementing a proper application and for experimental evaluation.
1429

Towards the development and application of representative lexicographic corpora for the Gabonese languages

Soami, Leandre Serge 03 1900 (has links)
Thesis (DLitt (Afrikaans and Dutch))--University of Stellenbosch, 2010. / ENGLISH ABSTRACT: The compilation of dictionaries is a laborious activity and it takes time, money and staff to achieve the objectives of any dictionary project. Many dictionaries have been compiled using the lexicographers’ personal intuition and guessing rather than being corpus based. That resulted in some dictionaries often being criticised by users because of the lack of representation of some important lexical items. This can probably be explained by the fact that most of these dictionaries were compiled in an era when theoretical lexicography was lacking or not well established. The last decades have witnessed the emergence of metalexicography as a theory directed also at dictionary planning in order to enhance the quality of lexicographic practice and the way in which the management and the compilation of dictionaries are dealt with. The planning of dictionaries takes into account not only the gathering of language material to be used but also the way in which this material will be treated and presented on both the macrostructural and the microstructural level as well as in the front matter texts and the back matter texts. In order to enhance the quality of the presentation in dictionaries, this dissertation pleads in favour of the formulation of a data collection policy that takes into consideration all the different sources of material, written and spoken, used in the different phases of the compilation of a dictionary. The three phases that form the main focus of this study are the material acquisition phase, the material preparation phase and the material processing phase. The involvement of the speech community in the compilation of a lexicographic corpus ensures the collection of representative and balanced data, and the different needs of that community are central to the dictionary project. The different language materials can be organised into different corpus types. The efficiency of a corpus resides in its capacity to provide different data types that can be included in the comment on semantics and the comment on form of each article in the central list of each dictionary. Some dictionaries lack a good representation of data in both these comments in the different articles. However, languages such as the Gabonese languages are in a privileged situation because they can still avoid the mistakes of other dictionary compilers by investing in corpus-based dictionaries at this early stage. Therefore, the establishment of lexicographic units with multifunctional tasks can play an important role. In a multilingual environment such as Gabon the issue of language status needs to be dealt with carefully because it is realistic to choose a certain number of languages to function as official languages. Different alphabets are presented in this study and realistic choices are made. The way in which the language material is organised will impact on the quality of the macrostructure and microstructure; this is essential because dictionaries are consulted most of the time for the spelling of a given lexical item, for a translation equivalent or for the explanation of the meaning of a lemma sign. The computerisation of a corpus is a focal point and needs to be done in a satisfactory manner that presents a clean and helpful corpus in order to provide the lexicographer with useful statistics, frequency word lists and the different concordance lines that are very important for the wording of definitions and the extraction of example sentences. This is why a corpus is seen as an indispensable tool in the improvement of the macro- and the microstructure of any type of dictionary. / AFRIKAANSE OPSOMMING: Die saamstel van woordeboeke is ’n moeisame aktiwiteit, en dit verg tyd, geld en personeel om die doelstellings van ’n woordeboekprojek te bereik. Talle woordeboeke is op grond van die navorsers se persoonlike intuïsie en raaiwerk saamgestel, in stede daarvan dat dit korpusgebaseerd is. Die gevolg is dat baie woordeboeke dikwels deur gebruikers gekritiseer word weens die gebrek aan verteenwoordiging van enkele belangrike leksikale items. Dít kan moontlik verklaar word deur die feit dat die meeste van hierdie woordeboeke saamgestel is in ’n era waartydens teoretiese leksikografie gebrekkig en nie goed gevestig was nie. In die afgelope dekades het metaleksikografie na vore getree as a teorie wat op woordeboekbeplanning gerig is ten einde die gehalte van die leksikografie-praktyk en die manier waarop die bestuur en samestelling van woordeboeke hanteer word, te verbeter. By die beplanning van woordeboeke word nie net die versameling taalmateriaal wat gebruik kan word in berekening gebring nie, maar ook die manier waarop hierdie materiaal op sowel makro- as mikrostrukturele vlakke, asook in die voorwerk en die agterwerk, hanteer en aangebied gaan word. Ten einde die gehalte van die aanbieding in woordeboeke te verbeter, lewer hierdie proefskrif ’n pleidooi vir die formulering van ’n dataversamelingsbeleid wat al die verskillende materiaalbronne, hetsy skriftelik of mondelings, wat in die verskillende stadia van die samestelling van ’n woordeboek gebruik word, in ag neem. Die drie stadia wat die hooffokus van hierdie studie is, is die stadia waarin die materiaal aangeskaf, voorberei en verwerk word. Die spraakgemeenskap se betrokkenheid by die saamstel van ’n leksikografiese korpus verseker die versameling van verteenwoordigende en gebalanseerde data, en die verskillende behoeftes van sodanige gemeenskap is die kern van die woordeboekprojek. Die verskillende taalmateriale kan in verskillende korpussoorte georden word. Die doeltreffendheid van ’n korpus berus op die vermoë daarvan om verskillende datasoorte te verskaf wat in die kommentaar op semantiek en die kommentaar op vorm van elke item in die sentrale lys van elke woordeboek ingesluit kan word. Sommige woordeboeke toon ’n gebrek aan goeie verteenwoordiging van data in albei hierdie soorte kommentaar in die verskillende items. Tale soos die Gaboenese tale is egter in ’n bevoorregte posisie, aangesien hulle nog die foute van ander woordeboeksamestellers kan vermy deur op hierdie vroeë stadium in korpusgebaseerde woordeboeke te belê. Die stigting van leksikografiese eenhede met multifunksionele take kan dus ’n belangrike rol speel. In ’n veeltalige omgewing soos Gaboen moet die kwessie van taalstatus versigtig hanteer word, aangesien dit realisties is om ’n sekere hoeveelheid tale as amptelike tale te kies. Verskillende alfabette word in hierdie studie aangebied en realistiese keuses word gemaak. Die manier waarop die taalmateriaal georden is, sal ’n uitwerking op die makro- en mikrostruktuur hê; dit is van belang omdat woordeboeke meestal vir die spelling van ’n gegewe leksikale item, vir ’n vertaalekwivalent of vir die verklaring van die betekenis van ’n lemmateken geraadpleeg word. Die rekenarisering van ’n korpus is ’n belangrike aspek en moet op ’n bevredigende wyse uitgevoer word wat ’n skoon en nuttige korpus lewer ten einde die leksikograaf van goeie statistieke, frekwensiewoordlyste en die verskillende konkordansielyne te voorsien, wat baie belangrik is vir die skryf van definisies en die onttrekking van voorbeeldsinne. Om hierdie rede word ’n korpus as ’n onmisbare instrument in die verbetering van die makro- en mikrostruktuur van enige soort woordeboek beskou.
1430

Word based off-line handwritten Arabic classification and recognition : design of automatic recognition system for large vocabulary offline handwritten Arabic words using machine learning approaches

AlKhateeb, Jawad Hasan Yasin January 2010 (has links)
The design of a machine which reads unconstrained words still remains an unsolved problem. For example, automatic interpretation of handwritten documents by a computer is still under research. Most systems attempt to segment words into letters and read words one character at a time. However, segmenting handwritten words is very difficult. So to avoid this words are treated as a whole. This research investigates a number of features computed from whole words for the recognition of handwritten words in particular. Arabic text classification and recognition is a complicated process compared to Latin and Chinese text recognition systems. This is due to the nature cursiveness of Arabic text. The work presented in this thesis is proposed for word based recognition of handwritten Arabic scripts. This work is divided into three main stages to provide a recognition system. The first stage is the pre-processing, which applies efficient pre-processing methods which are essential for automatic recognition of handwritten documents. In this stage, techniques for detecting baseline and segmenting words in handwritten Arabic text are presented. Then connected components are extracted, and distances between different components are analyzed. The statistical distribution of these distances is then obtained to determine an optimal threshold for word segmentation. The second stage is feature extraction. This stage makes use of the normalized images to extract features that are essential in recognizing the images. Various method of feature extraction are implemented and examined. The third and final stage is the classification. Various classifiers are used for classification such as K nearest neighbour classifier (k-NN), neural network classifier (NN), Hidden Markov models (HMMs), and the Dynamic Bayesian Network (DBN). To test this concept, the particular pattern recognition problem studied is the classification of 32492 words using ii the IFN/ENIT database. The results were promising and very encouraging in terms of improved baseline detection and word segmentation for further recognition. Moreover, several feature subsets were examined and a best recognition performance of 81.5% is achieved.

Page generated in 0.0374 seconds