11 |
Gemeenskapgebaseerde bejaardeversorging : 'n maatskaplikewerkperspektief (Afrikaans)Claassen, Johanna Wilma 01 December 2005 (has links)
Please read the abstract in the section 00front of this document / Dissertation (MA (Social Work))--University of Pretoria, 2006. / Social Work and Criminology / Unrestricted
|
12 |
An ultra-low duty cycle sleep scheduling protocol stack for wireless sensor networksKleu, Christo 18 July 2012 (has links)
A wireless sensor network is a distributed network system consisting of miniature spatially distributed autonomous devices designed for using sensors to sense the environment and cooperatively perform a specific goal. Each sensor node contains a limited power source, a sensor and a radio through which it can communicate with other sensor nodes within its communication radius. Since these sensor nodes may be deployed in inaccessible terrains, it might not be possible to replace their power sources. The radio transceiver is the hardware component that uses the most power in a sensor node and the optimisation of this element is necessary to reduce the overall energy consumption. In the data link layer there are several major sources of energy waste which should be minimised to achieve greater energy efficiency: idle listening, overhearing, over-emitting, network signalling overhead, and collisions. Sleep scheduling utilises the low-power sleep state of a transceiver and aims to reduce energy wastage caused by idle listening. Idle listening occurs when the radio is on, even though there is no data to transmit or receive. Collisions are reduced by using medium reservation and carrier sensing; collisions occur when there are simultaneous transmissions from several nodes that are within the interference range of the receiver node. The medium reservation packets include a network allocation vector field which is used for virtual carrier sensing which reduces overhearing. Overhearing occurs when a node receives and decodes packets that are not destined to it. Proper scheduling can avoid energy wastage due to over-emitting; over-emitting occurs when a transmitter node transmits a packet while the receiver node is not ready to receive packets. A protocol stack is proposed that achieves an ultra-low duty cycle sleep schedule. The protocol stack is aimed at large nodal populations, densely deployed, with periodic sampling applications. It uses the IEEE 802.15.4 Physical Layer (PHY) standard in the 2.4 GHz frequency band. A novel hybrid data-link/network cross-layer solution is proposed using the following features: a global sleep schedule, geographical data gathering tree, Time Division Multiple Access (TDMA) slotted architecture, Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA), Clear Channel Assessment (CCA) with a randomised contention window, adaptive listening using a conservative timeout activation mechanism, virtual carrier sensing, clock drift compensation, and error control. AFRIKAANS : 'n Draadlose sensor-netwerk is 'n verspreide netwerk stelsel wat bestaan uit miniatuur ruimtelik verspreide outonome toestelle wat ontwerp is om in harmonie saam die omgewing te meet. Elke sensor nodus besit 'n beperkte bron van energie, 'n sensor en 'n radio waardeur dit met ander sensor nodusse binne hulle kommunikasie radius kan kommunikeer. Aangesien hierdie sensor nodusse in ontoeganklike terreine kan ontplooi word, is dit nie moontlik om hulle kragbronne te vervang nie. Die radio is die hardeware komponent wat van die meeste krag gebruik in 'n sensor nodus en die optimalisering van hierdie element is noodsaaklik vir die verminder die totale energieverbruik. In die data-koppelvlak laag is daar verskeie bronne van energie vermorsing wat minimaliseer moet word: ydele luister, a uistering, oor-uitstraling, oorhoofse netwerk seine, en botsings. Slaap-skedulering maak gebruik van die lae-krag slaap toestand van 'n radio met die doel om energie vermorsing wat veroorsaak word deur ydele luister, te verminder. Ydele luister vind plaas wanneer die radio aan is selfs al is daar geen data om te stuur of ontvang nie. Botsings word verminder deur medium bespreking en draer deteksie; botsings vind plaas wanneer verskeie nodusse gelyktydig data stuur. Die medium bespreking pakkies sluit 'n netwerk aanwysing vektor veld in wat gebruik word vir virtuele draer deteksie om a uistering te verminder. Afluistering vind plaas wanneer 'n nodus 'n pakkie ontvang en dekodeer maar dit was vir 'n ander nodus bedoel. Behoorlike skedulering kan energie verkwisting as gevolg van oor-uistraling verminder; oor-uistraling gebeur wanneer 'n sender nodus 'n pakkie stuur terwyl die ontvang nog nie gereed is nie. 'n Protokol stapel is voorgestel wat 'n ultra-lae slaap-skedule dienssiklus het. Die protokol is gemik op draadlose sensor-netwerke wat dig ontplooi, groot hoeveelhede nodusse bevat, en met periodiese toetsing toepassings. Dit maak gebruik van die IEEE 802.15.4 Fisiese-Laag standaard in die 2.4 GHz frekwensie band. 'n Nuwe baster datakoppelvlak/netwerk laag oplossing is voorgestel met die volgende kenmerke: globale slaap-skedulering, geogra ese data rapportering, Tyd-Verdeling-Veelvuldige-Toegang (TVVT) gegleufde argitektuur, Draer-Deteksie-Veelvuldige-Toegang met Botsing-Vermyding (DDVT/BV), Skoon-Kanaal-Assessering (SKA) met 'n wisselvallige twis-tydperk, aanpasbare slaap-skedulering met 'n konserwatiewe aktiverings meganisme, virtuele draer-deteksie, klok-wegdrywing kompensasie, en fout beheer. Copyright / Dissertation (MEng)--University of Pretoria, 2012. / Electrical, Electronic and Computer Engineering / unrestricted
|
13 |
Tagging als soziales Bindeglied für CommunitiesKammergruber, Walter Christian, Langen, Manfred January 2009 (has links)
Social Tagging und soziale Netzwerke sind zentrale Bausteine des Web 2.0 und Enterprise 2.0. In diesem Beitrag werden die sozialen Aspekte von Social Tagging beleuchtet und ein Ansatz aufgeführt, um in Folksonomies Personen mit ähnlichen Interessen zu finden. Ferner wird ein Tagging-Framework beschrieben, das im Use Case Alexandria im Rahmen des BMWi-Projekts Theseus entstanden ist.
|
14 |
A study on the sustainability of a non-motorised transport CBD in Upington / Barend Jacobus ScheepersScheepers, Barend Jacobus January 2014 (has links)
The introduction of the private vehicle in urban communities (towns and cities) resulted in numerous urban problems experienced in the developed and developing world. These include, inter alia, economic inefficiency due to traffic congestion; a high mortality rate relevant to vehicle users and non-vehicle users; air & noise pollution and overall poor quality of life for residents.
As part of the literature review, it was found that the level of urban problems experienced will intensify and worsen, if sustainable transportation systems were not introduced in urban areas. These predictions were made based on the following three factors:
* The increase of the world population – It was predicated that the world population will increase by 2.3 people billion between 2011 and 2050. The total world population is therefore expected to be 9.3 billion in 2050.
* The urbanisation rate experienced – It was predicated that the entire world population growth, along with an additional 300 million people, will be absorbed by urban areas between 2011 and 2050. Urban communities will therefore accommodate 6.2 billion people, or 67% of the world population, in 2050.
* The level and growth in private vehicle ownership – The developed world consists of a high level of vehicles per 1 000 residents (655 in 2010), but experienced a decline in growth of 0,8% between 2005 and 2010. Contrary to the developed world, the developing world had a low level of vehicle ownership per 1 000 residents (128 in 2010), but experienced an increase of 21.9% between 2005 and 2010.
Apart from the above data, the literature review introduced planning theories and international as well as national policies.
The three planning theories that were researched each revealed ten principles of sustainable alternative transportation measures for an unsustainable private vehicle orientated urban area. These sustainable measures were used to introduce the option of a sustainable non-motorised transportation system to the demarcated study area. The three planning theories researched were:
* The Smart growth theory
* New urbanism, and
* Pedestrian mall developments. International and national policies were scrutinised to obtain a point-of-view on how different countries, cities, spheres of government and type of documentation addressed non-motorised transportation developments. The examination of the policies also provided insight on how South African spheres of government were addressing non-motorised transportation in South African urban communities, if at all. The international policies include the “Share the road” document compiled by the United Nations in 2010; Mount Rainier Town Centre Urban Renewal Plan (2005) (USA) and Ottawa’s Transport Master Plan (2008) (Canada). The South African policies included the National Non-motorised Transportation Policy (2008); National Transport Master Plan (2011); Northern Cape Provincial Spatial Development Framework (2012) and //Khara Hais Spatial Development Framework (2012) (local municipality).
Following the literature review, is an empirical study consisting of 2 sections. Firstly, a pilot study, which consists of international and local examples, was researched. These examples were identified as they consist of vehicle-free areas within the central business district. The success of the vehicle-free developments was measured and the information utilised to guide recommendations for the demarcated study area within the town of Upington (case study). Pilot study examples include Copenhagen, Denmark; Ghent, Belgium; Santa Monica, USA and Cape Town, South Africa.
Secondly, a case study was analysed. A study area within the South African town of Upington, Northern Cape Province was demarcated. The status quo of relevant aspects, including but not limited to; the climate, coverage, parking, road hierarchy and transport modes were obtained and analysed. This analysis was conducted in order to establish a) if the study area experienced urban transport related problems and b) if the implementation of a non-motorised transport system will be more sustainable for the general public of Upington, as opposed to the current private-vehicle dependable system. Inputs from Town Planners were also obtained in order to obtain a multi-dimensional point-of-view.
In the conclusion of the researched study it was found that a) the planning theories have been successfully implemented in the examples of the pilot studies and therefore these principles could apply to the demarcated study area in Upington. b) International policies addressed non-motorisation developments more comprehensively than the South African policies. Shortages especially existed at the provincial and local spheres of government where implementation should take place. c) Through the analysis of the case study it become evident that the demarcated study area within Upington was burdened by private vehicle orientated transport problems. However, the analysis also indicated that the study area has the potential to make a successful transition from being dependable on unsustainable private vehicles to sustainable non-motorised transportation. Finally, tailor-made recommendations (based on information derived from planning theories, policies, pilot study and case study) were made for the study area situated within Upington. These recommendations include the phased development of a pedestrian-only area, the development of parking garages (outside the pedestrian area), which are linked to the pedestrian-only area and the development of a public transportation system by means of busses. / MArt et Scien (Urban and Regional Planning), North-West University, Potchefstroom Campus, 2014
|
15 |
A study on the sustainability of a non-motorised transport CBD in Upington / Barend Jacobus ScheepersScheepers, Barend Jacobus January 2014 (has links)
The introduction of the private vehicle in urban communities (towns and cities) resulted in numerous urban problems experienced in the developed and developing world. These include, inter alia, economic inefficiency due to traffic congestion; a high mortality rate relevant to vehicle users and non-vehicle users; air & noise pollution and overall poor quality of life for residents.
As part of the literature review, it was found that the level of urban problems experienced will intensify and worsen, if sustainable transportation systems were not introduced in urban areas. These predictions were made based on the following three factors:
* The increase of the world population – It was predicated that the world population will increase by 2.3 people billion between 2011 and 2050. The total world population is therefore expected to be 9.3 billion in 2050.
* The urbanisation rate experienced – It was predicated that the entire world population growth, along with an additional 300 million people, will be absorbed by urban areas between 2011 and 2050. Urban communities will therefore accommodate 6.2 billion people, or 67% of the world population, in 2050.
* The level and growth in private vehicle ownership – The developed world consists of a high level of vehicles per 1 000 residents (655 in 2010), but experienced a decline in growth of 0,8% between 2005 and 2010. Contrary to the developed world, the developing world had a low level of vehicle ownership per 1 000 residents (128 in 2010), but experienced an increase of 21.9% between 2005 and 2010.
Apart from the above data, the literature review introduced planning theories and international as well as national policies.
The three planning theories that were researched each revealed ten principles of sustainable alternative transportation measures for an unsustainable private vehicle orientated urban area. These sustainable measures were used to introduce the option of a sustainable non-motorised transportation system to the demarcated study area. The three planning theories researched were:
* The Smart growth theory
* New urbanism, and
* Pedestrian mall developments. International and national policies were scrutinised to obtain a point-of-view on how different countries, cities, spheres of government and type of documentation addressed non-motorised transportation developments. The examination of the policies also provided insight on how South African spheres of government were addressing non-motorised transportation in South African urban communities, if at all. The international policies include the “Share the road” document compiled by the United Nations in 2010; Mount Rainier Town Centre Urban Renewal Plan (2005) (USA) and Ottawa’s Transport Master Plan (2008) (Canada). The South African policies included the National Non-motorised Transportation Policy (2008); National Transport Master Plan (2011); Northern Cape Provincial Spatial Development Framework (2012) and //Khara Hais Spatial Development Framework (2012) (local municipality).
Following the literature review, is an empirical study consisting of 2 sections. Firstly, a pilot study, which consists of international and local examples, was researched. These examples were identified as they consist of vehicle-free areas within the central business district. The success of the vehicle-free developments was measured and the information utilised to guide recommendations for the demarcated study area within the town of Upington (case study). Pilot study examples include Copenhagen, Denmark; Ghent, Belgium; Santa Monica, USA and Cape Town, South Africa.
Secondly, a case study was analysed. A study area within the South African town of Upington, Northern Cape Province was demarcated. The status quo of relevant aspects, including but not limited to; the climate, coverage, parking, road hierarchy and transport modes were obtained and analysed. This analysis was conducted in order to establish a) if the study area experienced urban transport related problems and b) if the implementation of a non-motorised transport system will be more sustainable for the general public of Upington, as opposed to the current private-vehicle dependable system. Inputs from Town Planners were also obtained in order to obtain a multi-dimensional point-of-view.
In the conclusion of the researched study it was found that a) the planning theories have been successfully implemented in the examples of the pilot studies and therefore these principles could apply to the demarcated study area in Upington. b) International policies addressed non-motorisation developments more comprehensively than the South African policies. Shortages especially existed at the provincial and local spheres of government where implementation should take place. c) Through the analysis of the case study it become evident that the demarcated study area within Upington was burdened by private vehicle orientated transport problems. However, the analysis also indicated that the study area has the potential to make a successful transition from being dependable on unsustainable private vehicles to sustainable non-motorised transportation. Finally, tailor-made recommendations (based on information derived from planning theories, policies, pilot study and case study) were made for the study area situated within Upington. These recommendations include the phased development of a pedestrian-only area, the development of parking garages (outside the pedestrian area), which are linked to the pedestrian-only area and the development of a public transportation system by means of busses. / MArt et Scien (Urban and Regional Planning), North-West University, Potchefstroom Campus, 2014
|
16 |
'n Masjienleerbenadering tot woordafbreking in AfrikaansFick, Machteld 06 1900 (has links)
Text in Afrikaans / Die doel van hierdie studie was om te bepaal tot watter mate ’n suiwer patroongebaseerde benadering tot woordafbreking bevredigende resultate lewer. Die masjienleertegnieke kunsmatige neurale netwerke, beslissingsbome en die TEX-algoritme is ondersoek aangesien dit met letterpatrone uit woordelyste afgerig kan word om lettergreep- en saamgesteldewoordverdeling te doen.
’n Leksikon van Afrikaanse woorde is uit ’n korpus van elektroniese teks genereer. Om lyste vir lettergreep- en saamgesteldewoordverdeling te kry, is woorde in die leksikon in lettergrepe verdeel en saamgestelde woorde is in hul samestellende dele verdeel. Uit elkeen van hierdie lyste van ±183 000 woorde is ±10 000 woorde as toetsdata gereserveer terwyl die res as afrigtingsdata gebruik is.
’n Rekursiewe algoritme is vir saamgesteldewoordverdeling ontwikkel. In hierdie algoritme word alle ooreenstemmende woorde uit ’n verwysingslys (die leksikon) onttrek deur stringpassing van die begin en einde van woorde af. Verdelingspunte word dan op grond van woordlengte uit die
samestelling van begin- en eindwoorde bepaal. Die algoritme is uitgebrei deur die tekortkominge
van hierdie basiese prosedure aan te spreek.
Neurale netwerke en beslissingsbome is afgerig en variasies van beide tegnieke is ondersoek om
die optimale modelle te kry. Patrone vir die TEX-algoritme is met die OPatGen-program
gegenereer. Tydens toetsing het die TEX-algoritme die beste op beide lettergreep- en saamgesteldewoordverdeling
presteer met 99,56% en 99,12% akkuraatheid, respektiewelik. Dit kan
dus vir woordafbreking gebruik word met min risiko vir afbrekingsfoute in gedrukte teks. Die neurale netwerk met 98,82% en 98,42% akkuraatheid op lettergreep- en saamgesteldewoordverdeling, respektiewelik, is ook bruikbaar vir lettergreepverdeling, maar dis meer riskant. Ons het bevind dat beslissingsbome te riskant is om vir lettergreepverdeling en veral vir woordverdeling te gebruik, met 97,91% en 90,71% akkuraatheid, respektiewelik.
’n Gekombineerde algoritme is ontwerp waarin saamgesteldewoordverdeling eers met die TEXalgoritme gedoen word, waarna die resultate van lettergreepverdeling deur beide die TEXalgoritme en die neurale netwerk gekombineer word. Die algoritme het 1,3% minder foute as die TEX-algoritme gemaak. ’n Toets op gepubliseerde Afrikaanse teks het getoon dat die risiko vir woordafbrekingsfoute in teks met gemiddeld tien woorde per re¨el ±0,02% is. / The aim of this study was to determine the level of success achievable with a purely pattern
based approach to hyphenation in Afrikaans. The machine learning techniques artificial neural
networks, decision trees and the TEX algorithm were investigated since they can be trained
with patterns of letters from word lists for syllabification and decompounding.
A lexicon of Afrikaans words was extracted from a corpus of electronic text. To obtain lists
for syllabification and decompounding, words in the lexicon were respectively syllabified and
compound words were decomposed. From each list of ±183 000 words, ±10 000 words were
reserved as testing data and the rest was used as training data.
A recursive algorithm for decompounding was developed. In this algorithm all words corresponding
with a reference list (the lexicon) are extracted by string fitting from beginning and
end of words. Splitting points are then determined based on the length of reassembled words.
The algorithm was expanded by addressing shortcomings of this basic procedure.
Artificial neural networks and decision trees were trained and variations of both were examined
to find optimal syllabification and decompounding models. Patterns for the TEX algorithm
were generated by using the program OPatGen. Testing showed that the TEX algorithm
performed best on both syllabification and decompounding tasks with 99,56% and 99,12% accuracy,
respectively. It can therefore be used for hyphenation in Afrikaans with little risk of
hyphenation errors in printed text. The performance of the artificial neural network was lower,
but still acceptable, with 98,82% and 98,42% accuracy for syllabification and decompounding,
respectively. The decision tree with accuracy of 97,91% on syllabification and 90,71% on
decompounding was found to be too risky to use for either of the tasks
A combined algorithm was developed where words are first decompounded by using the TEX
algorithm before syllabifying them with both the TEX algoritm and the neural network and
combining the results. This algoritm reduced the number of errors made by the TEX algorithm
by 1,3% but missed more hyphens. Testing the algorithm on Afrikaans publications showed the risk for hyphenation errors to be ±0,02% for text assumed to have an average of ten words per
line. / Decision Sciences / D. Phil. (Operational Research)
|
17 |
An investigation into the feasibility of monitoring a call centre using an emotion recognition systemStoop, Werner 04 June 2010 (has links)
In this dissertation a method for the classification of emotion in speech recordings made in a customer service call centre of a large business is presented. The problem addressed here is that customer service analysts at large businesses have to listen to large numbers of call centre recordings in order to discover customer service-related issues. Since recordings where the customer exhibits emotion are more likely to contain useful information for service improvement than “neutral” ones, being able to identify those recordings should save a lot of time for the customer service analyst. MTN South Africa agreed to provide assistance for this project. The system that has been developed for this project can interface with MTN’s call centre database, download recordings, classify them according to their emotional content, and provide feedback to the user. The system faces the additional challenge that it is required to classify emotion notwith- standing the fact that the caller may have one of several South African accents. It should also be able to function with recordings made at telephone quality sample rates. The project identifies several speech features that can be used to classify a speech recording according to its emotional content. The project uses these features to research the general methods by which the problem of emotion classification in speech can be approached. The project examines both a K-Nearest Neighbours Approach and an Artificial Neural Network- Based Approach to classify the emotion of the speaker. Research is also done with regard to classifying a recording according to the gender of the speaker using a neural network approach. The reason for this classification is that the gender of a speaker may be useful input into an emotional classifier. The project furthermore examines the problem of identifying smaller segments of speech in a recording. In the typical call centre conversation, a recording may start with the agent greeting the customer, the customer stating his or her problem, the agent performing an action, during which time no speech occurs, the agent reporting back to the user and the call being terminated. The approach taken by this project allows the program to isolate these different segments of speech in a recording and discard segments of the recording where no speech occurs. This project suggests and implements a practical approach to the creation of a classifier in a commercial environment through its use of a scripting language interpreter that can train a classifier in one script and use the trained classifier in another script to classify unknown recordings. The project also examines the practical issues involved in implementing an emotional clas- sifier. It addresses the downloading of recordings from the call centre, classifying the recording and presenting the results to the customer service analyst. AFRIKAANS : n Metode vir die klassifisering van emosie in spraakopnames in die oproepsentrum van ’n groot sake-onderneming word in hierdie verhandeling aangebied. Die probleem wat hierdeur aangespreek word, is dat kli¨entediens ontleders in ondernemings na groot hoeveelhede oproepsentrum opnames moet luister ten einde kli¨entediens aangeleenthede te identifiseer. Aangesien opnames waarin die kli¨ent emosie toon, heel waarskynlik nuttige inligting bevat oor diensverbetering, behoort die vermo¨e om daardie opnames te identifiseer vir die analis baie tyd te spaar. MTN Suid-Afrika het ingestem om bystand vir die projek te verleen. Die stelsel wat ontwikkel is kan opnames vanuit MTN se oproepsentrum databasis verkry, klassifiseer volgens emosionele inhoud en terugvoering aan die gebruiker verskaf. Die stelsel moet die verdere uitdaging kan oorkom om emosie te kan klassifiseer nieteenstaande die feit dat die spreker een van verskeie Suid-Afrikaanse aksente het. Dit moet ook in staat wees om opnames wat gemaak is teen telefoon gehalte tempos te analiseer. Die projek identifiseer verskeie spraak eienskappe wat gebruik kan word om ’n opname volgens emosionele inhoud te klassifiseer. Die projek gebruik hierdie eienskappe om die algemene metodes waarmee die probleem van emosie klassifisering in spraak benader kan word, na te vors. Die projek gebruik ’n K-Naaste Bure en ’n Neurale Netwerk benadering om die emosie van die spreker te klassifiseer. Navorsing is voorts gedoen met betrekking tot die klassifisering van die geslag van die spreker deur ’n neurale netwerk. Die rede vir hierdie klassifisering is dat die geslag van die spreker ’n nuttige inset vir ’n emosie klassifiseerder mag wees. Die projek ondersoek ook die probleem van identifisering van spraakgedeeltes in ’n opname. In ’n tipiese oproepsentrum gesprek mag die opname begin met die agent wat die kli¨ent groet, die kli¨ent wat sy of haar probleem stel, die agent wat ’n aksie uitvoer sonder spraak, die agent wat terugrapporteer aan die gebruiker en die oproep wat be¨eindig word. Die benadering van hierdie projek laat die program toe om hierdie verskillende gedeeltes te isoleer uit die opname en om gedeeltes waar daar geen spraak plaasvind nie, uit te sny. Die projek stel ’n praktiese benadering vir die ontwikkeling van ’n klassifiseerder in ’n kommersi¨ele omgewing voor en implementeer dit deur gebruik te maak van ’n programeer taal interpreteerder wat ’n klassifiseerder kan oplei in een program en die opgeleide klassifiseerder gebruik om ’n onbekende opname te klassifiseer met behulp van ’n ander program. Die projek ondersoek ook die praktiese aspekte van die implementering van ’n emosionele klassifiseerder. Dit spreek die aflaai van opnames uit die oproep sentrum, die klassifisering daarvan, en die aanbieding van die resultate aan die kli¨entediens analis, aan. Copyright / Dissertation (MEng)--University of Pretoria, 2010. / Electrical, Electronic and Computer Engineering / unrestricted
|
18 |
Masjienleerbenadering tot woordafbreking in AfrikaansFick, Machteld 06 1900 (has links)
Text in Afrikaans / Die doel van hierdie studie was om te bepaal tot watter mate ’n suiwer patroongebaseerde benadering tot woordafbreking bevredigende resultate lewer. Die masjienleertegnieke kunsmatige neurale netwerke, beslissingsbome en die TEX-algoritme is ondersoek aangesien dit met letterpatrone uit woordelyste afgerig kan word om lettergreep- en saamgesteldewoordverdeling te doen.
’n Leksikon van Afrikaanse woorde is uit ’n korpus van elektroniese teks genereer. Om lyste vir lettergreep- en saamgesteldewoordverdeling te kry, is woorde in die leksikon in lettergrepe verdeel en saamgestelde woorde is in hul samestellende dele verdeel. Uit elkeen van hierdie lyste van ±183 000 woorde is ±10 000 woorde as toetsdata gereserveer terwyl die res as afrigtingsdata gebruik is.
’n Rekursiewe algoritme is vir saamgesteldewoordverdeling ontwikkel. In hierdie algoritme word alle ooreenstemmende woorde uit ’n verwysingslys (die leksikon) onttrek deur stringpassing van die begin en einde van woorde af. Verdelingspunte word dan op grond van woordlengte uit die
samestelling van begin- en eindwoorde bepaal. Die algoritme is uitgebrei deur die tekortkominge
van hierdie basiese prosedure aan te spreek.
Neurale netwerke en beslissingsbome is afgerig en variasies van beide tegnieke is ondersoek om
die optimale modelle te kry. Patrone vir die TEX-algoritme is met die OPatGen-program
gegenereer. Tydens toetsing het die TEX-algoritme die beste op beide lettergreep- en saamgesteldewoordverdeling
presteer met 99,56% en 99,12% akkuraatheid, respektiewelik. Dit kan
dus vir woordafbreking gebruik word met min risiko vir afbrekingsfoute in gedrukte teks. Die neurale netwerk met 98,82% en 98,42% akkuraatheid op lettergreep- en saamgesteldewoordverdeling, respektiewelik, is ook bruikbaar vir lettergreepverdeling, maar dis meer riskant. Ons het bevind dat beslissingsbome te riskant is om vir lettergreepverdeling en veral vir woordverdeling te gebruik, met 97,91% en 90,71% akkuraatheid, respektiewelik.
’n Gekombineerde algoritme is ontwerp waarin saamgesteldewoordverdeling eers met die TEXalgoritme gedoen word, waarna die resultate van lettergreepverdeling deur beide die TEXalgoritme en die neurale netwerk gekombineer word. Die algoritme het 1,3% minder foute as die TEX-algoritme gemaak. ’n Toets op gepubliseerde Afrikaanse teks het getoon dat die risiko vir woordafbrekingsfoute in teks met gemiddeld tien woorde per re¨el ±0,02% is. / The aim of this study was to determine the level of success achievable with a purely pattern
based approach to hyphenation in Afrikaans. The machine learning techniques artificial neural
networks, decision trees and the TEX algorithm were investigated since they can be trained
with patterns of letters from word lists for syllabification and decompounding.
A lexicon of Afrikaans words was extracted from a corpus of electronic text. To obtain lists
for syllabification and decompounding, words in the lexicon were respectively syllabified and
compound words were decomposed. From each list of ±183 000 words, ±10 000 words were
reserved as testing data and the rest was used as training data.
A recursive algorithm for decompounding was developed. In this algorithm all words corresponding
with a reference list (the lexicon) are extracted by string fitting from beginning and
end of words. Splitting points are then determined based on the length of reassembled words.
The algorithm was expanded by addressing shortcomings of this basic procedure.
Artificial neural networks and decision trees were trained and variations of both were examined
to find optimal syllabification and decompounding models. Patterns for the TEX algorithm
were generated by using the program OPatGen. Testing showed that the TEX algorithm
performed best on both syllabification and decompounding tasks with 99,56% and 99,12% accuracy,
respectively. It can therefore be used for hyphenation in Afrikaans with little risk of
hyphenation errors in printed text. The performance of the artificial neural network was lower,
but still acceptable, with 98,82% and 98,42% accuracy for syllabification and decompounding,
respectively. The decision tree with accuracy of 97,91% on syllabification and 90,71% on
decompounding was found to be too risky to use for either of the tasks
A combined algorithm was developed where words are first decompounded by using the TEX
algorithm before syllabifying them with both the TEX algoritm and the neural network and
combining the results. This algoritm reduced the number of errors made by the TEX algorithm
by 1,3% but missed more hyphens. Testing the algorithm on Afrikaans publications showed the risk for hyphenation errors to be ±0,02% for text assumed to have an average of ten words per
line. / Decision Sciences / D. Phil. (Operational Research)
|
Page generated in 0.0497 seconds