Global ETD Search

1	Learning to Read Bushman: Automatic Handwriting Recognition for Bushman Languages Williams, Kyle 01 January 2012 (has links) The Bleek and Lloyd Collection contains notebooks that document the tradition, language and culture of the Bushman people who lived in South Africa in the late 19th century. Transcriptions of these notebooks would allow for the provision of services such as text-based search and text-to-speech. However, these notebooks are currently only available in the form of digital scans and the manual creation of transcriptions is a costly and time-consuming process. Thus, automatic methods could serve as an alternative approach to creating transcriptions of the text in the notebooks. In order to evaluate the use of automatic methods, a corpus of Bushman texts and their associated transcriptions was created. The creation of this corpus involved: the development of a custom method for encoding the Bushman script, which contains complex diacritics; the creation of a tool for creating and transcribing the texts in the notebooks; and the running of a series of workshops in which the tool was used to create the corpus. The corpus was used to evaluate the use of various techniques for automatically transcribing the texts in the corpus in order to determine which approaches were best suited to the complex Bushman script. These techniques included the use of Support Vector Machines, Artificial Neural Networks and Hidden Markov Models as machine learning algorithms, which were coupled with different descriptive features. The effect of the texts used for training the machine learning algorithms was also investigated as well as the use of a statistical language model. It was found that, for Bushman word recognition, the use of a Support Vector Machine with Histograms of Oriented Gradient features resulted in the best performance and, for Bushman text line recognition, Marti & Bunke features resulted in the best performance when used with Hidden Markov Models. The automatic transcription of the Bushman texts proved to be difficult and the performance of the different recognition systems was largely affected by the complexities of the Bushman script. It was also found that, besides having an influence on determining which techniques may be the most appropriate for automatic handwriting recognition, the texts used in a automatic handwriting recognition system also play a large role in determining whether or not automatic recognition should be attempted at all. I.7 DOCUMENT AND TEXT PROCESSING H.3 INFORMATION STORAGE AND RETRIEVAL
2	Investigating the Efficacy of XML and Stylesheets to Render Electronic Courseware for Multiple Learning Styles du Toit, Masha 01 June 2007 (has links) The objective of this project was to test the efficacy of using Extensible Markup Language (XML) - in particular the DocBook 5.0b5 schema - and Extensible Stylesheet Language Transformation (XSLT) to render electronic courseware that can be dynamically re-formatted according to a student’s individual learning style. The text of a typical lesson was marked up in XML according to the DocBook schema, and several XSLT stylesheets were created to transform the XML document into different versions, each according to particular learning needs. These learning needs were drawn from the Felder-Silverman learning style model. The notes had links to trigger JavaScript functions that allowed the student to reformat the notes to produce different views of the lesson. The dynamic notes were tested on twelve users who filled out a feedback questionnaire. Feedback was largely positive. It suggested that users were able to navigate according to their learning style. There were some usability issues caused by lack of compatibility of the program with some browsers. However, the user test is not the most critical part of the evaluation. It served to confirm that the notes were usable, but the analysis of the use of XSLT and DocBook is the key aspect of this project. It was found that XML, and in particular the DocBook schema, was a useful tool in these circumstances, being easy to learn, well supported and having the appropriate structure for a project of this type. The use of XSLT on the other hand was not so straightforward. Learning a declarative language was a challenge, as was using XSLT to transform the notes as necessary for this project. A particular problem was the need to move content from one area of the document to another - to hide it in some cases and reveal it in others. The solution was not straightforward to achieve using XSLT, and does not take proper advantage of the strengths of this technology. The fact that the XSLT processor uses the DOM API, which necessitates the loading of the entire XML document into memory, is particularly problematic in this instance where the document is constantly transformed and re-transformed. The manner in which stylesheets are assigned, as well as the need to use DOM objects to edit the source tree, necessitated the use of JavaScript to create the necessary usability. These mechanisms introduced a limitation in terms of compatibility with browsers and caused the program to freeze on older machines. The problems with browser compatibility and the synchronous loading of data are not insurmountable, and can be overcome with the appropriate use of JavaScript and the use of asynchronous data retrieval as is made possible by the use of AJAX. H.4 INFORMATION SYSTEMS APPLICATIONS K.3 COMPUTERS AND EDUCATION I.7 DOCUMENT AND TEXT PROCESSING
3	Improving searchability of automatically transcribed lectures through dynamic language modelling Marquard, Stephen 01 December 2012 (has links) Recording university lectures through lecture capture systems is increasingly common. However, a single continuous audio recording is often unhelpful for users, who may wish to navigate quickly to a particular part of a lecture, or locate a specific lecture within a set of recordings. A transcript of the recording can enable faster navigation and searching. Automatic speech recognition (ASR) technologies may be used to create automated transcripts, to avoid the significant time and cost involved in manual transcription. Low accuracy of ASR-generated transcripts may however limit their usefulness. In particular, ASR systems optimized for general speech recognition may not recognize the many technical or discipline-specific words occurring in university lectures. To improve the usefulness of ASR transcripts for the purposes of information retrieval (search) and navigating within recordings, the lexicon and language model used by the ASR engine may be dynamically adapted for the topic of each lecture. A prototype is presented which uses the English Wikipedia as a semantically dense, large language corpus to generate a custom lexicon and language model for each lecture from a small set of keywords. Two strategies for extracting a topic-specific subset of Wikipedia articles are investigated: a naïve crawler which follows all article links from a set of seed articles produced by a Wikipedia search from the initial keywords, and a refinement which follows only links to articles sufficiently similar to the parent article. Pair-wise article similarity is computed from a pre-computed vector space model of Wikipedia article term scores generated using latent semantic indexing. The CMU Sphinx4 ASR engine is used to generate transcripts from thirteen recorded lectures from Open Yale Courses, using the English HUB4 language model as a reference and the two topic-specific language models generated for each lecture from Wikipedia. Three standard metrics – Perplexity, Word Error Rate and Word Correct Rate – are used to evaluate the extent to which the adapted language models improve the searchability of the resulting transcripts, and in particular improve the recognition of specialist words. Ranked Word Correct Rate is proposed as a new metric better aligned with the goals of improving transcript searchability and specialist word recognition. Analysis of recognition performance shows that the language models derived using the similarity-based Wikipedia crawler outperform models created using the naïve crawler, and that transcripts using similarity-based language models have better perplexity and Ranked Word Correct Rate scores than those created using the HUB4 language model, but worse Word Error Rates. It is concluded that English Wikipedia may successfully be used as a language resource for unsupervised topic adaptation of language models to improve recognition performance for better searchability of lecture recording transcripts, although possibly at the expense of other attributes such as readability. I.2 ARTIFICIAL INTELLIGENCE I.7 DOCUMENT AND TEXT PROCESSING H.3 INFORMATION STORAGE AND RETRIEVAL

1

Page generated in 0.018 seconds