• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 1
  • Tagged with
  • 4
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Creation of a pronunciation dictionary for automatic speech recognition : a morphological approach

Nkosi, Mpho Caselinah January 2012 (has links)
Thesis (M.Sc. (Computer Science)) --University of Limpopo, 2012 / Pronunciation dictionaries or lexicons play an important role in guiding the predictive powers of an Automatic Speech Recognition (ASR) system. As the use of automatic speech recognition systems increases, there is a need for the development of dictionaries that cover a large number of inflected word forms to enhance the performance of ASR systems. The main purpose of this study is to investigate the contribution of the morphological approach to creating a more comprehensive and broadly representative Northern Sotho pronunciation dictionary for Automatic Speech Recognition systems. The Northern Sotho verbs together with morphological rules are used to generate more valid inflected word forms in the Northern Sotho language for the creation of a pronunciation dictionary. The pronunciation dictionary is developed using the Dictionary Maker tool. The Hidden Markov Model Toolkit is used to develop a simple ASR system in order to evaluate the performance of the ASR system when using the created pronunciation dictionary.
2

Adapting a pronunciation dictionary to Standard South African English for automatic speech recognition / Olga Meruzhanovna Martirosian

Martirosian, Olga Meruzhanovna January 2009 (has links)
The pronunciation dictionary is a key resource required during the development of an automatic speech recognition (ASR) system. In this thesis, we adapt a British English pronunciation dictionary to Standard South African English (SSAE), as a case study in dialect adaptation. Our investigation leads us in three different directions: dictionary verification, phoneme redundancy evaluation and phoneme adaptation. A pronunciation dictionary should be verified for correctness before its implementation in experiments or applications. However, employing a human to verify a full pronunciation dictionary is an indulgent process which cannot always be accommodated. In our dictionary verification research we attempt to reduce the human effort required in the verification of a pronunciation dictionary by implementing automatic and semi-automatic techniques that find and isolate possible erroneous entries in the dictionary. We identify a number of new techniques that are very efficient in identifying errors, and apply them to a public domain British English pronunciation dictionary. Investigating phoneme redundancy involves looking into the possibility that not all phoneme distinctions are required in SSAE, and investigating different methods of analysing these distinctions. The methods that are investigated include both data driven and knowledge based pronunciation suggestions for a pronunciation dictionary used in an automatic speech recognition (ASR) system. This investigation facilitates a deeper linguistic insight into the pronunciation of phonemes in SSAE. Finally, we investigate phoneme adaptation by adapting the KIT phoneme between two dialects of English through the implementation of a set of adaptation rules. Adaptation rules are extracted from literature but also formulated through an investigation of the linguistic phenomena in the data. We achieve a 93% predictive accuracy, which is significantly higher than the 71 % achievable through the implementation of previously identified rules. The adaptation of a British pronunciation dictionary to SSAE represents the final step of developing a SSAE pronunciation dictionary, which is the aim of this thesis. In addition, an ASR system utilising the dictionary is developed, achieving an unconstrained phoneme accuracy of 79.7%. / Thesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
3

Adapting a pronunciation dictionary to Standard South African English for automatic speech recognition / Olga Meruzhanovna Martirosian

Martirosian, Olga Meruzhanovna January 2009 (has links)
The pronunciation dictionary is a key resource required during the development of an automatic speech recognition (ASR) system. In this thesis, we adapt a British English pronunciation dictionary to Standard South African English (SSAE), as a case study in dialect adaptation. Our investigation leads us in three different directions: dictionary verification, phoneme redundancy evaluation and phoneme adaptation. A pronunciation dictionary should be verified for correctness before its implementation in experiments or applications. However, employing a human to verify a full pronunciation dictionary is an indulgent process which cannot always be accommodated. In our dictionary verification research we attempt to reduce the human effort required in the verification of a pronunciation dictionary by implementing automatic and semi-automatic techniques that find and isolate possible erroneous entries in the dictionary. We identify a number of new techniques that are very efficient in identifying errors, and apply them to a public domain British English pronunciation dictionary. Investigating phoneme redundancy involves looking into the possibility that not all phoneme distinctions are required in SSAE, and investigating different methods of analysing these distinctions. The methods that are investigated include both data driven and knowledge based pronunciation suggestions for a pronunciation dictionary used in an automatic speech recognition (ASR) system. This investigation facilitates a deeper linguistic insight into the pronunciation of phonemes in SSAE. Finally, we investigate phoneme adaptation by adapting the KIT phoneme between two dialects of English through the implementation of a set of adaptation rules. Adaptation rules are extracted from literature but also formulated through an investigation of the linguistic phenomena in the data. We achieve a 93% predictive accuracy, which is significantly higher than the 71 % achievable through the implementation of previously identified rules. The adaptation of a British pronunciation dictionary to SSAE represents the final step of developing a SSAE pronunciation dictionary, which is the aim of this thesis. In addition, an ASR system utilising the dictionary is developed, achieving an unconstrained phoneme accuracy of 79.7%. / Thesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
4

Unsupervised clustering of audio data for acoustic modelling in automatic speech recognition systems

Goussard, George Willem 03 1900 (has links)
Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2011. / ENGLISH ABSTRACT: This thesis presents a system that is designed to replace the manual process of generating a pronunciation dictionary for use in automatic speech recognition. The proposed system has several stages. The first stage segments the audio into what will be known as the subword units, using a frequency domain method. In the second stage, dynamic time warping is used to determine the similarity between the segments of each possible pair of these acoustic segments. These similarities are used to cluster similar acoustic segments into acoustic clusters. The final stage derives a pronunciation dictionary from the orthography of the training data and corresponding sequence of acoustic clusters. This process begins with an initial mapping between words and their sequence of clusters, established by Viterbi alignment with the orthographic transcription. The dictionary is refined iteratively by pruning redundant mappings, hidden Markov model estimation and Viterbi re-alignment in each iteration. This approach is evaluated experimentally by applying it to two subsets of the TIMIT corpus. It is found that, when test words are repeated often in the training material, the approach leads to a system whose accuracy is almost as good as one trained using the phonetic transcriptions. When test words are not repeated often in the training set, the proposed approach leads to better results than those achieved using the phonetic transcriptions, although the recognition is poor overall in this case. / AFRIKAANSE OPSOMMING: Die doelwit van die tesis is om ’n stelsel te beskryf wat ontwerp is om die handgedrewe proses in die samestelling van ’n woordeboek, vir die gebruik in outomatiese spraakherkenningsstelsels, te vervang. Die voorgestelde stelsel bestaan uit ’n aantal stappe. Die eerste stap is die segmentering van die oudio in sogenaamde sub-woord eenhede deur gebruik te maak van ’n frekwensie gebied tegniek. Met die tweede stap word die dinamiese tydverplasingsalgoritme ingespan om die ooreenkoms tussen die segmente van elkeen van die moontlike pare van die akoestiese segmente bepaal. Die ooreenkomste word dan gebruik om die akoestiese segmente te groepeer in akoestiese groepe. Die laaste stap stel die woordeboek saam deur gebruik te maak van die ortografiese transkripsie van afrigtingsdata en die ooreenstemmende reeks akoestiese groepe. Die finale stap begin met ’n aanvanklike afbeelding vanaf woorde tot hul reeks groep identifiseerders, bewerkstellig deur Viterbi belyning en die ortografiese transkripsie. Die woordeboek word iteratief verfyn deur oortollige afbeeldings te snoei, verskuilde Markov modelle af te rig en deur Viterbi belyning te gebruik in elke iterasie. Die benadering is getoets deur dit eksperimenteel te evalueer op twee subversamelings data vanuit die TIMIT korpus. Daar is bevind dat, wanneer woorde herhaal word in die afrigtingsdata, die stelsel se benadering die akkuraatheid ewenaar van ’n stelsel wat met die fonetiese transkripsie afgerig is. As die woorde nie herhaal word in die afrigtingsdata nie, is die akkuraatheid van die stelsel se benadering beter as wanneer die stelsel afgerig word met die fonetiese transkripsie, alhoewel die akkuraatheid in die algemeen swak is.

Page generated in 0.1451 seconds