Spelling suggestions: "subject:"pronunciation dictionary"" "subject:"pronunciations dictionary""
1 |
Data-driven augmentation of pronunciation dictionariesLoots, Linsen 03 1900 (has links)
Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2010. / ENGLISH ABSTRACT: This thesis investigates various data-driven techniques by which pronunciation dictionaries
can be automatically augmented. First, well-established grapheme-to-phoneme (G2P) conversion
techniques are evaluated for Standard South African English (SSAE), British English
(RP) and American English (GenAm) by means of four appropriate dictionaries: SAEDICT,
BEEP, CMUDICT and PRONLEX.
Next, the decision tree algorithm is extended to allow the conversion of pronunciations
between different accents by means of phoneme-to-phoneme (P2P) and grapheme-andphoneme-
to-phoneme (GP2P) conversion. P2P conversion uses the phonemes of the source
accent as input to the decision trees. GP2P conversion further incorporates the graphemes
into the decision tree input. Both P2P and GP2P conversion are evaluated using the four
dictionaries. It is found that, when the pronunciation is needed for a word not present
in the target accent, it is substantially more accurate to modify an existing pronunciation
from a different accent, than to derive it from the word’s spelling using G2P conversion.
When converting between accents, GP2P conversion provides a significant further increase
in performance above P2P.
Finally, experiments are performed to determine how large a training dictionary is required
in a target accent for G2P, P2P and GP2P conversion. It is found that GP2P
conversion requires less training data than P2P and substantially less than G2P conversion.
Furthermore, it is found that very little training data is needed for GP2P to perform at almost
maximum accuracy. The bulk of the accuracy is achieved within the initial 500 words,
and after 3000 words there is almost no further improvement.
Some specific approaches to compiling the best training set are also considered. By means
of an iterative greedy algorithm an optimal ranking of words to be included in the training
set is discovered. Using this set is shown to lead to substantially better GP2P performance
for the same training set size in comparison with alternative approaches such as the use of
phonetically rich words or random selections. A mere 25 words of training data from this
optimal set already achieve an accuracy within 1% of that of the full training dictionary. / AFRIKAANSE OPSOMMING: Hierdie tesis ondersoek verskeie data-gedrewe tegnieke waarmee uitspraakwoordeboeke outomaties
aangevul kan word. Eerstens word gevestigde grafeem-na-foneem (G2P) omskakelingstegnieke
ge¨evalueer vir Standaard Suid-Afrikaanse Engels (SSAE), Britse Engels (RP)
en Amerikaanse Engels (GenAm) deur middel van vier geskikte woordeboeke: SAEDICT,
BEEP, CMUDICT en PRONLEX.
Voorts word die beslissingsboomalgoritme uitgebrei om die omskakeling van uitsprake
tussen verskillende aksente moontlik te maak, deur middel van foneem-na-foneem (P2P) en
grafeem-en-foneem-na-foneem (GP2P) omskakeling. P2P omskakeling gebruik die foneme
van die bronaksent as inset vir die beslissingsbome. GP2P omskakeling inkorporeer verder
die grafeme by die inset. Beide P2P en GP2P omskakeling word evalueer deur middel van
die vier woordeboeke. Daar word bevind dat wanneer die uitspraak benodig word vir ’n
woord wat nie in die teikenaksent teenwoordig is nie, dit bepaald meer akkuraat is om ’n
bestaande uitspraak van ’n ander aksent aan te pas, as om dit af te lei vanuit die woord se
spelling met G2P omskakeling. Wanneer daar tussen aksente omgeskakel word, gee GP2P
omskakeling ’n verdere beduidende verbetering in akkuraatheid bo P2P.
Laastens word eksperimente uitgevoer om die grootte te bepaal van die afrigtingswoordeboek
wat benodig word in ’n teikenaksent vir G2P, P2P en GP2P omskakeling. Daar
word bevind dat GP2P omskakeling minder afrigtingsdata as P2P en substansieel minder as
G2P benodig. Verder word dit bevind dat baie min afrigtingsdata benodig word vir GP2P
om teen bykans maksimum akkuraatheid te funksioneer. Die oorwig van die akkuraatheid
word binne die eerste 500 woorde bereik, en n´a 3000 woorde is daar amper geen verdere
verbetering nie.
’n Aantal spesifieke benaderings word ook oorweeg om die beste afrigtingstel saam te stel.
Deur middel van ’n iteratiewe, gulsige algoritme word ’n optimale rangskikking van woorde
bepaal vir insluiting by die afrigtingstel. Daar word getoon dat deur hierdie stel te gebruik,
substansieel beter GP2P gedrag verkry word vir dieselfde grootte afrigtingstel in vergelyking
met alternatiewe benaderings soos die gebruik van foneties-ryke woorde of lukrake seleksies.
’n Skamele 25 woorde uit hierdie optimale stel gee reeds ’n akkuraatheid binne 1% van di´e
van die volle afrigtingswoordeboek.
|
2 |
Adapting a pronunciation dictionary to Standard South African English for automatic speech recognition / Olga Meruzhanovna MartirosianMartirosian, Olga Meruzhanovna January 2009 (has links)
The pronunciation dictionary is a key resource required during the development of an automatic speech recognition (ASR) system. In this thesis, we adapt a British English pronunciation dictionary to Standard South African English (SSAE), as a case study in dialect adaptation. Our investigation leads us in three different
directions: dictionary verification, phoneme redundancy evaluation and phoneme adaptation.
A pronunciation dictionary should be verified for correctness before its implementation in experiments or applications. However, employing a human to verify a full pronunciation dictionary is an indulgent process which cannot always be accommodated. In our dictionary verification research we attempt to reduce the human
effort required in the verification of a pronunciation dictionary by implementing automatic and semi-automatic
techniques that find and isolate possible erroneous entries in the dictionary. We identify a number of new techniques that are very efficient in identifying errors, and apply them to a public domain British English
pronunciation dictionary.
Investigating phoneme redundancy involves looking into the possibility that not all phoneme distinctions are required in SSAE, and investigating different methods of analysing these distinctions. The methods that are
investigated include both data driven and knowledge based pronunciation suggestions for a pronunciation dictionary
used in an automatic speech recognition (ASR) system. This investigation facilitates a deeper linguistic insight into the pronunciation of phonemes in SSAE.
Finally, we investigate phoneme adaptation by adapting the KIT phoneme between two dialects of English through the implementation of a set of adaptation rules. Adaptation rules are extracted from literature but also formulated through an investigation of the linguistic phenomena in the data. We achieve a 93% predictive
accuracy, which is significantly higher than the 71 % achievable through the implementation of previously identified rules. The adaptation of a British pronunciation dictionary to SSAE represents the final step of
developing a SSAE pronunciation dictionary, which is the aim of this thesis. In addition, an ASR system utilising the dictionary is developed, achieving an unconstrained phoneme accuracy of 79.7%. / Thesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
|
3 |
Adapting a pronunciation dictionary to Standard South African English for automatic speech recognition / Olga Meruzhanovna MartirosianMartirosian, Olga Meruzhanovna January 2009 (has links)
The pronunciation dictionary is a key resource required during the development of an automatic speech recognition (ASR) system. In this thesis, we adapt a British English pronunciation dictionary to Standard South African English (SSAE), as a case study in dialect adaptation. Our investigation leads us in three different
directions: dictionary verification, phoneme redundancy evaluation and phoneme adaptation.
A pronunciation dictionary should be verified for correctness before its implementation in experiments or applications. However, employing a human to verify a full pronunciation dictionary is an indulgent process which cannot always be accommodated. In our dictionary verification research we attempt to reduce the human
effort required in the verification of a pronunciation dictionary by implementing automatic and semi-automatic
techniques that find and isolate possible erroneous entries in the dictionary. We identify a number of new techniques that are very efficient in identifying errors, and apply them to a public domain British English
pronunciation dictionary.
Investigating phoneme redundancy involves looking into the possibility that not all phoneme distinctions are required in SSAE, and investigating different methods of analysing these distinctions. The methods that are
investigated include both data driven and knowledge based pronunciation suggestions for a pronunciation dictionary
used in an automatic speech recognition (ASR) system. This investigation facilitates a deeper linguistic insight into the pronunciation of phonemes in SSAE.
Finally, we investigate phoneme adaptation by adapting the KIT phoneme between two dialects of English through the implementation of a set of adaptation rules. Adaptation rules are extracted from literature but also formulated through an investigation of the linguistic phenomena in the data. We achieve a 93% predictive
accuracy, which is significantly higher than the 71 % achievable through the implementation of previously identified rules. The adaptation of a British pronunciation dictionary to SSAE represents the final step of
developing a SSAE pronunciation dictionary, which is the aim of this thesis. In addition, an ASR system utilising the dictionary is developed, achieving an unconstrained phoneme accuracy of 79.7%. / Thesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
|
Page generated in 0.1531 seconds