Global ETD Search

1	Fast accurate diphone-based phoneme recognition Du Preez, Marianne 03 1900 (has links) Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2009. / Statistical speech recognition systems typically utilise a set of statistical models of subword units based on the set of phonemes in a target language. However, in continuous speech it is important to consider co-articulation e ects and the interactions between neighbouring sounds, as over-generalisation of the phonetic models can negatively a ect system accuracy. Traditionally co-articulation in continuous speech is handled by incorporating contextual information into the subword model by means of context-dependent models, which exponentially increase the number of subword models. In contrast, transitional models aim to handle co-articulation by modelling the interphone dynamics found in the transitions between phonemes. This research aimed to perform an objective analysis of diphones as subword units for use in hidden Markov model-based continuous-speech recognition systems, with special emphasis on a direct comparison to a context-dependent biphone-based system in terms of complexity, accuracy and computational e ciency in similar parametric conditions. To simulate practical conditions, the experiments were designed to evaluate these systems in a low resource environment { limited supply of training data, computing power and system memory { while still attempting fast, accurate phoneme recognition. Adaptation techniques designed to exploit characteristics inherent in diphones, as well as techniques used for e ective parameter estimation and state-level tying were used to reduce resource requirements while simultaneously increasing parameter reliability. These techniques include diphthong splitting, utilisation of a basic diphone grammar, diphone set completion, maximum a posteriori estimation and decision-tree based state clustering algorithms. The experiments were designed to evaluate the contribution of each adaptation technique individually and subsequently compare the optimised diphone-based recognition system to a biphone-based recognition system that received similar treatment. Results showed that diphone-based recognition systems perform better than both traditional phoneme-based systems and context-dependent biphone-based systems when evaluated in similar parametric conditions. Therefore, diphones are e ective subword units, which carry suprasegmental knowledge of speech signals and provide an excellent compromise between detailed co-articulation modelling and acceptable system performance Phoneme recognition Diphones Acoustic speech modelling Automatic speech recognition Electrical and Electronic Engineering
2	Probabilistic Modelling of Hearing : Speech Recognition and Optimal Audiometry Stadler, Svante January 2009 (has links) <p>Hearing loss afflicts as many as 10\% of our population.Fortunately, technologies designed to alleviate the effects ofhearing loss are improving rapidly, including cochlear implantsand the increasing computing power of digital hearing aids. Thisthesis focuses on theoretically sound methods for improvinghearing aid technology. The main contributions are documented inthree research articles, which treat two separate topics:modelling of human speech recognition (Papers A and B) andoptimization of diagnostic methods for hearing loss (Paper C).Papers A and B present a hidden Markov model-based framework forsimulating speech recognition in noisy conditions using auditorymodels and signal detection theory. In Paper A, a model of normaland impaired hearing is employed, in which a subject's pure-tonehearing thresholds are used to adapt the model to the individual.In Paper B, the framework is modified to simulate hearing with acochlear implant (CI). Two models of hearing with CI arepresented: a simple, functional model and a biologically inspiredmodel. The models are adapted to the individual CI user bysimulating a spectral discrimination test. The framework canestimate speech recognition ability for a given hearing impairmentor cochlear implant user. This estimate could potentially be usedto optimize hearing aid settings.Paper C presents a novel method for sequentially choosing thesound level and frequency for pure-tone audiometry. A Gaussianmixture model (GMM) is used to represent the probabilitydistribution of hearing thresholds at 8 frequencies. The GMM isfitted to over 100,000 hearing thresholds from a clinicaldatabase. After each response, the GMM is updated using Bayesianinference. The sound level and frequency are chosen so as tomaximize a predefined objective function, such as the entropy ofthe probability distribution. It is found through simulation thatan average of 48 tone presentations are needed to achieve the sameaccuracy as the standard method, which requires an average of 135presentations.</p> auditory models probabilistic modelling speech modelling human speech recognition hearing aids cochlear implants psychoacoustics diagnostic methods optimal experiments audiometry
3	Probabilistic Modelling of Hearing : Speech Recognition and Optimal Audiometry Stadler, Svante January 2009 (has links) Hearing loss afflicts as many as 10\% of our population.Fortunately, technologies designed to alleviate the effects ofhearing loss are improving rapidly, including cochlear implantsand the increasing computing power of digital hearing aids. Thisthesis focuses on theoretically sound methods for improvinghearing aid technology. The main contributions are documented inthree research articles, which treat two separate topics:modelling of human speech recognition (Papers A and B) andoptimization of diagnostic methods for hearing loss (Paper C).Papers A and B present a hidden Markov model-based framework forsimulating speech recognition in noisy conditions using auditorymodels and signal detection theory. In Paper A, a model of normaland impaired hearing is employed, in which a subject's pure-tonehearing thresholds are used to adapt the model to the individual.In Paper B, the framework is modified to simulate hearing with acochlear implant (CI). Two models of hearing with CI arepresented: a simple, functional model and a biologically inspiredmodel. The models are adapted to the individual CI user bysimulating a spectral discrimination test. The framework canestimate speech recognition ability for a given hearing impairmentor cochlear implant user. This estimate could potentially be usedto optimize hearing aid settings.Paper C presents a novel method for sequentially choosing thesound level and frequency for pure-tone audiometry. A Gaussianmixture model (GMM) is used to represent the probabilitydistribution of hearing thresholds at 8 frequencies. The GMM isfitted to over 100,000 hearing thresholds from a clinicaldatabase. After each response, the GMM is updated using Bayesianinference. The sound level and frequency are chosen so as tomaximize a predefined objective function, such as the entropy ofthe probability distribution. It is found through simulation thatan average of 48 tone presentations are needed to achieve the sameaccuracy as the standard method, which requires an average of 135presentations. auditory models probabilistic modelling speech modelling human speech recognition hearing aids cochlear implants psychoacoustics diagnostic methods optimal experiments audiometry

1

Page generated in 0.0875 seconds