Return to search

Confidence Measures for Speech/Speaker Recognition and Applications on Turkish LVCSR

Confidence measures for the results of speech/speaker recognition make the systems more useful in the real time applications. Confidence measures provide a test statistic for accepting or rejecting the recognition hypothesis of the speech/speaker recognition system.
Speech/speaker recognition systems are usually based on statistical modeling techniques. In
this thesis we defined confidence measures for statistical modeling techniques used in speech/speaker recognition systems.
For speech recognition we tested available confidence measures and the newly defined acoustic prior information based confidence measure in two different conditions which cause errors: the out-of-vocabulary words and presence of additive noise. We showed that the newly defined confidence measure performs better in both tests.
Review of speech recognition and speaker recognition techniques and some related statistical methods is given through the thesis.
We defined also a new interpretation technique for confidence measures which is based on Fisher transformation of likelihood ratios obtained in speaker verification. Transformation provided us with a linearly interpretable confidence level which can be used directly in real time applications like for dialog management.
We have also tested the confidence measures for speaker verification systems and evaluated the efficiency of the confidence measures for adaptation of speaker models. We showed that use of confidence measures to select adaptation data improves the accuracy of the speaker model adaptation process.
Another contribution of this thesis is the preparation of a phonetically rich continuous speech database for Turkish Language. The database is used for developing an HMM/MLP hybrid speech recognition for Turkish Language. Experiments on the test sets of the database showed that the speech recognition system has a good accuracy for long speech sequences while performance is lower for short words, as it is the case for current speech recognition systems for
other languages.
A new language modeling technique for the Turkish language is introduced in this thesis, which can be used for other agglutinative languages. Performance evaluations on newly defined language modeling techniques showed that it outperforms the classical n-gram language modeling technique.

Identiferoai:union.ndltd.org:BICfB/oai:fpms.ac.be:ETDFPMS:FPMSetd-07092004-102840
Date24 May 2004
CreatorsMengusoglu, Erhan
ContributorsTrecat, J., Leich, H., Hanton, J., Gosselin, B., Froidure, J-C., Macq, B., Grenez, F.
PublisherFaculte Polytechnique de Mons
Source SetsBibliothèque interuniversitaire de la Communauté française de Belgique
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://theses.fpms.ac.be/ETD-db/collection/available/FPMSetd-07092004-102840/
Rightsunrestricted, J'accepte que le texte de la thèse (ci-après l'oeuvre), sous réserve des parties couvertes par la confidentialité, soit publié dans le recueil électronique des thèses FPMS. A cette fin, je donne licence à FPMS : - le droit de fixer et de reproduire l'oeuvre sur support électronique : logiciel ETD/db - le droit de communiquer l'oeuvre au public Cette licence, gratuite et non exclusive, est valable pour toute la durée de la propriété littéraire et artistique, y compris ses éventuelles prolongations, et pour le monde entier. Je conserve tous les autres droits pour la reproduction et la communication de la thèse, ainsi que le droit de l'utiliser dans de futurs travaux. Je certifie avoir obtenu, conformément à la législation sur le droit d'auteur et aux exigences du droit à l'image, toutes les autorisations nécessaires à la reproduction dans ma thèse d'images, de textes, et/ou de toute oeuvre protégés par le droit d'auteur, et avoir obtenu les autorisations nécessaires à leur communication à des tiers. Au cas où un tiers est titulaire d'un droit de propriété intellectuelle sur tout ou partie de ma thèse, je certifie avoir obtenu son autorisation écrite pour l'exercice des droits mentionnés ci-dessus.

Page generated in 0.0025 seconds