Global ETD Search

Return to search

Utilisation des coefficients de régression linéaire par maximum de vraisemblance comme paramètres pour la reconnaissance automatique du locuteur

The goal of this thesis is to find new and efficient features for speaker recognition. We are mostly concerned with the use of the Maximum-Likelihood Linear Regression (MLLR) family of adaptation techniques as features in speaker recognition systems. MLLR transformcoefficients are able to capture speaker cues after adaptation of a speaker-independent model using speech data. The resulting supervectors are high-dimensional and no underlying model guiding its generation is assumed a priori, becoming suitable for SVM for classification. This thesis brings some contributions to the speaker recognition field by proposing new approaches to feature extraction and studying existing ones via experimentation on large corpora: 1. We propose a compact yet efficient system, MLLR-SVM, which tackles the issues of transcript- and language-dependency of the standard MLLR-SVM approach by using single-class Constrained MLLR (CMLLR) adaptation transforms together with Speaker Adaptive Training (SAT) of a Universal Background Model (UBM). 1- When less data samples than dimensions are available. 2- We propose several alternative representations of CMLLR transformcoefficients based on the singular value and symmetric/skew-symmetric decompositions of transform matrices. 3- We develop a novel framework for feature-level inter-session variability compensation based on compensation of CMLLR transform supervectors via Nuisance Attribute Projection (NAP). 4- We perform a comprehensive experimental study of multi-class (C)MLLR-SVM systems alongmultiple axes including front-end, type of transform, type fmodel,model training and number of transforms. 5- We compare CMLLR and MLLR transform matrices based on an analysis of properties of their singular values. 6- We propose the use of lattice-basedMLLR as away to copewith erroneous transcripts in MLLR-SVMsystems using phonemic acoustic models.

traitement du langage

MLLR

reconnaissance du locuteur

adaptation du locuteur

Identifer	oai:union.ndltd.org:CCSD/oai:tel.archives-ouvertes.fr:tel-00616673
Date	10 July 2009
Creators	Ferràs Font, Marc
Publisher	Université Paris Sud - Paris XI
Source Sets	CCSD theses-EN-ligne, France
Language	English
Detected Language	English
Type	PhD thesis

Page generated in 0.0019 seconds

Utilisation des coefficients de régression linéaire par maximum de vraisemblance comme paramètres pour la reconnaissance automatique du locuteur

Description

Links & Downloads

Tags

Additional Fields