Return to search

Semi-continuous hidden Markov models for automatic speaker verification

This thesis investigates the use of semi-continuous hidden Markov models (HMM) for automatic speaker verification (ASV) over a telephone channel. The system which was implemented is evaluated on a large database of isolated digits recorded over the British telephone network. The goal of the work is to improve performance of the ASV system under the constraints of limited enrolment data (5 tokens of each digit) and realistic computational and storage requirements. Experiments are conducted on the combined use of several standard feature sets under a common state segmentation, multiple codebook architecture. The feature sets investigated are linear predictive cepstral coefficients, mel-frequency cepstral coefficients and their respective first order differences. New algorithms which are proposed and evaluated include the weighting of digits scores according to their usefulness to the verification task and using Gaussian state duration probabilities as an additional information source in the verification decision. The most important contribution of this thesis is the development of a method for the construction of discriminating HMMs without the need for discriminative training. This new form of model, known as a discriminating observation probability (DOP) HMM involves the combination of standard HMMs to form a discriminating model. The DOP models are more flexible and perform better than the <I>speaker normalisation </I>techniques which are currently favoured in the literature. DOP models have potential application to many binary classification tasks using HMMs. The equal error rate (EER) using speaker specific thresholds on a series of 12 isolated digits was 0.17% using multiple codebook DOP models, compared to 1.93% using single codebook conventional HMM models. This represents a reduction in EER of 91%.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:651039
Date January 1995
CreatorsForsyth, Mark Eric
PublisherUniversity of Edinburgh
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hdl.handle.net/1842/10903

Page generated in 0.0017 seconds