Return to search

A Comparative Analysis of Bayesian Nonparametric Variational Inference Algorithms for Speech Recognition

Nonparametric Bayesian models have become increasingly popular in speech recognition tasks such as language and acoustic modeling due to their ability to discover underlying structure in an iterative manner. These methods do not require a priori assumptions about the structure of the data, such as the number of mixture components, and can learn this structure directly. Dirichlet process mixtures (DPMs) are a widely used nonparametric Bayesian method which can be used as priors to determine an optimal number of mixture components and their respective weights in a Gaussian mixture model (GMM). Because DPMs potentially require an infinite number of parameters, inference algorithms are needed to make posterior calculations tractable. The focus of this work is an evaluation of three of these Bayesian variational inference algorithms which have only recently become computationally viable: Accelerated Variational Dirichlet Process Mixtures (AVDPM), Collapsed Variational Stick Breaking (CVSB), and Collapsed Dirichlet Priors (CDP). To eliminate other effects on performance such as language models, a phoneme classification task is chosen to more clearly assess the viability of these algorithms for acoustic modeling. Evaluations were conducted on the CALLHOME English and Mandarin corpora, consisting of two languages that, from a human perspective, are phonologically very different. It is shown in this work that these inference algorithms yield error rates comparable to a baseline Gaussian mixture model (GMM) but with a factor of up to 20 fewer mixture components. AVDPM is shown to be the most attractive choice because it delivers the most compact models and is computationally efficient, enabling its application to big data problems. / Electrical and Computer Engineering

Identiferoai:union.ndltd.org:TEMPLE/oai:scholarshare.temple.edu:20.500.12613/2459
Date January 2013
CreatorsSteinberg, John
ContributorsPicone, Joseph, Obeid, Iyad, 1975-, Won, Chang-Hee, 1967-, Yates, Alexander, Sobel, Marc J.
PublisherTemple University. Libraries
Source SetsTemple University
LanguageEnglish
Detected LanguageEnglish
TypeThesis/Dissertation, Text
Format68 pages
RightsIN COPYRIGHT- This Rights Statement can be used for an Item that is in copyright. Using this statement implies that the organization making this Item available has determined that the Item is in copyright and either is the rights-holder, has obtained permission from the rights-holder(s) to make their Work(s) available, or makes the Item available under an exception or limitation to copyright (including Fair Use) that entitles it to make the Item available., http://rightsstatements.org/vocab/InC/1.0/
Relationhttp://dx.doi.org/10.34944/dspace/2441, Theses and Dissertations

Page generated in 0.0014 seconds