Acoustic modeling in state-of-the-art speech recognition systems is commonly based on discriminative criteria. Different from the paradigm of the conventional distribution estimation such as maximum a posteriori (MAP) and maximum likelihood (ML), the most popular discriminative criteria such as MCE and MPE aim at direct minimization of the empirical error rate. As recent ASR applications become diverse, it has been increasingly recognized that realistic applications often require a model that can be optimized for a task-specific goal or a particular scenario beyond the general purposes of the current discriminative criteria. These specific requirements cannot be directly handled by the current discriminative criteria since the objective of the criteria is to minimize the overall empirical error rate.
In this thesis, we propose novel objective-driven discriminative training and adaptation frameworks, which are generalized from the minimum classification error (MCE) criterion, for various tasks and scenarios of speech recognition and detection. The proposed frameworks are constructed to formulate new discriminative criteria which satisfy various requirements of the recent ASR applications. In this thesis, each objective required by an application or a developer is directly embedded into the learning criterion. Then, the objective-driven discriminative criterion is used to optimize an acoustic model in order to achieve the required objective.
Three task-specific requirements that the recent ASR applications often require in practice are mainly taken into account in developing the objective-driven discriminative criteria. First, an issue of individual error minimization of speech recognition is addressed and we propose a direct minimization algorithm for each error type of speech recognition. Second, a rapid adaptation scenario is embedded into formulating discriminative linear transforms under the MCE criterion. A regularized MCE criterion is proposed to efficiently improve the generalization capability of the MCE estimate in a rapid adaptation scenario. Finally, the particular operating scenario that requires a system model optimized at a given specific operating point is discussed over the conventional receiver operating characteristic (ROC) optimization. A constrained discriminative training algorithm which can directly optimize a system model for any particular operating need is proposed. For each of the developed algorithms, we provide an analytical solution and an appropriate optimization procedure.
Identifer | oai:union.ndltd.org:GATECH/oai:smartech.gatech.edu:1853/50255 |
Date | 13 January 2014 |
Creators | Shin, Sung-Hwan |
Contributors | Juang, Biing-Hwang |
Publisher | Georgia Institute of Technology |
Source Sets | Georgia Tech Electronic Thesis and Dissertation Archive |
Language | en_US |
Detected Language | English |
Type | Dissertation |
Format | application/pdf |
Page generated in 0.0032 seconds