In Natural Language Processing (NLP), speech and text are parsed and generated with language models and parser models, and translated with translation models. Each model contains a set of numerical parameters which are found by applying a suitable training algorithm to a set of training data.
Many such training algorithms are instances of the Expectation-Maximization (EM) algorithm. In [BSV15], a generic EM algorithm for NLP is described. This work presents a particular speech model, the Hidden Markov model, and its standard training algorithm, the Baum-Welch algorithm. It is then shown that the Baum-Welch algorithm is an instance of the generic EM algorithm introduced by [BSV15], from which follows that all statements about the generic EM algorithm also apply to the Baum-Welch algorithm, especially its correctness and convergence properties.:1 Introduction
1.1 N-gram models
1.2 Hidden Markov model
2 Expectation-maximization algorithms
2.1 Preliminaries
2.2 Algorithmic skeleton
2.3 Corpus-based step mapping
2.4 Simple counting step mapping
2.5 Regular tree grammars
2.6 Inside-outside step mapping
2.7 Review
3 The Hidden Markov model
3.1 Forward and backward algorithms
3.2 The Baum-Welch algorithm
3.3 Deriving the Baum-Welch algorithm
3.3.1 Model parameter and countable events
3.3.2 Tree-shaped hidden information
3.3.3 Complete-data corpus
3.3.4 Inside weights
3.3.5 Outside weights
3.3.6 Complete-data corpus (cont.)
3.3.7 Step mapping
3.4 Review
Appendix
A Elided proofs from Chapter 3
A.1 Proof of Lemma 3.8
A.2 Proof of Lemma 3.9
B Formulary for Chapter 3
Bibliography
Identifer | oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:29382 |
Date | 22 August 2017 |
Creators | Majewsky, Stefan |
Contributors | Gebhardt, Kilian, Vogler, Heiko, Borchmann, Daniel, Technische Universität Dresden |
Source Sets | Hochschulschriftenserver (HSSS) der SLUB Dresden |
Language | English |
Detected Language | English |
Type | doc-type:bachelorThesis, info:eu-repo/semantics/bachelorThesis, doc-type:Text |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0023 seconds