Return to search

Language modeling for speech recognition of spoken Cantonese.

Yeung, Yu Ting. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2009. / Includes bibliographical references (leaves 84-93). / Abstracts in English and Chinese. / Acknowledgement --- p.iii / Abstract --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Cantonese Speech Recognition --- p.3 / Chapter 1.2 --- Objectives --- p.4 / Chapter 1.3 --- Thesis Outline --- p.5 / Chapter 2 --- Fundamentals of Large Vocabulary Continuous Speech Recognition --- p.7 / Chapter 2.1 --- Problem Formulation --- p.7 / Chapter 2.2 --- Feature Extraction --- p.8 / Chapter 2.3 --- Acoustic Models --- p.9 / Chapter 2.4 --- Decoding --- p.10 / Chapter 2.5 --- Statistical Language Modeling --- p.12 / Chapter 2.5.1 --- N-gram Language Models --- p.12 / Chapter 2.5.2 --- N-gram Smoothing --- p.13 / Chapter 2.5.3 --- Complexity of Language Model --- p.15 / Chapter 2.5.4 --- Class-based Langauge Model --- p.16 / Chapter 2.5.5 --- Language Model Pruning --- p.17 / Chapter 2.6 --- Performance Evaluation --- p.18 / Chapter 3 --- The Cantonese Dialect --- p.19 / Chapter 3.1 --- Phonology of Cantonese --- p.19 / Chapter 3.2 --- Orthographic Representation of Cantonese --- p.22 / Chapter 3.3 --- Classification of Cantonese speech --- p.25 / Chapter 3.4 --- Cantonese-English Code-mixing --- p.27 / Chapter 4 --- Rule-based Translation Method --- p.29 / Chapter 4.1 --- Motivations --- p.29 / Chapter 4.2 --- Transformation-based Learning --- p.30 / Chapter 4.2.1 --- Algorithm Overview --- p.30 / Chapter 4.2.2 --- Learning of Translation Rules --- p.32 / Chapter 4.3 --- Performance Evaluation --- p.35 / Chapter 4.3.1 --- The Learnt Translation Rules --- p.35 / Chapter 4.3.2 --- Evaluation of the Rules --- p.37 / Chapter 4.3.3 --- Analysis of the Rules --- p.37 / Chapter 4.4 --- Preparation of Training Data for Language Modeling --- p.41 / Chapter 4.5 --- Discussion --- p.43 / Chapter 5 --- Language Modeling for Cantonese --- p.44 / Chapter 5.1 --- Training Data --- p.44 / Chapter 5.1.1 --- Text Corpora --- p.44 / Chapter 5.1.2 --- Preparation of Formal Cantonese Text Data --- p.45 / Chapter 5.2 --- Training of Language Models --- p.46 / Chapter 5.2.1 --- Language Models for Standard Chinese --- p.46 / Chapter 5.2.2 --- Language Models for Formal Cantonese --- p.46 / Chapter 5.2.3 --- Language models for Colloquial Cantonese --- p.47 / Chapter 5.3 --- Evaluation of Language Models --- p.48 / Chapter 5.3.1 --- Speech Corpora for Evaluation --- p.48 / Chapter 5.3.2 --- Perplexities of Formal Cantonese Language Models --- p.49 / Chapter 5.3.3 --- Perplexities of Colloquial Cantonese Language Models --- p.51 / Chapter 5.4 --- Speech Recognition Experiments --- p.53 / Chapter 5.4.1 --- Speech Corpora --- p.53 / Chapter 5.4.2 --- Experimental Setup --- p.54 / Chapter 5.4.3 --- Results on Formal Cantonese Models --- p.55 / Chapter 5.4.4 --- Results on Colloquial Cantonese Models --- p.56 / Chapter 5.5 --- Analysis of Results --- p.58 / Chapter 5.6 --- Discussion --- p.59 / Chapter 5.6.1 --- Cantonese Language Modeling --- p.59 / Chapter 5.6.2 --- Interpolated Language Models --- p.59 / Chapter 5.6.3 --- Class-based Language Models --- p.60 / Chapter 6 --- Towards Language Modeling of Code-mixing Speech --- p.61 / Chapter 6.1 --- Data Collection --- p.61 / Chapter 6.1.1 --- Data Collection --- p.62 / Chapter 6.1.2 --- Filtering of Collected Data --- p.63 / Chapter 6.1.3 --- Processing of Collected Data --- p.63 / Chapter 6.2 --- Clustering of Chinese and English Words --- p.64 / Chapter 6.3 --- Language Modeling for Code-mixing Speech --- p.64 / Chapter 6.3.1 --- Language Models from Collected Data --- p.64 / Chapter 6.3.2 --- Class-based Language Models --- p.66 / Chapter 6.3.3 --- Performance Evaluation of Code-mixing Language Models --- p.67 / Chapter 6.4 --- Speech Recognition Experiments with Code-mixing Language Models --- p.69 / Chapter 6.4.1 --- Experimental Setup --- p.69 / Chapter 6.4.2 --- Monolingual Cantonese Recognition --- p.70 / Chapter 6.4.3 --- Code-mixing Speech Recognition --- p.72 / Chapter 6.5 --- Discussion --- p.74 / Chapter 6.5.1 --- Data Collection from the Internet --- p.74 / Chapter 6.5.2 --- Speech Recognition of Code-mixing Speech --- p.75 / Chapter 7 --- Conclusions and Future Work --- p.77 / Chapter 7.1 --- Conclusions --- p.77 / Chapter 7.1.1 --- Rule-based Translation Method --- p.77 / Chapter 7.1.2 --- Cantonese Language Modeling --- p.78 / Chapter 7.1.3 --- Code-mixing Language Modeling --- p.78 / Chapter 7.2 --- Future Works --- p.79 / Chapter 7.2.1 --- Rule-based Translation --- p.79 / Chapter 7.2.2 --- Training data --- p.80 / Chapter 7.2.3 --- Code-mixing speech --- p.80 / Chapter A --- Equation Derivation --- p.82 / Chapter A.l --- Relationship between Average Mutual Information and Perplexity --- p.82 / Bibliography --- p.83

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_326661
Date January 2009
ContributorsYeung, Yu Ting., Chinese University of Hong Kong Graduate School. Division of Electronic Engineering.
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, bibliography
Formatprint, xv, 93 leaves : ill. ; 30 cm.
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0016 seconds