Yeung, Yu Ting. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2009. / Includes bibliographical references (leaves 84-93). / Abstracts in English and Chinese. / Acknowledgement --- p.iii / Abstract --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Cantonese Speech Recognition --- p.3 / Chapter 1.2 --- Objectives --- p.4 / Chapter 1.3 --- Thesis Outline --- p.5 / Chapter 2 --- Fundamentals of Large Vocabulary Continuous Speech Recognition --- p.7 / Chapter 2.1 --- Problem Formulation --- p.7 / Chapter 2.2 --- Feature Extraction --- p.8 / Chapter 2.3 --- Acoustic Models --- p.9 / Chapter 2.4 --- Decoding --- p.10 / Chapter 2.5 --- Statistical Language Modeling --- p.12 / Chapter 2.5.1 --- N-gram Language Models --- p.12 / Chapter 2.5.2 --- N-gram Smoothing --- p.13 / Chapter 2.5.3 --- Complexity of Language Model --- p.15 / Chapter 2.5.4 --- Class-based Langauge Model --- p.16 / Chapter 2.5.5 --- Language Model Pruning --- p.17 / Chapter 2.6 --- Performance Evaluation --- p.18 / Chapter 3 --- The Cantonese Dialect --- p.19 / Chapter 3.1 --- Phonology of Cantonese --- p.19 / Chapter 3.2 --- Orthographic Representation of Cantonese --- p.22 / Chapter 3.3 --- Classification of Cantonese speech --- p.25 / Chapter 3.4 --- Cantonese-English Code-mixing --- p.27 / Chapter 4 --- Rule-based Translation Method --- p.29 / Chapter 4.1 --- Motivations --- p.29 / Chapter 4.2 --- Transformation-based Learning --- p.30 / Chapter 4.2.1 --- Algorithm Overview --- p.30 / Chapter 4.2.2 --- Learning of Translation Rules --- p.32 / Chapter 4.3 --- Performance Evaluation --- p.35 / Chapter 4.3.1 --- The Learnt Translation Rules --- p.35 / Chapter 4.3.2 --- Evaluation of the Rules --- p.37 / Chapter 4.3.3 --- Analysis of the Rules --- p.37 / Chapter 4.4 --- Preparation of Training Data for Language Modeling --- p.41 / Chapter 4.5 --- Discussion --- p.43 / Chapter 5 --- Language Modeling for Cantonese --- p.44 / Chapter 5.1 --- Training Data --- p.44 / Chapter 5.1.1 --- Text Corpora --- p.44 / Chapter 5.1.2 --- Preparation of Formal Cantonese Text Data --- p.45 / Chapter 5.2 --- Training of Language Models --- p.46 / Chapter 5.2.1 --- Language Models for Standard Chinese --- p.46 / Chapter 5.2.2 --- Language Models for Formal Cantonese --- p.46 / Chapter 5.2.3 --- Language models for Colloquial Cantonese --- p.47 / Chapter 5.3 --- Evaluation of Language Models --- p.48 / Chapter 5.3.1 --- Speech Corpora for Evaluation --- p.48 / Chapter 5.3.2 --- Perplexities of Formal Cantonese Language Models --- p.49 / Chapter 5.3.3 --- Perplexities of Colloquial Cantonese Language Models --- p.51 / Chapter 5.4 --- Speech Recognition Experiments --- p.53 / Chapter 5.4.1 --- Speech Corpora --- p.53 / Chapter 5.4.2 --- Experimental Setup --- p.54 / Chapter 5.4.3 --- Results on Formal Cantonese Models --- p.55 / Chapter 5.4.4 --- Results on Colloquial Cantonese Models --- p.56 / Chapter 5.5 --- Analysis of Results --- p.58 / Chapter 5.6 --- Discussion --- p.59 / Chapter 5.6.1 --- Cantonese Language Modeling --- p.59 / Chapter 5.6.2 --- Interpolated Language Models --- p.59 / Chapter 5.6.3 --- Class-based Language Models --- p.60 / Chapter 6 --- Towards Language Modeling of Code-mixing Speech --- p.61 / Chapter 6.1 --- Data Collection --- p.61 / Chapter 6.1.1 --- Data Collection --- p.62 / Chapter 6.1.2 --- Filtering of Collected Data --- p.63 / Chapter 6.1.3 --- Processing of Collected Data --- p.63 / Chapter 6.2 --- Clustering of Chinese and English Words --- p.64 / Chapter 6.3 --- Language Modeling for Code-mixing Speech --- p.64 / Chapter 6.3.1 --- Language Models from Collected Data --- p.64 / Chapter 6.3.2 --- Class-based Language Models --- p.66 / Chapter 6.3.3 --- Performance Evaluation of Code-mixing Language Models --- p.67 / Chapter 6.4 --- Speech Recognition Experiments with Code-mixing Language Models --- p.69 / Chapter 6.4.1 --- Experimental Setup --- p.69 / Chapter 6.4.2 --- Monolingual Cantonese Recognition --- p.70 / Chapter 6.4.3 --- Code-mixing Speech Recognition --- p.72 / Chapter 6.5 --- Discussion --- p.74 / Chapter 6.5.1 --- Data Collection from the Internet --- p.74 / Chapter 6.5.2 --- Speech Recognition of Code-mixing Speech --- p.75 / Chapter 7 --- Conclusions and Future Work --- p.77 / Chapter 7.1 --- Conclusions --- p.77 / Chapter 7.1.1 --- Rule-based Translation Method --- p.77 / Chapter 7.1.2 --- Cantonese Language Modeling --- p.78 / Chapter 7.1.3 --- Code-mixing Language Modeling --- p.78 / Chapter 7.2 --- Future Works --- p.79 / Chapter 7.2.1 --- Rule-based Translation --- p.79 / Chapter 7.2.2 --- Training data --- p.80 / Chapter 7.2.3 --- Code-mixing speech --- p.80 / Chapter A --- Equation Derivation --- p.82 / Chapter A.l --- Relationship between Average Mutual Information and Perplexity --- p.82 / Bibliography --- p.83
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_326661 |
Date | January 2009 |
Contributors | Yeung, Yu Ting., Chinese University of Hong Kong Graduate School. Division of Electronic Engineering. |
Source Sets | The Chinese University of Hong Kong |
Language | English, Chinese |
Detected Language | English |
Type | Text, bibliography |
Format | print, xv, 93 leaves : ill. ; 30 cm. |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.0018 seconds