Return to search

Automatic speech recognition of Cantonese-English code-mixing utterances.

Chan Yeuk Chi Joyce. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references. / Abstracts in English and Chinese. / Chapter Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Background --- p.1 / Chapter 1.2 --- Previous Work on Code-switching Speech Recognition --- p.2 / Chapter 1.2.1 --- Keyword Spotting Approach --- p.3 / Chapter 1.2.2 --- Translation Approach --- p.4 / Chapter 1.2.3 --- Language Boundary Detection --- p.6 / Chapter 1.3 --- Motivations of Our Work --- p.7 / Chapter 1.4 --- Methodology --- p.8 / Chapter 1.5 --- Thesis Outline --- p.10 / Chapter 1.6 --- References --- p.11 / Chapter Chapter 2 --- Fundamentals of Large Vocabulary Continuous Speech Recognition for Cantonese and English --- p.14 / Chapter 2.1 --- Basic Theory of Speech Recognition --- p.14 / Chapter 2.1.1 --- Feature Extraction --- p.14 / Chapter 2.1.2 --- Maximum a Posteriori (MAP) Probability --- p.15 / Chapter 2.1.3 --- Hidden Markov Model (HMM) --- p.16 / Chapter 2.1.4 --- Statistical Language Modeling --- p.17 / Chapter 2.1.5 --- Search A lgorithm --- p.18 / Chapter 2.2 --- Word Posterior Probability (WPP) --- p.19 / Chapter 2.3 --- Generalized Word Posterior Probability (GWPP) --- p.23 / Chapter 2.4 --- Characteristics of Cantonese --- p.24 / Chapter 2.4.1 --- Cantonese Phonology --- p.24 / Chapter 2.4.2 --- Variation and Change in Pronunciation --- p.27 / Chapter 2.4.3 --- Syllables and Characters in Cantonese --- p.28 / Chapter 2.4.4 --- Spoken Cantonese vs. Written Chinese --- p.28 / Chapter 2.5 --- Characteristics of English --- p.30 / Chapter 2.5.1 --- English Phonology --- p.30 / Chapter 2.5.2 --- English with Cantonese Accents --- p.31 / Chapter 2.6 --- References --- p.32 / Chapter Chapter 3 --- Code-mixing and Code-switching Speech Recognition --- p.35 / Chapter 3.1 --- Introduction --- p.35 / Chapter 3.2 --- Definition --- p.35 / Chapter 3.2.1 --- Monolingual Speech Recognition --- p.35 / Chapter 3.2.2 --- Multilingual Speech Recognition --- p.35 / Chapter 3.2.3 --- Code-mixing and Code-switching --- p.36 / Chapter 3.3 --- Conversation in Hong Kong --- p.38 / Chapter 3.3.1 --- Language Choice of Hong Kong People --- p.38 / Chapter 3.3.2 --- Reasons for Code-mixing in Hong Kong --- p.40 / Chapter 3.3.3 --- How Does Code-mixing Occur? --- p.41 / Chapter 3.4 --- Difficulties for Code-mixing - Specific to Cantonese-English --- p.44 / Chapter 3.4.1 --- Phonetic Differences --- p.45 / Chapter 3.4.2 --- Phonology difference --- p.48 / Chapter 3.4.3 --- Accent and Borrowing --- p.49 / Chapter 3.4.4 --- Lexicon and Grammar --- p.49 / Chapter 3.4.5 --- Lack of Appropriate Speech Corpus --- p.50 / Chapter 3.5 --- References --- p.50 / Chapter Chapter 4 --- Data Collection --- p.53 / Chapter 4.1 --- Data Collection --- p.53 / Chapter 4.1.1 --- Corpus Design --- p.53 / Chapter 4.1.2 --- Recording Setup --- p.59 / Chapter 4.1.3 --- Post-processing of Speech Data --- p.60 / Chapter 4.2 --- A Baseline Database --- p.61 / Chapter 4.2.1 --- Monolingual Spoken Cantonese Speech Data (CUMIX) --- p.61 / Chapter 4.3 --- References --- p.61 / Chapter Chapter 5 --- System Design and Experimental Setup --- p.63 / Chapter 5.1 --- Overview of the Code-mixing Speech Recognizer --- p.63 / Chapter 5.1.1 --- Bilingual Syllable / Word-based Speech Recognizer --- p.63 / Chapter 5.1.2 --- Language Boundary Detection --- p.64 / Chapter 5.1.3 --- Generalized Word Posterior Probability (GWPP) --- p.65 / Chapter 5.2 --- Acoustic Modeling --- p.66 / Chapter 5.2.1 --- Speech Corpus for Training of Acoustic Models --- p.67 / Chapter 5.2.2 --- Features Extraction --- p.69 / Chapter 5.2.3 --- Variability in the Speech Signal --- p.69 / Chapter 5.2.4 --- Language Dependency of the Acoustic Models --- p.71 / Chapter 5.2.5 --- Pronunciation Dictionary --- p.80 / Chapter 5.2.6 --- The Training Process of Acoustic Models --- p.83 / Chapter 5.2.7 --- Decoding and Evaluation --- p.88 / Chapter 5.3 --- Language Modeling --- p.90 / Chapter 5.3.1 --- N-gram Language Model --- p.91 / Chapter 5.3.2 --- Difficulties in Data Collection --- p.91 / Chapter 5.3.3 --- Text Data for Training Language Model --- p.92 / Chapter 5.3.4 --- Training Tools --- p.95 / Chapter 5.3.5 --- Training Procedure --- p.95 / Chapter 5.3.6 --- Evaluation of the Language Models --- p.98 / Chapter 5.4 --- Language Boundary Detection --- p.99 / Chapter 5.4.1 --- Phone-based LBD --- p.100 / Chapter 5.4.2 --- Syllable-based LBD --- p.104 / Chapter 5.4.3 --- LBD Based on Syllable Lattice --- p.106 / Chapter 5.5 --- "Integration of the Acoustic Model Scores, Language Model Scores and Language Boundary Information" --- p.107 / Chapter 5.5.1 --- Integration of Acoustic Model Scores and Language Boundary Information. --- p.107 / Chapter 5.5.2 --- Integration of Modified Acoustic Model Scores and Language Model Scores --- p.109 / Chapter 5.5.3 --- Evaluation Criterion --- p.111 / Chapter 5.6 --- References --- p.112 / Chapter Chapter 6 --- Results and Analysis --- p.118 / Chapter 6.1 --- Speech Data for Development and Evaluation --- p.118 / Chapter 6.1.1 --- Development Data --- p.118 / Chapter 6.1.2 --- Testing Data --- p.118 / Chapter 6.2 --- Performance of Different Acoustic Units --- p.119 / Chapter 6.2.1 --- Analysis of Results --- p.120 / Chapter 6.3 --- Language Boundary Detection --- p.122 / Chapter 6.3.1 --- Phone-based Language Boundary Detection --- p.123 / Chapter 6.3.2 --- Syllable-based Language Boundary Detection (SYL LB) --- p.127 / Chapter 6.3.3 --- Language Boundary Detection Based on Syllable Lattice (BILINGUAL LBD) --- p.129 / Chapter 6.3.4 --- Observations --- p.129 / Chapter 6.4 --- Evaluation of the Language Models --- p.130 / Chapter 6.4.1 --- Character Perplexity --- p.130 / Chapter 6.4.2 --- Phonetic-to-text Conversion Rate --- p.131 / Chapter 6.4.3 --- Observations --- p.131 / Chapter 6.5 --- Character Error Rate --- p.132 / Chapter 6.5.1 --- Without Language Boundary Information --- p.133 / Chapter 6.5.2 --- With Language Boundary Detector SYL LBD --- p.134 / Chapter 6.5.3 --- With Language Boundary Detector BILINGUAL-LBD --- p.136 / Chapter 6.5.4 --- Observations --- p.138 / Chapter 6.6 --- References --- p.141 / Chapter Chapter 7 --- Conclusions and Suggestions for Future Work --- p.143 / Chapter 7.1 --- Conclusion --- p.143 / Chapter 7.1.1 --- Difficulties and Solutions --- p.144 / Chapter 7.2 --- Suggestions for Future Work --- p.149 / Chapter 7.2.1 --- Acoustic Modeling --- p.149 / Chapter 7.2.2 --- Pronunciation Modeling --- p.149 / Chapter 7.2.3 --- Language Modeling --- p.150 / Chapter 7.2.4 --- Speech Data --- p.150 / Chapter 7.2.5 --- Language Boundary Detection --- p.151 / Chapter 7.3 --- References --- p.151 / Appendix A Code-mixing Utterances in Training Set of CUMIX --- p.152 / Appendix B Code-mixing Utterances in Testing Set of CUMIX --- p.175 / Appendix C Usage of Speech Data in CUMIX --- p.202

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_325133
Date January 2005
ContributorsChan, Yeuk Chi Joyce., Chinese University of Hong Kong Graduate School. Division of Electronic Engineering.
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, bibliography
Formatprint, xv, 204 leaves : ill. ; 30 cm.
CoverageChina, Hong Kong
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0028 seconds