• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 21
  • 17
  • Tagged with
  • 21
  • 21
  • 21
  • 21
  • 21
  • 18
  • 18
  • 7
  • 5
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Large vocabulary continuous speech recognition for cantonese. / 粤語的大詞彙、連續語音識別系統 / Large vocabulary continuous speech recognition for cantonese. / Yue yu de da ci hui, lian xu yu yin shi bie xi tong

January 2000 (has links)
Wong Yiu Wing = 粤語的大詞彙、連續語音識別系統 / 黃耀榮. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references. / Text in English; abstracts in English and Chinese. / Wong Yiu Wing = Yue yu de da ci hui, lian xu yu yin shi bie xi tong / Huang Yaorong. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Progress of Large Vocabulary Continuous Speech Recognition for Chinese --- p.2 / Chapter 1.2 --- Objectives of the Thesis --- p.5 / Chapter 1.3 --- Thesis Outline --- p.6 / Reference --- p.7 / Chapter 2 --- Fundamentals of Large Vocabulary Continuous Speech Recognition for Cantonese --- p.9 / Chapter 2.1 --- Characteristics of Cantonese --- p.9 / Chapter 2.1.1 --- Cantonese Phonology --- p.9 / Chapter 2.1.2 --- Written Cantonese versus Spoken Cantonese --- p.12 / Chapter 2.2 --- Techniques for Large Vocabulary Continuous Speech Recognition --- p.13 / Chapter 2.2.1 --- Feature Representation of the Speech Signal --- p.14 / Chapter 2.2.2 --- Hidden Markov Model for Acoustic Modeling --- p.15 / Chapter 2.2.3 --- Search Algorithm --- p.17 / Chapter 2.2.4 --- Statistical Language Modeling --- p.18 / Chapter 2.3 --- Discussions --- p.19 / Reference --- p.20 / Chapter 3 --- Acoustic Modeling for Cantonese --- p.21 / Chapter 3.1 --- The Speech Database --- p.21 / Chapter 3.2 --- Context-Dependent Acoustic Modeling --- p.22 / Chapter 3.2.1 --- Context-Independent Initial / Final Models --- p.23 / Chapter 3.2.2 --- Construction of Context-Dependent TrilF Models from Context- Independent IF Models --- p.26 / Chapter 3.2.3 --- Data Sharing in Acoustic Modeling --- p.27 / Chapter 1. --- Sparse Data Problem --- p.27 / Chapter 2. --- Decision-Tree Based State Clustering --- p.28 / Chapter 3.3 --- Experimental Results --- p.31 / Chapter 3.4 --- Error Analysis and Discussions --- p.33 / Chapter 3.4.1 --- Recognition Accuracy vs. Model Complexity --- p.33 / Chapter 3.4.2 --- Initial / Final Confusion Matrices --- p.34 / Chapter 3.4.3 --- Analysis of Phonetic Trees --- p.39 / Chapter 3.4.4 --- The NULL Initial HMM --- p.42 / Chapter 3.4.5 --- Comments on the CUSENT Speech Corpus --- p.42 / References --- p.44 / Chapter 4 --- Language Modeling for Cantonese --- p.46 / Chapter 4.1 --- N-gram Language Model --- p.46 / Chapter 4.1.1 --- Problems in Building an N-gram Language Model --- p.47 / Chapter 1. --- The Zero-Probability Problem and Backoff N-gram --- p.48 / Chapter 4.1.2 --- Perplexity of a Language Model --- p.49 / Chapter 4.2 --- N-gram Modeling in Cantonese --- p.50 / Chapter 4.2.1 --- The Vocabulary and Word Segmentation --- p.50 / Chapter 4.2.2 --- Evaluation of Chinese Language Models --- p.53 / Chapter 4.3 --- Character-Level versus Word-Level Language Models --- p.54 / Chapter 4.4 --- Language Modeling in a Specific Domain --- p.57 / Chapter 4.4.1 --- Language Model Adaptation to the Financial Domain --- p.57 / Chapter 1. --- Vocabulary Refinement --- p.57 / Chapter 2. --- The Seed Financial Bigram --- p.58 / Chapter 3. --- Linear Interpolation of Two Bigram models --- p.59 / Chapter 4. --- Performance of the Interpolated Language Model --- p.60 / Chapter 4.5 --- Error Analysis and Discussions --- p.61 / References --- p.63 / Chapter 5 --- Integration of Acoustic Model and Language Model --- p.65 / Chapter 5.1 --- One-Pass Search versus Multi-Pass Search --- p.66 / Chapter 5.2 --- A Two-Pass Decoder for Chinese LVCSR --- p.68 / Chapter 5.2.1 --- The First Pass Search --- p.69 / Chapter 5.2.2 --- The Second Pass Search --- p.72 / Chapter 5.3 --- Experimental Results --- p.73 / Chapter 5.4 --- Error Analysis and Discussions --- p.75 / Chapter 5.4.1 --- Vocabulary and Search --- p.75 / Chapter 5.4.2 --- Expansion of the Syllable Lattice --- p.76 / Chapter 5.4.3 --- Perplexity and Recognition Accuracy --- p.78 / Reference --- p.80 / Chapter 6 --- Conclusions and Suggestions for Future Work --- p.82 / Chapter 6.1 --- Conclusions --- p.82 / Chapter 6.2 --- Suggestions for future work --- p.84 / Chapter 1. --- Speaker Adaptation --- p.84 / Chapter 2. --- Tone Recognition --- p.84 / Reference --- p.85 / Appendix I Base Syllable Table --- p.86 / Appendix II Phonetic Question Set --- p.87
2

Verbal information verification for high-performance speaker authentication.

January 2005 (has links)
Qin Chao. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (leaves 77-82). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview of Speaker Authentication --- p.1 / Chapter 1.2 --- Goals of this Research --- p.6 / Chapter 1.3 --- Thesis Outline --- p.7 / Chapter 2 --- Speaker Verification --- p.8 / Chapter 2.1 --- Introduction --- p.8 / Chapter 2.2 --- Front-End Processing --- p.9 / Chapter 2.2.1 --- Acoustic Feature Extraction --- p.10 / Chapter 2.2.2 --- Endpoint Detection --- p.12 / Chapter 2.3 --- Speaker Modeling --- p.12 / Chapter 2.3.1 --- Likelihood Ratio Test for Speaker Verification --- p.13 / Chapter 2.3.2 --- Gaussian Mixture Models --- p.15 / Chapter 2.3.3 --- UBM Adaptation --- p.16 / Chapter 2.4 --- Experiments on Cantonese Speaker Verification --- p.18 / Chapter 2.4.1 --- Speech Databases --- p.19 / Chapter 2.4.2 --- Effect of Endpoint Detection --- p.21 / Chapter 2.4.3 --- Comparison of the UBM Adaptation and the Cohort Method --- p.22 / Chapter 2.4.4 --- Discussions --- p.25 / Chapter 2.5 --- Summary --- p.26 / Chapter 3 --- Verbal Information Verification --- p.28 / Chapter 3.1 --- Introduction --- p.28 / Chapter 3.2 --- Utterance Verification for VIV --- p.29 / Chapter 3.2.1 --- Forced Alignment --- p.30 / Chapter 3.2.2 --- Subword Hypothesis Test --- p.30 / Chapter 3.2.3 --- Confidence Measure --- p.31 / Chapter 3.3 --- Sequential Utterance Verification for VIV --- p.34 / Chapter 3.3.1 --- Practical Security Consideration --- p.34 / Chapter 3.3.2 --- Robust Interval --- p.34 / Chapter 3.4 --- Application and Further Improvement --- p.36 / Chapter 3.5 --- Summary --- p.36 / Chapter 4 --- Model Design for Cantonese Verbal Information Verification --- p.37 / Chapter 4.1 --- General Considerations --- p.37 / Chapter 4.2 --- The Cantonese Dialect --- p.37 / Chapter 4.3 --- Target Model Design --- p.38 / Chapter 4.4 --- Anti-Model Design --- p.38 / Chapter 4.4.1 --- Role of Normalization Techniques --- p.38 / Chapter 4.4.2 --- Context-dependent versus Context-independent Antimodels --- p.40 / Chapter 4.4.3 --- General Approach to CI Anti-modeling --- p.40 / Chapter 4.4.4 --- Sub-syllable Clustering --- p.41 / Chapter 4.4.5 --- Cohort and World Anti-models --- p.42 / Chapter 4.4.6 --- GMM-based Anti-models --- p.44 / Chapter 4.5 --- Simulation Results and Discussions --- p.45 / Chapter 4.5.1 --- Speech Databases --- p.45 / Chapter 4.5.2 --- Effect of Model Complexity --- p.46 / Chapter 4.5.3 --- Comparisons among different Anti-models --- p.47 / Chapter 4.5.4 --- Discussions --- p.48 / Chapter 4.6 --- Summary --- p.49 / Chapter 5 --- Integration of SV and VIV --- p.50 / Chapter 5.1 --- Introduction --- p.50 / Chapter 5.2 --- Voting Method --- p.53 / Chapter 5.2.1 --- Permissive Test vs. Restrictive Test --- p.54 / Chapter 5.2.2 --- Shared vs. Speaker-specific Thresholds --- p.55 / Chapter 5.3 --- Support Vector Machines --- p.56 / Chapter 5.4 --- Gaussian-based Classifier --- p.59 / Chapter 5.5 --- Simulation Results and Discussions --- p.60 / Chapter 5.5.1 --- Voting Method --- p.60 / Chapter 5.5.2 --- Support Vector Machines --- p.63 / Chapter 5.5.3 --- Gaussian-based Classifier --- p.64 / Chapter 5.5.4 --- Discussions --- p.66 / Chapter 5.6 --- Summary --- p.67 / Chapter 6 --- Conclusions and Suggested Future Works --- p.68 / Chapter 6.1 --- Conclusions --- p.68 / Chapter 6.2 --- Summary of Findings and Contributions of This Thesis --- p.70 / Chapter 6.3 --- Future Perspective --- p.71 / Chapter 6.3.1 --- Integration of Keyword Spotting into VIV --- p.71 / Chapter 6.3.2 --- Integration of Prosodic Information --- p.71 / Appendices --- p.73 / Chapter A --- A Cantonese VIV Demonstration System --- p.73 / Bibliography --- p.77
3

Automatic recognition of continuous Cantonese speech. / CUHK electronic theses & dissertations collection

January 1997 (has links)
Alfred Ying-Pang Ng. / Thesis (Ph.D.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references (p. 159-169). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Mode of access: World Wide Web.
4

Efficient algorithms for speech recognition of Cantonese.

January 1987 (has links)
by Lai Wai Ming. / Thesis (M.Ph.)--Chinese University of Hong Kong, 1987. / Bibliography: leaves 81-85.
5

Cantonese text-to-speech synethesis using sub-syllable units. / 利用子音節的粤語文語轉換系統 / Cantonese text-to-speech synethesis using sub-syllable units. / Li yong zi yin jie de Yue yu wen yu zhuan huan xi tong

January 2001 (has links)
Law Ka Man = 利用子音節的粤語文語轉換系統 / 羅家文. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2001. / Includes bibliographical references. / Text in English; abstracts in English and Chinese. / Law Ka Man = Li yong zi yin jie de Yue yu wen yu zhuan huan xi tong / Luo Jiawen. / Chapter 1. --- INTRODUCTION --- p.1 / Chapter 1.1 --- Text analysis --- p.2 / Chapter 1.2 --- Prosody prediction --- p.3 / Chapter 1.3 --- Speech generation --- p.3 / Chapter 1.4 --- The trend of TTS technology --- p.5 / Chapter 1.5 --- TTS systems for different languages --- p.6 / Chapter 1.6 --- Objectives of the thesis --- p.8 / Chapter 1.7 --- Thesis outline --- p.8 / References --- p.10 / Chapter 2. --- BACKGROUND --- p.11 / Chapter 2.1 --- Cantonese phonology --- p.11 / Chapter 2.2 --- Cantonese TTS - a baseline system --- p.16 / Chapter 2.3 --- Time-Domain Prrch-Synchronous-OverLap-Add --- p.17 / Chapter 2.3.1 --- "From, speech signal to short-time analysis signals" --- p.18 / Chapter 2.3.2 --- From short-time analysis signals to short-time synthesis signals --- p.19 / Chapter 2.3.3 --- From short-time synthesis signals to synthetic speech --- p.20 / Chapter 2.4 --- Time-scale and Pitch-scale modifications --- p.20 / Chapter 2.4.1 --- Voiced speech --- p.20 / Chapter 2.4.2 --- Unvoiced speech --- p.21 / Chapter 2.5 --- Summary --- p.22 / References --- p.23 / Chapter 3. --- SUB-SYLLABLE BASED TTS SYSTEM --- p.24 / Chapter 3.1 --- Motivations --- p.24 / Chapter 3.2 --- Choices of synthesis units --- p.27 / Chapter 3.2.1 --- Sub-syllable unit --- p.29 / Chapter 3.2.2 --- "Diphones, demi-syllables and sub-syllable units" --- p.31 / Chapter 3.3 --- Proposed TTS system --- p.32 / Chapter 3.3.1 --- Text analysis module --- p.33 / Chapter 3.3.2 --- Synthesis module --- p.36 / Chapter 3.3.3 --- Prosody module --- p.37 / Chapter 3.4 --- Summary --- p.38 / References --- p.39 / Chapter 4. --- ACOUSTIC INVENTORY --- p.40 / Chapter 4.1 --- The full set of Cantonese sub-syllable units --- p.40 / Chapter 4.2 --- A reduced set of sub-syllable units --- p.42 / Chapter 4.3 --- Corpus design --- p.44 / Chapter 4.4 --- Recording --- p.46 / Chapter 4.5 --- Post-processing of speech data --- p.47 / Chapter 4.6 --- Summary --- p.51 / References --- p.51 / Chapter 5. --- CONCATENATION TECHNIQUES --- p.52 / Chapter 5.1 --- Concatenation of sub-syllable units --- p.52 / Chapter 5.1.1 --- Concatenation of plosives and affricates --- p.54 / Chapter 5.1.2 --- Concatenation of fricatives --- p.55 / Chapter 5.1.3 --- "Concatenation of vowels, semi-vowels and nasals" --- p.55 / Chapter 5.1.4 --- Spectral distance measure --- p.57 / Chapter 5.2 --- Waveform concatenation method --- p.58 / Chapter 5.3 --- Selected examples of waveform concatenation --- p.59 / Chapter 5.3.1 --- I-I concatenation --- p.60 / Chapter 5.3.2 --- F-F concatenation --- p.66 / Chapter 5.4 --- Summary --- p.71 / References --- p.72 / Chapter 6. --- PERFORMANCE EVALUATION --- p.73 / Chapter 6.1 --- Listening test --- p.73 / Chapter 6.2 --- Test results: --- p.74 / Chapter 6.3 --- Discussions --- p.75 / References --- p.78 / Chapter 7. --- CONCLUSIONS & FUTURE WORKS --- p.79 / Chapter 7.1 --- Conclusions --- p.79 / Chapter 7.2 --- Suggested future work --- p.81 / APPENDIX 1 SYLLABLE DURATION --- p.82 / APPENDIX 2 PERCEPTUAL TEST PARAGRAPHS --- p.86
6

HMM based connected speech recognition system for Cantonese =: 建基於隱馬爾可夫模型的粤語連續語音識別系統. / 建基於隱馬爾可夫模型的粤語連續語音識別系統 / An HMM based connected speech recognition system for Cantonese =: Jian ji yu Yin Ma'erkefu mo xing de Yue yu lian xu yu yin shi bie xi tong. / Jian ji yu Yin Ma'erkefu mo xing de Yue yu lian xu yu yin shi bie xi tong

January 1998 (has links)
by Chow Ka Fai. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves [124-132]). / Text in English; abstract also in Chinese. / by Chow Ka Fai. / Chapter 1 --- INTRODUCTION --- p.1 / Chapter 1.1 --- Speech Recognition Technology --- p.4 / Chapter 1.2 --- Automatic Recognition of Cantonese Speech --- p.6 / Chapter 1.3 --- Objectives of the thesis --- p.8 / Chapter 1.4 --- Thesis Outline --- p.11 / Chapter 2 --- FUNDAMENTALS OF HMM BASED RECOGNITION SYSTEM --- p.13 / Chapter 2.1 --- Introduction --- p.13 / Chapter 2.2 --- HMM Fundamentals --- p.13 / Chapter 2.2.1 --- HMM Structure and Behavior --- p.13 / Chapter 2.2.2 --- HMM-based Speech Modeling --- p.15 / Chapter 2.2.3 --- Mathematics --- p.18 / Chapter 2.3 --- hmm Based Speech Recognition System --- p.22 / Chapter 2.3.1 --- Isolated Speech Recognition --- p.23 / Chapter 2.3.2 --- Connected Speech Recognition --- p.25 / Chapter 2.4 --- Algorithms for Finding Hidden State Sequence --- p.28 / Chapter 2.4.1 --- Forward-backward algorithm --- p.29 / Chapter 2.4.2 --- Viterbi Decoder Algorithm --- p.31 / Chapter 2.5 --- Parameter Estimation --- p.32 / Chapter 2.5.1 --- Basic Ideas for Estimation --- p.32 / Chapter 2.5.2 --- Single Model Re-estimation Using Best State-Time Alignment (HINIT) --- p.36 / Chapter 2.5.3 --- Single Model Re-estimation Using Baum- Welch Method (HREST) --- p.39 / Chapter 2.5.4 --- HMM Embedded Re-estimation (HEREST) --- p.41 / Chapter 2.6 --- Feature Extraction --- p.42 / Chapter 2.7 --- Summary --- p.47 / Chapter 3 --- CANTONESE PHONOLOGY AND LANGUAGE PROPERTIES --- p.48 / Chapter 3.1 --- Introduction --- p.48 / Chapter 3.2 --- Cantonese and Chinese Language --- p.48 / Chapter 3.2.1 --- Chinese Words and Characters --- p.48 / Chapter 3.2.2 --- The Relationship between Cantonese and Chinese Characters --- p.50 / Chapter 3.3 --- Basic Syllable structure --- p.51 / Chapter 3.3.1 --- CVC structure --- p.51 / Chapter 3.3.2 --- Cantonese Phonemes --- p.52 / Chapter 3.3.3 --- The Initial-Final structure --- p.55 / Chapter 3.3.4 --- Cantonese Nine Tone System --- p.57 / Chapter 3.4 --- Acoustic Properties of Cantonese --- p.58 / Chapter 3.5 --- Cantonese Phonology for Speech Recognition --- p.60 / Chapter 3.6 --- Summary --- p.62 / Chapter 4 --- CANTONESE SPEECH DATABASES --- p.64 / Chapter 4.1 --- Introduction --- p.64 / Chapter 4.2 --- The Importance of Speech Data --- p.64 / Chapter 4.3 --- The Demands of Cantonese Speech Databases --- p.67 / Chapter 4.4 --- Principles in Cantonese Database Development --- p.67 / Chapter 4.5 --- Resources and Limitations for Database Designs --- p.69 / Chapter 4.6 --- Details of Speech Databases --- p.69 / Chapter 4.6.1 --- Multiple speakers' Speech Database (CUWORD) --- p.70 / Chapter 4.6.2 --- Single Speaker's Speech Database (MYVOICE) --- p.72 / Chapter 4.7 --- Difficulties and Solutions in Recording Process --- p.76 / Chapter 4.8 --- Verification of Phonetic Transcription --- p.78 / Chapter 4.9 --- Summary --- p.79 / Chapter 5 --- TRAINING OF AN HMM BASED CANTONESE SPEECH RECOGNITION SYSTEM --- p.80 / Chapter 5.1 --- Introduction --- p.80 / Chapter 5.2 --- Objectives of HMM Development --- p.81 / Chapter 5.3 --- The Design of Initial-Final Models --- p.83 / Chapter 5.4 --- Initialization of Basic Initial-Final Models --- p.84 / Chapter 5.4.1 --- The Initialization Training with HEREST --- p.85 / Chapter 5.4.2 --- Refinement of Initialized Models --- p.88 / Chapter 5.4.3 --- Evaluation of the Models --- p.90 / Chapter 5.5 --- Training of Connected Speech Speaker Dependent Models --- p.93 / Chapter 5.5.1 --- Training Strategy --- p.93 / Chapter 5.5.2 --- Preliminary Result --- p.94 / Chapter 5.6 --- Design and Training of Context Dependent Initial Final Models --- p.95 / Chapter 5.6.1 --- Intra-syllable Context Dependent Units --- p.96 / Chapter 5.6.2 --- The Inter-syllable Context Dependent Units --- p.97 / Chapter 5.6.3 --- Model Refinement by Using Mixture Incrementing --- p.98 / Chapter 5.7 --- Training of Speaker Independent Models --- p.99 / Chapter 5.8 --- Discussions --- p.100 / Chapter 5.9 --- Summary --- p.101 / Chapter 6 --- PERFORMANCE ANALYSIS --- p.102 / Chapter 6.1 --- Substitution Errors --- p.102 / Chapter 6.1.1 --- Confusion of Long Vowels and Short Vowels for Initial Stop Consonants --- p.102 / Chapter 6.1.2 --- Confusion of Nasal Endings --- p.103 / Chapter 6.1.3 --- Confusion of Final Stop Consonants --- p.104 / Chapter 6.2 --- Insertion Errors and Deletion Errors --- p.105 / Chapter 6.3 --- Accuracy of Individual Models --- p.106 / Chapter 6.4 --- The Impact of Individual Models --- p.107 / Chapter 6.4.1 --- The Expected Error Rate of Initial Models --- p.110 / Chapter 6.4.2 --- The Expected Error Rate of Final Models --- p.111 / Chapter 6.5 --- Suggested Solutions for Error Reduction --- p.113 / Chapter 6.5.1 --- Duration Constraints --- p.113 / Chapter 6.5.2 --- The Use of Language Model --- p.113 / Chapter 6.6 --- Summary --- p.114 / Chapter 7 --- APPLICATIONS EXAMPLES OF THE HMM RECOGNITION SYSTEM --- p.115 / Chapter 7.1 --- Introduction --- p.115 / Chapter 7.2 --- Application 1: A Hong Kong Stock Market Inquiry System --- p.116 / Chapter 7.3 --- Application 2: A Navigating System for Hong Kong Street Map --- p.117 / Chapter 7.4 --- Automatic Character-to-Phonetic Conversion --- p.118 / Chapter 7.5 --- Summary --- p.119 / Chapter 8 --- CONCLUSIONS AND SUGGESTIONS FOR FURTHER WORK --- p.120 / Chapter 8.1 --- Conclusions --- p.120 / Chapter 8.2 --- Suggestions for Future Work --- p.122 / Chapter 8.2.1 --- Development of Continuous Speech Recognition System --- p.122 / Chapter 8.2.2 --- Implementation of Statistical Language Models --- p.122 / Chapter 8.2.3 --- Tones for Continuous Speech --- p.123 / BIBILOGRAPHY / APPENDIX
7

Unit selection and waveform concatenation strategies in Cantonese text-to-speech.

January 2005 (has links)
Oey Sai Lok. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references. / Abstracts in English and Chinese. / Chapter 1. --- Introduction --- p.1 / Chapter 1.1 --- An overview of Text-to-Speech technology --- p.2 / Chapter 1.1.1 --- Text processing --- p.2 / Chapter 1.1.2 --- Acoustic synthesis --- p.3 / Chapter 1.1.3 --- Prosody modification --- p.4 / Chapter 1.2 --- Trends in Text-to-Speech technologies --- p.5 / Chapter 1.3 --- Objectives of this thesis --- p.7 / Chapter 1.4 --- Outline of the thesis --- p.9 / References --- p.11 / Chapter 2. --- Cantonese Speech --- p.13 / Chapter 2.1 --- The Cantonese dialect --- p.13 / Chapter 2.2 --- Phonology of Cantonese --- p.14 / Chapter 2.2.1 --- Initials --- p.15 / Chapter 2.2.2 --- Finals --- p.16 / Chapter 2.2.3 --- Tones --- p.18 / Chapter 2.3 --- Acoustic-phonetic properties of Cantonese syllables --- p.19 / References --- p.24 / Chapter 3. --- Cantonese Text-to-Speech --- p.25 / Chapter 3.1 --- General overview --- p.25 / Chapter 3.1.1 --- Text processing --- p.25 / Chapter 3.1.2 --- Corpus based acoustic synthesis --- p.26 / Chapter 3.1.3 --- Prosodic control --- p.27 / Chapter 3.2 --- Syllable based Cantonese Text-to-Speech system --- p.28 / Chapter 3.3 --- Sub-syllable based Cantonese Text-to-Speech system --- p.29 / Chapter 3.3.1 --- Definition of sub-syllable units --- p.29 / Chapter 3.3.2 --- Acoustic inventory --- p.31 / Chapter 3.3.3 --- Determination of the concatenation points --- p.33 / Chapter 3.4 --- Problems --- p.34 / References --- p.36 / Chapter 4. --- Waveform Concatenation for Sub-syllable Units --- p.37 / Chapter 4.1 --- Previous work in concatenation methods --- p.37 / Chapter 4.1.1 --- Determination of concatenation point --- p.38 / Chapter 4.1.2 --- Waveform concatenation --- p.38 / Chapter 4.2 --- Problems and difficulties in concatenating sub-syllable units --- p.39 / Chapter 4.2.1 --- Mismatch of acoustic properties --- p.40 / Chapter 4.2.2 --- "Allophone problem of Initials /z/, Id and /s/" --- p.42 / Chapter 4.3 --- General procedures in concatenation strategies --- p.44 / Chapter 4.3.1 --- Concatenation of unvoiced segments --- p.45 / Chapter 4.3.2 --- Concatenation of voiced segments --- p.45 / Chapter 4.3.3 --- Measurement of spectral distance --- p.48 / Chapter 4.4 --- Detailed procedures in concatenation points determination --- p.50 / Chapter 4.4.1 --- Unvoiced segments --- p.50 / Chapter 4.4.2 --- Voiced segments --- p.53 / Chapter 4.5 --- Selected examples in concatenation strategies --- p.58 / Chapter 4.5.1 --- Concatenation at Initial segments --- p.58 / Chapter 4.5.1.1 --- Plosives --- p.58 / Chapter 4.5.1.2 --- Fricatives --- p.59 / Chapter 4.5.2 --- Concatenation at Final segments --- p.60 / Chapter 4.5.2.1 --- V group (long vowel) --- p.60 / Chapter 4.5.2.2 --- D group (diphthong) --- p.61 / References --- p.63 / Chapter 5. --- Unit Selection for Sub-syllable Units --- p.65 / Chapter 5.1 --- Basic requirements in unit selection process --- p.65 / Chapter 5.1.1 --- Availability of multiple copies of sub-syllable units --- p.65 / Chapter 5.1.1.1 --- "Levels of ""identical""" --- p.66 / Chapter 5.1.1.2 --- Statistics on the availability --- p.67 / Chapter 5.1.2 --- Variations in acoustic parameters --- p.70 / Chapter 5.1.2.1 --- Pitch level --- p.71 / Chapter 5.1.2.2 --- Duration --- p.74 / Chapter 5.1.2.3 --- Intensity level --- p.75 / Chapter 5.2 --- Selection process: availability check on sub-syllable units --- p.77 / Chapter 5.2.1 --- Multiple copies found --- p.79 / Chapter 5.2.2 --- Unique copy found --- p.79 / Chapter 5.2.3 --- No matched copy found --- p.80 / Chapter 5.2.4 --- Illustrative examples --- p.80 / Chapter 5.3 --- Selection process: acoustic analysis on candidate units --- p.81 / References --- p.88 / Chapter 6. --- Performance Evaluation --- p.89 / Chapter 6.1 --- General information --- p.90 / Chapter 6.1.1 --- Objective test --- p.90 / Chapter 6.1.2 --- Subjective test --- p.90 / Chapter 6.1.3 --- Test materials --- p.91 / Chapter 6.2 --- Details of the objective test --- p.92 / Chapter 6.2.1 --- Testing method --- p.92 / Chapter 6.2.2 --- Results --- p.93 / Chapter 6.2.3 --- Analysis --- p.96 / Chapter 6.3 --- Details of the subjective test --- p.98 / Chapter 6.3.1 --- Testing method --- p.98 / Chapter 6.3.2 --- Results --- p.99 / Chapter 6.3.3 --- Analysis --- p.101 / Chapter 6.4 --- Summary --- p.107 / References --- p.108 / Chapter 7. --- Conclusions and Future Works --- p.109 / Chapter 7.1 --- Conclusions --- p.109 / Chapter 7.2 --- Suggested future works --- p.111 / References --- p.113 / Appendix 1 Mean pitch level of Initials and Finals stored in the inventory --- p.114 / Appendix 2 Mean durations of Initials and Finals stored in the inventory --- p.121 / Appendix 3 Mean intensity level of Initials and Finals stored in the inventory --- p.124 / Appendix 4 Test word used in performance evaluation --- p.127 / Appendix 5 Test paragraph used in performance evaluation --- p.128 / Appendix 6 Pitch profile used in the Text-to-Speech system --- p.131 / Appendix 7 Duration model used in Text-to-Speech system --- p.132
8

Use of tone information in Cantonese LVCSR based on generalized character posterior probability decoding. / CUHK electronic theses & dissertations collection

January 2005 (has links)
Automatic recognition of Cantonese tones has long been regarded as a difficult task. Cantonese has one of the most complicated tone systems among all languages in the world. This thesis presents a novel approach of modeling Cantonese tones. We propose the use of supra-tone models. Each supra-tone unit covers a number of syllables in succession. The supra-tone model characterizes not only the tone contours of individual syllables but also the transitions among them. By including multiple tone contours in one modeling unit, the relative heights of the tones are captured explicitly. This is especially important for the discrimination among the level tones of Cantonese. / The decoding in conventional LVCSR systems aims at finding the sentence hypothesis, i.e. the string of words, which has the maximum a posterior (MAP) probability in comparison with other hypotheses. However, in most applications, the recognition performance is measured in terms of word error rate (or word accuracy). In Chinese languages, given that "word" is a rather ambiguous concept, speech recognition performance is usually measured in terms of the character error rate. In this thesis, we develop a decoding algorithm that can minimize the character error rate. The algorithm is applied to a reduced search space, e.g. a word graph or the N-best sentence list, which results from the 1st pass of search, and the generalized character posterior probability (GCPP) is maximized. (Abstract shortened by UMI.) / This thesis addresses two major problems of the existing large vocabulary continuous speech recognition (LVCSR) technology: (1) inadequate exploitation of alternative linguistic and acoustic information; and (2) the mismatch between the decoding (recognition) criterion and the performance evaluation. The study is focused on Cantonese, one of the major Chinese dialects, which is also monosyllabic and tonal. Tone is somewhat indispensable for lexical access and disambiguation of homonyms in Cantonese. However, tone information into Cantonese LVCSR requires effective tone recognition as well as a seamless integration algorithm. / Qian Yao. / "July 2005." / Adviser: Tan Lee. / Source: Dissertation Abstracts International, Volume: 67-07, Section: B, page: 4009. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (p. 100-110). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract in English and Chinese. / School code: 1307.
9

Pronunciation modeling for Cantonese speech recognition.

January 2003 (has links)
Kam Patgi. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2003. / Includes bibliographical references (leaf 103). / Abstracts in English and Chinese. / Chapter Chapter 1. --- Introduction --- p.1 / Chapter 1.1 --- Automatic Speech Recognition --- p.1 / Chapter 1.2 --- Pronunciation Modeling in ASR --- p.2 / Chapter 1.3 --- Obj ectives of the Thesis --- p.5 / Chapter 1.4 --- Thesis Outline --- p.5 / Reference --- p.7 / Chapter Chapter 2. --- The Cantonese Dialect --- p.9 / Chapter 2.1 --- Cantonese - A Typical Chinese Dialect --- p.10 / Chapter 2.1.1 --- Cantonese Phonology --- p.11 / Chapter 2.1.2 --- Cantonese Phonetics --- p.12 / Chapter 2.2 --- Pronunciation Variation in Cantonese --- p.13 / Chapter 2.2.1 --- Phone Change and Sound Change --- p.14 / Chapter 2.2.2 --- Notation for Different Sound Units --- p.16 / Chapter 2.3 --- Summary --- p.17 / Reference --- p.18 / Chapter Chapter 3. --- Large-Vocabulary Continuous Speech Recognition for Cantonese --- p.19 / Chapter 3.1 --- Feature Representation of the Speech Signal --- p.20 / Chapter 3.2 --- Probabilistic Framework of ASR --- p.20 / Chapter 3.3 --- Hidden Markov Model for Acoustic Modeling --- p.21 / Chapter 3.4 --- Pronunciation Lexicon --- p.25 / Chapter 3.5 --- Statistical Language Model --- p.25 / Chapter 3.6 --- Decoding --- p.26 / Chapter 3.7 --- The Baseline Cantonese LVCSR System --- p.26 / Chapter 3.7.1 --- System Architecture --- p.26 / Chapter 3.7.2 --- Speech Databases --- p.28 / Chapter 3.8 --- Summary --- p.29 / Reference --- p.30 / Chapter Chapter 4. --- Pronunciation Model --- p.32 / Chapter 4.1 --- Pronunciation Modeling at Different Levels --- p.33 / Chapter 4.2 --- Phone-level pronunciation model and its Application --- p.35 / Chapter 4.2.1 --- IF Confusion Matrix (CM) --- p.35 / Chapter 4.2.2 --- Decision Tree Pronunciation Model (DTPM) --- p.38 / Chapter 4.2.3 --- Refinement of Confusion Matrix --- p.41 / Chapter 4.3 --- Summary --- p.43 / References --- p.44 / Chapter Chapter 5. --- Pronunciation Modeling at Lexical Level --- p.45 / Chapter 5.1 --- Construction of PVD --- p.46 / Chapter 5.2 --- PVD Pruning by Word Unigram --- p.48 / Chapter 5.3 --- Recognition Experiments --- p.49 / Chapter 5.3.1 --- Experiment 1 ´ؤPronunciation Modeling in LVCSR --- p.49 / Chapter 5.3.2 --- Experiment 2 ´ؤ Pronunciation Modeling in Domain Specific task --- p.58 / Chapter 5.3.3 --- Experiment 3 ´ؤ PVD Pruning by Word Unigram --- p.62 / Chapter 5.4 --- Summary --- p.63 / Reference --- p.64 / Chapter Chapter 6. --- Pronunciation Modeling at Acoustic Model Level --- p.66 / Chapter 6.1 --- Hierarchy of HMM --- p.67 / Chapter 6.2 --- Sharing of Mixture Components --- p.68 / Chapter 6.3 --- Adaptation of Mixture Components --- p.70 / Chapter 6.4 --- Combination of Mixture Component Sharing and Adaptation --- p.74 / Chapter 6.5 --- Recognition Experiments --- p.78 / Chapter 6.6 --- Result Analysis --- p.80 / Chapter 6.6.1 --- Performance of Sharing Mixture Components --- p.81 / Chapter 6.6.2 --- Performance of Mixture Component Adaptation --- p.84 / Chapter 6.7 --- Summary --- p.85 / Reference --- p.87 / Chapter Chapter 7. --- Pronunciation Modeling at Decoding Level --- p.88 / Chapter 7.1 --- Search Process in Cantonese LVCSR --- p.88 / Chapter 7.2 --- Model-Level Search Space Expansion --- p.90 / Chapter 7.3 --- State-Level Output Probability Modification --- p.92 / Chapter 7.4 --- Recognition Experiments --- p.93 / Chapter 7.4.1 --- Experiment 1 ´ؤModel-Level Search Space Expansion --- p.93 / Chapter 7.4.2 --- Experiment 2 ´ؤ State-Level Output Probability Modification …… --- p.94 / Chapter 7.5 --- Summary --- p.96 / Reference --- p.97 / Chapter Chapter 8. --- Conclusions and Suggestions for Future Work --- p.98 / Chapter 8.1 --- Conclusions --- p.98 / Chapter 8.2 --- Suggestions for Future Work --- p.100 / Reference --- p.103 / Appendix I Base Syllable Table --- p.104 / Appendix II Cantonese Initials and Finals --- p.105 / Appendix III IF confusion matrix --- p.106 / Appendix IV Phonetic Question Set --- p.112 / Appendix V CDDT and PCDT --- p.114
10

Language modeling for speech recognition of spoken Cantonese.

January 2009 (has links)
Yeung, Yu Ting. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2009. / Includes bibliographical references (leaves 84-93). / Abstracts in English and Chinese. / Acknowledgement --- p.iii / Abstract --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Cantonese Speech Recognition --- p.3 / Chapter 1.2 --- Objectives --- p.4 / Chapter 1.3 --- Thesis Outline --- p.5 / Chapter 2 --- Fundamentals of Large Vocabulary Continuous Speech Recognition --- p.7 / Chapter 2.1 --- Problem Formulation --- p.7 / Chapter 2.2 --- Feature Extraction --- p.8 / Chapter 2.3 --- Acoustic Models --- p.9 / Chapter 2.4 --- Decoding --- p.10 / Chapter 2.5 --- Statistical Language Modeling --- p.12 / Chapter 2.5.1 --- N-gram Language Models --- p.12 / Chapter 2.5.2 --- N-gram Smoothing --- p.13 / Chapter 2.5.3 --- Complexity of Language Model --- p.15 / Chapter 2.5.4 --- Class-based Langauge Model --- p.16 / Chapter 2.5.5 --- Language Model Pruning --- p.17 / Chapter 2.6 --- Performance Evaluation --- p.18 / Chapter 3 --- The Cantonese Dialect --- p.19 / Chapter 3.1 --- Phonology of Cantonese --- p.19 / Chapter 3.2 --- Orthographic Representation of Cantonese --- p.22 / Chapter 3.3 --- Classification of Cantonese speech --- p.25 / Chapter 3.4 --- Cantonese-English Code-mixing --- p.27 / Chapter 4 --- Rule-based Translation Method --- p.29 / Chapter 4.1 --- Motivations --- p.29 / Chapter 4.2 --- Transformation-based Learning --- p.30 / Chapter 4.2.1 --- Algorithm Overview --- p.30 / Chapter 4.2.2 --- Learning of Translation Rules --- p.32 / Chapter 4.3 --- Performance Evaluation --- p.35 / Chapter 4.3.1 --- The Learnt Translation Rules --- p.35 / Chapter 4.3.2 --- Evaluation of the Rules --- p.37 / Chapter 4.3.3 --- Analysis of the Rules --- p.37 / Chapter 4.4 --- Preparation of Training Data for Language Modeling --- p.41 / Chapter 4.5 --- Discussion --- p.43 / Chapter 5 --- Language Modeling for Cantonese --- p.44 / Chapter 5.1 --- Training Data --- p.44 / Chapter 5.1.1 --- Text Corpora --- p.44 / Chapter 5.1.2 --- Preparation of Formal Cantonese Text Data --- p.45 / Chapter 5.2 --- Training of Language Models --- p.46 / Chapter 5.2.1 --- Language Models for Standard Chinese --- p.46 / Chapter 5.2.2 --- Language Models for Formal Cantonese --- p.46 / Chapter 5.2.3 --- Language models for Colloquial Cantonese --- p.47 / Chapter 5.3 --- Evaluation of Language Models --- p.48 / Chapter 5.3.1 --- Speech Corpora for Evaluation --- p.48 / Chapter 5.3.2 --- Perplexities of Formal Cantonese Language Models --- p.49 / Chapter 5.3.3 --- Perplexities of Colloquial Cantonese Language Models --- p.51 / Chapter 5.4 --- Speech Recognition Experiments --- p.53 / Chapter 5.4.1 --- Speech Corpora --- p.53 / Chapter 5.4.2 --- Experimental Setup --- p.54 / Chapter 5.4.3 --- Results on Formal Cantonese Models --- p.55 / Chapter 5.4.4 --- Results on Colloquial Cantonese Models --- p.56 / Chapter 5.5 --- Analysis of Results --- p.58 / Chapter 5.6 --- Discussion --- p.59 / Chapter 5.6.1 --- Cantonese Language Modeling --- p.59 / Chapter 5.6.2 --- Interpolated Language Models --- p.59 / Chapter 5.6.3 --- Class-based Language Models --- p.60 / Chapter 6 --- Towards Language Modeling of Code-mixing Speech --- p.61 / Chapter 6.1 --- Data Collection --- p.61 / Chapter 6.1.1 --- Data Collection --- p.62 / Chapter 6.1.2 --- Filtering of Collected Data --- p.63 / Chapter 6.1.3 --- Processing of Collected Data --- p.63 / Chapter 6.2 --- Clustering of Chinese and English Words --- p.64 / Chapter 6.3 --- Language Modeling for Code-mixing Speech --- p.64 / Chapter 6.3.1 --- Language Models from Collected Data --- p.64 / Chapter 6.3.2 --- Class-based Language Models --- p.66 / Chapter 6.3.3 --- Performance Evaluation of Code-mixing Language Models --- p.67 / Chapter 6.4 --- Speech Recognition Experiments with Code-mixing Language Models --- p.69 / Chapter 6.4.1 --- Experimental Setup --- p.69 / Chapter 6.4.2 --- Monolingual Cantonese Recognition --- p.70 / Chapter 6.4.3 --- Code-mixing Speech Recognition --- p.72 / Chapter 6.5 --- Discussion --- p.74 / Chapter 6.5.1 --- Data Collection from the Internet --- p.74 / Chapter 6.5.2 --- Speech Recognition of Code-mixing Speech --- p.75 / Chapter 7 --- Conclusions and Future Work --- p.77 / Chapter 7.1 --- Conclusions --- p.77 / Chapter 7.1.1 --- Rule-based Translation Method --- p.77 / Chapter 7.1.2 --- Cantonese Language Modeling --- p.78 / Chapter 7.1.3 --- Code-mixing Language Modeling --- p.78 / Chapter 7.2 --- Future Works --- p.79 / Chapter 7.2.1 --- Rule-based Translation --- p.79 / Chapter 7.2.2 --- Training data --- p.80 / Chapter 7.2.3 --- Code-mixing speech --- p.80 / Chapter A --- Equation Derivation --- p.82 / Chapter A.l --- Relationship between Average Mutual Information and Perplexity --- p.82 / Bibliography --- p.83

Page generated in 0.12 seconds