by Tan Lee. / Thesis (Ph.D.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references. / by Tan Lee. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Conventional Pattern Recognition Approaches for Speech Recognition --- p.3 / Chapter 1.2 --- A Review on Neural Network Applications in Speech Recognition --- p.6 / Chapter 1.2.1 --- Static Pattern Classification --- p.7 / Chapter 1.2.2 --- Hybrid Approaches --- p.9 / Chapter 1.2.3 --- Dynamic Neural Networks --- p.12 / Chapter 1.3 --- Automatic Recognition of Cantonese Speech --- p.16 / Chapter 1.4 --- Organization of the Thesis --- p.18 / References --- p.20 / Chapter 2 --- Phonological and Acoustical Properties of Cantonese Syllables --- p.29 / Chapter 2.1 --- Phonology of Cantonese --- p.29 / Chapter 2.1.1 --- Basic Phonetic Units --- p.30 / Chapter 2.1.2 --- Syllabic Structure --- p.32 / Chapter 2.1.3 --- Lexical Tones --- p.33 / Chapter 2.2 --- Acoustical Properties of Cantonese Syllables --- p.35 / Chapter 2.2.1 --- Spectral Features --- p.35 / Chapter 2.2.2 --- Energy and Zero-Crossing Rate --- p.39 / Chapter 2.2.3 --- Pitch --- p.40 / Chapter 2.2.4 --- Duration --- p.41 / Chapter 2.3 --- Acoustic Feature Extraction for Speech Recognition of Cantonese --- p.42 / References --- p.46 / Chapter 3 --- Tone Recognition of Isolated Cantonese Syllables --- p.48 / Chapter 3.1 --- Acoustic Pre-processing --- p.48 / Chapter 3.1.1 --- Voiced Portion Detection --- p.48 / Chapter 3.1.2 --- Pitch Extraction --- p.51 / Chapter 3.2 --- Supra-Segmental Feature Parameters for Tone Recognition --- p.53 / Chapter 3.2.1 --- Pitch-Related Feature Parameters --- p.53 / Chapter 3.2.2 --- Duration and Energy Drop Rate --- p.55 / Chapter 3.2.3 --- Normalization of Feature Parameters --- p.57 / Chapter 3.3 --- An MLP Based Tone Classifier --- p.58 / Chapter 3.4 --- Simulation Experiments --- p.59 / Chapter 3.4.1 --- Speech Data --- p.59 / Chapter 3.4.2 --- Feature Extraction and Normalization --- p.61 / Chapter 3.4.3 --- Experimental Results --- p.61 / Chapter 3.5 --- Discussion and Conclusion --- p.64 / References --- p.65 / Chapter 4 --- Recurrent Neural Network Based Dynamic Speech Models --- p.67 / Chapter 4.1 --- Motivations and Rationales --- p.68 / Chapter 4.2 --- RNN Speech Model (RSM) --- p.71 / Chapter 4.2.1 --- Network Architecture and Dynamic Operation --- p.71 / Chapter 4.2.2 --- RNN for Speech Modeling --- p.72 / Chapter 4.2.3 --- Illustrative Examples --- p.75 / Chapter 4.3 --- Training of RNN Speech Models --- p.78 / Chapter 4.3.1 --- Real-Time-Recurrent-Learning (RTRL) Algorithm --- p.78 / Chapter 4.3.2 --- Iterative Re-segmentation Training of RSM --- p.80 / Chapter 4.4 --- Several Practical Issues in RSM Training --- p.85 / Chapter 4.4.1 --- Combining Adjacent Segments --- p.85 / Chapter 4.4.2 --- Hypothesizing Initial Segmentation --- p.86 / Chapter 4.4.3 --- Improving Temporal State Dependency --- p.89 / Chapter 4.5 --- Simulation Experiments --- p.90 / Chapter 4.5.1 --- Experiment 4.1 - Training with a Single Utterance --- p.91 / Chapter 4.5.2 --- Experiment 4.2 - Effect of Augmenting Recurrent Learning Rate --- p.93 / Chapter 4.5.3 --- Experiment 4.3 - Training with Multiple Utterances --- p.96 / Chapter 4.5.4 --- Experiment 4.4 一 Modeling Performance of RSMs --- p.99 / Chapter 4.6 --- Conclusion --- p.104 / References --- p.106 / Chapter 5 --- Isolated Word Recognition Using RNN Speech Models --- p.107 / Chapter 5.1 --- A Baseline System --- p.107 / Chapter 5.1.1 --- System Description --- p.107 / Chapter 5.1.2 --- Simulation Experiments --- p.110 / Chapter 5.1.3 --- Discussion --- p.117 / Chapter 5.2 --- Incorporating Duration Information --- p.118 / Chapter 5.2.1 --- Duration Screening --- p.118 / Chapter 5.2.2 --- Determination of Duration Bounds --- p.120 / Chapter 5.2.3 --- Simulation Experiments --- p.120 / Chapter 5.2.4 --- Discussion --- p.124 / Chapter 5.3 --- Discriminative Training --- p.125 / Chapter 5.3.1 --- The Minimum Classification Error Formulation --- p.126 / Chapter 5.3.2 --- Generalized Probabilistic Descent Algorithm --- p.127 / Chapter 5.3.3 --- Determination of Training Parameters --- p.128 / Chapter 5.3.4 --- Simulation Experiments --- p.129 / Chapter 5.3.5 --- Discussion --- p.133 / Chapter 5.4 --- Conclusion --- p.134 / References --- p.135 / Chapter 6 --- An Integrated Speech Recognition System for Cantonese Syllables --- p.137 / Chapter 6.1 --- System Architecture and Recognition Scheme --- p.137 / Chapter 6.2 --- Speech Corpus and Data Pre-processing --- p.140 / Chapter 6.3 --- Recognition Experiments and Results --- p.140 / Chapter 6.4 --- Discussion and Conclusion --- p.144 / References --- p.146 / Chapter 7 --- Conclusions and Suggestions for Future Work --- p.147 / Chapter 7.1 --- Conclusions --- p.147 / Chapter 7.2 --- Suggestions for Future Work --- p.151
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_321646 |
Date | January 1996 |
Contributors | Lee, Tan., Chinese University of Hong Kong Graduate School. Division of Electronic Engineering. |
Publisher | Chinese University of Hong Kong |
Source Sets | The Chinese University of Hong Kong |
Language | English |
Detected Language | English |
Type | Text, bibliography |
Format | print, xii, 152 leaves : ill. ; 30 cm. |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.0019 seconds