Zhu Yu. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (leaves 100-104). / Abstracts in English and Chinese. / Chapter CHAPTER 1 --- lNTRODUCTION --- p.1 / Chapter 1.1. --- Speech and its temporal structure --- p.1 / Chapter 1.2. --- Previous work on the modeling of temporal structure --- p.1 / Chapter 1.3. --- Integrating explicit duration modeling in HMM-based ASR system --- p.3 / Chapter 1.4. --- Thesis outline --- p.3 / Chapter CHAPTER 2 --- BACKGROUND --- p.5 / Chapter 2.1. --- Automatic speech recognition process --- p.5 / Chapter 2.2. --- HMM for ASR --- p.6 / Chapter 2.2.1. --- HMM for ASR --- p.6 / Chapter 2.2.2. --- HMM-based ASR system --- p.7 / Chapter 2.3. --- General approaches to explicit duration modeling --- p.12 / Chapter 2.3.1. --- Explicit duration modeling --- p.13 / Chapter 2.3.2. --- Training of duration model --- p.16 / Chapter 2.3.3. --- Incorporation of duration model in decoding --- p.18 / Chapter CHAPTER 3 --- CANTONESE CONNECTD-DlGlT RECOGNITION --- p.21 / Chapter 3.1. --- Cantonese connected digit recognition --- p.21 / Chapter 3.1.1. --- Phonetics of Cantonese and Cantonese digit --- p.21 / Chapter 3.2. --- The baseline system --- p.24 / Chapter 3.2.1. --- Speech corpus --- p.24 / Chapter 3.2.2. --- Feature extraction --- p.25 / Chapter 3.2.3. --- HMM models --- p.26 / Chapter 3.2.4. --- HMM decoding --- p.27 / Chapter 3.3. --- Baseline performance and error analysis --- p.27 / Chapter 3.3.1. --- Recognition performance --- p.27 / Chapter 3.3.2. --- Performance for different speaking rates --- p.28 / Chapter 3.3.3. --- Confusion matrix --- p.30 / Chapter CHAPTER 4 --- DURATION MODELING FOR CANTONESE DIGITS --- p.41 / Chapter 4.1. --- Duration features --- p.41 / Chapter 4.1.1. --- Absolute duration feature --- p.41 / Chapter 4.1.2. --- Relative duration feature --- p.44 / Chapter 4.2. --- Parametric distribution for duration modeling --- p.47 / Chapter 4.3. --- Estimation of the model parameters --- p.51 / Chapter 4.4. --- Speaking-rate-dependent duration model --- p.52 / Chapter CHAPTER 5 --- USING DURATION MODELING FOR CANTONSE DIGIT RECOGNITION --- p.57 / Chapter 5.1. --- Baseline decoder --- p.57 / Chapter 5.2. --- Incorporation of state-level duration model --- p.59 / Chapter 5.3. --- Incorporation word-level duration model --- p.62 / Chapter 5.4. --- Weighted use of duration model --- p.65 / Chapter CHAPTER 6 --- EXPERIMENT RESULT AND ANALYSIS --- p.66 / Chapter 6.1. --- Experiments with speaking-rate-independent duration models --- p.66 / Chapter 6.1.1. --- Discussion --- p.68 / Chapter 6.1.2. --- Analysis of the error patterns --- p.71 / Chapter 6.1.3. --- "Reduction of deletion, substitution and insertion" --- p.72 / Chapter 6.1.4. --- Recognition performance at different speaking rates --- p.75 / Chapter 6.2. --- Experiments with speaking-rate-dependent duration models --- p.77 / Chapter 6.2.1. --- Using true speaking rate --- p.77 / Chapter 6.2.2. --- Using estimated speaking rate --- p.79 / Chapter 6.3. --- Evaluation on another speech database --- p.80 / Chapter 6.3.1. --- Experimental setup --- p.80 / Chapter 6.3.2. --- Experiment results and analysis --- p.82 / Chapter CHAPTER 7 --- CONCLUSIONS AND FUTUR WORK --- p.87 / Chapter 7.1. --- Conclusion and understanding of current work --- p.87 / Chapter 7.2. --- Future work --- p.89 / Chapter A --- APPENDIX --- p.90 / BIBLIOGRAPHY --- p.100
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_325149 |
Date | January 2005 |
Contributors | Zhu, Yu., Chinese University of Hong Kong Graduate School. Division of Electronic Engineering. |
Source Sets | The Chinese University of Hong Kong |
Language | English, Chinese |
Detected Language | English |
Type | Text, bibliography |
Format | print, xii, 104 leaves : ill. ; 30 cm. |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.0027 seconds