Return to search

Using duration information in HMM-based automatic speech recognition.

Zhu Yu. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (leaves 100-104). / Abstracts in English and Chinese. / Chapter CHAPTER 1 --- lNTRODUCTION --- p.1 / Chapter 1.1. --- Speech and its temporal structure --- p.1 / Chapter 1.2. --- Previous work on the modeling of temporal structure --- p.1 / Chapter 1.3. --- Integrating explicit duration modeling in HMM-based ASR system --- p.3 / Chapter 1.4. --- Thesis outline --- p.3 / Chapter CHAPTER 2 --- BACKGROUND --- p.5 / Chapter 2.1. --- Automatic speech recognition process --- p.5 / Chapter 2.2. --- HMM for ASR --- p.6 / Chapter 2.2.1. --- HMM for ASR --- p.6 / Chapter 2.2.2. --- HMM-based ASR system --- p.7 / Chapter 2.3. --- General approaches to explicit duration modeling --- p.12 / Chapter 2.3.1. --- Explicit duration modeling --- p.13 / Chapter 2.3.2. --- Training of duration model --- p.16 / Chapter 2.3.3. --- Incorporation of duration model in decoding --- p.18 / Chapter CHAPTER 3 --- CANTONESE CONNECTD-DlGlT RECOGNITION --- p.21 / Chapter 3.1. --- Cantonese connected digit recognition --- p.21 / Chapter 3.1.1. --- Phonetics of Cantonese and Cantonese digit --- p.21 / Chapter 3.2. --- The baseline system --- p.24 / Chapter 3.2.1. --- Speech corpus --- p.24 / Chapter 3.2.2. --- Feature extraction --- p.25 / Chapter 3.2.3. --- HMM models --- p.26 / Chapter 3.2.4. --- HMM decoding --- p.27 / Chapter 3.3. --- Baseline performance and error analysis --- p.27 / Chapter 3.3.1. --- Recognition performance --- p.27 / Chapter 3.3.2. --- Performance for different speaking rates --- p.28 / Chapter 3.3.3. --- Confusion matrix --- p.30 / Chapter CHAPTER 4 --- DURATION MODELING FOR CANTONESE DIGITS --- p.41 / Chapter 4.1. --- Duration features --- p.41 / Chapter 4.1.1. --- Absolute duration feature --- p.41 / Chapter 4.1.2. --- Relative duration feature --- p.44 / Chapter 4.2. --- Parametric distribution for duration modeling --- p.47 / Chapter 4.3. --- Estimation of the model parameters --- p.51 / Chapter 4.4. --- Speaking-rate-dependent duration model --- p.52 / Chapter CHAPTER 5 --- USING DURATION MODELING FOR CANTONSE DIGIT RECOGNITION --- p.57 / Chapter 5.1. --- Baseline decoder --- p.57 / Chapter 5.2. --- Incorporation of state-level duration model --- p.59 / Chapter 5.3. --- Incorporation word-level duration model --- p.62 / Chapter 5.4. --- Weighted use of duration model --- p.65 / Chapter CHAPTER 6 --- EXPERIMENT RESULT AND ANALYSIS --- p.66 / Chapter 6.1. --- Experiments with speaking-rate-independent duration models --- p.66 / Chapter 6.1.1. --- Discussion --- p.68 / Chapter 6.1.2. --- Analysis of the error patterns --- p.71 / Chapter 6.1.3. --- "Reduction of deletion, substitution and insertion" --- p.72 / Chapter 6.1.4. --- Recognition performance at different speaking rates --- p.75 / Chapter 6.2. --- Experiments with speaking-rate-dependent duration models --- p.77 / Chapter 6.2.1. --- Using true speaking rate --- p.77 / Chapter 6.2.2. --- Using estimated speaking rate --- p.79 / Chapter 6.3. --- Evaluation on another speech database --- p.80 / Chapter 6.3.1. --- Experimental setup --- p.80 / Chapter 6.3.2. --- Experiment results and analysis --- p.82 / Chapter CHAPTER 7 --- CONCLUSIONS AND FUTUR WORK --- p.87 / Chapter 7.1. --- Conclusion and understanding of current work --- p.87 / Chapter 7.2. --- Future work --- p.89 / Chapter A --- APPENDIX --- p.90 / BIBLIOGRAPHY --- p.100

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_325149
Date January 2005
ContributorsZhu, Yu., Chinese University of Hong Kong Graduate School. Division of Electronic Engineering.
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, bibliography
Formatprint, xii, 104 leaves : ill. ; 30 cm.
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.002 seconds