In this thesis, a novel technique based on the empirical mode decomposition (EMD) methodology
is proposed and examined for the noise-robustness of automatic speech recognition systems. The EMD analysis is a generalization of the Fourier analysis for processing nonlinear and non-stationary time functions, in our case, the speech feature sequences. We use the intrinsic mode functions (IMF), which include the sinusoidal functions as special cases,
obtained from the EMD analysis in the post-processing of the log energy feature. We evaluate
the proposed method on Aurora 2.0 and Aurora 3.0 databases. On Aurora 2.0, we obtain a 44.9% overall relative improvement over the baseline for the mismatched (clean-training) tasks. The results show an overall improvement of 49.5% over the baseline for Aurora 3.0 on the high-mismatch tasks. It shows that our proposed method leads to significant improvement.
Identifer | oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0825110-171559 |
Date | 25 August 2010 |
Creators | Wu, Kuo-hao |
Contributors | Hsin-Min Wang, Chia-Ping Chen, Chung-Hsien Wu, Jui-Feng Yeh |
Publisher | NSYSU |
Source Sets | NSYSU Electronic Thesis and Dissertation Archive |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0825110-171559 |
Rights | not_available, Copyright information available at source archive |
Page generated in 0.006 seconds