Global ETD Search

1	Auditory Based Modification of MFCC Feature Extraction for Robust Automatic Speech Recognition Chiou, Sheng-chiuan 01 September 2009 (has links) The human auditory perception system is much more noise-robust than any state-of theart automatic speech recognition (ASR) system. It is expected that the noise-robustness of speech feature vectors may be improved by employing more human auditory functions in the feature extraction procedure. Forward masking is a phenomenon of human auditory perception, that a weaker sound is masked by the preceding stronger masker. In this work, two human auditory mechanisms, synaptic adaptation and temporal integration are implemented by filter functions and incorporated to model forward masking into MFCC feature extraction. A filter optimization algorithm is proposed to optimize the filter parameters. The performance of the proposed method is evaluated on Aurora 3 corpus, and the procedure of training/testing follows the standard setting provided by the Aurora 3 task. The synaptic adaptation filter achieves relative improvements of 16.6% over the baseline. The temporal integration and modified temporal integration filter achieve relative improvements of 21.6% and 22.5% respectively. The combination of synaptic adaptation with each of temporal integration filters results in further improvements of 26.3% and 25.5%. Applying the filter optimization improves the synaptic adaptation filter and two temporal integration filters, results in the 18.4%, 25.2%, 22.6% improvements respectively. The performance of the combined-filters models are also improved, the relative improvement are 26.9% and 26.3%. forward masking auditory model syanptic adaptation temporal integration noise robust ASR automatic speech recognition

Search results

Auditory Based Modification of MFCC Feature Extraction for Robust Automatic Speech Recognition