Return to search

Data-Driven Rescaling of Energy Features for Noisy Speech Recognition

In this paper, we investigate rescaling of energy features for noise-robust speech recognition.
The performance of the speech recognition system will degrade very quickly by the influence
of environmental noise. As a result, speech robustness technique has become an important
research issue for a long time. However, many studies have pointed out that the impact of
speech recognition under the noisy environment is enormous. Therefore, we proposed the
data-driven energy features rescaling (DEFR) to adjust the features. The method is divided
into three parts, that are voice activity detection (VAD), piecewise log rescaling function and
parameter searching algorithm. The purpose is to reduce the difference of noisy and clean
speech features. We apply this method on Mel-frequency cepstral coefficients (MFCC) and
Teager energy cepstral coefficients (TECC), and we compare the proposed method with mean
subtraction (MS) and mean and variance normalization (MVN). We use the Aurora 2.0 and
Aurora 3.0 databases to evaluate the performance. From the experimental results, we proved
that the proposed method can effectively improve the recognition accuracy.

Identiferoai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0718112-111800
Date18 July 2012
CreatorsLuan, Miau
ContributorsChung-Hsien Wu, Chia-Ping Chen, Hsin-Min Wang
PublisherNSYSU
Source SetsNSYSU Electronic Thesis and Dissertation Archive
LanguageCholon
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0718112-111800
Rightsuser_define, Copyright information available at source archive

Page generated in 0.0017 seconds