Spelling suggestions: "subject:"delta's"" "subject:"delta'e""
1 |
Automatic Segmentation and Identification of Mixed-Language Speech Using delta-BIC and Support Vector MachinesWang, Sheng-Fu 09 September 2008 (has links)
This thesis proposes an approach to segmenting and identifying mixed-language speech.
Automatic LID can be divided into four steps, feature extraction, segmentation, segment clustering, and re-labeling. In feature extraction, we compare the group delay feature (GDF) with MFCC feature. Unlike the traditional feature from Fourier trans-form magnitude, GDF uses the phase spectrum. In segmentation, we compare delta Bayesian information criterion (delta-BIC) with support vector machines (SVMs). A delta-BIC is applied to segment the input speech utterance into a sequence of lan-guage-dependent segments using acoustic features. The segments are clustered using the K-means algorithm. Finally, re-labeling is used to determine the language of the clusters. SVMs proceed to segment and identify automatically after model training.
Considering the effect of the accent issue, we use the corpus English Across Taiwan (EAT) to perform our system. The experimental results show that the system can reach 78.13% in the frame hit rate under the baseline 57.77%.
|
Page generated in 0.0334 seconds