Return to search

Automatic Segmentation and Identification of Mixed-Language Speech Using delta-BIC and Support Vector Machines

This thesis proposes an approach to segmenting and identifying mixed-language speech.
Automatic LID can be divided into four steps, feature extraction, segmentation, segment clustering, and re-labeling. In feature extraction, we compare the group delay feature (GDF) with MFCC feature. Unlike the traditional feature from Fourier trans-form magnitude, GDF uses the phase spectrum. In segmentation, we compare delta Bayesian information criterion (delta-BIC) with support vector machines (SVMs). A delta-BIC is applied to segment the input speech utterance into a sequence of lan-guage-dependent segments using acoustic features. The segments are clustered using the K-means algorithm. Finally, re-labeling is used to determine the language of the clusters. SVMs proceed to segment and identify automatically after model training.
Considering the effect of the accent issue, we use the corpus English Across Taiwan (EAT) to perform our system. The experimental results show that the system can reach 78.13% in the frame hit rate under the baseline 57.77%.

Identiferoai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0909108-092149
Date09 September 2008
CreatorsWang, Sheng-Fu
ContributorsCheng-Wen Ko, Hsing-Min Wang, Chia-Ping Chen, Chun-I Fan
PublisherNSYSU
Source SetsNSYSU Electronic Thesis and Dissertation Archive
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0909108-092149
Rightsnot_available, Copyright information available at source archive

Page generated in 0.0016 seconds