The thesis discusses in detail some concepts of speech and prosody that can contribute to build a speech corpus for the speaker segmentation purpose. Moreover, the Elan multimedia annotator used for labeling is described. The theoretical part highlights some frequently used speech features such as MFCC, PLP and LPC and deals with currently most popular speech segmentation methods. Some classification algorithms are also mentioned. The practical part describes implementation of Bayesian information criterium algorithm in system for automatic speaker segmentation. For classification of speaker change point in speech, were used different speech features. The results of tests were evaluated by the graphic method of receiver operating characteristic (ROC) and his quantitative indices. As the best speech features for this system were provided MFCC and HFCC.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:219007 |
Date | January 2011 |
Creators | Adamský, Aleš |
Contributors | Přinosil, Jiří, Smékal, Zdeněk |
Publisher | Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií |
Source Sets | Czech ETDs |
Language | Slovak |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0018 seconds