This work explores the process of model-based classification of speech audio signals using low-level feature vectors. The process of extracting low-level features from audio signals is described along with a discussion of established techniques for training and testing mixture model-based classifiers and using these models in conjunction with feature selection algorithms to select optimal feature subsets. The results of a number of classification experiments using a publicly available speech database, the Berlin Database of Emotional Speech, are presented. This includes experiments in optimizing feature extraction parameters and comparing different feature selection results from over 700 candidate feature vectors for the tasks of classifying speaker gender, identity, and emotion. In the experiments, final classification accuracies of 99.5%, 98.0% and 79% were achieved for the gender, identity and emotion tasks respectively. / by Chris Thoman. / Thesis (M.S.C.S.)--Florida Atlantic University, 2009. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2009. Mode of access: World Wide Web.
Identifer | oai:union.ndltd.org:fau.edu/oai:fau.digital.flvc.org:fau_3410 |
Contributors | Thoman, Chris., College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science |
Publisher | Florida Atlantic University |
Source Sets | Florida Atlantic University |
Language | English |
Detected Language | English |
Type | Text, Electronic Thesis or Dissertation |
Format | xiv, 186 p. : ill., electronic |
Rights | http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.002 seconds