This document describes the three methods for the detection and classification of paralinguistic expressions such as laughing and crying from usual speech by analysis of the audio signal. The database of records was originally designed for this purpose. When analyzing everyday dialogs, music might be included, so the database was extended by four new classes as speech, music, singing with music and usual speech with background music. Feature extraction, feature reduction and classification are common steps in recognizing for all three methods. Difference of the methods is given by classification process in detail. One classification of all six classes at once is proposed in the first method called straight approach. In the second method called decision tree oriented approach we are using five intuitive sub classifiers in the tree structure and the final method uses for classification emotion coupling approach. The best features were reduced by feature evaluation using F-ratio and GMM classifiers were used for the each classification part.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:218348 |
Date | January 2010 |
Creators | Mašek, Jan |
Contributors | Míča, Ivan, Atassi, Hicham |
Publisher | Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0016 seconds