Global ETD Search

Return to search

Identifikace hudby, řeči, křiku, zpěvu v audio (video) záznamu / Music, Speech, Crying, Singing Detection in Audio (Video)

This thesis follows the trend of last decades in using neural networks in order to detect speech in noisy data. The text begins with basic knowledge about discussed topics, such as audio features, machine learning and neural networks. The network parameters are examined in order to provide the most suitable background for the experiments. The main focus of the experiments is to observe the influence of various sound events on the speech detection on a small, diverse database. Where the sound events correlated to the speech proved to be the most beneficial. In addition, the accuracy of the acoustic events, previously used only as a supplement to the speech, is also a part of experimentation. The experiment of examining the extending of the datasets by more fairly distributed data shows that it doesn't guarantee an improvement. And finally, the last experiment demonstrates that the network indeed succeeded in learning how to predict voice activity in both clean and noisy data.

http://www.nusl.cz/ntk/nusl-255309

Identifer	oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:255309
Date	January 2016
Creators	Danko, Michal
Contributors	Malenovský, Vladimír, Szőke, Igor
Publisher	Vysoké učení technické v Brně. Fakulta informačních technologií
Source Sets	Czech ETDs
Language	Czech
Detected Language	English
Type	info:eu-repo/semantics/masterThesis
Rights	info:eu-repo/semantics/restrictedAccess

Page generated in 0.0022 seconds

Identifikace hudby, řeči, křiku, zpěvu v audio (video) záznamu / Music, Speech, Crying, Singing Detection in Audio (Video)

Description

Links & Downloads

Tags

Additional Fields