Return to search

Identifikace hudby, řeči, křiku, zpěvu v audio (video) záznamu / Music, Speech, Crying, Singing Detection in Audio (Video)

This thesis follows the trend of last decades in using neural networks in order to detect speech in noisy data. The text begins with basic knowledge about discussed topics, such as audio features, machine learning and neural networks. The network parameters are examined in order to provide the most suitable background for the experiments. The main focus of the experiments is to observe the influence of various sound events on the speech detection on a small, diverse database. Where the sound events correlated to the speech proved to be the most beneficial. In addition, the accuracy of the acoustic events, previously used only as a supplement to the speech, is also a part of experimentation. The experiment of examining the extending of the datasets by more fairly distributed data shows that it doesn't guarantee an improvement. And finally, the last experiment demonstrates that the network indeed succeeded in learning how to predict voice activity in both clean and noisy data.

Identiferoai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:255309
Date January 2016
CreatorsDanko, Michal
ContributorsMalenovský, Vladimír, Szőke, Igor
PublisherVysoké učení technické v Brně. Fakulta informačních technologií
Source SetsCzech ETDs
LanguageCzech
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/masterThesis
Rightsinfo:eu-repo/semantics/restrictedAccess

Page generated in 0.0022 seconds