Global ETD Search

Return to search

Robustní detekce řečové aktivity / Robust Speech Activity Detection

The aim of this work is to design and create a robust speech activity detector that is able to detect speech in different languages, in a noise environment and with music on background. I decided to solve this problem by using a neural network as a classification model that assigns one of the four possible classes - silence, speech, music, or noise to the input of audio recording. The resulting tool is able to detect the speech in at least 12 languages. Speech with musical background up to 88 % accuracy and system success on noisy data reaches from 84 % (5 dB SNR) to 88 % (20 dB SNR). This tool can be used for speech activity detection in various research areas of speech processing. The main contribution is the elimination of music, which when not eliminated, significantly increases the error rate of systems for speaker identification or speech recognition.

http://www.nusl.cz/ntk/nusl-403178

Identifer	oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:403178
Date	January 2019
Creators	Popková, Anna
Contributors	Plchot, Oldřich, Matějka, Pavel
Publisher	Vysoké učení technické v Brně. Fakulta informačních technologií
Source Sets	Czech ETDs
Language	Czech
Detected Language	English
Type	info:eu-repo/semantics/masterThesis
Rights	info:eu-repo/semantics/restrictedAccess

Page generated in 0.0023 seconds

Robustní detekce řečové aktivity / Robust Speech Activity Detection

Description

Links & Downloads

Tags

Additional Fields