Global ETD Search

Return to search

Robustní rozpoznávání mluvčího pomocí neuronových sítí / Robust Speaker Verification with Deep Neural Networks

The objective of this work is to study state-of-the-art deep neural networks based speaker verification systems called x-vectors on various conditions, such as wideband and narrowband data and to develop the system, which is robust to unseen language, specific noise or speech codec. This system takes variable length audio recording and maps it into fixed length embedding which is afterward used to represent the speaker. We compared our systems to BUT's submission to Speakers in the Wild Speaker Recognition Challenge (SITW) from 2016, which used previously popular statistical models - i-vectors. We observed, that when comparing single best systems, with recently published x-vectors we were able to obtain more than 4.38 times lower Equal Error Rate on SITW core-core condition compared to SITW submission from BUT. Moreover, we find that diarization substantially reduces error rate when there are multiple speakers for SITW core-multi condition but we could not see the same trend on NIST SRE 2018 VAST data.

http://www.nusl.cz/ntk/nusl-403182

Identifer	oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:403182
Date	January 2019
Creators	Profant, Ján
Contributors	Rohdin, Johan Andréas, Matějka, Pavel
Publisher	Vysoké učení technické v Brně. Fakulta informačních technologií
Source Sets	Czech ETDs
Language	Czech
Detected Language	English
Type	info:eu-repo/semantics/masterThesis
Rights	info:eu-repo/semantics/restrictedAccess

Page generated in 0.0022 seconds

Robustní rozpoznávání mluvčího pomocí neuronových sítí / Robust Speaker Verification with Deep Neural Networks

Description

Links & Downloads

Tags

Additional Fields