Return to search

Robustní rozpoznávání mluvčího pomocí neuronových sítí / Robust Speaker Verification with Deep Neural Networks

The objective of this work is to study state-of-the-art deep neural networks based speaker verification systems called x-vectors on various conditions, such as wideband and narrowband data and to develop the system, which is robust to unseen language, specific noise or speech codec. This system takes variable length audio recording and maps it into fixed length embedding which is afterward used to represent the speaker. We compared our systems to BUT's submission to Speakers in the Wild Speaker Recognition Challenge (SITW) from 2016, which used previously popular statistical models - i-vectors. We observed, that when comparing single best systems, with recently published x-vectors we were able to obtain more than 4.38 times lower Equal Error Rate on SITW core-core condition compared to SITW submission from BUT. Moreover, we find that diarization substantially reduces error rate when there are multiple speakers for SITW core-multi condition but we could not see the same trend on NIST SRE 2018 VAST data.

Identiferoai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:403182
Date January 2019
CreatorsProfant, Ján
ContributorsRohdin, Johan Andréas, Matějka, Pavel
PublisherVysoké učení technické v Brně. Fakulta informačních technologií
Source SetsCzech ETDs
LanguageCzech
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/masterThesis
Rightsinfo:eu-repo/semantics/restrictedAccess

Page generated in 0.0022 seconds