Global ETD Search

Return to search

Vícejazyčná syntéza řeči / Multilingual speech synthesis

This work explores multilingual speech synthesis. We compare three models based on Tacotron that utilize various levels of parameter sharing. Two of them follow recent multilingual text-to-speech systems. The first one makes use of a fully-shared encoder and an adversarial classifier that removes speaker-dependent information from the encoder. The other uses language-specific encoders. We introduce a new approach that combines the best of both previous methods. It enables effective parameter sharing using a meta- learning technique, preserves encoder's flexibility, and actively removes speaker-specific information in the encoder. We compare the three models on two tasks. The first one aims at joint multilingual training on ten languages and reveals their knowledge-sharing abilities. The second concerns code-switching. We show that our model effectively shares information across languages, and according to a subjective evaluation test, it produces more natural and accurate code-switching speech.

http://www.nusl.cz/ntk/nusl-415948

Identifer	oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:415948
Date	January 2020
Creators	Nekvinda, Tomáš
Contributors	Dušek, Ondřej, Peterek, Nino
Source Sets	Czech ETDs
Language	English
Detected Language	English
Type	info:eu-repo/semantics/masterThesis
Rights	info:eu-repo/semantics/restrictedAccess

Page generated in 0.0019 seconds

Vícejazyčná syntéza řeči / Multilingual speech synthesis

Description

Links & Downloads

Tags

Additional Fields