Return to search

Strojový překlad mluvené řeči přes fonetickou reprezentaci zdrojové řeči / Spoken Language Translation via Phoneme Representation of the Source Language

We refactor the traditional two-step approach of automatic speech recognition for spoken language translation. Instead of conventional graphemes, we use phonemes as an intermediate speech representation. Starting with the acoustic model, we revise the cross-lingual transfer and propose a coarse-to-fine method providing further speed-up and performance boost. Further, we review the translation model. We experiment with source and target encoding, boosting the robustness by utilizing the fine-tuning and transfer across ASR and SLT. We empirically document that this conventional setup with an alternative representation not only performs well on standard test sets but also provides robust transcripts and translations on challenging (e.g., non-native) test sets. Notably, our ASR system outperforms commercial ASR systems. 1

Identiferoai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:416019
Date January 2020
CreatorsPolák, Peter
ContributorsBojar, Ondřej, Peterek, Nino
Source SetsCzech ETDs
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/masterThesis
Rightsinfo:eu-repo/semantics/restrictedAccess

Page generated in 0.0023 seconds