Global ETD Search

Return to search

Strojový překlad mluvené řeči přes fonetickou reprezentaci zdrojové řeči / Spoken Language Translation via Phoneme Representation of the Source Language

We refactor the traditional two-step approach of automatic speech recognition for spoken language translation. Instead of conventional graphemes, we use phonemes as an intermediate speech representation. Starting with the acoustic model, we revise the cross-lingual transfer and propose a coarse-to-fine method providing further speed-up and performance boost. Further, we review the translation model. We experiment with source and target encoding, boosting the robustness by utilizing the fine-tuning and transfer across ASR and SLT. We empirically document that this conventional setup with an alternative representation not only performs well on standard test sets but also provides robust transcripts and translations on challenging (e.g., non-native) test sets. Notably, our ASR system outperforms commercial ASR systems. 1

http://www.nusl.cz/ntk/nusl-416019

Identifer	oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:416019
Date	January 2020
Creators	Polák, Peter
Contributors	Bojar, Ondřej, Peterek, Nino
Source Sets	Czech ETDs
Language	English
Detected Language	English
Type	info:eu-repo/semantics/masterThesis
Rights	info:eu-repo/semantics/restrictedAccess

Page generated in 0.0023 seconds

Strojový překlad mluvené řeči přes fonetickou reprezentaci zdrojové řeči / Spoken Language Translation via Phoneme Representation of the Source Language

Description

Links & Downloads

Tags

Additional Fields