Global ETD Search

Return to search

Učící se analyzátor audio-vizuálních záznamů / Continously Learning Analyser of Audio-Visual Recordings

This thesis introduces a tool for analysis of audiovisual records. The tool uses the audio and closed captions supplied by the user to prepare text annotation. The annotation contains a transcript of the show which is based on the closed captions. In addition, speaker diarization is performed to mark who spoke when. The diarization is performed by a third party library. The library is evaluated on data from DIALOG corpus. The inner workings of the library are described. To assign the right portions of the text to the right section of the record Kaldi, a speech recognition toolkit, is used. Furthermore the thesis contains an overview describing how closed captions are created; overview of speech corpora creation; and a brief review of literature on record analysis. 1

http://www.nusl.cz/ntk/nusl-352610

Identifer	oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:352610
Date	January 2016
Creators	Košarko, Ondřej
Contributors	Peterek, Nino, Klusáček, David
Source Sets	Czech ETDs
Language	Czech
Detected Language	English
Type	info:eu-repo/semantics/masterThesis
Rights	info:eu-repo/semantics/restrictedAccess

Page generated in 0.0021 seconds

Učící se analyzátor audio-vizuálních záznamů / Continously Learning Analyser of Audio-Visual Recordings

Description

Links & Downloads

Tags

Additional Fields