Return to search

Visual Interactive Labeling of Large Multimedia News Corpora

The semantic annotation of large multimedia corpora is essential for numerous tasks. Be it for the training of classification algorithms, efficient content retrieval, or for analytical reasoning, appropriate labels are often the first necessity before automatic processing becomes efficient. However, manual labeling of large datasets is time-consuming and tedious. Hence, we present a new visual approach for labeling and retrieval of reports in multimedia news corpora. It combines automatic classifier training based on caption text from news reports with human interpretation to ease the annotation process. In our approach, users can initialize labels with keyword queries and iteratively annotate examples to train a classifier. The proposed visualization displays representative results in an overview that allows to follow different annotation strategies (e.g., active learning) and assess the quality of the classifier. Based on a usage scenario, we demonstrate the successful application of our approach. Therein, users label several topics which interest them and retrieve related documents with high confidence from three years of news reports.

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:32804
Date25 January 2019
CreatorsHan, Qi, John, Markus, Kurzhals, Kuno, Messner, Johannes, Ertl, Thomas
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageEnglish
Detected LanguageEnglish
Typedoc-type:conferenceObject, info:eu-repo/semantics/conferenceObject, doc-type:Text
Rightsinfo:eu-repo/semantics/openAccess
Relationurn:nbn:de:bsz:15-qucosa2-327974, qucosa:32797

Page generated in 0.002 seconds