Global ETD Search

Return to search

Textová klasifikace s limitovanými trénovacími daty / Text classification with limited training data

The aim of this thesis is to minimize manual work needed to create training data for text classification tasks. Various research areas including weak supervision, interactive learning and transfer learning explore how to minimize training data creation effort. We combine ideas from available literature in order to design a comprehensive text classification framework that employs keyword-based labeling instead of traditional text annotation. Keyword-based labeling aims to label texts based on keywords contained in the texts that are highly correlated with individual classification labels. As noted repeatedly in previous work, coming up with many new keywords is challenging for humans. To accommodate for this issue, we propose an interactive keyword labeler featuring the use of word similarity for guiding a user in keyword labeling. To verify the effectiveness of our novel approach, we implement a minimum viable prototype of the designed framework and use it to perform a user study on a restaurant review multi-label classification problem.

http://www.nusl.cz/ntk/nusl-448225

Identifer	oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:448225
Date	January 2021
Creators	Laitoch, Petr
Contributors	Hana, Jiří, Vidová Hladká, Barbora
Source Sets	Czech ETDs
Language	English
Detected Language	English
Type	info:eu-repo/semantics/masterThesis
Rights	info:eu-repo/semantics/restrictedAccess

Page generated in 0.059 seconds

Textová klasifikace s limitovanými trénovacími daty / Text classification with limited training data

Description

Links & Downloads

Tags

Additional Fields