Return to search

Použití hlubokých kontextualizovaných slovních reprezentací založených na znacích pro neuronové sekvenční značkování / Deep contextualized word embeddings from character language models for neural sequence labeling

A family of Natural Language Processing (NLP) tasks such as part-of- speech (PoS) tagging, Named Entity Recognition (NER), and Multiword Expression (MWE) identification all involve assigning labels to sequences of words in text (sequence labeling). Most modern machine learning approaches to sequence labeling utilize word embeddings, learned representations of text, in which words with similar meanings have similar representations. Quite recently, contextualized word embeddings have garnered much attention because, unlike pretrained context- insensitive embeddings such as word2vec, they are able to capture word meaning in context. In this thesis, I evaluate the performance of different embedding setups (context-sensitive, context-insensitive word, as well as task-specific word, character, lemma, and PoS) on the three abovementioned sequence labeling tasks using a deep learning model (BiLSTM) and Portuguese datasets. v

Identiferoai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:393167
Date January 2019
CreatorsLief, Eric
ContributorsPecina, Pavel, Kocmi, Tom
Source SetsCzech ETDs
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/masterThesis
Rightsinfo:eu-repo/semantics/restrictedAccess

Page generated in 0.0064 seconds