Return to search

Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic

In this paper, we experiment with using Stagger, an open-source implementation of an Averaged Perceptron tagger, to tag Icelandic, a morphologically complex language. By adding languagespecific linguistic features and using IceMorphy, an unknown word guesser, we obtain state-of- the-art tagging accuracy of 92.82%. Furthermore, by adding data from a morphological database, and word embeddings induced from an unannotated corpus, the accuracy increases to 93.84%. This is equivalent to an error reduction of 5.5%, compared to the previously best tagger for Icelandic, consisting of linguistic rules and a Hidden Markov Model.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:su-90304
Date January 2013
CreatorsÖstling, Robert
PublisherStockholms universitet, Avdelningen för datorlingvistik, Linköping University Electronic Press, Linköpings universitet
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeConference paper, info:eu-repo/semantics/conferenceObject, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationLinköping Electronic Conference Proceedings, 1650-3740, Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013), p. 105-119

Page generated in 0.0021 seconds