Global ETD Search

Return to search

Automatic morphological analysis of L-verbs in Palula / Automatisk morfologisk analys av L-verb i Palula

This study is exploring the possibilities of automatic morphological analysis of L-verbs in the Palula language by the help from Finite-state technology and two-level morphology along with supervised machine learning. The type of machine learning used are neural Sequence to Sequence models. A morphological transducer is made with the Helsinki Finite-State Transducer Technology, HFST, toolkit covering the L-verbs of the Palula Language. Several Sequence to Sequence models are trained on sets of L-verbs along with morphological tagging annotation. One model is trained with a small amount of manually annotated data and four models are trained with different amounts of training examples generated by the Finite-State Transducer. The efficiency and accuracy of these methods are investigated. The Sequence to Sequence model trained on solely manually annotated data did not perform as well as the other models. A Sequence to Sequence model trained with training examples generated by the transducer performed the best recall, accuracy and F1-score, while the Finite-State Transducer performed the best precision score. / Denna studie undersöker möjligheterna för en automatisk morfologisk analys av L-verb i språket Palula med hjälp av finit tillståndsteknik och två-nivå-morfologi samt övervakad maskininlärning. Den typ av maskininlärning som används i studien är neurala Sekvens till Sekvens-modeller. En morfologisk transduktor är skapad med verktyget Helsinki Finite-State Transducer Technology, HFST, som täcker L-verben i Palula. Flera Sekvens till Sekvens-modeller tränas på set av L-verb med morfologisk taggningsannotation. En modell tränas på ett litet set av manuellt annoterade data och fyra modeller tränas på olika mängder träningsdata som genererats av den finita tillstånds-transduktorn. Effektiviteten och noggrannheten för dessa modeller undersöks. Sekvens till Sekvens-modellen som tränats med bara manuellt annoterade data presterade inte lika bra som de andra modellerna i studien. En Sekvens till Sekvens-modell tränad med träningsdata bestående av genereringar producerade av transduktorn gav bästa svarsfrekvens, noggrannhet och F1-poäng, medan den finita tillstånds-transduktorn gav bästa precision.

http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-182528

Finite-State Transducer

Finite-State Transducer

General Language Studies and Linguistics

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:su-182528
Date	January 2020
Creators	Wallerö, Emma
Publisher	Stockholms universitet, Institutionen för lingvistik
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0022 seconds

Automatic morphological analysis of L-verbs in Palula / Automatisk morfologisk analys av L-verb i Palula

Description

Links & Downloads

Tags

Additional Fields