An actual trend in the computational linguistics and natural language processing is the implementation of multilingual utilities for different tasks, like information retrival, summarization of documents in different languages or machine translation, tasks in which the resolution of anaphoric references plays a crucial role. This dissertation presents a proposal of annotation scheme for the creation of corpus resources for linguistic based multilingual anaphora resolution. This scheme has been implemented for the annotation of English and Italian data. Inter-annotator agreement studies show that the annotation scheme is relaiable. The annotated corpora have been used for the anaphora resolution task, and the results have been compared with well known corpora. Finally hand annotated linguistic features have been used to help in the anaphora resolution process. The results show that our multilingual annotation scheme proposal has been utilized to produce data useful to build anaphora resolution systems for languages with different grammatical and typological features, like English and Italian.
Identifer | oai:union.ndltd.org:unitn.it/oai:iris.unitn.it:11572/367836 |
Date | January 2010 |
Creators | Rodriguez, Kepa Joseba |
Contributors | Rodriguez, Kepa Joseba, Poesio, Massimo |
Publisher | Università degli studi di Trento, place:TRENTO |
Source Sets | Università di Trento |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/doctoralThesis |
Rights | info:eu-repo/semantics/openAccess |
Relation | firstpage:1, lastpage:109, numberofpages:109 |
Page generated in 0.0017 seconds