This thesis deals with automatic syntactic analysis of natural languagetext, also known as parsing. The parsing approach is data-driven, whichmeans that parsers are constructed by means of machine learning, lookingat training data in the form of annotated natural language sentences. The syntactic framework used in the thesis is dependency-based. Robustness is one of the characteristics of the data-driven approaches investigated here.The overall aim of this thesis is to maintain robustness while increasing accuracy.The content of the thesis falls naturally into two tracks, a transformation track and a combination track. The rst type of transformation investigatedis called pseudo-projective, because it enables strictly projective dependency parsers to recover non-projective dependency relations. Informally,a non-projective dependency tree contains crossing binary directed relations, when drawn above the sentence. Experimental results show that pseudo-projective transformations can improve accuracy significantly for a range of languages. The second type of transformation aims to facilitate the processing of specific linguistic constructions such as coordination and verb groups. Experimental results again show a positive effect on parsing accuracy for several languages, often greater than for the pseudo-projective transformations. However, the improvement of the transformations dependson the internal structure of the base parser, which is not the case for thepseudo-projective transformations. The combination track compares various approaches for combining data driven dependency parsers, again as a means of improving accuracy. As different parsers have different strengths and weaknesses, making parsers collaborate in order to nd one single syntactic analysis may result in higher accuracy than any of the syntactic analyzers can produce by itself. The experimental results show that accuracy improves across languages, giventhat appropriate parsers are combined. The thesis ends with an attempt to combine the two tracks, showing that combining parsers with different tree transformations also increases accuracy. Moreover, this experiment indicates that high diversity among a small set of parsers is much more important than a large number of parsers with low diversity.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:vxu-6776 |
Date | January 2009 |
Creators | Nilsson, Jens |
Publisher | Växjö universitet, Matematiska och systemtekniska institutionen, Växjö : Växjö University Press |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Doctoral thesis, monograph, info:eu-repo/semantics/doctoralThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | Acta Wexionensia, 1404-4307 ; 183/2009 |
Page generated in 0.0017 seconds