The goal of the diploma thesis is to design, implement and evaluate classifiers for automatic classification of semantic patterns of English verbs according to a pattern lexicon that draws on the Corpus Pattern Analysis. We use a pilot collection of 30 sample English verbs as training and test data sets. We employ standard methods of machine learning. In our experiments we use decision trees, k-nearest neighbourghs (kNN), support vector machines (SVM) and Adaboost algorithms. Among other things we concentrate on feature design and selection. We experiment with both morpho-syntactic and semantic features. Our results show that the morpho-syntactic features are the most important for statistically-driven semantic disambiguation. Nevertheless, for some verbs the use of semantic features plays an important role.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:304092 |
Date | January 2012 |
Creators | Kríž, Vincent |
Contributors | Holub, Martin, Bojar, Ondřej |
Source Sets | Czech ETDs |
Language | Slovak |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0022 seconds