The thesis aims at explicit description of Old Czech common nouns declension with regard to its application in a tool for automatic morphological analysis of (digitized) texts in Old Czech. This means that this description is intended to serve as a basis for automatic generation of word forms (jointly with their appropriate morphological information and lemma) which will then be used for assigning morphological categories (gender, number, case) and lemma to word forms occurring in Old Czech digitized texts. The thesis thus develops a base for the first step in transformation of text banks (which currently exist for the Old Czech period) into an Old Czech corpus offering more possibilities for linguistic research. The Old Czech period is defined as a period from the beginning of the 14th century (more precisely from the period when first coherent texts written in Czech appeared) approx. to the end of the 15th century. Nouns were chosen for this work, because they cover approx. 30% of texts in current Czech (which is the highest percentage from all parts of speech). Old Czech texts are taken into account only in a transcribed form (based on transcription rules used in the Old Czech Text Bank developed at the Institute of the Czech Language of the Academy of Sciences of the Czech Republic). On the one...
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:357844 |
Date | January 2017 |
Creators | Synková, Pavlína |
Contributors | Oliva, Karel, Petkevič, Vladimír, Vepřek, Miroslav |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/doctoralThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0019 seconds