This paper deals with corpus linguistics. There are two applications under its scrutiny. Both of these applications are processing data from the corpus DeReKo via corpus-driven approach. It is a co-occurrence analysis and a Co-occurrence database. The aim of the work is to evaluate whether the results obtained by the co-occurrence analysis of the current scope of DeReKo are different from the results of the Co-occurrence database, which was created on a basis of a smaller scale corpus. In addition, this thesis offers illustrative examples of the use of both applications and the evaluation of their effectiveness, depending on the purpose of the research. The theoretical part of the thesis deals with the terminology of corpus linguistics and with the mentioned corpuses, which serve as a basis for the practical part of the thesis. The empirical part of the thesis consists of analyses of the randomly picked words (one from each word class) in both applications. The results confirm that the data obtained with Co-occurrence database and co-occurrence analysis are in many respects different and thus confirm the hypothesis that the corpus size plays a crucial role in the results. Both applications have their advantages and disadvantages. The paper offers a comprehensive overview and by doing so it...
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:369912 |
Date | January 2017 |
Creators | Křesťanová, Jitka |
Contributors | Hejhalová, Věra, Šemelík, Martin |
Source Sets | Czech ETDs |
Language | German |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.002 seconds