Return to search

Metody stemmingu používané při dolování textu / Stemming Methods Used in Text Mining

The main theme of this master's thesis is a description of text mining. This document is specialized to English texts and their automatic data preprocessing. The main part of this thesis analyses various stemming algorithms (Lovins, Porter and Paice/Husk). Stemming is a procedure for automatic conflating semantically related terms together via the use of rule sets. Next part of this thesis describes design of an application for various types of stemming algorithms. Application is based on the Java platform with using of graphic library Swing and MVC architecture. Next chapter contains description of implementation of the application and stemming algorithms. In the last part of this master's thesis experiments with stemming algorithms and comparing the algorithm from viewpoint to the results of classification the text are described.

Identiferoai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:235547
Date January 2010
CreatorsAdámek, Tomáš
ContributorsChmelař, Petr, Bartík, Vladimír
PublisherVysoké učení technické v Brně. Fakulta informačních technologií
Source SetsCzech ETDs
LanguageCzech
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/masterThesis
Rightsinfo:eu-repo/semantics/restrictedAccess

Page generated in 0.0038 seconds