Return to search

Reprezentace textu a její vliv na kategorizaci / Representation of Text and Its Influence on Categorization

The thesis deals with machine processing of textual data. In the theoretical part, issues related to natural language processing are described and different ways of pre-processing and representation of text are also introduced. The thesis also focuses on the usage of N-grams as features for document representation and describes some algorithms used for their extraction. The next part includes an outline of classification methods used. In the practical part, an application for pre-processing and creation of different textual data representations is suggested and implemented. Within the experiments made, the influence of these representations on accuracy of classification algorithms is analysed.

Identiferoai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:237263
Date January 2010
CreatorsŠabatka, Ondřej
ContributorsChmelař, Petr, Bartík, Vladimír
PublisherVysoké učení technické v Brně. Fakulta informačních technologií
Source SetsCzech ETDs
LanguageCzech
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/masterThesis
Rightsinfo:eu-repo/semantics/restrictedAccess

Page generated in 0.0025 seconds