Process of text data clustering can be used to analysis, navigation and structure large sets of texts or hypertext documents. The basic idea is to group the documents into a set of clusters on the basis of their similarity. The well-known methods of text clustering, however, do not really solve the specific problems of text clustering like high dimensionality of the input data, very large size of the databases and understandability of the cluster description. This work deals with mentioned problems and describes the modern method of text data clustering based on the use of frequent term sets, which tries to solve deficiencies of other clustering methods.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:237188 |
Date | January 2010 |
Creators | Leixner, Petr |
Contributors | Burgetová, Ivana, Bartík, Vladimír |
Publisher | Vysoké učení technické v Brně. Fakulta informačních technologií |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0089 seconds