We present a package which provides a general framework, including tools and algorithms, for text mining in R using the S4 class system. Using this package and the kernlab R package we explore the use of kernel methods for clustering (e.g., kernel k-means and spectral clustering) on a set of text documents, using string kernels. We compare these methods to a more traditional clustering technique like k-means on a bag of word representation of the text and evaluate the viability of kernel-based methods as a text clustering technique. (author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics
Identifer | oai:union.ndltd.org:VIENNA/oai:epub.wu-wien.ac.at:epub-wu-01_96d |
Date | January 2006 |
Creators | Karatzoglou, Alexandros, Feinerer, Ingo |
Publisher | Department of Statistics and Mathematics, WU Vienna University of Economics and Business |
Source Sets | Wirtschaftsuniversität Wien |
Language | English |
Detected Language | English |
Type | Paper, NonPeerReviewed |
Format | application/pdf |
Relation | http://epub.wu.ac.at/1002/ |
Page generated in 0.0093 seconds