Return to search

TEXT MINER FOR HYPERGRAPHS USING OUTPUT SPACE SAMPLING

Indiana University-Purdue University Indianapolis (IUPUI) / Text Mining is process of extracting high-quality knowledge from analysis of
textual data. Rapidly growing interest and focus on research in many fields is
resulting in an overwhelming amount of research literature. This literature is a vast source of knowledge. But due to huge volume of literature, it is practically impossible for researchers to manually extract the knowledge. Hence, there is a need for automated approach to extract knowledge from unstructured data. Text mining is right approach for automated extraction of knowledge from textual data.
The objective of this thesis is to mine documents pertaining to research literature, to find novel associations among entities appearing in that literature using Incremental Mining. Traditional text mining approaches provide binary associations. But it is important to understand context in which these associations occur. For example entity A has association with entity B in context of entity C. These contexts can be visualized as multi-way associations among the entities which are represented by a Hypergraph. This thesis work talks about extracting such multi-way associations among the entities using Frequent Itemset
Mining and application of a new concept called Output space sampling to extract
such multi-way associations in space and time efficient manner. We incorporated concept of personalization in Output space sampling so that user can specify his/her interests as the frequent hyper-associations are extracted from the text.

Identiferoai:union.ndltd.org:IUPUI/oai:scholarworks.iupui.edu:1805/2620
Date16 August 2011
CreatorsTirupattur, Naveen
ContributorsMukhopadhyay, Snehasis, Fang, Shiaofen, Xia, Yuni
Source SetsIndiana University-Purdue University Indianapolis
Languageen_US
Detected LanguageEnglish

Page generated in 0.0021 seconds