• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Graph-based Approach for Semantic Data Mining

Liu, Haishan, Liu, Haishan January 2012 (has links)
Data mining is the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. It is widely acknowledged that the role of domain knowledge in the discovery process is essential. However, the synergy between domain knowledge and data mining is still at a rudimentary level. This motivates me to develop a framework for explicit incorporation of domain knowledge in a data mining system so that insights can be drawn from both data and domain knowledge. I call such technology "semantic data mining." Recent research in knowledge representation has led to mature standards such as the Web Ontology Language (OWL) by the W3C's Semantic Web initiative. Semantic Web ontologies have become a key technology for knowledge representation and processing. The OWL ontology language is built on the W3C's Resource Description Framework (RDF) that provides a simple model to describe information resources as a graph. On the other hand, there has been a surge of interest in tackling data mining problems where objects of interest can be best described as a graph of interrelated nodes. I notice that the interface between domain knowledge and data mining can be achieved by using graph representations. Therefore I explore a graph-based approach for modeling both knowledge and data and for analyzing the combined information source from which insight can be drawn systematically. In summary, I make three main contributions in this dissertation to achieve semantic data mining. First, I develop an information integration solution based on metaheuristic optimization when data mining task require accessing heterogeneous data sources. Second, I describe how a graph interface for both domain knowledge and data can be structured by employing the RDF model and its graph representations. Finally, I describe several graph theoretic analysis approaches for mining the combined information source. I showcase the utility of the proposed methods on finding semantically associated itemsets, a particular case of the frequent pattern mining. I believe these contributions in semantic data mining can provide a novel and useful way to incorporate domain knowledge. This dissertation includes published and unpublished coauthored material.

Page generated in 0.1035 seconds