Pattern mining is a knowledge discovery task which is useful for finding interesting data characteristics. Existing mining techniques sometimes suffer from limited performance in challenging situations, such as when finding patterns in high-dimensional datasets. Binary Decision Diagrams and their variants are a compact and efficient graph data structure for representing and manipulating boolean functions and they are potentially attractive for solving many problems in pattern mining. This thesis explores techniques for the use of binary decision diagrams for mining both simple and complex types of patterns. / Firstly, we investigate the use of Binary Decision Diagrams for mining the fundamental types of patterns. These include frequent patterns, also known as frequent itemsets. We introduce a structure called the Weighted Zero-suppressed Binary Decision Diagram and evaluate its use on high dimensional data. This type of Decision Diagram is extremely useful for re-using intermediate patterns during computation. / Secondly, we study the problem of mining patterns in sequential databases. Here, we introduce a new structure called the Sequence Binary Decision Diagram, which can be used for mining frequent subsequences. We show that our technique is competitive with the state of the art and identify situations where it is superior. / Thirdly, we show how Weighted Zero-suppressed Binary Decision Diagrams can be used for discovering new and complex types of patterns. We introduce new types of highly expressive patterns for capturing contrasts, which express disjunctions of attribute values. Moreover, to investigate the usefulness of disjunctive patterns for knowledge discovery, we employ a statistical methodology for testing their significance, and study their use for solving classification problems. Our findings show that classifiers based on significant disjunctive patterns can be more robust than those which are only based on simple patterns. / Finally, we introduce patterns for capturing second-order differences between two groups of classes, which can provide useful insights for human experts. Again, we show how binary decision diagrams can be deployed for efficiently discovering this type of knowledge. / In summary, we demonstrate that Binary Decision Diagrams, are a powerful and scalable tool in pattern mining. We believe their use is very promising for a range of current and future tasks in the data mining context.
Identifer | oai:union.ndltd.org:ADTP/245446 |
Date | January 2009 |
Creators | Loekito, E. |
Source Sets | Australiasian Digital Theses Program |
Language | English |
Detected Language | English |
Rights | Terms and Conditions: Copyright in works deposited in the University of Melbourne Eprints Repository (UMER) is retained by the copyright owner. The work may not be altered without permission from the copyright owner. Readers may only, download, print, and save electronic copies of whole works for their own personal non-commercial use. Any use that exceeds these limits requires permission from the copyright owner. Attribution is essential when quoting or paraphrasing from these works., Open Access |
Page generated in 0.0019 seconds