Return to search

The Discovery and Retrieval of Temporal Rules in Interval Sequence Data

Data mining is increasingly becoming important tool in extracting interesting knowledge from large databases. Many industries are now using data mining tools for analysing their large collections of databases and making business decisions. Many data mining problems involve temporal aspects, with examples ranging from engineering to scientific research, finance and medicine. Temporal data mining is an extension of data mining which deals with temporal data. Mining temporal data poses more challenges than mining static data. While the analysis of static data sets often comes down to the question of data items, with temporal data there are many additional possible relations.

One of the tasks in temporal data mining is the pattern discovery task, whose objective is to discover time-dependent correlations, patterns or rules between events in large volumes of data. To date, most temporal pattern discovery research has focused on events existing at a point in time rather than over a temporal interval. In comparison to static rules, mining with respect to time points provides semantically richer rules. However, accommodating temporal intervals offers rules that are richer still.

This thesis addresses several issues related to the pattern discovery from interval sequence data. Despite its importance, this area of research has received relatively little attention and there are still many issues that need to be addressed.
Three main issues that this thesis considers include the definition of what constitutes an interesting pattern in interval sequence data, the efficient mining for patterns in the data, and the identification of interesting patterns from a large number of discovered patterns.

In order to deal with these issues, this thesis formulates the problem of discovering rules, which we term richer temporal association rules, from interval sequence databases. Furthermore, this thesis develops an efficient algorithm, ARMADA, for discovering richer temporal association rules. The algorithm does not require candidate generation. It utilizes a simple index, and only requires at most two database scans. In this thesis, a retrieval system is proposed to facilitate the selection of interesting rules from a set of discovered richer temporal association rules. To this end, a high-level query language specification, TAR-QL, is proposed to specify the criteria of the rules to be retrieved from the rule sets. Three low-level methods are developed to evaluate queries involving rule format conditions. In order to improve the performance of the methods, signature file based indexes are proposed. In addition, this thesis proposes the discovery of inter-transaction relative temporal association rules from event sequence databases.

Identiferoai:union.ndltd.org:ADTP/216452
Date January 2007
CreatorsWinarko, Edi, edwin@ugm.ac.id
PublisherFlinders University. Informatics and Engineering
Source SetsAustraliasian Digital Theses Program
LanguageEnglish
Detected LanguageEnglish
Rightshttp://www.flinders.edu.au/disclaimer/), Copyright Edi Winarko

Page generated in 0.0021 seconds