Return to search

A Subset-Lattice Algorithm for Mining Maximal Frequent Itemsets over a Data Stream Sliding Window

Online mining association rules in data streams is an important field in the data
mining. Among them, mining the maximal frequent itemsets is also an important
issue. A frequent itemset is called maximal if it is not a subset of any other frequent
itemset. The set of all the maximal frequent itemsets is denoted as the maximal
frequent itemset. Because data streams are continuous, high speed, unbounded, and
real time. As a result, we can only scan once for the data streams. Therefore, the
previous algorithms to mine the maximal frequent itemsets in the traditional
databases are not suitable for the data streams. Furthermore, many applications are
interested in the recent data streams, and the sliding window is the model which
deal with the most recent data streams. In the sliding window model, a window size
is required. One of the algorithms for mining the maximal frequent itemsets based
on the sliding window model is called the MFIoSSW algorithm. The MFIoSSW
algorithm uses a compact structure to mine the maximal frequent itemsets. It uses
an array-based structure A to store the maximal frequent itemsets and other helpful
itemsets. But it takes long time to mine the maximal frequent itemsets. When the
new transaction comes, the number of comparison between the new transaction and
the old transactions is too much. Therefore, in this project, we propose a sliding
window approach, the Subset-Lattice algorithm. We use the lattice structure to store
the information of the transactions. The structure of the lattice stores the relationship
between the child node and the father node. In each node, we record the itemset and
the support. When the new transaction comes, we consider five relations: (1)
equivalent, (2) subset, (3) intersection, (4) empty set, (5) superset. With this five
relations, we can add the new transactions and update the support efficiently.

Identiferoai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0709112-163748
Date09 July 2012
CreatorsWang, Syuan-Yun
ContributorsYe-In Chang, Tei-Wei Kuo, Gen-Huey Chen, Chien-I Lee
PublisherNSYSU
Source SetsNSYSU Electronic Thesis and Dissertation Archive
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0709112-163748
Rightsuser_define, Copyright information available at source archive

Page generated in 0.0021 seconds