• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Subset-Lattice Algorithm for Mining Maximal Frequent Itemsets over a Data Stream Sliding Window

Wang, Syuan-Yun 09 July 2012 (has links)
Online mining association rules in data streams is an important field in the data mining. Among them, mining the maximal frequent itemsets is also an important issue. A frequent itemset is called maximal if it is not a subset of any other frequent itemset. The set of all the maximal frequent itemsets is denoted as the maximal frequent itemset. Because data streams are continuous, high speed, unbounded, and real time. As a result, we can only scan once for the data streams. Therefore, the previous algorithms to mine the maximal frequent itemsets in the traditional databases are not suitable for the data streams. Furthermore, many applications are interested in the recent data streams, and the sliding window is the model which deal with the most recent data streams. In the sliding window model, a window size is required. One of the algorithms for mining the maximal frequent itemsets based on the sliding window model is called the MFIoSSW algorithm. The MFIoSSW algorithm uses a compact structure to mine the maximal frequent itemsets. It uses an array-based structure A to store the maximal frequent itemsets and other helpful itemsets. But it takes long time to mine the maximal frequent itemsets. When the new transaction comes, the number of comparison between the new transaction and the old transactions is too much. Therefore, in this project, we propose a sliding window approach, the Subset-Lattice algorithm. We use the lattice structure to store the information of the transactions. The structure of the lattice stores the relationship between the child node and the father node. In each node, we record the itemset and the support. When the new transaction comes, we consider five relations: (1) equivalent, (2) subset, (3) intersection, (4) empty set, (5) superset. With this five relations, we can add the new transactions and update the support efficiently.

Page generated in 0.0826 seconds