• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

An Obstruction-Check Approach to Mining Closed Sequential Patterns in Data Streams

Chin, Tsz-lin 21 June 2010 (has links)
Online mining sequential patterns over data streams is an important problem in data mining. There are many applications of using sequential patterns in data streams, such as market analysis, network security, sensor networks and web track- ing. Previous studies have shown mining closed patterns provides more benefits than mining the complete set of frequent patterns, since closed pattern mining leads to compact results. A sequential pattern is closed if the sequential pattern does not have any supersequence which has the same support. Chang et al. proposed a time- based sliding window model. The time-based sliding window has two features, the new item is inserted in front of a sequence, and the obsolete item is removed from of tail of a sequence. For solving the problem of data mining in the time-based sliding window, Chang et al. proposed an algorithm called SeqStream. It uses a data struc- ture IST (Inverse Closed Sequence Tree) to keep the result. IST can incrementally be updated by the SeqStream algorithm. Although the SeqStream algorithm has used the technique of dividing the time-based sliding window to speed up the updating of IST, the SeqStream algorithm still scans the sliding window many times when IST needs to be updated. In this thesis, we propose an obstruction-check approach to maintain the result of closed sequential patterns. Our approach is designed based on the lattice structure. The feature of the lattice structure is that the parent is a supersequence of its children. By utilizing this feature, we decide the obstruction link between the parent and child if their support is the same. If a node does not have any obstruction link parent, the node is a closed sequential pattern. Then we can utilize this feature to locally travel the lattice structure. Moreover, we can fully utilize the features of the time-based sliding window model to locally travel the lat- tice structure. Based on the lattice structure, we propose the EULB (Exact Update based on Lattice structure with Bit stream)-Lattice algorithm. The EULB-Lattice algorithm is an exact method for mining data streams. We record additional informa- tion, instead of scanning the entire sliding window. We conduct several experiments using different synthetic data sets. The simulation results show that the proposed algorithm outperforms the SeqStream algorithm.

Page generated in 0.086 seconds