The subject of the paper is to analyze the problem of the frequency of the subsequences in large volume sequences. A new probabilistic algorithm for mining frequent sequences (ProMFS) is proposed.
In the abstract of this paper we get know information about the main concepts of the analysis this problem. We got acquainted with the Market Basket Data example which main idea is to find most frequent set of the customer’s selected items.
There were also presented several algorithms, which analyze the problems of finding the frequent sequences. The most popular algorithms are: Apriori – based on the property: “Any subset of a large item set must be large”, Eclat – the main feature is dynamically process each transaction online maintaining 2-itemset counts, GSP algorithm – which can be used to identify surely not frequent sets.
According to the results of these algorithms the new probabilistic algorithm for mining frequent sequences was implemented. The new algorithm is based on the estimation of the statistical characteristic of the main sequence. According to these characteristics we generated much shorter model sequence that is analyzed with the GSP algorithm. The subsequence frequency in the main sequence is estimated by the results of the GSP algorithm applied on the new sequence.
The new probabilistic algorithm implemented in the practical part we tested in some experiments. There were two programs used – the first one was created in Pascal language, the second in Delphi... [to full text]
Identifer | oai:union.ndltd.org:LABT_ETD/oai:elaba.lt:LT-eLABa-0001:E.02~2006~D_20060613_162922-58059 |
Date | 13 June 2006 |
Creators | Cibulskis, Žilvinas |
Contributors | Dzemyda, Gintautas, Kazlauskas, Kazys, Račienė, Jurga, Šaltenis, Vydūnas, Lipeikienė, Joana, Vilnius Pedagogical University |
Publisher | Lithuanian Academic Libraries Network (LABT), Vilnius Pedagogical University |
Source Sets | Lithuanian ETD submission system |
Language | Lithuanian |
Detected Language | English |
Type | Master thesis |
Format | application/pdf |
Source | http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2006~D_20060613_162922-58059 |
Rights | Unrestricted |
Page generated in 0.0022 seconds