Return to search

Didelių duomenų sekų analizės problemos / Data mining problems

The main goal of these thesis is to compare association rules finding algorithms and to indicate the usability of finding association rules in business area. In order to achieve this goal, the theoretical analysis of three algorithms is done:
1. The Apriori algorithm – the most well known association rule algorithm – based on the property: “Any subset of a large itemset must be large”. This algorithm assumes that the database is memory-resident. The maximum number of database scans is one more than the cardinality of the largest large itemset.
2. The Sampling algorithm deals with the database sample prior the full database scan. The database sample is drawn such that it can be memory-resident. The Sampling algorithm reduces the number of database scans to one in the best case and two in the worst case.
3. The Partitioning algorithm divides database into partitions and bases on the property: “A large itemset must be large in at least one of the partitions”. This algorithm reduces the number of database scans to two and divides the database into partitions such that each partition can be placed into main memory.
There are created programs for all three algorithms plus the program for the full set of itemsets algorithm. Programs are created in C++ language. In order to achieve topmost performance, the GUI is missed.
Nine test data sets are created to compare the algorithms. Six of them contains real life data from telecommunications business area. Datasets varies from the... [to full text]

Identiferoai:union.ndltd.org:LABT_ETD/oai:elaba.lt:LT-eLABa-0001:E.02~2004~D_20040611_164618-10612
Date11 June 2004
CreatorsAmbraziūnas, Valdas
ContributorsKazlauskas, Kazys, Lipeikienė, Joana, Leonavičius, Gražvydas, Dzemyda, Gintautas, Šaltenis, Vydūnas, Vilnius Pedagogical University
PublisherLithuanian Academic Libraries Network (LABT), Vilnius Pedagogical University
Source SetsLithuanian ETD submission system
LanguageLithuanian
Detected LanguageEnglish
TypeMaster thesis
Formatapplication/pdf
Sourcehttp://vddb.library.lt/obj/LT-eLABa-0001:E.02~2004~D_20040611_164618-10612
RightsUnrestricted

Page generated in 0.1228 seconds