Return to search

Parallel Closet+ Algorithm For Finding Frequent Closed Itemsets

Data mining is proving itself to be a very important field as the data available is increasing exponentially, thanks to first computerization and now internetization. On the other hand, cluster computing systems made up of commodity hardware are becoming widespread, along with the multicore processor architectures. This high computing power is synthesized with data mining to process huge amounts of data and to reach information and knowledge.

Frequent itemset mining is a special subtopic of data mining because it is an integral part of many types of data mining tasks. Often this task is a prerequisite for many other data mining algorithms, most notably algorithms in the association rule mining area. For this reason, it is studied heavily in the literature.

In this thesis, a parallel implementation of CLOSET+, a frequent closed itemset mining algorithm, is presented. The CLOSET+ algorithm has been modified to run on multiple processors simultaneously, in order to obtain results faster. Open MPI and Boost libraries have been used for the communication between different processes and the program has been tested on different inputs and parameters. Experimental results show that the algorithm exhibits high speedup and eficiency for dense data when the support value is higher than a determined value. Proposed parallel algorithm could prove to be useful for application areas where fast response is needed for low to medium number of frequent closed itemsets. A particular application area is the Web where online applications have similar requirements.

Identiferoai:union.ndltd.org:METU/oai:etd.lib.metu.edu.tr:http://etd.lib.metu.edu.tr/upload/12610742/index.pdf
Date01 July 2009
CreatorsSen, Tayfun
ContributorsSener, Cevat
PublisherMETU
Source SetsMiddle East Technical Univ.
LanguageEnglish
Detected LanguageEnglish
TypeM.S. Thesis
Formattext/pdf
RightsTo liberate the content for public access

Page generated in 0.0021 seconds