Global ETD Search

Return to search

pcApriori: Scalable apriori for multiprocessor systems

Frequent-itemset mining is an important part of data mining. It is a computational and memory intensive task and has a large number of scientific and statistical application areas. In many of them, the datasets can easily grow up to tens or even several hundred gigabytes of data. Hence, efficient algorithms are required to process such amounts of data. In the recent years, there have been proposed many efficient sequential mining algorithms, which however cannot exploit current and future systems providing large degrees of parallelism. Contrary, the number of parallel frequent-itemset mining algorithms is rather small and most of them do not scale well as the number of threads is largely increased. In this paper, we present a highly-scalable mining algorithm that is based on the well-known Apriori algorithm; it is optimized for processing very large datasets on multiprocessor systems. The key idea of pcApriori is to employ a modified producer--consumer processing scheme, which partitions the data during processing and distributes it to the available threads. We conduct many experiments on large datasets. pcApriori scales almost linear on our test system comprising 32 cores.

info:eu-repo/classification/ddc/004

ddc:004

Identifer	oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:80641
Date	16 September 2022
Creators	Schlegel, Benjamin, Kiefer, Tim, Kissinger, Thomas, Lehner, Wolfgang
Publisher	ACM
Source Sets	Hochschulschriftenserver (HSSS) der SLUB Dresden
Language	English
Detected Language	English
Type	info:eu-repo/semantics/acceptedVersion, doc-type:conferenceObject, info:eu-repo/semantics/conferenceObject, doc-type:Text
Rights	info:eu-repo/semantics/openAccess
Relation	978-1-4503-1921-8, 20, 10.1145/2484838.2484879, info:eu-repo/grantAgreement/Deutsche Forschungsgemeinschaft/Sonderforschungsbereiche/164481002//HAEC - Highly Adaptive Energy-Efficient Computing/SFB 912

Page generated in 0.0027 seconds

pcApriori: Scalable apriori for multiprocessor systems

Description

Links & Downloads

Tags

Additional Fields