In this paper we develop an alternative to minimum support which utilizes knowledge of the process which generates transaction data and allows for highly skewed frequency distributions. We apply a simple stochastic model (the NB model), which is known for its usefulness to describe item occurrences in transaction data, to develop a frequency constraint. This model-based frequency constraint is used together with a precision threshold to find individual support thresholds for groups of associations. We develop the notion of NB-frequent itemsets and present two mining algorithms which find all NB-frequent itemsets in a database. In experiments with publicly available transaction databases we show that the new constraint can provide significant improvements over a single minimum support threshold and that the precision threshold is easier to use. (author's abstract) / Series: Working Papers on Information Systems, Information Business and Operations
Identifer | oai:union.ndltd.org:VIENNA/oai:epub.wu-wien.ac.at:epub-wu-01_7a9 |
Date | January 2004 |
Creators | Hahsler, Michael |
Publisher | Institut fĂĽr Informationsverarbeitung und Informationswirtschaft, WU Vienna University of Economics and Business |
Source Sets | Wirtschaftsuniversität Wien |
Language | English |
Detected Language | English |
Type | Paper, NonPeerReviewed |
Format | application/pdf |
Relation | http://epub.wu.ac.at/1760/ |
Page generated in 0.0021 seconds