Many frequent pattern mining algorithms operate on precise data, where each data point is an exact accounting of a phenomena (e.g., I have exactly two sisters). Alas, reasoning this way is a simplification for many real world observations. Measurements, predictions, environmental factors, human error, &ct. all introduce a degree of uncertainty into the mix. Tree-based frequent pattern mining algorithms such as FP-growth are particularly efficient due to their compact in-memory representations of the input database, but their uncertain extensions can require many more tree nodes. I propose new algorithms with tightened upper bounds to expected support, Tube-S and Tube-P, which mine frequent patterns from uncertain data. Extensive experimentation and analysis on datasets with different probability distributions are undertaken that show the tightness of my bounds in different situations. / February 2016
Identifer | oai:union.ndltd.org:MANITOBA/oai:mspace.lib.umanitoba.ca:1993/31059 |
Date | 12 1900 |
Creators | MacKinnon, Richard Kyle |
Contributors | Leung, Carson K.-S. (Computer Science), Wang, Yang (Computer Science) Wang, Xikui (Statistics) |
Publisher | Springer International Publishing, Springer International Publishing, Elsevier, IEEE Computer Society Press, IEEE Computer Society Press |
Source Sets | University of Manitoba Canada |
Detected Language | English |
Page generated in 0.0019 seconds