Return to search

Entropy-based subspace clustering for mining numerical data.

by Cheng, Chun-hung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 72-76). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgments --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Six Tasks of Data Mining --- p.1 / Chapter 1.1.1 --- Classification --- p.2 / Chapter 1.1.2 --- Estimation --- p.2 / Chapter 1.1.3 --- Prediction --- p.2 / Chapter 1.1.4 --- Market Basket Analysis --- p.3 / Chapter 1.1.5 --- Clustering --- p.3 / Chapter 1.1.6 --- Description --- p.3 / Chapter 1.2 --- Problem Description --- p.4 / Chapter 1.3 --- Motivation --- p.5 / Chapter 1.4 --- Terminology --- p.7 / Chapter 1.5 --- Outline of the Thesis --- p.7 / Chapter 2 --- Survey on Previous Work --- p.8 / Chapter 2.1 --- Data Mining --- p.8 / Chapter 2.1.1 --- Association Rules and its Variations --- p.9 / Chapter 2.1.2 --- Rules Containing Numerical Attributes --- p.15 / Chapter 2.2 --- Clustering --- p.17 / Chapter 2.2.1 --- The CLIQUE Algorithm --- p.20 / Chapter 3 --- Entropy and Subspace Clustering --- p.24 / Chapter 3.1 --- Criteria of Subspace Clustering --- p.24 / Chapter 3.1.1 --- Criterion of High Density --- p.25 / Chapter 3.1.2 --- Correlation of Dimensions --- p.25 / Chapter 3.2 --- Entropy in a Numerical Database --- p.27 / Chapter 3.2.1 --- Calculation of Entropy --- p.27 / Chapter 3.3 --- Entropy and the Clustering Criteria --- p.29 / Chapter 3.3.1 --- Entropy and the Coverage Criterion --- p.29 / Chapter 3.3.2 --- Entropy and the Density Criterion --- p.31 / Chapter 3.3.3 --- Entropy and Dimensional Correlation --- p.33 / Chapter 4 --- The ENCLUS Algorithms --- p.35 / Chapter 4.1 --- Framework of the Algorithms --- p.35 / Chapter 4.2 --- Closure Properties --- p.37 / Chapter 4.3 --- Complexity Analysis --- p.39 / Chapter 4.4 --- Mining Significant Subspaces --- p.40 / Chapter 4.5 --- Mining Interesting Subspaces --- p.42 / Chapter 4.6 --- Example --- p.44 / Chapter 5 --- Experiments --- p.49 / Chapter 5.1 --- Synthetic Data --- p.49 / Chapter 5.1.1 --- Data Generation ´ؤ Hyper-rectangular Data --- p.49 / Chapter 5.1.2 --- Data Generation ´ؤ Linearly Dependent Data --- p.50 / Chapter 5.1.3 --- Effect of Changing the Thresholds --- p.51 / Chapter 5.1.4 --- Effectiveness of the Pruning Strategies --- p.53 / Chapter 5.1.5 --- Scalability Test --- p.53 / Chapter 5.1.6 --- Accuracy --- p.55 / Chapter 5.2 --- Real-life Data --- p.55 / Chapter 5.2.1 --- Census Data --- p.55 / Chapter 5.2.2 --- Stock Data --- p.56 / Chapter 5.3 --- Comparison with CLIQUE --- p.58 / Chapter 5.3.1 --- Subspaces with Uniform Projections --- p.60 / Chapter 5.4 --- Problems with Hyper-rectangular Data --- p.62 / Chapter 6 --- Miscellaneous Enhancements --- p.64 / Chapter 6.1 --- Extra Pruning --- p.64 / Chapter 6.2 --- Multi-resolution Approach --- p.65 / Chapter 6.3 --- Multi-threshold Approach --- p.68 / Chapter 7 --- Conclusion --- p.70 / Bibliography --- p.71 / Appendix --- p.77 / Chapter A --- Differential Entropy vs Discrete Entropy --- p.77 / Chapter A.1 --- Relation of Differential Entropy to Discrete Entropy --- p.78 / Chapter B --- Mining Quantitative Association Rules --- p.80 / Chapter B.1 --- Approaches --- p.81 / Chapter B.2 --- Performance --- p.82 / Chapter B.3 --- Final Remarks --- p.83

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_322759
Date January 1999
ContributorsCheng, Chun-hung., Chinese University of Hong Kong Graduate School. Division of Computer Science and Engineering.
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, bibliography
Formatprint, xii, 83 leaves : ill. ; 30 cm.
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0016 seconds