Return to search

A study of two problems in data mining: projective clustering and multiple tables association rules mining.

Ng Ka Ka. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 114-120). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgement --- p.vii / Chapter I --- Projective Clustering --- p.1 / Chapter 1 --- Introduction to Projective Clustering --- p.2 / Chapter 2 --- Related Work to Projective Clustering --- p.7 / Chapter 2.1 --- CLARANS - Graph Abstraction and Bounded Optimization --- p.8 / Chapter 2.1.1 --- Graph Abstraction --- p.8 / Chapter 2.1.2 --- Bounded Optimized Random Search --- p.9 / Chapter 2.2 --- OptiGrid ´ؤ Grid Partitioning Approach and Density Estimation Function --- p.9 / Chapter 2.2.1 --- Empty Space Phenomenon --- p.10 / Chapter 2.2.2 --- Density Estimation Function --- p.11 / Chapter 2.2.3 --- Upper Bound Property --- p.12 / Chapter 2.3 --- CLIQUE and ENCLUS - Subspace Clustering --- p.13 / Chapter 2.3.1 --- Monotonicity Property of Subspaces --- p.14 / Chapter 2.4 --- PROCLUS Projective Clustering --- p.15 / Chapter 2.5 --- ORCLUS - Generalized Projective Clustering --- p.16 / Chapter 2.5.1 --- Singular Value Decomposition SVD --- p.17 / Chapter 2.6 --- "An ""Optimal"" Projective Clustering" --- p.17 / Chapter 3 --- EPC : Efficient Projective Clustering --- p.19 / Chapter 3.1 --- Motivation --- p.19 / Chapter 3.2 --- Notations and Definitions --- p.21 / Chapter 3.2.1 --- Density Estimation Function --- p.22 / Chapter 3.2.2 --- 1-d Histogram --- p.23 / Chapter 3.2.3 --- 1-d Dense Region --- p.25 / Chapter 3.2.4 --- Signature Q --- p.26 / Chapter 3.3 --- The overall framework --- p.28 / Chapter 3.4 --- Major Steps --- p.30 / Chapter 3.4.1 --- Histogram Generation --- p.30 / Chapter 3.4.2 --- Adaptive discovery of dense regions --- p.31 / Chapter 3.4.3 --- Count the occurrences of signatures --- p.36 / Chapter 3.4.4 --- Find the most frequent signatures --- p.36 / Chapter 3.4.5 --- Refine the top 3m signatures --- p.37 / Chapter 3.5 --- Time and Space Complexity --- p.38 / Chapter 4 --- EPCH: An extension and generalization of EPC --- p.40 / Chapter 4.1 --- Motivation of the extension --- p.40 / Chapter 4.2 --- Distinguish clusters by their projections in different subspaces --- p.43 / Chapter 4.3 --- EPCH: a generalization of EPC by building histogram with higher dimensionality --- p.46 / Chapter 4.3.1 --- Multidimensional histograms construction and dense re- gions detection --- p.46 / Chapter 4.3.2 --- Compressing data objects to signatures --- p.47 / Chapter 4.3.3 --- Merging Similar Signature Entries --- p.49 / Chapter 4.3.4 --- Associating membership degree --- p.51 / Chapter 4.3.5 --- The choice of Dimensionality d of the Histogram --- p.52 / Chapter 4.4 --- Implementation of EPC2 --- p.53 / Chapter 4.5 --- Time and Space Complexity of EPCH --- p.54 / Chapter 5 --- Experimental Results --- p.56 / Chapter 5.1 --- Clustering Quality Measurement --- p.56 / Chapter 5.2 --- Synthetic Data Generation --- p.58 / Chapter 5.3 --- Experimental setup --- p.59 / Chapter 5.4 --- Comparison between EPC and PROCULS --- p.60 / Chapter 5.5 --- Comparison between EPCH and ORCLUS --- p.62 / Chapter 5.5.1 --- Dimensionality of the original space and the associated subspace --- p.65 / Chapter 5.5.2 --- Projection not parallel to original axes --- p.66 / Chapter 5.5.3 --- Data objects belong to more than one cluster under fuzzy clustering --- p.67 / Chapter 5.6 --- Scalability of EPC --- p.68 / Chapter 5.7 --- Scalability of EPC2 --- p.69 / Chapter 6 --- Conclusion --- p.71 / Chapter II --- Multiple Tables Association Rules Mining --- p.74 / Chapter 7 --- Introduction to Multiple Tables Association Rule Mining --- p.75 / Chapter 7.1 --- Problem Statement --- p.77 / Chapter 8 --- Related Work to Multiple Tables Association Rules Mining --- p.80 / Chapter 8.1 --- Aprori - A Bottom-up approach to generate candidate sets --- p.80 / Chapter 8.2 --- VIPER - Vertical Mining with various optimization techniques --- p.81 / Chapter 8.2.1 --- Vertical TID Representation and Mining --- p.82 / Chapter 8.2.2 --- FORC --- p.83 / Chapter 8.3 --- Frequent Itemset Counting across Multiple Tables --- p.84 / Chapter 9 --- The Proposed Method --- p.85 / Chapter 9.1 --- Notations --- p.85 / Chapter 9.2 --- Converting Dimension Tables to internal representation --- p.87 / Chapter 9.3 --- The idea of discovering frequent itemsets without joining --- p.89 / Chapter 9.4 --- Overall Steps --- p.91 / Chapter 9.5 --- Binding multiple Dimension Tables --- p.92 / Chapter 9.6 --- Prefix Tree for FT --- p.94 / Chapter 9.7 --- Maintaining frequent itemsets in FI-trees --- p.96 / Chapter 9.8 --- Frequency Counting --- p.99 / Chapter 10 --- Experiments --- p.102 / Chapter 10.1 --- Synthetic Data Generation --- p.102 / Chapter 10.2 --- Experimental Findings --- p.106 / Chapter 11 --- Conclusion and Future Works --- p.112 / Bibliography --- p.114

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_324004
Date January 2002
ContributorsNg, Ka Ka., Chinese University of Hong Kong Graduate School. Division of Computer Science and Engineering.
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, bibliography
Formatprint, xv, 120 leaves : ill. ; 30 cm.
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0061 seconds