Global ETD Search

51	A study on privacy-preserving clustering Cui, Yingjie., 崔英杰. January 2009 (has links) published_or_final_version / Computer Science / Master / Master of Philosophy Database security. Cluster analysis - Data processing.
52	Generalized Feature Embedding Learning for Clustering and Classication Unknown Date (has links) Data comes in many di erent shapes and sizes. In real life applications it is common that data we are studying has features that are of varied data types. This may include, numerical, categorical, and text. In order to be able to model this data with machine learning algorithms, it is required that the data is typically in numeric form. Therefore, for data that is not originally numerical, it must be transformed to be able to be used as input into these algorithms. Along with this transformation it is common that data we study has many features relative to the number of samples in the data. It is often desirable to reduce the number of features that are being trained in a model to eliminate noise and reduce time in training. This problem of high dimensionality can be approached through feature selection, feature extraction, or feature embedding. Feature selection seeks to identify the most essential variables in a dataset that will lead to a parsimonious model and high performing results, while feature extraction and embedding are techniques that utilize a mathematical transformation of the data into a represented space. As a byproduct of using a new representation, we are able to reduce the dimension greatly without sacri cing performance. Oftentimes, by using embedded features we observe a gain in performance. Though extraction and embedding methods may be powerful for isolated machine learning problems, they do not always generalize well. Therefore, we are motivated to illustrate a methodology that can be applied to any data type with little pre-processing. The methods we develop can be applied in unsupervised, supervised, incremental, and deep learning contexts. Using 28 benchmark datasets as examples which include di erent data types, we construct a framework that can be applied for general machine learning tasks. The techniques we develop contribute to the eld of dimension reduction and feature embedding. Using this framework, we make additional contributions to eigendecomposition by creating an objective matrix that includes three main vital components. The rst being a class partitioned row and feature product representation of one-hot encoded data. Secondarily, the derivation of a weighted adjacency matrix based on class label relationships. Finally, by the inner product of these aforementioned values, we are able to condition the one-hot encoded data generated from the original data prior to eigenvector decomposition. The use of class partitioning and adjacency enable subsequent projections of the data to be trained more e ectively when compared side-to-side to baseline algorithm performance. Along with this improved performance, we can adjust the dimension of the subsequent data arbitrarily. In addition, we also show how these dense vectors may be used in applications to order the features of generic data for deep learning. In this dissertation, we examine a general approach to dimension reduction and feature embedding that utilizes a class partitioned row and feature representation, a weighted approach to instance similarity, and an adjacency representation. This general approach has application to unsupervised, supervised, online, and deep learning. In our experiments of 28 benchmark datasets, we show signi cant performance gains in clustering, classi cation, and training time. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection Eigenvectors--Data processing. Algorithms. Cluster analysis.
53	Incremental document clustering for web page classification. January 2000 (has links) by Wong, Wai-Chiu. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 89-94). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgments --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Document Clustering --- p.2 / Chapter 1.2 --- DC-tree --- p.4 / Chapter 1.3 --- Feature Extraction --- p.5 / Chapter 1.4 --- Outline of the Thesis --- p.5 / Chapter 2 --- Related Work --- p.8 / Chapter 2.1 --- Clustering Algorithms --- p.8 / Chapter 2.1.1 --- Partitional Clustering Algorithms --- p.8 / Chapter 2.1.2 --- Hierarchical Clustering Algorithms --- p.10 / Chapter 2.2 --- Document Classification by Examples --- p.11 / Chapter 2.2.1 --- k-NN algorithm - Expert Network (ExpNet) --- p.11 / Chapter 2.2.2 --- Learning Linear Text Classifier --- p.12 / Chapter 2.2.3 --- Generalized Instance Set (GIS) algorithm --- p.12 / Chapter 2.3 --- Document Clustering --- p.13 / Chapter 2.3.1 --- B+-tree-based Document Clustering --- p.13 / Chapter 2.3.2 --- Suffix Tree Clustering --- p.14 / Chapter 2.3.3 --- Association Rule Hypergraph Partitioning Algorithm --- p.15 / Chapter 2.3.4 --- Principal Component Divisive Partitioning --- p.17 / Chapter 2.4 --- Projections for Efficient Document Clustering --- p.18 / Chapter 3 --- Background --- p.21 / Chapter 3.1 --- Document Preprocessing --- p.21 / Chapter 3.1.1 --- Elimination of Stopwords --- p.22 / Chapter 3.1.2 --- Stemming Technique --- p.22 / Chapter 3.2 --- Problem Modeling --- p.23 / Chapter 3.2.1 --- Basic Concepts --- p.23 / Chapter 3.2.2 --- Vector Model --- p.24 / Chapter 3.3 --- Feature Selection Scheme --- p.25 / Chapter 3.4 --- Similarity Model --- p.27 / Chapter 3.5 --- Evaluation Techniques --- p.29 / Chapter 4 --- Feature Extraction and Weighting --- p.31 / Chapter 4.1 --- Statistical Analysis of the Words in the Web Domain --- p.31 / Chapter 4.2 --- Zipf's Law --- p.33 / Chapter 4.3 --- Traditional Methods --- p.36 / Chapter 4.4 --- The Proposed Method --- p.38 / Chapter 4.5 --- Experimental Results --- p.40 / Chapter 4.5.1 --- Synthetic Data Generation --- p.40 / Chapter 4.5.2 --- Real Data Source --- p.41 / Chapter 4.5.3 --- Coverage --- p.41 / Chapter 4.5.4 --- Clustering Quality --- p.43 / Chapter 4.5.5 --- Binary Weight vs Numerical Weight --- p.45 / Chapter 5 --- Web Document Clustering Using DC-tree --- p.48 / Chapter 5.1 --- Document Representation --- p.48 / Chapter 5.2 --- Document Cluster (DC) --- p.49 / Chapter 5.3 --- DC-tree --- p.52 / Chapter 5.3.1 --- Tree Definition --- p.52 / Chapter 5.3.2 --- Insertion --- p.54 / Chapter 5.3.3 --- Node Splitting --- p.55 / Chapter 5.3.4 --- Deletion and Node Merging --- p.56 / Chapter 5.4 --- The Overall Strategy --- p.57 / Chapter 5.4.1 --- Preprocessing --- p.57 / Chapter 5.4.2 --- Building DC-tree --- p.59 / Chapter 5.4.3 --- Identifying the Interesting Clusters --- p.60 / Chapter 5.5 --- Experimental Results --- p.61 / Chapter 5.5.1 --- Alternative Similarity Measurement : Synthetic Data --- p.61 / Chapter 5.5.2 --- DC-tree Characteristics : Synthetic Data --- p.63 / Chapter 5.5.3 --- Compare DC-tree and B+-tree: Synthetic Data --- p.64 / Chapter 5.5.4 --- Compare DC-tree and B+-tree: Real Data --- p.66 / Chapter 5.5.5 --- Varying the Number of Features : Synthetic Data --- p.67 / Chapter 5.5.6 --- Non-Correlated Topic Web Page Collection: Real Data --- p.69 / Chapter 5.5.7 --- Correlated Topic Web Page Collection: Real Data --- p.71 / Chapter 5.5.8 --- Incremental updates on Real Data Set --- p.72 / Chapter 5.5.9 --- Comparison with the other clustering algorithms --- p.73 / Chapter 6 --- Conclusion --- p.75 / Appendix --- p.77 / Chapter A --- Stopword List --- p.77 / Chapter B --- Porter's Stemming Algorithm --- p.81 / Chapter C --- Insertion Algorithm --- p.83 / Chapter D --- Node Splitting Algorithm --- p.85 / Chapter E --- Features Extracted in Experiment 4.53 --- p.87 / Bibliography --- p.88 Web search engines Cluster analysis Discriminant analysis
54	Entropy-based subspace clustering for mining numerical data. January 1999 (has links) by Cheng, Chun-hung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 72-76). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgments --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Six Tasks of Data Mining --- p.1 / Chapter 1.1.1 --- Classification --- p.2 / Chapter 1.1.2 --- Estimation --- p.2 / Chapter 1.1.3 --- Prediction --- p.2 / Chapter 1.1.4 --- Market Basket Analysis --- p.3 / Chapter 1.1.5 --- Clustering --- p.3 / Chapter 1.1.6 --- Description --- p.3 / Chapter 1.2 --- Problem Description --- p.4 / Chapter 1.3 --- Motivation --- p.5 / Chapter 1.4 --- Terminology --- p.7 / Chapter 1.5 --- Outline of the Thesis --- p.7 / Chapter 2 --- Survey on Previous Work --- p.8 / Chapter 2.1 --- Data Mining --- p.8 / Chapter 2.1.1 --- Association Rules and its Variations --- p.9 / Chapter 2.1.2 --- Rules Containing Numerical Attributes --- p.15 / Chapter 2.2 --- Clustering --- p.17 / Chapter 2.2.1 --- The CLIQUE Algorithm --- p.20 / Chapter 3 --- Entropy and Subspace Clustering --- p.24 / Chapter 3.1 --- Criteria of Subspace Clustering --- p.24 / Chapter 3.1.1 --- Criterion of High Density --- p.25 / Chapter 3.1.2 --- Correlation of Dimensions --- p.25 / Chapter 3.2 --- Entropy in a Numerical Database --- p.27 / Chapter 3.2.1 --- Calculation of Entropy --- p.27 / Chapter 3.3 --- Entropy and the Clustering Criteria --- p.29 / Chapter 3.3.1 --- Entropy and the Coverage Criterion --- p.29 / Chapter 3.3.2 --- Entropy and the Density Criterion --- p.31 / Chapter 3.3.3 --- Entropy and Dimensional Correlation --- p.33 / Chapter 4 --- The ENCLUS Algorithms --- p.35 / Chapter 4.1 --- Framework of the Algorithms --- p.35 / Chapter 4.2 --- Closure Properties --- p.37 / Chapter 4.3 --- Complexity Analysis --- p.39 / Chapter 4.4 --- Mining Significant Subspaces --- p.40 / Chapter 4.5 --- Mining Interesting Subspaces --- p.42 / Chapter 4.6 --- Example --- p.44 / Chapter 5 --- Experiments --- p.49 / Chapter 5.1 --- Synthetic Data --- p.49 / Chapter 5.1.1 --- Data Generation ´ؤ Hyper-rectangular Data --- p.49 / Chapter 5.1.2 --- Data Generation ´ؤ Linearly Dependent Data --- p.50 / Chapter 5.1.3 --- Effect of Changing the Thresholds --- p.51 / Chapter 5.1.4 --- Effectiveness of the Pruning Strategies --- p.53 / Chapter 5.1.5 --- Scalability Test --- p.53 / Chapter 5.1.6 --- Accuracy --- p.55 / Chapter 5.2 --- Real-life Data --- p.55 / Chapter 5.2.1 --- Census Data --- p.55 / Chapter 5.2.2 --- Stock Data --- p.56 / Chapter 5.3 --- Comparison with CLIQUE --- p.58 / Chapter 5.3.1 --- Subspaces with Uniform Projections --- p.60 / Chapter 5.4 --- Problems with Hyper-rectangular Data --- p.62 / Chapter 6 --- Miscellaneous Enhancements --- p.64 / Chapter 6.1 --- Extra Pruning --- p.64 / Chapter 6.2 --- Multi-resolution Approach --- p.65 / Chapter 6.3 --- Multi-threshold Approach --- p.68 / Chapter 7 --- Conclusion --- p.70 / Bibliography --- p.71 / Appendix --- p.77 / Chapter A --- Differential Entropy vs Discrete Entropy --- p.77 / Chapter A.1 --- Relation of Differential Entropy to Discrete Entropy --- p.78 / Chapter B --- Mining Quantitative Association Rules --- p.80 / Chapter B.1 --- Approaches --- p.81 / Chapter B.2 --- Performance --- p.82 / Chapter B.3 --- Final Remarks --- p.83 Data mining Cluster analysis--Data processing
55	Rival penalized competitive learning for content-based indexing. January 1998 (has links) by Lau Tak Kan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 100-108). / Abstract also in Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Background --- p.1 / Chapter 1.2 --- Problem Defined --- p.5 / Chapter 1.3 --- Contributions --- p.5 / Chapter 1.4 --- Thesis Organization --- p.7 / Chapter 2 --- Content-based Retrieval Multimedia Database Background and Indexing Problem --- p.8 / Chapter 2.1 --- Feature Extraction --- p.8 / Chapter 2.2 --- Nearest-neighbor Search --- p.10 / Chapter 2.3 --- Content-based Indexing Methods --- p.15 / Chapter 2.4 --- Indexing Problem --- p.22 / Chapter 3 --- Data Clustering Methods for Indexing --- p.25 / Chapter 3.1 --- Proposed Solution to Indexing Problem --- p.25 / Chapter 3.2 --- Brief Description of Several Clustering Methods --- p.26 / Chapter 3.2.1 --- K-means --- p.26 / Chapter 3.2.2 --- Competitive Learning (CL) --- p.27 / Chapter 3.2.3 --- Rival Penalized Competitive Learning (RPCL) --- p.29 / Chapter 3.2.4 --- General Hierarchical Clustering Methods --- p.31 / Chapter 3.3 --- Why RPCL? --- p.32 / Chapter 4 --- Non-hierarchical RPCL Indexing --- p.33 / Chapter 4.1 --- The Non-hierarchical Approach --- p.33 / Chapter 4.2 --- Performance Experiments --- p.34 / Chapter 4.2.1 --- Experimental Setup --- p.35 / Chapter 4.2.2 --- Experiment 1: Test for Recall and Precision Performance --- p.38 / Chapter 4.2.3 --- Experiment 2: Test for Different Sizes of Input Data Sets --- p.45 / Chapter 4.2.4 --- Experiment 3: Test for Different Numbers of Dimensions --- p.49 / Chapter 4.2.5 --- Experiment 4: Compare with Actual Nearest-neighbor Results --- p.53 / Chapter 4.3 --- Chapter Summary --- p.55 / Chapter 5 --- Hierarchical RPCL Indexing --- p.56 / Chapter 5.1 --- The Hierarchical Approach --- p.56 / Chapter 5.2 --- The Hierarchical RPCL Binary Tree (RPCL-b-tree) --- p.58 / Chapter 5.3 --- Insertion --- p.61 / Chapter 5.4 --- Deletion --- p.63 / Chapter 5.5 --- Searching --- p.63 / Chapter 5.6 --- Experiments --- p.69 / Chapter 5.6.1 --- Experimental Setup --- p.69 / Chapter 5.6.2 --- Experiment 5: Test for Different Node Sizes --- p.72 / Chapter 5.6.3 --- Experiment 6: Test for Different Sizes of Data Sets --- p.75 / Chapter 5.6.4 --- Experiment 7: Test for Different Data Distributions --- p.78 / Chapter 5.6.5 --- Experiment 8: Test for Different Numbers of Dimensions --- p.80 / Chapter 5.6.6 --- Experiment 9: Test for Different Numbers of Database Ob- jects Retrieved --- p.83 / Chapter 5.6.7 --- Experiment 10: Test with VP-tree --- p.86 / Chapter 5.7 --- Discussion --- p.90 / Chapter 5.8 --- A Relationship Formula --- p.93 / Chapter 5.9 --- Chapter Summary --- p.96 / Chapter 6 --- Conclusion --- p.97 / Chapter 6.1 --- Future Works --- p.97 / Chapter 6.2 --- Conclusion --- p.98 / Bibliography --- p.100 Multimedia systems Indexing Cluster analysis Information retrieval
56	Three essays in quantitative marketing. January 1997 (has links) by Ka-Kit Tse. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references. / Acknowledgments --- p.i / List of tables --- p.v / Chapter Chapter 1: --- Overall Review --- p.1 / Chapter Chapter 2: --- Essay one - A Mathematical Programming Approach to Clusterwise Regression Model and its Extensions / Chapter 2.0. --- Abstract --- p.5 / Chapter 2.1. --- Introduction --- p.6 / Chapter 2.2. --- A Mathematical Programming Formulation of the Clusterwise Regression Model --- p.10 / Chapter 2.2.1. --- The Generalized Clusterwise Regression Model --- p.10 / Chapter 2.2.2. --- "Clusterwise Regression Model (Spath, 1979)" --- p.14 / Chapter 2.2.3. --- A Nonparametric Clusterwise Regression Model --- p.15 / Chapter 2.2.4. --- A Mixture Approach to Clusterwise Regression Model --- p.16 / Chapter 2.2.5. --- An Illustrative Application --- p.19 / Chapter 2.3. --- Mathematical Programming Formulation of the Clusterwise Discriminant Analysis --- p.21 / Chapter 2.4. --- Conclusion --- p.25 / Chapter 2.5. --- Appendix --- p.28 / Chapter 2.6. --- References --- p.32 / Chapter 2.7. --- Tables --- p.35 / Chapter Chapter 3: --- Essay two - A Mathematical Programming Approach to Clusterwise Rank Order Logit Model / Chapter 3.0. --- Abstract --- p.40 / Chapter 3.1. --- Introduction --- p.41 / Chapter 3.2. --- Clusterwise Rank Order Logit Model --- p.42 / Chapter 3.3. --- Numerical Illustration --- p.46 / Chapter 3.4. --- Conclustion --- p.48 / Chapter 3.5. --- References --- p.50 / Chapter 3.6. --- Tables --- p.52 / Chapter Chapter 4: --- Essay three - A Mathematical Programming Approach to Metric Unidimensional Scaling / Chapter 4.0. --- Abstract --- p.53 / Chapter 4.1. --- Introduction --- p.54 / Chapter 4.2. --- Nonlinear Programming Formulation --- p.56 / Chapter 4.3. --- Numerical Examples --- p.60 / Chapter 4.4. --- Possible Extensions --- p.61 / Chapter 4.5. --- Conclusion and Extensions --- p.63 / Chapter 4.6. --- References --- p.64 / Chapter 4.7. --- Tables --- p.66 / Chapter Chapter 5: --- Research Project in Progress / Chapter 5.1. --- Project 1 -- An Integrated Approach to Taste Test Experiment Within the Prospect Theory Framework --- p.68 / Chapter 5.1.1. --- Experiment Procedure --- p.68 / Chapter 5.1.2. --- Experimental Result --- p.72 / Chapter 5.2. --- Project 2 -- An Integrated Approach to Multi- Dimensional Scaling Problem --- p.75 / Chapter 5.2.1. --- Introduction --- p.75 / Chapter 5.2.2. --- Experiment Procedure --- p.76 / Chapter 5.2.3. --- Questionnaire --- p.78 / Chapter 5.2.4. --- Experimental Result --- p.78 Marketing--Mathematical models Regression analysis Cluster analysis
57	The use of control variates in bootstrap simulation. January 2001 (has links) Lui Ying Kin. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2001. / Includes bibliographical references (leaves 63-65). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Introduction to bootstrap and efficiency bootstrap simulation --- p.5 / Chapter 2.1 --- Background of bootstrap --- p.5 / Chapter 2.2 --- Basic idea of bootstrap --- p.7 / Chapter 2.3 --- Variance reduction methods --- p.10 / Chapter 2.3.1 --- Control variates --- p.10 / Chapter 2.3.2 --- Common random numbers --- p.12 / Chapter 2.3.3 --- Antithetic variates --- p.14 / Chapter 2.3.4 --- Importance Sampling --- p.15 / Chapter 2.4 --- Efficient bootstrap simulation --- p.17 / Chapter 2.4.1 --- Linear approximation --- p.18 / Chapter 2.4.2 --- Centring method --- p.19 / Chapter 2.4.3 --- Balanced resampling --- p.20 / Chapter 2.4.4 --- Antithetic resampling --- p.21 / Chapter 3 --- Methodology --- p.22 / Chapter 3.1 --- Introduction --- p.22 / Chapter 3.2 --- Cluster analysis --- p.24 / Chapter 3.3 --- Regression estimator and mixture experiment --- p.25 / Chapter 3.4 --- Estimate of standard error and bias --- p.30 / Chapter 4 --- Simulation study --- p.45 / Chapter 4.1 --- Introduction --- p.45 / Chapter 4.2 --- Ratio estimation --- p.46 / Chapter 4.3 --- Time series problem --- p.50 / Chapter 4.4 --- Regression problem --- p.54 / Chapter 5 --- Conclusion and discussion --- p.60 / Reference --- p.63 Bootstrap (Statistics) Cluster analysis Regression analysis
58	A study of two problems in data mining: anomaly monitoring and privacy preservation. January 2008 (has links) Bu, Yingyi. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (leaves 89-94). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.v / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Anomaly Monitoring --- p.1 / Chapter 1.2 --- Privacy Preservation --- p.5 / Chapter 1.2.1 --- Motivation --- p.7 / Chapter 1.2.2 --- Contribution --- p.12 / Chapter 2 --- Anomaly Monitoring --- p.16 / Chapter 2.1 --- Problem Statement --- p.16 / Chapter 2.2 --- A Preliminary Solution: Simple Pruning --- p.19 / Chapter 2.3 --- Efficient Monitoring by Local Clusters --- p.21 / Chapter 2.3.1 --- Incremental Local Clustering --- p.22 / Chapter 2.3.2 --- Batch Monitoring by Cluster Join --- p.24 / Chapter 2.3.3 --- Cost Analysis and Optimization --- p.28 / Chapter 2.4 --- Piecewise Index and Query Reschedule --- p.31 / Chapter 2.4.1 --- Piecewise VP-trees --- p.32 / Chapter 2.4.2 --- Candidate Rescheduling --- p.35 / Chapter 2.4.3 --- Cost Analysis --- p.36 / Chapter 2.5 --- Upper Bound Lemma: For Dynamic Time Warping Distance --- p.37 / Chapter 2.6 --- Experimental Evaluations --- p.39 / Chapter 2.6.1 --- Effectiveness --- p.40 / Chapter 2.6.2 --- Efficiency --- p.46 / Chapter 2.7 --- Related Work --- p.49 / Chapter 3 --- Privacy Preservation --- p.52 / Chapter 3.1 --- Problem Definition --- p.52 / Chapter 3.2 --- HD-Composition --- p.58 / Chapter 3.2.1 --- Role-based Partition --- p.59 / Chapter 3.2.2 --- Cohort-based Partition --- p.61 / Chapter 3.2.3 --- Privacy Guarantee --- p.70 / Chapter 3.2.4 --- Refinement of HD-composition --- p.75 / Chapter 3.2.5 --- Anonymization Algorithm --- p.76 / Chapter 3.3 --- Experiments --- p.77 / Chapter 3.3.1 --- Failures of Conventional Generalizations --- p.78 / Chapter 3.3.2 --- Evaluations of HD-Composition --- p.79 / Chapter 3.4 --- Related Work --- p.85 / Chapter 4 --- Conclusions --- p.87 / Bibliography --- p.89 Data mining Cluster analysis Data protection
59	Clustering multivariate data using interpoint distances. January 2011 (has links) Ho, Siu Tung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (p. 43-44). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Introduction --- p.1 / Chapter 2 --- Methodology and Algorithm --- p.6 / Chapter 2.1 --- Testing one. homogeneous cluster --- p.8 / Chapter 3 --- Simulation Study --- p.17 / Chapter 3.1 --- Simulation Plan --- p.19 / Chapter 3.1.1 --- One single cluster --- p.19 / Chapter 3.1.2 --- Two separated clusters --- p.20 / Chapter 3.2 --- Measure of Performance --- p.26 / Chapter 3.3 --- Simulation Results --- p.27 / Chapter 3.3.1 --- One single cluster --- p.27 / Chapter 3.3.2 --- Two separated clusters --- p.30 / Chapter 4 --- Conclusion and further research --- p.36 / Chapter 4.1 --- Constructing Data depth --- p.38 / Bibliography --- p.43 Multivariate analysis Cluster analysis Spatial analysis (Statistics)
60	The association between beverage intake and overweight and obesity among Canadian adults Nikpartow, Nooshin 17 November 2010 Overweight and obesity in Canada has significantly increased during the last three decades, paralleled by increased intake of fat and sugar particularly sugary beverages leading to higher level of energy intake, as well as reduction in physical activity. Canadian Community Health Survey, Cycle 2.2, 2004 (CCHS 2.2), provides the opportunity to evaluate beverage intakes of Canadians in relation to overweight and obesity using Body Mass Index (BMI).<p> To examine the association between sugar-sweetened beverages and BMI in Canadian adults, we used data from CCHS 2.2 (n=14,304, aged >18 year and <65 year) in which dietary intake was assessed using 24-h recall. In various steps, data on beverage consumption were identified, coded and classified. Using descriptive statistics, we determined total gram intake and the contribution of each beverage to total energy intake among age/sex groups. To determine the most suitable patterns of beverage consumptions among Canadian adults, a cluster analysis K-means method was applied. Males and females were classified into distinct clusters based on the dominant pattern of beverage intakes. Finally, step-wise logistic regression models were used to determine associations between sugar-sweetened beverages and BMI, controlling for age, marital status, income, education, physical activity, total energy intake, immigration status, smoking habits and ethnicity. To account for complex survey design, all data were weighted and bootstrapped.<p> BMI in women with predominant fruit drink pattern (791.1±32.9 g) was significantly higher than those with no dominant pattern in beverage consumption (28.3±1 vs. 26.8±0.3 respectively, P<0.001). In women, high intake of fruit drinks was a significant predictor of overweight (OR=1.84, 95% C.I:1.06-3.20), obesity (OR=2.55, 95% C.I:1.46-4.47) and overweight/obesity (OR=2.05, 95% C.I:1.29-3.25). In men, mean BMI was not different among beverage consumption clusters and none of the beverages was a predictor for overweight and obesity. For the first time, in a nationally representative data, we report association of sugar-sweetened beverages and overweight and obesity in Canadian women. Sugar-sweetened BMI Beverage Adults Cluster analysis

Search results