101 |
Identifying Agricultural Retailers' Gaps in Understanding of the Value Proposition for Large Commercial ProducersHailey Grace Utech (12468510) 28 April 2022 (has links)
<p>In order for agricultural retailers to remain successful in a volatile market, it is imperative that they understand the needs and buying behaviors of their producers. These producers can be divided into four buying segments: the Economic buyer, the Agronomic buyer, the Business buyer, and the Performance buyer by identifying similar buying characteristics. The retailer’s ability to correctly predict their producers into the correct buying segment would allow them to optimally market to individual producers offering a consistent value proposition across all farms. This research uses cluster analysis to segment the agricultural market, multinomial logistic regression models to extract the variables that determined cluster classifications, and accuracy measures from a multilevel confusion matrix to assess retailers’ ability to classify their producers into the correct buying segment. Retailers predicted 70% of their producers into the correct segment. However, the accuracies differed across each segment leaving opportunity for an inconsistent value proposition across all segments. </p>
|
102 |
An improved clustering method for program restructuring /Laks, Jeffrey Mark. January 1981 (has links)
No description available.
|
103 |
The methodology of cluster analysis : an application to receivables /Heimann, Stephen R. January 1973 (has links)
No description available.
|
104 |
Clustering Concepts in Automatic Pattern RecognitionDeFilipps, Patricia J. 01 January 1975 (has links) (PDF)
During the past decade and a half, there has been a considerable growth of interest in problems of pattern recognition. Contributions to the growth have been from many of the disciplines including statistics, control theory, operations research, biology, linguistics, and computer science. One of the basic approaches to pattern recognition is cluster analysis, in which various methodologies may be successfully employed. It is the purpose of this research report to investigate some of the basic clustering concepts in automatic pattern recognition.
|
105 |
Elevation-layered dendroclimatic signal in eastern Mediterranean tree ringsTouchan, Ramzi, Shishov, Vladimir V, Tychkov, Ivan I, Sivrikaya, Fatih, Attieh, Jihad, Ketmen, Muzaffer, Stephan, Jean, Mitsopoulos, Ioannis, Christou, Andreas, Meko, David M 01 April 2016 (has links)
Networks of tree-ring data are commonly applied in statistical reconstruction of spatial fields of climate variables. The importance of elevation to the climatic interpretation of tree-ring networks is addressed using 281 station precipitation records, and a network of 79 tree-ring chronologies from different species and a range of elevations in the eastern Mediterranean. Cluster analysis of chronologies identifies 6 tree-ring groups, delineated principally by site elevation. Correlation analysis suggests several of the clusters are linked to homogenous elevational moisture regimes. Results imply that climate stations close to the elevations of the tree-ring sites are essential for assessing the seasonal climatic signal in tree-ring chronologies from this region. A broader implication is that the elevations of stations contributing to gridded climate networks should be considered in the design and interpretation of field reconstructions of climate from tree rings. Finally, results suggest elevation-stratified tree-ring networks as a strategy for seasonal climate reconstruction.
|
106 |
A study on privacy-preserving clusteringCui, Yingjie., 崔英杰. January 2009 (has links)
published_or_final_version / Computer Science / Master / Master of Philosophy
|
107 |
Clustering uncertain data using Voronoi diagramLee, King-for, Foris., 李敬科. January 2009 (has links)
published_or_final_version / Computer Science / Master / Master of Philosophy
|
108 |
Generalized Feature Embedding Learning for Clustering and ClassicationUnknown Date (has links)
Data comes in many di erent shapes and sizes. In real life applications it is
common that data we are studying has features that are of varied data types. This
may include, numerical, categorical, and text. In order to be able to model this data
with machine learning algorithms, it is required that the data is typically in numeric
form. Therefore, for data that is not originally numerical, it must be transformed to
be able to be used as input into these algorithms.
Along with this transformation it is common that data we study has many
features relative to the number of samples in the data. It is often desirable to reduce
the number of features that are being trained in a model to eliminate noise and reduce
time in training. This problem of high dimensionality can be approached through
feature selection, feature extraction, or feature embedding. Feature selection seeks to
identify the most essential variables in a dataset that will lead to a parsimonious model
and high performing results, while feature extraction and embedding are techniques
that utilize a mathematical transformation of the data into a represented space. As a
byproduct of using a new representation, we are able to reduce the dimension greatly
without sacri cing performance. Oftentimes, by using embedded features we observe a gain in performance.
Though extraction and embedding methods may be powerful for isolated machine
learning problems, they do not always generalize well. Therefore, we are motivated
to illustrate a methodology that can be applied to any data type with little
pre-processing. The methods we develop can be applied in unsupervised, supervised,
incremental, and deep learning contexts. Using 28 benchmark datasets as examples
which include di erent data types, we construct a framework that can be applied for
general machine learning tasks.
The techniques we develop contribute to the eld of dimension reduction and
feature embedding. Using this framework, we make additional contributions to eigendecomposition
by creating an objective matrix that includes three main vital components.
The rst being a class partitioned row and feature product representation
of one-hot encoded data. Secondarily, the derivation of a weighted adjacency matrix
based on class label relationships. Finally, by the inner product of these aforementioned
values, we are able to condition the one-hot encoded data generated from the
original data prior to eigenvector decomposition. The use of class partitioning and
adjacency enable subsequent projections of the data to be trained more e ectively
when compared side-to-side to baseline algorithm performance. Along with this improved
performance, we can adjust the dimension of the subsequent data arbitrarily.
In addition, we also show how these dense vectors may be used in applications to
order the features of generic data for deep learning.
In this dissertation, we examine a general approach to dimension reduction and
feature embedding that utilizes a class partitioned row and feature representation, a
weighted approach to instance similarity, and an adjacency representation. This general
approach has application to unsupervised, supervised, online, and deep learning.
In our experiments of 28 benchmark datasets, we show signi cant performance gains
in clustering, classi cation, and training time. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection
|
109 |
Incremental document clustering for web page classification.January 2000 (has links)
by Wong, Wai-Chiu. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 89-94). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgments --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Document Clustering --- p.2 / Chapter 1.2 --- DC-tree --- p.4 / Chapter 1.3 --- Feature Extraction --- p.5 / Chapter 1.4 --- Outline of the Thesis --- p.5 / Chapter 2 --- Related Work --- p.8 / Chapter 2.1 --- Clustering Algorithms --- p.8 / Chapter 2.1.1 --- Partitional Clustering Algorithms --- p.8 / Chapter 2.1.2 --- Hierarchical Clustering Algorithms --- p.10 / Chapter 2.2 --- Document Classification by Examples --- p.11 / Chapter 2.2.1 --- k-NN algorithm - Expert Network (ExpNet) --- p.11 / Chapter 2.2.2 --- Learning Linear Text Classifier --- p.12 / Chapter 2.2.3 --- Generalized Instance Set (GIS) algorithm --- p.12 / Chapter 2.3 --- Document Clustering --- p.13 / Chapter 2.3.1 --- B+-tree-based Document Clustering --- p.13 / Chapter 2.3.2 --- Suffix Tree Clustering --- p.14 / Chapter 2.3.3 --- Association Rule Hypergraph Partitioning Algorithm --- p.15 / Chapter 2.3.4 --- Principal Component Divisive Partitioning --- p.17 / Chapter 2.4 --- Projections for Efficient Document Clustering --- p.18 / Chapter 3 --- Background --- p.21 / Chapter 3.1 --- Document Preprocessing --- p.21 / Chapter 3.1.1 --- Elimination of Stopwords --- p.22 / Chapter 3.1.2 --- Stemming Technique --- p.22 / Chapter 3.2 --- Problem Modeling --- p.23 / Chapter 3.2.1 --- Basic Concepts --- p.23 / Chapter 3.2.2 --- Vector Model --- p.24 / Chapter 3.3 --- Feature Selection Scheme --- p.25 / Chapter 3.4 --- Similarity Model --- p.27 / Chapter 3.5 --- Evaluation Techniques --- p.29 / Chapter 4 --- Feature Extraction and Weighting --- p.31 / Chapter 4.1 --- Statistical Analysis of the Words in the Web Domain --- p.31 / Chapter 4.2 --- Zipf's Law --- p.33 / Chapter 4.3 --- Traditional Methods --- p.36 / Chapter 4.4 --- The Proposed Method --- p.38 / Chapter 4.5 --- Experimental Results --- p.40 / Chapter 4.5.1 --- Synthetic Data Generation --- p.40 / Chapter 4.5.2 --- Real Data Source --- p.41 / Chapter 4.5.3 --- Coverage --- p.41 / Chapter 4.5.4 --- Clustering Quality --- p.43 / Chapter 4.5.5 --- Binary Weight vs Numerical Weight --- p.45 / Chapter 5 --- Web Document Clustering Using DC-tree --- p.48 / Chapter 5.1 --- Document Representation --- p.48 / Chapter 5.2 --- Document Cluster (DC) --- p.49 / Chapter 5.3 --- DC-tree --- p.52 / Chapter 5.3.1 --- Tree Definition --- p.52 / Chapter 5.3.2 --- Insertion --- p.54 / Chapter 5.3.3 --- Node Splitting --- p.55 / Chapter 5.3.4 --- Deletion and Node Merging --- p.56 / Chapter 5.4 --- The Overall Strategy --- p.57 / Chapter 5.4.1 --- Preprocessing --- p.57 / Chapter 5.4.2 --- Building DC-tree --- p.59 / Chapter 5.4.3 --- Identifying the Interesting Clusters --- p.60 / Chapter 5.5 --- Experimental Results --- p.61 / Chapter 5.5.1 --- Alternative Similarity Measurement : Synthetic Data --- p.61 / Chapter 5.5.2 --- DC-tree Characteristics : Synthetic Data --- p.63 / Chapter 5.5.3 --- Compare DC-tree and B+-tree: Synthetic Data --- p.64 / Chapter 5.5.4 --- Compare DC-tree and B+-tree: Real Data --- p.66 / Chapter 5.5.5 --- Varying the Number of Features : Synthetic Data --- p.67 / Chapter 5.5.6 --- Non-Correlated Topic Web Page Collection: Real Data --- p.69 / Chapter 5.5.7 --- Correlated Topic Web Page Collection: Real Data --- p.71 / Chapter 5.5.8 --- Incremental updates on Real Data Set --- p.72 / Chapter 5.5.9 --- Comparison with the other clustering algorithms --- p.73 / Chapter 6 --- Conclusion --- p.75 / Appendix --- p.77 / Chapter A --- Stopword List --- p.77 / Chapter B --- Porter's Stemming Algorithm --- p.81 / Chapter C --- Insertion Algorithm --- p.83 / Chapter D --- Node Splitting Algorithm --- p.85 / Chapter E --- Features Extracted in Experiment 4.53 --- p.87 / Bibliography --- p.88
|
110 |
Entropy-based subspace clustering for mining numerical data.January 1999 (has links)
by Cheng, Chun-hung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 72-76). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgments --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Six Tasks of Data Mining --- p.1 / Chapter 1.1.1 --- Classification --- p.2 / Chapter 1.1.2 --- Estimation --- p.2 / Chapter 1.1.3 --- Prediction --- p.2 / Chapter 1.1.4 --- Market Basket Analysis --- p.3 / Chapter 1.1.5 --- Clustering --- p.3 / Chapter 1.1.6 --- Description --- p.3 / Chapter 1.2 --- Problem Description --- p.4 / Chapter 1.3 --- Motivation --- p.5 / Chapter 1.4 --- Terminology --- p.7 / Chapter 1.5 --- Outline of the Thesis --- p.7 / Chapter 2 --- Survey on Previous Work --- p.8 / Chapter 2.1 --- Data Mining --- p.8 / Chapter 2.1.1 --- Association Rules and its Variations --- p.9 / Chapter 2.1.2 --- Rules Containing Numerical Attributes --- p.15 / Chapter 2.2 --- Clustering --- p.17 / Chapter 2.2.1 --- The CLIQUE Algorithm --- p.20 / Chapter 3 --- Entropy and Subspace Clustering --- p.24 / Chapter 3.1 --- Criteria of Subspace Clustering --- p.24 / Chapter 3.1.1 --- Criterion of High Density --- p.25 / Chapter 3.1.2 --- Correlation of Dimensions --- p.25 / Chapter 3.2 --- Entropy in a Numerical Database --- p.27 / Chapter 3.2.1 --- Calculation of Entropy --- p.27 / Chapter 3.3 --- Entropy and the Clustering Criteria --- p.29 / Chapter 3.3.1 --- Entropy and the Coverage Criterion --- p.29 / Chapter 3.3.2 --- Entropy and the Density Criterion --- p.31 / Chapter 3.3.3 --- Entropy and Dimensional Correlation --- p.33 / Chapter 4 --- The ENCLUS Algorithms --- p.35 / Chapter 4.1 --- Framework of the Algorithms --- p.35 / Chapter 4.2 --- Closure Properties --- p.37 / Chapter 4.3 --- Complexity Analysis --- p.39 / Chapter 4.4 --- Mining Significant Subspaces --- p.40 / Chapter 4.5 --- Mining Interesting Subspaces --- p.42 / Chapter 4.6 --- Example --- p.44 / Chapter 5 --- Experiments --- p.49 / Chapter 5.1 --- Synthetic Data --- p.49 / Chapter 5.1.1 --- Data Generation ´ؤ Hyper-rectangular Data --- p.49 / Chapter 5.1.2 --- Data Generation ´ؤ Linearly Dependent Data --- p.50 / Chapter 5.1.3 --- Effect of Changing the Thresholds --- p.51 / Chapter 5.1.4 --- Effectiveness of the Pruning Strategies --- p.53 / Chapter 5.1.5 --- Scalability Test --- p.53 / Chapter 5.1.6 --- Accuracy --- p.55 / Chapter 5.2 --- Real-life Data --- p.55 / Chapter 5.2.1 --- Census Data --- p.55 / Chapter 5.2.2 --- Stock Data --- p.56 / Chapter 5.3 --- Comparison with CLIQUE --- p.58 / Chapter 5.3.1 --- Subspaces with Uniform Projections --- p.60 / Chapter 5.4 --- Problems with Hyper-rectangular Data --- p.62 / Chapter 6 --- Miscellaneous Enhancements --- p.64 / Chapter 6.1 --- Extra Pruning --- p.64 / Chapter 6.2 --- Multi-resolution Approach --- p.65 / Chapter 6.3 --- Multi-threshold Approach --- p.68 / Chapter 7 --- Conclusion --- p.70 / Bibliography --- p.71 / Appendix --- p.77 / Chapter A --- Differential Entropy vs Discrete Entropy --- p.77 / Chapter A.1 --- Relation of Differential Entropy to Discrete Entropy --- p.78 / Chapter B --- Mining Quantitative Association Rules --- p.80 / Chapter B.1 --- Approaches --- p.81 / Chapter B.2 --- Performance --- p.82 / Chapter B.3 --- Final Remarks --- p.83
|
Page generated in 0.0924 seconds