The present paper deals with the choice of clustering algorithms before treating a k-sample problem. We investigate multivariate data sets that are quantized by algorithms that define partitions by maximal support planes (MSP) of a convex function. These algorithms belong to a wide class containing as special cases both the well known k-means algorithm and the Kohonen (1985) algorithm and have been profoundly investigated by Pötzelberger and Strasser (1999). For computing the test statistics for the k-sample problem we replace the data points by their conditional expections with respect to the MSP-partition. We present Monte Carlo simulations of power functions of different tests for the k-sample problem whereas the tests are carried out as multivariate permutation tests to ensure that they hold the level. The results presented show that there seems to be a vital and decisive connection between the optimal choice of the clustering algorithm and the tails of the probability distribution of the data. Especially for distributions with heavy tails like the exponential distribution the performance of tests based on a quadratic convex function with k-means type partitions totally breaks down. (author's abstract) / Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
Identifer | oai:union.ndltd.org:VIENNA/oai:epub.wu-wien.ac.at:epub-wu-01_1d6 |
Date | January 1999 |
Creators | Rahnenführer, Jörg |
Publisher | SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business |
Source Sets | Wirtschaftsuniversität Wien |
Language | English |
Detected Language | English |
Type | Paper, NonPeerReviewed |
Format | application/pdf |
Relation | http://epub.wu.ac.at/1364/ |
Page generated in 0.0036 seconds