Global ETD Search

1	Artificial binary data scenarios Dolnicar, Sara, Leisch, Friedrich, Weingessel, Andreas January 1998 (has links) (PDF) This manual describes artificial binary data scenarios. These data sets can be used to compare the performance of algorithms for market segmentation. The data sets described in this manual are available as packages for R (Splus) and as ASCII-files under htttp://www.ci.tuwien.ac.at/SFB/. (author's abstract) / Series: Working Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
2	On the generation of correlated artificial binary data Leisch, Friedrich, Weingessel, Andreas, Hornik, Kurt January 1998 (has links) (PDF) The generation of random variates from multivariate binary distributions has not gained as much interest in the literature as, e.g., multivariate normal or Poisson distributions. Binary variables are important in many types of applications. Our main interest is in the segmentation of marketing data, where data come from customer questionnaires with "yes/no" questions. Artificial data provide a valuable tool for the analysis of segmentation tools, because data with known structure can be constructed to mimic situations from the real world (Dolnicar et al. 1998). Questionnaire data can be highly correlated, when several questions covering the same field are likely to be answered similarly by a subject. In this paper we present a computationally fast method to simulate multivariate binary distributions with a given correlation structure. The implementation of the algorithm in R, an implementation of the S statistical language, is described in the appendix. / Series: Working Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
3	A note on topological concepts for complexity reduction of binary survey data Buchta, Christian, Dolnicar, Sara, Köck, Robert, Skriner, Edith January 2000 (has links) (PDF) Recently in Steiner 1999 the use of topological concepts was suggested for deciding on the level of complexity reduction of continuous data sets by quantisations. Level of complexity is synonymous with the number of classes or class representing means (centroids) of a data partition. In this paper a simulation study of the applicability of this concept to binary data is presented which is comparable with the results in Weingessel et al. 1999. The class of quantisation methods used contains traditional k-means and three robust approaches to partitioning, as suggested in Pötzelberger and Strasser 1997. (author's abstract) / Series: Working Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
4	A comparison of several cluster algorithms on artificial binary data [Part 2]. Scenarios from travel market segmentation. Part 2 (Addition to Working Paper No. 7). Dolnicar, Sara, Leisch, Friedrich, Steiner, Gottfried, Weingessel, Andreas January 1998 (has links) (PDF) The search for clusters in empirical data is an important and often encountered research problem. Numerous algorithms exist that are able to render groups of objects or individuals. Of course each algorithm has its strengths and weaknesses. In order to identify these crucial points artificial data was generated - based primarily on experience with structures of empirical data - and used as benchmark for evaluating the results of numerous cluster algorithms. This work is an addition to SFB Working Paper No. 7 where hard competitive learning (HCL), neural gas (NGAS), k-means and self organizing maps (SOMs) were compared. Since the artificial data scenarios and the evaluation criteria used remained the same, they are not explained in this work, where the results of five additional algorithms are evaluated. (author's abstract) / Series: Working Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
5	A comparison of several cluster algorithms on artificial binary data [Part 1]. Scenarios from travel market segmentation [Part 2: Working Paper 19]. Dolnicar, Sara, Leisch, Friedrich, Weingessel, Andreas, Buchta, Christian, Dimitriadou, Evgenia January 1998 (has links) (PDF) Social scientists confronted with the problem of segmenting individuals into plausible subgroups usually encounter two main problems: First: there is very little indication about the correct choice of the number of clusters to search for. Second: different cluster algorithms and even multiple replications of the same algorithm result in different solutions due to random initializations and stochastic learning methods. In the worst case numerous solutions are found which all seem plausible as far as interpretation is concerned. The consequence is, that in the end clusters are postulated that are in fact "chosen" by the researcher, as he or she makes decisions on the number of clusters and the solution chosen as the "final" one. In this paper we concentrate on the power and stability of several popular clustering algorithms under the condition that the correct number of clusters is known. Artificial data sets modeled to mimic typical situations from tourism marketing are constructed. The structure of these data sets is described in several scenarios, and artificial binary data are generated accordingly. These data, ranging from very simple to more complex, real-data-like structures, enable us to systematically analyze the "behavior" of the cluster methods. Section 3 gives an overview of all cluster methods under investigation. Section 4 describes our experimental results, comparing first all scenarios and then all cluster methods. To accomplish this task, several evaluation criteria for cluster methods are proposed. Finally: Sections 5 and 6 draw some conclusions and give an outlook on future research. (author's abstract) / Series: Working Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"

1

Page generated in 0.0369 seconds