The idea of approximating a distribution is a prominent problem in statistics. This dissertation explores the theory of principal points and principal curves as approximation methods to a distribution. Principal points of a distribution have been initially introduced by Flury (1990) who tackled the problem of optimal grouping in multivariate data. In essence, principal points are the theoretical counterparts of cluster means obtained by the k-means algorithm. Principal curves defined by Hastie (1984), are smooth one-dimensional curves that pass through the middle of a p-dimensional data set, providing a nonlinear summary of the data. In this dissertation, details on the usefulness of principal points and principal curves are reviewed. The application of principal points and principal curves are then extended beyond its original purpose to well-known computational methods like Support Vector Machines in machine learning.
Identifer | oai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:uct/oai:localhost:11427/15515 |
Date | January 2015 |
Creators | Ganey, Raeesa |
Contributors | Lubbe, Sugnet |
Publisher | University of Cape Town, Faculty of Science, Department of Statistical Sciences |
Source Sets | South African National ETD Portal |
Language | English |
Detected Language | English |
Type | Master Thesis, Masters, MSc |
Format | application/pdf |
Page generated in 0.0018 seconds