Investigation on Bayesian Ying-Yang learning for model selection in unsupervised learning. / CUHK electronic theses & dissertations collection / Digital dissertation consortium

For factor analysis models, we develop an improved BYY harmony data smoothing learning criterion BYY-HDS in help of considering the dependence between the factors and observations. We make empirical comparisons of the BYY harmony empirical learning criterion BYY-HEC, BYY-HDS, the BYY automatic model selection method BYY-AUTO, AIC, CAIC, BIC, and CV for selecting the number of factors not only on simulated data sets of different sample sizes, noise variances, data dimensions and factor numbers, but also on two real data sets from air pollution data and sport track records, respectively. / Model selection is a critical issue in unsupervised learning. Conventionally, model selection is implemented in two phases by some statistical model selection criterion such as Akaike's information criterion (AIC), Bozdogan's consistent Akaike's information criterion (CAIC), Schwarz's Bayesian inference criterion (BIC) which formally coincides with the minimum description length (MDL) criterion, and the cross-validation (CV) criterion. These methods are very time intensive and may become problematic when sample size is small. Recently, the Bayesian Ying-Yang (BYY) harmony learning has been developed as a unified framework with new mechanisms for model selection and regularization. In this thesis we make a systematic investigation on BYY learning as well as several typical model selection criteria for model selection on factor analysis models, Gaussian mixture models, and factor analysis mixture models. / The most remarkable findings of our study is that BYY-HDS is superior to its counterparts, especially when the sample size is small. AIC, BYY-HEC, BYY-AUTO and CV have a risk of overestimating, while BIC and CAIC have a risk of underestimating in most cases. BYY-AUTO is superior to other methods in a computational cost point of view. The cross-validation method requires the highest computing cost. (Abstract shortened by UMI.) / Hu Xuelei. / "November 2005." / Adviser: Lei Xu. / Source: Dissertation Abstracts International, Volume: 67-07, Section: B, page: 3899. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (p. 131-142). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. Ann Arbor, MI : ProQuest Information and Learning Company, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract in English and Chinese. / School code: 1307.

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_343655
Date January 2005
ContributorsHu, Xuelei., Chinese University of Hong Kong Graduate School. Division of Computer Science and Engineering.
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, theses
Formatelectronic resource, microform, microfiche, 1 online resource (xx, 142 p. : ill.)
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0016 seconds