Global ETD Search

Return to search

Mixture of Factor Analyzers with Information Criteria and the Genetic Algorithm

In this dissertation, we have developed and combined several statistical techniques in Bayesian factor analysis (BAYFA) and mixture of factor analyzers (MFA) to overcome the shortcoming of these existing methods. Information Criteria are brought into the context of the BAYFA model as a decision rule for choosing the number of factors m along with the Press and Shigemasu method, Gibbs Sampling and Iterated Conditional Modes deterministic optimization. Because of sensitivity of BAYFA on the prior information of the factor pattern structure, the prior factor pattern structure is learned directly from the given sample observations data adaptively using Sparse Root algorithm.
Clustering and dimensionality reduction have long been considered two of the fundamental problems in unsupervised learning or statistical pattern recognition. In this dissertation, we shall introduce a novel statistical learning technique by focusing our attention on MFA from the perspective of a method for model-based density estimation to cluster the high-dimensional data and at the same time carry out factor analysis to reduce the curse of dimensionality simultaneously in an expert data mining system. The typical EM algorithm can get trapped in one of the many local maxima therefore, it is slow to converge and can never converge to global optima, and highly dependent upon initial values. We extend the EM algorithm proposed by cite{Gahramani1997} for the MFA using intelligent initialization techniques, K-means and regularized Mahalabonis distance and introduce the new Genetic Expectation Algorithm (GEM) into MFA in order to overcome the shortcomings of typical EM algorithm. Another shortcoming of EM algorithm for MFA is assuming the variance of the error vector and the number of factors is the same for each mixture. We propose Two Stage GEM algorithm for MFA to relax this constraint and obtain different numbers of factors for each population. In this dissertation, our approach will integrate statistical modeling procedures based on the information criteria as a fitness function to determine the number of mixture clusters and at the same time to choose the number factors that can be extracted from the data.

http://trace.tennessee.edu/utk_graddiss/853

Mixture of Factor Analyzers

Multivariate Analysis

Statistical Methodology

Statistical Models

Statistical Theory

Identifer	oai:union.ndltd.org:UTENN/oai:trace.tennessee.edu:utk_graddiss-1918
Date	01 August 2010
Creators	Turan, Esra
Publisher	Trace: Tennessee Research and Creative Exchange
Source Sets	University of Tennessee Libraries
Detected Language	English
Type	text
Format	application/pdf
Source	Doctoral Dissertations

Page generated in 0.0021 seconds

Mixture of Factor Analyzers with Information Criteria and the Genetic Algorithm

Description

Links & Downloads

Tags

Additional Fields