Return to search

Data Mining with Multivariate Kernel Regression Using Information Complexity and the Genetic Algorithm

Kernel density estimation is a data smoothing technique that depends heavily on the bandwidth selection. The current literature has focused on optimal selectors for the univariate case that are primarily data driven. Plug-in and cross validation selectors have recently been extended to the general multivariate case.
This dissertation will introduce and develop new and novel techniques for data mining with multivariate kernel density regression using information complexity and the genetic algorithm as a heuristic optimizer to choose the optimal bandwidth and the best predictors in kernel regression models. Simulated and real data will be used to cross validate the optimal bandwidth selectors using information complexity. The genetic algorithm is used in conjunction with information complexity to determine kernel density estimates for variable selection from high dimension multivariate data sets.
Kernel regression is also hybridized with the implicit enumeration algorithm to determine the set of independent variables for the global optimal solution using information criteria as the objective function. The results from the genetic algorithm are compared to the optimal solution from the implicit enumeration algorithm and the known global optimal solution from an explicit enumeration of all possible subset models.

Identiferoai:union.ndltd.org:UTENN/oai:trace.tennessee.edu:utk_graddiss-1628
Date01 December 2009
CreatorsBeal, Dennis Jack
PublisherTrace: Tennessee Research and Creative Exchange
Source SetsUniversity of Tennessee Libraries
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceDoctoral Dissertations

Page generated in 0.0023 seconds