We are interested in variable sélection in linear régression models. This research is motivated by recent development in microarrays, proteomics, brain images, among others. We study this problem in both frequentist and bayesian viewpoints.In a frequentist framework, we propose methods to deal with the problem of variable sélection, when the number of variables is much larger than the sample size with a possibly présence of additional structure in the predictor variables, such as high corrélations or order between successive variables. The performance of the proposed methods is theoretically investigated ; we prove that, under regularity conditions, the proposed estimators possess statistical good properties, such as Sparsity Oracle Inequalities, variable sélection consistency and asymptotic normality.In a Bayesian Framework, we propose a global noninformative approach for Bayesian variable sélection. In this thesis, we pay spécial attention to two calibration-free hierarchical Zellner's g-priors. The first one is the Jeffreys prior which is not location invariant. A second one avoids this problem by only considering models with at least one variable in the model. The practical performance of the proposed methods is illustrated through numerical experiments on simulated and real world datasets, with a comparison betwenn Bayesian and frequentist approaches under a low informative constraint when the number of variables is almost equal to the number of observations.
Identifer | oai:union.ndltd.org:CCSD/oai:tel.archives-ouvertes.fr:tel-00661689 |
Date | 14 December 2011 |
Creators | El anbari, Mohammed |
Publisher | Université Paris Sud - Paris XI |
Source Sets | CCSD theses-EN-ligne, France |
Language | fra |
Detected Language | English |
Type | PhD thesis |
Page generated in 0.0018 seconds