Global ETD Search

1	Quelques contributions à la sélection de variables et aux tests non-paramétriques / A few contributions to variable selection and nonparametric tests Comminges, Laëtitia 12 December 2012 (has links) Les données du monde réel sont souvent de très grande dimension, faisant intervenir un grand nombre de variables non pertinentes ou redondantes. La sélection de variables est donc utile dans ce cadre. D'abord, on considère la sélection de variables dans le modèle de régression quand le nombre de variables est très grand. En particulier on traite le cas où le nombre de variables pertinentes est bien plus petit que la dimension ambiante. Sans supposer aucune forme paramétrique pour la fonction de régression, on obtient des conditions minimales permettant de retrouver l'ensemble des variables pertinentes. Ces conditions relient la dimension intrinsèque à la dimension ambiante et la taille de l'échantillon. Ensuite, on considère le problème du test d'une hypothèse nulle composite sous un modèle de régression non paramétrique multi varié. Pour une fonctionnelle quadratique donnée $Q$, l'hypothèse nulle correspond au fait que la fonction $f$ satisfait la contrainte $Q[f] = 0$, tandis que l'alternative correspond aux fonctions pour lesquelles $ \|Q[f]\|$ est minorée par une constante strictement positive. On fournit des taux minimax de test et les constantes de séparation exactes ainsi qu'une procédure optimale exacte, pour des fonctionnelles quadratiques diagonales et positives. On peut utiliser ces résultats pour tester la pertinence d'une ou plusieurs variables explicatives. L'étude des taux minimax pour les fonctionnelles quadratiques diagonales qui ne sont ni positives ni négatives, fait apparaître deux régimes différents : un régime « régulier » et un régime « irrégulier ». On applique ceci au test de l'égalité des normes de deux fonctions observées dans des environnements bruités / Real-world data are often extremely high-dimensional, severely under constrained and interspersed with a large number of irrelevant or redundant features. Relevant variable selection is a compelling approach for addressing statistical issues in the scenario of high-dimensional and noisy data with small sample size. First, we address the issue of variable selection in the regression model when the number of variables is very large. The main focus is on the situation where the number of relevant variables is much smaller than the ambient dimension. Without assuming any parametric form of the underlying regression function, we get tight conditions making it possible to consistently estimate the set of relevant variables. Secondly, we consider the problem of testing a particular type of composite null hypothesis under a nonparametric multivariate regression model. For a given quadratic functional $Q$, the null hypothesis states that the regression function $f$ satisfies the constraint $Q[f] = 0$, while the alternative corresponds to the functions for which $Q[f]$ is bounded away from zero. We provide minimax rates of testing and the exact separation constants, along with a sharp-optimal testing procedure, for diagonal and nonnegative quadratic functionals. We can apply this to testing the relevance of a variable. Studying minimax rates for quadratic functionals which are neither positive nor negative, makes appear two different regimes: “regular” and “irregular”. We apply this to the issue of testing the equality of norms of two functions observed in noisy environments Sélection de variables Régression non paramétrique Tests d'hypothèses non paramétriques Asymptotiques exactes Taux de séparation Approche minimax Sparsity pattern Nonparametric hypotheses testing Sharp asymptotics Separation rates Minimax approach High-dimensional regression
2	Tests non paramétriques minimax pour de grandes matrices de covariance / Non parametric minimax tests for high dimensional covariance matrices Zgheib, Rania 23 May 2016 (has links) Ces travaux contribuent à la théorie des tests non paramétriques minimax dans le modèle de grandes matrices de covariance. Plus précisément, nous observons $n$ vecteurs indépendants, de dimension $p$, $X_1,ldots, X_n$, ayant la même loi gaussienne $mathcal {N}_p(0, Sigma)$, où $Sigma$ est la matrice de covariance inconnue. Nous testons l'hypothèse nulle $H_0:Sigma = I$, où $I$ est la matrice identité. L'hypothèse alternative est constituée d'un ellipsoïde avec une boule de rayon $varphi$ autour de $I$ enlevée. Asymptotiquement, $n$ et $p$ tendent vers l'infini. La théorie minimax des tests, les autres approches considérées pour le modèle de matrice de covariance, ainsi que le résumé de nos résultats font l'objet de l'introduction.Le deuxième chapitre est consacré aux matrices de covariance $Sigma$ de Toeplitz. Le lien avec le modèle de densité spectrale est discuté. Nous considérons deux types d'ellipsoïdes, décrits par des pondérations polynomiales (dits de type Sobolev) et exponentielles, respectivement.Dans les deux cas, nous trouvons les vitesses de séparation minimax. Nous établissons également des équivalents asymptotiques exacts de l'erreur minimax de deuxième espèce et de l'erreur minimax totale. La procédure de test asymptotiquement minimax exacte est basée sur une U-statistique d'ordre 2 pondérée de façon optimale.Le troisième chapitre considère une hypothèse alternative de matrices de covariance pas nécessairement de Toeplitz, appartenant à un ellipsoïde de type Sobolev de paramètre $alpha$. Nous donnons des équivalents asymptotiques exacts des erreurs minimax de 2ème espèce et totale. Nous proposons une procédure de test adaptative, c-à-d libre de $alpha$, quand $alpha$ appartient à un compact de $(1/2, + infty)$.L'implémentation numérique des procédures introduites dans les deux premiers chapitres montrent qu'elles se comportent très bien pour de grandes valeurs de $p$, en particulier elles gagnent beaucoup sur les méthodes existantes quand $p$ est grand et $n$ petit.Le quatrième chapitre se consacre aux tests adaptatifs dans un modèle de covariance où les observations sont incomplètes. En effet, chaque coordonnée du vecteur est manquante de manière indépendante avec probabilité $1-a$, $ ain (0,1)$, où $a$ peut tendre vers 0. Nous traitons ce problème comme un problème inverse. Nous établissons ici les vitesses minimax de séparation et introduisons de nouvelles procédures adaptatives de test. Les statistiques de test définies ici ont des poids constants. Nous considérons les deux cas: matrices de Toeplitz ou pas, appartenant aux ellipsoïdes de type Sobolev / Our work contributes to the theory of non-parametric minimax tests for high dimensional covariance matrices. More precisely, we observe $n$ independent, identically distributed vectors of dimension $p$, $X_1,ldots, X_n$ having Gaussian distribution $mathcal{N}_p(0,Sigma)$, where $Sigma$ is the unknown covariance matrix. We test the null hypothesis $H_0 : Sigma =I$, where $I$ is the identity matrix. The alternative hypothesis is given by an ellipsoid from which a ball of radius $varphi$ centered in $I$ is removed. Asymptotically, $n$ and $p$ tend to infinity. The minimax test theory, other approaches considered for testing covariance matrices and a summary of our results are given in the introduction.The second chapter is devoted to the case of Toeplitz covariance matrices $Sigma$. The connection with the spectral density model is discussed. We consider two types of ellipsoids, describe by polynomial weights and exponential weights, respectively. We find the minimax separation rate in both cases. We establish the sharp asymptotic equivalents of the minimax type II error probability and the minimax total error probability. The asymptotically minimax test procedure is a U-statistic of order 2 weighted by an optimal way.The third chapter considers alternative hypothesis containing covariance matrices not necessarily Toeplitz, that belong to an ellipsoid of parameter $alpha$. We obtain the minimax separation rate and give sharp asymptotic equivalents of the minimax type II error probability and the minimax total error probability. We propose an adaptive test procedure free of $alpha$, for $alpha$ belonging to a compact of $(1/2, + infty)$.We implement the tests procedures given in the previous two chapters. The results show their good behavior for large values of $p$ and that, in particular, they gain significantly over existing methods for large $p$ and small $n$.The fourth chapter is dedicated to adaptive tests in the model of covariance matrices where the observations are incomplete. That is, each value of the observed vector is missing with probability $1-a$, $a in (0,1)$ and $a$ may tend to 0. We treat this problem as an inverse problem. We establish the minimax separation rates and introduce new adaptive test procedures. Here, the tests statistics are weighted by constant weights. We consider ellipsoids of Sobolev type, for both cases : Toeplitz and non Toeplitz matrices Matrice de covariance Matrice de Toeplitz Tests adaptatifs Vitesse de séparation minimax Asymptotiques exactes Données incomplètes Covariance matrices Toeplitz matrices Adaptive tests Minimax separation rates Sharp asymptotics Missing data

Search results

Quelques contributions à la sélection de variables et aux tests non-paramétriques / A few contributions to variable selection and nonparametric tests

Tests non paramétriques minimax pour de grandes matrices de covariance / Non parametric minimax tests for high dimensional covariance matrices