Global ETD Search

11	Learning algorithms for sparse classification / Algorithmes d'estimation pour la classification parcimonieuse Sanchez Merchante, Luis Francisco 07 June 2013 (has links) Cette thèse traite du développement d'algorithmes d'estimation en haute dimension. Ces algorithmes visent à résoudre des problèmes de discrimination et de classification, notamment, en incorporant un mécanisme de sélection des variables pertinentes. Les contributions de cette thèse se concrétisent par deux algorithmes, GLOSS pour la discrimination et Mix-GLOSS pour la classification. Tous les deux sont basés sur le résolution d'une régression régularisée de type "optimal scoring" avec une formulation quadratique de la pénalité group-Lasso qui encourage l'élimination des descripteurs non-significatifs. Les fondements théoriques montrant que la régression de type "optimal scoring" pénalisée avec un terme "group-Lasso" permet de résoudre un problème d'analyse discriminante linéaire ont été développés ici pour la première fois. L'adaptation de cette théorie pour la classification avec l'algorithme EM n'est pas nouvelle, mais elle n'a jamais été détaillée précisément pour les pénalités qui induisent la parcimonie. Cette thèse démontre solidement que l'utilisation d'une régression de type "optimal scoring" pénalisée avec un terme "group-Lasso" à l'intérieur d'une boucle EM est possible. Nos algorithmes ont été testés avec des bases de données réelles et artificielles en haute dimension avec des résultats probants en terme de parcimonie, et ce, sans compromettre la performance du classifieur. / This thesis deals with the development of estimation algorithms with embedded feature selection the context of high dimensional data, in the supervised and unsupervised frameworks. The contributions of this work are materialized by two algorithms, GLOSS for the supervised domain and Mix-GLOSS for unsupervised counterpart. Both algorithms are based on the resolution of optimal scoring regression regularized with a quadratic formulation of the group-Lasso penalty which encourages the removal of uninformative features. The theoretical foundations that prove that a group-Lasso penalized optimal scoring regression can be used to solve a linear discriminant analysis bave been firstly developed in this work. The theory that adapts this technique to the unsupervised domain by means of the EM algorithm is not new, but it has never been clearly exposed for a sparsity-inducing penalty. This thesis solidly demonstrates that the utilization of group-Lasso penalized optimal scoring regression inside an EM algorithm is possible. Our algorithms have been tested with real and artificial high dimensional databases with impressive resuits from the point of view of the parsimony without compromising prediction performances. Algorithmes d'estimation Classification parcimonieuse Discrimination Variables pertinentes Algorithmes GLOSS Linear discriminant analysis Feature selection Regularization Variational approach Clustering Optimal scoring Group Lasso Sparsity
12	Contributions to Structured Variable Selection Towards Enhancing Model Interpretation and Computation Efficiency Shen, Sumin 07 February 2020 (has links) The advances in data-collecting technologies provides great opportunities to access large sample-size data sets with high dimensionality. Variable selection is an important procedure to extract useful knowledge from such complex data. While in many real-data applications, appropriate selection of variables should facilitate the model interpretation and computation efficiency. It is thus important to incorporate domain knowledge of underlying data generation mechanism to select key variables for improving the model performance. However, general variable selection techniques, such as the best subset selection and the Lasso, often do not take the underlying data generation mechanism into considerations. This thesis proposal aims to develop statistical modeling methodologies with a focus on the structured variable selection towards better model interpretation and computation efficiency. Specifically, this thesis proposal consists of three parts: an additive heredity model with coefficients incorporating the multi-level data, a regularized dynamic generalized linear model with piecewise constant functional coefficients, and a structured variable selection method within the best subset selection framework. In Chapter 2, an additive heredity model is proposed for analyzing mixture-of-mixtures (MoM) experiments. The MoM experiment is different from the classical mixture experiment in that the mixture component in MoM experiments, known as the major component, is made up of sub-components, known as the minor components. The proposed model considers an additive structure to inherently connect the major components with the minor components. To enable a meaningful interpretation for the estimated model, we apply the hierarchical and heredity principles by using the nonnegative garrote technique for model selection. The performance of the additive heredity model was compared to several conventional methods in both unconstrained and constrained MoM experiments. The additive heredity model was then successfully applied in a real problem of optimizing the Pringlestextsuperscript{textregistered} potato crisp studied previously in the literature. In Chapter 3, we consider the dynamic effects of variables in the generalized linear model such as logistic regression. This work is motivated from the engineering problem with varying effects of process variables to product quality caused by equipment degradation. To address such challenge, we propose a penalized dynamic regression model which is flexible to estimate the dynamic coefficient structure. The proposed method considers modeling the functional coefficient parameter as piecewise constant functions. Specifically, under the penalized regression framework, the fused lasso penalty is adopted for detecting the changes in the dynamic coefficients. The group lasso penalty is applied to enable a sparse selection of variables. Moreover, an efficient parameter estimation algorithm is also developed based on alternating direction method of multipliers. The performance of the dynamic coefficient model is evaluated in numerical studies and three real-data examples. In Chapter 4, we develop a structured variable selection method within the best subset selection framework. In the literature, many techniques within the LASSO framework have been developed to address structured variable selection issues. However, less attention has been spent on structured best subset selection problems. In this work, we propose a sparse Ridge regression method to address structured variable selection issues. The key idea of the proposed method is to re-construct the regression matrix in the angle of experimental designs. We employ the estimation-maximization algorithm to formulate the best subset selection problem as an iterative linear integer optimization (LIO) problem. the mixed integer optimization algorithm as the selection step. We demonstrate the power of the proposed method in various structured variable selection problems. Moverover, the proposed method can be extended to the ridge penalized best subset selection problems. The performance of the proposed method is evaluated in numerical studies. / Doctor of Philosophy / The advances in data-collecting technologies provides great opportunities to access large sample-size data sets with high dimensionality. Variable selection is an important procedure to extract useful knowledge from such complex data. While in many real-data applications, appropriate selection of variables should facilitate the model interpretation and computation efficiency. It is thus important to incorporate domain knowledge of underlying data generation mechanism to select key variables for improving the model performance. However, general variable selection techniques often do not take the underlying data generation mechanism into considerations. This thesis proposal aims to develop statistical modeling methodologies with a focus on the structured variable selection towards better model interpretation and computation efficiency. The proposed approaches have been applied to real-world problems to demonstrate their model performance. Model Selection Nonnegative Garrote Method Mixture Experiments Dynamic Coefficient Fused Lasso Group Lasso Integer Optimization Expectation-Maximization (EM) LASSO.
13	Analyse en composantes indépendantes avec une matrice de mélange éparse Billette, Marc-Olivier 06 1900 (has links) L'analyse en composantes indépendantes (ACI) est une méthode d'analyse statistique qui consiste à exprimer les données observées (mélanges de sources) en une transformation linéaire de variables latentes (sources) supposées non gaussiennes et mutuellement indépendantes. Dans certaines applications, on suppose que les mélanges de sources peuvent être groupés de façon à ce que ceux appartenant au même groupe soient fonction des mêmes sources. Ceci implique que les coefficients de chacune des colonnes de la matrice de mélange peuvent être regroupés selon ces mêmes groupes et que tous les coefficients de certains de ces groupes soient nuls. En d'autres mots, on suppose que la matrice de mélange est éparse par groupe. Cette hypothèse facilite l'interprétation et améliore la précision du modèle d'ACI. Dans cette optique, nous proposons de résoudre le problème d'ACI avec une matrice de mélange éparse par groupe à l'aide d'une méthode basée sur le LASSO par groupe adaptatif, lequel pénalise la norme 1 des groupes de coefficients avec des poids adaptatifs. Dans ce mémoire, nous soulignons l'utilité de notre méthode lors d'applications en imagerie cérébrale, plus précisément en imagerie par résonance magnétique. Lors de simulations, nous illustrons par un exemple l'efficacité de notre méthode à réduire vers zéro les groupes de coefficients non-significatifs au sein de la matrice de mélange. Nous montrons aussi que la précision de la méthode proposée est supérieure à celle de l'estimateur du maximum de la vraisemblance pénalisée par le LASSO adaptatif dans le cas où la matrice de mélange est éparse par groupe. / Independent component analysis (ICA) is a method of statistical analysis where the main goal is to express the observed data (mixtures) in a linear transformation of latent variables (sources) believed to be non-Gaussian and mutually independent. In some applications, the mixtures can be grouped so that the mixtures belonging to the same group are function of the same sources. This implies that the coefficients of each column of the mixing matrix can be grouped according to these same groups and that all the coefficients of some of these groups are zero. In other words, we suppose that the mixing matrix is sparse per group. This assumption facilitates the interpretation and improves the accuracy of the ICA model. In this context, we propose to solve the problem of ICA with a sparse group mixing matrix by a method based on the adaptive group LASSO. The latter penalizes the 1-norm of the groups of coefficients with adaptive weights. In this thesis, we point out the utility of our method in applications in brain imaging, specifically in magnetic resonance imaging. Through simulations, we illustrate with an example the effectiveness of our method to reduce to zero the non-significant groups of coefficients within the mixing matrix. We also show that the accuracy of the proposed method is greater than the one of the maximum likelihood estimator with an adaptive LASSO penalization in the case where the mixing matrix is sparse per group. Analyse en composantes indépendantes Matrice de mélange éparse Séparation aveugle de sources LASSO par groupe adaptatif Independent component analysis Sparse mixing matrix Blind source separation Adaptive group LASSO
14	Analyse en composantes indépendantes avec une matrice de mélange éparse Billette, Marc-Olivier 06 1900 (has links) L'analyse en composantes indépendantes (ACI) est une méthode d'analyse statistique qui consiste à exprimer les données observées (mélanges de sources) en une transformation linéaire de variables latentes (sources) supposées non gaussiennes et mutuellement indépendantes. Dans certaines applications, on suppose que les mélanges de sources peuvent être groupés de façon à ce que ceux appartenant au même groupe soient fonction des mêmes sources. Ceci implique que les coefficients de chacune des colonnes de la matrice de mélange peuvent être regroupés selon ces mêmes groupes et que tous les coefficients de certains de ces groupes soient nuls. En d'autres mots, on suppose que la matrice de mélange est éparse par groupe. Cette hypothèse facilite l'interprétation et améliore la précision du modèle d'ACI. Dans cette optique, nous proposons de résoudre le problème d'ACI avec une matrice de mélange éparse par groupe à l'aide d'une méthode basée sur le LASSO par groupe adaptatif, lequel pénalise la norme 1 des groupes de coefficients avec des poids adaptatifs. Dans ce mémoire, nous soulignons l'utilité de notre méthode lors d'applications en imagerie cérébrale, plus précisément en imagerie par résonance magnétique. Lors de simulations, nous illustrons par un exemple l'efficacité de notre méthode à réduire vers zéro les groupes de coefficients non-significatifs au sein de la matrice de mélange. Nous montrons aussi que la précision de la méthode proposée est supérieure à celle de l'estimateur du maximum de la vraisemblance pénalisée par le LASSO adaptatif dans le cas où la matrice de mélange est éparse par groupe. / Independent component analysis (ICA) is a method of statistical analysis where the main goal is to express the observed data (mixtures) in a linear transformation of latent variables (sources) believed to be non-Gaussian and mutually independent. In some applications, the mixtures can be grouped so that the mixtures belonging to the same group are function of the same sources. This implies that the coefficients of each column of the mixing matrix can be grouped according to these same groups and that all the coefficients of some of these groups are zero. In other words, we suppose that the mixing matrix is sparse per group. This assumption facilitates the interpretation and improves the accuracy of the ICA model. In this context, we propose to solve the problem of ICA with a sparse group mixing matrix by a method based on the adaptive group LASSO. The latter penalizes the 1-norm of the groups of coefficients with adaptive weights. In this thesis, we point out the utility of our method in applications in brain imaging, specifically in magnetic resonance imaging. Through simulations, we illustrate with an example the effectiveness of our method to reduce to zero the non-significant groups of coefficients within the mixing matrix. We also show that the accuracy of the proposed method is greater than the one of the maximum likelihood estimator with an adaptive LASSO penalization in the case where the mixing matrix is sparse per group. Analyse en composantes indépendantes Matrice de mélange éparse Séparation aveugle de sources LASSO par groupe adaptatif Independent component analysis Sparse mixing matrix Blind source separation Adaptive group LASSO
15	Learning algorithms for sparse classification Sanchez Merchante, Luis Francisco 07 June 2013 (has links) (PDF) This thesis deals with the development of estimation algorithms with embedded feature selection the context of high dimensional data, in the supervised and unsupervised frameworks. The contributions of this work are materialized by two algorithms, GLOSS for the supervised domain and Mix-GLOSS for unsupervised counterpart. Both algorithms are based on the resolution of optimal scoring regression regularized with a quadratic formulation of the group-Lasso penalty which encourages the removal of uninformative features. The theoretical foundations that prove that a group-Lasso penalized optimal scoring regression can be used to solve a linear discriminant analysis bave been firstly developed in this work. The theory that adapts this technique to the unsupervised domain by means of the EM algorithm is not new, but it has never been clearly exposed for a sparsity-inducing penalty. This thesis solidly demonstrates that the utilization of group-Lasso penalized optimal scoring regression inside an EM algorithm is possible. Our algorithms have been tested with real and artificial high dimensional databases with impressive resuits from the point of view of the parsimony without compromising prediction performances. [INFO:INFO_OH] Computer Science/Other [INFO:INFO_OH] Informatique/Autre Linear discriminant analysis Feature selection Regularization Variational approach Clustering Optimal scoring Group Lasso Sparsity
16	Advances on Dimension Reduction for Multivariate Linear Regression Guo, Wenxing January 2020 (has links) Multivariate linear regression methods are widely used statistical tools in data analysis, and were developed when some response variables are studied simultaneously, in which our aim is to study the relationship between predictor variables and response variables through the regression coefficient matrix. The rapid improvements of information technology have brought us a large number of large-scale data, but also brought us great challenges in data processing. When dealing with high dimensional data, the classical least squares estimation is not applicable in multivariate linear regression analysis. In recent years, some approaches have been developed to deal with high-dimensional data problems, among which dimension reduction is one of the main approaches. In some literature, random projection methods were used to reduce dimension in large datasets. In Chapter 2, a new random projection method, with low-rank matrix approximation, is proposed to reduce the dimension of the parameter space in high-dimensional multivariate linear regression model. Some statistical properties of the proposed method are studied and explicit expressions are then derived for the accuracy loss of the method with Gaussian random projection and orthogonal random projection. These expressions are precise rather than being bounds up to constants. In multivariate regression analysis, reduced rank regression is also a dimension reduction method, which has become an important tool for achieving dimension reduction goals due to its simplicity, computational efficiency and good predictive performance. In practical situations, however, the performance of the reduced rank estimator is not satisfactory when the predictor variables are highly correlated or the ratio of signal to noise is small. To overcome this problem, in Chapter 3, we incorporate matrix projections into reduced rank regression method, and then develop reduced rank regression estimators based on random projection and orthogonal projection in high-dimensional multivariate linear regression models. We also propose a consistent estimator of the rank of the coefficient matrix and achieve prediction performance bounds for the proposed estimators based on mean squared errors. Envelope technology is also a popular method in recent years to reduce estimative and predictive variations in multivariate regression, including a class of methods to improve the efficiency without changing the traditional objectives. Variable selection is the process of selecting a subset of relevant features variables for use in model construction. The purpose of using this technology is to avoid the curse of dimensionality, simplify models to make them easier to interpret, shorten training time and reduce overfitting. In Chapter 4, we combine envelope models and a group variable selection method to propose an envelope-based sparse reduced rank regression estimator in high-dimensional multivariate linear regression models, and then establish its consistency, asymptotic normality and oracle property. Tensor data are in frequent use today in a variety of fields in science and engineering. Processing tensor data is a practical but challenging problem. Recently, the prevalence of tensor data has resulted in several envelope tensor versions. In Chapter 5, we incorporate envelope technique into tensor regression analysis and propose a partial tensor envelope model, which leads to a parsimonious version for tensor response regression when some predictors are of special interest, and then consistency and asymptotic normality of the coefficient estimators are proved. The proposed method achieves significant gains in efficiency compared to the standard tensor response regression model in terms of the estimation of the coefficients for the selected predictors. Finally, in Chapter 6, we summarize the work carried out in the thesis, and then suggest some problems of further research interest. / Dissertation / Doctor of Philosophy (PhD)
17	Confidence bands in quantile regression and generalized dynamic semiparametric factor models Song, Song 01 November 2010 (has links) In vielen Anwendungen ist es notwendig, die stochastische Schwankungen der maximalen Abweichungen der nichtparametrischen Schätzer von Quantil zu wissen, zB um die verschiedene parametrische Modelle zu überprüfen. Einheitliche Konfidenzbänder sind daher für nichtparametrische Quantil Schätzungen der Regressionsfunktionen gebaut. Die erste Methode basiert auf der starken Approximation der empirischen Verfahren und Extremwert-Theorie. Die starke gleichmäßige Konsistenz liegt auch unter allgemeinen Bedingungen etabliert. Die zweite Methode beruht auf der Bootstrap Resampling-Verfahren. Es ist bewiesen, dass die Bootstrap-Approximation eine wesentliche Verbesserung ergibt. Der Fall von mehrdimensionalen und diskrete Regressorvariablen wird mit Hilfe einer partiellen linearen Modell behandelt. Das Verfahren wird mithilfe der Arbeitsmarktanalysebeispiel erklärt. Hoch-dimensionale Zeitreihen, die nichtstationäre und eventuell periodische Verhalten zeigen, sind häufig in vielen Bereichen der Wissenschaft, zB Makroökonomie, Meteorologie, Medizin und Financial Engineering, getroffen. Der typische Modelierungsansatz ist die Modellierung von hochdimensionalen Zeitreihen in Zeit Ausbreitung der niedrig dimensionalen Zeitreihen und hoch-dimensionale zeitinvarianten Funktionen über dynamische Faktorenanalyse zu teilen. Wir schlagen ein zweistufiges Schätzverfahren. Im ersten Schritt entfernen wir den Langzeittrend der Zeitreihen durch Einbeziehung Zeitbasis von der Gruppe Lasso-Technik und wählen den Raumbasis mithilfe der funktionalen Hauptkomponentenanalyse aus. Wir zeigen die Eigenschaften dieser Schätzer unter den abhängigen Szenario. Im zweiten Schritt erhalten wir den trendbereinigten niedrig-dimensionalen stochastischen Prozess (stationär). / In many applications it is necessary to know the stochastic fluctuation of the maximal deviations of the nonparametric quantile estimates, e.g. for various parametric models check. Uniform confidence bands are therefore constructed for nonparametric quantile estimates of regression functions. The first method is based on the strong approximations of the empirical process and extreme value theory. The strong uniform consistency rate is also established under general conditions. The second method is based on the bootstrap resampling method. It is proved that the bootstrap approximation provides a substantial improvement. The case of multidimensional and discrete regressor variables is dealt with using a partial linear model. A labor market analysis is provided to illustrate the method. High dimensional time series which reveal nonstationary and possibly periodic behavior occur frequently in many fields of science, e.g. macroeconomics, meteorology, medicine and financial engineering. One of the common approach is to separate the modeling of high dimensional time series to time propagation of low dimensional time series and high dimensional time invariant functions via dynamic factor analysis. We propose a two-step estimation procedure. At the first step, we detrend the time series by incorporating time basis selected by the group Lasso-type technique and choose the space basis based on smoothed functional principal component analysis. We show properties of this estimator under the dependent scenario. At the second step, we obtain the detrended low dimensional stochastic process (stationary). Bootstrap Quantilsregression Konsistenzrate Selbstvertrauen Band Check-Funktion Kernel Smoothing Nichtparametrische Fitting Teilweise Lineares Modell Semiparametrische Modell Faktor-Modell Saisonalität Periodisch Asymptotische Schlussfolgerung Wetter fMRI Group Lasso Implizite Volatilität Oberfläche fMRI Bootstrap Quantile Regression Consistency Rate Confidence Band Check Function Kernel Smoothing Nonparametric Fitting Partial Linear Model Factor model Group Lasso Seasonality Periodic Asymptotic inference Weather Semiparametric model Implied volatility surface 330 Wirtschaft 17 Wirtschaft QH 234 ddc:330
18	Contributions au démélange non-supervisé et non-linéaire de données hyperspectrales / Contributions to unsupervised and nonlinear unmixing of hyperspectral data Ammanouil, Rita 13 October 2016 (has links) Le démélange spectral est l’un des problèmes centraux pour l’exploitation des images hyperspectrales. En raison de la faible résolution spatiale des imageurs hyperspectraux en télédetection, la surface représentée par un pixel peut contenir plusieurs matériaux. Dans ce contexte, le démélange consiste à estimer les spectres purs (les end members) ainsi que leurs fractions (les abondances) pour chaque pixel de l’image. Le but de cette thèse estde proposer de nouveaux algorithmes de démélange qui visent à améliorer l’estimation des spectres purs et des abondances. En particulier, les algorithmes de démélange proposés s’inscrivent dans le cadre du démélange non-supervisé et non-linéaire. Dans un premier temps, on propose un algorithme de démelange non-supervisé dans lequel une régularisation favorisant la parcimonie des groupes est utilisée pour identifier les spectres purs parmi les observations. Une extension de ce premier algorithme permet de prendre en compte la présence du bruit parmi les observations choisies comme étant les plus pures. Dans un second temps, les connaissances a priori des ressemblances entre les spectres à l’échelle localeet non-locale ainsi que leurs positions dans l’image sont exploitées pour construire un graphe adapté à l’image. Ce graphe est ensuite incorporé dans le problème de démélange non supervisé par le biais d’une régularisation basée sur le Laplacian du graphe. Enfin, deux algorithmes de démélange non-linéaires sont proposés dans le cas supervisé. Les modèles de mélanges non-linéaires correspondants incorporent des fonctions à valeurs vectorielles appartenant à un espace de Hilbert à noyaux reproduisants. L’intérêt de ces fonctions par rapport aux fonctions à valeurs scalaires est qu’elles permettent d’incorporer un a priori sur la ressemblance entre les différentes fonctions. En particulier, un a priori spectral, dans un premier temps, et un a priori spatial, dans un second temps, sont incorporés pour améliorer la caractérisation du mélange non-linéaire. La validation expérimentale des modèles et des algorithmes proposés sur des données synthétiques et réelles montre une amélioration des performances par rapport aux méthodes de l’état de l’art. Cette amélioration se traduit par une meilleure erreur de reconstruction des données / Spectral unmixing has been an active field of research since the earliest days of hyperspectralremote sensing. It is concerned with the case where various materials are found inthe spatial extent of a pixel, resulting in a spectrum that is a mixture of the signatures ofthose materials. Unmixing then reduces to estimating the pure spectral signatures and theircorresponding proportions in every pixel. In the hyperspectral unmixing jargon, the puresignatures are known as the endmembers and their proportions as the abundances. Thisthesis focuses on spectral unmixing of remotely sensed hyperspectral data. In particular,it is aimed at improving the accuracy of the extraction of compositional information fromhyperspectral data. This is done through the development of new unmixing techniques intwo main contexts, namely in the unsupervised and nonlinear case. In particular, we proposea new technique for blind unmixing, we incorporate spatial information in (linear and nonlinear)unmixing, and we finally propose a new nonlinear mixing model. More precisely, first,an unsupervised unmixing approach based on collaborative sparse regularization is proposedwhere the library of endmembers candidates is built from the observations themselves. Thisapproach is then extended in order to take into account the presence of noise among theendmembers candidates. Second, within the unsupervised unmixing framework, two graphbasedregularizations are used in order to incorporate prior local and nonlocal contextualinformation. Next, within a supervised nonlinear unmixing framework, a new nonlinearmixing model based on vector-valued functions in reproducing kernel Hilbert space (RKHS)is proposed. The aforementioned model allows to consider different nonlinear functions atdifferent bands, regularize the discrepancies between these functions, and account for neighboringnonlinear contributions. Finally, the vector-valued kernel framework is used in orderto promote spatial smoothness of the nonlinear part in a kernel-based nonlinear mixingmodel. Simulations on synthetic and real data show the effectiveness of all the proposedtechniques Données hyperspectrales Démélange non-supervisé Démélange non-linéaire Régularisation de type groupe lasso Régularisation avec le Laplacian Hyperspectral data Unsupervised unmixing Nonlinear unmixing Group lasso regularization Laplacian regularization

Search results