• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 5
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 8
  • 8
  • 8
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Výběr modelu na základě penalizované věrohodnosti / Variable selection based on penalized likelihood

Chlubnová, Tereza January 2016 (has links)
Selection of variables and estimation of regression coefficients in datasets with the number of variables exceeding the number of observations consti- tutes an often discussed topic in modern statistics. Today the maximum penalized likelihood method with an appropriately selected function of the parameter as the penalty is used for solving this problem. The penalty should evaluate the benefit of the variable and possibly mitigate or nullify the re- spective regression coefficient. The SCAD and LASSO penalty functions are popular for their ability to choose appropriate regressors and at the same time estimate the parameters in a model. This thesis presents an overview of up to date results in the area of characteristics of estimates obtained by using these two methods for both small number of regressors and multidimensional datasets in a normal linear model. Due to the fact that the amount of pe- nalty and therefore also the choice of the model is heavily influenced by the tuning parameter, this thesis further discusses its selection. The behavior of the LASSO and SCAD penalty functions for different values and possibili- ties for selection of the tuning parameter is tested with various numbers of regressors on simulated datasets.
2

Non-global regression modelling

Huang, Yunkai 21 June 2016 (has links)
In this dissertation, a new non-global regression model - the partial linear threshold regression model (PLTRM) - is proposed. Various issues related to the PLTRM are discussed. In the first main section of the dissertation (Chapter 2), we define what is meant by the term “non-global regression model”, and we provide a brief review of the current literature associated with such models. In particular, we focus on their advantages and disadvantages in terms of their statistical properties. Because there are some weaknesses in the existing non-global regression models, we propose the PLTRM. The PLTRM combines non-parametric modelling with the traditional threshold regression models (TRMs), and hence can be thought of as an extension of the later models. We verify the performance of the PLTRM through a series of Monte Carlo simulation experiments. These experiments use a simulated data set that exhibits partial linear and partial nonlinear characteristics, and the PLTRM out-performs several competing parametric and non-parametric models in terms of the Mean Squared Error (MSE) of the within-sample fit. In the second main section of this dissertation (Chapter 3), we propose a method of estimation for the PLTRM. This requires estimating the parameters of the parametric part of the model; estimating the threshold; and fitting the non-parametric component of the model. An “unbalanced penalized least squares” approach is used. This involves using restricted penalized regression spline and smoothing spline techniques for the non-parametric component of the model; the least squares method for the linear parametric part of the model; together with a search procedure to estimate the threshold value. This estimation procedure is discussed for three mutually exclusive situations, which are classified according to the way in which the two components of the PLTRM “join” at the threshold. Bootstrap sampling distributions of the estimators are provided using the parametric bootstrap technique. The various estimators appear to have good sampling properties in most of the situations that are considered. Inference issues such as hypothesis testing and confidence interval construction for the PLTRM are also investigated. In the third main section of the dissertation (Chapter 4), we illustrate the usefulness of the PLTRM, and the application of the proposed estimation methods, by modelling various real-world data sets. These examples demonstrate both the good statistical performance, and the great application potential, of the PLTRM. / Graduate
3

Change-point detection and kernel methods / Détection de ruptures et méthodes à noyaux

Garreau, Damien 12 October 2017 (has links)
Dans cette thèse, nous nous intéressons à une méthode de détection des ruptures dans une suite d’observations appartenant à un ensemble muni d’un noyau semi-défini positif. Cette procédure est une version « à noyaux » d’une méthode des moindres carrés pénalisés. Notre principale contribution est de montrer que, pour tout noyau satisfaisant des hypothèses raisonnables, cette méthode fournit une segmentation proche de la véritable segmentation avec grande probabilité. Ce résultat est obtenu pour un noyau borné et une pénalité linéaire, ainsi qu’une autre pénalité venant de la sélection de modèles. Les preuves reposent sur un résultat de concentration pour des variables aléatoires bornées à valeurs dans un espace de Hilbert, et nous obtenons une version moins précise de ce résultat lorsque l’on supposeseulement que la variance des observations est finie. Dans un cadre asymptotique, nous retrouvons les taux minimax usuels en détection de ruptures lorsqu’aucune hypothèse n’est faite sur la taille des segments. Ces résultats théoriques sont confirmés par des simulations. Nous étudions également de manière détaillée les liens entre différentes notions de distances entre segmentations. En particulier, nous prouvons que toutes ces notions coïncident pour des segmentations suffisamment proches. D’un point de vue pratique, nous montrons que l’heuristique du « saut de dimension » pour choisir la constante de pénalisation est un choix raisonnable lorsque celle-ci est linéaire. Nous montrons également qu’une quantité clé dépendant du noyau et qui apparaît dans nos résultats théoriques influe sur les performances de cette méthode pour la détection d’une unique rupture. Dans un cadre paramétrique, et lorsque le noyau utilisé est invariant partranslation, il est possible de calculer cette quantité explicitement. Grâce à ces calculs, nouveaux pour plusieurs d’entre eux, nous sommes capable d’étudier précisément le comportement de la constante de pénalité maximale. Pour finir, nous traitons de l’heuristique de la médiane, un moyen courant de choisir la largeur de bande des noyaux à base de fonctions radiales. Dans un cadre asymptotique, nous montrons que l’heuristique de la médiane se comporte à la limite comme la médiane d’une distribution que nous décrivons complètement dans le cadre du test à deux échantillons à noyaux et de la détection de ruptures. Plus précisément, nous montrons que l’heuristique de la médiane est approximativement normale centrée en cette valeur. / In this thesis, we focus on a method for detecting abrupt changes in a sequence of independent observations belonging to an arbitrary set on which a positive semidefinite kernel is defined. That method, kernel changepoint detection, is a kernelized version of a penalized least-squares procedure. Our main contribution is to show that, for any kernel satisfying some reasonably mild hypotheses, this procedure outputs a segmentation close to the true segmentation with high probability. This result is obtained under a bounded assumption on the kernel for a linear penalty and for another penalty function, coming from model selection.The proofs rely on a concentration result for bounded random variables in Hilbert spaces and we prove a less powerful result under relaxed hypotheses—a finite variance assumption. In the asymptotic setting, we show that we recover the minimax rate for the change-point locations without additional hypothesis on the segment sizes. We provide empirical evidence supporting these claims. Another contribution of this thesis is the detailed presentation of the different notions of distances between segmentations. Additionally, we prove a result showing these different notions coincide for sufficiently close segmentations.From a practical point of view, we demonstrate how the so-called dimension jump heuristic can be a reasonable choice of penalty constant when using kernel changepoint detection with a linear penalty. We also show how a key quantity depending on the kernelthat appears in our theoretical results influences the performance of kernel change-point detection in the case of a single change-point. When the kernel is translationinvariant and parametric assumptions are made, it is possible to compute this quantity in closed-form. Thanks to these computations, some of them novel, we are able to study precisely the behavior of the maximal penalty constant. Finally, we study the median heuristic, a popular tool to set the bandwidth of radial basis function kernels. Fora large sample size, we show that it behaves approximately as the median of a distribution that we describe completely in the setting of kernel two-sample test and kernel change-point detection. More precisely, we show that the median heuristic is asymptotically normal around this value.
4

Velká data - extrakce klíčových informací pomocí metod matematické statistiky a strojového učení / Big data - extraction of key information combining methods of mathematical statistics and machine learning

Masák, Tomáš January 2017 (has links)
This thesis is concerned with data analysis, especially with principal component analysis and its sparse modi cation (SPCA), which is NP-hard-to- solve. SPCA problem can be recast into the regression framework in which spar- sity is usually induced with ℓ1-penalty. In the thesis, we propose to use iteratively reweighted ℓ2-penalty instead of the aforementioned ℓ1-approach. We compare the resulting algorithm with several well-known approaches to SPCA using both simulation study and interesting practical example in which we analyze voting re- cords of the Parliament of the Czech Republic. We show experimentally that the proposed algorithm outperforms the other considered algorithms. We also prove convergence of both the proposed algorithm and the original regression-based approach to PCA. vi
5

High-dimensional VAR analysis of regional house prices in United States / Analýza regionálních cen nemovitostí ve Spojených státech pomocí vysokodimenzionálního VAR modelu

Krčál, Adam January 2015 (has links)
In this thesis the heterogeneity of regional real estate prices in United States is investigated. A high dimensional VAR model with additional exogenous predictors, originally introduced by \cite{fan11}, is adopted. In this framework, the common factor in regional house prices dynamics is explained by exogenous predictors and the spatial dependencies are captured by lagged house prices in other regions. For the purpose of estimation and variable selection under high-dimensional setting the concept of Penalized Least Squares (PLS) with different penalty functions (e.g. LASSO penalty) is studied in detail and implemented. Moreover, clustering methods are employed to identify subsets of statistical regions with similar house prices dynamics. It is demonstrated that these clusters are well geographically defined and contribute to a better interpretation of the VAR model. Next, we make use of the LASSO variable selection property in order to construct the impulse response functions and to simulate the prices behavior when a shock occurs. And last but not least, one-period-ahead forecasts from VAR model are compared to those from the Diffusion Index Factor Model by \cite{stock02}, a commonly used model for forecasts.
6

Change point estimation in noisy Hammerstein integral equations / Sprungstellen-Schätzer für verrauschte Hammerstein Integral Gleichungen

Frick, Sophie 02 December 2010 (has links)
No description available.
7

General Adaptive Penalized Least Squares 模型選取方法之模擬與其他方法之比較 / The Simulation of Model Selection Method for General Adaptive Penalized Least Squares and Comparison with Other Methods

陳柏錞 Unknown Date (has links)
在迴歸分析中,若變數間具有非線性 (nonlinear) 的關係時,B-Spline線性迴歸是以無母數的方式建立模型。B-Spline函數為具有節點(knots)的分段多項式,選取合適節點的位置對B-Spline函數的估計有重要的影響,在希望得到B-Spline較好的估計量的同時,我們也想要只用少數的節點就達成想要的成效,於是Huang (2013) 提出了一種選擇節點的方式APLS (Adaptive penalized least squares),在本文中,我們以此方法進行一些更一般化的設定,並在不同的設定之下,判斷是否有較好的估計效果,且已修正後的方法與基於BIC (Bayesian information criterion)的節點估計方式進行比較,在本文中我們將一般化設定的APLS法稱為GAPLS,並且經由模擬結果我們發現此兩種以B-Spline進行迴歸函數近似的方法其近似效果都很不錯,只是節點的個數略有不同,所以若是對節點選取的個數有嚴格要求要取較少的節點的話,我們建議使用基於BIC的節點估計方式,除此之外GAPLS法也是不錯的選擇。 / In regression analysis, if the relationship between the response variable and the explanatory variables is nonlinear, B-splines can be used to model the nonlinear relationship. Knot selection is crucial in B-spline regression. Huang (2013) propose a method for adaptive estimation, where knots are selected based on penalized least squares. This method is abbreviated as APLS (adaptive penalized least squares) in this thesis. In this thesis, a more general version of APLS is proposed, which is abbreviated as GAPLS (generalized APLS). Simulation studies are carried out to compare the estimation performance between GAPLS and a knot selection method based on BIC (Bayesian information criterion). The simulation results show that both methods perform well and fewer knots are selected using the BIC approach than using GAPLS.
8

Pénalités minimales pour la sélection de modèle / Minimal penalties for model selection

Sorba, Olivier 09 February 2017 (has links)
Dans le cadre de la sélection de modèle par contraste pénalisé, L. Birgé and P. Massart ont prouvé que le phénomène de pénalité minimale se produit pour la sélection libre parmi des variables gaussiennes indépendantes. Nous étendons certains de leurs résultats à la partition d'un signal gaussien lorsque la famille de partitions envisagées est suffisamment riche, notamment dans le cas des arbres de régression. Nous montrons que le même phénomène se produit dans le cadre de l'estimation de densité. La richesse de la famille de modèle s'apparente à une forme d'isotropie. De ce point de vue le phénomène de pénalité minimale est intrinsèque. Pour corroborer et illustrer ce point de vue, nous montrons que le même phénomène se produit pour une famille de modèles d'orientation aléatoire uniforme. / L. Birgé and P. Massart proved that the minimum penalty phenomenon occurs in Gaussian model selection when the model family arises from complete variable selection among independent variables. We extend some of their results to discrete Gaussian signal segmentation when the model family corresponds to a sufficiently rich family of partitions of the signal's support. This is the case of regression trees. We show that the same phenomenon occurs in the context of density estimation. The richness of the model family can be related to a certain form of isotropy. In this respect the minimum penalty phenomenon is intrinsic. To corroborate this point of view, we show that the minimum penalty phenomenon occurs when the models are chosen randomly under an isotropic law.

Page generated in 0.0764 seconds