Spelling suggestions: "subject:"kernel regression"" "subject:"fernel regression""
1 |
Semiparametric mixture modelsXiang, Sijia January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Weixin Yao / This dissertation consists of three parts that are related to semiparametric mixture models.
In Part I, we construct the minimum profile Hellinger distance (MPHD) estimator for a class of semiparametric mixture models where one component has known distribution with possibly unknown parameters while the other component density and the mixing proportion are unknown. Such semiparametric mixture models have been often used in biology and the sequential clustering algorithm.
In Part II, we propose a new class of semiparametric mixture of regression models, where the mixing proportions and variances are constants, but the component regression functions are smooth functions of a covariate. A one-step backfitting estimate and two EM-type algorithms have been proposed to achieve the optimal convergence rate for both the global parameters and nonparametric regression functions. We derive the asymptotic property of the proposed estimates and show that both proposed EM-type algorithms preserve the asymptotic ascent property.
In Part III, we apply the idea of single-index model to the mixture of regression models and propose three new classes of models: the mixture of single-index models (MSIM), the mixture of regression models with varying single-index proportions (MRSIP), and the mixture of regression models with varying single-index proportions and variances (MRSIPV). Backfitting estimates and the corresponding algorithms have been proposed for the new models to achieve the optimal convergence rate for both the parameters and the nonparametric functions. We show that the nonparametric functions can be estimated as if the parameters were known and the parameters can be estimated with the same rate of convergence, n[subscript](-1/2), that is achieved in a parametric model.
|
2 |
Random Matrix Theory: Selected Applications from Statistical Signal Processing and Machine LearningElkhalil, Khalil 06 1900 (has links)
Random matrix theory is an outstanding mathematical tool that has demonstrated its usefulness in many areas ranging from wireless communication to finance and economics. The main motivation behind its use comes from the fundamental role that random matrices play in modeling unknown and unpredictable physical quantities. In many situations, meaningful metrics expressed as scalar functionals of these random matrices arise naturally. Along this line, the present work consists in leveraging tools from random matrix theory in an attempt to answer fundamental questions related to applications from statistical signal processing and machine learning. In a first part, this thesis addresses the development of analytical tools for the computation of the inverse moments of random Gram matrices with one side correlation. Such a question is mainly driven by applications in signal processing and wireless communications wherein such matrices naturally arise. In particular, we derive closed-form expressions for the inverse moments and show that the obtained results can help approximate several performance metrics of common estimation techniques. Then, we carry out a large dimensional study of discriminant analysis classifiers. Under mild assumptions, we show that the asymptotic classification error approaches a deterministic quantity that depends only on the means and covariances associated with each class as well as the problem dimensions. Such result permits a better understanding of the underlying classifiers, in practical large but finite dimensions, and can be used to optimize the performance. Finally, we revisit kernel ridge regression and study a centered version of it that we call centered kernel ridge regression or CKRR in short. Relying on recent advances on the asymptotic properties of random kernel matrices, we carry out a large dimensional analysis of CKRR under the assumption that both the data dimesion and the training size grow simultaneiusly large at the same rate. We particularly show that both the empirical and prediction risks converge to a limiting risk that relates the performance to the data statistics and the parameters involved. Such a result is important as it permits a better undertanding of kernel ridge regression and allows to efficiently optimize the performance.
|
3 |
Asymptotic properties of Non-parametric Regression with Beta KernelsNatarajan, Balasubramaniam January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Weixing Song / Kernel based non-parametric regression is a popular statistical tool to identify the relationship between response and predictor variables when standard parametric regression models are not appropriate. The efficacy of kernel based methods depend both on the kernel choice and the smoothing parameter. With insufficient smoothing, the resulting regression estimate is too rough and with excessive smoothing, important features of the underlying relationship is lost. While the choice of the kernel has been shown to have less of an effect on the quality of regression estimate, it is important to choose kernels to best match the support set of the underlying predictor variables. In the past few decades, there have been multiple efforts to quantify the properties of asymmetric kernel density and regression estimators. Unlike classic symmetric kernel based estimators, asymmetric kernels do not suffer from boundary problems. For example, Beta kernel estimates are especially suitable for investigating the distribution structure of predictor variables with compact support. In this dissertation, two types of Beta kernel based non parametric regression estimators are proposed and analyzed. First, a Nadaraya-Watson type Beta kernel estimator is introduced within the regression setup followed by a local linear regression estimator based on Beta kernels. For both these regression estimators, a comprehensive analysis of its large sample properties is presented. Specifically, for the first time, the asymptotic normality and the uniform almost sure convergence results for the new estimators are established. Additionally, general guidelines for bandwidth selection is provided. The finite sample performance of the proposed estimator is evaluated via both a simulation study and a real data application. The results presented and validated in this dissertation help advance the understanding and use of Beta kernel based methods in other non-parametric regression applications.
|
4 |
Bias reduction studies in nonparametric regression with applications : an empirical approach / Marike KrugellKrugell, Marike January 2014 (has links)
The purpose of this study is to determine the effect of three improvement methods on nonparametric kernel
regression estimators. The improvement methods are applied to the Nadaraya-Watson estimator with crossvalidation
bandwidth selection, the Nadaraya-Watson estimator with plug-in bandwidth selection, the local
linear estimator with plug-in bandwidth selection and a bias corrected nonparametric estimator proposed by Yao
(2012). The di erent resulting regression estimates are evaluated by minimising a global discrepancy measure,
i.e. the mean integrated squared error (MISE).
In the machine learning context various improvement methods, in terms of the precision and accuracy of an
estimator, exist. The rst two improvement methods introduced in this study are bootstrapped based. Bagging
is an acronym for bootstrap aggregating and was introduced by Breiman (1996a) from a machine learning
viewpoint and by Swanepoel (1988, 1990) in a functional context. Bagging is primarily a variance reduction
tool, i.e. bagging is implemented to reduce the variance of an estimator and in this way improve the precision of
the estimation process. Bagging is performed by drawing repetitive bootstrap samples from the original sample
and generating multiple versions of an estimator. These replicates of the estimator are then used to obtain an
aggregated estimator. Bragging stands for bootstrap robust aggregating. A robust estimator is obtained by
using the sample median over the B bootstrap estimates instead of the sample mean as in bagging.
The third improvement method aims to reduce the bias component of the estimator and is referred to as boosting.
Boosting is a general method for improving the accuracy of any given learning algorithm. The method starts
of with a sensible estimator and improves iteratively, based on its performance on a training dataset.
Results and conclusions verifying existing literature are provided, as well as new results for the new methods. / MSc (Statistics), North-West University, Potchefstroom Campus, 2015
|
5 |
Bias reduction studies in nonparametric regression with applications : an empirical approach / Marike KrugellKrugell, Marike January 2014 (has links)
The purpose of this study is to determine the effect of three improvement methods on nonparametric kernel
regression estimators. The improvement methods are applied to the Nadaraya-Watson estimator with crossvalidation
bandwidth selection, the Nadaraya-Watson estimator with plug-in bandwidth selection, the local
linear estimator with plug-in bandwidth selection and a bias corrected nonparametric estimator proposed by Yao
(2012). The di erent resulting regression estimates are evaluated by minimising a global discrepancy measure,
i.e. the mean integrated squared error (MISE).
In the machine learning context various improvement methods, in terms of the precision and accuracy of an
estimator, exist. The rst two improvement methods introduced in this study are bootstrapped based. Bagging
is an acronym for bootstrap aggregating and was introduced by Breiman (1996a) from a machine learning
viewpoint and by Swanepoel (1988, 1990) in a functional context. Bagging is primarily a variance reduction
tool, i.e. bagging is implemented to reduce the variance of an estimator and in this way improve the precision of
the estimation process. Bagging is performed by drawing repetitive bootstrap samples from the original sample
and generating multiple versions of an estimator. These replicates of the estimator are then used to obtain an
aggregated estimator. Bragging stands for bootstrap robust aggregating. A robust estimator is obtained by
using the sample median over the B bootstrap estimates instead of the sample mean as in bagging.
The third improvement method aims to reduce the bias component of the estimator and is referred to as boosting.
Boosting is a general method for improving the accuracy of any given learning algorithm. The method starts
of with a sensible estimator and improves iteratively, based on its performance on a training dataset.
Results and conclusions verifying existing literature are provided, as well as new results for the new methods. / MSc (Statistics), North-West University, Potchefstroom Campus, 2015
|
6 |
技術型態分析可否預測未來?-台灣期貨市場的證據 / Is pattern analysis useful for predictions? Evidence from Taiwan's futures market鄭吉良 Unknown Date (has links)
本文以台灣期貨市場為研究目標,利用無母數計量模型-核迴歸(kernel regression)平滑每日結算價格,並建立一電腦化、自動化的系統來辨識各種技術型態的形成,包括頭肩頂、三角形、雙重底、…等十種,期望能去除人為主觀意識對型態辨識的影響,並檢定技術型態是否含有預測未來報酬的訊息性。
從實證結果中發現,出現次數最頻繁的技術型態並不一定具有訊息性,如頭肩頂與頭肩底;而在型態形成並辨認後兩周之內,矩形、倒矩形、雙重頂、雙重底等四種型態則持續顯著地具有訊息性。
另外,核迴歸中平滑因子越小會使型態出現次數越多;平滑因子越大則使型態出現次數越少,平滑程度的改變對頭肩頂及頭肩底型態的訊息顯著性有一定程度的影響,但是對矩形、倒矩形、雙重頂、雙重底等四種型態的訊息顯著性並無影響。
最後,技術型態的訊息顯著性並不會因為期貨商品的不同而有差異,技術型態在不同的期貨市場中仍會保有同樣特性;而個別期貨商品下的檢定結果都比全體期貨市場的結果來的差,此情況可能與樣本數較少的原因有關。
|
7 |
ESTIMATION IN PARTIALLY LINEAR MODELS WITH CORRELATED OBSERVATIONS AND CHANGE-POINT MODELSFan, Liangdong 01 January 2018 (has links)
Methods of estimating parametric and nonparametric components, as well as properties of the corresponding estimators, have been examined in partially linear models by Wahba [1987], Green et al. [1985], Engle et al. [1986], Speckman [1988], Hu et al. [2004], Charnigo et al. [2015] among others. These models are appealing due to their flexibility and wide range of practical applications including the electricity usage study by Engle et al. [1986], gum disease study by Speckman [1988], etc., wherea parametric component explains linear trends and a nonparametric part captures nonlinear relationships.
The compound estimator (Charnigo et al. [2015]) has been used to estimate the nonparametric component of such a model with multiple covariates, in conjunction with linear mixed modeling for the parametric component. These authors showed, under a strict orthogonality condition, that parametric and nonparametric component estimators could achieve what appear to be (nearly) optimal rates, even in the presence of subject-specific random effects.
We continue with research on partially linear models with subject-specific random intercepts. Inspired by Speckman [1988], we propose estimators of both parametric and nonparametric components of a partially linear model, where consistency is achievable under an orthogonality condition. We also examine a scenario without orthogonality to find that bias could still exist asymptotically. The random intercepts accommodate analysis of individuals on whom repeated measures are taken. We illustrate our estimators in a biomedical case study and assess their finite-sample performance in simulation studies.
Jump points have often been found within the domain of nonparametric models (Muller [1992], Loader [1996] and Gijbels et al. [1999]), which may lead to a poor fit when falsely assuming the underlying mean response is continuous. We study a specific type of change-point where the underlying mean response is continuous on both left and right sides of the change-point. We identify the convergence rate of the estimator proposed in Liu [2017] and illustrate the result in simulation studies.
|
8 |
Sequential Procedures for Nonparametric Kernel RegressionDharmasena, Tibbotuwa Deniye Kankanamge Lasitha Sandamali, Sandamali.dharmasena@rmit.edu.au January 2008 (has links)
In a nonparametric setting, the functional form of the relationship between the response variable and the associated predictor variables is unspecified; however it is assumed to be a smooth function. The main aim of nonparametric regression is to highlight an important structure in data without any assumptions about the shape of an underlying regression function. In regression, the random and fixed design models should be distinguished. Among the variety of nonparametric regression estimators currently in use, kernel type estimators are most popular. Kernel type estimators provide a flexible class of nonparametric procedures by estimating unknown function as a weighted average using a kernel function. The bandwidth which determines the influence of the kernel has to be adapted to any kernel type estimator. Our focus is on Nadaraya-Watson estimator and Local Linear estimator which belong to a class of kernel type regression estimators called local polynomial kerne l estimators. A closely related problem is the determination of an appropriate sample size that would be required to achieve a desired confidence level of accuracy for the nonparametric regression estimators. Since sequential procedures allow an experimenter to make decisions based on the smallest number of observations without compromising accuracy, application of sequential procedures to a nonparametric regression model at a given point or series of points is considered. The motivation for using such procedures is: in many applications the quality of estimating an underlying regression function in a controlled experiment is paramount; thus, it is reasonable to invoke a sequential procedure of estimation that chooses a sample size based on recorded observations that guarantees a preassigned accuracy. We have employed sequential techniques to develop a procedure for constructing a fixed-width confidence interval for the predicted value at a specific point of the independent variable. These fixed-width confidence intervals are developed using asymptotic properties of both Nadaraya-Watson and local linear kernel estimators of nonparametric kernel regression with data-driven bandwidths and studied for both fixed and random design contexts. The sample sizes for a preset confidence coefficient are optimized using sequential procedures, namely two-stage procedure, modified two-stage procedure and purely sequential procedure. The proposed methodology is first tested by employing a large-scale simulation study. The performance of each kernel estimation method is assessed by comparing their coverage accuracy with corresponding preset confidence coefficients, proximity of computed sample sizes match up to optimal sample sizes and contrasting the estimated values obtained from the two nonparametric methods with act ual values at given series of design points of interest. We also employed the symmetric bootstrap method which is considered as an alternative method of estimating properties of unknown distributions. Resampling is done from a suitably estimated residual distribution and utilizes the percentiles of the approximate distribution to construct confidence intervals for the curve at a set of given design points. A methodology is developed for determining whether it is advantageous to use the symmetric bootstrap method to reduce the extent of oversampling that is normally known to plague Stein's two-stage sequential procedure. The procedure developed is validated using an extensive simulation study and we also explore the asymptotic properties of the relevant estimators. Finally, application of our proposed sequential nonparametric kernel regression methods are made to some problems in software reliability and finance.
|
9 |
Choosing a Kernel for Cross-ValidationSavchuk, Olga 14 January 2010 (has links)
The statistical properties of cross-validation bandwidths can be improved by choosing
an appropriate kernel, which is different from the kernels traditionally used for cross-
validation purposes. In the light of this idea, we developed two new methods of
bandwidth selection termed: Indirect cross-validation and Robust one-sided cross-
validation. The kernels used in the Indirect cross-validation method yield an
improvement in the relative bandwidth rate to n^1=4, which is substantially better
than the n^1=10 rate of the least squares cross-validation method. The robust kernels
used in the Robust one-sided cross-validation method eliminate the bandwidth bias
for the case of regression functions with discontinuous derivatives.
|
10 |
Gas Sensor Array Modeling and Cuprate Superconductivity From Correlated Spin DisorderFulkerson, Matthew D. 02 July 2002 (has links)
No description available.
|
Page generated in 0.0755 seconds