Global ETD Search

1	Non parametric density estimation via regularization Lin, Mu 11 1900 (has links) The thesis aims at showing some important methods, theories and applications about non-parametric density estimation via regularization in univariate setting. It gives a brief introduction to non-parametric density estimation, and discuss several well-known methods, for example, histogram and kernel methods. Regularized methods with penalization and shape constraints are the focus of the thesis. Maximum entropy density estimation is introduced and the relationship between taut string and maximum entropy density estimation is explored. Furthermore, the dual and primal theories are discussed and some theoretical proofs corresponding to quasi-concave density estimation are presented. Different the numerical methods of non-parametric density estimation with regularization are classified and compared. Finally, a real data experiment will also be discussed in the last part of the thesis. / Statistics
2	Non parametric density estimation via regularization Lin, Mu Unknown Date No description available.
3	Méthodes spectrales pour l'inférence grammaticale probabiliste de langages stochastiques rationnels Bailly, Raphael 12 December 2011 (has links) Nous nous plaçons dans le cadre de l’inférence grammaticale probabiliste. Il s’agit, étant donnée une distribution p sur un ensemble de chaînes S∗ inconnue, d’inférer un modèle probabiliste pour p à partir d’un échantillon fini S d’observations supposé i.i.d. selon p. L’inférence gram- maticale se concentre avant tout sur la structure du modèle, et la convergence de l’estimation des paramètres. Les modèles probabilistes dont il sera question ici sont les automates pondérés, ou WA. Les fonctions qu’ils modélisent sont appelées séries rationnelles. Dans un premier temps, nous étudierons la possibilité de trouver un critère de convergence absolue pour de telles séries. Par la suite, nous introduirons un type d’algorithme pour l’inférence de distributions rationnelles (i.e. distributions modélisées par un WA), basé sur des méthodes spectrales. Nous montrerons comment adapter cet algorithme pour l’appliquer au domaine, assez proche, des distributions sur les arbres. Enfin, nous tenterons d’utiliser cet algorithme d’inférence dans un contexte plus statistique d’estimation de densité. / Our framework is the probabilistic grammatical inference. That is, given an unknown distribution p on a set of string S∗ , to infer a probabilistic model for p from a sample S of observations assumed to be i.i.d. according to p. Grammatical inference focuses primarily on the structure of the probabilistic model, and the convergence of parameter estimate. Probabilistic models which will be considered here are weighted automata, or WA. The series they model are called rational series. Initially, we study the possibility of finding an absolute convergence criterion for such series. Subsequently, we introduce a algorithm for the inference of rational distrbutions (i.e. distributions modeled by WA), based on spectral methods. We will show how to fit this algorithm to the domain, fairly close, of rational distributions on trees. Finally, we will try to see how to use the spectral algorithm in a more statistical way, in a density estimation task. Inférence grammaticale Estimation de densité non-paramétrique Grammatical inference Non-parametric density estimation
4	Computational Challenges in Non-parametric Prediction of Bradycardia in Preterm Infants January 2020 (has links) abstract: Infants born before 37 weeks of pregnancy are considered to be preterm. Typically, preterm infants have to be strictly monitored since they are highly susceptible to health problems like hypoxemia (low blood oxygen level), apnea, respiratory issues, cardiac problems, neurological problems as well as an increased chance of long-term health issues such as cerebral palsy, asthma and sudden infant death syndrome. One of the leading health complications in preterm infants is bradycardia - which is defined as the slower than expected heart rate, generally beating lower than 60 beats per minute. Bradycardia is often accompanied by low oxygen levels and can cause additional long term health problems in the premature infant.The implementation of a non-parametric method to predict the onset of brady- cardia is presented. This method assumes no prior knowledge of the data and uses kernel density estimation to predict the future onset of bradycardia events. The data is preprocessed, and then analyzed to detect the peaks in the ECG signals, following which different kernels are implemented to estimate the shared underlying distribu- tion of the data. The performance of the algorithm is evaluated using various metrics and the computational challenges and methods to overcome them are also discussed. It is observed that the performance of the algorithm with regards to the kernels used are consistent with the theoretical performance of the kernel as presented in a previous work. The theoretical approach has also been automated in this work and the various implementation challenges have been addressed. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2020 Electrical engineering Statistics Bradycardia Infants Kernel Density Machine Learning Non-parametric Density Estimation
5	Bickel-Rosenblatt Test Based on Tilted Estimation for Autoregressive Models & Deep Merged Survival Analysis on Cancer Study Using Multiple Types of Bioinformatic Data Su, Yan January 2021 (has links) No description available. Statistics Goodness-of-fit test Time series models Non-parametric density estimation Tilted density function estimator Cancer bioinformatics Deep learning Survival analysis
6	Maximum-likelihood kernel density estimation in high-dimensional feature spaces /\| C.M. van der Walt Van der Walt, Christiaan Maarten January 2014 (has links) With the advent of the internet and advances in computing power, the collection of very large high-dimensional datasets has become feasible { understanding and modelling high-dimensional data has thus become a crucial activity, especially in the field of pattern recognition. Since non-parametric density estimators are data-driven and do not require or impose a pre-defined probability density function on data, they are very powerful tools for probabilistic data modelling and analysis. Conventional non-parametric density estimation methods, however, originated from the field of statistics and were not originally intended to perform density estimation in high-dimensional features spaces { as is often encountered in real-world pattern recognition tasks. Therefore we address the fundamental problem of non-parametric density estimation in high-dimensional feature spaces in this study. Recent advances in maximum-likelihood (ML) kernel density estimation have shown that kernel density estimators hold much promise for estimating nonparametric probability density functions in high-dimensional feature spaces. We therefore derive two new iterative kernel bandwidth estimators from the maximum-likelihood (ML) leave one-out objective function and also introduce a new non-iterative kernel bandwidth estimator (based on the theoretical bounds of the ML bandwidths) for the purpose of bandwidth initialisation. We name the iterative kernel bandwidth estimators the minimum leave-one-out entropy (MLE) and global MLE estimators, and name the non-iterative kernel bandwidth estimator the MLE rule-of-thumb estimator. We compare the performance of the MLE rule-of-thumb estimator and conventional kernel density estimators on artificial data with data properties that are varied in a controlled fashion and on a number of representative real-world pattern recognition tasks, to gain a better understanding of the behaviour of these estimators in high-dimensional spaces and to determine whether these estimators are suitable for initialising the bandwidths of iterative ML bandwidth estimators in high dimensions. We find that there are several regularities in the relative performance of conventional kernel density estimators across different tasks and dimensionalities and that the Silverman rule-of-thumb bandwidth estimator performs reliably across most tasks and dimensionalities of the pattern recognition datasets considered, even in high-dimensional feature spaces. Based on this empirical evidence and the intuitive theoretical motivation that the Silverman estimator optimises the asymptotic mean integrated squared error (assuming a Gaussian reference distribution), we select this estimator to initialise the bandwidths of the iterative ML kernel bandwidth estimators compared in our simulation studies. We then perform a comparative simulation study of the newly introduced iterative MLE estimators and other state-of-the-art iterative ML estimators on a number of artificial and real-world high-dimensional pattern recognition tasks. We illustrate with artificial data (guided by theoretical motivations) under what conditions certain estimators should be preferred and we empirically confirm on real-world data that no estimator performs optimally on all tasks and that the optimal estimator depends on the properties of the underlying density function being estimated. We also observe an interesting case of the bias-variance trade-off where ML estimators with fewer parameters than the MLE estimator perform exceptionally well on a wide variety of tasks; however, for the cases where these estimators do not perform well, the MLE estimator generally performs well. The newly introduced MLE kernel bandwidth estimators prove to be a useful contribution to the field of pattern recognition, since they perform optimally on a number of real-world pattern recognition tasks investigated and provide researchers and practitioners with two alternative estimators to employ for the task of kernel density estimation. / PhD (Information Technology), North-West University, Vaal Triangle Campus, 2014 Pattern recognition Non-parametric density estimation Kernel density estimation Kernel bandwidth estimation Maximum-likelihood High-dimensional data Artificial data Probability density function
7	Maximum-likelihood kernel density estimation in high-dimensional feature spaces /\| C.M. van der Walt Van der Walt, Christiaan Maarten January 2014 (has links) With the advent of the internet and advances in computing power, the collection of very large high-dimensional datasets has become feasible { understanding and modelling high-dimensional data has thus become a crucial activity, especially in the field of pattern recognition. Since non-parametric density estimators are data-driven and do not require or impose a pre-defined probability density function on data, they are very powerful tools for probabilistic data modelling and analysis. Conventional non-parametric density estimation methods, however, originated from the field of statistics and were not originally intended to perform density estimation in high-dimensional features spaces { as is often encountered in real-world pattern recognition tasks. Therefore we address the fundamental problem of non-parametric density estimation in high-dimensional feature spaces in this study. Recent advances in maximum-likelihood (ML) kernel density estimation have shown that kernel density estimators hold much promise for estimating nonparametric probability density functions in high-dimensional feature spaces. We therefore derive two new iterative kernel bandwidth estimators from the maximum-likelihood (ML) leave one-out objective function and also introduce a new non-iterative kernel bandwidth estimator (based on the theoretical bounds of the ML bandwidths) for the purpose of bandwidth initialisation. We name the iterative kernel bandwidth estimators the minimum leave-one-out entropy (MLE) and global MLE estimators, and name the non-iterative kernel bandwidth estimator the MLE rule-of-thumb estimator. We compare the performance of the MLE rule-of-thumb estimator and conventional kernel density estimators on artificial data with data properties that are varied in a controlled fashion and on a number of representative real-world pattern recognition tasks, to gain a better understanding of the behaviour of these estimators in high-dimensional spaces and to determine whether these estimators are suitable for initialising the bandwidths of iterative ML bandwidth estimators in high dimensions. We find that there are several regularities in the relative performance of conventional kernel density estimators across different tasks and dimensionalities and that the Silverman rule-of-thumb bandwidth estimator performs reliably across most tasks and dimensionalities of the pattern recognition datasets considered, even in high-dimensional feature spaces. Based on this empirical evidence and the intuitive theoretical motivation that the Silverman estimator optimises the asymptotic mean integrated squared error (assuming a Gaussian reference distribution), we select this estimator to initialise the bandwidths of the iterative ML kernel bandwidth estimators compared in our simulation studies. We then perform a comparative simulation study of the newly introduced iterative MLE estimators and other state-of-the-art iterative ML estimators on a number of artificial and real-world high-dimensional pattern recognition tasks. We illustrate with artificial data (guided by theoretical motivations) under what conditions certain estimators should be preferred and we empirically confirm on real-world data that no estimator performs optimally on all tasks and that the optimal estimator depends on the properties of the underlying density function being estimated. We also observe an interesting case of the bias-variance trade-off where ML estimators with fewer parameters than the MLE estimator perform exceptionally well on a wide variety of tasks; however, for the cases where these estimators do not perform well, the MLE estimator generally performs well. The newly introduced MLE kernel bandwidth estimators prove to be a useful contribution to the field of pattern recognition, since they perform optimally on a number of real-world pattern recognition tasks investigated and provide researchers and practitioners with two alternative estimators to employ for the task of kernel density estimation. / PhD (Information Technology), North-West University, Vaal Triangle Campus, 2014 Pattern recognition Non-parametric density estimation Kernel density estimation Kernel bandwidth estimation Maximum-likelihood High-dimensional data Artificial data Probability density function

1

Page generated in 0.1402 seconds