Global ETD Search

591	Optimum Savitzky-Golay Filtering for Signal Estimation Krishnan, Sunder Ram January 2013 (has links) (PDF) Motivated by the classic works of Charles M. Stein, we focus on developing risk-estimation frameworks for denoising problems in both one-and two-dimensions. We assume a standard additive noise model, and formulate the denoising problem as one of estimating the underlying clean signal from noisy measurements by minimizing a risk corresponding to a chosen loss function. Our goal is to incorporate perceptually-motivated loss functions wherever applicable, as in the case of speech enhancement, with the squared error loss being considered for the other scenarios. Since the true risks are observed to depend on the unknown parameter of interest, we circumvent the roadblock by deriving finite-sample un-biased estimators of the corresponding risks based on Stein’s lemma. We establish the link with the multivariate parameter estimation problem addressed by Stein and our denoising problem, and derive estimators of the oracle risks. In all cases, optimum values of the parameters characterizing the denoising algorithm are determined by minimizing the Stein’s unbiased risk estimator (SURE). The key contribution of this thesis is the development of a risk-estimation approach for choosing the two critical parameters affecting the quality of nonparametric regression, namely, the order and bandwidth/smoothing parameters. This is a classic problem in statistics, and certain algorithms relying on derivation of suitable finite-sample risk estimators for minimization have been reported in the literature (note that all these works consider the mean squared error (MSE) objective). We show that a SURE-based formalism is well-suited to the regression parameter selection problem, and that the optimum solution guarantees near-minimum MSE (MMSE) performance. We develop algorithms for both glob-ally and locally choosing the two parameters, the latter referred to as spatially-adaptive regression. We observe that the parameters are so chosen as to tradeoff the squared bias and variance quantities that constitute the MSE. We also indicate the advantages accruing out of incorporating a regularization term in the cost function in addition to the data error term. In the more general case of kernel regression, which uses a weighted least-squares (LS) optimization, we consider the applications of image restoration from very few random measurements, in addition to denoising of uniformly sampled data. We show that local polynomial regression (LPR) becomes a special case of kernel regression, and extend our results for LPR on uniform data to non-uniformly sampled data also. The denoising algorithms are compared with other standard, performant methods available in the literature both in terms of estimation error and computational complexity. A major perspective provided in this thesis is that the problem of optimum parameter choice in nonparametric regression can be viewed as the selection of optimum parameters of a linear, shift-invariant filter. This interpretation is provided by deriving motivation out of the hallmark paper of Savitzky and Golay and Schafer’s recent article in IEEE Signal Processing Magazine. It is worth noting that Savitzky and Golay had shown in their original Analytical Chemistry journal article, that LS fitting of a fixed-order polynomial over a neighborhood of fixed size is equivalent to convolution with an impulse response that is fixed and can be pre-computed. They had provided tables of impulse response coefficients for computing the smoothed function and smoothed derivatives for different orders and neighborhood sizes, the resulting filters being referred to as Savitzky-Golay (S-G) filters. Thus, we provide the new perspective that the regression parameter choice is equivalent to optimizing for the filter impulse response length/3dB bandwidth, which are inversely related. We observe that the MMSE solution is such that the S-G filter chosen is of longer impulse response length (equivalently smaller cutoff frequency) at relatively flat portions of the noisy signal so as to smooth noise, and vice versa at locally fast-varying portions of the signal so as to capture the signal patterns. Also, we provide a generalized S-G filtering viewpoint in the case of kernel regression. Building on the S-G filtering perspective, we turn to the problem of dynamic feature computation in speech recognition. We observe that the methodology employed for computing dynamic features from the trajectories of static features is in fact derivative S-G filtering. With this perspective, we note that the filter coefficients can be pre-computed, and that the whole problem of delta feature computation becomes efficient. Indeed, we observe an advantage by a factor of 104 on making use of S-G filtering over actual LS polynomial fitting and evaluation. Thereafter, we study the properties of first-and second-order derivative S-G filters of certain orders and lengths experimentally. The derivative filters are bandpass due to the combined effects of LPR and derivative computation, which are lowpass and highpass operations, respectively. The first-and second-order S-G derivative filters are also observed to exhibit an approximately constant-Q property. We perform a TIMIT phoneme recognition experiment comparing the recognition accuracies obtained using S-G filters and the conventional approach followed in HTK, where Furui’s regression formula is made use of. The recognition accuracies for both cases are almost identical, with S-G filters of certain bandwidths and orders registering a marginal improvement. The accuracies are also observed to improve with longer filter lengths, for a particular order. In terms of computation latency, we note that S-G filtering achieves delta and delta-delta feature computation in parallel by linear filtering, whereas they need to be obtained sequentially in case of the standard regression formulas used in the literature. Finally, we turn to the problem of speech enhancement where we are interested in de-noising using perceptually-motivated loss functions such as Itakura-Saito (IS). We propose to perform enhancement in the discrete cosine transform domain using risk-minimization. The cost functions considered are non-quadratic, and derivation of the unbiased estimator of the risk corresponding to the IS distortion is achieved using an approximate Taylor-series analysis under high signal-to-noise ratio assumption. The exposition is general since we focus on an additive noise model with the noise density assumed to fall within the exponential class of density functions, which comprises most of the common densities. The denoising function is assumed to be pointwise linear (modified James-Stein (MJS) estimator), and parallels between Wiener filtering and the optimum MJS estimator are discussed. Signal Processing Kernel Regression Stein's Unbiased Risk Estimator Savitzky-Golay Filtering Local Polynomial Regression Speech Recognition Nonparametric Regression SURE Theory Savitzky-Golay Filters Stein’s Lemma Optical Coherence Tomography Modified James-Stein Estimator MJS Estimator Electrical Engineering
592	Estimation non paramétrique pour les processus markoviens déterministes par morceaux / Nonparametric estimation for piecewise-deterministic Markov processes Azaïs, Romain 01 July 2013 (has links) M.H.A. Davis a introduit les processus markoviens déterministes par morceaux (PDMP) comme une classe générale de modèles stochastiques non diffusifs, donnant lieu à des trajectoires déterministes ponctuées, à des instants aléatoires, par des sauts aléatoires. Dans cette thèse, nous présentons et analysons des estimateurs non paramétriques des lois conditionnelles des deux aléas intervenant dans la dynamique de tels processus. Plus précisément, dans le cadre d'une observation en temps long de la trajectoire d'un PDMP, nous présentons des estimateurs de la densité conditionnelle des temps inter-sauts et du noyau de Markov qui gouverne la loi des sauts. Nous établissons des résultats de convergence pour nos estimateurs. Des simulations numériques pour différentes applications illustrent nos résultats. Nous proposons également un estimateur du taux de saut pour des processus de renouvellement, ainsi qu'une méthode d'approximation numérique pour un modèle de régression semi-paramétrique. / Piecewise-deterministic Markov processes (PDMP’s) have been introduced by M.H.A. Davis as a general family of non-diffusion stochastic models, involving deterministic motion punctuated by random jumps at random times. In this thesis, we propose and analyze nonparametric estimation methods for both the features governing the randomness of such a process. More precisely, we present estimators of the conditional density of the inter-jumping times and of the transition kernel for a PDMP observed within a long time interval. We establish some convergence results for both the proposed estimators. In addition, numerical simulations illustrate our theoretical results. Furthermore, we propose an estimator for the jump rate of a nonhomogeneous renewal process and a numerical approximation method based on optimal quantization for a semiparametric regression model. Chaînes de Markov ergodiques Estimation non paramétrique Estimation de taux de saut Estimation de noyau de transition Régression semi-paramétrique Piecewise-deterministic Markov processes Ergodic Markov chains Nonparametric estimation Jump rate estimation Transition kernel estimation Semiparametric regression
593	Système complet d’acquisition vidéo, de suivi de trajectoires et de modélisation comportementale pour des environnements 3D naturellement encombrés : application à la surveillance apicole / Full process of acquisition, multi-target tracking, behavioral modeling for naturally crowded environments : application to beehives monitoring Chiron, Guillaume 28 November 2014 (has links) Ce manuscrit propose une approche méthodologique pour la constitution d’une chaîne complète de vidéosurveillance pour des environnements naturellement encombrés. Nous identifions et levons un certain nombre de verrous méthodologiques et technologiques inhérents : 1) à l’acquisition de séquences vidéo en milieu naturel, 2) au traitement d’images, 3) au suivi multi-cibles, 4) à la découverte et la modélisation de motifs comportementaux récurrents, et 5) à la fusion de données. Le contexte applicatif de nos travaux est la surveillance apicole, et en particulier, l’étude des trajectoires des abeilles en vol devant la ruche. De ce fait, cette thèse se présente également comme une étude de faisabilité et de prototypage dans le cadre des deux projets interdisciplinaires EPERAS et RISQAPI (projets menées en collaboration avec l’INRA Magneraud et le Muséum National d’Histoire Naturelle). Il s’agit pour nous informaticiens et pour les biologistes qui nous ont accompagnés, d’un domaine d’investigation totalement nouveau, pour lequel les connaissances métiers, généralement essentielles à ce genre d’applications, restent encore à définir. Contrairement aux approches existantes de suivi d’insectes, nous proposons de nous attaquer au problème dans l’espace à trois dimensions grâce à l’utilisation d’une caméra stéréovision haute fréquence. Dans ce contexte, nous détaillons notre nouvelle méthode de détection de cibles appelée segmentation HIDS. Concernant le calcul des trajectoires, nous explorons plusieurs approches de suivi de cibles, s’appuyant sur plus ou moins d’a priori, susceptibles de supporter les conditions extrêmes de l’application (e.g. cibles nombreuses, de petite taille, présentant un mouvement chaotique). Une fois les trajectoires collectées, nous les organisons selon une structure de données hiérarchique et mettons en œuvre une approche Bayésienne non-paramétrique pour la découverte de comportements émergents au sein de la colonie d’insectes. L’analyse exploratoire des trajectoires issues de la scène encombrée s’effectue par classification non supervisée, simultanément sur des niveaux sémantiques différents, et où le nombre de clusters pour chaque niveau n’est pas défini a priori mais est estimé à partir des données. Cette approche est dans un premier temps validée à l’aide d’une pseudo-vérité terrain générée par un Système Multi-Agents, puis dans un deuxième temps appliquée sur des données réelles. / This manuscript provides the basis for a complete chain of videosurveillence for naturally cluttered environments. In the latter, we identify and solve the wide spectrum of methodological and technological barriers inherent to : 1) the acquisition of video sequences in natural conditions, 2) the image processing problems, 3) the multi-target tracking ambiguities, 4) the discovery and the modeling of recurring behavioral patterns, and 5) the data fusion. The application context of our work is the monitoring of honeybees, and in particular the study of the trajectories bees in flight in front of their hive. In fact, this thesis is part a feasibility and prototyping study carried by the two interdisciplinary projects EPERAS and RISQAPI (projects undertaken in collaboration with INRA institute and the French National Museum of Natural History). It is for us, computer scientists, and for biologists who accompanied us, a completely new area of investigation for which the scientific knowledge, usually essential for such applications, are still in their infancy. Unlike existing approaches for monitoring insects, we propose to tackle the problem in the three-dimensional space through the use of a high frequency stereo camera. In this context, we detail our new target detection method which we called HIDS segmentation. Concerning the computation of trajectories, we explored several tracking approaches, relying on more or less a priori, which are able to deal with the extreme conditions of the application (e.g. many targets, small in size, following chaotic movements). Once the trajectories are collected, we organize them according to a given hierarchical data structure and apply a Bayesian nonparametric approach for discovering emergent behaviors within the colony of insects. The exploratory analysis of the trajectories generated by the crowded scene is performed following an unsupervised classification method simultaneously over different levels of semantic, and where the number of clusters for each level is not defined a priori, but rather estimated from the data only. This approach is has been validated thanks to a ground truth generated by a Multi-Agent System. Then we tested it in the context of real data. Stéréovision Segmentation RGB-D Suivi multicibles Modélisation comportementale Approche Bayésienne non-paramétrique Processus hiérarchique de Dirichlet Surveillance apicole Colonie d’abeilles Stereovision RGB-D segmentation Multi-target tracking Behavioral modeling Bayesian nonparametric approach Hierarchical Dirichlet process Beehive monitoring Honeybee colony
594	Estimation Bayésienne non Paramétrique de Systèmes Dynamiques en Présence de Bruits Alpha-Stables / Nonparametric Bayesian Estimition of Dynamical Systems in the Presence of Alpha-Stable Noise Jaoua, Nouha 06 June 2013 (has links) Dans un nombre croissant d'applications, les perturbations rencontrées s'éloignent fortement des modèles classiques qui les modélisent par une gaussienne ou un mélange de gaussiennes. C'est en particulier le cas des bruits impulsifs que nous rencontrons dans plusieurs domaines, notamment celui des télécommunications. Dans ce cas, une modélisation mieux adaptée peut reposer sur les distributions alpha-stables. C'est dans ce cadre que s'inscrit le travail de cette thèse dont l'objectif est de concevoir de nouvelles méthodes robustes pour l'estimation conjointe état-bruit dans des environnements impulsifs. L'inférence est réalisée dans un cadre bayésien en utilisant les méthodes de Monte Carlo séquentielles. Dans un premier temps, cette problématique a été abordée dans le contexte des systèmes de transmission OFDM en supposant que les distorsions du canal sont modélisées par des distributions alpha-stables symétriques. Un algorithme de Monte Carlo séquentiel a été proposé pour l'estimation conjointe des symboles OFDM émis et des paramètres du bruit $\alpha$-stable. Ensuite, cette problématique a été abordée dans un cadre applicatif plus large, celui des systèmes non linéaires. Une approche bayésienne non paramétrique fondée sur la modélisation du bruit alpha-stable par des mélanges de processus de Dirichlet a été proposée. Des filtres particulaires basés sur des densités d'importance efficaces sont développés pour l'estimation conjointe du signal et des densités de probabilité des bruits / In signal processing literature, noise's sources are often assumed to be Gaussian. However, in many fields the conventional Gaussian noise assumption is inadequate and can lead to the loss of resolution and/or accuracy. This is particularly the case of noise that exhibits impulsive nature. The latter is found in several areas, especially telecommunications. $\alpha$-stable distributions are suitable for modeling this type of noise. In this context, the main focus of this thesis is to propose novel methods for the joint estimation of the state and the noise in impulsive environments. Inference is performed within a Bayesian framework using sequential Monte Carlo methods. First, this issue has been addressed within an OFDM transmission link assuming a symmetric alpha-stable model for channel distortions. For this purpose, a particle filter is proposed to include the joint estimation of the transmitted OFDM symbols and the noise parameters. Then, this problem has been tackled in the more general context of nonlinear dynamic systems. A flexible Bayesian nonparametric model based on Dirichlet Process Mixtures is introduced to model the alpha-stable noise. Moreover, sequential Monte Carlo filters based on efficient importance densities are implemented to perform the joint estimation of the state and the unknown measurement noise density Bruit impulsif Distributions alpha-stables Inférence Bayésienne Méthodes de Monte Carlo Filtrage particulaire Systèmes OFDM Estimation non paramétrique de densité Mélange de processus de Dirichlet Impulsive noise Alpha-stable distributions Bayesian inference Monte Carlo methods Particle filtering OFDM systems Nonparametric density estimation Dirichlet process mixture
595	Estimation de régularité locale / Local regularity estimation Servien, Rémi 12 March 2010 (has links) L'objectif de cette thèse est d'étudier le comportement local d'une mesure de probabilité, notamment à l'aide d'un indice de régularité locale. Dans la première partie, nous établissons la normalité asymptotique de l'estimateur des kn plus proches voisins de la densité. Dans la deuxième, nous définissons un estimateur du mode sous des hypothèses affaiblies. Nous montrons que l'indice de régularité intervient dans ces deux problèmes. Enfin, nous construisons dans une troisième partie différents estimateurs pour l'indice de régularité à partir d'estimateurs de la fonction de répartition, dont nous réalisons une revue bibliographique. / The goal of this thesis is to study the local behavior of a probability measure, using a local regularity index. In the first part, we establish the asymptotic normality of the nearest neighbor density estimate. In the second, we define a mode estimator under weakened hypothesis. We show that the regularity index interferes in this two problems. Finally, we construct in a third part various estimators of the regularity index from estimators of the distribution function, which we achieve a review. Indice de régularité locale Mesure de probabilité Estimation non paramétrique Estimation du mode Normalité asymptotique Local regularity index Probability measure Nonparametric estimation Mode estimators Distribution function estimators Asymptotic normality Nearest neighbor estimate
596	Nonparametric kernel estimation methods for discrete conditional functions in econometrics Elamin, Obbey Ahmed January 2013 (has links) This thesis studies the mixed data types kernel estimation framework for the models of discrete dependent variables, which are known as kernel discrete conditional functions. The conventional parametric multinomial logit MNL model is compared with the mixed data types kernel conditional density estimator in Chapter (2). A new kernel estimator for discrete time single state hazard models is developed in Chapter (3), and named as the discrete time “external kernel hazard” estimator. The discrete time (mixed) proportional hazard estimators are then compared with the discrete time external kernel hazard estimator empirically in Chapter (4). The work in Chapter (2) attempts to estimate a labour force participation decision model using a cross-section data from the UK labour force survey in 2007. The work in Chapter (4) estimates a hazard rate for job-vacancies in weeks, using data from Lancashire Careers Service (LCS) between the period from March 1988 to June 1992. The evidences from the vast literature regarding female labour force participation and the job-market random matching theory are used to examine the empirical results of the estimators. The parametric estimator are tighten by the restrictive assumption regarding the link function of the discrete dependent variable and the dummy variables of the discrete covariates. Adding interaction terms improves the performance of the parametric models but encounters other risks like generating multicollinearity problem, increasing the singularity of the data matrix and complicates the computation of the ML function. On the other hand, the mixed data types kernel estimation framework shows an outstanding performance compared with the conventional parametric estimation methods. The kernel functions that are used for the discrete variables, including the dependent variable, in the mixed data types estimation framework, have substantially improved the performance of the kernel estimators. The kernel framework uses very few assumptions about the functional form of the variables in the model, and relay on the right choice of the kernel functions in the estimator. The outcomes of the kernel conditional density shows that female education level and fertility have high impact on females propensity to work and be in the labour force. The kernel conditional density estimator captures more heterogeneity among the females in the sample than the MNL model due to the restrictive parametric assumptions in the later. The (mixed) proportional hazard framework, on the other hand, missed to capture the effect of the job-market tightness in the job-vacancies hazard rate and produce inconsistent results when the assumptions regarding the distribution of the unobserved heterogeneity are changed. The external kernel hazard estimator overcomes those problems and produce results that consistent with the job market random matching theory. The results in this thesis are useful for nonparametric estimation research in econometrics and in labour economics research. 519.5
597	Classificação de dados estacionários e não estacionários baseada em grafos / Graph-based classification for stationary and non-stationary data João Roberto Bertini Júnior 24 January 2011 (has links) Métodos baseados em grafos consistem em uma poderosa forma de representação e abstração de dados que proporcionam, dentre outras vantagens, representar relações topológicas, visualizar estruturas, representar grupos de dados com formatos distintos, bem como, fornecer medidas alternativas para caracterizar os dados. Esse tipo de abordagem tem sido cada vez mais considerada para solucionar problemas de aprendizado de máquina, principalmente no aprendizado não supervisionado, como agrupamento de dados, e mais recentemente, no aprendizado semissupervisionado. No aprendizado supervisionado, por outro lado, o uso de algoritmos baseados em grafos ainda tem sido pouco explorado na literatura. Este trabalho apresenta um algoritmo não paramétrico baseado em grafos para problemas de classificação com distribuição estacionária, bem como sua extensão para problemas que apresentam distribuição não estacionária. O algoritmo desenvolvido baseia-se em dois conceitos, a saber, 1) em uma estrutura chamada grafo K-associado ótimo, que representa o conjunto de treinamento como um grafo esparso e dividido em componentes; e 2) na medida de pureza de cada componente, que utiliza a estrutura do grafo para determinar o nível de mistura local dos dados em relação às suas classes. O trabalho também considera problemas de classificação que apresentam alteração na distribuição de novos dados. Este problema caracteriza a mudança de conceito e degrada o desempenho do classificador. De modo que, para manter bom desempenho, é necessário que o classificador continue aprendendo durante a fase de aplicação, por exemplo, por meio de aprendizado incremental. Resultados experimentais sugerem que ambas as abordagens apresentam vantagens na classificação de dados em relação aos algoritmos testados / Graph-based methods consist in a powerful form for data representation and abstraction which provides, among others advantages, representing topological relations, visualizing structures, representing groups of data with distinct formats, as well as, supplying alternative measures to characterize data. Such approach has been each time more considered to solve machine learning related problems, mainly concerning unsupervised learning, like clustering, and recently, semi-supervised learning. However, graph-based solutions for supervised learning tasks still remain underexplored in literature. This work presents a non-parametric graph-based algorithm suitable for classification problems with stationary distribution, as well as its extension to cope with problems of non-stationary distributed data. The developed algorithm relies on the following concepts, 1) a graph structure called optimal K-associated graph, which represents the training set as a sparse graph separated into components; and 2) the purity measure for each component, which uses the graph structure to determine local data mixture level in relation to their classes. This work also considers classification problems that exhibit modification on distribution of data flow. This problem qualifies concept drift and worsens any static classifier performance. Hence, in order to maintain accuracy performance, it is necessary for the classifier to keep learning during application phase, for example, by implementing incremental learning. Experimental results, concerning both algorithms, suggest that they had presented advantages over the tested algorithms on data classification tasks Aprendizado baseado em grafos Aprendizado incremental Classificação multiclasse Classificação não paramétrica Formação do grafo Grafo K-associado Medida de pureza Mudança de conceito Concept drift Graph formation Graph-based learning Incremental learning K-associated graph Multi-class classification Nonparametric classification Purity measure
598	Extension au cadre spatial de l'estimation non paramétrique par noyaux récursifs / Extension to spatial setting of kernel recursive estimation Yahaya, Mohamed 15 December 2016 (has links) Dans cette thèse, nous nous intéressons aux méthodes dites récursives qui permettent une mise à jour des estimations séquentielles de données spatiales ou spatio-temporelles et qui ne nécessitent pas un stockage permanent de toutes les données. Traiter et analyser des flux des données, Data Stream, de façon effective et efficace constitue un défi actif en statistique. En effet, dans beaucoup de domaines d'applications, des décisions doivent être prises à un temps donné à la réception d'une certaine quantité de données et mises à jour une fois de nouvelles données disponibles à une autre date. Nous proposons et étudions ainsi des estimateurs à noyau de la fonction de densité de probabilité et la fonction de régression de flux de données spatiales ou spatio-temporelles. Plus précisément, nous adaptons les estimateurs à noyau classiques de Parzen-Rosenblatt et Nadaraya-Watson. Pour cela, nous combinons la méthodologie sur les estimateurs récursifs de la densité et de la régression et celle d'une distribution de nature spatiale ou spatio-temporelle. Nous donnons des applications et des études numériques des estimateurs proposés. La spécificité des méthodes étudiées réside sur le fait que les estimations prennent en compte la structure de dépendance spatiale des données considérées, ce qui est loin d'être trivial. Cette thèse s'inscrit donc dans le contexte de la statistique spatiale non-paramétrique et ses applications. Elle y apporte trois contributions principales qui reposent sur l'étude des estimateurs non-paramétriques récursifs dans un cadre spatial/spatio-temporel et s'articule autour des l'estimation récursive à noyau de la densité dans un cadre spatial, l'estimation récursive à noyau de la densité dans un cadre spatio-temporel, et l'estimation récursive à noyau de la régression dans un cadre spatial. / In this thesis, we are interested in recursive methods that allow to update sequentially estimates in a context of spatial or spatial-temporal data and that do not need a permanent storage of all data. Process and analyze Data Stream, effectively and effciently is an active challenge in statistics. In fact, in many areas, decisions should be taken at a given time at the reception of a certain amount of data and updated once new data are available at another date. We propose and study kernel estimators of the probability density function and the regression function of spatial or spatial-temporal data-stream. Specifically, we adapt the classical kernel estimators of Parzen-Rosenblatt and Nadaraya-Watson. For this, we combine the methodology of recursive estimators of density and regression and that of a distribution of spatial or spatio-temporal data. We provide applications and numerical studies of the proposed estimators. The specifcity of the methods studied resides in the fact that the estimates take into account the spatial dependence structure of the relevant data, which is far from trivial. This thesis is therefore in the context of non-parametric spatial statistics and its applications. This work makes three major contributions. which are based on the study of non-parametric estimators in a recursive spatial/space-time and revolves around the recursive kernel density estimate in a spatial context, the recursive kernel density estimate in a space-time and recursive kernel regression estimate in space. Statistique spatiale Flux de données Données dépendantes Processus faiblement-mélangeant Estimation non paramétrique Estimateur à noyau Convergence en moyenne quadratique Convergence presque sûre Spatial statistics Data stream Dependent data Weakly dependent mixing processes Nonparametric estimation Kernel estimator Mean squared error convergence Almost sure convergence
599	Contributions à la modélisation de données spatiales et fonctionnelles : applications / Contributions to modeling spatial and functional data : applications Ternynck, Camille 28 November 2014 (has links) Dans ce mémoire de thèse, nous nous intéressons à la modélisation non paramétrique de données spatiales et/ou fonctionnelles, plus particulièrement basée sur la méthode à noyau. En général, les échantillons que nous avons considérés pour établir les propriétés asymptotiques des estimateurs proposés sont constitués de variables dépendantes. La spécificité des méthodes étudiées réside dans le fait que les estimateurs prennent en compte la structure de dépendance des données considérées.Dans une première partie, nous appréhendons l’étude de variables réelles spatialement dépendantes. Nous proposons une nouvelle approche à noyau pour estimer les fonctions de densité de probabilité et de régression spatiales ainsi que le mode. La particularité de cette approche est qu’elle permet de tenir compte à la fois de la proximité entre les observations et de celle entre les sites. Nous étudions les comportements asymptotiques des estimateurs proposés ainsi que leurs applications à des données simulées et réelles.Dans une seconde partie, nous nous intéressons à la modélisation de données à valeurs dans un espace de dimension infinie ou dites "données fonctionnelles". Dans un premier temps, nous adaptons le modèle de régression non paramétrique introduit en première partie au cadre de données fonctionnelles spatialement dépendantes. Nous donnons des résultats asymptotiques ainsi que numériques. Puis, dans un second temps, nous étudions un modèle de régression de séries temporelles dont les variables explicatives sont fonctionnelles et le processus des innovations est autorégressif. Nous proposons une procédure permettant de tenir compte de l’information contenue dans le processus des erreurs. Après avoir étudié le comportement asymptotique de l’estimateur à noyau proposé, nous analysons ses performances sur des données simulées puis réelles.La troisième partie est consacrée aux applications. Tout d’abord, nous présentons des résultats de classification non supervisée de données spatiales (multivariées), simulées et réelles. La méthode de classification considérée est basée sur l’estimation du mode spatial, obtenu à partir de l’estimateur de la fonction de densité spatiale introduit dans le cadre de la première partie de cette thèse. Puis, nous appliquons cette méthode de classification basée sur le mode ainsi que d’autres méthodes de classification non supervisée de la littérature sur des données hydrologiques de nature fonctionnelle. Enfin, cette classification des données hydrologiques nous a amené à appliquer des outils de détection de rupture sur ces données fonctionnelles. / In this dissertation, we are interested in nonparametric modeling of spatial and/or functional data, more specifically based on kernel method. Generally, the samples we have considered for establishing asymptotic properties of the proposed estimators are constituted of dependent variables. The specificity of the studied methods lies in the fact that the estimators take into account the structure of the dependence of the considered data.In a first part, we study real variables spatially dependent. We propose a new kernel approach to estimating spatial probability density of the mode and regression functions. The distinctive feature of this approach is that it allows taking into account both the proximity between observations and that between sites. We study the asymptotic behaviors of the proposed estimates as well as their applications to simulated and real data. In a second part, we are interested in modeling data valued in a space of infinite dimension or so-called "functional data". As a first step, we adapt the nonparametric regression model, introduced in the first part, to spatially functional dependent data framework. We get convergence results as well as numerical results. Then, later, we study time series regression model in which explanatory variables are functional and the innovation process is autoregressive. We propose a procedure which allows us to take into account information contained in the error process. After showing asymptotic behavior of the proposed kernel estimate, we study its performance on simulated and real data.The third part is devoted to applications. First of all, we present unsupervised classificationresults of simulated and real spatial data (multivariate). The considered classification method is based on the estimation of spatial mode, obtained from the spatial density function introduced in the first part of this thesis. Then, we apply this classification method based on the mode as well as other unsupervised classification methods of the literature on hydrological data of functional nature. Lastly, this classification of hydrological data has led us to apply change point detection tools on these functional data. Estimation non paramétrique Estimateur à noyau Densité de probabilité Mode Régression Statistique spatiale Données fonctionnelles Séries temporelles Classification non supervisée Détection de rupture Nonparametric estimation Kernel estimate Probability density Mode Regression Spatial statistics Functional data Time series Unsupervised classification Change point detection
600	Multivariate Analysis of Korean Pop Music Audio Features Solomon, Mary Joanna 20 May 2021 (has links) No description available. Statistics Music Multivariate Statistics Nonparametric Statistics Classification Regression PCA Principal Component Analysis K-pop Korean Pop Music Logistic Regression Multiple Linear Regression MLR Shrinkage Methods Ridge Lasso Elastic Net

Search results