• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 14
  • 6
  • 1
  • 1
  • Tagged with
  • 30
  • 30
  • 10
  • 9
  • 6
  • 6
  • 6
  • 6
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Airborne Radar Ground Clutter Suppression Using Multitaper Spectrum Estimation : Comparison with Traditional Method

Ekvall, Linus January 2018 (has links)
During processing of data received by an airborne radar one of the issues is that the typical signal echo from the ground produces a large perturbation. Due to this perturbation it can be difficult to detect targets with low velocity or a low signal-to-noise ratio. Therefore, a filtering process is needed to separate the large perturbation from the target signal. The traditional method include a tapered Fourier transform that operates in parallel with a MTI filter to suppress the main spectral peak in order to produce a smoother spectral output. The difference between a typical signal echo produced from an object in the environment and the signal echo from the ground can be of a magnitude corresponding to more than a 60 dB difference. This thesis presents research of how the multitaper approach can be utilized in concurrence with the minimum variance estimation technique, to produce a spectral estimation that strives for a more effective clutter suppression. A simulation model of the ground clutter was constructed and also a number of simulations for the multitaper, minimum variance estimation technique was made. Compared to the traditional method defined in this thesis, there was a slight improvement of the improvement factor when using the multitaper approach. An analysis of how variations of the multitaper parameters influence the results with respect to minimum detectable velocity and improvement factor have been carried out. The analysis showed that a large number of time samples, a large number of tapers and a narrow bandwidth provided the best result. The analysis is based on a full factorial simulation that provides insight of how to choose the DPSS parameters if the method is to be implemented in a real radar system.
12

Modelling of conditional variance and uncertainty using industrial process data

Juutilainen, I. (Ilmari) 14 November 2006 (has links)
Abstract This thesis presents methods for modelling conditional variance and uncertainty of prediction at a query point on the basis of industrial process data. The introductory part of the thesis provides an extensive background of the examined methods and a summary of the results. The results are presented in detail in the original papers. The application presented in the thesis is modelling of the mean and variance of the mechanical properties of steel plates. Both the mean and variance of the mechanical properties depend on many process variables. A method for predicting the probability of rejection in a quali?cation test is presented and implemented in a tool developed for the planning of strength margins. The developed tool has been successfully utilised in the planning of mechanical properties in a steel plate mill. The methods for modelling the dependence of conditional variance on input variables are reviewed and their suitability for large industrial data sets are examined. In a comparative study, neural network modelling of the mean and dispersion narrowly performed the best. A method is presented for evaluating the uncertainty of regression-type prediction at a query point on the basis of predicted conditional variance, model variance and the effect of uncertainty about explanatory variables at early process stages. A method for measuring the uncertainty of prediction on the basis of the density of the data around the query point is proposed. The proposed distance measure is utilised in comparing the generalisation ability of models. The generalisation properties of the most important regression learning methods are studied and the results indicate that local methods and quadratic regression have a poor interpolation capability compared with multi-layer perceptron and Gaussian kernel support vector regression. The possibility of adaptively modelling a time-varying conditional variance function is disclosed. Two methods for adaptive modelling of the variance function are proposed. The background of the developed adaptive variance modelling methods is presented.
13

Channel and Noise Variance Estimation for Future 5G Cellular Networks

Iscar Vergara, Jorge 10 November 2016 (has links)
Future fifth generation (5G) cellular networks have to cope with the expected ten-fold increase in mobile data traffic between 2015 and 2021. To achieve this goal, new technologies are being considered, including massive multiple-input multiple-output (MIMO) systems and millimeter-wave (mmWave) communications. Massive MIMO involves the use of large antenna array sizes at the base station, while mmWave communications employ frequencies between 30 and 300 GHz. In this thesis we study the impact of these technologies on the performance of channel estimators. Our results show that the characteristics of the propagation channel at mmWave frequencies improve the channel estimation performance in comparison with current, low frequency-based, cellular networks. Furthermore, we demonstrate the existence of an optimal angular spread of the multipath clusters, which can be used to maximize the capacity of mmWave networks. We also propose efficient noise variance estimators, which can be employed as an input to existing channel estimators.
14

Estimation de la variance en présence de données imputées pour des plans de sondage à grande entropie

Vallée, Audrey-Anne 07 1900 (has links)
Les travaux portent sur l’estimation de la variance dans le cas d’une non- réponse partielle traitée par une procédure d’imputation. Traiter les valeurs imputées comme si elles avaient été observées peut mener à une sous-estimation substantielle de la variance des estimateurs ponctuels. Les estimateurs de variance usuels reposent sur la disponibilité des probabilités d’inclusion d’ordre deux, qui sont parfois difficiles (voire impossibles) à calculer. Nous proposons d’examiner les propriétés d’estimateurs de variance obtenus au moyen d’approximations des probabilités d’inclusion d’ordre deux. Ces approximations s’expriment comme une fonction des probabilités d’inclusion d’ordre un et sont généralement valides pour des plans à grande entropie. Les résultats d’une étude de simulation, évaluant les propriétés des estimateurs de variance proposés en termes de biais et d’erreur quadratique moyenne, seront présentés. / Variance estimation in the case of item nonresponse treated by imputation is the main topic of this work. Treating the imputed values as if they were observed may lead to substantial under-estimation of the variance of point estimators. Classical variance estimators rely on the availability of the second-order inclusion probabilities, which may be difficult (even impossible) to calculate. We propose to study the properties of variance estimators obtained by approximating the second-order inclusion probabilities. These approximations are expressed in terms of first-order inclusion probabilities and are usually valid for high entropy sampling designs. The results of a simulation study evaluating the properties of the proposed variance estimators in terms of bias and mean squared error will be presented.
15

Gaussian Process Regression-based GPS Variance Estimation and Trajectory Forecasting / Regression med Gaussiska Processer för Estimering av GPS Varians och Trajektoriebaserade Tidtabellsprognoser

Kortesalmi, Linus January 2018 (has links)
Spatio-temporal data is a commonly used source of information. Using machine learning to analyse this kind of data can lead to many interesting and useful insights. In this thesis project, a novel public transportation spatio-temporal dataset is explored and analysed. The dataset contains 282 GB of positional events, spanning two weeks of time, from all public transportation vehicles in Östergötland county, Sweden.  From the data exploration, three high-level problems are formulated: bus stop detection, GPS variance estimation, and arrival time prediction, also called trajectory forecasting. The bus stop detection problem is briefly discussed and solutions are proposed. Gaussian process regression is an effective method for solving regression problems. The GPS variance estimation problem is solved via the use of a mixture of Gaussian processes. A mixture of Gaussian processes is also used to predict the arrival time for public transportation buses. The arrival time prediction is from one bus stop to the next, not for the whole trajectory.  The result from the arrival time prediction is a distribution of arrival times, which can easily be applied to determine the earliest and latest expected arrival to the next bus stop, alongside the most probable arrival time. The naïve arrival time prediction model implemented has a root mean square error of 5 to 19 seconds. In general, the absolute error of the prediction model decreases over time in each respective segment. The results from the GPS variance estimation problem is a model which can compare the variance for different environments along the route on a given trajectory.
16

Comparaison d'estimateurs de la variance du TMLE

Boulanger, Laurence 09 1900 (has links)
No description available.
17

Analysis of Longitudinal Surveys with Missing Responses

Carrillo Garcia, Ivan Adolfo January 2008 (has links)
Longitudinal surveys have emerged in recent years as an important data collection tool for population studies where the primary interest is to examine population changes over time at the individual level. The National Longitudinal Survey of Children and Youth (NLSCY), a large scale survey with a complex sampling design and conducted by Statistics Canada, follows a large group of children and youth over time and collects measurement on various indicators related to their educational, behavioral and psychological development. One of the major objectives of the study is to explore how such development is related to or affected by familial, environmental and economical factors. The generalized estimating equation approach, sometimes better known as the GEE method, is the most popular statistical inference tool for longitudinal studies. The vast majority of existing literature on the GEE method, however, uses the method for non-survey settings; and issues related to complex sampling designs are ignored. This thesis develops methods for the analysis of longitudinal surveys when the response variable contains missing values. Our methods are built within the GEE framework, with a major focus on using the GEE method when missing responses are handled through hot-deck imputation. We first argue why, and further show how, the survey weights can be incorporated into the so-called Pseudo GEE method under a joint randomization framework. The consistency of the resulting Pseudo GEE estimators with complete responses is established under the proposed framework. The main focus of this research is to extend the proposed pseudo GEE method to cover cases where the missing responses are imputed through the hot-deck method. Both weighted and unweighted hot-deck imputation procedures are considered. The consistency of the pseudo GEE estimators under imputation for missing responses is established for both procedures. Linearization variance estimators are developed for the pseudo GEE estimators under the assumption that the finite population sampling fraction is small or negligible, a scenario often held for large scale population surveys. Finite sample performances of the proposed estimators are investigated through an extensive simulation study. The results show that the pseudo GEE estimators and the linearization variance estimators perform well under several sampling designs and for both continuous response and binary response.
18

Analysis of Longitudinal Surveys with Missing Responses

Carrillo Garcia, Ivan Adolfo January 2008 (has links)
Longitudinal surveys have emerged in recent years as an important data collection tool for population studies where the primary interest is to examine population changes over time at the individual level. The National Longitudinal Survey of Children and Youth (NLSCY), a large scale survey with a complex sampling design and conducted by Statistics Canada, follows a large group of children and youth over time and collects measurement on various indicators related to their educational, behavioral and psychological development. One of the major objectives of the study is to explore how such development is related to or affected by familial, environmental and economical factors. The generalized estimating equation approach, sometimes better known as the GEE method, is the most popular statistical inference tool for longitudinal studies. The vast majority of existing literature on the GEE method, however, uses the method for non-survey settings; and issues related to complex sampling designs are ignored. This thesis develops methods for the analysis of longitudinal surveys when the response variable contains missing values. Our methods are built within the GEE framework, with a major focus on using the GEE method when missing responses are handled through hot-deck imputation. We first argue why, and further show how, the survey weights can be incorporated into the so-called Pseudo GEE method under a joint randomization framework. The consistency of the resulting Pseudo GEE estimators with complete responses is established under the proposed framework. The main focus of this research is to extend the proposed pseudo GEE method to cover cases where the missing responses are imputed through the hot-deck method. Both weighted and unweighted hot-deck imputation procedures are considered. The consistency of the pseudo GEE estimators under imputation for missing responses is established for both procedures. Linearization variance estimators are developed for the pseudo GEE estimators under the assumption that the finite population sampling fraction is small or negligible, a scenario often held for large scale population surveys. Finite sample performances of the proposed estimators are investigated through an extensive simulation study. The results show that the pseudo GEE estimators and the linearization variance estimators perform well under several sampling designs and for both continuous response and binary response.
19

Variance parameter estimation methods with re-use of data

Meterelliyoz Kuyzu, Melike 25 August 2008 (has links)
This dissertation studies three classes of estimators for the asymptotic variance parameter of a stationary stochastic process. All estimators are based on the concept of data "re-use" and all transform the output process into functions of an approximate Brownian motion process. The first class of estimators consists folded standardized time series area and Cramér-von Mises (CvM) estimators. Detailed expressions are obtained for their expectation at folding levels 0 and 1; those expressions explain the puzzling increase in small-sample bias as the folding level increases. In addition, we use batching and linear combinations of estimators from different levels to produce estimators with significantly smaller variance. Finally, we obtain very accurate approximations of the limiting distributions of batched folded estimators. These approximations are used to compute confidence intervals for the mean and variance parameter of the underlying stochastic process. The second class --- folded overlapping area estimators --- are computed by averaging folded versions of the standardized time series corresponding to overlapping batches. We establish the limiting distributions of the proposed estimators as the sample size tends to infinity. We obtain statistical properties of these estimators such as bias and variance. Further, we find approximate confidence intervals for the mean and variance parameter of the process by approximating the theoretical distributions of the proposed estimators. In addition, we develop algorithms to compute these estimators with only order-of-sample-size work. The third class --- reflected area and CvM estimators --- are computed from reflections of the original sample path. We obtain the expected values and variance of individual estimators. We show that it is possible to obtain linear combinations of reflected estimators with smaller variance than the variance of each constituent estimator, often at no cost in bias. A quadratic optimization problem is solved to find an optimal linear combination of estimators that minimizes the variance of the linearly combined estimator. For all classes of estimators, we provide Monte Carlo examples to show that the estimators perform as well in practice as advertised by the theory.
20

Safe optimization algorithms for variable selection and hyperparameter tuning / Algorithmes d’optimisation sûrs pour la sélection de variables et le réglage d’hyperparamètre

Ndiaye, Eugene 04 October 2018 (has links)
Le traitement massif et automatique des données requiert le développement de techniques de filtration des informations les plus importantes. Parmi ces méthodes, celles présentant des structures parcimonieuses se sont révélées idoines pour améliorer l’efficacité statistique et computationnelle des estimateurs, dans un contexte de grandes dimensions. Elles s’expriment souvent comme solution de la minimisation du risque empirique régularisé s’écrivant comme une somme d’un terme lisse qui mesure la qualité de l’ajustement aux données, et d’un terme non lisse qui pénalise les solutions complexes. Cependant, une telle manière d’inclure des informations a priori, introduit de nombreuses difficultés numériques pour résoudre le problème d’optimisation sous-jacent et pour calibrer le niveau de régularisation. Ces problématiques ont été au coeur des questions que nous avons abordées dans cette thèse.Une technique récente, appelée «Screening Rules», propose d’ignorer certaines variables pendant le processus d’optimisation en tirant bénéfice de la parcimonie attendue des solutions. Ces règles d’élimination sont dites sûres lorsqu’elles garantissent de ne pas rejeter les variables à tort. Nous proposons un cadre unifié pour identifier les structures importantes dans ces problèmes d’optimisation convexes et nous introduisons les règles «Gap Safe Screening Rules». Elles permettent d’obtenir des gains considérables en temps de calcul grâce à la réduction de la dimension induite par cette méthode. De plus, elles s’incorporent facilement aux algorithmes itératifs et s’appliquent à un plus grand nombre de problèmes que les méthodes précédentes.Pour trouver un bon compromis entre minimisation du risque et introduction d’un biais d’apprentissage, les algorithmes d’homotopie offrent la possibilité de tracer la courbe des solutions en fonction du paramètre de régularisation. Toutefois, ils présentent des instabilités numériques dues à plusieurs inversions de matrice, et sont souvent coûteux en grande dimension. Aussi, ils ont des complexités exponentielles en la dimension du modèle dans des cas défavorables. En autorisant des solutions approchées, une approximation de la courbe des solutions permet de contourner les inconvénients susmentionnés. Nous revisitons les techniques d’approximation des chemins de régularisation pour une tolérance prédéfinie, et nous analysons leur complexité en fonction de la régularité des fonctions de perte en jeu. Il s’ensuit une proposition d’algorithmes optimaux ainsi que diverses stratégies d’exploration de l’espace des paramètres. Ceci permet de proposer une méthode de calibration de la régularisation avec une garantie de convergence globale pour la minimisation du risque empirique sur les données de validation.Le Lasso, un des estimateurs parcimonieux les plus célèbres et les plus étudiés, repose sur une théorie statistique qui suggère de choisir la régularisation en fonction de la variance des observations. Ceci est difficilement utilisable en pratique car, la variance du modèle est une quantité souvent inconnue. Dans de tels cas, il est possible d’optimiser conjointement les coefficients de régression et le niveau de bruit. Ces estimations concomitantes, apparues dans la littérature sous les noms de Scaled Lasso, Square-Root Lasso, fournissent des résultats théoriques aussi satisfaisants que celui du Lasso tout en étant indépendant de la variance réelle. Bien que présentant des avancées théoriques et pratiques importantes, ces méthodes sont aussi numériquement instables et les algorithmes actuellement disponibles sont coûteux en temps de calcul. Nous illustrons ces difficultés et nous proposons à la fois des modifications basées sur des techniques de lissage pour accroitre la stabilité numérique de ces estimateurs, ainsi qu’un algorithme plus efficace pour les obtenir. / Massive and automatic data processing requires the development of techniques able to filter the most important information. Among these methods, those with sparse structures have been shown to improve the statistical and computational efficiency of estimators in a context of large dimension. They can often be expressed as a solution of regularized empirical risk minimization and generally lead to non differentiable optimization problems in the form of a sum of a smooth term, measuring the quality of the fit, and a non-smooth term, penalizing complex solutions. Although it has considerable advantages, such a way of including prior information, unfortunately introduces many numerical difficulties both for solving the underlying optimization problem and to calibrate the level of regularization. Solving these issues has been at the heart of this thesis. A recently introduced technique, called "Screening Rules", proposes to ignore some variables during the optimization process by benefiting from the expected sparsity of the solutions. These elimination rules are said to be safe when the procedure guarantees to not reject any variable wrongly. In this work, we propose a unified framework for identifying important structures in these convex optimization problems and we introduce the "Gap Safe Screening Rules". They allows to obtain significant gains in computational time thanks to the dimensionality reduction induced by this method. In addition, they can be easily inserted into iterative algorithms and apply to a large number of problems.To find a good compromise between minimizing risk and introducing a learning bias, (exact) homotopy continuation algorithms offer the possibility of tracking the curve of the solutions as a function of the regularization parameters. However, they exhibit numerical instabilities due to several matrix inversions and are often expensive in large dimension. Another weakness is that a worst-case analysis shows that they have exact complexities that are exponential in the dimension of the model parameter. Allowing approximated solutions makes possible to circumvent the aforementioned drawbacks by approximating the curve of the solutions. In this thesis, we revisit the approximation techniques of the regularization paths given a predefined tolerance and we propose an in-depth analysis of their complexity w.r.t. the regularity of the loss functions involved. Hence, we propose optimal algorithms as well as various strategies for exploring the parameters space. We also provide calibration method (for the regularization parameter) that enjoys globalconvergence guarantees for the minimization of the empirical risk on the validation data.Among sparse regularization methods, the Lasso is one of the most celebrated and studied. Its statistical theory suggests choosing the level of regularization according to the amount of variance in the observations, which is difficult to use in practice because the variance of the model is oftenan unknown quantity. In such case, it is possible to jointly optimize the regression parameter as well as the level of noise. These concomitant estimates, appeared in the literature under the names of Scaled Lasso or Square-Root Lasso, and provide theoretical results as sharp as that of theLasso while being independent of the actual noise level of the observations. Although presenting important advances, these methods are numerically unstable and the currently available algorithms are expensive in computation time. We illustrate these difficulties and we propose modifications based on smoothing techniques to increase stability of these estimators as well as to introduce a faster algorithm.

Page generated in 0.1559 seconds