• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 88
  • 24
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 149
  • 149
  • 138
  • 45
  • 29
  • 26
  • 26
  • 23
  • 22
  • 20
  • 20
  • 16
  • 16
  • 15
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Evaluating Bag Of Little Bootstraps On Logistic Regression With Unbalanced Data

Bark, Henrik January 2023 (has links)
The Bag of Little Bootstraps (BLB) was introduced to make the bootstrap method more computationally efficient when used on massive data samples. Since its introduction, a broad spectrum of research on the application of the BLB has been made. However, while the BLB has shown promising results that can be used for logistic regression, these results have been for well-balanced data. There is, therefore, an obvious need for further research into how the BLB performs when the dependent variable is unbalanced and whether possible performance issues can be remedied through methods such as Firths's Penalized Maximum Likelihood Estimation (PMLE). This thesis shows that the dependent variable's imbalances severely affect the BLB's performance when applied in logistic regression. Further, this thesis also shows that PMLE produces mixed and unreliable results when used to remedy the drops in performance.
72

Essays on Multivariate and Simultaneous Equations Spatial Autoregressive Models

Yang, Kai 28 September 2016 (has links)
No description available.
73

Logaritmicko-konkávní rozděleni pravděpodobnosti a jejich aplikace / Logarithmic-concave probability distributions and their applications

Zavadilová, Barbora January 2014 (has links)
No description available.
74

Modeling based on a reparameterized Birnbaum-Saunders distribution for analysis of survival data / Modelagem baseada na distribuição Birnbaum-Saunders reparametrizada para análise de dados sobrevivência

Leão, Jeremias da Silva 09 January 2017 (has links)
In this thesis we propose models based on a reparameterized Birnbaum-Saunder (BS) distribution introduced by Santos-Neto et al. (2012) and Santos-Neto et al. (2014), to analyze survival data. Initially we introduce the Birnbaum-Saunders frailty model where we analyze the cases (i) with (ii) without covariates. Survival models with frailty are used when further information is nonavailable to explain the occurrence time of a medical event. The random effect is the frailty, which is introduced on the baseline hazard rate to control the unobservable heterogeneity of the patients. We use the maximum likelihood method to estimate the model parameters. We evaluate the performance of the estimators under different percentage of censured observations by a Monte Carlo study. Furthermore, we introduce a Birnbaum-Saunders regression frailty model where the maximum likelihood estimation of the model parameters with censored data as well as influence diagnostics for the new regression model are investigated. In the following we propose a cure rate Birnbaum-Saunders frailty model. An important advantage of this proposed model is the possibility to jointly consider the heterogeneity among patients by their frailties and the presence of a cured fraction of them. We consider likelihood-based methods to estimate the model parameters and to derive influence diagnostics for the model. In addition, we introduce a bivariate Birnbaum-Saunders distribution based on a parameterization of the Birnbaum-Saunders which has the mean as one of its parameters. We discuss the maximum likelihood estimation of the model parameters and show that these estimators can be obtained by solving non-linear equations. We then derive a regression model based on the proposed bivariate Birnbaum-Saunders distribution, which permits us to model data in their original scale. A simulation study is carried out to evaluate the performance of the maximum likelihood estimators. Finally, examples with real-data are performed to illustrate all the models proposed here. / Nesta tese propomos modelos baseados na distribuição Birnbaum-Saunders reparametrizada introduzida por Santos-Neto et al. (2012) e Santos-Neto et al. (2014), para análise dados de sobrevivência. Inicialmente propomos o modelo de fragilidade Birnbaum-Saunders sem e com covariáveis observáveis. O modelo de fragilidade é caracterizado pela utilização de um efeito aleatório, ou seja, de uma variável aleatória não observável, que representa as informações que não podem ou não foram observadas tais como fatores ambientais ou genéticos, como também, informações que, por algum motivo, não foram consideradas no planejamento do estudo. O efeito aleatório (a fragilidade) é introduzido na função de risco de base para controlar a heterogeneidade não observável. Usamos o método de máxima verossimilhança para estimar os parâmetros do modelo. Avaliamos o desempenho dos estimadores sob diferentes percentuais de censura via estudo de simulações de Monte Carlo. Considerando variáveis regressoras, derivamos medidas de diagnóstico de influência. Os métodos de diagnóstico têm sido ferramentas importantes na análise de regressão para detectar anomalias, tais como quebra das pressuposições nos erros, presença de outliers e observações influentes. Em seguida propomos o modelo de fração de cura com fragilidade Birnbaum-Saunders. Os modelos para dados de sobrevivência com proporção de curados (também conhecidos como modelos de taxa de cura ou modelos de sobrevivência com longa duração) têm sido amplamente estudados. Uma vantagem importante do modelo proposto é a possibilidade de considerar conjuntamente a heterogeneidade entre os pacientes por suas fragilidades e a presença de uma fração curada. As estimativas dos parâmetros do modelo foram obtidas via máxima verossimilhança, medidas de influência e diagnóstico foram desenvolvidas para o modelo proposto. Por fim, avaliamos a distribuição bivariada Birnbaum-Saunders baseada na média, como também introduzimos um modelo de regressão para o modelo proposto. Utilizamos os métodos de máxima verossimilhança e método dos momentos modificados, para estimar os parâmetros do modelo. Avaliamos o desempenho dos estimadores via estudo de simulações de Monte Carlo. Aplicações a conjuntos de dados reais ilustram as potencialidades dos modelos abordados.
75

Odhad momentů při intervalovém cenzorování typu I / Odhad momentů při intervalovém cenzorování typu I

Ďurčík, Matej January 2012 (has links)
Title: Moments Estimation under Type I Interval Censoring Author: Matej Ďurčík Department: Faculty of Probability and Mathematical Statistics Supervisor: RNDr. Arnošt Komárek Ph.D. Abstract: In this thesis we apply the uniform deconvolution model to the interval censoring problem. We restrict ourselves only on interval censoring case 1. We show how to apply uniform deconvolution model in estimating the probability distribution characteristics in the interval censoring case 1. Moreover we derive limit distributions of the estimators of mean and variance. Then we compare these estimators to the asymptotically efficient estimators based on the nonparametric maximum likelihood estimation by simulation studies under some certain distributions of the random variables. 1
76

Regression models with an interval-censored covariate

Langohr, Klaus 16 June 2004 (has links)
El análisis de supervivencia trata de la evaluación estadística de variables que miden el tiempo transcurrido hasta un evento de interés. Una particularidad que ha de considerar el análisis de supervivencia son datos censurados. Éstos aparecen cuando el tiempo de interés no puede ser observado exactamente y la información al respecto es parcial. Se distinguen diferentes tipos de censura: un tiempo censurado por la derecha está presente si el tiempo de supervivencia es sabido mayor a un tiempo observado; censura por izquierda está dada si la supervivencia es menor que un tiempo observado. En el caso de censura en un intervalo, el tiempo está en un intervalo de tiempo observado, y el caso de doble censura aparece cuando, también, el origen del tiempo de supervivencia está censurado.La primera parte del Capítulo 1 contiene un resumen de la metodología estadística para datos censurados en un intervalo, incluyendo tanto métodos paramétricos como no-paramétricos. En la Sección 1.2 abordamos el tema de censura noinformativa que se supone cumplida para todos los métodos presentados. Dada la importancia de métodos de optimización en los demás capítulos, la Sección 1.3 trata de la teoría de optimización. Esto incluye varios algoritmos de optimización y la presentación de herramientas de optimización. Se ha utilizado el lenguaje de programación matemática AMPL para resolver los problemas de maximización que han surgido. Una de las características más importantes de AMPL es la posibilidad de enviar problemas de optimización al servidor 'NEOS: Server for Optimization' en Internet para que sean solucionados por ese servidor.En el Capítulo 2, se presentan los conjuntos de datos que han sido analizados. El primer estudio es sobre la supervivencia de pacientes de tuberculosis co-infectados por el VIH en Barcelona, mientras el siguiente, también del área de VIH/SIDA, trata de usuarios de drogas intra-venosas de Badalona y alrededores que fueron admitidos a la unidad de desintoxicación del Hospital Trias i Pujol. Un área completamente diferente son los estudios sobre la vida útil de alimentos. Se presenta la aplicación de la metodología para datos censurados en un intervalo en esta área. El Capítulo 3 trata del marco teórico de un modelo de vida acelerada con una covariante censurada en un intervalo. Puntos importantes a tratar son el desarrollo de la función de verosimilitud y el procedimiento de estimación de parámetros con métodos del área de optimización. Su uso puede ser una herramienta importante en la estadística. Estos métodos se aplican también a otros modelos con una covariante censurada en un intervalo como se demuestra en el Capítulo 4.Otros métodos que se podrían aplicar son descritos en el Capítulo 5. Se trata sobre todo de métodos basados en técnicas de imputación para datos censurados en un intervalo. Consisten en dos pasos: primero, se imputa el valor desconocido de la covariante, después, se pueden estimar los parámetros con procedimientos estadísticos estándares disponibles en cualquier paquete de software estadístico.El método de maximización simultánea ha sido implementado por el autor con el código de AMPL y ha sido aplicado al conjunto de datos de Badalona. Presentamos los resultados de diferentes modelos y sus respectivas interpretaciones en el Capítulo 6. Se ha llevado a cabo un estudio de simulación cuyos resultados se dan en el Capítulo 7. Ha sido el objetivo comparar la maximización simultánea con dos procedimientos basados en la imputación para el modelo de vida acelerada. Finalmente, en el último capítulo se resumen los resultados y se abordan diferentes aspectos que aún permanecen sin ser resueltos o podrían ser aproximados de manera diferente. / Survival analysis deals with the evaluation of variables which measure the elapsed time until an event of interest. One particularity survival analysis has to account for are censored data, which arise whenever the time of interest cannot be measured exactly, but partial information is available. Four types of censoring are distinguished: right-censoring occurs when the unobserved survival time is bigger, left-censoring when it is less than an observed time, and in case of interval-censoring, the survival time is observed within a time interval. We speak of doubly-censored data if also the time origin is censored.In Chapter 1 of the thesis, we first give a survey on statistical methods for interval-censored data, including both parametric and nonparametric approaches. In the second part of Chapter 1, we address the important issue of noninformative censoring, which is assumed in all the methods presented. Given the importance of optimization procedures in the further chapters of the thesis, the final section of Chapter 1 is about optimization theory. This includes some optimization algorithms, as well as the presentation of optimization tools, which have played an important role in the elaboration of this work. We have used the mathematical programming language AMPL to solve the maximization problems arisen. One of its main features is that optimization problems written in the AMPL code can be sent to the internet facility 'NEOS: Server for Optimization' and be solved by its available solvers.In Chapter 2, we present the three data sets analyzed for the elaboration of this dissertation. Two correspond to studies on HIV/AIDS: one is on the survival of Tuberculosis patients co-infected with HIV in Barcelona, the other on injecting drug users from Badalona and surroundings, most of whom became infected with HIV as a result of their drug addiction. The complex censoring patterns in the variables of interest of the latter study have motivated the development of estimation procedures for regression models with interval-censored covariates. The third data set comes from a study on the shelf life of yogurt. We present a new approach to estimate the shelf lives of food products taking advantage of the existing methodology for interval-censored data.Chapter 3 deals with the theoretical background of an accelerated failure time model with an interval-censored covariate, putting emphasize on the development of the likelihood functions and the estimation procedure by means of optimization techniques and tools. Their use in statistics can be an attractive alternative to established methods such as the EM algorithm. In Chapter 4 we present further regression models such as linear and logistic regression with the same type of covariate, for the parameter estimation of which the same techniques are applied as in Chapter 3. Other possible estimation procedures are described in Chapter 5. These comprise mainly imputation methods, which consist of two steps: first, the observed intervals of the covariate are replaced by an imputed value, for example, the interval midpoint, then, standard procedures are applied to estimate the parameters.The application of the proposed estimation procedure for the accelerated failure time model with an interval-censored covariate to the data set on injecting drug users is addressed in Chapter 6. Different distributions and covariates are considered and the corresponding results are presented and discussed. To compare the estimation procedure with the imputation based methods of Chapter 5, a simulation study is carried out, whose design and results are the contents of Chapter 7. Finally, in the closing Chapter 8, the main results are summarized and several aspects which remain unsolved or might be approximated in another way are addressed.
77

Stochastic Volatility Models and Simulated Maximum Likelihood Estimation

Choi, Ji Eun 08 July 2011 (has links)
Financial time series studies indicate that the lognormal assumption for the return of an underlying security is often violated in practice. This is due to the presence of time-varying volatility in the return series. The most common departures are due to a fat left-tail of the return distribution, volatility clustering or persistence, and asymmetry of the volatility. To account for these characteristics of time-varying volatility, many volatility models have been proposed and studied in the financial time series literature. Two main conditional-variance model specifications are the autoregressive conditional heteroscedasticity (ARCH) and the stochastic volatility (SV) models. The SV model, proposed by Taylor (1986), is a useful alternative to the ARCH family (Engle (1982)). It incorporates time-dependency of the volatility through a latent process, which is an autoregressive model of order 1 (AR(1)), and successfully accounts for the stylized facts of the return series implied by the characteristics of time-varying volatility. In this thesis, we review both ARCH and SV models but focus on the SV model and its variations. We consider two modified SV models. One is an autoregressive process with stochastic volatility errors (AR--SV) and the other is the Markov regime switching stochastic volatility (MSSV) model. The AR--SV model consists of two AR processes. The conditional mean process is an AR(p) model , and the conditional variance process is an AR(1) model. One notable advantage of the AR--SV model is that it better captures volatility persistence by considering the AR structure in the conditional mean process. The MSSV model consists of the SV model and a discrete Markov process. In this model, the volatility can switch from a low level to a high level at random points in time, and this feature better captures the volatility movement. We study the moment properties and the likelihood functions associated with these models. In spite of the simple structure of the SV models, it is not easy to estimate parameters by conventional estimation methods such as maximum likelihood estimation (MLE) or the Bayesian method because of the presence of the latent log-variance process. Of the various estimation methods proposed in the SV model literature, we consider the simulated maximum likelihood (SML) method with the efficient importance sampling (EIS) technique, one of the most efficient estimation methods for SV models. In particular, the EIS technique is applied in the SML to reduce the MC sampling error. It increases the accuracy of the estimates by determining an importance function with a conditional density function of the latent log variance at time t given the latent log variance and the return at time t-1. Initially we perform an empirical study to compare the estimation of the SV model using the SML method with EIS and the Markov chain Monte Carlo (MCMC) method with Gibbs sampling. We conclude that SML has a slight edge over MCMC. We then introduce the SML approach in the AR--SV models and study the performance of the estimation method through simulation studies and real-data analysis. In the analysis, we use the AIC and BIC criteria to determine the order of the AR process and perform model diagnostics for the goodness of fit. In addition, we introduce the MSSV models and extend the SML approach with EIS to estimate this new model. Simulation studies and empirical studies with several return series indicate that this model is reasonable when there is a possibility of volatility switching at random time points. Based on our analysis, the modified SV, AR--SV, and MSSV models capture the stylized facts of financial return series reasonably well, and the SML estimation method with the EIS technique works very well in the models and the cases considered.
78

Methods for Blind Separation of Co-Channel BPSK Signals Arriving at an Antenna Array and Their Performance Analysis

Anand, K 07 1900 (has links)
Capacity improvement of Wireless Communication Systems is a very important area of current research. The goal is to increase the number of users supported by the system per unit bandwidth allotted. One important way of achieving this improvement is to use multiple antennas backed by intelligent signal processing. In this thesis, we present methods for blind separation of co-channel BPSK signals arriving at an antenna array. These methods consist of two parts, Constellation Estimation and Assignment. We give two methods for constellation estimation, the Smallest Distance Clustering and the Maximum Likelihood Estimation. While the latter is theoretically sound,the former is Computationally simple and intuitively appealing. We show that the Maximum Likelihood Constellation Estimation is well approximated by the Smallest Distance Clustering Algorithm at high SNR. The Assignment Algorithm exploits the structure of the BPSK signals. We observe that both the methods for estimating the constellation vectors perform very well at high SNR and nearly attain Cramer-Rao bounds. Using this fact and noting that the Assignment Algorithm causes negligble error at high SNR, we derive an upper bound on the probability of bit error for the above methods at high SNR. This upper bound falls very rapidly with increasing SNR, showing that our constellation estimation-assignment approach is very efficient. Simulation results are given to demonstrate the usefulness of the bounds.
79

Application Of Controlled Random Search Optimization Technique In MMLE With Process Noise

Anilkumar, A K 08 1900 (has links)
Generally in most of the applications of estimation theory using the Method of Maximum Likelihood Estimation (MMLE) to dynamical systems one deals with a situation where only the measurement noise alone is present. However in many present day applications where modeling errors and random state noise input conditions occur it has become necessary for MMLE to handle measurement noise as well as process noise. The numerical algorithms accounting for both measurement and process noise require significantly an order of magnitude higher computer time and memory. Further more, implementation difficulties and convergence problems are often encountered. Here one has to estimate the quantities namely, the initial state error covariance matrix Po, measurement noise covariance matrix R, the process noise covariance matrix Q and the system parameter 0 and the present work deals with the above. Since the above problem is fairly involved we need to have a good reference solution. For this purpose we utilize the approach and results of Gemson who considered the above problem via the extended Kalman filter (EKF) route to compare the present results from the MMLE route. The EKF uses the unknown parameters as additional states unlike in MMLE which uses only the system states. Chapter 1 provides a brief historical perspective followed by parameter identification in the presence of process and measurement noises. The earlier formulations such as natural, innovation, combined, and adaptive approaches are discussed. Chapter 2 deals with the heuristic adaptive tuning of the Kalman filter parameters for the matrices Q and R by Myers and Tapley originally developed for state estimation problems involving satellite orbit estimation. It turns out that for parameter estimation problems apart from the above matrices even the choice of the initial covariance matrix Po is crucial for obtaining proper parameter estimates with a finite amount of data and for this purpose the inverse of the information matrix for Po is used. This is followed by a description of the original Controlled Random Search (CRS) of Price and its variant as implemented and used in the present work to estimate or tune Q, R, and 0 which is the aim of the present work. The above help the reader to appreciate the setting under which the present study has been carried out. Chapter 3 presents the results and the analysis of the estimation procedure adopted with respect to a specific case study of the lateral dynamics of an aircraft involving 15 unknown parameters. The reference results for the present work are the ones based on the approach of Gemson and Ananthasayanam (1998). The present work proceeds in two phases. In the first case (i) the EKF estimates for Po, Q, and R are used to obtain 0 and in the second case (ii) the estimate of Po and Q together with a reasonable choice of R are utilized to obtain 0 from the CRS algorithm. Thus one is able to assess the capability of the CRS to estimate only the unknown parameters. The next Chapter 4 presents the results of utilizing the CRS algorithm with R based on a reasonable choice and for Po from the inverse of the information matrix to estimate both Q and 0. This brings out the efficiency of MMLE with CRS algorithm in the estimation of unknown process noise characteristics and unknown parameters. Thus it demonstratesthofcdifficult Q can be estimated using CRS technique without the attendant difficulties of the earlier MMLE formulations in dealing with process noise. Chapter 5 discusses the - implementation of CRS to estimate the unknown measurement noise covariance matrix R together with the unknown 0 by utilizing the values of Po and Q obtained through EKF route. The effect of variation of R in the parameter estimation procedure is also highlighted in This Chapter. This Chapter explores the importance of Po in the estimation procedure. It establishes the importance of Po though most of the earlier works do not appear to have recognized such a feature. It turned out that the CRS algorithm does not converge when some arbitrary value of Po is chosen. It has to be necessarily obtained from a scouting pass of the EKF. Some sensitivity studies based on variations of Po shows its importance. Further studies shows the sequence of updates, the random nature of process and measurement noise effects, the deterministic nature of the parameter, play a critical role in the convergence of the algorithm. The last Chapter 6 presents the conclusions from the present work and suggestions for further work.
80

Dimension reduction of streaming data via random projections

Cosma, Ioana Ada January 2009 (has links)
A data stream is a transiently observed sequence of data elements that arrive unordered, with repetitions, and at very high rate of transmission. Examples include Internet traffic data, networks of banking and credit transactions, and radar derived meteorological data. Computer science and engineering communities have developed randomised, probabilistic algorithms to estimate statistics of interest over streaming data on the fly, with small computational complexity and storage requirements, by constructing low dimensional representations of the stream known as data sketches. This thesis combines techniques of statistical inference with algorithmic approaches, such as hashing and random projections, to derive efficient estimators for cardinality, l_{alpha} distance and quasi-distance, and entropy over streaming data. I demonstrate an unexpected connection between two approaches to cardinality estimation that involve indirect record keeping: the first using pseudo-random variates and storing selected order statistics, and the second using random projections. I show that l_{alpha} distances and quasi-distances between data streams, and entropy, can be recovered from random projections that exploit properties of alpha-stable distributions with full statistical efficiency. This is achieved by the method of L-estimation in a single-pass algorithm with modest computational requirements. The proposed estimators have good small sample performance, improved by the methods of trimming and winsorising; in other words, the value of these summary statistics can be approximated with high accuracy from data sketches of low dimension. Finally, I consider the problem of convergence assessment of Markov Chain Monte Carlo methods for simulating from complex, high dimensional, discrete distributions. I argue that online, fast, and efficient computation of summary statistics such as cardinality, entropy, and l_{alpha} distances may be a useful qualitative tool for detecting lack of convergence, and illustrate this with simulations of the posterior distribution of a decomposable Gaussian graphical model via the Metropolis-Hastings algorithm.

Page generated in 0.1233 seconds