• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6778
  • 117
  • 29
  • 4
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 6778
  • 1459
  • 1229
  • 1222
  • 1134
  • 966
  • 641
  • 637
  • 587
  • 470
  • 462
  • 456
  • 452
  • 406
  • 396
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
151

ESTIMATION OF REGRESSION COEFFICIENTS IN THE COMPETING RISKS MODEL WITH MISSING CAUSE OF FAILURE

Lu, Kaifeng 13 March 2002 (has links)
<p>In many clinical studies, researchers are interested in theeffects of a set of prognostic factors on the hazard of death from a specific disease even though patients may die from other competing causes. Often the time to relapse is right-censored for some individuals due to incomplete follow-up. In some circumstances, it may also be the case that patients are known to die but the cause of death is unavailable. When cause of failure is missing, excluding the missing observations from the analysis or treating them as censored may yield biased estimates and erroneous inferences. Under the assumption that cause of failure is missing at random, we propose three approaches to estimate the regression coefficients. The imputation approach isstraightforward to implement and allows for the inclusion ofauxiliary covariates, which are not of inherent interest formodeling the cause-specific hazard of interest but may be related to the missing data mechanism. The partial likelihood approach we propose is semiparametric efficient and allows for more general relationships between the two cause-specific hazards and more general missingness mechanism than the partial likelihood approach used by others. The inverse probability weighting approach isdoubly robust and highly efficient and also allows for theincorporation of auxiliary covariates. Using martingale theory and semiparametric theory for missing data problems, the asymptotic properties of these estimators are developed and the semiparametric efficiency of relevant estimators is proved. Simulation studies are carried out to assess the performance of these estimators in finite samples. The approaches are also illustrated using the data from a clinical trial in elderly women with stage II breast cancer. The inverse probability weighted doubly robust semiparametric estimator is recommended for itssimplicity, flexibility, robustness and high efficiency.<P>
152

Repeated Measures Mixture Modeling with Applications to Neuroscience

Sun, Zhuoxin 06 June 2005 (has links)
In some neurological postmortem brain tissue studies, repeated measures are observed. These observations are taken on the same experimental subject and are therefore correlated within the subject. Furthermore, each observation can be viewed as coming from one of a pre-specified number of populations where each population corresponds to a possible type of neurons. In this dissertation, we propose several mixture models with two components to model such repeated data. In the first model, we include subject-specific random effects in the component distributions to account for the within-subject correlation present in the data. The mixture components are generalized linear models with random effects, while the mixing proportions are governed by a logistic regression. In the second proposed model, the mixture components are generalized linear models, while the component-indicator variables are modeled by a multivariate Bernoulli distribution that depends on covariates. The within-subject observations are taken to be correlated through the latent component indicator random variables. As a special case of the second model, we focus on multivariate Bernoulli mixtures of normals, where the component-indicator variables are modeled by logistic regressions with random effects, and the mixture components are linear regressions. The third proposed model combines the first and second models, so that the within-subject correlation is built into the model not only through the component distributions, but also through the latent component indicator variables. The focus again is on a special case of the third model, where the mixture components are linear regressions with random effects while the mixing proportions are logistic regressions with another group of random effects. For each model, model fitting procedures, based on MCMC methods for sampling from the posterior distribution of the parameters, are developed. The second and third model are used to compare schizophrenic and control subjects with regard to the somal volumes of deep layer 3 pyramidal cells in the auditory association cortex. As a preliminary analysis, we start by employing classic mixture models and mixtures-of-experts to analyze such data neglecting the within-subject correlation. We also provide a discussion of the statistical and computational issues concerning estimation of classic Poisson mixtures.
153

PARAMETER ESTIMATION IN STOCHASTIC VOLATILITY MODELS WITH MISSING DATA USING PARTICLE METHODS AND THE EM ALGORITHM

Kim, Jeongeun 05 October 2005 (has links)
The main concern of financial time series analysis is how to forecast future values of financial variables, based on all available information. One of the special features of financial variables, such as stock prices and exchange rates, is that they show changes in volatility, or variance, over time. Several statistical models have been suggested to explain volatility in data, and among them Stochastic Volatility models or SV models have been commonly and successfully used. Another feature of financial variables I want to consider is the existence of several missing data. For example, there is no stock price data available for regular holidays, such as Christmas, Thanksgiving, and so on. Furthermore, even though the chance is small, stretches of data may not available for many reasons. I believe that if this feature is brought into the model, it will produce more precise results. The goal of my research is to develop a new technique for estimating parameters of SV models when some parts of data are missing. By estimating parameters, the dynamics of the process can be fully specified, and future values can be estimated from them. SV models have become increasingly popular in recent years, and their popularity has resulted in several different approaches proposed regarding the problem of estimating the parameters of the SV models. However, as of yet there is no consensus on this problem. In addition there has been no serious consideration of the missing data problem. A new statistical approach based on the EM algorithm and particle filters is presented. Moreover, I expand the scope of application of SV models by introducing a slight modification of the models.
154

Statistics in Ella Mathematics

Teng, Yunlong, Zhao, Yingrui January 2012 (has links)
"Ella Mathematics" is a web-based e-learning system which aims to improve elementary school students’ mathematics learning in Sweden. Such an e-learning tool has been partially completed in May 2012, except descriptive statistics module summarizing students’ performance in the learning process. This project report presents and describes the design and implementation of such descriptive statistics module, which intends to allow students to check their own grades and learning progress; teachers to check and compare students’ grades and progress, as well as parents to compare their children’s grades and learning progress with the average grade and progress of other students. To better understand and design such functionalities, different mathematical e-learning systems were investigated. Another contribution of this project relates to the evaluation and redesign of the existing database model of the “Ella Mathematics” system. The redesign improved performance and reduced data redundancy.
155

Quadratic Hedging with Margin Requirements and Portfolio Constraints

Tazhitdinova, Alisa January 2010 (has links)
We consider a mean-variance portfolio optimization problem, namely, a problem of minimizing the variance of the final wealth that results from trading over a fixed finite horizon in a continuous-time complete market in the presence of convex portfolio constraints, taking into account the cost imposed by margin requirements on trades and subject to the further constraint that the expected final wealth equal a specified target value. Market parameters are chosen to be random processes adapted to the information filtration available to the investor and asset prices are modeled by Itô processes. To solve this problem we use an approach based on conjugate duality: we start by synthesizing a dual optimization problem, establish a set of optimality relations that describe an optimal solution in terms of solutions of the dual problem, thus giving necessary and sufficient conditions for the given optimization problem and its dual to each have a solution. Finally, we prove existence of a solution of the dual problem, and for a particular class of dual solutions, establish existence of an optimal portfolio and also describe it explicitly. The method elegantly and rather straightforwardly constructs a dual problem and its solution, as well as provides intuition for construction of the actual optimal portfolio.
156

A comparison of unsupervised learning techniques for detection of medical abuse in automobile claims

Yang, Li 10 January 2013
A comparison of unsupervised learning techniques for detection of medical abuse in automobile claims
157

Branching processes with biological applications

January 2010 (has links)
Branching processes play an important role in models of genetics, molecular biology, microbiology, ecology and evolutionary theory. This thesis explores three aspects of branching processes with biological applications. The first part of the thesis focuses on fluctuation analysis, with the main purpose to estimate mutation rates in microbial populations. We propose a novel estimator of mutation rates, and apply it to a number of Luria-Delbruck type fluctuation experiments in Saccharomyces cerevisiae. Second, we study the extinction of Markov branching processes, and derived theorems for the path to extinction in the critical case, as an extension to Jagers' theory. The third part of the thesis introduces infinite-allele Markov branching processes. As an important non-trivial example, the limiting frequency spectrum for the birth-death process has been derived. Potential application of modeling the proliferation and mutation of human Alu sequences is also discussed.
158

Model-based clustering for multivariate time series of counts

January 2010 (has links)
This dissertation develops a modeling framework for univariate and multivariate zero-inflated time series of counts and applies the models in a clustering scheme to identify groups of count series with similar behavior. The basic modeling framework used is observation-driven Poisson regression with generalized linear model (GLM) structure. The zero-inflated Poisson (ZIP) model is employed to characterize the possibility of extra observed zeros relative to the Poisson, a common feature of count data. These two methods are combined to characterize time series of counts where the counts and the probability of extra zeros may depend on past data observations and on exogenous covariates. A key contribution of this work is a novel modeling paradigm for multivariate zero-inflated counts. The three related models considered are the jointly-inflated, the marginally-inflated, and the doubly-inflated multivariate Poisson. The doubly-inflated model encompasses both marginal-inflation, which allows for additional zeros at each time epoch for each individual count series, and joint-inflation, which allows for zero-inflation across all multivariate series. These models improve upon previously proposed models, which are either too rigid or too simplistic to be applicable in a wide variety of applications. To estimate the model parameters, a new Monte Carlo Estimation Maximization (MCEM) algorithm is developed. The Monte Carlo sampling eliminates complex recursion formulas needed for calculating the probability function of the multivariate Poisson. The algorithm is easily adapted for different multivariate zero-inflation schemes. The new models, new estimation methods, and applications in clustering are demonstrated on simulated and real datasets. For an application in finance, the number of trades and the number of price changes for bonds are modeled as a bivariate doubly zero-inflated Poisson time series, where observations of zero trades or zero price changes represent the liquidity risk for that bond. In an environmental science application, the new models are used in a model-based clustering scheme to study counts of high pollution events at air quality monitoring stations around Houston, Texas. Clustering reveals regions of the air monitoring network which behave similarly in terms of time dependence and response to covariates representing atmospheric conditions and physical sources of air pollution.
159

On the separation of T Tauri star spectra using non-negative matrix factorization and Bayesian positive source separation

January 2010 (has links)
The objective of this study is to compare and evaluate Bayesian and deterministic methods of positive source separation of young star spectra. In the Bayesian approach, the proposed Bayesian Positive Source Separation (BPSS) method uses Gamma priors to enforce non-negativity in the source signals and mixing coefficients and a Markov Chain Monte Carlo (MCMC) algorithm, modified by suggesting simpler proposal distributions and randomly initializing the MCMC to correctly separate spectra. In the deterministic approach, two Non-negative Matrix Factorization (NNMF) algorithms, the multiplicative update rule algorithm and an alternating least squares algorithm, are used to separate the star spectra into sources. The BPSS and NNMF algorithms are applied to the field of Astrophysics by applying the source separation techniques to T Tauri star spectra, resulting in a successful decomposition of the spectra into their sources. These methods are for the first time being applied and evaluated in optical spectroscopy. The results show that, while both methods perform well, BPSS outperforms NNMF. The NNMF and BPSS algorithms improve upon the current methodology used in Astrophysics iu two important ways. First, they permit the identification of additional components of the spectra in addition to the photosphere and boundary layer which can be modeled with current methods. Second, by applying a statistical algorithm, the modeling of T Tauri stars becomes less subjective. These methods may be further extrapolated to model spectra from other types of stars or astrophysical phenomena.
160

Generalized Gaussian process models with Bayesian variable selection

January 2010 (has links)
This research proposes a unified Gaussian process modeling approach that extends to data from the exponential dispersion family and survival data. Our specific interest is in the analysis of datasets with predictors possessing an a priori unknown form of possibly non-linear associations to the response. We incorporate Gaussian processes in a generalized linear model framework to allow a flexible non-parametric response surface function of the predictors. We term these novel classes "generalized Gaussian process models". We consider continuous, categorical and count responses and extend to survival outcomes. Next, we focus on the problem of selecting variables from a set of possible predictors and construct a general framework that employs mixture priors and a Metropolis-Hastings sampling scheme for the selection of the predictors with joint posterior exploration of the model and associated parameter spaces. We build upon this framework by first enumerating a scheme to improve efficiency of posterior sampling. In particular, we compare the computational performance of the Metropolis-Hastings sampling scheme with a newer Metropolis-within-Gibbs algorithm. The new construction achieves a substantial improvement in computational efficiency while simultaneously reducing false positives. Next, leverage this efficient scheme to investigate selection methods for addressing more complex response surfaces, particularly under a high dimensional covariate space. Finally, we employ spiked Dirichlet process (DP) prior constructions over set partitions containing covariates. Our approach results in a nonparametric treatment of the distribution of the covariance parameters of the GP covariance matrix that in turn induces a clustering of the covariates. We evaluate two prior constructions: The first employs a mixture of a point-mass and a continuous distribution as the centering distribution for the DP prior, therefore clustering all covariates. The second one employs a mixture of a spike and a DP prior with a continuous distribution as the centering distribution, which induces clustering of the selected covariates only. DP models borrow information across covariates through model-based clustering, achieving sharper variable selection and prediction than what obtained using mixture priors alone. We demonstrate that the former prior construction favors "sparsity", while the latter is computationally more efficient.

Page generated in 0.4521 seconds