• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6764
  • 117
  • 29
  • 4
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 6764
  • 1456
  • 1226
  • 1217
  • 1131
  • 963
  • 639
  • 636
  • 584
  • 467
  • 462
  • 454
  • 451
  • 404
  • 396
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
101

High-Dimensional Statistical Methods for Tensor Data and Efficient Algorithms

Unknown Date (has links)
In contemporary sciences, it is of great interest to study supervised and unsupervised learning problems of high-dimensional tensor data. In this dissertation, we develop new methods for tensor classification and clustering problems, and discuss algorithms to enhance their performance. For supervised learning, we propose CATCH model, in short for Covariate-Adjusted Tensor Classification in High-dimensions, which efficiently integrates the low-dimensional covariates and the tensor to perform classification and variable selection. The CATCH model preserves and utilizes the structures of the data for maximum interpretability and optimal prediction. We propose a penalized approach to select a subset of tensor predictor entries that has direct discriminative effects after adjusting for covariates. Theoretical results confirm that our approach achieves variable selection consistency and optimal classification accuracy. For unsupervised learning, we consider clustering problem on high-dimensional tensor data. we propose an efficient procedure based on EM algorithm. It directly estimates the sparse discriminant vector from a penalized objective function and provides computationally efficient rules to update all other parameters. Meanwhile, the algorithm takes advantage of the tensor structure to reduce the number of parameters, which leads to lower storage costs. The performance of our method over existing methods is demonstrated in simulated and real data examples. Moreover, based on tensor computation, we propose a novel algorithm referred to as the SMORE algorithm for differential network analysis. The SMORE algorithm has low storage cost and high computation speed, especially in the presence of strong sparsity. It also provides a unified framework for binary and multiple network problems. In addition, we note that the SMORE algorithm can be applied to high-dimensional quadratic discriminant analysis problems, providing a new approach for multiclass high-dimensional quadratic discriminant analysis. In the end, we discuss some directions of the future work, including new approaches, applications and relaxing assumptions. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Spring Semester 2019. / April 16, 2019. / classification, clustering, high-dimension, tensor, variable selection / Includes bibliographical references. / Qing Mai, Professor Co-Directing Dissertation; Xin Zhang, Professor Co-Directing Dissertation; Weikuan Yu, University Representative; Elizabeth Slate, Committee Member.
102

Envelopes, Subspace Learning and Applications

Unknown Date (has links)
Envelope model is a nascent dimension reduction technique. We focus on extending the envelope methodology to broader applications. In the first part of this thesis we propose a common reducing subspace model that can simultaneously estimating covariance, precision matrices and their differences across multiple populations. This model leads to substantial dimension reduction and efficient parameter estimation. We explicitly quantify the efficiency gain through an asymptotic analysis. In the second part, we propose a set of new mixture models called CLEMM (Clustering with Envelope Mixture Models) that is based on the widely used Gaussian mixture model assumptions. The proposed CLEMM framework and the associated envelope-EM algorithms provides the foundations for envelope methodology in unsupervised and semi-supervised learning problems. We also illustrate the performance of these models with simulation studies and empirical applications. Also, we have extended the envelope discriminant analysis from vector data to tensor data in the third part of this thesis. Another study on copula-based models for forecasting realized volatility matrix is included, which is an important financial application of estimating covariance matrices. We consider multivariate-t, Clayton, and bivariate t, Gumbel, Clayton copulas to model and forecast one-day ahead realized volatility matrices. Empirical results show that copula based models can achieve significant performance both in terms of statistical precision and economical efficiency. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Spring Semester 2019. / April 18, 2019. / Clustering Analysis, Dimension Reduction, EM algorithm, Envelope models, Reducing subspace, Tensor classification / Includes bibliographical references. / Xin Zhang, Professor Co-Directing Dissertation; Minjing Tao, Professor Co-Directing Dissertation; Wen Li, University Representative; Fred Huffer, Committee Member.
103

A Bayesian Semiparametric Joint Model for Longitudinal and Survival Data

Unknown Date (has links)
Many biomedical studies monitor both a longitudinal marker and a survival time on each subject under study. Modeling these two endpoints as joint responses has potential to improve the inference for both. We consider the approach of Brown and Ibrahim (2003) that proposes a Bayesian hierarchical semiparametric joint model. The model links the longitudinal and survival outcomes by incorporating the mean longitudinal trajectory as a predictor for the survival time. The usual parametric mixed effects model for the longitudinal trajectory is relaxed by using a Dirichlet process prior on the coefficients. A Cox proportional hazards model is then used for the survival time. The complicated joint likelihood increases the computational complexity. We develop a computationally efficient method by using a multivariate log-gamma distribution instead of Gaussian distribution to model the data. We use Gibbs sampling combined with Neal's algorithm (2000) and the Metropolis-Hastings method for inference. Simulation studies illustrate the procedure and compare this log-gamma joint model with the Gaussian joint models. We apply this joint modeling method to a human immunodeciency virus (HIV) data and a prostate-specific antigen (PSA) data. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Spring Semester 2019. / April 16, 2019. / Bayesian, Gibbs Sampler, Joint model, Longitudinal, Survival / Includes bibliographical references. / Elizabeth H. Slate, Professor Co-Directing Dissertation; Jonathan R. Bradley, Professor Co-Directing Dissertation; Amy M. Wetherby, University Representative; Lifeng Lin, Committee Member.
104

Univariate and Multivariate Volatility Models for Portfolio Value at Risk

Unknown Date (has links)
In modern day financial risk management, modeling and forecasting stock return movements via their conditional volatilities, particularly predicting the Value at Risk (VaR), became increasingly more important for a healthy economical environment. In this dissertation, we evaluate and compare two main families of models for the conditional volatilities - GARCH and Stochastic Volatility (SV) - in terms of their VaR prediction performance of 5 major US stock indices. We calculate GARCH-type model parameters via Quasi Maximum Likelihood Estimation (QMLE) while for those of SV we employ MCMC with Ancillary Sufficient Interweaving Strategy. We use the forecast volatilities corresponding to each model to predict the VaR of the 5 indices. We test the predictive performances of the estimated models by a two-stage backtesting procedure and then compare them via the Lopez loss function. Results of this dissertation indicate that even though it is more computational demanding than GARCH-type models, SV dominates them in forecasting VaR. Since financial volatilities are moving together across assets and markets, it becomes apparent that modeling the volatilities in a multivariate framework of modeling is more appropriate. However, existing studies in the literature do not present compelling evidence for a strong preference between univariate and multivariate models. In this dissertation we also address the problem of forecasting portfolio VaR via multivariate GARCH models versus univariate GARCH models. We construct 3 portfolios with stock returns of 3 major US stock indices, 6 major banks and 6 major technical companies respectively. For each portfolio, we model the portfolio conditional covariances with GARCH, EGARCH and MGARCH-BEKK, MGARCH-DCC, and GO-GARCH models. For each estimated model, the forecast portfolio volatilities are further used to calculate (portfolio) VaR. The ability to capture the portfolio volatilities is evaluated by MAE and RMSE; the VaR prediction performance is tested through a two-stage backtesting procedure and compared in terms of the loss function. The results of our study indicate that even though MGARCH models are better in predicting the volatilities of some portfolios, GARCH models could perform as well as their multivariate (and computationally more demanding) counterparts. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Spring Semester 2019. / April 2, 2019. / GARCH, MGARCH, SV, VaR / Includes bibliographical references. / Xufeng Niu, Professor Directing Dissertation; Giray Ökten, University Representative; Fred Huffer, Committee Member; Wei Wu, Committee Member.
105

Identifiability in the autopsy model of reliability theory

Unknown Date (has links)
Let S be a coherent system of m components acting independently. Two statistical models are considered. In the autopsy model S is observed until it fails. The set of failed components and the failure time of the system are noted. The failure times of the dead components are not known. In the second model, which was considered by Doss, Freitag and Proschan (Ann. Statist., 1989), the failure times of the dead components are also known. / In the autopsy model, it is not always possible to estimate or identify the component lifelengths from the observed data. A sufficient condition for the identifiability of the component distributions is given for the case in which the distributions are assumed to be analytic. Necessary and sufficient conditions are given for the case in which the distributions are assumed to belong to certain parametric families. / The model of Doss, Freitag and Proschan is considered in two special cases. In the first of these the component distributions are known to be identical. In the second, the distributions are known to be exponential. Estimators of the component and system life lengths are given for each of these cases, and the asymptotic relative efficiency of each with respect to the corresponding estimator of Doss, Freitag and Proschan is calculated. / Source: Dissertation Abstracts International, Volume: 53-03, Section: B, page: 1449. / Major Professors: Hani Doss; Myles Hollander. / Thesis (Ph.D.)--The Florida State University, 1992.
106

A comparison of robust and least squares regression models using actual and simulated data

Unknown Date (has links)
The purpose of this study was to compare several robust regression techniques to ordinary least squares (OLS) regression when analyzing bivariate and multivariate data. The bivariate analysis compared of the performance of alternative robust procedures in regard to the detection of outliers versus the standard OLS regression techniques. The bivariate analysis demonstrated the weaknesses of OLS regression and the standard OLS outlier diagnostic techniques when multiple outliers are present. In addition, this research assessed the empirical performance of alpha and power under three non-normal probability density functions using a Monte Carlo simulation. / The first analysis focused on several bivariate data sets. Each data set was plotted and each of the regression models used to analyze the data. The usual results (e.g., R$\sp2$, regression coefficients, standard errors, and regression diagnostics) were examined to give a visual as well as empirical analysis of the models' performance in the presence of multiple outliers. / The second component of this study entailed a Monte Carlo simulation of five robust regression models and OLS regression under four probability density functions. The variables included in the study were placed in one 2$\sp1$3$\sp2$ and two 3$\sp2$ factorial design repeated over four probability density functions, resulting in a total of 90 experimental runs of the Monte Carlo simulation. Random samples were generated and then transformed to fit desired distributional moment characteristics. The incremental null hypothesis was used as the basis to calculate empirical alpha and power values calculated. / The analysis demonstrated the inadequacies of the standard OLS based outlier detection methods and explained how regression analysis could be improved if a robust regression method is used in parallel with OLS regression. The multivariate analysis demonstrated the robustness of the OLS regression model to three nonnormal populations. It further demonstrated a moderate inflation of alpha for the M-class of robust regression model and a lack of power stability with the rank transform regression method. / Based on the results of this study, recommendations were made for using robust regression methods and suggestions for future research offered. / Source: Dissertation Abstracts International, Volume: 53-03, Section: B, page: 1450. / Thesis (Ph.D.)--The Florida State University, 1992.
107

Limit theorems for Markov random fields

Unknown Date (has links)
Markov Random Fields (MRF's) have been extensively applied in Statistical Mechanics as well as in Bayesian Image Analysis. MRF's are a special class of dependent random variables located at the vertices of a graph whose joint distribution includes a parameter called the temperature. When the number of vertices of the graph tends to infinity, the normalized distribution of statistics based on these random variables converge in distribution. It can happen that for certain values of the temperature, that the rate of growth of these normalizing constants change drastically. This feature is generally used to explain the phenomenon of phase transition as understood by physicists. In this dissertation we will show that this drastic change in normalizing constants occurs even in the relatively smooth case when all the random variables are Gaussian. Hence any image analytic MRF ought to be checked for such discontinuous behavior before any analysis is performed. / Mixed limit theorems in Bayesian Image Analysis seek to replace intensive simulations of MRF's with limit theorems that approximate the distribution of the MRF's as the number of sites increases. The problem of deriving mixed limit theorems for MRF's on a one dimensional lattice graph with an acceptor function that has a second moment has been studied by Chow. A mixed limit theorem for the integer lattice graph is derived when the acceptor function does not have a second moment as for instance when the acceptor function is a symmetric stable density of index 0 $<$ $\alpha$ $<$ 2. / Source: Dissertation Abstracts International, Volume: 52-08, Section: B, page: 4297. / Major Professor: Jayaram Sethuraman. / Thesis (Ph.D.)--The Florida State University, 1991.
108

Cumulative regression function methods in survival analysis and time series

Unknown Date (has links)
One may estimate a conditional hazard function from grouped (and possibly censored) survival data by the time and covariate specific occurrence/exposure rate. Asymptotic results for cumulative versions of this estimator are developed, utilizing the general framework of counting processes. In particular, a grouped data based goodness-of-fit test for Cox's proportional hazard model is given. Various constraints on the asymptotic behavior of the widths of the calendar periods and covariate strata employed in grouping the data are needed to prove the results. Actual performance of the estimators and test statistics is evaluated by Monte Carlo methods. / We also consider the problem of identifying the class of time series model to which a series belongs based on observation of part of the series. Techniques of nonparametric estimation have been applied to this problem by Auestad and Tjostheim (Biometrika 77(1990):669-687) who used kernel estimates of the one-step lagged conditional mean and variance functions. We study cumulative versions of such estimates. These are more stable than the kernel estimates and can be used to construct confidence bands for the underlying cumulative mean and variance functions. Goodness-of-fit tests for specific parametric models are also developed. / Source: Dissertation Abstracts International, Volume: 52-08, Section: B, page: 4300. / Major Professor: Ian McKeague. / Thesis (Ph.D.)--The Florida State University, 1991.
109

A Comparison of Estimators in Hierarchical Linear Modeling: Restricted Maximum Likelihood versus Bootstrap via Minimum Norm Quadratic Unbiased Estimators

Unknown Date (has links)
The purpose of the study was to investigate the relative performance of two estimation procedures, the restricted maximum likelihood (REML) and the bootstrap via MINQUE, for a two-level hierarchical linear model under a variety of conditions. Specific focus lay on observing whether the bootstrap via MINQUE procedure offered improved accuracy in the estimation of the model parameters and their standard errors in situations where normality may not be guaranteed. Through Monte Carlo simulations, the importance of this assumption for the accuracy of multilevel parameter estimates and their standard errors was assessed using the accuracy index of relative bias and by observing the coverage percentages of 95% confidence intervals constructed for both estimation procedures. The study systematically varied the number of groups at level-2 (30 versus 100), the size of the intraclass correlation (0.01 versus 0.20) and the distribution of the observations (normal versus chi-squared with 1 degree of freedom). The number of groups and intraclass correlation factors produced effects consistent with those previously reported—as the number of groups increased, the bias in the parameter estimates decreased, with a more significant effect observed for those estimates obtained via REML. High levels of the intraclass correlation also led to a decrease in the efficiency of parameter estimation under both methods. Study results show that while both the restricted maximum likelihood and the bootstrap via MINQUE estimates of the fixed effects were accurate, the efficiency of the estimates was affected by the distribution of errors with the bootstrap via MINQUE procedure outperforming the REML. Both procedures produced less efficient estimators under the chi-squared distribution, particularly for the variance-covariance component estimates. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Degree Awarded: Summer Semester, 2006. / Date of Defense: June 5, 2006. / Reml, Minque / Includes bibliographical references. / Xu-Feng Niu, Professor Directing Dissertation; Richard L. Tate, Outside Committee Member; Fred W. Huffer, Committee Member; Douglas Zahn, Committee Member.
110

Estimation from Data Representing a Sample of Curves

Unknown Date (has links)
This dissertation introduces and assesses an algorithm to generate confidence bands for a regression function or a main effect when multiple data sets are available. In particular it proposes to construct confidence bands for different trajectories and then aggregate these to produce an overall confidence band for a mean function. An estimator of the regression function or main effect is also examined. First, nonparametric estimators and confidence bands are formed on each data set separately. Then each data set is in turn treated as a testing set for aggregating the preliminary results from the remaining data sets. The criterion used for this aggregation is either the least squares (LS) criterion or a BIC type penalized LS criterion. The proposed estimator is the average over data sets of these aggregates. It is thus a weighted sum of the preliminary estimators. The proposed confidence band is the minimum L1 band of all the M aggregate bands when we only have a main effect. In the case where there is some random effect we suggest an adjustment to the confidence band. In this case, the proposed confidence band is the minimum L1 band of all the M adjusted aggregate bands. Desirable asymptotic properties are shown to hold. A simulation study examines the performance of each technique relative to several alternate methods and theoretical benchmarks. An application to seismic data is conducted. / A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Degree Awarded: Summer Semester, 2006. / Date of Defense: March 29, 2006. / Confidence Bands, Confidence Intervals, Nonparametric / Includes bibliographical references. / Florentina Bunea, Professor Directing Dissertation; Patrick Mason, Outside Committee Member; Myles Hollander, Committee Member; Fred Huffer, Committee Member.

Page generated in 0.5915 seconds