• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 56
  • 5
  • 3
  • 2
  • 1
  • 1
  • Tagged with
  • 73
  • 73
  • 22
  • 13
  • 9
  • 9
  • 9
  • 8
  • 8
  • 7
  • 6
  • 5
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.

A study on structured covariance modeling approaches to designing compact recognizers of online handwritten Chinese characters

Wang, Yongqiang, January 2009 (has links)
Thesis (M. Phil.)--University of Hong Kong, 2009. / Includes bibliographical references (leaves 81-89). Also available in print.

The multivariate one-way classification model with random effects

Schott, James Robert, January 1981 (has links)
Thesis (Ph. D.)--University of Florida, 1981. / Description based on print version record. Typescript. Vita. Includes bibliographical references (leaves 108-109).

Global covariance modeling : a deformation approach to anisotropy /

Das, Barnali, January 2000 (has links)
Thesis (Ph. D.)--University of Washington, 2000. / Vita. Includes bibliographical references (p. 124-131).

High-dimensional covariance matrix estimation with application to Hotelling's tests

Dong, Kai 31 August 2015 (has links)
In recent years, high-dimensional data sets are widely available in many scientific areas, such as gene expression study, finance and others. Estimating the covariance matrix is a significant issue in such high-dimensional data analysis. This thesis focuses on high-dimensional covariance matrix estimation and its application. First, this thesis focuses on the covariance matrix estimation. In Chapter 2, a new optimal shrinkage estimation of the covariance matrices is proposed. This method is motivated by the quadratic discriminant analysis where many covariance matrices need to be estimated simultaneously. We shrink the sample covariance matrix towards the pooled sample covariance matrix through a shrinkage parameter. Some properties of the optimal shrinkage parameter are investigated and we also provide how to estimate the optimal shrinkage parameter. Simulation studies and real data analysis are also conducted. In Chapter 4, we estimate the determinant of the covariance matrix using some recent proposals for estimating high-dimensional covariance matrix. Specifically, a total of nine covariance matrix estimation methods will be considered for comparison. Through extensive simulation studies, we explore and summarize some interesting comparison results among all compared methods. A few practical guidelines are also made on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance matrix. Finally, from a perspective of the loss function, the comparison study in this chapter also serves as a proxy to assess the performance of the covariance matrix estimation. Second, this thesis focuses on the application of high-dimensional covariance matrix estimation. In Chapter 3, we consider to estimate the high-dimensional covariance matrix based on the diagonal matrix of the sample covariance matrix and apply it to the Hotelling’s tests. In this chapter, we propose a shrinkage-based diagonal Hotelling’s test for both one-sample and two-sample cases. We also propose several different ways to derive the approximate null distribution under different scenarios of p and n for our proposed shrinkage-based test. Simulation studies show that the proposed method performs comparably to existing competitors when n is moderate or large, and it is better when n is small. In addition, we analyze four gene expression data sets and they demonstrate the advantage of our proposed shrinkage-based diagonal Hotelling’s test. Apart from the covariance matrix estimation, we also develop a new classification method for a specific type of high-dimensional data, RNA-sequencing data. In Chapter 5, we propose a negative binomial linear discriminant analysis for RNA-Seq data. By Bayes’ rule, we construct the classifier by fitting a negative binomial model, and propose some plug-in rules to estimate the unknown parameters in the classifier. The relationship between the negative binomial classifier and the Poisson classifier is explored, with a numerical investigation of the impact of dispersion on the discriminant score. Simulation results show the superiority of our proposed method. We also analyze four real RNA-Seq data sets to demonstrate the advantage of our method in real-world applications. Keywords: Covariance matrix, Discriminant analysis, High-dimensional data, Hotelling’s test, Log determinant, RNA-sequencing data.

Semiparametric Inference of Censored Data with Time-dependent Covariates

Chu, Chi Wing January 2021 (has links)
This thesis develops two semiparametric methods for censored survival data when the covariates involved are time-dependent. Respectively in the two parts of this thesis, we introduce an interquantile regression model and a censored quantile regression model that account for the commonly observed time-dependent covariates in survival analysis. The proposed quantile-based techniques offer a greater model flexibility comparing to the Cox proportional hazards model and the accelerated failure time model. The first half of this thesis introduces a censored interquantile regression model with time-dependent covariates. Conventionally, censored quantile regression stipulates a specific, pointwise conditional quantile of the survival time given covariates. Despite its model flexibility and straightforward interpretation, the pointwise formulation oftentimes yields rather unstable estimates across neighbouring quantile levels with large variances. In view of this phenomenon, we propose a new class of censored interquantile regression models with time-dependent covariates that can capture the relationship between the failure time and the covariate processes of a target population that falls within a specific quantile bracket. The pooling of information within a homogeneous neighbourhood facilitates more efficient estimates hence more consistent conclusion on statistical significances of the variables concerned. This new formulation can also be regarded as a generalization of the accelerated failure time model for survival data in the sense that it relaxes the assumption of global homogeneity for the error at all quantile levels. By introducing a class of weighted rank-based estimation procedure, our framework allows a quantile-based inference on the covariate effect with a less restrictive set of assumptions. Numerical studies demonstrate that the proposed estimator outperforms existing alternatives under various settings in terms of smaller empirical bias and standard deviation. A perturbation-based resampling method is also developed to reconcile the asymptotic distribution of the parameter estimates. Finally, consistency and weak convergence of the proposed estimator are established via empirical process theory. In the second half of this thesis, we propose a class of censored quantile regression models for right censored failure time data with time-dependent covariates that only requires a standard conditionally independent censorship. Upon a quantile based transformation, a system of functional estimating equations for the quantile parameters is derived based on the martingale construction. While time-dependent covariates naturally arise in time to event analysis, the few existing literature requires either an independent censoring mechanism or a fully observed covariate process even after the event has occured. The proposed formulation extends the existing censored quantile regression model so that only the covariate history up to the observed event time is required as in the Cox proportional hazards model for time-dependent covariates. A recursive algorithm is developed to evaluate the estimator numerically. Asymptotic properties including uniform consistency and weak convergence of the proposed estimator as a process of the quantile level is established. Monte Carlo simulations and numerical studies on the clinical trial data of the AIDS Clinical Trials Group is presented to illustrate the numerical performance of the proposed estimator.

Multisample analysis of structural equation models with stochastic constraints.

January 1992 (has links)
Wai-tung Ho. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1992. / Includes bibliographical references (leaves 81-83). / Chapter CHAPTER 1 --- OVERVIEW OF CONSTRAINTED ESTIMATION OF STRUCTURAL EQUATION MODEL --- p.1 / Chapter CHAPTER 2 --- MULTISAMPLE ANALYSIS OF STRUCTURAL EQUATION MODELS WITH STOCHASTIC CONSTRAINTS --- p.4 / Chapter 2.1 --- The Basic Model --- p.4 / Chapter 2.2 --- Bayesian Approach to Nuisance Parameters --- p.5 / Chapter 2.3 --- Estimation and Algorithm --- p.8 / Chapter 2.4 --- Asymptotic Properties of the Bayesian Estimate --- p.11 / Chapter CHAPTER 3 --- MULTISAMPLE ANALYSIS OF STRUCTURAL EQUATION MODELS WITH EXACT AND STOCHASTIC CONSTRAINTS --- p.17 / Chapter 3.1 --- The Basic Model --- p.17 / Chapter 3.2 --- Bayesian Approach to Nuisance Parameters and Estimation Procedures --- p.18 / Chapter 3.3 --- Asymptotic Properties of the Bayesian Estimate --- p.20 / Chapter CHAPTER 4 --- SIMULATION STUDIES AND NUMERICAL EXAMPLE --- p.24 / Chapter 4.1 --- Simulation Study for Identified Models with Stochastic Constraints --- p.24 / Chapter 4.2 --- Simulation Study for Non-identified Models with Stochastic Constraints --- p.29 / Chapter 4.3 --- Numerical Example with Exact and Stochastic Constraints --- p.32 / Chapter CHAPTER 5 --- DISCUSSION AND CONCLUSION --- p.34 / APPENDICES --- p.36 / TABLES --- p.66 / REFERENCES --- p.81

A Three-Paper Dissertation on Longitudinal Data Analysis in Education and Psychology

Ahmadi, Hedyeh January 2019 (has links)
In longitudinal settings, modeling the covariance structure of repeated measure data is essential for proper analysis. The first paper in this three-paper dissertation presents a survey of four journals in the fields of Education and Psychology to identify the most commonly used methods for analyzing longitudinal data. It provides literature reviews and statistical details for each identified method. This paper also offers a summary table giving the benefits and drawbacks of all the surveyed methods in order to help researchers choose the optimal model according to the structure of their data. Finally, this paper highlights that even when scholars do use more advanced methods for analyzing repeated measure data, they very rarely report (or explore in their discussions) the covariance structure implemented in their choice of modeling. This suggests that, at least in some cases, researchers may not be taking advantage of the optimal covariance patterns. This paper identifies a gap in the standard statistical practices of the fields of Education and Psychology, namely that researchers are not modeling the covariance structure as an extension of fixed/random effects modeling. The second paper introduces the General Serial Covariance (GSC) approach, an extension of the Linear Mixed Modeling (LMM) or Hierarchical Linear Model (HLM) techniques that models the covariance structure using spatial correlation functions such as Gaussian, Exponential, and other patterns. These spatial correlations model the covariance structure in a continuous manner and therefore can deal with missingness and imbalanced data in a straightforward way. A simulation study in the second paper reveals that when data are consistent with the GSC model, using basic HLMs is not optimal for the estimation and testing of the fixed effects. The third paper is a tutorial that uses a real-world data set from a drug abuse prevention intervention to demonstrate the use of the GSC and basic HLM models in R programming language. This paper utilizes variograms (a visualization tool borrowed from geostatistics) among other exploratory tools to determine the covariance structure of the repeated measure data. This paper aims to introduce the GSC model and variogram plots to Education and Psychology, where, according to the survey in the first paper, they are not in use. This paper can also help scholars seeking guidance for interpreting the fixed effect-parameters.

Estimation of two-level structural equation models with constraints.

January 1997 (has links)
by Sin Yu Tsang. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references (leaves 40-42). / Chapter Chapter 1. --- Introduction --- p.1 / Chapter Chapter 2. --- Two-level structural equation model --- p.5 / Chapter Chapter 3. --- Estimation of the model under general constraints --- p.11 / Chapter Chapter 4. --- Estimation of the model under linear constraints --- p.22 / Chapter Chapter 5. --- Simulation results --- p.27 / Chapter 5.1 --- "Artificial examples for ""modified"" EM algorithm" --- p.27 / Chapter 5.2 --- "Artificial examples for ""restricted"" EM algorithm" --- p.34 / Chapter Chapter 6. --- Discussion and conclusion --- p.38 / References --- p.40 / Tables --- p.43

Latent models for cross-covariance /

Wegelin, Jacob A. January 2001 (has links)
Thesis (Ph. D.)--University of Washington, 2001. / Vita. Includes bibliographical references (p. 139-145).

Phylogenetic analysis of multiple genes based on spectral methods

Abeysundera, Melanie 28 October 2011 (has links)
Multiple gene phylogenetic analysis is of interest since single gene analysis often results in poorly resolved trees. Here the use of spectral techniques for analyzing multi-gene data sets is explored. The protein sequences are treated as categorical time series and a measure of similarity between a pair of sequences, the spectral covariance, is used to build trees. Unlike other methods, the spectral covariance method focuses on the relationship between the sites of genetic sequences. We consider two methods with which to combine the dissimilarity or distance matrices of multiple genes. The first method involves properly scaling the dissimilarity measures derived from different genes between a pair of species and using the mean of these scaled dissimilarity measures as a summary statistic to measure the taxonomic distances across multiple genes. We introduced two criteria for computing scale coefficients which can then be used to combine information across genes, namely the minimum variance (MinVar) criterion and the minimum coefficient of variation squared (MinCV) criterion. The scale coefficients obtained with the MinVar and MinCV criteria can then be used to derive a combined-gene tree from the weighted average of the distance or dissimilarity matrices of multiple genes. The second method is based on the singular value decomposition of a matrix made up of the p-vectors of pairwise distances for k genes. By decomposing such a matrix, we extract the common signal present in multiple genes to obtain a single tree representation of the relationship between a given set of taxa. Influence functions for the components of the singular value decomposition are derived to determine which genes are most influential in determining the combined-gene tree.

Page generated in 0.0977 seconds