Global ETD Search

1	Anchored Bayesian Gaussian Mixture Models Kunkel, Deborah Elizabeth 25 September 2018 (has links) No description available. Statistics
2	Econometric modelling of nonlinearity and nonstationarity in the foreign exchange market Hillman, Robert J. T. January 1998 (has links) No description available. 330 Forecasting rates; Mixture models
3	Extensions to the OCLUST Algorithm Clark, Katharine M January 2024 (has links) OCLUST is a clustering algorithm that trims outliers in Gaussian mixture models. While mixtures of multivariate Gaussian distributions are a useful way to model heterogeneity in data, it is not always an appropriate assumption that the data arise from a finite mixture of Gaussian distributions. This thesis extends the OCLUST algorithm to three types of data which depart from the multivariate Gaussian distribution. The first extension, called funOCLUST, is developed for data which exist in functional form. Next, MVN-OCLUST applies outlier trimming to matrix-variate normal data. Finally, the skewOCLUST algorithm is formulated for skewed data by applying a transformation to normality. However, this final extension occurs after a brief detour in Chapter 5 to establish a foundation for the final chapter. / Thesis / Doctor of Philosophy (PhD) mixture models outliers clustering classification
4	DSP Base Independent Phrase Real Time Speaker Recognition System Yan, Ming-Xiang 27 July 2004 (has links) The thesis illustrates a DSP-based speaker recognition system . In order to make the modular within the representation floating-point, we simplify the algorithm. This speaker recognition system is including hardware setting and implementation of speaker algorithm. The DSP chip is float arithmetic DSP(ADSP-21161 of ADI SHARK Series) , the algorithm of speaker recognition is gaussian mixture model. According to result of experiments, the speaker recognition of DSP can gain good recognition and speed efficiency. DSP Gaussian Mixture Models Speaker Recognition System
5	Evaluation of two types of Differential Item Functioning in factor mixture models with binary outcomes Lee, Hwa Young, doctor of educational psychology 22 February 2013 (has links) Differential Item Functioning (DIF) occurs when examinees with the same ability have different probabilities of endorsing an item. Conventional DIF detection methods (e.g., the Mantel-Hansel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable (e.g., Cohen & Bolt, 2005). True source of DIF may be unobserved, including variables such as personality, response patterns, or unmeasured background variables. The Factor Mixture Model (FMM) is designed to detect unobserved sources of heterogeneity in factor structures, and an FMM with binary outcomes has recently been used for assessing DIF (DeMars & Lau, 2011; Jackman, 2010). However, FMMs with binary outcomes for detecting DIF have not been thoroughly explored to investigate both types of between-class latent DIF (LDIF) and class-specific observed DIF (ODIF). The present simulation study was designed to investigate whether models correctly specified in terms of LDIF and/or ODIF influence the performance of model fit indices (AIC, BIC, aBIC, and CAIC) and entropy, as compared to models incorrectly specified in terms of either LDIF or ODIF. In addition, the present study examined the recovery of item difficulty parameters and investigated the proportion of replications in which items were correctly or incorrectly identified as displaying DIF, by manipulating DIF effect size and latent class probability. For each simulation condition, two latent classes of 27 item responses were generated to fit a one parameter logistic model with items’ difficulties generated to exhibit DIF across the classes and/or the observed groups. Results showed that FMMs with binary outcomes performed well in terms of fit indices, entropy, DIF detection, and recovery of large DIF effects. When class probabilities were unequal with small DIF effects, performance decreased for fit indices, power, and the recovery of DIF effects compared to equal class probability conditions. Inflated Type I errors were found for invariant DIF items across simulation conditions. When data were generated to fit a model having ODIF but estimated LDIF, specifying LDIF in the model fully captured ODIF effects when DIF effect sizes were large. / text Differential Item Functioning (DIF) Factor mixture models
6	Linear clustering with application to single nucleotide polymorphism genotyping Yan, Guohua 11 1900 (has links) Single nucleotide polymorphisms (SNPs) have been increasingly popular for a wide range of genetic studies. A high-throughput genotyping technologies usually involves a statistical genotype calling algorithm. Most calling algorithms in the literature, using methods such as k-means and mixturemodels, rely on elliptical structures of the genotyping data; they may fail when the minor allele homozygous cluster is small or absent, or when the data have extreme tails or linear patterns. We propose an automatic genotype calling algorithm by further developing a linear grouping algorithm (Van Aelst et al., 2006). The proposed algorithm clusters unnormalized data points around lines as against around centroids. In addition, we associate a quality value, silhouette width, with each DNA sample and a whole plate as well. This algorithm shows promise for genotyping data generated from TaqMan technology (Applied Biosystems). A key feature of the proposed algorithm is that it applies to unnormalized fluorescent signals when the TaqMan SNP assay is used. The algorithm could also be potentially adapted to other fluorescence-based SNP genotyping technologies such as Invader Assay. Motivated by the SNP genotyping problem, we propose a partial likelihood approach to linear clustering which explores potential linear clusters in a data set. Instead of fully modelling the data, we assume only the signed orthogonal distance from each data point to a hyperplane is normally distributed. Its relationships with several existing clustering methods are discussed. Some existing methods to determine the number of components in a data set are adapted to this linear clustering setting. Several simulated and real data sets are analyzed for comparison and illustration purpose. We also investigate some asymptotic properties of the partial likelihood approach. A Bayesian version of this methodology is helpful if some clusters are sparse but there is strong prior information about their approximate locations or properties. We propose a Bayesian hierarchical approach which is particularly appropriate for identifying sparse linear clusters. We show that the sparse cluster in SNP genotyping datasets can be successfully identified after a careful specification of the prior distributions. Cluster analysis Mixture models Single nucleotide polymorphism
7	Improving the Error Resilience of G.711.1 Speech Coder with Multiple Description Coding Alikhanian, Hooman 02 June 2010 (has links) This thesis devises quantization and source-channel coding schemes to increase the error robustness of the newly standardized ITU-T G.711.1 speech coder. The schemes employ Gaussian mixture model (GMM) based multiple description quantizers (MDQ). The thesis reviews the literature focusing on GMM based quantization, MDQ, and GMM-MDQ design methods and bit allocation schemes. GMM-MDQ are then designed for the quantization and coding of the MDCT coefficients in the G.711.1 speech coder. The designs are optimized for and tested over packet erasure channels. Performance of the designs are compared with Mohr's forward error correcting code based multiple description coding (MDC) scheme. / Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2010-06-02 16:02:11.727 Multiple Description Coding Probability Mixture Models
8	Linear clustering with application to single nucleotide polymorphism genotyping Yan, Guohua 11 1900 (has links) Single nucleotide polymorphisms (SNPs) have been increasingly popular for a wide range of genetic studies. A high-throughput genotyping technologies usually involves a statistical genotype calling algorithm. Most calling algorithms in the literature, using methods such as k-means and mixturemodels, rely on elliptical structures of the genotyping data; they may fail when the minor allele homozygous cluster is small or absent, or when the data have extreme tails or linear patterns. We propose an automatic genotype calling algorithm by further developing a linear grouping algorithm (Van Aelst et al., 2006). The proposed algorithm clusters unnormalized data points around lines as against around centroids. In addition, we associate a quality value, silhouette width, with each DNA sample and a whole plate as well. This algorithm shows promise for genotyping data generated from TaqMan technology (Applied Biosystems). A key feature of the proposed algorithm is that it applies to unnormalized fluorescent signals when the TaqMan SNP assay is used. The algorithm could also be potentially adapted to other fluorescence-based SNP genotyping technologies such as Invader Assay. Motivated by the SNP genotyping problem, we propose a partial likelihood approach to linear clustering which explores potential linear clusters in a data set. Instead of fully modelling the data, we assume only the signed orthogonal distance from each data point to a hyperplane is normally distributed. Its relationships with several existing clustering methods are discussed. Some existing methods to determine the number of components in a data set are adapted to this linear clustering setting. Several simulated and real data sets are analyzed for comparison and illustration purpose. We also investigate some asymptotic properties of the partial likelihood approach. A Bayesian version of this methodology is helpful if some clusters are sparse but there is strong prior information about their approximate locations or properties. We propose a Bayesian hierarchical approach which is particularly appropriate for identifying sparse linear clusters. We show that the sparse cluster in SNP genotyping datasets can be successfully identified after a careful specification of the prior distributions. Cluster analysis Mixture models Single nucleotide polymorphism
9	Linear clustering with application to single nucleotide polymorphism genotyping Yan, Guohua 11 1900 (has links) Single nucleotide polymorphisms (SNPs) have been increasingly popular for a wide range of genetic studies. A high-throughput genotyping technologies usually involves a statistical genotype calling algorithm. Most calling algorithms in the literature, using methods such as k-means and mixturemodels, rely on elliptical structures of the genotyping data; they may fail when the minor allele homozygous cluster is small or absent, or when the data have extreme tails or linear patterns. We propose an automatic genotype calling algorithm by further developing a linear grouping algorithm (Van Aelst et al., 2006). The proposed algorithm clusters unnormalized data points around lines as against around centroids. In addition, we associate a quality value, silhouette width, with each DNA sample and a whole plate as well. This algorithm shows promise for genotyping data generated from TaqMan technology (Applied Biosystems). A key feature of the proposed algorithm is that it applies to unnormalized fluorescent signals when the TaqMan SNP assay is used. The algorithm could also be potentially adapted to other fluorescence-based SNP genotyping technologies such as Invader Assay. Motivated by the SNP genotyping problem, we propose a partial likelihood approach to linear clustering which explores potential linear clusters in a data set. Instead of fully modelling the data, we assume only the signed orthogonal distance from each data point to a hyperplane is normally distributed. Its relationships with several existing clustering methods are discussed. Some existing methods to determine the number of components in a data set are adapted to this linear clustering setting. Several simulated and real data sets are analyzed for comparison and illustration purpose. We also investigate some asymptotic properties of the partial likelihood approach. A Bayesian version of this methodology is helpful if some clusters are sparse but there is strong prior information about their approximate locations or properties. We propose a Bayesian hierarchical approach which is particularly appropriate for identifying sparse linear clusters. We show that the sparse cluster in SNP genotyping datasets can be successfully identified after a careful specification of the prior distributions. / Science, Faculty of / Statistics, Department of / Graduate Cluster analysis Mixture models Single nucleotide polymorphism
10	Non-Gaussian Mixture Model Averaging for Clustering Zhang, Xu Xuan January 2017 (has links) The Gaussian mixture model has been used for model-based clustering analysis for decades. Most model-based clustering analyses are based on the Gaussian mixture model. Model averaging approaches for Gaussian mixture models are proposed by Wei and McNicholas, based on a family of 14 Gaussian parsimonious clustering models. In this thesis, we use non-Gaussian mixture models, namely the tEigen family, for our averaging approaches. This paper studies fitting in an averaged model from a set of multivariate t-mixture models instead of fitting a best model. / Thesis / Master of Science (MSc)

Search results