1 |
Anchored Bayesian Gaussian Mixture ModelsKunkel, Deborah Elizabeth 25 September 2018 (has links)
No description available.
|
2 |
Econometric modelling of nonlinearity and nonstationarity in the foreign exchange marketHillman, Robert J. T. January 1998 (has links)
No description available.
|
3 |
Extensions to the OCLUST AlgorithmClark, Katharine M January 2024 (has links)
OCLUST is a clustering algorithm that trims outliers in Gaussian mixture models. While mixtures of multivariate Gaussian distributions are a useful way to model heterogeneity in data, it is not always an appropriate assumption that the data arise from a finite mixture of Gaussian distributions. This thesis extends the OCLUST algorithm to three types of data which depart from the multivariate Gaussian distribution. The first extension, called funOCLUST, is developed for data which exist in functional form. Next, MVN-OCLUST applies outlier trimming to matrix-variate normal data. Finally, the skewOCLUST algorithm is formulated for skewed data by applying a transformation to normality. However, this final extension occurs after a brief detour in Chapter 5 to establish a foundation for the final chapter. / Thesis / Doctor of Philosophy (PhD)
|
4 |
DSP Base Independent Phrase Real Time Speaker Recognition SystemYan, Ming-Xiang 27 July 2004 (has links)
The thesis illustrates a DSP-based speaker recognition system . In order to make the modular within the representation floating-point, we simplify the algorithm. This speaker recognition system is including hardware setting and implementation of speaker algorithm. The DSP chip is float arithmetic DSP(ADSP-21161 of ADI SHARK Series) , the algorithm of speaker recognition is gaussian mixture model. According to result of experiments, the speaker recognition of DSP can gain good recognition and speed efficiency.
|
5 |
Evaluation of two types of Differential Item Functioning in factor mixture models with binary outcomesLee, Hwa Young, doctor of educational psychology 22 February 2013 (has links)
Differential Item Functioning (DIF) occurs when examinees with the same ability have different probabilities of endorsing an item. Conventional DIF detection methods (e.g., the Mantel-Hansel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable (e.g., Cohen & Bolt, 2005). True source of DIF may be unobserved, including variables such as personality, response patterns, or unmeasured background variables.
The Factor Mixture Model (FMM) is designed to detect unobserved sources of heterogeneity in factor structures, and an FMM with binary outcomes has recently been used for assessing DIF (DeMars & Lau, 2011; Jackman, 2010). However, FMMs with binary outcomes for detecting DIF have not been thoroughly explored to investigate both types of between-class latent DIF (LDIF) and class-specific observed DIF (ODIF).
The present simulation study was designed to investigate whether models correctly specified in terms of LDIF and/or ODIF influence the performance of model fit indices (AIC, BIC, aBIC, and CAIC) and entropy, as compared to models incorrectly specified in terms of either LDIF or ODIF. In addition, the present study examined the recovery of item difficulty parameters and investigated the proportion of replications in which items were correctly or incorrectly identified as displaying DIF, by manipulating DIF effect size and latent class probability. For each simulation condition, two latent classes of 27 item responses were generated to fit a one parameter logistic model with items’ difficulties generated to exhibit DIF across the classes and/or the observed groups.
Results showed that FMMs with binary outcomes performed well in terms of fit indices, entropy, DIF detection, and recovery of large DIF effects. When class probabilities were unequal with small DIF effects, performance decreased for fit indices, power, and the recovery of DIF effects compared to equal class probability conditions. Inflated Type I errors were found for invariant DIF items across simulation conditions. When data were generated to fit a model having ODIF but estimated LDIF, specifying LDIF in the model fully captured ODIF effects when DIF effect sizes were large. / text
|
6 |
Linear clustering with application to single nucleotide polymorphism genotypingYan, Guohua 11 1900 (has links)
Single nucleotide polymorphisms (SNPs) have been increasingly popular for
a wide range of genetic studies. A high-throughput genotyping technologies
usually involves a statistical genotype calling algorithm. Most calling
algorithms in the literature, using methods such as k-means and mixturemodels,
rely on elliptical structures of the genotyping data; they may fail
when the minor allele homozygous cluster is small or absent, or when the
data have extreme tails or linear patterns.
We propose an automatic genotype calling algorithm by further developing
a linear grouping algorithm (Van Aelst et al., 2006). The proposed
algorithm clusters unnormalized data points around lines as against around
centroids. In addition, we associate a quality value, silhouette width, with
each DNA sample and a whole plate as well. This algorithm shows promise
for genotyping data generated from TaqMan technology (Applied Biosystems).
A key feature of the proposed algorithm is that it applies to unnormalized
fluorescent signals when the TaqMan SNP assay is used. The
algorithm could also be potentially adapted to other fluorescence-based SNP
genotyping technologies such as Invader Assay.
Motivated by the SNP genotyping problem, we propose a partial likelihood
approach to linear clustering which explores potential linear clusters
in a data set. Instead of fully modelling the data, we assume only the signed
orthogonal distance from each data point to a hyperplane is normally distributed.
Its relationships with several existing clustering methods are discussed.
Some existing methods to determine the number of components in a
data set are adapted to this linear clustering setting. Several simulated and
real data sets are analyzed for comparison and illustration purpose. We also
investigate some asymptotic properties of the partial likelihood approach.
A Bayesian version of this methodology is helpful if some clusters are
sparse but there is strong prior information about their approximate locations
or properties. We propose a Bayesian hierarchical approach which is
particularly appropriate for identifying sparse linear clusters. We show that
the sparse cluster in SNP genotyping datasets can be successfully identified
after a careful specification of the prior distributions.
|
7 |
Improving the Error Resilience of G.711.1 Speech Coder with Multiple Description CodingAlikhanian, Hooman 02 June 2010 (has links)
This thesis devises quantization and source-channel coding schemes to increase the error robustness of the newly standardized ITU-T G.711.1 speech coder. The schemes employ Gaussian mixture model (GMM) based multiple description quantizers (MDQ). The thesis reviews the literature focusing on GMM based quantization, MDQ, and GMM-MDQ design methods and bit allocation schemes. GMM-MDQ are then designed for the quantization and coding of the MDCT coefficients in the G.711.1 speech coder. The designs are optimized for and tested over packet erasure channels. Performance of the designs are compared with Mohr's forward error correcting code based multiple description coding (MDC) scheme. / Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2010-06-02 16:02:11.727
|
8 |
Linear clustering with application to single nucleotide polymorphism genotypingYan, Guohua 11 1900 (has links)
Single nucleotide polymorphisms (SNPs) have been increasingly popular for
a wide range of genetic studies. A high-throughput genotyping technologies
usually involves a statistical genotype calling algorithm. Most calling
algorithms in the literature, using methods such as k-means and mixturemodels,
rely on elliptical structures of the genotyping data; they may fail
when the minor allele homozygous cluster is small or absent, or when the
data have extreme tails or linear patterns.
We propose an automatic genotype calling algorithm by further developing
a linear grouping algorithm (Van Aelst et al., 2006). The proposed
algorithm clusters unnormalized data points around lines as against around
centroids. In addition, we associate a quality value, silhouette width, with
each DNA sample and a whole plate as well. This algorithm shows promise
for genotyping data generated from TaqMan technology (Applied Biosystems).
A key feature of the proposed algorithm is that it applies to unnormalized
fluorescent signals when the TaqMan SNP assay is used. The
algorithm could also be potentially adapted to other fluorescence-based SNP
genotyping technologies such as Invader Assay.
Motivated by the SNP genotyping problem, we propose a partial likelihood
approach to linear clustering which explores potential linear clusters
in a data set. Instead of fully modelling the data, we assume only the signed
orthogonal distance from each data point to a hyperplane is normally distributed.
Its relationships with several existing clustering methods are discussed.
Some existing methods to determine the number of components in a
data set are adapted to this linear clustering setting. Several simulated and
real data sets are analyzed for comparison and illustration purpose. We also
investigate some asymptotic properties of the partial likelihood approach.
A Bayesian version of this methodology is helpful if some clusters are
sparse but there is strong prior information about their approximate locations
or properties. We propose a Bayesian hierarchical approach which is
particularly appropriate for identifying sparse linear clusters. We show that
the sparse cluster in SNP genotyping datasets can be successfully identified
after a careful specification of the prior distributions.
|
9 |
Linear clustering with application to single nucleotide polymorphism genotypingYan, Guohua 11 1900 (has links)
Single nucleotide polymorphisms (SNPs) have been increasingly popular for
a wide range of genetic studies. A high-throughput genotyping technologies
usually involves a statistical genotype calling algorithm. Most calling
algorithms in the literature, using methods such as k-means and mixturemodels,
rely on elliptical structures of the genotyping data; they may fail
when the minor allele homozygous cluster is small or absent, or when the
data have extreme tails or linear patterns.
We propose an automatic genotype calling algorithm by further developing
a linear grouping algorithm (Van Aelst et al., 2006). The proposed
algorithm clusters unnormalized data points around lines as against around
centroids. In addition, we associate a quality value, silhouette width, with
each DNA sample and a whole plate as well. This algorithm shows promise
for genotyping data generated from TaqMan technology (Applied Biosystems).
A key feature of the proposed algorithm is that it applies to unnormalized
fluorescent signals when the TaqMan SNP assay is used. The
algorithm could also be potentially adapted to other fluorescence-based SNP
genotyping technologies such as Invader Assay.
Motivated by the SNP genotyping problem, we propose a partial likelihood
approach to linear clustering which explores potential linear clusters
in a data set. Instead of fully modelling the data, we assume only the signed
orthogonal distance from each data point to a hyperplane is normally distributed.
Its relationships with several existing clustering methods are discussed.
Some existing methods to determine the number of components in a
data set are adapted to this linear clustering setting. Several simulated and
real data sets are analyzed for comparison and illustration purpose. We also
investigate some asymptotic properties of the partial likelihood approach.
A Bayesian version of this methodology is helpful if some clusters are
sparse but there is strong prior information about their approximate locations
or properties. We propose a Bayesian hierarchical approach which is
particularly appropriate for identifying sparse linear clusters. We show that
the sparse cluster in SNP genotyping datasets can be successfully identified
after a careful specification of the prior distributions. / Science, Faculty of / Statistics, Department of / Graduate
|
10 |
Non-Gaussian Mixture Model Averaging for ClusteringZhang, Xu Xuan January 2017 (has links)
The Gaussian mixture model has been used for model-based clustering analysis for
decades. Most model-based clustering analyses are based on the Gaussian mixture
model. Model averaging approaches for Gaussian mixture models are proposed by
Wei and McNicholas, based on a family of 14 Gaussian parsimonious clustering
models. In this thesis, we use non-Gaussian mixture
models, namely the tEigen family, for our averaging approaches. This paper studies
fitting in an averaged model from a set of multivariate t-mixture models instead of
fitting a best model. / Thesis / Master of Science (MSc)
|
Page generated in 0.0599 seconds