• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 102
  • 9
  • 9
  • 8
  • 6
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 184
  • 184
  • 92
  • 75
  • 37
  • 36
  • 35
  • 32
  • 27
  • 26
  • 26
  • 25
  • 22
  • 21
  • 19
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Automatic architecture selection for probability density function estimation in computer vision

Sadeghi, Mohammad T. January 2002 (has links)
In this thesis, the problem of probability density function estimation using finite mixture models is considered. Gaussian mixture modelling is used to provide a semi-parametric density estimate for a given data set. The fundamental problem with this approach is that the number of mixtures required to adequately describe the data is not known in advance. In this work, a predictive validation technique [91] is studied and developed as a useful, operational tool that automatically selects the number of components for Gaussian mixture models. The predictive validation test approves a candidate model if, for the set of events they try to predict, the predicted frequencies derived from the model match the empirical ones derived from the data set. A model selection algorithm, based on the validation test, is developed which prevents both problems of over-fitting and under-fitting. We investigate the influence of the various parameters in the model selection method in order to develop it into a robust operational tool. The capability of the proposed method in real world applications is examined on the problem of face image segmentation for automatic initialisation of lip tracking systems. A segmentation approach is proposed which is based on Gaussian mixture modelling of the pixels RGB values using the predictive validation technique. The lip region segmentation is based on the estimated model. First a grouping of the model components is performed using a novel approach. The resulting groups are then the basis of a Bayesian decision making system which labels the pixels in the mouth area as lip or non-lip. The experimental results demonstrate the superiority of the method over the conventional clustering approaches. In order to improve the method computationally an image sampling technique is applied which is based on Sobol sequences. Also, the image modelling process is strengthened by incorporating spatial contextual information using two different methods, a Neigh-bourhood Expectation Maximisation technique and a spatial clustering method based on a Gibbs/Markov random field modelling approach. Both methods are developed within the proposed modelling framework. The results obtained on the lip segmentation application suggest that spatial context is beneficial.
2

DSP Base Independent Phrase Real Time Speaker Recognition System

Yan, Ming-Xiang 27 July 2004 (has links)
The thesis illustrates a DSP-based speaker recognition system . In order to make the modular within the representation floating-point, we simplify the algorithm. This speaker recognition system is including hardware setting and implementation of speaker algorithm. The DSP chip is float arithmetic DSP(ADSP-21161 of ADI SHARK Series) , the algorithm of speaker recognition is gaussian mixture model. According to result of experiments, the speaker recognition of DSP can gain good recognition and speed efficiency.
3

Birthweight-specific neonatal health : With application on data from a tertiaryhospital in Tanzania

Dahlqwist, Elisabeth January 2014 (has links)
The following study analyzes birthweight-specific neonatal health using a combination of a mixture model and logistic regression: the extended Parametric Mixture of Logistic Regression. The data are collected from the Obstetric database at Muhimbili National Hospital in Dar es Salaam, Tanzania and the years 2009 -2013 are used in the analysis. Due to rounding in the birthweight data a novel method to adjust for rounding when estimating a mixture model is applied. The influence of rounding on the estimates is then investigated. A three-component model is selected. The variables used in the analysis of neonatal health are early neonatal mortality, if the mother has HIV, anaemia, is a private patient and if the neonate is born after 36 completed weeks of gestation. It can be concluded that the mortality rates are high especially for low birthweights (2000 or less) in the estimated first and second components. However, due to wide confidence bounds it is hard to draw conclusions from the data.
4

The wild bootstrap resampling in regression imputation algorithm with a Gaussian Mixture Model

Mat Jasin, A., Neagu, Daniel, Csenki, Attila 08 July 2018 (has links)
Yes / Unsupervised learning of finite Gaussian mixture model (FGMM) is used to learn the distribution of population data. This paper proposes the use of the wild bootstrapping to create the variability of the imputed data in single miss-ing data imputation. We compare the performance and accuracy of the proposed method in single imputation and multiple imputation from the R-package Amelia II using RMSE, R-squared, MAE and MAPE. The proposed method shows better performance when compared with the multiple imputation (MI) which is indeed known as the golden method of missing data imputation techniques.
5

Security in Voice Authentication

Yang, Chenguang 27 March 2014 (has links)
We evaluate the security of human voice password databases from an information theoretical point of view. More specifically, we provide a theoretical estimation on the amount of entropy in human voice when processed using the conventional GMM-UBM technologies and the MFCCs as the acoustic features. The theoretical estimation gives rise to a methodology for analyzing the security level in a corpus of human voice. That is, given a database containing speech signals, we provide a method for estimating the relative entropy (Kullback-Leibler divergence) of the database thereby establishing the security level of the speaker verification system. To demonstrate this, we analyze the YOHO database, a corpus of voice samples collected from 138 speakers and show that the amount of entropy extracted is less than 14-bits. We also present a practical attack that succeeds in impersonating the voice of any speaker within the corpus with a 98% success probability with as little as 9 trials. The attack will still succeed with a rate of 62.50% if 4 attempts are permitted. Further, based on the same attack rationale, we mount an attack on the ALIZE speaker verification system. We show through experimentation that the attacker can impersonate any user in the database of 69 people with about 25% success rate with only 5 trials. The success rate can achieve more than 50% by increasing the allowed authentication attempts to 20. Finally, when the practical attack is cast in terms of an entropy metric, we find that the theoretical entropy estimate almost perfectly predicts the success rate of the practical attack, giving further credence to the theoretical model and the associated entropy estimation technique.
6

Robust feature extractions from geometric data using geometric algebra

Minh Tuan, Pham, Yoshikawa, Tomohiro, Furuhashi, Takeshi, Tachibana, Kaita 11 October 2009 (has links)
No description available.
7

Speaker and Emotion Recognition System of Gaussian Mixture Model

Wang, Jhong-yi 01 August 2006 (has links)
In this thesis, the speaker and emotion recognition system is established by PC and digit signal processor (DSP). Most speaker and emotion recognition systems are separately accomplished, but not combined together in the same system. In this thesis, it will show how speaker and emotion recognition systems are combined in the same system. In this system, the voice is picked up by a mike and through DSP to extract the characteristics. Then it passes the sample correctly, it can draw the result of distinguishing. The recognition system is divided into four sub-systems: the pronunciation pre-process, the speaker training model, the speaker and emotion recognition, and the speaker confirmation. The pronunciation pre-process uses the mike to capture the voice, and through the DSP board to convey the voice to the SRAM, then movements dealt with pre-process. The speaker trained model uses the Gaussian mixture model to establish the average, coefficient of variation and weight value of the person who sets up speaker specifically. And we¡¦ll take this information to be the datum of the whole recognition system. The speaker recognition mainly uses the density of probability to recognition the speaker¡¦s identity. The emotion recognition takes advantage of the coefficient of variation to recognize the emotion. The speaker confirms is set up to sure whether the user is the same speaker who hits for the systematic database. The recognition system based on DSP includes two parts¡GHardware setting and implementation of speaker algorithm. We use the fixed-arithmetician DSP chip (chipboard) in the DSP, the algorithm of recognition is Gaussian mixture model. In addition, compared with floating point, the fixed point DSP cost much less; it makes the system nearer to users.
8

GMMEDA : A demonstration of probabilistic modeling in continuous metaheuristic optimization using mixture models

Naveen Kumar Unknown Date (has links)
Optimization problems are common throughout science, engineering and commerce. The desire to continually improve solutions and resolve larger, complex problems has given prominence to this field of research for several decades and has led to the development of a range of optimization algorithms for different class of problems. The Estimation of Distribution Algorithms (EDAs) are a relatively recent class of metaheuristic optimization algorithms based on using probabilistic modeling techniques to control the search process. Within the general EDA framework, a number of different probabilistic models have been previously proposed for both discrete and continuous optimization problems. This thesis focuses on GMMEDAs; continuous EDAs based on the Gaussian Mixture Models (GMM) with parameter estimation performed using the Expectation Maximization (EM) algorithm. To date, this type of model has only received limited attention in the literature. There are few previous experimental studies of the algorithms. Furthermore, a number of implementation details of Continuous Iterated Density Estimation Algorithm based on Gaussian Mixture Model have not been previously documented. This thesis intends to provide a clear description of the GMMEDAs, discuss the implementation decisions and details and provides experimental study to evaluate the performance of the algorithms. The effectiveness of the GMMEDAs with varying model complexity (structure of covariance matrices and number of components) was tested against five benchmark functions (Sphere, Rastrigin, Griewank, Ackley and Rosenbrock) with varying dimensionality (2−, 10− and 30−D). The effect of the selection pressure parameters is also studied in this experiment. The results of the 2D experiments show that a variant of the GMMEDA with moderate complexity (Diagonal GMMEDA) was able to optimize both unimodal and multimodal functions. Further, experimental analysis of the 10 and 30D functions optimized results indicates that the simpler variant of the GMMEDA (Spherical GMMEDA) was most effective of all three variants of the algorithm. However, a greater consistency in the results of these functions is achieved when the most complex variant of the algorithm (Full GMMEDA) is used. The comparison of the results for four artificial test functions - Sphere, Griewank, Ackley and Rosenbrock - showed that the GMMEDA variants optimized most of complex functions better than existing continuous EDAs. This was achieved because of the ability of the GMM components to model the functions effectively. The analysis of the results evaluated by variants of the GMMEDA showed that number of the components and the selection pressure does affect the optimum value of artificial test function. The convergence of the GMMEDA variants to the respective functions best local optimum has been caused more by the complexity in the GMM components. The complexity of GMMEDA because of the number of components increases as the complexity owing to the structure of the covariance matrices increase. However, while finding optimum value of complex functions the increased complexity in GMMEDA due to complex covariance structure overrides the complexity due to increase in number of components. Additionally, the affect on the convergence due to the number of components decreases for most functions when the selection pressure increased. These affects have been noticed in the results in the form of stability of the results related to the functions. Other factors that affect the convergence of the model to the local optima are the initialization of the GMM parameters, the number of the EM components, and the reset condition. The initialization of the GMM components, though not visible graphically in the 10D optimization has shown: for different initialization of the GMM parameters in 2D, the optimum value of the functions is affected. The initialization of the population in the Evolutionary Algorithms has shown to affect the convergence of the algorithm to the functions global optimum. The observation of similar affects due to initialization of GMM parameters on the optimization of the 2D functions indicates that the convergence of the GMM in the 10D could be affected, which in turn, could affect the optimum value of respective functions. The estimated values related to the covariance and mean over the EM iteration in the 2D indicated that some functions needed a greater number of EM iterations while finding their optimum value. This indicates that lesser number of EM iterations could affect the fitting of the components to the selected population in the 10D and the fitting can affect the effective modeling of functions with varying complexity. Finally, the reset condition has shown as resetting the covariance and the best fitness value of individual in each generation in 2D. This condition is certain to affect the convergence of the GMMEDA variants to the respective functions best local optimum. The rate at which the reset condition was invoked could certainly have caused the GMM components covariance values to reset to their initials values and thus the model fitting the percentage of the selected population could have been affected. Considering all the affects caused by the different factors, the results indicates that a smaller number of the components and percentage of the selected population with a simpler S-GMMEDA modeled most functions with a varying complexity.
9

Some non-standard statistical dependence problems

Bere, Alphonce January 2016 (has links)
Philosophiae Doctor - PhD / The major result of this thesis is the development of a framework for the application of pair-mixtures of copulas to model asymmetric dependencies in bivariate data. The main motivation is the inadequacy of mixtures of bivariate Gaussian models which are commonly fitted to data. Mixtures of rotated single parameter Archimedean and Gaussian copulas are fitted to real data sets. The method of maximum likelihood is used for parameter estimation. Goodness-of-fit tests performed on the models giving the highest log-likelihood values show that the models fit the data well. We use mixtures of univariate Gaussian models and mixtures of regression models to investigate the existence of bimodality in the distribution of the widths of autocorrelation functions in a sample of 119 gamma-ray bursts. Contrary to previous findings, our results do not reveal any evidence of bimodality. We extend a study by Genest et al. (2012) of the power and significance levels of tests of copula symmetry, to two copula models which have not been considered previously. Our results confirm that for small sample sizes, these tests fail to maintain their 5% significance level and that the Cramer-von Mises-type statistics are the most powerful.
10

Embedded Feature Selection for Model-based Clustering

January 2020 (has links)
abstract: Model-based clustering is a sub-field of statistical modeling and machine learning. The mixture models use the probability to describe the degree of the data point belonging to the cluster, and the probability is updated iteratively during the clustering. While mixture models have demonstrated the superior performance in handling noisy data in many fields, there exist some challenges for high dimensional dataset. It is noted that among a large number of features, some may not indeed contribute to delineate the cluster profiles. The inclusion of these “noisy” features will confuse the model to identify the real structure of the clusters and cost more computational time. Recognizing the issue, in this dissertation, I propose a new feature selection algorithm for continuous dataset first and then extend to mixed datatype. Finally, I conduct uncertainty quantification for the feature selection results as the third topic. The first topic is an embedded feature selection algorithm termed Expectation-Selection-Maximization (ESM) model that can automatically select features while optimizing the parameters for Gaussian Mixture Model. I introduce a relevancy index (RI) revealing the contribution of the feature in the clustering process to assist feature selection. I demonstrate the efficacy of the ESM by studying two synthetic datasets, four benchmark datasets, and an Alzheimer’s Disease dataset. The second topic focuses on extending the application of ESM algorithm to handle mixed datatypes. The Gaussian mixture model is generalized to Generalized Model of Mixture (GMoM), which can not only handle continuous features, but also binary and nominal features. The last topic is about Uncertainty Quantification (UQ) of the feature selection. A new algorithm termed ESOM is proposed, which takes the variance information into consideration while conducting feature selection. Also, a set of outliers are generated in the feature selection process to infer the uncertainty in the input data. Finally, the selected features and detected outlier instances are evaluated by visualization comparison. / Dissertation/Thesis / Doctoral Dissertation Industrial Engineering 2020

Page generated in 0.0413 seconds