Spelling suggestions: "subject:"[een] MIXTURE MODEL"" "subject:"[enn] MIXTURE MODEL""
101 |
Development in Normal Mixture and Mixture of Experts ModelingQi, Meng 01 January 2016 (has links)
In this dissertation, first we consider the problem of testing homogeneity and order in a contaminated normal model, when the data is correlated under some known covariance structure. To address this problem, we developed a moment based homogeneity and order test, and design weights for test statistics to increase power for homogeneity test. We applied our test to microarray about Down’s syndrome. This dissertation also studies a singular Bayesian information criterion (sBIC) for a bivariate hierarchical mixture model with varying weights, and develops a new data dependent information criterion (sFLIC).We apply our model and criteria to birth- weight and gestational age data for the same model, whose purposes are to select model complexity from data.
|
102 |
Balance-guaranteed optimized tree with reject option for live fish recognitionHuang, Xuan January 2014 (has links)
This thesis investigates the computer vision application of live fish recognition, which is needed in application scenarios where manual annotation is too expensive, when there are too many underwater videos. This system can assist ecological surveillance research, e.g. computing fish population statistics in the open sea. Some pre-processing procedures are employed to improve the recognition accuracy, and then 69 types of features are extracted. These features are a combination of colour, shape and texture properties in different parts of the fish such as tail/head/top/bottom, as well as the whole fish. Then, we present a novel Balance-Guaranteed Optimized Tree with Reject option (BGOTR) for live fish recognition. It improves the normal hierarchical method by arranging more accurate classifications at a higher level and keeping the hierarchical tree balanced. BGOTR is automatically constructed based on inter-class similarities. We apply a Gaussian Mixture Model (GMM) and Bayes rule as a reject option after the hierarchical classification to evaluate the posterior probability of being a certain species to filter less confident decisions. This novel classification-rejection method cleans up decisions and rejects unknown classes. After constructing the tree architecture, a novel trajectory voting method is used to eliminate accumulated errors during hierarchical classification and, therefore, achieves better performance. The proposed BGOTR-based hierarchical classification method is applied to recognize the 15 major species of 24150 manually labelled fish images and to detect new species in an unrestricted natural environment recorded by underwater cameras in south Taiwan sea. It achieves significant improvements compared to the state-of-the-art techniques. Furthermore, the sequence of feature selection and constructing a multi-class SVM is investigated. We propose that an Individual Feature Selection (IFS) procedure can be directly exploited to the binary One-versus-One SVMs before assembling the full multiclass SVM. The IFS method selects different subsets of features for each Oneversus- One SVM inside the multiclass classifier so that each vote is optimized to discriminate the two specific classes. The proposed IFS method is tested on four different datasets comparing the performance and time cost. Experimental results demonstrate significant improvements compared to the normal Multiclass Feature Selection (MFS) method on all datasets.
|
103 |
A Finite Element Model for Mixed Porohyperelasticity with Transport, Swelling, and GrowthArmstrong, Michelle Hine, Buganza Tepole, Adrián, Kuhl, Ellen, Simon, Bruce R., Vande Geest, Jonathan P. 14 April 2016 (has links)
The purpose of this manuscript is to establish a unified theory of porohyperelasticity with transport and growth and to demonstrate the capability of this theory using a finite element model developed in MATLAB. We combine the theories of volumetric growth and mixed porohyperelasticity with transport and swelling (MPHETS) to derive a new method that models growth of biological soft tissues. The conservation equations and constitutive equations are developed for both solid-only growth and solid/fluid growth. An axisymmetric finite element framework is introduced for the new theory of growing MPHETS (GMPHETS). To illustrate the capabilities of this model, several example finite element test problems are considered using model geometry and material parameters based on experimental data from a porcine coronary artery. Multiple growth laws are considered, including time-driven, concentrationdriven, and stress-driven growth. Time-driven growth is compared against an exact analytical solution to validate the model. For concentration-dependent growth, changing the diffusivity (representing a change in drug) fundamentally changes growth behavior. We further demonstrate that for stress-dependent, solid-only growth of an artery, growth of an MPHETS model results in a more uniform hoop stress than growth in a hyperelastic model for the same amount of growth time using the same growth law. This may have implications in the context of developing residual stresses in soft tissues under intraluminal pressure. To our knowledge, this manuscript provides the first full description of an MPHETS model with growth. The developed computational framework can be used in concert with novel in-vitro and in-vivo experimental approaches to identify the governing growth laws for various soft tissues.
|
104 |
Generalised density function estimation using moments and the characteristic functionEsterhuizen, Gerhard 03 1900 (has links)
139 leaves printed single pages, preliminary pages i-xi and numbered pages 1-127. Includes bibliography and a list of figures and tables. Digitized at 600 dpi grayscale to pdf format (OCR),using a Bizhub 250 Konica Minolta Scanner. / Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2003. / ENGLISH ABSTRACT: Probability density functions (PDFs) and cumulative distribution functions (CDFs)
play a central role in statistical pattern recognition and verification systems. They allow
observations that do not occur according to deterministic rules to be quantified and modelled.
An example of such observations would be the voice patterns of a person that is
used as input to a biometric security device.
In order to model such non-deterministic observations, a density function estimator
is employed to estimate a PDF or CDF from sample data. Although numerous density
function estimation techniques exist, all the techniques can be classified into one of two
groups, parametric and non-parametric, each with its own characteristic advantages and
disadvantages.
In this research, we introduce a novel approach to density function estimation that
attempts to combine some of the advantages of both the parametric and non-parametric
estimators. This is done by considering density estimation using an abstract approach in
which the density function is modelled entirely in terms of its moments or characteristic
function. New density function estimation techniques are first developed in theory, after
which a number of practical density function estimators are presented.
Experiments are performed in which the performance of the new estimators are compared
to two established estimators, namely the Parzen estimator and the Gaussian mixture
model (GMM). The comparison is performed in terms of the accuracy, computational requirements
and ease of use of the estimators and it is found that the new estimators does
combine some of the advantages of the established estimators without the corresponding
disadvantages. / AFRIKAANSE OPSOMMING: Waarskynlikheids digtheidsfunksies (WDFs) en Kumulatiewe distribusiefunksies (KDFs)
speel 'n sentrale rol in statistiese patroonherkenning en verifikasie stelsels. Hulle maak dit
moontlik om nie-deterministiese observasies te kwantifiseer en te modelleer. Die stempatrone
van 'n spreker wat as intree tot 'n biometriese sekuriteits stelsel gegee word, is 'n
voorbeeld van so 'n observasie.
Ten einde sulke observasies te modelleer, word 'n digtheidsfunksie afskatter gebruik om
die WDF of KDF vanaf data monsters af te skat. Alhoewel daar talryke digtheidsfunksie
afskatters bestaan, kan almal in een van twee katagoriee geplaas word, parametries en
nie-parametries, elk met hul eie kenmerkende voordele en nadele.
Hierdie werk Ie 'n nuwe benadering tot digtheidsfunksie afskatting voor wat die voordele
van beide die parametriese sowel as die nie-parametriese tegnieke probeer kombineer. Dit
word gedoen deur digtheidsfunksie afskatting vanuit 'n abstrakte oogpunt te benader waar
die digtheidsfunksie uitsluitlik in terme van sy momente en karakteristieke funksie gemodelleer
word. Nuwe metodes word eers in teorie ondersoek en ontwikkel waarna praktiese
tegnieke voorgele word. Hierdie afskatters het die vermoe om 'n wye verskeidenheid digtheidsfunksies
af te skat en is nie net ontwerp om slegs sekere families van digtheidsfunksies
optimaal voor te stel nie.
Eksperimente is uitgevoer wat die werkverrigting van die nuwe tegnieke met twee gevestigde
tegnieke, naamlik die Parzen afskatter en die Gaussiese mengsel model (GMM), te
vergelyk. Die werkverrigting word gemeet in terme van akkuraatheid, vereiste numeriese
verwerkingsvermoe en die gemak van gebruik. Daar word bevind dat die nuwe afskatters
weI voordele van die gevestigde afskatters kombineer sonder die gepaardgaande nadele.
|
105 |
Probabilistic Models for Species Tree Inference and Orthology AnalysisUllah, Ikram January 2015 (has links)
A phylogenetic tree is used to model gene evolution and species evolution using molecular sequence data. For artifactual and biological reasons, a gene tree may differ from a species tree, a phenomenon known as gene tree-species tree incongruence. Assuming the presence of one or more evolutionary events, e.g., gene duplication, gene loss, and lateral gene transfer (LGT), the incongruence may be explained using a reconciliation of a gene tree inside a species tree. Such information has biological utilities, e.g., inference of orthologous relationship between genes. In this thesis, we present probabilistic models and methods for orthology analysis and species tree inference, while accounting for evolutionary factors such as gene duplication, gene loss, and sequence evolution. Furthermore, we use a probabilistic LGT-aware model for inferring gene trees having temporal information for duplication and LGT events. In the first project, we present a Bayesian method, called DLRSOrthology, for estimating orthology probabilities using the DLRS model: a probabilistic model integrating gene evolution, a relaxed molecular clock for substitution rates, and sequence evolution. We devise a dynamic programming algorithm for efficiently summing orthology probabilities over all reconciliations of a gene tree inside a species tree. Furthermore, we present heuristics based on receiver operating characteristics (ROC) curve to estimate suitable thresholds for deciding orthology events. Our method, as demonstrated by synthetic and biological results, outperforms existing probabilistic approaches in accuracy and is robust to incomplete taxon sampling artifacts. In the second project, we present a probabilistic method, based on a mixture model, for species tree inference. The method employs a two-phase approach, where in the first phase, a structural expectation maximization algorithm, based on a mixture model, is used to reconstruct a maximum likelihood set of candidate species trees. In the second phase, in order to select the best species tree, each of the candidate species tree is evaluated using PrIME-DLRS: a method based on the DLRS model. The method is accurate, efficient, and scalable when compared to a recent probabilistic species tree inference method called PHYLDOG. We observe that, in most cases, the analysis constituted only by the first phase may also be used for selecting the target species tree, yielding a fast and accurate method for larger datasets. Finally, we devise a probabilistic method based on the DLTRS model: an extension of the DLRS model to include LGT events, for sampling reconciliations of a gene tree inside a species tree. The method enables us to estimate gene trees having temporal information for duplication and LGT events. To the best of our knowledge, this is the first probabilistic method that takes gene sequence data directly into account for sampling reconciliations that contains information about LGT events. Based on the synthetic data analysis, we believe that the method has the potential to identify LGT highways. / <p>QC 20150529</p>
|
106 |
Non-parametric probability density function estimation for medical imagesJoshi, Niranjan Bhaskar January 2008 (has links)
The estimation of probability density functions (PDF) of intensity values plays an important role in medical image analysis. Non-parametric PDF estimation methods have the advantage of generality in their application. The two most popular estimators in image analysis methods to perform the non-parametric PDF estimation task are the histogram and the kernel density estimator. But these popular estimators crucially need to be ‘tuned’ by setting a number of parameters and may be either computationally inefficient or need a large amount of training data. In this thesis, we critically analyse and further develop a recently proposed non-parametric PDF estimation method for signals, called the NP windows method. We propose three new algorithms to compute PDF estimates using the NP windows method. One of these algorithms, called the log-basis algorithm, provides an easier and faster way to compute the NP windows estimate, and allows us to compare the NP windows method with the two existing popular estimators. Results show that the NP windows method is fast and can estimate PDFs with a significantly smaller amount of training data. Moreover, it does not require any additional parameter settings. To demonstrate utility of the NP windows method in image analysis we consider its application to image segmentation. To do this, we first describe the distribution of intensity values in the image with a mixture of non-parametric distributions. We estimate these distributions using the NP windows method. We then use this novel mixture model to evolve curves with the well-known level set framework for image segmentation. We also take into account the partial volume effect that assumes importance in medical image analysis methods. In the final part of the thesis, we apply our non-parametric mixture model (NPMM) based level set segmentation framework to segment colorectal MR images. The segmentation of colorectal MR images is made challenging due to sparsity and ambiguity of features, presence of various artifacts, and complex anatomy of the region. We propose to use the monogenic signal (local energy, phase, and orientation) to overcome the first difficulty, and the NPMM to overcome the remaining two. Results are improved substantially on those that have been reported previously. We also present various ways to visualise clinically useful information obtained with our segmentations in a 3-dimensional manner.
|
107 |
Statistical inference for rankings in the presence of panel segmentationXie, Lin January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Paul Nelson / Panels of judges are often used to estimate consumer preferences for m items such as food products. Judges can either evaluate each item on several ordinal scales and indirectly produce an overall ranking, or directly report a ranking of the items. A complete ranking orders all the items from best to worst. A partial ranking, as we use the term, only reports rankings of the best q out of m items. Direct ranking, the subject of this report, does not require the widespread but questionable practice of treating ordinal measurement as though they were on ratio or interval scales. Here, we develop and study segmentation models in which the panel may consist of relatively homogeneous subgroups, the segments. Judges within a subgroup will tend to agree among themselves and differ from judges in the other subgroups. We develop and study the statistical analysis of mixture models where it is not known to which segment a judge belongs or in some cases how many segments there are. Viewing segment membership indicator variables as latent data, an E-M algorithm was used to find the maximum likelihood estimators of the parameters specifying a mixture of Mallow’s (1957) distance models for complete and partial rankings. A simulation study was conducted to evaluate the behavior of the E-M algorithm in terms of such issues as the fraction of data sets for which the algorithm fails to converge and the sensitivity of initial values to the convergence rate and the performance of the maximum likelihood estimators in terms of bias and mean square error, where applicable.
A Bayesian approach was developed and credible set estimators was constructed. Simulation was used to evaluate the performance of these credible sets as
confidence sets.
A method for predicting segment membership from covariates measured on a judge was derived using a logistic model applied to a mixture of Mallows probability distance models. The effects of covariates on segment membership were assessed.
Likelihood sets for parameters specifying mixtures of Mallows distance models were constructed and explored.
|
108 |
Utilização da metodologia de superfície de resposta no desenvolvimento de um molho tipo Pesto visando a atividade antioxidante / Utilization of response surface methodology in the development of a Pesto sauce to maximize its antioxidant activityAfonso, Guilherme 06 September 2006 (has links)
Evidências recentes têm demonstrado que dietas com elevado conteúdo de vegetais, frutas e grãos podem reduzir o risco de diversas doenças não transmissíveis. As propriedades benéficas desses alimentos têm sido atribuídas, em grande parte, à presença de substâncias antioxidantes, que são capazes de diminuir os efeitos prejudiciais dos radicais livres. O objetivo deste trabalho foi desenvolver uma formulação de molho tipo Pesto, com base nas propriedades antioxidantes dos seus ingredientes principais: manjericão, castanha do Brasil e azeite de oliva extra virgem. A metodologia foi divida em duas fases: a primeira consistiu na avaliação da interação entre os componentes com atividade antioxidante (AA) presentes nos ingredientes principais do molho, realizada através da metodologia de superfície de resposta por modelagem de misturas. Foi utilizado um planejamento centróide simplex, no qual a resposta medida foi a atividade antioxidante dos extratos de diferentes polaridades obtidos das diferentes formulações. Utilizando-se o método DPPH (1,1-difenil-2-picrilhidrazil) e o sistema ß-caroteno/ácido linoléico, não foi encontrada interação entre os componentes com AA presentes nos ingredientes. Apesar dos modelos obtidos não descreverem adequadamente a variação dos resultados, o manjericão foi identificado como o ingrediente de maior contribuição para a AA total do molho. Foi realizada análise sensorial para determinar a formulação melhor aceita dentre as possibilidades obtidas. A segunda fase consistiu em submeter a formulação determinada na fase 1 às análises de composição centesimal, quantificação dos compostos fenólicos totais e quatro métodos in vitro de avaliação da AA: método do poder redutor, sistema ß-caroteno/ácido linoléico, DPPH e ensaio em meio lipídico pelo aparelho Rancimat®. A formulação final pode ser considerada como uma boa fonte de antioxidantes naturais e portanto fazer parte de uma dieta saudável. / Recent evidences have shown that high consumption of vegetables, fruits and grains can reduce the risk of non-communicable diseases. The healthy properties of these foods have been related mostly to the presence of antioxidants, substances which are known as capable of decreasing the harmful effects of free radicals. The objective of this work was to develop a Pesto sauce formulation, based on the antioxidant properties of its main ingredients: sweet basil, Brazil nut and extra-virgin olive oil. The methodology was divided in two phases: The first one consisted in the evaluation of the interaction between the components with antioxidant activity (AA) present in the sauce\'s main ingredients, applying the response surface methodology with a mixture model. A centroid simplex plan was used, in which the response measured was the AA of the extracts of different polarities from the different formulations. By using the DPPH (1,1-diphenyl-2-picryhydrazyl) method and the ß-carotene/linoleic acid system, no interaction between the components with AA was detected. Although the models could not describe properly the response variation, sweet basil was identified as the main responsible for the total AA of the sauce. Sensory analysis was conducted to determine the most accepted formulation among the possibilities. The second phase consisted in submitting the formulation obtained in phase 1 to centesimal composition analysis, quantification of total phenolics and four in vitro AA methods: reducing power, DPPH method, ß-carotene/linoleic acid system and the Rancimat® method. The final formulation may be considered a good source of natural antioxidants and therefore be part of a healthy diet.
|
109 |
Classificação de fluxos de dados não estacionários com algoritmos incrementais baseados no modelo de misturas gaussianas / Non-stationary data streams classification with incremental algorithms based on Gaussian mixture modelsOliveira, Luan Soares 18 August 2015 (has links)
Aprender conceitos provenientes de fluxos de dados é uma tarefa significamente diferente do aprendizado tradicional em lote. No aprendizado em lote, existe uma premissa implicita que os conceitos a serem aprendidos são estáticos e não evoluem significamente com o tempo. Por outro lado, em fluxos de dados os conceitos a serem aprendidos podem evoluir ao longo do tempo. Esta evolução é chamada de mudança de conceito, e torna a criação de um conjunto fixo de treinamento inaplicável neste cenário. O aprendizado incremental é uma abordagem promissora para trabalhar com fluxos de dados. Contudo, na presença de mudanças de conceito, conceitos desatualizados podem causar erros na classificação de eventos. Apesar de alguns métodos incrementais baseados no modelo de misturas gaussianas terem sido propostos na literatura, nota-se que tais algoritmos não possuem uma política explicita de descarte de conceitos obsoletos. Nesse trabalho um novo algoritmo incremental para fluxos de dados com mudanças de conceito baseado no modelo de misturas gaussianas é proposto. O método proposto é comparado com vários algoritmos amplamente utilizados na literatura, e os resultados mostram que o algoritmo proposto é competitivo com os demais em vários cenários, superando-os em alguns casos. / Learning concepts from data streams differs significantly from traditional batch learning. In batch learning there is an implicit assumption that the concept to be learned is static and does not evolve significantly over time. On the other hand, in data stream learning the concepts to be learned may evolve over time. This evolution is called concept drift, and makes the creation of a fixed training set be no longer applicable. Incremental learning paradigm is a promising approach for learning in a data stream setting. However, in the presence of concept drifts, out dated concepts can cause misclassifications. Several incremental Gaussian mixture models methods have been proposed in the literature, but these algorithms lack an explicit policy to discard outdated concepts. In this work, a new incremental algorithm for data stream with concept drifts based on Gaussian Mixture Models is proposed. The proposed methodis compared to various algorithms widely used in the literature, and the results show that it is competitive with them invarious scenarios, overcoming them in some cases.
|
110 |
Distribuição e abundância de Amazona vinacea (Papagaio-de-peito-roxo) no oeste de Santa CatarinaZulian, Viviane January 2017 (has links)
Esse trabalho oferece uma avaliação da abundância do papagaio-de-peito-roxo (Amazona vinacea) para 2016 e 2017, combinando contagens em dormitórios ao longo de toda a distribuição da espécie, em escala global, com amostragens replicadas em dormitórios na região oeste de Santa Catarina (WSC), em escala local, Brasil. As contagens em escala global resultaram em 3888 e 4066 indivíduos em 2016 e 2017, respectivamente. As estimativas para o WSC foram de 945 ± 50 e 1393 ± 40 para os mesmos dois anos. Não foi observada nenhuma evidência de crescimento populacional de 2016 para 2017, pois o acréscimo no número de indivíduos foi acompanhado por aumento do esforço amostral em ambas escalas. Quando extrapolamos a abundância no WSC para toda a área de distribuição da espécie, segundo a IUCN, e pressupondo densidade homogênea, obtivemos valores que estão acima da contagem na escala global, mas dentro da mesma ordem de magnitude. Nosso resultado oferece uma base sólida para afirmar que o tamanho populacional global de A. vinacea é de milhares de indivíduos, mas não dezenas de milhares. Realizamos um esforço sistemático para considerar as principais fontes de incerteza na estimativa de abundância da espécie. Cada contagem, tanto na escala local quanto na global, incluíram visitas em todos os dormitórios conhecidos dentro de um intervalo de 10 dias, evitando duplas contagens devido ao movimento dos papagaios entre dormitórios. No WSC, a abundância foi estimada usando um N-Mixture Model implementado em contexto Bayesiano. Apesar de nossa estimativa de tamanho populacional e de área de distribuição serem maiores do que as consideradas pela IUCN, sugerimos que A. vinacea permaneça na categoria “Em Perigo”, até que sejam realizados estudos sobre tendência populacional. / We offer an assessment of Vinaceous parrot (Amazona vinacea) abundance in 2016 and 2017, combining roost counts over the whole range of the species, with a replicated survey of roosts at the local scale, in western Santa Catarina state (WSC), Brazil. The whole range counts amounted to 3888 and 4066 individuals in 2016 and 2017, respectively. The WSC estimates were 945 ± 50 and of 1393 ± 40 individuals, for the same two years. We found no evidence of population growth from 2016 to 2017 because the increase in numbers is accompanied by an increase in observation effort both in WSC and at the whole-range scale. When extrapolating the WSC abundance estimate to the whole IUCN extant range of the species under the simplifying assumption of homogenous population density, we obtain values above the whole-range counts, but within the same order of magnitude. Such result offers a sound basis for putting the global population size of A. vinacea in the thousands of individuals, but not in the tens of thousands of individuals. We made a systematic effort to address key sources of uncertainty in parrot abundance estimation. Each count, at the local or whole-range scale, includes visits to all relevant roosts within less than ten days time to avoid double counting due to movement between roosts. At the local scale, we estimated abundance using an N-Mixture Model of replicated count data, implemented in a Bayesian framework. Even though we estimate a larger population size and a bigger geographic range that those currently reported by the IUCN, we suggest that A. vinacea should remain in the ‘Endangered’ IUCN threat category, pending further investigation of population trends.
|
Page generated in 0.0565 seconds