Global ETD Search

151	NORMAL MIXTURE AND CONTAMINATED MODEL WITH NUISANCE PARAMETER AND APPLICATIONS Fan, Qian 01 January 2014 (has links) This paper intend to find the proper hypothesis and test statistic for testing existence of bilaterally contamination when there exists nuisance parameter. The test statistic is based on method of moments estimators. Union-Intersection test is used for testing if the distribution of population can be implemented by a bilaterally contaminated normal model with unknown variance. This paper also developed a hierarchical normal mixture model (HNM) and applied it to birth weight data. EM algorithm is employed for parameter estimation and a singular Bayesian information criterion (sBIC) is applied to choose the number components. We also proposed a singular flexible information criterion which in addition involves a data-driven penalty. bilaterally contaminated normal model UnionIntersection test hierarchical normal mixture model singular Bayesian information criterion singular flexible information criterion Microarrays Multivariate Analysis Statistical Methodology Statistical Models
152	Dynamic Food Demand in China and International Nutrition Transition Zhou, De 12 May 2014 (has links) No description available. 630 demand model two-stage budgeting DLES-LA/DAIDS China Dynamic demand Income elasticity projetction meta-analysis nutrition transition calorie consumption finite mixture model food Land- und Forstwirtschaft (PPN621302791)
153	Design of robust blind detector with application to watermarking Anamalu, Ernest Sopuru 14 February 2014 (has links) One of the difficult issues in detection theory is to design a robust detector that takes into account the actual distribution of the original data. The most commonly used statistical detection model for blind detection is Gaussian distribution. Specifically, linear correlation is an optimal detection method in the presence of Gaussian distributed features. This has been found to be sub-optimal detection metric when density deviates completely from Gaussian distributions. Hence, we formulate a detection algorithm that enhances detection probability by exploiting the true characterises of the original data. To understand the underlying distribution function of data, we employed the estimation techniques such as parametric model called approximated density ratio logistic regression model and semiparameric estimations. Semiparametric model has the advantages of yielding density ratios as well as individual densities. Both methods are applicable to signals such as watermark embedded in spatial domain and outperform the conventional linear correlation non-Gaussian distributed. Signal detection Parametric and nonparametric estimations K-means expectation maximization maximum likelihood estimations density ratio estimation Gaussian mixture model Logistic regression model
154	Design of robust blind detector with application to watermarking Anamalu, Ernest Sopuru 14 February 2014 (has links) One of the difficult issues in detection theory is to design a robust detector that takes into account the actual distribution of the original data. The most commonly used statistical detection model for blind detection is Gaussian distribution. Specifically, linear correlation is an optimal detection method in the presence of Gaussian distributed features. This has been found to be sub-optimal detection metric when density deviates completely from Gaussian distributions. Hence, we formulate a detection algorithm that enhances detection probability by exploiting the true characterises of the original data. To understand the underlying distribution function of data, we employed the estimation techniques such as parametric model called approximated density ratio logistic regression model and semiparameric estimations. Semiparametric model has the advantages of yielding density ratios as well as individual densities. Both methods are applicable to signals such as watermark embedded in spatial domain and outperform the conventional linear correlation non-Gaussian distributed. Signal detection Parametric and nonparametric estimations K-means expectation maximization maximum likelihood estimations density ratio estimation Gaussian mixture model Logistic regression model
155	Modèle de mélange de lois multinormales appliqué à l'analyse de comportements et d'habiletés cognitives d'enfants Giguère, Charles-Édouard 11 1900 (has links) No description available. Modèle de mélanges de loi multinormale Analyse de tractoires Développement de l'enfant Cognition Comportement Multinormal mixture model Trajectory analysis Child development Cognition Behavior
156	Essays on Transaction Costs and Food Diversity in Developing Countries Steffen David, Christoph 28 June 2017 (has links) No description available. 630 transaction costs stochastic frontier Kenya food prices food diversity finite mixture model consumer demand India nutrition Land- und Forstwirtschaft (PPN621302791)
157	Analyse de données de cytometrie de flux pour un grand nombre d'échantillons / Automated flow cytometric analysis across a large number of samples Chen, Xiaoyi 06 October 2015 (has links) Cette thèse a conduit à la mise au point de deux nouvelles approches statistiques pour l'identification automatique de populations cellulaires en cytometrie de flux multiparamétrique, et ceci pour le traitement d'un grand nombre d'échantillons, chaque échantillon étant prélevé sur un donneur particulier. Ces deux approches répondent à des besoins exprimés dans le cadre du projet Labex «Milieu Intérieur». Dix panels cytométriques de 8 marqueurs ont été sélectionnés pour la quantification des populations principales et secondaires présentes dans le sang périphérique. Sur la base de ces panels, les données ont été acquises et analysées sur une cohorte de 1000 donneurs sains.Tout d'abord, nous avons recherché une quantification robuste des principales composantes cellulaires du système immunitaire. Nous décrivons une procédure computationnelle, appelée FlowGM, qui minimise l'intervention de l'utilisateur. Le cœur statistique est fondé sur le modèle classique de mélange de lois gaussiennes. Ce modèle est tout d'abord utilisé pour obtenir une classification initiale, le nombre de classes étant déterminé par le critère d'information BIC. Après cela, une méta-classification, qui consiste en l'étiquetage des classes et la fusion de celles qui ont la même étiquette au regard de la référence, a permis l'identification automatique de 24 populations cellulaires sur quatre panels. Ces identifications ont ensuite été intégrées dans les fichiers de cytométrie de flux standard (FCS), permettant ainsi la comparaison avec l'analyse manuelle opérée par les experts. Nous montrons que la qualité est similaire entre FlowGM et l'analyse manuelle classique pour les lymphocytes, mais notamment que FlowGM montre une meilleure discrimination des sous-populations de monocytes et de cellules dendritiques (DC), qui sont difficiles à obtenir manuellement. FlowGM fournit ainsi une analyse rapide de phénotypes cellulaires et se prête à des études de cohortes.A des fins d'évaluation, de diagnostic et de recherche, une analyse tenant compte de l'influence de facteurs, comme par exemple les effets du protocole, l'effet de l'âge et du sexe, a été menée. Dans le contexte du projet MI, les 1000 donneurs sains ont été stratifiés selon le sexe et l'âge. Les résultats de l'analyse quantitative faite avec FlowGM ont été jugés concordants avec l'analyse manuelle qui est considérée comme l'état de l'art. On note surtout une augmentation de la précision pour les populations CD16+ et CDC1, où les sous-populations CD14loCD16hi et HLADRhi CDC1 ont été systématiquement identifiées. Nous démontrons que les effectifs de ces deux populations présentent une corrélation significative avec l'âge. En ce qui concerne les populations qui sont connues pour être associées à l'âge, un modèle de régression linéaire multiple a été considéré qui fournit un coefficient de régression renforcé. Ces résultats établissent une base efficace pour l'évaluation de notre procédure FlowGM.Lors de l'utilisation de FlowGM pour la caractérisation détaillée de certaines sous-populations présentant de fortes variations au travers des différents échantillons, par exemple les cellules T, nous avons constaté que FlowGM était en difficulté. En effet, dans ce cas, l'algorithme EM classique initialisé avec la classification de l'échantillon de référence est insuffisant pour garantir l'alignement et donc l'identification des différentes classes entre tous échantillons. Nous avons donc amélioré FlowGM en une nouvelle procédure FlowGMP. Pour ce faire, nous avens ajouté au modèle de mélange, une distribution a priori sur les paramètres de composantes, conduisant à un algorithme EM contraint. Enfin, l'évaluation de FlowGMP sur un panel difficile de cellules T a été réalisée, en effectuant une comparaison avec l'analyse manuelle. Cette comparaison montre que notre procédure Bayésienne fournit une identification fiable et efficace des onze sous-populations de cellules T à travers un grand nombre d'échantillons. / In the course of my Ph.D. work, I have developed and applied two new computational approaches for automatic identification of cell populations in multi-parameter flow cytometry across a large number of samples. Both approaches were motivated and taken by the LabEX "Milieu Intérieur" study (hereafter MI study). In this project, ten 8-color flow cytometry panels were standardized for assessment of the major and minor cell populations present in peripheral whole blood, and data were collected and analyzed from 1,000 cohorts of healthy donors.First, we aim at robust characterization of major cellular components of the immune system. We report a computational pipeline, called FlowGM, which minimizes operator input, is insensitive to compensation settings, and can be adapted to different analytic panels. A Gaussian Mixture Model (GMM) - based approach was utilized for initial clustering, with the number of clusters determined using Bayesian Information Criterion. Meta-clustering in a reference donor, by which we mean labeling clusters and merging those with the same label in a pre-selected representative donor, permitted automated identification of 24 cell populations across four panels. Cluster labels were then integrated into Flow Cytometry Standard (FCS) files, thus permitting comparisons to human expert manual analysis. We show that cell numbers and coefficient of variation (CV) are similar between FlowGM and conventional manual analysis of lymphocyte populations, but notably FlowGM provided improved discrimination of "hard-to-gate" monocyte and dendritic cell (DC) subsets. FlowGM thus provides rapid, high-dimensional analysis of cell phenotypes and is amenable to cohort studies.After having cell counts across a large number of cohort donors, some further analysis (for example, the agreement with other methods, the age and gender effect, etc.) are required naturally for the purpose of comprehensive evaluation, diagnosis and discovery. In the context of the MI project, the 1,000 healthy donors were stratified across gender (50% women and 50% men) and age (20-69 years of age). Analysis was streamlined using our established approach FlowGM, the results were highly concordant with the state-of-art gold standard manual gating. More important, further precision of the CD16+ monocytes and cDC1 population was achieved using FlowGM, CD14loCD16hi monocytes and HLADRhi cDC1 cells were consistently identified. We demonstrate that the counts of these two populations show a significant correlation with age. As for the cell populations that are well-known to be related to age, a multiple linear regression model was considered, and it is shown that our results provided higher regression coefficient. These findings establish a strong foundation for comprehensive evaluation of our previous work.When extending this FlowGM method for detailed characterization of certain subpopulations where more variations are revealed across a large number of samples, for example the T cells, we find that the conventional EM algorithm initiated with reference clustering is insufficient to guarantee the alignment of clusters between all samples due to the presence of technical and biological variations. We then improved FlowGM and presented FlowGMP pipeline to address this specific panel. We introduce a Bayesian mixture model by assuming a prior distribution of component parameters and derive a penalized EM algorithm. Finally the performance of FlowGMP on this difficult T cell panel with a comparison between automated and manual analysis shows that our method provides a reliable and efficient identification of eleven T cell subpopulations across a large number of samples. Cytometrie en flux Analyse de donnes multiparamétrique Clustering Cohort data Mélange de lois Flow cytometry High-Dimensional data analysis Clustering Cohort data Mixture model
158	Algumas extensões da distribuição Birnbaum-Saunders: uma abordagem Bayesiana Cahui, Edwin Chaiña 09 January 2012 (has links) Made available in DSpace on 2016-06-02T20:06:05Z (GMT). No. of bitstreams: 1 4066.pdf: 4916301 bytes, checksum: c7a302cd5524ce8da164d4b95e1521a2 (MD5) Previous issue date: 2012-01-09 / The Birnbaum-Saunders Distribution is based on an physical damage that produces the cumulative fatigue materials, This fatigue was identified as an important cause of failure in engineering structures. Recently, this model has been applied in other areas such as health sciences, environmental measures, forestry, demographic, financial, among others. Due to it s importance several distributions have been proposed to describe the behavior of fatigue resistance. However there is not an argument about which is more effective for the analysis of data from fatigue. A major problem to choose a statistical distribution, is that often several models fit the data well in the central, but, however, the extremes of distribution raise questions about the decision to select some of their models. The lack of data at the extremes distribution is justified to consider other arguments like the use of a specific statistical distribution, and thus reject other models. In this work we study some extensions of the distribution Birnbaum-Saunders with a mixture of normal scale, in which the procedure will for obtaining inferences will be considered from a Bayesian perspective based on the methods Monte Carlo Markov Chain (MCMC). to detect possible observations influential in the models considered, we used the Bayesian method of analysis influence in each case based on the Kullback-Leibler divergence. Moreover, the geometric Birnbaum-Saunders model is proposed , for data survival. / A distribuição Birnbaum-Saunders (BS) está baseada em um argumento físico de dano cumulativo que produz a fadiga de materiais. Esta fadiga foi identificada como uma importante causa de falhas em estruturas de engenharia. Nos últimos tempos, este modelo tem sido aplicado em outras áreas, tais como: ciências da saúde, ambientais, florestais, demográficas, financeiras, entre outras. Devido a sua importância, várias distribuições têm sido propostas para descrever o comportamento da resistência à fadiga. Entretanto não há um argumento sobre qual modelo é mais efetivo para a análise dos dados de fadiga. Um dos principais problemas para escolher uma distribuição estatística, é que frequentemente vários modelos ajustam os dados bem na parte central, porém, no entanto, os extremos da distribuição colocam em dúvida a decisão para selecionar alguns dos modelos propostos. A falta de dados nos extremos da distribuição justifica considerar outros argumentos como o uso de um modelo estatístico específico, e assim rejeitar outros modelos. Neste trabalho estudamos algumas extensões da distribuição Birnbaum-Saunders com mistura de escala normal, no qual procedimento para obtenção de inferências sera considerado sob uma perspectiva Bayesiana baseada em Métodos de Monte Carlo via Cadeias de Markov (MCMC). Para detectar possíveis observações influentes nos modelos considerados, foi usado o método Bayesiano de análise de influência caso a caso, baseado na divergência de Kullback-Leibler. além disso, é proposto o modelo geométrico Birnbaum-Saunders, para dados de sobrevivência. Estatística Inferência bayesiana Tempo de Fadiga Kullback-Leibler Modelo de mistura Bayesian inference Fatigue time Kullback-Leibler Mixture model
159	Modelos de mistura para dados com distribuições Poisson truncadas no zero / Mixture models for data with zero truncated Poisson distributions Andressa do Carmo Gigante 22 September 2017 (has links) Modelo de mistura de distribuições tem sido utilizado desde longa data, mas ganhou maior atenção recentemente devido ao desenvolvimento de métodos de estimação mais eficientes. Nesta dissertação, o modelo de mistura foi utilizado como uma forma de agrupar ou segmentar dados para as distribuições Poisson e Poisson truncada no zero. Para solucionar o problema do truncamento foram estudadas duas abordagens. Na primeira, foi considerado o truncamento em cada componente da mistura, ou seja, a distribuição Poisson truncada no zero. E, alternativamente, o truncamento na resultante do modelo de mistura utilizando a distribuição Poisson usual. As estimativas dos parâmetros de interesse do modelo de mistura foram calculadas via metodologia de máxima verossimilhança, sendo necessária a utilização de um método iterativo. Dado isso, implementamos o algoritmo EM para estimar os parâmetros do modelo de mistura para as duas abordagens em estudo. Para analisar a performance dos algoritmos construídos elaboramos um estudo de simulação em que apresentaram estimativas próximas dos verdadeiros valores dos parâmetros de interesse. Aplicamos os algoritmos à uma base de dados real de uma determinada loja eletrônica e para determinar a escolha do melhor modelo utilizamos os critérios de seleção de modelos AIC e BIC. O truncamento no zero indica afetar mais a metodologia na qual aplicamos o truncamento em cada componente da mistura, tornando algumas estimativas para a distribuição Poisson truncada no zero com viés forte. Ao passo que, na abordagem em que empregamos o truncamento no zero diretamente no modelo as estimativas apontaram menor viés. / Mixture models has been used since long but just recently attracted more attention for the estimations methods development more efficient. In this dissertation, we consider the mixture model like a method for clustering or segmentation data with the Poisson and Poisson zero truncated distributions. About the zero truncation problem we have two emplacements. The first, consider the zero truncation in the mixture component, that is, we used the Poisson zero truncated distribution. And, alternatively, we do the zero truncation in the mixture model applying the usual Poisson. We estimated parameters of interest for the mixture model through maximum likelihood estimation method in which we need an iterative method. In this way, we implemented the EM algorithm for the estimation of interested parameters. We apply the algorithm in one real data base about one determined electronic store and towards determine the better model we use the criterion selection AIC and BIC. The zero truncation appear affect more the method which we truncated in the component mixture, return some estimates with strong bias. In the other hand, when we truncated the zero directly in the model the estimates pointed less bias. Algoritmo EM adaptado Métodos de agrupamento ou segmentação Mistura de Poissons truncadas no zero Modelo de mistura Variável truncada no zero Clustering methods EM algorithm Mixture model Zero truncated Poissons mixture
160	Alternative regression models to Beta distribution under Bayesian approach / Modelos de regressão alternativos à distribuição Beta sob abordagem bayesiana Rosineide Fernando da Paz 25 August 2017 (has links) The Beta distribution is a bounded domain distribution which has dominated the modeling the distribution of random variable that assume value between 0 and 1. Bounded domain distributions arising in various situations such as rates, proportions and index. Motivated by an analysis of electoral votes percentages (where a distribution with support on the positive real numbers was used, although a distribution with limited support could be more suitable) we focus on alternative distributions to Beta distribution with emphasis in regression models. In this work, initially we present the Simplex mixture model as a flexible model to modeling the distribution of bounded random variable then we extend the model to the context of regression models with the inclusion of covariates. The parameters estimation is discussed for both models considering Bayesian inference. We apply these models to simulated data sets in order to investigate the performance of the estimators. The results obtained were satisfactory for all the cases investigated. Finally, we introduce a parameterization of the L-Logistic distribution to be used in the context of regression models and we extend it to a mixture of mixed models. / A distribuição beta é uma distribuição com suporte limitado que tem dominado a modelagem de variáveis aleatórias que assumem valores entre 0 e 1. Distribuições com suporte limitado surgem em várias situações como em taxas, proporções e índices. Motivados por uma análise de porcentagens de votos eleitorais, em que foi assumida uma distribuição com suporte nos números reais positivos quando uma distribuição com suporte limitado seira mais apropriada, focamos em modelos alternativos a distribuição beta com enfase em modelos de regressão. Neste trabalho, apresentamos, inicialmente, um modelo de mistura de distribuições Simplex como um modelo flexível para modelar a distribuição de variáveis aleatórias que assumem valores em um intervalo limitado, em seguida estendemos o modelo para o contexto de modelos de regressão com a inclusão de covariáveis. A estimação dos parâmetros foi discutida para ambos os modelos, considerando o método bayesiano. Aplicamos os dois modelos a dados simulados para investigarmos a performance dos estimadores usados. Os resultados obtidos foram satisfatórios para todos os casos investigados. Finalmente, introduzimos a distribuição L-Logistica no contexto de modelos de regressão e posteriormente estendemos este modelo para o contexto de misturas de modelos de regressão mista. Análise bayesiana Distribuição Beta Distribuição L-logistica Distribuição Simplex Modelo com mistura de distribuições Bayesian analysis Beta distribution L-logistic distribution Mixture model Regression model Simplex distribution

Search results