Global ETD Search

71	A Classification Tool for Predictive Data Analysis in Healthcare Victors, Mason Lemoyne 07 March 2013 (has links) (PDF) Hidden Markov Models (HMMs) have seen widespread use in a variety of applications ranging from speech recognition to gene prediction. While developed over forty years ago, they remain a standard tool for sequential data analysis. More recently, Latent Dirichlet Allocation (LDA) was developed and soon gained widespread popularity as a powerful topic analysis tool for text corpora. We thoroughly develop LDA and a generalization of HMMs and demonstrate the conjunctive use of both methods in predictive data analysis for health care problems. While these two tools (LDA and HMM) have been used in conjunction previously, we use LDA in a new way to reduce the dimensionality involved in the training of HMMs. With both LDA and our extension of HMM, we train classifiers to predict development of Chronic Kidney Disease (CKD) in the near future. predictive data analysis Hidden Markov Models Latent Dirichlet Allocation health care convex analysis Markov chains Expectation Maximization Gibbs sampling classification tree random forest Mathematics
72	Measuring Skill Importance in Women's Soccer and Volleyball Allan, Michelle L. 11 March 2009 (has links) (PDF) The purpose of this study is to demonstrate how to measure skill importance for two sports: soccer and volleyball. A division I women's soccer team filmed each home game during a competitive season. Every defensive, dribbling, first touch, and passing skill was rated and recorded for each team. It was noted whether each sequence of plays led to a successful shot. A hierarchical Bayesian logistic regression model is implemented to determine how the performance of the skill affects the probability of a successful shot. A division I women's volleyball team rated each skill (serve, pass, set, etc.) and recorded rally outcomes during home games in a competitive season. The skills were only rated when the ball was on the home team's side of the net. Events followed one of these three patterns: serve-outcome, pass-set-attack-outcome, or dig-set-attack-outcome. We analyze the volleyball data using two different techniques, Markov chains and Bayesian logistic regression. These sequences of events are assumed to be first-order Markov chains. This means the quality of the current skill only depends on the quality of the previous skill. The count matrix is assumed to follow a multinomial distribution, so a Dirichlet prior is used to estimate each row of the count matrix. Bayesian simulation is used to produce the unconditional posterior probability (e.g., a perfect serve results in a point). The volleyball logistic regression model uses a Bayesian approach to determine how the performance of the skill affects the probability of a successful outcome. The posterior distributions produced from each of the models are used to calculate importance scores. The soccer data importance scores revealed that passing, first touch, and dribbling skills are the most important to the primary team. The Markov chain model for the volleyball data indicates setting 3–5 feet off the net increases the probability of a successful outcome. The logistic regression model for the volleyball data reveals that serves have a high importance score because of their steep slope. Importance scores can be used to assist coaches in allocating practice time, developing new strategies, and analyzing each player's skill performance. volleyball Markov chain transition matrix Markov chain Monte Carlo Gibbs sampling multinomial distribution Bayesian model soccer hierarchical logistic regression Statistics and Probability
73	Bayesian Solution to the Analysis of Data with Values below the Limit of Detection (LOD) Jin, Yan January 2008 (has links) No description available. Mathematics Statistics Bayesian Censored data Limit of detection LOD Censoring Repeated measure Nested random effects Mixture model Gibbs sampling Metropolis-Hastings algorithm DIC
74	Statistical models of TF/DNA interaction Fouquier d'Herouel, Aymeric January 2008 (has links) Gene expression is regulated in response to metabolic necessities and environmental changes throughout the life of a cell. A major part of this regulation is governed at the level of transcription, deciding whether messengers to specific genes are produced or not. This decision is triggered by the action of transcription factors, proteins which interact with specific sites on DNA and thus influence the rate of transcription of proximal genes. Mapping the organisation of these transcription factor binding sites sheds light on potential causal relations between genes and is the key to establishing networks of genetic interactions, which determine how the cell adapts to external changes. In this work I review briefly the basics of genetics and summarise popular approaches to describe transcription factor binding sites, from the most straight forward to finally discuss a biophysically motivated representation based on the estimation of free energies of molecular interactions. Two articles on transcription factors are contained in this thesis, one published (Aurell, Fouquier d'Hérouël, Malmnäs and Vergassola, 2007) and one submitted (Fouquier d'Hérouël, 2008). Both rely strongly on the representation of binding sites by matrices accounting for the affinity of the proteins to specific nucleotides at the different positions of the binding sites. The importance of non-specific binding of transcription factors to DNA is briefly addressed in the text and extensively discussed in the first appended article: In a study on the affinity of yeast transcription factors for their binding sites, we conclude that measured in vivo protein concentrations are marginally sufficient to guarantee the occupation of functional sites, as opposed to unspecific emplacements on the genomic sequence. A common task being the inference of binding site motifs, the most common statistical method is reviewed in detail, upon which I constructed an alternative biophysically motivated approach, exemplified in the second appended article. / QC 20101110 gene expression regulation transcription factor binding motif matrix representations gibbs sampling binding affinity non-specific binding Condensed Matter Physics Den kondenserade materiens fysik
75	Semiparametric Bayesian Approach using Weighted Dirichlet Process Mixture For Finance Statistical Models Sun, Peng 07 March 2016 (has links) Dirichlet process mixture (DPM) has been widely used as exible prior in nonparametric Bayesian literature, and Weighted Dirichlet process mixture (WDPM) can be viewed as extension of DPM which relaxes model distribution assumptions. Meanwhile, WDPM requires to set weight functions and can cause extra computation burden. In this dissertation, we develop more efficient and exible WDPM approaches under three research topics. The first one is semiparametric cubic spline regression where we adopt a nonparametric prior for error terms in order to automatically handle heterogeneity of measurement errors or unknown mixture distribution, the second one is to provide an innovative way to construct weight function and illustrate some decent properties and computation efficiency of this weight under semiparametric stochastic volatility (SV) model, and the last one is to develop WDPM approach for Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) model (as an alternative approach for SV model) and propose a new model evaluation approach for GARCH which produces easier-to-interpret result compared to the canonical marginal likelihood approach. In the first topic, the response variable is modeled as the sum of three parts. One part is a linear function of covariates that enter the model parametrically. The second part is an additive nonparametric model. The covariates whose relationships to response variable are unclear will be included in the model nonparametrically using Lancaster and Šalkauskas bases. The third part is error terms whose means and variance are assumed to follow non-parametric priors. Therefore we denote our model as dual-semiparametric regression because we include nonparametric idea for both modeling mean part and error terms. Instead of assuming all of the error terms follow the same prior in DPM, our WDPM provides multiple candidate priors for each observation to select with certain probability. Such probability (or weight) is modeled by relevant predictive covariates using Gaussian kernel. We propose several different WDPMs using different weights which depend on distance in covariates. We provide the efficient Markov chain Monte Carlo (MCMC) algorithms and also compare our WDPMs to parametric model and DPM model in terms of Bayes factor using simulation and empirical study. In the second topic, we propose an innovative way to construct weight function for WDPM and apply it to SV model. SV model is adopted in time series data where the constant variance assumption is violated. One essential issue is to specify distribution of conditional return. We assume WDPM prior for conditional return and propose a new way to model the weights. Our approach has several advantages including computational efficiency compared to the weight constructed using Gaussian kernel. We list six properties of this proposed weight function and also provide the proof of them. Because of the additional Metropolis-Hastings steps introduced by WDPM prior, we find the conditions which can ensure the uniform geometric ergodicity of transition kernel in our MCMC. Due to the existence of zero values in asset price data, our SV model is semiparametric since we employ WDPM prior for non-zero values and parametric prior for zero values. On the third project, we develop WDPM approach for GARCH type model and compare different types of weight functions including the innovative method proposed in the second topic. GARCH model can be viewed as an alternative way of SV for analyzing daily stock prices data where constant variance assumption does not hold. While the response variable of our SV models is transformed log return (based on log-square transformation), GARCH directly models the log return itself. This means that, theoretically speaking, we are able to predict stock returns using GARCH models while this is not feasible if we use SV model. Because SV models ignore the sign of log returns and provides predictive densities for squared log return only. Motivated by this property, we propose a new model evaluation approach called back testing return (BTR) particularly for GARCH. This BTR approach produces model evaluation results which are easier to interpret than marginal likelihood and it is straightforward to draw conclusion about model profitability by applying this approach. Since BTR approach is only applicable to GARCH, we also illustrate how to properly cal- culate marginal likelihood to make comparison between GARCH and SV. Based on our MCMC algorithms and model evaluation approaches, we have conducted large number of model fittings to compare models in both simulation and empirical study. / Ph. D. Additive Model Bayes factor Cubic Splines Dual-Semiparametric Regression Generalized Polya urn Geometric ergodicity Gibbs sampling Metropolis-Hastings Nonparametric Bayesian Model Ordinal data Parameterization Semiparametric Regr
76	Estimação clássica e bayesiana para relação espécieárea com distribuições truncadas no zero Arrabal, Claude Thiago 23 March 2012 (has links) Made available in DSpace on 2016-06-02T20:06:07Z (GMT). No. of bitstreams: 1 4453.pdf: 2980949 bytes, checksum: a5e49490266d2a0b649d487d8bf298d5 (MD5) Previous issue date: 2012-03-23 / Financiadora de Estudos e Projetos / In ecology, understanding the species-area relationship (SARs) are extremely important to determine species diversity. SARs are fundamental to assess the impact due to the destruction of natural habitats, creation of biodiversity maps, to determine the minimum area to preserve. In this study, the number of species is observed in different area sizes. These studies are referred in the literature through nonlinear models without assuming any distribution for the data. In this situation, it only makes sense to consider areas in which the counts of species are greater than zero. As the dependent variable is a count data, we assume that this variable comes from a known distribution for discrete data positive. In this paper, we used the zero truncated Poisson distribution (ZTP) and zero truncated Negative Binomial (ZTNB) to represent the probability distribution of the random variable species diversity number. To describe the relationship between species diversity and habitat, we consider nonlinear models with asymptotic behavior: Exponencial Negativo, Weibull, Logístico, Chapman-Richards, Gompertz e Beta. In this paper, we take a Bayesian approach to fit models. With the purpose of obtain the conditional distributions, we propose the use of latent variables to implement the Gibbs sampler. Introducing a comparative study through simulated data and will consider an application to a real data set. / Em ecologia, a compreensão da relação espécie-área (SARs) é de extrema importância para a determinação da diversidade de espécies e avaliar o impacto devido à destruição de habitats naturais. Neste estudo, observa-se o número de espécies em diferentes tamanhos de área. Estes estudos são abordados na literatura através de modelos não lineares sem assumir alguma distribuição para os dados. Nesta situação, só faz sentido considerar áreas nas quais as contagens das espécies são maiores do que zero. Como a variável dependente é um dado de contagem, assumiremos que esta variável provém de alguma distribuição conhecida para dados discretos positivos. Neste trabalho, utilizamos as distribuições de Poisson zero-truncada (PZT) e Binomial Negativa zero-truncada (BNZT) para representar a distribuição do número de espécies. Para descrever a relação espécie-área, consideramos os modelos não lineares com comportamento assintótico: Exponencial Negativo, Weibull, Logístico, Chapman-Richards, Gompertz e Beta. Neste trabalho os modelos foram ajustados através do método de verossimilhança, sendo proposto uma abordagem Bayesiana com a utilização de variáveis latentes auxiliares para a implementação do Amostrador de Gibbs. Estatística Inferência bayesiana Relações espécie-área Modelos não-lineares Variáveis latentes Distribuições zero-truncadas modelos não-lineares dados de contagem Gibbs Sampling Metropolis-Hastings. Zero-truncated distribution Specie-area relationship
77	馬可夫鏈蒙地卡羅法在外匯選擇權定價的應用謝盈弘 Unknown Date (has links) 本篇論文以Regime Switching Stochastic Volatility（RSV）作為外匯選擇權市場的波動度模型，採用馬可夫鏈蒙地卡羅法（Markov Chain Monte Carlo）中的GibbS Sampling演算法估計RSV模型的參數，並預測外匯選擇權在RSV模型下的價格。數值結果方面首先對GibbS Sampling參數估計的結果做討論，再對預測出的選擇權價格與Black and Scholes作比較，最後並提出笑狀波幅與隱含波動度平面的結果。本研究所得到之結論： 1. RSV模型與MCMC模擬法的組合，具備產生笑狀波幅的能力，提供足夠證據顯示，RSV模型與MCMC演算法所計算出來的選擇權價格，確實反應且捕捉到了市場上選擇權價格所應具備的特色。 2. 本模型能有效解釋期限結構（Term Stucture of Volatility）、笑狀波幅（Volatility Smile）的現象。關鍵字：馬可夫鏈蒙地卡羅法、外匯選擇權、貝氏選擇權評價、MCMC、Regime switching Regine change、Gibbs Sampling、currency option、Markov Chain Montec Carlo 馬可夫鏈蒙地卡羅法外匯選擇權貝氏選擇權評價 Markov chain monte carlo Currency option Bayesian option prcing MCMC Gibbs sampling Regime switching
78	Modelos multidimensionais da TRI com distribuições assimétricas para os traços latentes / Multidimensional IRT models with skew distributions for latent traits. Gilberto da Silva Matos 15 December 2008 (has links) A falta de alternativas ao modelo normal uni/multivariado já é um problema superado pois atualmente é possível encontrar inúmeros trabalhos que introduzem e desenvolvem generalizações da distribuição normal com relação `a assimetria, curtose e/ou multimodalidade (Branco e Arellano-Valle (2004), Genton (2004), Arellano-Valle et al. (2006)). No contexto dos modelos unidimensionais da Teoria da Resposta ao Item (TRI), Bazán (2005) percebeu esta realidade e introduziu uma classe denominada PANA (Probito Assimétrico - Normal Assimétrica) a qual permite modelar possíveis comportamentos assimétricos de um modelo (uma probabilidade) de resposta ao item bem como a especificação de uma distribuição normal assimétrica para os traços latentes (unidimensionais) a qual é utilizada no processo de estimação. Motivado pela necessidade de melhor representar os fenômenos da área psicométrica (Heinen, 1996, p. 105) e da atual disponibilidade de distribuições elípticas assimétricas cujas propriedades são tão convenientes quanto aquelas devidas `a distribuição normal, a proposta do presente trabalho é apresentar uma extensão do modelo K-dimensional de 3 Parâmetros Probito (Kd3PP) com vetores de traços latentes normalmente distribuídos para o caso t-Assimétrico, gerando, assim, o que denominamos modelo Kd3PP-tA. Nossa proposta, portanto, pode ser considerada como uma extensão do trabalho desenvolvido por Bazán (2005) tanto no sentido de extender a distribuição unidimensional assimétrica dos traços latentes para o caso multidimensional quanto no que conscerne em considerar o achatamento (curtose) da distribuição. Nossa proposta também pode ser vista como uma extensão do trabalho de Béguin e Glas (2001) no sentido de desenvolver o método de estimação bayesiana dos modelos multidimensionais da TRI via DAGS (Dados Aumentados com Amostrador de Gibbs) para o caso em que os vetores de traços latentes comportam-se segundo uma distribuição multivariada t-Assimétrica. No desenvolvimento deste trabalho nos deparamos com uma das principais dificuldades encontradas no processo de estimação e inferência dos modelos multidimensionais da TRI que é a falta de identificabilidade e, com a intenção de ampliar e desmistificar nossos conhecimentos sobre um assunto ainda pouco explorado na literatura da TRI, apresentamos um estudo bibliográfico sobre este tema tanto sob o contexto da inferência clássica quanto bayesiana. Com o intuito de identificar situações particulares em que o uso de uma distribuição normal assimétrica para os traços latentes seja de maior relevância para a estimação e inferência dos parâmetros de item, bem como outros parâmetros relacionados à distribuição dos traços latentes, algumas análises sobre conjuntos de dados simulados são desenvolvidas. Como conclusão destas análises, podemos dizer que há uma melhora superficial quando a informação sobre uma possível assimetria na distribuição dos traços latentes não é ignorada. Além disso, os resultados favoreceram a seleção dos modelos que consideram distribuições assimétricas para os traços latentes, principalmente quando são considerados os modelos que possibilitam a estimação dos parâmetros de localização e escala da distribuição dos vetores de traços latentes. Duas principais contribuições que consideramos de ordem prática, são: a análise e a interpretação de testes através da estimação de modelos uni e multidimensionais da TRI que consideram tanto distribuições simétricas quanto assimétricas para os vetores de traços latentes e a disponibilização de uma função escrita em códigos R e C++ para a estimação dos modelos apresentados e desenvolvidos no presente trabalho. / The lack of alternatives to the univariate or multivariate normal model has been already solved because actually it has been possible to find several works that introduce and develop generalizations of the normal distribution in relation to the asymmetry, kurtosis and/or multimodality (Branco e Arellano-Valle (2004), Genton (2004), Arellano-Valle et al. (2006). In the context of unidimensional models of the Item Response Theory (IRT), Baz´an (2005) observed this fact and introduced a class called PANA (Probito Assimétrico - Normal Assimétrica) which allows to take account for asymmetry in the shape of an item response model (probability) and the specification of a skew normal distribution for unidimensional latent traits which is used in the estimation process. Motivated by the need to better represent the phenomenon of psychometric area (Heinen, 1996, p. 105) and the current availability of skew elliptical distributions whose properties are as convenient as those due to normal distribution, the proposal of this work is to provide an extension of multidimensional 3 Parameters Probit model (Kd3PP) where latent traits vectors are normally distributed for the case of Skew-t distribution (Sahu et al., 2003), generating therefore what we call Kd3PP-St model. Our proposal, therefore, can be regarded as an extension of the work of Bazán (2005) in two ways: the first is extending the unidimensional skew normal distribution of latent traits to the multidimensional case and second in the sense to consider the flattening (kurtosis) of this distribution. Our proposal can also be seen as an extension of the work of B´eguin e Glas (2001) in the sense that we develop the Bayesian estimation method of the 3 parameters multidimensional item response model by DAGS (Augmentated Data with Gibbs sampling) for the case where the latent trait vectors behave according to a Skew-t multivariate distribution. In the development of this work we come across one of the main difficulties encountered in the process of estimation and inference of multidimensional IRT models which is the lack of identifiabilitie and, with the intent to demystify and expand our knowledge on a subject still little explored in the literature of the IRT, we present a bibliographical study on this subject both in the context of classical and Bayesian inference. In order to identify particular situations where the use of a skew normal distribution is more relevant to the estimation and inference of item parameters as well as other parameters related to the distribution of latent traits, some analyses on simulated data sets are developed. As results of these analyses, we can say that there is a modest improvement when information about a possible asymmetry in the distribution of latent traits is not ignored. Moreover, the results favored the selection of models that consider asymmetric distributions for latent traits, especially when models that enable the estimation of parameters of location and scale from this distribution are considered. Two main contributions that we consider of pratical interest are: analysis and interpretations of tests using unidimensional and multidimensional IRT models that consider both simetric and skewed distributions for the vectors of latent traits and a function written in R and C++ language program that is made disponible for the estimation of models treated in this work. Equalização Identificabilidade métodos bayesianos via MCMC Modelos multidimensionais da TRI Augmentated Data with Gibbs sampling Identifiabily Multidimensional equating Multivariate Skew-t distribution R and C++ language program.
79	Modelos multidimensionais da TRI com distribuições assimétricas para os traços latentes / Multidimensional IRT models with skew distributions for latent traits. Matos, Gilberto da Silva 15 December 2008 (has links) A falta de alternativas ao modelo normal uni/multivariado já é um problema superado pois atualmente é possível encontrar inúmeros trabalhos que introduzem e desenvolvem generalizações da distribuição normal com relação `a assimetria, curtose e/ou multimodalidade (Branco e Arellano-Valle (2004), Genton (2004), Arellano-Valle et al. (2006)). No contexto dos modelos unidimensionais da Teoria da Resposta ao Item (TRI), Bazán (2005) percebeu esta realidade e introduziu uma classe denominada PANA (Probito Assimétrico - Normal Assimétrica) a qual permite modelar possíveis comportamentos assimétricos de um modelo (uma probabilidade) de resposta ao item bem como a especificação de uma distribuição normal assimétrica para os traços latentes (unidimensionais) a qual é utilizada no processo de estimação. Motivado pela necessidade de melhor representar os fenômenos da área psicométrica (Heinen, 1996, p. 105) e da atual disponibilidade de distribuições elípticas assimétricas cujas propriedades são tão convenientes quanto aquelas devidas `a distribuição normal, a proposta do presente trabalho é apresentar uma extensão do modelo K-dimensional de 3 Parâmetros Probito (Kd3PP) com vetores de traços latentes normalmente distribuídos para o caso t-Assimétrico, gerando, assim, o que denominamos modelo Kd3PP-tA. Nossa proposta, portanto, pode ser considerada como uma extensão do trabalho desenvolvido por Bazán (2005) tanto no sentido de extender a distribuição unidimensional assimétrica dos traços latentes para o caso multidimensional quanto no que conscerne em considerar o achatamento (curtose) da distribuição. Nossa proposta também pode ser vista como uma extensão do trabalho de Béguin e Glas (2001) no sentido de desenvolver o método de estimação bayesiana dos modelos multidimensionais da TRI via DAGS (Dados Aumentados com Amostrador de Gibbs) para o caso em que os vetores de traços latentes comportam-se segundo uma distribuição multivariada t-Assimétrica. No desenvolvimento deste trabalho nos deparamos com uma das principais dificuldades encontradas no processo de estimação e inferência dos modelos multidimensionais da TRI que é a falta de identificabilidade e, com a intenção de ampliar e desmistificar nossos conhecimentos sobre um assunto ainda pouco explorado na literatura da TRI, apresentamos um estudo bibliográfico sobre este tema tanto sob o contexto da inferência clássica quanto bayesiana. Com o intuito de identificar situações particulares em que o uso de uma distribuição normal assimétrica para os traços latentes seja de maior relevância para a estimação e inferência dos parâmetros de item, bem como outros parâmetros relacionados à distribuição dos traços latentes, algumas análises sobre conjuntos de dados simulados são desenvolvidas. Como conclusão destas análises, podemos dizer que há uma melhora superficial quando a informação sobre uma possível assimetria na distribuição dos traços latentes não é ignorada. Além disso, os resultados favoreceram a seleção dos modelos que consideram distribuições assimétricas para os traços latentes, principalmente quando são considerados os modelos que possibilitam a estimação dos parâmetros de localização e escala da distribuição dos vetores de traços latentes. Duas principais contribuições que consideramos de ordem prática, são: a análise e a interpretação de testes através da estimação de modelos uni e multidimensionais da TRI que consideram tanto distribuições simétricas quanto assimétricas para os vetores de traços latentes e a disponibilização de uma função escrita em códigos R e C++ para a estimação dos modelos apresentados e desenvolvidos no presente trabalho. / The lack of alternatives to the univariate or multivariate normal model has been already solved because actually it has been possible to find several works that introduce and develop generalizations of the normal distribution in relation to the asymmetry, kurtosis and/or multimodality (Branco e Arellano-Valle (2004), Genton (2004), Arellano-Valle et al. (2006). In the context of unidimensional models of the Item Response Theory (IRT), Baz´an (2005) observed this fact and introduced a class called PANA (Probito Assimétrico - Normal Assimétrica) which allows to take account for asymmetry in the shape of an item response model (probability) and the specification of a skew normal distribution for unidimensional latent traits which is used in the estimation process. Motivated by the need to better represent the phenomenon of psychometric area (Heinen, 1996, p. 105) and the current availability of skew elliptical distributions whose properties are as convenient as those due to normal distribution, the proposal of this work is to provide an extension of multidimensional 3 Parameters Probit model (Kd3PP) where latent traits vectors are normally distributed for the case of Skew-t distribution (Sahu et al., 2003), generating therefore what we call Kd3PP-St model. Our proposal, therefore, can be regarded as an extension of the work of Bazán (2005) in two ways: the first is extending the unidimensional skew normal distribution of latent traits to the multidimensional case and second in the sense to consider the flattening (kurtosis) of this distribution. Our proposal can also be seen as an extension of the work of B´eguin e Glas (2001) in the sense that we develop the Bayesian estimation method of the 3 parameters multidimensional item response model by DAGS (Augmentated Data with Gibbs sampling) for the case where the latent trait vectors behave according to a Skew-t multivariate distribution. In the development of this work we come across one of the main difficulties encountered in the process of estimation and inference of multidimensional IRT models which is the lack of identifiabilitie and, with the intent to demystify and expand our knowledge on a subject still little explored in the literature of the IRT, we present a bibliographical study on this subject both in the context of classical and Bayesian inference. In order to identify particular situations where the use of a skew normal distribution is more relevant to the estimation and inference of item parameters as well as other parameters related to the distribution of latent traits, some analyses on simulated data sets are developed. As results of these analyses, we can say that there is a modest improvement when information about a possible asymmetry in the distribution of latent traits is not ignored. Moreover, the results favored the selection of models that consider asymmetric distributions for latent traits, especially when models that enable the estimation of parameters of location and scale from this distribution are considered. Two main contributions that we consider of pratical interest are: analysis and interpretations of tests using unidimensional and multidimensional IRT models that consider both simetric and skewed distributions for the vectors of latent traits and a function written in R and C++ language program that is made disponible for the estimation of models treated in this work. Augmentated Data with Gibbs sampling Equalização Identifiabily Identificabilidade métodos bayesianos via MCMC Modelos multidimensionais da TRI Multidimensional equating Multivariate Skew-t distribution R and C++ language program.
80	Stochastic Nested Aggregation for Images and Random Fields Wesolkowski, Slawomir Bogumil 27 March 2007 (has links) Image segmentation is a critical step in building a computer vision algorithm that is able to distinguish between separate objects in an image scene. Image segmentation is based on two fundamentally intertwined components: pixel comparison and pixel grouping. In the pixel comparison step, pixels are determined to be similar or different from each other. In pixel grouping, those pixels which are similar are grouped together to form meaningful regions which can later be processed. This thesis makes original contributions to both of those areas. First, given a Markov Random Field framework, a Stochastic Nested Aggregation (SNA) framework for pixel and region grouping is presented and thoroughly analyzed using a Potts model. This framework is applicable in general to graph partitioning and discrete estimation problems where pairwise energy models are used. Nested aggregation reduces the computational complexity of stochastic algorithms such as Simulated Annealing to order O(N) while at the same time allowing local deterministic approaches such as Iterated Conditional Modes to escape most local minima in order to become a global deterministic optimization method. SNA is further enhanced by the introduction of a Graduated Models strategy which allows an optimization algorithm to converge to the model via several intermediary models. A well-known special case of Graduated Models is the Highest Confidence First algorithm which merges pixels or regions that give the highest global energy decrease. Finally, SNA allows us to use different models at different levels of coarseness. For coarser levels, a mean-based Potts model is introduced in order to compute region-to-region gradients based on the region mean and not edge gradients. Second, we develop a probabilistic framework based on hypothesis testing in order to achieve color constancy in image segmentation. We develop three new shading invariant semi-metrics based on the Dichromatic Reflection Model. An RGB image is transformed into an R'G'B' highlight invariant space to remove any highlight components, and only the component representing color hue is preserved to remove shading effects. This transformation is applied successfully to one of the proposed distance measures. The probabilistic semi-metrics show similar performance to vector angle on images without saturated highlight pixels; however, for saturated regions, as well as very low intensity pixels, the probabilistic distance measures outperform vector angle. Third, for interferometric Synthetic Aperture Radar image processing we apply the Potts model using SNA to the phase unwrapping problem. We devise a new distance measure for identifying phase discontinuities based on the minimum coherence of two adjacent pixels and their phase difference. As a comparison we use the probabilistic cost function of Carballo as a distance measure for our experiments. random fields image segmentation Potts model hierarchical nested aggregation simulated annealing iterated conditional modes image patches color phase unwrapping shading invariant highlight invariant energy model Gibbs sampling graduated models nested models System Design Engineering

Search results