Global ETD Search

11	Misturas finitas de normais assimétricas e de t assimétricas aplicadas em análise discriminante Coelho, Carina Figueiredo 28 June 2013 (has links) Submitted by Kamila Costa (kamilavasconceloscosta@gmail.com) on 2015-06-18T20:16:38Z No. of bitstreams: 1 Dissertação-Carina Figueiredo Coelho.pdf: 3096964 bytes, checksum: 57c06ccd1fdc732a7cf9a50381d3806b (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2015-07-06T15:29:34Z (GMT) No. of bitstreams: 1 Dissertação-Carina Figueiredo Coelho.pdf: 3096964 bytes, checksum: 57c06ccd1fdc732a7cf9a50381d3806b (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2015-07-06T15:27:26Z (GMT) No. of bitstreams: 1 Dissertação-Carina Figueiredo Coelho.pdf: 3096964 bytes, checksum: 57c06ccd1fdc732a7cf9a50381d3806b (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2015-07-06T15:33:36Z (GMT) No. of bitstreams: 1 Dissertação-Carina Figueiredo Coelho.pdf: 3096964 bytes, checksum: 57c06ccd1fdc732a7cf9a50381d3806b (MD5) / Made available in DSpace on 2015-07-06T15:33:36Z (GMT). No. of bitstreams: 1 Dissertação-Carina Figueiredo Coelho.pdf: 3096964 bytes, checksum: 57c06ccd1fdc732a7cf9a50381d3806b (MD5) Previous issue date: 2013-06-28 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / We investigated use of finite mixture models with skew normal independent distributions to model the conditional distributions in discriminat analysis, particularly the skew normal and skew t. To evaluate this model, we developed a simulation study and applications with real data sets, analyzing error rates associated with the classifiers obtained with these mixture models. Problems were simulated with different structures and separations for the classes distributions employing different training set sizes. The results of the study suggest that the models evaluated are able to adjust to different problems studied, from the simplest to the most complex in terms of modeling the observations for classification purposes. With real data, where then shapes distributions of the class is unknown, the models showed reasonable error rates when compared to other classifiers. As a limitation for the analized sets of data was observed that modeling by finite mixtures requires large samples per class when the dimension of the feature vector is relatively high. / Investigamos o emprego de misturas finitas de densidades na família normal assimétrica independente, em particular a normal assimétrica e a t assimétrica, para modelar as distribuições condicionais do vetor de características em Análise Discriminante (AD). O objetivo é obter modelos capazes de modelar dados com estruturas mais complexas onde, por exemplo, temos assimetria e multimodalidade, o quemuitas vezes ocorrem em problemas reais de AD. Para avaliar esta modelagem, desenvolvemos um estudo de simulação e aplicações em dados reais, analisando a taxa de erro (TE) associadas aos classificadores obtidos com estes modelos de misturas. Foram simulados problemas com diferentes estruturas, relativas à separação e distribuição das classes e o tamanho do conjunto de treinamento. Os resultados do estudo sugerem que os modelos avaliados são capazes de se ajustar aos diferentes problemas estudados, desde os mais simples aos mais complexos, em termos de modelagem das observações para fins de classificação. Com os dados reais, situações onde desconhecemos as formas das distribuições nas classes, os modelos apresentaram TE’s razoáveis quando comparados a outros classificadores. Como uma limitação, para os conjuntos de dados analisados, foi observado que a modelagem por misturas finitas necessita de amostras grandes por classe em situações onde a dimensão do vetor de características é relativamente alta. Análise Discriminante Mistura Finita de Densidades Normal Assimétrica e t Assimétrica Discriminant analysis Finite mixture models Skew normal and skew t CIÊNCIAS EXATAS E DA TERRA: MATEMÁTICA
12	An Investigation of Distribution Functions Su, Nan-cheng 24 June 2008 (has links) The study of properties of probability distributions has always been a persistent theme of statistics and of applied probability. This thesis deals with an investigation of distribution functions under the following two topics: (i) characterization of distributions based on record values and order statistics, (ii) properties of the skew-t distribution. Within the extensive characterization literature there are several results involving properties of record values and order statistics. Although there have been many well known results already developed, it is still of great interest to find new characterization of distributions based on record values and order statistics. In the first part, we provide the conditional distribution of any record value given the maximum order statistics and study characterizations of distributions based on record values and the maximum order statistics. We also give some characterizations of the mean value function within the class of order statistics point processes, by using certain relations between the conditional moments of the jump times or current lives. These results can be applied to characterize the uniform distribution using the sequence of order statistics, and the exponential distribution using the sequence of record values, respectively. Azzalini (1985, 1986) introduced the skew-normal distribution which includes the normal distribution and has some properties like the normal and yet is skew. This class of distributions is useful in studying robustness and for modeling skewness. Since then, skew-symmetric distributions have been proposed by many authors. In the second part, the so-called generalized skew-t distribution is defined and studied. Examples of distributions in this class, generated by the ratio of two independent skew-symmetric distributions, are given. We also investigate properties of the skew-symmetric distribution. skew-normal distribution nonhomogeneous Poisson process conditional expectation skew-Cauchy distribution skew-t distribution. skew-symmetric distribution order statistics characterization conditional distribution record values order statistics property
13	Statistical Inference for a New Class of Skew t Distribution and Its Related Properties Basalamah, Doaa 04 August 2017 (has links) No description available. Mathematics Statistics skew t distribution skew distribution Kumaraswamy distribution Beta distribution L-moments Maximum likelihood estimation Mixture distribution Mixture of Kumaraswamy distribution Mixture of
14	Analyse intégrative de données de grande dimension appliquée à la recherche vaccinale / Integrative analysis of high-dimensional data applied to vaccine research Hejblum, Boris 06 March 2015 (has links) Les données d’expression génique sont reconnues comme étant de grande dimension, etnécessitant l’emploi de méthodes statistiques adaptées. Mais dans le contexte des essaisvaccinaux, d’autres mesures, comme par exemple les mesures de cytométrie en flux, sontégalement de grande dimension. De plus, ces données sont souvent mesurées de manièrelongitudinale. Ce travail est bâti sur l’idée que l’utilisation d’un maximum d’informationdisponible, en modélisant les connaissances a priori ainsi qu’en intégrant l’ensembledes différentes données disponibles, améliore l’inférence et l’interprétabilité des résultatsd’analyses statistiques en grande dimension. Tout d’abord, nous présentons une méthoded’analyse par groupe de gènes pour des données d’expression génique longitudinales. Ensuite,nous décrivons deux analyses intégratives dans deux études vaccinales. La premièremet en évidence une sous-expression des voies biologiques d’inflammation chez les patientsayant un rebond viral moins élevé à la suite d’un vaccin thérapeutique contre le VIH. Ladeuxième étude identifie un groupe de gènes lié au métabolisme lipidique dont l’impactsur la réponse à un vaccin contre la grippe semble régulé par la testostérone, et donc liéau sexe. Enfin, nous introduisons un nouveau modèle de mélange de distributions skew t àprocessus de Dirichlet pour l’identification de populations cellulaires à partir de donnéesde cytométrie en flux disponible notamment dans les essais vaccinaux. En outre, nousproposons une stratégie d’approximation séquentielle de la partition a posteriori dans lecas de mesures répétées. Ainsi, la reconnaissance automatique des populations cellulairespourrait permettre à la fois une avancée pratique pour le quotidien des immunologistesainsi qu’une interprétation plus précise des résultats d’expression génique après la priseen compte de l’ensemble des populations cellulaires. / Gene expression data is recognized as high-dimensional data that needs specific statisticaltools for its analysis. But in the context of vaccine trials, other measures, such asflow-cytometry measurements are also high-dimensional. In addition, such measurementsare often repeated over time. This work is built on the idea that using the maximum ofavailable information, by modeling prior knowledge and integrating all data at hand, willimprove the inference and the interpretation of biological results from high-dimensionaldata. First, we present an original methodological development, Time-course Gene SetAnalysis (TcGSA), for the analysis of longitudinal gene expression data, taking into accountprior biological knowledge in the form of predefined gene sets. Second, we describetwo integrative analyses of two different vaccine studies. The first study reveals lowerexpression of inflammatory pathways consistently associated with lower viral rebound followinga HIV therapeutic vaccine. The second study highlights the role of a testosteronemediated group of genes linked to lipid metabolism in sex differences in immunologicalresponse to a flu vaccine. Finally, we introduce a new model-based clustering approach forthe automated treatment of cell populations from flow-cytometry data, namely a Dirichletprocess mixture of skew t-distributions, with a sequential posterior approximation strategyfor dealing with repeated measurements. Hence, the automatic recognition of thecell populations could allow a practical improvement of the daily work of immunologistsas well as a better interpretation of gene expression data after taking into account thefrequency of all cell populations. Analyse intégrée Analyse par groupe de gènes Bayesien non paramétrique Connaissance a priori Cytométrie en flux Dimorphisme sexuel Distribution skew t Données de grande dimension Fenêtrage automatisé Grippe Génomique Modèle de mélange Processus de Dirichlet Vaccin VIH Automated gating Dirichlet process Flow cytometry Flu Gene set analysis Highdimensional data HIV Integrative analysis Mixture model Nonparametric Bayesian Prior knowledge Sexual dimorphism Skew t-distribution Statistical genomics Vaccine
15	Calibração linear assimétrica / Asymmetric Linear Calibration Figueiredo, Cléber da Costa 27 February 2009 (has links) A presente tese aborda aspectos teóricos e aplicados da estimação dos parâmetros do modelo de calibração linear com erros distribuídos conforme a distribuição normal-assimétrica (Azzalini, 1985) e t-normal-assimétrica (Gómez, Venegas e Bolfarine, 2007). Aplicando um modelo assimétrico, não é necessário transformar as variáveis a fim de obter erros simétricos. A estimação dos parâmetros e das variâncias dos estimadores do modelo de calibração foram estudadas através da visão freqüentista e bayesiana, desenvolvendo algoritmos tipo EM e amostradores de Gibbs, respectivamente. Um dos pontos relevantes do trabalho, na óptica freqüentista, é a apresentação de uma reparametrização para evitar a singularidade da matriz de informação de Fisher sob o modelo de calibração normal-assimétrico na vizinhança de lambda = 0. Outro interessante aspecto é que a reparametrização não modifica o parâmetro de interesse. Já na óptica bayesiana, o ponto forte do trabalho está no desenvolvimento de medidas para verificar a qualidade do ajuste e que levam em consideração a assimetria do conjunto de dados. São propostas duas medidas para medir a qualidade do ajuste: o ADIC (Asymmetric Deviance Information Criterion) e o EDIC (Evident Deviance Information Criterion), que são extensões da ideia de Spiegelhalter et al. (2002) que propôs o DIC ordinário que só deve ser usado em modelos simétricos. / This thesis focuses on theoretical and applied estimation aspects of the linear calibration model with skew-normal (Azzalini, 1985) and skew-t-normal (Gómez, Venegas e Bolfarine, 2007) error distributions. Applying the asymmetrical distributed error methodology, it is not necessary to transform the variables in order to have symmetrical errors. The frequentist and the Bayesian solution are presented. The parameter estimation and its variance estimation were studied using the EM algorithm and the Gibbs sampler, respectively, in each approach. The main point, in the frequentist approach, is the presentation of a new parameterization to avoid singularity of the information matrix under the skew-normal calibration model in a neighborhood of lambda = 0. Another interesting aspect is that the reparameterization developed to make the information matrix nonsingular, when the skewness parameter is near to zero, leaves the parameter of interest unchanged. The main point, in the Bayesian framework, is the presentation of two measures of goodness-of-fit: ADIC (Asymmetric Deviance Information Criterion) and EDIC (Evident Deviance Information Criterion ). They are natural extensions of the ordinary DIC developed by Spiegelhalter et al. (2002). algoritmo EM amostrador de Gibbs critérios de informação EM algorithm Gibbs sampler information criteria singularity of the information matrix
16	Calibração linear assimétrica / Asymmetric Linear Calibration Cléber da Costa Figueiredo 27 February 2009 (has links) A presente tese aborda aspectos teóricos e aplicados da estimação dos parâmetros do modelo de calibração linear com erros distribuídos conforme a distribuição normal-assimétrica (Azzalini, 1985) e t-normal-assimétrica (Gómez, Venegas e Bolfarine, 2007). Aplicando um modelo assimétrico, não é necessário transformar as variáveis a fim de obter erros simétricos. A estimação dos parâmetros e das variâncias dos estimadores do modelo de calibração foram estudadas através da visão freqüentista e bayesiana, desenvolvendo algoritmos tipo EM e amostradores de Gibbs, respectivamente. Um dos pontos relevantes do trabalho, na óptica freqüentista, é a apresentação de uma reparametrização para evitar a singularidade da matriz de informação de Fisher sob o modelo de calibração normal-assimétrico na vizinhança de lambda = 0. Outro interessante aspecto é que a reparametrização não modifica o parâmetro de interesse. Já na óptica bayesiana, o ponto forte do trabalho está no desenvolvimento de medidas para verificar a qualidade do ajuste e que levam em consideração a assimetria do conjunto de dados. São propostas duas medidas para medir a qualidade do ajuste: o ADIC (Asymmetric Deviance Information Criterion) e o EDIC (Evident Deviance Information Criterion), que são extensões da ideia de Spiegelhalter et al. (2002) que propôs o DIC ordinário que só deve ser usado em modelos simétricos. / This thesis focuses on theoretical and applied estimation aspects of the linear calibration model with skew-normal (Azzalini, 1985) and skew-t-normal (Gómez, Venegas e Bolfarine, 2007) error distributions. Applying the asymmetrical distributed error methodology, it is not necessary to transform the variables in order to have symmetrical errors. The frequentist and the Bayesian solution are presented. The parameter estimation and its variance estimation were studied using the EM algorithm and the Gibbs sampler, respectively, in each approach. The main point, in the frequentist approach, is the presentation of a new parameterization to avoid singularity of the information matrix under the skew-normal calibration model in a neighborhood of lambda = 0. Another interesting aspect is that the reparameterization developed to make the information matrix nonsingular, when the skewness parameter is near to zero, leaves the parameter of interest unchanged. The main point, in the Bayesian framework, is the presentation of two measures of goodness-of-fit: ADIC (Asymmetric Deviance Information Criterion) and EDIC (Evident Deviance Information Criterion ). They are natural extensions of the ordinary DIC developed by Spiegelhalter et al. (2002). algoritmo EM amostrador de Gibbs critérios de informação EM algorithm Gibbs sampler information criteria singularity of the information matrix
17	Modelos multidimensionais da TRI com distribuições assimétricas para os traços latentes / Multidimensional IRT models with skew distributions for latent traits. Gilberto da Silva Matos 15 December 2008 (has links) A falta de alternativas ao modelo normal uni/multivariado já é um problema superado pois atualmente é possível encontrar inúmeros trabalhos que introduzem e desenvolvem generalizações da distribuição normal com relação `a assimetria, curtose e/ou multimodalidade (Branco e Arellano-Valle (2004), Genton (2004), Arellano-Valle et al. (2006)). No contexto dos modelos unidimensionais da Teoria da Resposta ao Item (TRI), Bazán (2005) percebeu esta realidade e introduziu uma classe denominada PANA (Probito Assimétrico - Normal Assimétrica) a qual permite modelar possíveis comportamentos assimétricos de um modelo (uma probabilidade) de resposta ao item bem como a especificação de uma distribuição normal assimétrica para os traços latentes (unidimensionais) a qual é utilizada no processo de estimação. Motivado pela necessidade de melhor representar os fenômenos da área psicométrica (Heinen, 1996, p. 105) e da atual disponibilidade de distribuições elípticas assimétricas cujas propriedades são tão convenientes quanto aquelas devidas `a distribuição normal, a proposta do presente trabalho é apresentar uma extensão do modelo K-dimensional de 3 Parâmetros Probito (Kd3PP) com vetores de traços latentes normalmente distribuídos para o caso t-Assimétrico, gerando, assim, o que denominamos modelo Kd3PP-tA. Nossa proposta, portanto, pode ser considerada como uma extensão do trabalho desenvolvido por Bazán (2005) tanto no sentido de extender a distribuição unidimensional assimétrica dos traços latentes para o caso multidimensional quanto no que conscerne em considerar o achatamento (curtose) da distribuição. Nossa proposta também pode ser vista como uma extensão do trabalho de Béguin e Glas (2001) no sentido de desenvolver o método de estimação bayesiana dos modelos multidimensionais da TRI via DAGS (Dados Aumentados com Amostrador de Gibbs) para o caso em que os vetores de traços latentes comportam-se segundo uma distribuição multivariada t-Assimétrica. No desenvolvimento deste trabalho nos deparamos com uma das principais dificuldades encontradas no processo de estimação e inferência dos modelos multidimensionais da TRI que é a falta de identificabilidade e, com a intenção de ampliar e desmistificar nossos conhecimentos sobre um assunto ainda pouco explorado na literatura da TRI, apresentamos um estudo bibliográfico sobre este tema tanto sob o contexto da inferência clássica quanto bayesiana. Com o intuito de identificar situações particulares em que o uso de uma distribuição normal assimétrica para os traços latentes seja de maior relevância para a estimação e inferência dos parâmetros de item, bem como outros parâmetros relacionados à distribuição dos traços latentes, algumas análises sobre conjuntos de dados simulados são desenvolvidas. Como conclusão destas análises, podemos dizer que há uma melhora superficial quando a informação sobre uma possível assimetria na distribuição dos traços latentes não é ignorada. Além disso, os resultados favoreceram a seleção dos modelos que consideram distribuições assimétricas para os traços latentes, principalmente quando são considerados os modelos que possibilitam a estimação dos parâmetros de localização e escala da distribuição dos vetores de traços latentes. Duas principais contribuições que consideramos de ordem prática, são: a análise e a interpretação de testes através da estimação de modelos uni e multidimensionais da TRI que consideram tanto distribuições simétricas quanto assimétricas para os vetores de traços latentes e a disponibilização de uma função escrita em códigos R e C++ para a estimação dos modelos apresentados e desenvolvidos no presente trabalho. / The lack of alternatives to the univariate or multivariate normal model has been already solved because actually it has been possible to find several works that introduce and develop generalizations of the normal distribution in relation to the asymmetry, kurtosis and/or multimodality (Branco e Arellano-Valle (2004), Genton (2004), Arellano-Valle et al. (2006). In the context of unidimensional models of the Item Response Theory (IRT), Baz´an (2005) observed this fact and introduced a class called PANA (Probito Assimétrico - Normal Assimétrica) which allows to take account for asymmetry in the shape of an item response model (probability) and the specification of a skew normal distribution for unidimensional latent traits which is used in the estimation process. Motivated by the need to better represent the phenomenon of psychometric area (Heinen, 1996, p. 105) and the current availability of skew elliptical distributions whose properties are as convenient as those due to normal distribution, the proposal of this work is to provide an extension of multidimensional 3 Parameters Probit model (Kd3PP) where latent traits vectors are normally distributed for the case of Skew-t distribution (Sahu et al., 2003), generating therefore what we call Kd3PP-St model. Our proposal, therefore, can be regarded as an extension of the work of Bazán (2005) in two ways: the first is extending the unidimensional skew normal distribution of latent traits to the multidimensional case and second in the sense to consider the flattening (kurtosis) of this distribution. Our proposal can also be seen as an extension of the work of B´eguin e Glas (2001) in the sense that we develop the Bayesian estimation method of the 3 parameters multidimensional item response model by DAGS (Augmentated Data with Gibbs sampling) for the case where the latent trait vectors behave according to a Skew-t multivariate distribution. In the development of this work we come across one of the main difficulties encountered in the process of estimation and inference of multidimensional IRT models which is the lack of identifiabilitie and, with the intent to demystify and expand our knowledge on a subject still little explored in the literature of the IRT, we present a bibliographical study on this subject both in the context of classical and Bayesian inference. In order to identify particular situations where the use of a skew normal distribution is more relevant to the estimation and inference of item parameters as well as other parameters related to the distribution of latent traits, some analyses on simulated data sets are developed. As results of these analyses, we can say that there is a modest improvement when information about a possible asymmetry in the distribution of latent traits is not ignored. Moreover, the results favored the selection of models that consider asymmetric distributions for latent traits, especially when models that enable the estimation of parameters of location and scale from this distribution are considered. Two main contributions that we consider of pratical interest are: analysis and interpretations of tests using unidimensional and multidimensional IRT models that consider both simetric and skewed distributions for the vectors of latent traits and a function written in R and C++ language program that is made disponible for the estimation of models treated in this work. Equalização Identificabilidade métodos bayesianos via MCMC Modelos multidimensionais da TRI Augmentated Data with Gibbs sampling Identifiabily Multidimensional equating Multivariate Skew-t distribution R and C++ language program.
18	Modelos multidimensionais da TRI com distribuições assimétricas para os traços latentes / Multidimensional IRT models with skew distributions for latent traits. Matos, Gilberto da Silva 15 December 2008 (has links) A falta de alternativas ao modelo normal uni/multivariado já é um problema superado pois atualmente é possível encontrar inúmeros trabalhos que introduzem e desenvolvem generalizações da distribuição normal com relação `a assimetria, curtose e/ou multimodalidade (Branco e Arellano-Valle (2004), Genton (2004), Arellano-Valle et al. (2006)). No contexto dos modelos unidimensionais da Teoria da Resposta ao Item (TRI), Bazán (2005) percebeu esta realidade e introduziu uma classe denominada PANA (Probito Assimétrico - Normal Assimétrica) a qual permite modelar possíveis comportamentos assimétricos de um modelo (uma probabilidade) de resposta ao item bem como a especificação de uma distribuição normal assimétrica para os traços latentes (unidimensionais) a qual é utilizada no processo de estimação. Motivado pela necessidade de melhor representar os fenômenos da área psicométrica (Heinen, 1996, p. 105) e da atual disponibilidade de distribuições elípticas assimétricas cujas propriedades são tão convenientes quanto aquelas devidas `a distribuição normal, a proposta do presente trabalho é apresentar uma extensão do modelo K-dimensional de 3 Parâmetros Probito (Kd3PP) com vetores de traços latentes normalmente distribuídos para o caso t-Assimétrico, gerando, assim, o que denominamos modelo Kd3PP-tA. Nossa proposta, portanto, pode ser considerada como uma extensão do trabalho desenvolvido por Bazán (2005) tanto no sentido de extender a distribuição unidimensional assimétrica dos traços latentes para o caso multidimensional quanto no que conscerne em considerar o achatamento (curtose) da distribuição. Nossa proposta também pode ser vista como uma extensão do trabalho de Béguin e Glas (2001) no sentido de desenvolver o método de estimação bayesiana dos modelos multidimensionais da TRI via DAGS (Dados Aumentados com Amostrador de Gibbs) para o caso em que os vetores de traços latentes comportam-se segundo uma distribuição multivariada t-Assimétrica. No desenvolvimento deste trabalho nos deparamos com uma das principais dificuldades encontradas no processo de estimação e inferência dos modelos multidimensionais da TRI que é a falta de identificabilidade e, com a intenção de ampliar e desmistificar nossos conhecimentos sobre um assunto ainda pouco explorado na literatura da TRI, apresentamos um estudo bibliográfico sobre este tema tanto sob o contexto da inferência clássica quanto bayesiana. Com o intuito de identificar situações particulares em que o uso de uma distribuição normal assimétrica para os traços latentes seja de maior relevância para a estimação e inferência dos parâmetros de item, bem como outros parâmetros relacionados à distribuição dos traços latentes, algumas análises sobre conjuntos de dados simulados são desenvolvidas. Como conclusão destas análises, podemos dizer que há uma melhora superficial quando a informação sobre uma possível assimetria na distribuição dos traços latentes não é ignorada. Além disso, os resultados favoreceram a seleção dos modelos que consideram distribuições assimétricas para os traços latentes, principalmente quando são considerados os modelos que possibilitam a estimação dos parâmetros de localização e escala da distribuição dos vetores de traços latentes. Duas principais contribuições que consideramos de ordem prática, são: a análise e a interpretação de testes através da estimação de modelos uni e multidimensionais da TRI que consideram tanto distribuições simétricas quanto assimétricas para os vetores de traços latentes e a disponibilização de uma função escrita em códigos R e C++ para a estimação dos modelos apresentados e desenvolvidos no presente trabalho. / The lack of alternatives to the univariate or multivariate normal model has been already solved because actually it has been possible to find several works that introduce and develop generalizations of the normal distribution in relation to the asymmetry, kurtosis and/or multimodality (Branco e Arellano-Valle (2004), Genton (2004), Arellano-Valle et al. (2006). In the context of unidimensional models of the Item Response Theory (IRT), Baz´an (2005) observed this fact and introduced a class called PANA (Probito Assimétrico - Normal Assimétrica) which allows to take account for asymmetry in the shape of an item response model (probability) and the specification of a skew normal distribution for unidimensional latent traits which is used in the estimation process. Motivated by the need to better represent the phenomenon of psychometric area (Heinen, 1996, p. 105) and the current availability of skew elliptical distributions whose properties are as convenient as those due to normal distribution, the proposal of this work is to provide an extension of multidimensional 3 Parameters Probit model (Kd3PP) where latent traits vectors are normally distributed for the case of Skew-t distribution (Sahu et al., 2003), generating therefore what we call Kd3PP-St model. Our proposal, therefore, can be regarded as an extension of the work of Bazán (2005) in two ways: the first is extending the unidimensional skew normal distribution of latent traits to the multidimensional case and second in the sense to consider the flattening (kurtosis) of this distribution. Our proposal can also be seen as an extension of the work of B´eguin e Glas (2001) in the sense that we develop the Bayesian estimation method of the 3 parameters multidimensional item response model by DAGS (Augmentated Data with Gibbs sampling) for the case where the latent trait vectors behave according to a Skew-t multivariate distribution. In the development of this work we come across one of the main difficulties encountered in the process of estimation and inference of multidimensional IRT models which is the lack of identifiabilitie and, with the intent to demystify and expand our knowledge on a subject still little explored in the literature of the IRT, we present a bibliographical study on this subject both in the context of classical and Bayesian inference. In order to identify particular situations where the use of a skew normal distribution is more relevant to the estimation and inference of item parameters as well as other parameters related to the distribution of latent traits, some analyses on simulated data sets are developed. As results of these analyses, we can say that there is a modest improvement when information about a possible asymmetry in the distribution of latent traits is not ignored. Moreover, the results favored the selection of models that consider asymmetric distributions for latent traits, especially when models that enable the estimation of parameters of location and scale from this distribution are considered. Two main contributions that we consider of pratical interest are: analysis and interpretations of tests using unidimensional and multidimensional IRT models that consider both simetric and skewed distributions for the vectors of latent traits and a function written in R and C++ language program that is made disponible for the estimation of models treated in this work. Augmentated Data with Gibbs sampling Equalização Identifiabily Identificabilidade métodos bayesianos via MCMC Modelos multidimensionais da TRI Multidimensional equating Multivariate Skew-t distribution R and C++ language program.
19	Markovo grandinės Monte-Karlo metodo tyrimas ir taikymas / Study and application of Markov chain Monte Carlo method Vaičiulytė, Ingrida 09 December 2014 (has links) Disertacijoje nagrinėjami Markovo grandinės Monte-Karlo (MCMC) adaptavimo metodai, skirti efektyviems skaitiniams duomenų analizės sprendimų priėmimo su iš anksto nustatytu patikimumu algoritmams sudaryti. Suformuluoti ir išspręsti hierarchiniu būdu sudarytų daugiamačių skirstinių (asimetrinio t skirstinio, Puasono-Gauso modelio, stabiliojo simetrinio vektoriaus dėsnio) parametrų vertinimo uždaviniai. Adaptuotai MCMC procedūrai sukurti yra pritaikytas nuoseklaus Monte-Karlo imčių generavimo metodas, įvedant statistinį stabdymo kriterijų ir imties tūrio reguliavimą. Statistiniai uždaviniai išspręsti šiuo metodu leidžia atskleisti aktualias MCMC metodų skaitmeninimo problemų ypatybes. MCMC algoritmų efektyvumas tiriamas pasinaudojant disertacijoje sudarytu statistinio modeliavimo metodu. Atlikti eksperimentai su sportininkų duomenimis ir sveikatos industrijai priklausančių įmonių finansiniais duomenimis patvirtino, kad metodo skaitinės savybės atitinka teorinį modelį. Taip pat sukurti metodai ir algoritmai pritaikyti sociologinių duomenų analizės modeliui sudaryti. Atlikti tyrimai parodė, kad adaptuotas MCMC algoritmas leidžia gauti nagrinėjamų skirstinių parametrų įvertinius per mažesnį grandžių skaičių ir maždaug du kartus sumažinti skaičiavimų apimtį. Disertacijoje sukonstruoti algoritmai gali būti pritaikyti stochastinio pobūdžio sistemų tyrimui ir kitiems statistikos uždaviniams spręsti MCMC metodu. / Markov chain Monte Carlo adaptive methods by creating computationally effective algorithms for decision-making of data analysis with the given accuracy are analyzed in this dissertation. The tasks for estimation of parameters of the multivariate distributions which are constructed in hierarchical way (skew t distribution, Poisson-Gaussian model, stable symmetric vector law) are described and solved in this research. To create the adaptive MCMC procedure, the sequential generating method is applied for Monte Carlo samples, introducing rules for statistical termination and for sample size regulation of Markov chains. Statistical tasks, solved by this method, reveal characteristics of relevant computational problems including MCMC method. Effectiveness of the MCMC algorithms is analyzed by statistical modeling method, constructed in the dissertation. Tests made with sportsmen data and financial data of enterprises, belonging to health-care industry, confirmed that numerical properties of the method correspond to the theoretical model. The methods and algorithms created also are applied to construct the model for sociological data analysis. Tests of algorithms have shown that adaptive MCMC algorithm allows to obtain estimators of examined distribution parameters in lower number of chains, and reducing the volume of calculations approximately two times. The algorithms created in this dissertation can be used to test the systems of stochastic type and to solve other statistical... [to full text] Informatics Markovo grandinės Monte-Karlo metodas Asimetrinis t skirstinys Puasono-Gauso modelis Stabilusis skirstinys Statistinis modeliavimas Markov chain Monte Carlo method Skew t distribution Poisson-Gaussian model Stable distribution Statistical modeling
20	Study and application of Markov chain Monte Carlo method / Markovo grandinės Monte-Karlo metodo tyrimas ir taikymas Vaičiulytė, Ingrida 09 December 2014 (has links) Markov chain Monte Carlo adaptive methods by creating computationally effective algorithms for decision-making of data analysis with the given accuracy are analyzed in this dissertation. The tasks for estimation of parameters of the multivariate distributions which are constructed in hierarchical way (skew t distribution, Poisson-Gaussian model, stable symmetric vector law) are described and solved in this research. To create the adaptive MCMC procedure, the sequential generating method is applied for Monte Carlo samples, introducing rules for statistical termination and for sample size regulation of Markov chains. Statistical tasks, solved by this method, reveal characteristics of relevant computational problems including MCMC method. Effectiveness of the MCMC algorithms is analyzed by statistical modeling method, constructed in the dissertation. Tests made with sportsmen data and financial data of enterprises, belonging to health-care industry, confirmed that numerical properties of the method correspond to the theoretical model. The methods and algorithms created also are applied to construct the model for sociological data analysis. Tests of algorithms have shown that adaptive MCMC algorithm allows to obtain estimators of examined distribution parameters in lower number of chains, and reducing the volume of calculations approximately two times. The algorithms created in this dissertation can be used to test the systems of stochastic type and to solve other statistical... [to full text] / Disertacijoje nagrinėjami Markovo grandinės Monte-Karlo (MCMC) adaptavimo metodai, skirti efektyviems skaitiniams duomenų analizės sprendimų priėmimo su iš anksto nustatytu patikimumu algoritmams sudaryti. Suformuluoti ir išspręsti hierarchiniu būdu sudarytų daugiamačių skirstinių (asimetrinio t skirstinio, Puasono-Gauso modelio, stabiliojo simetrinio vektoriaus dėsnio) parametrų vertinimo uždaviniai. Adaptuotai MCMC procedūrai sukurti yra pritaikytas nuoseklaus Monte-Karlo imčių generavimo metodas, įvedant statistinį stabdymo kriterijų ir imties tūrio reguliavimą. Statistiniai uždaviniai išspręsti šiuo metodu leidžia atskleisti aktualias MCMC metodų skaitmeninimo problemų ypatybes. MCMC algoritmų efektyvumas tiriamas pasinaudojant disertacijoje sudarytu statistinio modeliavimo metodu. Atlikti eksperimentai su sportininkų duomenimis ir sveikatos industrijai priklausančių įmonių finansiniais duomenimis patvirtino, kad metodo skaitinės savybės atitinka teorinį modelį. Taip pat sukurti metodai ir algoritmai pritaikyti sociologinių duomenų analizės modeliui sudaryti. Atlikti tyrimai parodė, kad adaptuotas MCMC algoritmas leidžia gauti nagrinėjamų skirstinių parametrų įvertinius per mažesnį grandžių skaičių ir maždaug du kartus sumažinti skaičiavimų apimtį. Disertacijoje sukonstruoti algoritmai gali būti pritaikyti stochastinio pobūdžio sistemų tyrimui ir kitiems statistikos uždaviniams spręsti MCMC metodu. Informatics Markov chain Monte Carlo method Skew t distribution Poisson-Gaussian model Stable distribution Statistical modeling Markovo grandinės Monte Karlo metodas Asimetrinis t skirstinys Puasono-Gauso modelis Stabilusis skirstinys Statistinis modeliavimas

Search results