Global ETD Search

1	A distribuição normal-valor extremo generalizado para a modelagem de dados limitados no intervalo unitá¡rio (0,1) / The normal-generalized extreme value distribution for the modeling of data restricted in the unit interval (0,1) Benites, Yury Rojas 28 June 2019 (has links) Neste trabalho é introduzido um novo modelo estatístico para modelar dados limitados no intervalo continuo (0;1). O modelo proposto é construído sob uma transformação de variáveis, onde a variável transformada é resultado da combinação de uma variável com distribuição normal padrão e a função de distribuição acumulada da distribuição valor extremo generalizado. Para o novo modelo são estudadas suas propriedades estruturais. A nova família é estendida para modelos de regressão, onde o modelo é reparametrizado na mediana da variável resposta e este conjuntamente com o parâmetro de dispersão são relacionados com covariáveis através de uma função de ligação. Procedimentos inferênciais são desenvolvidos desde uma perspectiva clássica e bayesiana. A inferência clássica baseia-se na teoria de máxima verossimilhança e a inferência bayesiana no método de Monte Carlo via cadeias de Markov. Além disso estudos de simulação foram realizados para avaliar o desempenho das estimativas clássicas e bayesianas dos parâmetros do modelo. Finalmente um conjunto de dados de câncer colorretal é considerado para mostrar a aplicabilidade do modelo. / In this research a new statistical model is introduced to model data restricted in the continuous interval (0;1). The proposed model is constructed under a transformation of variables, in which the transformed variable is the result of the combination of a variable with standard normal distribution and the cumulative distribution function of the generalized extreme value distribution. For the new model its structural properties are studied. The new family is extended to regression models, in which the model is reparametrized in the median of the response variable and together with the dispersion parameter are related to covariables through a link function. Inferential procedures are developed from a classical and Bayesian perspective. The classical inference is based on the theory of maximum likelihood, and the Bayesian inference is based on the Markov chain Monte Carlo method. In addition, simulation studies were performed to evaluate the performance of the classical and Bayesian estimates of the model parameters. Finally a set of colorectal cancer data is considered to show the applicability of the model Bayesian inference Bayesian inference Generalized extreme value distribution Generalized extreme value distribution Maximum likelihood estimator Maximum likelihood estimator MCMC Method MCMC Method
2	Bayesian Networks for Modelling the Respiratory System and Predicting Hospitalizations Lopo Martinez, Victor January 2023 (has links) Bayesian networks can be used to model the respiratory system. Their structure indicate how risk factors, symptoms, and diseases are related and the Conditional Probability Tables enable predictions about a patient’s need for hospitalization. Numerous structure learning algorithms exist for discerning the structure of a Bayesian network, but none can guarantee to find the perfect structure. Employing multiple algorithms can discover relationships between variables that might otherwise remain hidden when relying on a single algorithm. The Maximum Likelihood Estimator is the predominant algorithm for learning the Conditional Probability Tables. However, it faces challenges due to the data fragmentation problem, which can compromise its predictions. Failing to hospitalize patients who require specialized medical care could lead to severe consequences. Therefore, in this thesis, the use of an XGBoost model for learning is proposed as a novel and better method since it does not suffer from data fragmentation. A Bayesian network is constructed combining several structure learning algorithms, and the predictive performance of the Maximum Likelihood Estimator and XGBoost are compared. XGBoost achieved a maximum accuracy of 86.0% compared to the Maximum Likelihood Estimator, which attained an accuracy of 81.5% in predicting future patient hospitalization. In this way, the predictive performance of Bayesian networks has been enhanced. / Bayesianska nätverk kan användas för att modellera andningssystemet. Deras struktur visar hur riskfaktorer, symtom och sjukdomar är relaterade, och de villkorliga sannolikhetstabellerna möjliggör prognoser om en patients behov av sjukhusvård. Det finns många strukturlärningsalgoritmer för att urskilja strukturen i ett bayesianskt nätverk, men ingen kan garantera att hitta den perfekta strukturen. Genom att använda flera algoritmer kan man upptäcka relationer mellan variabler som annars kan förbli dolda när man bara förlitar sig på en enda algoritm. Maximum Likelihood Estimator är den dominerande algoritmen för att lära sig de villkorliga sannolikhetstabellerna. Men den står inför utmaningar på grund av datafragmenteringsproblemet, vilket kan äventyra dess prognoser. Att inte lägga in patienter som behöver specialiserad medicinsk vård kan leda till allvarliga konsekvenser. Därför föreslås i denna avhandling användningen av en XGBoost-modell för inlärning som en ny och bättre metod eftersom den inte lider av datafragmentering. Ett bayesianskt nätverk byggs genom att kombinera flera strukturlärningsalgoritmer, och den prediktiva prestandan för Maximum Likelihood Estimator och XGBoost jämförs. XGBoost uppnådde en maximal noggrannhet på 86,0% jämfört med Maximum Likelihood Estimator, som uppnådde en noggrannhet på 81,5% för att förutsäga framtida patientinläggning. På detta sätt har den prediktiva prestandan för bayesianska nätverk förbättrats. Bayesian Networks Structure Learning Conditional Probability Tables Maximum Likelihood Estimator XGBoost and Respiratory System Bayesianska nätverk Strukturinlärning Villkorliga sannolikhetstabeller Maximum Likelihood Estimator XGBoost och Andningssystemet Computer and Information Sciences Data- och informationsvetenskap
3	Estimation in partly parametric additive Cox models Läuter, Henning January 2003 (has links) The dependence between survival times and covariates is described e.g. by proportional hazard models. We consider partly parametric Cox models and discuss here the estimation of interesting parameters. We represent the ma- ximum likelihood approach and extend the results of Huang (1999) from linear to nonlinear parameters. Then we investigate the least squares esti- mation and formulate conditions for the a.s. boundedness and consistency of these estimators. Survival models with covariates estimation of regression maximum likelihood estimator least squares estimator boun- dedness consistency Mathematics
4	Contributions to Estimation and Testing Block Covariance Structures in Multivariate Normal Models Liang, Yuli January 2015 (has links) This thesis concerns inference problems in balanced random effects models with a so-called block circular Toeplitz covariance structure. This class of covariance structures describes the dependency of some specific multivariate two-level data when both compound symmetry and circular symmetry appear simultaneously. We derive two covariance structures under two different invariance restrictions. The obtained covariance structures reflect both circularity and exchangeability present in the data. In particular, estimation in the balanced random effects with block circular covariance matrices is considered. The spectral properties of such patterned covariance matrices are provided. Maximum likelihood estimation is performed through the spectral decomposition of the patterned covariance matrices. Existence of the explicit maximum likelihood estimators is discussed and sufficient conditions for obtaining explicit and unique estimators for the variance-covariance components are derived. Different restricted models are discussed and the corresponding maximum likelihood estimators are presented. This thesis also deals with hypothesis testing of block covariance structures, especially block circular Toeplitz covariance matrices. We consider both so-called external tests and internal tests. In the external tests, various hypotheses about testing block covariance structures, as well as mean structures, are considered, and the internal tests are concerned with testing specific covariance parameters given the block circular Toeplitz structure. Likelihood ratio tests are constructed, and the null distributions of the corresponding test statistics are derived. Block circular symmetry covariance parameters explicit maximum likelihood estimator likelihood ratio test restricted model Toeplitz matrix
5	Effects of template mass, complexity, and analysis method on the ability to correctly determine the number of contributors to DNA mixtures Alfonse, Lauren Elizabeth 08 April 2016 (has links) In traditional forensic DNA casework, the inclusion or exclusion of individuals who may have contributed to an item of evidence may be dependent upon the assumption on the number of individuals from which the evidence arose. Typically, the determination of the minimum number of contributors (NOC) to a mixture is achieved by counting the number of alleles observed above a given analytical threshold (AT); this technique is known as maximum allele count (MAC). However, advances in polymerase chain reaction (PCR) chemistries and improvements in analytical sensitivities have led to an increase in the detection of complex, low template DNA (LtDNA) mixtures for which MAC is an inadequate means of determining the actual NOC. Despite the addition of highly polymorphic loci to multiplexed PCR kits and the advent of interpretation softwares which deconvolve DNA mixtures, a gap remains in the DNA analysis pipeline, where an effective method of determining the NOC needs to be established. The emergence of NOCIt -- a computational tool which provides the probability distribution on the NOC, may serve as a promising alternative to traditional, threshold- based methods. Utilizing user-provided calibration data consisting of single source samples of known genotype, NOCIt calculates the a posteriori probability (APP) that an evidentiary sample arose from 0 to 5 contributors. The software models baseline noise, reverse and forward stutter proportions, stutter and allele dropout rates, and allele heights. This information is then utilized to determine whether the evidentiary profile originated from one or many contributors. In short, NOCIt provides information not only on the likely NOC, but whether more than one value may be deemed probable. In the latter case, it may be necessary to modify downstream interpretation steps such that multiple values for the NOC are considered or the conclusion that most favors the defense is adopted. Phase I of this study focused on establishing the minimum number of single source samples needed to calibrate NOCIt. Once determined, the performance of NOCIt was evaluated and compared to that of two other methods: the maximum likelihood estimator (MLE) -- accessed via the forensim R package, and MAC. Fifty (50) single source samples proved to be sufficient to calibrate NOCIt, and results indicate NOCIt was the most accurate method of the three. Phase II of this study explored the effects of template mass and sample complexity on the accuracy of NOCIt. Data showed that the accuracy decreased as the NOC increased: for 1- and 5-contributor samples, the accuracy was 100% and 20%, respectively. The minimum template mass from any one contributor required to consistently estimate the true NOC was 0.07 ng -- the equivalent of approximately 10 cells' worth of DNA. Phase III further explored NOCIt and was designed to assess its robustness. Because the efficacy of determining the NOC may be affected by the PCR kit utilized, the results obtained from NOCIt analysis of 1-, 2-, 3-, 4-, and 5-contributor mixtures amplified with AmpFlstr® Identifiler® Plus and PowerPlex® 16 HS were compared. A positive correlation was observed for all NOCIt outputs between kits. Additionally, NOCIt was found to result in increased accuracies when analyzed with 1-, 3-, and 4-contributor samples amplified with Identifiler® Plus and with 5-contributor samples amplified with PowerPlex® 16 HS. The accuracy rates obtained for 2-contributor samples were equivalent between kits; therefore, the effect of amplification kit type on the ability to determine the NOC was not substantive. Cumulatively, the data indicate that NOCIt is an improvement to traditional methods of determining the NOC and results in high accuracy rates with samples containing sufficient quantities of DNA. Further, the results of investigations into the effect of template mass on the ability to determine the NOC may serve as a caution that forensic DNA samples containing low-target quantities may need to be interpreted using multiple or different assumptions on the number of contributors, as the assumption on the number of contributors is known to affect the conclusion in certain casework scenarios. As a significant degree of inaccuracy was observed for all methods of determining the NOC at severe low template amounts, the data presented also challenge the notion that any DNA sample can be utilized for comparison purposes. This suggests that the ability to detect extremely complex, LtDNA mixtures may not be commensurate with the ability to accurately interpret such mixtures, despite critical advances in software-based analysis. In addition to the availability of advanced comparison algorithms, limitations on the interpretability of complex, LtDNA mixtures may also be dependent on the amount of biological material present on an evidentiary substrate. Bioinformatics DNA mixtures NOCIt Low template Maximum allele count Maximum likelihood estimator Number of contributors
6	Testing the Hazard Rate, Part I Liero, Hannelore January 2003 (has links) We consider a nonparametric survival model with random censoring. To test whether the hazard rate has a parametric form the unknown hazard rate is estimated by a kernel estimator. Based on a limit theorem stating the asymptotic normality of the quadratic distance of this estimator from the smoothed hypothesis an asymptotic ®-test is proposed. Since the test statistic depends on the maximum likelihood estimator for the unknown parameter in the hypothetical model properties of this parameter estimator are investigated. Power considerations complete the approach. kernel estimator of the hazard rate goodness of fit maximum likelihood estimator censoring Mathematics
7	Condições de regularidade para o modelo de regressão com parametrização geral / Regularity conditions for the regression model with general parameterization Loose, Laís Helen 24 May 2019 (has links) Este trabalho objetiva apresentar um estudo detalhado e sistemático de algumas condições de regularidade para inferências baseadas em máxima verossimilhança no modelo de regressão elíptico multivariado com parametrização geral proposto em Lemonte e Patriota (2011). O modelo em estudo tem vários modelos importantes como casos particulares, entre eles temos os modelos lineares e não lineares homocedásticos e heterocedásticos, modelos mistos, modelos heterocedásticos com erros nas variáveis e na equação, modelos multiníveis, entre outros. As condições de regularidade estudadas estão associadas à identificabilidade do modelo, à existência, à unicidade, à consistência e à normalidade assintótica dos estimadores de máxima verossimilhança (EMV) e à distribuição assintótica das estatísticas de testes. Para isso, são enunciadas as condições suficientes e formalizados os teoremas que garantem a existência, unicidade, consistência e normalidade assintótica dos EMV e a distribuição assintótica das estatísticas de teste usuais. Além disso, os resultados de cada teorema são comentados e as demonstrações são apresentadas com detalhes. Inicialmente, considerou-se o modelo sob a suposição de normalidade dos erros, para, na sequência, ser possível generalizar os resultados para o caso elíptico. A fim de exemplificar os resultados obtidos, foram verificadas, analiticamente, a validade de algumas condições e os resultados de alguns teoremas em casos particulares do modelo geral. Ademais, foi desenvolvido um estudo de simulação em que uma das condições é violada adotando o modelo heterocedástico com erros nas variáveis e na equação. Por meio de simulações de Monte Carlo foram avaliados os impactos sobre a consistência e normalidade assintótica dos EMV. / This work aims to present a detailed and systematic study of some regularity conditions for inferences based on maximum likelihood in the multivariate elliptic regression model with general parameterization proposed in Lemonte and Patriota (2011). The model under study has several important models as particular cases, among them we have the linear and non-linear homocedastic and heterocedastic models, mixed models, heterocedastic models with errors in the variables and in the equation, multilevel models, among others. The regularity conditions studied are associated with the identifiability of the model, existence, uniqueness, consistency and asymptotic normality of the maximum likelihood estimators (MLE) and the asymptotic distribution of some test statistics. Sufficient conditions are stated to guarantee the existence, unicity, consistency and asymptotic normality of the MLE and the asymptotic distribution of the usual test statistics. In addition, the results of each theorem are commented and the proof are presented in detail. Initially, the model was considered under the assumption of normality of the errors, and then the results were generalized for the elliptical case. In order to exemplify the attained results, some particular cases of the general model are analyzed analytically, the validity of some conditions and the results of some theorems are verified. In addition, a simulation study is developed with one of the conditions violated under the heterocedastic model with errors in the variables and in the equation. By means of Monte Carlo simulations, the impacts of this violation on the consistency and the asymptotic normality of the MLE are evaluated. Asymptotic properties of estimators Asymptotic theory Distribuições elípticas Elliptical distribution Estimador de máxima verossimilhança Maximum likelihood estimator Modelos de regressão Regression models Teoria assintótica
8	Extensões do modelo -potência / extension for the alpha-power model Martinez Florez, Guillermo Domingo 22 June 2011 (has links) Em analise de dados que apresentam certo grau de assimetria a suposicao que as observações seguem uma distribuição normal, pode resultar ser uma suposição irreal e a aplicação deste modelo pode ocultar características importantes do modelo verdadeiro. Este tipo de situação deu forca á aplicação de modelo assimétricos, destacando-se entre estes a família de distribuições skew-symmetric, desenvolvida por Azzalini (1985). Neste trabalho nos apresentamos uma segunda proposta para a anàlise de dados com presença importante de assimetria e/ou curtose, comparado com a distribuição normal. Nós apresentamos e estudamos algumas propriedades dos modelos alfa-potência e log-alfa-potência, onde também estudamos o problema de estimação, as matrizes de informação observada e esperada de Fisher e o grau do viés dos estimadores mediante alguns processos de simulação. Nós introduzimos um modelo mais estável que o modelo alfa- potência do qual derivamos o caso bimodal desta distribuição e introduzimos os modelos bimodal simêtrico e assimêtrico alfa-potencia. Posteriormente nós estendemos a distribuição alfa-potência para o caso do modelo Birnbaum-Saunders, estudamos as propriedades deste novo modelo, desenvolvemos estimadores para os parametros e propomos estimadores com viés corrigido. Também introduzimos o modelo de regressão alfa-potência para dados censurados e não censurados e para o modelo de regressão log-linear Birnbaum-Saunders; aqui nós derivamos os estimadores dos parâmetros e estudamos algumas técnicas de validação dos modelos. Por ultimo nós fazemos a extensão multivariada do modelo alfa-potência e estudamos alguns processos de estimação dos parâmetros. Para todos os casos estudados apresentam-se ilustrações com dados já analisados previamente com outras suposições de distribuições. / In data analysis where data present certain degree of asymmetry the assunption of normality can result in an unreal situation and the application of this model can hide important caracteristics of the true model. Situations of this type has given strength to the use of asymmetric models with special emphasis on the skew-symmetric distribution developed by Azzalini (1985). In this work we present an alternative for data analysis in the presence of signi¯cant asymmetry or kurtosis, when compared with the normal distribution, as well as other situations that involve such model. We present and study of the properties of the ®-power and log-®-power distributions, where we also study the estimation problem, the observed and expected information matrices and the degree of bias in estimation using simulation procedures. A °exible model version is proposed for the ®-power distribution, following an extension to a bimodal version. Follows next an extension of the Birnbaum-Saunders distribution using the ®-power distribution, where some properties are studied, estimating approaches are developed as well as corrected bias estimator developed. We also develop censored and uncensored regression for the ®-power model and for the log-linear Birnbaum-Saunders regression models, for which model validation techniques are studied. Finally a multivariate extension of the ®-power model is proposed and some estimation procedures are investigated for the model. All the situations investigated were illustrated with data application using data sets previally analysed with other distributions. asymptotic distribution. distribuição assintótica. estimação de máxima verossimilhança Fisher information matrix likelihood ratio test matriz de informação de Fisher maximum likelihood estimator teste de razão de verossimilhança
9	Extensões do modelo -potência / extension for the alpha-power model Guillermo Domingo Martinez Florez 22 June 2011 (has links) Em analise de dados que apresentam certo grau de assimetria a suposicao que as observações seguem uma distribuição normal, pode resultar ser uma suposição irreal e a aplicação deste modelo pode ocultar características importantes do modelo verdadeiro. Este tipo de situação deu forca á aplicação de modelo assimétricos, destacando-se entre estes a família de distribuições skew-symmetric, desenvolvida por Azzalini (1985). Neste trabalho nos apresentamos uma segunda proposta para a anàlise de dados com presença importante de assimetria e/ou curtose, comparado com a distribuição normal. Nós apresentamos e estudamos algumas propriedades dos modelos alfa-potência e log-alfa-potência, onde também estudamos o problema de estimação, as matrizes de informação observada e esperada de Fisher e o grau do viés dos estimadores mediante alguns processos de simulação. Nós introduzimos um modelo mais estável que o modelo alfa- potência do qual derivamos o caso bimodal desta distribuição e introduzimos os modelos bimodal simêtrico e assimêtrico alfa-potencia. Posteriormente nós estendemos a distribuição alfa-potência para o caso do modelo Birnbaum-Saunders, estudamos as propriedades deste novo modelo, desenvolvemos estimadores para os parametros e propomos estimadores com viés corrigido. Também introduzimos o modelo de regressão alfa-potência para dados censurados e não censurados e para o modelo de regressão log-linear Birnbaum-Saunders; aqui nós derivamos os estimadores dos parâmetros e estudamos algumas técnicas de validação dos modelos. Por ultimo nós fazemos a extensão multivariada do modelo alfa-potência e estudamos alguns processos de estimação dos parâmetros. Para todos os casos estudados apresentam-se ilustrações com dados já analisados previamente com outras suposições de distribuições. / In data analysis where data present certain degree of asymmetry the assunption of normality can result in an unreal situation and the application of this model can hide important caracteristics of the true model. Situations of this type has given strength to the use of asymmetric models with special emphasis on the skew-symmetric distribution developed by Azzalini (1985). In this work we present an alternative for data analysis in the presence of signi¯cant asymmetry or kurtosis, when compared with the normal distribution, as well as other situations that involve such model. We present and study of the properties of the ®-power and log-®-power distributions, where we also study the estimation problem, the observed and expected information matrices and the degree of bias in estimation using simulation procedures. A °exible model version is proposed for the ®-power distribution, following an extension to a bimodal version. Follows next an extension of the Birnbaum-Saunders distribution using the ®-power distribution, where some properties are studied, estimating approaches are developed as well as corrected bias estimator developed. We also develop censored and uncensored regression for the ®-power model and for the log-linear Birnbaum-Saunders regression models, for which model validation techniques are studied. Finally a multivariate extension of the ®-power model is proposed and some estimation procedures are investigated for the model. All the situations investigated were illustrated with data application using data sets previally analysed with other distributions. distribuição assintótica. estimação de máxima verossimilhança matriz de informação de Fisher teste de razão de verossimilhança asymptotic distribution. Fisher information matrix likelihood ratio test maximum likelihood estimator
10	Implementação no software estatístico R de modelos de regressão normal com parametrização geral / Normal regression models with general parametrization in software R Perette, André Casagrandi 23 August 2019 (has links) Este trabalho objetiva o desenvolvimento de um pacote no software estatístico R com a implementação de estimadores em modelos de regressão normal univariados com parametrização geral, uma particularidade do modelo definido em Patriota e Lemonte (2011). Essa classe contempla uma ampla gama de modelos conhecidos, tais como modelos de regressão não lineares e heteroscedásticos. São implementadas correções nos estimadores de máxima verossimilhança e na estatística de razão de verossimilhanças. Tais correções são efetivas quando o tamanho amostral é pequeno. Para a correção do estimador de máxima verossimilhança, considerou-se a correção do viés de segunda ordem, enquanto que para a estatística da razão de verossimilhanças aplicou-se a correção desenvolvida em Skovgaard (2001). Todas as funcionalidades do pacote são descritas detalhadamente neste trabalho. Para avaliar a qualidade do algoritmo desenvolvido, realizaram-se simulações de Monte Carlo para diferentes cenários, avaliando taxas de convergência, erros da estimação e eficiência das correções de viés e de Skovgaard. / This work aims to develop a package in R language with the implementation of normal regression models with general parameterization, proposed in Patriota and Lemonte (2011). This model unifies important models, such as nonlinear heteroscedastic models. Corrections are implemented for the MLEs and likelihood-ratio statistics. These corrections are effective in small samples. The algorithm considers the second-order bias of MLEs solution presented in Patriota and Lemonte (2009) and the Skovgaard\'s correction for likelihood-ratio statistics defined in Skovgaard (2001). In addition, a simulation study is developed under different scenarios, where the convergence ratio, relative squared error and the efficiency of bias correction and Skovgaard\'s correction are evaluated. Bias correction Correção de Skovgaard Correção de viés Estimador de máxima verossimilhança General parameterization Linguagem R Maximum Likelihood Estimator Parametrização geral Skovgaard's correction Software R

Search results