Global ETD Search

91	Relação entre níveis de significância Bayesiano e freqüentista: e-value e p-value em tabelas de contingência / Relationship between Bayesian and frequentist significance tests: e-value and p-value in contingency tables Petri, Cátia 20 April 2007 (has links) O FBST (Full Bayesian Significance Test) é um procedimento para testar hipóteses precisas, apresentado por Pereira e Stern (1999), e baseado no cálculo da probabilidade posterior do conjunto tangente ao conjunto que define a hipótese nula. Este procedimento é uma alternativa Bayesiana aos testes de significância usuais. Neste trabalho, estudamos a relação entre os resultados do FBST e de um teste freqüentista, o TRVG (Teste da Razão de Verossimilhanças Generalizado), através de alguns problemas clássicos de testes de hipóteses. Apresentamos, também, todos os procedimentos computacionais utilizados para a resolução automática dos dois testes para grandes amostras, necessária ao estudo da relação entre os testes. / FBST (Full Bayesian Significance Test) is a procedure to test precise hypotheses, presented by Pereira and Stern (1999), which is based on the calculus of the posterior probability of the set tangent to the set that defines the null hypothesis. This procedure is a Bayesian alternative to the usual significance tests. In the present work we study the relation between the FBST\'s results and those of a frequentist test, GLRT (Generalised Likelihood Ratio Test) through some classical problems in hypotesis testing. We also present all computer procedures that compose the automatic solutions for applying FBST and GLRT on big samples what was necessary for studying the relation between both tests. integração numérica likelihood numerical integration numerical optimization otimização numérica verossimilhança
92	Discrete Weibull regression model for count data Kalktawi, Hadeel Saleh January 2017 (has links) Data can be collected in the form of counts in many situations. In other words, the number of deaths from an accident, the number of days until a machine stops working or the number of annual visitors to a city may all be considered as interesting variables for study. This study is motivated by two facts; first, the vital role of the continuous Weibull distribution in survival analyses and failure time studies. Hence, the discrete Weibull (DW) is introduced analogously to the continuous Weibull distribution, (see, Nakagawa and Osaki (1975) and Kulasekera (1994)). Second, researchers usually focus on modeling count data, which take only non-negative integer values as a function of other variables. Therefore, the DW, introduced by Nakagawa and Osaki (1975), is considered to investigate the relationship between count data and a set of covariates. Particularly, this DW is generalised by allowing one of its parameters to be a function of covariates. Although the Poisson regression can be considered as the most common model for count data, it is constrained by its equi-dispersion (the assumption of equal mean and variance). Thus, the negative binomial (NB) regression has become the most widely used method for count data regression. However, even though the NB can be suitable for the over-dispersion cases, it cannot be considered as the best choice for modeling the under-dispersed data. Hence, it is required to have some models that deal with the problem of under-dispersion, such as the generalized Poisson regression model (Efron (1986) and Famoye (1993)) and COM-Poisson regression (Sellers and Shmueli (2010) and Sáez-Castillo and Conde-Sánchez (2013)). Generally, all of these models can be considered as modifications and developments of Poisson models. However, this thesis develops a model based on a simple distribution with no modification. Thus, if the data are not following the dispersion system of Poisson or NB, the true structure generating this data should be detected. Applying a model that has the ability to handle different dispersions would be of great interest. Thus, in this study, the DW regression model is introduced. Besides the exibility of the DW to model under- and over-dispersion, it is a good model for inhomogeneous and highly skewed data, such as those with excessive zero counts, which are more disperse than Poisson. Although these data can be fitted well using some developed models, namely, the zero-inated and hurdle models, the DW demonstrates a good fit and has less complexity than these modifed models. However, there could be some cases when a special model that separates the probability of zeros from that of the other positive counts must be applied. Then, to cope with the problem of too many observed zeros, two modifications of the DW regression are developed, namely, zero-inated discrete Weibull (ZIDW) and hurdle discrete Weibull (HDW) models. Furthermore, this thesis considers another type of data, where the response count variable is censored from the right, which is observed in many experiments. Applying the standard models for these types of data without considering the censoring may yield misleading results. Thus, the censored discrete Weibull (CDW) model is employed for this case. On the other hand, this thesis introduces the median discrete Weibull (MDW) regression model for investigating the effect of covariates on the count response through the median which are more appropriate for the skewed nature of count data. In other words, the likelihood of the DW model is re-parameterized to explain the effect of the predictors directly on the median. Thus, in comparison with the generalized linear models (GLMs), MDW and GLMs both investigate the relations to a set of covariates via certain location measurements; however, GLMs consider the means, which is not the best way to represent skewed data. These DW regression models are investigated through simulation studies to illustrate their performance. In addition, they are applied to some real data sets and compared with the related count models, mainly Poisson and NB models. Overall, the DW models provide a good fit to the count data as an alternative to the NB models in the over-dispersion case and are much better fitting than the Poisson models. Additionally, contrary to the NB model, the DW can be applied for the under-dispersion case. 519.2
93	Two component semiparametric density mixture models with a known component Zhou Shen (5930258) 17 January 2019 (has links) <pre>Finite mixture models have been successfully used in many applications, such as classification, clustering, and many others. As opposed to classical parametric mixture models, nonparametric and semiparametric mixture models often provide more flexible approaches to the description of inhomogeneous populations. As an example, in the last decade a particular two-component semiparametric density mixture model with a known component has attracted substantial research interest. Our thesis provides an innovative way of estimation for this model based on minimization of a smoothed objective functional, conceptually similar to the log-likelihood. The minimization is performed with the help of an EM-like algorithm. We show that the algorithm is convergent and the minimizers of the objective functional, viewed as estimators of the model parameters, are consistent. </pre><pre><br></pre><pre>More specifically, in our thesis, a semiparametric mixture of two density functions is considered where one of them is known while the weight and the other function are unknown. For the first part, a new sufficient identifiability condition for this model is derived, and a specific class of distributions describing the unknown component is given for which this condition is mostly satisfied. A novel approach to estimation of this model is derived. That approach is based on an idea of using a smoothed likelihood-like functional as an objective functional in order to avoid ill-posedness of the original problem. Minimization of this functional is performed using an iterative Majorization-Minimization (MM) algorithm that estimates all of the unknown parts of the model. The algorithm possesses a descent property with respect to the objective functional. Moreover, we show that the algorithm converges even when the unknown density is not defined on a compact interval. Later, we also study properties of the minimizers of this functional viewed as estimators of the mixture model parameters. Their convergence to the true solution with respect to a bandwidth parameter is justified by reconsidering in the framework of Tikhonov-type functional. They also turn out to be large-sample consistent; this is justified using empirical minimization approach. The third part of the thesis contains a series of simulation studies, comparison with another method and a real data example. All of them show the good performance of the proposed algorithm in recovering unknown components from data.</pre> Statistics mixture models penalized smoothed likelihood MM algorithms regularization
94	Relação entre níveis de significância Bayesiano e freqüentista: e-value e p-value em tabelas de contingência / Relationship between Bayesian and frequentist significance tests: e-value and p-value in contingency tables Cátia Petri 20 April 2007 (has links) O FBST (Full Bayesian Significance Test) é um procedimento para testar hipóteses precisas, apresentado por Pereira e Stern (1999), e baseado no cálculo da probabilidade posterior do conjunto tangente ao conjunto que define a hipótese nula. Este procedimento é uma alternativa Bayesiana aos testes de significância usuais. Neste trabalho, estudamos a relação entre os resultados do FBST e de um teste freqüentista, o TRVG (Teste da Razão de Verossimilhanças Generalizado), através de alguns problemas clássicos de testes de hipóteses. Apresentamos, também, todos os procedimentos computacionais utilizados para a resolução automática dos dois testes para grandes amostras, necessária ao estudo da relação entre os testes. / FBST (Full Bayesian Significance Test) is a procedure to test precise hypotheses, presented by Pereira and Stern (1999), which is based on the calculus of the posterior probability of the set tangent to the set that defines the null hypothesis. This procedure is a Bayesian alternative to the usual significance tests. In the present work we study the relation between the FBST\'s results and those of a frequentist test, GLRT (Generalised Likelihood Ratio Test) through some classical problems in hypotesis testing. We also present all computer procedures that compose the automatic solutions for applying FBST and GLRT on big samples what was necessary for studying the relation between both tests. integração numérica otimização numérica verossimilhança likelihood numerical integration numerical optimization
95	Best-subset model selection based on multitudinal assessments of likelihood improvements Carter, Knute Derek 01 December 2013 (has links) Given a set of potential explanatory variables, one model selection approach is to select the best model, according to some criterion, from among the collection of models defined by all possible subsets of the explanatory variables. A popular procedure that has been used in this setting is to select the model that results in the smallest value of the Akaike information criterion (AIC). One drawback in using the AIC is that it can lead to the frequent selection of overspecified models. This can be problematic if the researcher wishes to assert, with some level of certainty, the necessity of any given variable that has been selected. This thesis develops a model selection procedure that allows the researcher to nominate, a priori, the probability at which overspecified models will be selected from among all possible subsets. The procedure seeks to determine if the inclusion of each candidate variable results in a sufficiently improved fitting term, and hence is referred to as the SIFT procedure. In order to determine whether there is sufficient evidence to retain a candidate variable or not, a set of threshold values are computed. Two procedures are proposed: a naive method based on a set of restrictive assumptions; and an empirical permutation-based method. Graphical tools have also been developed to be used in conjunction with the SIFT procedure. The graphical representation of the SIFT procedure clarifies the process being undertaken. Using these tools can also assist researchers in developing a deeper understanding of the data they are analyzing. The naive and empirical SIFT methods are investigated by way of simulation under a range of conditions within the standard linear model framework. The performance of the SIFT methodology is compared with model selection by minimum AIC; minimum Bayesian Information Criterion (BIC); and backward elimination based on p-values. The SIFT procedure is found to behave as designed—asymptotically selecting those variables that characterize the underlying data generating mechanism, while limiting the selection of false or spurious variables to the desired level. The SIFT methodology offers researchers a promising new approach to model selection, whereby they are now able to control the probability of selecting an overspecified model to a level that best suits their needs. AIC BIC Information Criterion Likelihood Ratio Model Selection SIFT Biostatistics
96	Flächennutzungswandel in Tirana : Untersuchungen anhand von Landsat TM, Terra ASTER und GIS Richter, Dietmar January 2007 (has links) Die Zuwanderung nach Tirana führte im Verlauf der 1990er Jahre zu einem enormen Flächenverbrauch auf Kosten landwirtschaftlicher Flächen im Umland der albanischen Hauptstadt. Im Rahmen der vorliegenden Arbeit wird die Entwicklung des rasanten Flächenverbrauchs mit computergestützten Methoden dokumentiert. Grundlage der Untersuchung bilden zwei zu unterschiedlichen Zeitpunkten (1988 und 2000) aufgenommene Satellitenszenen, mit Hilfe derer eine Änderungsanalyse durchgeführt wird. Ziel der Änderungsanalyse ist es, den Flächennutzungswandel zu analysieren, Daten zu generieren und die Ergebnisse in geeigneter Weise zu visualisieren. Zu den protagonistischen Verfahren der Änderungsanalyse zählen sowohl die Maximum-Likelihood Klassifikation sowie ein wissensbasierter Klassifizierungsansatz. Die Ergebnisse der Änderungsanalyse werden in Änderungskarten dargestellt und mittels einer GIS-Software statistisch ausgewertet. Transition Maximum-Likelihood GIS Änderungsanalyse Flächenutzungswandel Albanien Geography and travel
97	A Strategy for Earthquake Catalog Relocations Using a Maximum Likelihood Method Li, Ka Lok January 2012 (has links) A strategy for relocating earthquakes in a catalog is presented. The strategy is based on the argument that the distribution of the earthquake events in a catalog is reasonable a priori information for earthquake relocation in that region. This argument can be implemented using the method of maximum likelihood for arrival time data inversion, where the a priori probability distribution of the event locations is defined as the sum of the probability densities of all events in the catalog. This a priori distribution is then added to the standard misfit criterion in earthquake location to form the likelihood function. The probability density of an event in the catalog is described by a Gaussian probability density. The a priori probability distribution is, therefore, defined as the normalized sum of the Gaussian probability densities of all events in the catalog, excluding the event being relocated. For a linear problem, the likelihood function can be approximated by the joint probability density of the a priori distribution and the distribution of an unconstrained location due to the misfit alone. After relocating the events according to the maximum of the likelihood function, a modified distribution of events is generated. This distribution should be more densely clustered than before in general since the events are moved towards the maximum of the posterior distribution. The a priori distribution is updated and the process is iterated. The strategy is applied to the aftershock sequence in southwest Iceland after a pair of earthquakes on 29th May 2008. The relocated events reveal the fault systems in that area. Three synthetic data sets are used to test the general behaviour of the strategy. It is observed that the synthetic data give significantly different behaviour from the real data. earthquake catalog relocation maximum likelihood clustering Earth sciences Geovetenskap
98	Påverkan av auktoritet : Berömmelse ingen faktor som övertygar Illi, Peter January 2013 (has links) Ikoniska auktoriteter är personer som tillskrivs sådan betydelse att de kommit att symbolisera delar eller aspekter av samhällslivet eller epoker i historien och som kan sägas ha haft ett betydande inflytande på samhället och kulturen. I studien undersöktesom ikoniska auktoriter påverkar oss mer eller annorlunda än andra källor. I ett experiment fick högskolestudenter uppdelade i fyra grupper läsa en text under hög elaboration om en psykologisk teori där variablerna ikonisk auktoritet och personligrelevans manipulerades. Deltagarna ombads sedan skatta vilken trovärdighet de ansåg att teorin hade. Studien ställde upp två hypoteser: att hög ikonisk aukto-ritet skulle öka den skattade trovärdigheten och personlig relevans minska den. Resultatet gav inte stöd åt någon av hypoteserna men en interaktionseffekt visade att texten vid låg personlig relevans uppfattades som mer trovärdig av de deltagare som exponerades för låg ikonisk auktoritet än de deltagare som exponerades för hög ikonisk auktoritet. Det föreslås att interaktionen beror på en backlash-effekt. elaboration likelihood model personal relevance source credibility high elaboration
99	Empirical Likelihood Confidence Intervals for ROC Curves with Missing Data An, Yueheng 25 April 2011 (has links) The receiver operating characteristic, or the ROC curve, is widely utilized to evaluate the diagnostic performance of a test, in other words, the accuracy of a test to discriminate normal cases from diseased cases. In the biomedical studies, we often meet with missing data, which the regular inference procedures cannot be applied to directly. In this thesis, the random hot deck imputation is used to obtain a 'complete' sample. Then empirical likelihood (EL) confidence intervals are constructed for ROC curves. The empirical log-likelihood ratio statistic is derived whose asymptotic distribution isproved to be a weighted chi-square distribution. The results of simulation study show that the EL confidence intervals perform well in terms of the coverage probability and the average length for various sample sizes and response rates. Confidence interval Smoothed empirical likelihood Right censored data ROC curves
100	Copula Models for Multi-type Life History Processes Diao, Liqun January 2013 (has links) This thesis considers statistical issues in the analysis of data in the studies of chronic diseases which involve modeling dependencies between life history processes using copula functions. Many disease processes feature recurrent events which represent events arising from an underlying chronic condition; these are often modeled as point processes. In addition, however, there often exists a random variable which is realized upon the occurrence of each event, which is called a mark of the point process. When considered together, such processes are called marked point processes. A novel copula model for the marked point process is described here which uses copula functions to govern the association between marks and event times. Specifically, a copula function is used to link each mark with the next event time following the realization of that mark to reflect the pattern in the data wherein larger marks are often followed by longer time to the next event. The extent of organ damage in an individual can often be characterized by ordered states, and interest frequently lies in modeling the rates at which individuals progress through these states. Risk factors can be studied and the effect of therapeutic interventions can be assessed based on relevant multistate models. When chronic diseases affect multiple organ systems, joint modeling of progression in several organ systems is also important. In contrast to common intensity-based or frailty-based approaches to modelling, this thesis considers a copula-based framework for modeling and analysis. Through decomposition of the density and by use of conditional independence assumptions, an appealing joint model is obtained by assuming that the joint survival function of absorption transition times is governed by a multivariate copula function. Different approaches to estimation and inference are discussed and compared including composite likelihood and two-stage estimation methods. Special attention is paid to the case of interval-censored data arising from intermittent assessment. Attention is also directed to use of copula models for more general scenarios with a focus on semiparametric two-stage estimation procedures. In this approach nonparametric or semiparametric estimates of the marginal survivor functions are obtained in the first stage and estimates of the association parameters are obtained in the second stage. Bivariate failure time models are considered for data under right-censoring and current status observation schemes, and right-censored multistate models. A new expression for the asymptotic variance of the second-stage estimator for the association parameter along with a way of estimating this for finite samples are presented under these models and observation schemes. Copula Lifetime data Composite likelihood Multistage estimation procedure Statistics (Biostatistics)

Search results