Global ETD Search

121	具有額外或不足變異的群集類別資料之研究 / A Study of Modelling Categorical Data with Overdispersion or Underdispersion 蘇聖珠, Su, Sheng-Chu Unknown Date (has links) 進行調查時，最後的抽樣單位常是從不同的群集取得的，而同一群集內的樣本對象，因背景類似而對於某些問題常會傾向相同或類似的反應，研究者若忽略這種群內相關性，仍以獨立性樣本進行分析時，因其共變異數矩陣通常會與多項模式的共變異數矩陣相差懸殊，而造成所謂的額外變異或不足變異的現象。本文在不同的情況下，提出了Dirichlet-Multinomial模式(簡稱DM模式)、擴展的DM模式、以及兩種平均數-共變異數矩陣模式，以適當的彙整所有的群集資料。並討論DM與EDM模式中相關之參數及格機率之最大概似估計法，且分別對此兩種平均數-共變異數矩陣模式，提出求導廣義最小平方估計的程序。此外，也針對幾種特殊的二維表及三維表結構，探討對應的參數及格機率之估計方法。並提出計算簡易的Score統計檢定量以判斷群內相關(intra-cluster correlation)之存在性，及判斷資料集具有額外或不足變異，而對於不同母體的群內相關同質性檢定亦提出討論。 / This paper presents a modelling method of analyzing categorical data with overdispersion or underdispersion.　In many studies, data are collected from differ clusters, and members within the same cluster behave similary.　Thus, the responses of members within the same cluster are not independent and the multinomial distribution is not the correct distribution for the observed counts.　Therefore, the covariance matrix of the sample proportion vector tends to be much different from that of the multinomial model.　We discuss four different models to fit counts data with overdispersion or underdispersion feature, witch include Dirichlet-Multinomial model (DM model), extended DM model (EDM model), and two mean-covariance models.　Method of maximum-likelihood estimation is discussed for DM and EDM models.　Procedures to derive generalized least squares estimates are proposed for the two mean-covariance models respectively. As to the cell probabilities, we also discuss how to estimate them under several special structures of　two-way and three-way tables.　More easily evaluated Score test statistics are derived for the DM and EDM models to test the existence of　the intra-cluster correlation.　And the test of homogeneity of intra-cluster correlation among several populations is also derived. 群集類別資料群內相關性額外變異不足變異 Dirichlet-Multinomial模式最大概似估計式廣義最小平方估計式 categorical data intra-cluster correlation overdispersion underdispersion Dirichlet-Multinomial model maximum-likelihood estimation generalized least squares estimation
122	Eliminação de parâmetros perturbadores em um modelo de captura-recaptura Salasar, Luis Ernesto Bueno 18 November 2011 (has links) Made available in DSpace on 2016-06-02T20:04:51Z (GMT). No. of bitstreams: 1 4032.pdf: 1016886 bytes, checksum: 6e1eb83f197a88332f8951b054c1f01a (MD5) Previous issue date: 2011-11-18 / Financiadora de Estudos e Projetos / The capture-recapture process, largely used in the estimation of the number of elements of animal population, is also applied to other branches of knowledge like Epidemiology, Linguistics, Software reliability, Ecology, among others. One of the _rst applications of this method was done by Laplace in 1783, with aim at estimate the number of inhabitants of France. Later, Carl G. J. Petersen in 1889 and Lincoln in 1930 applied the same estimator in the context of animal populations. This estimator has being known in literature as _Lincoln-Petersen_ estimator. In the mid-twentieth century several researchers dedicated themselves to the formulation of statistical models appropriated for the estimation of population size, which caused a substantial increase in the amount of theoretical and applied works on the subject. The capture-recapture models are constructed under certain assumptions relating to the population, the sampling procedure and the experimental conditions. The main assumption that distinguishes models concerns the change in the number of individuals in the population during the period of the experiment. Models that allow for births, deaths or migration are called open population models, while models that does not allow for these events to occur are called closed population models. In this work, the goal is to characterize likelihood functions obtained by applying methods of elimination of nuissance parameters in the case of closed population models. Based on these likelihood functions, we discuss methods for point and interval estimation of the population size. The estimation methods are illustrated on a real data-set and their frequentist properties are analised via Monte Carlo simulation. / O processo de captura-recaptura, amplamente utilizado na estimação do número de elementos de uma população de animais, é também aplicado a outras áreas do conhecimento como Epidemiologia, Linguística, Con_abilidade de Software, Ecologia, entre outras. Uma das primeiras aplicações deste método foi feita por Laplace em 1783, com o objetivo de estimar o número de habitantes da França. Posteriormente, Carl G. J. Petersen em 1889 e Lincoln em 1930 utilizaram o mesmo estimador no contexto de popula ções de animais. Este estimador _cou conhecido na literatura como o estimador de _Lincoln-Petersen_. Em meados do século XX muitos pesquisadores se dedicaram à formula ção de modelos estatísticos adequados à estimação do tamanho populacional, o que causou um aumento substancial da quantidade de trabalhos teóricos e aplicados sobre o tema. Os modelos de captura-recaptura são construídos sob certas hipóteses relativas à população, ao processo de amostragem e às condições experimentais. A principal hipótese que diferencia os modelos diz respeito à mudança do número de indivíduos da popula- ção durante o período do experimento. Os modelos que permitem que haja nascimentos, mortes ou migração são chamados de modelos para população aberta, enquanto que os modelos em que tais eventos não são permitidos são chamados de modelos para popula- ção fechada. Neste trabalho, o objetivo é caracterizar o comportamento de funções de verossimilhança obtidas por meio da utilização de métodos de eliminação de parâmetros perturbadores, no caso de modelos para população fechada. Baseado nestas funções de verossimilhança, discutimos métodos de estimação pontual e intervalar para o tamanho populacional. Os métodos de estimação são ilustrados através de um conjunto de dados reais e suas propriedades frequentistas são analisadas via simulação de Monte Carlo. Estatística Estimativas de máxima verossimilhança Tamanho populacional População fechada Captura-recaptura Intervalos de confiança Função de verossimilhança condicional Função de verossimilhança integrada Função de verossimilhança perfilada Capture-recapture Closed population Conditional likelihood function Confidence intervals Elimination of nuisance parameters Integrated likelihood function Maximum likelihood estimation Profile likelihood function Population size
123	Imputação de dados faltantes via algoritmo EM e rede neural MLP com o método de estimativa de máxima verossimilhança para aumentar a acurácia das estimativas Ribeiro, Elisalvo Alves 14 August 2015 (has links) Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Database with missing values it is an occurrence often found in the real world, beiging of this problem caused by several reasons (equipment failure that transmits and stores the data, handler failure, failure who provides information, etc.). This may make the data inconsistent and unable to be analyzed, leading to very skewed conclusions. This dissertation aims to explore the use of Multilayer Perceptron Artificial Neural Network (ANN MLP), with new activation functions, considering two approaches (single imputation and multiple imputation). First, we propose the use of Maximum Likelihood Estimation Method (MLE) in each network neuron activation function, against the approach currently used, which is without the use of such a method or when is used only in the cost function (network output). It is then analyzed the results of these approaches compared with the Expectation Maximization algorithm (EM) is that the state of the art to treat missing data. The results indicate that when using the Artificial Neural Network MLP with Maximum Likelihood Estimation Method, both in all neurons and only in the output function, lead the an imputation with lower error. These experimental results, evaluated by metrics such as MAE (Mean Absolute Error) and RMSE (Root Mean Square Error), showed that the better results in most experiments occured when using the MLP RNA addressed in this dissertation to single imputation and multiple. / Base de dados com valores faltantes é uma ocorrência frequentemente encontrada no mundo real, sendo as causas deste problema são originadas por motivos diversos (falha no equipamento que transmite e armazena os dados, falha do manipulador, falha de quem fornece a informação, etc.). Tal situação pode tornar os dados inconsistentes e inaptos de serem analisados, conduzindo às conclusões muito enviesadas. Esta dissertação tem como objetivo explorar o emprego de Redes Neurais Artificiais Multilayer Perceptron (RNA MLP), com novas funções de ativação, considerando duas abordagens (imputação única e imputação múltipla). Primeiramente, é proposto o uso do Método de Estimativa de Máxima Verossimilhança (EMV) na função de ativação de cada neurônio da rede, em contrapartida à abordagem utilizada atualmente, que é sem o uso de tal método, ou quando o utiliza é apenas na função de custo (na saída da rede). Em seguida, são analisados os resultados destas abordagens em comparação com o algoritmo Expectation Maximization (EM) que é o estado da arte para tratar dados faltantes. Os resultados obtidos indicam que ao utilizar a Rede Neural Artificial MLP com o Método de Estimativa de Máxima Verossimilhança, tanto em todos os neurônios como apenas na função de saída, conduzem a uma imputação com menor erro. Os resultados experimentais foram avaliados via algumas métricas, sendo as principais o MAE (Mean Absolute Error) e RMSE (Root Mean Square Error), as quais apresentaram melhores resultados na maioria dos experimentos quando se utiliza a RNA MLP abordada neste trabalho para fazer imputação única e múltipla. Redes Neurais Artificiais MLP Algoritmo EM Imputação de dados Dados faltantes Novas funções de ativação Computação Algoritmos de computador Redes neurais (Computação) Variáveis aleatórias Banco de dados Artificial Neural Networks MLP Maximum Likelihood Estimation Method EM Algorithm Data imputation Missing data New function activation
124	Observation error model selection by information criteria vs. normality testing Lehmann, Rüdiger 17 October 2016 (has links) (PDF) To extract the best possible information from geodetic and geophysical observations, it is necessary to select a model of the observation errors, mostly the family of Gaussian normal distributions. However, there are alternatives, typically chosen in the framework of robust M-estimation. We give a synopsis of well-known and less well-known models for observation errors and propose to select a model based on information criteria. In this contribution we compare the Akaike information criterion (AIC) and the Anderson Darling (AD) test and apply them to the test problem of fitting a straight line. The comparison is facilitated by a Monte Carlo approach. It turns out that the model selection by AIC has some advantages over the AD test. Maximum-Likelihood-Schätzung Robuste Schätzung Gaußsche Normalverteilung Laplace-Verteilung Verallgemeinerte Normalverteilung Kontaminierte Normalverteilung Akaike's Informationskriterium Anderson-Darling-Test Monte-Carlo-Methode maximum likelihood estimation robust estimation Gaussian normal distribution Laplace distribution generalized normal distribution contaminated normal distribution Akaike information criterion Anderson Darling test Monte Carlo method ddc:500 rvk:ZI 9070
125	Statistická analýza souborů s malým rozsahem / Statistical Analysis of Sample with Small Size Holčák, Lukáš January 2008 (has links) This diploma thesis is focused on the analysis of small samples where it is not possible to obtain more data. It can be especially due to the capital intensity or time demandingness. Where the production have not a wherewithall for the realization more data or absence of the financial resources. Of course, analysis of small samples is very uncertain, because inferences are always encumbered with the level of uncertainty.
126	Route choice and traffic equilibrium modeling in multi-modal and activity-based networks Zimmermann, Maëlle 06 1900 (has links) No description available. Modèle markovien d'équilibre de trafic Estimation par maximum de vraisemblance Programmation dynamique Réseaux multi-modaux Recursive route choice models Maximum likelihood estimation Dynamic programming Multi-modal route choice Markovian traffic assignment model Activity-based travel demand
127	Stochastic Modelling of Daily Peak Electricity Demand Using Value Theory Boano - Danquah, Jerry 21 September 2018 (has links) MSc (Statistics) / Department of Statistics / Daily peak electricity data from ESKOM, South African power utility company for the period, January 1997 to December 2013 consisting of 6209 observations were used in this dissertation. Since 1994, the increased electricity demand has led to sustainability issues in South Africa. In addition, the electricity demand continues to rise everyday due to a variety of driving factors. Considering this, if the electricity generating capacity in South Africa does not show potential signs of meeting the country’s demands in the subsequent years, this may have a significant impact on the national grid causing it to operate in a risky and vulnerable state, leading to disturbances, such as load shedding as experienced during the past few years. In particular, it is of greater interest to have sufficient information about the extreme value of the stochastic load process in time for proper planning, designing the generation and distribution system, and the storage devices as these would ensure efficiency in the electrical energy in order to maintain discipline in the grid systems. More importantly, electricity is an important commodity used mainly as a source of energy in industrial, residential and commercial sectors. Effective monitoring of electricity demand is of great importance because demand that exceeds maximum power generated will lead to power outage and load shedding. It is in the light of this that the study seeks to assess the frequency of occurrence of extreme peak electricity demand in order to come up with a full electricity demand distribution capable of managing uncertainties in the grid system. In order to achieve stationarity in the daily peak electricity demand (DPED), we apply a penalized regression cubic smoothing spline to ensure the data is non-linearly detrended. The R package “evmix” is used to estimate the thresholds using the bounded corrected kernel density plot. The non-linear detrended datasets were divided into summer, spring, winter and autumn according to the calender dates in the Southern Hemisphere for frequency analysis. The data is declustered using Ferro and Segers automatic declustering method. The cluster maxima is extracted using the R package “evd”. We fit Poisson GPD and stationary point process to the cluster maxima and the intensity function of the point process which measures the frequency of occurrence of the daily peak electricity demand per year is calculated for each dataset. The formal goodness-of-fit test based on Cramer-Von Mises statistics and Anderson-Darling statistics supported the null hypothesis that each dataset follow Poisson GPD (σ, ξ) at 5 percent level of significance. The modelling framework, which is easily extensible to other peak load parameters, is based on the assumption that peak power follows a Poisson process. The parameters of the developed i models were estimated using the Maximum Likelihood. The usual asymptotic properties underlying the Poisson GPD were satisfied by the model. / NRF Extreme Value Theory (EVT) Daily Peak Electricity (DPED) Peaks-Over - Thresholds (POT) Poisson (GPD) Maximum Likelihood Estimation (MLE) 333.790968 Power resources -- South Africa Energy consumption -- South Africa Stochastic processes -- South Africa Electric power -- South Africa Electric utilities -- South Africa Electric industries -- South Africa
128	Observation error model selection by information criteria vs. normality testing Lehmann, Rüdiger January 2015 (has links) To extract the best possible information from geodetic and geophysical observations, it is necessary to select a model of the observation errors, mostly the family of Gaussian normal distributions. However, there are alternatives, typically chosen in the framework of robust M-estimation. We give a synopsis of well-known and less well-known models for observation errors and propose to select a model based on information criteria. In this contribution we compare the Akaike information criterion (AIC) and the Anderson Darling (AD) test and apply them to the test problem of fitting a straight line. The comparison is facilitated by a Monte Carlo approach. It turns out that the model selection by AIC has some advantages over the AD test. info:eu-repo/classification/ddc/500 ddc:500
129	Exact Analysis of Exponential Two-Component System Failure Data Zhang, Xuan 01 1900 (has links) <p>A survival distribution is developed for exponential two-component systems that can survive as long as at least one of the two components in the system function. It is assumed that the two components are initially independent and non-identical. If one of the two components fail (repair is impossible), the surviving component is subject to a different failure rate due to the stress caused by the failure of the other.</p> <p>In this paper, we consider such an exponential two-component system failure model when the observed failure time data are (1) complete, (2) Type-I censored, (3) Type-I censored with partial information on component failures, (4) Type-II censored and (5) Type-II censored with partial information on component failures. In these situations, we discuss the maximum likelihood estimates (MLEs) of the parameters by assuming the lifetimes to be exponentially distributed. The exact distributions (whenever possible) of the MLEs of the parameters are then derived by using the conditional moment generating function approach. Construction of confidence intervals for the model parameters are discussed by using the exact conditional distributions (when available), asymptotic distributions, and two parametric bootstrap methods. The performance of these four confidence intervals, in terms of coverage probabilities are then assessed through Monte Carlo simulation studies. Finally, some examples are presented to illustrate all the methods of inference developed here.</p> <p>In the case of Type-I and Type-II censored data, since there are no closed-form expressions for the MLEs, we present an iterative maximum likelihood estimation procedure for the determination of the MLEs of all the model parameters. We also carry out a Monte Carlo simulation study to examine the bias and variance of the MLEs.</p> <p>In the case of Type-II censored data, since the exact distributions of the MLEs depend on the data, we discuss the exact conditional confidence intervals and asymptotic confidence intervals for the unknown parameters by conditioning on the data observed.</p> / Thesis / Doctor of Philosophy (PhD) mathematics
130	Some Contributions to Inferential Issues of Censored Exponential Failure Data Han, Donghoon 06 1900 (has links) In this thesis, we investigate several inferential issues regarding the lifetime data from exponential distribution under different censoring schemes. For reasons of time constraint and cost reduction, censored sampling is commonly employed in practice, especially in reliability engineering. Among various censoring schemes, progressive Type-I censoring provides not only the practical advantage of known termination time but also greater flexibility to the experimenter in the design stage by allowing for the removal of test units at non-terminal time points. Hence, we first consider the inference for a progressively Type-I censored life-testing experiment with k uniformly spaced intervals. For small to moderate sample sizes, a practical modification is proposed to the censoring scheme in order to guarantee a feasible life-test under progressive Type-I censoring. Under this setup, we obtain the maximum likelihood estimator (MLE) of the unknown mean parameter and derive the exact sampling distribution of the MLE through the use of conditional moment generating function under the condition that the existence of the MLE is ensured. Using the exact distribution of the MLE as well as its asymptotic distribution and the parametric bootstrap method, we discuss the construction of confidence intervals for the mean parameter and their performance is then assessed through Monte Carlo simulations. Next, we consider a special class of accelerated life tests, known as step-stress tests in reliability testing. In a step-stress test, the stress levels increase discretely at pre-fixed time points and this allows the experimenter to obtain information on the parameters of the lifetime distributions more quickly than under normal operating conditions. Here, we consider a k-step-stress accelerated life testing experiment with an equal step duration τ. In particular, the case of progressively Type-I censored data with a single stress variable is investigated. For small to moderate sample sizes, we introduce another practical modification to the model for a feasible k-step-stress test under progressive censoring, and the optimal τ is searched using the modified model. Next, we seek the optimal τ under the condition that the step-stress test proceeds to the k-th stress level, and the efficiency of this conditional inference is compared to the preceding models. In all cases, censoring is allowed at each change stress point iτ, i = 1, 2, ... , k, and the problem of selecting the optimal Tis discussed using C-optimality, D-optimality, and A-optimality criteria. Moreover, when a test unit fails, there are often more than one fatal cause for the failure, such as mechanical or electrical. Thus, we also consider the simple stepstress models under Type-I and Type-II censoring situations when the lifetime distributions corresponding to the different risk factors are independently exponentially distributed. Under this setup, we derive the MLEs of the unknown mean parameters of the different causes under the assumption of a cumulative exposure model. The exact distributions of the MLEs of the parameters are then derived through the use of conditional moment generating functions. Using these exact distributions as well as the asymptotic distributions and the parametric bootstrap method, we discuss the construction of confidence intervals for the parameters and then assess their performance through Monte Carlo simulations. / Thesis / Doctor of Philosophy (PhD) A-optimality accelerated life-testing C-optimality change point competing risks conditional inference conditional moment generating function confidence interval cumulative exposure model D-optimality exponential distribution maximum likelihood estimation order statistics parametric boootstrap method progressive Type-I censoring step-stress model tail probability Type-1 censoring Type-II censoring

Search results