31 |
Παραβιάσεις των βασικών υποθέσεων του γραμμικού μοντέλου παλινδρόμησηςΓρηγοριάδου, Μαρία 05 February 2015 (has links)
Το στατιστικό μοντέλο είναι μία τυποποίηση στοχαστικών σχέσεων μεταξύ μεταβλητών σε μορφή
μαθηματικών εξισώσεων με σκοπό την όσο το δυνατόν πιο ακριβή περιγραφή ενός συστήματος
(φαινομένου ή γεγονότος). Σχεδόν σε κάθε σύστημα, υπάρχουν μεταβλητές ποσότητες που
αλλάζουν. Ένα ενδιαφέρον ζήτημα είναι η μελέτη των επιδράσεων που αυτές οι μεταβλητές ασκούν
(ή φαίνεται να ασκούν) πάνω σε άλλες. Η μελέτη αυτή είναι το αντικείμενο της ανάλυσης
παλινδρόμησης, μίας ευρέως χρησιμοποιούμενης στατιστικής τεχνικής, την οποία χρησιμοποιούμε
για να ανιχνεύσουμε και να μοντελοποιήσουμε σχέσεις και εξαρτήσεις μεταξύ μεταβλητών. Όταν οι
σχέσεις μεταξύ των μεταβλητών είναι γραμμικές, προκύπτουν τα λεγόμενα γραμμικά παλινδρομικά
μοντέλα. Τα στατιστικά μοντέλα παλινδρόμησης, βασίζονται σε κάποιες βασικές υποθέσεις, τις
οποίες υποχρεούμαστε να ελέγχουμε πριν την ανάλυση του μοντέλου. Στην πράξη, όμως, οι
υποθέσεις αυτές συχνά παραβιάζονται. Όταν δε, έχουμε να κάνουμε με δεδομένα του πραγματικού
κόσμου, η παραβίαση των υποθέσεων αυτών είναι τόσο συχνή που αποτελεί στη συντριπτική
πλειοψηφία τον κανόνα παρά την εξαίρεση.
Η παρούσα διπλωματική εργασία πραγματεύεται το σημαντικότατο θέμα που ανακύπτει σε
περιπτώσεις στις οποίες κάποιες από τις βασικές υποθέσεις που διέπουν το γραμμικό μοντέλο
παλινδρόμησης παραβιάζονται. Σκοπός της εργασίας αυτής είναι :
α)να αναλυθούν οι αιτίες που προκαλούν την κάθε παραβίαση και οι επιπτώσεις που έχει αυτή στο μοντέλο,
β)να καταγραφούν οι βασικότεροι τρόποι ανίχνευσης των παραβιάσεων στο υπόδειγμα,
γ)να βρεθούν τρόποι αντιμετώπισης των "προβληματικών καταστάσεων".
Τα αποτελέσματα δείχνουν ότι ο συνδυασμός της καθεστηκυίας γνώσης (του θεωρητικού
υποβάθρου) για το αντικείμενο και των σύγχρονων μεθόδων και ιδεών μπορούν να μειώσουν
σημαντικά τις δυσμενείς επιπτώσεις που επιφέρουν οι παραβιάσεις των κανόνων στο μοντέλο, και
παράλληλα μας επιτρέπει να "περισώσουμε" ικανοποιητικό ποσό πληροφορίας. / The statistical model is a standarization of stochastic relationships between variables in a form of mathematical equations in order to accurately describe a system, either phenomena, or facts. Almost every system includes some variable amounts that change.The interesting question is to investigate the effects those variables have (or appear to have) on other variables. This kind of investigation is the object of the regression analysis,a widely used statistical technic, which is used so as to detect relations and dependences between variables. Linear regression models are created when there are linear relations between variables. In addition, statistical models are based on some significant assumptions, that we are obliged to validate before we analyze the model. However, these assumptions are often violated in practise. Especially when we have to face with <<real world>> data, the violation is too frecuent that ends to be the rule instead the exception. The current thesis addresses the important subject which arises when some basic assumptions of the linear regression model are violated.The purpose of writing this thesis is : a)to analyse the reasons why the basic assumptions are violated and how these violations effect to our model b)to report the main methods in order to scan the model for violations c)to find ways to fight the problems The investigation results to the fact that if we combine the theoretical backround and the modern methods and techniques, we can reduce the adverse consecuences -and occasionally even reverse the damages- that the violations breed to the model, with simultaneous <<salvation>> of a quite satisfactory amount of information.
|
32 |
The comparison of stochastic frontier analysis with panel data modelsZhang, Miao January 2012 (has links)
From the idea of efficiency raised by Koopmans in 1951, and the panel data first introduced into the efficiency analysis by Pitt and Lee (1981) and Schmidt and Sickles (1984), the techniques of stochastic frontier analysis are fast developed and the applications of stochastic frontier are widely used in different areas, such as education, industry and hospital. But most researchers focus on only one aspect, either the development of new models or empirical applications. This thesis attempts to fill the gap to get a general idea of the properties of different panel data stochastic frontier models, on both statistical aspects and economic aspects, by the comparison of different models applied to different production applications. The thesis is also attempt to shed light on whether particular panel data stochastic frontier models are better suited to different data sets. The models selected capture the simplest situation, with no heterogeneity or heteroscedasticity, and complicated ones, with exogenous variables included in the models. Not only the classical models, such as the Pitt and Lee (1981) and Battese and Coelli (1992.1995), but also the new developed models, such as the latent class model and fixed management model are detected in the thesis. On the economic aspect, the data selected captures both microeconomic and macroeconomic, with the application to the World GDP and the Italian manufacturing industry. The results show that: the panel data stochastic frontier models perform better on the microeconomic level than on the macroeconomic level; the classical models perform better than the new developed ones; some panel data stochastic frontier models make ideal assumptions but the requirements to the dataset are hard to achieve; that the influence from the exogenous variables is quite strong.
|
33 |
Econometric Models of Crop Yields: Two EssaysTolhurst, Tor 17 May 2013 (has links)
This thesis is an investigation of econometric crop yield models divided into two essays. In the first essay, I propose estimating a single heteroscedasticity coefficient for all counties within a crop-reporting district by pooling county-level crop yield data in a two-stage estimation process. In the context of crop insurance---where heteroscedaticity has significant economic implications---I demonstrate the pooling approach provides economically and statistically significant improvements in rating crop insurance contracts over contemporary methods. In the second essay, I propose a new method for measuring the rate of technological change in crop yields. To date the agricultural economics literature has measured technological change exclusively at the mean; in contrast, the proposed model can measure the rate of technological change in endogenously-defined yield subpopulations. I find evidence of different rates of technological change in yield subpopulations, which leads to interesting questions about the effect of technological change on agricultural production. / Ontario Ministry of Agriculture and Food
|
34 |
Estimation for state space models quasi-likelihood and asymptotic quasi-likelihood approaches /Al zghool, Raed Ahmad Hasan. January 2008 (has links)
Thesis (Ph.D.)--University of Wollongong, 2008. / Typescript. Includes bibliographical references: leaf 239-254.
|
35 |
Essays on theories and applications of spatial econometric modelsLin, Xu, January 2006 (has links)
Thesis (Ph. D.)--Ohio State University, 2006. / Title from first page of PDF file. Includes bibliographical references (p. 114-119).
|
36 |
Predicting Uncertainty in Financial Markets : -An empirical study on ARCH-class models ability to estimate Value at RiskNybrant, Arvid, Rundberg, Henrik January 2018 (has links)
Value at Risk has over the last couple of decades become one of the most widely used measures of market risk. Several methods to compute this measure have been suggested. In this paper, we evaluate the use of the GARCH(1,1)-, EGARCH(1,1)- and the APARCH(1,1) model for estimation of this measure under the assumption that the conditional error distribution is normally-, t-, skewed t- and NIG-distributed respectively. For each model, the 95% and 99% one-day Value at Risk is computed using rolling out-of-sample forecasts for three equity indices. These forecasts are evaluated with Kupiec´s test for unconditional coverage test and Christoffersen’s test for conditional coverage. The results imply that the models generally perform well. The APARCH(1,1) model seems to be the most robust model. However, the GARCH(1,1) and the EGARCH(1,1) models also provide accurate predictions. The results indicate that the assumption of conditional distribution matters more for 99% than 95% Value at Risk. Generally, a leptokurtic distribution appears to be a sound choice for the conditional distribution.
|
37 |
Critérios de seleção para incremento de uniformidade de produção em bovinos de corte /Neves, Haroldo Henrique de Rezende. January 2010 (has links)
Resumo: O objetivo deste estudo foi investigar a existência de variabilidade genética aditiva sobre a variância residual do ganho de peso do nascimento à desmama (GND) de bovinos Nelore e as perspectivas de se explorar diferenças entre genótipos para variância residual para a obtenção de maior uniformidade de produção, por meio de seleção. Diferentes abordagens, implementadas em dois passos, foram estudadas: Inicialmente, avaliaram-se três modelos para análise de medidas de dispersão dos resíduos associados às observações de GND da progênie de touros Nelore. O modelo considerado mais promissor foi empregado em estudo subsequente, em que foi investigado o impacto do tamanho de progênie dos touros nas estimativas obtidas para variância aditiva sobre a dispersão residual e estimadores de dispersão em diferentes escalas foram comparados. A confiabilidade de tal abordagem foi verificada por meio de simulação de Monte Carlo. Um último estudo avaliou a possibilidade de se considerarem, simultaneamente, efeitos aditivos e ambientais sobre a variância residual de GND, empregando-se diferentes modelos para análise do logaritmo natural do quadrado do resíduo associado a cada observação. Concluiu-se que, ao se considerar famílias de grande tamanho, seria possível obter predições acuradas do mérito genético dos touros para a variância residual e alguma resposta em termos de uniformidade de produção, sendo a abordagem do último estudo considerada a mais adequada para este fim. Desconsiderar efeitos ambientais sobre a variância residual no segundo passo das análises pode levar a superestimação da variância aditiva sobre a dispersão residual, bem como da resposta esperada à seleção / Abstract: This study was carried out to investigate the existence of genetic variability on residual variance of beef cattle production traits and to evaluate the opportunity for improvement in uniformity of such traits by selecting for lower residual variance. Different two-step approaches were studied to address these questions: Firstly, three models were employed to analyze different measures associated with residual dispersion of weight gain from birth to weaning (GND) in the progeny of Nellore sires. The model that performed best was employed in a subsequent study to access the impact of progeny size on estimates of additive variance for residual dispersion, also aiming to compare dispersion estimators of different scales and to predict selection response in each situation. Reliability of this approach was verified by Monte Carlo simulation. The possibility of considering, simultaneously, additive and environmental effects on residual variance of GND was investigated by analyzing log squared residuals associated with each observation according to different models. It was concluded that, by considering large sire families, accurate estimates of genetic merit of sires for residual variance could be obtained as well as some improvement in uniformity of GND. Analyzing log squared residuals associated with each observation was considered the most promising approach for this task. Ignoring environmental effects at the level of residual variance could lead to inflated estimates of additive variance of residual dispersion, therefore implying in overestimation of response to selection / Orientador: Sandra Aidar de Queiroz / Coorientador: Roberto Carvalheiro / Banca: Vanerlei Mozaquatro Roso / Banca: Henrique Nunes de Oliveira / Mestre
|
38 |
Modelo de regressão linear Sinh-Normal : Aplicações à tempo de vidas / Linear Regression model Sinh-Normal : Applications to life timesMaehara Sánchez, Rocío Paola, 1983- 03 July 2014 (has links)
Orientador: Filidor Edilfonso Vilca Labra / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação Científica / Made available in DSpace on 2018-08-24T22:21:27Z (GMT). No. of bitstreams: 1
MaeharaSanchez_RocioPaola_M.pdf: 3392400 bytes, checksum: 820e10611571e2e9984d25f94c00ced2 (MD5)
Previous issue date: 2014 / Resumo: A família de distribuições Sinh-Normal é uma classe de distribuições simétricas com três parâmetros, e devido à presença destes parâmetros esta família é flexível. Quando a distribuição Sinh-Normal é unimodal, esta distribuição pode ser utilizada em lugar da distribuição normal, e consequentemente nos modelos de regressão. Uma subclasse das distribuições é o log-transformação da distribuição de tempo de fadiga Birnbaum-Saunders. Assim, várias propriedades da distribuição Birnbaum-Saunders e algumas generalizações podem ser obtidas. O principal objetivo deste trabalho é estudar alguns aspectos de estimação e análise de diagnóstico no modelo de regressão Sinh-Normal. A análise de diagnóstico baseia-se na metodologia de Cook (1986). Duas análises de dados são realizadas para ver como o modelo proposto pode ser utilizado na prática. Além disso, investigamos um teste de homogeneidade dos parâmetros de forma no modelo de regressão Sinh-Normal. Obtemos as estatísticas de escore para este teste. Finalmente, um exemplo numérico é apresentado para ilustrar a metodologia e as propriedades das estatísticas escore são investigadas através de simulações de Monte Carlo / Abstract: The family of Sinh-normal distributions is a class of symmetric distributions with three parameters, and due to presence of these parameters it is a very flexible distribution. When the Sinh-normal distribution is unimodal, it distribution could be used in place of the normal distribution and consequently in regression model. A subclass de distribution of Sinh-normal distributions is the log-transformation of the Birnbaum-Saunders fatigue-time distribution. So, several properties of the Birnbaum-Saunders distribution and some generalization can be obtained. The main objective of work is to study some aspect of estimation and analysis of diagnostics in the Sinh-Normal regression model. The analysis of diagnostics is based on the Cook (1986) approach. Two data analysis is performed to see how the proposed model can be used in practice. Furthermore, we investigate a test of homogeneity for shape parameters in the Sin-Normal regression model. We obtain the score statistics for such test. Finally, a numerical example is given to illustrate our methodology and the properties of the score statistics is investigated through Monte Carlo simulations / Mestrado / Estatistica / Mestra em Estatística
|
39 |
Avaliação genética de uma população multirracial angus-nelore / Genetic evaluation of a multibreed population angus-nellorePrestes, Alan Miranda 21 February 2017 (has links)
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / The objective of this study was to evaluate the best model for the genetic evaluation for the trait average daily gain of weaning to post weaning (ADGWP), of a multiple-breed Nellore and Angus population, comprised of 49.634 animals sired by 34.006 dams and 793 sire, born between 1986 and 2015. The genetic evaluation for this population was performed through the methodology of Bayesian inference with the animal model and the criteria of choice were the Number of Parameters (Np), Deviance Information (DIC) and the conditional predictive ordinate (CPO). In the first chapter three models were tested: Traditional Animal Model (TAM), Multiple-Breed Animal Model With (MBAMW) and without segregation (MBAMWS). Based on the selection criteria, the MBAMW was chosen because it presents better adjustments, besides presenting the smallest number of parameters, thus reducing the computational demand. In the second chapter, heteroscedastic multiple-breed models (HMBM) were tested. A 2×2 factorial scheme of two residual variance models (homoscedastic (HO) or heteroscedastic (HE)) was used based on two distributive assumptions (Gaussian (G) and Student’s t (T)). The HMBM-T-HE presented the best fit for the population in question. The Spearman's ordering correlations of the breeding values predicted for the sires were high when all animals were considered (0.93 to 0.99). However, when these sires were separated in TOP (10%) these correlations were reduced drastically (from 0.05 to 0.96). These results support the implementation of robust multibreed models that account for sources of heteroscedasticity to increase the accuracy of genetic assessments of multiple-breed populations. / Este estudo teve como objetivo avaliar o melhor modelo para a avaliação genética para a característica de ganho médio diário da desmama ao sobreano (GMDD) de uma população multirracial Nelore e Angus formada por 49.634 animais filhos de 34.006 matrizes e 793 touros, nascidos entre 1986 e 2015. A avaliação genética para esta população foi realizada através da metodologia de inferência Bayesiana por meio de um modelo animal e os critérios de escolha foram o Número de Parâmetros (Np), de Informação da Deviance (DIC) e Ordenada Preditiva (CPO). No primeiro capítulo foram testados três modelos: Modelo Animal Tradicional (MAT), Modelo Animal Multirracial sem (MAMRSS) e com segregação (MAMRCS). Com base nos critérios de escolha, o MAMRSS foi escolhido por apresentar melhores ajustes, além de apresentar o menor número de parâmetros, reduzindo assim a demanda computacional. No segundo capítulo foram testados modelos multirraciais (MAMR) homo e heteroscedástico. Foi utilizado um esquema fatorial 2×2 de dois modelos de variância residual (homoscedástica (HO) ou heteroscedástica (HE)) baseado em dois pressupostos distributivos (Gaussiano (G) e t de Student (T)). O MAMR-T-HE foi o que apresentou melhor ajuste para a população em questão. As correlações de ordenamento de Spearman dos valores genéticos preditos, para os reprodutores, foram altas quando consideradas todos animais (0,93 a 0,99). No entanto, quando separados estes reprodutores em TOPs (10%) estas correlações foram reduzidas drasticamente (de 0,05 a 0,96). Estes resultados apoiam a implementação de modelos multirraciais robustos que contabilizam fontes de heteroscedasticidade para aumentar a precisão de avaliações genéticas de populações multirraciais.
|
40 |
Modlování vývoje výše škodních událostí / Modeling development of incurred value of claimKantorová, Petra January 2010 (has links)
This diploma project is focused on the estimation of incurred value of claim and probability of the claim remaining opened (not settled) in the specific stage of the insurance settlement process. The change of incurred value of claim means the change of settlement process stage. Generalized linear model is used for modelling these changes. Classical linear regression model also belongs into this theory, which is its special case, just with stricter premises. Generalized linear model among others allows solving the problem of heteroscedasticity in the unusual way using joint model. This model is applied in the practical part of this piece of work. Logistic regression is the part of the generalized linear model theory, which helps to model the probability of the claim remaining opened in this piece of work. The model outcome is presented in graphic way, especially the graphs containing probability that levels of given claim will occur in certain range.
|
Page generated in 0.0905 seconds