Global ETD Search

141	Recursive Partitioning of Models of a Generalized Linear Model Type Rusch, Thomas 10 June 2012 (has links) (PDF) This thesis is concerned with recursive partitioning of models of a generalized linear model type (GLM-type), i.e., maximum likelihood models with a linear predictor for the linked mean, a topic that has received constant interest over the last twenty years. The resulting tree (a ''model tree'') can be seen as an extension of classic trees, to allow for a GLM-type model in the partitions. In this work, the focus lies on applied and computational aspects of model trees with GLM-type node models to work out different areas where application of the combination of parametric models and trees will be beneficial and to build a computational scaffold for future application of model trees. In the first part, model trees are defined and some algorithms for fitting model trees with GLM-type node model are reviewed and compared in terms of their properties of tree induction and node model fitting. Additionally, the design of a particularly versatile algorithm, the MOB algorithm (Zeileis et al. 2008) in R is described and an in-depth discussion of how the functionality offered can be extended to various GLM-type models is provided. This is highlighted by an example of using partitioned negative binomial models for investigating the effect of health care incentives. Part 2 consists of three research articles where model trees are applied to different problems that frequently occur in the social sciences. The first uses trees with GLM-type node models and applies it to a data set of voters, who show a non-monotone relationship between the frequency of attending past elections and the turnout in 2004. Three different type of model tree algorithms are used to investigate this phenomenon and for two the resulting trees can explain the counter-intuitive finding. Here model tress are used to learn a nonlinear relationship between a target model and a big number of candidate variables to provide more insight into a data set. A second application area is also discussed, namely using model trees to detect ill-fitting subsets in the data. The second article uses model trees to model the number of fatalities in Afghanistan war, based on the WikiLeaks Afghanistan war diary. Data pre-processing with a topic model generates predictors that are used as explanatory variables in a model tree for overdispersed count data. Here the combination of model trees and topic models allows to flexibly analyse database data, frequently encountered in data journalism, and provides a coherent description of fatalities in the Afghanistan war. The third paper uses a new framework built around model trees to approach the classic problem of segmentation, frequently encountered in marketing and management science. Here, the framework is used for segmentation of a sample of the US electorate for identifying likely and unlikely voters. It is shown that the framework's model trees enable accurate identification which in turn allows efficient targeted mobilisation of eligible voters. (author's abstract)
142	Genetic Heteroscedasticity for Domestic Animal Traits Felleki, Majbritt January 2014 (has links) Animal traits differ not only in mean, but also in variation around the mean. For instance, one sire’s daughter group may be very homogeneous, while another sire’s daughters are much more heterogeneous in performance. The difference in residual variance can partially be explained by genetic differences. Models for such genetic heterogeneity of environmental variance include genetic effects for the mean and residual variance, and a correlation between the genetic effects for the mean and residual variance to measure how the residual variance might vary with the mean. The aim of this thesis was to develop a method based on double hierarchical generalized linear models for estimating genetic heteroscedasticity, and to apply it on four traits in two domestic animal species; teat count and litter size in pigs, and milk production and somatic cell count in dairy cows. The method developed is fast and has been implemented in software that is widely used in animal breeding, which makes it convenient to use. It is based on an approximation of double hierarchical generalized linear models by normal distributions. When having repeated observations on individuals or genetic groups, the estimates were found to be unbiased. For the traits studied, the estimated heritability values for the mean and the residual variance, and the genetic coefficients of variation, were found in the usual ranges reported. The genetic correlation between mean and residual variance was estimated for the pig traits only, and was found to be favorable for litter size, but unfavorable for teat count. Quantitative genetics genetic heteroscedasticity of residuals teat count in pigs litter size in pigs milk yield in cows somatic cell count in cows
143	Die nächtliche Habitatnutzung von Feldhasen (Lepus europaeus) in drei unterschiedlichen Habitaten / The nocturnal habitat use of European Brown Hare (Lepus europaeus) in three different habitats Kinser, Andreas 08 June 2011 (has links) (PDF) Die vorliegende Studie untersucht die nächtliche Habitatnutzung von Feldhasen in drei unterschiedlichen Habitaten. Das Untersuchungsgebiet Opferbaum ist stark ackerbaulich geprägt und das Untersuchungsgebiet Güntersleben sehr strukturreich durch das Vorkommen von Gehölzen und Waldrändern. Das Untersuchungsgebiet Fritzlar besitzt einen waldrandgeprägten sowie einen ackerbaulich intensiv genutzten Landschaftsteil. Die nächtlichen Aufenthaltsorte von Feldhasen wurden mittels Wärmebildkamera zwischen September 2004 bis April 2005 und September 2005 bis April 2006 kartiert. In jedem der Untersuchungsgebiete wurden einmal monatlich sogenannte Festpunkte angefahren, die umliegenden Landschaftsbereiche abgesucht und beobachtete Feldhasen in Arbeitskarten eingezeichnet. Eine Kartierung der von den Festpunkten einsehbaren Landschaftsteile geschah vor jedem Erfassungstermin bei Tageslicht. Den kartierten Feldhasen (Präsenz-Punkte) wurde im GIS eine zufällige Punktverteilung im beobachteten Landschaftsraum gegenüber gestellt (Pseudo-Absenz-Punkte). Für jeden dieser Punkte wurden bis zu 20 Minimaldistanzen zu verschiedenen Strukturelementen der Landschaft berechnet. In Generalisierten Linearen Modellen (GLM) wurden die univariaten und multivariaten Zusammenhänge der erklärenden Variablen mit der binomialen Zielvariablen modelliert. Zeitliche Aspekte der Habitatnutzung im Verlauf des Winterhalbjahres wurden mit einer multitemporalen Modellierung für zusammengefasste Zwei-Monats-Zeiträume untersucht. Die Modellselektion geschah mit Hilfe des Akaike Information Criterion (AIC). Insgesamt wurden 4.494 Standorte von Feldhasen in Opferbaum, 2.418 in Güntersleben und 1.391 in Fritzlar kartiert. Die univariate Analyse zeigt eine Meidung von Verkehrs- und Siedlungsstrukturen. Waldränder, Gehölze, Buntbrachen und Grünland werden in den Untersuchungsgebieten Fritzlar und Opferbaum bevorzugt, in Güntersleben werden die zwei letzteren gemieden. Die multivariaten Modelle zeigen eine Präferenz der Nahrungshabitate Wintergetreide und Raps, in Fritzlar und Opferbaum wird auch Grünland bevorzugt. Nach dem Nahrungshabitat wird von Feldhasen die Nähe zu potentiellen Deckungshabitaten präferiert, dabei werden nur Buntbrachen in allen Untersuchungsgebieten bevorzugt. Besonders Verkehrswege und Siedlungen werden gemiedenen, Ausnahme ist die Bevorzugung von Siedlungsbereichen in Güntersleben. Teilweise gegensätzliche Ergebnisse zeigt die Modellierung der Zwei-Monats-Zeiträume zwischen den Untersuchungsgebieten. Sie zeigen aber nur geringe Veränderungen der Habitatnutzung von Feldhasen im Verlauf des Winterhalbjahres. Allen selektierten Modellen gemein ist die geringe Erklärungsgüte von weniger als 5 % der Datenvarianz. Die Eignung der entwickelten Aufnahmemethodik und die Ergebnisse werden anhand der umfangreichen Literatur diskutiert. Die Art des Habitats ist von großer Bedeutung für die Habitatnutzung der Feldhasen. Durch die landwirtschaftliche Fruchtfolge bedingte strukturelle Veränderungen verändern ebenso die kleinräumige Habitatnutzung wie die Veränderungen der landwirtschaftlichen Schläge im Verlauf des Herbstes und Winters. Das opportunistische Habitatverhalten von Feldhasen erschwert dabei die Beobachtung von speziellem Habitatverhalten. Die zum Teil gegensätzlichen Ergebnisse werden auch vor dem Hintergrund potentieller Fehlerquellen der Methodik und einem möglichen Einfluss vernachlässigter Variablen diskutiert. Dabei stellt sich die Frage nach grundsätzlichen Konsequenzen für zukünftige Untersuchungen. Die unterschiedliche Habitatnutzung des Feldhasen in unterschiedlichen Habitaten muss sowohl bei der Wahl der Methodik als auch bei der Wahl der Gebietskulisse berücksichtigt werden. / The study presented in this thesis examined the nocturnal habitat use by hares in three different habitats. The study area Opferbaum is strongly influenced by agriculture whereas the landscape of the study area Güntersleben has very diverse structures such as groves and forest edges. The study area Fritzlar has a forest dominated landscape on the one hand and a landscape of intensive agricultural activities on the other hand. Hare locations were mapped using thermography between September 2004 to April 2005 and September 2005 to April 2006. In each of the study sites the surrounding landscape of selected viewpoints was observed once a month and hare distribution was plotted in topographical maps. Mapping of the visible landscape of the viewpoints took place during daytime. Up to 20 minimum distances to different structural elements of the landscape were calculated for each hare location (presence-points) and randomly distributed points (pseudo-absence points) in the observed landscape. Generalized linear models (GLM) were applied to model the univariate and multivariate relationships of explanatory variables with the binomial response variables (hare 1; pseudo-absence 0). Temporal aspects of habitat use during the winter were analyzed by multi-temporal modeling for combined two-month periods. The model selection was done using the Akaike Information Criterion (AIC). A total of 4,494 locations by hares were mapped in Opferbaum, 2,418 in Güntersleben and 1,391 in Fritzlar. The univariate analysis shows an avoidance of traffic and urban areas. Forest edges and groves are preferred in all study areas. Pasture and wildlife-friendly set-asides are preferred in Fritzlar and Opferbaum, but avoided in Güntersleben. The multivariate models show a preference of feeding habitats such as winter cereals and oilseed rape, hares also prefer pasture in Fritzlar and Opferbaum. After the feeding habitat, hares show a preference to be in proximity to shelter providing habitats. Wildlife-friendly set-asides were preferred in all study sites. Traffic and urban areas are avoided in Opferbaum and Fritzlar but urban areas preferred in Güntersleben. Modeling the two-month periods shows different results between the study areas but only small changes in habitat use by brown hares during the winter months. All selected models explain less than 5 % of the variance of data. The consideration of comparable studies shows that besides methodology and surveying time, the results of habitat use of brown hares are primarily influenced by the kind of the examined landscapes. The small-scale habitat use of brown hare is also influenced by structural changes in the agricultural crop rotation as well as a changing vegetation in autumn and winter. The opportunistic behaviour of brown hares make the observation of special habitat use difficult. The results are discussed in connection with error in methodology and unconsidered variables but also to fundamental consequences for future investigations. The differences in habitat use of brown hares in different habitats have to be considered in both, the choice of methodology and when choosing the study sites. Lepus europaeus Habitatnutzung Habitatwahl Wärmebild Infrarot GLM Strukturelemente Lebensraum Brache Festpunkte Habitat use Hare locations Generalized linear models thermography Wildlife-friendly set-asides ddc:599 ddc:630 rvk:ZC 92300
144	Inférence statistique en grande dimension pour des modèles structurels. Modèles linéaires généralisés parcimonieux, méthode PLS et polynômes orthogonaux et détection de communautés dans des graphes. / Statistical inference for structural models in high dimension. Sparse generalized linear models, PLS through orthogonal polynomials and community detection in graphs Blazere, Melanie 01 July 2015 (has links) Cette thèse s'inscrit dans le cadre de l'analyse statistique de données en grande dimension. Nous avons en effet aujourd'hui accès à un nombre toujours plus important d'information. L'enjeu majeur repose alors sur notre capacité à explorer de vastes quantités de données et à en inférer notamment les structures de dépendance. L'objet de cette thèse est d'étudier et d'apporter des garanties théoriques à certaines méthodes d'estimation de structures de dépendance de données en grande dimension.La première partie de la thèse est consacrée à l'étude de modèles parcimonieux et aux méthodes de type Lasso. Après avoir présenté les résultats importants sur ce sujet dans le chapitre 1, nous généralisons le cas gaussien à des modèles exponentiels généraux. La contribution majeure à cette partie est présentée dans le chapitre 2 et consiste en l'établissement d'inégalités oracles pour une procédure Group Lasso appliquée aux modèles linéaires généralisés. Ces résultats montrent les bonnes performances de cet estimateur sous certaines conditions sur le modèle et sont illustrés dans le cas du modèle Poissonien. Dans la deuxième partie de la thèse, nous revenons au modèle de régression linéaire, toujours en grande dimension mais l'hypothèse de parcimonie est cette fois remplacée par l'existence d'une structure de faible dimension sous-jacente aux données. Nous nous penchons dans cette partie plus particulièrement sur la méthode PLS qui cherche à trouver une décomposition optimale des prédicteurs étant donné un vecteur réponse. Nous rappelons les fondements de la méthode dans le chapitre 3. La contribution majeure à cette partie consiste en l'établissement pour la PLS d'une expression analytique explicite de la structure de dépendance liant les prédicteurs à la réponse. Les deux chapitres suivants illustrent la puissance de cette formule aux travers de nouveaux résultats théoriques sur la PLS . Dans une troisième et dernière partie, nous nous intéressons à la modélisation de structures au travers de graphes et plus particulièrement à la détection de communautés. Après avoir dressé un état de l'art du sujet, nous portons notre attention sur une méthode en particulier connue sous le nom de spectral clustering et qui permet de partitionner les noeuds d'un graphe en se basant sur une matrice de similarité. Nous proposons dans cette thèse une adaptation de cette méthode basée sur l'utilisation d'une pénalité de type l1. Nous illustrons notre méthode sur des simulations. / This thesis falls within the context of high-dimensional data analysis. Nowadays we have access to an increasing amount of information. The major challenge relies on our ability to explore a huge amount of data and to infer their dependency structures.The purpose of this thesis is to study and provide theoretical guarantees to some specific methods that aim at estimating dependency structures for high-dimensional data. The first part of the thesis is devoted to the study of sparse models through Lasso-type methods. In Chapter 1, we present the main results on this topic and then we generalize the Gaussian case to any distribution from the exponential family. The major contribution to this field is presented in Chapter 2 and consists in oracle inequalities for a Group Lasso procedure applied to generalized linear models. These results show that this estimator achieves good performances under some specific conditions on the model. We illustrate this part by considering the case of the Poisson model. The second part concerns linear regression in high dimension but the sparsity assumptions is replaced by a low dimensional structure underlying the data. We focus in particular on the PLS method that attempts to find an optimal decomposition of the predictors given a response. We recall the main idea in Chapter 3. The major contribution to this part consists in a new explicit analytical expression of the dependency structure that links the predictors to the response. The next two chapters illustrate the power of this formula by emphasising new theoretical results for PLS. The third and last part is dedicated to graphs modelling and especially to community detection. After presenting the main trends on this topic, we draw our attention to Spectral Clustering that allows to cluster nodes of a graph with respect to a similarity matrix. In this thesis, we suggest an alternative to this method by considering a $l_1$ penalty. We illustrate this method through simulations. Grande dimension Méthode de régularisation Méthode de réduction de dimension High dimension Sparse generalized linear models Regularization methods Dimension reduction methods Partial least squares Community detection in graphs 519
145	Modelos lineares parciais aditivos generalizados com suavização por meio de P-splines / Generalized additive partial linear models with P-splines smoothing Amanda Amorim Holanda 03 May 2018 (has links) Neste trabalho apresentamos os modelos lineares parciais generalizados com uma variável explicativa contínua tratada de forma não paramétrica e os modelos lineares parciais aditivos generalizados com no mínimo duas variáveis explicativas contínuas tratadas de tal forma. São utilizados os P-splines para descrever a relação da variável resposta com as variáveis explicativas contínuas. Sendo assim, as funções de verossimilhança penalizadas, as funções escore penalizadas e as matrizes de informação de Fisher penalizadas são desenvolvidas para a obtenção das estimativas de máxima verossimilhança penalizadas por meio da combinação do algoritmo backfitting (Gauss-Seidel) e do processo iterativo escore de Fisher para os dois tipos de modelo. Em seguida, são apresentados procedimentos para a estimação do parâmetro de suavização, bem como dos graus de liberdade efetivos. Por fim, com o objetivo de ilustração, os modelos propostos são ajustados à conjuntos de dados reais. / In this work we present the generalized partial linear models with one continuous explanatory variable treated nonparametrically and the generalized additive partial linear models with at least two continuous explanatory variables treated in such a way. The P-splines are used to describe the relationship among the response and the continuous explanatory variables. Then, the penalized likelihood functions, penalized score functions and penalized Fisher information matrices are derived to obtain the penalized maximum likelihood estimators by the combination of the backfitting (Gauss-Seidel) algorithm and the Fisher escoring iterative method for the two types of model. In addition, we present ways to estimate the smoothing parameter as well as the effective degrees of freedom. Finally, for the purpose of illustration, the proposed models are fitted to real data sets. Método de suavização Modelos aditivos generalizados Modelos lineares generalizados Modelos lineares parciais generalizados Modelos parcialmente lineares Modelos semiparamétricos P-splines Splines Generalized additive models Generalized linear models Generalized partial linear models P-splines Partial linear models Semiparametric models Smoothing method Splines
146	Sistema radicular de plantas com enfoque na criação e seleção de genótipos de feijão adaptados ao Planalto Serrano / Plants root system with focus on the creation and selection of beangenotypes adapted to the Planalto Serrano Rocha, Fabiani da 02 February 2011 (has links) Made available in DSpace on 2016-12-08T16:44:34Z (GMT). No. of bitstreams: 1 PGPV11MA035.pdf: 382044 bytes, checksum: 888904b44ab1d4e73c30ea19d52dffc8 (MD5) Previous issue date: 2011-02-02 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Bean production is affected by a range of abiotic stresses such as drought, low soil fertility, soil acidity and unfavorable temperatures. However, a deep root system and well distributed allows better adaptation, especially regarding the conditions of drought and low nutrient availability. Despite the advancement of research, little has been done in this line of study. Thus, this study aimed to: i) present an alternative to the measurement and statistical analysis to the root distribution, ii) measuring the character of root distribution in bean genotypes Active Germplasm Bank of UDESC (Universidade do Estado de Santa Catarina) and its relationship to other important agronomic characteristics, and iii) to evaluate the root distribution along the profile between mutant populations and select bean genotypes with higher metric values for the character. The experiments were performed in the experimental area of IMEGEM, arranged in a randomized design. The evaluation of root distribution was held in hybrids, cultivars, accessions and mutant populations of beans. For that were open profiles perpendicular to the plant rows of beans, where a rectangle with dimensions of 0.5 m wide by 0.3 m, 0.05 m grid was set aside and a photo was taken. The determination of root distribution in the binary system (name of presence (1) and absence (0) of roots in each box) was performed by analysis of the photo. It was observed that the measurement of the characteristic root distribution through the determination of simple events is a valuable tool for researchers because it allows the quantitative analysis of root distribution by means of Generalized Linear Models the GENMOD procedure of SAS. Considering the small number of genotypes, can be stated that the Active Germplasm Bank Bean has promising genotypes for the character root distribution, where BAF09 (black) and BAF35 (carioca) present the best root deep distribution (20 to 30 cm). It might still be verified the presence of significant positive correlation between root distribution and other traits of agronomic importance. The mutant populations present different performance against the mutagenic for the root distribution. Since the most promising segregating populations are derived from cultivars IPR Uirapuru and IPR Chopim, as they present a significant increase in the number of roots with increasing doses of mutagen / A produção de feijão é afetada por uma gama de estresses abióticos, como seca, baixa fertilidade do solo, acidez do solo e temperaturas desfavoráveis. No entanto, um sistema radicular profundo e bem distribuído permite melhor adaptação da cultura, principalmente no que tange as condições de deficiência hídrica e baixa disponibilidade de nutrientes. Apesar do avanço da pesquisa, pouco se tem trabalhado nessa linha de estudo. Sendo assim, este trabalho teve como objetivos: i) apresentar uma alternativa para a mensuração e a análise estatística para o caráter distribuição radicular; ii) mensurar o caráter distribuição radicular em genótipos de feijão do Banco Ativo de Germoplasma da UDESC (Universidade do Estado de Santa Catarina) e verificar a correlação com outros caracteres de importância agronômica; e iii) avaliar a distribuição radicular ao longo do perfil entre populações mutantes e selecionar genótipos de feijão com valores métricos superiores para o caráter. Os experimentos foram realizados na área experimental do IMEGEM, arranjados em delineamento inteiramente casualizado. A avaliação da distribuição radicular foi realizada em híbridos, cultivares, acessos e populações mutantes de feijão. Para isso foram abertos perfis perpendiculares alinha de semeadura, onde um retângulo com dimensões de 0,5 m de largura por 0,3 m de altura, quadriculado com 0,05 m de lado foi disposto e uma foto foi capturada. A determinação da distribuição radicular no sistema binário (denominação de presença (1) e ausência (0) das raízes em cada quadrícula) foi realizada por meio da análise da foto. A mensuração da característica distribuição radicular a partir da determinação de eventos simples (presença=1 e ausência=0) é uma valiosa ferramenta para o pesquisador, já que possibilita a análise quantitativa da distribuição radicular, por meio dos Modelos Lineares Generalizados do procedimento GENMOD do SAS. Ainda que de forma incipiente, devido ao pequeno número de genótipos avaliados pode ser afirmado que o Banco Ativo de Germoplasma de Feijão possui genótipos promissores para o caráter distribuição radicular. Sendo que BAF09 (preto) e BAF35 (carioca), por apresentarem distribuição radicular profunda e significativa (20 a 30 cm), merecem destaque. Pôde ser verificada ainda a presença de correlação positiva e significativa entre a distribuição radicular e outros caracteres de importância agronômica. As populações mutantes apresentaram desempenho diferenciado frente ao mutagênico para a distribuição radicular. As populações segregantes mais promissoras foram oriundas das cultivares IPR Uirapuru e IPR Chopim, pois apresentaram um aumento significativo na distribuição radicular o aumento das doses do agente mutagênico phaseolus vulgaris L. distribuição radicular modelos lineares generalizados estresses abióticos variabilidade genética populações mutantes phaseolus vulgaris L. root distribution generalized linear models abiotic stresses genetic variability mutant populations CNPQ::CIENCIAS AGRARIAS::AGRONOMIA
147	Desenvolvimento de um modelo para previsão de ocorrência de Ecdytolopha aurantiana (Lima, 1927) (Lepidoptera: Tortricidae) e um Sistema WEB Integrado de Apoio ao Citricultor. / Development of a model to predict the occurrence of Ecdytolopha aurantiana (Lima, 1927) (Lepidoptera: Tortricidae) and an integrated web system to support citrus producers. Ronaldo Reis Junior 19 April 2004 (has links) Os frutos cítricos são uma das principais fontes de exportação do Brasil, sendo que São Paulo responde por cerca de 80% da produção total. Dentre os fatores que limitam a produção citrícola brasileira, especialmente no estado de São Paulo, vem se sobressaindo nos últimos anos, a clorose variegada dos citros (CVC), a podridão floral, o minador-dos-citros e o bicho-furão-dos-citros, Ecdytolopha aurantiana (Lima, 1927) (Lepidoptera: Tortricidae), além dos ácaros, moscas-das-frutas, pinta preta e mais recentemente a morte súbita. O presente trabalho teve como objetivo desenvolver um modelo para previsão de ocorrência de E. aurantiana, baseado em dados de monitoramento através de armadilhas de feromônio sexual e desenvolver um sistema computacional que possa utilizar este modelo para gerar as previsões de ocorrência, além de fornecer um local de troca de informações entre o citricultor e a comunidade científica. O tipo de solo, temperatura do local, variedade de citros, idade das plantas e uso de agroquímicos para controle de E. aurantiana, influenciaram na dinâmica populacional deste inseto. A maior influência sobre a flutuação do bicho-furão foi exercida pelo tipo de solo, seguido pela temperatura local, variedade de citros, idade das plantas e uso de agroquímicos para controle de E. aurantiana. A ocorrência de E. aurantiana em função da temperatura é diferente para cada combinação de tipo de solo, variedade de citros, idade das plantas e uso de agroquímicos. O modelo desenvolvido pode prever o potencial de ocorrência de E. aurantiana em função da temperatura ou dos meses do ano, levando-se em consideração o tipo de solo, variedade de citros, idade das plantas e aplicação de agroquímicos. O programa (BF) elaborado na linguagem R conta com equações para simular as diversas situações de ocorrência de E. aurantiana. O SIAC (Sistema Integrado de Apoio ao Citricultor) é um sistema que facilita o uso do modelo, sem a necessidade de conhecimento de R e fornece uma gama de recursos que visa facilitar o acesso do citricultor às informações e ao pesquisador aos problemas do citricultor, criando com isto uma maior interação de ambos. O modelo de previsão de ocorrência de bicho-furão pode ser aperfeiçoado com coleta de dados mais regulares e de forma contínua. / Citrus fruits are Brazils major exporting sources, and São Paulo is accountable for approximately 80% of the total production. Among the factors that limit the Brazilian citriculture, especially in the state of São Paulo, one points in the past few years the citrus variegated chlorosis (CVC), flower rot, citrus leaf miner and citrus fruit borer, Ecdytolopha aurantiana (Lima, 1927) (Lepidoptera: Tortricidae), in addition to mites, fruit flies, black spot, and more recently, sudden death. The goal of this work was to develop a model to predict the occurrence of E. aurantiana, based on monitoring data collected through sexual pheromone traps, and to develop a computer system capable of using such model to generate occurrence predictions and to provide a place for information exchange between citriculturists and the scientific community. Soil type, site temperature, citrus variety, age of plants and use of chemicals to control E. aurantiana influenced the population dynamics of the insect. The highest influence on the citrus fruit borer dynamics was exerted by the soil type, followed by site temperature, citrus variety, age of plants and use of chemicals for E. aurantiana control. The occurrence of E. aurantiana according to temperature is different for each combination of soil type, citrus variety, age of plants and use of chemicals. The model developed can predict the occurrence potential of E. aurantiana according to temperature or months of the year, taking into account soil type, citrus variety, age of plants and chemicals spraying. The elaborated software (BF), designed in R language, includes equations that simulate the various situations of E. aurantiana occurrence. SIAC (Sistema Integrado de Apoio ao Citricultor) is a system that simplifies the use of the model, does not require previous knowledge on R, and provides a wide range of resources to facilitates the access of citriculturists to information and that of researchers to citriculturists problems, thus creating a better interaction between them. The predicting model of citrus fruit borer occurrence can be improved with more frequent and continuous data collecting. armadilha para inseto bicho-furão feromônio sexual de inseto frutas cítricas modelo linear generalizado SIAC sistema de computador citric fruit citrus fruit borer computer system generalized linear models insect pheromone insect trap SIAC
148	Aperfeiçoamento de métodos estatísticos em modelos de regressão da família exponencial / Further statistical methods in regression models of the exponential family Alexsandro Bezerra Cavalcanti 03 August 2009 (has links) Neste trabalho, desenvolvemos três tópicos relacionados a modelos de regressão da família exponencial. No primeiro tópico, obtivemos a matriz de covariância assintótica de ordem $n^$, onde $n$ é o tamanho da amostra, dos estimadores de máxima verossimilhança corrigidos pelo viés de ordem $n^$ em modelos lineares generalizados, considerando o parâmetro de precisão conhecido. No segundo tópico calculamos o coeficiente de assimetria assintótico de ordem n^{-1/2} para a distribuição dos estimadores de máxima verossimilhança dos parâmetros que modelam a média e dos parâmetros de precisão e dispersão em modelos não-lineares da família exponencial, considerando o parâmetro de dispersão desconhecido, porém o mesmo para todas as observações. Finalmente, obtivemos fatores de correção tipo-Bartlett para o teste escore em modelos não-lineares da família exponencial, considerando covariáveis para modelar o parâmetro de dispersão. Avaliamos os resultados obtidos nos três tópicos desenvolvidos por meio de estudos de simulação de Monte Carlo / In this work, we develop three topics related to the exponential family nonlinear regression. First, we obtain the asymptotic covariance matrix of order $n^$, where $n$ is the sample size, for the maximum likelihood estimators corrected by the bias of order $n^$ in generalized linear models, considering the precision parameter known. Second, we calculate an asymptotic formula of order $n^{-1/2}$ for the skewness of the distribution of the maximum likelihood estimators of the mean parameters and of the precision and dispersion parameters in exponential family nonlinear models considering that the dispersion parameter is the same although unknown for all observations. Finally, we obtain Bartlett-type correction factors for the score test in exponential family nonlinear models assuming that the precision parameter is modelled by covariates. Monte Carlo simulation studies are developed to evaluate the results obtained in the three topics. coeficiente de assimetria correção tipo-Bartlett estimador de máxima verossimilhança matriz de covariância Modelos lineares generalizados modelos não lineares teste escore Bartlett-type corrections covariance matrix Generalized linear models score test skewness
149	Modelagem simultânea de média e dispersão e aplicações na pesquisa agronômica / Joint modeling of mean and dispersion and applications to agricultural research Afrânio Márcio Corrêa Vieira 10 February 2009 (has links) Diversos delineamentos experimentais que são aplicados correntemente tomam como base experimentos agronômicos. Esses dados experimentais são, geralmente, analisados usando-se modelos que consideram uma variância residual constante (ou homogênea), como pressuposto inicial. Entretanto, esta pressuposição mostra-se relativamente forte quando se está diante de situações para as quais fatores ambientais ou externos exercem considerável influência nas medidas experimentais. Neste trabalho, são estudados modelos para a média e a variância, simultaneamente, com a variância estruturada de duas formas: (i) por meio de um preditor linear, que permite incorporar variáveis externas e fatores de ruído e (ii) por meio de efeitos aleatórios, que permitem acomodar tanto o efeito longitudinal quanto o efeito de superdispersão, no caso de medidas binárias repetidas no tempo. A classe de modelos lineares generalizados duplos (MLGD) foi aplicada a um estudo observacional que consistiu em medir a mortalidade de frangos de corte no fim da condição de espera pré-abate. Nesse problema, é forte a evidência de que alguns fatores influenciam a variabilidade, e consequentemente, diminuem a precisão das análises inferenciais. Outro problema agronômico relevante, associado à horticultura, são os experimentos de cultura de tecidos vegetais, em que o número de explantes que regeneram são contados. Como esse tipo de experimento apresenta um grande número de parâmetros a serem estimados, comparado ao tamanho da amostra, os modelos existente podem gerar estimativas questionáveis ou até levar a conclusões erroneas, uma vez esse que são baseados em grandes amostras para se fazer inferência estatística. Foi proposto um modelo linear generalizados duplo, para os dados de proporções, de uma perspectiva Bayesiana, visando a análise estatística sob pequenas amostras e a incorporação do conhecimento especialista no processo de estimação dos parâmetros. Um problema clínico, que envolve dados binários medidos repetidamente no tempo é apresentado e são propostos dois modelos que acomodam o efeito da superdispersão e a dependência longitudinal das medidas, utilizandos-se efeitos aleatórios. Foram obtidos resultados satisfatórios nos três problemas estudados. Os MLGD permitiram identificar os fatores associados à mortalidade das aves de corte, o que permitirá minimizar perdas e habilitar os processos de manejo, transporte e abate aos critérios de bem-estar animal e exigências da comunidade européia. O MLGD Bayesiano permitiu identificar o genótipo associado ao efeito de superdispersão, aumentando a precisão da inferência de seleção de variedades. Dois modelos combinados foram propostos logit-normal-Bernoulli-beta e o probit-normal-Bernoulli-beta, que acomodaram satisfatoriamente a superdispersão e a dependência longitudinal das medidas binárias. Esses resultados reforçam a importância de se modelar a média e a variância conjuntamente, o que aumenta a precisão na pesquisa agronômica, tanto em estudos experimentais quanto em estudos observacionais. / Several experimental designs that are currently applied are based on agricultural experiments. These experimental data are, usually, analised with statistical models that assume constant residual variance (or homogeneous), as basic assumption. However, this assumption shows hard to stand for, when environmental or external factors exert strong influence over the measurements. In this work, we study the joint modelling for the mean and the variance, the latter being structured on two ways: (i) through a linear predictor, which allows the incorporation of external variables and/or noise factors and (ii) by the use of random effects, that accommodate jointly the possible overdispersion effect and the dependence of longitudinal data in the case of binary measusurements taken over time. The class of double generalized linear models (DGLM) was applied to an observational study where the poultry mortality was measured in the preslaughter operations. With this situation, it can be observed that there is a strong influence from some environmental factors over the variability observed, and consequently, this reduces the precision of the inferential analysis. Another relevant agricultural problem, related to horticulture, is the tissue culture experiments, where the number of regenerated explants is counted. Usually, this kind of experiment use a large number of parameters to be estimated, when compared with the sample size. The current frequentist models are based on large samples for statistical inference and, under this experimental condition, can generate unreliable estimates or even lead to erroneous conclusions. A double generalized linear model was proposed to analyse proportion data, under the Bayesian perspective, which can be applied to small samples and can incorporate expert knowledge into the parameter estimation process. One clinical research, that measured binary data repeatedly through the time is presented and two models are proposed to fit the overdispersion effect and the dependence of longitudinal measurements, using random effects. It was obtained satisfactory results under these three problems studied. the DGLM allowed to identify factors associated with the poultry mortality, that will allow to minimize loss and improve the process, since the catching until lairage on slaughterhouse, agreeing with animal welfare criteria and the European community rules. The Bayesian DGLM allowed to identify the genotype associated with the overdispersion effect, increasing the precision on the inference about varieties selection. Two combined models were proposed, a logit-normal- Bernoulli-beta and a probit-normal-Bernoulli-beta, which have both addressed the overdispersion effect and the longitudinal dependence of the binary measurements. These results reinforce the importance to modelling mean and dispersion jointly, as a way to increase the precision of agricultural experimentation, be it on experimental studies or observational studies. Análise de dados longitudinais Análise de variância Conforto térmico das construções Cultura de tecidos Delineamento experimental Inferência bayesiana Modelos lineares generalizados Verossimilhança. Analysis of variance Bayesian inference Cultura de tecidos Experimental design Generalized linear models Likelihood. Longitudinal data analysis Thermal comfort of buildings
150	Matematické modelování v neživotním pojištění / Mathematical modelling in general insurance Zajíček, Jakub January 2015 (has links) This diploma thesis deals with the mathematical models in general insurance. The aim of this thesis is to analyse selected mathematical models that are widely used in general insurance for the estimation of insurance portfolio statistics, pricing and the regulatory capital requirement calculation. Claim frequency models, claim severity models, aggregate loss models and generalized linear models are analysed. This thesis consists of a theoretical and a practical part. The theoretical part contains description of selected models. Described models are then applied to a real dataset in the practical part. The real dataset modelling was performed using the statistical software R. It has been proved that maximum likelihood parameter estimations are of better quality than the method of moments or quantile method estimations. The results of aggregate loss distribution computational methods are comparable. This comparability is mostly caused by a large number of observations. In the context of tariff analysis it was found that the most significant factors are driver's age and the driver's area of residence.

Search results