Global ETD Search

41	Ein nichtlineares, hierarchisches und gemischtes Modell für das Baum-Höhenwachstum der Fichte (Picea abies (L.) Karst.) in Baden-Württemberg / A non-linear hierarchical mixed model for tree height growth of Norway spruce (Picea abies (L.) Karst.) in Baden-Württemberg Nothdurft, Arne 09 February 2007 (has links) No description available. 570 Biowissenschaften Biologie Forest Sciences and Forest Ecology nichtlineares gemischtes Modell Zufallsparameter Mehrebenenanalyse Wachstumsmodell Baumhöhenwachstum non-linear mixed model random parameter multilevel analysis growth model tree height growth 48.99 42.11
42	Análise espacial do potencial fotovoltaico em telhados de residências usando modelagem hierárquica bayesiana / Análisis espacial del potencial fotovoltaico en tejados de residencias usando modelamiento jerárquico bayesiano Villavicencio Gastelu, Joel [UNESP] 01 March 2016 (has links) Submitted by JOÉL VILLAVICENCIO GASTELÚ null (tear_295@hotmail.com) on 2016-03-30T17:36:01Z No. of bitstreams: 1 Dissertação_Rev1_13 - Joel Gastelu.pdf: 3335802 bytes, checksum: 93fbe0689da0072cc77a9120a8e24b02 (MD5) / Rejected by Juliano Benedito Ferreira (julianoferreira@reitoria.unesp.br), reason: Solicitamos que realize uma nova submissão seguindo as orientações abaixo: O arquivo submetido está sem a ficha catalográfica. A versão submetida por você é considerada a versão final da dissertação/tese, portanto não poderá ocorrer qualquer alteração em seu conteúdo após a aprovação. Corrija estas informações e realize uma nova submissão contendo o arquivo correto. Agradecemos a compreensão. on 2016-04-01T13:14:50Z (GMT) / Submitted by JOÉL VILLAVICENCIO GASTELÚ null (tear_295@hotmail.com) on 2016-04-01T19:04:22Z No. of bitstreams: 1 Dissertação_Joel.pdf: 4253690 bytes, checksum: 75d9921d8416eec7341f8bf0e2182766 (MD5) / Rejected by Ana Paula Grisoto (grisotoana@reitoria.unesp.br), reason: Solicitamos que realize uma nova submissão seguindo as orientações abaixo: A data informada na capa do documento está diferente da data de defesa que consta na ficha catalográfica e folha de aprovação. Corrija esta informação no arquivo PDF e realize uma nova submissão contendo o arquivo correto. Agradecemos a compreensão. on 2016-04-05T13:53:33Z (GMT) / Submitted by JOÉL VILLAVICENCIO GASTELÚ null (tear_295@hotmail.com) on 2016-04-06T22:35:57Z No. of bitstreams: 1 Dissertação_Joel.pdf: 4231140 bytes, checksum: 4bd6143a52dc3a6846abd4f996ba9306 (MD5) / Approved for entry into archive by Juliano Benedito Ferreira (julianoferreira@reitoria.unesp.br) on 2016-04-07T12:21:23Z (GMT) No. of bitstreams: 1 gastelu_jv_me_ilha.pdf: 4231140 bytes, checksum: 4bd6143a52dc3a6846abd4f996ba9306 (MD5) / Made available in DSpace on 2016-04-07T12:21:23Z (GMT). No. of bitstreams: 1 gastelu_jv_me_ilha.pdf: 4231140 bytes, checksum: 4bd6143a52dc3a6846abd4f996ba9306 (MD5) Previous issue date: 2016-03-01 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / No presente trabalho tem-se como objetivo estimar o potencial fotovoltaico devido à instalação de sistemas fotovoltaicos em telhados de áreas residenciais. Na estimação desse potencial foram consideradas quatro grandezas: o nível de irradiação solar, a área aproveitável de telhado para a instalação dos sistemas fotovoltaicos, a eficiência de conversão dos sistemas fotovoltaicos e as probabilidades de instalação dos sistemas fotovoltaicos, que caracterizam as preferências dos habitantes à instalação desses sistemas. Um modelo hierárquico bayesiano foi proposto para o cálculo das probabilidades de instalação dos sistemas fotovoltaicos. Nesse modelo bayesiano é estabelecida uma relação entre as probabilidades de instalação, as variáveis socioeconômicas e as interações entre as subáreas, através de um modelo linear generalizado misto. O cálculo do valor esperado das probabilidades de instalação foi realizado usando o método de Monte Carlo via cadeias de Markov. Os resultados do potencial fotovoltaico são apresentados através de mapas temáticos, que permitem a visualização da distribuição espacial do seu valor esperado. Esta informação pode ajudar as concessionárias de distribuição no planejamento e expansão de suas redes elétricas em regiões com maior potencial de geração fotovoltaica. / The present work aims to estimate the photovoltaic potential for installing solar panel on the rooftop of residential areas. The estimation of this potential considers four quantities: the solar radiation level, rooftop availability for installation of photovoltaic systems, conversion efficiency of the photovoltaic systems and the probabilities for the installation of photovoltaic systems that characterize the preferences of the inhabitants to the installation of such systems. A bayesian hierarchical model is proposed to calculate the installation probabilities of photovoltaic systems. This bayesian model establishes a relation among the installation probabilities, socioeconomic variables and interactions between subareas, through a generalized linear mixed model. The calculation of expected value of installation probabilities in each subarea is performed using the Markov Chain Monte Carlo method. Photovoltaic potential results are presented through thematic maps that allow the visualization of the spatial distribution of its expected value. This information can help to distribution utilities for planning and expansion of their networks in regions with the greatest potential for photovoltaic generation. Modelo hierárquico bayesiano Modelo linear generalizado misto Sistemas fotovoltaicos Sistema de distribuição elétrico Bayesian hierarchical model Electrical distribution systems Generalized linear mixed model Markov chain Monte Carlo methods Photovoltaic systems
43	Classification et modélisation statistique intégrant des données cliniques et d’imagerie par résonance magnétique conventionnelle et avancée / Classification and statistical modeling based on clinical and conventional and advanced Magnetic Resonance Imaging data Tozlu, Ceren 19 March 2018 (has links) L'accident vasculaire cérébral et la sclérose en plaques figurent parmi les maladies neurologiques les plus destructrices du système nerveux central. L'accident vasculaire cérébral est la deuxième cause de décès et la principale cause de handicap chez l'adulte dans le monde alors que la sclérose en plaques est la maladie neurologique non traumatique la plus fréquente chez l'adulte jeune. L'imagerie par résonance magnétique est un outil important pour distinguer le tissu cérébral sain du tissu pathologique à des fins de diagnostic, de suivi de la maladie, et de prise de décision pour un traitement personnalisé des patients atteints d'accident vasculaire cérébral ou de sclérose en plaques. La prédiction de l'évolution individuelle de la maladie chez les patients atteints d'accident vasculaire cérébral ou de sclérose en plaques constitue un défi pour les cliniciens avant de donner un traitement individuel approprié. Cette prédiction est possible avec des approches statistiques appropriées basées sur des informations cliniques et d'imagerie. Toutefois, l'étiologie, la physiopathologie, les symptômes et l'évolution dans l'accident vasculaire cérébral et la sclérose en plaques sont très différents. Par conséquent, dans cette thèse, les méthodes statistiques utilisées pour ces deux maladies neurologiques sont différentes. Le premier objectif était l'identification du tissu à risque d'infarctus chez les patients atteints d'accident vasculaire cérébral. Pour cet objectif, les méthodes de classification (dont les méthodes de machine learning) ont été utilisées sur des données d'imagerie mesurées à l'admission pour prédire le risque d'infarctus à un mois. Les performances des méthodes de classification ont été ensuite comparées dans un contexte d'identification de tissu à haut risque d'infarctus à partir de données humaines codées voxel par voxel. Le deuxième objectif était de regrouper les patients atteints de sclérose en plaques avec une méthode non supervisée basée sur des trajectoires individuelles cliniques et d'imagerie tracées sur cinq ans. Les groupes de trajectoires aideraient à identifier les patients menacés d'importantes progressions et donc à leur donner des médicaments plus efficaces. Le troisième et dernier objectif de la thèse était de développer un modèle prédictif pour l'évolution du handicap individuel des patients atteints de sclérose en plaques sur la base de données démographiques, cliniques et d'imagerie obtenues a l'inclusion. L'hétérogénéité des évolutions du handicap chez les patients atteints de sclérose en plaques est un important défi pour les cliniciens qui cherchent à prévoir l'évolution individuelle du handicap. Le modèle mixte linéaire à classes latentes a été utilisé donc pour prendre en compte la variabilité individuelle et la variabilité inobservée entre sous-groupes de sclérose en plaques / Stroke and multiple sclerosis are two of the most destructive neurological diseases of the central nervous system. Stroke is the second most common cause of death and the major cause of disability worldwide whereas multiple sclerosis is the most common non-traumatic disabling neurological disease of adulthood. Magnetic resonance imaging is an important tool to distinguish healthy from pathological brain tissue in diagnosis, monitoring disease evolution, and decision-making in personalized treatment of patients with stroke or multiple sclerosis.Predicting disease evolution in patients with stroke or multiple sclerosis is a challenge for clinicians that are about to decide on an appropriate individual treatment. The etiology, pathophysiology, symptoms, and evolution of stroke and multiple sclerosis are highly different. Therefore, in this thesis, the statistical methods used for the study of the two neurological diseases are different.The first aim was the identification of the tissue at risk of infarction in patients with stroke. For this purpose, the classification methods (including machine learning methods) have been used on voxel-based imaging data. The data measured at hospital admission is performed to predict the infarction risk at one month. Next, the performances of the classification methods in identifying the tissue at a high risk of infarction were compared. The second aim was to cluster patients with multiple sclerosis using an unsupervised method based on individual clinical and imaging trajectories plotted over five 5 years. Clusters of trajectories would help identifying patients who may have an important progression; thus, to treat them with more effective drugs irrespective of the clinical subtypes. The third and final aim of this thesis was to develop a predictive model for individual evolution of patients with multiple sclerosis based on demographic, clinical, and imaging data taken at study onset. The heterogeneity of disease evolution in patients with multiple sclerosis is an important challenge for the clinicians who seek to predict the disease evolution and decide on an appropriate individual treatment. For this purpose, the latent class linear mixed model was used to predict disease evolution considering individual and unobserved subgroup' variability in multiple sclerosis Accident vasculaire cérébral Sclérose en plaques Méthodes de classification Regroupement des données longitudinales Modélisation prédictive Modèle mixte à classes latentes Stroke Multiple Sclerosis Classification Methods Trajectory Clustering Predictive Modeling Latent Class Linear Mixed Model 570.15
44	Análise espacial do potencial fotovoltaico em telhados de residências usando modelagem hierárquica bayesiana / Villavicencio Gastelu, Joel January 2016 (has links) Orientador: Antônio Padilha Feltrin / Resumo: No presente trabalho tem-se como objetivo estimar o potencial fotovoltaico devido à instalação de sistemas fotovoltaicos em telhados de áreas residenciais. Na estimação desse potencial foram consideradas quatro grandezas: o nível de irradiação solar, a área aproveitável de telhado para a instalação dos sistemas fotovoltaicos, a eficiência de conversão dos sistemas fotovoltaicos e as probabilidades de instalação dos sistemas fotovoltaicos, que caracterizam as preferências dos habitantes à instalação desses sistemas. Um modelo hierárquico bayesiano foi proposto para o cálculo das probabilidades de instalação dos sistemas fotovoltaicos. Nesse modelo bayesiano é estabelecida uma relação entre as probabilidades de instalação, as variáveis socioeconômicas e as interações entre as subáreas, através de um modelo linear generalizado misto. O cálculo do valor esperado das probabilidades de instalação foi realizado usando o método de Monte Carlo via cadeias de Markov. Os resultados do potencial fotovoltaico são apresentados através de mapas temáticos, que permitem a visualização da distribuição espacial do seu valor esperado. Esta informação pode ajudar as concessionárias de distribuição no planejamento e expansão de suas redes elétricas em regiões com maior potencial de geração fotovoltaica. / Abstract: The present work aims to estimate the photovoltaic potential for installing solar panel on the rooftop of residential areas. The estimation of this potential considers four quantities: the solar radiation level, rooftop availability for installation of photovoltaic systems, conversion efficiency of the photovoltaic systems and the probabilities for the installation of photovoltaic systems that characterize the preferences of the inhabitants to the installation of such systems. A bayesian hierarchical model is proposed to calculate the installation probabilities of photovoltaic systems. This bayesian model establishes a relation among the installation probabilities, socioeconomic variables and interactions between subareas, through a generalized linear mixed model. The calculation of expected value of installation probabilities in each subarea is performed using the Markov Chain Monte Carlo method. Photovoltaic potential results are presented through thematic maps that allow the visualization of the spatial distribution of its expected value. This information can help to distribution utilities for planning and expansion of their networks in regions with the greatest potential for photovoltaic generation. / Mestre Modelo hierárquico bayesiano Modelo linear generalizado misto Sistemas fotovoltaicos Sistema de distribuição elétrico Bayesian hierarchical model Electrical distribution systems Generalized linear mixed model Markov chain Monte Carlo methods Photovoltaic systems
45	Modeling strategies for complex hierarchical and overdispersed data in the life sciences / Estratégias de modelagem para dados hierárquicos complexos e com superdispersão em ciências biológicas Izabela Regina Cardoso de Oliveira 24 July 2014 (has links) In this work, we study the so-called combined models, generalized linear mixed models with extension to allow for overdispersion, in the context of genetics and breeding. Such flexible models accommodates cluster-induced correlation and overdispersion through two separate sets of random effects and contain as special cases the generalized linear mixed models (GLMM) on the one hand, and commonly known overdispersion models on the other. We use such models while obtaining heritability coefficients for non-Gaussian characters. Heritability is one of the many important concepts that are often quantified upon fitting a model to hierarchical data. It is often of importance in plant and animal breeding. Knowledge of this attribute is useful to quantify the magnitude of improvement in the population. For data where linear models can be used, this attribute is conveniently defined as a ratio of variance components. Matters are less simple for non-Gaussian outcomes. The focus is on time-to-event and count traits, where the Weibull-Gamma-Normal and Poisson-Gamma-Normal models are used. The resulting expressions are sufficiently simple and appealing, in particular in special cases, to be of practical value. The proposed methodologies are illustrated using data from animal and plant breeding. Furthermore, attention is given to the occurrence of negative estimates of variance components in the Poisson-Gamma-Normal model. The occurrence of negative variance components in linear mixed models (LMM) has received a certain amount of attention in the literature whereas almost no work has been done for GLMM. This phenomenon can be confusing at first sight because, by definition, variances themselves are non-negative quantities. However, this is a well understood phenomenon in the context of linear mixed modeling, where one will have to make a choice between a hierarchical and a marginal view. The variance components of the combined model for count outcomes are studied theoretically and the plant breeding study used as illustration underscores that this phenomenon can be common in applied research. We also call attention to the performance of different estimation methods, because not all available methods are capable of extending the parameter space of the variance components. Then, when there is a need for inference on such components and they are expected to be negative, the accuracy of the method is not the only characteristic to be considered. / Neste trabalho foram estudados os chamados modelos combinados, modelos lineares generalizados mistos com extensão para acomodar superdispersão, no contexto de genética e melhoramento. Esses modelos flexíveis acomodam correlação induzida por agrupamento e superdispersão por meio de dois conjuntos separados de efeitos aleatórios e contem como casos especiais os modelos lineares generalizados mistos (MLGM) e os modelos de superdispersão comumente conhecidos. Tais modelos são usados na obtenção do coeficiente de herdabilidade para caracteres não Gaussianos. Herdabilidade é um dos vários importantes conceitos que são frequentemente quantificados com o ajuste de um modelo a dados hierárquicos. Ela é usualmente importante no melhoramento vegetal e animal. Conhecer esse atributo é útil para quantificar a magnitude do ganho na população. Para dados em que modelos lineares podem ser usados, esse atributo é convenientemente definido como uma razão de componentes de variância. Os problemas são menos simples para respostas não Gaussianas. O foco aqui é em características do tipo tempo-até-evento e contagem, em que os modelosWeibull-Gama-Normal e Poisson-Gama-Normal são usados. As expressões resultantes são suficientemente simples e atrativas, em particular nos casos especiais, pelo valor prático. As metodologias propostas são ilustradas usando dados de melhoramento animal e vegetal. Além disso, a atenção é voltada à ocorrência de estimativas negativas de componentes de variância no modelo Poisson-Gama- Normal. A ocorrência de componentes de variância negativos em modelos lineares mistos (MLM) tem recebido certa atenção na literatura enquanto quase nenhum trabalho tem sido feito para MLGM. Esse fenômeno pode ser confuso a princípio porque, por definição, variâncias são quantidades não-negativas. Entretanto, este é um fenômeno bem compreendido no contexto de modelagem linear mista, em que a escolha deverá ser feita entre uma interpretação hierárquica ou marginal. Os componentes de variância do modelo combinado para respostas de contagem são estudados teoricamente e o estudo de melhoramento vegetal usado como ilustração confirma que esse fenômeno pode ser comum em pesquisas aplicadas. A atenção também é voltada ao desempenho de diferentes métodos de estimação, porque nem todos aqueles disponíveis são capazes de estender o espaço paramétrico dos componentes de variância. Então, quando há a necessidade de inferência de tais componentes e é esperado que eles sejam negativos, a acurácia do método de estimação não é a única característica a ser considerada. Distribuição gama Distribuição Poisson Distribuição Weibull Efeito aleatório Herdabilidade Modelo combinado Modelo linear misto generalizado Variâncias negativas Combined model Gamma distribution Generalized linear mixed model Heritability Negative variance components Poisson distribution Random effect Weibull distribution
46	Modèles statistiques pour l'étude de la progression de la maladie rénale chronique / Statistical models to study progression of chronic kidney disease Boucquemont, Julie 15 December 2014 (has links) Cette thèse avait pour but d'illustrer l'intérêt de méthodes statistiques avancées lorsqu'on s'in téresse aux associations entre différents facteurs et la progression de la maladie rénale chronique (MRC). Dans un premier temps, une revue de la littérature a été effectuée alin d'identifier les méthodes classiquement utilisées pour étudier les facteurs de progression de la MRC ; leurs limites et des méthodes permettant de mieux prendre en compte ces limites ont été discutées. Notre second travail s'est concentré sur les analyses de données de survie et la prise en compte de la censure par intervalle, qui survient lorsque l'évènement d'intérêt est la progression vers un stade spécifique de la MRC, et le risque compétitif avec le décès. Une comparaison entre des modèles de survie standards et le modêle illness-death pour données censurées par intervalle nous a permis d'illustrer l'impact de la modélisation choisie sur les estimations à la fois des effets des facteurs de risque et des probabilités d'évènements, à partir des données de la cohorte NephroTest. Les autres travaux ont porté sur les analyses de données longitudinales de la fonction rénale. Nous avons illustré l'intérêt du modèle linéaire mixte dans ce contexte et présenté son extension pour la prise en compte de sous-populations de trajectoires de la fonction rénale différentes. Nous avons ainsi identifier cinq classes, dont une avec un déclin très rapide et une autre avec une amélioration de la fonction rénale au cours du temps. Des perspectives de travaux liés à la prédiction permettent enfin de lier les deux types d'analyses présentées dans la thèse. / The objective of this thesis was to illustrate the benefit of using advanced statistical methods to study associations between risk factors and chrouic kidney disease (CKD) progression. In a first time, we conducted a literature review of statistical methods used to investigate risk factors of CKD progression, identified important methodological issues, and discussed solutions. In our sec ond work, we focused on survival analyses and issues with interval-censoring, which occurs when the event of interest is the progression to a specifie CKD stage, and competing risk with death. A comparison between standard survival models and the illness-death mode! for interval-censored data allowed us to illustrate the impact of modeling on the estimates of both the effects of risk factors and the probabilities of events, using data from the NephroTest cohort. Other works fo cused on analysis of longitudinal data on renal function. We illustrated the interest of linear mixed mode! in this context and presented its extension to account for sub-populations with different trajectories of renal function. We identified five classes, including one with a strong decline and one with an improvement of renal function over time. Severa! perspectives on predictions bind the two types of analyses presented in this thesis. Maladie rénale chronique Débit de filtration glomérulaire Censure par intervalle Risques compétitifs Chronic kidney disease Glomerular filtration rate Interval censoring Competing risks Latent class linear mixed model
47	Development and maintenance of victimization associated with bullying during the transition to middle school: The role of school-based factors Abel, Leah A. 04 August 2020 (has links) No description available. Education Teacher Education School Administration School Counseling Psychology Educational Psychology victimization bullying victimization bullying school climate student teacher relationship generalized linear mixed model McNemar test binary logistic regression elementary middle school middle school transition
48	Statistical Evaluation of Correlated Measurement Data in Longitudinal Setting Based on Bilateral Corneal Cross-Linking Herber, Robert, Graehlert, Xina, Raiskup, Frederik, Veselá, Martina, Pillunat, Lutz E., Spoerl, Eberhard 13 April 2023 (has links) Purpose In ophthalmology, data from both eyes of a person are frequently included in the statistical evaluation. This violates the requirement of data independence for classical statistical tests (e.g. t-Test or analysis of variance (ANOVA)) because it is correlated data. Linear mixed models (LMM) were used as a possibility to include the data of both eyes in the statistical evaluation. Methods The LMM is available for a variety of statistical software such as SPSS or R. The application was applied to a retrospective longitudinal analysis of an accelerated corneal cross-linking (ACXL (9*10)) treatment in progressive keratoconus (KC) with a follow-up period of 36 months. Forty eyes of 20 patients were included, whereas sequential bilateral CXL treatment was performed within 12 months. LMM and ANOVA for repeated measurements were used for statistical evaluation of topographical and tomographical data measured by Pentacam (Oculus, Wetzlar, Germany). Results Both eyes were classified into a worse and better eye concerning corneal topography. Visual acuity, keratometric values and minimal corneal thickness were statistically significant between them at baseline (p < 0.05). A significant correlation between worse and better eye was shown (p < 0.05). Therefore, analyzing the data at each follow-up visit using ANOVA partially led to an overestimation of the statistical effect that could be avoided by using LMM. After 36 months, ACXL has significantly improved BCVA and flattened the cornea. Conclusion The evaluation of data of both eyes without considering their correlation using classical statistical tests leads to an overestimation of the statistical effect, which can be avoided by using the LMM. info:eu-repo/classification/ddc/610 ddc:610
49	ASSESSMENT OF VARIABILITY OF LAND USE IMPACTS ON WATER QUALITY CONTAMINANTS Johann Alexander Vera (14103150), Bernard A. Engel (5644601) 10 December 2022 (has links) <p> The hydrological cycle is affected by land use variability. Land use spatial and temporal variability has the power to alter watershed runoff, water resource quantity and quality, ecosystems, and environmental sustainability. In recent decades, agriculture lands, pastures, plantations, and urban areas have increased, resulting in significant increases in energy, water, and fertilizer usage, as well as significant biodiversity losses. </p> Land use and environmental planning Surface water hydrology Statistical data science water quality contaminants swat linear mixed model
50	Inférence robuste à la présence des valeurs aberrantes dans les enquêtes Dongmo Jiongo, Valéry 12 1900 (has links) Cette thèse comporte trois articles dont un est publié et deux en préparation. Le sujet central de la thèse porte sur le traitement des valeurs aberrantes représentatives dans deux aspects importants des enquêtes que sont : l’estimation des petits domaines et l’imputation en présence de non-réponse partielle. En ce qui concerne les petits domaines, les estimateurs robustes dans le cadre des modèles au niveau des unités ont été étudiés. Sinha & Rao (2009) proposent une version robuste du meilleur prédicteur linéaire sans biais empirique pour la moyenne des petits domaines. Leur estimateur robuste est de type «plugin», et à la lumière des travaux de Chambers (1986), cet estimateur peut être biaisé dans certaines situations. Chambers et al. (2014) proposent un estimateur corrigé du biais. En outre, un estimateur de l’erreur quadratique moyenne a été associé à ces estimateurs ponctuels. Sinha & Rao (2009) proposent une procédure bootstrap paramétrique pour estimer l’erreur quadratique moyenne. Des méthodes analytiques sont proposées dans Chambers et al. (2014). Cependant, leur validité théorique n’a pas été établie et leurs performances empiriques ne sont pas pleinement satisfaisantes. Ici, nous examinons deux nouvelles approches pour obtenir une version robuste du meilleur prédicteur linéaire sans biais empirique : la première est fondée sur les travaux de Chambers (1986), et la deuxième est basée sur le concept de biais conditionnel comme mesure de l’influence d’une unité de la population. Ces deux classes d’estimateurs robustes des petits domaines incluent également un terme de correction pour le biais. Cependant, ils utilisent tous les deux l’information disponible dans tous les domaines contrairement à celui de Chambers et al. (2014) qui utilise uniquement l’information disponible dans le domaine d’intérêt. Dans certaines situations, un biais non négligeable est possible pour l’estimateur de Sinha & Rao (2009), alors que les estimateurs proposés exhibent un faible biais pour un choix approprié de la fonction d’influence et de la constante de robustesse. Les simulations Monte Carlo sont effectuées, et les comparaisons sont faites entre les estimateurs proposés et ceux de Sinha & Rao (2009) et de Chambers et al. (2014). Les résultats montrent que les estimateurs de Sinha & Rao (2009) et de Chambers et al. (2014) peuvent avoir un biais important, alors que les estimateurs proposés ont une meilleure performance en termes de biais et d’erreur quadratique moyenne. En outre, nous proposons une nouvelle procédure bootstrap pour l’estimation de l’erreur quadratique moyenne des estimateurs robustes des petits domaines. Contrairement aux procédures existantes, nous montrons formellement la validité asymptotique de la méthode bootstrap proposée. Par ailleurs, la méthode proposée est semi-paramétrique, c’est-à-dire, elle n’est pas assujettie à une hypothèse sur les distributions des erreurs ou des effets aléatoires. Ainsi, elle est particulièrement attrayante et plus largement applicable. Nous examinons les performances de notre procédure bootstrap avec les simulations Monte Carlo. Les résultats montrent que notre procédure performe bien et surtout performe mieux que tous les compétiteurs étudiés. Une application de la méthode proposée est illustrée en analysant les données réelles contenant des valeurs aberrantes de Battese, Harter & Fuller (1988). S’agissant de l’imputation en présence de non-réponse partielle, certaines formes d’imputation simple ont été étudiées. L’imputation par la régression déterministe entre les classes, qui inclut l’imputation par le ratio et l’imputation par la moyenne sont souvent utilisées dans les enquêtes. Ces méthodes d’imputation peuvent conduire à des estimateurs imputés biaisés si le modèle d’imputation ou le modèle de non-réponse n’est pas correctement spécifié. Des estimateurs doublement robustes ont été développés dans les années récentes. Ces estimateurs sont sans biais si l’un au moins des modèles d’imputation ou de non-réponse est bien spécifié. Cependant, en présence des valeurs aberrantes, les estimateurs imputés doublement robustes peuvent être très instables. En utilisant le concept de biais conditionnel, nous proposons une version robuste aux valeurs aberrantes de l’estimateur doublement robuste. Les résultats des études par simulations montrent que l’estimateur proposé performe bien pour un choix approprié de la constante de robustesse. / This thesis focuses on the treatment of representative outliers in two important aspects of surveys: small area estimation and imputation for item non-response. Concerning small area estimation, robust estimators in unit-level models have been studied. Sinha & Rao (2009) proposed estimation procedures designed for small area means, based on robustified maximum likelihood parameters estimates of linear mixed model and robust empirical best linear unbiased predictors of the random effect of the underlying model. Their robust methods for estimating area means are of the plug-in type, and in view of the results of Chambers (1986), the resulting robust estimators may be biased in some situations. Biascorrected estimators have been proposed by Chambers et al. (2014). In addition, these robust small area estimators were associated with the estimation of the Mean Square Error (MSE). Sinha & Rao (2009) proposed a parametric bootstrap procedure based on the robust estimates of the parameters of the underlying linear mixed model to estimate the MSE. Analytical procedures for the estimation of the MSE have been proposed in Chambers et al. (2014). However, their theoretical validity has not been formally established and their empirical performances are not fully satisfactorily. Here, we investigate two new approaches for the robust version the best empirical unbiased estimator: the first one relies on the work of Chambers (1986), while the second proposal uses the concept of conditional bias as an influence measure to assess the impact of units in the population. These two classes of robust small area estimators also include a correction term for the bias. However, they are both fully bias-corrected, in the sense that the correction term takes into account the potential impact of the other domains on the small area of interest unlike the one of Chambers et al. (2014) which focuses only on the domain of interest. Under certain conditions, non-negligible bias is expected for the Sinha-Rao method, while the proposed methods exhibit significant bias reduction, controlled by appropriate choices of the influence function and tuning constants. Monte Carlo simulations are conducted, and comparisons are made between: the new robust estimators, the Sinha-Rao estimator, and the bias-corrected estimator. Empirical results suggest that the Sinha-Rao method and the bias-adjusted estimator of Chambers et al (2014) may exhibit a large bias, while the new procedures offer often better performances in terms of bias and mean squared error. In addition, we propose a new bootstrap procedure for MSE estimation of robust small area predictors. Unlike existing approaches, we formally prove the asymptotic validity of the proposed bootstrap method. Moreover, the proposed method is semi-parametric, i.e., it does not rely on specific distributional assumptions about the errors and random effects of the unit-level model underlying the small-area estimation, thus it is particularly attractive and more widely applicable. We assess the finite sample performance of our bootstrap estimator through Monte Carlo simulations. The results show that our procedure performs satisfactorily well and outperforms existing ones. Application of the proposed method is illustrated by analyzing a well-known outlier-contaminated small county crops area data from North-Central Iowa farms and Landsat satellite images. Concerning imputation in the presence of item non-response, some single imputation methods have been studied. The deterministic regression imputation, which includes the ratio imputation and mean imputation are often used in surveys. These imputation methods may lead to biased imputed estimators if the imputation model or the non-response model is not properly specified. Recently, doubly robust imputed estimators have been developed. However, in the presence of outliers, the doubly robust imputed estimators can be very unstable. Using the concept of conditional bias as a measure of influence (Beaumont, Haziza and Ruiz-Gazen, 2013), we propose an outlier robust version of the doubly robust imputed estimator. Thus this estimator is denoted as a triple robust imputed estimator. The results of simulation studies show that the proposed estimator performs satisfactorily well for an appropriate choice of the tuning constant. Estimateur corrigé pour le biais Biais conditionnel Valeurs aberrantes Inférence basée sur le modèle Inférence basée sur le plan Petits domaines Bootstrap Modèle linéaire mixte Robustesse Imputation Corrected-bias estimator Conditional bias Outliers Model-based inference Sampling-based inference Small-area Linear mixed model Robustness

Search results