Global ETD Search

41	Variable Selection and Function Estimation Using Penalized Methods Xu, Ganggang 2011 December 1900 (has links) Penalized methods are becoming more and more popular in statistical research. This dissertation research covers two major aspects of applications of penalized methods: variable selection and nonparametric function estimation. The following two paragraphs give brief introductions to each of the two topics. Infinite variance autoregressive models are important for modeling heavy-tailed time series. We use a penalty method to conduct model selection for autoregressive models with innovations in the domain of attraction of a stable law indexed by alpha is an element of (0, 2). We show that by combining the least absolute deviation loss function and the adaptive lasso penalty, we can consistently identify the true model. At the same time, the resulting coefficient estimator converges at a rate of n^(?1/alpha) . The proposed approach gives a unified variable selection procedure for both the finite and infinite variance autoregressive models. While automatic smoothing parameter selection for nonparametric function estimation has been extensively researched for independent data, it is much less so for clustered and longitudinal data. Although leave-subject-out cross-validation (CV) has been widely used, its theoretical property is unknown and its minimization is computationally expensive, especially when there are multiple smoothing parameters. By focusing on penalized modeling methods, we show that leave-subject-out CV is optimal in that its minimization is asymptotically equivalent to the minimization of the true loss function. We develop an efficient Newton-type algorithm to compute the smoothing parameters that minimize the CV criterion. Furthermore, we derive one simplification of the leave-subject-out CV, which leads to a more efficient algorithm for selecting the smoothing parameters. We show that the simplified version of CV criteria is asymptotically equivalent to the unsimplified one and thus enjoys the same optimality property. This CV criterion also provides a completely data driven approach to select working covariance structure using generalized estimating equations in longitudinal data analysis. Our results are applicable to additive, linear varying-coefficient, nonlinear models with data from exponential families. Adaptive lasso Autoregressive model Infinite variance Least absolute deviation
42	Do Childhood Excess Weight and Family Food Insecurity Share Common Risk Factors in the Local Environment? An Examination Using a Quebec Birth Cohort Carter, Megan Ann 20 February 2013 (has links) Background: Childhood excess weight and family food insecurity are food-system related public health problems that exist in Canada. Since both relate to issues of food accessibility and availability, which have elements of “place”, they may share common risk factors in the local environment that are amenable to intervention. In this area of research, the literature derives mostly from a US context, and there is a dearth of high quality evidence, specifically from longitudinal studies. Objectives: The main objectives of this thesis were to examine the adjusted associations between the place factors: material deprivation, social deprivation, social cohesion, disorder, and living location, with change in child BMI Z-score and with change in family food insecurity status in a Canadian cohort of children. Methods: The Québec Longitudinal Study of Child Development was used to meet the main objectives of this thesis. Response data from six collection cycles (4 – 10 years of age) were used in three main analyses. The first analysis examined change in child BMI Z-score as a function of the place factors using mixed models regression. The second analysis examined change in child BMI Z-score as a function of place factors using group-based trajectory modeling. The third and final analysis examined change in family food insecurity status as a function of the place factors using generalized estimating equations. Results: Social deprivation, social cohesion and disorder were strongly and positively associated with family food insecurity, increasing the odds by 45-76%. These place factors, on the other hand, were not consistently associated with child weight status. Material deprivation was not important for either outcome, except for a slight positive association in the mixed models analysis of child weight status. Living location was not important in explaining family food insecurity. On the other hand, it was associated with child weight status in both analyses, but the nature of the relationship is still unclear. Conclusions: Results do not suggest that addressing similar place factors may alleviate both child excess weight and family food insecurity. More high quality longitudinal and experimental studies are needed to clarify relationships between the local environment and child weight status and family food insecurity. social epidemiology food insecurity childhood obesity longitudinal study group-based trajectory modeling mixed models regression generalized estimating equations social capital residential characteristics
43	Design, maintenance and methodology for analysing longitudinal social surveys, including applications Domrow, Nathan Craig January 2007 (has links) This thesis describes the design, maintenance and statistical analysis involved in undertaking a Longitudinal Survey. A longitudinal survey (or study) obtains observations or responses from individuals over several times over a defined period. This enables the direct study of changes in an individual's response over time. In particular, it distinguishes an individual's change over time from the baseline differences among individuals within the initial panel (or cohort). This is not possible in a cross-sectional study. As such, longitudinal surveys give correlated responses within individuals. Longitudinal studies therefore require different considerations for sample design and selection and analysis from standard cross-sectional studies. This thesis looks at the methodology for analysing social surveys. Most social surveys comprise of variables described as categorical variables. This thesis outlines the process of sample design and selection, interviewing and analysis for a longitudinal study. Emphasis is given to categorical response data typical of a survey. Included in this thesis are examples relating to the Goodna Longitudinal Survey and the Longitudinal Survey of Immigrants to Australia (LSIA). Analysis in this thesis also utilises data collected from these surveys. The Goodna Longitudinal Survey was conducted by the Queensland Office of Economic and Statistical Research (a portfolio office within Queensland Treasury) and began in 2002. It ran for two years whereby two waves of responses were collected. bayesian benchmarking correlation cross sectional surveys data analysis generalized estimating equations imputation longitudinal surveys missing data sample size standard error survey design survey methodology weighting
44	Modelos estatísticos para dados politômicos nominais em estudos longitudinais com uma aplicação à área agronômica / Statistical models for nominal polytomous data in longitudinal studies with an application to agronomy Vinicius Menarin 14 January 2016 (has links) Estudos em que a resposta de interesse é uma variável categorizada são bastante comuns nas mais diversas áreas da Ciência. Em muitas situações essa resposta é composta por mais de duas categorias não ordenadas, denominada então de uma variável politômica nominal, e em geral o objetivo do estudo é associar a probabilidade de ocorrência de cada categoria aos efeitos de variáveis explicativas. Ademais, existem tipos especiais de estudos em que os dados são coletados diversas vezes para uma mesma unidade amostral ao longo do tempo, os estudos longitudinais. Estudos assim requerem o uso de modelos estatísticos que considerem em sua formulação algum tipo de estrutura que suporte a dependência que tende a surgir entre observações feitas em uma mesma unidade amostral. Neste trabalho são abordadas duas extensões do modelo de logitos generalizados, usualmente empregado quando a resposta é politômica nominal com observações independentes entre si. A primeira consiste de uma modificação das equações de estimação generalizadas para dados nominais que se utiliza de razões de chances locais para descrever a dependência entre as observações da variável resposta politômica ao longo dos diversos tempos observados. Este tipo de modelo é denominado de modelo marginal. A segunda proposta abordada consiste no modelo de logitos generalizados com a inclusão de efeitos aleatórios no preditor linear, que também leva em conta uma dependência entre as observações. Esta abordagem caracteriza o modelo de logitos generalizados misto. Há diferenças importantes inerentes às interpretações dos modelos marginais e mistos, que são discutidas e que devem ser levadas em consideração na escolha da abordagem adequada. Ambas as propostas são aplicadas em um conjunto de dados proveniente de um experimento da área agronômica realizado em campo, conduzido sob um delineamento casualizado em blocos com esquema fatorial para os tratamentos. O experimento foi acompanhado ao longo de seis estações do ano, caracterizando assim uma estrutura longitudinal, sendo a variável resposta o tipo de vegetação observado no campo (touceiras, plantas invasoras ou espaços vazios). Os resultados encontrados são satisfatórios, embora a dependência presente nos dados não seja tão caracterizada; por meio de testes como da razão de verossimilhanças e de Wald diversas diferenças significativas entre os tratamentos foram encontradas. Ainda, devido às diferenças metodológicas das duas abordagens, o modelo marginal baseado nas equações de estimação generalizadas mostra-se mais adequado para esses dados. / Studies where the response is a categorical variable are quite common in many fields of Sciences. In many situations this response is composed by more than two unordered categories characterizing a nominal polytomous outcome and, in general, the aim of the study is to associate the probability of occurrence of each category to the effects of variables. Furthermore, there are special types of study where many measurements are taken over the time for the same sampling unit, called longitudinal studies. Such studies require special statistical models that consider some kind of structure that support the dependence that tends to arise from the repeated measurements for the same sampling unit. This work focuses on two extensions of the baseline-category logit model usually employed in cases when there is a nominal polytomous response with independent observations. The first one consists in a modification of the well-known generalized estimating equations for longitudinal data based on local odds ratios to describe the dependence between the levels of the response over the repeated measurements. This type of model is also known as a marginal model. The second approach adds random effects to the linear predictor of the baseline-category logit model, which also considers a dependence between the observations. This characterizes a baseline-category mixed model. There are substantial differences inherent to interpretations when marginal and mixed models are compared, what should be considered in the choice of the most appropriated approach for each situation. Both methodologies are applied to the data of an agronomic experiment installed under a complete randomized block design with a factorial arrangement for the treatments. It was carried out over six seasons, characterizing the longitudinal structure, and the response is the type of vegetation observed in field (tussocks, weeds or regions with bare ground). The results are satisfactory, even if the dependence found in data is not so strong, and likelihood-ratio and Wald tests point to several differences between treatments. Moreover, due to methodological differences between the two approaches, the marginal model based on generalized estimating equations seems to be more appropriate for this data. Dados categorizados nominais Equações de estimação generalizadas Medidas repetidas no tempo Modelos lineares generalizados mistos generalized estimating equations generalized linear mixed models nominal categorical data repeated measurements over time
45	Do Childhood Excess Weight and Family Food Insecurity Share Common Risk Factors in the Local Environment? An Examination Using a Quebec Birth Cohort Carter, Megan Ann January 2013 (has links) Background: Childhood excess weight and family food insecurity are food-system related public health problems that exist in Canada. Since both relate to issues of food accessibility and availability, which have elements of “place”, they may share common risk factors in the local environment that are amenable to intervention. In this area of research, the literature derives mostly from a US context, and there is a dearth of high quality evidence, specifically from longitudinal studies. Objectives: The main objectives of this thesis were to examine the adjusted associations between the place factors: material deprivation, social deprivation, social cohesion, disorder, and living location, with change in child BMI Z-score and with change in family food insecurity status in a Canadian cohort of children. Methods: The Québec Longitudinal Study of Child Development was used to meet the main objectives of this thesis. Response data from six collection cycles (4 – 10 years of age) were used in three main analyses. The first analysis examined change in child BMI Z-score as a function of the place factors using mixed models regression. The second analysis examined change in child BMI Z-score as a function of place factors using group-based trajectory modeling. The third and final analysis examined change in family food insecurity status as a function of the place factors using generalized estimating equations. Results: Social deprivation, social cohesion and disorder were strongly and positively associated with family food insecurity, increasing the odds by 45-76%. These place factors, on the other hand, were not consistently associated with child weight status. Material deprivation was not important for either outcome, except for a slight positive association in the mixed models analysis of child weight status. Living location was not important in explaining family food insecurity. On the other hand, it was associated with child weight status in both analyses, but the nature of the relationship is still unclear. Conclusions: Results do not suggest that addressing similar place factors may alleviate both child excess weight and family food insecurity. More high quality longitudinal and experimental studies are needed to clarify relationships between the local environment and child weight status and family food insecurity. social epidemiology food insecurity childhood obesity longitudinal study group-based trajectory modeling mixed models regression generalized estimating equations social capital residential characteristics
46	Empirical likelihood and mean-variance models for longitudinal data Li, Daoji January 2011 (has links) Improving the estimation efficiency has always been one of the important aspects in statistical modelling. Our goal is to develop new statistical methodologies yielding more efficient estimators in the analysis of longitudinal data. In this thesis, we consider two different approaches, empirical likelihood and jointly modelling the mean and variance, to improve the estimation efficiency. In part I of this thesis, empirical likelihood-based inference for longitudinal data within the framework of generalized linear model is investigated. The proposed procedure takes into account the within-subject correlation without involving direct estimation of nuisance parameters in the correlation matrix and retains optimality even if the working correlation structure is misspecified. The proposed approach yields more efficient estimators than conventional generalized estimating equations and achieves the same asymptotic variance as quadratic inference functions based methods. The second part of this thesis focus on the joint mean-variance models. We proposed a data-driven approach to modelling the mean and variance simultaneously, yielding more efficient estimates of the mean regression parameters than the conventional generalized estimating equations approach even if the within-subject correlation structure is misspecified in our joint mean-variance models. The joint mean-variances in parametric form as well as semi-parametric form has been investigated. Extensive simulation studies are conducted to assess the performance of our proposed approaches. Three longitudinal data sets, Ohio Children’s wheeze status data (Ware et al., 1984), Cattle data (Kenward, 1987) and CD4+ data (Kaslowet al., 1987), are used to demonstrate our models and approaches. 519.5
47	Zobecněné odhadovací rovnice (GEE) / Generalized estimating equaitons Sotáková, Martina January 2020 (has links) In this thesis we are interested in generalized estimating equations (GEE). First, we introduce the term of generalized linear model, on which generalized estimating equations are based. Next we present the methos of pseudo maximum likelyhood and quasi-pseudo maximum likelyhood, from which we move on to the methods of generalized estimating equations. Finally, we perform simulation studies, which demonstrates the theoretical results presented in the thesis. 1
48	A Dynamic Longitudinal Examination of Social Networks and Political Behavior: The Moderating Effect of Local Network Properties and Its Implication for Social Influence Processes Song, Hyunjin 21 May 2015 (has links) No description available. Communication Political Science Sociology Social networks political discussion political behavior political attitudes social selection social influence Temporal Exponential Random Graph Models Generalized Estimating Equations
49	Longitudinal Analysis to Assess the Impact of Method of Delivery on Postpartum Outcomes: The Ontario Mother and Infant Study (TOMIS) III Bai, Yu Qing 10 1900 (has links) <p>Postpartum depression has become a major public health concern for women within a specific time period after delivery. Depression is possibly associated with some risk factors such as socioeconomic status, social support, maternal mental and physical health, and history of anxiety. TOMIS III, funded by the Canadian Institutes of Health Research, is a prospective cohort to study the associations between delivery method and health and health resource utilization.</p> <p>Clinically, we investigated the associations between mode of delivery and outcome of postnatal depression, maternal and infant health, and we implied the risk predictors for outcomes by statistical methodology of marginal model with generalized estimating equations (GEE). Statistically, a variety of regression models, namely, generalized linear mixed effect model (GLMM), hierarchical generalized linear model (HGLM) and Bayesian hierarchical model were applied for this analysis and results were compared with GEEs. Some imputation strategies, namely, mean imputation, last observation carrying forward (LOCF), hot-deck imputation and multiple imputation were employed for handling missing values in this study.</p> <p>Analysis results demonstrated that there was no statistically significant association between mode of delivery and postpartum depression [OR 0.99, 95% CI (0.73, 1.34)]. However, the development of postpartum depression was found to be associated with low income, low mental and physical health functioning, lack of social support, the low number of unmet learning needs in hospital, and English or French spoken at home. Results were consistent for all regression models but GEE provided the best fit and an excellent discriminative ability. GEE models were constructed on different datasets imputed by mean, LOCF, hot-deck and multiple imputation, and LOCF was recommended to handle the missing data in this longitudinal study.</p> <p>Analyses on the outcome of maternal health and infant health stated that method of delivery had a statistically significant influence on maternal health but no significant impact on infant health. Risks of maternal health problems were associated with cesarean delivery, good/fair/poor infant health, low maternal mental and physical health functioning, lack of care for maternal mental health, and good/fair/poor health before pregnancy. Risks of infant health problems were associated with good/fair/poor maternal health before pregnancy and after discharge, inadequate care or help for infant health, fair/poor community services after discharge, low maternal mental health functioning, non-English or non-French spoken at home, and mothers born outside of Canada.</p> / Master of Science (MSc) The Ontario Mother and Infant Study generalized estimating equations generalized linear mixed effect model hierarchical generalized linear model Bayesian hierarchical model Biostatistics Biostatistics
50	"Modelos lineares generalizados para análise de dados com medidas repetidas" / "Generalized linear models for repeated measures regression analysis" Venezuela, Maria Kelly 04 July 2003 (has links) Neste trabalho, apresentamos as equações de estimação generalizadas desenvolvidas por Liang e Zeger (1986), sob a ótica da teoria de funções de estimação apresentada por Godambe (1991). Essas equações de estimação são obtidas para os modelos lineares generalizados (MLGs) considerando medidas repetidas. Apresentamos também um processo iterativo para estimação dos parâmetros de regressão, assim como testes de hipóteses para esses parâmetros. Para a análise de resíduos, generalizamos para dados com medidas repetidas algumas técnicas de diagnóstico usuais em MLGs. O gráfico de probabilidade meio-normal com envelope simulado é uma proposta para avaliarmos a adequação do ajuste do modelo. Para a construção desse gráfico, simulamos respostas correlacionadas por meio de algoritmos que descrevemos neste trabalho. Por fim, realizamos aplicações a conjuntos de dados reais. / In this work, we consider the generalized estimation equations developed by Liang and Zeger (1986) focusing the theory of estimating functions presented by Godambe (1991). These estimation equations are an extension of generalized linear models (GLMs) to the analysis of repeated measurements. We present an iterative procedure to estimate the regression parameters as well as hypothesis testing of these parameters. For the residual analysis, we generalize to repeated measurements some diagnostic methods available for GLMs. The half-normal probability plot with a simulated envelope is useful for diagnosing model inadequacy and detecting outliers. To obtain this plot, we consider an algorithm for generating a set of nonnegatively correlated variables having a specified correlation structure. Finally, the theory is applied to real data sets. dados longitudinais diagnostic techniques equação de estimação generalizada generalized estimating equations generalized linear models longitudinal data medidas repetidas modelos lineares generalizados quase-verossimilhança quasi-likelihood methods repeated measures técnicas de diagnóstico

Search results