• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 75
  • 52
  • 14
  • 13
  • 8
  • 6
  • 5
  • 3
  • 2
  • 1
  • 1
  • Tagged with
  • 249
  • 249
  • 105
  • 86
  • 53
  • 51
  • 51
  • 48
  • 45
  • 45
  • 45
  • 38
  • 36
  • 36
  • 33
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Modelos para a análise de dados de contagens longitudinais com superdispersão: estimação INLA / Models for data analysis of longitudinal counts with overdispersion: INLA estimation

Rocha, Everton Batista da 04 September 2015 (has links)
Em ensaios clínicos é muito comum a ocorrência de dados longitudinais discretos. Para sua análise é necessário levar em consideração que dados observados na mesma unidade experimental ao longo do tempo possam ser correlacionados. Além dessa correlação inerente aos dados é comum ocorrer o fenômeno de superdispersão (ou sobredispersão), em que, existe uma variabilidade nos dados além daquela captada pelo modelo. Um caso que pode acarretar a superdispersão é o excesso de zeros, podendo também a superdispersão ocorrer em valores não nulos, ou ainda, em ambos os casos. Molenberghs, Verbeke e Demétrio (2007) propuseram uma classe de modelos para acomodar simultaneamente a superdispersão e a correlação em dados de contagens: modelo Poisson, modelo Poisson-gama, modelo Poisson-normal e modelo Poisson-normal-gama (ou modelo combinado). Rizzato (2011) apresentou a abordagem bayesiana para o ajuste desses modelos por meio do Método de Monte Carlo com Cadeias de Markov (MCMC). Este trabalho, para modelar a incerteza relativa aos parâmetros desses modelos, considerou a abordagem bayesiana por meio de um método determinístico para a solução de integrais, INLA (do inglês, Integrated Nested Laplace Approximations). Além dessa classe de modelos, como objetivo, foram propostos outros quatros modelos que também consideram a correlação entre medidas longitudinais e a ocorrência de superdispersão, além da ocorrência de zeros estruturais e não estruturais (amostrais): modelo Poisson inacionado de zeros (ZIP), modelo binomial negativo inacionado de zeros (ZINB), modelo Poisson inacionado de zeros - normal (ZIP-normal) e modelo binomial negativo inacionado de zeros - normal (ZINB-normal). Para ilustrar a metodologia desenvolvida, um conjunto de dados reais referentes à contagens de ataques epilépticos sofridos por pacientes portadores de epilepsia submetidos a dois tratamentos (um placebo e uma nova droga) ao longo de 27 semanas foi considerado. A seleção de modelos foi realizada utilizando-se medidas preditivas baseadas em validação cruzada. Sob essas medidas, o modelo selecionado foi o modelo ZIP-normal, sob o modelo corrente na literatura, modelo combinado. As rotinas computacionais foram implementadas no programa R e são parte deste trabalho. / Discrete and longitudinal structures naturally arise in clinical trial data. Such data are usually correlated, particularly when the observations are made within the same experimental unit over time and, thus, statistical analyses must take this situation into account. Besides this typical correlation, overdispersion is another common phenomenon in discrete data, defined as a greater observed variability than that nominated by the statistical model. The causes of overdispersion are usually related to an excess of observed zeros (zero-ination), or an excess of observed positive specific values or even both. Molenberghs, Verbeke e Demétrio (2007) have developed a class of models that encompasses both overdispersion and correlation in count data: Poisson, Poisson-gama, Poisson-normal, Poissonnormal- gama (combined model) models. A Bayesian approach was presented by Rizzato (2011) to fit these models using the Markov Chain Monte Carlo method (MCMC). In this work, a Bayesian framework was adopted as well and, in order to consider the uncertainty related to the model parameters, the Integrated Nested Laplace Approximations (INLA) method was used. Along with the models considered in Rizzato (2011), another four new models were proposed including longitudinal correlation, overdispersion and zero-ination by structural and random zeros, namely: zero-inated Poisson (ZIP), zero-inated negative binomial (ZINB), zero-inated Poisson-normal (ZIP-normal) and the zero-inated negative binomial-normal (ZINB-normal) models. In order to illustrate the developed methodology, the models were fit to a real dataset, in which the response variable was taken to be the number of epileptic events per week in each individual. These individuals were split into two groups, one taking placebo and the other taking an experimental drug, and they observed up to 27 weeks. The model selection criteria were given by different predictive measures based on cross validation. In this setting, the ZIP-normal model was selected instead the usual model in the literature (combined model). The computational routines were implemented in R language and constitute a part of this work.
52

Geographies of motherhood : sub-national differences in the involvement in paid work of mothers of young children : the cases of Germany and the UK

Walthery, Pierre January 2012 (has links)
In this thesis I analyse sub-national differences in the employment trajectories of mothers of young children in Germany (Bundeslaender) and the UK (Government Office Regions and Metropolitan counties). The thesis combines longitudinal and spatial approaches to paid work, and focuses on mothers of children under 6 - arguably the group at the core of the social (re)production of gender differences in employment. One of its aims is to nuance the existing literature explaining the differences in women's involvement in paid work in terms of national welfare and/or breadwinner regimes - by looking at the nature and extent of regional variations in the patterns of involvement that make these countries typical of such regimes. Its specific goals consist in testing the Latent Growth Curve (LCM) framework as a method for modelling variations in participation in paid work over time, then in exploring three possible explanations for the regional differences observed. The respective role of regional differences in the family formation and social position of the maternal labour force, of the availability of suitable jobs in particular segregated jobs, and finally of economic histories in relation to women's orientations to work is assessed. The results confirmed that LCM represents an innovative tool to understand variations of involvement in paid work over time, and revealed significant regional differences, beyond the 'North South' and 'East-West' divides documented respectively in the UK and Germany. In both countries, results pointed at a combined effect of the three explanatory factors analysed. Whilst composition and labour demands effects went some way towards explaining some of the variations observed, at the same time additional regional variations were discovered once composition factors were taken into account. Finally the pattern of association between the remaining unexplained regional variation and aggregate attitudes of women towards paid work suggests an influence of long term trends in participation on present levels of involvement.
53

Methods for handling missing data in cohort studies where outcomes are truncated by death

Wen, Lan January 2018 (has links)
This dissertation addresses problems found in observational cohort studies where the repeated outcomes of interest are truncated by both death and by dropout. In particular, we consider methods that make inference for the population of survivors at each time point, otherwise known as 'partly conditional inference'. Partly conditional inference distinguishes between the reasons for missingness; failure to make this distinction will cause inference to be based not only on pre-death outcomes which exist but also on post-death outcomes which fundamentally do not exist. Such inference is called 'immortal cohort inference'. Investigations of health and cognitive outcomes in two studies - the 'Origins of Variance in the Old Old' and the 'Health and Retirement Study' - are conducted. Analysis of these studies is complicated by outcomes of interest being missing because of death and dropout. We show, first, that linear mixed models and joint models (that model both the outcome and survival processes) produce immortal cohort inference. This makes the parameters in the longitudinal (sub-)model difficult to interpret. Second, a thorough comparison of well-known methods used to handle missing outcomes - inverse probability weighting, multiple imputation and linear increments - is made, focusing particularly on the setting where outcomes are missing due to both dropout and death. We show that when the dropout models are correctly specified for inverse probability weighting, and the imputation models are correctly specified for multiple imputation or linear increments, then the assumptions of multiple imputation and linear increments are the same as those of inverse probability weighting only if the time of death is included in the dropout and imputation models. Otherwise they may not be. Simulation studies show that each of these methods gives negligibly biased estimates of the partly conditional mean when its assumptions are met, but potentially biased estimates if its assumptions are not met. In addition, we develop new augmented inverse probability weighted estimating equations for making partly conditional inference, which offer double protection against model misspecification. That is, as long as one of the dropout and imputation models is correctly specified, the partly conditional inference is valid. Third, we describe methods that can be used to make partly conditional inference for non-ignorable missing data. Both monotone and non-monotone missing data are considered. We propose three methods that use a tilt function to relate the distribution of an outcome at visit j among those who were last observed at some time before j to those who were observed at visit j. Sensitivity analyses to departures from ignorable missingness assumptions are conducted on simulations and on real datasets. The three methods are: i) an inverse probability weighted method that up-weights observed subjects to represent subjects who are still alive but are not observed; ii) an imputation method that replaces missing outcomes of subjects who are alive with their conditional mean outcomes given past observed data; and iii) a new augmented inverse probability method that combines the previous two methods and is doubly-robust against model misspecification.
54

Análise estatística para dados de contagem longitudinais  na presença de covariáveis: aplicações na área médica / Statistical Analyze For Longitudinal Counting Data in Presence of Covariates: Application in Medical Research

Emilio Augusto Coelho Barros 09 February 2009 (has links)
COELHO-BARROS, E. A. Analise estatstica para dados de contagem longitudinais na presenca de covariaveis: Aplicações na area medica. Dissertação (mestrado) - Faculdade de Medicina de Ribeirão Preto - USP, Ribeirão Preto - SP - Brasil, 2009. Dados de contagem ao longo do tempo na presenca de covariaveis são muito comuns em estudos na area da saude coletiva, por exemplo; numero de doenças que uma pessoa, com alguma caracteristica especifica, adquiriu ao longo de um período de tempo; numero de internações hospitalares em um período de tempo, devido a algum tipo de doença; numero de doadores de orgãos em um período de tempo. Nesse trabalho são apresentados diferentes modelos estatsticos de\\fragilidade\" de Poisson para a analise estatística de dados de contagem longitudinais. Teoricamente, a distribuição de Poisson exige que a media seja igual a variância, quando isto não ocorre tem-se a presenca de uma variabilidade extra-Poisson. Os modelos estatsticos propostos nesta dissertação incorporam a variabilidade extra-Poisson e capturam uma possvel correlação entre as contagens para o mesmo indivduo. Para cada modelo foi feito uma analise Bayesiana Hierarquica considerando os metodos MCMC (Markov Chain Monte Carlo). Utilizando bancos de dados reais, cedidos por pesquisadores auxiliados pelo CEMEQ (Centro de Metodos Quantitativos, USP/FMRP), foram discutidos alguns aspectos de discriminação Bayesiana para a escolha do melhor modelo. Um exemplo de banco de dados reais, discutido na Seção 4 dessa dissertação, que se encaixa na area da saude coletiva, e composto de um estudo prospectivo, aberto e randomizado, realizado em pacientes infectados pelo HIV que procuraram atendimento na Unidade Especial de Terapia de Doencas Infecciosas (UETDI) do Hospital das Clnicas da Faculdade de Medicina de Ribeirão Preto da Universidade de São Paulo (HCFMRP-USP). Os esquemas terapêuticos estudados consistiam em zidovudina e lamivudina, associadas ao efavirenz ou lopinavir. Entre setembro de 2004 e maio de 2006 foram avaliados 66 pacientes, sendo 43 deles includos no estudo. Destes, 39 participantes alcançaram a semana 24 de acompanhamento, enquanto 27 atingiram a semana 48. Os grupos de pacientes apresentavam características basais semelhantes, quanto a idade, sexo, mediana de CD4 e carga viral. O interesse desse experimento e estudar a contagem de CD4 considerando os dois esquemas terapêuticos (efavirenz e lopinavir). / COELHO-BARROS, E. A. Analise estatstica para dados de contagem longitudinais na presenca de covariaveis: Aplicac~oes na area medica. Dissertac~ao (mestrado) - Faculdade de Medicina de Ribeir~ao Preto - USP, Ribeir~ao Preto - SP - Brasil, 2009. Longitudinal counting data in the presence of covariates is very common in many applications, especially considering medical data. In this work we present dierent \\frailty\"models to analyze longitudinal Poisson data in the presence of covariates. These models incorporate the extra-Poisson variability and the possible correlation among the repeated counting data for each individual. A hierarchical Bayesian analysis is introduced for each dierent model considering usual MCMC (Markov Chain Monte Carlo) methods. Considering reals biological data set (obtained from CEMEQ, Medical School of Ribeir~ao Preto, University of S~ao Paulo, Brazil), we also discuss some Bayesian discrimination aspects for the choice of the best model. In Section 4 is considering a data set related to an open prospective and randomized study, considering of HIV infected patients, free of treatments, which entered the Infection Diseases Therapy Special Unit (UETDI) of the Clinical Hospital of the Medical School of Ribeir~ao Preto, University of S~ao Paulo (HCFMRP-USP). The therapeutic treatments consisted of the drugs Zidovudine and Lamivudine, associated to Efavirenz and Lopinavir. The data set was related to 66 patients followed from September, 2004 to may, 2006, from which, 43 were included in the study. The patients groups presented similar basal characteristics in terms of sex, age, CD4 counting median and viral load. The main goal of this study was to compare the CD4 cells counting for the two treatments, based on the drugs Efavirenz and Lopinavir, recently adopted as preferencial for the initial treatment of the disease.
55

Time Series Decomposition Using Singular Spectrum Analysis

Deng, Cheng 01 May 2014 (has links)
Singular Spectrum Analysis (SSA) is a method for decomposing and forecasting time series that recently has had major developments but it is not yet routinely included in introductory time series courses. An international conference on the topic was held in Beijing in 2012. The basic SSA method decomposes a time series into trend, seasonal component and noise. However there are other more advanced extensions and applications of the method such as change-point detection or the treatment of multivariate time series. The purpose of this work is to understand the basic SSA method through its application to the monthly average sea temperature in a point of the coast of South America, near where “EI Ni˜no” phenomenon originates, and to artificial time series simulated using harmonic functions. The output of the basic SSA method is then compared with that of other decomposition methods such as classic seasonal decomposition, X-11 decomposition using moving averages and seasonal decomposition by Loess (STL) that are included in some time series courses.
56

Examination of Mixed-Effects Models with Nonparametrically Generated Data

January 2019 (has links)
abstract: Previous research has shown functional mixed-effects models and traditional mixed-effects models perform similarly when recovering mean and individual trajectories (Fine, Suk, & Grimm, 2019). However, Fine et al. (2019) showed traditional mixed-effects models were able to more accurately recover the underlying mean curves compared to functional mixed-effects models. That project generated data following a parametric structure. This paper extended previous work and aimed to compare nonlinear mixed-effects models and functional mixed-effects models on their ability to recover underlying trajectories which were generated from an inherently nonparametric process. This paper introduces readers to nonlinear mixed-effects models and functional mixed-effects models. A simulation study is then presented where the mean and random effects structure of the simulated data were generated using B-splines. The accuracy of recovered curves was examined under various conditions including sample size, number of time points per curve, and measurement design. Results showed the functional mixed-effects models recovered the underlying mean curve more accurately than the nonlinear mixed-effects models. In general, the functional mixed-effects models recovered the underlying individual curves more accurately than the nonlinear mixed-effects models. Progesterone cycle data from Brumback and Rice (1998) were then analyzed to demonstrate the utility of both models. Both models were shown to perform similarly when analyzing the progesterone data. / Dissertation/Thesis / Doctoral Dissertation Psychology 2019
57

Investigating Post-Earnings-Announcement Drift Using Principal Component Analysis and Association Rule Mining

Schweickart, Ian R. W. 01 January 2017 (has links)
Post-Earnings-Announcement Drift (PEAD) is commonly accepted in the fields of accounting and finance as evidence for stock market inefficiency. Less accepted are the numerous explanations for this anomaly. This project aims to investigate the cause for PEAD by harnessing the power of machine learning algorithms such as Principle Component Analysis (PCA) and a rule-based learning technique, applied to large stock market data sets. Based on the notion that the market is consumer driven, repeated occurrences of irrational behavior exhibited by traders in response to news events such as earnings reports are uncovered. The project produces findings in support of the PEAD anomaly using non-accounting nor financial methods. In particular, this project finds evidence for delayed price response exhibited in trader behavior, a common manifestation of the PEAD phenomenon.
58

Improved Standard Error Estimation for Maintaining the Validities of Inference in Small-Sample Cluster Randomized Trials and Longitudinal Studies

Tanner, Whitney Ford 01 January 2018 (has links)
Data arising from Cluster Randomized Trials (CRTs) and longitudinal studies are correlated and generalized estimating equations (GEE) are a popular analysis method for correlated data. Previous research has shown that analyses using GEE could result in liberal inference due to the use of the empirical sandwich covariance matrix estimator, which can yield negatively biased standard error estimates when the number of clusters or subjects is not large. Many techniques have been presented to correct this negative bias; However, use of these corrections can still result in biased standard error estimates and thus test sizes that are not consistently at their nominal level. Therefore, there is a need for an improved correction such that nominal type I error rates will consistently result. First, GEEs are becoming a popular choice for the analysis of data arising from CRTs. We study the use of recently developed corrections for empirical standard error estimation and the use of a combination of two popular corrections. In an extensive simulation study, we find that nominal type I error rates can be consistently attained when using an average of two popular corrections developed by Mancl and DeRouen (2001, Biometrics 57, 126-134) and Kauermann and Carroll (2001, Journal of the American Statistical Association 96, 1387-1396) (AVG MD KC). Use of this new correction was found to notably outperform the use of previously recommended corrections. Second, data arising from longitudinal studies are also commonly analyzed with GEE. We conduct a simulation study, finding two methods to attain nominal type I error rates more consistently than other methods in a variety of settings: First, a recently proposed method by Westgate and Burchett (2016, Statistics in Medicine 35, 3733-3744) that specifies both a covariance estimator and degrees of freedom, and second, AVG MD KC with degrees of freedom equaling the number of subjects minus the number of parameters in the marginal model. Finally, stepped wedge trials are an increasingly popular alternative to traditional parallel cluster randomized trials. Such trials often utilize a small number of clusters and numerous time intervals, and these components must be considered when choosing an analysis method. A generalized linear mixed model containing a random intercept and fixed time and intervention covariates is the most common analysis approach. However, the sole use of a random intercept applies assumptions that will be violated in practice. We show, using an extensive simulation study based on a motivating example and a more general design, alternative analysis methods are preferable for maintaining the validity of inference in small-sample stepped wedge trials with binary outcomes. First, we show the use of generalized estimating equations, with an appropriate bias correction and a degrees of freedom adjustment dependent on the study setting type, will result in nominal type I error rates. Second, we show the use of a cluster-level summary linear mixed model can also achieve nominal type I error rates for equal cluster size settings.
59

Provision of Hospital-based Palliative Care and the Impact on Organizational and Patient Outcomes

Roczen, Marisa L 01 January 2016 (has links)
Hospital-based palliative care services aim to streamline medical care for patients with chronic and potentially life-limiting illnesses by focusing on individual patient needs, efficient use of hospital resources, and providing guidance for patients, patients’ families and clinical providers toward making optimal decisions concerning a patient’s care. This study examined the nature of palliative care provision in U.S. hospitals and its impact on selected organizational and patient outcomes, including hospital costs, length of stay, in-hospital mortality, and transfer to hospice. Hospital costs and length of stay are viewed as important economic indicators. Specifically, lower hospital costs may increase a hospital’s profit margin and shorter lengths of stay can enable patient turnover and efficiency of care. Higher rates of hospice transfers and lower in-hospital mortality may be considered positive outcomes from a patient perspective, as the majority of patients prefer to die at home or outside of the hospital setting. Several data sources were utilized to obtain information about patient, hospital, and county characteristics; patterns of hospitals’ palliative care provision; and patients’ hospital costs, length of stay, in-hospital mortality, and transfer to hospice (if a patient survived hospitalization). The study sample consisted of 3,763,339 patients; 348 urban, general, short-term, acute care, non-federal hospitals; and 111 counties located in six states over a 5-year study (2007-2011). Hospital-based palliative care provision was measured by the presence of three palliative care services, including inpatient palliative care consultation services (PAL), inpatient palliative care units (IPAL), and hospice programs (HOSPC). Derived from Institutional Theory, Resource Dependence Theory, and Donabedian’s Structure Process-Outcome framework, 13 hypotheses were tested using a hierarchical (generalized) linear modeling approach. The study findings suggested that hospital size was associated with a higher probability of hospital-based palliative care provision. Conversely, the presence of palliative care services through a hospital’s health system, network, or joint venture was associated with a lower probability of hospital-based palliative care provision. The study findings also indicated that hospitals with an IPAL or HOSPC incurred lower hospital costs, whereas hospitals with PAL incurred higher hospital costs. The presence of PAL, IPAL, and HOSPC was generally associated with a lower probability of in-hospital mortality and transfer to hospice. Finally, the effects of hospital-based palliative care services on length of stay were mixed, and further research is needed to understand this relationship.
60

Working correlation selection in generalized estimating equations

Jang, Mi Jin 01 December 2011 (has links)
Longitudinal data analysis is common in biomedical research area. Generalized estimating equations (GEE) approach is widely used for longitudinal marginal models. The GEE method is known to provide consistent regression parameter estimates regardless of the choice of working correlation structure, provided the square root of n consistent nuisance parameters are used. However, it is important to use the appropriate working correlation structure in small samples, since it improves the statistical efficiency of β estimate. Several working correlation selection criteria have been proposed (Rotnitzky and Jewell, 1990; Pan, 2001; Hin and Wang, 2009; Shults et. al, 2009). However, these selection criteria have the same limitation in that they perform poorly when over-parameterized structures are considered as candidates. In this dissertation, new working correlation selection criteria are developed based on generalized eigenvalues. A set of generalized eigenvalues is used to measure the disparity between the bias-corrected sandwich variance estimator under the hypothesized working correlation matrix and the model-based variance estimator under a working independence assumption. A summary measure based on the set of the generalized eigenvalues provides an indication of the disparity between the true correlation structure and the misspecified working correlation structure. Motivated by the test statistics in MANOVA, three working correlation selection criteria are proposed: PT (Pillai's trace type criterion),WR (Wilks' ratio type criterion) and RMR (Roy's Maximum Root type criterion). The relationship between these generalized eigenvalues and the CIC measure is revealed. In addition, this dissertation proposes a method to penalize for the over-parameterized working correlation structures. The over-parameterized structure converges to the true correlation structure, using extra parameters. Thus, the true correlation structure and the over-parameterized structure tend to provide similar variance estimate of the estimated β and similar working correlation selection criterion values. However, the over-parameterized structure is more likely to be chosen as the best working correlation structure by "the smaller the better" rule for criterion values. This is because the over-parameterization leads to the negatively biased sandwich variance estimator, hence smaller selection criterion value. In this dissertation, the over-parameterized structure is penalized through cluster detection and an optimization function. In order to find the group ("cluster") of the working correlation structures that are similar to each other, a cluster detection method is developed, based on spacings of the order statistics of the selection criterion measures. Once a cluster is found, the optimization function considering the trade-off between bias and variability provides the choice of the "best" approximating working correlation structure. The performance of our proposed criterion measures relative to other relevant criteria (QIC, RJ and CIC) is examined in a series of simulation studies.

Page generated in 0.0927 seconds