• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 75
  • 52
  • 14
  • 13
  • 8
  • 6
  • 5
  • 3
  • 2
  • 1
  • 1
  • Tagged with
  • 249
  • 249
  • 105
  • 86
  • 53
  • 51
  • 51
  • 48
  • 45
  • 45
  • 45
  • 38
  • 36
  • 36
  • 33
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Modelos para a análise de dados de contagens longitudinais com superdispersão: estimação INLA / Models for data analysis of longitudinal counts with overdispersion: INLA estimation

Everton Batista da Rocha 04 September 2015 (has links)
Em ensaios clínicos é muito comum a ocorrência de dados longitudinais discretos. Para sua análise é necessário levar em consideração que dados observados na mesma unidade experimental ao longo do tempo possam ser correlacionados. Além dessa correlação inerente aos dados é comum ocorrer o fenômeno de superdispersão (ou sobredispersão), em que, existe uma variabilidade nos dados além daquela captada pelo modelo. Um caso que pode acarretar a superdispersão é o excesso de zeros, podendo também a superdispersão ocorrer em valores não nulos, ou ainda, em ambos os casos. Molenberghs, Verbeke e Demétrio (2007) propuseram uma classe de modelos para acomodar simultaneamente a superdispersão e a correlação em dados de contagens: modelo Poisson, modelo Poisson-gama, modelo Poisson-normal e modelo Poisson-normal-gama (ou modelo combinado). Rizzato (2011) apresentou a abordagem bayesiana para o ajuste desses modelos por meio do Método de Monte Carlo com Cadeias de Markov (MCMC). Este trabalho, para modelar a incerteza relativa aos parâmetros desses modelos, considerou a abordagem bayesiana por meio de um método determinístico para a solução de integrais, INLA (do inglês, Integrated Nested Laplace Approximations). Além dessa classe de modelos, como objetivo, foram propostos outros quatros modelos que também consideram a correlação entre medidas longitudinais e a ocorrência de superdispersão, além da ocorrência de zeros estruturais e não estruturais (amostrais): modelo Poisson inacionado de zeros (ZIP), modelo binomial negativo inacionado de zeros (ZINB), modelo Poisson inacionado de zeros - normal (ZIP-normal) e modelo binomial negativo inacionado de zeros - normal (ZINB-normal). Para ilustrar a metodologia desenvolvida, um conjunto de dados reais referentes à contagens de ataques epilépticos sofridos por pacientes portadores de epilepsia submetidos a dois tratamentos (um placebo e uma nova droga) ao longo de 27 semanas foi considerado. A seleção de modelos foi realizada utilizando-se medidas preditivas baseadas em validação cruzada. Sob essas medidas, o modelo selecionado foi o modelo ZIP-normal, sob o modelo corrente na literatura, modelo combinado. As rotinas computacionais foram implementadas no programa R e são parte deste trabalho. / Discrete and longitudinal structures naturally arise in clinical trial data. Such data are usually correlated, particularly when the observations are made within the same experimental unit over time and, thus, statistical analyses must take this situation into account. Besides this typical correlation, overdispersion is another common phenomenon in discrete data, defined as a greater observed variability than that nominated by the statistical model. The causes of overdispersion are usually related to an excess of observed zeros (zero-ination), or an excess of observed positive specific values or even both. Molenberghs, Verbeke e Demétrio (2007) have developed a class of models that encompasses both overdispersion and correlation in count data: Poisson, Poisson-gama, Poisson-normal, Poissonnormal- gama (combined model) models. A Bayesian approach was presented by Rizzato (2011) to fit these models using the Markov Chain Monte Carlo method (MCMC). In this work, a Bayesian framework was adopted as well and, in order to consider the uncertainty related to the model parameters, the Integrated Nested Laplace Approximations (INLA) method was used. Along with the models considered in Rizzato (2011), another four new models were proposed including longitudinal correlation, overdispersion and zero-ination by structural and random zeros, namely: zero-inated Poisson (ZIP), zero-inated negative binomial (ZINB), zero-inated Poisson-normal (ZIP-normal) and the zero-inated negative binomial-normal (ZINB-normal) models. In order to illustrate the developed methodology, the models were fit to a real dataset, in which the response variable was taken to be the number of epileptic events per week in each individual. These individuals were split into two groups, one taking placebo and the other taking an experimental drug, and they observed up to 27 weeks. The model selection criteria were given by different predictive measures based on cross validation. In this setting, the ZIP-normal model was selected instead the usual model in the literature (combined model). The computational routines were implemented in R language and constitute a part of this work.
82

Modelo não linear misto aplicado a análise de dados longitudinais em um solo localizado em Paragominas, PA / Nonlinear mixed model applied in longitudinal data analysis in a soil located in Paragominas, PA

Marcello Neiva de Mello 22 January 2014 (has links)
Este trabalho tem como objetivo aplicar a teoria de modelos mistos ao estudo do teor de nitrogênio e carbono no solo, em diversas profundidades. Devido a grande quantidade de matéria orgânica no solo, o teor de nitrogênio e carbono apresentam alta variabilidade nas primeiras profundidades, além de apresentar um comportamento não linear. Assim, fez-se necessário utilizar a abordagem de modelos não lineares mistos a dados longitudinais. A utilização desta abordagem proporciona um modelo que permite modelar dados não lineares, com heterogeneidade de variâncias, fornecendo uma curva para cada amostra. / This paper has as an objective to apply the theory of mixed models to the content of nitrogen and carbon in the soil at various depths. Due to the large amount of organic material in the soil, the content of nitrogen and carbon present high variability in the depths of soil surface, and present a nonlinear behavior. Thus, it was necessary to use the approach of nonlinear mixed models to longitudinal data analysis. The use of this approach provides a model that allows to model nonlinear data with heterogeneity of variances by providing a curve for each sample.
83

Pronostic dynamique de l'évolution de l'état de santé de patients atteints d'une maladie chronique / Dynamic prognostic of clinical evolution for chronic disease patients

Fournier, Marie-Cecile 10 October 2016 (has links)
Pour de nombreuses pathologies chroniques,l’amélioration de la prise en charge des patients passe par une meilleure compréhension de la progression de la pathologie et par la capacité à pronostiquer précocement la survenue d’événements délétères.L’évolution de l’état de santé des patients peut être appréciée à travers des mesures répétées d’un marqueur longitudinal, comme la créatinine sérique en transplantation rénale.Ce travail de thèse en Epidémiologie et Biostatistique appliqué à la transplantation rénale s’intéresse aux modèles conjoints pour données longitudinales et de temps d’évènement. Ces derniers présentent de nombreux avantages mais ils restent encore peu utilisés en pratique. Dans une première partie du travail, nous proposons d’utiliser cette méthodologie afin d’étudier le rôle spécifique des déterminants de santé sur l’évolution du sérum de créatinine et/ou sur le risque d’échec de greffe. Cette modélisation apporte une vision épidémiologique très riche et met en évidence certains facteurs qui pourraient être intéressants à intégrer dans la prise en charge des patients puisqu’ils semblent associés au risque d’échec de greffe sans reflet préalable sur le marqueur de suivi, la créatinine sérique.Dans une seconde partie, nous nous sommes intéressés aux prédictions dynamiques. Calculables à partir d’un modèle conjoint, les prédictions sont dites dynamiques car elles se mettent à jour tout au long du suivi en fonction de l’information longitudinale récoltée jusqu’au temps de prédiction. L’utilité clinique de ce type de score dynamique doit être évaluée et repose en partie sur des performances adéquates en termes de calibration et de discrimination. Des outils d’évaluation,tels que le Brier Score ou la courbe ROC, ont déjà été développés. En complément de ces indicateurs, nous proposons le développement d’un indicateur de type R² afin de pallier certaines de leurs limites / For many chronic diseases, the monitoring of patients can be improved by a better understanding of disease growth and the ability to predict the occurrence of major events. Health status evolution can be measured by repeated measurements of a longitudinal marker, as serumcreatinine in renal transplantation.This thesis work in epidemiology and biostatistics applied to renal transplantation focuses on jointmodels for longitudinal and time-to-event data.These models have various benefits but their use is still uncommon in practice. In a first part, we use this methodology to identify the specific role of risk factors on serum creatinine evolution and/or graftfailure risk. We give a rich epidemiological overview and highlights some features which deserve additional attention as they seemassociated with graft failure risk without previousmodification of the longitudinal marker, the serumcreatinine. In a second part, we focus on dynamic predictions, which can be estimated from a jointmodel. They are called dynamic because of an update performed at each new measurement of the longitudinal marker. The clinical usefulness of this type of predictions has to be evaluated and should be based on good accuracy in terms of discrimination and calibration. To assess the prognostic capacities, the Brier Score or the ROCcurve have already been developed. To complete them, we propose an R² type indicator in order to complement some limitations of previous tools.
84

The Generalized Monotone Incremental Forward Stagewise Method for Modeling Longitudinal, Clustered, and Overdispersed Count Data: Application Predicting Nuclear Bud and Micronuclei Frequencies

Lehman, Rebecca 01 January 2017 (has links)
With the influx of high-dimensional data there is an immediate need for statistical methods that are able to handle situations when the number of predictors greatly exceeds the number of samples. One such area of growth is in examining how environmental exposures to toxins impact the body long term. The cytokinesis-block micronucleus assay can measure the genotoxic effect of exposure as a count outcome. To investigate potential biomarkers, high-throughput assays that assess gene expression and methylation have been developed. It is of interest to identify biomarkers or molecular features that are associated with elevated micronuclei (MN) or nuclear bud (Nbud) frequency, measures of exposure to environmental toxins. Given our desire to model a count outcome (MN and Nbud frequency) using high-throughput genomic features as predictors, novel methods that can handle over-parameterized models need development. Overdispersion, when the variance of a count outcome is larger than its mean, is frequently observed with count response data. For situations where overdispersion is present, the negative binomial distribution is more appropriate. Furthermore, we expand the method to the longitudinal Poisson and longitudinal negative binomial settings for modeling a longitudinal or clustered outcome both when there is equidispersion and overdispersion. The method we have chosen to expand is the Generalized Monotone Incremental Forward Stagewise (GMIFS) method. We extend the GMIFS to the negative binomial distribution so it may be used to analyze a count outcome when both a high-dimensional predictor space and overdispersion are present. Our methods were compared to glmpath. We also extend the GMIFS to the longitudinal Poisson and longitudinal negative binomial distribution for analyzing a longitudinal outcome. Our methods were compared to glmmLasso and GLMMLasso. The developed methods were used to analyze two datasets, one from the Norwegian Mother and Child Cohort study and one from the breast cancer epigenomic study conducted by researchers at Virginia Commonwealth University. In both studies a count outcome measured exposure to potential genotoxins and either gene expression or high-throughput methylation data formed a high dimensional predictor space. Further, the breast cancer study was longitudinal such that outcomes and high-dimensional genomic features were collected at multiple time points during the study for each patient. Our goal is to identify biomarkers that are associated with elevated MN or NBud frequency. From the development of these methods, we hope to make available more comprehensive statistical models for analyzing count outcomes with high dimensional predictor spaces and either cross-sectional or longitudinal study designs.
85

Towards a Privacy Preserving Framework for Publishing Longitudinal Data

Sehatkar, Morvarid January 2014 (has links)
Recent advances in information technology have enabled public organizations and corporations to collect and store huge amounts of individuals' data in data repositories. Such data are powerful sources of information about an individual's life such as interests, activities, and finances. Corporations can employ data mining and knowledge discovery techniques to extract useful knowledge and interesting patterns from large repositories of individuals' data. The extracted knowledge can be exploited to improve strategic decision making, enhance business performance, and improve services. However, person-specific data often contain sensitive information about individuals and publishing such data poses potential privacy risks. To deal with these privacy issues, data must be anonymized so that no sensitive information about individuals can be disclosed from published data while distortion is minimized to ensure usefulness of data in practice. In this thesis, we address privacy concerns in publishing longitudinal data. A data set is longitudinal if it contains information of the same observation or event about individuals collected at several points in time. For instance, the data set of multiple visits of patients of a hospital over a period of time is longitudinal. Due to temporal correlations among the events of each record, potential background knowledge of adversaries about an individual in the context of longitudinal data has specific characteristics. None of the previous anonymization techniques can effectively protect longitudinal data against an adversary with such knowledge. In this thesis we identify the potential privacy threats on longitudinal data and propose a novel framework of anonymization algorithms in a way that protects individuals' privacy against both identity disclosure and attribute disclosure, and preserves data utility. Particularly, we propose two privacy models: (K,C)^P -privacy and (K,C)-privacy, and for each of these models we propose efficient algorithms for anonymizing longitudinal data. An extensive experimental study demonstrates that our proposed framework can effectively and efficiently anonymize longitudinal data.
86

Triclosan: Source Attribution, Urinary Metabolite Levels and Temporal Variability in Exposure Among Pregnant Women in Canada

Weiss, Lorelle D. January 2013 (has links)
OBJECTIVE: To measure urinary triclosan levels and their variability across pregnancy, and to identify sources of triclosan exposure among Canadian pregnant women. METHODS: Single spot and serial urine samples, as well as consumer product use information were collected across pregnancy and post-partum from 80 healthy pregnant women in Ottawa. Analyses included descriptives, linear mixed effects and parametric trend modeling, and surrogate category analysis. RESULTS: Triclosan was detected in 87% of maternal urine samples (LOD=3.0 µg/L). Triclosan concentrations varied by time of day of urine collection (p=0.0006), season of sampling (p=0.019), and parity (p=0.038). Triclosan was included in 4% of all personal care products used by participants; 89% of these triclosan products were varying brands of toothpaste and hand soaps. CONCLUSION: This study provided the first data on temporal variability urinary triclosan levels, and on source attribution data in Canadian pregnant women. Results will assist with population-specific exposure assessment strategies.
87

Chronic poverty concepts and measures : an application to Kazakhstan

Kudebayeva, Alma January 2015 (has links)
This thesis explores the concepts and measurements of chronic poverty, with application to Kazakhstan. A rigorous analysis of different approaches in the measurement of poverty and chronic poverty is presented in this study. Five matching techniques have been applied for the construction of unintended panel data based on KHBS 2001-2009. The substantial test of reliability, representativeness and robustness of the constructed panel data has examined. The attrition biases of the longitudinal data have been studied rigorously. The appropriate equivalence scale has been determined through regression analysis to the Kazakhstan HBS. The sensitivity of conventional and chronic poverty measures to various poverty lines and equivalence scales studied in this thesis. The stochastic dominance analysis of per adult equivalent consumption expenditures has been presented. The chronic poverty measures and determinants of chronically and transient poor have been estimated. It illustrates that the main correlates of chronic poverty are education, employment status of the head of household, household composition, the ownership of assets such as a dwelling other than main dwelling, a car, access to water in the house and location. The correlates of transient poverty are similar to chronic poverty; however some of them have opposite signs, for example the ethnicity of the head of household, household compositions, an ownership of a dwelling other than main dwelling, location in urban area and repayments of loan in 2008. The Oaxaca-Blinder decomposition analysis of the gap in consumption expenditures between chronically and transient poor, chronically poor and non-poor explains the differences through returns to endowments. Poverty transitions analysis illustrate improvement in poverty dynamics in later period of the study in 2006-2009. Long durations of poverty prevail among singles with children and couples with children. Poverty exit rates are higher than poverty entry rates for the whole period of 2001-2009. The multivariate hazard regression models are estimated to examine differences in people's experience of poverty over a period of time. For individuals who enter poverty, the total span of time that they spend in poverty consequently depends on both the chances of exit from poverty and the chances of re-entry to poverty. The results confirm the negative duration dependence of the hazards of poverty exit and re-entry for longer lengths of state. The only factor significantly positive influence on poverty exit is a location in Almaty. Many correlates of the model estimation have the same signs for the hazard rate of poverty exit and re-entries. These facts mean that these factors are common for transient poor, who are moving in and out poverty in given period of time. As defined before the existence of children under age six will increase the hazard rate of poverty re-entry.
88

Intégration des facteurs prédictifs de l'effet d'un traitement dans la conception et l'analyse des essais cliniques de petite taille : application à la maladie de Huntington. / Integration of predictive factors of treatment effect in design and analyse of clinical trials with small sample size : application on Huntington's disease

Schramm, Catherine 06 July 2016 (has links)
La maladie de Huntington est neurodégénérative, génétique, rare, multifacette et de durée d'évolution longue, induisant une grande. Les biothérapies en cours d'essai sont réalisées sur des petits effectifs, avec un effet mesurable à long terme et hétérogène. Identifier des marqueurs d'évolution de la maladie et de réponse au traitement permettrait de mieux comprendre et d'améliorer les résultats des futurs essais cliniques. Nous avons développé une méthode de clustering pour l'efficacité d'un traitement dans le cadre de données longitudinales afin de définir des répondeurs et non répondeurs au traitement. Notre méthode, robuste pour les petits effectifs, combine un modèle linéaire mixte à deux pentes et un algorithme de clustering. Le modèle mixte génère des effets aléatoires, associés à la réponse au traitement, propres à chaque patient. L'algorithme de clustering permet de définir des sous-groupes selon la valeur des effets aléatoires. Trouver des sous-groupes de patients répondeurs permet de définir des marqueurs prédictifs de la réponse au traitement qui seront utilisés pour donner le traitement le mieux adapté à chaque patient. Nous avons discuté de l'intégration (i) des marqueurs prédictifs dans les plans expérimentaux des essais cliniques, en évaluant leur impact sur la puissance de l'étude; et (ii) des marqueurs pronostiques, en étudiant l¿impact du polymorphisme COMT sur le déclin cognitif des patients. Enfin, nous avons évalué l'effet d'apprentissage des tests neuropsychologiques, et montré comment une double évaluation à l'inclusion dans un essai clinique permettait de s'en affranchir quand le critère de jugement principal est le déclin cognitif. / Huntington's disease is neurodegenerative, genetic, rare, multifaceted and has a long evolution, inducing heterogeneity of conditions and progression of the disease. Current biotherapy trials are performed on small samples of patients, with a treatment effect measurable in the long-term that is heterogeneous. Identifying markers of the disease progression and of the treatment response may help to better understand and improve results of biotherapy studies in Huntington's disease. We have developed a clustering method for the treatment efficacy in the case of longitudinal data in order to identify treatment responders and nonresponders. Our method combines a linear mixed model with two slopes and a classical clustering algorithm. The mixed model generates random effects associated with treatment response, specific to each patient. The clustering algorithm is used to define subgroups according to the value of the random effects. Our method is robust in case of small samples. Finding subgroups of responders may help to define predictive markers of treatment response which will be used to give the most appropriate treatment for each patient. We discussed integration of (i) the predictive markers in study design of future clinical trials, assessing their impact on the power of the study; and (ii) the prognostic markers of disease progression by studying the COMT polymorphism as a prognostic marker of cognitive decline in Huntington's disease. Finally, we evaluated the learning effect of neuropsychological tasks measuring cognitive abilities, and showed how a double baseline in a clinical trial could take it into account when the primary outcome is the cognitive decline.
89

Investigating maintenance costs using response feature analysis

Hansson, Linus January 2022 (has links)
Svenska Kraftnät (Svk) is responsible for ensuring that Sweden has a safe, sustainable, and cost-effective transmission system for electricity. In an effort to reduce costs, Svk has participated in a study where it has been determined that there are mostly costs for common facilities that stand out cost-wise. The goal of this master thesis is to identify and assess what factors influence maintenance costs for the substations in the Swedish national grid for electricity. Response feature modeling was applied on longitudinal data for the substations (N=53) for years 2017-2020 to obtain individual intercepts with a common slope for further analysis. The factors included in the global model were based on Pearson correlation analysis and consultation with experts. Further attempts to improve upon the global model were made based on a stepwise variable selection made by comparing AIC, a log-transformation of the response, and by applying expert knowledge to attain a subset of predictors. all the resulting models were significant (P<0.001) with the best model having an R2 of 0.8376. Predictions for a proposed substation was made for the first fifty years of lifespan. Predictors that were found significant in multiple models includes variables regarding substation size and age. Factors that were not significant in any model related to substation fence length and location amongst others. / Svenska kraftnät (Svk) är ansvariga för att försäkra att Sverige har ett säkert, hållbart och kostnadseffektivt transmissionsnät för elektricitet. I ett försök att minska kostnader deltog Svk i en studie där det fastslogs att det var främst kostnader för deras stamnnätsstationer som sticker ut kostnadsmässigt. Målet i den här uppsatsen är att identifiera och värdera faktorer som kan influera underhållskostnader för stamnätsstationerna i det Svenska stamnätet. Response feature modeling applicerades på longitudinell data (N=53) för åren 2017-2020. Individuella intercept skattades med en gemensam lutning för att användas vid vidare analys. Faktorerna som initialt inkluderades i den globala modellen var antingen inkluderade eller exkluderade genom en analys av Pearson korrelationer samt i rådfrågan med experter med sakkunskap. Vidare försök att förbättra den globala modellen gjordes genom att applicera en stegvis variable selection baserad på AIC. Samma process genomfördes efter en log-transformation av responsvariabeln. Slutligen skapades en ytterligare modell med en delmängd faktorer skapad av experter på Svk. Samtliga regressionsmodeller var signifikanta (P<0.001) där den bästa modellen med avseende på R2 hade värdet 0.8376. Skattningar av framtida kostnader för en föreslagen stations första femtio år av dess livsspann gjordes. Faktorer som var signifikanta (P<0.05) inkluderade variabler som beskrev stationsstorlek, ställverkstyp och stationsålder. Faktorer som inte var signifikanta i någon modell var bland annat faktorer med anknytning till geografiskt läge, staketlängd bland andra.
90

Addressing the Variable Selection Bias and Local Optimum Limitations of Longitudinal Recursive Partitioning with Time-Efficient Approximations

January 2019 (has links)
abstract: Longitudinal recursive partitioning (LRP) is a tree-based method for longitudinal data. It takes a sample of individuals that were each measured repeatedly across time, and it splits them based on a set of covariates such that individuals with similar trajectories become grouped together into nodes. LRP does this by fitting a mixed-effects model to each node every time that it becomes partitioned and extracting the deviance, which is the measure of node purity. LRP is implemented using the classification and regression tree algorithm, which suffers from a variable selection bias and does not guarantee reaching a global optimum. Additionally, fitting mixed-effects models to each potential split only to extract the deviance and discard the rest of the information is a computationally intensive procedure. Therefore, in this dissertation, I address the high computational demand, variable selection bias, and local optimum solution. I propose three approximation methods that reduce the computational demand of LRP, and at the same time, allow for a straightforward extension to recursive partitioning algorithms that do not have a variable selection bias and can reach the global optimum solution. In the three proposed approximations, a mixed-effects model is fit to the full data, and the growth curve coefficients for each individual are extracted. Then, (1) a principal component analysis is fit to the set of coefficients and the principal component score is extracted for each individual, (2) a one-factor model is fit to the coefficients and the factor score is extracted, or (3) the coefficients are summed. The three methods result in each individual having a single score that represents the growth curve trajectory. Therefore, now that the outcome is a single score for each individual, any tree-based method may be used for partitioning the data and group the individuals together. Once the individuals are assigned to their final nodes, a mixed-effects model is fit to each terminal node with the individuals belonging to it. I conduct a simulation study, where I show that the approximation methods achieve the goals proposed while maintaining a similar level of out-of-sample prediction accuracy as LRP. I then illustrate and compare the methods using an applied data. / Dissertation/Thesis / Doctoral Dissertation Psychology 2019

Page generated in 0.0327 seconds