• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 5
  • 5
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 23
  • 23
  • 23
  • 7
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Some problems in model specification and inference for generalized additive models

Marra, Giampiero January 2010 (has links)
Regression models describingthe dependence between a univariate response and a set of covariates play a fundamental role in statistics. In the last two decades, a tremendous effort has been made in developing flexible regression techniques such as generalized additive models(GAMs) with the aim of modelling the expected value of a response variable as a sum of smooth unspecified functions of predictors. Many nonparametric regression methodologies exist includinglocal-weighted regressionand smoothing splines. Here the focus is on penalized regression spline methods which can be viewed as a generalization of smoothing splines with a more flexible choice of bases and penalties. This thesis addresses three issues. First, the problem of model misspecification is treated by extending the instrumental variable approach to the GAM context. Second, we study the theoretical and empirical properties of the confidence intervals for the smooth component functions of a GAM. Third, we consider the problem of variable selection within this flexible class of models. All results are supported by theoretical arguments and extensive simulation experiments which shed light on the practical performance of the methods discussed in this thesis.
2

Comparison of background correction in tiling arrays and a spatial model

Maurer, Dustin January 1900 (has links)
Master of Science / Department of Statistics / Susan J. Brown / Haiyan Wang / DNA hybridization microarray technologies have made it possible to gain an unbiased perspective of whole genome transcriptional activity on such a scale that is increasing more and more rapidly by the day. However, due to biologically irrelevant bias introduced by the experimental process and the machinery involved, correction methods are needed to restore the data to its true biologically meaningful state. Therefore, it is important that the algorithms developed to remove any sort of technical biases are accurate and robust. This report explores the concept of background correction in microarrays by using a real data set of five replicates of whole genome tiling arrays hybridized with genetic material from Tribolium castaneum. It reviews the literature surrounding such correction techniques and explores some of the more traditional methods through implementation on the data set. Finally, it introduces an alternative approach, implements it, and compares it to the traditional approaches for the correction of such errors.
3

Estimation and bias correction of the magnitude of an abrupt level shift

Liu, Wenjie January 2012 (has links)
Consider a time series model which is stationary apart from a single shift in mean. If the time of a level shift is known, the least squares estimator of the magnitude of this level shift is a minimum variance unbiased estimator. If the time is unknown, however, this estimator is biased. Here, we first carry out extensive simulation studies to determine the relationship between the bias and three parameters of our time series model: the true magnitude of the level shift, the true time point and the autocorrelation of adjacent observations. Thereafter, we use two generalized additive models to generalize the simulation results. Finally, we examine to what extent the bias can be reduced by multiplying the least squares estimator with a shrinkage factor. Our results showed that the bias of the estimated magnitude of the level shift can be reduced when the level shift does not occur close to the beginning or end of the time series. However, it was not possible to simultaneously reduce the bias for all possible time points and magnitudes of the level shift.
4

Regression analysis with longitudinal measurements

Ryu, Duchwan 29 August 2005 (has links)
Bayesian approaches to the regression analysis for longitudinal measurements are considered. The history of measurements from a subject may convey characteristics of the subject. Hence, in a regression analysis with longitudinal measurements, the characteristics of each subject can be served as covariates, in addition to possible other covariates. Also, the longitudinal measurements may lead to complicated covariance structures within each subject and they should be modeled properly. When covariates are some unobservable characteristics of each subject, Bayesian parametric and nonparametric regressions have been considered. Although covariates are not observable directly, by virtue of longitudinal measurements, the covariates can be estimated. In this case, the measurement error problem is inevitable. Hence, a classical measurement error model is established. In the Bayesian framework, the regression function as well as all the unobservable covariates and nuisance parameters are estimated. As multiple covariates are involved, a generalized additive model is adopted, and the Bayesian backfitting algorithm is utilized for each component of the additive model. For the binary response, the logistic regression has been proposed, where the link function is estimated by the Bayesian parametric and nonparametric regressions. For the link function, introduction of latent variables make the computing fast. In the next part, each subject is assumed to be observed not at the prespecifiedtime-points. Furthermore, the time of next measurement from a subject is supposed to be dependent on the previous measurement history of the subject. For this outcome- dependent follow-up times, various modeling options and the associated analyses have been examined to investigate how outcome-dependent follow-up times affect the estimation, within the frameworks of Bayesian parametric and nonparametric regressions. Correlation structures of outcomes are based on different correlation coefficients for different subjects. First, by assuming a Poisson process for the follow- up times, regression models have been constructed. To interpret the subject-specific random effects, more flexible models are considered by introducing a latent variable for the subject-specific random effect and a survival distribution for the follow-up times. The performance of each model has been evaluated by utilizing Bayesian model assessments.
5

Spatial and Temporal Shifts in Estuarine Nursery Habitats Used by Juvenile Southern Flounder (Paralichthys lethostigma)

Furey, Nathaniel 2012 August 1900 (has links)
Southern flounder (Parlichthys lethostigma) is a recreationally and commercially important flatfish species found in the Gulf of Mexico, and recent analyses indicate that the northern Gulf of Mexico population is in decline. For proper management, knowledge of habitats used throughout the juvenile stage is needed. The aim of the current study is to examine habitat use of young-of-year (YOY) southern flounder in the Galveston Bay complex using habitat distribution models and acoustic telemetry. A set of habitat distribution models examined how habitat use changes during the first year of life. In addition, southern flounder were tagged with acoustic telemetry transmitters and monitored with a novel receiver array that allows for measurements of fine-scale movements. These movements were compared to habitat maps to examine habitat selection. Habitat distribution models determined that habitat requirements for southern flounder change with ontogeny and season. Newly settled southern flounder were most influenced by physicochemical parameters and the presence of seagrass beds. YOY southern flounder, however, showed increased occurrence at freshwater inlets during summer and fall months, and occurrence decreased at tidal inlets during the fall. Predictions of habitat suitability across the Galveston Bay complex indicate that the factors influencing occurrence of southern flounder change with season, ontogeny, and availability of suitable habitats. With acoustic telemetry, it was apparent that habitat use by southern flounder was nonrandom and influenced by benthic and other physicochemical conditions. Habitat analyses indicated that southern flounder used sand habitats more frequently than seagrass, oyster reef, or salt marsh habitats. Telemetry results also indicated that depth and water temperature were important determinants of habitat suitability for YOY southern flounder, with individuals preferring deeper and cooler regions of the water column in Christmas Bay. Both model and telemetry analyses indicate that habitat use by YOY southern flounder is dynamic across multiple spatial and temporal scales, with distributions and movements influenced strongly by ontogenetic changes in habitat associations, temporal and spatial variability in physicochemical conditions, and tidal cycles.
6

Distribuição espacial dos indicadores entomológicos de Aedes aegypti e associação com a ocorrência de casos de dengue em município de médio porte do Estado de São Paulo / Spatial distribution of entomological indicators of Aedes aegypti and association with the ocurrence of dengue cases in medium-sized municipality of São Paulo

Barbosa, Gerson Laurindo, 1970- 25 August 2018 (has links)
Orientadores: Roberto Wagner Lourenço, Maria Rita Donalísio Cordeiro / Tese (doutorado) - Universidade Estadual de Campinas, Faculdade de Ciências Médicas / Made available in DSpace on 2018-08-25T08:45:15Z (GMT). No. of bitstreams: 1 Barbosa_GersonLaurindo_D.pdf: 4382667 bytes, checksum: 11ab540bd4300b50d395f50692cbdec4 (MD5) Previous issue date: 2014 / Resumo: Diminuir os níveis de infestação pelo Aedes aegypti é uma das poucas estratégias para o controle da dengue na atualidade. O acompanhamento dos indicadores de infestação constitui parâmetro estratégico para as ações das equipes de controle da doença, porém pouco se sabe sobre a capacidade preditiva destes indicadores. Este trabalho tem como objetivo analisar a distribuição espacial dos indicadores entomológicos de Aedes aegypti nas fases ovos, larvas/pupas e mosquitos adultos e sua influência no risco de ocorrência de dengue em um município de médio porte no estado de São Paulo. Trata-se de um estudo caso-controle espacial, para avaliar a associação entre os indicadores entomológicos e o risco de dengue em Sumaré, SP, no ano de 2011. Os casos de dengue foram os confirmados e notificados pelo Sistema de Vigilância Epidemiológica da cidade e os controles foram obtidos por sorteio de pontos no perímetro da área habitada. Os indicadores entomológicos foram construídos a partir de coleta mensal de ovos (armadilhas), larvas/pupas e mosquitos adultos em quarteirões sorteados. Superfícies suavizadas dos valores dos indicadores entomológicos foram obtidas por meio do método de Krigagem ordinária. Estes indicadores foram incluídos no modelo aditivo generalizado para avaliar sua influência no risco espacial da doença. Observou-se ocorrência sazonal da doença e dos indicadores. Casos de dengue e vetores nas diversas fases do ciclo biológico foram encontrados em toda área de estudo. Entretanto, não houve coincidência espacial entre o risco da doença e a intensidade dos indicadores entomológicos. Os riscos relativos espaciais de dengue brutos e ajustados mostram feição espacial similar, indicando limitada interferência no risco da doença. Assim, a distribuição espacial e temporal da dengue possivelmente não depende da distribuição espacial dos vetores em locais onde os níveis de infestação são altos, antigos e estáveis, como no caso de Sumaré. Além disso, a área analisada apresenta infestação e transmissão antiga e deficiência de serviços públicos de saneamento e intensa circulação de pessoas, que podem ser fatores relevantes para explicar a circulação do vírus. O vetor foi identificado em abundância suficiente para desencadear e manter a circulação do vírus na área de estudo. A infestação não apresentou grande variação de intensidade e foi suficiente para a manutenção e/ou ocorrência de casos de dengue na área de estudo. O modelo aditivo generalizado não mostrou nenhum dos indicadores entomológicos analisados como preditores de áreas de risco de transmissão. A inclusão de outras variáveis nos modelos aditivos generalizados como sorotipos circulantes, imunidade populacional e intervenções por parte das equipes de controle poderiam eventualmente revelar efeito modulador do risco da doença, não encontrado utilizando-se apenas com os indicadores entomológicos / Abstract: Decrease the infestation levels of Aedes aegypti is one of the few strategies for dengue control today. Monitoring infestation indicators is strategic for the dengue control program, but little is known about the predictive capacity of these indicators. This study aimed to analyze the spatial distribution of entomological indicators of Aedes aegypti in the stages of egg, larva-pupae and adult forms and its influence on risk of dengue in a medium-sized city in the state of São Paulo. This is a spatial case-control study to evaluate the association between entomological indicators and risk of dengue in Sumaré, SP, in 2011. Dengue cases confirmed and reported by the Epidemiological Surveillance System of the municipality and the controls were obtained on the perimeter of the inhabited areas. Monthly entomological indicators were constructed from eggs, larvae-pupae and adult forms collected in the selected blocks. Smoothed surfaces for cases and entomological indicators were obtained by the ordinary kriging method. These indicators were included in the generalized additive model to assess its influence on the spatial risk of the disease. Seasonality of disease occurrence and entomological indicators were observed. Cases of dengue and vectors in the various life cycle stages were found throughout the study area. However, there was no spatial coincidence between disease risk and intensity of entomological indicators. The spatial crude and adjusted relative risks of dengue showed similar features, indicating its limited interference in disease risk. The spatial and temporal distribution of the disease may not depend exclusively on the spatial distribution of vectors in areas where infestation levels are high, longstanding and stable, like in the case of Sumaré-SP. Furthermore, the analyzed area has experienced dengue cases and high infestation for a long time and has poor public sanitation services and intense movement of persons, which may be relevant to explain the circulation of the virus. The vector was identified abundantly sufficient to initiate and maintain the virus in the study area. The infestation had no significant variation in intensity and was sufficient for the maintenance and / or occurrence of dengue cases in the study area. The entomological indicators analyzed in the generalized additive model didn¿t act as a predictor of the dengue risk in the area. Other variables as serotype circulation, the population immunity and interventions by the control teams could be included in the models in order to modulate disease risk, which was not found using only entomological indicators / Doutorado / Epidemiologia / Doutor em Saude Coletiva
7

Modern Monte Carlo Methods and Their Application in Semiparametric Regression

Thomas, Samuel Joseph 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / The essence of Bayesian data analysis is to ascertain posterior distributions. Posteriors generally do not have closed-form expressions for direct computation in practical applications. Analysts, therefore, resort to Markov Chain Monte Carlo (MCMC) methods for the generation of sample observations that approximate the desired posterior distribution. Standard MCMC methods simulate sample values from the desired posterior distribution via random proposals. As a result, the mechanism used to generate the proposals inevitably determines the efficiency of the algorithm. One of the modern MCMC techniques designed to explore the high-dimensional space more efficiently is Hamiltonian Monte Carlo (HMC), based on the Hamiltonian differential equations. Inspired by classical mechanics, these equations incorporate a latent variable to generate MCMC proposals that are likely to be accepted. This dissertation discusses how such a powerful computational approach can be used for implementing statistical models. Along this line, I created a unified computational procedure for using HMC to fit various types of statistical models. The procedure that I proposed can be applied to a broad class of models, including linear models, generalized linear models, mixed-effects models, and various types of semiparametric regression models. To facilitate the fitting of a diverse set of models, I incorporated new parameterization and decomposition schemes to ensure the numerical performance of Bayesian model fitting without sacrificing the procedure’s general applicability. As a concrete application, I demonstrate how to use the proposed procedure to fit a multivariate generalized additive model (GAM), a nonstandard statistical model with a complex covariance structure and numerous parameters. Byproducts of the research include two software packages that all practical data analysts to use the proposed computational method to fit their own models. The research’s main methodological contribution is the unified computational approach that it presents for Bayesian model fitting that can be used for standard and nonstandard statistical models. Availability of such a procedure has greatly enhanced statistical modelers’ toolbox for implementing new and nonstandard statistical models.
8

Optimizing Parameters for High-quality Metagenomic Assembly

Kumar, Ashwani 29 July 2015 (has links)
No description available.
9

Additive Latent Variable (ALV) Modeling: Assessing Variation in Intervention Impact in Randomized Field Trials

Toyinbo, Peter Ayo 23 October 2009 (has links)
In order to personalize or tailor treatments to maximize impact among different subgroups, there is need to model not only the main effects of intervention but also the variation in intervention impact by baseline individual level risk characteristics. To this end a suitable statistical model will allow researchers to answer a major research question: who benefits or is harmed by this intervention program? Commonly in social and psychological research, the baseline risk may be unobservable and have to be estimated from observed indicators that are measured with errors; also it may have nonlinear relationship with the outcome. Most of the existing nonlinear structural equation models (SEM’s) developed to address such problems employ polynomial or fully parametric nonlinear functions to define the structural equations. These methods are limited because they require functional forms to be specified beforehand and even if the models include higher order polynomials there may be problems when the focus of interest relates to the function over its whole domain. To develop a more flexible statistical modeling technique for assessing complex relationships between a proximal/distal outcome and 1) baseline characteristics measured with errors, and 2) baseline-treatment interaction; such that the shapes of these relationships are data driven and there is no need for the shapes to be determined a priori. In the ALV model structure the nonlinear components of the regression equations are represented as generalized additive model (GAM), or generalized additive mixed-effects model (GAMM). Replication study results show that the ALV model estimates of underlying relationships in the data are sufficiently close to the true pattern. The ALV modeling technique allows researchers to assess how an intervention affects individuals differently as a function of baseline risk that is itself measured with error, and uncover complex relationships in the data that might otherwise be missed. Although the ALV approach is computationally intensive, it relieves its users from the need to decide functional forms before the model is run. It can be extended to examine complex nonlinearity between growth factors and distal outcomes in a longitudinal study.
10

Estruturação da comunidade de trepadeiras em uma floresta estacional semidecídua / Community structure of climbing plants in a seasonal semideciduos forest

Van Melis, Juliano, 1981- 28 January 2013 (has links)
Orientador: Fernando Roberto Martins / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Biologia / Made available in DSpace on 2018-08-23T02:32:50Z (GMT). No. of bitstreams: 1 VanMelis_Juliano_D.pdf: 2552550 bytes, checksum: 8227a941fa221a10cce8b272ae92449f (MD5) Previous issue date: 2013 / Resumo: Apesar da importância que as trepadeiras apresentam em florestas tropicais, estudos sobre a montagem da comunidade de lianas (trepadeiras lenhosas e sublenhosas) que investiguem desde a contribuição dos fatores abióticos e bióticos até fatores intrínsecos (coexistência entre indivíduos) são escassos. O objetivo geral desta tese é pesquisar a estruturação da comunidade das espécies de lianas em uma Floresta Estacional Semidecídua (FES), investigando (1) a importância relativa dos fatores ambientais e espaciais para diferentes espécies de lianas, (2) a estruturação filogenética da comunidade de trepadeiras em diferentes ambientes, e (3) os efeitos diretos ou mediados das árvores e arbustos para o número de espécies e indivíduos de trepadeiras. Mostramos que (1) grande parte da variação na composição de espécies de lianas em uma FES é devido a fatores não investigados (fatores estocásticos) e o espaço (autocorrelação espacial). Portanto, concluímos que os maiores determinantes na variação da composição de espécies de lianas em uma FES é a aleatoriedade (sendo reflexo da variação estocástica das populações) e a limitação por dispersão (demonstrada pela alta autocorrelação espacial). No segundo capítulo (2), encontramos que uma maioria discreta das parcelas apresentou maior aproximação filogenética do que o esperado ao acaso na comunidade de trepadeiras amostrada. Houve pouca influência de variáveis relacionadas à dinâmica florestal na variação da aproximação filogenética, sendo que áreas com árvores mais altas e maior proporção de árvores do presente apresentavam maior aproximação filogenética que outras áreas. Concluímos que em áreas de dossel mais baixo e menor proporção de árvores do presente (clareiras) não apresentam menor sinal filogenético, pois todas as espécies de lianas apresentariam potencial de existirem nestas áreas, enquanto que nas áreas de floresta madura haveria a existência de filtros ambientais para a existência de poucos ramos filogenéticos. Por último (3), encontramos que os atributos da comunidade de árvores e arbustos são fatores importantes na variação dos atributos da comunidade de lianas, sendo parte dele decorrente do distúrbio no dossel. Mas o distúrbio no dossel como fator direto é mais importante na variação da abundância e número de espécies de lianas em uma Floresta Estacional Semidecídua / Abstract: Despite the fact that climbing plants present in tropical forests, studies which investigate the contribution of abiotic and biotic factors or intrinsic factors (coexistence between individuals) on community assembly of lianas (woody and sub-woody climbers) are scarce. The overall objective of this thesis is to research the community structure of liana species in a Seasonal Semideciduous Forest (SSF), investigating (1) the relative importance of environmental and spatial factors on community assembly of lianas, (2) the phylogenetic structure of climbing plants community along the forest development (treefall gaps to old-growth forest), and (3) the direct or indirect effects of trees and shrubs for the number of species and individuals of climbing plants. We show that (1) much of the variation in species composition of lianas in a SSF is due to stochastic factors and space. Therefore, we conclude that the major determinants of variation in lianas' species composition in a TSF are stochastic variance of populations, shown by the unexplained factors, and dispersion limitation, shown by spatial autocorrelation. In the second chapter (2), we found that a slight majority of the sample plots showed cluster phylogenetic structure in the climbing plants community. There was a slight influence of variables related to forest dynamics in the variation of the phylogenetic structure, and areas with tall trees and higher proportion of present trees had higher values of clustering in phylogenetic structure than other areas. We conclude that in areas of lower canopy and smaller proportion of present trees (treefall gaps) showed few phylogenetic branches, since all species of climbing plants would be existing in these areas, while areas of old-growth forest would demonstrate environmental filters for the climbing plants. Finally, we also found (3) that the community of trees and shrubs' attributes (abundance and species richness) are important factors in the variation of attributes liana community (species richness and abundance), being part of it due to the canopy disturbance. But canopy disturbance was the more important direct factor in variance of abundance and species richness of lianas in a Seasonal Semideciduous Forest / Doutorado / Doutor em Biologia Vegetal

Page generated in 0.1104 seconds