• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 262
  • 111
  • 16
  • 15
  • 13
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 1
  • Tagged with
  • 451
  • 129
  • 110
  • 88
  • 75
  • 58
  • 52
  • 47
  • 40
  • 32
  • 32
  • 31
  • 29
  • 26
  • 24
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
301

Penalized spline modeling of the ex-vivo assays dose-response curves and the HIV-infected patients' bodyweight change

Sarwat, Samiha 05 June 2015 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / A semi-parametric approach incorporates parametric and nonparametric functions in the model and is very useful in situations when a fully parametric model is inadequate. The objective of this dissertation is to extend statistical methodology employing the semi-parametric modeling approach to analyze data in health science research areas. This dissertation has three parts. The first part discusses the modeling of the dose-response relationship with correlated data by introducing overall drug effects in addition to the deviation of each subject-specific curve from the population average. Here, a penalized spline regression method that allows modeling of the smooth dose-response relationship is applied to data in studies monitoring malaria drug resistance through the ex-vivo assays.The second part of the dissertation extends the SiZer map, which is an exploratory and a powerful visualization tool, to detect underlying significant features (increase, decrease, or no change) of the curve at various smoothing levels. Here, Penalized Spline Significant Zero Crossings of Derivatives (PS-SiZer), using a penalized spline regression, is introduced to investigate significant features in correlated data arising from longitudinal settings. The third part of the dissertation applies the proposed PS-SiZer methodology to analyze HIV data. The durability of significant weight change over a period is explored from the PS-SiZer visualization. PS-SiZer is a graphical tool for exploring structures in curves by mapping areas where rate of change is significantly increasing, decreasing, or does not change. PS-SiZer maps provide information about the significant rate of weigh change that occurs in two ART regimens at various level of smoothing. A penalized spline regression model at an optimum smoothing level is applied to obtain an estimated first-time point where weight no longer increases for different treatment regimens.
302

Data Quality Assessment for the Secondary Use of Person-Generated Wearable Device Data: Assessing Self-Tracking Data for Research Purposes

Cho, Sylvia January 2021 (has links)
The Quantified Self movement has led to an increased routine use of consumer wearables, generating large amounts of person-generated wearable device data. This has become an opportunity to researchers to conduct research with large-scale person-generated wearable device data without having to collect data in a costly and time-consuming way. However, there are known challenges of wearable device data such as missing data or inaccurate data which raises the need to assess the quality of data before conducting research. Currently, there is a lack of in-depth understanding on data quality challenges of using person-generated wearable device data for research purposes, and how data quality assessment should be conducted. Data quality assessment could be especially a burden to those without the domain knowledge on a specific data type, which might be the case for emerging biomedical data sources. The goal of this dissertation is to advance the knowledge on data quality challenges and assessment of person-generated wearable device data and facilitate data quality assessment for those without the domain knowledge on the emerging data type. The dissertation consists of two aims: (1) identifying data quality dimensions important for assessing the quality of person-generated wearable device data for research purposes, (2) designing and evaluating an interactive data quality characterization tool that supports researchers in assessing the fitness-for-use of fitness tracker data. In the first aim, a multi-method approach was taken, conducting literature review, survey, and focus group discussion sessions. We found that intrinsic data quality dimensions applicable to electronic health record data such as conformance, completeness, and plausibility are applicable to person-generated wearable device data. In addition, contextual/fitness-for-use dimensions such as breadth and density completeness, and temporal data granularity were identified given the fact that our focus was on assessing data quality for research purposes. In the second aim, we followed an iterative design process from understanding informational needs to designing a prototype, and evaluating the usability of the final version of a tool. The tool allows users to customize the definition of data completeness (fitness-for-use measures), and provides data summarization on the cohort that meets that definition. We found that the interactive tool that incorporates fitness-for-use measures and allows customization on data completeness, can support assessing fitness-for-use assessment more accurately and in less time than a tool that only presents information on intrinsic data quality measures.
303

General Bayesian Calibration Framework for Model Contamination and Measurement Error

Wang, Siquan January 2023 (has links)
Many applied statistical applications face the potential problem of model contamination and measurement error. The form and degree of contamination as well as the measurement error are usually unknown and sample-specific, which brings additional challenges for researchers. In this thesis, we have proposed several Bayesian inference models to address these issues, with the application to one type of special data for allergen concentration measurement, which is called serial dilution data and is self-calibrated. In our first chapter, we address the problem of model contamination by using a multilevel model to simultaneously flag problematic observations and estimate unknown concentrations in serial dilution data, a problem where the current approach can lead to noisy estimates and difficulty in estimating very low or high concentrations. In our second chapter, we propose the Bayesian joint contamination model for modeling multiple measurement units at the same time while adjusting for differences between experiments using the idea of global calibration, and it could account for uncertainty in both predictors and response variables in Bayesian regression. We are able to get efficacy gain by analyzing multiple experiments together while maintaining robustness with the use of hierarchical models. In our third chapter, we develop a Bayesian two-step inference model to account for measurement uncertainty propagation in regression analysis when the joint inference model is infeasible. We aim to increase model inference reliability while providing flexibility to users by not restricting the type of inference model used in the first step. For each of the proposed methods, We also demonstrate how to integrate multiple model building blocks through the idea of Bayesian workflow. In extensive simulation studies, we show that our proposed methods outperform other commonly used approaches. For the data applications, we apply the proposed new methods to the New York City Neighborhood Asthma and Allergy Study (NYC NAAS) data to estimate indoor allergen concentrations more accurately as well as reveal the underlying associations between dust mite allergen concentrations and the exhaled nitric oxide (NO) measurement for asthmatic children. The methods and tools developed here have a wide range of applications and can be used to improve lab analyses, which are crucial for quantifying exposures to assess disease risk and evaluating interventions.
304

Statistical Methods for Learning Patients Heterogeneity and Treatment Effects to Achieve Precision Medicine

Xu, Tianchen January 2022 (has links)
The burgeoning adoption of modern technologies provides a great opportunity for gathering multiple modalities of comprehensive personalized data on individuals. The thesis aims to address statistical challenges in analyzing these data, including patient-specific biomarkers, digital phenotypes and clinical data available from the electronic health records (EHRs) linked with other data sources to achieve precision medicine. The first part of the thesis introduces a dimension reduction method of microbiome data to facilitate subsequent analysis such as regression and clustering. We adopt the proposed zero-inflated Poisson factor analysis (ZIPFA) model on the Oral Infections, Glucose Intolerance and Insulin Resistance Study (ORIGINS) and provide valuable insights into the relation between subgingival microbiome and periodontal disease. The second part focuses on modeling the intensive longitudinal digital phenotypes collected by mobile devices. We develop a method based on a generalized state-space model to estimate the latent process of patient's health status. The application to the Mobile Parkinson's Observatory for Worldwide Evidence-based Research (mPower) data reveals the low-rank structure of digital phenotypes and infers the short-term and long-term Levodopa treatment effect. The third part proposes a self-matched learning method to learn individualized treatment rule (ITR) from longitudinal EHR data. The medical history data in EHRs provide the opportunity to alleviate unmeasured time-invariant confounding by matching different periods of treatments within the same patient (self-controlled matching). We estimate the ITR for type 2 diabetes patients for reducing the risk of diabetes-related complications using the EHRs data from New York Presbyterian (NYP) hospital. Furthermore, we include an additional example of self-controlled case series (SCCS) study on the side effect of stimulants. Significant associations between the use of stimulants and mortality are found from both FDA Adverse Event Reporting System and the SCCS study, but the latter uses a much smaller sample size which suggests high efficiency of the SCCS design.
305

Refractive error, ocular biometry and oculomotor function: The prevalence of myopia and its potential risk factors in the Middle East, with an investigation of dynamic accommodation responses and axial length fluctuations in young myopic adults.

Gammoh, Yazan S.S. January 2011 (has links)
The main experimental work of this thesis has been a cross-sectional study of the prevalence of refractive error and its biometric correlates in Middle Eastern adults. In addition dynamic accommodative responses and twenty-four hour axial length fluctuations were investigated in young myopic adults. The prevalence of myopia in 3000 Middle Eastern adults (age range 17-40 years) was similar to previously reported levels of myopia in the West. Myopia was associated with a higher level of education, occupations with a high nearwork demand and positive family history of myopia; all of which have been identified as risk factors for myopia development and progression Diurnal variations in axial length (AL) of similar magnitude to those previously reported in emmetropes were observed in myopes recruited in the current thesis. However, the pattern of the diurnal variation in AL was significantly different between early-onset myopes (EOMs) and late-onset myopes (LOMs). There were no significant differences between EOMs and LOMs in the dynamic accommodative response to a sinusoidally oscillating target. The accommodative phase lag was increased following 30 minute adaptation to myopic defocus using +2.00 D lens. However, intense prolonged (30 minute) nearwork was found to have no effect on accommodative gain or phase lag. A number of recommendations for further work on the prevalence of refractive error in the Middle East are suggested along with further research on diurnal AL variations and dynamic accommodative responses in EOMs and LOMs.
306

Correcting for Measurement Error and Misclassification using General Location Models

Kwizera, Muhire Honorine January 2023 (has links)
Measurement error is common in epidemiologic studies and can lead to biased statistical inference. It is well known, for example, that regression analyses involving measurement error in predictors often produce biased model coefficient estimates. The work in this dissertation adds to the existing vast literature on measurement error by proposing a missing data treatment of measurement error through general location models. The focus is on the case in which information about the measurement error model is not obtained from a subsample of the main study data but from separate, external information, namely the external calibration. Methods for handling measurement error in the setting of external calibration are in need with the increase in the availability of external data sources and the popularity of data integration in epidemiologic studies. General location models are well suited for the joint analysis of continuous and discrete variables. They offer direct relationships with the linear and logistic regression models and can be readily implemented using frequentist and Bayesian approaches. We use the general location models to correct for measurement error and misclassification in the context of three practical problems. The first problem concerns measurement error in a continuous variable from a dataset containing both continuous and categorical variables. In the second problem, measurement error in the continuous variable is further complicated by the limit of detection (LOD) of the measurement instrument, resulting in some measures of the error-prone continuous variable undetectable if they are below LOD. The third problem deals with misclassification in a binary treatment variable. We implement the proposed methods using Bayesian approaches for the first two problems and using the Expectation-maximization algorithm for the third problem. For the first problem we propose a Bayesian approach, based on the general location model, to correct measurement error of a continuous variable in a data set with both continuous and categorical variables. We consider the external calibration setting where in addition to the main study data of interest, calibration data are available and provide information on the measurement error but not on the error-free variables. The proposed method uses observed data from both the calibration and main study samples and incorporates relationships among all variables in measurement error adjustment, unlike existing methods that only use the calibration data for model estimation. We assume by strong nondifferential measurement error (sNDME) that the measurement error is independent of all the error-free variables given the true value of the error-prone variable. The sNDME assumption allows us to identify our model parameters. We show through simulations that the proposed method yields reduced bias, smaller mean squared error, and interval coverage closer to the nominal level compared to existing methods in regression settings. Furthermore, this improvement is pronounced with increased measurement error, higher correlation between covariates, and stronger covariate effects. We apply the new method to the New York City Neighborhood Asthma and Allergy Study to examine the association between indoor allergen concentrations and asthma morbidity among urban asthmatic children. The simultaneous occurrence of measurement error and LOD is common particularly in environmental exposures such as measurements of the indoor allergen concentrations mentioned in the first problem. Statistical analyses that do not address these two problems simultaneously could lead to wrong scientific conclusions. To address this second problem, we extend the Bayesian general location models for measurement error adjustment to handle both measurement error and values below LOD in a continuous environmental exposure in a regression setting with mixed continuous and discrete variables. We treat values below LOD as censored. Simulations show that our method yields smaller bias and root mean squared error and the posterior credible interval of our method has coverage closer to the nominal level compared to alternative methods, even when the proportion of data below LOD is moderate. We revisit data from the New York City Neighborhood Asthma and Allergy Study and quantify the effect of indoor allergen concentrations on childhood asthma when over 50% of the measured concentrations are below LOD. We finally look at the third problem of group mean comparison when treatment groups are misclassified. Our motivation comes from the Frequent User Services Engagement (FUSE) study. Researchers wish to compare quantitative health and social outcome measures for frequent jail-and-shelter users who were assigned housing and those who were not housed, and misclassification occurs as a result of noncompliance. The recommended intent-to-treat analysis which is based on initial group assignment is known to underestimate group mean differences. We use the general location model to estimate differences in group means after adjusting for misclassification in the binary grouping variable. Information on the misclassification is available through the sensitivity and specificity. We assume nondifferential misclassification so that misclassification does not depend on the outcome. We use the expectation-maximization algorithm to obtain estimates of the general location model parameters and the group means difference. Simulations show the bias reduction in the estimates of group means difference.
307

Computational Algorithms for Multi-omics and Electronic Health Records Data

Guo, Jia January 2023 (has links)
Real world data have enhanced healthcare research, improving our understanding of disease progression, aiding in diagnosis, and enabling the development of personalized and targeted treatments. In recent years, multi-omics data and electronic health record (EHR) data have become increasingly available, providing researchers with a wealth of information to analyze. The use of machine learning methods with EHR and multi-omics data has emerged as a promising approach to extract valuable insights from these complex data sources. This dissertation focuses on the development of supervised and unsupervised learning methods, as well as their applications to EHR and multi-omics data, with a particular emphasis on early detection of clinical outcomes and identification of novel cancer subtypes. The first part of the dissertation centers on developing a risk prediction tool using EHR data that enables disease early detection so that preventive treatments can be taken to better manage the disease. For this goal, we developed a similarity-based supervised learning method with two applications to predict end-stage kidney disease (ESKD) and aortic stenosis (AS). In the second part of the dissertation, we expanded our goal to a phenome-wide prediction task and developed a patient representation based deep learning method that is able to predict phenotypes across the phenome. Through a weighting scheme, this approach is conducting tailored disease phenotype prediction computationally efficiently with good prediction performance. In the final part of the dissertation, I shifted the focus with the goal to identify clinical meaningful novel disease subtypes with unsupervised learning methods using multi-omics data. We tackled this goal through integrating multiple patient graphs being generated from multiple omics data with molecular level features for an improved disease subtyping. This dissertation has significantly contributed to the development of data-driven approaches to healthcare and biomedical research using EHR data and multi-omics data. The new methodologies developed with applications in multiple diseases using EHR and multi-omics data advanced our knowledge in disease diagnosis, vulnerable groups identification, and ultimately improve patient care.
308

Machine Learning Methods for Intensive Longitudinal Data and Causal Inference in Multi-Study, Multi-Outcome Settings

Kim, Soohyun January 2024 (has links)
This dissertation examines the challenges and opportunities of analyzing distinct sources of mental health data in the age of precision medicine and big data. The focus lies on two areas: leveraging real-time Ecological Momentary Assessment (EMA) data to understand individual-level variations in mental disorders, especially depression; and the integration of data from randomized clinical trials (RCTs) to assess treatment efficacy, with an application to schizophrenia. The multifaceted and heterogeneous nature of mental disorders calls for nuanced and personalized assessment methods. In the first part of this dissertation, through our proposed machine learning method, the Heterogeneous-Dynamics Restricted Boltzmann Machine (HDRBM), we examine symptom-level variations beyond the traditional one-size-fits-all summary scores and learn the heterogeneous group dynamics. We demonstrate the effectiveness of our approach on simulated and real-world EMA data sets. We show that by incorporating covariates, HDRBM can improve accuracy and interpretability, explore the underlying drivers of the group dynamics of participants, and serve as a generative model for EMA studies. In the second part of the dissertation, we present the challenges of integrating multiple randomized clinical trials (RCTs) in mental health research, proposing data fusion as a means to integrate individual patient data across similar studies to enhance statistical power. The dissertation introduces novel estimators tailored for multi-study, multi-outcome fused datasets, aiming for the optimization of health outcomes for each treatment. The method also addresses the utilization of similar trials with different outcome follow- up measurements, serving as proxies for unobserved outcomes. An application is provided on cognitive remediation (CR) therapy’s efficacy using the NIMH Database of Cognitive Training and Remediation Studies (DCTRS) as a resource, emphasizing the importance of leveraging surrogate outcomes in clinical trials.
309

Validation of Optical Coherence Tomography-Based Crystalline Lens Thickness Measurements in Children

Lehman, Bret M. 14 July 2009 (has links)
No description available.
310

The Joint Modeling of Longitudinal Covariates and Censored Quantile Regression for Health Applications

Hu, Bo January 2022 (has links)
The overall theme of this thesis focuses on the joint modeling of longitudinal covariates and a censored survival outcome, where a survival outcome is modeled using a conditional quantile regression. In traditional joint modeling approaches, a survival outcome is usually parametrically modeled as a Cox regression. Censored quantile regressions can model a survival outcome without pre-specifying a parametric likelihood function or assuming a proportional hazard ratio. Existing censored quantile methods are mostly limited to fixed cross-sectional covariates, while in many longitudinal studies, researchers wish to investigate the associations between longitudinal covariates and a survival outcome. The first part considers the problem of joint modeling with a survival outcome under a mixture of censoring: left censoring, interval censoring or right censoring. We pose a linear mixed effect model for a longitudinal covariate and a conditional quantile regression for a censored survival outcome, assuming that a longitudinal covariate and a survival outcome are conditional independent on individual level random effects. We propose a Gibbs sampling approach as an extension of a censored quantile based data augmentation algorithm, to allow for a longitudinal covariate process. We also propose an iterative algorithm that alternately updates individual level random effects and model parameters, where a censored survival outcome is treated in the way of re-weighting. Both of our methods are illustrated by the application to the LEGACY Girls cohort Study to understand the influence of individual genetic profiles on the pubertal development (i.e., the onset of breast development) while adjusting for BMI growth trajectories. The second part considers the problem of joint modelling with a random right censoring survival outcome. We pose a linear mixed effect model for a longitudinal covariate and a conditional quantile regression for a censored survival outcome, assuming that a longitudinal covariate and a survival outcome are conditional independent on individual level random effects. We propose a Gibbs sampling approach as an extension of a censored quantile based data augmentation algorithm, to allow for a longitudinal covariate process. Theoretical properties for the resulting parameter estimates are established. We also propose an iterative algorithm that alternately updates individual level random effects and model parameters, where a censored survival outcome is treated in the way of re-weighting. Both of our methods are illustrated by the application to Mayo Clinic Primary Biliary Cholangitis Data to assess the effect of drug D-penicilamine on risk of liver transplantation or death, while controlling for age at registration and serBilir marker.

Page generated in 0.2487 seconds