• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 916
  • 436
  • 436
  • 436
  • 436
  • 436
  • 433
  • 149
  • 14
  • 12
  • 6
  • 3
  • 2
  • 1
  • 1
  • Tagged with
  • 2244
  • 996
  • 519
  • 452
  • 318
  • 315
  • 247
  • 175
  • 135
  • 127
  • 104
  • 93
  • 88
  • 82
  • 79
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

DEVELOPING COMPOSITE AREA-LEVEL INDICATORS OF SOCIOECONOMIC POSITION FOR PITTSBURGH, PENNSYLVANIA

Doebler, Donna Charissa Almario 29 September 2009 (has links)
Objective: To develop a process to construct composite area-level indicators of socioeconomic position (SEP) from existing SEP measures and examine how well they predict the proportion of low birth weight (LBW) infants in Pittsburgh, Pennsylvania. Methodology: Twelve existing measures of SEP were derived from U.S. Census 2000 and constructed at block group (BG) and neighborhood (NB) levels. Geocoded individual-level LBW data were obtained from Allegheny County Birth Registry (2003-2006) and aggregated to BG level for Pittsburgh. The indicator development process included multilevel data exploration (boxplots, variance decomposition, mapping, and examining correlations), exploratory multilevel factor analysis (MFA), and model selection. Multilevel linear regression (MLR) and diagnostic tests were used to examine whether indicators of SEP predicted LBW. Results: MFA identified two BG-level factors: material and economic deprivation (MEDij, mean=29.8, variance=184.8), representing percentage of individuals or households not owning a car, renting their residence, in poverty, receiving public assistance, and earning low income; and concentrated disadvantage (CDij, mean=15.7, variance=164.4), representing percentage of Blacks, single-headed families, having family members under 18 years old, and receiving public assistance. At NB level, all 12 SEP measures were captured in one factor, overall neighborhood deprivation (ONDj, mean=29.3, variance =115.9). MLR identified significant associations between both ONDj and MEDij and LBW: a unit increase in ONDj was associated with 0.003 increase in LBW infants (p<0.001), and a unit increase in MEDij was associated with 0.0018 increase (p<0.01). The association between CDij and LBW was moderated by ONDj (p=0.017): in NBs with high ONDj, LBW increased as CDij increased, while in NBs with low ONDj, LBW decreased as CDij increased. This result suggests that lower levels of ONDj may ameliorate the effects of high CDij at the BG level in Pittsburgh. Conclusion: The study outlines a novel approach to examining area-level associations between SEP and health by utilizing MFA to develop BG and NB composite SEP measures; this approach has not been reported in previous neighborhood research. An important public health implication is that these methods facilitate a closer examination of the mechanisms by which SEP at different area-levels could impact health.
112

Effects of missing value imputation on down-stream analyses in microarray data

OH, sunghee 28 January 2010 (has links)
Amongst the high-throughput technologies, DNA microarray experiments provide enormous quantity of genes and arrays with biological information to disease. The studies of gene expression values in various conditions and various organisms in public health have led to the identification of genes to the comparison between tumor and normal, clinically relevant subtypes of tumor, and prognostic signatures and have ultimately provided the potential targets for specific therapy of public health disease. Despite such advances and the popular usage of microarray, the microarray experiments frequently produce multiple missing values due to many flaw factors such as dust, scratches on the slides, insufficient resolution, or hybridization errors on the chips. Thus, gene expression data contains missing entries and a large number of genes may be affected. Unfortunately, many downstream algorithms for gene expression analysis require a complete matrix as an input. Therefore effective missing value imputation methods are needed and have been developed in the literature so far. There exists no uniformly superior imputation method and the performance depends on the structure and nature of a data set. In addition, imputation methods have been mostly compared in terms of variants of RMSEs (Root Mean Squared Error) to compare similarity between true expression values and imputed expression values. The drawback of RMSE-based evaluation is that the measure does not reflect the true biological effect in down-stream analyses. In this dissertation, we will investigate how missing value imputation process affects the biological result of differentially expressed genes discovery, clustering and classification. Multiple statistical methods in each of the downstream analysis will be considered. Quantitative measures reflecting the true biological effects in each down-stream analysis will be used to evaluate imputation methods and be compared to RMSE-based evaluation.
113

Analysis of Non-ignorable Missing and Left-Censored Longitudinal Biomarker Data

Sattar, Abdus 28 January 2010 (has links)
In a longitudinal study of biomarker data collected during a hospital stay, observations may be missing due to administrative reasons, the death of the subject or the subject's discharge from the hospital, resulting in non-ignorable missing data. Standard likelihood-based methods for the analysis of longitudinal data, e.g, mixed models, do not include a mechanism that accounts for the different reasons for missingness. Rather than specifying a full likelihood function for the observed and missing data, we have proposed a weighted pseudo likelihood (WPL) method. Using this method a model can be built based on available data by accounting for the unobserved data via weights which are then treated as nuisance parameters in the model. The WPL method accounts for the nuisance parameters in the computation of the variances of parameter estimates. The performance of the proposed method has been compared with a number of widely used methods. The WPL method is illustrated using an example from the Genetic and Inflammatory Marker of Sepsis (GenIMS) study. A simulation study has been conducted to study the properties of the proposed method and the results are competitive with the widely used methods. In the second part, our goal is to address the problem of analyzing left-censored longitudinally measured biomarker data when subjects are lost due to the above mentioned reasons. We propose to analyze one such biomarker, IL-6, obtained from the GenIMS study, using a weighted random effects Tobit (WRT) model. We have compared the results of the WRT model with the random effects Tobit model. The simulation study shows that the WRT model estimates are approximately unbiased. The correct standard error has been computed using asymptotic pseudo likelihood theory. The use of multiple weights across the panel improves the estimate and produces smaller root mean square error. Therefore, the WRT model with multiple weights across panels is the recommended model for analyzing non-ignorable missing and left-censored biomarker longitudinal data. Model selection is an extremely important part of the analysis of any data set. As illustrated in these analyses, conclusions, which can directly impact public health, depend heavily on the data analytic approach.
114

A Meta-Analytic Framework for Combining Incomparable Cox Proportional Hazard Models Caused by Omitting Important Covariates

Yuan, Xing 27 January 2010 (has links)
Meta-analysis can be broadly defined as the quantitative review and synthesis of the results of related but independent studies into a single overall result. It is a statistical analysis that combines or integrates the results of several independent clinical trials considered by the analyst to be "combinable". In many biomedical research areas, especially clinical trials in oncology, researchers often use time to some event (or death) as the primary endpoint to assess treatment effects. As the amount of survival analyses continues to increase, there is a greater need to summarize a pool of studies into a coherent overview. It is well established that in Cox proportional hazard models with censored survival data, estimates of treatment effects with some important covariates omitted will be biased toward zero. This is especially problematic in meta-analyses which combine estimates of parameters from studies where different covariate adjustments are made. Presently, few constructive solutions have been provided to address this issue. We propose a meta-analytic framework for combining incomparable Cox models using both aggregated patient data (APD) and individual patient data (IPD) structures. For APD, two meta-regression models (meta-ANOVA and meta-polynomial models) with indicators of different covariates in Cox models are proposed to adjust for the heterogeneity of treatment effects across studies. Both parametric and nonparametric estimators for the pooled treatment effect and the heterogeneity variance are presented and compared. For IPD, we propose a hierarchical multiple imputation method to handle the unique missing covariates problem when we combine individual data from different studies for a meta-analysis, and results are compared with estimations from the conventional multiple imputation method. We illustrate the advantages of our proposed analytic procedures over existing methodologies by simulation studies and real data analyses using multiple breast cancer clinical trials. The public health significance of our work is to provide practical guidance of designing and implementing meta-analyses of incomparable Cox proportional hazard models for researchers in the fields of clinical trials, medical research, and other health care areas. Such guidance is important due to the emerging role of meta-analysis in assessing important public health studies.
115

IS THERE A DIFFERENCE IN COMPLETION RATE OF RADIATION TREATMENT IN AFRICAN AMERICAN AND CAUCASIAN WOMEN IN CLINICAL TRIALS?

Glover, Khaleelah 27 January 2010 (has links)
Breast cancer is a disease that can affect all women. However, the rate at which this disease affects women varies by race and ethnicity. When one analyzes incidence rates for a life threatening disease, a higher incidence rate for a certain group usually portends a higher death rate. However this is not necessarily true for breast cancer. In particular, when comparing incidence rates with their white counterparts, African American women have a lower incidence rate but a higher death rate. The phenomenon of racial differences or health disparities among cancer patients has been established several times by studies primarily associated with differences in health care in the general population. However, within randomized clinical trials, one does not anticipate that disparities in health outcome would be evident as all patients receive treatment in accordance with standard treatment protocols. The purpose of this study is to test this premise by asking the question: Is there a difference in radiation treatment when comparing African American and Caucasian women who are treated in randomized clinical trials? The study population includes patients from the National Surgical Adjuvant Breast and Bowel Project (NSABP) on various protocols (B15, B16, B18, B22, B23, B25, and B28). The focus was on patients who received chemotherapy in the form of Adriamycin and cyclophosphamide (AC), alone or prior to other chemotherapy agents. AC was given as adjuvant in all of the protocols. There were 9,646 Caucasian patients and 1,040 African-American (AA) patients. Among these patients were 3,504 Caucasian patients and 377 AA patients who received radiation therapy according to protocol. After adjusting for various potential confounders no evidence was found of a difference by race in total radiation therapy. Public health importance: Randomized clinical trials provide important evidence for the choices of breast cancer treatment. The success of such trials in providing an environment where patients received a standardized treatment would be called into question if there were treatment differences by race in those trials. This study did not find evidence of racial disparity in radiation therapy in the NSABP breast cancer trials examined.
116

Statistical Assessment of Medication Adherence Data: A Technique to Analyze the J-Shaped Curve

Rohay, Jeffrey Michael 27 January 2010 (has links)
Medication non-adherence impacts public health by impeding the evaluation of medication efficacy, decreasing improvement and/or increasing morbidity in patients, while increasing health care costs. As a result, intervention studies are designed to improve adherence rates. Medication adherence is J-shaped in nature with many people taking their medication completely, a significant proportion taking no medication, and a substantial proportion taking their medication on some intermittent schedule. Therefore, descriptive statistics and standard statistical techniques (e.g., parametric t-tests, non-parametric Wilcoxon Rank Sum tests, and dichotomization) can provide misleading results. This study developed and evaluated a method to more accurately assess interventions designed to improve adherence. Better evaluation could lead to identifying new interventions that decrease morbidity, mortality, and health care costs. Parametric techniques utilizing a Gaussian distribution are inappropriate as J-shaped adherence distributions violate the normality assumption and transformations fail to induce normality. Additionally, measures of central tendency fail to provide an adequate depiction of the distribution. While non-parametric techniques overcome distributional problems, they fail to adequately describe the distributions shape. Similarly, dichotomizing data results in a loss of information, making small improvements impossible to detect. Using a mixture of beta distributions to describe adherence measures and the expectation-maximization algorithm, parameter and standard error estimates of this distribution were produced. This technique is advantageous as it allows one to both describe the shape of the distribution and compare parameter estimates. We assessed, via simulation studies, &alpha;-levels and power for this new method as compared to standard methods. Additionally, we applied the technique to data obtained from studies designed to increase medication adherence in rheumatoid arthritis patients. Via simulations, the mixed beta model was shown to adequately depict adherence distributions. This technique performed better at distinguishing datasets, exhibiting power ranging from 66% to 92% across samples sizes. Additionally, &alpha;-levels for the new technique were reasonable, ranging from 3.4% to 5.4%. Finally, application to the Adherence in Rheumatoid Arthritis: Nursing Interventions studies produced parameters estimates and allowed for the comparison of interventions. The p-value for this new test was 0.0597, compared to 0.20 for the t-test.
117

COMPARISON BETWEEN RESPONDENTS AND NON-RESPONDENTS IN A NESTED CASE-CONTROL STUDY OF BRAIN TUMORS

Xu, Hui 27 January 2010 (has links)
The purpose of this study was to identify characteristics of non-respondents and estimate the potential non-response bias by comparing the respondents with non-respondents from a nested case-control study of brain tumors. The nested case-control study was conducted in eight Pratt & Whitney plants in Connecticut. Information about demographic and some work related variables of 239 cases and 116 controls who responded to an interview, as well as 483 cases and 604 controls who did not respond were obtained from the plant records. Pearsons chi-square test was used to test whether these common known variables were differently distributed between respondents and non-respondents by case-control status. There were no differences detected between the respondents and the non-respondents in the control group. However, significant distribution differences were identified between the case respondents and the case non-respondents with respect to the variables: age at hire, age at termination, and duration of time worked. Multivariate logistic regression was conducted to specify which variables were significantly associated with non-response. The probability of being a non-respondent in the case group was significantly associated with age at hire and age at termination. Furthermore, case-control status, age at hire, and duration of time worked were significant predictors for being a non-respondent in the whole dataset. In addition, the non-response biases in brain tumor risk associated with age at hire and age at termination were calculated by comparing risk among respondents and all subjects. The bias varied from -9% to 43%, indicating that difference between the respondents and the non-respondents may result in a large bias in the risk estimate for brain tumors in the nested case-control study. Our study has great public health relevance because survey data with low response rate could undermine the results of a case-control study of some exposure of interest and a specific disease, or worse lead to erroneous conclusions.
118

ANALYZING TRAJECTORIES OF CAREGIVER PSYCHOLOGICAL DISTRESS OVER TIME USING GROUP-BASED MODELING METHODS

Kuo, Chien-Wen Jean 28 January 2010 (has links)
Group-based trajectory analysis is an innovative statistical method to identify distinct populations over time. We used this approach to characterize patterns of change in distress using shortened scales (depressive symptoms (CESD), anxiety (POMS), and caregiver burden (CRA)) in caregivers (CG) of persons with primary malignant brain tumors. In an ongoing longitudinal study, 98 CGs were interviewed within a month of their care recipients diagnosis and at 4, 8, and 12- months afterwards. We used SAS Proc Traj to select models based on clinical criteria and statistical judgment. We identified 2 trajectories for depressive symptoms, 2 for anxiety, and 3 for caregiver burden. An estimated 61.2% of CGs had low CESD (range: 0-30) scores at baseline (mean (M)=5.3, standard deviation (SD) = 3.6) and remained low (M=2.7, SD=2.8) at 12-months (p=0.06 for trajectory slope); the remaining CGs (38.8%) had high scores at baseline (M=14.4, SD=5.3) that significantly decreased by 12-months (M=9.1, SD=4.6; p=0.01). An estimated 20.4% of CGs had low POMS (range: 3-18) scores at baseline (M=6.0, SD=2.2) that decreased significantly (M=4.0, SD=1.1) at 12-months (p=0.002); the remaining CGs (79.6%) had high scores at baseline (M=10.2, SD=2.1) that decreased significantly by 12-months (M=7.8, SD=1.5; p=0.001). An estimated 20.4% of CGs had low CRA (range: 5-25) scores at baseline (M=10.5, SD=2.7) that decreased significantly (M=6.4, SD=1.3) at 12-months (p<0.001); the moderate trajectory included 26.5% of CGs with consistent scores at baseline (M=14.2, SD=2.0) and 12 months (M=11.0, SD=1.4; p=0.51); the majority of CGs (53.1%) had consistently high scores at baseline (M=19.7, SD=2.1) and (M=20.0, SD=2.4) at 12 months (p=0.85). Logistic and multinomial regression results revealed that CGs with low emotional stability were more likely to belong to the high depressive symptoms (p=0.007) and anxiety (p=0.002) trajectory groups. CGs were more likely to belong to the moderate to high caregiver burden trajectory group if their care recipients had more aggressive tumor types (p=0.004) or lower constructional ability (p=0.05). The public health significance of this work is that trajectory analysis provides a way to identify CGs at risk of increasing psychological distress so that suitable interventions can be developed and targeted.
119

Statistical Methods for Genotype Assay Data

Cheong, Soo Yeon 28 June 2010 (has links)
There are many methods to detect any relationship between genotype and phenotype. All of them need to be preceded by measuring genotypes. Genotypes are assigned at each marker for every person to be tested based on raw data from any of a number of different assays. After genotyping, association is tested with a chi-square test on a 2 x 3 table of phenotype x genotype for a simple case-control study design. Based on the chi-square test, we may infer that one of the alleles at the marker might increase risk of the disease. In this dissertation we study analysis methods for raw data from genotyping assays, with particular attention to two issues: genotype calling for trisomic individuals, and design and testing for pooled DNA studies. There are a number of statistical clustering techniques and software packages in use to call genotypes for disomic individuals. However, standard software packages cannot be used if a chromosomal abnormality exists. We used data from individuals with Down syndrome, who have an extra copy of chromosome 21. A method of calling genotypes for individuals with Down syndrome was already suggested in a previous study. In this study we propose a new method to improve the genotype calling in this situation. In most association studies, individual genotyping is used, but that approach has high cost. Pooled genotyping is a cost effective way to perform the first stage of a genetic association study. DNA pools are formed by mixing DNA samples from multiple individuals before genotyping. Pooled DNA is assayed on a standard genotyping chip, and allele frequencies are estimated from the raw intensity data for the chip. Many previous studies looked at the issue of estimating more accurate allele frequencies for pooled genotyping. In this study we consider two different issues: design of pooled studies and statistical testing methods. We consider several pooling designs with the same cost and compare to figure out the most effective design. And we also discuss the most appropriate statistics for testing each design. The two issues addressed in this study are pre-requisites to any genetic association analysis. Genetic association studies are leading to new knowledge that will eventually improve prevention and treatment options for many diseases. However, these studies cannot succeed unless we know how to design and analyze them correctly. Using incorrect genotype calls, incorrect statistics, or inefficient designs will all severely compromise the public health advances that these studies are able to make. The studies we have done will help lead to more correct and efficient genetic association studies, and thus to quicker and surer advances in prevention and treatment. Thus this work has great public health significance.
120

Identifying And Validating Type 1 And Type 2 Diabetic Cases Using Administrative Date: A Tree-Structured Model

Lo, Wei-Hsuan 28 June 2010 (has links)
Background: Planning, implementing, monitoring, temporal evolution and prognosis differ between type 1 diabetes (T1DM) and type 2 diabetes (T2DM). To date, few administrative diabetes registries have distinguished T1DM from T2DM, reflecting the lack of required differential information and possible recording bias. Objective: Using a classification tree model, we developed a prediction rule to distinguish T1DM from T2DM accurately, using information from a large administrative database. Methods: The Medical Archival Retrieval System (MARS) at the University of Pittsburgh Medical Center from 1/1/2000-9/30/2009 included administrative and clinical data for 209,642 unique diabetic patients aged ≥ 18 years. We identified 10,004 T1DM and 156,712 T2DM patients as probable or possible cases, based on clinical criteria. Classification tree models were fit using TIBCO Spotfire S+ 8.1 (TIBCO Software). We used 10-fold cross-validation to choose model size. We estimated sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of T1DM. Results: The main predictors that distinguished T1DM from T2DM include age < 40 vs. ≥ 40 years, ICD-9 codes of T1DM or T2DM diagnosis, oral hypoglycemic agent use, insulin use, and episode(s) of diabetic ketoacidosis diagnosis. History of hypoglycemic coma, duration in the MARS database, in-patient diagnosis of diabetes, and number of complications (including myocardial infarction, coronary artery bypass graft, dialysis, neuropathy, retinopathy, and amputation) are ancillary predictors. The tree-structured model to predict T1DM from probable cases yields sensitivity (99.63%), specificity (99.28%), PPV (89.87%) and NPV (99.71%). Conclusion: Our preliminary predictive rule to distinguish between T1DM and T2DM cases in a large administrative database appears to be promising and needs to be validated. The public health significance is that being able to distinguish between these diabetes subtypes will allow future subtype-specific analyses of cost, morbidity, and mortality. Future work will focus on ascertaining the validity and generalizability of our predictive rule, by conducting a review of medical charts (as an internal validation) and applying the rule to another MARS dataset or other administrative databases (as external validations).

Page generated in 0.0806 seconds