• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 331
  • 135
  • 10
  • 4
  • Tagged with
  • 926
  • 926
  • 466
  • 437
  • 384
  • 380
  • 380
  • 184
  • 174
  • 92
  • 67
  • 66
  • 63
  • 62
  • 61
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

An Inferential Framework for Network Hypothesis Tests: With Applications to Biological Networks

Yates, Phillip 30 June 2010 (has links)
The analysis of weighted co-expression gene sets is gaining momentum in systems biology. In addition to substantial research directed toward inferring co-expression networks on the basis of microarray/high-throughput sequencing data, inferential methods are being developed to compare gene networks across one or more phenotypes. Common gene set hypothesis testing procedures are mostly confined to comparing average gene/node transcription levels between one or more groups and make limited use of additional network features, e.g., edges induced by significant partial correlations. Ignoring the gene set architecture disregards relevant network topological comparisons and can result in familiar n
72

A Comparison of Methods of Analysis to Control for Confounding in a Cohort Study of a Dietary Intervention

Hali, Esinhart 23 July 2012 (has links)
Comparing samples from different populations can be biased by confounding. There are several statistical methods that can be used to control for confounding. These include; multiple linear regression, propensity score matching, propensity score/logit of propensity score as a single covariate in a linear regression model, stratified analysis using propensity score quintiles, weighted analysis using propensity scores or trimmed scores. The data were from two studies of a dietary intervention (FIBERR and RNP). The outcome variable was change from baseline to one month for eight outcome measures; fat, fiber, and fruits/ vegetables behavior, fat, fiber, and fruits/vegetables intentions, fat and fruits/vegetables self-efficacy. It was found that the propensity score matching and the quintiles analysis were the two best methods for analyzing this dataset. The weighted analyses were the worst of all the methods compared in analyzing this particular dataset.
73

Tolerance Intervals in Random-Effects Models

Sanogo, Kakotan 01 December 2008 (has links)
In the pharmaceutical setting, it is often necessary to establish the shelf life of a drug product and sometimes suitable to assess the risk of product failure at the desired expiry period. The current statistical methodology use confidence intervals for the predicted mean to establish the expiry period and prediction intervals for a predicted new assay value or a tolerance interval for a proportion of the population for use in a risk assessment. A major concern is that most methodology treat a homogeneous subpopulation, say batch, either as a fixed effect and therefore uses a fixed-effects regression model (Graybill, 1976) or as a mixed-effects model limited to balanced data structures (Jonsson, 2003). However, batch is definitely a random effect as this fact has been reflected by some recent methodology [Altan, Cabrera and Shoung (2005), Hoffman and Kringle (2005)]. Thus, to assess the risk of product failure at expiry, it is necessary to use tolerance intervals since they provide an estimate of the proportion of assay values and/or batches failing at the expiry period. In this thesis, we illustrate the methodology described by Jonsson (2003) to construct β-expectation tolerance limits for longitudinal data in a random-effects setting. We underline the limitations of Jonsson’s approach to constructing tolerance intervals and highlight the need for a better methodology.
74

Application and Extension of Weighted Quantile Sum Regression for the Development of a Clinical Risk Prediction Tool

Bello, Ghalib 21 April 2014 (has links)
In clinical settings, the diagnosis of medical conditions is often aided by measurement of various serum biomarkers through the use of laboratory tests. These biomarkers provide information about different aspects of a patient’s health and the overall function of different organs. In this dissertation, we develop and validate a weighted composite index that aggregates the information from a variety of health biomarkers covering multiple organ systems. The index can be used for predicting all-cause mortality and could also be used as a holistic measure of overall physiological health status. We refer to it as the Health Status Metric (HSM). Validation analysis shows that the HSM is predictive of long-term mortality risk and exhibits a robust association with concurrent chronic conditions, recent hospital utilization, and self-rated health. We develop the HSM using Weighted Quantile Sum (WQS) regression (Gennings et al., 2013; Carrico, 2013), a novel penalized regression technique that imposes nonnegativity and unit-sum constraints on the coefficients used to weight index components. In this dissertation, we develop a number of extensions to the WQS regression technique and apply them to the construction of the HSM. We introduce a new guided approach for the standardization of index components which accounts for potential nonlinear relationships with the outcome of interest. An extended version of the WQS that accommodates interaction effects among index components is also developed and implemented. In addition, we demonstrate that ensemble learning methods borrowed from the field of machine learning can be used to improve the predictive power of the WQS index. Specifically, we show that the use of techniques such as weighted bagging, the random subspace method and stacked generalization in conjunction with the WQS model can produce an index with substantially enhanced predictive accuracy. Finally, practical applications of the HSM are explored. A comparative study is performed to evaluate the feasibility and effectiveness of a number of ‘real-time’ imputation strategies in potential software applications for computing the HSM. In addition, the efficacy of the HSM as a predictor of hospital readmission is assessed in a cohort of emergency department patients.
75

A NUMERICAL METHOD FOR ESTIMATING THE VARIANCE OF AGE AT MAXIMUM GROWTH RATE IN GROWTH MODELS

Ogbagaber, Semhar 23 April 2010 (has links)
Most studies on maturation and body composition using the Fels Longitudinal data mention peak height velocity (PHV) as an important outcome measure. The PHV is often derived from growth models such as the triple logistic model fitted to the stature (height) data. The age at PHV is sometimes ordinalized to designate an individual as an early, average or late maturer. In theory, age at PHV is the age at which the rate of growth reaches the maximum. Theoretically, for a well behaved growth function, this could be obtained by setting the second derivative of the growth function to zero and solving for age. Such a solution would obviously depend on the parameters of the growth function. An estimate of the age at PHV would be a function of estimates of these parameters. Since the estimates of age at PHV are ultimately used as a predictor variable for analyzing adulthood outcomes, the uncertainty in the estimation of the PHV inherent due to the uncertainty in the estimation of the growth model need to be accounted for. The asymptotic s.e. of the age at maximum velocity in simple growth models such as the logistic and the Gompertz models could be explicitly obtained because explicit formulas for the age at maximum velocity are available. In this thesis a numerical method is proposed for computing the s.e. of the age at PHV for those that do not lead to explicit solutions for the age at PHV. The accuracy of this method is demonstrated by computing the s.e. using the explicit method as well as the proposed numerical methods and by comparing them. Incorporating the estimates of the s.e. in regression models that use age at PHV as predictor is illustrated using the FELS data.
76

Adaptive Threat Detector Testing Using Bayesian Gaussian Process Models

Ferguson, Bradley Thomas 18 May 2011 (has links)
Detection of biological and chemical threats is an important consideration in the modern national defense policy. Much of the testing and evaluation of threat detection technologies is performed without appropriate uncertainty quantification. This paper proposes an approach to analyzing the effect of threat concentration on the probability of detecting chemical and biological threats. The approach uses a probit semi-parametric formulation between threat concentration level and the probability of instrument detection. It also utilizes a bayesian adaptive design to determine at which threat concentrations the tests should be performed. The approach offers unique advantages, namely, the flexibility to model non-monotone curves and the ability to test in a more informative way. We compare the performance of this approach to current threat detection models and designs via a simulation study.
77

The ROC Curve and the Area under the Curve (AUC)

Zheng, Shimin 17 February 2017 (has links)
No description available.
78

Design and Analysis of Toxicological Experiments with Multiple Endpoints and Synergistic and Inhibitory Effects

Farhat, Naha 01 January 2014 (has links)
The enormous increase of exposure to toxic materials and hazardous chemicals in recent years is a major concern due to the adverse effect resulting from such exposure on human health specifically and all organisms in general. Among the major concerns of toxicologists is to determine an acceptable level(s) of exposure to such hazardous substance(s). Current approaches often evaluate each endpoint and stressor individually. Herein we propose two novel approaches to simultaneously determine the Benchmark Dose Tolerable Region (BMDTR) for multiple endpoints and multiple stressors studies when stressors experience no more than additive effects by adopting a Bayesian approach to compute the non-linear hierarchical model. A main concern while assessing the combined toxicological effect of chemical mixture is the anticipated type of the combined action (i.e. synergistic or antagonistic), thus it was essential to extend the two proposed methods to handle this situation, which imposes more challenges due to the non-linearity of the tolerable region. Furthermore, we proposed a new method to determine the endpoint probabilities for each endpoint, which reflects the importance of each endpoint in determining the boundaries of the Benchmark Dose Tolerable Region (BMDTR). This method was also extended for situations where there is an interaction effect between stressors. The results obtained from this method were consistent with the resulting BMDTR approach in both scenarios (i.e. additive effect and non-additive effect). In addition, we developed new criteria for determining ray designs for follow-up experiments for toxicology studies based on the popular D- A- and E- optimality criteria introduced initially by (Keifer, 1959) for both scenarios (i.e. additive effect and non-additive effect). Moreover, the endpoint probabilities were used to extend these criteria in to weighted versions, where the main motivation behind using these probabilities is to segregate necessary information from un-necessary information through inducing them as weights in to the Fisher Information Matrix. Illustrative examples from simulated data were provided to illustrate all methods and criteria.
79

Interpretation of Principal Components

Dabdoub, Marwan A. 01 May 1978 (has links)
The principal component analysis can be carried out two ways. First the R-mode: R = K'K and the second is the Q-mode: Q = K K' where K is a data matrix centered by column or by row. The most commonly used method is the R-mode. It has been suggested that principal components computed from either the R-mode or the Q-mode may have the same interpretation. If this is true, then interpretation of the principal components could be put on a much more intuitive level in many applications. This will occur whenever one type of principal component is more intuitively related to the physical or natural system being studied than the other. The relationship between the principal components of the R-mode and the Q-mode have been investigated with the result that they show a perfect correlation between them. The conclusion that the principal components of the R-mode or the Q-mode have the same interpretation is established. An example is given to illustrate this work. The resulting interpretation is found to be the same as that obtained by Donald L. Phillips (1977) using different methods.
80

Investigations of Variable Importance Measures Within Random Forests

Merrill, Andrew C. 01 May 2009 (has links)
Random Forests (RF) (Breiman 2001; Breiman and Cutler 2004) is a completely nonparametric statistical learning procedure that may be used for regression analysis and. A feature of RF that is drawing a lot of attention is the novel algorithm that is used to evaluate the relative importance of the predictor/explanatory variables. Other machine learning algorithms for regression and classification, such as support vector machines and artificial neural networks (Hastie et al. 2009), exhibit high predictive accuracy but provide little insight into predictive power of individual variables. In contrast, the permutation algorithm of RF has already established a track record for identification of important predictors (Huang et al. 2005; Cutler et al. 2007; Archer and Kimes 2008). Recently, however, some authors (Nicodemus and Shugart 2007; Strobl et al. 2007, 2008) have shown that the presence of categorical variables with many categories (Strobl et al. 2007) or high colinearity give unduly large variable importance using the standard RF permutation algorithm (Strobl et al. 2008). This work creates simulations from multiple linear regression models with small numbers of variables to understand the issues raised by Strobl et al. (2008) regarding shortcomings of the original RF variable importance algorithm and the alternatives implemented in conditional forests (Strobl et al. 2008). In addition this paper will look at the dependence of RF variable importance values on user-defined parameters.

Page generated in 0.135 seconds