Spelling suggestions: "subject:"estatistics anda probability"" "subject:"estatistics anda aprobability""
231 |
An Adaptive Dose Finding Design (DOSEFIND) Using A Nonlinear Dose Response ModelDavenport, James Michael 01 January 2007 (has links)
First-in-man (FIM) Phase I clinical trials are part of the critical path in the development of a new compound entity (NCE). Since FIM clinical trials are the first time that an NCE is dosed in human subjects, the designs used in these trials are unique and geared toward patient safety. We develop a method for obtaining the desired response using an adaptive non-linear approach. This method is applicable for studies in which MTD, NOEL,NOAEL, PK, PD effects or other such endpoints are evaluated to determine the desired dose. The method has application whenever a measurable PD marker is an indicator of potential efficacy and could be particularly useful for dose finding studies. The advantages in the adaptive non-linear methodology is that the actual range of dose response and lowest non-effective dose levels are more quickly and accurately determined using fewer subjects than typically needed for a conventional early phase clinical trial. Using the nonlinear logistic model, we demonstrate, with simulations, that the DOSEFIND approach has better asymptotic relative efficiency than a fixed-dose approach. Further, we demonstrate that, on average, this method is consistent in reproducing .the target dose, that it has very little bias. This is an indicator of reproducibility of the method, showing that the long-run average error is quite small. Additionally, DOSEFIND is more cost effective because the sample size needed to obtain the desired target dose is much smaller than that needed in the fixed dose approach.
|
232 |
Optimal Clustering: Genetic Constrained K-Means and Linear Programming AlgorithmsZhao, Jianmin 01 January 2006 (has links)
Methods for determining clusters of data under- specified constraints have recently gained popularity. Although general constraints may be used, we focus on clustering methods with the constraint of a minimal cluster size. In this dissertation, we propose two constrained k-means algorithms: Linear Programming Algorithm (LPA) and Genetic Constrained K-means Algorithm (GCKA). Linear Programming Algorithm modifies the k-means algorithm into a linear programming problem with constraints requiring that each cluster have m or more subjects. In order to achieve an acceptable clustering solution, we run the algorithm with a large number of random sets of initial seeds, and choose the solution with minimal Root Mean Squared Error (RMSE) as our final solution for a given data set. We evaluate LPA with both generic data and simulated data and the results indicate that LPA can obtain a reasonable clustering solution. Genetic Constrained K-Means Algorithm (GCKA) hybridizes the Genetic Algorithm with a constrained k-means algorithm. We define Selection Operator, Mutation Operator and Constrained K-means operator. Using finite Markov chain theory, we prove that the GCKA converges in probability to the global optimum. We test the algorithm with several datasets. The analysis shows that we can achieve a good clustering solution by carefully choosing parameters such as population size, mutation probability and generation. We also propose a Bi-Nelder algorithm to search for an appropriate cluster number with minimal RMSE.
|
233 |
NONLINEAR MODELS IN MULTIVARIATE POPULATION BIOEQUIVALENCE TESTINGDahman, Bassam 17 November 2009 (has links)
In this dissertation a methodology is proposed for simultaneously evaluating the population bioequivalence (PBE) of a generic drug to a pre-licensed drug, or the bioequivalence of two formulations of a drug using multiple correlated pharmacokinetic metrics. The univariate criterion that is accepted by the food and drug administration (FDA) for testing population bioequivalence is generalized. Very few approaches for testing multivariate extensions of PBE have appeared in the literature. One method uses the trace of the covariance matrix as a measure of total variability, and another uses a pooled variance instead of the reference variance. The former ignores the correlation between the measurements while the later is not equivalent to the criterion proposed by the FDA in the univariate case, unless the variances of the test and reference are identical, which reduces the PBE to the average bioequivalence. The confidence interval approach is used to test the multivariate population bioequivalence by using a parametric bootstrap method to evaluate the 100% (1-alpha) confidence interval. The performance of the multivariate criterion is evaluated by a simulation study. The size and power of testing for bioequivalence using this multivariate criterion are evaluated in a simulation study by altering the mean differences, the variances, correlations between pharmacokinetic variables and sample size. A comparison between the two published approaches and the proposed criterion is demonstrated. Using nonlinear models and nonlinear mixed effects models, the multivariate population bioequivalence is examined. Finally, the proposed methods are illustrated by simultaneously testing the population bioequivalence for AUC and Cmax in two datasets.
|
234 |
Phase II Trials Powered to Detect Activity in Tumor Subsets with Retrospective (or Prospective) Use of Predictive MarkersSheth, Grishma S. 01 January 2007 (has links)
Classical phase II trial designs assume a patient population with a homogeneous tumor type and yield an estimate of a stochastic probability of tumor response. Clinically, however, oncology is moving towards identifying patients who are likely to respond to therapy using tumor subtyping based upon predictive markers. Such designs are called targeted designs (Simon, 2004). For a given phase I1 trial predictive markers may be defined prospectively (on the basis of previous results) or identified retrospectively on the basis of analysis of responding and non-responding tumors. For the prospective case we propose two Phase I1 targeted designs in which a) the trial is powered to detect the presence of responding subtype(s) as identified either prospectively or retrospectively by predictive markers or b) the trial is powered to achieve a desired precision in the smallest subtype. Relevant parameters in such a design include the prevalence of the smallest subtype of interest, the hypothesized response rate within that subtype, the expected total response rate, and the targeted probabilities of type I and II errors (α and β). (The expected total response rate is needed for design a) but not for b)). Extensions of this design to simultaneous or sequential multiple subtyping and imperfect assays for predictive markers will also be considered. The Phase II targeted design could be formulated as a single stage or Simon two-stage design. For multiple subtyping corrections to the significance level will be considered. Sample size calculations for different scenarios will be presented. An implication of this approach is that Phase II trials based upon classical designs are too small. On the other hand, trials involving "reasonable" numbers of patients must target relatively high threshold response rates within tumor subtypes. For the retrospective case we will provide the power to detect desired rates in the subtypes and provide the sample sizes required to achieve desired power. Retrospective analysis has the advantages that the analysis can be "supervised" by grouping responding and non-responding tumors; and multiple hypotheses, including hypotheses not formulated at the time of trial design, can be tested.
|
235 |
Variable Selection in Competing Risks Using the L1-Penalized Cox ModelKong, XiangRong 22 September 2008 (has links)
One situation in survival analysis is that the failure of an individual can happen because of one of multiple distinct causes. Survival data generated in this scenario are commonly referred to as competing risks data. One of the major tasks, when examining survival data, is to assess the dependence of survival time on explanatory variables. In competing risks, as with ordinary univariate survival data, there may be explanatory variables associated with the risks raised from the different causes being studied. The same variable might have different degrees of influence on the risks due to different causes. Given a set of explanatory variables, it is of interest to identify the subset of variables that are significantly associated with the risk corresponding to each failure cause. In this project, we develop a statistical methodology to achieve this purpose, that is, to perform variable selection in the presence of competing risks survival data. Asymptotic properties of the model and empirical simulation results for evaluation of the model performance are provided. One important feature of our method, which is based on the idea of the L1 penalized Cox model, is the ability to perform variable selection in situations where we have high-dimensional explanatory variables, i.e. the number of explanatory variables is larger than the number of observations. The method was applied on a real dataset originated from the National Institutes of Health funded project "Genes related to hepatocellular carcinoma progression in living donor and deceased donor liver transplant'' to identify genes that might be relevant to tumor progression in hepatitis C virus (HCV) infected patients diagnosed with hepatocellular carcinoma (HCC). The gene expression was measured on Affymetrix GeneChip microarrays. Based on the current available 46 samples, 42 genes show very strong association with tumor progression and deserve to be further investigated for their clinical implications in prognosis of progression on patients diagnosed with HCV and HCC.
|
236 |
Quantifying the Effects of Correlated Covariates on Variable Importance Estimates from Random ForestsKimes, Ryan Vincent 01 January 2006 (has links)
Recent advances in computing technology have lead to the development of algorithmic modeling techniques. These methods can be used to analyze data which are difficult to analyze using traditional statistical models. This study examined the effectiveness of variable importance estimates from the random forest algorithm in identifying the true predictor among a large number of candidate predictors. A simulation study was conducted using twenty different levels of association among the independent variables and seven different levels of association between the true predictor and the response. We conclude that the random forest method is an effective classification tool when the goals of a study are to produce an accurate classifier and to provide insight regarding the discriminative ability of individual predictor variables. These goals are common in gene expression analysis, therefore we apply the random forest method for the purpose of estimating variable importance on a microarray data set.
|
237 |
Joint Mixed-Effects Models for Longitudinal Data Analysis: An Application for the Metabolic SyndromeThorp, John, III 11 November 2009 (has links)
Mixed-effects models are commonly used to model longitudinal data as they can appropriately account for within and between subject sources of variability. Univariate mixed effect modeling strategies are well developed for a single outcome (response) variable that may be continuous (e.g. Gaussian) or categorical (e.g. binary, Poisson) in nature. Only recently have extensions been discussed for jointly modeling multiple outcome variables measures longitudinally. Many diseases processes are a function of several factors that are correlated. For example, the metabolic syndrome, a constellation of cardiovascular risk factors associated with an increased risk of cardiovascular disease and type 2 diabetes, is often defined as having three of the following: elevated blood pressure, high waist circumference, elevated glucose, elevated triglycerides, and decreased HDL. Clearly these multiple measures within a subject are not independent. A model that could jointly model two or more of these risk factors and appropriately account for between subjects sources of variability as well as within subject sources of variability due to the longitudinal and multivariate nature of the data would be more useful than several univariate models. In fact, the univariate mixed-effects model can be extended in a relatively straightforward fashion to define a multivariate mixed-effects model for longitudinal data by appropriately defining the variance-covariance structure for the random-effects. Existing software such as the PROC MIXED in SAS can be used to fit the multivariate mixed-effects model. The Fels Longitudinal Study data were used to illustrate both univariate and multivariate mixed-effects modeling strategies. Specifically, jointly modeled longitudinal measures of systolic (SBP) and diastolic (DBP) blood pressure during childhood (ages two to eighteen) were compared between participants who were diagnosed with at least three of the metabolic syndrome risk factors in adulthood (ages thirty to fifty-five) and those who were never diagnosed with any risk factors. By identifying differences in risk factors, such as blood pressure, early in childhood between those who go on to develop the metabolic syndrome in adulthood and those who do not, earlier interventions could be used to prevent the development cardiovascular disease and type 2 diabetes. As demonstrated by these analyses, the multivariate model is able to not only answer the same questions addressed as the univariate model, it is also able to answer additional important questions about the association in the evolutions of the responses as well as the evolution of the associations. Furthermore, the additional information gained by incorporating information about the correlations between the responses was able to reduce the variability (standard errors) in both the fixed-effects estimates (e.g. differences in groups, effects of covariates) as well as the random-effects estimates (e.g. variability).
|
238 |
Design and Analysis Methods for Cluster Randomized Trials with Pair-Matching on Baseline Outcome: Reduction of Treatment Effect VariancePark, Misook 01 January 2006 (has links)
Cluster randomized trials (CRT) are comparative studies designed to evaluate interventions where the unit of analysis and randomization is the cluster but the unit of observation is individuals within clusters. Typically such designs involve a limited number of clusters and thus the variation between clusters is left uncontrolled. Experimental designs and analysis strategies that minimize this variance are required. In this work we focus on the CRT with pre-post intervention measures. By incorporating the baseline measure into the analysis, we can effectively reduce the variance of the treatment effect. Well known methods such as adjustment for baseline as a covariate and analysis of differences of pre and post measures are two ways to accomplish this. An alternate way of incorporating baseline measures in the data analysis is to order the clusters on baseline means and pairmatch the two clusters with the smallest means, pair-match the next two, and so on. Our results show that matching on baseline helps to control the between cluster variation when there is a high correlation between the pre-post measures. Six cases of designs and analysis are evaluated by comparing the variance of the treatment effect and the power of related hypothesis tests. We observed that - given our assumptions - the adjusted analysis for baseline as a covariate without pair-matching is the best choice in terms of variance. Future work may reveal that other matching schemes that reflect the natural clustering of experimental units could reduce the variance and increase the power over the standard methods.
|
239 |
An Empirical Approach to Evaluating Sufficient Similarity: Utilization of Euclidean Distance As A Similarity MeasureMarshall, Scott 27 May 2010 (has links)
Individuals are exposed to chemical mixtures while carrying out everyday tasks, with unknown risk associated with exposure. Given the number of resulting mixtures it is not economically feasible to identify or characterize all possible mixtures. When complete dose-response data are not available on a (candidate) mixture of concern, EPA guidelines define a similar mixture based on chemical composition, component proportions and expert biological judgment (EPA, 1986, 2000). Current work in this literature is by Feder et al. (2009), evaluating sufficient similarity in exposure to disinfection by-products of water purification using multivariate statistical techniques and traditional hypothesis testing. The work of Stork et al. (2008) introduced the idea of sufficient similarity in dose-response (making a connection between exposure and effect). They developed methods to evaluate sufficient similarity of a fully characterized reference mixture, with dose-response data available, and a candidate mixture with only mixing proportions available. A limitation of the approach is that the two mixtures must contain the same components. It is of interest to determine whether a fully characterized reference mixture (representative of the random process) is sufficiently similar in dose-response to a candidate mixture resulting from a random process. Four similarity measures based on Euclidean distance are developed to aid in the evaluation of sufficient similarity in dose-response, allowing for mixtures to be subsets of each other. If a reference and candidate mixture are concluded to be sufficiently similar in dose-response, inference about the candidate mixture can be based on the reference mixture. An example is presented demonstrating that the benchmark dose (BMD) of the reference mixture can be used as a surrogate measure of BMD for the candidate mixture when the two mixtures are determined to be sufficiently similar in dose-response. Guidelines are developed that enable the researcher to evaluate the performance of the proposed similarity measures.
|
240 |
APPLICATIONS OF THE BIVARIATE GAMMA DISTRIBUTION IN NUTRITIONAL EPIDEMIOLOGY AND MEDICAL PHYSICSBarker, Jolene 26 September 2008 (has links)
In this thesis the utility of a bivariate gamma distribution is explored. In the field of nutritional epidemiology a nutrition density transformation is used to reduce collinearity. This phenomenon will be shown to result due to the independent variables following a bivariate gamma model. In the field of radiation oncology paired comparison of variances is often performed. The bivariate gamma model is also appropriate for fitting correlated variances. A method for simulating bivariate gamma random variables is presented. This method is used to generate data from several bivariate gamma models and the asymptotic properties of a test statistic, suggested for the radiation oncology application, is studied.
|
Page generated in 0.2793 seconds