Global ETD Search

211	Contagion in Financial Markets: Two Statistical Approaches Rao, Harshavardhana 17 August 2004 (has links) Financial markets in different countries undergo crises at one point in time or another. These crises can have different causes but they could affect other markets due to trade relations and capital mobility. Some crises affect markets in other countries more than what market fundamentals would dictate. We will model this phenomenon, also defined as contagion, using two approaches viz., one-factor model and volatility spillover, and compare these approaches. Statistics
212	Variations on the Accelerated Failure Time Model: Mixture Distributions, Cure Rates, and Different Censoring Scenarios Krachey, Elizabeth Catherine 06 October 2009 (has links) The accelerated failure time (AFT) model is a popular model for time-to-event data. It provides a useful alternative when the proportional hazards assumption is in question and it provides an intuitive linear regression interpretation where the logarithm of the survival time is regressed on the covariates. We have explored several deviations from the standard AFT model. Standard survival analysis assumes that in the case of perfect follow-up, every patient will eventually experience the event of interest. However, in some clinical trials, a number of patients may never experience such an event, and in essence, are considered cured from the disease. In such a scenario, the Kaplan-Meier survival curve will level off at a nonzero proportion. Hence there is a window of time in which most or all of the events occur, while heavy censoring occurs in the tail. The two-component mixture cure model provides a means of adjusting the AFT model to account for this cured fraction. Chapters 1 and 2 propose parametric and semiparametric estimation procedures for this cure rate AFT model. Survival analysis methods for interval-censoring have been much slower to develop than for the right-censoring case. This is in part because interval-censored data have a more complex censoring mechanism and because the counting process theory developed for right-censored data does not generalize or extend to interval-censored data. Because of the analytical difficulty associated with interval-censored data, recent estimation strategies have focused on the implementation rather than the large sample theoretical justifications of the semiparametric AFT model. Chapter 3 proposes a semiparametric Bayesian estimation procedure for the AFT model under interval-censored data. Statistics
213	Age-Dependent Tag Return Models for Estimating Fishing Mortality, Natural Mortality and Selectivity Jiang, Honghua 09 September 2005 (has links) We extend the instantaneous rates formulation of fisheries tag return models to allow for age-dependence of fishing mortality rates in Chapter 1. This is important in many applications where tagged fish vary over a large range of ages (and sizes). We focus on a model based on assuming selectivity by age is constant over years and that above a certain age selectivity is fixed at 1. We show that it is possible to allow natural mortality, M, to vary by age and year. We allow for incomplete mixing of tagged fish and for fisheries to be pulse, continuous or continuous over part of the year. We focus on the case where all or most age classes are tagged each year. We investigate model identifiability and how well parameters can be estimated using analytic and simulation methods. Results show that some models with the tag reporting rate estimated are singular or near-singular. The age-length key method commonly used for age specification may produce substantial errors in converting size to age, especially for the older fish. To reduce such errors, in Chapter 2 we propose two alternative sampling designs to the standard one of tagging all age classes: one where only age 1 fish are tagged, another where both age 1 and age 2 fish are tagged. Catch-and-release fisheries have become very important to the management of overexploited recreational fish stocks. Tag return studies where the tag is removed regardless of fish disposition have been used to assess the effectiveness of restoration efforts for these catch-and-release fisheries. In Chapter 3, we extend the instantaneous rate formulation of tag return models introduced in Chapter 1 to catch-and-release tagging studies. We illustrate the methods using multiple age class tag return data on striped bass (Morone saxatilis) from the Maryland Department of Natural Resources (MDNR). We found evidence that M is age dependent and that M has increased since 1999 possibly due to an outbreak of the disease (mycobacteriosis) in striped bass in the Chesapeake Bay. Statistics
214	Bayesian Analysis of Circular Data Using Wrapped Distributions Ravindran, Palanikumar 29 October 2002 (has links) Circular data arise in a number of different areas such as geological, meteorological, biological and industrial sciences. We cannot use standard statistical techniques to model circular data, due to the circular geometry of the sample space. One of the common methods used to analyze such data is the wrapping approach. Using the wrapping approach, we assume that, by wrapping a probability distribution from the real line onto the circle, we obtain the probability distribution for circular data. This approach creates a vast class of probability distributions that are flexible to account for different features of circular data. However, the likelihood-based inference for such distributions can be very complicated and computationally intensive. The EM algorithm used to compute the MLE is feasible, but is computationally unsatisfactory. Instead, we use Markov Chain Monte Carlo (MCMC) methods with a data augmentation step, to overcome such computational difficulties. Given a probability distribution on the circle, we assume that the original distribution was distributed on the real line, and then wrapped onto the circle. If we can "unwrap" the distribution off the circle and obtain a distribution on the real line, then the standard statistical techniques for data on the real line can be used. Our proposed methods are flexible and computationally efficient to fit a wide class of wrapped distributions. Furthermore, we can easily compute the usual summary statistics. We present extensive simulation studies to validate the performance of our method. We apply our method to several real data sets and compare our results to parameter estimates available in the literature. We find that the Wrapped Double Exponential family produces robust parameter estimates with good frequentist coverage probability. We extend our method to the regression model. As an example, we analyze the association between ozone data and wind direction. A major contribution of this dissertation is to illustrate a technique to interpret the circular regression coefficients in terms of the linear regression model setup. Regression diagnostics can be developed after augmenting wrapping numbers to the circular data (refer Section 3.5). We extend our method to fit time-correlated data. We can compute other statistics such as circular autocorrelation functions and their standard errors very easily. We use the Wrapped Normal model to analyze the hourly wind directions, which is an example of the time series circular data. Statistics
215	Family-based methods which rely on association for the mapping of genes in human populations Monks, Stephanie Ann 14 May 1999 (has links) <p>Family-based tests, that employ association between alleles at a marker locus and a trait locus, have proven useful for the localization of genes in human genomes. Many existing tests have been derived as extensions of the transmission/disequilibrium test (TDT), which was originally introduced as a test of linkage in the presence of association for a susceptibility locus. One of these tests, the sib-TDT or S-TDT, makes use of genetic information from sibships containing at least one diseased and nondiseased child and is defined for a biallelic marker. We propose an extension of the S-TDT to a multiallelic marker and provide evidence that the chi-square distribution can be used to measure statistical signicance. The test is compared to three contemporary extensions for a multiallelic marker. We also present a test for a multiallelic marker that combines data from families with and without parental genetic information. <BR><BR>Next, tests of linkage and association are developed for a quantitative trait that utilize families, without restrictions on the number of children per family that can be used in the analysis. Tests are introduced that can be used on family data with parent and child genotypes, only child genotypes, or a combination of these types of families. Equations are derived that allow one to determine the sample size needed to achieve desired power. Through simulation, we demonstrate that existing tests have an elevated false-positive rate, when the size restrictions are not followed, and that a good deal of information is lost by adhering to the size restrictions. Permutation procedures are introduced that are recommended for small samples, but can also be used for extensions of the tests to multiallelic markers and to the simultaneous use of more than one marker. <BR><BR>Finally, resampling procedures for existing tests are explored. The resampling procedures reduce families to contain the number of children allowable for a valid test of linkage and association. We show that our tests are equivalent to the use of within cluster resampling for the existing tests, but that differences exist when resampling is performed on the basis of phenotypically extreme individuals.<P> Statistics
216	STATISTICAL ANALYSIS OF GENETIC ASSOCIATIONS Zaykin, Dmitri V. 30 September 1999 (has links) <p>Zaykin, Dmitri V. Statistical Analysis of Genetic Associations.Advisor: Bruce S. Weir.There is an increasing need for a statistical treatment of geneticdata prompted by recent advances in molecular genetics and moleculartechnology. Study of associations between genes is one of the mostimportant aspects in applications of population genetics theory andstatistical methodology to genetic data. Developments of these methodsare important for conservation biology, experimental populationgenetics, forensic science, and for mapping human disease genes. Overthe next several years, genotypic data will be collected to attemptlocating positions of multiple genes affecting disease phenotype.Adequate statistical methodology is required to analyze thesedata. Special attention should be paid to multiple testing issuesresulting from searching through many genetic markers and high risk offalse associations. In this research we develop theory and methodsneeded to treat some of these problems. We introduce exact conditionaltests for analyzing associations within and between genes in samplesof multilocus genotypes and efficient algorithms to perform them.These tests are formulated for the general case of multiple alleles atarbitrary numbers of loci and lead to multiple testing adjustmentsbased on the closing testing principle, thus providing strongprotection of the family-wise error rate. We discuss an applicationof the closing method to the testing for Hardy-Weinberg equilibriumand computationally efficient shortcuts arising from methods forcombining p-values that allow to deal with large numbers of loci. Wealso discuss efficient Bayesian tests for heterozygote excess anddeficiency, as a special case of testing for Hardy-Weinbergequilibrium, and the frequentist properties of a p-value type ofquantity resulting from them. We further develop new methods forvalidation of experiments and for combining and adjusting independentand correlated p-values and apply them to simulated as well as toactual gene expression data sets. These methods prove to be especiallyuseful in situations with large numbers of statistical tests, such asin whole-genome screens for associations of genetic markers withdisease phenotypes and in analyzing gene expression data obtained fromDNA microarrays.<P> Statistics
217	GENERAL ZERO-INFLATED MODELS AND THEIR APPLICATIONS Gan, Nianci 31 March 2000 (has links) <p>Count data with excess zeros are commonly seen in experiments forimproving electronics manufacturing quality, in medical researchof HIV patients with high-risk behaviors and in agricultural study of number of insects per leaf.Yip (1988) and Lambert (1992) proposed zero-inflated Poisson distribution andHeilbron (1989) used zero-altered Poisson and negative binomial distributionsto model this type of data. Li, Lu, Park, Kim, Brinkley and Peterson (1999)derived multivariate version of the zero-inflated Poisson distribution andapplied it to detect equipment problems in electronics manufacturingprocesses.<p>Zero-inflated distributions assume that with probability 1 - p the onlypossible observation is 0, and with probability p, a random variabledescribing defect counts in the imperfect state is observed. For example, when manufacturing equipment is properlyaligned (perfect state), there may be no defects. Otherwise, defects may occuraccording to a distribution of the imperfect state. The defect counts inimperfect statecould follow Poisson, negative binomial, or other distributions but most of the current researches use Poisson distribution. Although the maximum likelihood (ML) method is widely used in estimatingparameters in the zero-inflated distributions, there is no theoreticalstudy on the properties of the ML estimates.In Chapter 1, we propose a generalframework for generalized zero-inflated models (ZIM), which assume only thatthe distribution of the imperfect state has the support of the nonnegativeintegers and satisfies appropriate regularity conditions. We study the properties of the ML estimates of ZIM parameters,including their existence, uniqueness, strong consistencyand asymptotic normality under regularity conditions. By focusing on the univariate ZIM, we give detailedrigorous proofs to the lemmas and theorems stated in the thesis. Then, we study covariate effects in the univariate and multivariate zero-inflated regression models. Because the zero-inflated model involves both Bernoulli parameter p and the imperfect state parameter lambda,building the model separately does not use the information efficiently and the resulted model is more complicated than needed. This problem gets worse in the multivariate ZIM, where the number of model terms increases drastically. Our procedure selects limited important model terms to maximize the ZIM likelihood functions. <p>In Chapter 2, we review current researches on zero-inflated Poissonmodels. Some new results on multivariate Poisson and multivariate zero-inflated Poisson distributions are given. By generalizing theresults in Lambert (1992) and Li, et al (1999), we propose a multivariatezero-inflated Poisson regression model. An example from Nortel process development research is used to illustrate the model selection procedure for the zero-inflated regression models and computational details. <P> Statistics
218	Multi-Systems Operation and Control Fenner, Joel S. 07 April 2000 (has links) <p>The typical process modeling and control approach of dealingwith each manufacturing process stage in isolation is re-evaluated anda new approach is developed for the case of a series ofprocess stages and the case of several similar processesoperating in parallel at different sites. The serialmulti-stage modeling framework uses the possible correlationbetween stages as a tool to achieve tighter control of anonstationary process line. The case of several similarprocesses operating in parallel at different sites is termedglobal manufacturing. Global manufacturing allows updatedestimation of the sensitivity or slope parameters over thecourse of the process in some cases since the site effectcauses a spread in the range of the inputs. This spreadin the inputs provides an opportunity for stable sensistivityestimates without perturbing the process as is usuallynecessary. The Bayesian parallel site estimation procedure isshown to have broad application to any scenario where thedata is collected at various related but not identical sites.Specifically, uniformity modeling is explored using theBayesian estimation procedure. The multi-systems operation and controlmethodologies developed for the serial and global manufacturingcases provide valuable tools for the improvement ofmanufacturing processes in many industries.<P> Statistics
219	A Proposed Framework for Establishing Optimal Genetic Designsfor Estimating Narrow-sense Heritability Silva, Carlos H. 14 April 2000 (has links) <p> We develop a framework for establishing sample sizes in breeding plans, so that one is able to estimate narrow-sense heritability with smallest possible variance, for a given amount of effort. We call this an optimal genetic design. The framework allows one to compare the variances of estimators of narrow-sense heritability, when estimated from each one of the alternative plans under consideration, and does not require data simulation, but does require computer programming. We apply it to the study of a peanut (Arachis hypogaea) breading example, in order to determine the ideal number of plants to be selected at each generation. We also propose a methodology that allows one to estimate the additive genetic variance for the estimation of the narrow-sense heritability using MINQUE and REML, without an analysis of variance model. It requires that one can build the matrix of genetic variances and covariances among the subjects on which observations are taken. This methodology can be easily adapted to ANOVA-based methods, and we exemplify by using Henderson's Method III. We compare Henderson's Method III, MINQUE, and REML under the proposed methodology, with an emphasis on comparing these estimation methods with non-normally distributed data and unbalanced designs. A location-scale transformation of the beta density is proposed for simulation of non-normal data. <P> Statistics
220	Information-based Group Sequential Tests With Lagged or Censored Data Kung, Meifen 23 June 2000 (has links) <p>Conventionally, values of nuisance parameters given in a statistical design are often erroneous, thus may result in overpowering or underpowering a test using traditional sample size calculations. In this thesis, we propose to use Fisher Information data monitoring in group sequential studies to not only allow an early stopping in a clinical trial but also maintain the desired power of the test for all values of nuisance parameters. Simulation studies for the simple case of comparing two response rates are used to demonstrate that a test of a single parameter of interest with a specified alternative achieves the desired power in information-based monitoring regardless of the value of the nuisance parameters, provided that this parameter of interest can be estimated efficiently. The emphasis in this part is to show how information-based monitoring can be implemented in practice and to demonstrate the accuracy of the corresponding operating characteristics in some simulation studies.When there is lag time in reporting, standard statistical techniques often lead to biased inferences on interim data. A maximum lag estimator ensures complete information by using data before a lag time period. The estimator is unbiased but less powerful. We propose an inverse probability weighted estimator which accounts for censoring and is consistent and asymptotically normal in estimating mean of dichotomous variables. The joint distribution of test statistics at different times have the covariance structure of a sequential process with independent increments. This allows the use of information-based monitoring. Simulation study shows that our estimator preserves the type I and type II errors, and reduces the number of participants required in a trial. Future approach in finding an efficient estimator is also suggested in chapter 3. <P> Statistics

Search results