Global ETD Search

1	Testing an assumed distribution Wong, Tze-yue 04 1900 (has links) <p>Testing for an assumed distribution has been a major area of statistical research, both in theory and in practice. A reason for this interest is that many statistical procedures are based on certain distributional assumptions. Two new statistics, BN and BL, are suggested in this thesis for testing for normal and logistic distributions. The formulation of these statistics is based on the best linear unbiased estimator of the population scale parameter δ, using order statistics. The distributions of BN and BL tend to normal very rapidly, effectively for sample size n ≥ 20. In general, BN and BL have good power properties. They are particularly sensitive in testing against skew distributions or symmetric distributions with large kurtosis. The power of BN is comparable with other available test-statistics.</p> / Master of Science (MS) Applied Statistics Applied Statistics
2	Testing For Outliers Chawla, Satwant 05 1900 (has links) <p>Several test statistics, which are known, can be used for testing for outliers. Two new statistics T and tc are proposed. T and tc are based on censored and complete samples and are similar to Tiku's T and tc statistics for testing for normality. The distribution of T is closely approximated by the Beta distribution, and the distribution of tc is closely approximated by Student's t distribution. T and tc are also both origin and scale invariant. Besides T and tc are easy to calculate. The statistic T is more powerful than Tietjen and Moore's statistics Lr and Er. The statistic tc is, on the whole, as powerful as Er.</p> / Master of Science (MS) Applied Statistics Applied Statistics
3	A Statistical Analysis of Key Factors Influencing the Location of Biomass-using Facilities Liu, Xu 01 December 2009 (has links) Bioenergy and biofuels are emerging industries in the U.S. economy that will require statistical and economical analyses of woody biomass resources, supply chains, and other key factors that influence the siting of industrial facilities. This thesis develops models using logistic regression to improve the understanding of the key factors that influence the locations of existing wood-using bioenergy and biofuels plants, and other wood-using plants. The scope of the study is 13 Southeastern states.1 Logistic regression models are developed at the state and regional levels. The resolution of the study is the ZIP Code tabulation area (ZCTA). There are 9,416 ZCTAs in the 13–state study region. Because a small number of woody biomass-using bioenergy and biofuels plants exist relative to the large number of traditional woody biomass-using facilities (e.g., wood composites, sawmills, and secondary mills), two sample groups are developed. The first group combines all wood-using mills with wood-using bioenergy and biofuels plants, and compares ZCTAs with these types of mills with ZCTAs that do not contain any such facilities. This follows a more modern planning view of total woody biomass management. The second group combines only one type of mill, pulp and paper mills, with wood-using bioenergy and biofuels plants, and compares ZCTAs of these mill types with ZCTAs that do not contain such facilities. For both groups in the entire study region, logging residues harvesting costs (negative influence) and the availability of thinnings within an 80-mile haul distance (positive influence) are statistically significant factors (p-value < 0.0001) in the logistic models. Population is statistically significant and has a negative influence on site location for six of the thirteen states in the region (p-values ranged from < 0.0001 to 0.0197) for the first group. Twenty-five optimal locations in the Southeastern states (ZCTAs) are predicted from the logistic regression models. A de-clustering algorithm is developed as part of this study to avoid locating potential bioenergy and biofuels sites in close proximity to competing mills within same ZCTA. ______________________ 1 Alabama, Arkansas, Florida, Georgia, Kentucky, Louisiana, Mississippi, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia. Applied Statistics
4	Likelihood Inference of Some Cure Rate Models and Applications Liu, Xiaofeng 04 1900 (has links) <p>In this thesis, we perform a survival analysis for right-censored data of populations with a cure rate. We consider two cure rate models based on the Geometric distribution and Poisson distribution, which are the special cases of the Conway-Maxwell distribution. The models are based on the assumption that the number of competing causes of the event of interest follows Conway-Maxwell distribution. For various sample sizes, we implement a simulation process to generate samples with a cure rate. Under this setup, we obtain the maximum likelihood estimator (MLE) of the model parameters by using the gamlss R package. Using the asymptotic distribution of the MLE as well as the parametric bootstrap method, we discuss the construction of confidence intervals for the model parameters and their performance is then assessed through Monte Carlo simulations.</p> / Master of Science (MSc) Applied Statistics Statistical Models Applied Statistics
5	The application of a new method of parameter estimation to the three-parameter lognormal and the three-parameter weibull distributions Amin, N. A. K. January 1981 (has links) A new method of parameter estimation similar to maximum likelihood(ML) estimation is proposed to overcome the problem associated with the unbounded likelihood in ML estimation when applied to distributions such as the 3-parameter lognormal and the 3-parameter Weibull models. Tn-. these distributions, ML estimation often breaks down with unresolved theoretical and practical difficulties, but the new method yields efficient estimators for all values in the parameter space. Unlike the ad hoe modifications sometimes applied to obtain usable estimators in ML method, this method is based on a general principle applicable to any continuous univariate distribution. It gives consistent estimators under more general conditions'than ML. In regular situation, the new estimators are asymptotically normal and efficient. Moreover, in non-standard situation, tfle new estimation yields efficient estimators even when ML estimation fails. An extensive Monte Carlo study is carried out to compare the performance of the new method and NIT; method in the above 3-parameter distributions for a series o-'-* sample sizes e-nd parameter values. Iterative procedures are suSgested for computing the estimates which include a 1goodness of fit test of the model before estimation. The Mon-1-e1 Carlo results indicate -that variances of the new estimators are similar to those of the ML estimators. An exception is in very skewe. d samples where, while the ML estimatior is bound to fail, this method gives more efficient estimators. 519.5 Applied statistics
6	Models for queueing systems with 'feedback' Worthington, D. J. January 1983 (has links) No description available. 519.5 Applied statistics
7	Machine Learning on Statistical Manifold Zhang, Bo 01 January 2017 (has links) This senior thesis project explores and generalizes some fundamental machine learning algorithms from the Euclidean space to the statistical manifold, an abstract space in which each point is a probability distribution. In this thesis, we adapt the optimal separating hyperplane, the k-means clustering method, and the hierarchical clustering method for classifying and clustering probability distributions. In these modifications, we use the statistical distances as a measure of the dissimilarity between objects. We describe a situation where the clustering of probability distributions is needed and useful. We present many interesting and promising empirical clustering results, which demonstrate the statistical-distance-based clustering algorithms often outperform the same algorithms with the Euclidean distance in many complex scenarios. In particular, we apply our statistical-distance-based hierarchical and k-means clustering algorithms to the univariate normal distributions with k = 2 and k = 3 clusters, the bivariate normal distributions with diagonal covariance matrix and k = 3 clusters, and the discrete Poisson distributions with k = 3 clusters. Finally, we prove the k-means clustering algorithm applied on the discrete distributions with the Hellinger distance converges not only to the partial optimal solution but also to the local minimum. Applied Statistics Statistical Methodology
8	Parkinson Disease Loci in the Mid-Western Amish Davis, Mary Feller 15 April 2013 (has links) Previous evidence has shown that Parkinson disease (PD) has a heritable component, but only a small proportion of the total genetic contribution to PD has been identified. Genetic heterogeneity complicates the verification of proposed PD genes and the identification of new PD susceptibility genes. Our approach to overcome the problem of heterogeneity is to study a population isolate, the mid-western Amish communities of Indiana and Ohio. We performed genome-wide association and linkage analyses on 798 individuals (31 with PD), who are part of a 4,998 member pedigree. Through these analyses, we identified a region on chromosome 5q31.3 that shows evidence of association (p-value < 1 x 10-4) and linkage (multipoint HLOD = 3.77). We also found further evidence of linkage on chromosomes 6 and 10 (multipoint HLOD 4.02 and 4.35 respectively). These data suggest that locus heterogeneity, even within the Amish, may be more extensive than previously appreciated.
9	Identification of Additional Independent Loci in the Major Histocompatibility Complex in Multiple Sclerosis Susceptibility Zuvich, Rebecca Lynn 04 August 2010 (has links) Multiple sclerosis (MS) is characterized as an autoimmune neurodegenerative disease. The disease manifests as demyelination or degradation of the myelin sheath in the central nervous system. The Major Histocompatibility Complex (MHC) was associated with MS in the mid-1970s; the association was later refined to the HLA-DRB11501-DQB0602 haplotype. The MHC region is riddled with complicating factors including high gene content, extreme levels of polymorphism, and a dense pattern of linkage disequilibrium (LD). These characteristics make this region difficult for differentiating whether a single allele or an entire haplotype contributes to disease association. Despite these challenges it is clear that the MHC region harbors MS susceptibility loci in addition to the HLA-DRB11501 region. Using the strong LD in this region we can test a model that predicts residual odds ratios (ORs) for a marker in LD with a disease allele such as HLA-DRB11501. Comparing the correlation between the observed OR and the calculated OR for multiple SNPs in the MHC region, we hypothesize those SNPs that appear as outliers are suggestive of additional effects independent from the HLA-DRB11501 region. We examined ~2,300 SNPs in 1,479 cases and 1,482 controls in the 28 Mb to 36 Mb region on chromosome 6 containing the MHC. The ORs for the SNPs were grouped based on the amount of LD with the HLA-DRB1 surrogate SNP (rs3135388). We identified nine outlying SNPs, which had observed ORs much larger than the calculated OR. These nine SNPs are in six different genes that suggest susceptibility to MS independent of HLA-DRB11501.
10	POWER AND TYPE 1 ERROR FOR LARGE PEDIGREE ANALYSES OF BINARY TRAITS Cummings, Anna Christine 07 December 2012 (has links) Studying population isolates with large, complex pedigrees has many advantages for discovering genetic susceptibility loci; however, statistical analyses can be computationally challenging. Allelic association tests need to be corrected for relatedness among study participants, and linkage analyses require subdividing and simplifying the pedigree structures. In this thesis work I simulated SNP (single nucleotide polymorphism) data in complex pedigree structures based on an Amish pedigree. I evaluated type 1 error rates and power when performing two-point and multipoint linkage after dividing the pedigree into subpedigrees. I also ran MQLS (modified likelihood score test) to test for allelic association in the subpedigrees and in the unified pedigree.

Search results