Global ETD Search

381	A Comparison of Five Statistical Methods for Predicting Stream Temperature Across Stream Networks Holthuijzen, Maike F. 01 August 2017 (has links) The health of freshwater aquatic systems, particularly stream networks, is mainly influenced by water temperature, which controls biological processes and influences species distributions and aquatic biodiversity. Thermal regimes of rivers are likely to change in the future, due to climate change and other anthropogenic impacts, and our ability to predict stream temperatures will be critical in understanding distribution shifts of aquatic biota. Spatial statistical network models take into account spatial relationships but have drawbacks, including high computation times and data pre-processing requirements. Machine learning techniques and generalized additive models (GAM) are promising alternatives to the SSN model. Two machine learning methods, gradient boosting machines (GBM) and Random Forests (RF), are computationally efficient and can automatically model complex data structures. However, a study comparing the predictive accuracy among a variety of widely-used statistical modeling techniques has not yet been conducted. My objectives for this study were to 1) compare the accuracy among linear models (LM), SSN, GAM, RF, and GBM in predicting stream temperature over two stream networks and 2) provide guidelines in choosing a prediction method for practitioners and ecologists. Stream temperature prediction accuracies were compared with the test-set root mean square error (RMSE) for all methods. For the actual data, SSN had the highest predictive accuracy overall, which was followed closely by GBM and GAM. LM had the poorest performance overall. This study shows that although SSN appears to be the most accurate method for stream temperature prediction, machine learning methods and GAM may be suitable alternatives. stream temperature stream networks statistics prediction freshwater ecosystems Aquaculture and Fisheries Mathematics Statistics and Probability
382	Linear Operators Strongly Preserving Polynomial Equations Over Antinegative Semirings Lee, Sang-Gu 01 May 1991 (has links) We characterized the group of linear operators that strongly preserve r-potent matrices over the binary Boolean semiring, nonbinary Boolean semirings, and zero-divisor free antinegative semirings. We extended these results to show that linear operators that strongly preserve r-potent matrices are equivalent to those linear operators that strongly preserve the matrix polynomial equation p(X) = X. where p(X) = Xr1 + Xr2 + ... + Xrt and r1>r2>...>rt≥2. In addition, we characterized the group of linear operators that strongly preserve r-cyclic matrices over the same semirings. We also extended these results to linear operators that strongly preserve the matrix polynomial equation p(X) = I where p(X) is as above. Chapters I and II of this thesis contain background material and summaries of the work done by other researchers on the linear preserver problem. Characterizations of linear operators in chapters III, IV, V, and VI of this thesis are new. linear operators r-potent matrices semirings polynomial equations Mathematics Statistics and Probability
383	Family-Wise Error Rate Control in Quantitative Trait Loci (QTL) Mapping and Gene Ontology Graphs with Remarks on Family Selection Saunders, Garret 01 May 2014 (has links) The main aim of this dissertation is to meet real needs of practitioners in multiple hypothesis testing. The issue of multiplicity has become a signicant concern in most elds of research as computational abilities have increased, allowing for the simultaneous testing of many (thousands or millions) statistical hypothesis tests. While many error rates have been dened to address this issue of multiplicity, this work considers only the most natural generalization of the Type I Error rate to multiple tests, the family-wise error rate (FWER). Much work has already been done to establish powerful yet general methods which control the FWER under arbitrary dependencies among tests. This work both introduces these methods and expands upon them as is detailed through its four main chapters. Chapter 1 contains general introductions and preliminaries important to the remainder of the work, particularly a previously published graphical weighted Bonferroni multiplicity adjustment. Chapter 2 then applies the principles introduced in Chapter 1 to achieve a substantial computational improvement to an existing FWER controlling multiplicity approach (the Focus Level method) for gene set testing in high throughput microarray and next generation sequencing studies using Gene Ontology graphs. This improvement to the Focus Level procedure, which we call the Short Focus Level procedure, is achieved by extending the reach of graphical weighted Bonferroni testing to closed testing situations where restricted hypotheses are present. This is accomplished through Theorem 1 of Chapter 2. As a result of the improvement, the full top-down approach to the Focus Level procedure can now be performed, overcoming a signicant disadvantage of the otherwise powerful approach to multiple testing. Chapter 3 presents a solution to a multiple testing diculty within quantitative trait loci (QTL) mapping in natural populations for QTL LD (linkage disequilibrium) mapping models. Such models apply a two-hypothesis framework to the testing of thousands of genetic markers across the genome in search of QTL underlying a quantitative trait of interest. Inherent to the model is an unidentiability issue where a parameter of interest is identiable only under the alternative hypothesis. Through a second application of graphical weighted Bonferroni methods we show how the multiplicity can be accounted for while simultaneously accounting for the required logical structuring of the testing such that identiability is preserved. Finally, Chapter 4 details some of the diculties associated with the distributional assumptions for the test statistics of the two hypotheses of the LDbased QTL mapping framework. A novel bivariate testing strategy is proposed for these test statistics in order to overcome these distributional diculties while preserving power in the multiplicity correction by reducing the number of tests performed. Chapter 5 concludes the work with a summary of the main contributions and future research goals aimed at continual improvement to the multiple testing issues inherent to both the elds of genetics and genomics. Error Rate Control Quatitative Mapping Ontology Family Selection Applied Statistics Physical Sciences and Mathematics Statistics and Probability
384	Model for Bathtub-Shaped Hazard Rate: Monte Carlo Study Leithead, Glen S. 01 May 1970 (has links) A new model developed for the entire bathtub-shaped hazard rate curve has been evaluated as to its usefulness as a method of reliability estimation. The model is of the form: F(t) = 1 - exp - (ϴ1tL + ϴ2t + ϴ3tM) where "L" and "M" were assumed known. The estimate of reliability obtained from the new model was compared with the traditional restricted sample estimate for four different time intervals and was found to have less bias and variance for all time points. The was a monte carlo study and the data generated showed that the new model has much potential as a method for estimating reliability. (51 pages) Kolmogorov-Smirnov statistic conditional reliability hazard rate monte carlo study Applied Statistics Statistics and Probability
385	A Report on the Statistical Properties of the Coefficient of Variation and Some Applications Irvin, Howard P. 01 May 1970 (has links) Examples from four disciplines were used to introduce the coefficient of variation which was considered to have considerable usage and application in solving Quality Control and Reliability problems. The statistical properties were found in the statistical literature and are presented, namely, the mean and the variance of the coefficient of variation. The cumulative probability function was determined by two approximate methods and by using the noncentral t distribution. A graphical method to determine approximate confidence intervals and a method to determine if the coefficients of variation from two samples were significantly different from each other are also provided (with examples). Applications of the coefficient of variation to solving some of the main problems encountered in industry that are included in this report are: (a) using the coefficient of variation to measure relative efficiency, (b) acceptance sampling, (c) stress versus strength reliability problem, and (d) estimating the shape parameter of the two parameter Weibull. (84 pages) coefficient of variation quartile comparator quality variability mean squared error Applied Statistics Statistics and Probability
386	Resource Requirements Determination (Based on Statistical Methods) Howard, Robert L. 01 May 1971 (has links) Two methods of determining resource requirements at an Air Force maintenance depot were developed. The first method is designed for new workloads and is based on infinite queuing theory formulas. Tables have been developed for this method. The second method is designed for workload with, at minimum, several months of historical data. An optimum fit test was designed to aid in fitting and smoothing the empirical data to the normal distribution. These data are then input to simulation model for determination of resource requirements. (86 pages) resource requirement optimum resource decision optimum fit analysis Applied Statistics Statistics and Probability
387	Surviving a Civil War: Expanding the Scope of Survival Analysis in Political Science Whetten, Andrew B. 01 December 2018 (has links) Survival Analysis in the context of Political Science is frequently used to study the duration of agreements, political party influence, wars, senator term lengths, etc. This paper surveys a collection of methods implemented on a modified version of the Power-Sharing Event Dataset (which documents civil war peace agreement durations in the Post-Cold War era) in order to identify the research questions that are optimally addressed by each method. A primary comparison will be made between a Cox Proportional Hazards Model using some advanced capabilities in the glmnet package, a Survival Random Forest Model, and a Survival SVM. En route to this comparison, issues including Cox Model variable selection using the LASSO, identification of clusters using Hierarchal Clustering, and discretizing the response for Classification Analysis will be discussed. The results of the analysis will be used to justify the need and accessibility of the Survival Random Forest algorithm as an additional tool for survival analysis. Survival Analysis nonparametric semi parametric political science survival random forest Statistics and Probability
388	To Dot Product Graphs and Beyond Bailey, Sean 01 May 2016 (has links) We will introduce three new classes of graphs; namely bipartite dot product graphs, probe dot product graphs, and combinatorial orthogonal graphs. All of these representations were inspired by a vector representation known as a dot product representation. Given a bipartite graph G = (X, Y, E), the bipartite dot product representation of G is a function ƒ : X ∪ Y → Rk and a positive threshold t such that for any κ ∈ Χ and γ ∈ Υ , κγ ∈ ε if and only if f(κ) · f(γ) ≥ t. The minimum k such that a bipartite dot product representation exists for G is the bipartite dot product dimension of G, denoted bdp(G). We will show that such representations exist for all bipartite graphs as well as give an upper bound for the bipartite dot product dimension of any graph. We will also characterize the bipartite graphs of bipartite dot product dimension 1 by their forbidden subgraphs. An undirected graph G = (V, E) is a probe C graph if its vertex set can be parti-tioned into two sets, N (nonprobes) and P (probes) where N is independent and there exists E' ⊆ N × N such that G' = (V, E ∪ E) is a C graph. In this dissertation we introduce probe k-dot product graphs and characterize (at least partially) probe 1-dot product graphs in terms of forbidden subgraphs and certain 2-SAT formulas. These characterizations are given for the very different circumstances: when the partition into probes and nonprobes is given, and when the partition is not given. Vectors κ = (κ1, κ2, . . . , κn)T and γ = (γ1, γ2, . . . , γn)T are combinatorially orthogonal if \|{i : κiγi = 0}\| ≠ 1. An undirected graph G = (V, E) is a combinatorial orthogonal graph if there exists ƒ : V → Rn for some n ∈ Ν such that for any u, υ &Isin; V , uv ∉ E iff ƒ(u) and ƒ(v) are combinatorially orthogonal. These representations can also be limited to a mapping g : V → {0, 1}n such that for any u,v ∈ V , uv ∉ E iff g(u) · g(v) = 1. We will show that every graph has a combinatorial orthogonal representation. We will also state the minimum dimension necessary to generate such a representation for specific classes of graphs. bipartite dot product graphs probe dot product graphs combinational orthogonal graphs Mathematics Statistics and Probability
389	Implementation and Application of the Curds and Whey Algorithm to Regression Problems Kidd, John 01 May 2014 (has links) A common multivariate statistical problem is the prediction of two or more response variables using two or more predictor variables. The simplest model for this situation is the multivariate linear regression model. The standard least squares estimation for this model involves regressing each response variable separately on all the predictor variables. Breiman and Friedman found a way to take advantage of correlations among the response variables to increase the predictive accuracy for each of the response variables with an algorithm they called Curds and Whey. In this report, I describe an implementation of the Curds and Whey algorithm in the R language and environment for statistical computing, apply the algorithm to some simulated and real data sets, and discuss the R package I developed for Curds and Whey. Curds Whey Algorithm Regression Problems Applied Statistics Physical Sciences and Mathematics Statistics and Probability
390	Statistical Modeling, Exploration, and Visualization of Snow Water Equivalent Data Odei, James Beguah 01 May 2014 (has links) Due to a continual increase in the demand for water as well as an ongoing regional drought, there is an imminent need to monitor and forecast water resources in the Western United States. In particular, water resources in the IntermountainWest rely heavily on snow water storage. Thus, the need to improve seasonal forecasts of snowpack and considering new techniques would allow water resources to be more effectively managed throughout the entire water-year. Many available models used in forecasting snow water equivalent (SWE) measurements require delicate calibrations. In contrast to the physical SWE models most commonly used for forecasting, we offer a statistical model. We present a data-based statistical model that characterizes seasonal snow water equivalent in terms of a nested time-series, with the large scale focusing on the inter-annual periodicity of dominant signals and the small scale accommodating seasonal noise and autocorrelation. This model provides a framework for independently estimating the temporal dynamics of SWE for the various snow telemetry (SNOTEL) sites. We use SNOTEL data from ten stations in Utah over 34 water-years to implement and validate this model. This dissertation has three main goals: (i) developing a new statistical model to forecast SWE; (ii) bridging existing R packages into a new R package to visualize and explore spatial and spatio-temporal SWE data; and (iii) applying the newly developed R package to SWE data from Utah SNOTEL sites and the Upper Sheep Creek site in Idaho as case studies. Statistic Modeling Exploration Visualization Snow Water Snow Water Equivalent Data Mathematics Statistics and Probability

Search results