Global ETD Search

111	Data Analysis Using Experimental Design Model Factorial Analysis of Variance/Covariance (DMAOVC.BAS) Newton, Wesley E. 01 May 1985 (has links) DMAOVC.BAS is a computer program written in the compiler version of microsoft basic which performs factorial analysis of variance/covariance with expected mean squares. The program accommodates factorial and other hierarchical experimental designs with balanced sets of data. The program is writ ten for use on most modest sized microprocessors, in which the compiler is available. The program is parameter file driven where the parameter file consists of the response variable structure, the experimental design model expressed in a similar structure as seen in most textbooks, information concerning the factors (i.e. fixed or random, and the number of levels), and necessary information to perform covariance analysis. The results of the analysis are written to separate files in a format that can be used for reporting purposes and further computations if needed. Data analysis experimental design model factorial analysis variance covariance DMAOVC.BAS Applied Statistics
112	An investigation of the methods for estimating usual dietary intake distributions : a thesis presented in partial fulfillment of the requirements for the degree of Master of Applied Statistics at Massey University, Albany, New Zealand Stoyanov, Stefan Kremenov January 2008 (has links) The estimation of the distribution of usual intake of nutrients is important for developing nutrition policies as well as for etiological research and educational purposes. In most nutrition surveys only a small number of repeated intake observations per individual are collected. Of main interest is the longterm usual intake which is defined as long-term daily average intake of a dietary component. However, dietary intake on a single day is a poor estimate of the individual’s long-term usual intake. Furthermore, the distribution of individual intake means is also a poor estimator of the distribution of usual intake since usually there is large within-individual compared to between-individual variability in the dietary intake data. Hence, the variance of the mean intakes is larger than the variance of the usual intake distribution. Essentially, the estimation of the distribution of long-term intake is equivalent to the estimation of a distribution of a random variable observed with measurement error. Some of the methods for estimating the distributions of usual dietary intake are reviewed in detail and applied to nutrient intake data in order to evaluate their properties. The results indicate that there are a number of robust methods which could be used to derive the distribution of long-term dietary intake. The methods share a common framework but differ in terms of complexity and assumptions about the properties of the dietary consumption data. Hence, the choice of the most appropriate method depends on the specific characteristics of the data, research purposes as well as availability of analytical tools and statistical expertise. Statistical methodology Estimates Food intake distributions
113	Measurement of body posture using multivariate statistical techniques Petkov , John January 2005 (has links) The aim of this thesis is to develop a quantitative measure of postural defects known as lordosis and kyphosis. The measurement of these is an important part of their identification and treatment. Body Posture Statistical Theory Applied Statistics Biological Mathematics Measurement Statistical techniques
114	Adjusting the parameter estimation of the parentage analysis software MasterBayes to the presence of siblings : a thesis presented in partial fulfillment of the requirements for the degree of Master of Applied Statistics at Massey University, Albany, New Zealand Heller, Florian January 2009 (has links) Parentage analysis is concerned with the estimation of a sample’s pedigree structure, which is often essential knowledge for estimating population parameters of animal species, such as reproductive success. While it is often easy to relate one parent to an offspring simply by observation, the second parent remains frequently unknown. Parentage analysis uses genotypic data to estimate the pedigree, which then allows inferring the desired parameters. There are several software applications available for parentage analysis, one of which is MasterBayes, an extension to the statistical software package R. MasterBayes makes use of behavioural, phenotypic, spatial and genetic data, providing a Bayesian approach to simultaneously estimate pedigree and population parameters of interest, allowing for a range of covariate models. MasterBayes however assumes the sample to be a randomly collected from the population of interest. Often however, collected data will come from nests or otherwise from groups that are likely to contain siblings. If siblings are present, the assumption of a random population sample is not met anymore and as a result, the parameter variance will be underestimated. This thesis presents four methods to adjust MasterBayes’ parameter estimate to the presence of siblings, all of which are based on the pedigree structure, as estimated by MasterBayes. One approach, denoted as DEP, provides a Bayesian estimate, similar to MasterBayes’ approach, but incorporating the presence of siblings. Three further approaches, denoted as W1, W2 and W3, apply importance sampling to re-weight parameter estimates obtained from MasterBayes and DEP. Though fully satisfying adjustment of the estimate’s variance is only achieved at nearly perfect pedigree assignment, the presented methods do improve MasterBayes’ parameter estimation in the presence of siblings considerably, when the pedigree is uncertain. DEP and W3 show to be the most successful adjustment methods, providing comparatively accurate, though yet underestimated variances for small family sizes. W3 is the superior approach when the pedigree is highly uncertain, whereas DEP becomes superior when about half of all parental assignments are correct. Large family sizes introduce to all approaches a tendency to underestimate the parameter variance, the degree of underestimation depending on the certainty of pedigree. Additionally, the importance sampling schemes provide at large uncertainty of pedigree comparatively good estimates of the parameter’s expected values, where the non importance sampling approaches severely fail. Statistical analysis Animal pedigrees Population parameters
115	A New Screening Methodology for Mixture Experiments Weese, Maria 01 May 2010 (has links) Many materials we use in daily life are comprised of a mixture; plastics, gasoline, food, medicine, etc. Mixture experiments, where factors are proportions of components and the response depends only on the relative proportions of the components, are an integral part of product development and improvement. However, when the number of components is large and there are complex constraints, experimentation can be a daunting task. We study screening methods in a mixture setting using the framework of the Cox mixture model [1]. We exploit the easy interpretation of the parameters in the Cox mixture model and develop methods for screening in a mixture setting. We present specific methods for adding a component, removing a component and a general method for screening a subset of components in mixtures with complex constraints. The variances of our parameter estimates are comparable with the typically used Scheff ́e model variances and our methods provide a reduced run size for screening experiments with mixtures containing a large number of components. We then further extend the new screening methods by using Evolutionary Operation (EVOP) developed by Box and Draper [2]. EVOP methods use small movement in a subset of process parameters and replication to reveal effects out of the process noise. Mixture experiments inherently have small movements (since the proportions can only range from zero to unity) and the effects have large variances. We update the EVOP methods by using sequential testing of effects opposed to the confidence interval method originally proposed by Box and Draper. We show that the sequential testing approach as compared with a fixed sample size reduced the required sample size as much as 50 percent with all other testing parameters held constant. We present two methods for adding a component and a general screening method using a graphical sequential t-test and provide R-code to reproduce the limits for the test. mixture experiments screening Cox model Applied Statistics Design of Experiments and Sample Surveys Statistical Methodology
116	An Analysis of Boosted Regression Trees to Predict the Strength Properties of Wood Composites Carty, Dillon Matthew 01 August 2011 (has links) The forest products industry is a significant contributor to the U.S. economy contributing six percent of the total U.S. manufacturing gross domestic product (GDP), placing it on par with the U.S. automotive and plastics industries. Sustaining business competitiveness by reducing costs and maintaining product quality will be essential in the long term for this industry. Improved production efficiency and business competitiveness is the primary rationale for this work. A challenge facing this industry is to develop better knowledge of the complex nature of process variables and their relationship with final product quality attributes. Quantifying better the relationships between process variables (e.g., press temperature) and final product quality attributes plus predicting the strength properties of final products are the goals of this study. Destructive lab tests are taken at one to two hour intervals to estimate internal bond (IB) tensile strength and modulus of rupture (MOR) strength properties. Significant amounts of production occur between destructive test samples. In the absence of a real-time model that predicts strength properties, operators may run higher than necessary feedstock input targets (e.g., weight, resin, etc.). Improved prediction of strength properties using boosted regression tree (BRT) models may reduce the costs associated with rework (i.e., remanufactured panels due to poor strength properties), reduce feedstocks costs (e.g., resin and wood), reduce energy usage, and improve wood utilization from the valuable forest resource. Real-time, temporal process data sets were obtained from a U.S. particleboard manufacturer. In this thesis, BRT models were developed to predict the continuous response variables MOR and IB from a pool of possible continuous predictor variables. BRT model comparisons were done using the root mean squared error for prediction (RMSEP) and the RMSEP relative to the mean of the response variable as a percent (RMSEP%) for the validation data set(s). Overall, for MOR, RMSEP values ranged from 0.99 to 1.443 MPa, and RMSEP% values ranged from 7.9% to 11.6%. Overall, for IB, RMSEP values ranged from 0.074 to 0.108 MPa, and RMSEP% values ranged from 12.7% to 18.6%. Boosted Regression Tree Predictive Models Internal Bond Modulus of Rupture Particleboard Applied Statistics
117	Applying Localized Realized Volatility Modeling to Futures Indices Fu, Luella 01 January 2011 (has links) This thesis extends the application of the localized realized volatility model created by Ying Chen, Wolfgang Karl Härdle, and Uta Pigorsch to other futures markets, particularly the CAC 40 and the NI 225. The research attempted to replicate results though ultimately, those results were invalidated by procedural difficulties. Volatility Modeling Applied Statistics Statistical Models
118	Assessing Changes in the Abundance of the Continental Population of Scaup Using a Hierarchical Spatio-Temporal Model Ross, Beth E. 01 January 2012 (has links) In ecological studies, the goal is often to describe and gain further insight into ecological processes underlying the data collected during observational studies. Because of the nature of observational data, it can often be difficult to separate the variation in the data from the underlying process or `state dynamics.' In order to better address this issue, it is becoming increasingly common for researchers to use hierarchical models. Hierarchical spatial, temporal, and spatio-temporal models allow for the simultaneous modeling of both first and second order processes, thus accounting for underlying autocorrelation in the system while still providing insight into overall spatial and temporal pattern. In this particular study, I use two species of interest, the lesser and greater scaup (Aythya affnis and Aythya marila), as an example of how hierarchical models can be utilized in wildlife management studies. Scaup are the most abundant and widespread diving duck in North America, and are important game species. Since 1978, the continental population of scaup has declined to levels that are 16% below the 1955-2010 average and 34% below the North American Waterfowl Management Plan goal. The greatest decline in abundance of scaup appears to be occurring in the western boreal forest, where populations may have depressed rates of reproductive success, survival, or both. In order to better understand the causes of the decline, and better understand the biology of scaup in general, a level of high importance has been placed on retrospective analyses that determine the spatial and temporal changes in population abundance. In order to implement Bayesian hierarchical models, I used a method called Integrated Nested Laplace Approximation (INLA) to approximate the posterior marginal distribution of the parameters of interest, rather than the more common Markov Chain Monte Carlo (MCMC) approach. Based on preliminary analysis, the data appeared to be overdispersed, containing a disproportionately high number of zeros along with a high variance relative to the mean. Thus, I considered two potential data models, the negative binomial and the zero-inflated negative binomial. Of these models, the zero-inflated negative binomial had the lowest DIC, thus inference was based on this model. Results from this model indicated that a large proportion of the strata were not decreasing (i.e., the estimated slope of the parameter was not significantly different from zero). However, there were important exceptions with strata in the northwest boreal forest and southern prairie parkland habitats. Several strata in the boreal forest habitat had negative slope estimates, indicating a decrease in breeding pairs, while some of the strata in the prairie parkland habitat had positive slope estimates, indicating an increase in this region. Additionally, from looking at plots of individual strata, it seems that the strata experiencing increases in breeding pairs are experiencing dramatic increases. Overall, my results support previous work indicating a decline in population abundance in the northern boreal forest of Canada, and additionally indicate that the population of scaup has increased rapidly in the prairie pothole region since 1957. Yet, by accounting for spatial and temporal autocorrelation in the data, it appears that declines in abundance are not as widespread as previously reported. Bayesin hierchiacal modeling wildlife management hierchical spatial-temporal model Applied Statistics Statistical Models Statistics and Probability
119	New Results in ell_1 Penalized Regression Roualdes, Edward A. 01 January 2015 (has links) Here we consider penalized regression methods, and extend on the results surrounding the l1 norm penalty. We address a more recent development that generalizes previous methods by penalizing a linear transformation of the coefficients of interest instead of penalizing just the coefficients themselves. We introduce an approximate algorithm to fit this generalization and a fully Bayesian hierarchical model that is a direct analogue of the frequentist version. A number of benefits are derived from the Bayesian persepective; most notably choice of the tuning parameter and natural means to estimate the variation of estimates – a notoriously difficult task for the frequentist formulation. We then introduce Bayesian trend filtering which exemplifies the benefits of our Bayesian version. Bayesian trend filtering is shown to be an empirically strong technique for fitting univariate, nonparametric regression. Through a simulation study, we show that Bayesian trend filtering reduces prediction error and attains more accurate coverage probabilities over the frequentist method. We then apply Bayesian trend filtering to real data sets, where our method is quite competitive against a number of other popular nonparametric methods. linear model penalized regression Bayesian analysis Hierarchical Models Applied Statistics Statistical Models
120	A complex survey data analysis of TB and HIV mortality in South Africa. Murorunkwere, Joie Lea. January 2012 (has links) Many countries in the world record annual summary statistics such as economic indicators like Gross Domestic Product (GDP) and vital statistics for example the number of births and deaths. In this thesis we focus on mortality data from various causes including Tuberculosis (TB) and HIV. TB is an infectious disease caused by bacteria called Mycobacterium tuberculosis. It is the main cause of death in the world among all infectious diseases. An additional complexity is that HIV/AIDS acts as a catalyst to the occurrence of TB. Vaidyanathan and Singh revealed that people infected with mycobacterium tuberculosis alone have an approximately 10% life time risk of developing active TB, compared to 60% or more in persons co-infected with HIV and mycobacterium tuberculosis. South Africa was ranked seventh highest by the World Health Organization among the 22 TB high burden countries in the world and fourth highest in Africa. The research work in this thesis uses the 2007 Statistics South Africa (STATSSA) data on TB and HIV as the primary cause of death to build statistical models that can be used to investigate factors associated with death due to TB. Logistic regression, Survey Logistic regression and generalized linear models (GLM) will be used to assess the effect of risk factors or predictors to the probability of deaths associated with TB and HIV. This study will be guided by a theoretical approach to understanding factors associated with TB and HIV deaths. Bayesian modeling using WINBUGS will be used to assess spatial modeling of relative risk and spatial prior distributions for disease mapping models. Of the 615312 deceased, 546917 (89%) died from natural death, 14179 (2%) were stillborn and 54216 (9%) from non-natural death possibly accidents, murder, suicide. Among those who died from natural death and disease, 65052 (12%) died of TB and 13718 (2%) died of HIV. The results of the analysis revealed risk factors associated with TB and HIV mortality. / Thesis (M.Sc.)-University of KwaZulu-Natal, Pietermaritzburg, 2012. Tuberculosis--Mortality--South Africa. HIV infections. Theses--Applied statistics.

Search results