Spelling suggestions: "subject:"estatistics anda probability"" "subject:"estatistics anda aprobability""
161 |
The performance of Multilevel Structural Equation Modeling (MSEM) in comparison to Multilevel Modeling (MLM) in multilevel mediation analysis with non-normal dataPham, Thanh Vinh 17 November 2017 (has links)
The mediation analysis has been used to test if the effect of one variable on another variable is mediated by the third variable. The mediation analysis answers a question of how a predictor influences an outcome variable. Such information helps to gain understanding of mechanism underlying the variation of the outcome. When the mediation analysis is conducted on hierarchical data, the structure of data needs to be taken into account. Krull and MacKinnon (1999) recommended using Multilevel Modeling (MLM) with nested data and showed that the MLM approach has more power and flexibility over the standard Ordinary Least Squares (OLS) approach in multilevel data. However the MLM mediation model still has some limitations such as incapability of analyzing outcome variables measured at the upper level. Preacher, Zyphur, and Zhang (2010) proposed that the Multilevel Structural Equation Modeling (MSEM) will overcome the limitation of MLM approach in multilevel mediation analysis. The purpose of this study was to examine the performance of the MSEM approach on non-normal hierarchical data. This study also aimed to compare the MSEM method with the MLM method proposed by MacKinnon (2008) and Zhang, Zyphur, and Preacher (2009). The study focused on the null hypothesis testing which were presented by Type I error, statistical power, and convergence rate. Using Monte Carlo method, this study systematically investigates the effect of several factors on the performance of the MSEM and MLM methods. Designed factors considered were: the magnitude of the population indirect effect, the population distribution shape, sample size at level 1 and level 2, and the intra-class correlation (ICC) level. The results of this study showed no significant effect of the degree of non-normality on any performance criteria of either MSEM or MLM models. While the Type I error rates of the MLM model reached the expected alpha level as the group number was 300 or higher, the MSEM model showed very conservative performance in term of controlling for the Type I error with the rejection rates of null conditions were zero or closed to zero across all conditions. It was evident that the MLM model outperformed the MSEM model in term of power for most simulated conditions. Among the simulation factors examined in this dissertation, the mediation effect size emerged as the most important one since it is highly associated with each of the considered performance criteria. This study also supported the finding of previous studies (Preacher, Zhang, & Zyphur, 2011; Zhang, 2005) about the relationship between sample size, especially the number of group, and the performance of either the MLM or MSEM models. The accuracy and precision of the MLM and MSEM methods were also investigated partially in this study in term of relative bias and confidence interval (CI) width. The MSEM model outperformed the MLM model in term of relative bias while the MLM model had better CI width than the MSEM model. Sample size, effect size, and ICC value were the factors that significantly associate with the performance of these methods in term of relative bias and CI width.
|
162 |
Telephone Polls and PPS Sampling: A Potential Boon to the Polling IndustryBurt, Jade McKay 01 April 2017 (has links)
In the wake of the 2016 election, the polling industry has no shortage of critics. While these are difficult times for the industry as a whole, there are exciting innovations happening that will serve to benefit and revitalize the industry for years. One of these exciting innovations is Probability Proportional to Size (PPS) sampling. I will elaborate on what PPS sampling is and provide a mathematical foundation for its use in polling. I also discuss what some of the myriad of issues plaguing the polling industry are and then show how PPS sampling can be used to remedy many of these ills. Finally, I look at a real-world application of PPS sampling. The Mia Love internal polling team, Y2 Analytics, granted me access to their PPS data. I use it to show that we can accurately model the electorate using PPS samples and that polls conducted by this method are at least as accurate as other polls using simple random samples.
|
163 |
Simulation of Mathematical Models in Genetic AnalysisPatel, Dinesh Govindlal 01 January 1964 (has links)
In recent years a new field of statistics has become of importance in many branches of experimental science. This is the Monte Carlo Method, so called because it is based on simulation of stochastic processes. By stochastic process, it is meant some possible physical process in the real world that has some random or stochastic element in its structure. This is the subject which may appropriately be called the dynamic part of statistics or the statistics of "change," in contrast with the static statistical problems which have so far been the more systematically studied. Many obvious examples of such processes are to be found in various branches of science and technology, for example, the phenomenon of Brownian Motion, the growth of a bacterial colony, the fluctuating numbers of electrons and protons in a cosmic ray shower or the random segregation and assortment of genes (chemical entities responsible for governing physical traits for the plant and animal systems) under linkage condition. Their occurrences are predominant in the fields of medicine, genetics, physics, oceanography, economics, engineering and industry, to name only a few scientific disciplines. The scientists making measurements in his laboratory, the meteriologist attempting to forecast weather, the control
systems engineer designing a servomechanism (such as an aircraft or a thermostatic control), the electrical engineer designing a communication system (such as the radio link between entertainer and audience or the apparatus and cables that transmit messages from one point to another), economist studying price fluctuations in business cycles and the neurosurgion studying brain wave records, all are encountering problems to which the theory of stochastic processes may be relevant. Let us consider a few of these processes in a little more detail. In statistical physics many parts of the theory of stochastic processes were developed in correlation with the study of fluctuations and noise in physical systems (Einstein, 1905; Smoluchowski, 1906; and Schottky, 1918). Consequently, the theory of stochastic processes can be regarded as the mathematical foundation of statistical physics. The stochastic models for population growth consider the size and composition of a population which is constantly fluctuating. These are mostly considered by Bailey (1957), Bartlett (1960), and Bharucha-Reid (1960). In communication theory a wide variety of problems involving communication and/or control such as the problem of automatic tracking of moving objects, the reception of radio signals in the presence of natural and artificial disturbances, the reproduction of sound and images, the design of guidance systems, the design of control systems for industrial processes may be regarded as special cases of the following general problem; that is, let T denote a set of points in a time axis such that at each point t in T an observation has been made of a random variable X(t). Given the observations [x(t), t fT] and a quantity Z related to the observation, one desires to from in an optimum manner, estimates of, and tests of hypothesis about Z and various functions h(Z).
|
164 |
Robustness of the Within- and Between-Series Estimators to Non-Normal Multiple-Baseline Studies: A Monte Carlo StudyJoo, Seang-Hwane 06 April 2017 (has links)
In single-case research, multiple-baseline (MB) design is the most widely used design in practical settings. It provides the opportunity to estimate the treatment effect based on not only within-series comparisons of treatment phase to baseline phase observations, but also time-specific between-series comparisons of observations from those that have started treatment to those that are still in the baseline. In MB studies, the average treatment effect and the variation of these effects across multiple participants can be estimated using various statistical modeling methods. Recently, two types of statistical modeling methods were proposed for analyzing MB studies: a) within-series model and b) between-series model. The within-series model is a typical two-level multilevel modeling approach analyzing the measurement occasions within a participant, whereas the between-series model is an alternative modeling approach analyzing participants’ measurement occasions at certain time points, where some participants are in the baseline phase and others are in the treatment phase. Parameters of both within- and between-series models are generally estimated with restricted maximum likelihood (ReML) estimation and ReML is developed based on the assumption of normality (Hox, et al., 2010; Raudenbush & Bryk, 2002). However, in practical educational and psychological settings, observed data may not be easily assumed to be normal. Therefore, the purpose of this study is to investigate the robustness of analyzing MB studies with the within- and between-series models when level-1 errors are non-normal. A Monte Carlo study was conducted under the conditions where level-1 errors were generated from non-normal distributions in which skewness and kurtosis of the distribution were manipulated. Four statistical approaches were considered for comparison based on theoretical and/or empirical rationales. The approaches were defined by the crossing of two analytic decisions: a) whether to use a within- or between-series estimate of effect and b) whether to use REML estimation with Kenward-Roger adjustment for inferences or Bayesian estimation and inference. The accuracy of parameter estimation and statistical power and Type I error were systematically analyzed. The results of the study showed the within- and between-series models are robust to the non-normality of the level-1 error variance. Both within- and between-series models estimated the treatment effect accurately and statistical inferences were acceptable. ReML and Bayesian estimations also showed similar results in the current study. Applications and implications for applied and methodology researchers are discussed based on the findings of the study.
|
165 |
Heterogeneous computing for the Bayesian hierarchical normal intrinsic conditional autoregressive model with incomplete dataSomal, Harsimran S. 01 August 2016 (has links)
A popular model for spatial association is the conditional autoregressive (CAR) model, and generalizations exist in the literature that utilize intrinsic CAR (ICAR) models within spatial hierarchical models. One generalization is the class of Bayesian hierarchical normal ICAR models, abbreviated HNICAR. The Bayesian HNICAR model can be used to smooth areal or lattice data, estimate the directional strength of spatio-temporal associations, and make posterior predictions at each point in space or time. Furthermore, the Bayesian HNICAR model allows for sample-based posterior inference about model parameters and predictions. R package CARrampsOcl enables fast, independent sampling-based inference for a Bayesian HNICAR model when data are complete and the spatial precision matrix is expressible as a Kronecker sum of lower order matrices. This thesis presents an independent sampling algorithm to accommodate incomplete data and arbitrary precision structures, a parallelized implementation of the algorithm that can be executed on a wide range of hardware, including NVIDIA and AMD graphical processing units (GPUs) and multicore Intel CPUs, analysis of the effects of missingness on the posterior distribution of model parameters and predictive densities, and a survey of model comparison methods for CAR models. The merits of the model and algorithm are demonstrated through both simulation and analysis of an environmental data set.
|
166 |
A µ-Model Approach on the Cell Means: The Analysis of Full, Design Models with Non-Orthogonal DataVan Koningsveld, Richard 01 May 1979 (has links)
This work considers the application of a µ-model approach on the cell means to a special yet important class of experimental designs. These include full factorial, completely nested, and mixed models with one or more observations per cell. By limiting attention to full models, an approach to the general data situation is developed which is both conceptually simple and computationally advantageous.
Conceptually, the method is simple because the design related effects are defined as if the cell means are single observations. This leads to a rather simple algorithm for generating main effect contrasts, from which associated interaction contrasts can also be formed. While the sums of squares found from these contrasts are not additive with non-orthogonal data, they do lead to the class of design related hypotheses with the clearest interpretation in terms of the cells.
The computational method is advantageous because the sum of squares for each source of variation is evaluated separately. This avoids the storage and inversion of a potentially large matrix associated with alternative methods, and allows the user to evaluate only those sources of interest.
The methodology outlined in this work is programmed into a user-easy, interactive terminal version for the analysis of these n-factor design models.
|
167 |
Statistical Properties and Problems on Modeling the Bolivian Foreign Exchange MarketBarja, Gover 01 May 1994 (has links)
The Bolivian foreign exchange market is explained in terms of the official and parallel exchange rates. The data covers the post hyper inflationary period from 1986 to 1992. The distribution of the rate of depreciation of the official and parallel exchange rates is long tailed and strongly departs from normality due to the existence of outliers. A market interactions model of the autoregressive kind is estimated using robust regression. This procedure produces M-parameter estimates using iteratively reweighted least squares. The robust method handles well the outlier problem and at the same time it reveals the true nature of the statistical properties of the data by not being able to produce white noise in the squared residuals. Both markets show a one-time break in the variance creating two periods of differential behavior, with one of them having GARCH properties. Robust unit root and cointegration tests also fail to produce white noise squared residuals due to the same phenomena. Further research requires the development of a robust procedure that could take care of the outlier and heteroskedasticity problems simultaneously.
|
168 |
An Investigation of Cluster AnalysisKlingel, John C. 01 May 1973 (has links)
Three cluster analysis programs were used to group the same 64 individuals, generated so as to represent eight populations of eight individuals each. Each individual had quantitative values for seven attributes. All eight populations shared a common attribute variance-covariance matrix.
The first program, from F. J. Rohlf's MINT package, implemented single linkage. Correlation was used as the basis for similarity. The results were not satisfactory, and the further use of correlation is in question.
The second program, MDISP, bases similarity on Euclidean distance. It was found to give excellent results, in that it clustered individuals into the exact populations from which they were generated. It is the recommended program of the three used here.
The last program, MINFO, uses similarity based on mutual information. It also gave very satisfactory results, but, due to visualization reasons, it was found to be less favorable than the MDISP program.
|
169 |
Sequential Analysis for Tolerances of Noxious Weed SeedsTokko, Seung 01 May 1972 (has links)
The application of a sequential test, the sequential probability ratio test, for the tolerances of noxious weed seeds is studied. It is proved that the sequential test can give a similar power curve to that of the current fixed sample test if the test parameters are properly chosen.
The average sample size required by a sequential test, in general, is smaller than that of the existing test. However, in some cases it requires relatively a larger sample than current test.
As a solution to the problem a method of truncation is considered. A kind of mixed procedure is suggested. This procedure gives almost an identical power curve to the standard one with great savings in sample size. The sample size is always less than that of the current test procedure.
|
170 |
Analysis of Case Histories by Markov Chains Using Juvenile Court Data of State of UtahUh, Soo-Hong 01 May 1973 (has links)
The purpose of this paper is to analyze juvenile court data using Markov Chains. A computer program was generalized with a single array orientation for analyzing realizations of a Markov Chain to the kth order within machine limitations. The data used in this paper were gathered by the Juvenile Court of the State of Utah for administrative purposes and limited to District II. The results from the paper, "Statistical Inference About Markov Chains" by Anderson and Goodman, were applied for testing hypotheses. The paper is divided into five chapters: introduction, statistical background, methodology, analysis and summary, conclusions.
|
Page generated in 0.1082 seconds