1 |
Sufficient sample sizes for the multivariate multilevel regression modelChang, Wanchen 08 September 2015 (has links)
The three-level multivariate multilevel model (MVMM) is a multivariate extension of the conventional univariate two-level hierarchical linear model (HLM) and is used for estimating and testing the effects of explanatory variables on a set of correlated continuous outcome measures. Two simulation studies were conducted to investigate the sample size requirements for restricted maximum likelihood (REML) estimation of three-level MVMMs, the effects of sample sizes and other design characteristics on estimation, and the performance of the MVMMs compared to corresponding two-level HLMs. The model for the first study was a random-intercept MVMM, and the model for the second study was a fully-conditional MVMM. Study conditions included number of clusters, cluster size, intraclass correlation coefficient, number of outcomes, and correlations between pairs of outcomes. The accuracy and precision of estimates were assessed with parameter bias, relative parameter bias, relative standard error bias, and 95% confidence interval coverage. Empirical power and type I error rates were also calculated. Implications of the results for applied researchers and suggestions for future methodological studies are discussed. / text
|
2 |
Sample Size Determination in Multivariate Parameters With Applications to Nonuniform Subsampling in Big Data High Dimensional Linear RegressionYu Wang (11821553) 20 December 2021 (has links)
Subsampling is an important method in the analysis of Big Data. Subsample size determination (SSSD) plays a crucial part in extracting information from data and in breaking<br>the challenges resulted from huge data sizes. In this thesis, (1) Sample size determination<br>(SSD) is investigated in multivariate parameters, and sample size formulas are obtained for<br>multivariate normal distribution. (2) Sample size formulas are obtained based on concentration inequalities. (3) Improved bounds for McDiarmid’s inequalities are obtained. (4) The<br>obtained results are applied to nonuniform subsampling in Big Data high dimensional linear<br>regression. (5) Numerical studies are conducted.<br>The sample size formula in univariate normal distribution is a melody in elementary<br>statistics. It appears that its generalization to multivariate normal (or more generally multivariate parameters) hasn’t been caught much attention to the best of our knowledge. In<br>this thesis, we introduce a definition for SSD, and obtain explicit formulas for multivariate<br>normal distribution, in gratifying analogy of the sample size formula in univariate normal.<br>Commonly used concentration inequalities provide exponential rates, and sample sizes<br>based on these inequalities are often loose. Talagrand (1995) provided the missing factor to<br>sharpen these inequalities. We obtained the numeric values of the constants in the missing<br>factor and slightly improved his results. Furthermore, we provided the missing factor in<br>McDiarmid’s inequality. These improved bounds are used to give shrunken sample sizes
<br>
|
3 |
Sample Size Determination in Auditing Accounts Receivable Using a Zero-Inflated Poisson ModelPedersen, Kristen E 28 April 2010 (has links)
In the practice of auditing, a sample of accounts is chosen to verify if the accounts are materially misstated, as opposed to auditing all accounts; it would be too expensive to audit all acounts. This paper seeks to find a method for choosing a sample size of accounts that will give a more accurate estimate than the current methods for sample size determination that are currently being used. A review of methods to determine sample size will be investigated under both the frequentist and Bayesian settings, and then our method using the Zero-Inflated Poisson (ZIP) model will be introduced which explicitly considers zero versus non-zero errors. This model is favorable due to the excess zeros that are present in auditing data which the standard Poisson model does not account for, and this could easily be extended to data similar to accounting populations.
|
4 |
Introduction to power and sample size in multilevel modelsVenkatesan, Harini 21 August 2012 (has links)
In this report we give a brief introduction to the multilevel models, provide a brief summary of the need for using the multilevel model, discuss the assumptions underlying use of multilevel models, and present by means of example the necessary steps involved in model building. This introduction is followed by a discussion of power and sample size determination in multilevel designs. Some formulae are discussed to provide insight into the design aspects that are most influential in terms of power and calculation of standard errors. Finally we conclude by discussing and reviewing the simulation study performed by Maas and Hox (2005) about the influence of different sample sizes at individual as well as group level on the accuracy of the estimates (regression coefficients and variances) and their standard errors. / text
|
5 |
Cutoff sample size estimation for survival data: a simulation studyChe, Huiwen January 2014 (has links)
This thesis demonstrates the possible cutoff sample size point that balances goodness of es-timation and study expenditure by a practical cancer case. As it is crucial to determine the sample size in designing an experiment, researchers attempt to find the suitable sample size that achieves desired power and budget efficiency at the same time. The thesis shows how simulation can be used for sample size and precision calculations with survival data. The pre-sentation concentrates on the simulation involved in carrying out the estimates and precision calculations. The Kaplan-Meier estimator and the Cox regression coefficient are chosen as point estimators, and the precision measurements focus on the mean square error and the stan-dard error.
|
6 |
Influence of Correlation and Missing Data on Sample Size Determination in Mixed ModelsChen, Yanran 26 July 2013 (has links)
No description available.
|
7 |
Effect of Sample Size on Irt Equating of Uni-Dimensional Tests in Common Item Non-Equivalent Group Design: a Monte Carlo Simulation StudyWang, Xiangrong 03 May 2012 (has links)
Test equating is important to large-scale testing programs because of the following two reasons: strict test security is a key concern for high-stakes tests and fairness of test equating is important for test takers. The question of adequacy of sample size often arises in test equating. However, most recommendations in the existing literature are based on classical test equating. Very few research studies systematically investigated the minimal sample size which leads to reasonably accurate equating results based on item response theory (IRT). The main purpose of this study was to examine the minimal sample size for desired IRT equating accuracy for the common-item nonequivalent groups design under various conditions. Accuracy was determined by examining the relative magnitude of six accuracy statistics. Two IRT equating methods were carried out on simulated tests with combinations of test length, test format, group ability difference, similarity of the form difficulty, and parameter estimation methods for 14 sample sizes using Monte Carlo simulations with 1,000 replications per cell. Observed score equating and true score equating were compared to the criterion equating to obtain the accuracy statistics. The results suggest that different sample size requirements exist for different test lengths, test formats and parameter estimation methods. Additionally, the results show the following: first, the results for true score equating and observed score equating are very similar. Second, the longer test has less accurate equating than the shorter one at the same sample size level and as the sample size decreases, the gap is greater. Third, concurrent parameter estimation method produced less equating error than separate estimation at the same sample size level and as the sample size reduces, the difference increases. Fourth, the cases with different group ability have larger and less stable error comparing to the base case and the cases with different test difficulty, especially when using separate parameter estimation method with sample size less than 750. Last, the mixed formatted test is more accurate than the single formatted one at the same sample size level. / Ph. D.
|
8 |
Bayesian decision theoretic methods for clinical trialsTan, Say Beng January 1999 (has links)
No description available.
|
9 |
Estudo de algoritmos de otimização estocástica aplicados em aprendizado de máquina / Study of algorithms of stochastic optimization applied in machine learning problemsFernandes, Jessica Katherine de Sousa 23 August 2017 (has links)
Em diferentes aplicações de Aprendizado de Máquina podemos estar interessados na minimização do valor esperado de certa função de perda. Para a resolução desse problema, Otimização estocástica e Sample Size Selection têm um papel importante. No presente trabalho se apresentam as análises teóricas de alguns algoritmos destas duas áreas, incluindo algumas variações que consideram redução da variância. Nos exemplos práticos pode-se observar a vantagem do método Stochastic Gradient Descent em relação ao tempo de processamento e memória, mas, considerando precisão da solução obtida juntamente com o custo de minimização, as metodologias de redução da variância obtêm as melhores soluções. Os algoritmos Dynamic Sample Size Gradient e Line Search with variable sample size selection apesar de obter soluções melhores que as de Stochastic Gradient Descent, a desvantagem se encontra no alto custo computacional deles. / In different Machine Learnings applications we can be interest in the minimization of the expected value of some loss function. For the resolution of this problem, Stochastic optimization and Sample size selection has an important role. In the present work, it is shown the theoretical analysis of some algorithms of these two areas, including some variations that considers variance reduction. In the practical examples we can observe the advantage of Stochastic Gradient Descent in relation to the processing time and memory, but considering accuracy of the solution obtained and the cost of minimization, the methodologies of variance reduction has the best solutions. In the algorithms Dynamic Sample Size Gradient and Line Search with variable sample size selection, despite of obtaining better solutions than Stochastic Gradient Descent, the disadvantage lies in their high computational cost.
|
10 |
Estudo de algoritmos de otimização estocástica aplicados em aprendizado de máquina / Study of algorithms of stochastic optimization applied in machine learning problemsJessica Katherine de Sousa Fernandes 23 August 2017 (has links)
Em diferentes aplicações de Aprendizado de Máquina podemos estar interessados na minimização do valor esperado de certa função de perda. Para a resolução desse problema, Otimização estocástica e Sample Size Selection têm um papel importante. No presente trabalho se apresentam as análises teóricas de alguns algoritmos destas duas áreas, incluindo algumas variações que consideram redução da variância. Nos exemplos práticos pode-se observar a vantagem do método Stochastic Gradient Descent em relação ao tempo de processamento e memória, mas, considerando precisão da solução obtida juntamente com o custo de minimização, as metodologias de redução da variância obtêm as melhores soluções. Os algoritmos Dynamic Sample Size Gradient e Line Search with variable sample size selection apesar de obter soluções melhores que as de Stochastic Gradient Descent, a desvantagem se encontra no alto custo computacional deles. / In different Machine Learnings applications we can be interest in the minimization of the expected value of some loss function. For the resolution of this problem, Stochastic optimization and Sample size selection has an important role. In the present work, it is shown the theoretical analysis of some algorithms of these two areas, including some variations that considers variance reduction. In the practical examples we can observe the advantage of Stochastic Gradient Descent in relation to the processing time and memory, but considering accuracy of the solution obtained and the cost of minimization, the methodologies of variance reduction has the best solutions. In the algorithms Dynamic Sample Size Gradient and Line Search with variable sample size selection, despite of obtaining better solutions than Stochastic Gradient Descent, the disadvantage lies in their high computational cost.
|
Page generated in 0.0511 seconds