• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 70
  • 27
  • 12
  • 5
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 164
  • 164
  • 33
  • 33
  • 21
  • 21
  • 21
  • 17
  • 17
  • 17
  • 17
  • 17
  • 17
  • 17
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Who Are the Cigarette Smokers in Arizona

Chen, Mei-Kuang January 2007 (has links)
The purpose of this study was to investigate the relationship between cigarette smoking and socio-demographic variables based on the empirical literature and the primitive theories in the field. Two regression approaches, logistic regression and linear multiple regression, were conducted on the two most recent Arizona Adult Tobacco Surveys to test the hypothesized models. The results showed that cigarette smokers in Arizona are mainly residents who have not completed a four-year college degree, who are unemployed, White, non-Hispanic, or young to middle-aged adults. Among the socio-demographic predictors of interest, education is the most important variable in identifying cigarette smokers, even though the predictive power of these socio-demographic variables is small. Practical and methodological implications of these findings are discussed.
42

Sample Size in Ordinal Logistic Hierarchical Linear Modeling

Timberlake, Allison M 07 May 2011 (has links)
Most quantitative research is conducted by randomly selecting members of a population on which to conduct a study. When statistics are run on a sample, and not the entire population of interest, they are subject to a certain amount of error. Many factors can impact the amount of error, or bias, in statistical estimates. One important factor is sample size; larger samples are more likely to minimize bias than smaller samples. Therefore, determining the necessary sample size to obtain accurate statistical estimates is a critical component of designing a quantitative study. Much research has been conducted on the impact of sample size on simple statistical techniques such as group mean comparisons and ordinary least squares regression. Less sample size research, however, has been conducted on complex techniques such as hierarchical linear modeling (HLM). HLM, also known as multilevel modeling, is used to explain and predict an outcome based on knowledge of other variables in nested populations. Ordinal logistic HLM (OLHLM) is used when the outcome variable has three or more ordered categories. While there is a growing body of research on sample size for two-level HLM utilizing a continuous outcome, there is no existing research exploring sample size for OLHLM. The purpose of this study was to determine the impact of sample size on statistical estimates for ordinal logistic hierarchical linear modeling. A Monte Carlo simulation study was used to investigate this research query. Four variables were manipulated: level-one sample size, level-two sample size, sample outcome category allocation, and predictor-criterion correlation. Statistical estimates explored include bias in level-one and level-two parameters, power, and prediction accuracy. Results indicate that, in general, holding other conditions constant, bias decreases as level-one sample size increases. However, bias increases or remains unchanged as level-two sample size increases, holding other conditions constant. Power to detect the independent variable coefficients increased as both level-one and level-two sample size increased, holding other conditions constant. Overall, prediction accuracy is extremely poor. The overall prediction accuracy rate across conditions was 47.7%, with little variance across conditions. Furthermore, there is a strong tendency to over-predict the middle outcome category.
43

Factors that Influence Cross-validation of Hierarchical Linear Models

Widman, Tracy 07 May 2011 (has links)
While use of hierarchical linear modeling (HLM) to predict an outcome is reasonable and desirable, employing the model for prediction without first establishing the model’s predictive validity is ill-advised. Estimating the predictive validity of a regression model by cross-validation has been thoroughly researched, but there is a dearth of research investigating the cross-validation of hierarchical linear models. One of the major obstacles in cross-validating HLM is the lack of a measure of explained variance similar to the squared multiple correlation coefficient in regression analysis. The purpose of this Monte Carlo simulation study is to explore the impact of sample size, centering, and predictor-criterion correlation magnitudes on potential cross-validation measurements for hierarchical linear modeling. This study considered the impact of 64 simulated conditions across three explained variance approaches: Raudenbush and Bryk’s (2002) proportional reduction in error variance, Snijders and Bosker’s (1994) modeled variance, and a measure of explained variance proposed by Gagné and Furlow (2009). For each of the explained variance approaches, a cross-validation measurement, shrinkage, was obtained. The results indicate that sample size, predictor-criterion correlations, and centering impact the cross-validation measurement. The degree and direction of the impact differs with the explained variance approach employed. Under some explained variance approaches, shrinkage decreased with larger level-2 sample sizes and increased in others. Likewise, in comparing group- and grand-mean centering, with some approaches grand-mean centering resulted in higher shrinkage estimates but smaller estimates in others. Larger total sample sizes yielded smaller shrinkage estimates, as did the predictor-criterion correlation combination in which the group-level predictor had a stronger correlation. The approaches to explained variance differed substantially in their usability for cross-validation. The Snijders and Bosker approach provided relatively large shrinkage estimates, and, depending on the predictor-criterion correlation, shrinkage under both Raudenbush and Bryk approaches could be sizable to the degree that the estimate begins to lack meaning. Researchers seeking to cross-validate HLM need to be mindful of the interplay between the explained variance approach employed and the impact of sample size, centering, and predictor-criterion correlations on shrinkage estimates when making research design decisions.
44

Approximation du calcul de la taille échantillonnale pour les tests à hypothèses multiples lorsque r parmis m hypothèses doivent être significatives

Delorme, Philippe 12 1900 (has links)
Généralement, dans les situations d’hypothèses multiples on cherche à rejeter toutes les hypothèses ou bien une seule d’entre d’elles. Depuis quelques temps on voit apparaître le besoin de répondre à la question : « Peut-on rejeter au moins r hypothèses ? ». Toutefois, les outils statisques pour répondre à cette question sont rares dans la littérature. Nous avons donc entrepris de développer les formules générales de puissance pour les procédures les plus utilisées, soit celles de Bonferroni, de Hochberg et de Holm. Nous avons développé un package R pour le calcul de la taille échantilonnalle pour les tests à hypothèses multiples (multiple endpoints), où l’on désire qu’au moins r des m hypothèses soient significatives. Nous nous limitons au cas où toutes les variables sont continues et nous présentons quatre situations différentes qui dépendent de la structure de la matrice de variance-covariance des données. / Generally, in multiple endpoints situations we want to reject all hypotheses or at least only one of them. For some time now, we see emerge the need to answer the question : "Can we reject at least r hypotheses ?" However, the statistical tools to answer this new problem are rare in the litterature. We decide to develop general power formulas for the principals procedures : Bonferroni’s, Hochberg’s and Holm’s procedures. We also develop an R package for the sample size calculation for multiple endpoints, when we want to reject at least r hypotheses. We limit ourselves in the case where all the variables are continuous and we present four different situations depending on the structure of the data’s variance-covariance matrix.
45

An Investigation of the Optimal Sample Size, Relationship between Existing Tests and Performance, and New Recommended Specifications for Flexible Base Courses in Texas

Hewes, Bailey 03 October 2013 (has links)
The purpose of this study was to improve flexible base course performance within the state of Texas while reducing TxDOT’s testing burden. The focus of this study was to revise the current specification with the intent of providing a “performance related” specification while optimizing sample sizes and testing frequencies based on material variability. A literature review yielded information on base course variability within and outside the state of Texas, and on what tests other states, and Canada, are currently using to characterize flexible base performance. A sampling and testing program was conducted at Texas A&M University to define current variability information, and to conduct performance related tests including resilient modulus and permanent deformation. In addition to these data being more current, they are more representative of short-term variability than data obtained from the literature. This “short-term” variability is considered more realistic for what typically occurs during construction operations. A statistical sensitivity analysis (based on the 80th percentile standard deviation) of these data was conducted to determine minimum sample sizes for contractors to qualify for the proposed quality monitoring program (QMP). The required sample sizes for contractors to qualify for the QMP are 20 for gradation, compressive strength, and moisture-density tests, 15 for Atterberg Limits, and 10 for Web Ball Mill. These sample sizes are based on a minimum 25,000 ton stockpile, or “lot”. After qualifying for the program, if contractors can prove their variability is better than the 80th percentile, they can reduce their testing frequencies. The sample size for TxDOT’s verification testing is 5 samples per lot and will remain at that number regardless of reduced variability. Once qualified for the QMP, a contractor may continue to send material to TxDOT projects until a failing sample disqualifies the contractor from the program. TxDOT does not currently require washed gradations for flexible base. Dry and washed sieve analyses were performed during this study to investigate the need for washed gradations. Statistical comparisons of these data yielded strong evidence that TxDOT should always use a washed method. Significant differences between the washed and dry method were determined for the percentage of material passing the No. 40 and No. 200 sieves. Since TxDOT already specifies limits on the fraction of material passing the No. 40 sieve, and since this study yielded evidence of that size fraction having a relationship with resilient modulus (performance), it would be beneficial to use a washed sieve analysis and therefore obtain a more accurate reading for that specification. Furthermore, it is suggested the TxDOT requires contractors to have “target” test values, and to place 90 percent within limits (90PWL) bands around those target values to control material variability.
46

Some problems in high dimensional data analysis

Pham, Tung Huy January 2010 (has links)
The bloom of economics and technology has had an enormous impact on society. Along with these developments, human activities nowadays produce massive amounts of data that can be easily collected for relatively low cost with the aid of new technologies. Many examples can be mentioned here including data from web term-document data, sensor arrays, gene expression, finance data, imaging and hyperspectral analysis. Because of the enormous amount of data from various different and new sources, more and more challenging scientific problems appear. These problems have changed the types of problems which mathematical scientists work. / In traditional statistics, the dimension of the data, p say, is low, with many observations, n say. In this case, classical rules such as the Central Limit Theorem are often applied to obtain some understanding from data. A new challenge to statisticians today is dealing with a different setting, when the data dimension is very large and the number of observations is small. The mathematical assumption now could be p > n, or even p goes to infinity and n fixed in many cases, for example, there are few patients with many genes. In these cases, classical methods fail to produce a good understanding of the nature of the problem. Hence, new methods need to be found to solve these problems. Mathematical explanations are also needed to generalize these cases. / The research preferred in this thesis includes two problems: Variable selection and Classification, in the case where the dimension is very large. The work on variable selection problems, in particular the Adaptive Lasso was completed by June 2007 and the research on classification has been carried out through out 2008 and 2009. The research on the Dantzig selector and the Lasso were finished in July 2009. Therefore, this thesis is divided into two parts. In the first part of the thesis we study the Adaptive Lasso, the Lasso and the Dantzig selector. In particular, in Chapter 2 we present some results for the Adaptive Lasso. Chapter 3 will provides two examples that show that neither the Dantzig selector or the Lasso is definitely better than the other. The second part of the thesis is organized as follows. In Chapter 5, we shall construct the model setting. In Chapter 6, we summarize the results of the scaled centroid-based classifier. We also prove some results on the scaled centroid-based classifier. Because there are similarities between the Support Vector Machine (SVM) and Distance Weighted Discrimination (DWD) classifiers, Chapter 8 introduces a class of distance-based classifiers that could be considered a generalization of the SVM and DWD classifiers. Chapters 9 and 10 are about the SVM and DWD classifiers. Chapter 11 demonstrates the performance of these classifiers on simulated data sets and some cancer data sets.
47

Some problems in high dimensional data analysis

Pham, Tung Huy January 2010 (has links)
The bloom of economics and technology has had an enormous impact on society. Along with these developments, human activities nowadays produce massive amounts of data that can be easily collected for relatively low cost with the aid of new technologies. Many examples can be mentioned here including data from web term-document data, sensor arrays, gene expression, finance data, imaging and hyperspectral analysis. Because of the enormous amount of data from various different and new sources, more and more challenging scientific problems appear. These problems have changed the types of problems which mathematical scientists work. / In traditional statistics, the dimension of the data, p say, is low, with many observations, n say. In this case, classical rules such as the Central Limit Theorem are often applied to obtain some understanding from data. A new challenge to statisticians today is dealing with a different setting, when the data dimension is very large and the number of observations is small. The mathematical assumption now could be p > n, or even p goes to infinity and n fixed in many cases, for example, there are few patients with many genes. In these cases, classical methods fail to produce a good understanding of the nature of the problem. Hence, new methods need to be found to solve these problems. Mathematical explanations are also needed to generalize these cases. / The research preferred in this thesis includes two problems: Variable selection and Classification, in the case where the dimension is very large. The work on variable selection problems, in particular the Adaptive Lasso was completed by June 2007 and the research on classification has been carried out through out 2008 and 2009. The research on the Dantzig selector and the Lasso were finished in July 2009. Therefore, this thesis is divided into two parts. In the first part of the thesis we study the Adaptive Lasso, the Lasso and the Dantzig selector. In particular, in Chapter 2 we present some results for the Adaptive Lasso. Chapter 3 will provides two examples that show that neither the Dantzig selector or the Lasso is definitely better than the other. The second part of the thesis is organized as follows. In Chapter 5, we shall construct the model setting. In Chapter 6, we summarize the results of the scaled centroid-based classifier. We also prove some results on the scaled centroid-based classifier. Because there are similarities between the Support Vector Machine (SVM) and Distance Weighted Discrimination (DWD) classifiers, Chapter 8 introduces a class of distance-based classifiers that could be considered a generalization of the SVM and DWD classifiers. Chapters 9 and 10 are about the SVM and DWD classifiers. Chapter 11 demonstrates the performance of these classifiers on simulated data sets and some cancer data sets.
48

Power and sample size of cluster randomized trials

You, Zhiying. January 2008 (has links) (PDF)
Thesis (Ph.D.)--University of Alabama at Birmingham, 2008. / Title from first page of PDF file (viewed on June 29, 2009). Includes bibliographical references.
49

Comparing the performance of four calculation methods for estimating the sample size in repeated measures clinical trials where difference in treatment groups means is of interest

Hagen, Clinton Ernest. January 2008 (has links) (PDF)
Thesis--University of Oklahoma. / Bibliography: leaf 51.
50

Hypothesis testing based on pool screening with unequal pool sizes

Gao, Hongjiang. January 2010 (has links) (PDF)
Thesis (Ph.D.)--University of Alabama at Birmingham, 2010. / Title from PDF title page (viewed on June 28, 2010). Includes bibliographical references.

Page generated in 0.0741 seconds