Global ETD Search

21	Investigating the Effects of Sample Size, Model Misspecification, and Underreporting in Crash Data on Three Commonly Used Traffic Crash Severity Models Ye, Fan 2011 May 1900 (has links) Numerous studies have documented the application of crash severity models to explore the relationship between crash severity and its contributing factors. These studies have shown that a large amount of work was conducted on this topic and usually focused on different types of models. However, only a limited amount of research has compared the performance of different crash severity models. Additionally, three major issues related to the modeling process for crash severity analysis have not been sufficiently explored: sample size, model misspecification and underreporting in crash data. Therefore, in this research, three commonly used traffic crash severity models: multinomial logit model (MNL), ordered probit model (OP) and mixed logit model (ML) were studied in terms of the effects of sample size, model misspecification and underreporting in crash data, via a Monte-Carlo approach using simulated and observed crash data. The results of sample size effects on the three models are consistent with prior expectations in that small sample sizes significantly affect the development of crash severity models, no matter which model type is used. Furthermore, among the three models, the ML model was found to require the largest sample size, while the OP model required the lowest sample size. The sample size requirement for the MNL model is intermediate to the other two models. In addition, when the sample size is sufficient, the results of model misspecification analysis lead to the following suggestions: in order to decrease the bias and variability of estimated parameters, logit models should be selected over probit models. Meanwhile, it was suggested to select more general and flexible model such as those allowing randomness in the parameters, i.e., the ML model. Another important finding was that the analysis of the underreported data for the three models showed that none of the three models was immune to this underreporting issue. In order to minimize the bias and reduce the variability of the model, fatal crashes should be set as the baseline severity for the MNL and ML models while, for the OP models, the rank for the crash severity should be set from fatal to property-damage-only (PDO) in a descending order. Furthermore, when the full or partial information about the unreported rates for each severity level is known, treating crash data as outcome-based samples in model estimation, via the Weighted Exogenous Sample Maximum Likelihood Estimator (WESMLE), dramatically improve the estimation for all three models compared to the result produced from the Maximum Likelihood estimator (MLE). crash severity models data underreporting sample size model misspecification
22	Preemptive power analysis for the consulting statistician novel applications of internal pilot design and information based monitoring systems / Sawrie, David Franklin. January 2007 (has links) (PDF) Thesis (Ph.D.)--University of Alabama at Birmingham, 2007. / Title from PDF title page (viewed on Feb. 19, 2010). Includes bibliographical references.
23	Self-designing optimal group sequential clinical trials / Thach, Chau Thuy. January 2000 (has links) Thesis (Ph. D.)--University of Washington, 2000. / Vita. Includes bibliographical references (leaves 107-111).
24	Statistical analysis of TxCAP and its subsystems Qazi, Abdus Shakur 29 September 2011 (has links) The Texas Department of Transportation (TxDOT) uses the Texas Condition Assessment Program (TxCAP) to measure and compare the overall road maintenance conditions among its 25 districts. TxCAP combines data from three existing subsystems: the Pavement Management Information System (PMIS), which scores the condition of pavement; the Texas Maintenance Assessment Program (TxMAP), which evaluates roadside conditions; and the Texas Traffic Assessment Program (TxTAP), which evaluates the condition of signs, work zones, railroad crossings, and other traffic elements to get an overall picture of the condition of state roads. As a result, TxCAP provides a more comprehensive assessment of the interstate and non-interstate highways. However, the scores for each of the subsystems are based on data of different sample sizes, accuracy, and levels of variations, making it difficult to decide if the difference between two TxCAP score is a true difference or measurement error. Therefore, whether the use of TxCAP is an effective and consistent means to measure the TxDOT roadway maintenance conditions raises concerns and needs to be evaluated. In order to achieve this objective, statistical analyses of the system were conducted in two ways: 1) to determine whether sufficient samples are collected for each of the subsystems, and 2) to determine if the scores are statistically different from each other. A case study was conducted with a dataset covering the whole state from 2008 to 2010. The case study results show that the difference in scores between two districts are statistically significant for some of the districts and insignificant for some other districts. It is therefore recommended that TxDOT either compare the 25 districts by groups/tiers or increase the sample size of the data being collected to compare the districts as individual ones. / text TxCAP Pavement maintenance monitoring Minimum sample size T-test
25	Color Image Based Face Recognition Ganapathi, Tejaswini 24 February 2009 (has links) Traditional appearance based face recognition (FR) systems use gray scale images, however recently attention has been drawn to the use of color images. Color inputs have a larger dimensionality, which increases the computational cost, and makes the small sample size (SSS) problem in supervised FR systems more challenging. It is therefore important to determine the scenarios in which usage of color information helps the FR system. In this thesis, it was found that inclusion of chromatic information in FR systems is shown to be particularly advantageous in poor illumination conditions. In supervised systems, a color input of optimal dimensionality would improve the FR performance under SSS conditions. A fusion of decisions from individual spectral planes also helps in the SSS scenario. Finally, chromatic information is integrated into a supervised ensemble learner to address pose and illumination variations. This framework significantly boosts FR performance under a range of learning scenarios. Color Face Recognition Biometrics Small Sample Size Problem 0544
26	A New Reclassification Method for Highly Uncertain Microarray Data in Allergy Gene Prediction Paul, Jasmin 11 April 2012 (has links) The analysis of microarray data is a challenging task because of the large dimensionality and small sample size involved. Although a few methods are available to address the problem of small sample size, they are not sufficiently successful in dealing with microarray data from extremely small (~<20) sample sizes. We propose a method to incorporate information from diverse sources to analyze the microarray data so as to improve the predictability of significant genes. A transformed data set, including statistical parameters, literature mining and gene ontology data, is evaluated. We performed classification experiments to identify potential allergy-related genes. Feature selection is used to identify the effect of features on classifier behaviour. An exploratory and domain knowledge analysis was performed on noisy real-life allergy data, and a subset of genes was selected as positive and negative class. A new set of transformed variables, depending on the mean and standard deviation statistics of the data distribution and other data sources, was identified. Significant allergy- and immune-related genes from the microarray data were selected. Experiments showed that classification predictability of significant genes can be improved. Important features from the transformed variable set were also identified.
27	Color Image Based Face Recognition Ganapathi, Tejaswini 24 February 2009 (has links) Traditional appearance based face recognition (FR) systems use gray scale images, however recently attention has been drawn to the use of color images. Color inputs have a larger dimensionality, which increases the computational cost, and makes the small sample size (SSS) problem in supervised FR systems more challenging. It is therefore important to determine the scenarios in which usage of color information helps the FR system. In this thesis, it was found that inclusion of chromatic information in FR systems is shown to be particularly advantageous in poor illumination conditions. In supervised systems, a color input of optimal dimensionality would improve the FR performance under SSS conditions. A fusion of decisions from individual spectral planes also helps in the SSS scenario. Finally, chromatic information is integrated into a supervised ensemble learner to address pose and illumination variations. This framework significantly boosts FR performance under a range of learning scenarios. Color Face Recognition Biometrics Small Sample Size Problem 0544
28	Robust estimation of inter-chip variability to improve microarray sample size calculations Knowlton, Nicholas Scott. January 2005 (has links) (PDF) Thesis--University of Oklahoma. / Bibliography: leaves 82-83.
29	Guidance for using pilot studies to inform the design of intervention trials with continuous outcomes Bell, Melanie L, Whitehead, Amy L, Julious, Steven A 01 1900 (has links) Background: A pilot study can be an important step in the assessment of an intervention by providing information to design the future definitive trial. Pilot studies can be used to estimate the recruitment and retention rates and population variance and to provide preliminary evidence of efficacy potential. However, estimation is poor because pilot studies are small, so sensitivity analyses for the main trial's sample size calculations should be undertaken. Methods: We demonstrate how to carry out easy-to-perform sensitivity analysis for designing trials based on pilot data using an example. Furthermore, we introduce rules of thumb for the size of the pilot study so that the overall sample size, for both pilot and main trials, is minimized. Results: The example illustrates how sample size estimates for the main trial can alter dramatically by plausibly varying assumptions. Required sample size for 90% power varied from 392 to 692 depending on assumptions. Some scenarios were not feasible based on the pilot study recruitment and retention rates. Conclusion: Pilot studies can be used to help design the main trial, but caution should be exercised. We recommend the use of sensitivity analyses to assess the robustness of the design assumptions for a main trial. pilot feasibility sample size power randomized controlled trial sensitivity analysis
30	Estimating the necessary sample size for a binomial proportion confidence interval with low success probabilities Ahlers, Zachary January 1900 (has links) Master of Science / Department of Statistics / Christopher Vahl / Among the most used statistical concepts and techniques, seen even in the most cursory of introductory courses, are the confidence interval, binomial distribution, and sample size estimation. This paper investigates a particular case of generating a confidence interval from a binomial experiment in the case where zero successes are expected. Several current methods of generating a binomial proportion confidence interval are examined by means of large-scale simulations and compared in order to determine an ad-hoc method for generating a confidence interval with coverage as close as possible to nominal while minimizing width. This is then used to construct a formula which allows for the estimation of a sample size necessary to obtain a sufficiently narrow confidence interval (with some predetermined probability of success) using the ad-hoc method given a prior estimate of the probability of success for a single trial. With this formula, binomial experiments could potentially be planned more efficiently, allowing researchers to plan only for the amount of precision they deem necessary, rather than trying to work with methods of producing confidence intervals that result in inefficient or, at worst, meaningless bounds. Binomial proportion Zero successes Sample size estimation Rule of three

Search results