• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 35
  • 15
  • 5
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 290
  • 290
  • 101
  • 99
  • 81
  • 69
  • 69
  • 46
  • 39
  • 38
  • 38
  • 37
  • 35
  • 32
  • 31
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Simulation and Application of Binary Logic Regression Models

Heredia Rico, Jobany J 01 April 2016 (has links)
Logic regression (LR) is a methodology to identify logic combinations of binary predictors in the form of intersections (and), unions (or) and negations (not) that are linearly associated with an outcome variable. Logic regression uses the predictors as inputs and enables us to identify important logic combinations of independent variables using a computationally efficient tree-based stochastic search algorithm, unlike the classical regression models, which only consider pre-determined conventional interactions (the “and” rules). In the thesis, we focused on LR with a binary outcome in a logistic regression framework. Simulation studies were conducted to examine the performance of LR under the assumption of independent and correlated observations, respectively, for various characteristics of the data sets and LR search parameters. We found that the proportion of times that LR selected the correct logic rule was usually low when the signal and/or prevalence of the true logic rule were relatively low. The method performed satisfactorily under easy learning conditions such as high signal, simple logic rules and/or small numbers of predictors. Given the simulation characteristics and correlation structures tested, we found some but not significant difference in performance when LR was applied to dependent observations compared to the independent case. In addition to simulation studies, an advanced application method was proposed to integrate LR and resampling methods in order to enhance LR performance. The proposed method was illustrated using two simulated data sets as well as a data set from a real-life situation. The proposed method showed some evidence of being effective in discerning the correct logic rule, even for unfavorable learning conditions.
132

Maximum Likelihood Estimation of Parameters in Exponential Power Distribution with Upper Record Values

Zhi, Tianchen 27 March 2017 (has links)
The exponential power (EP) distribution is a very important distribution that was used by survival analysis and related with asymmetrical EP distribution. Many researchers have discussed statistical inference about the parameters in EP distribution using i.i.d random samples. However, sometimes available data might contain only record values, or it is more convenient for researchers to collect record values. We aim to resolve this problem. We estimated two parameters of the EP distribution by MLE using upper record values. According to simulation study, we used the Bias and MSE of the estimators for studying the efficiency of the proposed estimation method. Then, we discussed the prediction on the next upper record value by known upper record values. The study concluded that MLEs of EP distribution parameters by upper record values has satisfactory performance. Also, prediction of the next upper record value performed well
133

A Comparison of Standard Denoising Methods for Peptide Identification

Carpenter, Skylar 01 May 2019 (has links)
Peptide identification using tandem mass spectrometry depends on matching the observed spectrum with the theoretical spectrum. The raw data from tandem mass spectrometry, however, is often not optimal because it may contain noise or measurement errors. Denoising this data can improve alignment between observed and theoretical spectra and reduce the number of peaks. The method used by Lewis et. al (2018) uses a combined constant and moving threshold to denoise spectra. We compare the effects of using the standard preprocessing methods baseline removal, wavelet smoothing, and binning on spectra with Lewis et. al’s threshold method. We consider individual methods and combinations, using measures of distance from Lewis et. al's scoring function for comparison. Our findings showed that no single method provided better results than Lewis et. al's, but combining techniques with that of Lewis et. al's reduced the distance measurements and size of the data set for many peptides.
134

Function Space Tensor Decomposition and its Application in Sports Analytics

Reising, Justin 01 December 2019 (has links)
Recent advancements in sports information and technology systems have ushered in a new age of applications of both supervised and unsupervised analytical techniques in the sports domain. These automated systems capture large volumes of data points about competitors during live competition. As a result, multi-relational analyses are gaining popularity in the field of Sports Analytics. We review two case studies of dimensionality reduction with Principal Component Analysis and latent factor analysis with Non-Negative Matrix Factorization applied in sports. Also, we provide a review of a framework for extending these techniques for higher order data structures. The primary scope of this thesis is to further extend the concept of tensor decomposition through the use of function spaces. In doing so, we address the limitations of PCA to vector and matrix representations and the CP-Decomposition to tensor representations. Lastly, we provide an application in the context of professional stock car racing.
135

Evaluating Public Masking Mandates on COVID-19 Growth Rates in U.S. States

Wong, Angus K 01 July 2021 (has links)
U.S. state governments have implemented numerous policies to help mitigate the spread of COVID-19. While there is strong biological evidence supporting the wearing of face masks or coverings in public spaces, the impact of public masking policies remains unclear. We aimed to evaluate how early versus delayed implementation of state-level public masking orders impacted subsequent COVID-19 growth rates. We defined “early” implementation as having a state-level mandate in place before September 1, 2020, the approximate start of the school-year. We defined COVID-19 growth rates as the relative increase in confirmed cases 7, 14, 21, 30, 45, 60-days after September 1. Primary analyses used targeted maximum likelihood estimation (TMLE) with Super Learner and considered a wide range of potential confounders to account for differences between states. In secondary analyses, we took an unadjusted approach and calculated the average COVID-19 growth rate among early-implementing states divided by the average COVID-19 growth rate among late-implementing states. At a national level, the expected growth rate after 14-days was 4%lower with early vs. delayed implementation (aRR: 0.96; 95%CI: 0.95-0.98). Associations did not plateau over time, but instead grew linearly. After 60-days, the expected growth rate was 16% lower with early vs. delayed implementation (aRR:0.84; 95%CI: 0.78-0.91). Unadjusted estimates were exaggerated (e.g. 60-day RR:0.72; 95%CI: 0.60-0.84). Sensitivity analyses varying the timing of the masking order yielded similar results. In both the short and long term, state-level public masking mandates were associated with lower COVID-19 growth rates. Given their low-cost and minimal (if any) impact on the economy, masking policies are promising public health strategies to mitigate further spread of COVID-19.
136

Intraday Algorithmic Trading using Momentum and Long Short-Term Memory network strategies

Whitinger, Andrew R., II 01 May 2022 (has links)
Intraday stock trading is an infamously difficult and risky strategy. Momentum and reversal strategies and long short-term memory (LSTM) neural networks have been shown to be effective for selecting stocks to buy and sell over time periods of multiple days. To explore whether these strategies can be effective for intraday trading, their implementations were simulated using intraday price data for stocks in the S&P 500 index, collected at 1-second intervals between February 11, 2021 and March 9, 2021 inclusive. The study tested 160 variations of momentum and reversal strategies for profitability in long, short, and market-neutral portfolios, totaling 480 portfolios. Long and short portfolios for each strategy were also compared to the market to observe excess returns. Eight reversal portfolios yielded statistically significant profits, and 16 yielded significant excess returns. Tests of these strategies on another set of 16 days failed to yield statistically significant returns, though average returns remained profitable. Four LSTM network configurations were tested on the same original set of days, with no strategy yielding statistically significant returns. Close examination of the stocks chosen by LSTM networks suggests that the networks expect stocks to exhibit a momentum effect. Further studies may explore whether an intraday reversal effect can be observed over time during different market conditions and whether different configurations of LSTM networks can generate significant returns.
137

Performance Comparison of Imputation Methods for Mixed Data Missing at Random with Small and Large Sample Data Set with Different Variability

Afari, Kyei 01 August 2021 (has links)
One of the concerns in the field of statistics is the presence of missing data, which leads to bias in parameter estimation and inaccurate results. However, the multiple imputation procedure is a remedy for handling missing data. This study looked at the best multiple imputation methods used to handle mixed variable datasets with different sample sizes and variability along with different levels of missingness. The study employed the predictive mean matching, classification and regression trees, and the random forest imputation methods. For each dataset, the multiple regression parameter estimates for the complete datasets were compared to the multiple regression parameter estimates found with the imputed dataset. The results showed that the random forest imputation method was the best for mostly a sample of 150 and 500 irrespective of the variability. The classification and regression tree imputation methods worked best mostly on sample of 30 irrespective of the variability.
138

Family-Wise Error Rate Control in Quantitative Trait Loci (QTL) Mapping and Gene Ontology Graphs with Remarks on Family Selection

Saunders, Garret 01 May 2014 (has links)
The main aim of this dissertation is to meet real needs of practitioners in multiple hypothesis testing. The issue of multiplicity has become a signicant concern in most elds of research as computational abilities have increased, allowing for the simultaneous testing of many (thousands or millions) statistical hypothesis tests. While many error rates have been dened to address this issue of multiplicity, this work considers only the most natural generalization of the Type I Error rate to multiple tests, the family-wise error rate (FWER). Much work has already been done to establish powerful yet general methods which control the FWER under arbitrary dependencies among tests. This work both introduces these methods and expands upon them as is detailed through its four main chapters. Chapter 1 contains general introductions and preliminaries important to the remainder of the work, particularly a previously published graphical weighted Bonferroni multiplicity adjustment. Chapter 2 then applies the principles introduced in Chapter 1 to achieve a substantial computational improvement to an existing FWER controlling multiplicity approach (the Focus Level method) for gene set testing in high throughput microarray and next generation sequencing studies using Gene Ontology graphs. This improvement to the Focus Level procedure, which we call the Short Focus Level procedure, is achieved by extending the reach of graphical weighted Bonferroni testing to closed testing situations where restricted hypotheses are present. This is accomplished through Theorem 1 of Chapter 2. As a result of the improvement, the full top-down approach to the Focus Level procedure can now be performed, overcoming a signicant disadvantage of the otherwise powerful approach to multiple testing. Chapter 3 presents a solution to a multiple testing diculty within quantitative trait loci (QTL) mapping in natural populations for QTL LD (linkage disequilibrium) mapping models. Such models apply a two-hypothesis framework to the testing of thousands of genetic markers across the genome in search of QTL underlying a quantitative trait of interest. Inherent to the model is an unidentiability issue where a parameter of interest is identiable only under the alternative hypothesis. Through a second application of graphical weighted Bonferroni methods we show how the multiplicity can be accounted for while simultaneously accounting for the required logical structuring of the testing such that identiability is preserved. Finally, Chapter 4 details some of the diculties associated with the distributional assumptions for the test statistics of the two hypotheses of the LDbased QTL mapping framework. A novel bivariate testing strategy is proposed for these test statistics in order to overcome these distributional diculties while preserving power in the multiplicity correction by reducing the number of tests performed. Chapter 5 concludes the work with a summary of the main contributions and future research goals aimed at continual improvement to the multiple testing issues inherent to both the elds of genetics and genomics.
139

Model for Bathtub-Shaped Hazard Rate: Monte Carlo Study

Leithead, Glen S. 01 May 1970 (has links)
A new model developed for the entire bathtub-shaped hazard rate curve has been evaluated as to its usefulness as a method of reliability estimation. The model is of the form: F(t) = 1 - exp - (ϴ1tL + ϴ2t + ϴ3tM) where "L" and "M" were assumed known. The estimate of reliability obtained from the new model was compared with the traditional restricted sample estimate for four different time intervals and was found to have less bias and variance for all time points. The was a monte carlo study and the data generated showed that the new model has much potential as a method for estimating reliability. (51 pages)
140

A Report on the Statistical Properties of the Coefficient of Variation and Some Applications

Irvin, Howard P. 01 May 1970 (has links)
Examples from four disciplines were used to introduce the coefficient of variation which was considered to have considerable usage and application in solving Quality Control and Reliability problems. The statistical properties were found in the statistical literature and are presented, namely, the mean and the variance of the coefficient of variation. The cumulative probability function was determined by two approximate methods and by using the noncentral t distribution. A graphical method to determine approximate confidence intervals and a method to determine if the coefficients of variation from two samples were significantly different from each other are also provided (with examples). Applications of the coefficient of variation to solving some of the main problems encountered in industry that are included in this report are: (a) using the coefficient of variation to measure relative efficiency, (b) acceptance sampling, (c) stress versus strength reliability problem, and (d) estimating the shape parameter of the two parameter Weibull. (84 pages)

Page generated in 0.0648 seconds