• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 462
  • 159
  • 49
  • 47
  • 46
  • 38
  • 35
  • 30
  • 22
  • 8
  • 6
  • 6
  • 4
  • 4
  • 4
  • Tagged with
  • 1074
  • 1074
  • 257
  • 148
  • 130
  • 125
  • 122
  • 111
  • 92
  • 90
  • 88
  • 86
  • 84
  • 81
  • 80
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Sample comparisons using microarrays -- application of false discovery rate and quadratic logistic regression

Guo, Ruijuan. January 2007 (has links)
Thesis (M.S.)--Worcester Polytechnic Institute. / Keywords: FDR; logistic regression; microarray. Includes bibliographical references (leaf 26).
22

Knowing when a higher education institution is in trouble

Sturm, Pamela S. January 2005 (has links)
Theses (Ed. D.)--Marshall University, 2005. / Title from document title page. Includes abstract. Document formatted into pages: contains ix, 180 p. Bibliography: p. 121-129.
23

Logistic regression, measures of explained variation, and the base rate problem

Sharma, Dinesh R. McGee, Daniel. January 2006 (has links)
Thesis (Ph. D.)--Florida State University, 2006. / Advisor: Daniel L. McGee, Sr., Florida State University, College of Arts and Sciences, Dept. of Statistics. Title and description from dissertation home page (viewed Sept. 21, 2006). Document formatted into pages; contains xii, 147 pages. Includes bibliographical references.
24

The variable selection problem and the application of the roc curve for binary outcome variables

Matshego, James Moeng 11 August 2008 (has links)
Variable selection refers to the problem of selecting input variables that are most predictive of a given outcome. Variable selection problems are found in all machine learning tasks, supervised or unsupervised, classification, regression, time series prediction , two - class or multi-class, posing various levels of challenges. Variables selection problems are related to the problems of input dimensionality reduction and of parameter planning. It has practical and theoretical challenges of its own. From the practical point of view, eliminating variables may reduce the cost of producing the outcome and increase its speed, while space dimensionality does not address these problems. Theoretical challenges include estimating with what confidence one can state that a variable is relevant to the concept when it is useful to the outcome and providing a theoretical understanding of the stability of selected variables subsets. As the probability cut-points increase in value, the more likely it becomes that an observation is classified as a non-event by the selected variables. The mathematical statement of the problem is not widely agreed upon and may depend on the application. One typically distinguishes: i) The problem of discovering all the variables relevant to the outcome variable and determine HOW relevant they are and how they are related to each other. ii) The problem of finding a minimum subset of variables that is useful to the outcome variable. Logistic regression is an increasingly popular statistical technique used to model the probability of discrete binary outcome. Logistic regression applies maximum likelihood estimation after transforming the outcome variable into a logit variable. In this way, logistic regression estimates the probability of a certain event. When properly applied, logistic regression analyses yield a very powerful insight in to what variables are more or less likely to predict event outcome in a population of interest. These models also show the extent to which changes in the values of the variable may increase or decrease the predicted probability of event outcome. Variable selection, in all its facets is similarly important with logistic regression. The receiver operating characteristics (ROC) curve is a graphic display that gives a measure of the predictive accuracy of a logistic regression model. It is a measure of classification performance, the area under the ROC curve (AUC) is a scalar measure gauging one facet of performance. Another measure of predictive accuracy of a logistic regression model is a classification table. It uses the model to classifying observations as events if their estimated probability is greater or equal to a given probability cut-point, otherwise events are classified as non-events. This technique, as it appears in the literature, is also studied in this thesis. In this thesis the issue of variable selection, both for continuous and binary outcome variables, is investigated as it appears in the statistical literature. It is clear that this topic has been widely researched and still remains a feature of modern research. The last word certainly hasn’t been spoken. / Dissertation (MSc)--University of Pretoria, 2008. / Statistics / unrestricted
25

GIS-based Evaluation of Landslide Susceptibility for Eastern Tennessee

Smith, Sara Ann 06 May 2017 (has links)
The Appalachian Mountains in eastern Tennessee are known for landslides, and landslides are reported to cause millions of dollars of damage. To aid in the estimation of future susceptibility, geographic information systems was used to perform a logistic regression, to identify landslides in eastern Tennessee. Landslide model results validated using Kold cross validation. The model results suggest that the environmental variables slope, soil, landcover/vegetation, and distance to roads were significant factors related to landslide susceptibility. The susceptibility map showed that 86.8% of urban areas in eastern Tennessee were at highest susceptibility for landslides, possibly due to lower amounts of landcover. By overlaying past landslides on landslide susceptibility for accuracy, areas with high landslide susceptibility were found in areas along main highways and interstates. This model is a first step in using GIS to increase the awareness of landslide susceptibility in the regions and may ultimately lead to better preparation.
26

Genomic signature of trait-associated variants

Kindt, Alida Sophie Dorothea January 2014 (has links)
Genome-wide association studies have been used extensively to study hundreds of phenotypes and have determined thousands of associated SNPs whose underlying biology and causation is as yet largely unknown. Many previous studies attempted to clarify the causal biology by investigating overlaps of trait-associated variants with functional annotations, but lacked statistical rigor and examined incomplete subsets of available functional annotations. Additionally, it has been difficult to disentangle the relative contributions of different annotations that may show strong correlations with one another. In this thesis, we address these shortcomings and strengthen and extend the obtained results. Two methods, permutations and logistic regression, are applied in statistically rigorous analyses of genomic annotations and their observed enrichment or depletion of trait-associated SNPs. The genomic annotations range from genic regions and regulatory features to measures of conservation and aspects of chromatin structure. Logistic regressions in a number of trait-specific subsets identify genomic annotations influencing SNPs associated with both normal variation (e.g., eye or hair colour) and diseases, suggesting some generalities in the biological underpinnings of phenotypes. SNPs associated with phenotypes of the immune system are investigated and the results highlight the distinct aetiology for this subset. Despite the heterogeneity of the studied cancers, SNPs associated to different cancers are particularly enriched for conserved regions, unlike all other trait-subsets. Nonetheless, chromatin states are, perhaps surprisingly, among the most influential genomic annotations in all trait-subsets. Evolutionary conserved regions are rarely within the top genomic annotations despite their widespread use in prioritisation methods for follow-up studies. We identify a common set of enriched or depleted genomic annotations that significantly influence all traits, but also highlight trait-­‐specific differences. These annotations may be used for the computational prioritisation of variants implicated in phenotypes of interest. The approaches developed for this thesis are further applied to studies of a specific human complex trait (height) and gene expression in atherosclerosis.
27

Identifying historical financial crisis: Bayesian stochastic search variable selection in logistic regression

Ho, Chi-San 2009 August 1900 (has links)
This work investigates the factors that contribute to financial crises. We first study the Dow Jones index performance by grouping the daily adjusted closing value into a two-month window and finding several critical quantiles in each window. Then, we identify severe downturn in these quantiles and find that the 5th quantile is the best to identify financial crises. We then matched these quantiles with historical financial crises and gave a basic explanation about them. Next, we introduced all exogenous factors that could be related to the crises. Then, we applied a rapid Bayesian variable selection technique - Stochastic Search Variable Selection (SSVS) using a Bayesian logistic regression model. Finally, we analyzed the result of SSVS, leading to the conclusion that that the dummy variable we created for disastrous hurricane, crude oil price and gold price (GOLD) should be included in the model. / text
28

Logistic regression with conjugate gradient descent for document classification

Namburi, Sruthi January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / William H. Hsu / Logistic regression is a model for function estimation that measures the relationship between independent variables and a categorical dependent variable, and by approximating a conditional probabilistic density function using a logistic function, also known as a sigmoidal function. Multinomial logistic regression is used to predict categorical variables where there can be more than two categories or classes. The most common type of algorithm for optimizing the cost function for this model is gradient descent. In this project, I implemented logistic regression using conjugate gradient descent (CGD). I used the 20 Newsgroups data set collected by Ken Lang. I compared the results with those for existing implementations of gradient descent. The conjugate gradient optimization methodology outperforms existing implementations.
29

Statistical methods for diagnostic testing: an illustration using a new method for cancer detection

Sun, Xin January 1900 (has links)
Master of Science / Department of Statistics / Gary Gadbury / This report illustrates how to use two statistic methods to investigate the performance of a new technique to detect breast cancer and lung cancer at early stages. The two methods include logistic regression and classification and regression tree (CART). It is found that the technique is effective in detecting breast cancer and lung cancer, with both sensitivity and specificity close to 0.9. But the ability of this technique to predict the actual stages of cancer is low. The age variable improves the ability of logistic regression in predicting the existence of breast cancer for the samples used in this report. But since the sample sizes are small, it is impossible to conclude that including the age variable helps the prediction of breast cancer. Including the age variable does not improve the ability to predict the existence of lung cancer. If the age variable is excluded, CART and logistic regression give a very close result.
30

Classification of Bone Cements Using Multinomial Logistic Regression Method

Wei, Jinglun 29 April 2018 (has links)
Bone cement surgery is a new technique widely used in medical field nowadays. In this thesis I analyze 48 bone cement types using their content of 20 elements. My goal is to ?find a method to classify new found bone cement sample into these 48 categories. Here I will use multinomial logistic regression method to see whether it works or not. Due to the lack of observations, I generate enough data by adding white noise in proper scales to the original data again and again, and then I get a data set of over 100 times as many points as the original one. Then I use purposeful variable selection method to pick the covariates I need, rather than stepwise selection. There are 15 covariates left after the selection, and then I use my new data set to fit such a multinomial logistic regression model. The model doesn't perform that good in goodness of ?fit test, but the result is still acceptable, and the diagnostic statistics also indicate a good performance. Combined with clinical experience and prior conditions, this model is helpful in this classification case.

Page generated in 0.1619 seconds