• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • Tagged with
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Comparison Of Data Mining Methods For Prediction And Classification Types Of Quality Problems

Anakli, Zeynep 01 December 2009 (has links) (PDF)
In this study, an Analytic Network Process (ANP) and Preference Ranking Organization MeTHod for Enrichment Evaluations (PROMETHEE) based approach is developed and used to compare overall performance of some commonly used classification and prediction data mining methods on quality improvement data, according to several decision criteria. Classification and prediction data mining (DM) methods are frequently used in many areas including quality improvement. Previous studies on comparison of performance of these methods are not valid for quality improvement data. Furthermore, these studies do not consider all relevant decision criteria in their comparison. All relevant criteria and interdependencies among criteria should be taken into consideration during the performance evaluation. In this study, classification DM methods namely / Decision Trees (DT), Neural Networks (NN), Multivariate Adaptive Regression Splines (MARS), Logistic Regression (LR), Mahalanobis-Taguchi System (MTS), Fuzzy Classifier (FC) and Support Vector Machine (SVM) / prediction DM methods DT, NN, MARS, Multiple Linear Regression (MLR), Fuzzy Regression (FR) and Robust Regression (RR) are prioritized according to a comprehensive set of criteria using ANP and PROMETHEE. According to results of this study, MARS is found superior to the other methods for both classification and prediction. Moreover, sensitivity of the results to changes in weights and thresholds of the decision criteria is analyzed. These analyses show that resulting priorities are very insensitive to these parameters.
2

Data measures that characterise classification problems

Van der Walt, Christiaan Maarten 29 August 2008 (has links)
We have a wide-range of classifiers today that are employed in numerous applications, from credit scoring to speech-processing, with great technical and commercial success. No classifier, however, exists that will outperform all other classifiers on all classification tasks, and the process of classifier selection is still mainly one of trial and error. The optimal classifier for a classification task is determined by the characteristics of the data set employed; understanding the relationship between data characteristics and the performance of classifiers is therefore crucial to the process of classifier selection. Empirical and theoretical approaches have been employed in the literature to define this relationship. None of these approaches have, however, been very successful in accurately predicting or explaining classifier performance on real-world data. We use theoretical properties of classifiers to identify data characteristics that influence classifier performance; these data properties guide us in the development of measures that describe the relationship between data characteristics and classifier performance. We employ these data measures on real-world and artificial data to construct a meta-classification system. We use theoretical properties of classifiers to identify data characteristics that influence classifier performance; these data properties guide us in the development of measures that describe the relationship between data characteristics and classifier performance. We employ these data measures on real-world and artificial data to construct a meta-classification system. The purpose of this meta-classifier is two-fold: (1) to predict the classification performance of real-world classification tasks, and (2) to explain these predictions in order to gain insight into the properties of real-world data. We show that these data measures can be employed successfully to predict the classification performance of real-world data sets; these predictions are accurate in some instances but there is still unpredictable behaviour in other instances. We illustrate that these data measures can give valuable insight into the properties and data structures of real-world data; these insights are extremely valuable for high-dimensional classification problems. / Dissertation (MEng)--University of Pretoria, 2008. / Electrical, Electronic and Computer Engineering / unrestricted
3

Bioinformatický nástroj pro predikci rozpustnosti proteinů / Bioinformatics Tool for Prediction of Protein Solubility

Hronský, Patrik January 2016 (has links)
This master's thesis addresses the solubility of recombinant proteins and its prediction. It describes the subject of protein synthesis, as well as the process of recombinant protein creation. Recombinant protein synthesis is of great importance for example to pharmacologic industry. This synthesis is not a simple task and it does not always produce viable proteins. Protein solubility is an important factor, determining the viability of the resulting proteins. It is of course favourable for companies, that take part in recombinant protein synthesis, to focus their effort and their resources on proteins, that will be viable in the end. In this regard, bioinformatics is of great help, as it is capable, with the help of machine learning, of predicting the solubility of proteins, for example based on their sequences. This thesis introduces the reader to the basic principles of machine learning and presents several machine learning methods, used in the field of protein solubility prediction. It deals with the definition of a dataset, which is later used to test selected predictors, as well as to train the ensemble predictor, which is the main focus of this thesis. It also focuses on several specific protein solubility predictors and explains the basic principles upon which they are built, as well as the results of their testing. In the end, it presents the ensemble predictor of protein solubility.

Page generated in 0.1406 seconds