Global ETD Search

1	The role of classifiers in feature selection : number vs nature Chrysostomou, Kyriacos January 2008 (has links) Wrapper feature selection approaches are widely used to select a small subset of relevant features from a dataset. However, Wrappers suffer from the fact that they only use a single classifier when selecting the features. The problem of using a single classifier is that each classifier is of a different nature and will have its own biases. This means that each classifier will select different feature subsets. To address this problem, this thesis aims to investigate the effects of using different classifiers for Wrapper feature selection. More specifically, it aims to investigate the effects of using different number of classifiers and classifiers of different nature. This aim is achieved by proposing a new data mining method called Wrapper-based Decision Trees (WDT). The WDT method has the ability to combine multiple classifiers from four different families, including Bayesian Network, Decision Tree, Nearest Neighbour and Support Vector Machine, to select relevant features and visualise the relationships among the selected features using decision trees. Specifically, the WDT method is applied to investigate three research questions of this thesis: (1) the effects of number of classifiers on feature selection results; (2) the effects of nature of classifiers on feature selection results; and (3) which of the two (i.e., number or nature of classifiers) has more of an effect on feature selection results. Two types of user preference datasets derived from Human-Computer Interaction (HCI) are used with WDT to assist in answering these three research questions. The results from the investigation revealed that the number of classifiers and nature of classifiers greatly affect feature selection results. In terms of number of classifiers, the results showed that few classifiers selected many relevant features whereas many classifiers selected few relevant features. In addition, it was found that using three classifiers resulted in highly accurate feature subsets. In terms of nature of classifiers, it was showed that Decision Tree, Bayesian Network and Nearest Neighbour classifiers caused signficant differences in both the number of features selected and the accuracy levels of the features. A comparison of results regarding number of classifiers and nature of classifiers revealed that the former has more of an effect on feature selection than the latter. The thesis makes contributions to three communities: data mining, feature selection, and HCI. For the data mining community, this thesis proposes a new method called WDT which integrates the use of multiple classifiers for feature selection and decision trees to effectively select and visualise the most relevant features within a dataset. For the feature selection community, the results of this thesis have showed that the number of classifiers and nature of classifiers can truly affect the feature selection process. The results and suggestions based on the results can provide useful insight about classifiers when performing feature selection. For the HCI community, this thesis has showed the usefulness of feature selection for identifying a small number of highly relevant features for determining the preferences of different users. 005.3
2	Sémantické rozpoznávání komentářů na webu / Semantic Recognition of Comments on the Web Stříteský, Radek January 2017 (has links) The main goal of this paper is the identification of comments on internet websites. The theoretical part is focused on artificial intelligence, mainly classifiers are described there. The practical part deals with creation of training database, which is formed by using generators of features. A generated feature might be for example a title of the HTML element where the comment is. The training database is created by input of classifiers. The result of this paper is testing classifiers in the RapidMiner program.
3	Příprava cvičení pro dolování znalostí z báze dat - klasifikace a predikce / Design of exercises for data mining - Classification and prediction Martiník, Jan January 2009 (has links) My master's thesis on the topic of "Design of exercises for data mining - Classification and prediction" deals with the most frequently used methods classification and prediction. There are association rules, Bayesian classification, genetic algorithms, the nearest method neighbor, neural network and decision trees on the classification. There are linear and non-linear prediction on the prediction. This work also contains a summary of detail the issue of decision trees and a detailed algorithm for creating the decision tree, including development of individual diagrams. The proposed algorithm for creating the decision tree is tested through two tests of data dowloaded from Internet. The results are mutually compared and described differences between the two implementations. The work is written in a way that would provide the reader with a notion of the individual methods and techniques for data mining, their advantages, disadvantages and some of the issues that directly relate to this topic.
4	Natural Language Explanation Model for Decision Trees Silva, Jesús, Hernández Palma, Hugo, Niebles Núẽz, William, Ruiz-Lazaro, Alex, Varela, Noel 07 January 2020 (has links) This study describes a model of explanations in natural language for classification decision trees. The explanations include global aspects of the classifier and local aspects of the classification of a particular instance. The proposal is implemented in the ExpliClas open source Web service [1], which in its current version operates on trees built with Weka and data sets with numerical attributes. The feasibility of the proposal is illustrated with two example cases, where the detailed explanation of the respective classification trees is shown. Decision trees Web services Classification decision Classification trees Global aspects Natural language explanations Natural languages Numerical attributes
5	A Programming Framework To Implement Rule-based Target Detection In Images Sahin, Yavuz 01 December 2008 (has links) (PDF) An expert system is useful when conventional programming techniques fall short of capturing human expert knowledge and making decisions using this information. In this study, we describe a framework for capturing expert knowledge under a decision tree form and this framework can be used for making decisions based on captured knowledge. The framework proposed in this study is generic and can be used to create domain specific expert systems for different problems. Features are created or processed by the nodes of decision tree and a final conclusion is reached for each feature. Framework supplies 3 types of nodes to construct a decision tree. First type is the decision node, which guides the search path with its answers. Second type is the operator node, which creates new features using the inputs. Last type of node is the end node, which corresponds to a conclusion about a feature. Once the nodes of the tree are developed, then user can interactively create the decision tree and run the supplied inference engine to collect the result on a specific problem. The framework proposed is experimented with two case studies / &quot / Airport Runway Detection in High Resolution Satellite Images&quot / and &quot / Urban Area Detection in High Resolution Satellite Images&quot / . In these studies linear features are used for structural decisions and Scale Invariant Feature Transform (SIFT) features are used for testing existence of man made structures.
6	Política antitruste e sua consistência: uma análise das decisões do Sistema Brasileiro de Defesa da Concorrência relativas aos Atos de Concentração / An analysis of the Brazilian Antitrust Policy Consistency Cardoso, Diego Soares 20 May 2013 (has links) Made available in DSpace on 2016-06-02T19:33:12Z (GMT). No. of bitstreams: 1 CARDOSO_Diego_2013.pdf: 1706794 bytes, checksum: 52ad0ebf4915ad86f6ac9a9529176b01 (MD5) Previous issue date: 2013-05-20 / Financiadora de Estudos e Projetos / The goal of competition policy, also known as antitrust policy, is promoting the welfare and economic efficiency by preserving fair competition in markets. Merger control is one of the main responsibilities of antitrust institutions. Prohibitions and restrictions of merger operations affect market structures, thus making these decisions relevant to economic agents. This Master's thesis analyzes the decisions made by Brazilian antitrust institutions regarding merger processes. Data was collected from public documents issued from 2004 to 2011. Bivariate analysis, discrete choice models and classification decision trees show that these merger control decisions are consistent with Brazilian antitrust law. Consistent competition policy reduces uncertainty, aligns expectations and increases the efficiency of antitrust law enforcement. Therefore, this research contributes to better understanding Brazilian competition policy related to merger control and its decision drivers. / As políticas de defesa da concorrência, ou políticas antitruste, visam ao maior bem-estar social por meio da manutenção de ambientes concorrenciais que promovam a eficiência econômica. No Brasil, os órgãos que compõem o Sistema Brasileiro de Defesa da Concorrência são os responsáveis pelas decisões sobre os agentes econômicos a fim de atingir os objetivos das políticas antitruste. Nesse âmbito, as decisões que influenciam a estrutura de mercados por meio das restrições e vetos a processos como fusões e aquisições de empresas - os julgamentos de Atos de Concentração - apresentam elevada relevância. Este trabalho realiza uma avaliação das decisões do Sistema Brasileiro de Defesa da Concorrência relativas aos Atos de Concentração. Para tal, foram coletados dados a partir dos documentos públicos emitidos pelos órgãos antitruste no período entre 2004 e 2011. Por meio da aplicação de modelos de regressão de escolha discreta e árvores de decisão induzidas, verificou-se que tais decisões são consistentes com as regras antitruste brasileiras. A consistência com regras estabelecidas possibilita uma maior eficiência na aplicação das políticas de defesa da concorrência, uma vez que reduz as incertezas dos agentes econômicos, alinha as expectativas e facilita a condução dos processos. Nesse sentido, esta investigação contribui para uma melhor compreensão dos fatores que influenciam as decisões dos órgãos brasileiros de defesa da concorrência, oferecendo também indicativos que auxiliam na verificação da eficiência da aplicação de tais políticas. política antitruste - Brasil concentração econômica - Brasil competição econômica atos de concentração CADE modelos de escolha discreta árvores de decisão induzidas antitrust policy merger control discrete choice models classification decision trees
7	Optimalizace strojového učení pro predikci KPI / Machine Learning Optimization of KPI Prediction Haris, Daniel January 2018 (has links) This thesis aims to optimize the machine learning algorithms for predicting KPI metrics for an organization. The organization is predicting whether projects meet planned deadlines of the last phase of development process using machine learning. The work focuses on the analysis of prediction models and sets the goal of selecting new candidate models for the prediction system. We have implemented a system that automatically selects the best feature variables for learning. Trained models were evaluated by several performance metrics and the best candidates were chosen for the prediction. Candidate models achieved higher accuracy, which means, that the prediction system provides more reliable responses. We suggested other improvements that could increase the accuracy of the forecast.
8	Analýza klasifikačních metod / Analysis of Classification Methods Juríček, Jakub January 2019 (has links) This work deals with the classification methods used in the knowledge discovery from data process and discusses the possibilities of their validation and comparison. Through experiments, the work focuses on the analysis of four selected methods: Naive Bayes classificator, decision tree, neural network and SVM. Factors influencing basic characteristics such as training speed, classification speed, accuracy are examined. A part of the thesis is a desktop application, which is a tool for training, testing and validation of individual methods. Eleven reference data sets are selected for experimental purposes. At the end of this work experimental results of comparison and observed characteristics of classification methods are summarized.
9	Identifikace zařízení na základě jejich chování v síti / Behaviour-Based Identification of Network Devices Polák, Michael Adam January 2020 (has links) Táto práca sa zaoberá problematikou identifikácie sieťových zariadení na základe ich chovania v sieti. S neustále sa zvyšujúcim počtom zariadení na sieti je neustále dôležitejšia schopnosť identifikovať zariadenia z bezpečnostných dôvodov. Táto práca ďalej pojednáva o základoch počítačových sietí a metódach, ktoré boli využívané v minulosti na identifikáciu sieťových zariadení. Následne sú popísané algoritmy využívané v strojovom učení a taktiež sú popísané ich výhody i nevýhody. Nakoniec, táto práca otestuje dva tradičné algorithmy strojového učenia a navrhuje dva nové prístupy na identifikáciu sieťových zariadení. Výsledný navrhovaný algoritmus v tejto práci dosahuje 89% presnosť identifikácii sieťových zariadení na reálnej dátovej sade s viac ako 10000 zariadeniami.
10	Porovnání klasifikačních metod / Comparison of Classification Methods Dočekal, Martin January 2019 (has links) This thesis deals with a comparison of classification methods. At first, these classification methods based on machine learning are described, then a classifier comparison system is designed and implemented. This thesis also describes some classification tasks and datasets on which the designed system will be tested. The evaluation of classification tasks is done according to standard metrics. In this thesis is presented design and implementation of a classifier that is based on the principle of evolutionary algorithms.

Search results