Global ETD Search

281	Automatic State Construction using Decision Trees for Reinforcement Learning Agents Au, Manix January 2005 (has links) Reinforcement Learning (RL) is a learning framework in which an agent learns a policy from continual interaction with the environment. A policy is a mapping from states to actions. The agent receives rewards as feedback on the actions performed. The objective of RL is to design autonomous agents to search for the policy that maximizes the expectation of the cumulative reward. When the environment is partially observable, the agent cannot determine the states with certainty. These states are called hidden in the literature. An agent that relies exclusively on the current observations will not always find the optimal policy. For example, a mobile robot needs to remember the number of doors went by in order to reach a specific door, down a corridor of identical doors. To overcome the problem of partial observability, an agent uses both current and past (memory) observations to construct an internal state representation, which is treated as an abstraction of the environment. This research focuses on how features of past events are extracted with variable granularity regarding the internal state construction. The project introduces a new method that applies Information Theory and decision tree technique to derive a tree structure, which represents the state and the policy. The relevance, of a candidate feature, is assessed by the Information Gain Ratio ranking with respect to the cumulative expected reward. Experiments carried out on three different RL tasks have shown that our variant of the U-Tree (McCallum, 1995) produces a more robust state representation and faster learning. This better performance can be explained by the fact that the Information Gain Ratio exhibits a lower variance in return prediction than the Kolmogorov-Smirnov statistical test used in the original U-Tree algorithm. Reinforcement learning State Action Reward Policy Value based method Policy search method Automatic state construction Decision tree Partial observability U-Tree Kolmogorov-Smirnov two sample test Information gain ratio test
282	Detecção de intrusos em redes de computadores com uso de códigos corretores de erros e medidas de informação. / Intrusion detection in computer networks using error correction codes and information measures. LIMA, Christiane Ferreira Lemos. 13 August 2018 (has links) Submitted by Johnny Rodrigues (johnnyrodrigues@ufcg.edu.br) on 2018-08-13T19:50:34Z No. of bitstreams: 1 CHRISTIANE FERREIRA LEMOS LIMA - TESE PPGEE 2013..pdf: 5704501 bytes, checksum: da700470760daace1ac791c6514082a3 (MD5) / Made available in DSpace on 2018-08-13T19:50:34Z (GMT). No. of bitstreams: 1 CHRISTIANE FERREIRA LEMOS LIMA - TESE PPGEE 2013..pdf: 5704501 bytes, checksum: da700470760daace1ac791c6514082a3 (MD5) Previous issue date: 2013-04-19 / Capes / Este trabalho de tese tem como objetivo principal apresentar um novo esquema, direcionado à proteção de sistemas computacionais contra a ocorrência de invasões, fazendo uso de códigos corretores de erros e de medidas de informação. Para isto, considera-se que a identiﬁcação de diferentes tipos de ataques a uma rede de computadores pode ser vista como uma tarefa de classiﬁcação multiclasses, por envolver a discriminação de ataques em diversos tipos ou categorias. Com base nessa abordagem, o presente trabalho apresenta uma estratégia para classiﬁcação multiclasses, baseada nos princípios dos códigos corretores de erros, tendo em vista que para cada uma dasM classes do problema proposto é associada a um conjunto de palavras códigos de comprimento igual a N, em que N é o número de atributos, selecionados por meio de medidas de informação, e que serão monitorados por dispositivos de software, aqui chamados de detectores de rede e detectores dehost. Nesta abordagem, as palavras código que formam uma tabela são restritas a um sub-código de um código linear do tipo BCH (Bose-ChaudhuriHocquenghem), permitindo que a etapa de decodiﬁcação seja realizada, utilizando-se algoritmos de decodiﬁcação algébrica, o que não é possível para palavras código selecionadas aleatoriamente. Nesse contexto, o uso de uma variante de algoritmo genético é aplicado no projeto da tabela que será utilizada na identiﬁcação de ataques em redes de computadores. Dentre as contribuições efetivas desta tese, cujas comprovações são demonstradas em experimentos realizados por simulação computacional, tem-se a aplicação da teoria da codiﬁcação para a determinação de palavras código adequadas ao problema de detectar intrusões numa rede de computadores; a determinação dos atributos por meio do uso de árvores de decisão C4.5 baseadas nas medidas de informação de Rényi e Tsallis; a utilização de decodiﬁcação algébrica, baseado nos conceitos de decodiﬁcação tradicional e na decodiﬁcação por lista. / The thesis’s main objective is to present a scheme to protect computer networks against the occurrence of invasions by making use of error correcting codes and information measures. For this, the identiﬁcation of attacks in a network is viewed as a multiclass classiﬁcation task because it involves attacks discrimination into various categories. Based on this approach, this work presents strategies for multiclass problems based on the error correcting codes principles, where eachM class is associated with a codeword of lengthN, whereN is the number of selected attributes, chosen by use of information measures. These attributes are monitored by software devices, here called network detectors and detectors host. In this approach, the codewords that form a codewords table are a sub-code of a BCH-type linear code. This approach allows decoding step to be performed using algebraic decoding algorithms, what is not possible with random selected codewords. In this context, the use of a variant of genetic algorithm are applied in the table-approach design to be used in attacks identiﬁcation in networks. The effective contributions of this thesis, demonstrated on the cientiﬁc experiments, are: the application of coding theory to determine the appropriate code words to network intrusion detection; the application of C4.5 decision tree based on information measurements of Rényi and Tsallis for attributes selection; the use of algebraic decoding, based on the concepts of the traditional decoding and list decoding techniques. Engenharia Elétrica. Detecção de intrusos - redes Redes de computadores - intrusos Árvore de decisão Algoritmos genéticos Detecção de intrusão Decision tree Intrusion detection Proteção de sistemas computacionais Segurança da informação digital Ataques a redes de computadores Digital information security Security of information systems
283	Arbres de décisions symboliques, outils de validations et d'aide à l'interprétation / Symbolic decision trees, tools for validation and interpretation assistance Seck, Djamal 20 December 2012 (has links) Nous proposons dans cette thèse la méthode STREE de construction d'arbres de décision avec des données symboliques. Ce type de données permet de caractériser des individus de niveau supérieur qui peuvent être des classes ou catégories d’individus ou des concepts au sens des treillis de Galois. Les valeurs des variables, appelées variables symboliques, peuvent être des ensembles, des intervalles ou des histogrammes. Le critère de partitionnement récursif est une combinaison d'un critère par rapport aux variables explicatives et d'un critère par rapport à la variable à expliquer. Le premier critère est la variation de la variance des variables explicatives. Quand il est appliqué seul, STREE correspond à une méthode descendante de classification non supervisée. Le second critère permet de construire un arbre de décision. Il s'agit de la variation de l'indice de Gini si la variable à expliquer est nominale et de la variation de la variance si la variable à expliquer est continue ou bien est une variable symbolique. Les données classiques sont un cas particulier de données symboliques sur lesquelles STREE peut aussi obtenir de bons résultats. Il en ressort de bonnes performances sur plusieurs jeux de données UCI par rapport à des méthodes classiques de Data Mining telles que CART, C4.5, Naive Bayes, KNN, MLP et SVM. STREE permet également la construction d'ensembles d'arbres de décision symboliques soit par bagging soit par boosting. L'utilisation de tels ensembles a pour but de pallier les insuffisances liées aux arbres de décisions eux-mêmes et d'obtenir une décision finale qui est en principe plus fiable que celle obtenue à partir d'un arbre unique. / In this thesis, we propose the STREE methodology for the construction of decision trees with symbolic data. This data type allows us to characterize individuals of higher levels which may be classes or categories of individuals or concepts within the meaning of the Galois lattice. The values of the variables, called symbolic variables, may be sets, intervals or histograms. The criterion of recursive partitioning is a combination of a criterion related to the explanatory variables and a criterion related to the dependant variable. The first criterion is the variation of the variance of the explanatory variables. When it is applied alone, STREE acts as a top-down clustering methodology. The second criterion enables us to build a decision tree. This criteron is expressed as the variation of the Gini index if the dependant variable is nominal, and as the variation of the variance if thedependant variable is continuous or is a symbolic variable. Conventional data are a special case of symbolic data on which STREE can also get good results. It has performed well on multiple sets of UCI data compared to conventional methodologies of Data Mining such as CART, C4.5, Naive Bayes, KNN, MLP and SVM. The STREE methodology also allows for the construction of ensembles of symbolic decision trees either by bagging or by boosting. The use of such ensembles is designed to overcome shortcomings related to the decisions trees themselves and to obtain a finaldecision that is in principle more reliable than that obtained from a single tree. Arbre de décision Données symboliques Variable à expliquer Variables explicatives Indice de Gini Variance Élagage Courbe ROC Bagging Boosting Decision tree Symbolic data Dependant variable Explanatory variables Gini index Variance Pruning ROC curve Bagging Boosting
284	Classificação de data streams utilizando árvore de decisão estatística e a teoria dos fractais na análise evolutiva dos dados Cazzolato, Mirela Teixeira 24 March 2014 (has links) Made available in DSpace on 2016-06-02T19:06:13Z (GMT). No. of bitstreams: 1 5984.pdf: 1962060 bytes, checksum: d943b973e9dd5f12ab87985f7388cb80 (MD5) Previous issue date: 2014-03-24 / Financiadora de Estudos e Projetos / A data stream is generated in a fast way, continuously, ordered, and in large quantities. To process data streams there must be considered, among others factors, the limited use of memory, the need of real-time processing, the accuracy of the results and the concept drift (which occurs when there is a change in the concept of the data being analyzed). Decision tree is a popular form of representation of the classifier, that is intuitive and fast to build, generally obtaining high accuracy. The techniques of incremental decision trees present in the literature generally have high computational costs to construct and update the model, especially regarding the calculation to split the decision nodes. The existent methods have a conservative characteristic to deal with limited amounts of data, tending to improve their results as the number of examples increases. Another problem is that many real-world applications generate data with noise, and the existing techniques have a low tolerance to these events. This work aims to develop decision tree methods for data streams, that supply the deficiencies of the current state of the art. In addition, another objective is to develop a technique to detect concept drift using the fractal theory. This functionality should indicate when there is a need to correct the model, allowing the adequate description of most recent events. To achieve the objectives, three decision tree algorithms were developed: StARMiner Tree, Automatic StARMiner Tree, and Information Gain StARMiner Tree. These algorithms use a statistical method as heuristic to split the nodes, which is not dependent on the number of examples and is fast. In the experiments the algorithms achieved high accuracy, also showing a tolerant behavior in the classification of noisy data. Finally, a drift detection method was proposed to detect changes in the data distribution, based on the fractal theory. The method, called Fractal Detection Method, detects significant changes on the data distribution, causing the model to be updated when it does not describe the data (becoming obsolete). The method achieved good results in the classification of data containing concept drift, proving to be suitable for evolutionary analysis of data. / Um data stream e gerado de forma rápida, contínua, ordenada e em grande quantidade. Para o processamento de data streams deve-se considerar, dentre outros fatores, o uso limitado de memoria, a necessidade de processamento em tempo real, a precisão dos resultados e o concept drift (que ocorre quando há uma mudança no conceito dos dados que estão sendo analisados). À arvore de decisão e uma popular forma de representação do modelo classificador, intuitiva, e rápida de construir, geralmente possuindo alta acurada. Às técnicas de arvores de decisão incrementais presentes na literatura geralmente apresentam um alto custo computacional para a construção e atualização do modelo, principalmente no que se refere ao calculo para a decisão de divisão dos nós. Os métodos existentes possuem uma característica conservadora para lidar com quantidades de dados limitadas, tendendo a melhorar seus resultados conforme o número de exemplos aumenta. Outro problema e a geração dos dados com ruídos por muitas aplicações reais, pois as técnicas existentes possuem baixa tolerância a essas ocorrências. Este trabalho tem como objetivo o desenvolvimento de métodos de arvores de decisão para data streams, que suprem as deficiências do atual estado da arte. Além disso, outro objetivo deste projeto e o desenvolvimento de uma funcionalidade para detecção de concept drift utilizando a teoria dos fractais, corrigindo o modelo sempre que necessário, possibilitando a descrição correta dos acontecimentos mais recentes dos dados. Para atingir os objetivos foram desenvolvidos três algoritmos de arvore de decisão: o StÀRMiner Tree, o Àutomatic StÀRMiner Tree, e o Information Gain StÀR-Miner Tree. Esses algoritmos utilizam um método estatístico como heurística de divisão de nós, que não é dependente do numero de exemplos lidos e que e rápida. Os algoritmos obtiveram alta acurácia nos experimentos realizados, mostrando também um comportamento tolerante na classificação de dados ruidosos. Finalmente, foi proposto um método para a detecção de mudanças no comportamento dos dados baseado na teoria dos fractais, o Fractal Drift Detection Method. Ele detecta mudanças significativas na distribuicao dos dados, fazendo com que o modelo seja atualizado sempre que o mesmo não descrever os dados atuais (se tornar obsoleto). O método obteve bons resultados na classificação de dados contendo concept drift, mostrando ser adequado para a análise evolutiva dos dados. Ciência da computação Banco de dados Fluxo de dados Classificação Data mining (Mineração de dados) Fractais Árvore de decisão Algoritmo Incremental Data streams Classification Data mining Decision tree Incremental algorithm StARMiner Tree FDDM Fractal theory
285	Supervised Learning of Piecewise Linear Models Manwani, Naresh January 2012 (has links) (PDF) Supervised learning of piecewise linear models is a well studied problem in machine learning community. The key idea in piecewise linear modeling is to properly partition the input space and learn a linear model for every partition. Decision trees and regression trees are classic examples of piecewise linear models for classification and regression problems. The existing approaches for learning decision/regression trees can be broadly classified in to two classes, namely, fixed structure approaches and greedy approaches. In the fixed structure approaches, tree structure is fixed before hand by fixing the number of non leaf nodes, height of the tree and paths from root node to every leaf node of the tree. Mixture of experts and hierarchical mixture of experts are examples of fixed structure approaches for learning piecewise linear models. Parameters of the models are found using, e.g., maximum likelihood estimation, for which expectation maximization(EM) algorithm can be used. Fixed structure piecewise linear models can also be learnt using risk minimization under an appropriate loss function. Learning an optimal decision tree using fixed structure approach is a hard problem. Constructing an optimal binary decision tree is known to be NP Complete. On the other hand, greedy approaches do not assume any parametric form or any fixed structure for the decision tree classifier. Most of the greedy approaches learn tree structured piecewise linear models in a top down fashion. These are built by binary or multi-way recursive partitioning of the input space. The main issues in top down decision tree induction is to choose an appropriate objective function to rate the split rules. The objective function should be easy to optimize. Top-down decision trees are easy to implement and understand, but there are no optimality guarantees due to their greedy nature. Regression trees are built in the similar way as decision trees. In regression trees, every leaf node is associated with a linear regression function. All piece wise linear modeling techniques deal with two main tasks, namely, partitioning of the input space and learning a linear model for every partition. However, Partitioning of the input space and learning linear models for different partitions are not independent problems. Simultaneous optimal estimation of partitions and learning linear models for every partition, is a combinatorial problem and hence computationally hard. However, piecewise linear models provide better insights in to the classification or regression problem by giving explicit representation of the structure in the data. The information captured by piecewise linear models can be summarized in terms of simple rules, so that, they can be used to analyze the properties of the domain from which the data originates. These properties make piecewise linear models, like decision trees and regression trees, extremely useful in many data mining applications and place them among top data mining algorithms. In this thesis, we address the problem of supervised learning of piecewise linear models for classification and regression. We propose novel algorithms for learning piecewise linear classifiers and regression functions. We also address the problem of noise tolerant learning of classifiers in presence of label noise. We propose a novel algorithm for learning polyhedral classifiers which are the simplest form of piecewise linear classifiers. Polyhedral classifiers are useful when points of positive class fall inside a convex region and all the negative class points are distributed outside the convex region. Then the region of positive class can be well approximated by a simple polyhedral set. The key challenge in optimally learning a fixed structure polyhedral classifier is to identify sub problems, where each sub problem is a linear classification problem. This is a hard problem and identifying polyhedral separability is known to be NP complete. The goal of any polyhedral learning algorithm is to efficiently handle underlying combinatorial problem while achieving good classification accuracy. Existing methods for learning a fixed structure polyhedral classifier are based on solving non convex constrained optimization problems. These approaches do not efficiently handle the combinatorial aspect of the problem and are computationally expensive. We propose a method of model based estimation of posterior class probability to learn polyhedral classifiers. We solve an unconstrained optimization problem using a simple two step algorithm (similar to EM algorithm) to find the model parameters. To the best of our knowledge, this is the first attempt to form an unconstrained optimization problem for learning polyhedral classifiers. We then modify our algorithm to find the number of required hyperplanes also automatically. We experimentally show that our approach is better than the existing polyhedral learning algorithms in terms of training time, performance and the complexity. Most often, class conditional densities are multimodal. In such cases, each class region may be represented as a union of polyhedral regions and hence a single polyhedral classifier is not sufficient. To handle such situation, a generic decision tree is required. Learning optimal fixed structure decision tree is a computationally hard problem. On the other hand, top-down decision trees have no optimality guarantees due to the greedy nature. However, top-down decision tree approaches are widely used as they are versatile and easy to implement. Most of the existing top-down decision tree algorithms (CART,OC1,C4.5, etc.) use impurity measures to assess the goodness of hyper planes at each node of the tree. These measures do not properly capture the geometric structures in the data. We propose a novel decision tree algorithm that ,at each node, selects hyperplanes based on an objective function which takes into consideration geometric structure of the class regions. The resulting optimization problem turns out to be a generalized eigen value problem and hence is efficiently solved. We show through empirical studies that our approach leads to smaller size trees and better performance compared to other top-down decision tree approaches. We also provide some theoretical justification for the proposed method of learning decision trees. Piecewise linear regression is similar to the corresponding classification problem. For example, in regression trees, each leaf node is associated with a linear regression model. Thus the problem is once again that of (simultaneous) estimation of optimal partitions and learning a linear model for each partition. Regression trees, hinge hyperplane method, mixture of experts are some of the approaches to learn continuous piecewise linear regression models. Many of these algorithms are computationally intensive. We present a method of learning piecewise linear regression model which is computationally simple and is capable of learning discontinuous functions as well. The method is based on the idea of K plane regression that can identify a set of linear models given the training data. K plane regression is a simple algorithm motivated by the philosophy of k means clustering. However this simple algorithm has several problems. It does not give a model function so that we can predict the target value for any given input. Also, it is very sensitive to noise. We propose a modified K plane regression algorithm which can learn continuous as well as discontinuous functions. The proposed algorithm still retains the spirit of k means algorithm and after every iteration it improves the objective function. The proposed method learns a proper Piece wise linear model that can be used for prediction. The algorithm is also more robust to additive noise than K plane regression. While learning classifiers, one normally assumes that the class labels in the training data set are noise free. However, in many applications like Spam filtering, text classification etc., the training data can be mislabeled due to subjective errors. In such cases, the standard learning algorithms (SVM, Adaboost, decision trees etc.) start over fitting on the noisy points and lead to poor test accuracy. Thus analyzing the vulnerabilities of classifiers to label noise has recently attracted growing interest from the machine learning community. The existing noise tolerant learning approaches first try to identify the noisy points and then learn classifier on remaining points. In this thesis, we address the issue of developing learning algorithms which are inherently noise tolerant. An algorithm is inherently noise tolerant if, the classifier it learns with noisy samples would have the same performance on test data as that learnt from noise free samples. Algorithms having such robustness (under suitable assumption on the noise) are attractive for learning with noisy samples. Here, we consider non uniform label noise which is a generic noise model. In non uniform label noise, the probability of the class label for an example being incorrect, is a function of the feature vector of the example.(We assume that this probability is less than 0.5 for all feature vectors.) This can account for most cases of noisy data sets. There is no provably optimal algorithm for learning noise tolerant classifiers in presence of non uniform label noise. We propose a novel characterization of noise tolerance of an algorithm. We analyze noise tolerance properties of risk minimization frame work as risk minimization is a common strategy for classifier learning. We show that risk minimization under 01 loss has the best noise tolerance properties. None of the other convex loss functions have such noise tolerance properties. Empirical risk minimization under 01 loss is a hard problem as 01 loss function is not differentiable. We propose a gradient free stochastic optimization technique to minimize risk under 01 loss function for noise tolerant learning of linear classifiers. We show (under some conditions) that the algorithm converges asymptotically to the global minima of the risk under 01 loss function. We illustrate the noise tolerance of our algorithm through simulations experiments. We demonstrate the noise tolerance of the algorithm through simulations. Linear Models Linear Models (Classification) Linear Models (Regression) Polyhedral Classifiers Decision Trees Piecewise Linear Regression Noise Tolerant Learning Piecewise Linear Models Polyhedral Classifier Learning Geometric Decision Tree Regression Trees Nonlinear Models Supervised Learning Computer Science
286	Estruturas flutuantes para a exploração de campos de petróleo no mar(FPSO): apoio à decisão na escolha do sistema. / Decision aid methods applied to the selection of floating production storage and offloading system. Marcos Fernando Garber 17 December 2002 (has links) Freqüentemente os profissionais da construção naval tomam decisões para selecionar os elementos que devem ser especificados na composição de determinado projeto. Além da experiência e do conhecimento necessários para optar por um caminho adequado, a escolha deve atender eficientemente ao problema proposto e às preferências do projetista. A seleção de componentes do projeto de estruturas flutuantes para exploração de campos de petróleo no mar envolve aspectos objetivos e subjetivos. O trabalho apresenta e aplica alguns métodos de auxílio à tomada de decisão possibilitando ao projetista aprimorar sua sensibilidade. O objetivo da pesquisa é apresentar de forma sumária as duas bases para a decisão sobre o projeto, que são os métodos de apoio e os requisitos que as instalações FPSO devem atender, fornecendo como resultado um procedimento que permitirá aos usuários a melhor escolha dos componentes e aumentar a sensibilidade dos projetistas na seleção entre as possíveis opções (fazer totalmente novo ou aproveitar o existente). O trabalho apresenta uma revisão dos princípios da análise da decisão, informações sobre métodos de apoio à decisão, os dados de entrada do problema proposto, o método de projetos navais, os requisitos básicos para construção de estruturas flutuantes de exploração de petróleo no mar e os requisitos básicos para uma instalação FPSO. Para a solução do problema se aplicaram os métodos de árvore de decisão, para a parte correspondente às decisões sob risco e o método de análise hierárquica AHP para as decisões tomadas sob certeza. / Frequently, naval construction engineers take decisions to select the elements which must specified for the composition of a determined project. Besides the necessary experience and the knowledge to choose a proper way, the option must fulfil efficiently the problem requirements and the preferences of the designer. The choice of components in the design off-shore structures for production in sea oilfields, involves objective and subjective aspects. This work introduces and uses a few methods of decision aids, helping the designer to improve his sensibility. The objective of this research is to present in a simple way two foundations to decide about the project, which are the aid decision methods and the requirements which the FPSO (Floating Production Storage and Offloading System) must fulfil, supplying a procedure that enables the designer to take the best choice of components and to increase his sensibility referring to the selection among the possible options, wich are to make a completely new FPSO or to use an existing one. This work offers a revision of the principles of the decision analysis theory, the data input of the problem, the classic naval design method, the basic requirements to build floating structures to off-shore oil extraction, and the basic requirements for a FPSO installation. To solve the problem the decision tree method, for decision under risk, and the AHP (Analytic Hierarchy Process), for decision under certainty, were applied. AHP árvore de decisão FPSO método da análise hierárquica métodos de apoio à decisão petróleo projeto naval requisitos FPSO aid decision methods analytic hierarchy process decision tree petroleum extraction requirements
287	Uma metodologia de binarização para áreas de imagens de cheque utilizando algoritmos de aprendizagem supervisionada Alves, Rafael Félix 23 June 2015 (has links) Made available in DSpace on 2016-03-15T19:38:02Z (GMT). No. of bitstreams: 1 RAFAEL FELIX ALVES.pdf: 2156088 bytes, checksum: a82e527c69001eb9cee5a989bde3b8dc (MD5) Previous issue date: 2015-06-23 / The process of image binarization consists of transforming a color image into a new one with only two colors: black and white. This process is an important step for many modern applica-tions such as Check Clearance, Optical Character Recognition and Handwriting Recognition. Improvements in the automatic process of image binarization represent impacts on applications that rely on this step. The present work proposes a methodology for automatic image binariza-tion. This methodology applies supervised learning algorithms to binarize images and consists of the following steps: images database construction; extraction of the region of interest; pat-terns matrix construction; pattern labelling; database sampling; and classifier training. Experi-mental results are presented using a database of Brazilian bank check images and the competi-tion database DIBCO 2009. In conclusion, the proposal demonstrated to be superior to some of its competitors in terms of accuracy and F-Measure. / O processo de binarização de imagens consiste na transformação de uma imagem colorida em uma nova imagem com apenas duas cores: uma que representa o fundo, outra o objeto de interesse. Este processo é uma importante etapa de diversas aplicações modernas, como a Compensação de Cheque, o Reconhecimento Ótico de Caracteres (do inglês Optical Characterer Recognition) e o Reconhecimento de Texto Manuscrito (do inglês Handwritten Recognition, HWR). Dado que melhorias no processo automático de binarização de imagens representam impactos diretos nas aplicações que dependem desta etapa o presente trabalho propõe uma metodologia para realizar a binarização automática de imagens. A proposta realiza a binarização de forma automática baseado no uso de algoritmos de aprendizagem supervisionada, tais como redes neurais artificiais e árvore de decisão. O processo como um todo consiste das seguintes etapas: construção do banco de imagens; extração da região de interesse; construção da matriz de padrões; rotulação dos padrões; amostragem da base; e treinamento do classificador. Resultados experimentais são apresentados utilizando uma base de imagens de cheques de bancos brasileiros (CMC-7 e montante de cortesia) e a base de imagens da competição DIBCO 2009. Em conclusão, a metodologia proposta apresentou-se competitiva aos métodos da literatura destacando-se em aplicações onde o processamento de imagens está restrito a uma categoria de imagens, como é o caso das imagens de cheques de bancos brasileiros. A presente metodologia apresenta resultados experimentais entre as três primeiras posições e melhores resultados em relação a medida F-Measure quando comparada com as demais. binarização aprendizagem de máquina classificação imagens de cheque montante de cortesia redes neurais artificiais árvore de decisão binarization machine learning classification check images courtesy amount CMC-7 artificial neural network decision tree CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
288	Tvorba konceptu category managementu / Creating the Concept of Category Management Lepař, Martin January 2012 (has links) The goal of this thesis is to define proces of creating the concept of category management and to apply the proces via existing assortment concept of certain retail company. The theoretical part covers basics of category management, merchandising and other related disciplines and its influence to creating concepts. Practical part includes detailed description of creating the concept, realized by retail company, selling Do it yourself products. Creation proces will start by anylysing current sutiation, following by determination of category strategy and objectives and implemetation of new concept.
289	Árbol de decisión para la selección de un motor de base de datos / Decision tree for the selection of database engine Bendezú Kiyán , Enrique Renato, Monjaras Flores, Álvaro Gianmarco 30 August 2020 (has links) Desde los últimos años, la cantidad de usuarios que navega en internet ha crecido exponencialmente. Por consecuencia, la cantidad de información que se maneja crece a manera desproporcionada y, por ende, el manejo de grandes volúmenes de información obtenidos de internet ha ocasionado grandes problemas. Los diferentes tipos de bases de datos tienen un funcionamiento variado, dado que, se ve afectado el rendimiento para ejecutar las transacciones cuando se lidia con diferentes cantidades de información. Entre este tipo de variedades, se analizará las bases de datos relacionales, bases de datos no relaciones y bases de datos en memoria. Para las organizaciones es muy importante contar con un acelerado manejo de información debido a la gran demanda por parte de los clientes y el mercado en general, permitiendo que no se disminuya la agilidad de operación interna cuando se requiera manejar información, y conservar la integridad de esta. Sin embargo, cada categoría de base de datos está diseñada para cubrir diferentes casos de usos específicos para mantener un alto rendimiento con respecto al manejo de los datos. El presente proyecto tiene como objetivo el estudio de diversos escenarios de los principales casos de uso, costos, aspectos de escalabilidad y rendimiento de cada base de datos, mediante la elaboración de un árbol de decisión, en el cual, se determine la mejor opción de categoría de base de datos según el flujo que decida tomar el usuario. Palabras clave: Base de Datos, Base de Datos Relacional, Base de Datos No Relacional, Base de Datos en Memoria, Árbol de Decisión. / In recent years, the number of users browsing the internet has grown exponentially. Consequently, the amount of information handled grows disproportionately and, therefore, the handling of large volumes of information obtained from the Internet has caused major problems. Different types of databases work differently, since the performance of executing transactions suffers when dealing with different amounts of information. Among this type of varieties, relational databases, non-relationship databases and in-memory databases will be analyzed. For organizations it is very important to have an accelerated information management due to the great demand from customers and the market in general, allowing the agility of internal operation to not be diminished when it is required to manage information, and to preserve the integrity of is. However, each category of database is designed to cover different specific use cases to maintain high performance regarding data handling. The purpose of this project is to study various scenarios of the main use cases, costs, scalability and performance aspects of each database, through the development of a decision tree, in which the best option for database category according to the flow that the user decides to take. / Tesis Base de datos Base de datos relacional Base de datos no relacional Base de datos en memoria Árbol de decisión Database Relational database Non-relational database In memory database Decision tree Sql Nosql
290	Získávání znalostí pro modelování následných akcí / Data Mining for Suggesting Further Actions Veselovský, Martin January 2017 (has links) Knowledge discovery from databases is a complex issue involving integration, data preparation, data mining using machine learning methods and visualization of results. The thesis deals with the whole process of knowledge discovery, especially with the issue of data warehousing, where it offers the design and implementation of a specific data warehouse for the company ROI Hunter, a.s. In the field of data mining, the work focuses on the classification and forecasting of the advertising data available from the prepared data warehouse and, in particular, on the decision tree classification. When predicting the development of new ads, emphasis is put on the rationale for the prediction as well as the proposal to adjust the ad settings so that the prediction ends positively and, with a certain likelihood, the ads actually get better results.

Search results