Global ETD Search

1	[en] ON THE SIMULTANEOUS MINIMIZATION OF WORST TESTING COST AND EXPECTED TESTING COST WITH DECISION TREES / [pt] MINIMIZAÇÃO SIMULTÂNEA DO PIOR CUSTO E DO CUSTO MÉDIO EM ÁRVORES DE DECISÃO ALINE MEDEIROS SAETTLER 25 January 2017 (has links) [pt] O problema de minimizar o custo de avaliar uma função discreta lendo sequencialmente as suas variáveis é um problema que surge em diversas aplicações, entre elas sistemas de diagnóstico automático e aprendizado ativo. Neste problema, cada variável da função está associada a um custo, que se deve pagar para checar o seu valor. Além disso, pode existir uma distribuição de probabilidades associadas aos pontos onde a função está definida. A maioria dos trabalhos nesta área se concentra ou na minimização do custo máximo ou na minimização do custo esperado gasto para avaliar a função. Nesta dissertação, mostramos como obter uma Ômicron logaritmo de N aproximação em relação à minimização do pior custo (a melhor aproximação possível assumindo que P é diferente de NP). Nós também mostramos um procedimento polinomial para avaliar uma função otimizando simultaneamente o pior custo e o custo esperado. / [en] The problem of minimizing the cost of evaluating a discrete function by sequentially reading its variables is a problem that arises in several applications, among them automatic diagnosis design and active learning. In this problem, each variable of the function is associated with a cost, that we have to pay in order to check its value. In addition, there may exist a probability distribution associated with the points where the function is defined. Most of the work in the area has focussed either on the minimization of the maximum cost or on the minimization of the expected cost spent to evaluate the function. In this dissertation, we show how to obtain an Ômicron logarithm of N approximation with respect to the worst case minimization (the best possible approximation under the assumption that P is different from NP). We also show a polynomial time procedure for evaluate a function that simultaneously optimizes both the worst and the expected costs. [pt] ARVORE DE DECISAO [en] DECISION TREE [pt] OTIMIZACAO COMBINATORIA [en] COMBINATORIAL OPTIMIZATION [pt] ANALISE DE ALGORITMOS [pt] ALGORITMOS DE APROXIMACAO
2	[pt] ALGORITMOS DE APROXIMAÇÃO PARA ÁRVORES DE DECISÃO / [en] APPROXIMATION ALGORITHMS FOR DECISION TREES ALINE MEDEIROS SAETTLER 13 December 2021 (has links) [pt] A construção de árvores de decisão é um problema central em diversas áreas da ciência da computação, por exemplo, teoria de banco de dados e aprendizado computacional. Este problema pode ser visto como o problema de avaliar uma função discreta, onde para verificar o valor de cada variável da função temos que pagar um custo, e os pontos onde a função está definida estão associados a uma distribuição de probabilidade. O objetivo do problema é avaliar a função minimizando o custo gasto (no pior caso ou no caso médio). Nesta tese, apresentamos quatro contribuições relacionadas a esse problema. A primeira é um algoritmo que alcança uma aproximação de O(log(n)) em relação a tanto o custo esperado quanto ao pior custo. A segunda é um método que combina duas árvores, uma com pior custo W e outra com custo esperado E, e produz uma árvore com pior custo de no máximo (1+p)W e custo esperado no máximo (1/(1-e-p))E, onde p é um parâmetro dado. Nós também provamos que esta é uma caracterização justa do melhor trade-off alcançável, mostrando que existe um número infinito de instâncias para as quais não podemos obter uma árvore de decisão com tanto o pior custo menor que (1 + p)OPTW(I) quanto o custo esperado menor que (1/(1 - e - p))OPTE(I), onde OPTW(I) (resp. OPTE(I)) denota o pior custo da árvore de decisão que minimiza o pior custo (resp. custo esperado) para uma instância I do problema. A terceira contribuição é um algoritmo de aproximação de O(log(n)) para a minimização do pior custo para uma variante do problema onde o custo de ler uma variável depende do seu valor. Nossa última contribuição é um algoritmo randomized rounding que, dada uma instância do problema (com um inteiro adicional (k > 0) e um parâmetro 0 < e < 1/2, produz uma árvore de decisão oblivious com custo no máximo (3/(1 - 2e))ln(n)OPT(I) e que produz no máximo (k/e) erros, onde OPT(I) denota o custo da árvore de decisão oblivious com o menor custo entre todas as árvores oblivious para a instância I que produzem no máximo k erros de classificação. / [en] Decision tree construction is a central problem in several areas of computer science, for example, data base theory and computational learning. This problem can be viewed as the problem of evaluating a discrete function, where to check the value of each variable of the function we have to pay a cost, and the points where the function is defined are associated with a probability distribution. The goal of the problem is to evaluate the function minimizing the cost spent (in the worst case or in expectation). In this Thesis, we present four contributions related to this problem. The first one is an algorithm that achieves an O(log(n)) approximation with respect to both the expected and the worst costs. The second one is a procedure that combines two trees, one with worst costW and another with expected cost E, and produces a tree with worst cost at most (1+p)W and expected cost at most (1/(1-e-p))E, where p is a given parameter. We also prove that this is a sharp characterization of the best possible trade-off attainable, showing that there are infinitely many instances for which we cannot obtain a decision tree with both worst cost smaller than (1+p)OPTW(I) and expected cost smaller than (1/(1-e-p))OPTE(I), where OPTW(I) (resp. OPTE(I)) denotes the cost of the decision tree that minimizes the worst cost (resp. expected cost) for an instance I of the problem. The third contribution is an O(log(n)) approximation algorithm for the minimization of the worst cost for a variant of the problem where the cost of reading a variable depends on its value. Our final contribution is a randomized rounding algorithm that, given an instance of the problem (with an additional integer k > 0) and a parameter 0 < e < 1/2, builds an oblivious decision tree with cost at most (3/(1 - 2e))ln(n)OPT(I) and produces at most (k/e) errors, where OPT(I) denotes the cost of the oblivious decision tree with minimum cost among all oblivious decision trees for instance I that make at most k classification errors. [pt] ARVORE DE DECISAO [pt] ALGORITMOS DE APROXIMACAO [pt] OTIMIZACAO COMBINATORIA [en] DECISION TREE [en] APPROXIMATION ALGORITHMS [en] COMBINATORIAL OPTIMIZATION
3	[en] DECISION TREES WITH EXPLAINABLE RULES / [pt] ÁRVORES DE DECISÃO COM REGRAS EXPLICÁVEIS VICTOR FEITOSA DE CARVALHO SOUZA 04 August 2023 (has links) [pt] As árvores de decisão são estruturas comumente utilizadas em cenários nos quais modelos explicáveis de Aprendizado de Máquina são desejados, por serem visualmente intuitivas. Na literatura existente, a busca por explicabilidade em árvores envolve a minimização de métricas como altura e número de nós. Nesse contexto, definimos uma métrica de explicabilidade, chamada de explanation size, que reflete o número de atributos necessários para explicar a classificação dos exemplos. Apresentamos também um algoritmo, intitulado SER-DT, que obtém uma aproximação O(log n) (ótima se P diferente NP) para a minimização da altura no pior caso ou caso médio, assim como do explanation size no pior caso ou caso médio. Em uma série de experimentos, comparamos a implementação de SER-DT com algoritmos conhecidos da área, como CART e EC2, além de testarmos o impacto de parâmetros e estratégias de poda nesses algoritmos. SER-DT mostrou-se competitivo em acurácia com os algoritmos citados, mas gerou árvores muito mais explicáveis. / [en] Decision trees are commonly used structures in scenarios where explainable Machine Learning models are desired, as they are visually intuitive. In the existing literature, the search for explainability in trees involves minimizing metrics such as depth and number of nodes. In this context, we define an explainability metric, called explanation size, which reflects the number of attributes needed to explain the classification of examples. We also present an algorithm, called SER-DT, which obtains an O(log n) approximation (optimal if P different NP) for the minimization of depth in the worst/average case, as well as of explanation size in the worst/average case. In a series of experiments, we compared the SER-DT implementation with well-known algorithms in the field, such as CART and EC2 in addition to testing the impact of parameters and pruning strategies on these algorithms. SER-DT proved to be competitive in terms of accuracy with the aforementioned algorithms, but generated much more explainable trees. [pt] APRENDIZADO DE MAQUINA [pt] MODELO EXPLICAVEL [pt] ALGORITMOS DE APROXIMACAO [pt] ARVORE DE DECISAO [pt] CLASSIFICACAO [en] MACHINE LEARNING [en] EXPLAINABLE MODEL [en] APPROXIMATION ALGORITHMS [en] DECISION TREE [en] CLASSIFICATION
4	[en] A CHARACTERIZATION OF TESTABLE GRAPH PROPERTIES IN THE DENSE GRAPH MODEL / [pt] UMA CARACTERIZAÇÃO DE PROPRIEDADES TESTÁVEIS NO MODELO DE GRAFOS DENSOS FELIPE DE OLIVEIRA 19 June 2023 (has links) [pt] Consideramos, nesta dissertação, a questão de determinar se um grafo tem uma propriedade P, tal como G é livre de triângulos ou G é 4- colorível. Em particular, consideramos para quais propriedades P existe um algoritmo aleatório com probabilidades de erro constantes que aceita grafos que satisfazem P e rejeita grafos que são epsilon-longe de qualquer grafo que o satisfaça. Se, além disso, o algoritmo tiver complexidade independente do tamanho do grafo, a propriedade é dita testável. Discutiremos os resultados de Alon, Fischer, Newman e Shapira que obtiveram uma caracterização combinatória de propriedades testáveis de grafos, resolvendo um problema em aberto levantado em 1996. Essa caracterização diz informalmente que uma propriedade P de um grafo é testável se e somente se testar P pode ser reduzido a testar a propriedade de satisfazer uma das finitas partições Szemerédi. / [en] We consider, in this thesis, the question of determining if a graph has a property P such as G is triangle-free or G is 4-colorable. In particular, we consider for which properties P there exists a random algorithm with constant error probabilities that accept graphs that satisfy P and reject graphs that are epsilon-far from any graph that satisfies it. If, in addition, the algorithm has complexity independent of the size of the graph, the property is called testable. We will discuss the results of Alon, Fischer, Newman, and Shapira that obtained a combinatorial characterization of testable graph properties, solving an open problem raised in 1996. This characterization informally says that a graph property P is testable if and only if testing P can be reduced to testing the property of satisfying one of finitely many Szemerédi-partitions. [pt] ALGORITMOS ALEATORIZADOS [pt] O LEMA DE REGULARIDADE DE SZEMEREDI [pt] TESTAGEM DE PROPRIEDADES [pt] ALGORITMOS EM GRAFOS [pt] ALGORITMOS DE APROXIMACAO [en] RANDOMIZED ALGORITHMS [en] SZEMEREDI S REGULARITY LEMMA [en] PROPERTY TESTING [en] GRAPH ALGORITHMS [en] APPROXIMATION ALGORITHMS

1

Page generated in 0.0339 seconds