Global ETD Search

1	Formal Verification of Tree Ensembles in Safety-Critical Applications Törnblom, John January 2020 (has links) In the presence of data and computational resources, machine learning can be used to synthesize software automatically. For example, machines are now capable of learning complicated pattern recognition tasks and sophisticated decision policies, two key capabilities in autonomous cyber-physical systems. Unfortunately, humans find software synthesized by machine learning algorithms difficult to interpret, which currently limits their use in safety-critical applications such as medical diagnosis and avionic systems. In particular, successful deployments of safety-critical systems mandate the execution of rigorous verification activities, which often rely on human insights, e.g., to identify scenarios in which the system shall be tested. A natural pathway towards a viable verification strategy for such systems is to leverage formal verification techniques, which, in the presence of a formal specification, can provide definitive guarantees with little human intervention. However, formal verification suffers from scalability issues with respect to system complexity. In this thesis, we investigate the limits of current formal verification techniques when applied to a class of machine learning models called tree ensembles, and identify model-specific characteristics that can be exploited to improve the performance of verification algorithms when applied specifically to tree ensembles. To this end, we develop two formal verification techniques specifically for tree ensembles, one fast and conservative technique, and one exact but more computationally demanding. We then combine these two techniques into an abstraction-refinement approach, that we implement in a tool called VoTE (Verifier of Tree Ensembles). Using a couple of case studies, we recognize that sets of inputs that lead to the same system behavior can be captured precisely as hyperrectangles, which enables tractable enumeration of input-output mappings when the input dimension is low. Tree ensembles with a high-dimensional input domain, however, seems generally difficult to verify. In some cases though, conservative approximations of input-output mappings can greatly improve performance. This is demonstrated in a digit recognition case study, where we assess the robustness of classifiers when confronted with additive noise. Formal verification Machine learning Tree ensembles Computer Sciences Datavetenskap (datalogi)
2	Urychlení evolučních algoritmů pomocí rozhodovacích stromů a jejich zobecnění / Accelerating evolutionary algorithms by decision trees and their generalizations Klíma, Jan January 2011 (has links) Evolutionary algorithms are one of the most successful methods for solving non-traditional optimization problems. As they employ only function values of the objective function, evolutionary algorithms converge much more slowly than optimization methods for smooth functions. This property of evolutionary algorithms is particularly disadvantageous in the context of costly and time-consuming empirical way of obtaining values of the objective function. However, evolutionary algorithms can be substantially speeded up by employing a sufficiently accurate regression model of the empirical objective function. This thesis provides a survey of utilizability of regression trees and their ensembles as a surrogate model to accelerate convergence of evolutionary optimization.
3	[en] OCEANUI: INTERFACE FOR COUNTERFACTUAL EXPLANATIONS GENERATION / [pt] OCEANUI: INTERFACE PARA GERAÇÃO DE EXPLICAÇÕES CONTRAFACTUAIS MOISES HENRIQUE PEREIRA 22 August 2022 (has links) [pt] Atualmente algoritmos de aprendizado de máquina (ML) estão incrivelmente presentes no nosso cotidiano, desde sistemas de recomendação de filmes e músicas até áreas de alto risco como saúde, justiça criminal, finanças e assim por diante, auxiliando na tomada de decisões. Mas a complexidade de criação desses algoritmos de ML também está aumentando, enquanto sua interpretabilidade está diminuindo. Muitos algoritmos e suas decisões não podem ser facilmente explicados por desenvolvedores ou usuários, e os algoritmos também não são autoexplicáveis. Com isso, erros e vieses podem acabar ficando ocultos, o que pode impactar profundamente a vida das pessoas. Devido a isso, iniciativas relacionadas a transparência, explicabilidade e interpretabilidade estão se tornando cada vez mais relevantes, como podemos ver no novo regulamento sobre proteção e tratamento de dados pessoais (GDPR, do inglês General Data Protection Regulation), aprovado em 2016 para a União Europeia, e também na Lei Geral de Proteção de Dados (LGPD) aprovada em 2020 no Brasil. Além de leis e regulamentações tratando sobre o tema, diversos autores consideram necessário o uso de algoritmos inerentemente interpretáveis; outros mostram alternativas para se explicar algoritmos caixa-preta usando explicações locais, tomando a vizinhança de um determinado ponto e então analisando a fronteira de decisão dessa região; enquanto ainda outros estudam o uso de explicações contrafactuais. Seguindo essa linha dos contrafactuais, nos propomos a desenvolver uma interface com usuário para o sistema Optimal Counterfactual Explanations in Tree Ensembles (OCEAN), denominada OceanUI, através do qual o usuário gera explicações contrafactuais plausíveis usando Programação Inteira Mista e Isolation Forest. O propósito desta interface é facilitar a geração de contrafactuais e permitir ao usuário obter um contrafactual personalizado e mais aplicável individualmente, por meio da utilização de restrições e gráficos interativos. / [en] Machine learning algorithms (ML) are becoming incredibly present in our daily lives, from movie and song recommendation systems to high-risk areas like health care, criminal justice, finance, and so on, supporting decision making. But the complexity of those algorithms is increasing while their interpretability is decreasing. Many algorithms and their decisions cannot be easily explained by either developers or users, and the algorithms are also not self-explanatory. As a result, mistakes and biases can end up being hidden, which can profoundly impact people s lives. So, initiatives concerning transparency, explainability, and interpretability are becoming increasingly more relevant, as we can see in the General Data Protection Regulation (GDPR), approved in 2016 for the European Union, and in the General Data Protection Law (LGPD) approved in 2020 in Brazil. In addition to laws and regulations, several authors consider necessary the use of inherently interpretable algorithms; others show alternatives to explain black-box algorithms using local explanations, taking the neighborhood of a given point and then analyzing the decision boundary in that region; while yet others study the use of counterfactual explanations. Following the path of counterfactuals, we propose to develop a user interface for the system Optimal Counterfactual Explanations in Tree Ensembles (OCEAN), which we call OceanUI, through which the user generates plausible counterfactual explanations using Mixed Integer Programming and Isolation Forest. The purpose of this user interface is to facilitate the counterfactual generation and to allow the user to obtain a personal and more individually applicable counterfactual, by means ofrestrictions and interactive graphics. [pt] EXPLICACAO COUNTRAFACTUAL [pt] TREE ENSEMBLES [pt] VISUALIZACAO DE DADOS [en] COUNTERFACTUAL EXPLANATION [en] TREE ENSEMBLE [en] DATA VISUALIZATION
4	[en] APPROXIMATE BORN AGAIN TREE ENSEMBLES / [pt] ÁRVORES BA APROXIMADAS MATHEUS DE SOUSA SUKNAIC 28 October 2021 (has links) [pt] Métodos ensemble como random forest, boosting e bagging foram extensivamente estudados e provaram ter uma acurácia melhor do que usar apenas um preditor. Entretanto, a desvantagem é que os modelos obtidos utilizando esses métodos podem ser muito mais difíceis de serem interpretados do que por exemplo, uma árvore de decisão. Neste trabalho, nós abordamos o problema de construir uma árvore de decisão que aproximadamente reproduza um conjunto de árvores, explorando o tradeoff entre acurácia e interpretabilidade, que pode ser alcançado quando a reprodução exata do conjunto de árvores é relaxada. Primeiramente, nós formalizamos o problem de obter uma árvore de decisão de uma determinada profundidade que seja a mais aderente ao conjunto de árvores e propomos um algoritmo de programação dinâmica para resolver esse problema. Nós também provamos que a árvore de decisão obtida por esse procedimento satisfaz garantias de generalização relacionadas a generalização do modelo original de conjuntos de árvores, um elemento crucial para a efetividade dessa árvore de decisão em prática. Visto que a complexidade computacional do algoritmo de programação dinâmica é exponencial no número de features, nós propomos duas heurísticas para gerar árvores de uma determinada profundidade com boa aderência em relação ao conjunto de árvores. Por fim, nós conduzimos experimentos computacionais para avaliar os algoritmos propostos. Quando utilizados classificadores mais interpretáveis, os resultados indicam que em diversas situações a perda em acurácia é pequena ou inexistente: restrigindo a árvores de decisão de profundidade 6, nossos algoritmos produzem árvores que em média possuem acurácias que estão a 1 por cento (considerando o algoritmo de programção dinâmica) ou 2 por cento (considerando os algoritmos heurísticos) do conjunto original de árvores. / [en] Ensemble methods in machine learning such as random forest, boosting, and bagging have been thoroughly studied and proven to have better accuracy than using a single predictor. However, their drawback is that they give models that can be much harder to interpret than those given by, for example, decision trees. In this work, we approach in a principled way the problem of constructing a decision tree that approximately reproduces a tree ensemble, exploring the tradeoff between accuracy and interpretability that can be obtained once exact reproduction is relaxed. First, we formally define the problem of obtaining the decision tree of a given depth that is most adherent to a tree ensemble and give a Dynamic Programming algorithm for solving this problem. We also prove that the decision trees obtained by this procedure satisfy generalization guarantees related to the generalization of the original tree ensembles, a crucial element for their effectiveness in practice. Since the computational complexity of the Dynamic Programming algorithm is exponential in the number of features, we also design heuristics to compute trees of a given depth with good adherence to a tree ensemble. Finally, we conduct a comprehensive computational evaluation of the algorithms proposed. The results indicate that in many situations, there is little or no loss in accuracy in working more interpretable classifiers: even restricting to only depth-6 decision trees, our algorithms produce trees with average accuracies that are within 1 percent (for the Dynamic Programming algorithm) or 2 percent (heuristics) of the original random forest. [pt] ARVORE DE DECISAO [pt] MODELOS DE COMPRESSAO DE DADOS [pt] MACHINE LEARNING INTERPRETAVEL [pt] CONJUNTOS DE ARVORES [en] DECISION TREE [en] MODEL COMPRESSION [en] INTERPRETABLE MACHINE LEARNING [en] TREE ENSEMBLES
5	Применение алгоритмов машинного обучения для прогнозирования выработки солнечной электрической станции : магистерская диссертация / Application of Machine Learning Algorithms for Forecasting the Output of a Solar Power Station Харлашкина, М. А., Kharlashkina, M. A. January 2024 (has links) The result of this work is the development of an algorithm for forecasting the generation of a solar power station using artificial intelligence technologies. The developed models were successfully tested, demonstrating the best results with the use of random forests. / Результатом работы стала разработка алгоритма для прогнозирования генерации солнечной электростанции с использованием технологий искусственного интеллекта. Разработанные модели были успешно протестированы, показав наилучший результат с использованием случайного леса. MASTER'S THESIS RENEWABLE ENERGY SOURCES PHOTOVOLTAIC STATION ARTIFICIAL INTELLIGENCE MACHINE LEARNING DECISION TREE ENSEMBLES МАШИННОЕ ОБУЧЕНИЕ

1

Page generated in 0.0776 seconds