121 |
Uma comparação de métodos de classificação aplicados à detecção de fraude em cartões de crédito / A comparison of classification methods applied to credit card fraud detectionGadi, Manoel Fernando Alonso 22 April 2008 (has links)
Em anos recentes, muitos algoritmos bio-inspirados têm surgido para resolver problemas de classificação. Em confirmação a isso, a revista Nature, em 2002, publicou um artigo que já apontava para o ano de 2003 o uso comercial de Sistemas Imunológicos Artificiais para detecção de fraude em instituições financeiras por uma empresa britânica. Apesar disso, não observamos, a luz de nosso conhecimento, nenhuma publicação científica com resultados promissores desde então. Nosso trabalho tratou de aplicar Sistemas Imunológicos Artificiais (AIS) para detecção de fraude em cartões de crédito. Comparamos AIS com os métodos de Árvore de Decisão (DT), Redes Neurais (NN), Redes Bayesianas (BN) e Naive Bayes (NB). Para uma comparação mais justa entre os métodos, busca exaustiva e algoritmo genético (GA) foram utilizados para selecionar um conjunto paramétrico otimizado, no sentido de minimizar o custo de fraude na base de dados de cartões de crédito cedida por um emissor de cartões de crédito brasileiro. Em adição à essa otimização, fizemos também uma análise e busca por parâmetros mais robustos via multi-resolução, estes parâmetros são apresentados neste trabalho. Especificidades de bases de fraude como desbalanceamento de dados e o diferente custo entre falso positivo e negativo foram levadas em conta. Todas as execuções foram realizadas no Weka, um software público e Open Source, e sempre foram utilizadas bases de teste para validação dos classificadores. Os resultados obtidos são consistentes com Maes et al. que mostra que BN são melhores que NN e, embora NN seja um dos métodos mais utilizados hoje, para nossa base de dados e nossas implementações, encontra-se entre os piores métodos. Apesar do resultado pobre usando parâmetros default, AIS obteve o melhor resultado com os parâmetros otimizados pelo GA, o que levou DT e AIS a apresentarem os melhores e mais robustos resultados entre todos os métodos testados. / In 2002, January the 31st, the famous journal Nature, with a strong impact in the scientific environment, published some news about immune based systems. Among the different considered applications, we can find detection of fraudulent financial transactions. One can find there the possibility of a commercial use of such system as close as 2003, in a British company. In spite of that, we do not know of any scientific publication that uses Artificial Immune Systems in financial fraud detection. This work reports results very satisfactory on the application of Artificial Immune Systems (AIS) to credit card fraud detection. In fact, scientific financial fraud detection publications are quite rare, as point out Phua et al. [PLSG05], in particular for credit card transactions. Phua et al. points out the fact that no public database of financial fraud transactions is available for public tests as the main cause of such a small number of publications. Two of the most important publications in this subject that report results about their implementations are the prized Maes (2000), that compares Neural Networks and Bayesian Networks in credit card fraud detection, with a favored result for Bayesian Networks and Stolfo et al. (1997), that proposed the method AdaCost. This thesis joins both these works and publishes results in credit card fraud detection. Moreover, in spite the non availability of Maes data and implementations, we reproduce the results of their and amplify the set of comparisons in such a way to compare the methods Neural Networks, Bayesian Networks, and also Artificial Immune Systems, Decision Trees, and even the simple Naïve Bayes. We reproduce in certain way the results of Stolfo et al. (1997) when we verify that the usage of a cost sensitive meta-heuristics, in fact generalized from the generalization done from the AdaBoost to the AdaCost, applied to several tested methods substantially improves it performance for all methods, but Naive Bayes. Our analysis took into account the skewed nature of the dataset, as well as the need of a parametric adjustment, sometimes through the usage of genetic algorithms, in order to obtain the best results from each compared method.
|
122 |
Quantification of the influences of built-form upon travel of employed adults : new models based on the UK National Travel SurveyJahanshahi, Kaveh January 2017 (has links)
After decades of research, a host of analytical difficulties is still hindering our understanding of the influences of the built form on travel. The main challenges are (a) assembling good quality data that reflects the majority of the known influences and that supports continuous monitoring, and (b) making sense methodologically of the many variables which strongly intercorrelate. This study uses the UK national travel survey (NTS) data that is among the most comprehensive of its form in the world. The fact that it has rarely been used so far for this purpose may be attributable to the methodological difficulties. This dissertation aims to develop a new analytical framework based on extended structural equation models (SEMs) in order to overcome some of the key methodological difficulties in quantifying the influences of the built form on travel, and in addition to provide a means to continuously monitor any changes in the effects over time. The analyses are focused on employed adults, because they are not only the biggest UK population segment with the highest per capita travel demand, but also the segment that are capable of adapting more rapidly to changing land use, built form and transport supply conditions. The research is pursued through three new models. Model 1 is a path diagram coupled with factor analyses, which estimates continuous, categorical and binary dependent variables. The model estimates the influences on travel distance, time and trip frequency by trip purpose while accounting for self-selection, spatial sorting, endogeneity of car ownership, and interactions among trip purposes. The results highlight stark differences among commuters, particularly the mobility disadvantages of women, part time and non-car owning workers even when they live in the most accessible urban areas. Model 2 incorporates latent categorisation analyses in order to identify a tangible typology of the built form and the associated variations in impacts on travel. Identifying NTS variables as descriptors for tangible built form categories provides an improved basis for investigating land use and transport planning interventions. The model reveals three distinct built form categories in the UK with striking variations in the patterns of influences. Model 3 further investigates the variations across the built form categories. The resulting random intercept SEM provides a more precise quantification of the influences of self-selection and spatial sorting across the built form categories for each socioeconomic group. Four research areas are highlighted for further studies: First, new preference, attitude and behavioural parameters may be introduced through incorporating non-NTS behavioural surveys; Second, the new SEMs provide a basis for incorporating choice modelling where the utility function is defined with direct, indirect and latent variables; Third, conceptual and methodological developments – such as non-parametric latent class analysis, allow expanding the current model to monitor changes in travel behaviour as and when new NTS or non NTS data become available. Fourth, the robustness of the inferences regarding causal or directional influences may require further quantification through designing new panel data sets, building on the findings above.
|
123 |
Avaliação e modelagem de sistemas de suporte à decisão utilizando reconhecimento de padrões e redes bayesianas / Assessment and modeling of decision support systems using pattern recognition and bayesian networksMichel Bessani 09 February 2015 (has links)
Sistemas de suporte a decisão são utilizados em cenários com incertezas. Uma decisão normalmente é auxiliada por resultados obtidos com ações passadas em problemas semelhantes. Quando um sistema de suporte a decisão incorpora conhecimento específico de uma área, estes recebem o nome de sistemas especialistas. Tal conhecimento especifico é utilizado para inferência juntamente com as informações de entrada a respeito do problema. O objetivo deste trabalho é a avaliação e modelagem de sistemas de auxílio a decisão, foram analisadas duas abordagens para um mesmo problema alvo, sendo uma de gerenciamento do problema e outra de detecção do problema. A abordagem de gerenciamento utiliza redes Bayesianas para modelagem, tanto do conhecimento específico quanto para a inferência. As variáveis utilizadas, as relações de dependência e as probabilidades condicionais entre as variáveis foram extraídas da literatura. A abordagem de detecção do problema utilizou imagens para extração de características seguida de um algoritmo de agrupamento para comparação com a classificação de um especialista. Uma das áreas de aplicação de sistemas especialistas é na área clínica, podendo auxiliar tanto na detecção, diagnóstico e tratamento de doenças. A cárie dental é um problema generalizado que afeta a maioria das pessoas, tanto em países ricos, como em países pobres. Existem poucos sistemas para auxílio no processo de diagnóstico da cárie, sendo a maior parte dos sistemas existentes determinísticos, focando apenas na detecção da lesão. O sistema de gerenciamento da cárie desenvolvido foi apresentado a dois profissionais da odontologia, a opinião deles mostra que está abordagem é promissora e aplicável em campos como a educação e a atenção básica a saúde. Além da apresentação aos profissionais, foram utilizados casos bem estabelecidos da literatura para analisar as sugestões fornecidas pela Rede, e o resultado foi coerente com o cenário real de tomada de decisão. A metodologia de detecção da cárie resultou em um alto valor de acurácia, 96.88%, mostrando que tal metodologia é promissora em comparação com outros trabalhos da área. Além da contribuição para a área de informática odontológica, os resultados mostram que a extração da estrutura e das probabilidades condicionais da rede a partir da literatura é uma metodologia que pode ser utilizada em outras áreas com cenário similar ao do diagnóstico da cárie. Nos próximos passos do projeto alguns pontos referentes a modelagem de sistemas e redes Bayesianas serão analisados, como escalabilidade e testes de validação, tanto quantitativamente como qualitativamente, isto inclui o desenvolvimento de métodos computacionalmente efetivos para a geração de casos aleatórios utilizando o Método de Monte Carlo / Decision support systems are used in uncertainty scenarios; normally a decision is choose using similar problems actions results. Decision support systems could incorporate specific knowledge; such systems are called expert systems. The specific knowledge is used for inference about the problem scenario. This work objective is the evaluation and modeling of decision support systems, we analyzed two distinct approaches for the same problem, one for detection, another for management. The management approach uses Bayesian networks for modeling the specific knowledge and the inference engine. The variables choice, the dependences relationship and the conditional probabilities were extracted from the scientific literature. The detection approach used images and feature extraction to perform a clustering and compare the output labels with a specialist classification. One application of expert systems is clinical, supporting diseases detection, diagnosis and treatment. Dental caries is a generalized problem that affects major part of the population, few systems exists for support the caries diagnostic process, the major part is deterministic, focusing only the detection problem. The caries management system developed here was shown to two odontology professionals, and they opinion encourage such approach to be applied in fields like odontology education and basic health. Beyond this, we used well-established cases to analyze the network output suggestions, the result obtained was coherent with the real decision making scenario. The caries detection approach resulted in a high accuracy, 96.88%, showing that methodology is promising. Besides the contribution for dental informatics field, the results obtained here shows that the extraction of the network structure from the literature could be used in problems similar with caries diagnoses. The project next steps are to analyze some points of systems modeling and Bayesian networks, like scalability and validation tests, both quantitative and qualitative, and including the development of computational effectives methods for the use of Monte Carlo methodology
|
124 |
Motif representation and discoveryCarvalho, A.M. 01 July 2011 (has links) (PDF)
An important part of gene regulation is mediated by specific proteins, called transcription factors, which influence the transcription of a particular gene by binding to specific sites on DNA sequences, called transcription factor binding sites (TFBS) or, simply, motifs. Such binding sites are relatively short segments of DNA, normally 5 to 25 nucleotides long, over- represented in a set of co-regulated DNA sequences. There are two different problems in this setup: motif representation, accounting for the model that describes the TFBS's; and motif discovery, focusing in unravelling TFBS's from a set of co-regulated DNA sequences. This thesis proposes a discriminative scoring criterion that culminates in a discriminative mixture of Bayesian networks to distinguish TFBS's from the background DNA. This new probabilistic model supports further evidence in non-additivity among binding site positions, providing a superior discriminative power in TFBS's detection. On the other hand, extra knowledge carefully selected from the literature was incorporated in TFBS discovery in order to capture a variety of characteristics of the TFBS's patterns. This extra knowledge was combined during the process of motif discovery leading to results that are considerably more accurate than those achieved by methods that rely in the DNA sequence alone.
|
125 |
Modeling of Reliable Service Based Operations Support Systems (MORSBOSS).Kogeda, Okuthe Paul. January 2008 (has links)
<p>
<p>  / </p>
</p>
<p align="left">The underlying theme of this thesis is identification, classification, detection and prediction of cellular network faults using state of the art technologies, methods and algorithms.</p>
|
126 |
Design Patterns for Service-Based Fault Tolerant Mechatronic Systems / Designmönster för feltoleranta servicebaserade mekatroniska systemLundqvist, Erik January 2011 (has links)
In this Master thesis a new framework for achieving fault tolerance in mechatronic systems is studied. The framework is called service-based fault tolerant control and has the advantage of being completely decentralized and modular and therefore scales very well to large system sizes. First, a method is presented for designing the signal-flow architecture of mechatronic systems of real-life size and complexity. The result is a small set of generic building blocks in the form of design patterns, a concept that has gained widespread popularity in the field of software architecture. Best practises are then established for how each of the design patterns can be extended to support fault tolerance through diagnosis and reconfiguration according to the service-based framework. These extended design patterns can be used either to aid in the construction of new and more complex mechatronic systems or as a methodology for applying service-based fault tolerant control on large existing systems. The presented methods for designing and modelling large-scale mechatronic systems have the advantages of being applicable to a large class of mechatronic systems, being easy to apply without expert knowledge, as well as having the potential for being automated in the future. Finally, a case-study demonstrates how the new methods can be used to construct a fault tolerance architecture for a real-life automotive system currently used by Scania CV AB. As a part of this study a mathematical model for the system was also constructed and implemented. The model can be used for analysis during the development phase as well as troubleshooting in a repair workshop.
|
127 |
Information-driven Sensor Path Planning and the Treasure Hunt ProblemCai, Chenghui 25 April 2008 (has links)
This dissertation presents a basic information-driven sensor management problem, referred to as treasure hunt, that is relevant to mobile-sensor applications such as mine hunting, monitoring, and surveillance. The objective is to classify/infer one or multiple fixed targets or treasures located in an obstacle-populated workspace by planning the path and a sequence of measurements of a robotic sensor installed on a mobile platform associated with the treasures distributed in the sensor workspace. The workspace is represented by a connectivity graph, where each node represents a possible sensor deployment, and the arcs represent possible sensor movements. A methodology is developed for planning the sensing strategy of a robotic sensor deployed. The sensing strategy includes the robotic sensor's path, because it determines which targets are measurable given a bounded field of view. Existing path planning techniques are not directly applicable to robots whose primary objective is to gather sensor measurements. Thus, in this dissertation, a novel approximate cell-decomposition approach is developed in which obstacles, targets, the sensor's platform and field of view are represented as closed and bounded subsets of an Euclidean workspace. The approach constructs a connectivity graph with observation cells that is pruned and transformed into a decision tree, from which an optimal sensing strategy can be computed. It is shown that an additive incremental-entropy function can be used to efficiently compute the expected information value of the measurement sequence over time.
The methodology is applied to a robotic landmine classification problem and the board game of CLUE$^{\circledR}$. In the landmine detection application, the optimal strategy of a robotic ground-penetrating radar is computed based on prior remote measurements and environmental information. Extensive numerical experiments show that this methodology outperforms shortest-path, complete-coverage, random, and grid search strategies, and is applicable to non-overpass capable platforms that must avoid targets as well as obstacles. The board game of CLUE$^{\circledR}$ is shown to be an excellent benchmark example of treasure hunt problem. The test results show that a player implementing the strategies developed in this dissertation outperforms players implementing Bayesian networks only, Q-learning, or constraint satisfaction, as well as human players. / Dissertation
|
128 |
Symbolische Interpretation Technischer ZeichnungenBringmann, Oliver 19 January 2003 (has links) (PDF)
Gescannte und vektorisierte technische Zeichnungen werden automatisch unter Nutzung eines Netzes von Modellen in eine hochwertige Datenstruktur migriert. Die Modelle beschreiben die Inhalte der Zeichnungen hierarchisch und deklarativ. Modelle für einzelne Bestandteile der Zeichnungen können paarweise unabhängig entwickelt werden. Dadurch werden auch sehr komplexe Zeichnungsklassen wie Elektroleitungsnetze oder Gebäudepläne zugänglich. Die Modelle verwendet der neue, sogenannte Y-Algorithmus: Hypothesen über die Deutung lokaler Zeichnungsinhalte werden hierarchisch generiert. Treten bei der Nutzung konkurrierender Modelle Konflikte auf, werden diese protokolliert. Mittels des Konfliktbegriffes können konsistente Interpretationen einer kompletten Zeichnung abstrakt definiert und während der Analyse einer konkreten Zeichnung bestimmt werden. Ein wahrscheinlichkeitsbasiertes Gütemaß bewertet jede dieser alternativen, globalen Interpretationen. Das Suchen einer bzgl. dieses Maßes optimalen Interpretation ist ein NP-hartes Problem. Ein Branch and Bound-Algorithmus stellt die adäquate Lösung dar.
|
129 |
Modeling of Reliable Service Based Operations Support Systems (MORSBOSS).Kogeda, Okuthe Paul. January 2008 (has links)
<p>
<p>  / </p>
</p>
<p align="left">The underlying theme of this thesis is identification, classification, detection and prediction of cellular network faults using state of the art technologies, methods and algorithms.</p>
|
130 |
Towards an Integral Approach for Modeling CausalityMeganck, Stijn 24 September 2008 (has links) (PDF)
A partir de données d'observation classiques, il est rarement possible d'arriver à une structure de réseau bayésien qui soit complètement causale. Le point théorique auquel nous nous intéressons est l'apprentissage des réseaux bayésiens causaux, avec ou sans variables latentes. Nous nous sommes d'abord focalisés sur la découverte de relations causales lorsque toutes les variables sont connues (i.e. il n'y a pas de variables latentes) en proposant un algorithme d'apprentissage utilisant à la fois des données issues d'observations et d'expérimentations. Logiquement, nous nous sommes ensuite concentrés sur le même problème lorsque toutes les variables ne sont pas connues. Il faut donc découvrir à la fois des relations de causalité entre les variables et la présence éventuelle de variables latentes dans la structure du réseau bayésien. Pour cela, nous tentons d'unifier deux formalismes, les modèles causaux semi-markoviens (SMCM) et les graphes ancestraux maximaux (MAG), utilisés séparément auparavant, l'un pour l'inférence causale (SMCM), l'autre pour la découverte de causalité (MAG). Nous nous sommes aussi interessé à l'adaptation de réseaux bayésiens causaux pour des systèmes multi-agents, et sur l'apprentissage de ces modèles causaux multi-agents (MACM).
|
Page generated in 0.0381 seconds