Global ETD Search

51	Indexing and Search Algorithmsfor Web shops : / Indexering och sök algoritmer för webshoppar : Reimers, Axel, Gustafsson, Isak January 2016 (has links) Web shops today needs to be more and more responsive, where one part of this responsivenessis fast product searches. One way of getting faster searches are by searching against anindex instead of directly against a database. Network Expertise Sweden AB (Net Exp) wants to explore different methods of implementingan index in their future web shop, building upon the open-source web shop platformSmartStore.NET. Since SmartStore.NET does all of its searches directly against itsdatabase, it will not scale well and will wear more on the database. The aim was thereforeto find different solutions to offload the database by using an index instead. A prototype that retrieved products from a database and made them searchable through anindex was developed, evaluated and implemented. The prototype indexed the data with aninverted index algorithm, and was made searchable with a search algorithm that mixed typeboolean queries with normal queries. / Webbutiker idag behöver vara mer och mer responsiva, en del av denna responsivitet ärsnabb produkt sökningar. Ett sätt att skaffa snabbare sökningar är genom att söka mot ettindex istället för att söka direkt mot en databas. Network Expertise Sweden AB vill utforska olika metoder för att implementera ett index ideras framtida webbutik, byggt ovanpå SmartStore.NET som är öppen käll-kod. Då Smart-Store.NET gör alla av sina sökningar direkt mot sin databas, kommer den inte att skala braoch kommer slita mer på databasen. Målsättningen var därför att hitta olika lösningar somavlastar databasen genom att använda ett index istället. En prototyp som hämtade produkter från en databas och gjorde dom sökbara genom ettindex var utvecklad, utvärderad och implementerad. Prototypen indexerade datan med eninverterad indexerings algoritm, och gjordes sökbara med en sök algoritm som blandar booleskafrågor med normala frågor. / <p></p><p></p><p></p> Index Data Mining Knowledge Discovery in Databases Lucene.NET Inverted Index Index data utvinning kunskaps upptäckelse i databaser Lucene.NET inverterad index Computer Sciences Datavetenskap (datalogi)
52	Definition of a human-machine learning process from timed observations : application to the modelling of human behaviourfor the detection of abnormal behaviour of old people at home / Définition d'un processus d'apprentissage par l'homme et la machine à partir d'observations datées : application à la modélisation du comportement humain pour la détection des comportements anormaux de personnes âgées maintenues dans leur domicile Pomponio, Laura 26 June 2012 (has links) L'acquisition et la modélisation de connaissances ont été abordés jusqu'à présent selon deux approches principales : les êtres humains (experts) à l'aide des méthodologies de l'Ingénierie des Connaissances et le Knowledge Management, et les données à l'aide des techniques relevant de la découverte de connaissances à partir du contenu de bases de données (fouille de données). Cette thèse porte sur la conception d'un processus d'apprentissage conjoint par l'être humain et la machine combinant une approche de modélisation des connaissances de type Ingénierie des Connaissances (TOM4D, Timed Observation Modelling for Diagnosis) et une approche d'apprentissage automatique fondée sur un processus de découverte de connaissances à partir de données datées (TOM4L, Timed Observation Mining for Learning). Ces deux approches étant fondées sur la Théorie des Observations Datées, les modèles produits sont représentés dans le même formalisme ce qui permet leur comparaison et leur combinaison. Le mémoire propose également une méthode d'abstraction, inspiée des travaux de Newell sur le "Knowledge Level'' et fondée sur le paradigme d'observation datée, qui a pour but de traiter le problème de la différence de niveau d'abstraction inhérent entre le discours d'un expert et les données mesurées sur un système par un processus d'abstractions successives. Les travaux présentés dans ce mémoire ayant été menés en collaboration avec le CSTB de Sophia Antipolis (Centre Scientifique et Technique du Bâtiment), ils sont appliqués à la modélisation de l'activité humaine dans le cadre de l'aide aux personnes âgées maintenues à domicile. / Knowledge acquisition has been traditionally approached from a primarily people-driven perspective, through Knowledge Engineering and Management, or from a primarily data-driven approach, through Knowledge Discovery in Databases, rather than from an integral standpoint. This thesis proposes then a human-machine learning approach that combines a Knowledge Engineering modelling approach called TOM4D (Timed Observation Modelling For Diagnosis) with a process of Knowledge Discovery in Databases based on an automatic data mining technique called TOM4L (Timed Observation Mining For Learning). The combination and comparison between models obtained through TOM4D and those ones obtained through TOM4L is possible, owing to that TOM4D and TOM4L are based on the Theory of Timed Observations and share the same representation formalism. Consequently, a learning process nourished with experts' knowledge and knowledge discovered in data is defined in the present work. In addition, this dissertation puts forward a theoretical framework of abstraction levels, in line with the mentioned theory and inspired by the Newell's Knowledge Level work, in order to reduce the broad gap of semantic content that exists between data, relative to an observed process, in a database and what can be inferred in a higher level; that is, in the experts' discursive level. Thus, the human-machine learning approach along with the notion of abstraction levels are then applied to the modelling of human behaviour in smart environments. In particular, the modelling of elderly people's behaviour at home in the GerHome Project of the CSTB (Centre Scientifique et Technique du Bâtiment) of Sophia Antipolis, France. Ingénierie des Connaissances Modéle de Connaissance Fouille de donnés Niveaux d'Abstraction Environnements Intelligents Activités Humaines Knowledge Engineering Knowledge Modelling Knowledge Discovery in Databases Data Mining Abstraction Levels Smart Environments Human Activity
53	Etude comportementale des mesures d'intérêt d'extraction de connaissances / Behavioral study of interestingness measures of knowledge extraction Grissa, Dhouha 02 December 2013 (has links) La recherche de règles d’association intéressantes est un domaine important et actif en fouille de données. Puisque les algorithmes utilisés en extraction de connaissances à partir de données (ECD), ont tendance à générer un nombre important de règles, il est difficile à l’utilisateur de sélectionner par lui même les connaissances réellement intéressantes. Pour répondre à ce problème, un post-filtrage automatique des règles s’avère essentiel pour réduire fortement leur nombre. D’où la proposition de nombreuses mesures d’intérêt dans la littérature, parmi lesquelles l’utilisateur est supposé choisir celle qui est la plus appropriée à ses objectifs. Comme l’intérêt dépend à la fois des préférences de l’utilisateur et des données, les mesures ont été répertoriées en deux catégories : les mesures subjectives (orientées utilisateur ) et les mesures objectives (orientées données). Nous nous focalisons sur l’étude des mesures objectives. Néanmoins, il existe une pléthore de mesures objectives dans la littérature, ce qui ne facilite pas le ou les choix de l’utilisateur. Ainsi, notre objectif est d’aider l’utilisateur, dans sa problématique de sélection de mesures objectives, par une approche par catégorisation. La thèse développe deux approches pour assister l’utilisateur dans sa problématique de choix de mesures objectives : (1) étude formelle suite à la définition d’un ensemble de propriétés de mesures qui conduisent à une bonne évaluation de celles-ci ; (2) étude expérimentale du comportement des différentes mesures d’intérêt à partir du point de vue d’analyse de données. Pour ce qui concerne la première approche, nous réalisons une étude théorique approfondie d’un grand nombre de mesures selon plusieurs propriétés formelles. Pour ce faire, nous proposons tout d’abord une formalisation de ces propriétés afin de lever toute ambiguïté sur celles-ci. Ensuite, nous étudions, pour différentes mesures d’intérêt objectives, la présence ou l’absence de propriétés caractéristiques appropriées. L’évaluation des mesures est alors un point de départ pour une catégorisation de celle-ci. Différentes méthodes de classification ont été appliquées : (i) méthodes sans recouvrement (CAH et k-moyennes) qui permettent l’obtention de groupes de mesures disjoints, (ii) méthode avec recouvrement (analyse factorielle booléenne) qui permet d’obtenir des groupes de mesures qui se chevauchent. Pour ce qui concerne la seconde approche, nous proposons une étude empirique du comportement d’une soixantaine de mesures sur des jeux de données de nature différente. Ainsi, nous proposons une méthodologie expérimentale, où nous cherchons à identifier les groupes de mesures qui possèdent, empiriquement, un comportement semblable. Nous effectuons par la suite une confrontation avec les deux résultats de classification, formel et empirique dans le but de valider et mettre en valeur notre première approche. Les deux approches sont complémentaires, dans l’optique d’aider l’utilisateur à effectuer le bon choix de la mesure d’intérêt adaptée à son application. / The search for interesting association rules is an important and active field in data mining. Since knowledge discovery from databases used algorithms (KDD) tend to generate a large number of rules, it is difficult for the user to select by himself the really interesting knowledge. To address this problem, an automatic post-filtering rules is essential to significantly reduce their number. Hence, many interestingness measures have been proposed in the literature in order to filter and/or sort discovered rules. As interestingness depends on both user preferences and data, interestingness measures were classified into two categories : subjective measures (user-driven) and objective measures (data-driven). We focus on the study of objective measures. Nevertheless, there are a plethora of objective measures in the literature, which increase the user’s difficulty for choosing the appropriate measure. Thus, our goal is to avoid such difficulty by proposing groups of similar measures by means of categorization approaches. The thesis presents two approaches to assist the user in his problematic of objective measures choice : (1) formal study as per the definition of a set of measures properties that lead to a good measure evaluation ; (2) experimental study of the behavior of various interestingness measures from data analysispoint of view. Regarding the first approach, we perform a thorough theoretical study of a large number of measures in several formal properties. To do this, we offer first of all a formalization of these properties in order to remove any ambiguity about them. We then study for various objective interestingness measures, the presence or absence of appropriate characteristic properties. Interestingness measures evaluation is therefore a starting point for measures categorization. Different clustering methods have been applied : (i) non overlapping methods (CAH and k-means) which allow to obtain disjoint groups of measures, (ii) overlapping method (Boolean factor analysis) that provides overlapping groups of measures. Regarding the second approach, we propose an empirical study of the behavior of about sixty measures on datasets with different nature. Thus, we propose an experimental methodology, from which we seek to identify groups of measures that have empirically similar behavior. We do next confrontation with the two classification results, formal and empirical in order to validate and enhance our first approach. Both approaches are complementary, in order to help the user making the right choice of the appropriate interestingness measure to his application. Mesures d’intérêt Propriétés formelles Règles d’association Classification non supervisée Analyse factorielle booléenne Knowledge Discovery from Databases (KDD) Interestingness measures Formal properties Association rules Clustering Boolean factor analysis
54	An analysis of semantic data quality defiencies in a national data warehouse: a data mining approach Barth, Kirstin 07 1900 (has links) This research determines whether data quality mining can be used to describe, monitor and evaluate the scope and impact of semantic data quality problems in the learner enrolment data on the National Learners’ Records Database. Previous data quality mining work has focused on anomaly detection and has assumed that the data quality aspect being measured exists as a data value in the data set being mined. The method for this research is quantitative in that the data mining techniques and model that are best suited for semantic data quality deficiencies are identified and then applied to the data. The research determines that unsupervised data mining techniques that allow for weighted analysis of the data would be most suitable for the data mining of semantic data deficiencies. Further, the academic Knowledge Discovery in Databases model needs to be amended when applied to data mining semantic data quality deficiencies. / School of Computing / M. Tech. (Information Technology) Data warehouse Data mining Data quality mining Exploratory data mining Cluster analysis Association rule Knowledge discovery in databases National Learners’ Records Database Learner enrolment data Semantic data quality deficiencies 005.745 Data warehousing Data mining Cluster analysis Association rule mining
55	[en] INTELLIGENT ASSISTANCE FOR KDD-PROCESS ORIENTATION / [pt] ASSISTÊNCIA INTELIGENTE À ORIENTAÇÃO DO PROCESSO DE DESCOBERTA DE CONHECIMENTO EM BASES DE DADOS RONALDO RIBEIRO GOLDSCHMIDT 15 December 2003 (has links) [pt] A notória complexidade inerente ao processo de KDD - Descoberta de Conhecimento em Bases de Dados - decorre essencialmente de aspectos relacionados ao controle e à condução deste processo (Fayyad et al., 1996b; Hellerstein et al., 1999). De uma maneira geral, estes aspectos envolvem dificuldades em perceber inúmeros fatos cuja origem e os níveis de detalhe são os mais diversos e difusos, em interpretar adequadamente estes fatos, em conjugar dinamicamente tais interpretações e em decidir que ações devem ser realizadas de forma a procurar obter bons resultados. Como identificar precisamente os objetivos do processo, como escolher dentre os inúmeros algoritmos de mineração e de pré-processamento de dados existentes e, sobretudo, como utilizar adequadamente os algoritmos escolhidos em cada situação são alguns exemplos das complexas e recorrentes questões na condução de processos de KDD. Cabe ao analista humano a árdua tarefa de orientar a execução de processos de KDD. Para tanto, diante de cada cenário, o homem utiliza sua experiência anterior, seus conhecimentos e sua intuição para interpretar e combinar os fatos de forma a decidir qual a estratégia a ser adotada (Fayyad et al., 1996a, b; Wirth et al., 1998). Embora reconhecidamente úteis e desejáveis, são poucas as alternativas computacionais existentes voltadas a auxiliar o homem na condução do processo de KDD (Engels, 1996; Amant e Cohen, 1997; Livingston, 2001; Bernstein et al., 2002; Brazdil et al., 2003). Aliado ao exposto acima, a demanda por aplicações de KDD em diversas áreas vem crescendo de forma muito acentuada nos últimos anos (Buchanan, 2000). É muito comum não existirem profissionais com experiência em KDD disponíveis para atender a esta crescente demanda (Piatetsky-Shapiro, 1999). Neste contexto, a criação de ferramentas inteligentes que auxiliem o homem no controle do processo de KDD se mostra ainda mais oportuna (Brachman e Anand, 1996; Mitchell, 1997). Assim sendo, esta tese teve como objetivos pesquisar, propor, desenvolver e avaliar uma Máquina de Assistência Inteligente à Orientação do Processo de KDD que possa ser utilizada, fundamentalmente, como instrumento didático voltado à formação de profissionais especializados na área da Descoberta de Conhecimento em Bases de Dados. A máquina proposta foi formalizada com base na Teoria do Planejamento para Resolução de Problemas (Russell e Norvig, 1995) da Inteligência Artificial e implementada a partir da integração de funções de assistência utilizadas em diferentes níveis de controle do processo de KDD: Definição de Objetivos, Planejamento de Ações de KDD, Execução dos Planos de Ações de KDD e Aquisição e Formalização do Conhecimento. A Assistência à Definição de Objetivos tem como meta auxiliar o homem na identificação de tarefas de KDD cuja execução seja potencialmente viável em aplicações de KDD. Esta assistência foi inspirada na percepção de um certo tipo de semelhança no nível intensional apresentado entre determinados bancos de dados. Tal percepção auxilia na prospecção do tipo de conhecimento a ser procurado, uma vez que conjuntos de dados com estruturas similares tendem a despertar interesses similares mesmo em aplicações de KDD distintas. Conceitos da Teoria da Equivalência entre Atributos de Bancos de Dados (Larson et al., 1989) viabilizam a utilização de uma estrutura comum na qual qualquer base de dados pode ser representada. Desta forma, bases de dados, ao serem representadas na nova estrutura, podem ser mapeadas em tarefas de KDD, compatíveis com tal estrutura. Conceitos de Espaços Topológicos (Lipschutz, 1979) e recursos de Redes Neurais Artificiais (Haykin, 1999) são utilizados para viabilizar os mapeamentos entre padrões heterogêneos. Uma vez definidos os objetivos em uma aplicação de KDD, decisões sobre como tais objetivos podem ser alcançados se tornam necessárias. O primeiro passo envolve a escolha de qual algoritmo de mineração de dados é o mais apropriado para o problema em questão. A Assistência ao Planejamento de Ações de KDD auxilia o homem nesta escolha. Utiliza, para tanto, uma metodologia de ordenação dos algoritmos de mineração baseada no desempenho prévio destes algoritmos em problemas similares (Soares et al., 2001; Brazdil et al., 2003). Critérios de ordenação de algoritmos baseados em similaridade entre bases de dados nos níveis intensional e extensional foram propostos, descritos e avaliados. A partir da escolha de um ou mais algoritmos de mineração de dados, o passo seguinte requer a escolha de como deverá ser realizado o pré-processamento dos dados. Devido à diversidade de algoritmos de pré-processamento, são muitas as alternativas de combinação entre eles (Bernstein et al., 2002). A Assistência ao Planejamento de Ações de KDD também auxilia o homem na formulação e na escolha do plano ou dos planos de ações de KDD a serem adotados. Utiliza, para tanto, conceitos da Teoria do Planejamento para Resolução de Problemas. Uma vez escolhido um plano de ações de KDD, surge a necessidade de executá-lo. A execução de um plano de ações de KDD compreende a execução, de forma ordenada, dos algoritmos de KDD previstos no plano. A execução de um algoritmo de KDD requer conhecimento sobre ele. A Assistência à Execução dos Planos de Ações de KDD provê orientações específicas sobre algoritmos de KDD. Adicionalmente, esta assistência dispõe de mecanismos que auxiliam, de forma especializada, no processo de execução de algoritmos de KDD e na análise dos resultados obtidos. Alguns destes mecanismos foram descritos e avaliados. A execução da Assistência à Aquisição e Formalização do Conhecimento constitui-se em um requisito operacional ao funcionamento da máquina proposta. Tal assistência tem por objetivo adquirir e disponibilizar os conhecimentos sobre KDD em uma representação e uma organização que viabilizem o processamento das funções de assistência mencionadas anteriormente. Diversos recursos e técnicas de aquisição de conhecimento foram utilizados na concepção desta assistência. / [en] Generally speaking, such aspects involve difficulties in perceiving innumerable facts whose origin and levels of detail are highly diverse and diffused, in adequately interpreting these facts, in dynamically conjugating such interpretations, and in deciding which actions must be performed in order to obtain good results. How are the objectives of the process to be identified in a precise manner? How is one among the countless existing data mining and preprocessing algorithms to be selected? And most importantly, how can the selected algorithms be put to suitable use in each different situation? These are but a few examples of the complex and recurrent questions that are posed when KDD processes are performed. Human analysts must cope with the arduous task of orienting the execution of KDD processes. To this end, in face of each different scenario, humans resort to their previous experiences, their knowledge, and their intuition in order to interpret and combine the facts and therefore be able to decide on the strategy to be adopted (Fayyad et al., 1996a, b; Wirth et al., 1998). Although the existing computational alternatives have proved to be useful and desirable, few of them are designed to help humans to perform KDD processes (Engels, 1996; Amant and Cohen, 1997; Livingston, 2001; Bernstein et al., 2002; Brazdil et al., 2003). In association with the above-mentioned fact, the demand for KDD applications in several different areas has increased dramatically in the past few years (Buchanan, 2000). Quite commonly, the number of available practitioners with experience in KDD is not sufficient to satisfy this growing demand (Piatetsky-Shapiro, 1999). Within such a context, the creation of intelligent tools that aim to assist humans in controlling KDD processes proves to be even more opportune (Brachman and Anand, 1996; Mitchell, 1997). Such being the case, the objectives of this thesis were to investigate, propose, develop, and evaluate an Intelligent Machine for KDD-Process Orientation that is basically intended to serve as a teaching tool to be used in professional specialization courses in the area of Knowledge Discovery in Databases. The basis for formalization of the proposed machine was the Planning Theory for Problem-Solving (Russell and Norvig, 1995) in Artificial Intelligence. Its implementation was based on the integration of assistance functions that are used at different KDD process control levels: Goal Definition, KDD Action-Planning, KDD Action Plan Execution, and Knowledge Acquisition and Formalization. The Goal Definition Assistant aims to assist humans in identifying KDD tasks that are potentially executable in KDD applications. This assistant was inspired by the detection of a certain type of similarity between the intensional levels presented by certain databases. The observation of this fact helps humans to mine the type of knowledge that must be discovered since data sets with similar structures tend to arouse similar interests even in distinct KDD applications. Concepts from the Theory of Attribute Equivalence in Databases (Larson et al., 1989) make it possible to use a common structure in which any database may be represented. In this manner, when databases are represented in the new structure, it is possible to map them into KDD tasks that are compatible with such a structure. Topological space concepts and ANN resources as described in Topological Spaces (Lipschutz, 1979) and Artificial Neural Nets (Haykin, 1999) have been employed so as to allow mapping between heterogeneous patterns. After the goals have been defined in a KDD application, it is necessary to decide how such goals are to be achieved. The first step involves selecting the most appropriate data mining algorithm for the problem at hand. The KDD Action-Planning Assistant helps humans to make this choice. To this end, it makes use of a methodology for ordering the mining algorithms that is based on the previous experiences, their knowledge, and their intuition in order to interpret and combine the facts and therefore be able to decide on the strategy to be adopted (Fayyad et al., 1996a, b; Wirth et al., 1998). Although the existing computational alternatives have proved to be useful and desirable, few of them are designed to help humans to perform KDD processes (Engels, 1996; Amant & Cohen, 1997; Livingston, 2001; Bernstein et al., 2002; Brazdil et al., 2003). In association with the above-mentioned fact, the demand for KDD applications in several different areas has increased dramatically in the past few years (Buchanan, 2000). Quite commonly, the number of available practitioners with experience in KDD is not sufficient to satisfy this growing demand (Piatetsky-Shapiro, 1999). Within such a context, the creation of intelligent tools that aim to assist humans in controlling KDD processes proves to be even more opportune (Brachman & Anand, 1996; Mitchell, 1997). Such being the case, the objectives of this thesis were to investigate, propose, develop, and evaluate an Intelligent Machine for KDD-Process Orientation that is basically intended to serve as a teaching tool to be used in professional specialization courses in the area of Knowledge Discovery in Databases. The basis for formalization of the proposed machine was the Planning Theory for Problem-Solving (Russell and Norvig, 1995) in Artificial Intelligence. Its implementation was based on the integration of assistance functions that are used at different KDD process control levels: Goal Definition, KDD Action- Planning, KDD Action Plan Execution, and Knowledge Acquisition and Formalization. The Goal Definition Assistant aims to assist humans in identifying KDD tasks that are potentially executable in KDD applications. This assistant was inspired by the detection of a certain type of similarity between the intensional levels presented by certain databases. The observation of this fact helps humans to mine the type of knowledge that must be discovered since data sets with similar structures tend to arouse similar interests even in distinct KDD applications. Concepts from the Theory of Attribute Equivalence in Databases (Larson et al., 1989) make it possible to use a common structure in which any database may be represented. In this manner, when databases are represented in the new structure, it is possible to map them into KDD tasks that are compatible with such a structure. Topological space concepts and ANN resources as described in Topological Spaces (Lipschutz, 1979) and Artificial Neural Nets (Haykin, 1999) have been employed so as to allow mapping between heterogeneous patterns. After the goals have been defined in a KDD application, it is necessary to decide how such goals are to be achieved. The first step involves selecting the most appropriate data mining algorithm for the problem at hand. The KDD Action-Planning Assistant helps humans to make this choice. To this end, it makes use of a methodology for ordering the mining algorithms that is based on the previous performance of these algorithms in similar problems (Soares et al., 2001; Brazdil et al., 2003). Algorithm ordering criteria based on database similarity at the intensional and extensional levels were proposed, described and evaluated. The data mining algorithm or algorithms having been selected, the next step involves selecting the way in which data preprocessing is to be performed. Since there is a large variety of preprocessing algorithms, many are the alternatives for combining them (Bernstein et al., 2002). The KDD Action-Planning Assistant also helps humans to formulate and to select the KDD action plan or plans to be adopted. To this end, it makes use of concepts contained in the Planning Theory for Problem-Solving. Once a KDD action plan has been chosen, it is necessary to execute it. Executing a KDD action plan involves the ordered execution of the KDD algorithms that have been anticipated in the plan. Executing a KDD algorithm requires knowledge about it. The KDD Action Plan Execution Assistant provides specific guidance on KDD algorithms. In addition, this assistant is equipped with mechanisms that provide specialized assistance for performing the KDD algorithm execution process and for analyzing the results obtained. Some of these mechanisms have been described and evaluated. The execution of the Knowledge Acquisition and Formalization Assistant is an operational requirement for running the proposed machine. The objective of this assistant is to acquire knowledge about KDD and to make such knowledge available by representing and organizing it a way that makes it possible to process the above-mentioned assistance functions. A variety of knowledge acquisition resources and techniques were employed in the conception of this assistant. [pt] MINERACAO DE DADOS [en] DATA MINING [en] KNOWLEDGE DISCOVERY IN DATABASES [en] KDD TASK DEFINITION ASSISTANCE [pt] PLANEJAMENTO EM KDD [en] PLANNING IN KDD
56	Descoberta de regras de conhecimento utilizando computação evolutiva multiobjetivo / Discoveing knowledge rules with multiobjective evolutionary computing Giusti, Rafael 22 June 2010 (has links) Na área de inteligência artificial existem algoritmos de aprendizado, notavelmente aqueles pertencentes à área de aprendizado de máquina AM , capazes de automatizar a extração do conhecimento implícito de um conjunto de dados. Dentre estes, os algoritmos de AM simbólico são aqueles que extraem um modelo de conhecimento inteligível, isto é, que pode ser facilmente interpretado pelo usuário. A utilização de AM simbólico é comum no contexto de classificação, no qual o modelo de conhecimento extraído é tal que descreve uma correlação entre um conjunto de atributos denominados premissas e um atributo particular denominado classe. Uma característica dos algoritmos de classificação é que, em geral, estes são utilizados visando principalmente a maximização das medidas de cobertura e precisão, focando a construção de um classificador genérico e preciso. Embora essa seja uma boa abordagem para automatizar processos de tomada de decisão, pode deixar a desejar quando o usuário tem o desejo de extrair um modelo de conhecimento que possa ser estudado e que possa ser útil para uma melhor compreensão do domínio. Tendo-se em vista esse cenário, o principal objetivo deste trabalho é pesquisar métodos de computação evolutiva multiobjetivo para a construção de regras de conhecimento individuais com base em critérios definidos pelo usuário. Para isso utiliza-se a biblioteca de classes e ambiente de construção de regras de conhecimento ECLE, cujo desenvolvimento remete a projetos anteriores. Outro objetivo deste trabalho consiste comparar os métodos de computação evolutiva pesquisados com métodos baseado em composição de rankings previamente existentes na ECLE. É mostrado que os métodos de computação evolutiva multiobjetivo apresentam melhores resultados que os métodos baseados em composição de rankings, tanto em termos de dominância e proximidade das soluções construídas com aquelas da fronteira Pareto-ótima quanto em termos de diversidade na fronteira de Pareto. Em otimização multiobjetivo, ambos os critérios são importantes, uma vez que o propósito da otimização multiobjetivo é fornecer não apenas uma, mas uma gama de soluções eficientes para o problema, das quais o usuário pode escolher uma ou mais soluções que apresentem os melhores compromissos entre os objetivos / Machine Learning algorithms are notable examples of Artificial Intelligence algorithms capable of automating the extraction of implicit knowledge from datasets. In particular, Symbolic Learning algorithms are those which yield an intelligible knowledge model, i.e., one which a user may easily read. The usage of Symbolic Learning is particularly common within the context of classification, which involves the extraction of knowledge such that the associated model describes correelation among a set of attributes named the premises and one specific attribute named the class. Classification algorithms usually target into creating knowledge models which maximize the measures of coverage and precision, leading to classifiers that tend to be generic and precise. Althought this constitutes a good approach to creating models that automate the decision making process, it may not yield equally good results when the user wishes to extract a knowledge model which could assist them into getting a better understanding of the domain. Having that in mind, it has been established as the main goal of this Masters thesis the research of multi-objective evolutionary computing methods to create individual knowledge rules maximizing sets of arbitrary user-defined criteria. This is achieved by employing the class library and knowledge rule construction environment ECLE, which had been developed during previous research work. A second goal of this Masters thesis is the comparison of the researched evolutionary computing methods against previously existing ranking composition methods in ECLE. It is shown in this Masters thesis that the employment of multi-objective evolutionary computing methods produces better results than those produced by the employment of ranking composition-based methods. This improvement is verified both in terms of solution dominance and proximity of the solution set to the Pareto-optimal front and in terms of Pareto-front diversity. Both criteria are important for evaluating the efficiency of multi-objective optimization algorithms, for the goal of multi-objective optimization is to provide a broad range of efficient solutions, so the user may pick one or more solutions which present the best trade-off among all objectives Algoritmos evolutivos Aprendizado de máquina Computação evolutiva Computação evolutiva multiobjetivo Construção de regras de conhecimento Evolutionary algorithm Evolutionary computing Extração de conhecimento Knowledge discovery in databases Knowledge rule discovery Machine learning Multiobjective evolutionary computing Multiobjective optimization Otimização multiobjetivo
57	Zpracování asociačních pravidel metodou vícekriteriálního shlukování / Post-processing of association rules by multicriterial clustering method Kejkula, Martin January 2002 (has links) Association rules mining is one of several ways of knowledge discovery in databases. Paradoxically, data mining itself can produce such great amounts of association rules that there is a new knowledge management problem: there can easily be thousands or even more association rules holding in a data set. The goal of this work is to design a new method for association rules post-processing. The method should be software and domain independent. The output of the new method should be structured description of the whole set of discovered association rules. The output should help user to work with discovered rules. The path to reach the goal I used is: to split association rules into clusters. Each cluster should contain rules, which are more similar each other than to rules from another cluster. The output of the method is such cluster definition and description. The main contribution of this Ph.D. thesis is the described new Multicriterial clustering association rules method. Secondary contribution is the discussion of already published association rules post-processing methods. The output of the introduced new method are clusters of rules, which cannot be reached by any of former post-processing methods. According user expectations clusters are more relevant and more effective than any former association rules clustering results. The method is based on two orthogonal clustering of the same set of association rules. One clustering is based on interestingness measures (confidence, support, interest, etc.). Second clustering is inspired by document clustering in information retrieval. The representation of rules in vectors like documents is fontal in this thesis. The thesis is organized as follows. Chapter 2 identify the role of association rules in the KDD (knowledge discovery in databases) process, using KDD methodologies (CRISP-DM, SEMMA, GUHA, RAMSYS). Chapter 3 define association rule and introduce characteristics of association rules (including interestingness measuress). Chapter 4 introduce current association rules post-processing methods. Chapter 5 is the introduction to cluster analysis. Chapter 6 is the description of the new Multicriterial clustering association rules method. Chapter 7 consists of several experiments. Chapter 8 discuss possibilities of usage and development of the new method.
58	[en] HYBRID INTELLIGENT SYSTEM FOR CLASSIFICATION OF NON-RESIDENTIAL ELECTRICITY CUSTOMERS PAYMENT PROFILES / [pt] SISTEMA INTELIGENTE HÍBRIDO PARA CLASSIFICAÇÃO DO PERFIL DE PAGAMENTO DOS CONSUMIDORES NÃO-RESIDENCIAIS DE ENERGIA ELÉTRICA NORMA ALICE DA SILVA CARVALHO 26 March 2018 (has links) [pt] O objetivo desta pesquisa é classificar o perfil de pagamento dos consumidores não-residenciais de energia elétrica, considerando conhecimento armazenado em base de dados de distribuidoras de energia elétrica. A motivação para desenvolvê-la surgiu da necessidade das distribuidoras por um modelo de suporte a formulação de estratégias capazes de reduzir o grau inadimplência. A metodologia proposta consiste em um sistema inteligente híbrido composto por módulos intercomunicativos que usam conhecimentos armazenados em base de dados para segmentar consumidores e, então, atingir o objetivo proposto. O sistema inicia-se com o módulo neural, que aloca as unidades consumidoras em grupos conforme similaridades (valor fatura, consumo, demanda medida/demanda contratada, intensidade energética e peso da conta no orçamento), em sequência, o módulo bayesiano, estabelece um escore entre 0 e 1 que permite predizer o perfil de pagamento das unidades considerando os grupos gerados e os atributos categóricos (atividade econômica, estrutura tarifária, mesorregião, natureza jurídica e porte empresarial) que caracterizam essas unidades. Os resultados revelaram que o sistema proposto estabelece razoável taxa de acerto na classificação do perfil de consumidores e, portanto, constitui uma importante ferramenta de suporte a formulação de estratégias para combate à inadimplência. Conclui-se que, o sistema híbrido proposto apresenta caráter generalista podendo ser adaptado e implementado em outros mercados. / [en] The objective of this research is to classify the non-residential electricity customer payment profiles regarding the knowledge stored in electricity distribution utilities databases. The motivation for development of the work from the need of electricity distribution by a support model to formulate strategies for tackling non-payment and late payment. The proposed methodology consists of a hybrid intelligent system constituted by intercommunicating modules that use knowledge stored in database to customer segmentation and then achieve the proposed objective. The system begins with the neural module, which allocates the consuming units in groups according to similarities (bill amount, consumption, measured demand/contracted demand, energy intensity and share of the electricity bill in the customer s income), in sequence, the Bayesian module establishes a score between 0 and 1 that allows to predict what payment profile of the units considering the generated groups and categorical attributes (business activity, tariff type, business size, mesoregion and company s legal form) that characterize these units. The results showed that the proposed system provides a reasonable success rate when classifying customer profiles and thus constitutes an important tool in the formulation of strategies for tackling non-payment and late payment. In conclusion, the hybrid system proposed here is a generalist one and could usefully be adapted and implemented in other markets. [pt] MAPAS AUTO-ORGANIZAVEIS DE KOHONEN [en] KOHONEN SELF-ORGANIZED MAP [pt] INADIMPLENCIA [en] NON-PAYMENT LATE PAYMENT [en] ELECTRIC POWER DISTRIBUTION SYSTEMS [en] KNOWLEDGE DISCOVERY IN DATABASES [pt] CLASSIFICADOR BAYESIANO SIMPLES [en] NAIVE BAYES
59	Modelação e análise da vida útil (metrológica) de medidores tipo indução de energia elétrica ativa Silva, Marcelo Rubia da [UNESP] 27 August 2010 (has links) (PDF) Made available in DSpace on 2014-06-11T19:22:31Z (GMT). No. of bitstreams: 0 Previous issue date: 2010-08-27Bitstream added on 2014-06-13T18:49:27Z : No. of bitstreams: 1 silva_mr_me_ilha.pdf: 2058535 bytes, checksum: 046bcb6196cc4909e675190cc0e21275 (MD5) / Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) / O estudo da confiabilidade operacional de equipamentos se tornou fundamental para as empresas possuírem o devido controle dos seus ativos, tanto pelo lado financeiro quanto em questões de segurança. O estudo da taxa de falha de equipamentos prevê quando as falhas irão ocorrer possibilitando estabelecer atitudes preventivas, porém, seu estudo deve ser realizado em condições de operação estabelecidas e fixas. Os medidores de energia elétrica, parte do ativo financeiro das concessionárias de energia, são equipamentos utilizados em diversas condições de operação, tanto nas condições do fluxo de energia, tais como presenças de harmônicos, subtensões, sobre-tensões e padrões de consumo distintos, quanto pelo local físico de instalação, tais como maresia, temperatura, umidade, etc. As falhas nos medidores eletromecânicos de energia elétrica são de difícil constatação uma vez que a maioria dos erros de medição, ocasionados principalmente por envelhecimento de componentes, não alteram a qualidade da energia fornecida e nem interrompem o seu fornecimento. Neste sentido, este trabalho propõe uma nova metodologia de determinação de falhas em medidores eletromecânicos de energia elétrica ativa. Faz-se uso de banco de dados de uma concessionária de energia elétrica e do processo de descoberta de conhecimento em bases de dados para selecionar as variáveis mais significativas na determinação de falhas em medidores eletromecânicos de energia elétrica ativa, incluindo no conjunto de falhas a operação com erros de medição acima do permitido pela legislação nacional (2010). Duas técnicas de mineração de dados foram utilizadas: regressão stepwise e árvores de decisão. As variáveis obtidas foram utilizadas na construção de um modelo de agrupamento de equipamentos associando a cada grupo uma probabilidade... / The operational reliability study of equipments has become primal in order to enterprises have the righteous control over their assets, both by financial side as by security reasons. The study for the hazard rate of equipments allows to foresee the failures for the equipments and to act preventively, but this study must be accomplished under established and fixed operation conditions. The energy meters, for their part, are equipments utilized in several operating conditions so on the utilization manner, like presence of harmonics, undervoltages and over-voltages and distinct consumption patterns, as on the installation location, like swel, temperature, humidity, etc. Failures in electromechanical Wh-meters are difficult to detect once that the majority of metering errors occurred mainly by aging of components do not change the quality of offered energy neither disrupt its supply. In this context, this work proposes a novel methodology to obtain failure determination for electromechanical Whmeters. It utilizes Wh-databases from an electrical company and of the process of knowledge discovery in databases to specify the most significant variables in determining failures in electromechanical Wh-meters, including in the failure set the operation with metering errors above those permitted by national regulations (2010). Two techniques of data mining were used in this work: stepwise regression and decision trees. The obtained variables were utilized on the construction of a model of clustering similar equipments and the probability of failure of those clusters were determined. As final results, an application in a friendly platform were developed in order to apply the methodology, and a case study was accomplished in order to demonstrate its feasibility. Inteligencia artificial Bases de dados Arvores de decisão Probabilidade de falha Regressão stepwise Active electromechanical energy meters Probability of failure Artificial intelligence Knowledge discovery in databases Stepwise regression Decision trees
60	Descoberta de regras de conhecimento utilizando computação evolutiva multiobjetivo / Discoveing knowledge rules with multiobjective evolutionary computing Rafael Giusti 22 June 2010 (has links) Na área de inteligência artificial existem algoritmos de aprendizado, notavelmente aqueles pertencentes à área de aprendizado de máquina AM , capazes de automatizar a extração do conhecimento implícito de um conjunto de dados. Dentre estes, os algoritmos de AM simbólico são aqueles que extraem um modelo de conhecimento inteligível, isto é, que pode ser facilmente interpretado pelo usuário. A utilização de AM simbólico é comum no contexto de classificação, no qual o modelo de conhecimento extraído é tal que descreve uma correlação entre um conjunto de atributos denominados premissas e um atributo particular denominado classe. Uma característica dos algoritmos de classificação é que, em geral, estes são utilizados visando principalmente a maximização das medidas de cobertura e precisão, focando a construção de um classificador genérico e preciso. Embora essa seja uma boa abordagem para automatizar processos de tomada de decisão, pode deixar a desejar quando o usuário tem o desejo de extrair um modelo de conhecimento que possa ser estudado e que possa ser útil para uma melhor compreensão do domínio. Tendo-se em vista esse cenário, o principal objetivo deste trabalho é pesquisar métodos de computação evolutiva multiobjetivo para a construção de regras de conhecimento individuais com base em critérios definidos pelo usuário. Para isso utiliza-se a biblioteca de classes e ambiente de construção de regras de conhecimento ECLE, cujo desenvolvimento remete a projetos anteriores. Outro objetivo deste trabalho consiste comparar os métodos de computação evolutiva pesquisados com métodos baseado em composição de rankings previamente existentes na ECLE. É mostrado que os métodos de computação evolutiva multiobjetivo apresentam melhores resultados que os métodos baseados em composição de rankings, tanto em termos de dominância e proximidade das soluções construídas com aquelas da fronteira Pareto-ótima quanto em termos de diversidade na fronteira de Pareto. Em otimização multiobjetivo, ambos os critérios são importantes, uma vez que o propósito da otimização multiobjetivo é fornecer não apenas uma, mas uma gama de soluções eficientes para o problema, das quais o usuário pode escolher uma ou mais soluções que apresentem os melhores compromissos entre os objetivos / Machine Learning algorithms are notable examples of Artificial Intelligence algorithms capable of automating the extraction of implicit knowledge from datasets. In particular, Symbolic Learning algorithms are those which yield an intelligible knowledge model, i.e., one which a user may easily read. The usage of Symbolic Learning is particularly common within the context of classification, which involves the extraction of knowledge such that the associated model describes correelation among a set of attributes named the premises and one specific attribute named the class. Classification algorithms usually target into creating knowledge models which maximize the measures of coverage and precision, leading to classifiers that tend to be generic and precise. Althought this constitutes a good approach to creating models that automate the decision making process, it may not yield equally good results when the user wishes to extract a knowledge model which could assist them into getting a better understanding of the domain. Having that in mind, it has been established as the main goal of this Masters thesis the research of multi-objective evolutionary computing methods to create individual knowledge rules maximizing sets of arbitrary user-defined criteria. This is achieved by employing the class library and knowledge rule construction environment ECLE, which had been developed during previous research work. A second goal of this Masters thesis is the comparison of the researched evolutionary computing methods against previously existing ranking composition methods in ECLE. It is shown in this Masters thesis that the employment of multi-objective evolutionary computing methods produces better results than those produced by the employment of ranking composition-based methods. This improvement is verified both in terms of solution dominance and proximity of the solution set to the Pareto-optimal front and in terms of Pareto-front diversity. Both criteria are important for evaluating the efficiency of multi-objective optimization algorithms, for the goal of multi-objective optimization is to provide a broad range of efficient solutions, so the user may pick one or more solutions which present the best trade-off among all objectives Algoritmos evolutivos Aprendizado de máquina Computação evolutiva Computação evolutiva multiobjetivo Construção de regras de conhecimento Extração de conhecimento Otimização multiobjetivo Evolutionary algorithm Evolutionary computing Knowledge discovery in databases Knowledge rule discovery Machine learning Multiobjective evolutionary computing Multiobjective optimization

Search results