Global ETD Search

131	Využití data miningových metod při zpracování dat z demografických šetření / Using data mining methods for demographic survey data processing Fišer, David January 2015 (has links) USING DATA MINING METHODS FOR DEMOGRAPHIC SURVEY DATA PROCESSING Abstract The goal of the thesis was to describe and demonstrate principles of the process of knowledge discovery in databases - data mining (DM). In the theoretical part of the thesis, selected methods for data mining processes are described as well as basic principles of those DM techniques. In the second part of the thesis a DM task is realized in accordance to CRISP-DM methodology. Practical part of the thesis is divided into two parts and data from the survey of American Community Survey served as the basic data for the practical part of the thesis. First part contains a classification task which goal was to determinate whether the selected DM techniques can be used to solve missing data in the surveys. The success rate of classifications and following data value prediction in selected attributes was in 55-80 % range. The second part of the practical part of the thesis was then focused of determining knowledge of interest using associating rules and the GUHA method. Keywords: data mining, knowledge discovery in databases, statistic surveys, missing values, classification, association rules, GUHA method, ACS
132	Porovnatelnost dat v dobývání znalostí z databází / Data comparability in knowledge discovery in databases Horáková, Linda January 2017 (has links) The master thesis is focused on analysis of data comparability and commensurability in datasets, which are used for obtaining knowledge using methods of data mining. Data comparability is one of aspects of data quality, which is crucial for correct and applicable results from data mining tasks. The aim of the theoretical part of the thesis is to briefly describe the field of knowledqe discovery and define specifics of mining of aggregated data. Moreover, the terms of comparability and commensurability is discussed. The main part is focused on process of knowledge discovery. These findings are applied in practical part of the thesis. The main goal of this part is to define general methodology, which can be used for discovery of potential problems of data comparability in analyzed data. This methodology is based on analysis of real dataset containing daily sales of products. In conclusion, the methodology is applied on data from the field of public budgets.
133	Učení business rules z výsledků dolování GUHA asociačních pravidel / Business rule learning using data mining of GUHA association rules Vojíř, Stanislav January 2012 (has links) In the currently highly competitive environment, the information systems of the businesses should not only effectively support the existing business processes, but also have to be dynamically adaptable to the changes in the environment. There are increasing efforts at separation of the application and the business logic in the information system. One of the appropriate instruments for this separation is the business rule approach. Business rules are simple, understandable rules. They can be used for the knowledge externalization and sharing also as for the active control and decisions within the business processes. Although the business rule approach is used for almost 20 years, the various specifications and practical applications of business rules are still a goal of the active research. The disadvantage of the business rule approach is great demands on obtaining of the rules. There has to be a domain expert, who is able to manually write them. One of the problems addressed by the current research is the possibility of (semi)automatic acquisition of business rules from the different resources - unstructured documents, historical data etc. This dissertation thesis addresses the problem of acquisition (learning) of business rules from the historical data of the company. The main objective of this thesis is to design and validate a method for (semi)automatic learning of business rules using the data mining of association rules. Association rule are a known data mining method for discovering of interesting relations hidden in the data. Association rules are comprehensible and explainable. The comprehensibility of association rules is suitable for the use of them for learning of business rules. For this purpose the user can use not only simple rules discovered using the algorithm Apriori or FP-Growth, but also more complex association rules discovered using the GUHA method. Within this thesis is used the procedure 4ft-Miner implemented in the data mining system LISp Miner. The first part of this thesis contains the description of the relevant topics from the research of business rules and association rules. Business rules is not a name of one specification of standard but rather a label of the approach to modelling of business logic. As part of the work there is defined a process of selection of the most appropriate specification of business rules for the selected practical use. Consequently, the author proposed three models of involving of data mining of association rules into business rule sets. These models contain also the definition of a model for the transformation of GUHA association rules in the business rules for the system JBoss Drools. For the possibility of learning of business rules using the data mining results from more than one data set, the author proposed a knowledge base. The knowledge base is suitable for the interconnection of business rules and data mining of association rules. From the perspective of business rules the knowledge base is a term dictionary. From the perspective of data mining the knowledge base contains some background knowledge for data preprocessing and preparation of classification models. The proposed models have been validated using practical implementations in the systems EasyMiner (in conjunction with JBoss Drools) and Erian. The thesis contains also a description of two model use cases based on real data from the field of marketing and the field of health insurance.
134	Association rule mining as a support for OLAP / Dolování asociačních pravidel jako podpora pro OLAP Chudán, David January 2010 (has links) The aim of this work is to identify the possibilities of the complementary usage of two analytical methods of data analysis, OLAP analysis and data mining represented by GUHA association rule mining. The usage of these two methods in the context of proposed scenarios on one dataset presumes a synergistic effect, surpassing the knowledge acquired by these two methods independently. This is the main contribution of the work. Another contribution is the original use of GUHA association rules where the mining is performed on aggregated data. In their abilities, GUHA association rules outperform classic association rules referred to the literature. The experiments on real data demonstrate the finding of unusual trends in data that would be very difficult to acquire using standard methods of OLAP analysis, the time consuming manual browsing of an OLAP cube. On the other hand, the actual use of association rules loses a general overview of data. It is possible to declare that these two methods complement each other very well. The part of the solution is also usage of LMCL scripting language that automates selected parts of the data mining process. The proposed recommender system would shield the user from association rules, thereby enabling common analysts ignorant of the association rules to use their possibilities. The thesis combines quantitative and qualitative research. Quantitative research is represented by experiments on a real dataset, proposal of a recommender system and implementation of the selected parts of the association rules mining process by LISp-Miner Control Language. Qualitative research is represented by structured interviews with selected experts from the fields of data mining and business intelligence who confirm the meaningfulness of the proposed methods.
135	Creating a prediction model for weather forecasting based on artificial neural network supported by association rules mining / Vytvoření predikčního modelu předpovědi počasí pomocí neuronové sítě a asociačních pravidel Kadlec, Jakub January 2016 (has links) This diploma thesis introduces three different methods of creating a neural network binary classifier for the purpose of automated weather prediction with attribute pre-selection using association rules and correlation patters mining by the LISp-Miner system. First part of the thesis consists of collection of theoretical knowledge enabling the creation of such predictive model, whereas the second part describes the creation of the model itself using the CRISP-DM methodology. Final part of the thesis analyses the performance of created classifiers and concludes the proposed methods and their possible benefits over training the network without attribute pre-selection.
136	Regras de associação aplicadas aos filtros de mensagens e canais de informação do projeto direto / Association rules applied to messages filters and information channel in the direto environment Frighetto, Michele January 2003 (has links) Neste trabalho é apresentado um breve estudo sobre o processo de descoberta de conhecimento em banco de dados, com enfoque na etapa de mineração de dados através de regras de associação. Propostas por Agrawal em 1993, num estudo chamado análise de cesta de mercado, as regras de associação representam que com um certo grau de suporte e confiança um conjunto de itens pode estar presente numa transação visto que outro conjunto está presente. A necessidade de análise semelhante às realizadas por Agrawal surgiu em outros campos e estas foram estendidas a outras aplicações. Neste, são apresentadas as principais variações sobre o tema regras de associação encontradas na literatura. É proposta a mineração de dados através de regras de associação sobre filtros de mensagens e canais de informação do software de catálogo, agenda e correio eletrônico Direto. Para as pesquisas são utilizadas três ferramentas: Intelligent Miner, CBA e Magnus Opus. Elas foram aplicadas sobre uma lista de discussão da Linguagem Java, pois o projeto Direto ainda não possui mensagens públicas. As ferramentas possuem características distintas: o Intelligent Miner permite a definição de hierarquias sobre os dados que serão minerados; o Magnus Opus trabalha com diversos filtros e com a definição de intervalos para o tratamento de campos numéricos; o CBA permite que sejam especificados suportes múltiplos para os itens. / This work presents a brief review about knowledge discovery in database having association rules as the data mining process. Association rules were proposed by Agrawal in 1993 in a basket data analysis. Association rules have been extended to other applications because there is a necessity for similar Agrawal’s analysis in different domains. Here are presented some variations proposed in the literature about association rules along with the main algorithms. This work proposes the use of association rules over message filters and information channels from the Direto, which is a catalog, schedule and e-mail software. Three data mining tools were used: Intelligent Miner, CBA and Magnus Opus. They were applied over a Java discussion list because Direto project does not have public messages. Each tool has distinct features: Intelligent Miner allows to define a hierarchy over the data that will be mined; Magnus Opus works with many filters over the data and permits to define ranges over numeric fields and CBA allows to specify multiple minimum support over the items. Descoberta : Conhecimento Tecnologia da informação Internet : Aspectos sociais Data mining Association rules Discussion list Message filters Information channel Intelligent miner Magnus opus CBA
137	Uma metodologia para exploração de regras de associação generalizadas integrando técnicas de visualização de informação com medidas de avaliação do conhecimento / A methodology for exploration of generalized association rules integrating information visualization techniques with knowledge evaluation measures Magaly Lika Fujimoto 04 August 2008 (has links) O processo de mineração de dados tem como objetivo encontrar o conhecimento implícito em um conjunto de dados para auxiliar a tomada de decisão. Do ponto de vista do usuário, vários problemas podem ser encontrados durante a etapa de pós-processamento e disponibilização do conhecimento extraído, como a enorme quantidade de padrões gerados por alguns algoritmos de extração e a dificuldade na compreensão dos modelos extraídos dos dados. Além do problema da quantidade de regras, os algoritmos tradicionais de regras de associação podem levar à descoberta de conhecimento muito específico. Assim, pode ser realizada a generalização das regras de associação com o intuito de obter um conhecimento mais geral. Neste projeto é proposta uma metodologia interativa que auxilie na avaliação de regras de associação generalizadas, visando melhorar a compreensibilidade e facilitar a identificação de conhecimento interessante. Este auxílio é realizado por meio do uso de técnicas de visualização em conjunto com a aplicação medidas de avaliação objetivas e subjetivas, que estão implementadas no módulo de visualização de regras de associação generalizados denominado RulEE-GARVis, que está integrado ao ambiente de exploração de regras RulEE (Rule Exploration Environment). O ambiente RulEE está sendo desenvolvido no LABIC-ICMC-USP e auxilia a etapa de pós-processamento e disponibilização de conhecimento. Neste contexto, também foi objetivo deste projeto de pesquisa desenvolver o Módulo de Gerenciamento do ambiente de exploração de regras RulEE. Com a realização do estudo dirigido, foi possível verificar que a metodologia proposta realmente facilita a compreensão e a identificação de regras de associação generalizadas interessantes / The data mining process aims at finding implicit knowledge in a data set to aid in a decision-making process. From the users point of view, several problems can be found at the stage of post-processing and provision of the extracted knowledge, such as the huge number of patterns generated by some of the extraction algorithms and the difficulty in understanding the types of the extracted data. Besides the problem of the number of rules, the traditional algorithms of association rules may lead to the discovery of very specific knowledge. Thus, the generalization of association rules can be realized to obtain a more general knowledge. In this project an interactive methodology is proposed to aid in the evaluation of generalized association rules in order to improve the understanding and to facilitate the identification of interesting knowledge. This aid is accomplished through the use of visualization techniques along with the application of objective and subjective evaluation measures, which are implemented in the visualization module of generalized association rules called RulEE-GARVis, which is integrated with the Rule Exploration Environment RulEE. The RulEE environment is being developed at LABIC-ICMC-USP and aids in the post-processing and provision of knowledge. In this context, it was also the objective of this research project to develop the Module Management of the rule exploration environment RulEE. Through this directed study, it was verified that the proposed methodology really facilitates the understanding and identification of interesting generalized association rules Generalização Medidas objetivas Medidas subjetivas Mineração de dados Pós-processamento Regras de associação Taxonomias Visualização Association rules Data mining Generalization Objective measures Post-processing Subjective measures Taxonomies Visualization
138	Data Mining in Small Business / Data Mining in Small Business Sabovčik, František January 2018 (has links) Tato práce si klade za cíl vyhodnotit techniky získávání znalostí pro využití v prostředí malého podnikání. Po prozkoumání dat a konzultace s doménovymi experty byly vybrány dvě úlohy: analyza nákupního košíku a predikce prodejů. Pro analyzu nákupního košíku byl využit algoritmus Relim pro vyhledávání častych itemsetů a metriky určující zajímavost asociačních pravidel. Pro úlohu predikce prodejů byl implementován dekompoziční model, SARIMA, MARS a neuronové sítě s časovym oknem. Modely byly vyhodnoceny. Pomocí optimalizace hyper-parametrů bylo dosaženo přijatelnych vysledků. Oproti předpokladům nedošlo při dodání dat o počasí a využití nelineárních modelů ke zlepšení oproti SARIMA. Predikce byla implementována jako služba na straně serveru pro testování v produkčním prostředí.
139	Algoritmus pro cílené doporučování produktů / Algorithm for Product Recommendation Bodeček, Miroslav January 2011 (has links) The goal of this project is to explore the problem of product recommendations in the area of e-commerce and to evaluate known techniques, design product recommendation system for an existing e-commerce site, implement it and test it. This report introduces the problem, briefly examines current state of affairs in this area and defines requirements for a product recommendation module. The concept of data mining in general is introduced. The report proceeds to present detailed design corresponding to defined requirements and summarizes data gathered during testing phase. It concludes with evaluation and with discussion of the remaining goals for this thesis.
140	Získávání znalostí z obchodních procesů / Business Process Mining Skácel, Jan January 2015 (has links) This thesis explains business process mining and it's principles. A substantial part is devoted to the problems of process discovery. Further, based on the analysis of specific manufacturing process are proposed three methods that are trying to identify shortcomings in the process. First discovers the manufacturing process and renders it into a graph. The second method uses simulator of production history to obtain products that may caused delays in the process. Acquired data are used to mine frequent itemsets. The third method tries to predict processing time on the selected workplace using asociation rules. Last two mentioned methods employ an algorithm Frequent Pattern Growth. The knowledge obtained from this thesis improve efficiency of the manufacturing process and enables better production planning.

Search results