Global ETD Search

111	Descoberta de conhecimento em bases de dados e estratégias de relacionamento com clientes: um estudo no setor de serviços Fernandes, Marcelo Pires 12 February 2008 (has links) Made available in DSpace on 2016-03-15T19:26:36Z (GMT). No. of bitstreams: 1 Marcelo Pires Fernandes.pdf: 425391 bytes, checksum: 82c6fd61293544d4f47d5a6eec0f6580 (MD5) Previous issue date: 2008-02-12 / The research problem to be studied is related to the way companies from the services industry use customer databases to discover useful knowledge about their customers, in order to improve the development of relationship strategies with them. This issue is important mainly because due to the increasing of concurrence and customer demand, the company needs to relate differently with their customers, so that thy can keep in its portfolio the most profitable ones. In this way, the theory has suggested a deeper integration among distinct disciplines as Relationship Marketing, CRM and Data Mining. In this current study, it was investigated the way the theory presents and describes database analysis processes and, as a result, some proposals were found out, that segment the processes of discovering knowledge in databases in stages like problem understanding, data understanding, data preparation, data modeling data, model evaluation and deployment. The target population was composed by companies from the services industry from São Paulo and Rio de Janeiro cities and a quantitative research was made by applying a questionnaire to 67 professionals from the target population. In this research, themes as utilization level from stages of process of discovering knowledge in databases, utilization level of data mining techniques and utilization level of relationship strategies were investigated. It was discovered that the companies researched have a high utilization level of the stages of knowledge discovery identified in the theory, just only a small part of the data mining techniques are uniformly used by the companies researched and, at last, the strategies with the highest utilization levels are that related to the acquisition of new customers and identification of profitable ones. This last discover was a little bit surprising, because it is opposed to the way of thinking of some authors who defend companies should focus on their relationship strategies in the customer retention. These results can be used to support companies, in subjects related to the development of customer relationship strategies, based in an integrated analysis of business issues, customer information, as well quantitative models of analysis from this information, in order to turn it into useful knowledge to the making decision. / O problema de pesquisa a ser investigado está associado ao modo como empresas do setor de serviços utilizam bases de dados para descobrir conhecimento sobre o cliente e embasar o desenvolvimento de estratégias de relacionamento. Este tema é importante, visto que em função do aumento da concorrência e da exigência dos clientes, as empresas precisam tratar seus clientes de forma diferenciada, de forma a manter em sua carteira aqueles mais rentáveis. Neste sentido, a literatura tem sugerido uma integração cada vez mais intensa entre disciplinas como Marketing de Relacionamento, CRM e Mineração de Dados. O presente trabalho estudou o modo como a literatura apresenta e descreve processos de análise de bases de dados e algumas propostas foram encontradas, propostas que segmentam o processo de descoberta de conhecimento em bases de dados em etapas como entendimento do problema, entendimento e preparação dos dados, modelagem dos dados, avaliação do modelo e implementação da solução desenvolvida. O universo estudado foi o de empresas do setor de serviços que atuam nas cidades de São Paulo e do Rio de Janeiro e uma pesquisa quantitativa foi realizada por meio da aplicação de um questionário a 67 respondentes. Nesta pesquisa, foi investigado o nível de utilização das etapas dos processos de descoberta de conhecimento em bases de dados, as técnicas de mineração utilizadas, bem como as estratégias de relacionamento adotadas com clientes. Constatou-se que as empresas pesquisadas possuem um alto nível de utilização das etapas de descoberta de conhecimento identificadas na literatura, que elas utilizam de forma uniforme apenas algumas das técnicas de mineração de dados identificadas na literatura e que, do ponto de vista de estratégias de relacionamento com clientes, as estratégias de aquisição de novos clientes e identificação dos melhores clientes possuem um nível de utilização superior ao de estratégias de retenção de clientes (considerando resultados da amostra). Esta última constatação, de certo modo, contraria o pensamento de algumas correntes teóricas, que defendem que as empresas devem focar suas estratégias de relacionamento na retenção de clientes. Estes resultados pode servir de apoio aos gestores das empresas, no que se refere aos processos de desenvolvimento de estratégias de relacionamento com clientes, sustentados em análise integrada dos aspectos de negócio envolvidos, informações sobre o cliente, bem como modelos quantitativos de análise destas informações, de forma a transformá-las em conhecimento útil para a tomada de decisão. estratégias de relacionamento descoberta de conhecimento em bases mineração de dados relationship strategies knowledge discovery in databases data mining
112	Otimização multiobjetivo e programação genética para descoberta de conhecimento em engenharia Russo, Igor Lucas de Souza 26 January 2017 (has links) Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-04-19T15:28:50Z No. of bitstreams: 1 igorlucasdesouzarusso.pdf: 2265113 bytes, checksum: 0eb7e55f7354359d8fb9419e6e6da17f (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-04-20T12:28:17Z (GMT) No. of bitstreams: 1 igorlucasdesouzarusso.pdf: 2265113 bytes, checksum: 0eb7e55f7354359d8fb9419e6e6da17f (MD5) / Made available in DSpace on 2017-04-20T12:28:17Z (GMT). No. of bitstreams: 1 igorlucasdesouzarusso.pdf: 2265113 bytes, checksum: 0eb7e55f7354359d8fb9419e6e6da17f (MD5) Previous issue date: 2017-01-26 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / A área de Otimização envolve o estudo e emprego de métodos para determinação dos parâmetros que levam à obtenção de soluções ótimas, de acordo com critérios denominados objetivos. Um problema é classiﬁcado como multiobjetivo quando apresenta objetivos múltiplos e conﬂitantes, que devem ser otimizados simultaneamente. Recentemente tem crescido o interesse dos pesquisadores pela análise de pós-otimalidade, que consiste na busca por propriedades intrínsecas às soluções ótimas de problemas de otimização e que podem lançar uma nova luz à compreensão dos mesmos. Innovization (inovação através de otimização, do inglês innovation through optmization) é um processo de descoberta de conhecimento a partir de problemas de otimização na forma de relações matemáticas entre variáveis, objetivos, restrições e parâmetros. Dentre as técnicas de busca que podem ser utilizadas neste processo está a Programação Genética (PG), uma meta heurística bioinspirada capaz de evoluir programas de forma automatizada. Além de numericamente válidos, os modelos encontrados devem utilizar corretamente as variáveis de decisão em relação às unidades envolvidas, de forma a apresentar signiﬁcado físico coerente. Neste trabalho é proposta uma alternativa para tratamento das unidades através de operações protegidas que ignoram os termos inválidos. Além disso, propõe-se aqui uma estratégia para evitar a obtenção de soluções triviais que não agregam conhecimento sobre o problema. Visando aumentar a diversidade dos modelos obtidos, propõe-se também a utilização de um arquivo externo para armazenar as soluções de interesse ao longo da busca. Experimentos computacionais são apresentados utilizando cinco estudos de caso em engenharia para veriﬁcar a inﬂuência das ideias propostas. Os problemas tratados aqui envolvem os projetos de: uma treliça de 2 barras, uma viga soldada, do corte de uma peça metálica, de engrenagens compostas e de uma treliça de 10 barras, sendo este último ainda não explorado na literatura de descoberta de conhecimento. Finalmente, o conhecimento inferido no estudo de caso da estrutura de 10 barras é utilizado para reduzir a dimensionalidade do problema. / The area of optimization involves the study and the use of methods to determine the parameters that lead to optimal solutions, according to criteria called objectives. A problem is classiﬁed as multiobjective when it presents multiple and conﬂicting objectives which must be simultaneously optimized. Recently, the interest of the researchers has grown in the analysis of post-optimality, which consists in the search for intrinsic properties of the optimal solutions of optimization problems. This can shed a new light on the understanding of the optimization problems. Innovization (from innovation through optimization) is a process of knowledge discovery from optimization problems in the form of mathematical relationships between variables, objectives, constraints, and parameters. Genetic Programming (GP), a search technique that can be used in this process, is a bio-inspired metaheuristic capable of evolving programs automatically. In addition to be numerically valid, the models found must correctly use the decision variables with respect to the units involved, in order to present coherent physical meaning. In this work, a method is proposed to handle the units through protected operations which ignore invalid terms. Also, a strategy is proposed here to avoid trivial solutions that do not add knowledge about the problem. In order to increase the diversity of the models obtained, it is also proposed the use of an external ﬁle to store the solutions of interest found during the search. Computational experiments are presented using ﬁve case studies in engineering to verify the inﬂuence of the proposed ideas. The problems dealt with here are the designs of: a 2-bar truss, a welded beam, the cutting of a metal part, composite gears, and a 10-bar truss. The latter was not previously explored in the knowledge discovery literature. Finally, the inferred knowledge in the case study of the 10-bar truss structure is used to reduce the dimensionality of that problem. CNPQ::CIENCIAS EXATAS E DA TERRA Innovization Otimização multiobjetivo Descoberta de conhecimento Programação genética Innovization Multiobjective Optimization Knowledge discovery Genetic Programming
113	Méthodologie d’extraction de connaissances spatio-temporelles par fouille de données pour l’analyse de comportements à risques : application à la surveillance maritime / Methodology of spatio-temporal knowledge discovery through data mining for risk behavior analysis : application to maritime traffic monitoring Idiri, Bilal 17 December 2013 (has links) Les progrès technologiques en systèmes de localisation (AIS, radar, GPS, RFID, etc.), de télétransmission (VHF, satellite, GSM, etc.), en systèmes embarqués et leur faible coût de production a permis leur déploiement à une large échelle. Énormément de données sur les déplacements d'objets sont produites par le biais de ces technologies et utilisées dans diverses applications de surveillance temps-réel comme la surveillance du trafic maritime. L'analyse a posteriori des données de déplacement de navires et d'événements à risques peut présenter des perspectives intéressantes pour la compréhension et l'aide à la modélisation des comportements à risques. Dans ce travail de thèse une méthodologie basée sur la fouille de données spatio-temporelle est proposée pour l'extraction de connaissances sur les comportements potentiellement à risques de navires. Un atelier d'aide à l'analyse de comportements de navires fondé sur cette méthodologie est aussi proposé. / The advent of positioning system technologies (AIS, radar, GPS, RFID, etc.), remote transmission (VHF, satellite, GSM, etc.), technological advances in embedded systems and low cost production, has enabled their deployment on a large scale. A huge amount of moving objects data are collected through these technologies and used in various applications such as real time monitoring surveillance of maritime traffic. The post-hoc analysis of data from moving ships and risk events may present interesting opportunities for the understanding and modeling support of risky behaviors. In this work, we propose a methodology based on Spatio-Temporal Data Mining for the knowledge discovery about potentially risky behaviors of ships. Based on this methodology, a workshop to support the analysis of behavior of ships is also proposed. Fouille de données Extraction de connaissances Objets mobiles Surveillance maritime Analyse de comportements Data mining Knowledge discovery Moving objects Maritime monitoring Behavior analysis
114	A Framework for How to Make Use of an Automatic Passenger Counting System Fihn, John, Finndahl, Johan January 2011 (has links) Most of the modern cities are today facing tremendous traffic congestions, which is a consequence of an increasing usage of private motor vehicles in the cities. Public transport plays a crucial role to reduce this traffic, but to be an attractive alternative to the use of private motor vehicles the public transport needs to provide services that suit the citizens requirements for travelling. A system that can provide transit agencies with rapid feedback about the usage of their transport network is the Automatic Passenger Counting (APC) system, a system that registers the number of passengers boarding and alighting a vehicle. Knowledge about the passengers travel behaviour can be used by transit agencies to adapt and improve their services to satisfy the requirements, but to achieve this knowledge transit agencies needs to know how to use an APC system. This thesis investigates how a transit agency can make use of an APC system. The research has taken place in Melbourne where Yarra Trams, operator of the tram network, now are putting effort in how to utilise the APC system. A theoretical framework based on theories about Knowledge Discovery from Data, System Development, and Human Computer Interaction, is built, tested, and evaluated in a case study at Yarra Trams. The case study resulted in a software system that can process and model Yarra Tram's APC data. The result of the research is a proposal of a framework consistingof different steps and events that can be used as a guide for a transit agency that wants to make use of an APC system. APC Automatic Passenger Counting System System Development CRISP-DM KDD Knowledge Discovery from Data ITS Computer and Information Sciences Data- och informationsvetenskap
115	Approche évolutionnaire et agrégation de variables : application à la prévision de risques hydrologiques / Evolutionary approach and variable aggregation : application to hydrological risks forecasting Segretier, Wilfried 10 December 2013 (has links) Les travaux de recherche présentés dans ce mémoire s'inscrivent dans la lignée des approches de modélisation hydrologiques prédictives dirigées par les données. Nous avons particulièrement développé leur application sur le contexte difficile des phénomènes de crue éclairs caractéristiques des bassins versants de la région Caraïbe qui pose un dé fi sé.curi taire. En envisageant le problème de la prévision de crues comme un problème d'optimisation combinatoire difficile nous proposons d'utiliser la notion de métaneuristiques, à travers les algorithmes évolutionnaire notamment pour leur capacité à parcourir efficacement de grands espaces de recherche et fi fournir des solutions de bOlIDe qualité en des temps d'exécution raisonnables. Nous avons présenté l'approche de prédiction AV2D : Aggregate Variable Data Driven dom le concept central est la notion de variable agrégée. L'idée sous-jacente à ce concept est de considérer le pouvoir prédictif de nouvelles variables définies comme le résultat de fonctions tatistiques, dites d'agrégation calculées sur de donnée' correspondant à des périodes de temps précédent uo événem nt à prédire. Ces variable sont caractérisées par des ensembles de paramètres correspondant a leur pJ:opriétés. Nous avons imroduitle variables agrégées hydrométéorologiques permettant de répondre au problème de la classification d événements hydrologiques. La complexité du parcours de l'espace de recherche engendré par les paramètres définissant ces variables a été prise en compte grâce à la njse en oeuvre d'un algorithme évolutionnaire particulier dont les composants ont été spécifiquement définis pour ce problème. Nous avons montré, à travers une étude comparative avec d'autres approches de modélisation dirigées par les données, menée sur deux cas d'études de bassins versant caribéens, que l'approche AV2D est particulièrement bien adaptée à leur contexte. Nous étudions par la suite les bénéfices offerts par les approches de modélisation hydrologiques modulaires dirigées par les données, en définissant un procédé de division en sous-processus prenant en compte les caractéristiques paniculières des bassins versants auxquels nous nous intéressons. Nou avons proposé une extension des travaux précédents à travers la définition d'une approche de modélisation modulaire M2D: Spatial Modular Data Driven, consistant à considérer des sous-processus en divisant l'ensemble des exemples à classifier en sous-ensembles correspondant à des comportements hydrologiques homogènes. Nous avons montré à travers une étude comparative avec d autres approches dU'igées par les données mises en oeuvre sur les mêmes sous-ensembles de données que celte approche permet d améliorer les résultats de prédiction particulièrement à coun Lenne. Nous avons enfin proposé la modélisation d un outil de pi / The work presented in this thesis is in the area of data-driven hydrological modeling approaches. We particularly investigared their application on the difficult problem of flash flood phenomena typically observed in Caribbean watersheds. By considering the problem of flood prediction as a combinatorial optimization problem, we propose to use the notion of Oleraheuristics, through evolutionary algorithms, especially for their capacity ta visit effjciently large search space and to provide good solutions in reasonable execution times. We proposed the hydrological prediction approach AV2D: Aggregate Variable Data Driven which central concept is the notion of aggregate variable. The underlying idea of this [concept is to consider the predictive power of new variables defined as the results of statistical functions, called aggregation functions, computed on data corresponding ta time periods before an event ta predict. These variables are characterized by sets of parameters corresponding ta their specifications. We introduced hydro-meteorological aggregate variables allowing ta address the classification problem of hydrological events. We showed through a comparative study on two typical caribbean watersheds, using several common data driven modelling techniques that the AV2D approach is panicul.rly weil fitted ta the studied context. We also study the benefits offered by modulaI' approaches through the definition of the SM2D: Spatial Modular DataDriven approach, consisting in considering sub-processes partly defined by spatial criteria. We showed that the results obtained by the AV2D on these sub-processes allows to increase the performances particularly for short term prediction. Finally we proposed the modelization of a generic control tool for hydro-meteorological prediction systems, H2FCT: Hydro-meteorological Flood Forecasting Control 1'001 Métaheuristiques Algoritihmes évolutionnaires Intelligence artificielle Knowledge discovery from data Metaheuristics Evolutionary algorithms Artificial intelligence
116	Empirické porovnání systémů dobývání znalostí z databází / Empirical comparison of systems for knowledge discovery in databases Benešová, Kristýna January 2008 (has links) S rostoucím množstvím shromažďovaných a ukládaných dat roste také potřeba a zájem majitelů těchto dat o využití jejich potenciálu k dalšímu rozhodování. Proto se vyvíjí nové přístupy a způsoby vycházející z informatiky, statistiky a oblasti strojového učení, které se této potřebě snaží vyhovět. Cílem této diplomové práce je uvést proces dobývání znalostí dat z databází na medicínských datech Tinnitus a představit systémy LISp-Miner a Weka, které daný proces podporují. Obsahem teoretické části diplomové práce je shrnutí základních charakteristik a přístupů procesu dobývání znalostí. Praktická část diplomové práce je věnována realizaci celého procesu v jednotlivých krocích. V samotném kroku modelování jsou využity již zmíněné systémy akademické LISp-Miner a Weka. Poslední část praktické části práce patří prezentaci dosažených výsledků a vlastnímu zhodnocení systémů.
117	Automatizace předzpracování dat za využití doménových znalosti / Automation of data preprocessing using domain knowledge Beskyba, Jan January 2014 (has links) In this work we propose a solution that would help automate the part of knowledge discovery in databases. Domain knowledge has an important role in the automation process which is necessary to include into the proposed program for data preparation. In the introduction to this work, we focus on the theoretical basis of knowledge discovery of databases with an emphasis on domain knowledge. Next, we focus on the basic principles of data pre-processing and scripting language LMCL that could be part of the design of the newly established applications for automated data preparation. Subsequently, we will deal with application design for data pre-processing, which will be verified on the data the House of Commons.
118	Aplikace data miningu v podnikové praxi / Data mining applications in business practice Trávníček, Petr January 2011 (has links) Throughout last decades, knowledge discovery from databases as one of the information and communicaiton technologies' disciplines has developed into its current state being showed increasing interest not only by major business corporates. Presented diploma thesis deals with problematique of data mining while paying prime attention to its practical utilization within business environment. Thesis objective is to review possibilities of data mining applications and to decompose implementation techniques focusing on specific data mining methods and algorithms as well as adaptation of business processes. This objective is subject of theoretical part of thesis focusing on principles of data mining, knowledge discovery from databases process, data mining commonly used methods and algorithms and finally tasks typically implemented in this domain. Further objective consists in presenting data mining benefits on the model example that is being displayed in the practical part of the thesis. Besides created data mining models evalution, practical part contains also design of subsequent steps that would enable higher efficiency in some specific areas of given business. I believe previous point together with characterization of knowledge discovery in databases process to be considered as the most beneficial one's of the thesis.
119	Mineração de dados aplicada à classificação do risco de evasão de discentes ingressantes em instituições federais de ensino superior AMARAL, Marcelo Gomes do 08 July 2016 (has links) Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2017-07-11T14:35:16Z No. of bitstreams: 3 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) projeto_v26016.pdf: 1271790 bytes, checksum: f724d8523f2ffdb11ce599aff1eb8eb6 (MD5) projeto_v26016.pdf: 1271790 bytes, checksum: f724d8523f2ffdb11ce599aff1eb8eb6 (MD5) / Made available in DSpace on 2017-07-11T14:35:16Z (GMT). No. of bitstreams: 3 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) projeto_v26016.pdf: 1271790 bytes, checksum: f724d8523f2ffdb11ce599aff1eb8eb6 (MD5) projeto_v26016.pdf: 1271790 bytes, checksum: f724d8523f2ffdb11ce599aff1eb8eb6 (MD5) Previous issue date: 2016-07-08 / As Instituições Federais de Ensino Superior (IFES) possuem um importante papel no desenvolvimento social e econômico do país, contribuindo para o avanço tecnológico e cientifico e fomentando investimentos. Nesse sentido, entende-se que um melhor aproveitamento dos recursos educacionais ofertados pelas IFES contribui para a evolução da educação superior, como um todo. Uma maneira eficaz de atender esta necessidade é analisar o perfil dos estudantes ingressos e procurar prever, com antecedência, casos indesejáveis de evasão que, quanto mais cedo identificados, melhor poderão ser estudados e tratados pela administração. Neste trabalho, propõe-se a definição de uma abordagem para aplicação de técnicas diretas de Mineração de Dados objetivando a classificação dos discentes ingressos de acordo com o risco de evasão que apresentam. Como prova de conceito, a análise dos aspectos inerentes ao processo de Mineração de Dados proposto se deu por meio de experimentações conduzidas no ambiente da Universidade Federal de Pernambuco (UFPE). Para alguns dos algoritmos classificadores, foi possível obter uma acurácia de classificação de 73,9%, utilizando apenas dados socioeconômicos disponíveis quando do ingresso do discente na instituição, sem a utilização de nenhum dado dependente do histórico acadêmico. / The Brazilian's Federal Institutions of Higher Education have an important role in the social and economic development of the country, contributing to the technological and scientific advances and encouraging investments. Therefore, it is possible to infer that a better use of the educational resources offered by those institutions contributes to the evolution of higher education as a whole. An effective way to meet this need is to analyze the profile of the freshmen students and try to predict, as soon as possible, undesirable cases of dropout that when earlier identified can be examined and addressed by the institution's administration. This work propose the development of a approach for direct application of Data Mining techniques to classify newcomer students according to their dropout risk. As a viability proof, the proposed Data Mining approach was evaluated through experimentations conducted in the Federal University of Pernambuco. Some of the classification algorithms tested had an classification accuracy of 73.9% using only socioeconomic data available since the student's admission to the institution, without the use of any academic related data. Mineração de Dados Educacionais Algoritmos de Classificação Knowledge Discovery in Databases Educational Data Mining Classification Algorithms
120	Visualização de operações de junção em sistemas de bases de dados para mineração de dados. / Visualization of join operations in DBMS for data mining. Maria Camila Nardini Barioni 13 June 2002 (has links) Nas últimas décadas, a capacidade das empresas de gerar e coletar informações aumentou rapidamente. Essa explosão no volume de dados gerou a necessidade do desenvolvimento de novas técnicas e ferramentas que pudessem, além de processar essa enorme quantidade de dados, permitir sua análise para a descoberta de informações úteis, de maneira inteligente e automática. Isso fez surgir um proeminente campo de pesquisa para a extração de informação em bases de dados denominado Knowledge Discovery in Databases KDD, no geral técnicas de mineração de dados DM têm um papel preponderante. A obtenção de bons resultados na etapa de mineração de dados depende fortemente de quão adequadamente o preparo dos dados é realizado. Sendo assim, a etapa de extração de conhecimento (DM) no processo de KDD, é normalmente precedida de uma etapa de pré-processamento, onde os dados que porventura devam ser submetidos à etapa de DM são integrados em uma única relação. Um problema importante enfrentado nessa etapa é que, na maioria das vezes, o usuário ainda não tem uma idéia muito precisa dos dados que devem ser extraídos. Levando em consideração a grande habilidade de exploração da mente humana, este trabalho propõe uma técnica de visualização de dados armazenados em múltiplas relações de uma base de dados relacional, com o intuito de auxiliar o usuário na preparação dos dados a serem minerados. Esta técnica permite que a etapa de DM seja aplicada sobre múltiplas relações simultaneamente, trazendo as operações de junção para serem parte desta etapa. De uma maneira geral, a adoção de junções em ferramentas de DM não é prática, devido ao alto custo computacional associado às operações de junção. Entretanto, os resultados obtidos nas avaliações de desempenho da técnica proposta neste trabalho mostraram que ela reduz esse custo significativamente, tornando possível a exploração visual de múltiplas relações de uma maneira interativa. / In the last decades the capacity of information generation and accumulation increased quickly. With the explosive growth in the volume of data, new techniques and tools are being sought to process it and to automatically discover useful information from it, leading to techniques known as Knowledge Discovery in Databases KDD where, in general, data mining DM techniques play an important role. The results of applying data mining techniques on datasets are highly dependent on proper data preparation. Therefore, in traditional DM processes, data goes through a pre-processing step that results in just one table that is submitted to mining. An important problem faced during this step is that, most of the times, the analyst doesnt have a clear idea of what portions of data should be mined. This work reckons the strong ability of human beings to interpret data represented in graphical format, to develop a technique to visualize data from multiple tables, helping human analysts when preparing data to DM. This technique allows the data mining process to be applied over multiple relations at once, bringing the join operations to become part of this process. In general, the use of multiple tables in DM tools is not practical, due to the high computational cost required to explore them. Experimental evaluation of the proposed technique shows that it reduces this cost significantly, turning it possible to visually explore data from multiple tables in an interactive way. mineração visual de dados pré-processamento knowledge discovery in databases pre-processing visual data mining

Search results