Global ETD Search

41	Efficient Authentication, Node Clone Detection, and Secure Data Aggregation for Sensor Networks Li, Zhijun January 2010 (has links) Sensor networks are innovative wireless networks consisting of a large number of low-cost, resource-constrained sensor nodes that collect, process, and transmit data in a distributed and collaborative way. There are numerous applications for wireless sensor networks, and security is vital for many of them. However, sensor nodes suffer from many constraints, including low computation capability, small memory, limited energy resources, susceptibility to physical capture, and the lack of infrastructure, all of which impose formidable security challenges and call for innovative approaches. In this thesis, we present our research results on three important aspects of securing sensor networks: lightweight entity authentication, distributed node clone detection, and secure data aggregation. As the technical core of our lightweight authentication proposals, a special type of circulant matrix named circulant-P2 matrix is introduced. We prove the linear independence of matrix vectors, present efficient algorithms on matrix operations, and explore other important properties. By combining circulant-P2 matrix with the learning parity with noise problem, we develop two one-way authentication protocols: the innovative LCMQ protocol, which is provably secure against all probabilistic polynomial-time attacks and provides remarkable performance on almost all metrics except one mild requirement for the verifier's computational capacity, and the HB$^C$ protocol, which utilizes the conventional HB-like authentication structure to preserve the bit-operation only computation requirement for both participants and consumes less key storage than previous HB-like protocols without sacrificing other performance. Moreover, two enhancement mechanisms are provided to protect the HB-like protocols from known attacks and to improve performance. For both protocols, practical parameters for different security levels are recommended. In addition, we build a framework to extend enhanced HB-like protocols to mutual authentication in a communication-efficient fashion. Node clone attack, that is, the attempt by adversaries to add one or more nodes to the network by cloning captured nodes, imposes a severe threat to wireless sensor networks. To cope with it, we propose two distributed detection protocols with difference tradeoffs on network conditions and performance. The first one is based on distributed hash table, by which a fully decentralized, key-based caching and checking system is constructed to deterministically catch cloned nodes in general sensor networks. The protocol performance of efficient storage consumption and high security level is theoretically deducted through a probability model, and the resulting equations, with necessary adjustments for real application, are supported by the simulations. The other is the randomly directed exploration protocol, which presents notable communication performance and minimal storage consumption by an elegant probabilistic directed forwarding technique along with random initial direction and border determination. The extensive experimental results uphold the protocol design and show its efficiency on communication overhead and satisfactory detection probability. Data aggregation is an inherent requirement for many sensor network applications, but designing secure mechanisms for data aggregation is very challenging because the aggregation nature that requires intermediate nodes to process and change messages, and the security objective to prevent malicious manipulation, conflict with each other to a great extent. To fulfill different challenges of secure data aggregation, we present two types of approaches. The first is to provide cryptographic integrity mechanisms for general data aggregation. Based on recent developments of homomorphic primitives, we propose three integrity schemes: a concrete homomorphic MAC construction, homomorphic hash plus aggregate MAC, and homomorphic hash with identity-based aggregate signature, which provide different tradeoffs on security assumption, communication payload, and computation cost. The other is a substantial data aggregation scheme that is suitable for a specific and popular class of aggregation applications, embedded with built-in security techniques that effectively defeat outside and inside attacks. Its foundation is a new data structure---secure Bloom filter, which combines HMAC with Bloom filter. The secure Bloom filter is naturally compatible with aggregation and has reliable security properties. We systematically analyze the scheme's performance and run extensive simulations on different network scenarios for evaluation. The simulation results demonstrate that the scheme presents good performance on security, communication cost, and balance. wireless sensor networks security protocols efficient entity authentication node clone detection secure data aggregation homomorphic primitives Electrical and Computer Engineering
42	Efficient Authentication, Node Clone Detection, and Secure Data Aggregation for Sensor Networks Li, Zhijun January 2010 (has links) Sensor networks are innovative wireless networks consisting of a large number of low-cost, resource-constrained sensor nodes that collect, process, and transmit data in a distributed and collaborative way. There are numerous applications for wireless sensor networks, and security is vital for many of them. However, sensor nodes suffer from many constraints, including low computation capability, small memory, limited energy resources, susceptibility to physical capture, and the lack of infrastructure, all of which impose formidable security challenges and call for innovative approaches. In this thesis, we present our research results on three important aspects of securing sensor networks: lightweight entity authentication, distributed node clone detection, and secure data aggregation. As the technical core of our lightweight authentication proposals, a special type of circulant matrix named circulant-P2 matrix is introduced. We prove the linear independence of matrix vectors, present efficient algorithms on matrix operations, and explore other important properties. By combining circulant-P2 matrix with the learning parity with noise problem, we develop two one-way authentication protocols: the innovative LCMQ protocol, which is provably secure against all probabilistic polynomial-time attacks and provides remarkable performance on almost all metrics except one mild requirement for the verifier's computational capacity, and the HB$^C$ protocol, which utilizes the conventional HB-like authentication structure to preserve the bit-operation only computation requirement for both participants and consumes less key storage than previous HB-like protocols without sacrificing other performance. Moreover, two enhancement mechanisms are provided to protect the HB-like protocols from known attacks and to improve performance. For both protocols, practical parameters for different security levels are recommended. In addition, we build a framework to extend enhanced HB-like protocols to mutual authentication in a communication-efficient fashion. Node clone attack, that is, the attempt by adversaries to add one or more nodes to the network by cloning captured nodes, imposes a severe threat to wireless sensor networks. To cope with it, we propose two distributed detection protocols with difference tradeoffs on network conditions and performance. The first one is based on distributed hash table, by which a fully decentralized, key-based caching and checking system is constructed to deterministically catch cloned nodes in general sensor networks. The protocol performance of efficient storage consumption and high security level is theoretically deducted through a probability model, and the resulting equations, with necessary adjustments for real application, are supported by the simulations. The other is the randomly directed exploration protocol, which presents notable communication performance and minimal storage consumption by an elegant probabilistic directed forwarding technique along with random initial direction and border determination. The extensive experimental results uphold the protocol design and show its efficiency on communication overhead and satisfactory detection probability. Data aggregation is an inherent requirement for many sensor network applications, but designing secure mechanisms for data aggregation is very challenging because the aggregation nature that requires intermediate nodes to process and change messages, and the security objective to prevent malicious manipulation, conflict with each other to a great extent. To fulfill different challenges of secure data aggregation, we present two types of approaches. The first is to provide cryptographic integrity mechanisms for general data aggregation. Based on recent developments of homomorphic primitives, we propose three integrity schemes: a concrete homomorphic MAC construction, homomorphic hash plus aggregate MAC, and homomorphic hash with identity-based aggregate signature, which provide different tradeoffs on security assumption, communication payload, and computation cost. The other is a substantial data aggregation scheme that is suitable for a specific and popular class of aggregation applications, embedded with built-in security techniques that effectively defeat outside and inside attacks. Its foundation is a new data structure---secure Bloom filter, which combines HMAC with Bloom filter. The secure Bloom filter is naturally compatible with aggregation and has reliable security properties. We systematically analyze the scheme's performance and run extensive simulations on different network scenarios for evaluation. The simulation results demonstrate that the scheme presents good performance on security, communication cost, and balance. wireless sensor networks security protocols efficient entity authentication node clone detection secure data aggregation homomorphic primitives Electrical and Computer Engineering
43	Agrégation et dissémination de données dans un réseau véhiculaire VANET. / Data Dissemination and Aggregation in Vehicular Adhoc Network Allani, Sabri 02 November 2018 (has links) Cette thèse traite la problématique de la dissémination et l’agrégation des données dans un contexte de réseaux VANET (Vehicle Ad-Hoc Networks). Cette problématique est fort intéressante, toujours d’actualité dans un monde de plus en plus urbanisé. En effet, d’un côté la dissémination permet d’informer les véhicules mobiles des principaux événements en temps utile, et de l’autre côté l’agrégation permet de résumer plusieurs données émanant de sources différentes concernant le même événement. Le challenge de la dissémination consiste à calculer la zone de relevance d’un événement, de délivrer les messages aux véhicules de cette zone, et de continuer à délivrer les messages en continu aux véhicules de cette zone. Le challenge de l’agrégation consiste essentiellement à sélectionner les messages à agréger et à qualifier les messages provenant de véhicules lointains. Pour résoudre le problème de dissémination nous proposons un nouveau protocole de dissémination des données dans les réseaux VANET. La principale idée de ce protocole est basée sur la définition de zones de relevance ZOR (zone of relevance of a région) pour la mesure de l’intérêt d’une zone par rapport à un évènement donné, et la définition de split Map permettant de décomposer une grande région en un ensemble de ZORs. L’approche de calcul des ZORs est formalisée, elle est basée sur les techniques de greedy pour l’extraction de la couverture pertinente. Le protocole de dissémination présenté sous forme de diagramme Flowchart qui résumé les activités lorsque qu’un véhicule est en mouvement, un événement est détecté. La performance du protocole proposé est évaluée et comparé au protocole Slotted1-Persistence à travers un environnement de simulations et une topologie réelle de routes de la ville de Bizerte en Tunisie. Les résultats de simulation sont présentés et discutés.D’autre part, certaines applications VANET, par exemple le système d’information de trafic (TIS), nécessitent une agrégation de données pour informer les véhicules des conditions de circulation, ce qui réduit les embouteillages et par conséquent les émissions de CO2 Par conséquent, la conception d'un protocole d'agrégation efficace combinant des informations de trafic corrélées telles que l'emplacement, la vitesse et la direction, appelées données flottantes sur les voitures (FCD), pose un problème complexe. Dans cette thèse, nous introduisons un nouveau protocole d’agrégation de données dans un réseau VANET appelé SDDA (Smart Directional Data Aggregation). Ce protocole est dédié aussi bien à l’échange de données dans un contexte urbain et autoroutier. Le protocole proposé est basé sur une sélection des messages à agréger. Trois principaux filtres ont été utilisés : filtrage basé sur la direction des véhicules, filtrage basé sur la limitation de vitesse, et filtrage basé sur l’élimination des messages dupliqués. Trois algorithmes d’agrégation sont proposés, ils visent à optimiser l’algorithme de SOTIS. Les trois algorithmes traitent des cas de routes unidirectionnelles, bidirectionnelles et les réseaux urbains. A l’image du chapitre précédent, la performance des algorithmes proposés sont évaluées à travail un travail de simulation et différents résultats sont présentés et discutés. / Since the last decade, the emergence of affordable wireless devices in vehicle ad-hoc networks has been a key step towards improving road safety as well as transport efficiency. Informing vehicles about interesting safety and non-safety events is of key interest. Thus, the design of an efficient data dissemination protocol has been of paramount importance. A careful scrutiny of the pioneering vehicle-to-vehicle data dissemination approaches highlights that geocasting is the most feasible approach for VANET applications, more especially in safety applications, since safety events are of interest mainly to vehicles located within a specific area, commonly called ZOR or Zone Of Relevance, close to the event. Indeed, the most challenging issue in geocast protocols is the definition of the ZOR for a given event dissemination. In this thesis, our first contribution introduces a new geocast approach, called Data Dissemination Protocol based on Map Splitting(DPMS). The main thrust of DPMS consists of building the zones of relevance through the mining of correlations between vehicles’ trajectories and crossed regions. To do so, we rely on the Formal Concept Analysis (FCA), which is a method of extracting interesting clusters from relational data. The performed experiments show that DPMS outperforms its competitors in terms of effectiveness and efficiency. In another hand, some VANET applications, e.g., Traffic Information System (TIS), require data aggregation in order to inform vehicles about road traffic conditions, which leads to reduce traffic jams and consequently CO2 emission while increasing the user comfort. Therefore, the design of an efficient aggregation protocol that combines correlated traffic information like location, speed and direction known as Floating Car Data (FCD) is a challenging issue. In this thesis, we introduce a new TIS data aggregation protocol called Smart Directional Data Aggregation (SDDA) able to decrease the network overload while obtaining high accurate information on traffic conditions for large road sections. To this end, we introduce three levels of messages filtering: (i) filtering all FCD messages before the aggregation process based on vehicle directions and road speed limitations, (ii) integrating a suppression technique in the phase of information gathering in order to eliminate the duplicate data, and (iii) aggregating the filtered FCD data and then disseminating it to other vehicles. The performed experiments show that the SDDA outperforms existing approaches in terms of effectiveness and efficiency. Désamination des données Agrégation des données Vanet GeoCast ZOR ITS VANET Data dissemination Data Aggregation Geocast ZOR ITS 004.2
44	Roteamento e agregação de dados baseado no RSSI em Redes de Sensores Sem Fio Lima, Moysés Mendes de 19 April 2015 (has links) Submitted by Geyciane Santos (geyciane_thamires@hotmail.com) on 2015-12-09T20:33:22Z No. of bitstreams: 1 Dissertação - Moysés Mendes de Lima.pdf: 1622069 bytes, checksum: 337a5becc26e67c107efbaeff8a9c441 (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2016-01-19T20:45:40Z (GMT) No. of bitstreams: 1 Dissertação - Moysés Mendes de Lima.pdf: 1622069 bytes, checksum: 337a5becc26e67c107efbaeff8a9c441 (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2016-01-19T20:47:25Z (GMT) No. of bitstreams: 1 Dissertação - Moysés Mendes de Lima.pdf: 1622069 bytes, checksum: 337a5becc26e67c107efbaeff8a9c441 (MD5) / Made available in DSpace on 2016-01-19T20:47:25Z (GMT). No. of bitstreams: 1 Dissertação - Moysés Mendes de Lima.pdf: 1622069 bytes, checksum: 337a5becc26e67c107efbaeff8a9c441 (MD5) Previous issue date: 2015-04-19 / Não informada / AWireless Sensor Network (WSN) consists of a set of sensors arranged in a sensor field in contact or near an event or phenomenon to be monitored. Such networks have amajor impact on applications involving the monitoring of environmental conditions such as temperature, light, motion and presence of certain types of objects. Accordingly in most other applications, the sensor nodes have limited power and bandwidth. Thus, techniques for reducing power consumption and communication are required in many settings. One of the main objectives in the design of a WSN is to provide data communication between the sensor nodes and the node textit sink.The challenge is to extend the network lifetime while maintaining a quality communication and data delivery. One of the key points observed in several studies is that routing protocols for WSNs will differ depending on the application and network architecture. Recently, algorithms that exploit information geographic, called Geographic Routing Algorithms, been considered as one of the most viable solution for routing in Wireless Sensor Networks (WSNs) due to their high scalability, dynamism, and high data delivery rate. These algorithms refer to nodes by position rather than address and use these coordinates to discover routes to the sink node.The main drawback of these algorithms is the need for position information of the nodes, which can be expensive in several ways. In this paper, we go further and propose two new routing algorithms that does not require position information while also adds data aggregation functionality to the network. Our proposal takes advantage of the grater communication range of the sink node as well as the Received Signal Strength Indicator (RSSI) of the sensor nodes to configure routing paths and aggregation times back to the sink node. Our results indicate clearly that the proposed algorithm reduces the amount of redundant data and the number of transmissions in the network while maintaining all advantages of these kind of algorithms. / Uma Rede de Sensores Sem Fio (RSSF) consiste em um conjunto de sensores, dispostos em um campo de sensoreamento em contato ou próximos a um evento ou fenômeno a ser sensoreado. Estas redes têm um grande impacto em aplicações de monitoramento ambiental tais como medição de temperatura, luminosidade, movimento e presença de certos tipos de objetos. Na maioria das aplicações para RSSF, os nós sensores possuem limitações de energia e largura de banda. Desta forma, técnicas de redução do consumo de energia e comunicação são necessárias, afim de otimizar o tempo de vida de uma RSSF. Um dos objetivos principais no projeto de uma RSSF é prover a comunicação de dados entre os nós sensores e o nó sink. O desafio é prolongar a vida útil da redemantendo uma qualidade na comunicação e na entrega dos dados. Um dos pontos chaves observados em diversos estudos é que os protocolos de roteamento em RSSFs irão diferir de acordo com a aplicação e a arquitetura da rede. Recentemente, algoritmos que exploram informações geográficas, chamados Algoritmos de Roteamento Geográfico, têm sido propostos para o transporte de dados nas RSSFs por serem escaláveis, dinâmicos e possuírem uma alta taxa de entrega de dados. Estes algoritmos se baseiam na posição física dos nós da rede ao invés de seus endereços para criar rotas em direção ao nó sink. A principal desvantagem destes algoritmos é a necessidade de localização, que pode ser custoso em diversos aspectos. Neste trabalho, propomos dois novos algoritmos de roteamento geográfico que não necessitam das informações de posição dos nós enquanto ainda proveem a funcionalidade de agregação de dados na rede. Nossa proposta tira vantagem da possibilidade de maior alcance de comunicação do nó sink, bem como do indicador de potência do sinal recebido (RSSI) dos nós sensores para configurar rotas e tempos de agregação de dados em direção ao nó sink. Os resultados obtidos mostram que os algoritmos propostos reduzem a quantidade de dados redundantes e o número de transmissões na rede, enquanto mantém todas as vantagens deste tipo de algoritmo. Redes de Sensores Sem Fio Roteamento Agregação de Dados Wireless Sendor Networks Routing Data Aggregation
45	Integrace statistické aplikace a herního systému s využitím datového skladu a platformy Java / Integration of statistic application and gaming system using data warehouse and Java platform Macoun, Jakub January 2013 (has links) Diploma thesis is about creation of support software integrated into gaming system. Thanks to non-existence of documentation of chosen gaming information system, untraditional method of software creation had to be used. The method is described by this thesis. Main objective of the thesis is creation of supporting application that generates aggregated data, stores it to a data warehouse and presents it to its users. All of this development is done using defined method. Thesis is divided in two main parts. The first one contains analysis of gaming information system. This analysis is used in the second part of the thesis, which describes how was the developed software designed and implemented. All analysis in the thesis are important part of the final product development and the product can be created thanks to them. Created application is unique in its domain and brings view at an untraditional development of this support software. Thanks to its uniqueness, created software can help with inspiration for next development in this domain.
46	Secure data aggregation protocol for sensor networks Shah, Kavit 20 February 2015 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / We propose a secure in-network data aggregation protocol with internal verification, to gain increase in the lifespan of the network by preserving bandwidth. For doing secure internal distributed operations, we show an algorithm for securely computing the sum of sensor readings in the network. Our algorithm can be generalized to any random tree topology and can be applied to any combination of mathematical functions. In addition, we represent an efficient way of doing statistical analysis for the protocol. Furthermore, we propose a novel, distributed and interactive algorithm to trace down the adversary and remove it from the network. Finally, we do bandwidth analysis of the protocol and give the proof for the efficiency of the protocol. Sensor networks Security Data aggregation Database management Computer networks -- Security measures Statistics Broadband communication systems Algorithms
47	Analytical Model for Energy Management in Wireless Sensor Networks Li, Hailong 24 September 2013 (has links) No description available. Computer Engineering Wireless Sensor Networks Wireless Visual Sensor Network Energy Management Data Aggregation Gaussian Random Distribution Lifetime Optimization
48	Spatiotemporal heterogeneity and bias in respiratory infection surveillance Rader, Benjamin Matthew 20 February 2024 (has links) Parameter estimation of respiratory infection surveillance dynamics commonly utilize data aggregated over space and time. However, estimates derived from aggregated data may fail to account for biologically meaningful spatiotemporal heterogeneity of effects or to identify where and when transmissions occur. This dissertation shows that high-resolution temporal and spatial data can improve our understanding of heterogeneity while producing more valid and precise estimates of transmission parameters (e.g., contagiousness), behavioral trends (e.g., face mask utilization), and intervention effects (e.g., at-home test distribution). In three projects, we evaluate spatiotemporal heterogeneity in the context of two major respiratory pathogens: Tuberculosis and SARSCoV-2. First, in project one, we identify disease transmission hotspots from a tuberculosis case surveillance system in Greater Vitória, Brazil. Utilizing a human mobility model and recently developed method to quantify disease transmission, we overcome multiple methodological constraints that often obscure spatially and temporally accurate transmission measurements. We estimate that two cities in Greater Vitória, Vila Velha (reproductive number = 1.05, 95%CI: 1.03–1.07) and Vitória (reproductive number = 1.04, 95%CI: 1.02–1.06), help sustain tuberculosis transmission in the entire region and may be effective targets for intervention, while Cariacica (reproductive number = 0.95, 95%CI: 0.94–0.97) fell below the critical threshold of 1 required to sustain transmission alone. Next, in project two, we utilize interrupted time series methods to estimate the effect of mask mandates on mask adherence using a nationally representative digital health survey on masking and a comprehensive database of pandemic-related government policies. The analysis focuses on improving previous attempts at measuring the effectiveness of mask mandates at the state level, by utilizing county-level exposure and outcome data. We find that mask mandates were associated with a large heterogeneity of effects, ranging from increasing masking approximately 8% in counties with low levels of prior masking to 1% or lower change in masking in places like the Northeast U.S. where masking levels were already high. Last, in project three, we leverage the same nationally representative digital health survey to understand at-home testing patterns in the United States. We utilize two different economic measures of resource allocation and a regression model with autoregressive integrated moving average errors to examine if the Covidtests.gov government program reduced at-home testing inequities. We show that Covidtest.gov did increase at-home testing across all demographics; however, income-, geographic- and race-based disparities in at-home test utilization were heightened during periods when the program was active. Specifically, the regression results estimate that Theil’s T, an economic metric used here to measure at-home testing disparities, was 53% (95%CI: 6%–121%) higher for household income, 214% (95%CI: 86%–429%) higher for race, and 90% (95%CI: 23%–193%) higher for geography during Covidtest.gov dissemination periods. Disparities were not elevated for age. Together, these three projects demonstrate the substantial role that high-resolution data can play in improving our understanding of respiratory infection surveillance and informing effective public health interventions. Public health Behavioral epidemiology Data aggregation bias Disease transmission dynamics Hotspot identification Pandemic response evaluation Public health policy
49	Energy-efficient privacy homomorphic encryption scheme for multi-sensor data in WSNs Verma, Suraj, Pillai, Prashant, Hu, Yim Fun 04 May 2015 (has links) Yes / The recent advancements in wireless sensor hardware ensures sensing multiple sensor data such as temperature, pressure, humidity, etc. using a single hardware unit, thus defining it as multi-sensor data communication in wireless sensor networks (WSNs). The in-processing technique of data aggregation is crucial in energy-efficient WSNs; however, with the requirement of end-to-end data confidentiality it may prove to be a challenge. End-to-end data confidentiality along with data aggregation is possible with the implementation of a special type of encryption scheme called privacy homomorphic (PH) encryption schemes. This paper proposes an optimized PH encryption scheme for WSN integrated networks handling multi-sensor data. The proposed scheme ensures light-weight payloads, significant energy and bandwidth consumption along with lower latencies. The performance analysis of the proposed scheme is presented in this paper with respect to the existing scheme. The working principle of the multi-sensor data framework is also presented in this paper along with the appropriate packet structures and process. It can be concluded that the scheme proves to decrease the payload size by 56.86% and spend an average energy of 8-18 mJ at the aggregator node for sensor nodes varying from 10-50 thereby ensuring scalability of the WSN unlike the existing scheme. Contiki-OS Wireless sensor networks Homomorphic encryption scheme Data aggregation Energy-efficient WSNs End-to-end data confidentiality
50	Uma nova metáfora visual escalável para dados tabulares e sua aplicação na análise de agrupamentos / A scalable visual metaphor for tabular data and its application on clustering analysis Mosquera, Evinton Antonio Cordoba 19 September 2017 (has links) A rápida evolução dos recursos computacionais vem permitindo que grandes conjuntos de dados sejam armazenados e recuperados. No entanto, a exploração, compreensão e extração de informação útil ainda são um desafio. Com relação às ferramentas computacionais que visam tratar desse problema, a Visualização de Informação possibilita a análise de conjuntos de dados por meio de representações gráficas e a Mineração de Dados fornece processos automáticos para a descoberta e interpretação de padrões. Apesar da recente popularidade dos métodos de visualização de informação, um problema recorrente é a baixa escalabilidade visual quando se está analisando grandes conjuntos de dados, resultando em perda de contexto e desordem visual. Com intuito de representar grandes conjuntos de dados reduzindo a perda de informação relevante, o processo de agregação visual de dados vem sendo empregado. A agregação diminui a quantidade de dados a serem representados, preservando a distribuição e as tendências do conjunto de dados original. Quanto à mineração de dados, visualização de informação vêm se tornando ferramental essencial na interpretação dos modelos computacionais e resultados gerados, em especial das técnicas não-supervisionados, como as de agrupamento. Isso porque nessas técnicas, a única forma do usuário interagir com o processo de mineração é por meio de parametrização, limitando a inserção de conhecimento de domínio no processo de análise de dados. Nesta dissertação, propomos e desenvolvemos uma metáfora visual baseada na TableLens que emprega abordagens baseadas no conceito de agregação para criar representações mais escaláveis para a interpretação de dados tabulares. Como aplicação, empregamos a metáfora desenvolvida na análise de resultados de técnicas de agrupamento. O ferramental resultante não somente suporta análise de grandes bases de dados com reduzida perda de contexto, mas também fornece subsídios para entender como os atributos dos dados contribuem para a formação de agrupamentos em termos da coesão e separação dos grupos formados. / The rapid evolution of computing resources has enabled large datasets to be stored and retrieved. However, exploring, understanding and extracting useful information is still a challenge. Among the computational tools to address this problem, information visualization techniques enable the data analysis employing the human visual ability by making a graphic representation of the data set, and data mining provides automatic processes for the discovery and interpretation of patterns. Despite the recent popularity of information visualization methods, a recurring problem is the low visual scalability when analyzing large data sets resulting in context loss and visual disorder. To represent large datasets reducing the loss of relevant information, the process of aggregation is being used. Aggregation decreases the amount of data to be represented, preserving the distribution and trends of the original dataset. Regarding data mining, information visualization has become an essential tool in the interpretation of computational models and generated results, especially of unsupervised techniques, such as clustering. This occurs because, in these techniques, the only way the user interacts with the mining process is through parameterization, limiting the insertion of domain knowledge in the process. In this thesis, we propose and develop the new visual metaphor based on the TableLens that employs approaches based on the concept of aggregation to create more scalable representations of tabular data. As application, we use the developed metaphor in the analysis of the results of clustering techniques. The resulting framework does not only support large database analysis but also provides insights into how data attributes contribute to clustering regarding cohesion and separation of the composed groups Agregação de dados Análise de agrupamentos Análise visual Clustering analysis Dados tabulares Data aggregation Data mining Data visualization Mineração de dados Tabular data Visual analytics Visualização de dados

Search results