Spelling suggestions: "subject:"[een] COMMUNITY DETECTION"" "subject:"[enn] COMMUNITY DETECTION""
71 |
Complex network component unfolding using a particle competition technique / Desdobramento de componentes de redes complexas utilizando uma técnica de competição de partículasUrio, Paulo Roberto 12 June 2017 (has links)
This work applies complex network theory to the problem of semi-supervised and unsupervised learning in networks that are representations of multivariate datasets. Complex networks allow the use of nonlinear dynamical systems to represent behaviors according to the connectivity patterns of networks. Inspired by behavior observed in nature, such as competition for limited resources, dynamical system models can be employed to uncover the organizational structure of a network. In this dissertation, we develop a technique for classifying data represented as interaction networks. As part of the technique, we model a dynamical system inspired by the biological dynamics of resource competition. So far, similar methods have focused on vertices as the resource of competition. We introduce edges as the resource of competition. In doing so, the connectivity pattern of a network might be used not only in the dynamical system simulation but in the learning task as well. / Este trabalho aplica a teoria de redes complexas para o estudo de uma técnica aplicada ao problema de aprendizado semissupervisionado e não-supervisionado em redes, especificamente, aquelas que representam conjuntos de dados multivariados. Redes complexas permitem o emprego de sistemas dinâmicos não-lineares que podem apresentar comportamentos de acordo com os padrões de conectividade de redes. Inspirado pelos comportamentos observados na natureza, tais como a competição por recursos limitados, sistema dinâmicos podem ser utilizados para revelar a estrutura da organização de uma rede. Nesta dissertação, desenvolve-se uma técnica aplicada ao problema de classificação de dados representados por redes de interação. Como parte da técnica, um sistema dinâmico inspirado na competição por recursos foi modelado. Métodos similares concentraram-se em vértices como o recurso da concorrência. Neste trabalho, introduziu-se arestas como o recurso-alvo da competição. Ao fazê-lo, utilizar-se-á o padrão de conectividade de uma rede tanto na simulação do sistema dinâmico, quanto na tarefa de aprendizado.
|
72 |
Segmentação de imagens baseada em redes complexas e superpixels: uma aplicação ao censo de aves / Image segmentation based on complex networks and superpixels: an application to birds censusBotelho, Glenda Michele 19 September 2014 (has links)
Uma das etapas mais importantes da análise de imagens e, que conta com uma enorme quantidade de aplicações, é a segmentação. No entanto, uma boa parte das técnicas tradicionais apresenta alto custo computacional, dificultando sua aplicação em imagens de alta resolução como, por exemplo, as imagens de ninhais de aves do Pantanal que também serão analisadas neste trabalho. Diante disso, é proposta uma nova abordagem de segmentação que combina algoritmos de detecção de comunidades, pertencentes à teoria das redes complexas, com técnicas de extração de superpixels. Tal abordagem é capaz de segmentar imagens de alta resolução mantendo o compromisso entre acurácia e tempo de processamento. Além disso, como as imagens de ninhais analisadas apresentam características peculiares que podem ser mais bem tratadas por técnicas de segmentação por textura, a técnica baseada em Markov Random Fields (MRF) é proposta, como um complemento à abordagem de segmentação inicial, para realizar a identificação final das aves. Por fim, devido à importância de avaliar quantitativamente a qualidade das segmentações obtidas, um nova métrica de avaliação baseada em ground-truth foi desenvolvida, sendo de grande importância para a área. Este trabalho contribuiu para o avanço do estado da arte das técnicas de segmentação de imagens de alta resolução, aprimorando e desenvolvendo métodos baseados na combinação de redes complexas com superpixels, os quais alcançaram resultados satisfatórios com baixo tempo de processamento. Além disso, uma importante contribuição referente ao censo demográfico de aves por meio da análise de imagens aéreas de ninhais foi viabilizada por meio da aplicação da técnica de segmentação MRF. / Segmentation is one of the most important steps in image analysis with a large range of applications. However, some traditional techniques exhibit high computational costs, hindering their application in high resolution images such as the images of birds nests from Pantanal, one of Brazilian most important wetlands. Therefore, we propose a new segmentation approach that combines community detection algorithms, originated from the theory of the complex networks, with superpixels extraction techniques. This approach is capable of segmenting high resolution images while maintaining the trade-off between accuracy and processing time. Moreover, as the nest images exhibit peculiar characteristics that can be better dealt with texture segmentation techniques, the Markov Random Fields (MRF) technique is proposed, as a complement to the initial approach, to perform the final identification of the birds. Finally, due to the importance of the quantitatively evaluation of the segmentation quality, a new evaluation metric based on ground-truth was developed, being of great importance to the segmentation field. This work contributed to the state of art of high resolution images segmentation techniques, improving and developing methods based on combination of complex networks and superpixels, which generated satisfactory results within low processing time. Moreover, an important contribution for the birds census by the analysis of aerial images of birds nests was made possible by application of the MRF technique.
|
73 |
Triangle packing for community detection : algorithms, visualizations and application to Twitter's network / La détection de communautés basée sur la triangulation de graphes : algorithmes, visualisations et application aux réseaux de tweetsAbdelsadek, Youcef 31 March 2016 (has links)
De nos jours, nous générons une quantité immensément grande de données juste en accomplissant nos simples tâches quotidiennes. L'analyse de ces données soulève des challenges ardus. Dans cette thèse, nous nous intéressons à deux aspects des données relationnelles. En premier lieu, nous considérons les données relationnelles dans lesquelles les relations sont pondérées. Un exemple concret serait le nombre commun de suiveurs entre deux utilisateurs de Twitter. Dans un deuxième temps, nous abordons le cas dynamique de ces données qui est inhérent à leur nature. Par exemple, le nombre de suiveurs communs pourrait changer au fil du temps. Dans cette thèse nous utilisons les graphes pour modéliser ces données qui sont à la fois complexes et évolutives. Les travaux de cette thèse s'articulent aussi autour de la détection de communautés pour les graphes pondérés et dynamiques. Pour un utilisateur expert, l'identification de ces communautés pourrait l'aider à comprendre la sémantique sous-jacente à la structure du graphe. Notre hypothèse repose sur l'utilisation des triangles comme ossature pour la détection de communautés. Cela nous a amenés à proposer plusieurs algorithmes : Séparation et évaluation, recherche gloutonne, heuristiques et algorithme génétique sont proposés. En se basant sur cet ensemble de triangles, nous proposons un algorithme de détection de communautés, appelé Tribase. L'idée conductrice de cet algorithme est de comparer les poids des communautés, permettant aux communautés dominantes d'acquérir plus de membres. Les résultats de l'étude comparative sur le benchmark LFR montrent que l'algorithme que nous proposons parvient à détecter les communautés dans les graphes dans lesquels une structure de communautés existe. De plus, l'applicabilité de notre algorithme a été testée sur des données réelles du projet ANR Info-RSN. Dans l'optique d'accompagner l'utilisateur expert dans son processus d'acquisition de l'information, une application visuelle et interactive a été implémentée. NLCOMS (Nœud-Lien et COMmunautéS) propose une panoplie de vues synchronisées pour la représentation de l'information. Par ailleurs, nous proposons dans cette thèse un algorithme de détection de communautés pour les graphes pondérés et dynamiques, appelé Dyci. Dyci permet de gérer les différents scénarios de mise à jour possibles de la structure du graphe. L'idée principale de Dyci est de guetter au cours du temps l'affaiblissement d'une communauté (en termes de poids) dans le but de reconsidérer localement sa place dans la structure, évitant ainsi une réindentification globale des communautés. Une étude comparative a été menée montrant que l'algorithme que nous proposons offre un bon compromis entre la solution obtenue et le temps de calcul. Finalement, l'intégration dans NLCOMS des visualisations adéquates pour la variante dynamique a été effectuée / Relational data in our society are on a constant increasing, rising arduous challenges. In this thesis, we consider two aspects of relational data. First, we are interested in relational data with weighted relationship. As a concrete example, relationships among Twitter's users could be weighted with regard to their shared number of followers. The second aspect is related to the dynamism which is inherent to data nature. As an instance, in the previous example the number of common followers between two Twitter's users can change over time. In order to handle these complex and dynamic relational data, we use the modelling strength of graphs. Another facet considered in this thesis deals with community identification on weighted and dynamic graphs. For an analyst, the community detection might be helpful to grasp the semantic behind the graph structure. Our assumption relies on the idea to use a set of disjoint pairwise triangles as a basis to detect the community structure. To select these triangles, several algorithms are proposed (i.e., branch-and-bound, greedy search, heuristics and genetic algorithm). Thereafter, we propose a community detection algorithm, called Tribase. In the latter, the weights of communities are compared allowing dominant communities to gain in size. Tribase is compared with the well-known LFR benchmark. The results show that Tribase identifies efficiently the communities while a community structure exists. Additionally, to asset Tribase on real-world data, we consider social networks data, especially Twitter's data, of the ANR-Info-RSN project. In order to support the analyst in its knowledge acquisition, we elaborate a visual interactive approach. To this end, an interactive application, called NLCOMS is introduced. NLCOMS uses multiple synchronous views for visualizing community structure and the related information. Furthermore, we propose an algorithm for the identification of communities over time, called Dyci. The latter takes advantage from the previously detected communities. Several changes' scenarios are considered like, node/edge addition, node/edge removing and edge weight update. The main idea of the proposed algorithm is to track whether a part of the weighted graph becomes weak over time, in order to merge it with the "dominant" neighbour community. In order to assess the quality of the returned community structure, we conduct a comparison with a genetic algorithm on real-world data of the ARN-Info-RSN project. The conducted comparison shows that Dyci algorithm provides a good trade-off between efficiency and consumed time. Finally, the dynamic changes which occur to the underlying graph structure can be visualized with NLCOMS which combines physical an axial time to fulfil this need
|
74 |
Classificação e previsão de séries temporais através de redes complexas / Time series trend classification and forecasting using complex network analysisAnghinoni, Leandro 06 November 2018 (has links)
O estudo de séries temporais para a geração de conhecimento é uma área que vem crescendo em importância e complexidade ao longo da última década, à medida que a quantidade de dados armazenados cresce exponencialmente. Considerando este cenário, novas técnicas de mineração de dados têm sido constantemente desenvolvidas para lidar com esta situação. Neste trabalho é proposto o estudo de séries temporais baseado em suas características topológicas, observadas em uma rede complexa gerada com os dados da série temporal. Especificamente, o objetivo do modelo proposto é criar um algoritmo de detecção de tendências para séries temporais estocásticas baseado em detecção de comunidades e caminhadas nesta mesma rede. O modelo proposto apresenta algumas vantagens em relação à métodos tradicionais, como o número adaptativo de classes, com força mensurável, e uma melhor absorção de ruídos. Resultados experimentais em bases artificiais e reais mostram que o método proposto é capaz de classificar as séries temporais em padrões locais e globais, melhorando a previsibilidade das séries ao se utilizar métodos de aprendizado de máquina para a previsão das classes / Extracting knowledge from time series analysis has been growing in importance and complexity over the last decade as the amount of stored data has increased exponentially. Considering this scenario, new data mining techniques have continuously developed to deal with such a situation. In this work, we propose to study time series based on its topological characteristics, observed on a complex network generated from the time series data. Specifically, the aim of the proposed model is to create a trend detection algorithm for stochastic time series based on community detection and network metrics. The proposed model presents some advantages over traditional time series analysis, such as adaptive number of classes with measurable strength and better noise absorption. Experimental results on artificial and real datasets shows that the proposed method is able to classify the time series into local and global patterns, improving the predictability of the series when using machine-learning methods
|
75 |
Classificação e previsão de séries temporais através de redes complexas / Time series trend classification and forecasting using complex network analysisLeandro Anghinoni 06 November 2018 (has links)
O estudo de séries temporais para a geração de conhecimento é uma área que vem crescendo em importância e complexidade ao longo da última década, à medida que a quantidade de dados armazenados cresce exponencialmente. Considerando este cenário, novas técnicas de mineração de dados têm sido constantemente desenvolvidas para lidar com esta situação. Neste trabalho é proposto o estudo de séries temporais baseado em suas características topológicas, observadas em uma rede complexa gerada com os dados da série temporal. Especificamente, o objetivo do modelo proposto é criar um algoritmo de detecção de tendências para séries temporais estocásticas baseado em detecção de comunidades e caminhadas nesta mesma rede. O modelo proposto apresenta algumas vantagens em relação à métodos tradicionais, como o número adaptativo de classes, com força mensurável, e uma melhor absorção de ruídos. Resultados experimentais em bases artificiais e reais mostram que o método proposto é capaz de classificar as séries temporais em padrões locais e globais, melhorando a previsibilidade das séries ao se utilizar métodos de aprendizado de máquina para a previsão das classes / Extracting knowledge from time series analysis has been growing in importance and complexity over the last decade as the amount of stored data has increased exponentially. Considering this scenario, new data mining techniques have continuously developed to deal with such a situation. In this work, we propose to study time series based on its topological characteristics, observed on a complex network generated from the time series data. Specifically, the aim of the proposed model is to create a trend detection algorithm for stochastic time series based on community detection and network metrics. The proposed model presents some advantages over traditional time series analysis, such as adaptive number of classes with measurable strength and better noise absorption. Experimental results on artificial and real datasets shows that the proposed method is able to classify the time series into local and global patterns, improving the predictability of the series when using machine-learning methods
|
76 |
Time series data mining using complex networks / Mineração de dados em séries temporais usando redes complexasFerreira, Leonardo Nascimento 15 September 2017 (has links)
A time series is a time-ordered dataset. Due to its ubiquity, time series analysis is interesting for many scientific fields. Time series data mining is a research area that is intended to extract information from these time-related data. To achieve it, different models are used to describe series and search for patterns. One approach for modeling temporal data is by using complex networks. In this case, temporal data are mapped to a topological space that allows data exploration using network techniques. In this thesis, we present solutions for time series data mining tasks using complex networks. The primary goal was to evaluate the benefits of using network theory to extract information from temporal data. We focused on three mining tasks. (1) In the clustering task, we represented every time series by a vertex and we connected vertices that represent similar time series. We used community detection algorithms to cluster similar series. Results show that this approach presents better results than traditional clustering results. (2) In the classification task, we mapped every labeled time series in a database to a visibility graph. We performed classification by transforming an unlabeled time series to a visibility graph and comparing it to the labeled graphs using a distance function. The new label is the most frequent label in the k-nearest graphs. (3) In the periodicity detection task, we first transform a time series into a visibility graph. Local maxima in a time series are usually mapped to highly connected vertices that link two communities. We used the community structure to propose a periodicity detection algorithm in time series. This method is robust to noisy data and does not require parameters. With the methods and results presented in this thesis, we conclude that network science is beneficial to time series data mining. Moreover, this approach can provide better results than traditional methods. It is a new form of extracting information from time series and can be easily extended to other tasks. / Séries temporais são conjuntos de dados ordenados no tempo. Devido à ubiquidade desses dados, seu estudo é interessante para muitos campos da ciência. A mineração de dados temporais é uma área de pesquisa que tem como objetivo extrair informações desses dados relacionados no tempo. Para isso, modelos são usados para descrever as séries e buscar por padrões. Uma forma de modelar séries temporais é por meio de redes complexas. Nessa modelagem, um mapeamento é feito do espaço temporal para o espaço topológico, o que permite avaliar dados temporais usando técnicas de redes. Nesta tese, apresentamos soluções para tarefas de mineração de dados de séries temporais usando redes complexas. O objetivo principal foi avaliar os benefícios do uso da teoria de redes para extrair informações de dados temporais. Concentramo-nos em três tarefas de mineração. (1) Na tarefa de agrupamento, cada série temporal é representada por um vértice e as arestas são criadas entre as séries de acordo com sua similaridade. Os algoritmos de detecção de comunidades podem ser usados para agrupar séries semelhantes. Os resultados mostram que esta abordagem apresenta melhores resultados do que os resultados de agrupamento tradicional. (2) Na tarefa de classificação, cada série temporal rotulada em um banco de dados é mapeada para um gráfico de visibilidade. A classificação é realizada transformando uma série temporal não marcada em um gráfico de visibilidade e comparando-a com os gráficos rotulados usando uma função de distância. O novo rótulo é dado pelo rótulo mais frequente nos k grafos mais próximos. (3) Na tarefa de detecção de periodicidade, uma série temporal é primeiramente transformada em um gráfico de visibilidade. Máximos locais em uma série temporal geralmente são mapeados para vértices altamente conectados que ligam duas comunidades. O método proposto utiliza a estrutura de comunidades para realizar a detecção de períodos em séries temporais. Este método é robusto para dados ruidosos e não requer parâmetros. Com os métodos e resultados apresentados nesta tese, concluímos que a teoria da redes complexas é benéfica para a mineração de dados em séries temporais. Além disso, esta abordagem pode proporcionar melhores resultados do que os métodos tradicionais e é uma nova forma de extrair informações de séries temporais que pode ser facilmente estendida para outras tarefas.
|
77 |
Visualizing media with interactive multiplex networks / Cartographier les médias avec des réseaux multiplexes interactifsRen, Haolin 14 March 2019 (has links)
Les flux d’information suivent aujourd’hui des chemins complexes: la propagation des informations, impliquant éditeurs on-line, chaînes d’information en continu et réseaux sociaux, emprunte alors des chemins croisés, susceptibles d’agir sur le contenu et sa perception. Ce projet de thèse étudie l’adaptation des mesures de graphes classiques aux graphes multiplexes en relation avec le domaine étudié, propose de construire des visualisations à partir de plusieurs représentations graphiques des réseaux, et de les combiner (visualisations multi-vues synchronisées, représentations hybrides, etc.). L’accent est mis sur les modes d’interaction permettant de prendre en compte l’aspect multiplexe (multicouche) des réseaux. Ces représentations et manipulations interactives s’appuient aussi sur le calcul d’indicateurs propres aux réseaux multiplexes. Ce travail est basé sur deux jeux de données principaux: l’un est une archive de 12 ans de l’émission japonaise publique quotidienne NHK News 7, de 2001 à 2013. L’autre recense les participants aux émissions de télévision/radio françaises entre 2010 et 2015. Deux systèmes de visualisation s’appuyant sur une interface Web ont été développés pour analyser des réseaux multiplexes, que nous appelons «Visual Cloud» et «Laputa». Dans le Visual Cloud, nous définissons formellement une notion de similitude entre les concepts et les groupes de concepts que nous nommons possibilité de co-occurrence (CP). Conformément à cette définition, nous proposons un algorithme de classification hiérarchique. Nous regroupons les couches dans le réseau multiplexe de documents, et intégrons cette hiérarchie dans un nuage de mots interactif. Nous améliorons les algorithmes traditionnels de disposition de mise en forme de nuages de mots de sorte à préserver les contraintes sur la hiérarchie de concepts. Le système Laputa est destiné à l’analyse complexe de réseaux temporels denses et multidimensionnels. Pour ce faire, il associe un graphe à une segmentation. La segmentation par communauté, par attribut, ou encore par tranche temporelle, forme des vues de ce graphe. Afin d’associer ces vues avec le tout global, nous utilisons des diagrammes de Sankey pour révéler l’évolution des communautés (diagrammes que nous avons augmentés avec un zoom sémantique). Cette thèse nous permet ainsi de parcourir trois aspects (3V) des plus intéressants de la donnée et du BigData appliqués aux archives multimédia: Le Volume de nos données dans l’immensité des archives, nous atteignons des ordres de grandeurs qui ne sont pas praticables pour la visualisation et l’exploitation des liens. La Vélocité à cause de la nature temporelle de nos données (par définition). La Variété qui est un corollaire de la richesse des données multimédia et de tout ce que l’on peut souhaiter vouloir y investiguer. Ce que l’on peut retenir de cette thèse c’est que la traduction de ces trois défis a pris dans tous les cas une réponse sous la forme d’une analyse de réseaux multiplexes. Nous retrouvons toujours ces structures au coeur de notre travail, que ce soit de manière plus discrète dans les critères pour filtrer les arêtes par l’algorithme Simmelian backbone, que ce soit par la superposition de tranches temporelles, ou bien que ce soit beaucoup plus directement dans la combinaison d’indices sémantiques visuels et textuels pour laquelle nous extrayons les hiérarchies permettant notre visualisation. / Nowadays, information follows complex paths: information propagation involving on-line editors, 24-hour news providers and social medias following entangled paths acting on information content and perception. This thesis studies the adaptation of classical graph measurements to multiplex graphs, to build visualizations from several graphical representations of the networks, and to combine them (synchronized multi-view visualizations, hybrid representations, etc.). Emphasis is placed on the modes of interaction allowing to take in hand the multiplex nature (multilayer) of the networks. These representations and interactive manipulations are also based on the calculation of indicators specific to multiplex networks. The work is based on two main datasets: one is a 12-year archive of the Japanese public daily broadcast NHK News 7, from 2001 to 2013. Another lists the participants in the French TV/radio shows between 2010 and 2015. Two visualization systems based on a Web interface have been developed for multiplex network analysis, which we call "Visual Cloud" and "Laputa". In the Visual Cloud, we formally define a notion of similarity between concepts and groups of concepts that we call co-occurrence possibility (CP). According to this definition, we propose a hierarchical classification algorithm. We aggregate the layers in a multiplex network of documents, and integrate that hierarchy into an interactive word cloud. Here we improve the traditional word cloud layout algorithms so as to preserve the constraints on the concept hierarchy. The Laputa system is intended for the complex analysis of dense and multidimensional temporal networks. To do this, it associates a graph with a segmentation. The segmentation by communities, by attributes, or by time slices, forms views of this graph. In order to associate these views with the global whole, we use Sankey diagrams to reveal the evolution of the communities (diagrams that we have increased with a semantic zoom). This thesis allows us to browse three aspects of the most interesting aspects of the data miming and BigData applied to multimedia archives: The Volume since our archives are immense and reach orders of magnitude that are usually not practicable for the visualization; Velocity, because of the temporal nature of our data (by definition). The Variety that is a corollary of the richness of multimedia data and of all that one may wish to want to investigate. What we can retain from this thesis is that we met each of these three challenges by taking an answer in the form of a multiplex network analysis. These structures are always at the heart of our work, whether in the criteria for filtering edges using the Simmelian backbone algorithm, or in the superposition of time slices in the complex networks, or much more directly in the combinations of visual and textual semantic indices for which we extract hierarchies allowing our visualization.
|
78 |
Big Networks: Analysis and Optimal ControlNguyen, Hung The 01 January 2018 (has links)
The study of networks has seen a tremendous breed of researches due to the explosive spectrum of practical problems that involve networks as the access point. Those problems widely range from detecting functionally correlated proteins in biology to finding people to give discounts and gain maximum popularity of a product in economics. Thus, understanding and further being able to manipulate/control the development and evolution of the networks become critical tasks for network scientists. Despite the vast research effort putting towards these studies, the present state-of-the-arts largely either lack of high quality solutions or require excessive amount of time in real-world `Big Data' requirement.
This research aims at affirmatively boosting the modern algorithmic efficiency to approach practical requirements. That is developing a ground-breaking class of algorithms that provide simultaneously both provably good solution qualities and low time and space complexities. Specifically, I target the important yet challenging problems in the three main areas:
Information Diffusion: Analyzing and maximizing the influence in networks and extending results for different variations of the problems.
Community Detection: Finding communities from multiple sources of information.
Security and Privacy: Assessing organization vulnerability under targeted-cyber attacks via social networks.
|
79 |
Détection et analyse de communautés dans les réseaux / Community detection and analysis in networksSerrour, Belkacem 10 December 2010 (has links)
L'étude de structures de communautés dans les réseaux devient de plus en plus une question importante. La connaissance des modules de base (communautés) des réseaux nous aide à bien comprendre leurs fonctionnements et comportements, et à appréhender les performances de ces systèmes. Une communauté dans un graphe (réseau) est définie comme un ensemble de nœuds qui sont fortement liés entre eux, mais faiblement liés avec le reste du graphe. Les membres de la même communauté partagent les mêmes centres d'intérêt. La plupart des travaux qui existent dans ce domaine se scindent en deux grandes thématiques: la détection de communautés et l'analyse de communautés. La détection de communautés consiste à trouver les communautés dans un réseau donné, sans connaître à priori ni la taille ni le nombre des communautés. La partie analyse de communautés, quant à elle, consiste à étudier les propriétés structurelles et sémantiques des communautés détectées et de celles du réseau étudié. Dans cette thèse, nous nous intéressons à l'étude de structures de communautés dans les réseaux. Nous contribuons dans les deux parties, analyse et détection de communautés. Dans la partie analyse de communautés, nos contributions sont l'étude des communautés dans les réseaux de communication et l'étude des communautés dans les services Web. D'une part, nous étudions l'émergence de communautés dans les réseaux de communication. Nous proposons une classification de structures de communautés émergées dans un réseau de communication donné. Nous modélisons les réseaux par les graphes et nous les caractérisons par un ensemble de paramètres. Nous concluons par une corrélation directe entre le réseau initial et les types de structures de communautés émergées. D'autre part, nous étudions les communautés dans les logs de services Web. Nous analysons les historiques d'exécution (les fichiers logs) afin de découvrir les protocoles métiers de services (séquences de messages échangés entre le service et le client pour aboutir à un but donné). Nous modélisons les logs par les graphes, et nous cherchons l'ensemble de conversations (communautés) issues de notre graphe de messages (le graphe de messages est un graphe induit du graphe de logs). Notre contribution dans la partie détection de communautés, est la proposition d'un algorithme de détection de communautés basé sur les motifs utilisant l'optimisation spectrale. Nous définissons une matrice de modularité motif (particulièrement, le triangle), et nous utilisons l'algorithme de décomposition et d'optimisation spectrale pour détecter les communautés basées sur des motifs. Nous montrons l'apport des communautés basées sur les motifs en appliquant notre algorithme sur des réseaux sociaux connus dans la littérature et en comparant les communautés basées sur les motifs trouvées avec les communautés classiques. / The study of the sub-structure of complex networks is of major importance to relate topology and functionality. Understanding the modular units (communities) of graphs is of utmost importance to grasping knowledge about the functionality and performance of such systems. A community is defined as a group of nodes such that connections between the nodes are denser than connections with the rest of the network. Generally, the members of one community share the same interest. Many efforts have been devoted to the analysis of the modular structure of networks. The most of these works are grouped into two parts: community detection and community analysis. Community detection consists on finding communities in networks whithout knowing there size and number. While the community analysis deals the study of the structural and semantic properties of the emerged communities, and the understanding of the functionality and the performance of the network. In this thesis, we are interested on the study of the community structures in networks. We give contributions in both community analysis and community detection parts. In the community analysis part, we study the communities of communication networks and the communities in web services. On the one hand, we study the community emergence in communication networks. We propose a classification of the emerged community structures in a given network. We model the networks by graphs and we characterize them by some parameters (network size, network density, number of resources in the network, number of providers in the network, etc.). We give also a direct correlation between the network parameters and the emerged community structures. On the other hand, we study the communities in the web service logs. We aim to discover the business protocol of services (sequences of messages exchanged between the service and a client to achieve a given goal). We analyze the log files and we model them by graphs. In our final tree graph (message graph), the paths represent the conversations (communities). In the community detection part, the main goal of our contribution is to determine communities using as building blocks triangular motifs. We propose an approach for triangle community detection based on modularity optimization using the spectral algorithm decomposition and optimization. The resulting algorithm is able to identify efficiently the best partition in communities of triangles of any given network, optimizing their correspondent modularity function.
|
80 |
Community Detection in Imperfect NetworksDahlin, Johan January 2011 (has links)
Community detection in networks is an important area of current research with many applications. Finding community structures is a challenging task and despite significant effort no satisfactory method has been found. Different methods find different communities in the same network and with different computational requirements. To counter this problem, several different methods are often used and the results compared manually. In this thesis, we present three different methods to instead merge the results from different methods (or several runs from the same algorithm) to find better estimates of the community structure. Another problem in practical applications is noisy and imperfect networks with missing and false edges. These imperfections are natural results from the methods used to map the network structure and are often difficult to eliminate. In this thesis, we apply a Monte Carlo-sampling method in combination with the introduced methods for merging community detection results to find community structures in such networks. The method is tested by simulation studies on both real-world networks and synthetic networks with generated uncertainties and imperfections. We finally demonstrate how it is possible to generate confidence levels of the obtained community structure from the merging methods. This allows for a qualitative comparison of the robustness and significance of the network clustering. / Identifikation av grupperingar i nätverk är ett viktigt område inom aktuell forskning med många olika tillämpningsområden. Att finna grupperingar är ofta svårt och trots betydande ansträngningar har ingen tillfredsställande metod hittats. Olika metoder finner ofta olika grupperingar i samma nätverk och kräver varierande beräkningskraft. För att hantera dessa problem används ofta flera metoder vartefter resultaten jämförs manuellt. I detta examensarbete presenterar vi tre olika metoder att istället slå samman resultat från olika metoder (eller fler körningar från samma algoritm) för att hitta bättre uppskattningar av grupperingarna. Ett annat problem i praktiska tillämpningar är brus och ofullständiga nätverk med saknade och falska kanter. Dessa brister är naturliga resultat från de metoder som används för att kartlägga nätverketstrukturen och det är ofta svåra att eliminera dessa. I detta examensarbete använder vi Monte Carlo-metoder i kombination med de introducerade metoderna för att slå samman funna grupperingar för att hitta grupperingar i det osäkra nätverket. Vi testar metoden genom simuleringstudier på både verkliga och syntetiska nätverk med genererade osäkerheter och brister. Slutligen demostrerar vi hur det är möjligt att skapa konfidensnivåer för noder i grupperingar med hjälp av metoderna för sammanslagning. Detta möjliggör en kvalitativ jämförelse av stabilitet och signifikans av identifierade nätverksgrupperingar.
|
Page generated in 0.0421 seconds