91 |
Community Detection in Imperfect NetworksDahlin, Johan January 2011 (has links)
Community detection in networks is an important area of current research with many applications. Finding community structures is a challenging task and despite significant effort no satisfactory method has been found. Different methods find different communities in the same network and with different computational requirements. To counter this problem, several different methods are often used and the results compared manually. In this thesis, we present three different methods to instead merge the results from different methods (or several runs from the same algorithm) to find better estimates of the community structure. Another problem in practical applications is noisy and imperfect networks with missing and false edges. These imperfections are natural results from the methods used to map the network structure and are often difficult to eliminate. In this thesis, we apply a Monte Carlo-sampling method in combination with the introduced methods for merging community detection results to find community structures in such networks. The method is tested by simulation studies on both real-world networks and synthetic networks with generated uncertainties and imperfections. We finally demonstrate how it is possible to generate confidence levels of the obtained community structure from the merging methods. This allows for a qualitative comparison of the robustness and significance of the network clustering. / Identifikation av grupperingar i nätverk är ett viktigt område inom aktuell forskning med många olika tillämpningsområden. Att finna grupperingar är ofta svårt och trots betydande ansträngningar har ingen tillfredsställande metod hittats. Olika metoder finner ofta olika grupperingar i samma nätverk och kräver varierande beräkningskraft. För att hantera dessa problem används ofta flera metoder vartefter resultaten jämförs manuellt. I detta examensarbete presenterar vi tre olika metoder att istället slå samman resultat från olika metoder (eller fler körningar från samma algoritm) för att hitta bättre uppskattningar av grupperingarna. Ett annat problem i praktiska tillämpningar är brus och ofullständiga nätverk med saknade och falska kanter. Dessa brister är naturliga resultat från de metoder som används för att kartlägga nätverketstrukturen och det är ofta svåra att eliminera dessa. I detta examensarbete använder vi Monte Carlo-metoder i kombination med de introducerade metoderna för att slå samman funna grupperingar för att hitta grupperingar i det osäkra nätverket. Vi testar metoden genom simuleringstudier på både verkliga och syntetiska nätverk med genererade osäkerheter och brister. Slutligen demostrerar vi hur det är möjligt att skapa konfidensnivåer för noder i grupperingar med hjälp av metoderna för sammanslagning. Detta möjliggör en kvalitativ jämförelse av stabilitet och signifikans av identifierade nätverksgrupperingar.
|
92 |
Der Einfluss der Länge von Beobachtungszeiträumen auf die Identifizierung von Subgruppen in Online CommunitiesZeini, Sam, Göhnert, Tilman, Hecking, Tobias, Krempel, Lothar, Hoppe, H. Ulrich 25 October 2013 (has links) (PDF)
Die Verbreitung von Social Media und damit verbunden die entstehenden und wachsenden Communities im Internet führen zu einer Zunahme von auswertbaren, digitalen Spuren, die häufig öffentlich zugänglich sind. Diese lassen sich durch verschiedene analytische Verfahren wie z.B. die Methode der Sozialen Netzwerkanalyse [1] auswerten. Insbesondere Ansätze für „Community Detection“ erfreuen sich besonderer Beliebtheit, wodurch sich unter anderem innovative Untergemeinschaften und Subgruppen beispielsweise in großen „Open Source“-Projekten identifizieren lassen [2]. Im Rahmen dieser Anwendungen ergeben sich neue methodische und grundlegende Fragen, darunter die nach der Rolle der von Zeit in solchen Analysen. Während die Darstellung dynamischer Effekte (z.B. durch Animationen) die Zeit als expliziten Parameter enthält, geht die Wahl der Zeitintervalle für die Aggregation von Daten, aus denen dann Netzwerke gewonnen werden, nur implizit in die Prämissen des Verfahrens ein. Diese Effekte wurden im Gegensatz zur Analyse von Dynamik bisher kaum untersucht. Im Fall der Sozialen Netzwerkanalyse ist die Zielrepräsentation selbst nicht mehr zeitbehaftet sondern sozusagen ein „statischer Schnappschuss“, wodurch etwa zeitabhängige Interaktionsmuster nicht erkannt werden können.
(...)
|
93 |
Detection of malicious user communities in data networksMoghaddam, Amir 04 April 2011 (has links)
Malicious users in data networks may form social interactions to create communities in abnormal fashions that deviate from the communication standards of a network. As a community, these users may perform many illegal tasks such as spamming, denial-of-service attacks, spreading confidential information, or sharing illegal contents. They may use different methods to evade existing security systems such as session splicing, polymorphic shell code, changing port numbers, and basic string manipulation. One way to masquerade the traffic is by changing the data rate patterns or use very low (trickle) data rates for communication purposes, the latter is focus of this research. Network administrators consider these communities of users as a serious threat.
In this research, we propose a framework that not only detects the abnormal data rate patterns in a stream of traffic by
using a type of neural network, Self-organizing Maps (SOM), but also
detect and reveal the community structure of these users for further
decisions. Through a set of comprehensive simulations, it is shown in this research that the suggested framework is able to detect these malicious user communities with a low false negative rate and false positive rate.
We further discuss ways of improving the performance of the neural network by studying the size of SOM's.
|
94 |
Détection de communautés recouvrantes dans des réseaux de terrain dynamiques / Overlapping community detection in dynamic networksWang, Qinna 12 April 2012 (has links)
Dans le contexte des réseaux complexes, la structure communautaire du réseau devient un sujet important pour plusieurs domaines de recherche. Les communautés sont en général vues comme des groupes intérieurement denses. La détection de tels groupes offre un éclairage intéressant sur la structure du réseau. Par exemple, une communauté de pages web regroupe des pages traitant du même sujet. La définition de communautés est en général limitée à une partition de l’ensemble des nœds. Cela exclut par définition qu’un nœd puisse appartenir à plusieurs communautés, ce qui pourtant est naturel dans de nombreux (cas des réseaux sociaux par exemple). Une autre question importante et sans réponse est l’étude des réseaux et de leur structure communautaire en tenant compte de leur dynamique. Cette thèse porte sur l’étude de réseaux dynamiques et la détection de communautés recouvrantes. Nous proposons deux méthodes différentes pour la détection de communautés recouvrantes. La première méthode est appelée optimisation de clique. L'optimisation de clique vise à détecter les nœds recouvrants granulaires. La méthode de l'optimisation de clique est une approche à grain fin. La seconde méthode est nommée détection floue (fuzzy detection). Cette méthode est à grain plus grossier et vise à identifier les groupes recouvrants. Nous appliquons ces deux méthodes à des réseaux synthétiques et réels. Les résultats obtenus indiquent que les deux méthodes peuvent être utilisées pour caractériser les nœds recouvrants. Les deux approches apportent des points de vue distincts et complémentaires. Dans le cas des graphes dynamiques, nous donnons une définition sur la relation entre les communautés à deux pas de temps consécutif. Cette technique permet de représenter le changement de la structure en fonction du temps. Pour mettre en évidence cette relation, nous proposons des diagrammes de lignage pour la visualisation de la dynamique des communautés. Ces diagrammes qui connectent des communautés à des pas de temps successifs montrent l’évolution de la structure et l'évolution des groupes recouvrantes., Nous avons également appliquer ces outils à des cas concrets. / In complex networks, the notion of community structure refers to the presence of groups of nodes in a network. These groups are more densely connected internally than with the rest of the network. The presence of communities inside a network gives an insight on network structural properties. For example, in social networks, communities are based on common interests, location, hobbies.... Generally, a community structure is described by a partition of the network nodes, where each node belongs to a unique community. A more reasonable description seems to be overlapping community structure, where nodes are allowed to be shared by several communities. Moreover, when considering dynamic networks whose interactions between nodes evolve in time, it appears crucial to consider also the evolution of the intrinsic community structure. This thesis focus on mining dynamic community evolution and overlapping community detection. We have proposed two distinct methods for overlapping community detection. The first one named clique optimization and the second one called fuzzy detection. Our clique optimization aims to identify granular overlaps and it is a fine grain scale approach. Our fuzzy detection is at a coarser grain scale with the strategy of identifying modular overlaps. Their applications in synthetic and real networks indicate that both methods can be used for characterizing overlapping nodes but in distinct and complementary views. We also propose the definition of predecessor and successor in mining community evolution. Such definition describes the relationship between communities at different time steps. We use it to detect community evolution in dynamic networks and show how modular overlaps evolve over time. A visualization tool called lineage diagrams is used to show community evolution by connecting communities in relationship of predecessor and successor. Several cases are studied.
|
95 |
Tools for Understanding the Dynamics of Social Networks / Des Outils pour Comprendre les Dynamiques des Réseaux SociauxMorini, Matteo 29 September 2017 (has links)
Cette thèse fournit au lecteur un recueil d'applications de la théorie des graphes ; à ce but, des outils sur mesure, adaptés aux applications considérées, ont été conçus et mis en œuvre de manière inspirée par les données.Dans la première partie, une nouvelle métrique de centralité, nommée “bridgeness”, est présentée, basée sur une décomposition de la centralité intermédiaire (“betweenness centrality”) standard. Une composante, la “connectivité locale”, correspondante approximativement au degré d'un noeud, est différenciée de l'autre, qui, en revanche, évalue les propriétés structurelles à longue distance. En effet, cette dernière fournit une mesure de l'efficacité de chaque noeud à “relayer” parties faiblement connectées d'un réseau ; une caractéristique importante de cette métrique est son agnosticisme en ce qui concerne la structure de la communauté sous jacente éventuelle.Une deuxième application vise à décrire les caractéristiques dynamiques des graphes temporels qui apparaissent au niveau mésoscopique. L'ensemble de données de choix comprend 40 ans de publications scientifiques sélectionnées. L'apparition et l'évolution dans le temps d'un domaine d'étude spécifique (les ondelettes) sont capturées, en discriminant les caractéristiques persistantes des artefacts transitoires résultants du processus de détection des communautés, intrinsèquement bruité, effectué indépendamment sur des instantanées statiques successives. La notion de “flux laminaire”, sur laquelle repose le “score de complexité” que nous cherchons à optimiser, est présentée.Dans le même ordre d'idées, un réseau d'investisseurs japonais a été construit, sur la base d'un ensemble de données qui comprend des informations (indirectes) sur les filiales étrangères en copropriété. Une question très débattue dans le domaine de l'économie industrielle, l'hypothèse de Miwa-Ramseyer, a été démontrée de manière concluante comme fausse, du moins sous sa forme forte. / This thesis provides the reader with a compendium of applications of network theory; tailor-madetools suited for the purpose have been devised and implemented in a data-driven fashion. In the first part, a novel centrality metric, aptly named “bridgeness”, is presented, based on adecomposition of the standard betweenness centrality. One component, local connectivity, roughlycorresponding to the degree of a node, is set apart from the other, which evaluates longer-rangestructural properties. Indeed, the latter provides a measure of the relevance of each node in“bridging” weakly connected parts of a network; a prominent feature of the metric is its agnosticism with regard to the eventual ground truth community structure.A second application is aimed at describing dynamic features of temporal graphs which are apparent at the mesoscopic level. The dataset of choice includes 40 years of selected scientific publications.The appearance and evolution in time of a specific field of study (“wavelets”) is captured,discriminating persistent features from transient artifacts, which result from the intrinsically noisy community detection process, independently performed on successive static snapshots. The concept of “laminar stream”, on which the “complexity score” we seek to optimize is based, is introduced.In a similar vein, a network of Japanese investors has been constructed, based on a dataset which includes (indirect) information on co-owned overseas subsidiaries. A hotly debated issue in the field of industrial economics, the Miwa-Ramseyer hypothesis, has been conclusively shown to be false, at least in its strong form.
|
96 |
Técnica de agrupamento de dados baseada em redes complexas para o posicionamento de cluster heads em rede de sensores sem fio / A clustering technique based on community detection for deployment of cluster head nodesLeonardo Nascimento Ferreira 19 October 2012 (has links)
Redes de Sensores Sem Fio são um tipo especial de rede ad-hoc que são posicionadas em uma região para monitorar fenômenos físicos. Considerando que os sensores dessas redes são independentes e possuem um raio de cobertura pequeno, é comum a utilização de um grande número de sensores para monitorar uma área grande. Um problema nesses tipos de redes é garantir que o máximo de dados capturados por esses sensores sejam coletados e transmitidos até uma estação base para que possam ser analisados por usuários. Uma abordagem para resolver esse problema é por meio da utilização de sensores especiais chamados cluster heads. Esses sensores são posicionados estrategicamente para coletar a informação de um grupo de sensores e transmiti-la para a estação base. Assim surge a necessidade de agrupar esses sensores. Nesse trabalho é proposta uma técnica híbrida baseada no algoritmo de agrupamento de dados K-Médias e em detecção comunidades em redes complexas. Esse algoritmo, chamado de QK-Médias, tenta aproveitar as vantagens das duas abordagens em duas etapas. Primeiro a rede é quebrada em comunidades usando uma técnica de detecção de comunidades. Em seguida essas comunidades são quebradas em subcomunidades de tal forma que os cluster heads consigam gerenciar. Os resultados obtidos a partir do agrupamento de sensores utilizando o QK-Médias mostram que é possível diminuir o número de mensagens perdidas na rede utilizando menos cluster heads que algoritmos tradicionais de agrupamento em redes de sensores sem fio / Wireless Sensor Networks are a special kind of ad-hoc network that are deployed in a monitoring field in order to detect some physical phenomenon. Due to the low dependability of individual nodes and small radio coverage, it is common to use a large number of sensors. A common problem in this sort of network is to guarantee that the highst number of captured data was sucessfull broadcast to the base station. One approach to solve this problem use special sensors called cluster heads. These sensors are responsible for collecting data from a group of common sensors and broadcast it to a base station. Thus, it is necessary to cluster these sensors. Here we propose a hybrid clustering algorithm based on community detection in complex networks and traditional K-means clustering technique: the QK-Means algorithm. This new algorithm is composed by two steps. First, the network is broken into communities and then broken into subcommuinties that the cluster heads can deal with. Simulation results show that QK-Means can decrease the rate of lost messages in the network using less cluster heads than tradicional clustering algorithms
|
97 |
Decomposição baseada em modelo de problemas de otimização de projeto utilizando redução de dimensionalidade e redes complexasCardoso, Alexandre Cançado 16 September 2016 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-03-07T15:01:41Z
No. of bitstreams: 1
alexandrecancadocardoso.pdf: 3207141 bytes, checksum: 46de44194b8a9a99093ecb73f332eacd (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-03-07T15:07:15Z (GMT) No. of bitstreams: 1
alexandrecancadocardoso.pdf: 3207141 bytes, checksum: 46de44194b8a9a99093ecb73f332eacd (MD5) / Made available in DSpace on 2017-03-07T15:07:15Z (GMT). No. of bitstreams: 1
alexandrecancadocardoso.pdf: 3207141 bytes, checksum: 46de44194b8a9a99093ecb73f332eacd (MD5)
Previous issue date: 2016-09-16 / A estratégia de dividir para conquistar é comum a diversos ramos de atuação, indo
do projeto de algoritmos à politica e sociologia. Em engenharia, é utilizada, dentre
outras aplicações, para auxiliar na resolução de problemas de criação de um projeto
(general desing problems) ou de um projeto ótimo (optimal design problems) de sistemas
grandes, complexos ou multidisciplinares. O presente, trabalho apresenta um método para
divisão, decomposição destes problemas em sub-problemas menores a partir de informação
apenas do seu modelo (model-based decomposition). Onde a extração dos padrões de
relação entre as variáveis, funções, simulações e demais elementos do modelo é realizada
através de algoritmos de aprendizado não supervisionado em duas etapas. Primeiramente,
o espaço dimensional é reduzido a fim de ressaltar as relações mais significativas, e
em seguida utiliza-se a técnica de detecção de comunidade oriunda da área de redes
complexas ou técnicas de agrupamento para identificação dos sub-problemas. Por fim,
o método é aplicado a problemas de otimização de projeto encontrados na literatura
de engenharia estrutural e mecânica. Os sub-problemas obtidos são avaliados segundo
critérios comparativos e qualitativos. / The divide and conquer strategy is common to many fields of activity, ranging from
the algorithms design to politics and sociology. In engineering, it is used, among other
applications, to assist in solving general design problems or optimal design problems
of large, complex or multidisciplinary systems. The present work presents a method
for splitting, decomposition of these problems into smaller sub-problems using only
information from its model (model-based decomposition). Where the pattern extraction
of relationships between variables, functions, simulations and other model elements is
performed using unsupervised learning algorithms in two steps. First, the dimensional
space is reduced in order to highlight the most significant relationships, and then we use
the community detection technique coming from complex networks area and clustering
techniques to identify the sub-problems. Finally, the method is applied to design
optimization problems encountered in structural and mechanical engineering literature.
The obtained sub-problems are evaluated against comparative and qualitative criteria.
|
98 |
Réseaux et signal : des outils de traitement du signal pour l'analyse des réseaux / Networks and signal : signal processing tools for network analysisTremblay, Nicolas 09 October 2014 (has links)
Cette thèse propose de nouveaux outils adaptés à l'analyse des réseaux : sociaux, de transport, de neurones, de protéines, de télécommunications... Ces réseaux, avec l'essor de certaines technologies électroniques, informatiques et mobiles, sont de plus en plus mesurables et mesurés ; la demande d'outils d'analyse assez génériques pour s'appliquer à ces réseaux de natures différentes, assez puissants pour gérer leur grande taille et assez pertinents pour en extraire l'information utile, augmente en conséquence. Pour répondre à cette demande, une grande communauté de chercheurs de différents horizons scientifiques concentre ses efforts sur l'analyse des graphes, des outils mathématiques modélisant la structure relationnelle des objets d'un réseau. Parmi les directions de recherche envisagées, le traitement du signal sur graphe apporte un éclairage prometteur sur la question : le signal n'est plus défini comme en traitement du signal classique sur une topologie régulière à n dimensions, mais sur une topologie particulière définie par le graphe. Appliquer ces idées nouvelles aux problématiques concrètes d'analyse d'un réseau, c'est ouvrir la voie à une analyse solidement fondée sur la théorie du signal. C'est précisément autour de cette frontière entre traitement du signal et science des réseaux que s'articule cette thèse, comme l'illustrent ses deux principales contributions. D'abord, une version multiéchelle de détection de communautés dans un réseau est introduite, basée sur la définition récente des ondelettes sur graphe. Puis, inspirée du concept classique de bootstrap, une méthode de rééchantillonnage de graphes est proposée à des fins d'estimation statistique. / This thesis describes new tools specifically designed for the analysis of networks such as social, transportation, neuronal, protein, communication networks... These networks, along with the rapid expansion of electronic, IT and mobile technologies are increasingly monitored and measured. Adapted tools of analysis are therefore very much in demand, which need to be universal, powerful, and precise enough to be able to extract useful information from very different possibly large networks. To this end, a large community of researchers from various disciplines have concentrated their efforts on the analysis of graphs, well define mathematical tools modeling the interconnected structure of networks. Among all the considered directions of research, graph signal processing brings a new and promising vision : a signal is no longer defined on a regular n-dimensional topology, but on a particular topology defined by the graph. To apply these new ideas on the practical problems of network analysis paves the way to an analysis firmly rooted in signal processing theory. It is precisely this frontier between signal processing and network science that we explore throughout this thesis, as shown by two of its major contributions. Firstly, a multiscale version of community detection in networks is proposed, based on the recent definition of graph wavelets. Then, a network-adapted bootstrap method is introduced, that enables statistical estimation based on carefully designed graph resampling schemes.
|
99 |
Détection et analyse des communautés dans les réseaux sociaux : approche basée sur l'analyse formelle de concepts / Community detection and analysis in social networks : approach based on formal concept analysisSelmane, Sid Ali 11 May 2015 (has links)
L’étude de structures de communautés dans les réseaux devient de plus en plus une question importante. La connaissance des modules de base (communautés) des réseaux nous aide à bien comprendre leurs fonctionnements et comportements, et à appréhender les performances de ces systèmes. Une communauté dans un graphe (réseau) est définie comme un ensemble de noeuds qui sont fortement liés entre eux, mais faiblement liés avec le reste du graphe. Les membres de la même communauté partagent les mêmes centres d’intérêt. L’originalité de nos travaux de recherche consiste à montrer qu’il est pertinent d’utiliser l’analyse formelle de concepts pour la détection de communautés, contrairement aux approches classiques qui utilisent des graphes. Nous avons notamment étudié plusieurs problèmes posés par la détection de communautés dans les réseaux sociaux : (1) l’évaluation des méthodes de détection de communautés proposées dans la littérature, (2) la détection de communautés disjointes et chevauchantes, et (3) la modélisation et l’analyse des réseaux sociaux de données tridimensionnelles. Pour évaluer les méthodes de détection de communautés proposées dans la littérature, nous avons abordé ce sujet en étudiant tout d’abord l’état de l’art qui nous a permis de présenter une classification des méthodes de détection de communautés en évaluant chacune des méthodes présentées dans la littérature (les méthodes les plus connues). Pour le deuxième volet, nous nous sommes ensuite intéressés à l’élaboration d’une approche de détection de communautés disjointes et chevauchantes dans des réseaux sociaux homogènes issus de matrices d’adjacence (données dites à un seul mode ou une seule dimension), en exploitant des techniques issues de l’analyse formelle de concepts. Nous avons également porté un intérêt particulier aux méthodes de modélisation de réseaux sociaux hétérogènes. Nous nous sommes intéressés en particulier aux données tridimensionnelles et proposé dans ce cadre une approche de modélisation et d’analyse des réseaux sociaux issus de données tridimensionnelles. Cette approche repose sur un cadre méthodologique permettant d’appréhender au mieux cet aspect tridimensionnel des données. De plus, l’analyse concerne la découverte de communautés et de relations dissimulées qui existent entre les différents types d’individus de ces réseaux. L’idée principale réside dans l’extraction de communautés et de règles d’association triadiques à partir de ces réseaux hétérogènes afin de simplifier et de réduire la complexité algorithmique de ce processus. Les résultats obtenus serviront par la suite à une application de recommandation de liens et de contenus aux individus d’un réseau social. / The study of community structure in networks became an increasingly important issue. The knowledge of core modules (communities) of networks helps us to understand how they work and behaviour, and to understand the performance of these systems. A community in a graph (network) is defined as a set of nodes that are strongly linked, but weakly linked with the rest of the graph. Members of the same community share the same interests. The originality of our research is to show that it is relevant to use formal concept analysis for community detection unlike conventional approaches using graphs. We studied several problems related to community detection in social networks : (1) the evaluation of community detection methods in the literature, (2) the detection of disjointed and overlapping communities, and (3) modelling and analysing heterogeneous social network of three-dimensional data. To assess the community detection methods proposed in the literature, we discussed this subject by studying first the state of the art that allowed us to present a classification of community detection methods by evaluating each method presented in the literature (the best known methods). For the second part, we were interested in developing a disjointed and overlapping community detection approach in homogeneous social networks from adjacency matrices (one mode data or one dimension) by exploiting techniques from formal concept analysis. We paid also a special attention to methods of modeling heterogeneous social networks. We focused in particular to three-dimensional data and proposed in this framework a modeling approach and social network analysis from three-dimensional data. This is based on a methodological framework to better understand the threedimensional aspect of this data. In addition, the analysis concerns the discovery of communities and hidden relationships between different types of individuals of these networks. The main idea lies in mining communities and rules of triadic association from these heterogeneous networks to simplify and reduce the computational complexity of this process. The results will then be used for an application recommendation of links and content to individuals in a social network.
|
100 |
Fast Identification of Structured P2P Botnets Using Community Detection AlgorithmsVenkatesh, Bharath January 2013 (has links) (PDF)
Botnets are a global problem, and effective botnet detection requires cooperation of large Internet Service Providers, allowing near global visibility of traffic that can be exploited to detect them. The global visibility comes with huge challenges, especially in the amount of data that has to be analysed. To handle such large volumes of data, a robust and effective detection method is the need of the hour and it must rely primarily on a reduced or abstracted form of data such as a graph of hosts, with the presence of an edge between two hosts if there is any data communication between them. Such an abstraction would be easy to construct and store, as very little of the packet needs to be looked at.
Structured P2P command and control have been shown to be robust against targeted and random node failures, thus are ideal mechanisms for botmasters to organize and command their botnets effectively. Thus this thesis develops a scalable, efficient and robust algorithm for the detection of structured P2P botnets in large traffic graphs. It draws from the advances in the state of the art in Community Detection, which aim to partition a graph into dense communities.
Popular Community Detection Algorithms with low theoretical time complexities such as Label Propagation, Infomap and Louvain Method have been implemented and compared on large LFR benchmark graphs to study their efficiency. Louvain method is found to be capable of handling graphs of millions of vertices and billions of edges. This thesis analyses the performance of this method with two objective functions, Modularity and Stability and found that neither of them are robust and general.
In order to overcome the limitations of these objective functions, a third objective function proposed in the literature is considered. This objective function has previously been used in the case of Protein Interaction Networks successfully, and used in this thesis to detect structured P2P botnets for the first time. Further, the differences in the topological properties - assortativity and density, of structured P2P botnet communities and benign communities are discussed. In order to exploit these differences, a novel measure based on mean regular degree is proposed, which captures both the assortativity and the density of a graph and its properties are studied.
This thesis proposes a robust and efficient algorithm that combines the use of greedy community detection and community filtering using the proposed measure mean regular degree. The proposed algorithm is tested extensively on a large number of datasets and found to be comparable in performance in most cases to an existing botnet detection algorithm called BotGrep and found to be significantly faster.
|
Page generated in 0.0404 seconds