Spelling suggestions: "subject:"clustering coefficient"" "subject:"klustering coefficient""
1 |
Generating Random Graphs with Tunable Clustering CoefficientParikh, Nidhi Kiranbhai 29 April 2011 (has links)
Most real-world networks exhibit a high clustering coefficient— the probability that two neighbors of a node are also neighbors of each other. We propose four algorithms CONF-1, CONF-2, THROW-1, and THROW-2 which are based on the configuration model and that take triangle degree sequence (representing the number of triangles/corners at a node) and single-edge degree sequence (representing the number of single-edges/stubs at a node) as input and generate a random graph with a tunable clustering coefficient. We analyze them theoretically and empirically for the case of a regular graph. CONF-1 and CONF-2 generate a random graph with the degree sequence and the clustering coefficient anticipated from the input triangle and single-edge degree sequences. At each time step, CONF-1 chooses each node for creating triangles or single edges with the same probability, while CONF-2 chooses a node for creating triangles or single edge with a probability proportional to their number of unconnected corners or unconnected stubs, respectively. Experimental results match quite well with the anticipated clustering coefficient except for highly dense graphs, in which case the experimental clustering coefficient is higher than the anticipated value. THROW-2 chooses three distinct nodes for creating triangles and two distinct nodes for creating single edges, while they need not be distinct for THROW-1. For THROW-1 and THROW-2, the degree sequence and the clustering coefficient of the generated graph varies from the input. However, the expected degree distribution, and the clustering coefficient of the generated graph can also be predicted using analytical results. Experiments show that, for THROW-1 and THROW-2, the results match quite well with the analytical results. Typically, only information about degree sequence or degree distribution is available. We also propose an algorithm DEG that takes degree sequence and clustering coefficient as input and generates a graph with the same properties. Experiments show results for DEG that are quite similar to those for CONF-1 and CONF-2. / Master of Science
|
2 |
Metaheurísticas para o problema de agrupamento de dados em grafo / Metaheuristics for the graph clustering problemNascimento, Mariá Cristina Vasconcelos 26 February 2010 (has links)
O problema de agrupamento de dados em grafos consiste em encontrar clusters de nós em um dado grafo, ou seja, encontrar subgrafos com alta conectividade. Esse problema pode receber outras nomenclaturas, algumas delas são: problema de particionamento de grafos e problema de detecção de comunidades. Para modelar esse problema, existem diversas formulações matemáticas, cada qual com suas vantagens e desvantagens. A maioria dessas formulações tem como desvantagem a necessidade da definição prévia do número de grupos que se deseja obter. Entretanto, esse tipo de informação não está contida em dados para agrupamento, ou seja, em dados não rotulados. Esse foi um dos motivos da popularização nas últimas décadas da medida conhecida como modularidade, que tem sido maximizada para encontrar partições em grafos. Essa formulação, além de não exigir a definição prévia do número de clusters, se destaca pela qualidade das partições que ela fornece. Nesta Tese, metaheurísticas Greedy Randomized Search Procedures para dois modelos existentes para agrupamento em grafos foram propostas: uma para o problema de maximização da modularidade e a outra para o problema de maximização da similaridade intra-cluster. Os resultados obtidos por essas metaheurísticas foram melhores quando comparadas àqueles de outras heurísticas encontradas na literatura. Entretanto, o custo computacional foi alto, principalmente o da metaheurística para o modelo de maximização da modularidade. Com o passar dos anos, estudos revelaram que a formulação que maximiza a modularidade das partições possui algumas limitações. A fim de promover uma alternativa à altura do modelo de maximização da modularidade, esta Tese propõe novas formulações matemáticas de agrupamento em grafos com e sem pesos que visam encontrar partições cujos clusters apresentem alta conectividade. Além disso, as formulações propostas são capazes de prover partições sem a necessidade de definição prévia do número de clusters. Testes com centenas de grafos com pesos comprovaram a eficiência dos modelos propostos. Comparando as partições provenientes de todos os modelos estudados nesta Tese, foram observados melhores resultados em uma das novas formulações propostas, que encontrou partições bastante satisfatórias, superiores às outras existentes, até mesmo para a de maximização de modularidade. Os resultados apresentaram alta correlação com a classificação real dos dados simulados e reais, sendo esses últimos, em sua maioria, de origem biológica / Graph clustering aims at identifying highly connected groups or clusters of nodes of a graph. This problem can assume others nomenclatures, such as: graph partitioning problem and community detection problem. There are many mathematical formulations to model this problem, each one with advantages and disadvantages. Most of these formulations have the disadvantage of requiring the definition of the number of clusters in the final partition. Nevertheless, this type of information is not found in graphs for clustering, i.e., whose data are unlabeled. This is one of the reasons for the popularization in the last decades of the measure known as modularity, which is being maximized to find graph partitions. This formulation does not require the definition of the number of clusters of the partitions to be produced, and produces high quality partitions. In this Thesis, Greedy Randomized Search Procedures metaheuristics for two existing graph clustering mathematical formulations are proposed: one for the maximization of the partition modularity and the other for the maximization of the intra-cluster similarity. The results obtained by these proposed metaheuristics outperformed the results from other heuristics found in the literature. However, their computational cost was high, mainly for the metaheuristic for the maximization of modularity model. Along the years, researches revealed that the formulation that maximizes the modularity of the partitions has some limitations. In order to promote a good alternative for the maximization of the partition modularity model, this Thesis proposed new mathematical formulations for graph clustering for weighted and unweighted graphs, aiming at finding partitions with high connectivity clusters. Furthermore, the proposed formulations are able to provide partitions without a previous definition of the true number of clusters. Computational tests with hundreds of weighted graphs confirmed the efficiency of the proposed models. Comparing the partitions from all studied formulations in this Thesis, it was possible to observe that the proposed formulations presented better results, even better than the maximization of partition modularity. These results are characterized by satisfactory partitions with high correlation with the true classification for the simulated and real data (mostly biological)
|
3 |
Metaheurísticas para o problema de agrupamento de dados em grafo / Metaheuristics for the graph clustering problemMariá Cristina Vasconcelos Nascimento 26 February 2010 (has links)
O problema de agrupamento de dados em grafos consiste em encontrar clusters de nós em um dado grafo, ou seja, encontrar subgrafos com alta conectividade. Esse problema pode receber outras nomenclaturas, algumas delas são: problema de particionamento de grafos e problema de detecção de comunidades. Para modelar esse problema, existem diversas formulações matemáticas, cada qual com suas vantagens e desvantagens. A maioria dessas formulações tem como desvantagem a necessidade da definição prévia do número de grupos que se deseja obter. Entretanto, esse tipo de informação não está contida em dados para agrupamento, ou seja, em dados não rotulados. Esse foi um dos motivos da popularização nas últimas décadas da medida conhecida como modularidade, que tem sido maximizada para encontrar partições em grafos. Essa formulação, além de não exigir a definição prévia do número de clusters, se destaca pela qualidade das partições que ela fornece. Nesta Tese, metaheurísticas Greedy Randomized Search Procedures para dois modelos existentes para agrupamento em grafos foram propostas: uma para o problema de maximização da modularidade e a outra para o problema de maximização da similaridade intra-cluster. Os resultados obtidos por essas metaheurísticas foram melhores quando comparadas àqueles de outras heurísticas encontradas na literatura. Entretanto, o custo computacional foi alto, principalmente o da metaheurística para o modelo de maximização da modularidade. Com o passar dos anos, estudos revelaram que a formulação que maximiza a modularidade das partições possui algumas limitações. A fim de promover uma alternativa à altura do modelo de maximização da modularidade, esta Tese propõe novas formulações matemáticas de agrupamento em grafos com e sem pesos que visam encontrar partições cujos clusters apresentem alta conectividade. Além disso, as formulações propostas são capazes de prover partições sem a necessidade de definição prévia do número de clusters. Testes com centenas de grafos com pesos comprovaram a eficiência dos modelos propostos. Comparando as partições provenientes de todos os modelos estudados nesta Tese, foram observados melhores resultados em uma das novas formulações propostas, que encontrou partições bastante satisfatórias, superiores às outras existentes, até mesmo para a de maximização de modularidade. Os resultados apresentaram alta correlação com a classificação real dos dados simulados e reais, sendo esses últimos, em sua maioria, de origem biológica / Graph clustering aims at identifying highly connected groups or clusters of nodes of a graph. This problem can assume others nomenclatures, such as: graph partitioning problem and community detection problem. There are many mathematical formulations to model this problem, each one with advantages and disadvantages. Most of these formulations have the disadvantage of requiring the definition of the number of clusters in the final partition. Nevertheless, this type of information is not found in graphs for clustering, i.e., whose data are unlabeled. This is one of the reasons for the popularization in the last decades of the measure known as modularity, which is being maximized to find graph partitions. This formulation does not require the definition of the number of clusters of the partitions to be produced, and produces high quality partitions. In this Thesis, Greedy Randomized Search Procedures metaheuristics for two existing graph clustering mathematical formulations are proposed: one for the maximization of the partition modularity and the other for the maximization of the intra-cluster similarity. The results obtained by these proposed metaheuristics outperformed the results from other heuristics found in the literature. However, their computational cost was high, mainly for the metaheuristic for the maximization of modularity model. Along the years, researches revealed that the formulation that maximizes the modularity of the partitions has some limitations. In order to promote a good alternative for the maximization of the partition modularity model, this Thesis proposed new mathematical formulations for graph clustering for weighted and unweighted graphs, aiming at finding partitions with high connectivity clusters. Furthermore, the proposed formulations are able to provide partitions without a previous definition of the true number of clusters. Computational tests with hundreds of weighted graphs confirmed the efficiency of the proposed models. Comparing the partitions from all studied formulations in this Thesis, it was possible to observe that the proposed formulations presented better results, even better than the maximization of partition modularity. These results are characterized by satisfactory partitions with high correlation with the true classification for the simulated and real data (mostly biological)
|
4 |
A pollination network of Cornus floridaLee, James H 01 January 2014 (has links)
From the agent-based, correlated random walk model presented, we observe the effects of varying the parameter values of maximum insect turning area, 𝛿max, density of trees, ω, maximum pollen carryover, 𝜅max, and probability of fertilization, P𝜅, on the distribution of pollen within a population of Cornus florida (flowering dogwood). We see that varying 𝛿max and 𝜅max changes the dispersal distance of pollen, which greatly affects many measures of connectivity. The clustering coefficient of fathers is maximized when 𝛿max is between 60° and 90°. Varying ω does not have a major effect on the clustering coefficient of fathers, but it does have a greater effect on other measures of genetic diversity. Lastly, we compare our simulations with randomly-placed trees with that of actual tree placement of C. florida at the VCU Rice Center, concluding that in order to truly understand how pollen is distributed within a specific ecosystem, specificity in describing tree locations is necessary.
|
5 |
Enumerating Approximate Maximal Cliques in a Distributed FrameworkDhanasetty, Abhishek 05 October 2021 (has links)
No description available.
|
6 |
Nested (2,r)-regular graphs and their network properties.Brooks, Josh Daniel 15 August 2012 (has links) (PDF)
A graph G is a (t, r)-regular graph if every collection of t independent vertices is collectively adjacent to exactly r vertices. If a graph G is (2, r)-regular where p, s, and m are positive integers, and m ≥ 2, then when n is sufficiently large, then G is isomorphic to G = Ks+mKp, where 2(p-1)+s = r. A nested (2,r)-regular graph is constructed by replacing selected cliques with a (2,r)-regular graph and joining the vertices of the peripheral cliques. For example, in a nested 's' graph when n = s + mp, we obtain n = s1+m1p1+mp. The nested 's' graph is now of the form Gs = Ks1+m1Kp1+mKp. We examine the network properties such as the average path length, clustering coefficient, and the spectrum of these nested graphs.
|
7 |
Models of Information Diffusion and The Role of InfluenceDon Dimungu Arachchige, Chathura JJ 01 January 2024 (has links) (PDF)
Information diffusion is significant in fields such as propagation prediction and influence maximization, with applications in viral marketing and rumor control. Despite conceptual differences, existing diffusion models may not represent identical underlying generative structures. A classification of diffusion of information models is developed based on infection requirements and stochasticity. The study involves analyzing seven existing DOI models on directed scale-free networks. The distinctive properties of each model are identified through simulations and analysis of experimental results. Our analysis reveals that similarity in conceptual design does not imply similarity in behavior concerning speed, the final state of nodes and edges, and sensitivity to parameters. Therefore, we highlight the importance of considering the unique behavioral characteristics of each model when selecting a suitable information diffusion model for a particular application. We further investigate how the network structure and clustering affect the diffusion of information. Our findings reveal that clustering does not consistently accelerate the spread of information. Instead, the extent to which clustering facilitates the dissemination of information is influenced by the interplay between the specific network structure types and the information diffusion model employed. Another significant aspect of information diffusion is the effect of influential nodes. Identifying highly influential nodes is of great interest for strategic targeting in various applications such as viral marketing and information campaigns. Our follow-up study aims to identify influential nodes using a transfer entropy-based method. In this work, we use our method to identify influential users in Twitter data and compare the results against other existing methods. Finally, we developed a methodology based on Transfer Entropy to evaluate influence in the context of information diffusion. This methodology demonstrated its superiority in predicting user adoption against retweet-based metrics, marking it as a direct and reliable metric for understanding influential users and information diffusion trends.
|
8 |
Η παράμετρος της κεντρικότητας σε ανεξάρτητα κλίμακας μεγάλα δίκτυα / The centrality metric in large scale-free networksΓεωργιάδης, Γιώργος 16 May 2007 (has links)
Ένα φαινόμενο που έκανε την εμφάνισή του τα τελευταία χρόνια είναι η μελέτη μεγάλων δικτύων που εμφανίζουν μια ιεραρχική δομή ανεξαρτήτως κλίμακας (large scale-free networks). Μια παραδοσιακή μέθοδος μοντελοποίησης δικτύων είναι η χρήση γραφημάτων και η χρησιμοποίηση αποτελεσμάτων που προκύπτουν από την Θεωρία Γράφων. Όμως στα κλασικά μοντέλα που έχουν μελετηθεί, δυο κόμβοι του ίδιου γραφήματος έχουν την ίδια πιθανότητα να συνδέονται με οποιουσδήποτε δυο άλλους κόμβους. Αυτός ο τρόπος μοντελοποίησης αποτυγχάνει να περιγράψει πολλά δίκτυα της καθημερινής ζωής, όπως δίκτυα γνωριμιών όπου οι κόμβοι συμβολίζουν ανθρώπους και συνδέονται μεταξύ τους αν γνωρίζονται άμεσα. Σε ένα τέτοιο δίκτυο είναι αναμενόμενο δυο φίλοι κάποιου ατόμου να έχουν μεγαλύτερη πιθανότητα να γνωρίζονται μεταξύ τους από ότι δυο τυχαία επιλεγμένοι ξένοι. Αυτό ακριβώς το φαινόμενο ονομάζεται συσσωμάτωση (clustering) και είναι χαρακτηριστικό για τα εν λόγω δίκτυα. Είναι γεγονός ότι πολλά δίκτυα που συναντώνται στη φύση αλλά και πάρα πολλά ανθρωπογενή δίκτυα εντάσσονται σε αυτήν την κατηγορία. Παραδείγματα τέτοιων είναι τα δίκτυα πρωτεϊνών, δίκτυα τροφικών αλυσίδων, επιδημικής διάδοσης ασθενειών, δίκτυα ηλεκτρικού ρεύματος, υπολογιστών, ιστοσελίδων του Παγκόσμιου Ιστού, δίκτυα γνωριμιών, επιστημονικών αναφορών (citations) κ.α. . Παρότι φαίνεται να άπτονται πολλών επιστημών όπως η Φυσική, η Βιολογία, η Κοινωνιολογία και η Πληροφορική, δεν έχουν τύχει ευρείας μελέτης, καθώς μέχρι στιγμής έλειπαν πραγματικά μεγάλα δίκτυα για πειραματική μελέτη (κενό που καλύφθηκε με την ανάπτυξη του Παγκόσμιου Ιστού). Μέχρι σήμερα δεν έχουν φωτιστεί όλα εκείνα τα σημεία και τα μεγέθη που είναι χαρακτηριστικά για αυτά τα δίκτυα και που πρέπει να εστιάσει η επιστημονική έρευνα, παρόλα αυτά έχει γίνει κάποια πρόοδος. Μια τέτοια έννοια που μπορεί να εκφραστεί με πολλά μεγέθη είναι η έννοια της κεντρικότητας (centrality) ενός κόμβου στο δίκτυο. Η χρησιμότητα ενός τέτοιου μεγέθους, αν μπορεί να οριστεί, είναι προφανής, για παράδειγμα στον τομέα της εσκεμμένης «επίθεσης» σε ένα τέτοιο δίκτυο (π.χ. δίκτυο υπολογιστών). Η ακριβής όμως συσχέτιση της κεντρικότητας με τα άλλα χαρακτηριστικά μεγέθη του δικτύου, όπως η συσσωμάτωση, δεν είναι γνωστή. Στόχος της εργασίας είναι να εμβαθύνει στην έννοια της κεντρικότητας, και χρησιμοποιεί σαν πεδίο πειραματισμών τον χώρο της εσκεμμένης επίθεσης σε ανεξάρτητα κλίμακας δίκτυα. Στο πλαίσιο αυτό γίνεται μια συνοπτική παρουσίαση των μοντέλων δικτύων που έχουν προταθεί μέχρι σήμερα και αναλύεται η έννοια της κεντρικότητας μέσω των παραδοσιακών ορισμών της από την επιστήμη της Κοινωνιολογίας. Στη συνέχεια προτείνεται μια σειρά ορισμών της κεντρικότητας που την συνδέουν με μεγέθη του δικτύου όπως ο συντελεστής συσσωμάτωσης. Η καταλληλότητα των ορισμών αυτών διαπιστώνεται στην πράξη, εξομοιώνοντας πειραματικά επιθέσεις σε ανεξάρτητα κλίμακας μεγάλα δίκτυα και χρησιμοποιώντας στρατηγικές επίθεσης που βασίζονται σε αυτές. / A trend in recent years is the study of large networks which possess a hierarchical structure independent of the current scale (large scale-free networks). A traditional method of network modelling is the use of graphs and the usage of results based on Graph Theory. Until recently, the classical models studied, describe the probability of two random vertices connecting with each other as equal for all pairs of vertices. This modelling fails to describe many everyday networks such as acquaintance networks, where the vertices are individuals and connect with an edge if they know each other
|
9 |
An Evolutionary Analysis of the Internet Autonomous System NetworkStewart, Craig R. 22 June 2010 (has links)
No description available.
|
10 |
[pt] CONTROLANDO O GRAU MÉDIO NA CONSTRUÇÃO DE REDES COMPLEXAS / [en] CONTROLLING THE AVERAGE DEGREE IN BUILDING COMPLEX NETWORKSJUDSON DE OLIVEIRA MOURA 18 January 2022 (has links)
[pt] A construção de redes complexas é de grande importância para o estudo de modelos de agentes e sistemas dinâmicos, a exemplo dos modelos de opinião, de epidemias, sistemas de osciladores ou mapas acoplados, etc., que usam grafos como substrato das interações entre os elementos do sistema. Essas dinâmicas dependem fortemente das características topológicas da rede de interações, portanto, é fundamental construir redes com propriedades estruturais bem definidas. Uma das propriedades de grande importância
é o grau médio, primeiro momento da distribuição de graus. Nos casos em que a distribuição de graus decai como uma lei de potência, o seu expoente é outra grandeza relevante, relacionada à possibilidade de ter vértices muito conectados. Além disso, procura-se evitar as correlações. Dentro deste quadro, estudamos os efeitos que certas características da distribuição de graus têm nas propriedades da rede, construída mediante o modelo de configuração. Para cada valor do expoente da lei de potência, fixamos os
graus mínimo, máximo, e médio, comparando o efeito destes parâmetros nas redes resultantes, através do coeficiente de agrupamento e da correlação de graus entre sítios vizinhos. / [en] The construction of complex networks is of great importance for the study of agent-based models and dynamical systems, such as opinion models, epidemics, oscillator systems or coupled maps, etc., that use graphs as a substrate to represent the interaction paths. The dynamics can strongly depend
on the topological characteristics of the interaction network, therefore, it is essential to build networks with well-defined structural properties. One of the properties of great importance is the average degree, the first moment of degree distribution. In cases where the degree distribution decays like a power law, its exponent is another relevant quantity, related to the possibility of hubs. Within this framework, we study the effects that certain characteristics of the degree distribution have on the properties of the network,
built using the configuration model. For each value of the power-law exponent, we fix the minimum, maximum, and average degrees, comparing the effect of these parameters on the resulting networks, through the clustering coefficient and the degree-degree correlation between neighboring sites.
|
Page generated in 0.1307 seconds