71 |
Bipartitní grafy pro analýzu mikrobiomů / Bipartite graphs for microbiome analysisŠafárová, Marcela January 2017 (has links)
Microorganisms are all around us. Some of them even live in our body and are essential for our healthy being. Study of microbial communities based on their genetic content has become very popular with the development of new technologies, which enable easy reading of DNA or RNA. The key role of these studies is usually to characterize significant microbial patterns of an environment. However, currently used visualization tools have many drawbacks for such analyses. The subject of this thesis is to design a R/Bioconductor package for simple creation of bipartite graphs from microbial data. This type of visualization brings many advantages for microbiome analysis. Benefits of bipartite graphs are further demonstrated by analysis of main parameters affecting computer processing of microbial data.
|
72 |
Dynamické sociální sítě a jejich analýza / Dynamic Social Networks and their AnalysisHudeček, Ján January 2021 (has links)
For a long time, there has been little research on dynamic social networks. However, in recent years, there has been much more focus on this field and many techniques for analyzing temporal aspects of social networks were proposed. In this work, we studied a dynamic social network based on data retrieved from the Commercial Register. This registry contains information about all economic entities that operate in the Czech Republic, including people who hold functions in entities and their addresses of living. We applied several data analysis techniques including community tracing, clustering, and methods for identifying key actors to find important entities and individuals in the social network and inspect their changes over time. 1
|
73 |
Improving the speed and quality of an Adverse Event cluster analysis with Stepwise Expectation Maximization and Community DetectionErlanson, Nils January 2020 (has links)
Adverse drug reactions are unwanted effects alongside the intended benefit of a drug and might be responsible for 3-7\% of hospitalizations. Finding such reactions is partly done by analysing individual case safety reports (ICSR) of adverse events. The reports consist of categorical terms that describe the event.Data-driven identification of suspected adverse drug reactions using this data typically considers single adverse event terms, one at a time. This single term approach narrows the identification of reports and information in the reports is ignored during the search. If one instead assumes that each report is connected to a topic, then by creating a cluster of the reports that are connected to the topic more reports would be identified. More context would also be provided by virtue of the topics. This thesis takes place at Uppsala Monitoring Centre which has implemented a probabilistic model of how an ICSR, and its topic, is assumed to be generated. The parameters of the model are estimated with expectation maximization (EM), which also assigns the reports to clusters. The clusters are improved with Consensus Clustering that identify groups of reports that tend to be grouped together by several runs of EM. Additionally, in order to not cluster outlying reports all clusters below a certain size are excluded. The objective of the thesis is to improve the algorithm in terms of computational efficiency and quality, as measured by stability and clinical coherence. The convergence of EM is improved using stepwise EM, which resulted in a speed up of at least 1.4, and a decrease of the computational complexity. With all the speed improvements the speed up factor of the entire algorithm can reach 2 but is constrained by the size of the data. In order to improve the clusters' quality, the community detection algorithm Leiden is used. It is able to improve the stability with the added benefit of increasing the number of clustered reports. The clinical coherence score performs worse with Leiden. There are good reasons to further investigate the benefits of Leiden as there were suggestions that community detection identified clusters with greater resolution that still appeared clinically coherent in a posthoc analysis.
|
74 |
Groupes et Communautés dans les flots de liens : des données aux algorithmes / Groups and communities in link streams : from data to algorithmsGaumont, Noé 11 October 2016 (has links)
Les interactions sont partout : il peut s'agir de contacts entre individus, d'emails, d'appels téléphoniques, etc. Toutes ces interactions sont définies par deux entités interagissant sur un intervalle de temps: par exemple, deux individus se rencontrant entre 12h et 14h. Nous modélisons ces interactions par des flots de liens qui sont des ensembles de quadruplets (b, e, u, v), où chaque quadruplet représente un lien entre les noeuds u et v existant durant l'intervalle [b,e]. Dans un graphe, une communauté est un sous-ensemble plus densément connecté qu’une référence. Dans le formalisme de flot de liens, les notions même de densité et de référence sont à définir. Nous étudions donc comment étendre la notion de communauté aux flots de liens. Pour ce faire, nous nous appuyons sur des données réel où une structure communautaire est connue. Puis, nous développons une méthode permettant de trouver automatiquement des sous-flots qui sont jugés pertinents. Ces sous-flots, c’est-à-dire des sous-ensembles de liens, sont trouvés grâce à une méthode de détection de communautés appliquée sur une projection du flot sur un graphe statique. Un sous-flot est jugé pertinent s’il est plus dense que les sous-flots qui lui sont proches temporellement et topologiquement. Ainsi nous approfondissons les notions de voisinage et référence dans les flots de liens. Nous appliquons cette méthode sur plusieurs jeux de données d’interactions réelles et obtenons des groupes pertinents qui n’auraient pas pu être détectés par les méthodes existantes. Enfin, nous abordons la génération de flots de liens avec une structure communautaire donnée et à la manière d'évaluer une telle partition. / Interactions are everywhere: in the contexts of face-to-face contacts, emails, phone calls, IP traffic, etc. In all of them, an interaction is characterized by two entities and a time interval: for instance, two individuals meet from 1pm to 3pm. We model them as link stream which is a set of quadruplets (b,e,u,v) where each quadruplet means that a link exists between u and v from time b to time e. In graphs, a community is a subset which is more densely connected than a reference. Within the link stream formalism, the notion of density and reference have to be redefined. Therefore, we study how to extend the notion of density for link streams. To this end, we use a real data set where a community structure is known. Then, we develop a method that finds automatically substream which are considered relevant. These substream, defined as subsets of links, are discovered by a classical community detection algorithm applied on the link stream the transformed into a static graph. A substream is considered relevant, if it is denser than the substreams which are close temporally and structurally. Thus, we deepen the notion of neighbourhood and reference in link streams. We apply our method on several real world interaction networks and we find relevant substream which would not have been found by existing methods. Finally, we discuss the generation of link streams having a given community structure and also a proper way to evaluate such community structure.
|
75 |
Comparing Communities & User Clusters in Twitter Network DataBhowmik, Kowshik January 2019 (has links)
No description available.
|
76 |
Decentralized Diffusion-Controlled Algorithm for Community Detection : Initialization and Resolution StudyRamirez, Adrian January 2017 (has links)
Community detection in graphs has been an important research topic for many fields. The aim of community detection is to extract from graphs those groups of nodes that present more connections between them than with the rest of the network. Detecting such groups at different scales can help understanding the global behaviour of the system. However, recent studies have shown that realworld graphs follow power-law distributions for degree and community sizes. Specifically, these graphs present many small communities but just a few large ones. This unbalanced community size distribution poses a great challenge for community detection algorithms.Most of the existing methods are based on global approaches that require information about the network to be processed as a whole. Thus, those techniques can not be applied when the graph is too big to fit into one single machine, or in distributed setting when the graph is partitioned among multiple machines. To solve this limitations, a completely decentralized community detection algorithm is presented. It is based on diffusion, following a vertex-centric approach that allows each node to decide the diffusion rates based on local information. It adds as well a mechanism for controlling the diffusion speed through a customizable function.We evaluate the algorithm with a variety of graphs with different levels of imbalance and community structures. Our algorithm is able to detect (almost) perfectly the communities when the imbalance between community sizes is not extreme. We show as well how the sizes of the detected communities can be controlled by the diffusion strategy, allowing for better detection of finer or coarser resolutions in hierarchical graphs. The algorithm is also compared to other two well-known existing methods, achieving similar results in most of the cases though with a higher computation time. / Gemenskap detektering i grafer har varit ett viktigt forsknings ämne förmånga områden. Gemenskapsdetekterings syftet är att extrahera ur grafernade grupper av noder som har mer kopplingar mellan varandra än med restenav nätverket. Att upptäcka sådana grupper i olika skaler kan hjälpa till att förstå systemets globala beteende. Däremot har nyliga studier visat att verkliga grafers grad och gemenskap storlek följer en potenslagen fördelning. Specifikt,dessa grafer uppvisar många små gemenskaper men bara några stora. Denhär obalanserade gemenskaps storleks fördelningen utgör en stor utmaning för gemenskapsdetekterings algoritmer.De flesta av de befintliga metoderna är baserade på globala tillvägagångssätt som kräver att information om nätverket behandlas som helhet. Således kan dessa tekniker inte tillämpas när grafen är för stor för att passa in i en enda maskin, eller på distribuerat sätt när grafen är uppdelad bland flera maskiner. För att lösa dessa begränsningar, uppvisas en helt decentraliserad gemenskapsdetekterings algoritm.Denär baserad pådiffusion som följer en vertex-centrerad tillvägagångssätt.Varje node valder diffusionshastigheten baserad på lokal information. Deninnehåller även en mekanism som kontrollerar diffusionens hastighet genom en anpassningsbar funktion.Vi utvärderar algoritmen genom flera olika grafer med olika nivåer av obalans och gemenskaps strukurer. Vår algoritm kan (nästan) felfritt upptäcka gemenskaper där obalansen mellan dem inte är för stor. Vi visar även hur storlekenpå de hittade gemenskaperna kan kontrolleras genom diffusions strategin, somtillåter bättre uptäckt av finare eller grövre resolution av hierarkiska grafer. Algoritmen kan också jämföras med två befintliga, välkända metoder, vilka ger liknande resultat i de flesta fallen men tar längre tid att genomföra.
|
77 |
Fast Algorithms for Large-Scale Network AnalyticsSariyuce, Ahmet Erdem 29 May 2015 (has links)
No description available.
|
78 |
Détection de communautés dans les réseaux d'information utilisant liens et attributs / Community detection in information networks using links and attributesCombe, David 15 October 2013 (has links)
Alors que les réseaux sociaux s'attachent à représenter des entités et les relations existant entre elles, les réseaux d'information intègrent également des attributs décrivant ces entités ; ce qui conduit à revisiter les méthodes d'analyse et de fouille de ces réseaux. Dans ces travaux, nous proposons des méthodes de classification des entités du réseau d'information qui exploitent d'une part les relations entre celles-ci et d'autre part les attributs les caractérisant. Nous nous penchons sur le cas des réseaux à vecteurs d'attributs, où les entités du réseau sont décrites par des vecteurs numériques. Ainsi nous proposons des approches basées sur des techniques reconnues pour chaque type d'information, faisant appel notamment à l'inertie pour la classification automatique et à la modularité de Newman et Girvan pour la détection de communautés. Nous évaluons nos propositions sur des réseaux issus de données bibliographiques, faisant usage en particulier d'information textuelle. Nous évaluons également nos approches face à diverses évolutions du réseau, notamment au regard d'une détérioration des informations des liens et des attributs, et nous caractérisons la robustesse de nos méthodes à celle-ci / While social networks use to represent entities and relationships between them, information networks also include attributes describing these entities, leading to review the analysis and mining methods for these networks. In this work, we discuss classification of the entities in an information network. Classification operate simultaneously on the relationships and on the attributes characterizing the entities. We look at the case of attributed graphs where entities are described by numerical feature vectors. We propose approaches based on proven classification techniques for each type of information, including the inertia for machine learning and Newman and Girvan's modularity for community detection. We evaluate our proposals on networks from bibliographic data, using textual information. We also evaluate our methods against various changes in the network, such as a deterioration of the relational or vector data, mesuring the robustness of our methods to them
|
79 |
Statistical physics of disordered networks - Spin Glasses on hierarchical lattices and community inference on random graphs / Physique statistique des réseaux désordonnées - Verres de spin sur réseaux hiérarchique et inférence de modules dans les graphes aléatoiresDecelle, Aurélien 11 October 2011 (has links)
Cette thèse aborde des aspects fondamentales et appliquées de la théorie des verres de spin etplus généralement des systèmes complexes. Les premiers modèles théoriques décrivant la transitionvitreuse sont apparues dans les années 1970. Ceux-ci décrivaient les verres à l'aide d'interactionsaléatoires. Il a fallu alors plusieurs années avant qu'une théorie de champs moyen pour ces systèmessoient comprises. De nos jours il existe un grand nombre de modèles tombant dans la classe de« champs moyen » et qui sont bien compris à la fois analytiquement, mais également numériquementgrâce à des outils tels que le monte-carlo ou la méthode de la cavité. Par ailleurs il est bien connu quele groupe de renormalisation a échoué jusque ici à pouvoir prédire le comportement des observablescritiques dans les verres hors champs moyen. Nous avons donc choisi d'étudier des systèmes eninteraction à longue portée dont on ignore encore si la physique est identique à celle du champmoyen. Nous avons montré dans une première partie, la facilité avec laquelle on peut décrire unetransformation du groupe de renormalisation dans les systèmes ferromagnétiques en interaction àlongue portée dé finies sur le réseau hiérarchique de Dyson. Dans un second temps, nous avons portéenotre attention sur des modèles de verre de spin sur ce même réseau. Un début d'analyse sur cestransformations dans l'espace réel est présenté ainsi qu'une comparaison de la mesure de l'exposantcritique nu par différentes méthodes. Si la transformation décrite semble prometteuse il faut cependantnoter que celle-ci doit encore être améliorée afin d'être considérée comme une méthode valide pournotre système. Nous avons continué dans cette même direction en analysant un modèle d'énergiesaléatoires toujours en utilisant la topologie du réseau hiérarchique. Nous avons étudié numériquementce système dans lequel nous avons pu observer l'existence d'une transition de phase de type « criseentropique » tout à fait similaire à celle du REM de Derrida. Toutefois, notre modèle présente desdifférences importantes avec ce dernier telles que le comportement non-analytique de l'entropie à latransition, ainsi que l'émergence de « criticalité » dont la présence serait à confirmer par d'autres études.Nous montrons également à l'aide de notre méthode numérique comment la température critique dece système peut-être estimée de trois façon différentes.Dans une dernière partie nous avons abordé des problèmes liés aux systèmes complexes. Il aété remarqué récemment que les modèles étudiés dans divers domaines, par exemple la physique, labiologie ou l'informatique, étaient très proches les uns des autres. Ceci est particulièrement vrai dansl'optimisation combinatoire qui a en partie été étudiée par des méthodes de physique statistique. Cesméthodes issues de la théories des verres de spin et des verres structuraux ont été très utilisées pourétudier les transitions de phase qui ont lieux dans ces systèmes ainsi que pour inventer de nouveauxalgorithmes pour ces modèles. Nous avons étudié le problème de l'inférence de modules dans lesréseaux à l'aide de ces même méthodes. Nous présentons une analyse sur la détection des modules topologiques dans des réseaux aléatoires et démontrons la présence d'une transition de phase entre une région où ces modules sont indétectables et une région où ils sont détectables. Par ailleurs, nous avons implémenté pour ces problèmes un algorithme utilisant Belief Propagation afin d'inférer les modules ainsi que d'apprendre leurs propriétés en ayant pour unique information la structure du réseau. Finalementnous avons appliqué cet algorithme sur des réseaux construits à partir de données réelles et discutonsles développements à apporter à notre méthode. / This thesis presents fundamental and applied aspects of spin glasses theory and complex systems. The first theoretical models of spin glasses appeared during the 1970. They were modelling glassy systems by using random interactions. It took several years before a mean-field theory of spin glasses was solved and understood. Nowadays there exists many different models falling in the class of mean-field models. They are well-understood analytically but also numerically where many methods exist to analyse them, namely the monte-carlo and the cavity method which are now essential numerical tools to investigate spin glass. At the same time, the renormalisation group technique which has been very useful in the past to analyse second order transition failed in many disordered systems to predict the behaviour of critical observables in non-mean-field spin glasses. We have chosen to study long-range interacting systems in which we don't know if the physics is identical to mean-field models. In a first part, we studied a ferromagnetic model on the Dyson hierarchical lattice. In this system with long-range interaction, we showed that it is easy to find a real-space transformation of the renormalisation group to compute the critical exponents. In a second part we focused on a spin glass model built on the same lattice. We made a first study where a real-space transformation is described for this system and we compare the estimations of the critical exponent nu for this model by different methods. The renormalisation group transformation gives some encouraging results but needs to be improved to become a more reliable method in this system. We have then investigated a model of random energies by using the same hierarchical topology. We studied numerically this system where we observed the existence of a phase transition of the same type as the one present in the REM of Derrida. However our model exhibits many different features compare to the REM. We found a non-analytical behaviour of the entropy at the transition and critical properties such as a diverging length-scale should occur according to our results. This last prediction has to be studied by a more direct measurement. By the numerical method we developed, we estimated the critical temperature using three different observables, all giving the same value. In the last part I turned to problems related to complex systems. It has been noticed recently that models of different fields such as physics, biology or computer science were very close to each other. This is particularly true in combinatorial optimisation problem which has been investigated using method of statistical physics. These techniques coming from the field of spin glasses and structural glasses were used to studied phase transitions in such systems and to invent new algorithms. We studied the problem of inference and learning of modular structure in random graphs by these techniques. We analysed the presence of topological clusters in some particular types of random graphs, and we showed that a phase transition occurred between a region where it is possible to detect clusters and a region where it is impossible. We also implemented a new algorithm using Belief Propagation to learn the properties of these clusters and to infer them in networks. We applied this algorithm to real-graph and discussed further development of this problem.
|
80 |
Técnica de agrupamento de dados baseada em redes complexas para o posicionamento de cluster heads em rede de sensores sem fio / A clustering technique based on community detection for deployment of cluster head nodesFerreira, Leonardo Nascimento 19 October 2012 (has links)
Redes de Sensores Sem Fio são um tipo especial de rede ad-hoc que são posicionadas em uma região para monitorar fenômenos físicos. Considerando que os sensores dessas redes são independentes e possuem um raio de cobertura pequeno, é comum a utilização de um grande número de sensores para monitorar uma área grande. Um problema nesses tipos de redes é garantir que o máximo de dados capturados por esses sensores sejam coletados e transmitidos até uma estação base para que possam ser analisados por usuários. Uma abordagem para resolver esse problema é por meio da utilização de sensores especiais chamados cluster heads. Esses sensores são posicionados estrategicamente para coletar a informação de um grupo de sensores e transmiti-la para a estação base. Assim surge a necessidade de agrupar esses sensores. Nesse trabalho é proposta uma técnica híbrida baseada no algoritmo de agrupamento de dados K-Médias e em detecção comunidades em redes complexas. Esse algoritmo, chamado de QK-Médias, tenta aproveitar as vantagens das duas abordagens em duas etapas. Primeiro a rede é quebrada em comunidades usando uma técnica de detecção de comunidades. Em seguida essas comunidades são quebradas em subcomunidades de tal forma que os cluster heads consigam gerenciar. Os resultados obtidos a partir do agrupamento de sensores utilizando o QK-Médias mostram que é possível diminuir o número de mensagens perdidas na rede utilizando menos cluster heads que algoritmos tradicionais de agrupamento em redes de sensores sem fio / Wireless Sensor Networks are a special kind of ad-hoc network that are deployed in a monitoring field in order to detect some physical phenomenon. Due to the low dependability of individual nodes and small radio coverage, it is common to use a large number of sensors. A common problem in this sort of network is to guarantee that the highst number of captured data was sucessfull broadcast to the base station. One approach to solve this problem use special sensors called cluster heads. These sensors are responsible for collecting data from a group of common sensors and broadcast it to a base station. Thus, it is necessary to cluster these sensors. Here we propose a hybrid clustering algorithm based on community detection in complex networks and traditional K-means clustering technique: the QK-Means algorithm. This new algorithm is composed by two steps. First, the network is broken into communities and then broken into subcommuinties that the cluster heads can deal with. Simulation results show that QK-Means can decrease the rate of lost messages in the network using less cluster heads than tradicional clustering algorithms
|
Page generated in 0.0349 seconds