1 |
IDLE: A Novel Approach to Improving Overlapping Community Detection in Complex NetworksSenthil, Rathna 18 April 2016 (has links)
Complex systems in areas such as biology, physics, social science, and technology are extensively modeled as networks due to the rich set of tools available for their study and analysis. In such networks, groups of nodes that correspond to functional units or those that share some common attributes result in densely connected structures called communities. Community formation is an inherent process, and it is not easy to detect these structures because of the complex ways in which components of these systems interact.
Detecting communities in complex networks is important because it helps us to understand their internal dynamics better, thereby leading to significant insights into the underlying systems. Overlapping communities are formed when nodes in the network simultaneously belong to more than one community, and it has been shown that most real networks naturally contain such an overlapping community structure. In this thesis, I introduce a new approach to overlapping community detection called IDLE that incorporates ideas from another interesting problem: the identification of influential spreaders. Influential spreaders are nodes that play an important role in the propagation of information or diseases in networks. Research suggests that the main core identified by k-core decomposition techniques are the most influential spreaders. In my approach, I use these k-cores as candidate seeds for local community detection. Following a well-defined seed selection process, IDLE builds and prunes their corresponding local communities. It then augments the resulting local communities and puts them together to obtain the global overlapping community structure of the network.
My approach improves on the current local community detection techniques, because they use either random nodes or maximal k-cliques as seeds, and they do not focus explicitly on detecting overlapping nodes in the network. Hence their results can be significantly improved in building ground-truth overlapping communities. The results of my experiments on real and synthetic networks indicate that IDLE results in enhanced overlapping community detection and thereby a better identification of overlapping nodes that could be important or influential components in the underlying system. / Master of Science
|
2 |
Analyse temporelle et sémantique des réseaux sociaux typés à partir du contenu de sites généré par des utilisateurs sur le Web / Temporal and semantic analysis of richly typed social networks from user-generated content sites on the webMeng, Zide 07 November 2016 (has links)
Nous proposons une approche pour détecter les sujets, les communautés d'intérêt non disjointes,l'expertise, les tendances et les activités dans des sites où le contenu est généré par les utilisateurs et enparticulier dans des forums de questions-réponses tels que StackOverFlow. Nous décrivons d'abordQASM (Questions & Réponses dans des médias sociaux), un système basé sur l'analyse de réseauxsociaux pour gérer les deux principales ressources d’un site de questions-réponses: les utilisateurs et lecontenu. Nous présentons également le vocabulaire QASM utilisé pour formaliser à la fois le niveaud'intérêt et l'expertise des utilisateurs. Nous proposons ensuite une approche efficace pour détecter lescommunautés d'intérêts. Elle repose sur une autre méthode pour enrichir les questions avec un tag plusgénéral en cas de besoin. Nous comparons trois méthodes de détection sur un jeu de données extrait dusite populaire StackOverflow. Notre méthode basée sur le se révèle être beaucoup plus simple et plusrapide, tout en préservant la qualité de la détection. Nous proposons en complément une méthode pourgénérer automatiquement un label pour un sujet détecté en analysant le sens et les liens de ses mots-clefs.Nous menons alors une étude pour comparer différents algorithmes pour générer ce label. Enfin, nousétendons notre modèle de graphes probabilistes pour modéliser conjointement les sujets, l'expertise, lesactivités et les tendances. Nous le validons sur des données du monde réel pour confirmer l'efficacité denotre modèle intégrant les comportements des utilisateurs et la dynamique des sujets / We propose an approach to detect topics, overlapping communities of interest, expertise, trends andactivities in user-generated content sites and in particular in question-answering forums such asStackOverFlow. We first describe QASM (Question & Answer Social Media), a system based on socialnetwork analysis to manage the two main resources in question-answering sites: users and contents. Wealso introduce the QASM vocabulary used to formalize both the level of interest and the expertise ofusers on topics. We then propose an efficient approach to detect communities of interest. It relies onanother method to enrich questions with a more general tag when needed. We compared threedetection methods on a dataset extracted from the popular Q&A site StackOverflow. Our method basedon topic modeling and user membership assignment is shown to be much simpler and faster whilepreserving the quality of the detection. We then propose an additional method to automatically generatea label for a detected topic by analyzing the meaning and links of its bag of words. We conduct a userstudy to compare different algorithms to choose the label. Finally we extend our probabilistic graphicalmodel to jointly model topics, expertise, activities and trends. We performed experiments with realworlddata to confirm the effectiveness of our joint model, studying the users’ behaviors and topicsdynamics
|
Page generated in 0.1059 seconds