Spelling suggestions: "subject:"graph 2analysis"" "subject:"graph 3analysis""
1 |
Graph analysis combining numerical, statistical, and streaming techniquesFairbanks, James Paul 27 May 2016 (has links)
Graph analysis uses graph data collected on a physical, biological, or social
phenomena to shed light on the underlying dynamics and behavior of the agents in that system. Many fields contribute to this topic including graph theory, algorithms, statistics, machine learning, and linear algebra. This dissertation advances a novel framework for dynamic graph analysis that combines numerical, statistical, and streaming algorithms to provide deep
understanding into evolving networks. For example, one can be interested in the changing influence structure over time. These disparate techniques each
contribute a fragment to understanding the graph; however, their combination
allows us to understand dynamic behavior and graph structure. Spectral partitioning methods rely on eigenvectors for solving data analysis
problems such as clustering. Eigenvectors of large sparse systems must be approximated with iterative methods. This dissertation analyzes how data analysis accuracy depends on the numerical accuracy of the eigensolver. This leads to new bounds on the residual tolerance necessary to guarantee correct partitioning. We present a novel stopping criterion for spectral partitioning guaranteed to satisfy the Cheeger inequality along with an empirical study of the performance on real world networks such as web, social, and e-commerce networks. This work bridges the gap between numerical analysis and computational data analysis.
|
2 |
Formulation of Hybrid Knowledge-Based/Molecular Mechanics Potentials for Protein Structure Refinement and a Novel Graph Theoretical Protein Structure Comparison and Analysis TechniqueMaus, Aaron 05 August 2019 (has links)
Proteins are the fundamental machinery that enables the functions of life. It is critical to understand them not just for basic biology, but also to enable medical advances. The field of protein structure prediction is concerned with developing computational techniques to predict protein structure and function from a protein’s amino acid sequence, encoded for directly in DNA, alone. Despite much progress since the first computational models in the late 1960’s, techniques for the prediction of protein structure still cannot reliably produce structures of high enough accuracy to enable desired applications such as rational drug design. Protein structure refinement is the process of modifying a predicted model of a protein to bring it closer to its native state. In this dissertation a protein structure refinement technique, that of potential energy minimization using hybrid molecular mechanics/knowledge based potential energy functions is examined in detail. The generation of the knowledge-based component is critically analyzed, and in the end, a potential that is a modest improvement over the original is presented.
This dissertation also examines the task of protein structure comparison. In evaluating various protein structure prediction techniques, it is crucial to be able to compare produced models against known structures to understand how well the technique performs. A novel technique is proposed that allows an in-depth yet intuitive evaluation of the local similarities between protein structures. Based on a graph analysis of pairwise atomic distance similarities, multiple regions of structural similarity can be identified between structures independently of relative orientation. Multidomain structures can be evaluated and this technique can be combined with global measures of similarity such as the global distance test. This method of comparison is expected to have broad applications in rational drug design, the evolutionary study of protein structures, and in the analysis of the protein structure prediction effort.
|
3 |
The role of the psychological contract in affecting employee behaviour under the influence of merger and acquisition: a study of local regional managers in Hong KongFong, Dominic January 2009 (has links)
In past decades, the expectation of synergy has fueled many thousands of mergers and acquisitions. Meanwhile, economists and analysts have reported a large proportion of merger failures. This apparent contradiction has provided researchers with a rich source of studies. One of the likely causes of a merger failure is the “people factor”. Revolving around the axis of mergers and acquisitions, the peoples affected are, on the one side, the stockholders, top management, and economists who “talk the project” and tend to have a positive attitude and on the other side, the people who “walk the project” – the employees - who have a more hesitant attitude. / This empirical study adopted the construct of Psychological Contracts to measure the expectations of employees who are influenced by mergers and acquisitions. Based on this construct, a model was developed to study employees’ behaviour after a merger, examining it from a multitude of dimensions. Using the PLS-Graph analysis tools, the model was tested with the aim of assessing the factors’ impact on employees’ behaviour. Apart from the direct causal relationship between two variables, the indirect effects caused by other variables are assessed as well. / The first contribution made by this research is the fact that it examines the relevance of a psychological contract in a non-Western geographical region. Next, the study clearly confirms some of the existing conceptualizations regarding psychological contracts and reveals some additional insights, particularly in relation to the consideration of psychological contracts in a non-Western socio-cultural context. / The research aspires to generalize the model for predicting the post-merger behaviour of employees anywhere, across any industry, business segment and profession.
|
4 |
Functional network centrality in obesityGarcía-García, Isabel, Jurado, María Ángeles, Garolera, Maite, Marqués-Iturria, Idoia, Horstmann, Annette, Segura, Bàrbara, Pueyo, Roser, Sender-Palacios, María José, Vernet-Vernet, Maria, Villringer, Arno, Junqué, Carme, Margulies, Daniel S., Neumann, Jane 23 June 2016 (has links) (PDF)
Obesity is associated with structural and functional alterations in brain areas that are often functionally distinct and anatomically distant. This suggests that obesity is associated with differences in functional connectivity of regions distributed across the brain. However, studies addressing whole brain functional connectivity in obesity remain scarce. Here, we compared voxel-wise degree centrality and eigenvector centrality between participants with obesity (n=20) and normal-weight controls (n=21). We analyzed resting state and task-related fMRI data acquired from the same individuals. Relative to normal-weight controls, participants with obesity exhibited reduced degree centrality in the right middle frontal gyrus in the resting-state condition. During the task fMRI condition, obese participants exhibited less degree centrality in the left middle frontal gyrus and the lateral occipital cortex along with reduced eigenvector centrality in the lateral occipital cortex and occipital pole. Our results highlight the central role of the middle frontal gyrus in the pathophysiology of obesity, a structure involved in several brain circuits signaling attention, executive functions and motor functions. Additionally, our analysis suggests the existence of task-dependent reduced centrality in occipital areas; regions with a role in perceptual processes and that are profoundly modulated by attention.
|
5 |
Development and Application of Reaction Route Graph Representation and Analysis of Catalytic Reaction NetworksO'Malley, Patrick Daniel 18 January 2017 (has links)
Chemical reactions can have a staggering amount of molecular complexity. Reaction mechanisms have been proposed with over one hundred elementary reaction steps that occur in the same system simultaneously. While several methods exist to simplify and make sense of the pathways and kinetics via which these reactions proceed, e.g., reaction graphs, sensitivity or flux analysis, microkinetic analysis, and comparison of energy landscapes, etc., these methods all have limitations and are often not able to capture a comprehensive picture of the kinetics of system. It has been found useful to view these mechanisms as a network, i.e., a reaction graph. These graphs enable the visualization of the pathways of the reaction and can provide an analytical tool for pathway and kinetic analysis. However, many of the specific graph-theoretic approaches in the literature are not the most suitable for kinetic analysis of complex mechanisms; as they are simply not based on rules that are rigorous enough to fully enumerate all the pathways or provide quantitative analysis of the reaction rates. Our Reaction Route (RR) Graph approach is different in that it depicts the mechanism by a graph that is consistent with all physical and chemical laws associated with reaction networks, particularly being consistent with mass and energy conservation, i.e., Kirchoff’s Flux Law (KFL) and Kirchoff’s Potential Law (KPL). Because of their adherence to these laws, RR Graphs are able to provide an accurate graph-theoretical tool not only for depicting all reactions routes as walks (hence the name RR Graph) but also for pruning mechanisms and allowing a simplified but accurate quantitative description of reaction rates. This adherence to KFL and KPL does mean that the construction and implementation of these graphs can be prohibitively difficult for large mechanisms. For large reaction systems,especially nonlinear mechanisms, it is not realistic to generate these graphs by hand. And although there exists an analytical solution to find a determinant matrix for the RR Graph of a mechanism, the process involves an exhaustive search for a solution which experiences a combinatorial explosion as the number of steps gets very large. This leads to the idea of developing an algorithm for a computer program that can determine how to generate these graphs automatically. Unfortunately, the same combinatorial explosion is present such that for a moderately sized twenty step mechanism, it could take an average computational processor over a decade to find a solution. We have determined, however, that this brute force combinatorial approach can be avoided if heuristics could be developed to bridge gaps in our knowledge of how these graphs are constructed. Thus, developing a better analytical approach and/or a tighter set of heuristics for a computer algorithm are the overarching goals of this work. To make progress toward developing such heuristics, a set of microkinetic mechanisms were analyzed with the notion that the realization of the RR Graphs would highlight a better approach to their construction and usage. In particular, a very large linear reaction system, a smaller linear system and two non-linear reaction systems were analyzed to develop insights into how each graph is manually constructed and analyzed. Furthermore, kinetic analysis was done for these mechanisms and compared to experimental data and other analytical tools to prove not only the validity of the RR Graphs, but also how they are a significant improvement over more commonly used approaches for mechanistic and kinetic analysis. Based on the lessons learned through a consideration of these examples, a set of heuristics are established and enumerated with the ultimate goal of developing an intuitive algorithm that can help automate drawing and kinetic analysis via RR Graphs of complex mechanisms.
|
6 |
Towards effective analysis of big graphs : from scalability to qualityTian, Chao January 2017 (has links)
This thesis investigates the central issues underlying graph analysis, namely, scalability and quality. We first study the incremental problems for graph queries, which aim to compute the changes to the old query answer, in response to the updates to the input graph. The incremental problem is called bounded if its cost is decided by the sizes of the query and the changes only. No matter how desirable, however, our first results are negative: for common graph queries such as graph traversal, connectivity, keyword search and pattern matching, their incremental problems are unbounded. In light of the negative results, we propose two new characterizations for the effectiveness of incremental computation, and show that the incremental computations above can still be effectively conducted, by either reducing the computations on big graphs to small data, or incrementalizing batch algorithms by minimizing unnecessary recomputation. We next study the problems with regards to improving the quality of the graphs. To uniquely identify entities represented by vertices in a graph, we propose a class of keys that are recursively defined in terms of graph patterns, and are interpreted with subgraph isomorphism. As an application, we study the entity matching problem, which is to find all pairs of entities in a graph that are identified by a given set of keys. Although the problem is proved to be intractable, and cannot be parallelized in logarithmic rounds, we provide two parallel scalable algorithms for it. In addition, to catch numeric inconsistencies in real-life graphs, we extend graph functional dependencies with linear arithmetic expressions and comparison predicates, referred to as NGDs. Indeed, NGDs strike a balance between expressivity and complexity, since if we allow non-linear arithmetic expressions, even of degree at most 2, the satisfiability and implication problems become undecidable. A localizable incremental algorithm is developed to detect errors using NGDs, where the cost is determined by small neighbors of nodes in the updates instead of the entire graph. Finally, a rule-based method to clean graphs is proposed. We extend graph entity dependencies (GEDs) as data quality rules. Given a graph, a set of GEDs and a block of ground truth, we fix violations of GEDs in the graph by combining data repairing and object identification. The method finds certain fixes to errors detected by GEDs, i.e., as long as the GEDs and the ground truth are correct, the fixes are assured correct as their logical consequences. Several fundamental results underlying the method are established, and an algorithm is developed to implement the method. We also parallelize the method and guarantee to reduce its running time with the increase of processors.
|
7 |
Metody dolování v grafech a jejich využití v prostředí IS pro historiky / Methods for Graph Mining and Their Use in an IS for HistoriansKapavík, Radim January 2010 (has links)
This diploma work presents design and implementation of web information system for historians (called Electronic Encyclopaedia of History) with data model based on semantic network and of several tools for data analysis in such IS, using graph algorithms and graph mining methods.
|
8 |
A graph database management system for a logistics-related serviceWalldén, Marcus, Özkan, Aylin January 2016 (has links)
Higher demands on database systems have lead to an increased popularity of certain database system types in some niche areas. One such niche area is graph networks, such as social networks or logistics networks. An analysis made on such networks often focus on complex relational patterns that sometimes can not be solved efficiently by traditional relational databases, which has lead to the infusion of some specialized non-relational database systems. Some of the database systems that have seen a surge in popularity in this area are graph database systems. This thesis presents a prototype of a logistics network-related service using a graph database management system called Neo4j, which currently is the most popular graph database management system in use. The logistics network covered by the service is based on existing data from PostNord, Sweden’s biggest provider of logistics solutions, and primarily focuses on customer support and business to business. By creating a prototype of the service this thesis strives to indicate some of the positive and negative aspects of a graph database system, as well as give an indication of how a service using a graph database system could be created. The results indicate that Neo4j is very intuitive and easy to use, which would make it optimal for prototyping and smaller systems, but due to the used evaluation method more research in this area would need to be carried out in order to confirm these conclusions. / Högre krav på databassystem har lett till en ökad popularitet för vissa databassystemstyper i några nischområden. Ett sådant nischområde är grafnätverk, såsomsociala nätverk eller logistiknätverk. Analyser på grafnätverk fokuserar ofta påkomplexa relationsmönster som ibland inte kan lösas effektivt av traditionella relationsdatabassystem, vilket har lett till att vissa specialiserade icke-relationella databassystem har blivit populära alternativ. Många av de populära databassystemen inom detta område är grafdatabassystem. Detta arbete presenterar en prototyp av en logistiknätverksrelaterad tjänst som använder sig av ett grafdatabashanteringssystem som heter Neo4j, vilket är det mest använda grafdatabashanteringssystemet. Logistiknätverket som täcks av tjänsten är baserad på existerande data från PostNord, Sveriges ledande leverantör av logistiklösningar, och fokuserar primärt på kundsupport och företagsrelaterad analys. Genom att skapa en prototyp av tjänsten strävar detta arbete efter att uppvisa vissa av de positiva och negativa aspekterna av ett grafdatabashanteringssystem samt att visa hur en tjänst kan skapas genom att använda ett grafdatabashanteringssystem. Resultaten indikerar att Neo4j är väldigt intuitivt och lättanvänt, vilket skulle göra den optimal för prototyping och mindre system, men på grund av den använda evalueringsmetoden så behöver mer forskning inom detta område utföras innan dessa slutsatser kan bekräftas.
|
9 |
Graph-based Modern Nonparametrics For High-dimensional DataWang, Kaijun January 2019 (has links)
Developing nonparametric statistical methods and inference procedures for high-dimensional large data have been a challenging frontier problem of statistics. To attack this problem, in recent years, a clear rising trend has been observed with a radically different viewpoint--``Graph-based Nonparametrics," which is the main research focus of this dissertation. The basic idea consists of two steps: (i) representation step: code the given data using graphs, (ii) analysis step: apply statistical methods on the graph-transformed problem to systematically tackle various types of data structures. Under this general framework, this dissertation develops two major research directions. Chapter 2—based on Mukhopadhyay and Wang (2019a)—introduces a new nonparametric method for high-dimensional k-sample comparison problem that is distribution-free, robust, and continues to work even when the dimension of the data is larger than the sample size. The proposed theory is based on modern LP-nonparametrics tools and unexplored connections with spectral graph theory. The key is to construct a specially-designed weighted graph from the data and to reformulate the k-sample problem into a community detection problem. The procedure is shown to possess various desirable properties along with a characteristic exploratory flavor that has practical consequences. The numerical examples show surprisingly well performance of our method under a broad range of realistic situations. Chapter 3—based on Mukhopadhyay and Wang (2019b)—revisits some foundational questions about network modeling that are still unsolved. In particular, we present unified statistical theory of the fundamental spectral graph methods (e.g., Laplacian, Modularity, Diffusion map, regularized Laplacian, Google PageRank model), which are often viewed as spectral heuristic-based empirical mystery facts. Despite half a century of research, this question has been one of the most formidable open issues, if not the core problem in modern network science. Our approach integrates modern nonparametric statistics, mathematical approximation theory (of integral equations), and computational harmonic analysis in a novel way to develop a theory that unifies and generalizes the existing paradigm. From a practical standpoint, it is shown that this perspective can provide adequate guidance for designing next-generation computational tools for large-scale problems. As an example, we have described the high-dimensional change-point detection problem. Chapter 4 discusses some further extensions and application of our methodologies to regularized spectral clustering and spatial graph regression problems. The dissertation concludes with the a discussion of two important areas of future studies. / Statistics
|
10 |
The Dao of Wikipedia: Extracting Knowledge from the Structure of WikilinksConsonni, Cristian 24 October 2019 (has links)
Wikipedia is a multilingual encyclopedia written collaboratively by volunteers online, and it is now the largest, most visited encyclopedia in existence. Wikipedia has arisen through the self-organized collaboration of contributors, and since its launch in January 2001, its potential as a research resource has become apparent to scientists, its appeal lies in the fact that it strikes a middle ground between accurate, manually created, limited-coverage resources, and noisy knowledge mined from the web. For this reason, Wikipedia's content has been exploited for a variety of applications: to build knowledge bases, to study interactions between users on the Internet, and to investigate social and cultural issues such as gender bias in history, or the spreading of information.
Similarly to what happened for the Web at large, a structure has emerged from the collaborative creation of Wikipedia: its articles contain hundreds of millions of links. In Wikipedia parlance, these internal links are called wikilinks. These connections explain the topics being covered in articles and provide a way to navigate between different subjects, contextualizing the information, and making additional information available.
In this thesis, we argue that the information contained in the link structure of Wikipedia can be harnessed to gain useful insights by extracting it with dedicated algorithms. More prosaically, in this thesis, we explore the link structure of Wikipedia with new methods.
In the first part, we discuss in depth the characteristics of Wikipedia, and we describe the process and challenges we have faced to extract the network of links. Since Wikipedia is available in several language editions and its entire edition history is publicly available, we have extracted the wikilink network at various points in time, and we have performed data integration to improve its quality.
In the second part, we show that the wikilink network can be effectively used to find the most relevant pages related to an article provided by the user. We introduce a novel algorithm, called CycleRank, that takes advantage of the link structure of Wikipedia considering cycles of links, thus giving weight to both incoming and outgoing connections, to produce a ranking of articles with respect to an article chosen by the user.
In the last part, we explore applications of CycleRank. First, we describe the Engineroom EU project, where we faced the challenge to find which were the most relevant Wikipedia pages connected to the Wikipedia article about the Internet. Finally, we present another contribution using Wikipedia article accesses to estimate how the information about diseases propagates.
In conclusion, with this thesis, we wanted to show that browsing Wikipedia's wikilinks is not only fascinating and serendipitous, but it is an effective way to extract useful information that is latent in the user-generated encyclopedia.
|
Page generated in 0.0623 seconds