Global ETD Search

11	Methods for Differential Analysis of Gene Expression and Metabolic Pathway Activity Temate Tiagueu, Yvette Charly B, Temate Tiagueu, Yvette C. B. 09 May 2016 (has links) RNA-Seq is an increasingly popular approach to transcriptome profiling that uses the capabilities of next generation sequencing technologies and provides better measurement of levels of transcripts and their isoforms. In this thesis, we apply RNA-Seq protocol and transcriptome quantification to estimate gene expression and pathway activity levels. We present a novel method, called IsoDE, for differential gene expression analysis based on bootstrapping. In the first version of IsoDE, we compared the tool against four existing methods: Fisher's exact test, GFOLD, edgeR and Cuffdiff on RNA-Seq datasets generated using three different sequencing technologies, both with and without replicates. We also introduce the second version of IsoDE which runs 10 times faster than the first implementation due to some in-memory processing applied to the underlying gene expression frequencies estimation tool and we also perform more optimization on the analysis. The second part of this thesis presents a set of tools to differentially analyze metabolic pathways from RNA-Seq data. Metabolic pathways are series of chemical reactions occurring within a cell. We focus on two main problems in metabolic pathways differential analysis, namely, differential analysis of their inferred activity level and of their estimated abundance. We validate our approaches through differential expression analysis at the transcripts and genes levels and also through real-time quantitative PCR experiments. In part Four, we present the different packages created or updated in the course of this study. We conclude with our future work plans for further improving IsoDE 2.0. Bootstrapping algorithm Next generation sequencing Gene expression RNA-Seq data Expectation maximization Graph analysis Metabolic pathway activity level Metabolic pathways Metabolic pathway abundance KEGG Differential gene expression analysis
12	Mining Tera-Scale Graphs: Theory, Engineering and Discoveries Kang, U 01 May 2012 (has links) How do we find patterns and anomalies, on graphs with billions of nodes and edges, which do not fit in memory? How to use parallelism for such Tera- or Peta-scale graphs? In this thesis, we propose PEGASUS, a large scale graph mining system implemented on the top of the HADOOP platform, the open source version of MAPREDUCE. PEGASUS includes algorithms which help us spot patterns and anomalous behaviors in large graphs. PEGASUS enables the structure analysis on large graphs. We unify many different structure analysis algorithms, including the analysis on connected components, PageRank, and radius/diameter, into a general primitive called GIM-V. GIM-V is highly optimized, achieving good scale-up on the number of edges and available machines. We discover surprising patterns using GIM-V, including the 7-degrees of separation in one of the largest publicly available Web graphs, with 7 billion edges. PEGASUS also enables the inference and the spectral analysis on large graphs. We design an efficient distributed belief propagation algorithm which infer the states of unlabeled nodes given a set of labeled nodes. We also develop an eigensolver for computing top k eigenvalues and eigenvectors of the adjacency matrices of very large graphs. We use the eigensolver to discover anomalous adult advertisers in the who-follows-whom Twitter graph with 3 billion edges. In addition, we develop an efficient tensor decomposition algorithm and use it to analyze a large knowledge base tensor. Finally, PEGASUS allows the management of large graphs. We propose efficient graph storage and indexing methods to answer graph mining queries quickly. We also develop an edge layout algorithm for better compressing graphs. graph mining MAPREDUCE HADOOP graph structure analysis radius plot diameter connected component inference spectral graph analysis eigensolver tensor analysis graph management graph indexing graph compression Computer Sciences
13	A high-performance framework for analyzing massive complex networks Madduri, Kamesh 08 July 2008 (has links) Graphs are a fundamental and widely-used abstraction for representing data. We can analytically study interesting aspects of real-world complex systems such as the Internet, social systems, transportation networks, and biological interaction data by modeling them as graphs. Graph-theoretic and combinatorial problems are also pervasive in scientific computing and engineering applications. In this dissertation, we address the problem of analyzing large-scale complex networks that represent interactions between hundreds of thousands to billions of entities. We present SNAP, a new high-performance computational framework for efficiently processing graph-theoretic queries on massive datasets. Graph analysis is computationally very different from traditional scientific computing, and solving massive graph-theoretic problems on current high performance computing systems is challenging due to several reasons. First, real-world graphs are often characterized by a low diameter and unbalanced degree distributions, and are difficult to partition on parallel systems. Second, parallel algorithms for solving graph-theoretic problems are typically memory intensive, and the memory accesses are fine-grained and highly irregular. The primary contributions of this dissertation are the design and implementation of novel parallel graph algorithms for traversal, shortest paths, and centrality computations, optimized for the small-world network topology, and high-performance multithreaded architectures and multicore servers. SNAP (Small-world Network Analysis and Partitioning) is a modular, open-source framework for the exploratory analysis and partitioning of large-scale networks. With SNAP, we demonstrate the capability to process massive graphs with billions of vertices and edges, and achieve up to two orders of magnitude speedup over state-of-the-art network analysis approaches. We also design a new parallel computing benchmark for characterizing the performance of graph-theoretic problems on high-end systems; study data representations for dynamic graph problems on parallel systems; and apply algorithms in SNAP to solve real-world problems in social network analysis and systems biology. Parallel computing Graph algorithms Multithreaded algorithms Complex networks Graph analysis framework Graph theory Data processing Network analysis (Planning) Combinatorial analysis Graph algorithms
14	Signal Processing on Graphs - Contributions to an Emerging Field / Traitement du signal sur graphes - Contributions à un domaine émergent Girault, Benjamin 01 December 2015 (has links) Ce manuscrit introduit dans une première partie le domaine du traitement du signal sur graphe en commençant par poser les bases d'algèbre linéaire et de théorie spectrale des graphes. Nous définissons ensuite le traitement du signal sur graphe et donnons des intuitions sur ses forces et faiblesses actuelles comparativement au traitement du signal classique. En seconde partie, nous introduisons nos contributions au domaine. Le chapitre 4 cible plus particulièrement l'étude de la structure d'un graphe par l'analyse des signaux temporels via une transformation graphe vers série temporelle. Ce faisant, nous exploitons une approche unifiée d'apprentissage semi-supervisé sur graphe dédiée à la classification pour obtenir une série temporelle lisse. Enfin, nous montrons que cette approche s'apparente à du lissage de signaux sur graphe. Le chapitre 5 de cette partie introduit un nouvel opérateur de translation sur graphe définit par analogie avec l'opérateur classique de translation en temps et vérifiant la propriété clé d'isométrie. Cet opérateur est comparé aux deux opérateurs de la littérature et son action est décrite empiriquement sur quelques graphes clés. Le chapitre 6 décrit l'utilisation de l'opérateur ci-dessus pour définir la notion de signal stationnaire sur graphe. Après avoir étudié la caractérisation spectrale de tels signaux, nous donnons plusieurs outils essentiels pour étudier et tester cette propriété sur des signaux réels. Le dernier chapitre s'attache à décrire la boite à outils \matlab développée et utilisée tout au long de cette thèse. / This dissertation introduces in its first part the field of signal processing on graphs. We start by reminding the required elements from linear algebra and spectral graph theory. Then, we define signal processing on graphs and give intuitions on its strengths and weaknesses compared to classical signal processing. In the second part, we introduce our contributions to the field. Chapter 4 aims at the study of structural properties of graphs using classical signal processing through a transformation from graphs to time series. Doing so, we take advantage of a unified method of semi-supervised learning on graphs dedicated to classification to obtain a smooth time series. Finally, we show that we can recognize in our method a smoothing operator on graph signals. Chapter 5 introduces a new translation operator on graphs defined by analogy to the classical time shift operator and verifying the key property of isometry. Our operator is compared to the two operators of the literature and its action is empirically described on several graphs. Chapter 6 describes the use of the operator above to define stationary graph signals. After giving a spectral characterization of these graph signals, we give a method to study and test stationarity on real graph signals. The closing chapter shows the strength of the matlab toolbox developed and used during the course of this PhD. Traitement du signal sur graphe Translation sur graphe Analyse de graphe Signaux sur graphe aléatoires Stationarité Graph signal processing Graph translation Graph analysis Stochastic graph signals Stationarity
15	Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis Royer, Loic 12 December 2017 (has links) (PDF) Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks of interacting proteins. Molecular biologists hope to understand life's complex molecular machines by studying these networks. This thesis addresses tree open problems centered upon their analysis and quality assessment. First, we introduce power graph analysis as a novel approach to the representation and visualization of biological networks. Power graphs are a graph theoretic approach to lossless and compact representation of complex networks. It groups edges into cliques and bicliques, and nodes into a neighborhood hierarchy. We demonstrate power graph analysis on five examples, and show its advantages over traditional network representations. Moreover, we evaluate the algorithm performance on a benchmark, test the robustness of the algorithm to noise, and measure its empirical time complexity at O (e1.71)- sub-quadratic in the number of edges e. Second, we tackle the difficult and controversial problem of data quality in protein interaction networks. We propose a novel measure for accuracy and completeness of genome-wide protein interaction networks based on network compressibility. We validate this new measure by i) verifying the detrimental effect of false positives and false negatives, ii) showing that gold standard networks are highly compressible, iii) showing that authors' choice of confidence thresholds is consistent with high network compressibility, iv) presenting evidence that compressibility is correlated with co-expression, co-localization and shared function, v) showing that complete and accurate networks of complex systems in other domains exhibit similar levels of compressibility than current high quality interactomes. Third, we apply power graph analysis to networks derived from text-mining as well to gene expression microarray data. In particular, we present i) the network-based analysis of genome-wide expression profiles of the neuroectodermal conversion of mesenchymal stem cells. ii) the analysis of regulatory modules in a rare mitochondrial cytopathy: emph{Mitochondrial Encephalomyopathy, Lactic acidosis, and Stroke-like episodes} (MELAS), and iii) we investigate the biochemical causes behind the enhanced biocompatibility of tantalum compared with titanium. protein interaction networks Power graph analysis proteomics bioinformatics computational biology graph theory visualization network compression Y2H APMS miR-124 HIF-1 MELAS Sjogren syndrome ddc:570 rvk:WD 5100
16	Compile- and run-time approaches for the selection of efficient data structures for dynamic graph analysis Schiller, Benjamin, Deusser, Clemens, Castrillon, Jeronimo, Strufe, Thorsten 11 January 2017 (has links) (PDF) Graphs are used to model a wide range of systems from different disciplines including social network analysis, biology, and big data processing. When analyzing these constantly changing dynamic graphs at a high frequency, performance is the main concern. Depending on the graph size and structure, update frequency, and read accesses of the analysis, the use of different data structures can yield great performance variations. Even for expert programmers, it is not always obvious, which data structure is the best choice for a given scenario. In previous work, we presented an approach for handling the selection of the most efficient data structures automatically using a compile-time approach well-suited for constant workloads. We extend this work with a measurement study of seven data structures and use the results to fit actual cost estimation functions. In addition, we evaluate our approach for the computations of seven different graph metrics. In analyses of real-world dynamic graphs with a constant workload, our approach achieves a speedup of up to 5.4× compared to basic data structure configurations. Such a compile-time based approach cannot yield optimal results when the behavior of the system changes later and the workload becomes non-constant. To close this gap we present a run-time approach which provides live profiling and facilitates automatic exchanges of data structures during execution. We analyze the performance of this approach using an artificial, non-constant workload where our approach achieves speedups of up to 7.3× compared to basic configurations. Dynamische Graphenanalyse Datenstrukturen Performance Messstudie Kompilierzeitoptimierung TU Dresden Publikationsfonds Dynamic graph analysis Data structures Performance Measurement study Compile-time optimization TU Dresden Publishing Fund ddc:300 rvk:MN 1000
17	Functional network centrality in obesity: a resting-state and task fMRI study García-García, Isabel, Jurado, María Ángeles, Garolera, Maite, Marqués-Iturria, Idoia, Horstmann, Annette, Segura, Bàrbara, Pueyo, Roser, Sender-Palacios, María José, Vernet-Vernet, Maria, Villringer, Arno, Junqué, Carme, Margulies, Daniel S., Neumann, Jane January 2015 (has links) Obesity is associated with structural and functional alterations in brain areas that are often functionally distinct and anatomically distant. This suggests that obesity is associated with differences in functional connectivity of regions distributed across the brain. However, studies addressing whole brain functional connectivity in obesity remain scarce. Here, we compared voxel-wise degree centrality and eigenvector centrality between participants with obesity (n=20) and normal-weight controls (n=21). We analyzed resting state and task-related fMRI data acquired from the same individuals. Relative to normal-weight controls, participants with obesity exhibited reduced degree centrality in the right middle frontal gyrus in the resting-state condition. During the task fMRI condition, obese participants exhibited less degree centrality in the left middle frontal gyrus and the lateral occipital cortex along with reduced eigenvector centrality in the lateral occipital cortex and occipital pole. Our results highlight the central role of the middle frontal gyrus in the pathophysiology of obesity, a structure involved in several brain circuits signaling attention, executive functions and motor functions. Additionally, our analysis suggests the existence of task-dependent reduced centrality in occipital areas; regions with a role in perceptual processes and that are profoundly modulated by attention. info:eu-repo/classification/ddc/610 ddc:610
18	Analysis, integration and applications of the human interactome Chaurasia, Gautam 12 December 2012 (has links) Protein-Protein Interaktions (PPI) Netzwerke liefern ein Grundgerüst für systematische Untersuchungen der komplexen molekularen Maschinerie in der Zelle. Die Komplexität von Protein-Wechselwirkungen stellt jedoch in Bezug auf ihre Identifizierung, Validierung und Annotation eine große experimentelle und rechnerische Herausforderung dar. In dieser Arbeit analysierte ich diese Probleme und lieferte Lösungen, um die Limitierungen aktueller humanen PPI Netzwerke zu überwinden. Meine Arbeit kann in zwei Teile aufgeteilt werden: Im ersten Teil führte ich eine kritischen Vergleich von acht unabhängig konstruierten humanen PPI Netzwerke durch, um mögliche experimentellen Verzerrungen zu erkennen. Die Ergebnisse zeigten starke Tendenzen bezüglich der Selektion und Detektion von Interaktionen, die in zukünftigen Anwendungen dieser Netzwerke berücksichtigt werden sollten. Einer der wichtigsten Schlussfolgerungen dieser Studie war, dass die derzeitigen humanen Interaktions Netzwerke komplementär sind und deshalb wurde eine Datenbank mit der Bezeichnung Unified Human Interaktome (UniHI) entwickelt, die menschliche PPI Daten aus zwölf wichtigsten Quellen integriert. Im zweiten Teil dieser Forschungsarbeit benutzte ich die Daten aus der UniHI Datenbank, die genetischen Modifikatoren in einer bestimmten Krankheit, Chorea Huntington (HD) eine autosomal dominante neurodegenerative Erkrankung, zu charakterisieren. Um die Proteine zu identifizieren, die den Krankheitsverlauf modifizieren können, wurden Protein Interaktion Daten mit Genexpressionsdaten von HD-Patienten in Kombination mit einem Mehrschritt-Filterungsverfahren integriert. Mit dem neuartigen Ansatz wurde ein Nucleus caudatus-spezifische Protein-Interaktion HD (PPI)-Netzwerk vorhergesagt, das 14 potentiell dysregulierten Proteine direkt oder indirekt mit dem Huntingtin-Protein verlinkt, mit mögliche Verbindung zu Molekularen Prozessen wie z.B. Apoptose, Metabolismus, neuronale Entwicklung. / Protein interaction networks aim to provide the scaffold maps for systematic studies of the complex molecular machinery in the cell. The complexity of protein interactions poses, however, large experimental and computational challenges regarding their identification, validation and annotation. Additionally, storage and linking is demanding since new data are rapidly accumulating. In this research work, I addressed these issues and provided solutions to overcome the limitations of current human protein-protein interaction (PPI) maps. In particular, my thesis can be partitioned into two parts: In the first part, I conducted a comparative assessment of eight recently constructed human protein-protein interaction networks to identify experimental biases. Results showed strong selection and detection biases which are necessary to take into consideration in future applications of these maps. One of the important conclusions of this study was that the current human interaction networks contain complementary information; hence, a database was developed, termed as Unified Human Interactome (UniHI), integrating human PPI data from twelve major sources. Several new tools were included for querying, analyzing and visualizing human PPI networks. In the second part of this research work, UniHI dataset was applied to characterize the genetic modifiers involved in a specific disease: Chorea Huntington (HD), an autosomal dominant neurodegenerative disease. To find the modifiers, a network-based modeling approach was implemented by integrating huntingtin-specific protein interaction network with gene expression data from HD patients in multiple steps. Using this approach, a Caudate Nucleus-specific HD protein interaction (PPI) network was predicted, connecting 14 potentially dysregulated proteins directly or indirectly to the disease protein, showing a possible link to molecular processes such as pro-apoptotic pathways, cell survival, anti-apoptotic, growth, and neuronal diseases. System Biologie Netzwerk Biologie Protein-protein Wechselwirkung Grpah analyze Huntington-Krankheit Systems Biology Network Biology Protein-protein Interaction Graph Analysis Huntington Disease 570 Biowissenschaften, Biologie 32 Biologie WD 5100 ddc:570
19	Compile- and run-time approaches for the selection of efficient data structures for dynamic graph analysis Schiller, Benjamin, Deusser, Clemens, Castrillon, Jeronimo, Strufe, Thorsten 11 January 2017 (has links) Graphs are used to model a wide range of systems from different disciplines including social network analysis, biology, and big data processing. When analyzing these constantly changing dynamic graphs at a high frequency, performance is the main concern. Depending on the graph size and structure, update frequency, and read accesses of the analysis, the use of different data structures can yield great performance variations. Even for expert programmers, it is not always obvious, which data structure is the best choice for a given scenario. In previous work, we presented an approach for handling the selection of the most efficient data structures automatically using a compile-time approach well-suited for constant workloads. We extend this work with a measurement study of seven data structures and use the results to fit actual cost estimation functions. In addition, we evaluate our approach for the computations of seven different graph metrics. In analyses of real-world dynamic graphs with a constant workload, our approach achieves a speedup of up to 5.4× compared to basic data structure configurations. Such a compile-time based approach cannot yield optimal results when the behavior of the system changes later and the workload becomes non-constant. To close this gap we present a run-time approach which provides live profiling and facilitates automatic exchanges of data structures during execution. We analyze the performance of this approach using an artificial, non-constant workload where our approach achieves speedups of up to 7.3× compared to basic configurations. info:eu-repo/classification/ddc/300 ddc:300
20	Statistické zhodnocení dat / Statistical data evaluation Fadrný, Tomáš January 2009 (has links) This diploma thesis evaluates and processes data from final device checks. All the devices are similar types of thermal overcurrent relays by the ABB company. For appropriate statistical data processing, the Minitab 14 statistical software was used and various statistical methods were applied. Results are always listed for each device type and each method used. The diploma thesis is divided into two parts. The first one analyzes the methods used and the second part states the method results. There is also an overall evaluation of the processed data.

Search results